I have below dataset that takes a 2 column dataset and creates age group categories depending on stated CustomerAge.
library(tidyverse)
df <-
read.table(textConnection("Area CustomerAge
A 28
A 40
A 70
A 19
B 13
B 12
B 72
B 90"), header=TRUE)
df2 <- df %>%
mutate(
# Create categories
Customer_Age_Group = dplyr::case_when(
CustomerAge <= 18 ~ "0-18",
CustomerAge > 18 & CustomerAge <= 60 ~ "19-60",
CustomerAge > 60 ~ ">60"
))
What I am looking to achieve is an output summary that looks like the below:
Area | Customer_Age_Group | Occurrences |
---|---|---|
A | 0-18 | 0 |
A | 19-59 | 3 |
A | >60 | 1 |
B | 0-18 | 2 |
B | 19-59 | 0 |
B | >60 | 2 |
CodePudding user response:
group_by
and summarise
is what you're looking for.
df2 %>% group_by(Area, Customer_Age_Group) %>% summarise(Occurences = n())
However note that this won't show categories with zero occurences in your data set.
CodePudding user response:
To include also 0 occurences you need count()
, ungroup()
and complete()
:
df2 %>% group_by(Area, Customer_Age_Group,.drop = FALSE) %>%
count() %>%
ungroup() %>%
complete(Area, Customer_Age_Group, fill=list(n=0))
This will show also 0 occurences.