At the moment I have the following code:
categories <- df %>% #this is a very large df but that should not matter to my question
group_by(category, subcategory, IV_type) %>%
summarise(n = n())
Which produces the following df:
category <- c('a','a','a','a','b','b','b','c','c')
subcategory <- c(1,1,2,3,4,4,5,6,7)
N <- c(21,13,7,9,11,17,19,23,27)
type <- c('nom', 'ord', 'nom', 'scale', 'nom', 'scale', 'nom', 'scale', 'scale')
categories <- data.frame(category, subcategory, N, type)
However, I would like to obtain this dataframe:
category1 <- c('a','a','a','b','b','c','c')
subcategory1 <- c(1,2,3,4,5,6,7)
N1 <- c(34,7,9,28,19,23,27)
type1 <- c('nom, ord', 'nom', 'scale', 'nom, scale', 'nom', 'scale', 'scale')
categories1 <- data.frame(category1, subcategory1, N1, type1)
my try:
categories <- df %>%
group_by(category, subcategory) %>%
summarise(n = n(), unique_types = unique(type))
Unfortunately, this throws an error. Does anyone know how I can accomplish this?
CodePudding user response:
You can use the following:
categories %>%
group_by(category, subcategory) %>%
summarise(N = sum(N), type = toString(unique(type)), .groups = 'drop')
category subcategory N type
<chr> <dbl> <dbl> <chr>
1 a 1 34 nom, ord
2 a 2 7 nom
3 a 3 9 scale
4 b 4 28 nom, scale
5 b 5 19 nom
6 c 6 23 scale
7 c 7 27 scale