I've used group_by
function in R, as :
data = r %>%
group_by(Name, yp) %>%
summarise(nb = n()) %>%
mutate(Frac = nb / sum(nb))
This is what I get
Name yp nb Frac
0_S 0 1 0.03030303
0_S 1 20 0.60606061
0_S 2 12 0.36363636
1_S 1 16 0.59259259
1_S 2 11 0.40740741
But for each item in Name (each time 3 : 0,1,2), when there is no item in the previous table, I get a missing value instead of a 0. So, here is what I would like (adding 1_S 0 row) for example if 0 is missing in yp.
Name yp nb Frac
0_S 0 1 0.03030303
0_S 1 20 0.60606061
0_S 2 12 0.36363636
1_S 0 0 0
1_S 1 16 0.59259259
1_S 2 11 0.40740741
Reproducible example :
Df <- data.frame(A = c('0_S','0_S','0_S','0_S','0_S','0_S','1_S','1_S','1_S','1_S','1_S','1_S'),
B = c(0,0,1,1,2,2,1,1,1,1,2,2),
C = c(0,0,1,1,2,2,0,0,1,1,2,2))
Df
DDf = Df %>%
group_by(A,B) %>%
summarise(n = n()) %>%
mutate(Frac = n / sum(n))
head(DDf)
CodePudding user response:
You can use tidyr::complete
:
library(tidyverse)
DDf %>%
ungroup() %>%
complete(A, B, fill = list(n = 0, Frac = 0)
# A tibble: 6 x 4
A B n Frac
<chr> <dbl> <dbl> <dbl>
1 0_S 0 2 0.333
2 0_S 1 2 0.333
3 0_S 2 2 0.333
4 1_S 0 0 0
5 1_S 1 4 0.667
6 1_S 2 2 0.333
data
Df <- data.frame(A = c('0_S','0_S','0_S','0_S','0_S','0_S','1_S','1_S','1_S','1_S','1_S','1_S'),
B = c(0,0,1,1,2,2,1,1,1,1,2,2),
C = c(0,0,1,1,2,2,0,0,1,1,2,2))
DDf = Df %>%
group_by(A,B) %>%
summarise(n = n()) %>%
mutate(Frac = n / sum(n))