Home > database >  Percentage by group with many columns
Percentage by group with many columns

Time:03-19

I have a data frame with round 100 columns from var1 to var100.

Grouping variable is a logical variable with 2 possible values, TRUE and FALSE.

What I am looking for it is to calculate the percentage of each variable from 1 to 100 variables by the grouping variable called Group that is a logical variable with only two values TRUE and FALSE.

Sample

Test_dataframe <- data.frame(Group=c(TRUE,TRUE,TRUE,FALSE,FALSE,FALSE),
                             var1=c(1,0,1,0,1,0),
                             var2=c(0,1,0,1,0,1),
                             var3=c(0,0,0,1,0,1))

Expected result:

 Result_dataframe <- data.frame(Group=c(TRUE,FALSE),
                               var1=c(0.66,0.33),
                               var2=c(0.33,0.66),
                               var3=c(0,0.66))

What I have tried so far.

Result_dataframe  <-Test_dataframe  %>%  group_by(Group) %>% summarise_each(funs(sum))

CodePudding user response:

We can use mean instead of sum

library(dplyr)
Test_dataframe %>% 
   group_by(Group) %>%
   summarise(across(everything(), mean), .groups = "drop")

-output

# A tibble: 2 × 4
  Group  var1  var2  var3
  <lgl> <dbl> <dbl> <dbl>
1 FALSE 0.333 0.667 0.667
2 TRUE  0.667 0.333 0    
  • Related