Calculate the percentage of a value per group?-CodePudding

I have this data frame

df.bar <- data.frame(diagnosis = c("A","A","A", "nb" ,"nb", "hg"),
  C1 = c(1,1,0,0,1,0), C2 = c(0,1,0,0,0,0))


    df.bar
   diagnosis C1 C2
   1         A  1  0
   2         A  1  1
   3         A  0  0
   4        nb  0  0
   5        nb  1  0
   6        hg  0  0

I want to calculate the percentage of "one" for each diagnosis as follows:

   diagnosis C1 C2
   1        A  66%  33%       
   2        nb  50%  0%
   3        hg  0%  0%

CodePudding user response：

base solution with aggregate():

aggregate(cbind(C1, C2) ~ diagnosis, df.bar,
          \(x) paste0(round(mean(x) * 100, 2), '%'))

dplyr solution:

library(dplyr)

df.bar %>%
  group_by(diagnosis) %>%
  summarise(across(C1:C2, ~ paste0(round(mean(.x) * 100, 2), '%')))

# # A tibble: 3 × 3
#   diagnosis C1     C2
#   <chr>     <chr>  <chr>
# 1 A         66.67% 33.33%
# 2 hg        0%     0%
# 3 nb        50%    0%

CodePudding user response：

Dplyr answer:

library(dplyr)

df.bar |> group_by(diagnosis) |> summarise(C1 = sum(C1) / n() * 100,
                                           C2 = sum(C2) / n() * 100)

CodePudding user response：

library(tidyverse)
df.bar %>% 
  group_by(diagnosis) %>% 
  summarise(
    C1 = str_c(round(sum(C1)/n()*100,2), "%"),
    C2 = str_c(round(sum(C2)/n()*100,2), "%")
  )
# A tibble: 3 × 3
  diagnosis C1     C2    
  <chr>     <chr>  <chr> 
1 A         66.67% 33.33%
2 hg        0%     0%    
3 nb        50%    0%