Home > Mobile >  R: Count percentage of "yes" in a column by group
R: Count percentage of "yes" in a column by group

Time:09-13

I want to count the percentage of the response "yes" in a column that contains "yes" and "no".

Student Response
S1 yes
S2 yes
S1 no
S5 yes
S5 yes
S7 no
S8 no

This is what I would like to get

Student Response percentage
S1 yes 50%
S2 yes 100%
S1 no 50%
S5 yes 100%
S5 yes 100%
S7 no 0%
S8 no 0%

This is what I have been working but I don't understand what's not working. Thanks!

df %>%
group_by(Student)%>%
summarize(sum_total = n(Response)%>%
filter(Response== "yes") %>%
summarize(sum_yes = n(Response))%>%
mutate(yes_percentage = scales::label_percent()(sum_yes/sum_total))

CodePudding user response:

You can do this using the base function ave

dat$percentage <- scales::label_percent()(ave( dat$Response=="yes", dat$Student, FUN=mean))

dat
  Student Response percentage
1      S1      yes        50%
2      S2      yes       100%
3      S1       no        50%
4      S5      yes       100%
5      S5      yes       100%
6      S7       no         0%
7      S8       no         0%

CodePudding user response:

A simple group_by and mutate will do the job.

library(dplyr)

df %>% 
  group_by(Student) %>% 
  mutate(Percent = label_percent()(sum(Response == "yes")/n()))

# A tibble: 7 × 3
# Groups:   Student [5]
  Student Response Percent
  <chr>   <chr>    <chr>  
1 S1      yes      50%    
2 S2      yes      100%   
3 S1      no       50%    
4 S5      yes      100%   
5 S5      yes      100%   
6 S7      no       0%     
7 S8      no       0%    
  • Related