I want to count the percentage of the response "yes" in a column that contains "yes" and "no".
Student | Response |
---|---|
S1 | yes |
S2 | yes |
S1 | no |
S5 | yes |
S5 | yes |
S7 | no |
S8 | no |
This is what I would like to get
Student | Response | percentage |
---|---|---|
S1 | yes | 50% |
S2 | yes | 100% |
S1 | no | 50% |
S5 | yes | 100% |
S5 | yes | 100% |
S7 | no | 0% |
S8 | no | 0% |
This is what I have been working but I don't understand what's not working. Thanks!
df %>%
group_by(Student)%>%
summarize(sum_total = n(Response)%>%
filter(Response== "yes") %>%
summarize(sum_yes = n(Response))%>%
mutate(yes_percentage = scales::label_percent()(sum_yes/sum_total))
CodePudding user response:
You can do this using the base function ave
dat$percentage <- scales::label_percent()(ave( dat$Response=="yes", dat$Student, FUN=mean))
dat
Student Response percentage
1 S1 yes 50%
2 S2 yes 100%
3 S1 no 50%
4 S5 yes 100%
5 S5 yes 100%
6 S7 no 0%
7 S8 no 0%
CodePudding user response:
A simple group_by
and mutate
will do the job.
library(dplyr)
df %>%
group_by(Student) %>%
mutate(Percent = label_percent()(sum(Response == "yes")/n()))
# A tibble: 7 × 3
# Groups: Student [5]
Student Response Percent
<chr> <chr> <chr>
1 S1 yes 50%
2 S2 yes 100%
3 S1 no 50%
4 S5 yes 100%
5 S5 yes 100%
6 S7 no 0%
7 S8 no 0%