I am trying to calculate the proportion of individuals who completed a program per age category, flagged by the variable 'retain_yn' in which retain_yn=1 means the individual completed the program and retain_yn=0 means they didn't. Right now I made a bar chart that looks at the absolute number of individuals who completed the program per age category, however I would like to make this into a proportion instead.
This is the code I have:
dplyr::mutate(freq_lab=ifelse(test$retain_yn %in% 1, NA, retain_yn))%>%
ggplot(aes(x=agecat, y=retain_yn))
geom_bar(stat="identity")
How do I make this into a proportion of people who completed the program per age group instead of a count?
Here is some reproducible data:
agecat<-c("0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person",
"180-365d in person", "180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person",
"365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person",
"2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person")
agecat<-factor(agecat, levels=c("0-180 d in person", "180-365d in person", "365d-2y in person", "2y in person"))
retain_yn<-c(0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
test<-data.frame(agecat, retain_yn)
Thank you!
CodePudding user response:
If you want proportion by absolute count:
library(dplyr)
library(ggplot)
mydf <- test %>%
group_by(agecat, retain_yn) %>%
summarise(n = n(), .groups = "drop")
To plot the graph:
mydf %>%
ggplot(aes(x = agecat, y = n, fill = as.factor(retain_yn)))
geom_bar(stat = "identity", position = "dodge")
Edit
If you want percentages:
mydf <- test %>%
group_by(agecat, retain_yn) %>%
summarise(n = n(), .groups = "drop") %>%
group_by(agecat) %>%
mutate(Pct = (n / sum(n)) * 100) %>%
ungroup()
Plot:
mydf %>%
ggplot(aes(x = agecat, y = Pct, fill = as.factor(retain_yn)))
geom_bar(stat = "identity")