Home > Enterprise >  How to make bar chart using ggplot and dplyr looking at absolute count into a proportion instead?
How to make bar chart using ggplot and dplyr looking at absolute count into a proportion instead?

Time:09-01

I am trying to calculate the proportion of individuals who completed a program per age category, flagged by the variable 'retain_yn' in which retain_yn=1 means the individual completed the program and retain_yn=0 means they didn't. Right now I made a bar chart that looks at the absolute number of individuals who completed the program per age category, however I would like to make this into a proportion instead. enter image description here

This is the code I have:

  dplyr::mutate(freq_lab=ifelse(test$retain_yn %in% 1, NA, retain_yn))%>%
  ggplot(aes(x=agecat, y=retain_yn)) 
  geom_bar(stat="identity")

How do I make this into a proportion of people who completed the program per age group instead of a count?

Here is some reproducible data:

agecat<-c("0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person","0-180 d in person",
          "180-365d in person", "180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person","180-365d in person",
          "365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person","365d-2y in person",
          "2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person","2y in person")
agecat<-factor(agecat, levels=c("0-180 d in person", "180-365d in person", "365d-2y in person", "2y in person"))
retain_yn<-c(0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
     0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
     0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
     0, 1, 1, 1, 1, 1, 1, 1, 1, 1)

test<-data.frame(agecat, retain_yn)

Thank you!

CodePudding user response:

If you want proportion by absolute count:

library(dplyr)
library(ggplot)

mydf <- test %>%
  group_by(agecat, retain_yn) %>%
  summarise(n = n(), .groups = "drop")

To plot the graph:

mydf %>%
  ggplot(aes(x = agecat, y = n, fill = as.factor(retain_yn)))  
  geom_bar(stat = "identity", position = "dodge")

enter image description here

Edit

If you want percentages:

mydf <- test %>%
  group_by(agecat, retain_yn) %>%
  summarise(n = n(), .groups = "drop") %>%
  group_by(agecat) %>%
  mutate(Pct = (n / sum(n)) * 100) %>%
  ungroup()

Plot:

mydf %>%
  ggplot(aes(x = agecat, y = Pct, fill = as.factor(retain_yn)))  
  geom_bar(stat = "identity")

enter image description here

  • Related