Home > Software engineering >  Add count values on top of gghistogram
Add count values on top of gghistogram

Time:11-23

I've made a histogram for the different age groups in my data:

> dput(Agedata[1:20,])

structure(list(samples = c("Pt1", "Pt10", "Pt101", "Pt103", "Pt106", 
"Pt11", "Pt17", "Pt18", "Pt2", "Pt24", "Pt26", "Pt27", "Pt28", 
"Pt29", "Pt3", "Pt30", "Pt31", "Pt34", "Pt36", "Pt37"), resp = c("NoResponse", 
"NoResponse", "Response", "NoResponse", "NoResponse", "NoResponse", 
"NoResponse", "Response", "NoResponse", "NoResponse", "NoResponse", 
"NoResponse", "NoResponse", "NoResponse", "Response", "Response", 
"NoResponse", "Response", "NoResponse", "NoResponse"), age = c(58, 
53, 61, 57, 57, 62, 51, 59, 58, 60, 61, 49, 52, 57, 61, 61, 60, 
56, 55, 61), age_group = structure(c(6L, 6L, 7L, 6L, 6L, 7L, 
6L, 6L, 6L, 7L, 7L, 5L, 6L, 6L, 7L, 7L, 7L, 6L, 6L, 7L), levels = c("0-9", 
"10-19", "20-29", "30-39", "40-49", "50-59", "60-69", "70-79", 
"80-89", "90-99"), class = "factor")), row.names = c(NA, 20L), class = "data.frame")

Like this:

library(ggpubr)

gghistogram(Agedata, x = "age_group", bins = 8,
            rug = TRUE,
            color = "resp", fill = "resp", stat = 'count',
            palette = c("red", "green"), main = 'Age ~ Outcome')   ylim(c(0,500))   theme_bw()

enter image description here

Now how do I add the count values on top of each bin? including the red bins and the green bins?

CodePudding user response:

I think I would use plain old geom_bar here instead of gghistogram. In general, functions like gghistogram make it easier to produce nice results in ggplot with minimal code, but what they lack in ease-of-use they lose in flexibility.

Since your bins are pre-defined rather than being constructed from a continuous variable, your data is a better fit for a bar plot than a histogram. It also allows you to add text via geom_text rather than having to work out what gghistogram is doing internally with its aesthetic mappings first.

ggplot(Agedata, aes(age_group, color = resp))  
  geom_bar(aes(fill = after_scale(alpha(colour, 0.4))))  
  geom_text(stat = 'count', position = position_stack(vjust = 1),
            vjust = -0.2,
            aes(label = after_stat(count), group = resp), color = 'black')  
  scale_color_manual(values = c('red2', 'green3'))  
  theme_bw()

enter image description here

CodePudding user response:

Slightly different approach:

library(tidyverse)
Agedata %>% 
  count(resp, age_group) %>% 
  ggplot(aes(x = age_group,y = n, fill = resp, label = n))  
  geom_col()  
  geom_text(size = 6, position = position_stack(vjust = 0.5)) 
  scale_fill_manual(values = alpha(c("red", "green"), 0.5))  
  ylim(c(0,15))  
  theme_bw()

enter image description here

CodePudding user response:

Another option using ggbarplot from the ggpubr package by first calculating the n. Also you can simply use label = TRUE to add the labels like this:

library(ggpubr)
library(dplyr)

data <- Agedata %>%
  group_by(age_group, resp) %>%
  summarise(n = n())
#> `summarise()` has grouped output by 'age_group'. You can override using the
#> `.groups` argument.

ggbarplot(data, 
          x = "age_group", y = 'n', 
          color = "resp", fill = "resp", 
          palette = c("red", "green"), 
          main = 'Age ~ Outcome',
          label = TRUE)   
  ylim(c(0,12))   
  theme_bw()

Created on 2022-11-22 with reprex v2.0.2

  • Related