Home > front end >  Is it possible to put the frequency of each level on the x-tick label in ggplot?
Is it possible to put the frequency of each level on the x-tick label in ggplot?

Time:10-01

As the title suggests, I would like to put the frequency of each level in the x-axis ticks with their corresponding label. For example, in the subtitle I have the total frequency of Question_3l. On each x-axis tick, I would like it to say the label, then on another line n=2.

Here is some working code of what I have so far:

dt <- structure(list(Question1 = c("1", "5", "4", "4"), 
                     Question2 = c("2", "5", "4", "4"), 
                     Question3 = c("1", "3", "2", "4"),
                     Question1_3l = c("3", "3", "2", "1"),
                     Question2_3l = c("2", "3", "1", "2"),
                     Question3_3l = c("1", "3", "2", "1")),
                class = c("data.frame"), 
                row.names = c(NA, -4L))


ggplot(dt, aes(as.factor(Question1_3l)),  na.rm=T)                             
  geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")  
  geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)  
  scale_y_continuous(labels = scales::percent)  
  scale_x_discrete(breaks = c(1, 2, 3), labels = c("Dissatisfied","Neutral","Satisfied"))  
  #scale_y_continuous(breaks = seq(0, .6, .01), labels = scales::percent, limits = c(0, .1)) 
  #ylim(0, .1) 
  scale_fill_brewer(palette = "Set2") 
  labs(title = "",
       subtitle = paste("Number of observations: ", nrow(dt)),
       x="",
       y = "Disabled",
       color = "Unemployed")

Now I just need to automate the frequencies of each level in the x-axis label, similar to what I did with the subtitle.

Is this possible?

CodePudding user response:

You might do this with dplyr upstream:

dt %>%
  add_count(Question1_3l) %>%
  mutate(x_cat = paste0(
    recode(Question1_3l, 
           "1" = "Dissatisfied",
           "2" = "Neutral",
           "3" = "Satisfied"),
    "\n(", n, ")")) %>%
ggplot( aes(x_cat))                             
  geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")  
  geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)  
  scale_y_continuous(labels = scales::percent)  
  labs(subtitle = paste("Number of observations: ", nrow(dt)),
      x= NULL,
      y = "Disabled",
      color = "Unemployed")

enter image description here

CodePudding user response:

Here is the same approach as Jon Spring with other utilities:

We basically do upfront:

  1. long format
  2. filter
  3. refactor
  4. plot
library(tidyverse)
dt %>% 
  pivot_longer(everything()) %>% 
  add_count(name, value) %>% 
  filter(name == "Question1_3l") %>% 
  mutate(value.factor = value %>% 
           factor() %>% 
           fct_recode("Dissatisfied" = "1", "Neutral" = "2", "Satisfied" = "3")) %>% 
  mutate(my_label = paste0(value.factor, " (n=", n, ")")) %>% 
  ggplot(aes(x = my_label,  na.rm=T))                             
  geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")  
  geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)  
  scale_y_continuous(labels = scales::percent)  
  scale_fill_brewer(palette = "Set2") 
  labs(title = "",
       subtitle = paste("Number of observations: ", nrow(dt)),
       x="",
       y = "Disabled",
       color = "Unemployed")

enter image description here

CodePudding user response:

If you do most of the calculations beforehand, the ggplot call becomes less complicated. See below

library(dplyr)
sumdat <- dt %>% 
  count(Question1_3l) %>% 
  mutate(
    Question1_3l = factor(
      Question1_3l,
      levels = c(1,2,3), 
      labels = c("Dissatisfied","Neutral","Satisfied")
    ),
    perc = n/sum(n),
    perc_label = paste0(round(100*perc), "%"),
    plotlabels = paste0(Question1_3l, " (", n, ")")
    )

ggplot(sumdat, aes(x = plotlabels, y = perc, label = perc_label))  
  geom_col(fill = "#005A9B")   
  geom_text(vjust = -.25)  
  scale_y_continuous(labels = scales::percent)  
  labs(
    subtitle = paste("Number of observations: ", nrow(dt)),
    x= NULL,
    y = "Disabled",
    color = "Unemployed"
    )

figure-output

  • Related