Home > front end >  Adding text (sample size) on top of stacked bar chart returns error message in ggplot R
Adding text (sample size) on top of stacked bar chart returns error message in ggplot R

Time:05-06

Current figure: Current figure Desired effect: Desired effect

I have a stacked bar chart which I wanted to add sample size on top of the chart, I tried using geomtext with the following code:

Data %>% count(Month, Age) %>%
  group_by(Month) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(Month, percent, fill = as.factor(Age)))  
  geom_col(position = "fill")   ylab("")  
  geom_text(aes(label = n_month, y = 1.05))  
  scale_y_continuous(labels = scales::percent)  
  scale_fill_manual(values = c("#009E73", "#E69F00", "#0072B2"))  
  theme(axis.text = element_text(size = 17), 
        legend.text = element_text(size = 18),
        axis.title.x = element_text(margin = margin(t = 10), size = 16))

This returns an error, which I understand that it's because there are actually 34 data in this figure, but I only wanted it to display 12 numbers. For now I can only succeed if there's only 12 data (Hence the "Desired effect" figure). How should I change my code?

Error: Aesthetics must be either length 1 or the same as the data (34): label" 
n_month
 [1] 18  8 20 18 24 34 32 15 22 26 12 13

CodePudding user response:

sorry for the delay. I tried to reproduce your data and the issue is the underlying data. For your approach it would be easier to have different datasets for your geoms.

For this example I am using the nycflights13 data, which is probably similar to your data.

Here is my setup:

library(dplyr)
library(ggplot2)
library(nycflights13)

graph_data <- flights %>% 
  filter(carrier %in% c("UA", "B6", "EV")) %>% 
  count(carrier, month) %>% 
  add_count(month, wt = n, name = "n_month") %>% 
  mutate(percent = n / n_month * 100) 

Data looks like:

# A tibble: 36 × 3
   carrier month     n n_month percent
   <chr>   <int> <int>   <int>   <dbl>
 1 B6          1  4427   13235    33.4
 2 B6          2  4103   12276    33.4
 3 B6          3  4772   14469    33.0

Now we supply the geom_col() and geom_text() with different datasets, based on your graph_data.


ggplot()  
  geom_col(
    data = graph_data,
    aes(x = month, y = percent, fill = as.factor(carrier)), 
    position = "fill")   ylab("")  
  geom_text(
    data = distinct(graph_data, month, n_month),
    aes(x = month, y = 1.05, label = n_month))  
  scale_y_continuous(labels = scales::percent)  
  scale_fill_manual(values = c("#009E73", "#E69F00", "#0072B2"))  
  theme(axis.text = element_text(size = 17), 
        legend.text = element_text(size = 18),
        axis.title.x = element_text(margin = margin(t = 10), size = 16))

I tried to leave your code as much as possible, just added the data = ... argument in the geom_s.

Output is:

enter image description here

  • Related