Home > Mobile >  Adding unique labels to stacked geom_col in ggplot
Adding unique labels to stacked geom_col in ggplot

Time:02-22

Data overview: 27 sample sites (bars), 3 phyla/groups shown by color, and a variable number of unique species (grouped into 1 of the three phyla) at each site.

I am trying to put a number on top of each stacked column showing the number of unique species detected at each sample site (bar). Overall goal is to show that from left to right the diversity (#of species) goes down. I have the numbers stored in a vector, I tried merging to the original d.f. using a column "count" and then doing: geom_text(aes(label=count)) which resulted in the following image, where the count of species/area is duplicated in each bar. Help!

enter image description here

test data that can be used:

set.seed(123)
test.data <- data.frame(area = rep(1:27, 3), 
                        cover = c(rnorm(27, mean = 40),
                                   rnorm(27, mean = 10),
                                   rnorm(27, mean = 5)),
                       phyla = c(rep("c", 27), rep("o", 27), rep("r", 27)))
 
    ggplot(data=test.data, aes(x=area, y = cover,
                               fill = phyla)) 
      geom_col()  
      theme_classic() 
      scale_y_continuous(expand = expansion(mult = c(0, 0.05)),
                         limits=c(0,75)) 
      scale_fill_manual(
        values =c('#a6611a','#dfc27d', '#80cdc1')) 
      geom_vline(xintercept=c(9.5,18.5, 27.5), linetype="dotted") 
      
    #number I want on top of each bar
    count = c(14, 17, 15, 15, 15, 12, 18, 14, 13, 14, 12, 14, 12, 11, 13,
              9, 11, 10,  5,  4,  4,  6,  7,  6,  6,  6,  5)

CodePudding user response:

Your stacked bars include 3 groups per area, but you want just one count label per area. As you found, this can be tricky, as ggplot will try to plot a value for each row in data. You can get around this by passing separate data to geom_col (your original test.data, with 27 rows) and to geom_text (for which we'll create a new data.frame with 13 rows, one per bar).

We'll also need to know the total height of each bar in order to place each label atop it. We can find this by using dplyr::group_by() and summarize() to compute the sum of y values (cover) within each x value (area).

library(tidyverse)

# I'm assuming the `count`s in your question are already in the right order
count.data <- test.data %>% 
  group_by(area) %>% 
  summarize(label_height = sum(cover)) %>% 
  mutate(count = c(14, 17, 15, 15, 15, 12, 18, 14, 13, 14, 12, 14, 12, 11, 13,
                   9, 11, 10,  5,  4,  4,  6,  7,  6,  6,  6,  5)
  )

ggplot()  
  geom_col(data = test.data, aes(x = area, y = cover, fill = phyla))  
  geom_text(
    data = count.data, 
    aes(x = area, y = label_height, label = count),
    nudge_y = 2              # scoot labels up above bar tops
  )  
  theme_classic()  
  scale_y_continuous(
    expand = expansion(mult = c(0, 0.05)),
    limits = c(0, 75)
  )  
  scale_fill_manual(values = c('#a6611a', '#dfc27d', '#80cdc1'))  
  geom_vline(xintercept = c(9.5, 18.5, 27.5), linetype = "dotted") 
  • Related