Home > Blockchain >  Position stacked identity data sample size as geom_text directly over a bar using geom_bar from ggpl
Position stacked identity data sample size as geom_text directly over a bar using geom_bar from ggpl

Time:01-13

In this experiment, we tracked presence or absence of bacterial infection in our subject animals. We were able to isolate which type of bacteria was present in our animals and created a plot that has Week Since Experiment Start on the X axis, and Percentage of Animals Positive for bacterial infection on the Y axis. This is a stacked identity ggplot where each geom_bar contains the different identities of the bacteria that were in the infected animals each week. Here is a sample dataset with the corresponding ggplot code and result:

DummyData <- data.frame(matrix(ncol = 5, nrow = 78))

colnames(DummyData) <- c('WeeksSinceStart','BacteriaType','PositiveOccurences','SampleSize','NewSampleSize')

DummyData$WeeksSinceStart <- c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9,9,9,10,10,10,10)

DummyData$BacteriaType <- c("BactA","BactB","BactD","BactB","BactE","BactA","BactS","BactF","BactE","BactH","BactJ","BactK","BactE","BactB","BactS","BactF","BactL","BactE","BactW","BactH","BactS","BactJ","BactQ","BactN","BactW","BactA","BactD","BactE","BactA","BactC","BactD","BactK","BactL","BactE","BactD","BactA","BactS","BactK","BactB","BactE","BactF","BactH","BactN","BactE","BactL","BactZ","BactE","BactC","BactR","BactD","BactJ","BactN","BactK","BactW","BactR","BactE","BactW","BactA","BactM","BactG","BactO","BactI","BactE","BactD","BactM","BactH","BactC","BactM","BactW","BactA","BactL","BactB","BactE","BactA","BactS","BactH","BactQ","BactF")

PosOcc <- seq(from = 1, to = 2, by = 1)
DummyData$PositiveOccurences <- rep(PosOcc, times = 13)

DummyData$SampleSize <- c(78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,78,29,29,29,29,29,10,10,10,10)


DummyData$NewSampleSize <- c(78,NA,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,NA,NA,78,NA,NA,NA,NA,NA,NA,29,NA,NA,NA,NA,10,NA,NA,NA)


numcolor <- 20
plotcolors <- colorRampPalette(brewer.pal(8, "Set3"))(numcolor)

#GGplot for Dummy Data
DummyDataPlot <- ggplot(DummyData, aes(x = WeeksSinceStart, y = PositiveOccurences/SampleSize, fill = BacteriaType))   geom_bar(position = "stack", stat = "identity")  
geom_text(label = DummyData$NewSampleSize, nudge_y = 0.1)  
  
scale_y_continuous(limits = c(0,0.6), breaks = seq(0, 1, by = 0.1))   scale_x_continuous(limits = c(0.5,11), breaks = seq(0,10, by =1))    
  labs(
  x = "Weeks Since Start",
  y = "Proportion Positive")  
scale_fill_manual(values = plotcolors) 

ggplot for Dummy Data

The problem: I cannot seem to find a way to position the labels from geom_text directly over each bar. I would also love to add the text "n = " to the sample size value directly over each bar. Thank you for your help!

I have tried different values for position_dodge statement and nudge_y statement with no success.

CodePudding user response:

Sometimes the easiest approach is to do some data wrangling, i.e. one option would be to create a separate dataframe for your labels:

library(ggplot2)
library(dplyr)

dat_label <- DummyData |> 
  group_by(WeeksSinceStart) |> 
  summarise(y = sum(PositiveOccurences / SampleSize), SampleSize = unique(SampleSize))

ggplot(DummyData, aes(x = WeeksSinceStart, y = PositiveOccurences / SampleSize, fill = BacteriaType))  
  geom_bar(position = "stack", stat = "identity")  
  geom_text(data = dat_label, aes(x = WeeksSinceStart, y = y, label = SampleSize), inherit.aes = FALSE, nudge_y = .01)  
  #scale_y_continuous(limits = c(0, 0.6), breaks = seq(0, 1, by = 0.1))  
  scale_x_continuous(limits = c(0.5, 11), breaks = seq(0, 10, by = 1))  
  labs(
    x = "Weeks Since Start",
    y = "Proportion Positive"
  )  
  scale_fill_manual(values = plotcolors)

enter image description here

  • Related