Home > Software engineering >  Set size line plot with different y axis as addition to a stacked barplot
Set size line plot with different y axis as addition to a stacked barplot

Time:02-21

I would like to plot stacked barplot with added line plot that presents the overall set sizes. I'm plotting stacked barplot in ggplot2 without problems however additional line with different y axis is the difficulty. I'm using long-formated table as input, so there is no 'overall size' column.

Code to reproduce sample table:

df <- data.frame(Sample=c("S1","S2","S3","S4","S5","S6"), A=c(30,52,50,81,23,48), B=c(12,20,15,22,30,14), C=c(rep(15,6)))
df.melt <- melt(setDT(df), id.vars = "Sample", variable.name = "Group")

Head of the table:

   Sample Group value
1:     S1     A    30
2:     S2     A    52
3:     S3     A    50
4:     S4     A    81
5:     S5     A    23
6:     S6     A    48

Code to draw stacked barplot:

ggplot(df.melt, aes(x = Sample, y = value, fill = Group))  
  geom_col(position = position_fill(reverse = TRUE))  
  theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank())  
  scale_fill_brewer(palette="Set3")  
  ylab("% of Total")  
  scale_y_continuous(labels = percent)  
  scale_x_discrete(limits  = unique(df.melt$Sample))

Therefore the line would run through six stacked bars pointing the size of each set i.e. for sample S1 it would be 57 (A B C), and y axis labels to the right of the plot would show set size range.

enter image description here

CodePudding user response:

You can put the data set directly in the geom. This allows you to use different data sets for each geom. Secondary axis are a bit tricky. They need to be a function of the primary axis and the data adjusted accordingly. I've used 120 as the adjustment factor.

visual


percent <- c("0%", "25%", "50%", "75%", "100%")

set_sizes <- df %>%
    rowwise %>%
    mutate(Size = sum(A, B, C))
    
    
ggplot()  
    geom_col(df.melt, mapping = aes(x = Sample, y = value, fill = Group),position = position_fill(reverse = TRUE))  
    geom_line(set_sizes, mapping = aes(x = Sample, y = Size / 120, group = 1))  
    scale_y_continuous(name = "% of Total", labels = percent, sec.axis = sec_axis(~ .*120, name  = "Sample Size"))              
    theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank())  
    scale_fill_brewer(palette="Set3")  
    scale_x_discrete(limits  = unique(df.melt$Sample))


CodePudding user response:

Alternatively, you can use cowplot to arrange two independent plots on top of each other, e.g.:

suppressMessages(invisible(lapply(c("data.table", "ggplot2", "cowplot"), 
    require, character.only=TRUE)))
df <- data.table(Sample=c("S1","S2","S3","S4","S5","S6"), 
    A=c(30,52,50,81,23,48), B=c(12,20,15,22,30,14), C=c(rep(15,6)))
df.melt <- melt(df, id.vars = "Sample", variable.name = "Group")
percent <- paste0(sprintf("%s", seq(0, 100, 25)), "%")
p1 <- ggplot(df.melt, aes(x = Sample, y = value, fill = Group))  
  geom_col(position = position_fill(reverse = TRUE))  
  theme(axis.text.x=element_text(angle=45, hjust=1), legend.title=element_blank())  
  scale_fill_brewer(palette="Set3")  
  ylab("% of Total")  
  scale_y_continuous(labels = percent)  
  scale_x_discrete(limits  = unique(df.melt$Sample))

p2 <- ggplot(df.melt[, .(value=sum(value)), by="Sample"], 
    aes(x = Sample, y = value, group=1))  
    geom_line()   
    scale_x_discrete(labels = NULL, breaks = NULL)   
    labs(x = NULL)
plot_grid(p2, NULL, p1, align="hv", nrow=3, axis='tlbr', rel_heights=c(1, -.28, 4), greedy=FALSE)

Created on 2022-02-20 by the reprex package (v2.0.1)

  • Related