Home > Blockchain >  Grouped barplot with SD bars from two different groups with ggplot2
Grouped barplot with SD bars from two different groups with ggplot2

Time:12-22

I'm almost certain I'm overcomplicating this, but I don't have much experience using ggplot2 so any advice is welcome!

I have one large dataset with male and female responses to three stimulus types for 4 variables. I have used aggregate to calculate the means and SD for sex's response to each stimulus for each variable. Example shown below.

dffull
Individual  Sex     var1    var2    var3    var4   stimulustype
       A    Male     10       1       4      6          A
       B    Male     3        9       9     10          A
       C    Female   9        10      7      9          A
       D    Female   8        5       2      5          A
       A    Male     1        3       1      7          B
       B    Male     9        5       2      3          B
       C    Female   8        1       2      4          B
       D    Female   8        4       4     10          B
       A    Male     1        8       2      7          C
       B    Male     1       10       6      3          C
       C    Female   8        1       5      2          C
       D    Female   8        7       5      8          C
#splitting main df into Male and Female

femaledf <-dffull[dffull$Sex=="Female",]
maledf <-dffull[dffull$Sex=="Male",]
mstim <- maledf$stimulustype
fstim <- femaledf$stimulustype

malemeans <- aggregate(cbind(var1, var2, var3, var4)~mstim, FUN=mean, na.rm=TRUE, na.action=NULL)
malesd <- aggregate(cbind(var1,var2,var3,var4)~mstim, FUN=sd, na.rm=TRUE, na.action=NULL)
femalemeans <- aggregate(cbind(var1, var2, var3, var4)~fstim, FUN=mean, na.rm=TRUE, na.action=NULL)
femalesd <- aggregate(cbind(var1,var2,var3,var4)~fstim, FUN=sd, na.rm=TRUE, na.action=NULL)

I would like to plot male and female responses to each stimulus for a single variable at a time, with the SD as error bars.

Is it a mistake to split the df into male and female? If so, how can I calculate the group-specific means and SD for each sex? Or should I try to merge the aggregate results by sex?

Edit: I would like to have stimulus type on the x axis, with one bar per type for each sex, and one variable plotted for both sexes. In the case below I would want Male and Female instead of Gene 1/2 and simulus A or B instead of Species A/B.

In this case I would want Male and Female instead of Gene 1/2 and simulus A or B instead of Species A/B

CodePudding user response:

Things like this are easier if you reshape your data to a "tidy" format and summarise with dplyr before plotting. Here's one way to do that

library(dplyr)
library(tidyr)
library(ggplot2)

dffull %>% 
  pivot_longer(Var1:Var4, names_to="Variable", values_to="Val") %>% 
  group_by(stimulustype, Variable, Sex) %>% 
  summarise(mean=mean(Val, na.rm=TRUE), sd=sd(Val, na.rm=TRUE)) %>% 
  ggplot()   
  aes(Variable, mean, fill=Sex, group=Sex)  
  geom_col(position="dodge")   
  geom_errorbar(aes(ymin=mean-sd, ymax=mean sd), position="dodge")   
  facet_grid(col=vars(stimulustype))

This returns the following plot enter image description here

  • Related