I'm almost certain I'm overcomplicating this, but I don't have much experience using ggplot2 so any advice is welcome!
I have one large dataset with male and female responses to three stimulus types for 4 variables. I have used aggregate to calculate the means and SD for sex's response to each stimulus for each variable. Example shown below.
dffull
Individual Sex var1 var2 var3 var4 stimulustype
A Male 10 1 4 6 A
B Male 3 9 9 10 A
C Female 9 10 7 9 A
D Female 8 5 2 5 A
A Male 1 3 1 7 B
B Male 9 5 2 3 B
C Female 8 1 2 4 B
D Female 8 4 4 10 B
A Male 1 8 2 7 C
B Male 1 10 6 3 C
C Female 8 1 5 2 C
D Female 8 7 5 8 C
#splitting main df into Male and Female
femaledf <-dffull[dffull$Sex=="Female",]
maledf <-dffull[dffull$Sex=="Male",]
mstim <- maledf$stimulustype
fstim <- femaledf$stimulustype
malemeans <- aggregate(cbind(var1, var2, var3, var4)~mstim, FUN=mean, na.rm=TRUE, na.action=NULL)
malesd <- aggregate(cbind(var1,var2,var3,var4)~mstim, FUN=sd, na.rm=TRUE, na.action=NULL)
femalemeans <- aggregate(cbind(var1, var2, var3, var4)~fstim, FUN=mean, na.rm=TRUE, na.action=NULL)
femalesd <- aggregate(cbind(var1,var2,var3,var4)~fstim, FUN=sd, na.rm=TRUE, na.action=NULL)
I would like to plot male and female responses to each stimulus for a single variable at a time, with the SD as error bars.
Is it a mistake to split the df into male and female? If so, how can I calculate the group-specific means and SD for each sex? Or should I try to merge the aggregate results by sex?
Edit: I would like to have stimulus type on the x axis, with one bar per type for each sex, and one variable plotted for both sexes. In the case below I would want Male and Female instead of Gene 1/2 and simulus A or B instead of Species A/B.
CodePudding user response:
Things like this are easier if you reshape your data to a "tidy" format and summarise with dplyr
before plotting. Here's one way to do that
library(dplyr)
library(tidyr)
library(ggplot2)
dffull %>%
pivot_longer(Var1:Var4, names_to="Variable", values_to="Val") %>%
group_by(stimulustype, Variable, Sex) %>%
summarise(mean=mean(Val, na.rm=TRUE), sd=sd(Val, na.rm=TRUE)) %>%
ggplot()
aes(Variable, mean, fill=Sex, group=Sex)
geom_col(position="dodge")
geom_errorbar(aes(ymin=mean-sd, ymax=mean sd), position="dodge")
facet_grid(col=vars(stimulustype))