Within ggplot2, I am using the stat_summary()
function to calculate and plot the mean and standard deviation of a dataset. I am simultaneously using facet_wrap()
to break the dataset into two plots.
I was pleasantly surprised that adding facet_wrap()
to my ggplot caused stat_summary()
to correctly be applied to each subset of the data independently.
df>
| ID | Group | Strain | Condition | DoublingTime |
|-----------|-------|--------|-----------|-----------------|
| A_3g_Rep1 | A_3g | A | 3g | 122.4135 |
| A_3g_Rep2 | A_3g | A | 3g | 124.5801 |
| A_3g_Rep3 | A_3g | A | 3g | 124.9419 |
| A_6g_Rep1 | A_6g | A | 6g | 120.5004 |
| A_6g_Rep2 | A_6g | A | 6g | 124.1666 |
| A_6g_Rep3 | A_6g | A | 6g | 124.6453 |
| B_3g_Rep1 | B_3g | B | 3g | 132.568 |
| B_3g_Rep2 | B_3g | B | 3g | 137.5242 |
| B_3g_Rep3 | B_3g | B | 3g | 135.5238 |
| B_6g_Rep1 | B_6g | B | 6g | 137.1333 |
| B_6g_Rep2 | B_6g | B | 6g | 142.733 |
| B_6g_Rep3 | B_6g | B | 6g | 140.0722 |
First, I was using the following which correctly calculates mean and standard deviation values. However, it includes groups on the x-axis aren't present in the facet.
DT_plotA <- ggplot(df, aes(Group, DoublingTime))
stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
geom="errorbar", width=0.5)
stat_summary(fun=mean, geom="point", size=3)
facet_wrap(nrow = 1, .~Strain)
I was pleasantly surprised that adjusting my aes() x-value to Condition while including facet_wrap() caused stat_summary() to correctly calculate mean and standard deviation for each Group correctly.
DT_plotB <- ggplot(df, aes(Condition, DoublingTime))
stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
geom="errorbar", width=0.5)
stat_summary(fun=mean, geom="point", size=3)
facet_wrap(nrow = 1, .~Strain)
However, if facet_wrap is removed from the plot, stat_summary calculates mean and standard deviation based on Condition: data from independent Strains is averaged. I worry that this caveat will be forgotten and lead to incorrect calculation of mean/sd when facet is removed.
DT_plotC <- ggplot(df, aes(Condition, DoublingTime))
stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
geom="errorbar", width=0.5)
stat_summary(fun=mean, geom="point", size=3)
Question
Is there a way to generate a plot that looks like DT_plotB but instead includes aes(Group, DoublingTime) as in shown in the code for DT_plotA?
CodePudding user response:
Maybe we could do it with some preprocessing of the data -> calculating the mean and sd:
library(dplyr)
library(ggplot2)
df %>%
group_by(Strain, Condition) %>%
mutate(mean = mean(DoublingTime),
sd = sd(DoublingTime)) %>%
ggplot(aes(x = Condition, y=mean))
geom_point()
geom_errorbar(aes(ymin = mean-sd, ymax = mean sd), width=.2)
facet_wrap(.~Strain)