I am trying to show percentual change between groups. As a base approach, I wanted to compare the all groups' means to a defaut value, and just show those differences in [%]. I wish however to somehow represent the variability within the means, let's say in terms of standard deviations. But, I am not sure how to show this and actually if it does make sense?
Here is my example of simple bar plotting of the means and error bars, calculated as mean
and sd
:
dd <- data.frame(id = rep(c(1,2,3), 2),
vol = c(10,5,8,11,10,9),
reg = rep(c('control', 'new'), each = 3))
# calculate mean and sd
sum_dd <- dd %>%
group_by(reg) %>%
summarize(V_mean = mean(vol, na.rm = T),
V_sd = sd(vol, na.rm = T)) #
# Plot bar plot and error bars
sum_dd %>%
ggplot(aes(x = reg,
y = V_mean))
geom_bar(stat = 'identity')
geom_errorbar(aes(x=reg,
min=V_mean-V_sd,
ymax=V_mean V_sd)) #
This plotting generates nice bar plot with error bars, wherw it is obvious what are my means
and sd
for each group:
But, how to express that the group new
is a % change from control
?
Here, I need to first calcualte the % change from control
to new
. Then I can plot the bar of the percentual change. But from which values I can calculate something like standard deviations to plot and show the variability (using e.g., error bar) in my results ?
sum_dd %>%
group_by(reg) %>%
# Calculate % change from a to b value
mutate(control_mean = 7.67,
perc_change = (10-7.67)/7.67 * 100) %>%
filter(reg !='control') %>%
ggplot(aes(x = reg,
y = perc_change))
geom_bar(stat = 'identity') #
# from which values calculate the error bar??
geom_errorbar(aes(x=reg,
min=V_mean-V_sd,
ymax=V_mean V_sd)) #
Thanks for your thoughts!
CodePudding user response:
First of all, you can get your original plot using stat_summary()
more easily because it will calculate the mean
and SD
for you directly inside the ggplot()
call.
But to your question, you easily calculate the fold change prior to passing to ggplot()
by doing a mutate()
where you set vol[reg == "control"]
as the denominator. Then you can format the y
axis using {scales}.
library(tidyverse)
library(scales)
dd <- data.frame(id = rep(c(1,2,3), 2),
vol = c(10,5,8,11,10,9),
reg = rep(c('control', 'new'), each = 3))
# original plot using stat_summary to avoid transforming data
dd %>%
ggplot(aes(reg, vol))
stat_summary(geom = "bar", fun = mean)
stat_summary(geom = "errorbar", fun.data = mean_cl_normal, fun.args = list(mult = 1))
# calculate % of control
dd %>%
mutate(norm_vol = vol/mean(vol[reg == "control"])) %>%
ggplot(aes(reg, norm_vol))
stat_summary(geom = "bar", fun = mean)
stat_summary(geom = "errorbar", fun.data = mean_cl_normal, fun.args = list(mult = 1))
scale_y_continuous(labels = scales::percent_format())
Created on 2022-02-21 by the reprex package (v2.0.1)