Home > Back-end >  How to view standard deviation value along with mean value in boxplot
How to view standard deviation value along with mean value in boxplot

Time:12-09

CO2 Boxplot

I am trying to view "mean( /-SD)" in box plot. I am using CO2 data. How to get "mean( /-SD)" in the plot?

means <- aggregate(conc ~ Type, CO2, mean)

ggboxplot(CO2, x = "Type", y= "conc", color = "Type", palette = "jco", add = "jitter") 
  geom_text(data = means, aes(label = round(conc,2), y = conc   10)) 
  stat_summary(fun.y=mean, colour="darkred", geom="point", shape=18, size=3,show_guide = FALSE) 
  stat_compare_means(method = "t.test")

CodePudding user response:

add = c("jitter", "mean", "mean_sd")

ggpubr includes doing this visually as a specification to ggboxplot.enter image description here

CodePudding user response:

You could pre calculate the mean and sd per group using aggregate. Here is a reproducible example using the iris dataset:

library(ggpubr)
library(dplyr)
library(tidyr)
library(ggplot2)

# Long format
iris_long <- iris %>%
  pivot_longer(cols = -Species)

# Calculate mean and sd per group
ag <- do.call(data.frame, aggregate(value ~ Species, iris_long, function(x) c(mean = mean(x), sd = sd(x))))

# Plot
ggboxplot(data = iris_long, x = 'Species', y = 'value', color = 'Species', add = 'jitter')  
  geom_text(data = ag, mapping = aes(x = Species, label = paste0(round(value.mean, 2), ' -', round(value.sd, 2)), y = 8)) 

Created on 2022-12-09 with reprex v2.0.2

CodePudding user response:

I'm not sure what ggboxplot is, but you can achieve this using base ggplot() by passing mean_sdl() to the data.fun argument of stat_summary():

library(ggplot2)
ggplot(CO2,
       aes(x = Type, y= conc, color = Type, group_by = Type))  
  stat_summary(fun.data = mean_sdl, colour = "darkred",
               shape = 18, size = 2)

mean and standard deviation visualisation

Created on 2022-12-09 with reprex v2.0.2

fun.y is a deprecated argument. fun.data is a better argument for this purpose - it takes a function that sumamrises a vector into minimum, maximum and middle values before plotting. mean_sdl() returns the mean and the mean /- the standard deviation.

Additionally, show.guide is a deprecated argument to stat.summary()

  • Related