Home > Blockchain >  Convenient way to plot mean and sd per factor (and level)?
Convenient way to plot mean and sd per factor (and level)?

Time:02-02

Is there an easy way to add to a geom_point() plot the mean plus the sd like this here:

img

Going further it would be cool to also take into account levels of a factor. My data looks like this:

 str(df)
'data.frame':   138 obs. of  7 variables:
 $ Measurement_type: Factor w/ 3 levels "block_w_same_oil",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ BDV             : num  45.2 64 77 70.2 67.9 55.7 59.8 67.4 75.1 75.2 ...
 $ Temp            : Factor w/ 2 levels "cold","warm": 1 1 1 1 1 1 1 1 1 1 ...
 $ Temp_C          : num  20.1 20.1 20.1 20.1 20.1 20.1 20.1 20.5 20.5 20.5 ...
 $ Pollution       : Factor w/ 2 levels "clean","polluted": 1 1 1 1 1 1 1 1 1 1 ...
 $ Step            : num  1 2 3 4 5 6 1 2 3 4 ...
 $ Rep             : Factor w/ 5 levels "M1","M2","M3",..: 1 1 1 1 1 1 2 2 2 2 ...

I would like to be able to create such plots easily for e.g. the factor Measurement_type and Rep. But maybe also for Pollution and Temp. Is there a built-in feature so I don't have to calculate any means, sd and merge data frames on my own?

What I have atm is:

df %>%
            ggplot(aes(x = Step, y = BDV, colour = Measurement_type, shape = Rep), alpha = 0.8)   
            geom_point(aes(colour = Measurement_type), size = 3)  
            stat_summary(fun.data = 'mean_sdl', geom = 'smooth')  
            xlab("Step")   ylab("BDV / kV")  
            theme_tq()

which produces

img

which actually does the job but is not really usable as the visualization is not great (plus the sds like in geom_ribbon are not even there, yet).

CodePudding user response:

One option would be to use two stat_summary layers to add the mean line and the confidence bands. If you want lines and ribbons for interaction of Rep and Measurement_type then drop the group aes.

Using some fake random example data:

library(ggplot2)

set.seed(123)

df <- data.frame(
  Measurement_type = sample(LETTERS[1:3], 100, replace = TRUE),
  Rep = sample(letters[1:5], 100, replace = TRUE),
  Step = sample(seq(5), 100, replace = TRUE),
  BDV = runif(100, 25, 75)
)

ggplot(df, aes(x = Step, y = BDV, colour = Measurement_type, shape = Rep), alpha = 0.8)  
  stat_summary(aes(
    fill = Measurement_type,
    group = Measurement_type
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA)  
  stat_summary(aes(group = Measurement_type), fun.data = "mean_se", geom = "line")  
  geom_point(size = 3)  
  xlab("Step")  
  ylab("BDV / kV")

EDIT

ggplot(df, aes(x = Step, y = BDV, shape = Rep), alpha = 0.8)  
  stat_summary(aes(
    fill = Measurement_type,
    group = Measurement_type
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA)  
  stat_summary(aes(
    fill = Rep,
    group = Rep
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA)  
  stat_summary(aes(colour = Measurement_type, group = Measurement_type), fun.data = "mean_se", geom = "line")  
  stat_summary(aes(colour = Rep, group = Rep), fun.data = "mean_se", geom = "line")  
  geom_point(aes(colour = Measurement_type), size = 3)  
  xlab("Step")  
  ylab("BDV / kV")

enter image description here

  • Related