Home > Mobile >  stat_summary() and fun.data = mean_sdl not working
stat_summary() and fun.data = mean_sdl not working

Time:11-04

set.seed(1) # generate random data
day1 = rnorm(20,0,1)
day2 = rnorm(20,5,1)
Subject <- rep(paste0('S',seq(1:20)), 2)
Data <- data.frame(Value = matrix(c(day1,day2),ncol=1))
Day <- rep(c('Day 1', 'Day 2'), each = length(day1))
df <- cbind(Subject, Data, Day)

Using this random data, I'd like to plot individual points with unique color for each subject and a summary point (mean standard deviation).

It seems that the plot is okay when all points are plotted with the same color because stat_summary(fun.data = mean_sdl) works properly.

ggplot(data = df, mapping = aes(x= Day, y =Value))  
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 2),
               geom = 'pointrange', fatten = 3*1.2, size = 1.2,
               color= 'black')   
  geom_point(size = 2) 

enter image description here But not when all points have unique color (for each subject).

ggplot(data = df, mapping = aes(x = Day, y = Value,
                                fill = Subject))  
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 2),
               geom = 'pointrange', fatten = 3*1.2, size = 1.2,
               color = 'black')  
  geom_point(shape = 21, color = 'white', size = 2) 

enter image description here

CodePudding user response:

In your example ggplot assumes that each color corresponds to an individual group, but you want the grouping and color to be separate. Therefore, you need to explicitly define the group to be "Day".

ggplot(data = df, mapping = aes(x = Day, y = Value,
                                fill = Subject, group = Day))  
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 2),
               geom = 'pointrange', fatten = 3*1.2, size = 1.2,
               color = 'black')  
  geom_point(shape = 21, color = 'white', size = 2) 

enter image description here

CodePudding user response:

Try the following:

ggplot(data = df, mapping = aes(x= Day, y =Value))  
  stat_summary(fun.data = mean_sdl, fun.args = list(mult = 2),
               geom = 'pointrange', fatten = 3*1.2, size = 1.2,
               color= 'black')   
  geom_point(size = 2, aes(color = Subject)) 

Instead of specifying fill in aes() in the first line (ggplot(...)), I've moved it to the geom_point() element instead. Otherwise, stat_summary() will be doing its calculations grouped using Subject!

  • Related