Home > database >  Standard deviation in ggplot for separate variables
Standard deviation in ggplot for separate variables

Time:02-16

I am trying to plot the standard deviation(SD) of two separate variables in ggplot with stat_summary.

I am able to do so with this code:

Figure1 <-ggplot(data.frame, aes(x = time, y = output, group= parameter, color=parameter, 
shape= parameter))   scale_color_manual(values=c("forestgreen", "purple4"))  
stat_summary(fun = "mean", 
           geom = "pointrange", 
           fun.max = function(x) mean(x)   sd(x), group ="parameter",
           fun.min = function(x) mean(x) - sd(x), size = 0.5)

However, I would like to put the SD lines upwards for one variable and down for the other. Whenever I make an change to the code it treats both variables the same (i.e they both show SD lines up or down). For example below, both lines are now upwards.

Figure1 <-ggplot(data.frame, aes(x = time, y = output, group= parameter, color=parameter, 
shape= parameter))   scale_color_manual(values=c("forestgreen", "purple4"))  
stat_summary(fun = "mean", 
           geom = "pointrange", 
           fun.max = function(x) mean(x)   sd(x), group ="parameter",
           fun.min = function(x) mean(x), size = 0.5)

I would like to separate based on "parameter"

Any help is appreciated, thanks.

CodePudding user response:

You do not share your data with us. So I was forced to generate them myself.

library(tidyverse)

df = tibble(
  parameter = "A",
  time = seq(0,100,5),
) %>% bind_rows(
  tibble(
    parameter = "B",
    time = seq(10,120,8),
  )  
) %>% rowwise(time) %>% 
  mutate(output = list(rnorm(sample(5:10,1), rnorm(1,0,10), sample(1:5,1)))) %>% 
  unnest(output)

output

# A tibble: 269 x 3
# Groups:   time [32]
   parameter  time output
   <chr>     <dbl>  <dbl>
 1 A             0   8.22
 2 A             0   1.76
 3 A             0   5.11
 4 A             0   3.68
 5 A             0  11.2 
 6 A             0   7.40
 7 A             0   3.83
 8 A             5 -21.4 
 9 A             5 -20.8 
10 A             5 -20.7 
# ... with 259 more rows

My data looks like this

df %>% ggplot(aes(time, output,color=parameter, shape= parameter)) 
  geom_point(size =2)  
  scale_color_manual(values=c("forestgreen", "purple4"))

enter image description here

Now let's make the graph you expect (or at least that's how I understood it)

df %>% ggplot(aes(time, output, group=parameter, color=parameter, shape= parameter))   
  scale_color_manual(values=c("forestgreen", "purple4"))  
  stat_summary(
    data = df %>% filter(parameter =="A"),
    fun = mean, 
    fun.max = function(x) mean(x)   sd(x), group ="parameter",
    fun.min = function(x) mean(x), size = 0.5) 
  stat_summary(
    data = df %>% filter(parameter =="B"),
    fun = mean, 
    fun.max = function(x) mean(x), group ="parameter",
    fun.min = function(x) mean(x) - sd(x), size = 0.5)

enter image description here

As you can see, it was enough to call the stat_summary function only twice, each time giving it a different data range data = df %>% filter(parameter == "A") or data = df %>% filter(parameter == "B").

Our graph can possibly be supplemented with data points

df %>% ggplot(aes(time, output, group=parameter, color=parameter, shape= parameter))   
  geom_point(size=1.2, alpha = .7) 
  scale_color_manual(values=c("forestgreen", "purple4"))  
  stat_summary(
    data = df %>% filter(parameter =="A"),
    fun = mean, 
    fun.max = function(x) mean(x)   sd(x), group ="parameter",
    fun.min = function(x) mean(x), size = 0.5) 
  stat_summary(
    data = df %>% filter(parameter =="B"),
    fun = mean, 
    fun.max = function(x) mean(x), group ="parameter",
    fun.min = function(x) mean(x) - sd(x), size = 0.5)

enter image description here

Hope this is the solution you have been looking for.

  • Related