Home > Enterprise >  How to use geom_ribbon to plot?
How to use geom_ribbon to plot?

Time:06-13

I am using 'flights' data set from 'nycflights13' package and 'ggplot2' package to convert the code using stat_summary function into the one using geom_ribbon(), geom_line(), and geom_point() functions. Here is the original code:

      flights %>% select(hour, dep_delay, arr_delay) %>% filter(hour > 4) %>%
  pivot_longer(!hour) %>%
  ggplot()  
  stat_summary(aes(hour, value, color = name),
               fun = mean,
               geom = "point",
               size = 3)  
  stat_summary(aes(hour, value, color = name),
               fun = mean,
               geom = "line",
               size = 1.1)  
  stat_summary(aes(hour, value, color = name),
               fun.data = "mean_sdl",
               fun.args = list(mult = 0.2),
               geom = "ribbon",
               alpha = 0.3)  
  theme_bw()

Below is my code:

 df = flights %>%  
  select(hour, dep_delay, arr_delay) %>% filter(hour > 4) %>%
  pivot_longer(!hour) %>% group_by(hour,name) %>%
  summarise(value = mean(value, na.rm = T))
df %>% mutate(low = value - sd(value)*(0.2), high = value   sd(value)*(0.2)) %>% ggplot()  
  geom_point(aes(hour, value, color = name), size = 3)  
  geom_line(aes(hour, value, color = name), size = 1.1)  
  geom_ribbon(aes(x = hour, ymax = high, ymin = low), alpha = 0.3)
  theme_bw()

However, the plot I made is not similar to the orginal one, I know the problem lies in the geom_ribbon() part but I don't know how to fix it. Could anyone help me? Thank you so much!

CodePudding user response:

library(nycflights13)
library(tidyverse)
f <- flights %>% 
   select(hour, dep_delay, arr_delay) %>% 
   filter(hour > 4) %>%
   pivot_longer(!hour)

Replicate the calculation that stat_summary() does internally, applying the mean_sdl function to each hour/name combination:

fs <- (f
  ## partition data
  %>% group_by(hour, name)
  ## convert value to a list-column
  %>% nest()
  ## summarise each entry
  %>% mutate(across(data, map, \(x) mean_sdl(x, mult = 0.2)))
  ## collapse back to a vector
  %>% unnest(cols = c(data))
)

Now create the plot:

ggplot(fs)  
  aes(hour, y = y, ymin = ymin, ymax = ymax, color = name)  
  geom_point(size = 3)  
  geom_line(size = 1.1)  
  geom_ribbon(alpha = 0.3)   
  theme_bw()

The order of the elements affects the colours of the lines — i.e. if geom_ribbon is last, it covers the lines with one or two layers of "black/alpha=0.3" (depending on whether the lines are overlapped by one or both confidence regions). I might recommend drawing the lines and points after you draw the ribbon, so that the colours are closer to the originally specified values/more predictable (but there's no need to do that if you like the way your plot looks).

CodePudding user response:

You need to add name as a grouping variable. The natural way to do this is to map it to the color aesthetic:

 df %>% 
  mutate(low = value - sd(value)*(0.2), high = value   sd(value)*(0.2)) %>% 
  ggplot()  
  geom_point(aes(hour, value, color = name), size = 3)  
  geom_line(aes(hour, value, color = name), size = 1.1)  
  geom_ribbon(aes(x = hour, ymax = high, ymin = low, color = name), 
              alpha = 0.3)  
  theme_bw()

enter image description here

  • Related