I have two long format datasets: df1 with time and the growth of different species. df2 with their corresponding standard deviation for each point.
(I made simplified versions of the data below, real data consists of many additional species) df1:
time | species | OD |
---|---|---|
1 | sp_a | 0.2 |
2 | sp_a | 0.3 |
3 | sp_a | 0.4 |
1 | sp_b | 0.1 |
2 | sp_b | 0.15 |
df2:
time | species | sd |
---|---|---|
1 | sp_a | 0.05 |
2 | sp_a | 0.02 |
3 | sp_a | 0.08 |
1 | sp_b | 0.04 |
2 | sp_b | 0.02 |
How can I make a time series plot with x=time and y=OD, and error bars using ggplot2 without manually having to add the corresponding error bars from the standard deviation of every single species?
Thank you very much in advance :)
CodePudding user response:
Straightfoward approach to join them together, calculate upper and lower values and use geom_linerange
:
library(tidyverse)
df1 <- tribble(
~time, ~species, ~OD,
1, "sp_a", 0.2,
2, "sp_a", 0.3,
3, "sp_a", 0.4,
1, "sp_b", 0.1,
2, "sp_b", 0.15
)
df2 <- tribble(
~time, ~species, ~sd,
1, "sp_a", 0.05,
2, "sp_a", 0.02,
3, "sp_a", 0.08,
1, "sp_b", 0.04,
2, "sp_b", 0.02
)
df1 |>
left_join(df2, by = c("time", "species")) |>
mutate(u = OD sd,
l = OD - sd) |>
ggplot(aes(time, OD, colour = species))
geom_point()
geom_linerange(aes(ymin = l, ymax = u))
geom_linerange
takes the aesthetics ymin
and ymax
. You pass your values plus/minus sds to ggplot to note ends of the lines. Other geoms (geom_crossbar
, geom_errorbar
, geom_pointrange
) are available!