I have this dataframe which I would like to to make an overlapping timeseries plot from, by ID, using four relevant data points for each ID. I don't know how to have all these show on top of each other. I want the lines to be divided into two colors, green for those which finish at a lower value than they began with and red for the opposite.
I don't know if I need to change my dataframe first or if I can do all the work with ggplot..
Here's a sample of my data (which has 38 total rows) All columns after ID should be points on the x axis and the Y should show from 0 to 3
ID Starting_BAI HYP1_PostSession_1 HYP1_PostSession_4 HYP1_PostSession_8
1 1 2 1 1 0
2 2 3 3 2 1
3 3 2 2 0 0
4 4 3 3 2 0
5 5 3 2 2 1
6 6 2 2 1 1
7 7 2 2 1 0
8 8 3 3 2 1
9 9 3 3 2 0
10 10 3 3 1 0
CodePudding user response:
I might rename your baseline value to HYP1_Session_0
or something like that, so that the final number at the end indicates "time" and would be consistent with the other column names.
It will help to have your data into long form instead of wide (and tidy). This means each time observation for a particular ID is one row.
Using pivot_longer
, your data can look like this in long form:
library(tidyverse)
library(ggplot2)
df %>%
rename(HYP1_Session_0 = Starting_BAI) %>%
pivot_longer(cols = -ID, names_to = "time", values_to = "value", names_pattern = "(\\d )$", names_transform = list(time = as.numeric))
# A tibble: 40 × 3
ID time value
<int> <dbl> <int>
1 1 0 2
2 1 1 1
3 1 4 1
4 1 8 0
5 2 0 3
6 2 1 3
7 2 4 2
8 2 8 1
9 3 0 2
10 3 1 2
# … with 30 more rows
After that, using tidyverse
you can group_by
your ID
and add an indicator column change
that will compare the values at time 0 and 8 (for coloring).
Finally, you can use ggplot
to plot your spaghetti plot. Here we can indicate color based on change
. In addition, position_jitter
prevents lines from completely overlapping each other.
Note: I changed the data so that one of the ID
s (ID #6) would result in a different color (red) - just for testing.
library(tidyverse)
library(ggplot2)
df %>%
rename(HYP1_Session_0 = Starting_BAI) %>%
pivot_longer(cols = -ID, names_to = "time", values_to = "value", names_pattern = "(\\d )$", names_transform = list(time = as.numeric)) %>%
group_by(ID) %>%
mutate(change = value[time == 8] < value[time == 0]) %>%
ggplot(aes(x = time, y = value, group = ID, color = change))
geom_line(position=position_jitter(width = 0, height = .1))
scale_x_continuous(breaks = c(0, 1, 4, 8))
scale_color_manual(values = c("red", "green"))
Output
Data
df <- structure(list(ID = 1:10, Starting_BAI = c(2L, 3L, 2L, 3L, 3L,
2L, 2L, 3L, 3L, 3L), HYP1_PostSession_1 = c(1L, 3L, 2L, 3L, 2L,
2L, 2L, 3L, 3L, 3L), HYP1_PostSession_4 = c(1L, 2L, 0L, 2L, 2L,
1L, 1L, 2L, 2L, 1L), HYP1_PostSession_8 = c(0L, 1L, 0L, 0L, 1L,
3L, 0L, 1L, 0L, 0L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))