All I'm trying to do is plot a cumulative row count (so that by 2021 the graph line has reached 73) over time. I'm quite new to r, and I feel like this is really easy so I don't know why it's not really working.
My data looks like this:
ID name year
73 name73 2021
72 name72 2021
71 name71 2019
70 name70 2017
69 name69 2015
68 name68 2015
I've tried this code and it kind of works but sometimes the line goes down which doesn't seem right, since I just want a cumulative count.
ggplot(df, aes(x=year, y=ID))
geom_line()
Any help would be much appreciated!
CodePudding user response:
Order the data by year and ID before plotting and it will go from the first year to the last and within year the smaller ID first.
x <- 'ID name year
73 name73 2021
72 name72 2021
71 name71 2019
70 name70 2017
69 name69 2015
68 name68 2015'
df <- read.table(textConnection(x), header = TRUE)
library(ggplot2)
i <- order(df$year, df$ID)
ggplot(df[i,], aes(x=year, y=ID))
geom_line()
Created on 2022-07-08 by the reprex package (v2.0.1)
An alternative, that I do not know is what the question is asking for, is to aggregate the IDs by year keeping the maximum in each year.
The code below does this and pipes to the plot directly, without creating an extra data set.
aggregate(ID ~ year, df, max) |>
ggplot(aes(x=year, y=ID))
geom_line()
Created on 2022-07-08 by the reprex package (v2.0.1)