Home > Enterprise >  How to plot total cumulative row count over time ggplot
How to plot total cumulative row count over time ggplot

Time:07-09

All I'm trying to do is plot a cumulative row count (so that by 2021 the graph line has reached 73) over time. I'm quite new to r, and I feel like this is really easy so I don't know why it's not really working.

My data looks like this:

ID    name      year
73    name73    2021
72    name72    2021
71    name71    2019
70    name70    2017
69    name69    2015
68    name68    2015

I've tried this code and it kind of works but sometimes the line goes down which doesn't seem right, since I just want a cumulative count.

ggplot(df, aes(x=year, y=ID))  
  geom_line()

Any help would be much appreciated!

CodePudding user response:

Order the data by year and ID before plotting and it will go from the first year to the last and within year the smaller ID first.

x <- 'ID    name      year
73    name73    2021
72    name72    2021
71    name71    2019
70    name70    2017
69    name69    2015
68    name68    2015'
df <- read.table(textConnection(x), header = TRUE)

library(ggplot2)

i <- order(df$year, df$ID)
ggplot(df[i,], aes(x=year, y=ID))  
  geom_line()

Created on 2022-07-08 by the reprex package (v2.0.1)


An alternative, that I do not know is what the question is asking for, is to aggregate the IDs by year keeping the maximum in each year.
The code below does this and pipes to the plot directly, without creating an extra data set.

aggregate(ID ~ year, df, max) |>
  ggplot(aes(x=year, y=ID))  
  geom_line()

Created on 2022-07-08 by the reprex package (v2.0.1)

  • Related