I have some time series data, hourly, that runs across 5 years. I would like to plot my time series data with year as a grouping variable, so there are 5 lines across all dates and times of the year. Simply plotting dateTime on the x-axis and using lubridate::year(dateTime)
as the grouping/colour in aes
makes one long line with 5 different colours on it.
rough example data;
require(data.table)
require(lubridate)
require(ggplot2)
# this is just 7 days of hourly data, over 3 separate years
dt <- data.table(date = c(seq(as.Date("2018-03-01"), as.Date("2018-03-07"),by="day"), seq(as.Date("2019-03-01"), as.Date("2019-03-07"),by="day"), seq(as.Date("2020-03-01"), as.Date("2020-03-07"),by="day")), hr = rep(1:24, 21))
dt[, value := sin(hr)*sample(1:3,1)]
dt[, dateTime := as.POSIXct(paste0(date," ",hr,":00:00"), format="%Y-%m-%d %H:%M")]
# the result should be an x-axis of 7 days/hours, with three lines for years.
# the below is obviously not that
ggplot(dt, aes(x=dateTime,y=value,group=year(dateTime), colour=year(dateTime)))
geom_line()
I thought there would be a way of formatting posix times as just month/day/time without the year component, but it appears to just return NA.
(p.s. not really interested in grouping by yday
, for example, as i want the intricacies of the hourly cycles to be plotted)
CodePudding user response:
You need a common timestamp for plotting on the x-axis.. so create one (plotDate
) by setting all years in the dateTime to the year 2000 (or whatever...)
On creating the labels for the x-axis, just leave out the dummy-year value in the formatting.
# create some variables to use for plotting
dt[, year := lubridate::year(dateTime)]
dt[, datePlot := update(dateTime, year = 2000)]
#now plot
ggplot(data = dt, aes(x = datePlot, y = value, group = year, color = as.factor(year)))
geom_line(size = 1)
scale_x_datetime(breaks = "12 hours",
labels = function(x) format(x, "%d %b %H:%M"))
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
labs(x = "dateTime", color = "year")