I have a data set with survey counts and I want to plot the date on the x axis and count on the y axis and have a line for each year of data. I would like to have the month showing on the x axis but still have the points plotted by day.
I want it to look like this (MD being month and day) but have just the months showing on the x axis.
I know it is possible for one year of data using scale_x_date but I have multiple years of data so my date can't have a year in it, and I can't figure out an easy way to make month-day a "Date" class.
Data:
structure(list(Date = structure(c(15218, 15274, 15314, 15392,
15429, 15441, 15463, 15547, 15574, 15607, 15652, 15687, 15742,
15768, 15799, 15825, 15853, 15898, 15938, 15982, 16009, 16035,
16073, 16098, 16126, 16128, 16147, 16149, 16181, 16225, 16252,
16288, 16331, 16358, 16378, 16407, 16464, 16465, 16525, 16554,
16583, 16610, 16645, 16667, 16696, 16720, 16749, 16798, 16815,
16862, 16891, 16918, 16947, 16976, 17010, 17038, 17072, 17100,
17123, 17150, 17176, 17198, 17220, 17268, 17296, 17329, 17372,
17421, 17491, 17571, 17617, 17647, 17725, 17773, 17792, 17834,
17863, 17884, 17920, 17955, 17987, 18014, 18058, 18096, 18131,
18156, 18186, 18212, 18240, 18306, 18401, 18432, 18464, 18465,
18485, 18526, 18557, 18590, 18616, 18652, 18711, 18744, 18771,
18796, 18835, 18858, 18892, 18915, 18954, 18977, 19023, 19045,
19081, 19101, 19130, 19165, 19199, 19221, 19251, 19287, 19310,
19339), class = "Date"), thecount = c(7L, 11L, 11L, 8L, 9L, 4L,
5L, 4L, 2L, 7L, 4L, 7L, 7L, 6L, 13L, 9L, 4L, 6L, 6L, 5L, 6L,
9L, 10L, 10L, 10L, 14L, 11L, 12L, 5L, 6L, 9L, 6L, 7L, 14L, 8L,
9L, 6L, 7L, 7L, 3L, 9L, 5L, 7L, 5L, 10L, 9L, 10L, 12L, 9L, 11L,
16L, 9L, 10L, 5L, 5L, 8L, 10L, 9L, 12L, 8L, 8L, 8L, 2L, 4L, 3L,
7L, 4L, 6L, 5L, 5L, 9L, 9L, 4L, 8L, 7L, 5L, 4L, 3L, 7L, 5L, 6L,
7L, 6L, 4L, 10L, 4L, 5L, 2L, 4L, 14L, 9L, 5L, 4L, 3L, 4L, 6L,
3L, 3L, 5L, 9L, 8L, 4L, 6L, 4L, 7L, 3L, 5L, 5L, 5L, 17L, 21L,
13L, 20L, 7L, 3L, 20L, 5L, 9L, 10L, 5L, 11L, 16L), Year = c("2011",
"2011", "2011", "2012", "2012", "2012", "2012", "2012", "2012",
"2012", "2012", "2012", "2013", "2013", "2013", "2013", "2013",
"2013", "2013", "2013", "2013", "2013", "2014", "2014", "2014",
"2014", "2014", "2014", "2014", "2014", "2014", "2014", "2014",
"2014", "2014", "2014", "2015", "2015", "2015", "2015", "2015",
"2015", "2015", "2015", "2015", "2015", "2015", "2015", "2016",
"2016", "2016", "2016", "2016", "2016", "2016", "2016", "2016",
"2016", "2016", "2016", "2017", "2017", "2017", "2017", "2017",
"2017", "2017", "2017", "2017", "2018", "2018", "2018", "2018",
"2018", "2018", "2018", "2018", "2018", "2019", "2019", "2019",
"2019", "2019", "2019", "2019", "2019", "2019", "2019", "2019",
"2020", "2020", "2020", "2020", "2020", "2020", "2020", "2020",
"2020", "2020", "2021", "2021", "2021", "2021", "2021", "2021",
"2021", "2021", "2021", "2021", "2021", "2022", "2022", "2022",
"2022", "2022", "2022", "2022", "2022", "2022", "2022", "2022",
"2022"), MD = c("09-01", "10-27", "12-06", "02-22", "03-30",
"04-11", "05-03", "07-26", "08-22", "09-24", "11-08", "12-13",
"02-06", "03-04", "04-04", "04-30", "05-28", "07-12", "08-21",
"10-04", "10-31", "11-26", "01-03", "01-28", "02-25", "02-27",
"03-18", "03-20", "04-21", "06-04", "07-01", "08-06", "09-18",
"10-15", "11-04", "12-03", "01-29", "01-30", "03-31", "04-29",
"05-28", "06-24", "07-29", "08-20", "09-18", "10-12", "11-10",
"12-29", "01-15", "03-02", "03-31", "04-27", "05-26", "06-24",
"07-28", "08-25", "09-28", "10-26", "11-18", "12-15", "01-10",
"02-01", "02-23", "04-12", "05-10", "06-12", "07-25", "09-12",
"11-21", "02-09", "03-27", "04-26", "07-13", "08-30", "09-18",
"10-30", "11-28", "12-19", "01-24", "02-28", "04-01", "04-28",
"06-11", "07-19", "08-23", "09-17", "10-17", "11-12", "12-10",
"02-14", "05-19", "06-19", "07-21", "07-22", "08-11", "09-21",
"10-22", "11-24", "12-20", "01-25", "03-25", "04-27", "05-24",
"06-18", "07-27", "08-19", "09-22", "10-15", "11-23", "12-16",
"01-31", "02-22", "03-30", "04-19", "05-18", "06-22", "07-26",
"08-17", "09-16", "10-22", "11-14", "12-13")), row.names = c(NA,
-122L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
I'd suggest normalizing the dates to be in the same year so that you can use a normal date axis.
In this case I am going by the number of days elapsed in the year, so "March 1" will appear at slightly different x location depending on whether it's a leap year or not. If that level of granularity matters to you, there are various other approaches to show years on the same timeline, which each have tradeoffs.
df1$Date_norm = as.Date("2000-01-01") as.numeric(df1$Date - lubridate::floor_date(df1$Date, "year"))
df1 |>
ggplot(aes(Date_norm, thecount, color = Year))
geom_line()
scale_x_date(date_labels = "%b")