I have a dataframe with 4 columns. This includes week_no (week of the year), date (the Friday date for that week), year, sales.
I want to plot a line graph showing sales for the winter period by week. The winter period runs from week 50 - week 4. Each winter season should it’s own line and colour.
Sample dataframe code
library(ggplot2)
df <- data.frame(
week_no = c("50","51","52","1","2","3","4","50","51","52","1","2","3","4"),
date = c("2018-12-14", "2018-12-21", "2018-12-28", "2019-01-04", "2019-01-11", "2019-01-18", "2019-01-25", "2019-12-13", "2019-12-20", "2019-12-27", "2020-01-03", "2020-01-10", "2020-01-17", "2020-01-24"),
year = c("2018", "2018", "2018", "2019", "2019", "2019", "2019", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
sales = c(546,873,532,424,235,321,531,865,869,458,234,654,345,984))
Sample dataframe
week_no | date | year | sales |
---|---|---|---|
50 | 2018-12-14 | 2018 | 546 |
51 | 2018-12-21 | 2018 | 873 |
52 | 2018-12-28 | 2018 | 532 |
1 | 2019-01-04 | 2019 | 424 |
2 | 2019-01-11 | 2019 | 235 |
3 | 2019-01-18 | 2019 | 321 |
4 | 2019-01-25 | 2019 | 531 |
50 | 2019-12-13 | 2019 | 865 |
51 | 2019-12-20 | 2019 | 869 |
52 | 2019-12-27 | 2019 | 458 |
1 | 2020-01-03 | 2020 | 234 |
2 | 2020-01-10 | 2020 | 654 |
3 | 2020-01-17 | 2020 | 345 |
4 | 2020-01-24 | 2020 | 984 |
I’ve tried the below code and it gives a result close to what I need other than the but the years don’t match up correct and the week order is continuous running 1-52.
I desired the x-axis to be ordered as 50,51,52,1,2,3 and each line to show data similarly by date order rather than year.
Code
winter_line_plot <- ggplot(df, aes(x = week_no, y = sales))
geom_line(aes(color = year))
I tried using the dates for x-axis but my result showed one continuous line rather than separate colour/lines by the different winter seasons.
CodePudding user response:
You need to add year as an aesthetic. Keep week discrete, and add scale limits. In order to get the season, your idea in the comment to create a grouping variable for each season is certainly the most straight forward.
library(tidyverse)
df <- data.frame(
week_no = c("50","51","52","1","2","3","4","50","51","52","1","2","3","4"),
date = c("2018-12-14", "2018-12-21", "2018-12-28", "2018-01-04", "2018-01-11", "2018-01-18", "2018-01-25", "2019-12-13", "2019-12-20", "2019-12-27", "2020-01-03", "2020-01-10", "2020-01-17", "2020-01-24"),
year = c("2018", "2018", "2018", "2019", "2019", "2019", "2019", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
sales = c(546,873,532,424,235,321,531,865,869,458,234,654,345,984))
## if your data is indeed very regular and contains always one and no more than one row per week,
## then you can easily create the groups with rep
## (make sure the data frame is sorted by year and week)
df %>%
arrange(year, week_no) %>%
mutate(season = rep(c("winter_1819"," winter_1920"), each = 7)) %>%
ggplot()
## week is discrete, so you need to specify group
geom_line(aes(week_no, sales, color = season, group = season))
## set axis breaks
scale_x_discrete(limits = as.character(c(50:52, 1:4) ))