Home > Software engineering >  Plot lines according to "season" based on years and week number given
Plot lines according to "season" based on years and week number given

Time:06-15

I have a dataframe with 4 columns. This includes week_no (week of the year), date (the Friday date for that week), year, sales.

I want to plot a line graph showing sales for the winter period by week. The winter period runs from week 50 - week 4. Each winter season should it’s own line and colour.

Sample dataframe code


library(ggplot2)

df <- data.frame(
week_no = c("50","51","52","1","2","3","4","50","51","52","1","2","3","4"),
date = c("2018-12-14", "2018-12-21", "2018-12-28", "2019-01-04", "2019-01-11", "2019-01-18", "2019-01-25", "2019-12-13", "2019-12-20", "2019-12-27", "2020-01-03", "2020-01-10", "2020-01-17", "2020-01-24"), 
         year = c("2018", "2018", "2018", "2019", "2019", "2019", "2019", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
sales = c(546,873,532,424,235,321,531,865,869,458,234,654,345,984))

Sample dataframe

week_no date year sales
50 2018-12-14 2018 546
51 2018-12-21 2018 873
52 2018-12-28 2018 532
1 2019-01-04 2019 424
2 2019-01-11 2019 235
3 2019-01-18 2019 321
4 2019-01-25 2019 531
50 2019-12-13 2019 865
51 2019-12-20 2019 869
52 2019-12-27 2019 458
1 2020-01-03 2020 234
2 2020-01-10 2020 654
3 2020-01-17 2020 345
4 2020-01-24 2020 984

I’ve tried the below code and it gives a result close to what I need other than the but the years don’t match up correct and the week order is continuous running 1-52.

I desired the x-axis to be ordered as 50,51,52,1,2,3 and each line to show data similarly by date order rather than year.

Code

winter_line_plot <- ggplot(df, aes(x = week_no, y = sales))   
  geom_line(aes(color = year))

I tried using the dates for x-axis but my result showed one continuous line rather than separate colour/lines by the different winter seasons.

CodePudding user response:

You need to add year as an aesthetic. Keep week discrete, and add scale limits. In order to get the season, your idea in the comment to create a grouping variable for each season is certainly the most straight forward.

library(tidyverse)

df <- data.frame(
  week_no = c("50","51","52","1","2","3","4","50","51","52","1","2","3","4"),
  date = c("2018-12-14", "2018-12-21", "2018-12-28", "2018-01-04", "2018-01-11", "2018-01-18", "2018-01-25", "2019-12-13", "2019-12-20", "2019-12-27", "2020-01-03", "2020-01-10", "2020-01-17", "2020-01-24"), 
  year = c("2018", "2018", "2018", "2019", "2019", "2019", "2019", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
  sales = c(546,873,532,424,235,321,531,865,869,458,234,654,345,984))

##  if your data is indeed very regular and contains always one and no more than one row per week,
## then you can easily create the groups with rep
## (make sure the data frame is sorted by year and week)
df %>% 
  arrange(year, week_no) %>%
  mutate(season = rep(c("winter_1819"," winter_1920"), each = 7)) %>%
ggplot()  
  ## week is discrete, so you need to specify group
  geom_line(aes(week_no, sales, color = season, group = season))  
  ## set axis breaks
  scale_x_discrete(limits = as.character(c(50:52, 1:4) ))

  • Related