Home > Software design >  Plotting monthly average over time of a column grouped by another column - RStudio
Plotting monthly average over time of a column grouped by another column - RStudio

Time:02-13

I have a dataframe df_have that looks like this:

  Arrival_Date     Cust_ID      Wait_Time_Mins    Cust_Priority 
  <chr>              <int>          <int>           <int> 
1 1/01/2010         612345            114               1 
2 1/01/2010         415911            146               4 
3 1/01/2010         445132             13               2 
4 1/01/2010         515619             72               3 
5 1/01/2010         725521            155               4 
6 1/01/2010         401404            100               5 
    ...               ...              ...             ...   

And I want to create five line graphs - 1 for each of the unique values in Cust_Priority - overlayed on the same plot, such that it is the such that it is the average Wait_Time_Mins by Cust_Priority by month.

How would I do this?

I know how

CodePudding user response:

You can use floor_date to change the date to 1st day of the month. Then for each Cust_Priority in each Month get the average wait time and create a line plot.

We use scale_x_date to format the labels on X-axis.

library(dplyr)
library(lubridate)
library(ggplot2)

df %>%
  #If the date is in mdy format use mdy() function to change Arrival_Date to date
  mutate(Arrival_Date = dmy(Arrival_Date), 
         date = floor_date(Arrival_Date, 'month')) %>%
  group_by(Cust_Priority, date) %>%
  summarise(Wait_Time_Mins = mean(Wait_Time_Mins), .groups = 'drop') %>%
  ggplot(aes(date, Wait_Time_Mins, color = factor(Cust_Priority), 
             group = Cust_Priority))   
  geom_line()   
  labs(x = "Month", y = "Average wait time", 
       title = "Average wait time for each month", color = "Customer Priority")  
  scale_x_date(date_labels = '%b - %Y', date_breaks = '1 month')
  • Related