Given I have the following data table:
total_sales tickets_sold show_time occu_perc ticket_price DATE
1: 3900000 26 4 4.26 150000 2018-05-05
2: 3360000 42 5 8.08 80000 2018-05-05
3: 2560000 32 4 20.00 80000 2018-05-05
4: 1200000 12 1 11.01 100000 2018-05-05
5: 1200000 15 3 16.67 80000 2018-05-05
6: 1050000 7 3 0.98 150000 2018-05-05
The data entails daily sales, the volume in sold tickets etc. I am intending to transform all but the DATE variable into a time series for multivariate analysis. However, I would like to convert the entries into aggregate monthly means, e.g., average monthly show times. How would I do that?
CodePudding user response:
From the DATE
you can extract year and month value and aggregate the variables for each month and return the mean
.
library(dplyr)
df %>%
mutate(DATE = as.Date(DATE),
ym = format(DATE, '%Y-%m')) %>%
group_by(ym) %>%
summarise(across(total_sales:ticket_price, mean, na.rm = TRUE))