I am trying to create a rolling 4 week average of the Actuals column and grouped on the State.
So starting at the bottom, the 4 week average for TX in the final row would be, (1 0 (no sales in week of may22) 1 3) / 4 (number of weeks).
Week State Actuals
4/24/2022 CA 1
5/1/2022 CA 3
5/8/2022 CA 34
5/8/2022 NV 2
5/8/2022 AZ 1
5/8/2022 TX 3
5/15/2022 CA 27
5/15/2022 TX 1
5/22/2022 CA 15
5/29/2022 TX 1
5/29/2022 CA 24
I like R more than Power BI, but either would be great
CodePudding user response:
Use complete
to include the weeks missing filling them with 0
library(tidyverse)
df %>%
mutate(Week = as.Date(Week, "%m/%d/%Y")) %>%
complete(Week, State, fill = list(Actuals = 0)) %>%
group_by(State) %>%
summarise(Results = zoo::rollmean(Actuals, 4), .groups = 'drop')
# A tibble: 12 x 2
State Results
<chr> <dbl>
1 AZ 0.25
2 AZ 0.25
3 AZ 0.25
4 CA 16.2
5 CA 19.8
6 CA 25
7 NV 0.5
8 NV 0.5
9 NV 0.5
10 TX 1
11 TX 1
12 TX 1.25
Edit:
If you want to maintain the week use stats::filter
:
df %>%
mutate(Week = as.Date(Week, "%m/%d/%Y")) %>%
complete(State, Week, fill = list(Actuals = 0)) %>%
group_by(State) %>%
mutate(Results = stats::filter(Actuals, rep(1,4)/4, sides = 1))
State Week Actuals Results
1 AZ 2022-04-24 0 NA
2 AZ 2022-05-01 0 NA
3 AZ 2022-05-08 1 NA
4 AZ 2022-05-15 0 0.25
5 AZ 2022-05-22 0 0.25
6 AZ 2022-05-29 0 0.25
7 CA 2022-04-24 1 NA
8 CA 2022-05-01 3 NA
9 CA 2022-05-08 34 NA
10 CA 2022-05-15 27 16.25
11 CA 2022-05-22 15 19.75
12 CA 2022-05-29 24 25.00
13 NV 2022-04-24 0 NA
14 NV 2022-05-01 0 NA
15 NV 2022-05-08 2 NA
16 NV 2022-05-15 0 0.50
17 NV 2022-05-22 0 0.50
18 NV 2022-05-29 0 0.50
19 TX 2022-04-24 0 NA
20 TX 2022-05-01 0 NA
21 TX 2022-05-08 3 NA
22 TX 2022-05-15 1 1.00
23 TX 2022-05-22 0 1.00
24 TX 2022-05-29 1 1.25
CodePudding user response:
Using your data :
x <- "
Week State Actuals
4/24/2022 CA 1
5/1/2022 CA 3
5/8/2022 CA 34
5/8/2022 NV 2
5/8/2022 AZ 1
5/8/2022 TX 3
5/15/2022 CA 27
5/15/2022 TX 1
5/22/2022 CA 15
5/29/2022 TX 1
5/29/2022 CA 24
"
df <- read.table(textConnection(x) , header = TRUE)
df
calculating averages every 4 weeks from the beginning
#===================================
library(dplyr , warn.conflicts = FALSE)
df$Week <- as.Date(df$Week , format = "%m/%d/%y")
uw <- unique(df$Week)
weeks <- list()
for(i in 1:ceiling(length(uw)/4)){
s <- i*4 - 3
weeks[[i]] <- df %>% filter( Week %in% uw[s:(s 3)]) %>% group_by(State) %>%
summarise(Avg = format(round(mean(Actuals) , 3) , 3))
}
#===================================
the first 4 weeks :
weeks[[1]]
#> # A tibble: 4 × 2
#> State Avg
#> <chr> <dbl>
#> 1 AZ 1
#> 2 CA 16.25
#> 3 NV 2
#> 4 TX 2
the second 4 weeks :
weeks[[2]]
#> # A tibble: 2 × 2
#> State Avg
#> <chr> <dbl>
#> 1 CA 19.5
#> 2 TX 1
Created on 2022-06-08 by the reprex package (v2.0.1)