I want to take the average of each column (except the date) after every seven rows. I tried the approach below, but I was getting incorrect values. This method also seems really long. Is there a way to shorten it?
bankamerica = read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/bankamerica.csv')
library(tidyverse)
GroupLabels <- 0:(nrow(bankamerica) - 1)%/% 7
bankamerica$Group <- GroupLabels
Avgs <- bankamerica %>%
group_by(bankamerica$Group) %>%
summarize(Avg = mean(bankamerica$tr))
EDITED: Just realized this code provides the incorrect values
CodePudding user response:
I think you're on the right path.
bankamerica %>%
mutate(group = cumsum(row_number() %% 7 == 1)) %>%
group_by(group) %>%
summarise(caldate = first(caldate), across(-caldate, mean)) %>%
select(-group)
## A tibble: 144 × 3
# caldate tr var
# <chr> <dbl> <dbl>
# 1 1/2/01 28.9 -50.6
# 2 1/11/01 23.6 -45.4
# 3 1/23/01 20.9 -45
# 4 2/1/01 17.4 -48
# 5 2/12/01 14.4 -48
# 6 2/21/01 17 -48.9
# 7 3/2/01 19.1 -56
# 8 3/13/01 19.4 -56.9
# 9 3/22/01 23.3 -55.7
#10 4/2/01 7.71 -58.3
This averages every 7 rows not every 7 days, because there are missing days in the data.