I am trying to created a weighted average for each week, across multiple columns. My data looks like this:
week <- c(1,1,1,2,2,3)
col_a <- c(1,2,2,4,2,7)
col_b <- c(4,2,3,1,2,5)
col_c <- c(4,2,3,2,2,4)
dfreprex <- data.frame(week,col_a,col_b,col_c)
week col_a col_b col_c
1 1 1 4 4
2 1 2 2 2
3 1 2 3 3
4 2 4 1 2
5 2 2 2 2
6 3 7 5 4
weightsreprex <- data.frame(county = c("col_a", "col_b", "col_c")
, weights = c(.3721, .3794, .2485))
How do I weight each column and then get the mean? Is there a simpler way than just multiplying each column by its weight in a new column (col_a_weighted) and then taking the rowmean of the weighted columns only?
Tried weighted.means, rowmeans, group_by and summarise
CodePudding user response:
We may use *
for matrix multiplication:
dfreprex$wtmean <- as.matrix(dfreprex[,-1]) %*% as.matrix(weightsreprex[, 2])
dfreprex
week col_a col_b col_c wtmean
1 1 1 4 4 2.8837
2 1 2 2 2 2.0000
3 1 2 3 3 2.6279
4 2 4 1 2 2.3648
5 2 2 2 2 2.0000
6 3 7 5 4 5.4957
We might also use crossprod
crossprod(t(as.matrix(dfreprex[,-1])), as.matrix(weightsreprex[, 2]))
CodePudding user response:
You can use stats::weighted.mean()
here:
library(tidyverse)
dfreprex <- structure(list(week = c(1, 1, 1, 2, 2, 3), col_a = c(1, 2, 2, 4, 2, 7), col_b = c(4, 2, 3, 1, 2, 5), col_c = c(4, 2, 3, 2, 2, 4)), class = "data.frame", row.names = c(NA, -6L))
weightsreprex <- data.frame(county = c("col_a", "col_b", "col_c"), weights = c(.3721, .3794, .2485))
dfreprex %>%
rowwise() %>%
mutate(wt_avg = weighted.mean(c(col_a, col_b, col_c), weightsreprex$weights))
#> # A tibble: 6 × 5
#> # Rowwise:
#> week col_a col_b col_c wt_avg
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 4 4 2.88
#> 2 1 2 2 2 2
#> 3 1 2 3 3 2.63
#> 4 2 4 1 2 2.36
#> 5 2 2 2 2 2
#> 6 3 7 5 4 5.50
Created on 2022-11-09 with reprex v2.0.2