Home > OS >  Weighted rowmeans in R
Weighted rowmeans in R

Time:11-09

I am trying to created a weighted average for each week, across multiple columns. My data looks like this:

week <- c(1,1,1,2,2,3)
col_a <- c(1,2,2,4,2,7)
col_b <- c(4,2,3,1,2,5)
col_c <- c(4,2,3,2,2,4)
dfreprex <- data.frame(week,col_a,col_b,col_c)
  week col_a col_b col_c
1    1     1     4     4
2    1     2     2     2
3    1     2     3     3
4    2     4     1     2
5    2     2     2     2
6    3     7     5     4
weightsreprex <- data.frame(county = c("col_a", "col_b", "col_c")
                             , weights = c(.3721, .3794, .2485))

How do I weight each column and then get the mean? Is there a simpler way than just multiplying each column by its weight in a new column (col_a_weighted) and then taking the rowmean of the weighted columns only?

Tried weighted.means, rowmeans, group_by and summarise

CodePudding user response:

We may use * for matrix multiplication:

dfreprex$wtmean <- as.matrix(dfreprex[,-1]) %*% as.matrix(weightsreprex[, 2])
dfreprex

  week col_a col_b col_c wtmean
1    1     1     4     4 2.8837
2    1     2     2     2 2.0000
3    1     2     3     3 2.6279
4    2     4     1     2 2.3648
5    2     2     2     2 2.0000
6    3     7     5     4 5.4957

We might also use crossprod

crossprod(t(as.matrix(dfreprex[,-1])),  as.matrix(weightsreprex[, 2]))

CodePudding user response:

You can use stats::weighted.mean() here:

library(tidyverse)

dfreprex <- structure(list(week = c(1, 1, 1, 2, 2, 3), col_a = c(1, 2, 2, 4, 2, 7), col_b = c(4, 2, 3, 1, 2, 5), col_c = c(4, 2, 3, 2, 2, 4)), class = "data.frame", row.names = c(NA, -6L))

weightsreprex <- data.frame(county = c("col_a", "col_b", "col_c"), weights = c(.3721, .3794, .2485))

dfreprex %>%  
  rowwise() %>% 
  mutate(wt_avg = weighted.mean(c(col_a, col_b, col_c), weightsreprex$weights))
#> # A tibble: 6 × 5
#> # Rowwise: 
#>    week col_a col_b col_c wt_avg
#>   <dbl> <dbl> <dbl> <dbl>  <dbl>
#> 1     1     1     4     4   2.88
#> 2     1     2     2     2   2   
#> 3     1     2     3     3   2.63
#> 4     2     4     1     2   2.36
#> 5     2     2     2     2   2   
#> 6     3     7     5     4   5.50

Created on 2022-11-09 with reprex v2.0.2

  • Related