I have to create a weighted sum score (WSum) based on several variables. For instance, mydata
has three variables (A, B, and C). I have to create a 'WSum' score in which I multiple each of these variables by the weights present in the weights
dataset, and sum all of them (A*0.5 B*1.2 C*2)
. I have to do all of that without altering other variables in the dataset, for instance, in this case, the variable ID.
PS: I can definitely do it manually mutate(Wsum = (A*0.5 B*1.2 C*2))
. But I would like to automatize this process, because, as you might guess, I don't have only 3 variables, and that will not be a one-time job. Thanks in advance.
Example data
Packages required
library(dplyr)
Example Data
mydata <- tibble(ID = (01, 02
A = c(1, 2, 3),
B = c(2, 5, 4),
C = c(3, 6, 1))
> # A tibble: 3 x 4
> ID A B C
> <dbl> <dbl> <dbl> <dbl>
>1 1 1 2 3
>2 2 2 5 6
>3 3 3 4 1
Data Weight Dictionary
weights <- tibble( variables = c("A", "B", "C"),
w = c(0.5, 1.2, 2))
># A tibble: 3 x 2
> variables w
> <chr> <dbl>
>1 A 0.5
>2 B 1.2
>3 C 2
Expected result
># A tibble: 3 x 5
> ID A B C Wsum
> <dbl> <dbl> <dbl> <dbl> <dbl>
>1 1 1 2 3 8.9
>2 2 2 5 6 19
>3 3 3 4 1 8.3
CodePudding user response:
In base R
, use %*%
or crossprod
after converting to matrix
mydata$Wsum <- as.vector(as.matrix(mydata[weights$variables]) %*%
with(weights, setNames(w, variables)))
-output
mydata$Wsum
[1] 8.9 19.0 8.3
Or using tidyverse
, loop across
the columns in the 'mydata', get the corresponding value from deframe
d (converted to named vector) 'weights', multiply and get the rowSums
to create the 'Wsum' column
library(dplyr)
library(tibble)
mydata <- mydata %>%
mutate(Wsum = rowSums(across(all_of(weights$variables),
~ .x * deframe(weights)[[cur_column()]])))
mydata
# A tibble: 3 × 4
A B C Wsum
<dbl> <dbl> <dbl> <dbl>
1 1 2 3 8.9
2 2 5 6 19
3 3 4 1 8.3