I have a dataset with var1, var2, var3, var4, and I am calculating a sum var_total \<- var1 var2 var3 var4
. I want missing value in var_total if any of the values var1,
var2, var3 and var4 is missing.
Have:
Var1 | var2 | var3 | var4 | var_total |
---|---|---|---|---|
1 | 0 | 0 | 0 | 1 |
1 | NA | 2 | 0 | 3 |
1 | 0 | 0 | NA | 1 |
Want:
Var1 | var2 | var3 | var4 | var_total |
---|---|---|---|---|
1 | 0 | 0 | 0 | 1 |
1 | NA | 2 | 0 | NA |
1 | 0 | 0 | NA | NA |
I assume something involving ifelse()
.
CodePudding user response:
Libraries
library(dplyr)
Data
data <-
tibble::tribble(
~var1, ~var2, ~var3, ~var4, ~var_total,
1L, 0L, 0L, 0L, 1L,
1L, NA, 2L, 0L, 3L,
1L, 0L, 0L, NA, 1L
)
Code
data %>%
rowwise() %>%
mutate(var_total = sum(c_across(cols = var1:var4),na.rm = FALSE))
Output
# A tibble: 3 x 5
# Rowwise:
var1 var2 var3 var4 var_total
<int> <int> <int> <int> <int>
1 1 0 0 0 1
2 1 NA 2 0 NA
3 1 0 0 NA NA
CodePudding user response:
Using rowSums()
in base R:
data$var_total <- rowSums(data[ , 1:4])
Or with dplyr:
library(dplyr)
data %>%
mutate(var_total = rowSums(across(var1:var4)))
Result from either approach:
var1 var2 var3 var4 var_total
1 1 0 0 0 1
2 1 NA 2 0 NA
3 1 0 0 NA NA