Home > OS >  Creating a new variable adding 1 each time a condition is met in R
Creating a new variable adding 1 each time a condition is met in R

Time:08-24

I am working with a survey data set in which each observation (respondents) is represented by an own row. I want to create a new (numeric) variable which counts the number of times a condition is met by other variables per row. More specifically, the dataframe contains several numerical variables (var1, var2, var3 in the example below). Each time that a value of those variables is >=3 and not NA, the new variable (desiredvar) should increase by 1. As you can see in the example, the desired variable takes the value 2 for the first row, since var1 and var3 are both >= 3.

df1 <- data.frame(var1 = c(3, NA, 2, 1),
              var2 = c(0, 0, 2, 1),
              var3 = c(8, NA, 5, 6),
              desiredvar = c(2, 0, 1, 1))

  var1 var2 var3 desiredvar
1    3    0    8          2
2   NA    0   NA          0
3    2    2    5          1
4    1    1    6          1

I am assuming that it should be relatively easy to code that with a for loop and/or apply, but I am not very experienced with R. Would appreciate any help!

Best, Carlo

CodePudding user response:

You can use rowSums with na.rm = TRUE:

df1$desiredvar <- rowSums(df1 >= 3, na.rm = TRUE)

or with apply:

df1$desiredvar <- apply(df1 >= 3, 1, sum, na.rm = T)
  var1 var2 var3 desiredvar
1    3    0    8          2
2   NA    0   NA          0
3    2    2    5          1
4    1    1    6          1

In dplyr, you could use the abovementioned answers, or use rowwise and c_across:

library(dplyr)
df1 %>% 
  rowwise() %>% 
  mutate(desiredvar = sum(c_across(var1:var3) >= 3, na.rm = T))
  • Related