Home > front end >  Create a new variable with condition OR for multiples variables
Create a new variable with condition OR for multiples variables


Currently I'm working with a dataset of Gender-Based violence. I have several dimensions such as physical, sexual or psychological violence and several indicators for each dimension, everyone with a set of 6 to 10 indicators. Also, I have two measures of time: within the last 12 months and before the last 12 months. Long story short, at least I'm working with 70 indicators.

I have to create a variable "some point in life" that indicates wether a woman suffered violence in any dimension, in any period of time. So, if a woman indicate yes in any of the 70 indicators, she suffered violence at some point in her life.

My question is how can I create that new variable more quickly, because the only thing that I'm thinking its something like

base <- base %>% mutate(newvariable= case_when(v1 == 1 | v2 == 1 | v3 == 1 ~ ... 1))

But I have 70 variables. So, can I create this new variable with a condition or within a range of variables?

CodePudding user response:

You can do:


mydf <- data.frame(a = sample(c(0,1), 10, replace = TRUE),
                   b = sample(c(0,1), 10, replace = TRUE),
                   c = sample(c(0,1), 10, replace = TRUE))

mydf |> 
  rowwise() |> 
  mutate(outcome = if_else(sum(c_across(everything())) >= 1, TRUE, FALSE)) |> 

# A tibble: 10 x 4
       a     b     c outcome
   <dbl> <dbl> <dbl> <lgl>  
 1     0     0     0 FALSE  
 2     1     0     1 TRUE   
 3     0     0     1 TRUE   
 4     1     0     0 TRUE   
 5     1     1     0 TRUE   
 6     1     0     1 TRUE   
 7     1     0     1 TRUE   
 8     0     1     1 TRUE   
 9     1     0     1 TRUE   
10     0     0     1 TRUE   
  • Related