Currently I'm working with a dataset of Gender-Based violence. I have several dimensions such as physical, sexual or psychological violence and several indicators for each dimension, everyone with a set of 6 to 10 indicators. Also, I have two measures of time: within the last 12 months and before the last 12 months. Long story short, at least I'm working with 70 indicators.
I have to create a variable "some point in life" that indicates wether a woman suffered violence in any dimension, in any period of time. So, if a woman indicate yes in any of the 70 indicators, she suffered violence at some point in her life.
My question is how can I create that new variable more quickly, because the only thing that I'm thinking its something like
base <- base %>% mutate(newvariable= case_when(v1 == 1 | v2 == 1 | v3 == 1 ~ ... 1))
But I have 70 variables. So, can I create this new variable with a condition or within a range of variables?
CodePudding user response:
You can do:
library(dplyr)
mydf <- data.frame(a = sample(c(0,1), 10, replace = TRUE),
b = sample(c(0,1), 10, replace = TRUE),
c = sample(c(0,1), 10, replace = TRUE))
mydf |>
rowwise() |>
mutate(outcome = if_else(sum(c_across(everything())) >= 1, TRUE, FALSE)) |>
ungroup()
# A tibble: 10 x 4
a b c outcome
<dbl> <dbl> <dbl> <lgl>
1 0 0 0 FALSE
2 1 0 1 TRUE
3 0 0 1 TRUE
4 1 0 0 TRUE
5 1 1 0 TRUE
6 1 0 1 TRUE
7 1 0 1 TRUE
8 0 1 1 TRUE
9 1 0 1 TRUE
10 0 0 1 TRUE