In the following dataframe for each household, individual combination if "X_1","Y_2" and "Z_3" all three variable > 0 then add new column "criteria" = "C1" else 0.
Only household 1001 - individual 1 fulfill this condition
I tried ifelse option first and then select_at but to no avail. Its throwing error
data %>% group_by(household,individual) %>%
mutate(criteria = ifelse(X_1 >0 & Y_2 >0 & X_3 >0,"C1",0))
# option_2
data %>% group_by(household,individual) %>%
select_at(vars(X_1 >0 & Y_2 >0 & Z_3 >0,"C1",0),all_vars(.>0)) %>%
mutate(criteria = "c1")
I also want to retain all other variables intact for household - individual combination like year, week, duration in the final dataframe which are not present in the group by.
Please suggest
sample dataset:
data <- data.frame(household=c(1001,1001,1001,1001,1001,1002,1002,1002,1003,1003,1003),
individual = c(1,1,1,1,1,2,2,2,1,1,1),
year = c(2021,2021,2022,2022,2022,2021,2022,2022,2022,2022,2022),
week =c("w51","w52","w1","w2","w4","w51","w1","w3","w1","w2","w3"),
duration =c(20,23,24,56,78,12,34,67,87,89,90),
X_1 = c(3,3,3,3,3,0,0,0,1,1,1),
Y_2 = c(2,2,2,2,2,1,1,1,0,0,0),
Z_3 = c(4,4,4,4,4,0,0,0,0,0,0))
CodePudding user response:
You coul use if_all()
, which is more efficient than rowwise c_across
.
data %>%
mutate(criteria = ifelse(if_all(X_1:Z_3, `>`, 0), "C1", "0"))
# household individual year week duration X_1 Y_2 Z_3 criteria
# 1 1001 1 2021 w51 20 3 2 4 C1
# 2 1001 1 2021 w52 23 3 2 4 C1
# 3 1001 1 2022 w1 24 3 2 4 C1
# 4 1001 1 2022 w2 56 3 2 4 C1
# 5 1001 1 2022 w4 78 3 2 4 C1
# 6 1002 2 2021 w51 12 0 1 0 0
# 7 1002 2 2022 w1 34 0 1 0 0
# 8 1002 2 2022 w3 67 0 1 0 0
# 9 1003 1 2022 w1 87 1 0 0 0
# 10 1003 1 2022 w2 89 1 0 0 0
# 11 1003 1 2022 w3 90 1 0 0 0
CodePudding user response:
You're doing a rowwise operation so we can call rowwise
and then do the ifelse
using the c_across
function. Calling ungroup
to get out of rowwise
library(dplyr)
data |>
rowwise() |>
mutate(criteria = ifelse(all(c_across(X_1:Z_3) > 0), "C1", "0")) |>
ungroup()
Or you can just do:
data$criteria = apply(subset(data, ,X_1:Z_3), 1, \(x) ifelse(all(x) > 0, "C1", "0"))
household individual year week duration X_1 Y_2 Z_3 criteria
<dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 1001 1 2021 w51 20 3 2 4 C1
2 1001 1 2021 w52 23 3 2 4 C1
3 1001 1 2022 w1 24 3 2 4 C1
4 1001 1 2022 w2 56 3 2 4 C1
5 1001 1 2022 w4 78 3 2 4 C1
6 1002 2 2021 w51 12 0 1 0 0
7 1002 2 2022 w1 34 0 1 0 0
8 1002 2 2022 w3 67 0 1 0 0
9 1003 1 2022 w1 87 1 0 0 0
10 1003 1 2022 w2 89 1 0 0 0
11 1003 1 2022 w3 90 1 0 0 0