Home > Net >  Check multiple columns for value in r after group by
Check multiple columns for value in r after group by

Time:09-01

In the following dataframe for each household, individual combination if "X_1","Y_2" and "Z_3" all three variable > 0 then add new column "criteria" = "C1" else 0.

Only household 1001 - individual 1 fulfill this condition

enter image description here

I tried ifelse option first and then select_at but to no avail. Its throwing error

 data %>% group_by(household,individual) %>%
 mutate(criteria = ifelse(X_1 >0 & Y_2 >0 & X_3 >0,"C1",0))

 # option_2
data %>% group_by(household,individual) %>%
select_at(vars(X_1 >0 & Y_2 >0 & Z_3 >0,"C1",0),all_vars(.>0)) %>%
 mutate(criteria = "c1")

I also want to retain all other variables intact for household - individual combination like year, week, duration in the final dataframe which are not present in the group by.

Please suggest

sample dataset:

   data <- data.frame(household=c(1001,1001,1001,1001,1001,1002,1002,1002,1003,1003,1003),
               individual = c(1,1,1,1,1,2,2,2,1,1,1),
               year = c(2021,2021,2022,2022,2022,2021,2022,2022,2022,2022,2022),
               week =c("w51","w52","w1","w2","w4","w51","w1","w3","w1","w2","w3"),
               duration =c(20,23,24,56,78,12,34,67,87,89,90),
               X_1 = c(3,3,3,3,3,0,0,0,1,1,1),
               Y_2 = c(2,2,2,2,2,1,1,1,0,0,0),
               Z_3 = c(4,4,4,4,4,0,0,0,0,0,0))

CodePudding user response:

You coul use if_all(), which is more efficient than rowwise c_across.

data %>%
  mutate(criteria = ifelse(if_all(X_1:Z_3, `>`, 0), "C1", "0"))

#    household individual year week duration X_1 Y_2 Z_3 criteria
# 1       1001          1 2021  w51       20   3   2   4       C1
# 2       1001          1 2021  w52       23   3   2   4       C1
# 3       1001          1 2022   w1       24   3   2   4       C1
# 4       1001          1 2022   w2       56   3   2   4       C1
# 5       1001          1 2022   w4       78   3   2   4       C1
# 6       1002          2 2021  w51       12   0   1   0        0
# 7       1002          2 2022   w1       34   0   1   0        0
# 8       1002          2 2022   w3       67   0   1   0        0
# 9       1003          1 2022   w1       87   1   0   0        0
# 10      1003          1 2022   w2       89   1   0   0        0
# 11      1003          1 2022   w3       90   1   0   0        0

CodePudding user response:

You're doing a rowwise operation so we can call rowwise and then do the ifelse using the c_across function. Calling ungroup to get out of rowwise

library(dplyr)

data |> 
  rowwise() |> 
  mutate(criteria = ifelse(all(c_across(X_1:Z_3) > 0), "C1", "0")) |> 
  ungroup()

Or you can just do:

data$criteria = apply(subset(data, ,X_1:Z_3), 1, \(x) ifelse(all(x) > 0, "C1", "0"))
  household individual  year week  duration   X_1   Y_2   Z_3 criteria
       <dbl>      <dbl> <dbl> <chr>    <dbl> <dbl> <dbl> <dbl> <chr>   
 1      1001          1  2021 w51         20     3     2     4 C1      
 2      1001          1  2021 w52         23     3     2     4 C1      
 3      1001          1  2022 w1          24     3     2     4 C1      
 4      1001          1  2022 w2          56     3     2     4 C1      
 5      1001          1  2022 w4          78     3     2     4 C1      
 6      1002          2  2021 w51         12     0     1     0 0       
 7      1002          2  2022 w1          34     0     1     0 0       
 8      1002          2  2022 w3          67     0     1     0 0       
 9      1003          1  2022 w1          87     1     0     0 0       
10      1003          1  2022 w2          89     1     0     0 0       
11      1003          1  2022 w3          90     1     0     0 0
  • Related