Home > front end >  Use case_when() in R with multiple conditional rules and multiple columns
Use case_when() in R with multiple conditional rules and multiple columns

Time:09-30

I need to create a new column (insider_class) sorting data from a data.frame` based on specific rules using two columns as a reference.

I have a column with several parameters (parameter) and another with values (value).

The rule is:

If value pH >=6 and <=9 then insider_class=yes, if not then insider_class=no

If value DO >= 5.0 then insider_class=yes

I tried that, but some pH values don't respect the rule.

dput -->

df<-structure(list(Estacao2 = c("1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "10", "10"), parameter = c("pH", 
"DO", "pH", "DO", "pH", "DO", "pH", "DO", "pH", "DO", "pH", "DO", 
"pH", "DO", "pH", "DO", "pH", "DO", "pH", "DO", "pH", "DO", "pH", 
"DO", "pH", "DO", "pH", "DO", "pH", "DO"), value = c(4.475, 7.2, 
5.65, 5.15, 6.65, 6.425, 6.4, 6.56, 6.05, 5.533, 5.75, 5.825, 
5.625, 6.25, 5.833, 6.2, 5.35, 4.3, 5.867, 5.8, 5.375, 7.4, 5.6, 
6.45, 5.55, 6.625, 6.033, 7.667, 7.438, 7.312)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -30L), groups = structure(list(
    Estacao2 = c("1", "10"), .rows = structure(list(1:28, 29:30), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -2L), .drop = TRUE))

code:

df2<-df%>%mutate(inside_class = case_when(
    (parameter=='pH'& value %in% c(6.00:9.00) ~ 'yes'),
    (parameter=='DO' & value>=5.0 ~'yes'),
    TRUE~'no'
  ))

enter image description here

CodePudding user response:

You should be aware of the fact that c(6:9) means c(6, 7, 8, 9). You could use dplyr::between in this case like this:

library(dplyr)

df %>% 
  mutate(inside_class = case_when(
  parameter == 'pH'& between(value, 6, 9)  ~ 'yes',
  parameter == 'DO' & value >= 5.0 ~'yes',
  TRUE ~ 'no'
))

# A tibble: 30 × 4
# Groups:   Estacao2 [2]
   Estacao2 parameter value inside_class
   <chr>    <chr>     <dbl> <chr>       
 1 1        pH         4.47 no          
 2 1        DO         7.2  yes         
 3 1        pH         5.65 no          
 4 1        DO         5.15 yes         
 5 1        pH         6.65 yes         
 6 1        DO         6.42 yes         
 7 1        pH         6.4  yes         
 8 1        DO         6.56 yes         
 9 1        pH         6.05 yes         
10 1        DO         5.53 yes         
# … with 20 more rows
  • Related