I'm looking for a dynamic way to specify some "condition parameters" and then feed that to a case_when
operation (or sth. else if better suited for that problem).
So my goal is to seperate the specification of the conditions from the case_when call, e.g. so that a user can just type in the condition in a txt file or a list in R and then I would take that info anf feed it to case_when (or any other function if more appropriate).
So assuming the following data where I want to create an additional variable that recodes x
, I could do:
df <- data.frame(x = 1:10)
df |>
mutate(x2 = case_when(x < 4 ~ 1,
x >= 4 & x <=7 ~ 2,
TRUE ~ 3))
Now, what I want to achieve is to make this code flexible in a way that I can externally specify the case_when conditions and then do the recoding.
E.g. it might look like:
all_conditions <- list(1 = "x < 2",
2 = "x >= 2 & x < 5",
3 = "x >= 5 & x < 9",
4 = "TRUE")
And then I could do some sort of:
df |>
mutate(x2 = do(case_when, all_conditions))
EDIT: while the example shows a numeric type of variable for which the solution of @Mael works, the solution shoudl also work for character vars where the condition might look like "x == "abc" | x == "def"".
CodePudding user response:
A possible solution, base on rlang
, is below.
EXPLANATION
First, we need to create a string with the whole code for
case_when
, using the listall_conditions
— that is what myimap
does.Second, using
rlang::parse_quo
, we transform the string into an expression to be used insidemutate
.
Remark
The names of the elements of list all_conditions
have to be enclosed by backticks.
library(tidyverse)
library(rlang)
df <- data.frame(x = 1:10)
all_conditions <- list(`1` = "x < 2",
`2` = "x >= 2 & x < 5",
`3` = "x >= 5 & x < 9",
`4` = "TRUE")
code <- imap(all_conditions, ~ str_c(.x, " ~ ", .y)) %>%
str_c(collapse = ", ") %>% str_c("case_when(",.,")")
df %>%
mutate(x2 = !!parse_quo(code, env = caller_env()))
#> x x2
#> 1 1 1
#> 2 2 2
#> 3 3 2
#> 4 4 2
#> 5 5 3
#> 6 6 3
#> 7 7 3
#> 8 8 3
#> 9 9 4
#> 10 10 4
CodePudding user response:
In this specific case, one way to do it would be to use cut
:
df$x2 <- cut(df$x, breaks = c(-Inf, 2, 5, 9, Inf), labels = 1:4)
output
df
x x2
1 1 1
2 2 1
3 3 2
4 4 2
5 5 2
6 6 3
7 7 3
8 8 3
9 9 3
10 10 4