Home > Net >  Using dynamic variable in right side of the dplyr formula
Using dynamic variable in right side of the dplyr formula

Time:10-20

Suppose I wish to relevel the Species in iris dataset so that the reference level becomes "virginica"

want_iris <- iris %>% 
  mutate(Species = relevel(factor(Species), ref = "virginica")) 
want_iris$Species
...
[141] virginica  virginica  virginica  virginica  virginica  virginica  virginica  virginica  virginica  virginica 
Levels: virginica setosa versicolor

However, let's say I wish to dynamically change the variable (Species) and reference level (virginica)

var_name <- "Species"
ref_name <- "virginica"

test_iris <- iris %>% 
  mutate({{var_name}} := relevel(factor({{var_name}}), ref = {{ref_name}})) 
test_iris$Species

Error: Problem with `mutate()` column `Species`.
i `Species = relevel(factor("Species"), ref = "virginica")`.
x 'ref' must be an existing level

From what I gather from these posts (1, 2), using dynamic variable on the right side of the dplyr is not straight forward, and I actually asked a similar question on 3, although the question was limited to the column name.

My rough guess is that since {{}} unquotes the variable name, I am specifying ref = virginica instead of ref = "virginica"

How may I approach this problem?

CodePudding user response:

Here's another option using rlang::quo:

var_name <- quo(Species)
ref_name <- "virginica"

test_iris <- iris %>% 
  mutate(!!var_name := relevel(factor(!!var_name), ref = ref_name)) 

You can check out how it's evaluated using rlang::qq_show:

qq_show(mutate(!!var_name := relevel(factor(!!var_name), ref = ref_name)))

# mutate(^Species := relevel(factor(^Species), ref = ref_name))

CodePudding user response:

It may be better to use .data

var_name <- "Species"
ref_name <- "virginica"
test_iris <- iris %>%
      mutate(!!var_name := relevel(factor(.data[[var_name]]), ref = ref_name))

-output

> levels(test_iris$Species)
[1] "virginica"  "setosa"     "versicolor"
  • Related