Home > Software design >  avoid repeated unquoting in dplyr non standard evaluation
avoid repeated unquoting in dplyr non standard evaluation

Time:08-16

Suppose we have the following data:

tib <- tibble::tibble(x = 1:10)

Then, suppose we want to make a function that takes a column as input and returns a tibble with several added columns such as:

library(dplyr)
generate_transformations <- function(data, column){
    transform <- sym(column)
    data %>% 
        mutate(
            sqrt = sqrt(!!transform),
            recip = 1 / !!transform,
            log = log(!!transform)
        )
}
# Usage is great:
tib %>% 
    generate_transformations('x')
# A tibble: 10 x 4
       x  sqrt recip   log
   <int> <dbl> <dbl> <dbl>
 1     1  1    1     0    
 2     2  1.41 0.5   0.693
 3     3  1.73 0.333 1.10 
 4     4  2    0.25  1.39 
 5     5  2.24 0.2   1.61 
 6     6  2.45 0.167 1.79 
 7     7  2.65 0.143 1.95 
 8     8  2.83 0.125 2.08 
 9     9  3    0.111 2.20 
10    10  3.16 0.1   2.30

Now my question is, is there a way to avoid unquoting (!!) transform repeatedly? Yes, I could, e.g., temporarily rename column and then rename it back after I am done, but that is not my interest in this question. I am interested if there is a way to produce a variable that does not need the !!. While it does not work, I was looking for something like:

generate_transformations <- function(data, column){
    transform <- !!sym(column) # cannot unquote here :(
    data %>% 
        mutate(
            sqrt = sqrt(transform),
            recip = 1 / transform,
            log = log(transform)
        )
}

CodePudding user response:

Convert to string and subset from the data and use transform

generate_transformations <- function(data, column){
    transform <- data[[rlang::as_string(ensym(column))]]
    data %>% 
        mutate(
            sqrt = sqrt(transform),
            recip = 1 / transform,
            log = log(transform)
        )
}

-testing

tib %>% 
     generate_transformations('x')
# A tibble: 10 × 4
       x  sqrt recip   log
   <int> <dbl> <dbl> <dbl>
 1     1  1    1     0    
 2     2  1.41 0.5   0.693
 3     3  1.73 0.333 1.10 
 4     4  2    0.25  1.39 
 5     5  2.24 0.2   1.61 
 6     6  2.45 0.167 1.79 
 7     7  2.65 0.143 1.95 
 8     8  2.83 0.125 2.08 
 9     9  3    0.111 2.20 
10    10  3.16 0.1   2.30 

Or create a temporary column and remove it later

generate_transformations <- function(data, column){
   
    data %>% 
        mutate(transform = !! rlang::ensym(column),
            sqrt = sqrt(transform),
            recip = 1 / transform,
            log = log(transform), 
            transform = NULL
        )
}

-testing

tib %>% 
     generate_transformations('x')
# A tibble: 10 × 4
       x  sqrt recip   log
   <int> <dbl> <dbl> <dbl>
 1     1  1    1     0    
 2     2  1.41 0.5   0.693
 3     3  1.73 0.333 1.10 
 4     4  2    0.25  1.39 
 5     5  2.24 0.2   1.61 
 6     6  2.45 0.167 1.79 
 7     7  2.65 0.143 1.95 
 8     8  2.83 0.125 2.08 
 9     9  3    0.111 2.20 
10    10  3.16 0.1   2.30 

CodePudding user response:

You can do it in one, if you swap !! for {{}} and use across:

data_transformations <- function(d, col, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
  d %>% mutate(across({{col}}, .fns=funs))
}

d %>% data_transformations(x)
# A tibble: 10 × 4
       x x_sqrt x_log x_recip
   <int>  <dbl> <dbl>   <dbl>
 1     1   1    0       1    
 2     2   1.41 0.693   0.5  
 3     3   1.73 1.10    0.333
 4     4   2    1.39    0.25 
 5     5   2.24 1.61    0.2  
 6     6   2.45 1.79    0.167
 7     7   2.65 1.95    0.143
 8     8   2.83 2.08    0.125
 9     9   3    2.20    0.111
10    10   3.16 2.30    0.1  

To restore your original column names, use

data_transformations <- function(d, col, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
  d %>% mutate(across({{col}}, .fns=funs, .names="{.fn}"))
}
d %>% data_transformations(x)
# A tibble: 10 × 4
       x  sqrt   log recip
   <int> <dbl> <dbl> <dbl>
 1     1  1    0     1    
 2     2  1.41 0.693 0.5  
 3     3  1.73 1.10  0.333
 4     4  2    1.39  0.25 
 5     5  2.24 1.61  0.2  
 6     6  2.45 1.79  0.167
 7     7  2.65 1.95  0.143
 8     8  2.83 2.08  0.125
 9     9  3    2.20  0.111
10    10  3.16 2.30  0.1  

To handle multiple columns:

data_transformations <- function(d, cols, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
  d %>% mutate(across({{cols}}, .fns=funs))
}
d1 <- tibble(x=1:10, y=seq(2, 20, 2))

d1 %>% data_transformations(c(x, y), list(sqrt=sqrt, log=log))
A tibble: 10 × 6
       x     y x_sqrt x_log y_sqrt y_log
   <int> <dbl>  <dbl> <dbl>  <dbl> <dbl>
 1     1     2   1    0       1.41 0.693
 2     2     4   1.41 0.693   2    1.39 
 3     3     6   1.73 1.10    2.45 1.79 
 4     4     8   2    1.39    2.83 2.08 
 5     5    10   2.24 1.61    3.16 2.30 
 6     6    12   2.45 1.79    3.46 2.48 
 7     7    14   2.65 1.95    3.74 2.64 
 8     8    16   2.83 2.08    4    2.77 
 9     9    18   3    2.20    4.24 2.89 
10    10    20   3.16 2.30    4.47 3.00 
  • Related