Home > OS >  Using rlang double curly braces {{ in data.table
Using rlang double curly braces {{ in data.table

Time:06-19

Problem

The {{}} operator from the rlang package makes it incredibly easy to pass column names as function arguments (aka Quasiquotation). I understand rlang is intended to work with tidyverse, but is there a way to use {{}} in data.table?

Intended use of {{}} with dplyr

test_dplyr <- function(dt, col1, col2){
  
  temp <- dt %>%
            group_by( {{col2}} ) %>%
            summarise(test = mean( {{col1}} ))

  return(temp)
}

test_dplyr(dt=iris, col1=Sepal.Length, col2=Species)

> # A tibble: 3 x 2
>   Species     test
>   <fct>      <dbl>
> 1 setosa      5.01
> 2 versicolor  5.94
> 3 virginica   6.59

Failed attempt of using {{}} with data.table

This is ideally what I would like to do, but it returns an ERROR.

test_dt2 <- function(dt, col1, col2){
  
  data.table::setDT(dt)
  temp <- dt[, .( test = mean({{col1}})), by = {{col2}} ] )
  return(temp)
}

# error
test_dt2(dt=iris, col1= Sepal.Length, col2= Species)

# and error
test_dt2(dt=iris, col1= 'Sepal.Length', col2= 'Species')

Alternative use of rlang with data.table

And here is an alternative way to use rlang with data.table. There are two inconvinences here, which are to rlang::ensym() every column name variable, and having to call data.table operations inside rlang::injec().

test_dt <- function(dt, col1, col2){
  
  # eval colnames
  col1 <- rlang::ensym(col1)
  col2 <- rlang::ensym(col2)
  
  data.table::setDT(dt)
  temp <- rlang::inject( dt[, .( test = mean(!!col1)), by = !!col2] )
  return(temp)
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')

>       Species  test
> 1:     setosa 5.006
> 2: versicolor 5.936
> 3:  virginica 6.588

CodePudding user response:

I don't think you want to use rlang with data.table. data.table already has more convenient facilities itself. Also suggest not using setDT here as that will result in the side effect of changing dt in place.

library(data.table)

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(.SD[[col1]])), by = c(col2)]
}

test_dt(dt = iris, col1 = 'Sepal.Length', col2 = 'Species')
##       Species  test
## 1:     setosa 5.006
## 2: versicolor 5.936
## 3:  virginica 6.588

This also works:

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(get(col1)), by = c(col2)]
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')
  • Related