Problem
The {{}}
operator from the rlang
package makes it incredibly easy to pass column names as function arguments (aka Quasiquotation). I understand rlang
is intended to work with tidyverse
, but is there a way to use {{}}
in data.table
?
Intended use of {{}} with dplyr
test_dplyr <- function(dt, col1, col2){
temp <- dt %>%
group_by( {{col2}} ) %>%
summarise(test = mean( {{col1}} ))
return(temp)
}
test_dplyr(dt=iris, col1=Sepal.Length, col2=Species)
> # A tibble: 3 x 2
> Species test
> <fct> <dbl>
> 1 setosa 5.01
> 2 versicolor 5.94
> 3 virginica 6.59
Failed attempt of using {{}} with data.table
This is ideally what I would like to do, but it returns an ERROR.
test_dt2 <- function(dt, col1, col2){
data.table::setDT(dt)
temp <- dt[, .( test = mean({{col1}})), by = {{col2}} ] )
return(temp)
}
# error
test_dt2(dt=iris, col1= Sepal.Length, col2= Species)
# and error
test_dt2(dt=iris, col1= 'Sepal.Length', col2= 'Species')
Alternative use of rlang with data.table
And here is an alternative way to use rlang
with data.table
. There are two inconvinences here, which are to rlang::ensym()
every column name variable, and having to call data.table operations inside rlang::injec()
.
test_dt <- function(dt, col1, col2){
# eval colnames
col1 <- rlang::ensym(col1)
col2 <- rlang::ensym(col2)
data.table::setDT(dt)
temp <- rlang::inject( dt[, .( test = mean(!!col1)), by = !!col2] )
return(temp)
}
test_dt(dt=iris, col1='Sepal.Length', col2='Species')
> Species test
> 1: setosa 5.006
> 2: versicolor 5.936
> 3: virginica 6.588
CodePudding user response:
I don't think you want to use rlang with data.table. data.table already has more convenient facilities itself. Also suggest not using setDT here as that will result in the side effect of changing dt in place.
library(data.table)
test_dt <- function(dt, col1, col2) {
as.data.table(dt)[, .( test = mean(.SD[[col1]])), by = c(col2)]
}
test_dt(dt = iris, col1 = 'Sepal.Length', col2 = 'Species')
## Species test
## 1: setosa 5.006
## 2: versicolor 5.936
## 3: virginica 6.588
This also works:
test_dt <- function(dt, col1, col2) {
as.data.table(dt)[, .( test = mean(get(col1)), by = c(col2)]
}
test_dt(dt=iris, col1='Sepal.Length', col2='Species')