This question is the data.table equivalent of Pass a data.frame column name to a function.
Suppose I have a very simple data.table:
dat <- data.table(x = 1:4,
y = 5:8)
Now I want to create a new column for any given function:
new_column <- function(df,col_name,expr){
col_name <- deparse(substitute(col_name))
df[[col_name]] <- eval(substitute(expr),df,parent.frame())
df
}
So that it correctly provides:
> new_column (dat,z,x y)
x y z
1 1 5 6
2 2 6 8
3 3 7 10
4 4 8 12
However , because it is a data.table I would like to create this new column using :=
:
new_column_byref <- function(df,col_name,expr){
col_name <- deparse(substitute(col_name))
df[, col_name:=eval(substitute(expr)
,df
,parent.frame()
)]
df
}
But it does not work:
> a <- new_column_byref(dat,z,x y)
Error: Check that is.data.table(DT) == TRUE. Otherwise, :=, `:=`(...) and let(...) are defined for use in j, once only and in particular ways. See help(":=").
How do I fix this? Thank you.
CodePudding user response:
new_column_byref <- function(df,col_name,expr){
col_name <- deparse(substitute(col_name))
set(df,j=col_name,value=eval(substitute(expr),df,parent.frame()))
}
dat <- data.table(x = 1:4,y = 5:8)
new_column_byref(dat,z,x y)[]
x y z
1: 1 5 6
2: 2 6 8
3: 3 7 10
4: 4 8 12
CodePudding user response:
set
is the data.table-idiomatic way. If you need to do other stuff like use by
, rlang
has a generic way to delay evaluation, which is to enexpr
the args (or enquo if you want them evaluated in the original environment) and !!
them inside an inject
with the expression you'd normally use.
library(rlang)
#> Warning: package 'rlang' was built under R version 4.1.2
library(data.table)
#>
#> Attaching package: 'data.table'
#> The following object is masked from 'package:rlang':
#>
#> :=
dat <- data.table(x = 1:4,
y = 5:8)
new_column <- function(df, col_name, expr) {
col_name <- enexpr(col_name)
expr <- enexpr(expr)
inject(df[, !!col_name := !!expr])
}
new_column(dat, z, x y)
dat
#> x y z
#> <int> <int> <int>
#> 1: 1 5 6
#> 2: 2 6 8
#> 3: 3 7 10
#> 4: 4 8 12
Created on 2022-02-25 by the reprex package (v2.0.1)
Or, similarly without rlang
library(data.table)
dat <- data.table(x = 1:4,
y = 5:8)
new_column <- function(df, col_name, expr) {
col_name <- deparse(substitute(col_name))
expr <- substitute(expr)
df[, (col_name) := eval(expr)]
}
new_column(dat, z, x y)
dat
#> x y z
#> <int> <int> <int>
#> 1: 1 5 6
#> 2: 2 6 8
#> 3: 3 7 10
#> 4: 4 8 12
Created on 2022-02-25 by the reprex package (v2.0.1)
The data.table package also has a new "programming on the language" interface in the dev version, but best I can tell that only allows symbols, not expressions.