Home > Net >  Pass a data.table column name to a function using :=
Pass a data.table column name to a function using :=

Time:02-26

This question is the data.table equivalent of Pass a data.frame column name to a function.

Suppose I have a very simple data.table:

dat <- data.table(x = 1:4,
                  y = 5:8)

Now I want to create a new column for any given function:

new_column <- function(df,col_name,expr){
    col_name <- deparse(substitute(col_name))
    df[[col_name]] <- eval(substitute(expr),df,parent.frame())
    df
}

So that it correctly provides:

> new_column (dat,z,x y)
  x y  z
1 1 5  6
2 2 6  8
3 3 7 10
4 4 8 12

However , because it is a data.table I would like to create this new column using :=:

new_column_byref <- function(df,col_name,expr){
   col_name <- deparse(substitute(col_name))
  df[, col_name:=eval(substitute(expr)
                      ,df
                      ,parent.frame()
                      )]
  df
}

But it does not work:

> a <- new_column_byref(dat,z,x y)
 Error: Check that is.data.table(DT) == TRUE. Otherwise, :=, `:=`(...) and let(...) are defined for use in j, once only and in particular ways. See help(":=").

How do I fix this? Thank you.

CodePudding user response:

new_column_byref <- function(df,col_name,expr){
  col_name <- deparse(substitute(col_name))
  set(df,j=col_name,value=eval(substitute(expr),df,parent.frame()))
}


dat <- data.table(x = 1:4,y = 5:8)

new_column_byref(dat,z,x y)[]

   x y  z
1: 1 5  6
2: 2 6  8
3: 3 7 10
4: 4 8 12

CodePudding user response:

set is the data.table-idiomatic way. If you need to do other stuff like use by, rlang has a generic way to delay evaluation, which is to enexpr the args (or enquo if you want them evaluated in the original environment) and !! them inside an inject with the expression you'd normally use.

library(rlang)
#> Warning: package 'rlang' was built under R version 4.1.2
library(data.table)
#> 
#> Attaching package: 'data.table'
#> The following object is masked from 'package:rlang':
#> 
#>     :=

dat <- data.table(x = 1:4,
                  y = 5:8)

new_column <- function(df, col_name, expr) {
    col_name <- enexpr(col_name)
    expr <- enexpr(expr)
    inject(df[, !!col_name := !!expr])
}

new_column(dat, z, x   y)

dat
#>        x     y     z
#>    <int> <int> <int>
#> 1:     1     5     6
#> 2:     2     6     8
#> 3:     3     7    10
#> 4:     4     8    12

Created on 2022-02-25 by the reprex package (v2.0.1)

Or, similarly without rlang

library(data.table)

dat <- data.table(x = 1:4,
                  y = 5:8)

new_column <- function(df, col_name, expr) {
    col_name <- deparse(substitute(col_name))
    expr <- substitute(expr)
    df[, (col_name) := eval(expr)]
}

new_column(dat, z, x   y)

dat
#>        x     y     z
#>    <int> <int> <int>
#> 1:     1     5     6
#> 2:     2     6     8
#> 3:     3     7    10
#> 4:     4     8    12

Created on 2022-02-25 by the reprex package (v2.0.1)

The data.table package also has a new "programming on the language" interface in the dev version, but best I can tell that only allows symbols, not expressions.

https://rdatatable.gitlab.io/data.table/news/index.html

  • Related