Home > Enterprise >  Subsetting a data.table using logical function by calling its string name
Subsetting a data.table using logical function by calling its string name

Time:09-17

I have a function containing a logical expression to subset the rows of a data.table. I want to call this function by its name given by a character string and a vector of column names representing the function arguments as well (as similar to using do.call). However, I have no clue how to approach that intention.

In a simple, reproducible example:

#function with logical return value
myfunc <- function(mpg, cyl) {
  mpg/cyl > 7
}

#name of function
funcname <- "myfunc"

#vector of column names, corresponding to the function parameters
cols <- c("mpg", "cyl")

#data.table to be subsetted
dt <- setDT(copy(mtcars))

What I am looking for is a function to pass the variables dt, funcname and cols in order to obtain the following result as it is identical to dt[mpg/cyl > 7, ]:

#desired function
subsetfunc(dt, funcname, cols)

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 32.4   4 78.7  66 4.08 2.200 19.47  1  1    4    1
2: 30.4   4 75.7  52 4.93 1.615 18.52  1  1    4    2
3: 33.9   4 71.1  65 4.22 1.835 19.90  1  1    4    1
4: 30.4   4 95.1 113 3.77 1.513 16.90  1  1    5    2

CodePudding user response:

In case you are interested in exploring the env parameter in data.table version 1.14.3, here is an alternative approach:

subsetfunc <- function(dt,funcname,cols) {
  dt[f(c1, c2),env = list(f=funcname,c1 = cols[1], c2=cols[2])]
}

subsetfunc(dt,funcname,cols)

Output:

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 32.4   4 78.7  66 4.08 2.200 19.47  1  1    4    1
2: 30.4   4 75.7  52 4.93 1.615 18.52  1  1    4    2
3: 33.9   4 71.1  65 4.22 1.835 19.90  1  1    4    1
4: 30.4   4 95.1 113 3.77 1.513 16.90  1  1    5    2

CodePudding user response:

Here is one way

subsetfunc <- function(data, funcstring, colnms) {
   data[get(funcstring)(data[[colnms[1]]], data[[colnms[2]]])]


}

-testing

> subsetfunc(dt, funcname, cols)
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
2:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
3:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
4:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2

Or use do.call

subsetfunc <- function(data, funcstring, colnms) {
   
   data[data[, do.call(funcstring, .SD), .SDcols = colnms]]
}

-testing

> subsetfunc(dt, funcname, cols)
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
2:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
3:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
4:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
  • Related