Home > Net >  How to subset data.table by external function with arbitrary conditions
How to subset data.table by external function with arbitrary conditions

Time:08-11

Suppose I have a datatable like the following.

a <- seq(2)
b <- seq(3)
c <- seq(4)
dt <- data.table(expand.grid(a,b,c))
> dt
    Var1 Var2 Var3
 1:    1    1    1
 2:    2    1    1
 3:    1    2    1
 4:    2    2    1
 5:    1    3    1
 6:    2    3    1
 7:    1    1    2
 8:    2    1    2
 9:    1    2    2
10:    2    2    2
11:    1    3    2
12:    2    3    2
13:    1    1    3
14:    2    1    3
15:    1    2    3
16:    2    2    3
17:    1    3    3
18:    2    3    3
19:    1    1    4
20:    2    1    4
21:    1    2    4
22:    2    2    4
23:    1    3    4
24:    2    3    4

now I can easily subset by column values by using a standard datatable subset call. For example,

dt[Var2==2 & Var3==1]
   Var1 Var2 Var3
1:    1    2    1
2:    2    2    1

But now suppose I wanted to create a function outside of the datatable, something, generically like

foo <- function(dt,...){
return(dt[Var2==2 & Var3==1])}

I have seen some examples using only 1 subset column and globalenv()$val, and you could define Var2 outside of the data.table filter.

foo <- function(dt,...){
return(dt[,Var2==globalenv()$Var2])}

But, if I had a large number of columns and wanted to filter by an arbitrary subset of the columns and values, this wouldn't seem to present a simple solution. I can do this a few ways, but they all seem very cumbersome and inefficient. Is there a way to subset by a function with arbitrary columns selected by the user that would accomplish this?

Like,

foo(dt,Var2=1,Var3=1)
foo(dt,Var1=2,Var3=1,Var10=2,...)
foo(dt,c(Var1=2,Var3=1,Var10=2))

etc

I added the extra dots since I want to be able to enter any number of arbitrary selection conditions to the function call.

In case anyone is wondering, my end goal is a much larger function. But the datatable filtering is a critical portion of it.

CodePudding user response:

A slight modification from Christian's answer:

fun <- function(dt, ...) {
    args <- list(...)
    filter <- Reduce(
        function(x, y) call("&", x, y),
        Map(function(val, name) call("==", as.name(name), val), args, names(args)))
    dt[eval(filter)]
}

fun(dt, Var1 = 1, Var3 = 1)
#   Var1 Var2 Var3
#1:    1    1    1
#2:    1    2    1
#3:    1    3    1

CodePudding user response:

One possible solution (Note the of == and not =, as in the post):

foo = function(dt, ...) {
  eval(substitute(dt[Reduce(`&`, list(...)),]))
}

foo(dt,Var2==1,Var3==1)

    Var1  Var2  Var3
   <int> <int> <int>
1:     1     1     1
2:     2     1     1
  • Related