I'm trying to understand the new env
parameter in data.table.
This works (data.table version 1.14.3)
df1 = data.table(x=seq(1,10))
a="x"
df1[a<5, env=list(a=a)]
x
1: 1
2: 2
3: 3
4: 4
However, I can't get it to work in join conditions
df1 = data.table(x=seq(1,10), xz=rnorm(10))
df2 = data.table(y=seq(5,14), yz=rnorm(10))
a="x"
b="y"
# This fails
df1[df2, on=.(a<b), nomatch=0, env=list(a=a,b=b)]
Any thoughts on how to pass variables a
and b
to the on
clause of data.table? I understand how to do it via function. However, I'm struggling to find best way to pass a
and b
as characters, and still use in the on
clause
# This works
func <- function(d1,d2,a,b) {
eval(substitute(d1[d2, on=.(a<b), nomatch=0]))
}
func(df1,df2,x,y)
CodePudding user response:
According to the documentation for ?data.table
(1.14.3
)
env - List or an environment, passed to substitute2 for substitution of parameters in
i, j and by (or keyby)
. Use verbose to preview constructed expressions.
The usage of env
with on
is not specified. An option without the eval/substitute
would be to pass a named
vector in on
out1 <- df1[df2, on = setNames(b, a), nomatch = 0]
-checking with the output of actual columns
> out2 <- df1[df2, on = .(x = y), nomatch = 0]
> identical(out1, out2)
[1] TRUE
For the non-equi join, as @jangorecki mentioned in the comments, use substitute2
eval(substitute2(df1[df2, on = .(a <b), nomatch = 0], env = list(a=a, b = b)))
-output
x xz yz
<int> <num> <num>
1: 5 -1.1259516 -0.78272556
2: 5 -0.4893074 -0.78272556
3: 5 1.8534867 -0.78272556
4: 5 -0.0376076 -0.78272556
5: 6 -1.1259516 -0.38678402
6: 6 -0.4893074 -0.38678402
7: 6 1.8534867 -0.38678402
8: 6 -0.0376076 -0.38678402
9: 6 -0.8409438 -0.38678402
10: 7 -1.1259516 1.18356550
11: 7 -0.4893074 1.18356550
12: 7 1.8534867 1.18356550
13: 7 -0.0376076 1.18356550
14: 7 -0.8409438 1.18356550
15: 7 2.4536938 1.18356550
16: 8 -1.1259516 1.01226735
17: 8 -0.4893074 1.01226735
18: 8 1.8534867 1.01226735
19: 8 -0.0376076 1.01226735
20: 8 -0.8409438 1.01226735
21: 8 2.4536938 1.01226735
22: 8 -0.4296828 1.01226735
23: 9 -1.1259516 0.64996351
24: 9 -0.4893074 0.64996351
25: 9 1.8534867 0.64996351
26: 9 -0.0376076 0.64996351
27: 9 -0.8409438 0.64996351
28: 9 2.4536938 0.64996351
29: 9 -0.4296828 0.64996351
30: 9 0.9030399 0.64996351
31: 10 -1.1259516 -1.23172125
32: 10 -0.4893074 -1.23172125
33: 10 1.8534867 -1.23172125
34: 10 -0.0376076 -1.23172125
35: 10 -0.8409438 -1.23172125
36: 10 2.4536938 -1.23172125
37: 10 -0.4296828 -1.23172125
38: 10 0.9030399 -1.23172125
39: 10 0.5757248 -1.23172125
40: 11 -1.1259516 0.21203485
41: 11 -0.4893074 0.21203485
42: 11 1.8534867 0.21203485
43: 11 -0.0376076 0.21203485
44: 11 -0.8409438 0.21203485
45: 11 2.4536938 0.21203485
46: 11 -0.4296828 0.21203485
47: 11 0.9030399 0.21203485
48: 11 0.5757248 0.21203485
49: 11 -1.1732665 0.21203485
50: 12 -1.1259516 -1.28178320
51: 12 -0.4893074 -1.28178320
52: 12 1.8534867 -1.28178320
53: 12 -0.0376076 -1.28178320
54: 12 -0.8409438 -1.28178320
55: 12 2.4536938 -1.28178320
56: 12 -0.4296828 -1.28178320
57: 12 0.9030399 -1.28178320
58: 12 0.5757248 -1.28178320
59: 12 -1.1732665 -1.28178320
60: 13 -1.1259516 -0.03909294
61: 13 -0.4893074 -0.03909294
62: 13 1.8534867 -0.03909294
63: 13 -0.0376076 -0.03909294
64: 13 -0.8409438 -0.03909294
65: 13 2.4536938 -0.03909294
66: 13 -0.4296828 -0.03909294
67: 13 0.9030399 -0.03909294
68: 13 0.5757248 -0.03909294
69: 13 -1.1732665 -0.03909294
70: 14 -1.1259516 0.25522954
71: 14 -0.4893074 0.25522954
72: 14 1.8534867 0.25522954
73: 14 -0.0376076 0.25522954
74: 14 -0.8409438 0.25522954
75: 14 2.4536938 0.25522954
76: 14 -0.4296828 0.25522954
77: 14 0.9030399 0.25522954
78: 14 0.5757248 0.25522954
79: 14 -1.1732665 0.25522954