I'm sure this is a dumb question...but I've looked around and can't seem to find anything.
Say I have two tables:
a <- data.table(row_id = 1:12)
b <- data.table(start_row = c(1,5,10),end_row = c(3,7,11),value = c("a","b","c"))
> a
row_id
1: 1
2: 2
3: 3
4: 4
5: 5
6: 6
7: 7
8: 8
9: 9
10: 10
11: 11
12: 12
> b
start_row end_row value
1: 1 3 a
2: 5 7 b
3: 10 11 c
I'd like to join using the row indices from b
to insert the value
into a
, to produce something like this:
> c
row_id value
1: 1 a
2: 2 a
3: 3 a
4: 4 <NA>
5: 5 b
6: 6 b
7: 7 b
8: 8 <NA>
9: 9 <NA>
10: 10 c
11: 11 c
12: 12 <NA>
What I'm doing right now is using a for loop, which is slow:
for (i in 1:nrow(s)) {
a[row_id >= b[i,start_row] & row_id <= b[i,end_row],value := b[i,value]]
}
I've thought about dcasting b
into a long form of sort and then joining, but what I really need is someway to generate the sequences between start and end rows before joining. Any suggestions welcome!
CodePudding user response:
We just need a non-equi join
library(data.table)
a[b, value := i.value, on = .(row_id >= start_row, row_id <= end_row)]
-output
> a
row_id value
1: 1 a
2: 2 a
3: 3 a
4: 4 <NA>
5: 5 b
6: 6 b
7: 7 b
8: 8 <NA>
9: 9 <NA>
10: 10 c
11: 11 c
12: 12 <NA>