I'm trying to find the equivalent of tidyr::unnest()
for a data.table with multiple nested columns:
MT <- as.data.table(mtcars)
MT_NEST_MULT <- MT[, .(data1 = .(.SD[, .(mpg, hp)]), data2 = .(.SD[, !c("mpg", "hp")])), by = .(cyl, gear)]
cyl gear data1 data2
8 3 <S3: data.table> <S3: data.table>
8 5 <S3: data.table> <S3: data.table>
6 4 <S3: data.table> <S3: data.table>
6 3 <S3: data.table> <S3: data.table>
6 5 <S3: data.table> <S3: data.table>
4 4 <S3: data.table> <S3: data.table>
4 3 <S3: data.table> <S3: data.table>
4 5 <S3: data.table> <S3: data.table>
Un-nesting a single column is easy: MT_NEST_MULT[, rbindlist(data1), by = .(cyl, gear)]
But I don't know how to un-nest both, i.e. do the equivalent of tidyr::unnest(..., c(data1, data2))
Thanks !
CodePudding user response:
Here is one option - where we specify the columns to apply the rbindlist
with .SDcols
, loop over the .SD
(Subset of Data.table), apply rbindlist
and flatten the output with c
library(data.table)
MT_NEST_MULT[, do.call(c, unname(lapply(.SD, rbindlist))),
.SDcols = patterns('data'), by = .(cyl, gear)]
-output
cyl gear mpg hp disp drat wt qsec vs am carb
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 6 4 21.0 110 160.0 3.90 2.620 16.46 0 1 4
2: 6 4 21.0 110 160.0 3.90 2.875 17.02 0 1 4
3: 6 4 19.2 123 167.6 3.92 3.440 18.30 1 0 4
4: 6 4 17.8 123 167.6 3.92 3.440 18.90 1 0 4
5: 4 4 22.8 93 108.0 3.85 2.320 18.61 1 1 1
6: 4 4 24.4 62 146.7 3.69 3.190 20.00 1 0 2
7: 4 4 22.8 95 140.8 3.92 3.150 22.90 1 0 2
8: 4 4 32.4 66 78.7 4.08 2.200 19.47 1 1 1
9: 4 4 30.4 52 75.7 4.93 1.615 18.52 1 1 2
10: 4 4 33.9 65 71.1 4.22 1.835 19.90 1 1 1
...