Assuming we have a data.table
with a nested column val
dt <- data.table(
grp = c(1, 2, 1, 3, 4),
val = list("a", c("b", "c"), c("d", "e", "f"), "g", c("h", "i"))
)
which shows as
> dt
grp val
1: 1 a
2: 2 b,c
3: 1 d,e,f
4: 3 g
5: 4 h,i
Question
I would like to unnest
the val
column, where a possible option is using tidyr::unnest
, i.e.,
> dt %>%
unnest(val)
# A tibble: 9 × 2
grp val
<dbl> <chr>
1 1 a
2 2 b
3 2 c
4 1 d
5 1 e
6 1 f
7 3 g
8 4 h
9 4 i
I am wondering if we can implement it by using the data.table
only.
Watch out the order of values in the column grp
, I would like to preserve the order like 1,2,1,3,4
rather than 1,1,2,3,4
.
My data.table
Attempt
My attempt is as below
> dt[, id := .I][, lapply(.SD, unlist), id][, id := NULL][]
grp val
1: 1 a
2: 2 b
3: 2 c
4: 1 d
5: 1 e
6: 1 f
7: 3 g
8: 4 h
9: 4 i
or
> dt[,.(grp = rep(grp,lengths(val)), val = unlist(val))]
grp val
1: 1 a
2: 2 b
3: 2 c
4: 1 d
5: 1 e
6: 1 f
7: 3 g
8: 4 h
9: 4 i
but I guess there might be some more concise and elegant way to do this, e.g., without creating an auxiliary column id
or using rep
lengths
.
Any idea? Much appreciated!
CodePudding user response:
One more option:
dt[, .(grp, val = unlist(val)), by = .I][, !"I"]
# grp val
# <num> <char>
# 1: 1 a
# 2: 2 b
# 3: 2 c
# 4: 1 d
# 5: 1 e
# 6: 1 f
# 7: 3 g
# 8: 4 h
# 9: 4 i
PS. If you are using data.table 1.4.2
or older by = .I
won't work. Instead you can use:
dt[, .(grp, val = unlist(val)), by = 1:nrow(dt)][, !"nrow"]
PS2. I think your dt[,.(grp = rep(grp,lengths(val)), val = unlist(val))]
is neater.
CodePudding user response:
I guess this works:
dt[, lapply(.SD, unlist), by = .(grp)]
grp val
1: 1 a
2: 1 d
3: 1 e
4: 1 f
5: 2 b
6: 2 c
7: 3 g
8: 4 h
9: 4 i