I need to flatten an arbitrarily nested list to a data frame and retain the path of keys / indices in one column, while extracting each element on the bottom level to an individual row.
Consider the following list:
lst <- list(
animals = list(
lamas = c("brown", "white"),
primates = list(
humans = c("asia", "europe"),
apes = c("good", "fast", "angry")
)
),
objects = c("expensive", "cheap"),
plants = NULL
)
The results of flatten_list(lst, delimiter="_")
should look like this:
data.frame(
path = c("animals_lamas", "animals_lamas", "animals_primates_humans", "animals_primates_humans", "animals_primates_apes", "animals_primates_apes", "animals_primates_apes", "objects", "objects", "plants"),
value = c("brown", "white", "asia", "europe", "good", "fast", "angry", "expensive", "cheap", NA)
)
I was surprised that I couldn't achieve this with tidyr or data.tables. Do I need a recursive function, or is there some out-of-the-box solution for this? Appreciated!
EDIT: The solution provided by akrun worked on the original data. I realized that there is a problem when an element is NULL
at the bottom level and hence rephrased the problem.
EDIT2 My current workaround is to recursively replace NULL
by NA
before applying akrun solution, using the function supplied here [again by akrun ;) ].
CodePudding user response:
It can be done by melt
ing into a data.frame and then unite
the key columns
library(reshape2)
library(dplyr)
library(tidyr)
out2 <- melt(lst) %>%
unite(path, L1:L3, sep = "_", na.rm = TRUE) %>%
select(path, value)
-checking with OP's output
> all.equal(out, out2)
[1] TRUE
CodePudding user response:
A solution that can deal with NULL
, based on rrapply
:
library(tidyverse)
library(rrapply)
rrapply(lst, f = \(x) if (is.null(x)) NA else x, how = "melt") %>%
unnest(value) %>% unite(path, L1:L3, na.rm = T)
#> # A tibble: 10 × 2
#> path value
#> <chr> <chr>
#> 1 animals_lamas brown
#> 2 animals_lamas white
#> 3 animals_primates_humans asia
#> 4 animals_primates_humans europe
#> 5 animals_primates_apes good
#> 6 animals_primates_apes fast
#> 7 animals_primates_apes angry
#> 8 objects expensive
#> 9 objects cheap
#> 10 plants <NA>