Home > OS >  Flatten nested list and retain all parent keys for each bottom-level element
Flatten nested list and retain all parent keys for each bottom-level element

Time:05-14

I need to flatten an arbitrarily nested list to a data frame and retain the path of keys / indices in one column, while extracting each element on the bottom level to an individual row.

Consider the following list:

lst <- list(
    animals = list(
        lamas = c("brown", "white"),
        primates = list(
            humans = c("asia", "europe"),
            apes = c("good", "fast", "angry")
        )
    ),
    objects = c("expensive", "cheap"),
    plants = NULL
)

The results of flatten_list(lst, delimiter="_") should look like this:

data.frame(
  path = c("animals_lamas", "animals_lamas", "animals_primates_humans", "animals_primates_humans", "animals_primates_apes", "animals_primates_apes", "animals_primates_apes", "objects", "objects", "plants"),
  value = c("brown", "white", "asia", "europe", "good", "fast", "angry", "expensive", "cheap", NA)
)

I was surprised that I couldn't achieve this with tidyr or data.tables. Do I need a recursive function, or is there some out-of-the-box solution for this? Appreciated!

EDIT: The solution provided by akrun worked on the original data. I realized that there is a problem when an element is NULL at the bottom level and hence rephrased the problem.

EDIT2 My current workaround is to recursively replace NULL by NA before applying akrun solution, using the function supplied here [again by akrun ;) ].

CodePudding user response:

It can be done by melting into a data.frame and then unite the key columns

library(reshape2)
library(dplyr)
library(tidyr)
out2 <- melt(lst) %>% 
        unite(path, L1:L3, sep = "_", na.rm = TRUE) %>% 
        select(path, value)

-checking with OP's output

> all.equal(out, out2)
[1] TRUE

CodePudding user response:

A solution that can deal with NULL, based on rrapply:

library(tidyverse)
library(rrapply)

rrapply(lst, f = \(x) if (is.null(x)) NA else x, how = "melt") %>% 
  unnest(value) %>% unite(path, L1:L3, na.rm = T)

#> # A tibble: 10 × 2
#>    path                    value    
#>    <chr>                   <chr>    
#>  1 animals_lamas           brown    
#>  2 animals_lamas           white    
#>  3 animals_primates_humans asia     
#>  4 animals_primates_humans europe   
#>  5 animals_primates_apes   good     
#>  6 animals_primates_apes   fast     
#>  7 animals_primates_apes   angry    
#>  8 objects                 expensive
#>  9 objects                 cheap    
#> 10 plants                  <NA>
  • Related