Home > Back-end >  Unlist the second to last list of a nested list
Unlist the second to last list of a nested list

Time:03-19

I have a deeply nested list of lists. In the "center" of the nested list is a vector containing n integers. I need to count how many integers are in each nested list, then unlist one level above to have a vector of these counts (i.e., instead of list(0, 1:5, 0, 0, 1:3) at the center of the nest, I want c(0, 5, 0, 0, 3).

This seems relatively simple - I was able to use rapply to accomplish the first part, i.e. convert list(0, 1:5, 0, 0, 1:3) to list(0, 5, 0, 0, 3). My specific question I need help with is how to unlist the innermost lists to a vector (instead of list(0, 5, 0, 0, 3) I want c(0, 5, 0, 0, 3)

I have searched and tried various apply, lapply, unlist approaches but none of them are quite right as they target the very innermost list. Since the list I want to unlist is the second to last element, I am struggling finding a way to accomplish this elegantly.

In the sample data below, I can get the desired outcome 2 ways: either multiple lapply functions or a for loop. However, my actual data contain many more lists and millions of datapoints, so these are likely not effective options.

Below is (1) sample data, (2) what I have tried, and (3) sample data having the desired structure.

Sample Data

have_list <- list(scenario1 = list(method1 = list(place1 = list(0, 1:5, 0, 0, 1:3),
                                                  place2 = list(1:2, 0, 1:10, 0, 0),
                                                  place3 = list(0:19, 0, 0, 0, 0),
                                                  place4 = list(1:100, 0, 0, 1:4, 0)),
                                   method2 = list(place1 = list(1:5, 1:5, 0, 0, 1:3),
                                                  place2 = list(0, 0, 1:5, 0, 0),
                                                  place3 = list(0:19, 0, 1:7, 0, 0),
                                                  place4 = list(1:22, 0, 0, 1:4, 0)),
                                   method3 = list(place1 = list(0, 1:2, 1:6, 0, 1:3),
                                                  place2 = list(1:2, 0, 1:6, 1:4, 0),
                                                  place3 = list(0:19, 0, 0, 0, 1:2),
                                                  place4 = list(1:12, 0, 0, 1:12, 0))),
                  scenario2 = list(method1 = list(place1 = list(0, 1:5, 0, 0, 1:3),
                                                  place2 = list(1:2, 0, 1:10, 0, 0),
                                                  place3 = list(0:19, 0, 0, 0, 0),
                                                  place4 = list(1:100, 0, 0, 1:4, 0)),
                                   method2 = list(place1 = list(1:5, 1:5, 0, 0, 1:3),
                                                  place2 = list(0, 0, 1:5, 0, 0),
                                                  place3 = list(0:19, 0, 1:7, 0, 0),
                                                  place4 = list(1:22, 0, 0, 1:4, 0)),
                                   method3 = list(place1 = list(0, 1:2, 1:6, 0, 1:3),
                                                  place2 = list(1:2, 0, 1:6, 1:4, 0),
                                                  place3 = list(0:19, 0, 0, 0, 1:2),
                                                  place4 = list(1:12, 0, 0, 1:12, 0))))

What I have tried

And SO questions I have visited:

# Get number of integers in each nested list 
lengths <- rapply(have_list, function(x) unlist(length(x)), how = "list") # this works fine

#' Each count is currently still in its own list of length 1,
#' Convert each count to vector
#' In the "middle" the nested list:
    # I have list(0, 5, 0, 0, 3) 
    # I want c(0, 5, 0, 0, 3)

# Attempts to unlist the counts
# Unlist the counts
test1 <- rapply(lengths, unlist, how = "list") # doesn't work
test2 <- unlist(lengths, recursive = FALSE) # doesn't work
test3 <- lapply(lengths, function(x) lapply(x, unlist)) # doesnt work
test4 <- lapply(lengths, function(x) lapply(x, unlist, recursive = FALSE)) # doesnt work 
test5 <- rapply(have_list, function(x) unlist(length(x)), how = "list")  #doesnt work
test6 <- rapply(have_list, function(x) unlist(length(x)), how = "unlist")  #doesnt work

Data structure I want

# This works on test data but is impractical for real data
want_list <- lapply(lengths, function(w) lapply(w, function(x) lapply(x, unlist)))

# or

want_list <- lengths 

## for loops work but is not practical

for (i in 1:length(lengths)){
  for (j in 1:length(lengths[[i]])){
    for (k in 1:length(lengths[[i]][[j]])){
      want_list[[i]][[j]][[k]] <- unlist(lengths[[i]][[j]][[k]])
    }
  }
}

CodePudding user response:

An option is to melt the nested list with rrapply, replace the 'value' column with the lengths and then use the recursive split (rsplit) from collapse

library(rrapply)
library(collapse)
dat <- transform(rrapply(have_list, how = "melt"), value= lengths(value))
out <- rsplit(dat$value, dat[1:3]) 

-testing with OP' expected

identical(out, want_list)
[1] TRUE

CodePudding user response:

This can be done by using recursion. A simple recursion will be:

my_fun <- function(x) if(is.list(x[[1]])) lapply(x, my_fun) else lengths(x)

out <- my_fun(have_list)

identical(out, want_list)
[1] TRUE
  • Related