I have a nested list of lists which contains some data frames. However, the data frames can appear at any level in the list. What I want to end up with is a flat list, i.e. just one level, where each element is only the data frames, with all other things discarded.
I have come up with a solution for this, but it looks very clunky and I am sure there ought to be a more elegant solution.
Importantly, I'm looking for something in base R, that can extract data frames at any level inside the nested list. I have tried unlist()
and dabbled with rapply()
but somehow not found a satisfying solution.
Example code follows: an example list, what I am actually trying to achieve, and my own solution which I am not very happy with. Thanks for any help!
# extract dfs from list
# example of multi-level list with some dfs in it
# note, dfs could be nested at any level
problem1 <- list(x1 = 1,
x2 = list(
x3 = "dog",
x4 = data.frame(cats = c(1, 2),
pigs = c(3, 4))
),
x5 = data.frame(sheep = c(1,2,3),
goats = c(4,5,6)),
x6 = list(a = 2,
b = "c"),
x7 = head(cars,5))
# want to end up with flat list like this (names format is optional)
result1 <- list(x2.x4 = data.frame(cats = c(1, 2),
pigs = c(3, 4)),
x5 = data.frame(sheep = c(1,2,3),
goats = c(4,5,6)),
x7 = head(cars,5))
# my solution (not very satisfactory)
exit_loop <- FALSE
while(exit_loop == FALSE){
# find dfs (logical)
idfs <- sapply(problem1, is.data.frame)
# check if all data frames
exit_loop <- all(idfs)
# remove anything not df or list
problem1 <- problem1[idfs | sapply(problem1, is.list)]
# find dfs again (logical)
idfs <- sapply(problem1, is.data.frame)
# unlist only the non-df part
problem1 <- c(problem1[idfs], unlist(problem1[!idfs], recursive = FALSE))
}
CodePudding user response:
Maybe consider a simple recursive function like this
find_df <- function(x) {
if (is.data.frame(x))
return(list(x))
if (!is.list(x))
return(NULL)
unlist(lapply(x, find_df), FALSE)
}
Results
> find_df(problem1)
$x2.x4
cats pigs
1 1 3
2 2 4
$x5
sheep goats
1 1 4
2 2 5
3 3 6
$x7
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
CodePudding user response:
There is a function called rrapply
You could use that. The only downside is that I do not get the required names:
rrapply::rrapply(problem1, is.data.frame, classes = 'data.frame', how = 'flatten')
$x4
cats pigs
1 1 3
2 2 4
$x5
sheep goats
1 1 4
2 2 5
3 3 6
$x7
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16