Converting Lists With Subelements into a Data Frame-CodePudding

I have this list in R:

my_list  <- list(NULL, NULL, NULL, list(4L, c(66.4, 12.1)), list(5L, c(66.9, 12.8)), list(6L, c(67.4, 12.9)))

This looks like this:

> my_list
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
[[4]][[1]]
[1] 4

[[4]][[2]]
[1] 66.4 12.1


[[5]]
[[5]][[1]]
[1] 5

[[5]][[2]]
[1] 66.9 12.8


[[6]]
[[6]][[1]]
[1] 6

[[6]][[2]]
[1] 67.4 12.9

I would like to convert this to the following format:

  col1 col2 col3
1    1 NULL NULL
2    2 NULL NULL
3    3 NULL NULL
4    4 66.4 12.1
5    5 66.9 12.8
6    6 67.4 12.9

Normally, I would have used the "rbind.data.frame" to approach problems like this - but now it returns this error:

final = do.call(rbind.data.frame , my_list)

Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE,  : 
  invalid list argument: all variables should have the same length

I tried to do some research to see if there was a standard approach for doing this. For example:

c1 = sapply(my_list,function(x) x[[1]][1])
c2 = sapply(my_list,function(x) x[[1]][2])


c1[sapply(c1, is.null)] <- NA
col2 = unlist(c1)

c2[sapply(c2, is.null)] <- NA
col3 = unlist(c2)

# now, how do I create col1?

d = data.frame(col2,col3)

      col2 col3
1   NA   NA
2   NA   NA
3   NA   NA
4 66.4 12.1
5 66.9 12.8
6 67.4 12.9

From here, I do not know how to create "col1" by directly selecting from "my_list". I know that I can create "col1" after the fact by d$col1 = 1:nrow(d) - but just to stay on the safe side, I would like to create col1 using an "sapply" statement as I did for col2 and col3. This way, in case some elements in the list get corrupted, I will be directly pulling the original numbers instead of creating these numbers after the fact.

Can someone please show me how to do this?

Thank you!

CodePudding user response：

This is how I would go about it.

Just be aware I have converted NULL to NA, this may impact any down stream usage you had planned. I did this is so rbindlist will return a row for each list element.

library(dplyr)
my_list2 = lapply(my_list, function(x) if(is.null(x)){
  data.frame(V1=NA)  
}else{
  x}
)

data.table::rbindlist(my_list2, idcol=T, fill=T) %>% group_by(.id) %>% 
  mutate(id = row_number()) %>% pivot_wider(names_from = id, values_from=V2) %>% select(!V1)

CodePudding user response：

library(tibble)
library(tidyr)
library(dplyr)

enframe(my_list, name  = "col1") |> 
  unnest_wider(col = value, names_sep = "_") |> 
  unnest_wider(col = value_2, names_sep = "_") |> 
  select(col1, col2 = value_2_1, col3 = value_2_2)

Output:

   col1  col2  col3
  <int> <dbl> <dbl>
1     1  NA    NA  
2     2  NA    NA  
3     3  NA    NA  
4     4  66.4  12.1
5     5  66.9  12.8
6     6  67.4  12.9

Requirements:

R 4.1 or newer