Home > Mobile >  Converting Lists With Subelements into a Data Frame
Converting Lists With Subelements into a Data Frame

Time:07-15

I have this list in R:

my_list  <- list(NULL, NULL, NULL, list(4L, c(66.4, 12.1)), list(5L, c(66.9, 12.8)), list(6L, c(67.4, 12.9)))

This looks like this:

> my_list
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
[[4]][[1]]
[1] 4

[[4]][[2]]
[1] 66.4 12.1


[[5]]
[[5]][[1]]
[1] 5

[[5]][[2]]
[1] 66.9 12.8


[[6]]
[[6]][[1]]
[1] 6

[[6]][[2]]
[1] 67.4 12.9

I would like to convert this to the following format:

  col1 col2 col3
1    1 NULL NULL
2    2 NULL NULL
3    3 NULL NULL
4    4 66.4 12.1
5    5 66.9 12.8
6    6 67.4 12.9

Normally, I would have used the "rbind.data.frame" to approach problems like this - but now it returns this error:

final = do.call(rbind.data.frame , my_list)

Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE,  : 
  invalid list argument: all variables should have the same length

I tried to do some research to see if there was a standard approach for doing this. For example:

c1 = sapply(my_list,function(x) x[[1]][1])
c2 = sapply(my_list,function(x) x[[1]][2])


c1[sapply(c1, is.null)] <- NA
col2 = unlist(c1)

c2[sapply(c2, is.null)] <- NA
col3 = unlist(c2)

# now, how do I create col1?

d = data.frame(col2,col3)

      col2 col3
1   NA   NA
2   NA   NA
3   NA   NA
4 66.4 12.1
5 66.9 12.8
6 67.4 12.9

From here, I do not know how to create "col1" by directly selecting from "my_list". I know that I can create "col1" after the fact by d$col1 = 1:nrow(d) - but just to stay on the safe side, I would like to create col1 using an "sapply" statement as I did for col2 and col3. This way, in case some elements in the list get corrupted, I will be directly pulling the original numbers instead of creating these numbers after the fact.

  • Can someone please show me how to do this?

Thank you!

CodePudding user response:

This is how I would go about it.

Just be aware I have converted NULL to NA, this may impact any down stream usage you had planned. I did this is so rbindlist will return a row for each list element.

library(dplyr)
my_list2 = lapply(my_list, function(x) if(is.null(x)){
  data.frame(V1=NA)  
}else{
  x}
)

data.table::rbindlist(my_list2, idcol=T, fill=T) %>% group_by(.id) %>% 
  mutate(id = row_number()) %>% pivot_wider(names_from = id, values_from=V2) %>% select(!V1)

CodePudding user response:

library(tibble)
library(tidyr)
library(dplyr)

enframe(my_list, name  = "col1") |> 
  unnest_wider(col = value, names_sep = "_") |> 
  unnest_wider(col = value_2, names_sep = "_") |> 
  select(col1, col2 = value_2_1, col3 = value_2_2)

Output:

   col1  col2  col3
  <int> <dbl> <dbl>
1     1  NA    NA  
2     2  NA    NA  
3     3  NA    NA  
4     4  66.4  12.1
5     5  66.9  12.8
6     6  67.4  12.9

Requirements:

  • R 4.1 or newer
  • Related