I have this list in R:
my_list <- list(NULL, NULL, NULL, list(4L, c(66.4, 12.1)), list(5L, c(66.9, 12.8)), list(6L, c(67.4, 12.9)))
This looks like this:
> my_list
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
[[4]]
[[4]][[1]]
[1] 4
[[4]][[2]]
[1] 66.4 12.1
[[5]]
[[5]][[1]]
[1] 5
[[5]][[2]]
[1] 66.9 12.8
[[6]]
[[6]][[1]]
[1] 6
[[6]][[2]]
[1] 67.4 12.9
I would like to convert this to the following format:
col1 col2 col3
1 1 NULL NULL
2 2 NULL NULL
3 3 NULL NULL
4 4 66.4 12.1
5 5 66.9 12.8
6 6 67.4 12.9
Normally, I would have used the "rbind.data.frame" to approach problems like this - but now it returns this error:
final = do.call(rbind.data.frame , my_list)
Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE, :
invalid list argument: all variables should have the same length
I tried to do some research to see if there was a standard approach for doing this. For example:
c1 = sapply(my_list,function(x) x[[1]][1])
c2 = sapply(my_list,function(x) x[[1]][2])
c1[sapply(c1, is.null)] <- NA
col2 = unlist(c1)
c2[sapply(c2, is.null)] <- NA
col3 = unlist(c2)
# now, how do I create col1?
d = data.frame(col2,col3)
col2 col3
1 NA NA
2 NA NA
3 NA NA
4 66.4 12.1
5 66.9 12.8
6 67.4 12.9
From here, I do not know how to create "col1" by directly selecting from "my_list". I know that I can create "col1" after the fact by d$col1 = 1:nrow(d)
- but just to stay on the safe side, I would like to create col1 using an "sapply" statement as I did for col2 and col3. This way, in case some elements in the list get corrupted, I will be directly pulling the original numbers instead of creating these numbers after the fact.
- Can someone please show me how to do this?
Thank you!
CodePudding user response:
This is how I would go about it.
Just be aware I have converted NULL
to NA
, this may impact any down stream usage you had planned. I did this is so rbindlist
will return a row for each list element.
library(dplyr)
my_list2 = lapply(my_list, function(x) if(is.null(x)){
data.frame(V1=NA)
}else{
x}
)
data.table::rbindlist(my_list2, idcol=T, fill=T) %>% group_by(.id) %>%
mutate(id = row_number()) %>% pivot_wider(names_from = id, values_from=V2) %>% select(!V1)
CodePudding user response:
library(tibble)
library(tidyr)
library(dplyr)
enframe(my_list, name = "col1") |>
unnest_wider(col = value, names_sep = "_") |>
unnest_wider(col = value_2, names_sep = "_") |>
select(col1, col2 = value_2_1, col3 = value_2_2)
Output:
col1 col2 col3
<int> <dbl> <dbl>
1 1 NA NA
2 2 NA NA
3 3 NA NA
4 4 66.4 12.1
5 5 66.9 12.8
6 6 67.4 12.9
Requirements:
- R 4.1 or newer