I am working with a large nested list of tibbles. A previous post already helped me out, but I am stuck at the last step of forming a usable dataframe out of a large nested list.
In this dataframe should be an 'id' column that shows the name a tibble has within the list. I tried bind.rows(.id='id')
but it discards the names and gives it a numeric index. How can I avoid this?
Here is a minimized version of my problem:
(I am not really sure if the example is precise enough, as I was not able to name each list element, but I hope the idea comes across)
a<-tibble (a=numeric(7),
b=letters[7:1],
c=integer(length=1))
b<-tibble (a=integer(length=1),
b=as.numeric(8),
c=letters[7:1])
c<- tibble(.rows = 2)
A<-list(list(a,b,c))
B<-list(A,list(a,b,c))
C<-list(A,B)
riddle<-list(A,B,C)
Following is the code that I am running to get my original data in format, but you will see that the id column only gets numeric indexes, for the example, as for my original data
rrapply(riddle, condition = function(x) all(dim(x)>0),
f = function(x)
{
# change to unique column names
names(x) <- make.unique(names(x))
x %>%
# convert all columns to character if there
# are mismatch in column types in any list elements
mutate(across(everything(), as.character))
}, classes = "data.frame", how= "flatten") %>%
# bind the flattened list of data.frame/tibbles to single dataset
bind_rows(.id="id") %>%
# do the column type conversion
type.convert(as.is = TRUE)
Pretending that my example would have names for the 12 values of id - How and which command would I need to implement to get the names of the objects as values for the .id column?
CodePudding user response:
If the list
have names, then we may be able to extract and create 'id' with the names of the list
library(rrapply)
library(dplyr)
library(stringr)
A <-list(list(a,b,c))
B <- list(A = A, list(a, b, c))
C <- list(A=A, B = B)
riddle <- list(A = A, B = B, C = C)
-testing
out <- rrapply(riddle, condition = function(x) all(dim(x)>0),
f = function(x, .xparents)
{
# change to unique column names
names(x) <- make.unique(names(x))
x %>%
mutate(id = str_c(setdiff(.xparents, ""),
collapse = "_"), .before = 1 ) %>%
# convert all columns to character if there
# are mismatch in column types in any list elements
mutate(across(everything(), as.character))
}, classes = "data.frame", how= "flatten") %>%
bind_rows() %>%
type.convert(as.is = TRUE)
-output
> out
# A tibble: 84 × 4
id a b c
<chr> <int> <chr> <chr>
1 A_1 0 g 0
2 A_1 0 f 0
3 A_1 0 e 0
4 A_1 0 d 0
5 A_1 0 c 0
6 A_1 0 b 0
7 A_1 0 a 0
8 A_1_2 0 8 g
9 A_1_2 0 8 f
10 A_1_2 0 8 e
# … with 74 more rows