Home > database >  Unnesting Lists Inside of Dataframes
Unnesting Lists Inside of Dataframes

Time:10-06

I have two lists that are the result of a for loop:

pid list <- List of IDs

violations <- List of Violations

Now, when I run the following:

df <- cbind(pid_list, violations) %>%
    as.data.frame()
head(df)

I end up with a dataframe with a column on the left and a list in the right that looks like this: Current Result

What I would like the result to be is:

pid_list violations
-6928390558574835
-6141623232242576
-4584655687749366
-2856811727809710 381
-2856811727809710 382
-2856811727809710 383
-2856811727809710 384
-2856811727809710 385
-2856811727809710 386
-2856811727809710 387
-2856811727809710 388
-1860498996583344

So I can join it back to a previous table that I have and remove the rows with NULL violations.

I've tried some unlist() and purrr() methods without much success -- does anyone here have any recommendations?

CodePudding user response:

We could loop over the violations list, if the upper element is NULL, convert it to NA, extract the upper element, construct the data.frame by replicating the 'pid_list' based on the lengths of the violations list in base R

violations <- lapply(violations, \(x) {if(is.null(x$upper) || 
    length(x$upper) == 0) {x$upper <- NA}
        x$upper})
pid_list[lengths(pid_list) == 0| is.null(pid_list)] <- NA
data.frame(pid_list = unlist(pid_list)[rep(seq_along(pid_list), 
      lengths(violations))], 
     violations = unlist(violations))

-output

               pid_list violations
1  -6928390558574835712         NA
2  -6141623232242576384         NA
3  -4584655687749366272         NA
4  -2856811727809710592        381
5  -2856811727809710592        382
6  -2856811727809710592        383
7  -2856811727809710592        384
8  -2856811727809710592        385
9  -2856811727809710592        386
10 -2856811727809710592        387
11 -2856811727809710592        388
12 -1860498996583344128         NA
13  1345936609258173440         NA

data

violations <-  list(list(upper = integer(0)), list(upper = integer(0)), list( upper = integer(0)), list(upper = 381:388), list(upper = integer(0)), list(upper = integer(0))) 
pid_list <- list("-6928390558574835712", "-6141623232242576384", "-4584655687749366272", "-2856811727809710592", "-1860498996583344128", "1345936609258173440")
  • Related