use dplyr to get list items from dataframe in R-CodePudding

I have a dataframe being returned from Microsoft365R:

SKA_student <- structure(list(name = "Computing SKA 2021-22.xlsx", size = 22266L, 
             lastModifiedBy = 
               structure(list(user = 
                      structure(list(email = "[email protected]", 
                                     id = "8ae50289-d7af-4779-91dc-e4638421f422", 
                                     displayName = "Name, My"), class = "data.frame", row.names = c(NA, -1L))), 
                      class = "data.frame", row.names = c(NA, -1L)), 
             fileSystemInfo = structure(list(
               createdDateTime = "2021-09-08T16:03:38Z", 
               lastModifiedDateTime = "2021-09-16T00:09:04Z"), class = "data.frame", row.names = c(NA,-1L))), row.names = c(NA, -1L), class = "data.frame")

I can return all the lastModifiedBy data through:

SKA_student %>% select(lastModifiedBy)

lastModifiedBy.user.email               lastModifiedBy.user.id lastModifiedBy.user.displayName
1              my@email.com 8ae50289-d7af-4779-91dc-e4638421f422                        Name, My

But if I want a specific item in the lastModifiedBy list, it doesn't work, e.g.:

SKA_student %>% select(lastModifiedBy.user.email)

Error: Can't subset columns that don't exist.
x Column `lastModifiedBy.user.email` doesn't exist.

I can get this working through base, but would really like a dplyr answer

CodePudding user response：

This function allows you to flatten all the list columns (I found this ages ago on SO but can't find the original post for credit)

SO_flat_cols <- function(data) {
    ListCols <- sapply(data, is.list)
    cbind(data[!ListCols], t(apply(data[ListCols], 1, unlist)))
}

Then you can select as you like.

SO_flat_cols (SKA_student) %>%
  select(lastModifiedBy.user.email)

Alternatively you can get to the end by recursively pulling the lists

SKA_student %>%
  pull(lastModifiedBy) %>%
  pull(user) %>%
  select(email)

CodePudding user response：

You could use

library(dplyr)
library(tidyr)

SKA_student %>% 
  unnest_wider(lastModifiedBy) %>% 
  select(email)

This returns

# A tibble: 1 x 1
  email       
  <chr>       
1 my@email.com