Home > database >  Handling lists in R with multiple value types
Handling lists in R with multiple value types

Time:06-21

I have a list that contains vectors with member names, number of members, and NULL and empty list elements. I would like a data frame/ tibble containing the number of members and a list with the member names, if available.

Any tips on how to get form raw.list to clean.tibble using purrr in R.

# Raw Data Example

raw.list <- list(list(list("a", "b", "c")), # members are group a, b, and c (3 members) 
                 NULL, # no members
                 list(), # no members (unknown origin of empty list)
                 10, # 10 members
                 100) # 100 members

# Outcome   

clean.tibble <- tibble(members.n = c(3L, NA_integer_, NA_integer_, 10L, 100L),
                       members.list = c(list(list("a", "b", "c")), list(NULL), list(NULL), list(NULL), list(NULL))) 

CodePudding user response:

You can do:

library(tidyverse)

clean.tibble <- tibble(raw.list = raw.list) |> 
  rowwise() |> 
  mutate(has_elements = !is_empty(lengths(raw.list)),
         n_raw        = ifelse(has_elements == TRUE, lengths(raw.list), NA_integer_),
         members.n    = ifelse(n_raw == 1, raw.list, n_raw),
         members.list = ifelse(n_raw > 1, raw.list, list(NULL))) |> 
  ungroup() |>
  select(-raw.list, -has_elements, -n_raw)

# A tibble: 5 × 2
  members.n members.list
      <dbl> <list>      
1         3 <list [3]>  
2        NA <NULL>      
3        NA <NULL>      
4        10 <NULL>      
5       100 <NULL>      

I suggest you run the code without the final select step to see what's happening. I thought there is an easier way by just using a joint case_when instead of the several ifelse commands, but the problem is that case_when seems to always evaluate all elements and thus the approach fails because of the NULL element in your list.

  • Related