Home > Back-end >  Extract elements with purrr::map only if existing in each sublist
Extract elements with purrr::map only if existing in each sublist

Time:02-18

I have a sample list list1, with 3 sublists Alpha, Beta, Gamma. Each of these sublists consists of several elements. However, not all of the elements are in each subgroup.

list1 <- list(Alpha = structure(list(sample_0 = c(3, NA, 7, 9, 2),
                                     sample_1 = c(NA, 8, 5, 4, NA),
                                     sample_2 = c(7, 3, 5, NA, NA)),
                                row.names = c(NA, -5L),
                                class = c("tbl_df", "tbl", "data.frame")),
              Beta = structure (list(sample_0 = c(2, 9, NA, 3, 7),
                                     sample_1 = c(3, 7, 9, 3, NA),
                                     sample_2 = c(4, 2, 6, 4, 6)),
                                row.names = c(NA, -5L),
                                class = c("tbl_df", "tbl", "data.frame")),
              Gamma = structure(list(sample_0 = c(NA, NA, 4, 6, 3),
                                     sample_1 = c(3, 7, 3, NA, 8)),
                                row.names = c(NA, -5L),
                                class = c("tbl_df", "tbl", "data.frame")))

I want to create a list2, consisting only of a specified part of list1, e.g. only of sublists containing both of the elements sample_1 and sample_2. I manage to filter elements that appear in each sublist, for example sample_0:

map(list1, `[`, "sample_0")

## Output

$Alpha
# A tibble: 5 x 1
  sample_0
     <dbl>
1        3
2       NA
3        7
4        9
5        2

$Beta
# A tibble: 5 x 1
  sample_0
     <dbl>
1        2
2        9
3       NA
4        3
5        7

$Gamma
# A tibble: 5 x 1
  sample_0
     <dbl>
1       NA
2       NA
3        4
4        6
5        3

However, when I try to filter a sublist that doesn't exist in each subgroup, it throws an error message:

map(list1, `[`, "sample_2")

Error in `stop_subscript()`:
! Can't subset columns that don't exist.
x Column `sample_2` doesn't exist.

My final goal would be to create a new list, containing only sublists which contain ALL of a set of prespecified vectors. Ideally, this would be done by passing on a vector extract_vars to purrr:map:

extract_vars <- c("sample_1", "sample_2")

The desired output being:

$Alpha
# A tibble: 5 x 3
sample_1 sample_2
   <dbl>    <dbl>
1     NA        7
2      8        3
3      5        5
4      4       NA
5     NA       NA

$Beta
# A tibble: 5 x 3
sample_1 sample_2
   <dbl>    <dbl>
1      3        4
2      7        2
3      9        6
4      3        4
5     NA        6

(Element Gamma removed from the desired list, since it doensn't contain the element sampling_0).

CodePudding user response:

One option could be:

map(keep(list1, ~ all(xtract_vars %in% names(.))), ~ select(., all_of(xtract_vars)))

$Alpha
# A tibble: 5 × 2
  sample_1 sample_2
     <dbl>    <dbl>
1       NA        7
2        8        3
3        5        5
4        4       NA
5       NA       NA

$Beta
# A tibble: 5 × 2
  sample_1 sample_2
     <dbl>    <dbl>
1        3        4
2        7        2
3        9        6
4        3        4
5       NA        6
  • Related