R - how to select elements from sublists of a list by their name-CodePudding

I have a list of lists that looks like this:

list(list("A[1]" = data.frame(W = 1:5),
          "A[2]" = data.frame(X = 6:10),
          B = data.frame(Y = 11:15),
          C = data.frame(Z = 16:20)),
     list("A[1]" = data.frame(W = 21:25),
          "A[2]" = data.frame(X = 26:30),
          B = data.frame(Y = 31:35),
          C = data.frame(Z = 36:40)),
     list("A[1]" = data.frame(W = 41:45),
          "A[2]" = data.frame(X = 46:50),
          B = data.frame(Y = 51:55),
          C = data.frame(Z = 56:60))) -> dflist

I need my output to also be a list of list with length 3 so that each sublist retains elements whose names start with A[ while dropping other elements.

Based on some previous questions, I am trying to use this:

dflist %>% 
    map(keep, names(.) %in% "A[")

but that gives the following error:

Error in probe(.x, .p, ...) : length(.p) == length(.x) is not TRUE

Trying to select a single element, for example just A[1] like this:

dflist %>% 
    map(keep, names(.) %in% "A[1]")

also doesn't work. How can I achieve the desired output?

CodePudding user response：

I think you want:

purrr::map(dflist, ~.[stringr::str_starts(names(.), "A\\[")])

What this does is:

For each sublist (purrr::map)
- Select all elements of that sublist (.[], where . is the sublist)
- Whose names start with A[ (stringr::str_starts(names(.), "A\\["))

You got the top level map correct, since you want to modify the sublists. However, map(keep, names(.) %in% "A[") has some issues:

names(.) %in% "A[" should be a function or a formula (starting with ~
purrr::keep applies the filtering function to each element of the sublist, namely to the data frames directly. It never "sees" the names of each data frame. Actually I don't think you can use keep for this problem at all

Anyway this produces:

[[1]]
[[1]]$`A[1]`
  W
1 1
2 2
3 3
4 4
5 5

[[1]]$`A[2]`
   X
1  6
2  7
3  8
4  9
5 10


[[2]]
[[2]]$`A[1]`
   W
1 21
2 22
3 23
4 24
5 25

[[2]]$`A[2]`
   X
1 26
2 27
3 28
4 29
5 30


[[3]]
[[3]]$`A[1]`
   W
1 41
2 42
3 43
4 44
5 45

[[3]]$`A[2]`
   X
1 46
2 47
3 48
4 49
5 50

CodePudding user response：

If we want to use keep, use

library(dplyr)
library(purrr)
library(stringr)
map(dflist, ~ keep(.x, str_detect(names(.x), fixed("A["))))

CodePudding user response：

Here a base R solution:

lapply(dflist, function(x) x[grep("A\\[",names(x))] )

[[1]]
[[1]]$`A[1]`
  W
1 1
2 2
3 3
4 4
5 5

[[1]]$`A[2]`
   X
1  6
2  7
3  8
4  9
5 10


[[2]]
[[2]]$`A[1]`
   W
1 21
2 22
3 23
4 24
5 25

[[2]]$`A[2]`
   X
1 26
2 27
3 28
4 29
5 30


[[3]]
[[3]]$`A[1]`
   W
1 41
2 42
3 43
4 44
5 45

[[3]]$`A[2]`
   X
1 46
2 47
3 48
4 49
5 50