subset a list within a list by column's name start with all requested pattern-CodePudding

I have a list of data frames (L) with different column's name. I want to have a subset of a list which contains all requested columns name start with A and B (the sequence of A and B is not important.

L1 = data.frame(A1 = c(1:4) , Ab = c("u","v","w","x"))
L2 = data.frame(A2 = c(1:4) , Bc = c("u","v","w","x"))
L3 = data.frame(A3 = c(1:4) , Bd = c("u","v","w","x"))
L4 = data.frame(A = c(1:4) , B = c("u","v","w","x"))
L<-list(L1,L2, L3, L4)

The result should be a list with L2, L3, and L4 which contains columns start with A and B.

####### Following command gives all lists which contain columns start with A or B but not subset a list which just start with both A and B.

lapply(L, function(x)   x[ , grepl( '^A|^B' , names(x))])

####### and this function gives lists with exact requested columns name and not the columns start with A and B.

trial <- function(x) 
{
  reqnames <- c('A', 'B')
  L <- lapply(L, function(x)   all(reqnames  %in% names(x)))
  L <- which(L==1)
  x[L]
}
 trial(L)

CodePudding user response：

Try this:

new_list  <- lapply(L, \(x) x[
    all(
        any(grepl("^A", names(x))),
        any(grepl("^B", names(x)))
    )
]
)

This will return an empty list in place of L1, and the contents of L2 to L4.

If you don't want an empty list for L1 you can subset it again:

new_list[sapply(new_list, length)>0]