I have a data frame that has lists of lists that stores IDs:
a <- list(as.character(c("1","2","3")))
b <- list(as.character(c("2","3","5")))
c <- list(as.character(c("4","6","8")))
df = data.frame(NAME = c("A1", "A2", "A3"), stat = c(14, 15, 16))
df$IDs[1] <- a
df$IDs[2] <- b
df$IDs[3] <- c
Additionally, I have a list of characters which is a reference of IDs of my interest that I want to track:
x <- list(as.character(c("2","3")))
I would like to filter the initial data frame so that it will only contain the rows that have IDs of 2 and/or 3 in the ID column of the data frame (ie, x matching to df$ID; thereby in this case only the rows named A1 and A2 in this case).
The actual data frame has hundreds of rows so I would appreciate a shorter route than a loop if possible.
If you have a different approach as part of your suggestions (like wrangling the initial df a bit more), I'd also appreciate hearing them as well.
Many thanks in advance.
CodePudding user response:
You could use sapply
or mapply
:
df[sapply(df$IDs, \(a) any(x[[1]] %in% a)), ]
df[mapply(\(a, b) any(a %in% b), x, df$IDs), ]
Output
# NAME stat IDs
# 1 A1 14 1, 2, 3
# 2 A2 15 2, 3, 5
CodePudding user response:
Using tidyverse
library(dplyr)
library(purrr)
df %>%
filter(map_lgl(IDs, ~ any(unlist(x) %in% .x)))
NAME stat IDs
1 A1 14 1, 2, 3
2 A2 15 2, 3, 5