I have DFs like this
df1 <- data.frame(ID=c(222717,321817,521917), Name=c("Bob","James","Eric"), Team=c("Badgers","Tigers","Possums"))
df2 <- data.frame(Badgers=c(222717,789423), Tigers=c(438283,978748), Possums=c(521917,251233))
I use mapply to check for IDs from df1 in the corresponding Team column in df2, like this
df1$result <- mapply(function(a,b) {b %in% df2[[a]]}, a=as.character(df1$Team), b=df1$ID)
My problem is that I now have new df1s that have a variable number of ID columns to look up, like this
newDf1 <- data.frame(ID1=c(222717,321817,521917), ID2=c(13998,978748,251233), Name=c("Bob","James","Eric"), Team=c("Badgers","Tigers","Possums"))
I've attempted to do this using grep
in my mapply
but it doesn't work properly:
newDf1$result <- mapply(function(a,b) {b %in% df2[[a]]}, a=as.character(newDf1$Team), b=newDf1[grep("ID",names(newDf1))])
How do I go about checking for multiple values in this situation? Note that, as I mentioned above, the number of ID columns is variable, so while something like the following works for this example newDf1 that I've provided, which has only two ID columns, I need something that works for any number of ID columns.
newDf1$result <- mapply(function(a,b,c) {b %in% df2[[a]] | c %in% df2[[a]]}, a=as.character(newDf1$Team), b=newDf1$ID1, c=newDf1$ID2)
CodePudding user response:
We may use rowwise
library(dplyr)
newDf1 %>%
rowwise %>%
mutate(result = any(c_across(starts_with("ID")) %in% df2[[Team]])) %>%
ungroup
Or in base R
newDf1$result <- sapply(seq_len(nrow(newDf1)), \(i)
any(newDf1[i, grep("ID", names(newDf1))] %in% df2[[newDf1$Team[i]]]))
CodePudding user response:
You can use mapply()
with do.call()
:
newDf1$result <- do.call(mapply, c(\(Team, ...) any(list(...) %in% df2[[Team]]),
newDf1["Team"], newDf1[grep("ID", names(newDf1))]))
# Badgers Tigers Possums
# TRUE TRUE TRUE