I have df:
A <- c("010410", "010420","010511","010519","010519","010591","010591",NA,NA)
B <- c("010392","010410","010420","010511","010512","010519","010592","010592","010593")
C <- c("010410", "010420", "010511", "010512", "010519", "010594", "010594", "010599",NA)
D <- c("010231", "010239", "010290", "010290", "010310", "010391", "010392", "010410", "010420")
df <- data.frame(A,B,C,D)
Now I want to find out in which columns match the following codes and return the dataframe where will be YES and NOW across A B C & D columns
df2 <- as.data.frame(c("010410","010420","010511","010512","010519","010290"))
- YES - if it matches
- NO- if such code does not exist in this column
Expected outcome:
A B C D
010410 YES YES YES YES
010420 YES YES YES YES
010511 YES YES YES NO
010512 NO YES YES NO
010519 YES YES YES NO
010290 NO NO NO YES
CodePudding user response:
Here's a simple way using TRUE and FALSE:
lapply(df, \(x) df2[[1]] %in% x) |>
as.data.frame() |>
`row.names<-`(df2[[1]])
# A B C D
# 010410 TRUE TRUE TRUE TRUE
# 010420 TRUE TRUE TRUE TRUE
# 010511 TRUE TRUE TRUE FALSE
# 010512 FALSE TRUE TRUE FALSE
# 010519 TRUE TRUE TRUE FALSE
# 010290 FALSE FALSE FALSE TRUE
If the Yes/Nos are strongly preferred, we can get them with this modification:
lapply(df, \(x) c("No", "Yes")[(df2[[1]] %in% x) 1]) |>
as.data.frame() |>
`row.names<-`(df2[[1]])
# A B C D
# 010410 Yes Yes Yes Yes
# 010420 Yes Yes Yes Yes
# 010511 Yes Yes Yes No
# 010512 No Yes Yes No
# 010519 Yes Yes Yes No
# 010290 No No No Yes