I am looking for a data.table
solution to a simple problem. I have data like this:
library(data.table)
data <- data.table(
id = c(1,1,2,3,3),
code = c("A1","A3", "B1", "A2", "B2")
)
I wish to find the unique IDs that has a code contained in either
codes1 <- c("A1", "A2", "A3")
or
codes2 <- c("B1", "B2", "B3")
on two separate rows: Possible matches are those unique IDs with a code matching an element in codes1
or codes2
for the first row for that unique id and a code matching an element in codes1
or codes2
for the second row for that unique id (but if the code is contained in codes1
in the first row, it has to be codes2
in the second row, or vice versa)
So I would like to end up with this:
data_want <- data.table(
id = c(1,2,3),
match = c(0,0,1)
)
CodePudding user response:
We may use %in%
with any
library(data.table)
data[, .(match = (any(codes1 %in% code) & any(codes2 %in% code))), by = id]
-output
id match
<num> <int>
1: 1 0
2: 2 0
3: 3 1