I am looking for a data.table solution for this problem. I have data like this:
library(data.table)
codes1 <- c("A1", "A2", "A3")
codes2 <- c("B1", "B2", "B3")
codes3 <- c("C1", "C2", "C3")
data <- data.table(
id = c(1,1,2,3,3,4,4,4),
code = c("A1","A3", "B1", "A2", "B2","A1","B2","C1")
)
I wish to count, for each unique id, number of times data$code
matches an element in vectors codes1
,codes2
, and codes3
, counting only once for a match in each vector. I wish to end up with the following:
data_want <- data.table(
id = c(1,2,3,4),
match = c(1,1,2,3)
)
CodePudding user response:
Place the codes
vectors in a list
, loop over the list
with lapply
, after grouping by 'id', then check whether any
of the elements are %in%
the 'code' column, Reduce
the list
of logical vectors to integer by adding (
- TRUE
-> 1 and FALSE
-> 0)
library(data.table)
data[, .(match = Reduce(` `, lapply(list(codes1, codes2, codes3),
\(x) any(x %in% code)))), by = id]
-output
id match
<num> <int>
1: 1 1
2: 2 1
3: 3 2
4: 4 3