Home > OS >  data.table: find number of times unique id matches element in vector
data.table: find number of times unique id matches element in vector

Time:02-15

I am looking for a data.table solution for this problem. I have data like this:

library(data.table)

codes1 <- c("A1", "A2", "A3")
codes2 <- c("B1", "B2", "B3")
codes3 <- c("C1", "C2", "C3")

data <- data.table(
  id = c(1,1,2,3,3,4,4,4),
  code = c("A1","A3", "B1", "A2", "B2","A1","B2","C1")
)

I wish to count, for each unique id, number of times data$code matches an element in vectors codes1,codes2, and codes3, counting only once for a match in each vector. I wish to end up with the following:

data_want <- data.table(
  id = c(1,2,3,4),
  match = c(1,1,2,3)
)

CodePudding user response:

Place the codes vectors in a list, loop over the list with lapply, after grouping by 'id', then check whether any of the elements are %in% the 'code' column, Reduce the list of logical vectors to integer by adding ( - TRUE -> 1 and FALSE -> 0)

library(data.table)
data[, .(match = Reduce(` `, lapply(list(codes1, codes2, codes3), 
    \(x) any(x %in% code)))), by =  id]

-output

      id match
   <num> <int>
1:     1     1
2:     2     1
3:     3     2
4:     4     3
  • Related