Home > Software design >  Find index based on condition by key variable using data.table
Find index based on condition by key variable using data.table

Time:03-01

I have a data.table like this:

library(data.table)

dt1 <- data.table(
  id = 1,
  week = c( seq(1:260))
)
dt1[0:100, status := "A"][101:260, status := "B"]

dt2 <- data.table(
  id = 2,
  week = c( seq(1:260))
)
dt2[0:155, status := "A"][156:260, status := "B"]

dt3 <- data.table(
  id = 3,
  week = c( seq(1:260))
)
dt3[0:190, status := "A"][191:193, status := "B"][194:200, status := "A"][201:260, status := "B"]

data <- rbind(dt1,dt2,dt3)

I wish to find the first index by each unique id where status equals "B" for 4 weeks in a row. The result should be

data_want <- data.table(
  id = c(1,2,3),
  week = c(104, 59, 204)
)

Any suggestions?

CodePudding user response:

With rleid:

data[order(id,week),.(status,week,n=seq_len(.N)),by=.(id,rleid(status))][
     status=="B"&n==4,.(id,week)]

      id  week
   <num> <int>
1:     1   104
2:     2   159
3:     3   204
  • Related