I have a large datatable with a 'status' column. Status could be 0, 1, or 2. Zero or more rows of status 0 or 2 can precede a row with status of 1. E.g.
dt <- data.table(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), status = c(1, 0, 1, 0, 2, 1, 0, 0, 0, 1))
I want to identify the groups 1 (id = 1), 2 (id = 2:3), 3 (id = 4:6), and 4 (id = 7:10). Is there an efficient way to do this instead of running through a loop?
Please let me know. Thanks!
CodePudding user response:
library(data.table)
dt <- data.table(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), status = c(1, 0, 1, 0, 2, 1, 0, 0, 0, 1))
dt[,.(group=rleid(rev(cumsum(rev(status==1)))), id, status)]
#> group id status
#> 1: 1 1 1
#> 2: 2 2 0
#> 3: 2 3 1
#> 4: 3 4 0
#> 5: 3 5 2
#> 6: 3 6 1
#> 7: 4 7 0
#> 8: 4 8 0
#> 9: 4 9 0
#> 10: 4 10 1
Created on 2022-11-23 with reprex v2.0.2