Relative row index based on a variable value in R-CodePudding

wondering if there was an efficient and terse/concise data.table solution to the following problem.

Please suppose that I have the following data.table:

library(data.table)

DT <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
                 time = c(1,2,3,4,1,2,3,4),
                 treat_time = c(0,0,1,0, 0,1,0,0))

Here, treat_time is the time period where the store receives a treatment. Note that the time of the treatment is different between stores A and stores B. I would like to create a column time_rel, which describes the time period relative to when treat_time = 1. That is, the data.table should look like:


DT_outcome <- data.table(store = c("A", "A", "A", "A", "B", "B", "B", "B"),
                 time = c(1,2,3,4,1,2,3,4),
                 treat_time = c(0,0,1,0, 0,1,0,0),
                 time_rel = c(-1,0,1,2, 0,1,2,3))


    store  time treat_time time_rel
   <char> <num>      <num>    <num>
1:      A     1          0       -1
2:      A     2          0        0
3:      A     3          1        1
4:      A     4          0        2
5:      B     1          0        0
6:      B     2          1        1
7:      B     3          0        2
8:      B     4          0        3

Thanks!

CodePudding user response：

We can subtract from the sequence

library(data.table)
DT[,   time_rel := seq_len(.N) - seq_len(.N)[treat_time == 1]   1 ,store]

-output

> DT
    store  time treat_time time_rel
   <char> <num>      <num>    <num>
1:      A     1          0       -1
2:      A     2          0        0
3:      A     3          1        1
4:      A     4          0        2
5:      B     1          0        0
6:      B     2          1        1
7:      B     3          0        2
8:      B     4          0        3

Or the same logic in dplyr

library(dplyr)
DT %>% 
  group_by(store) %>%
  mutate(time_rel = row_number() - which(treat_time == 1)   1) %>% 
  ungroup