I have a big data-set with over 1000 subjects, a small piece of the data-set looks like:
mydata <- read.table(header=TRUE, text="
Id DAYS QS Event
01 50 1 1
01 57 4 1
01 70 1 1
01 78 2 1
01 85 3 1
02 70 2 1
02 92 4 1
02 98 5 1
02 105 6 1
02 106 7 0
")
I would like to get row number of the observation 28 or more days prior to last observation, eg. for id=01; last observation is 85 minus 28 would be 57 which is row number 2. For id=02; last observation is 106 minus 28; 78 and because 78 does not exist, we will use row number of 70 which is 1 (I will be getting the row number for each observation separately) or first observation for id=02.
CodePudding user response:
This should work:
mydata %>% group_by(Id) %>%
mutate(row_number = last(which(DAYS <= max(DAYS) - 28)))
# A tibble: 10 x 6
# Groups: Id [2]
Id DAYS QS Event max row_number
<int> <int> <int> <int> <dbl> <int>
1 1 50 1 1 57 2
2 1 57 4 1 57 2
3 1 70 1 1 57 2
4 1 78 2 1 57 2
5 1 85 3 1 57 2
6 2 70 2 1 78 1
7 2 92 4 1 78 1
8 2 98 5 1 78 1
9 2 105 6 1 78 1
10 2 106 7 0 78 1