~SUBJID, ~TP.DATE, ~TPR_ar,
'2617001', '2019-04-11', 'Undefined',
'2617001', '2019-07-09', 'PD',
'2617001', '2019-09-07', 'PD',
'2617001', '2019-10-19', 'PD',
'2617001', '2019-11-12', 'PD',
'2617001', '2020-01-13', 'PR',
'2617001', '2020-02-24', 'PD',
'2617001', '2020-03-24', 'PD',
)
Hi, stackoverflow! I would like to get the specific date of above data. You can see that for above data, sequence of TPR_ar goes : 'Undefined', 'PD', 'PR', 'PD'. What I would like to do is get the second-first date of PD (2020-02-24). Thanks in advance!
CodePudding user response:
We could use rle
or rleid
to group adjacent similar elements
library(dplyr)
library(data.table)
df1 %>%
group_by(grp = rleid(TPR_ar)) %>%
filter(TPR_ar == 'PD', row_number() == 1) %>%
ungroup %>%
slice(2) %>%
pull(TP.DATE)
[1] "2020-02-24"
If it is grouped by "SUBJID"
df1 %>%
group_by(SUBJID, grp = rleid(TPR_ar)) %>%
filter(TPR_ar == 'PD', row_number() == 1) %>%
group_by(SUBJID) %>%
slice(2) %>%
pull(TP.DATE)
CodePudding user response:
Here is one without rleid
:
library(dplyr)
df %>%
group_by(x = cumsum(TPR_ar != lag(TPR_ar, def = first(TPR_ar))) 1) %>%
slice(1) %>%
filter(TPR_ar == "PD") %>%
ungroup() %>%
slice(2) %>%
pull(TP.DATE)
1] "2020-02-24"