I would like to calculate Day.Before_nextCLS
with 3 columns below
tibble::tribble(
~Day, ~CLS, ~BAL.D,
0, 0, NA,
3, 0, 15000,
6, 0, 10000,
20, 0, 2000,
25, 0, -4771299,
26, 0, -1615637,
27, 0, -920917,
31, 1, -923089,
32, 1, -81863,
33, 1, 19865,
34, 1, 9865,
37, 1, 609865
)
Desired output is below tribble.
For Day27
, Day.Before_nextCLS
is 4,
because when CLS
is 2, Day is 31, and interval between 27 and 31 is 4.
tibble::tribble(
~Day, ~CLS, ~BAL.D, ~Day.Before_nextCLS
0, 0, NA, 31,
3, 0, 15000, 28,
6, 0, 10000, 25,
20, 0, 2000, 11,
25, 0, -4771299, 6,
26, 0, -1615637, 5,
27, 0, -920917, 4,
31, 1, -923089, NA, (for we don't have date when CLS ==2)
32, 1, -81863, NA,
33, 1, 19865, NA,
34, 1, 9865, NA,
37, 1, 609865, NA,
)
How can I achieve this?
Thank you very much!!
CodePudding user response:
We create a lead
column and then do a group by subtract from the last
value of lead column with the Day column
library(dplyr)
df1 %>%
mutate(DayLead = lead(Day)) %>%
group_by(CLS) %>%
mutate(Day.Before_nextCLS = last(DayLead) - Day, DayLead = NULL) %>%
ungroup
-output
# A tibble: 12 × 4
Day CLS BAL.D Day.Before_nextCLS
<dbl> <dbl> <dbl> <dbl>
1 0 0 NA 31
2 3 0 15000 28
3 6 0 10000 25
4 20 0 2000 11
5 25 0 -4771299 6
6 26 0 -1615637 5
7 27 0 -920917 4
8 31 1 -923089 NA
9 32 1 -81863 NA
10 33 1 19865 NA
11 34 1 9865 NA
12 37 1 609865 NA