I am trying to create a type of lead and lag. The dplyr::lead
and dplyr::lag
function almost get what I want, but it's missing a crucial step where it "fills" in the values I want. Take the following example:
treatment <- c(0,0,0,0,0,0,1,0,0,0,0,0,1,0,0)
df_treatment <- tibble(treatment)
df_treatment
I want a lead and lag column of the treatment column, but I want it to have 3 leading 1s for each treatment indicator. The dplyr::lead
function can give me only 1 lead since it can't take a vector in it's n
argument.
This is my desired output:
lead_2 <- c(1,1,1,0,0,0,1,1,1,0,0,0,0,0,0)
lead_1 <- c(0,0,0,1,1,1,0,0,0,1,1,1,0,0,0)
df_desired <- tibble(lead_2, lead_1, treatment)
df_desired
The purpose of this is to create 6 leads and lags columns similar to the df_desired
.
CodePudding user response:
If your treatment
is like that, define new function leead
that can put vector instead of n
.
leead <- function(x, v){
xx <- rep(0, length(x))
for(i in v){
xx <- xx lead(x, i)
}
xx[is.na(xx)] <- 0
xx
}
tibble(treatment) %>%
mutate(lead_1 = leead(treatment, c(1:3)),
lead_2 = leead(treatment, c(4:6)))
treatment lead_1 lead_2
<dbl> <dbl> <dbl>
1 0 0 1
2 0 0 1
3 0 0 1
4 0 1 0
5 0 1 0
6 0 1 0
7 1 0 1
8 0 0 1
9 0 0 1
10 0 1 0
11 0 1 0
12 0 1 0
13 1 0 0
14 0 0 0
15 0 0 0