I have been tasked with the creation of lagging indicators for a series of data frames. Each dataframe has two columns: date and value. For each "value" column I need to create 12 new columns, where the lag increases by one unit.
A given data frame looks like this:
date | value |
01-01-2021 1
02-01-2021 2
03-01-2021 3
04-01-2021 4
05-01-2021 5
06-01-2021 6
07-01-2021 7
08-01-2021 8
And I need to add new columns so they end up looking like this:
date | value | lag1 | ... | lag12
01-01-2021 1 2 13
02-01-2021 2 3 14
03-01-2021 3 4 15
04-01-2021 4 5 16
05-01-2021 5 6 17
06-01-2021 6 7 18
07-01-2021 7 8 19
08-01-2021 8 9 20
The values shown are for illustrative purposes, the real data is of economic nature and therefore follows no particular pattern.
So far, I have only managed to figure out how to create one column at a time for a single data frame, but this isn't very efficient. I need a way to loop through all of the data frames and create the additional columns.
Thanks in advance for any help!
CodePudding user response:
Loop over however many lags you need, creating a list of the new column values, then add them back all at once.
lagfun <- function(x,n) c(tail(x,-n), rep(NA,n))
dat[paste0("lag", 1:7)] <- lapply(1:7, lagfun, x=dat$value)
dat
# date value lag1 lag2 lag3 lag4 lag5 lag6 lag7
#1 01-01-2021 1 2 3 4 5 6 7 8
#2 02-01-2021 2 3 4 5 6 7 8 NA
#3 03-01-2021 3 4 5 6 7 8 NA NA
#4 04-01-2021 4 5 6 7 8 NA NA NA
#5 05-01-2021 5 6 7 8 NA NA NA NA
#6 06-01-2021 6 7 8 NA NA NA NA NA
#7 07-01-2021 7 8 NA NA NA NA NA NA
#8 08-01-2021 8 NA NA NA NA NA NA NA
You could of course use the lag
function from dplyr
or a similar function instead of my lagfun
, but this will be self-contained.