Home > OS >  R: Perform regression with rolling data
R: Perform regression with rolling data

Time:06-11

I have a large panel data set with daily data. I would like to perform a linear regression with a rolling time window to find out the daily alpha (Intercept). I choose a rolling time window of 250 days per entity (Name). However, already when I have at least 30 days I start running the regression by just using first 30 observations, then 31, 32 etc. until I reach 250.

My data looks like this (just to show you, my real data has over 1000 observations per group and over 200 groups):

structure(list(Name = c("A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C"), Date = c("2018-08-01", 
"2018-08-02", "2018-08-03", "2018-08-06", "2018-08-07", "2018-08-08", 
"2018-08-09", "2018-08-10", "2018-08-13", "2018-08-14", "2018-08-16", 
"2018-08-17", "2018-08-20", "2018-08-21", "2018-08-22", "2018-08-23", 
"2018-08-24", "2018-08-27", "2018-08-28", "2018-08-29", "2018-08-30", 
"2018-08-31", "2018-09-03", "2018-09-04", "2018-09-05", "2018-09-06", 
"2018-09-07", "2018-09-10", "2018-09-11", "2018-09-12", "2018-09-13", 
"2018-09-14", "2018-09-17", "2018-09-18", "2018-09-19", "2018-09-20", 
"2018-09-21", "2018-09-24", "2018-09-25", "2018-09-26", "2018-09-27", 
"2018-09-28", "2018-10-01", "2018-10-02", "2018-10-03", "2018-10-04", 
"2018-10-05", "2018-10-08", "2018-10-09", "2018-10-10", "2018-10-11", 
"2018-10-12", "2018-10-15", "2018-10-16", "2018-10-17", "2018-10-18", 
"2018-10-19", "2018-10-22", "2018-10-23", "2018-10-24", "2018-10-25", 
"2018-10-26", "2018-10-29", "2018-10-30", "2018-10-31", "2018-11-02", 
"2018-11-05", "2018-08-01", "2018-08-02", "2018-08-03", "2018-08-06", 
"2018-08-07", "2018-08-08", "2018-08-09", "2018-08-10", "2018-08-13", 
"2018-08-14", "2018-08-16", "2018-08-17", "2018-08-20", "2018-08-21", 
"2018-08-22", "2018-08-23", "2018-08-24", "2018-08-27", "2018-08-28", 
"2018-08-29", "2018-08-30", "2018-08-31", "2018-09-03", "2018-09-04", 
"2018-09-05", "2018-09-06", "2018-09-07", "2018-09-10", "2018-09-11", 
"2018-09-12", "2018-09-13", "2018-09-14", "2018-09-17", "2018-09-18", 
"2018-09-19", "2018-09-20", "2018-09-21", "2018-09-24", "2018-09-25", 
"2018-09-26", "2018-09-27", "2018-09-28", "2018-10-01", "2018-10-02", 
"2018-10-03", "2018-10-04", "2018-10-05", "2018-10-08", "2018-10-09", 
"2018-10-10", "2018-10-11", "2018-10-12", "2018-10-15", "2018-10-16", 
"2018-10-17", "2018-10-18", "2018-10-19", "2018-10-22", "2018-10-23", 
"2018-10-24", "2018-10-25", "2018-10-26", "2018-10-29", "2018-10-30", 
"2018-10-31", "2018-11-02", "2018-11-05", "2018-08-01", "2018-08-02", 
"2018-08-03", "2018-08-06", "2018-08-07", "2018-08-08", "2018-08-09", 
"2018-08-10", "2018-08-13", "2018-08-14", "2018-08-16", "2018-08-17", 
"2018-08-20", "2018-08-21", "2018-08-22", "2018-08-23", "2018-08-24", 
"2018-08-27", "2018-08-28", "2018-08-29", "2018-08-30", "2018-08-31", 
"2018-09-03", "2018-09-04", "2018-09-05", "2018-09-06", "2018-09-07", 
"2018-09-10", "2018-09-11", "2018-09-12", "2018-09-13", "2018-09-14", 
"2018-09-17", "2018-09-18", "2018-09-19", "2018-09-20", "2018-09-21", 
"2018-09-24", "2018-09-25", "2018-09-26", "2018-09-27", "2018-09-28", 
"2018-10-01", "2018-10-02", "2018-10-03", "2018-10-04", "2018-10-05", 
"2018-10-08", "2018-10-09", "2018-10-10", "2018-10-11", "2018-10-12", 
"2018-10-15", "2018-10-16", "2018-10-17", "2018-10-18", "2018-10-19", 
"2018-10-22", "2018-10-23", "2018-10-24", "2018-10-25", "2018-10-26", 
"2018-10-29", "2018-10-30", "2018-10-31", "2018-11-02", "2018-11-05"
), `Return-RF` = c(0.02, 0.3, 0.425, 0.56, -0.04, -0.246, 0.425, 
0.56, -0.04, 0.425, 0.56, -0.04, 0.56, 0.425, 0.56, -0.04, -1.265791457, 
0.118560063, 1.376111562, -0.729104746, -1.355130661, 1.158722853, 
1.351227849, 0.88999491, 0.614812405, 0.437523454, 0.388822079, 
0.004467408, -0.000544858, -1.265791457, 0.118560063, 1.376111562, 
-0.729104746, -1.355130661, 1.158722853, 1.351227849, 0.88999491, 
0.614812405, 0.437523454, 0.388822079, 0.004467408, -0.000544858, 
-0.729104746, -1.355130661, 1.158722853, 1.351227849, 0.88999491, 
0.614812405, -0.729104746, -1.355130661, 1.158722853, 1.351227849, 
0.88999491, 0.614812405, 0.425, 0.56, -0.04, 0.56, 0.425, 0.56, 
-0.04, 0.88999491, 0.614812405, 0.437523454, 0.388822079, 0.004467408, 
-0.000544858, -0.729104746, -1.355130661, 0.88999491, 0.614812405, 
0.437523454, 0.388822079, 0.004467408, -0.000544858, -0.729104746, 
-1.355130661, 0.88999491, 0.614812405, 0.437523454, 0.388822079, 
0.004467408, -0.000544858, -0.729104746, -1.355130661, 0.88999491, 
0.614812405, 0.437523454, 0.388822079, 0.004467408, -0.000544858, 
-0.729104746, -1.355130661, -1.355130661, -1.355130661, -0.729104746, 
-1.355130661, -0.729104746, -1.355130661, -0.000544858, -0.729104746, 
-1.355130661, -0.729104746, -1.355130661, NA, NA, NA, NA, NA, 
NA, NA, 0.425, 0.56, -0.04, 0.56, 0.425, 0.56, -0.04, 0.56, 0.425, 
0.56, -0.04, 0.56, 0.425, 0.56, -0.04, NA, 0.425, 0.56, -0.04, 
0.56, 0.425, 0.425, 0.56, -0.04, 0.56, 0.56, -0.04, 0.56, 0.56, 
0.56, -0.04, 0.56, 0.56, -0.04, 0.56, 0.56, -0.04, -0.04, 0.56, 
0.425, 0.56, -0.04, 0.56, 0.56, -0.04, 0.56, 0.56, 0.56, -0.04, 
0.56, 0.56, 0.56, 0.56, -0.04, 0.56, -0.04, 0.56, 0.425, 0.56, 
-0.04, 0.56, -0.04, 0.56, 0.425, 0.56, -0.04, 0.56, -0.04, 0.56, 
0.425, 0.56, -0.04, -0.04, 0.56, 0.425, 0.56, -0.04, 0.56, 0.425, 
0.56, 0.56, 0.425, 0.56, 0.56, 0.56, 0.425, 0.56, -0.04, 0.56, 
0.425), `MKT-RF` = c(0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32, 0.15, 0.19, 0.15, 0.08, -0.03, 0.56, -0.22, -0.16, 
-0.44, 0, -0.32, 0.15, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, 0.08, -0.03, 0.56, -0.22, 
-0.16, -0.44, 0, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 
0.15, 0.19, 0.15, -0.1, -0.31, 0.08, -0.03, 0.56, -0.22, -0.16, 
-0.44, 0, -0.32, 0.15, 0.19, 0.08, -0.03, 0.56, -0.22, -0.16, 
-0.44, 0, -0.32, 0.15, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, 0.08, -0.03, 0.56, -0.22, 
-0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, 0.08, -0.03, 0.56, 
-0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, 
0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.08, -0.03, 
0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.08, -0.03, 0.56, 
-0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, 
-0.08, -0.08, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 
0.15, 0.19, 0.15, -0.1, -0.31, -0.08, 0.08, -0.03, 0.56, -0.22, 
-0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, -0.08, 
0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.08, 
-0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, 
-0.1, -0.31, -0.08, -0.08, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32), SMB = c(0.28, 0.05, -0.27, 0.03, -0.3, 0.08, -0.03, 
0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, 
-0.31, -0.08, -0.08, 0.35, 0.73, -0.21, 0.19, 0.6, 0.02, 0.28, 
0.05, -0.27, 0.03, -0.3, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, -0.08, -0.08, 0.35, 
0.73, -0.21, 0.19, 0.6, 0.02, 0.28, 0.05, -0.27, 0.03, -0.3, 
0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 
0.15, -0.1, -0.31, -0.08, -0.08, 0.35, 0.73, -0.21, 0.19, 0.6, 
0.02, 0.28, 0.05, -0.27, 0.03, -0.3, 0.08, -0.03, 0.56, -0.22, 
-0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, -0.08, 
-0.08, 0.35, 0.73, -0.21, 0.19, 0.6, 0.02, 0.28, 0.05, -0.27, 
0.03, -0.3, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 
0.15, 0.19, 0.15, -0.1, -0.31, -0.08, -0.08, 0.35, 0.73, -0.21, 
0.19, 0.6, 0.02, 0.28, 0.05, -0.27, 0.03, -0.3, 0.08, -0.03, 
0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 0.15, -0.1, 
-0.31, -0.08, -0.08, 0.35, 0.73, -0.21, 0.19, 0.6, 0.02, 0.28, 
0.05, -0.27, 0.03, -0.3, 0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 
0, -0.32, 0.15, 0.19, 0.15, -0.1, -0.31, -0.08, -0.08, 0.35, 
0.73, -0.21, 0.19, 0.6, 0.02, 0.28, 0.05, -0.27, 0.03, -0.3, 
0.08, -0.03, 0.56, -0.22, -0.16, -0.44, 0, -0.32, 0.15, 0.19, 
0.15, -0.1, -0.31, -0.08)), class = "data.frame", row.names = c(NA, 
-201L))

My goal is to have an alpha (intercept) for each day (after the first 30 observations per group (Name). The dependent variable is Return-RF and the independent variables are MKT-RF and SMB.

I am very new to regressions with rolling time windows. Could someone help me?

I only came so far to have the code for the regression without a rolling time window. But this is not what I would like to achieve:

Regression <- Data %>%
  na.omit(Data$`Return - RF`) %>%
  group_by(`Name`) %>%
  do(tidy(lm((`Return - RF`) ~ `Mkt-RF`   SMB, data= .)))

Extract intercept:

Regression <- as.data.frame(Regression )
Intercept<- subset(Regression , term == "`Intercept`")

CodePudding user response:

Use rollapplyr to create the row number subsets to use and for each run the indicated function. Note that unlike do within mutate dot (.) refers to all the rows in the data frame whereas cur_data() refers to the rows in the current group.

library(dplyr, exclude = c("filter", "lag"))
library(zoo)

Data %>%
  group_by(Name) %>%
  mutate(intercept = rollapplyr(1:n(), pmin(pmax(1:n(), 30), 250), function(ix)
    coef(lm(`Return-RF` ~ `MKT-RF`   SMB, cur_data(), subset=ix))[1], fill=NA)) %>%
  ungroup
  • Related