Home > Software design >  How to rearrange time series data in 10 days time window in R?
How to rearrange time series data in 10 days time window in R?

Time:11-09

I am currently working on a timeseries streamflow data range from 2002-10-04 to 2012-10-28, what I need to do is to develop regression models using 10-day time window data. To be more specific, I need to use Oct-04 to Oct 13 across all years from 2002 to 2012 to build a regression model. Then I need to use Oct-05 to Oct 14 across all years from 2002 to 2012 to build another regression model, then use Oct-06 to Oct 15 across all years to build next model and so on repeatedly till the end.

This is how my data look like.

> head(CFbasin)
        Data      Qkl     Qllt     Qhaw      Qdp     Qlit
1 2002-10-04 19.25546 8.353470 4.502379 2.217209 1.985011
2 2002-10-05 19.56694 8.126935 4.615646 1.622555 1.628219
3 2002-10-06 19.73684 7.560598 4.389111 1.251605 1.492298
4 2002-10-07 18.12278 7.079212 3.992675 1.158159 1.413011
5 2002-10-08 18.12278 6.824360 3.794457 1.070377 1.393189
6 2002-10-09 17.83961 6.739409 3.369705 1.073208 1.353545

> tail(CFbasin)
           Data      Qkl     Qllt      Qhaw      Qdp     Qlit
3673 2012-10-23 24.89051 16.67862  8.608321 2.477724 1.432832
3674 2012-10-24 25.00378 16.48040 11.638224 2.820358 1.393189
3675 2012-10-25 25.37189 16.99011  7.758816 3.001586 1.322397
3676 2012-10-26 26.07982 16.87684  6.484558 2.814695 1.279921
3677 2012-10-27 27.41071 17.07506  4.813864 3.086536 1.228951
3678 2012-10-28 28.88318 17.16001  5.635052 3.114853 1.220456

I only tried once and this is my code:

CFbasin %>% filter(month(Date) == 10 & day(Date) >= 4 & day(Date) <= 14) 

this allows me to get all the data within Oct 4 to Oct13 from 2002 to 2012, and then conduct linear regression. But I am not sure how to have it work on the whole dataset and then conduct linear regressions, I am considering for loop and function rollapply(), but really unclear about how to arrange my dataset. Any suggestion and recommendation will be really appreciated, thank you in advance!

CodePudding user response:

Here is an example using the data in the Note at the end and a width of 5.

library(zoo)
coefs <- function(x) coef(lm.fit(cbind(1, x[, -1]), x[, 1]))
rollapplyr(CFbasin[, -1], 5, coefs, by.column = FALSE, fill = NA)

Added

This uses NA for the first output 4 rows and then the rows that correspond to rows 1:5 of all years to form the next regression, rows 2:6 of all years to form the regression after that and so on. Dec 31st is not used in leap years.

yday <- as.POSIXlt(CFbasin$Data)$yday
coefs <- function(ix) {
  x <- CFbasin[yday %in% ix, -1]
  if (NROW(x) == 0) NA else coef(lm(Qkl ~., x))
}
rollapplyr(0:364, 5, coefs, fill = NA)

Note

Lines <- "
        Data      Qkl     Qllt     Qhaw      Qdp     Qlit
1 2002-10-04 19.25546 8.353470 4.502379 2.217209 1.985011
2 2002-10-05 19.56694 8.126935 4.615646 1.622555 1.628219
3 2002-10-06 19.73684 7.560598 4.389111 1.251605 1.492298
4 2002-10-07 18.12278 7.079212 3.992675 1.158159 1.413011
5 2002-10-08 18.12278 6.824360 3.794457 1.070377 1.393189
6 2002-10-09 17.83961 6.739409 3.369705 1.073208 1.353545"
CFbasin <- read.table(text = Lines)
  • Related