I have the following dataframe in R. I'm trying to calculate a 5 day rolling Z score (X minus its rolling mean, divided by its rolling standard deviation) by creating a function for this using rollify
)
library(tibbletime)
library(tidyverse)
rolling_Z <- rollify(~((.x - mean(.x))/sd(.x)), window = 5)
data <- structure(list(Date = structure(c(19282, 19283, 19284, 19285,
19286, 19289, 19290, 19291, 19292, 19293, 19296, 19297, 19298,
19299, 19300, 19303), class = "Date"), `US 10 Year` = c(3.824,
3.881, 3.881, 3.947, 3.943, 4.018, 4.01, 4.007, 4.134, 4.228,
4.217, 4.242, 4.102, 4.003, 3.919, 3.998)), row.names = c(NA,
-16L), class = c("tbl_df", "tbl", "data.frame"))
data %>%
mutate(Z_Score = rolling_Z(`US 10 Year`))
However I'm getting the following error. I'm guessing it is because US 10 Year
is not the same length as the rolling mean and rolling standard deviation seeing as the first four days would be an NA. Is there a way to overcome this issue?
Error in `mutate()`:
! Problem while computing `Z_Score = rolling_Z(`US 10 Year`)`.
x `Z_Score` must be size 16 or 1, not 64.
CodePudding user response:
So I'm not sure how tibbletime works but you can use the zoo
package which has multiple roll
functions.
You didn't specify whether you wanted a forward roll or a roll that took from both sides so in this case I used rollapplyr()
There are parameters in the zoo roll()
function family that allow you to specify how you want to roll, whether you should allow partials, etc. and you can change these to fit your needs.
library(tidyverse)
library(zoo)
data %>%
mutate(mean = rollapplyr(`US 10 Year`, 5, mean, partial = T),
sd = rollapplyr(`US 10 Year`, 5, sd, partial = T)) %>%
mutate(rolling_Z = (`US 10 Year` - mean) / sd)
output ---------
Date `US 10 Year` mean sd rolling_Z
<date> <dbl> <dbl> <dbl> <dbl>
1 2022-10-17 3.82 3.82 NA NA
2 2022-10-18 3.88 3.85 0.0403 0.707
3 2022-10-19 3.88 3.86 0.0329 0.577
4 2022-10-20 3.95 3.88 0.0503 1.27
5 2022-10-21 3.94 3.90 0.0511 0.936
6 2022-10-24 4.02 3.93 0.0568 1.48
7 2022-10-25 4.01 3.96 0.0560 0.896
8 2022-10-26 4.01 3.98 0.0368 0.598
9 2022-10-27 4.13 4.02 0.0692 1.61
10 2022-10-28 4.23 4.08 0.0986 1.51
11 2022-10-31 4.22 4.12 0.107 0.911
12 2022-11-01 4.24 4.17 0.0981 0.778
13 2022-11-02 4.10 4.18 0.0625 -1.32
14 2022-11-03 4.00 4.16 0.103 -1.51
15 2022-11-04 3.92 4.10 0.138 -1.29
16 2022-11-07 4.00 4.05 0.124 -0.442
If you wanted to make this into a function you could also do this.
roll_z <- function(data, x) {
data %>%
mutate(mean = rollapplyr(x, 5, mean, partial = T),
sd = rollapplyr(x, 5, sd, partial = T)) %>%
mutate(rolling_Z = (x - mean) / sd) %>%
pull(rolling_Z)
}
data$rolling_Z <- roll_z(data, data$`US 10 Year`)