I have a daily frequency dataframe which I am trying to convert into weekly timeseries to decompose()
. I have been able to convert it into monthly but same process doesn't work as expected for weekly.
Code that I have attempted for both monthly & weekly:
Data:
library(tidyverse)
library(quantmod)
library(zoo)
library(xts)
adani_green_df <- read.csv("https://raw.githubusercontent.com/johnsnow09/covid19-df_stack-code/main/adani_daily_data.csv")
Monthly: (work as expected)
adani_monthly_zoo <- adani_green_df %>%
select(date,CLOSE) %>%
set_names(.,c("date","Close")) %>%
read.zoo(.,format = "%Y-%m-%d") %>%
to.monthly() %>%
Cl() %>%
as.ts()
adani_monthly_zoo
########### output #############
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2018 27.15 61.20 57.70 41.00 39.55 46.75 41.90
2019 35.70 33.05 37.25 40.25 43.10 44.45 46.75 46.20 58.25 89.65 137.20 166.50
2020 189.45 154.90 153.65 210.25 247.85 358.70 340.65 453.60 737.85 856.25 1120.80 1052.55
2021 1003.75 1168.05 1104.30 1019.00 1267.25 1116.90 888.20 1066.85 1147.25 1146.35 1291.20 1327.75
2022 1878.75 1839.10 1913.40 2887.30 1898.80 1929.00 2168.45 2436.70 2347.00
Weekly: (Resulting Data below isn't compiled as it is for monthly
in above)
adani_weekly_zoo <- adani_green_df %>%
select(date,CLOSE) %>%
set_names(.,c("date","Close")) %>%
read.zoo(.,format = "%Y-%m-%d") %>%
to.weekly() %>%
Cl() %>%
as.ts()
adani_weekly_zoo
########### output #############
Time Series:
Start = 17704
End = 19254
Frequency = 1
[1] 29.45 NA NA NA NA NA NA 27.15 NA NA NA NA
[13] NA NA 30.05 NA NA NA NA NA NA 31.50 NA NA
[25] NA NA NA NA 35.30 NA NA NA NA NA NA 53.00
[37] NA NA NA NA NA NA 70.80 NA NA NA NA NA
[49] NA 66.90 NA NA NA NA NA NA 55.05 NA NA NA
Decompose:
decompose()
is working on monthly timeseries that is created above but not working on weekly timeseries.
# works
adani_monthly_zoo %>%
decompose() %>%
plot()
# doesn't work
adani_weekly_zoo %>%
decompose() %>%
plot()
Error in decompose(.) : time series has no or less than 2 periods
CodePudding user response:
You should use ts
rather than as.ts
:
adani_weekly_zoo <- adani_green_df %>%
select(date,CLOSE) %>%
set_names(.,c("date","Close")) %>%
read.zoo(.,format = "%Y-%m-%d") %>%
to.weekly() %>%
`colnames<-`(c('Open', 'High', 'Low', 'Close')) %>%
Cl() %>%
ts(frequency = 52, start = c(2018, 25))
This results in:
print(adani_weekly_zoo, calendar = TRUE)
#> p1 p2 p3 p4 p5 p6 p7 p8 p9
#> 2018
#> 2019 40.75 39.90 38.70 35.15 35.70 33.85 31.00 32.70 34.25
#> 2020 192.65 220.15 188.35 190.00 186.35 212.30 206.95 194.95 154.90
#> 2021 1066.15 1014.90 947.55 1047.45 1003.75 1076.05 1061.20 1118.40 1168.05
#> 2022 1327.75 1422.55 1697.50 1935.30 1936.35 1951.95 1905.95 2045.25 1922.75
#> p10 p11 p12 p13 p14 p15 p16 p17 p18
#> 2018
#> 2019 35.05 37.95 36.00 37.25 36.65 36.90 45.05 43.10 41.85
#> 2020 144.95 124.80 142.05 148.70 152.95 174.80 198.75 196.05 210.25
#> 2021 1171.05 1160.30 1191.50 1211.55 1159.50 1142.90 1107.65 1052.40 1019.00
#> 2022 1891.20 1859.05 1905.75 1922.45 1941.90 2323.95 2864.75 2888.40 2887.30
#> p19 p20 p21 p22 p23 p24 p25 p26 p27
#> 2018 29.45 27.15 30.05
#> 2019 38.95 37.40 42.45 43.10 43.40 42.70 41.25 44.45 47.55
#> 2020 212.10 230.10 240.60 247.85 284.20 336.45 420.60 397.40 375.65
#> 2021 1102.70 1088.15 1300.45 1240.20 1277.20 1226.65 1062.75 1160.55 1008.70
#> 2022 2872.45 2171.70 2271.80 2165.85 1853.70 1755.75 1713.80 1856.65 1959.70
#> p28 p29 p30 p31 p32 p33 p34 p35 p36
#> 2018 31.50 35.30 53.00 70.80 66.90 55.05 55.85 57.70 55.90
#> 2019 49.55 45.85 49.60 46.95 46.35 46.80 43.80 46.20 45.00
#> 2020 372.35 347.10 336.20 340.65 371.00 357.70 376.65 462.15 544.65
#> 2021 1020.50 1003.85 978.30 888.20 911.25 917.45 931.10 1039.75 1079.80
#> 2022 1920.90 2071.45 2119.80 2168.45 2103.05 2175.80 2411.10 2393.00 2383.05
#> p37 p38 p39 p40 p41 p42 p43 p44 p45
#> 2018 55.10 49.65 41.00 33.75 35.25 37.45 36.25 41.35 45.40
#> 2019 50.20 51.55 57.30 65.05 78.80 89.70 90.15 89.85 86.70
#> 2020 608.45 664.60 647.85 747.05 683.85 704.95 718.90 856.25 859.95
#> 2021 1213.55 1153.00 1152.15 1151.15 1176.90 1205.85 1202.05 1146.35 1199.30
#> 2022 2314.80 2315.30 2347.00
#> p46 p47 p48 p49 p50 p51 p52
#> 2018 42.65 45.75 46.75 44.35 44.35 42.55 41.65
#> 2019 95.50 106.60 137.20 126.40 132.60 135.00 151.05
#> 2020 1008.70 1135.55 1120.80 1129.20 1041.10 1016.15 1035.80
#> 2021 1292.25 1346.85 1372.35 1388.90 1388.70 1405.05 1385.85
#> 2022
And
adani_weekly_zoo %>%
decompose() %>%
plot()
CodePudding user response:
The reason that the monthly one worked is that the index is automatically converted to yearmon which works with decompose.
First we show how to get the montly ts series using compact code.
Next for weekly data there is the problem of how we get around the fact that every year must have the same number of weeks. There are a number of possibilities but here we use the week definition given by %U which goes from 01-53 and remove week 53. This omits a few days per year but does maintain year alignment. Since we are dropping 6 days in every week to get weekly data it seems that perhaps it is not so bad to be dropping a few extra days per year to maintain year alignment.
First define two one-line functions to
- remove week 53 using the %U week definition
- convert Date vectors to the form year (week-1)/52 for weeks 1-52
Next remove the week 53 using rm53
and then convert the Dates using Date2yearwk. Finally we set the frequency to 52 and convert to ts.
library(zoo)
z <- read.csv.zoo(u) # u is URL shown in Note at end
### monthly
tt.m <- z |>
aggregate(as.yearmon, tail, 1) |>
as.ts()
### weekly
# given Date vector return year (week-1)/52
# It assumes that weeks are numbered 1-52 and
# that week 53 data has already been removed.
Date2yearwk <- function(x) {
with(read.table(text = format(x, "%Y %U")), V1 (V2-1)/52)
}
# given zoo series with Date index remove values in week 53
rm53 <- function(x) {
x[format(time(x), "%U") != "53"]
}
tt.w <- z |>
rm53() |>
aggregate(Date2yearwk, tail, 1) |>
`frequency<-`(52) |>
as.ts()
Note
u <- "https://raw.githubusercontent.com/johnsnow09/covid19-df_stack-code/main/adani_daily_data.csv"