Home > Net >  How to convert daily frequency dataframe into weekly time series in r?
How to convert daily frequency dataframe into weekly time series in r?

Time:09-21

I have a daily frequency dataframe which I am trying to convert into weekly timeseries to decompose(). I have been able to convert it into monthly but same process doesn't work as expected for weekly.

Code that I have attempted for both monthly & weekly:

Data:

library(tidyverse)
library(quantmod)
library(zoo)
library(xts)

adani_green_df <- read.csv("https://raw.githubusercontent.com/johnsnow09/covid19-df_stack-code/main/adani_daily_data.csv") 

Monthly: (work as expected)

adani_monthly_zoo <- adani_green_df %>% 
  select(date,CLOSE) %>% 
  set_names(.,c("date","Close")) %>% 
  
  read.zoo(.,format = "%Y-%m-%d") %>% 
  to.monthly() %>%  
  Cl() %>% 
  as.ts()

adani_monthly_zoo 
########### output #############

         Jan     Feb     Mar     Apr     May     Jun     Jul     Aug     Sep     Oct     Nov     Dec
2018                                           27.15   61.20   57.70   41.00   39.55   46.75   41.90
2019   35.70   33.05   37.25   40.25   43.10   44.45   46.75   46.20   58.25   89.65  137.20  166.50
2020  189.45  154.90  153.65  210.25  247.85  358.70  340.65  453.60  737.85  856.25 1120.80 1052.55
2021 1003.75 1168.05 1104.30 1019.00 1267.25 1116.90  888.20 1066.85 1147.25 1146.35 1291.20 1327.75
2022 1878.75 1839.10 1913.40 2887.30 1898.80 1929.00 2168.45 2436.70 2347.00

Weekly: (Resulting Data below isn't compiled as it is for monthly in above)

adani_weekly_zoo <- adani_green_df %>% 
  select(date,CLOSE) %>% 
  set_names(.,c("date","Close")) %>% 
  
  read.zoo(.,format = "%Y-%m-%d") %>% 
  to.weekly() %>%  
  Cl() %>% 
  as.ts()

adani_weekly_zoo 
########### output #############

Time Series:
Start = 17704 
End = 19254 
Frequency = 1 
   [1]   29.45      NA      NA      NA      NA      NA      NA   27.15      NA      NA      NA      NA
  [13]      NA      NA   30.05      NA      NA      NA      NA      NA      NA   31.50      NA      NA
  [25]      NA      NA      NA      NA   35.30      NA      NA      NA      NA      NA      NA   53.00
  [37]      NA      NA      NA      NA      NA      NA   70.80      NA      NA      NA      NA      NA
  [49]      NA   66.90      NA      NA      NA      NA      NA      NA   55.05      NA      NA      NA

Decompose: decompose() is working on monthly timeseries that is created above but not working on weekly timeseries.

# works
adani_monthly_zoo %>% 
  decompose() %>% 
  plot()
# doesn't work
adani_weekly_zoo %>% 
  decompose() %>% 
  plot()

Error in decompose(.) : time series has no or less than 2 periods

CodePudding user response:

You should use ts rather than as.ts:

adani_weekly_zoo <- adani_green_df %>% 
  select(date,CLOSE) %>% 
  set_names(.,c("date","Close")) %>% 
  read.zoo(.,format = "%Y-%m-%d") %>%
  to.weekly() %>%
  `colnames<-`(c('Open', 'High', 'Low', 'Close')) %>%
  Cl() %>%
  ts(frequency = 52, start = c(2018, 25)) 

This results in:

print(adani_weekly_zoo, calendar = TRUE)
#>           p1      p2      p3      p4      p5      p6      p7      p8      p9
#> 2018                                                                        
#> 2019   40.75   39.90   38.70   35.15   35.70   33.85   31.00   32.70   34.25
#> 2020  192.65  220.15  188.35  190.00  186.35  212.30  206.95  194.95  154.90
#> 2021 1066.15 1014.90  947.55 1047.45 1003.75 1076.05 1061.20 1118.40 1168.05
#> 2022 1327.75 1422.55 1697.50 1935.30 1936.35 1951.95 1905.95 2045.25 1922.75
#>          p10     p11     p12     p13     p14     p15     p16     p17     p18
#> 2018                                                                        
#> 2019   35.05   37.95   36.00   37.25   36.65   36.90   45.05   43.10   41.85
#> 2020  144.95  124.80  142.05  148.70  152.95  174.80  198.75  196.05  210.25
#> 2021 1171.05 1160.30 1191.50 1211.55 1159.50 1142.90 1107.65 1052.40 1019.00
#> 2022 1891.20 1859.05 1905.75 1922.45 1941.90 2323.95 2864.75 2888.40 2887.30
#>          p19     p20     p21     p22     p23     p24     p25     p26     p27
#> 2018                                                   29.45   27.15   30.05
#> 2019   38.95   37.40   42.45   43.10   43.40   42.70   41.25   44.45   47.55
#> 2020  212.10  230.10  240.60  247.85  284.20  336.45  420.60  397.40  375.65
#> 2021 1102.70 1088.15 1300.45 1240.20 1277.20 1226.65 1062.75 1160.55 1008.70
#> 2022 2872.45 2171.70 2271.80 2165.85 1853.70 1755.75 1713.80 1856.65 1959.70
#>          p28     p29     p30     p31     p32     p33     p34     p35     p36
#> 2018   31.50   35.30   53.00   70.80   66.90   55.05   55.85   57.70   55.90
#> 2019   49.55   45.85   49.60   46.95   46.35   46.80   43.80   46.20   45.00
#> 2020  372.35  347.10  336.20  340.65  371.00  357.70  376.65  462.15  544.65
#> 2021 1020.50 1003.85  978.30  888.20  911.25  917.45  931.10 1039.75 1079.80
#> 2022 1920.90 2071.45 2119.80 2168.45 2103.05 2175.80 2411.10 2393.00 2383.05
#>          p37     p38     p39     p40     p41     p42     p43     p44     p45
#> 2018   55.10   49.65   41.00   33.75   35.25   37.45   36.25   41.35   45.40
#> 2019   50.20   51.55   57.30   65.05   78.80   89.70   90.15   89.85   86.70
#> 2020  608.45  664.60  647.85  747.05  683.85  704.95  718.90  856.25  859.95
#> 2021 1213.55 1153.00 1152.15 1151.15 1176.90 1205.85 1202.05 1146.35 1199.30
#> 2022 2314.80 2315.30 2347.00                                                
#>          p46     p47     p48     p49     p50     p51     p52
#> 2018   42.65   45.75   46.75   44.35   44.35   42.55   41.65
#> 2019   95.50  106.60  137.20  126.40  132.60  135.00  151.05
#> 2020 1008.70 1135.55 1120.80 1129.20 1041.10 1016.15 1035.80
#> 2021 1292.25 1346.85 1372.35 1388.90 1388.70 1405.05 1385.85
#> 2022  

And

adani_weekly_zoo %>%
  decompose() %>%
  plot()

enter image description here

CodePudding user response:

The reason that the monthly one worked is that the index is automatically converted to yearmon which works with decompose.

First we show how to get the montly ts series using compact code.

Next for weekly data there is the problem of how we get around the fact that every year must have the same number of weeks. There are a number of possibilities but here we use the week definition given by %U which goes from 01-53 and remove week 53. This omits a few days per year but does maintain year alignment. Since we are dropping 6 days in every week to get weekly data it seems that perhaps it is not so bad to be dropping a few extra days per year to maintain year alignment.

First define two one-line functions to

  1. remove week 53 using the %U week definition
  2. convert Date vectors to the form year (week-1)/52 for weeks 1-52

Next remove the week 53 using rm53 and then convert the Dates using Date2yearwk. Finally we set the frequency to 52 and convert to ts.

library(zoo)

z <- read.csv.zoo(u)  # u is URL shown in Note at end

### monthly

tt.m <- z |>
  aggregate(as.yearmon, tail, 1) |>
  as.ts()

### weekly

# given Date vector return year   (week-1)/52
# It assumes that weeks are numbered 1-52 and 
# that week 53 data has already been removed.
Date2yearwk <- function(x) {
    with(read.table(text = format(x, "%Y %U")), V1   (V2-1)/52)
}

# given zoo series with Date index remove values in week 53
rm53 <- function(x) {
  x[format(time(x), "%U") != "53"]
}

tt.w <- z |>
  rm53() |>
  aggregate(Date2yearwk, tail, 1) |>
  `frequency<-`(52) |>
  as.ts()

Note

u <- "https://raw.githubusercontent.com/johnsnow09/covid19-df_stack-code/main/adani_daily_data.csv"
  • Related