I am trying to write a list into a data frame after forecast is generated in a loop. But the forecast generated is overwritten in one column of the data frame and I want it to go in different columns. I am not sure what is it that I am doing wrong. Please share your opinion on this.
library(dplyr)
library(tidyverse)
library(tidyr)
library(tidymodels)
library(forecast)
library(prophet)
library(readxl)
library(writexl)
library(tibble)
pd <- readxl::read_excel("C:/X/X/X/X/Dummy.xlsx")
colnames(pd)[1]="ds"
colnames(pd)
pd1 <- pd %>% select(`X1`,`X2`,`X3`)
pd2 <- pd %>% select(`X1`)
Output = data.frame()
for(i in 2:ncol(pd))
{
Yi<- ts(data = pd[,i],
frequency = 12,
start = c(2019,1),
end = c(2022,8))
#print(Yi)
Model = HoltWinters(x=Yi,
seasonal = 'additive')
Predictions = forecast(Model,h=6)
print(Predictions$mean)
Output = as.data.frame(Predictions$mean)
print(Output)
}
When I print the output, I can see that the forecast is getting written into the data frame as shown in the image below but its overwritten as I cant specify the column reference.
https://imgur.com/gallery/2MTZq5r
I tried perfroming this, but this also failed
Output[,i] = as.data.frame(Predictions$mean)
The Ideal Expectation is given in the image below:
https://imgur.com/gallery/t98hUJE
But even this output will be fine if the other one is not possible.
https://imgur.com/gallery/pzxXifs
the dataframe pd is given below:
structure(list(ds = c("2019-01-01", "2019-02-01", "2019-03-01",
"2019-04-01", "2019-05-01", "2019-06-01", "2019-07-01", "2019-08-01",
"2019-09-01", "2019-10-01", "2019-11-01", "2019-12-01", "2020-01-01",
"2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-06-01",
"2020-07-01", "2020-08-01", "2020-09-01", "2020-10-01", "2020-11-01",
"2020-12-01", "2021-01-01", "2021-02-01", "2021-03-01", "2021-04-01",
"2021-05-01", "2021-06-01", "2021-07-01", "2021-08-01", "2021-09-01",
"2021-10-01", "2021-11-01", "2021-12-01", "2022-01-01", "2022-02-01",
"2022-03-01", "2022-04-01", "2022-05-01", "2022-06-01", "2022-07-01",
"2022-08-01"), X1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5,
85, 72, 111, 96, 50, 95, 48, 87, 75, 249, 173, 74, 86, 127, 209,
92, 137, 49, 84, 75, 73, 376, 196, 91, 107, 124, 177, 244, 275,
100, 176), X2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 29, 243,
281, 262, 283, 0, 264, 104, 289, 41, 76), X3 = c(0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 157, 171, 377, 409, 375, 314, 253, 322,
130, 472, 115, 179)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -44L))
CodePudding user response:
if you do what I did bellow the Output is now a list. Is this what you wanted?
library(forecast)
pd<-structure(list(ds=c("2019-01-01","2019-02-01","2019-03-01",
"2019-04-01","2019-05-01","2019-06-01","2019-07-01","2019-08-01",
"2019-09-01","2019-10-01","2019-11-01","2019-12-01","2020-01-01",
"2020-02-01","2020-03-01","2020-04-01","2020-05-01","2020-06-01",
"2020-07-01","2020-08-01","2020-09-01","2020-10-01","2020-11-01",
"2020-12-01","2021-01-01","2021-02-01","2021-03-01","2021-04-01",
"2021-05-01","2021-06-01","2021-07-01","2021-08-01","2021-09-01",
"2021-10-01","2021-11-01","2021-12-01","2022-01-01","2022-02-01",
"2022-03-01","2022-04-01","2022-05-01","2022-06-01","2022-07-01",
"2022-08-01"),
X1 = c(0,0,0,0,0,0,0,0,0,0,0,0,5,85,72,111,96,50,95,48,87,75,249,173,74,86,127,209,92,137,49,84,75,73,376,196,91,107,124,177,244,275,100,176),
X2 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,19,29,243,281,262,283,0,264,104,289,41,76),
X3 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,157,171,377,409,375,314,253,322,130,472,115,179)),
class = c("tbl_df","tbl","data.frame"),
row.names = c(NA,-44L))
colnames(pd)[1]="ds"
colnames(pd)
output = list()
for(i in c("X1","X2","X3")) {
Yi<-ts(data = pd[,i],
frequency = 12,
start = c(2019,1),
end = c(2022,8))
Model<-HoltWinters(x=Yi,seasonal = 'additive')
Predictions = forecast(Model,h=6)
print(Predictions$mean)
output[[i]] = as.data.frame(Predictions$mean)
}
do.call(cbind,output)
CodePudding user response:
The issue which you are coming across is because you have created a empty data frame, if you create a data frame based on the required length of months needed for the forecast, the numbers of columns with time series data or just a data frame and fill it with NA, you should be able to get the output with the above mentioned code and a few changes. I wrote the code which would create a empty data frame in par with your requirement.
Creation of the empty data frame can be done by data.frame(matrix(NA,nrow = 6,ncol = 6))
To determine the rows and columns dynamically, you can use the below code.
freq = 6
pd1 <- pd %>% select(`X1`,`X2`,`X3`)
Output = data.frame(matrix(NA,nrow = freq,ncol = ncol(pd1)))
for(i in 1:ncol(pd1))
{
Yi<- ts(data = pd1[,i],
frequency = 12,
start = c(2019,1),
end = c(2022,8))
Model = HoltWinters(x=Yi,
seasonal = 'additive')
Predictions = forecast(Model,h=freq)
Output[,i] = as.data.frame(Predictions$mean)
}