Home > OS >  How to Write a List into a Empty data fame with a Loop in R
How to Write a List into a Empty data fame with a Loop in R

Time:10-14

I am trying to write a list into a data frame after forecast is generated in a loop. But the forecast generated is overwritten in one column of the data frame and I want it to go in different columns. I am not sure what is it that I am doing wrong. Please share your opinion on this.

library(dplyr)
library(tidyverse)
library(tidyr)
library(tidymodels)
library(forecast)
library(prophet)
library(readxl)
library(writexl)
library(tibble)

pd <- readxl::read_excel("C:/X/X/X/X/Dummy.xlsx")
colnames(pd)[1]="ds"

colnames(pd)
pd1 <- pd %>% select(`X1`,`X2`,`X3`)

pd2 <- pd %>% select(`X1`)

Output = data.frame()

for(i in 2:ncol(pd))
{
  Yi<- ts(data = pd[,i],
           frequency = 12,
           start = c(2019,1),
           end = c(2022,8))
  #print(Yi)
  
  Model = HoltWinters(x=Yi,
                      seasonal = 'additive')
  
  Predictions = forecast(Model,h=6)
  
  print(Predictions$mean)
  
  Output = as.data.frame(Predictions$mean)
  
  print(Output)
  
}

When I print the output, I can see that the forecast is getting written into the data frame as shown in the image below but its overwritten as I cant specify the column reference.

https://imgur.com/gallery/2MTZq5r

I tried perfroming this, but this also failed

Output[,i] = as.data.frame(Predictions$mean)

The Ideal Expectation is given in the image below:

https://imgur.com/gallery/t98hUJE

But even this output will be fine if the other one is not possible.

https://imgur.com/gallery/pzxXifs

the dataframe pd is given below:

structure(list(ds = c("2019-01-01", "2019-02-01", "2019-03-01", 
"2019-04-01", "2019-05-01", "2019-06-01", "2019-07-01", "2019-08-01", 
"2019-09-01", "2019-10-01", "2019-11-01", "2019-12-01", "2020-01-01", 
"2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-06-01", 
"2020-07-01", "2020-08-01", "2020-09-01", "2020-10-01", "2020-11-01", 
"2020-12-01", "2021-01-01", "2021-02-01", "2021-03-01", "2021-04-01", 
"2021-05-01", "2021-06-01", "2021-07-01", "2021-08-01", "2021-09-01", 
"2021-10-01", "2021-11-01", "2021-12-01", "2022-01-01", "2022-02-01", 
"2022-03-01", "2022-04-01", "2022-05-01", "2022-06-01", "2022-07-01", 
"2022-08-01"), X1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 
85, 72, 111, 96, 50, 95, 48, 87, 75, 249, 173, 74, 86, 127, 209, 
92, 137, 49, 84, 75, 73, 376, 196, 91, 107, 124, 177, 244, 275, 
100, 176), X2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 29, 243, 
281, 262, 283, 0, 264, 104, 289, 41, 76), X3 = c(0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 157, 171, 377, 409, 375, 314, 253, 322, 
130, 472, 115, 179)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -44L))

CodePudding user response:

if you do what I did bellow the Output is now a list. Is this what you wanted?

library(forecast)
pd<-structure(list(ds=c("2019-01-01","2019-02-01","2019-03-01",
                        "2019-04-01","2019-05-01","2019-06-01","2019-07-01","2019-08-01",
                        "2019-09-01","2019-10-01","2019-11-01","2019-12-01","2020-01-01",
                        "2020-02-01","2020-03-01","2020-04-01","2020-05-01","2020-06-01",
                        "2020-07-01","2020-08-01","2020-09-01","2020-10-01","2020-11-01",
                        "2020-12-01","2021-01-01","2021-02-01","2021-03-01","2021-04-01",
                        "2021-05-01","2021-06-01","2021-07-01","2021-08-01","2021-09-01",
                        "2021-10-01","2021-11-01","2021-12-01","2022-01-01","2022-02-01",
                        "2022-03-01","2022-04-01","2022-05-01","2022-06-01","2022-07-01",
                        "2022-08-01"),
                   X1 = c(0,0,0,0,0,0,0,0,0,0,0,0,5,85,72,111,96,50,95,48,87,75,249,173,74,86,127,209,92,137,49,84,75,73,376,196,91,107,124,177,244,275,100,176),
                   X2 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,19,29,243,281,262,283,0,264,104,289,41,76),
                   X3 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,157,171,377,409,375,314,253,322,130,472,115,179)),
              class = c("tbl_df","tbl","data.frame"),
              row.names = c(NA,-44L))
colnames(pd)[1]="ds"
colnames(pd)
output = list()
for(i in c("X1","X2","X3")) {
  Yi<-ts(data = pd[,i],
         frequency = 12,
         start = c(2019,1),
         end = c(2022,8))
  Model<-HoltWinters(x=Yi,seasonal = 'additive')
  Predictions = forecast(Model,h=6)
  print(Predictions$mean)
  output[[i]] = as.data.frame(Predictions$mean)
}

do.call(cbind,output)

CodePudding user response:

The issue which you are coming across is because you have created a empty data frame, if you create a data frame based on the required length of months needed for the forecast, the numbers of columns with time series data or just a data frame and fill it with NA, you should be able to get the output with the above mentioned code and a few changes. I wrote the code which would create a empty data frame in par with your requirement.

Creation of the empty data frame can be done by data.frame(matrix(NA,nrow = 6,ncol = 6))

To determine the rows and columns dynamically, you can use the below code.

freq = 6   

pd1 <- pd %>% select(`X1`,`X2`,`X3`)

Output = data.frame(matrix(NA,nrow = freq,ncol = ncol(pd1)))

for(i in 1:ncol(pd1))
{
  Yi<- ts(data = pd1[,i],
           frequency = 12,
           start = c(2019,1),
           end = c(2022,8))
  
  Model = HoltWinters(x=Yi,
                      seasonal = 'additive')
  
  Predictions = forecast(Model,h=freq)
 
  Output[,i] = as.data.frame(Predictions$mean)
  
}
  • Related