Let's say I am training a model for predicting tomorrow's sales. I have data about previous days and future days and I know my previous sales. About tomorrow I know that it is a weekday there will be rain and it is a holiday. How can I use this data to make predictions?
Dataset looks like this.
Weekday | Holiday | Weather | Sales |
---|---|---|---|
1 | 0 | Rainy | 25 |
1 | 0 | Rainy | 27 |
1 | 1 | Sunny | 23 |
0 | 0 | Sunny | 24 |
0 | 0 | Cloudy | 31 |
I created the training set by using the previous 150 days with multivariant lstm. However to do prediction I use only previous days' data.
I have data about tomorrow and want to use it. How can I do that?
CodePudding user response:
You can shift Weekday/Holiday/Weather data by -1 and use it as an input during training. Than at inference time you use tomorrows data as an input.
As an example please see "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition", by Aurélien Géron, p. 559:
"...df_mulvar["next_day_type"] = df["day_type"].shift(-1) # we know tomorrow's type"
This example is also available at (see section "Multivariate time series"): https://github.com/ageron/handson-ml3/blob/main/15_processing_sequences_using_rnns_and_cnns.ipynb