Why is my Tensorflow LSTM Timeseries model returning only one value into the future-CodePudding

I have a tensorflow model for predicting Timeseries values using LSTM, it trains fine but when I ask it to predict some values in time it only gives me the T 1 value,

How can I make it to give me from T 1 to T n values instead of just T 1.

I thought about giving him back the predicted value to analyse again in a loop, e.g.

We look back at 20 samples for this example

T±0 = now value
T-k = value k steps into the past (known)
T n = value n steps into the future (unkown at the start)

--- Algorithm

T 1 = model.predict(data from T-20 to T±0)
T 2 = model.predict(data from T-19 to T 1) #using the previously found T 1 value
T 3 = model.predict(data from T-18 to T 2) #using the previously found T 1 and T 2 values
.
.
.
T n = model.predict(data from T-(k-n) to T (n-1)) # using the previously found T 1 .. T (n-1) values

The thing is that T 1 has an mean absolute error of around 0.75%, doesn't the error propagate/compound through the predictions ? If it does it means that if I ask the program to predict T 10 it will have a mean absolute error of 0.75%^10 = ~7.7%, which is not very good in my case. So I'm looking for other ways to predict up to T n values.

I've looked at few youtube tutorials but each time it seems that their call of model.predict(X) returns multiple values already, and I have no idea about what parameters I could have missed.

Code :

import tensorflow.keras as tfks
import pandas as pd
import numpy as np

def model_training(dataframe,folder_name, window_size=40, epochs=100, batch_size=64):
    """Function to start training the model on given data
    
    Parameters
    ----------
    dataframe : `pandas.DataFrame` The dataframe to train the model on
    window_size : `int` The size of the lookback window to use
    epochs : `int` The number of epochs to train for
    batch_size : `int` The batch size to use for training

    Returns
    -------
    None
    """
    dataframe,_ = Process.pre(dataframe) #function to standardize each column of the data

    TRAIN_SIZE = 0.7
    VAL_SIZE = 0.2
    TEST_SIZE = 0.1

    #Splitting the data into train, validation and test sets
    x,y = dataframe_to_xy(dataframe, window_size) #converts pandas dataframe to numpy array

    x_train,y_train = x[:int(len(dataframe)*TRAIN_SIZE)],y[:int(len(dataframe)*TRAIN_SIZE)]
    x_val,y_val = x[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE VAL_SIZE))],y[int(len(dataframe)*TRAIN_SIZE):int(len(dataframe)*(TRAIN_SIZE VAL_SIZE))]
    x_test,y_test = x[int(len(dataframe)*(TRAIN_SIZE VAL_SIZE)):],y[int(len(dataframe)*(TRAIN_SIZE VAL_SIZE)):]

    #Creating the model base
    model = tfkr.models.Sequential()
    model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
    model.add(tfkr.layers.LSTM(64))
    model.add(tfkr.layers.Dense(8, 'relu'))
    model.add(tfkr.layers.Dense(10, 'linear'))

    model.summary()

    #Compiling and saving the model
    cp = tfkr.callbacks.ModelCheckpoint('ai\\models\\' folder_name '\\', save_best_only=True)
    model.compile(loss=tfkr.losses.MeanSquaredError(), optimizer=tfkr.optimizers.Adam(learning_rate=0.0001), metrics=[tfkr.metrics.RootMeanSquaredError()])

    model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=[cp])

def predict_data(model,data_pre,window_size,n_future):
    '''Function to predict data using the model
    
    Parameters
    ----------
    model : `tensorflow.keras.models.Sequential` The model to use for prediction
    data_pre : `pandas.DataFrame` The dataframe to predict on
    window_size : `int` The size of the lookback window to use
    n_future : `int` Number of values to predict in the future

    Returns
    -------
    data_pred : `pandas.DataFrame` The dataframe containing the predicted values
    '''

    time_interval = data_pre.index[1] - data_pre.index[0] 

    #Setting up the dataframe to predict on
    data_pre, proc_params = Process.pre(data_pre) #function to standardize each column of the data
    data_pred = data_pre.iloc[-window_size:]
    data_pred = data_pred.to_numpy().astype('float32')
    data_pred = data_pred.reshape(1,window_size,10)

    #Predicting the data
    data_pred = model.predict(data_pred)
    
    #Converting the data from numpy array to pandas dataframe and doing some formatting   post-processing/reversing standardization
    #yada yada pandas dataframe

    return data_pred

If there is no way preventing the error propagation, do you have any tips for reducting the error of the model ?

Thanks in advance.

CodePudding user response：

One way which you can do this is, lets assume you initially trained your model something like this:

x_train = [
[1, 2, 3, 4, 5, 6, 7],
[2, 3, 4, 5, 6, 7, 8],
]
y_train = [
[8],
[9]
]

And the training was done as such using the model.fit with all appropriate reshaping.

model.fit(x_train, y_train)

Then, you would expect something like this when you try to predict

ans = model.predict(x_train[0])
ans: 8

But lets say you wanted it predict the next value as well, with the expected output as 8 and then 9, or for n expected outputs, you would expect a result such as 8, 9, 10, 11... and so on.

Then I suggest to approach the problem the following way:

def predict_continuous(model, input_arr, num_ahead):
    # Your input arr was [1,2,3,4,5,6,7]
    # You want the output to go beyond just 8, lets say 8,9,10,11,...
    # num_ahead: How much into the future you need predictions.
    predictions = []
    model_input = input_arr
    for i in range(num_ahead):
        predictions.append(model.predict(model_input)) # Get the prediction, and subsequent predictions are appended
        # Since the LSTM model is trained for a particular input size, we need to keep dynamically changing the input
        # so we strip off the first value from the input array, and append the latest prediction as its last value.
        model_input = model_input[1:].append(predictions[-1])
    return predictions

Now this won't work directly, you will have to do some reshaping, however the core idea remains the same.

CodePudding user response：

You could let your LSTM return the full sequences, not only the last output, like this:

model = tfkr.models.Sequential()
    model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
    model.add(tfkr.layers.LSTM(64, return_sequences=True))
    model.add(tfkr.layers.Dense(8, 'relu'))
    model.add(tfkr.layers.Dense(10, 'linear'))

Each LSTM output would go through the same dense weights (because we did not flatten). Your output would be then of shape

(None, window_size, 10)

This means that for k input time points you would get k output time points.

Downside is, that the first output would be calculated only using the first input. The second output only by using the first two inputs, and so on. So I would suggest to use a bidirectional LSTM, and combine both directions, maybe like this:

model = tfkr.models.Sequential()
        model.add(tfkr.layers.InputLayer(input_shape=(window_size, 10)))
        model.add(tfkr.layers.Bidirectional(tfkr.layers.LSTM(64, return_sequences=True), merge_mode='sum')
        model.add(tfkr.layers.Dense(8, 'relu'))
        model.add(tfkr.layers.Dense(10, 'linear'))