Home > database >  Can't get Keras TimeseriesGenerator to train LSTM but can train DNN
Can't get Keras TimeseriesGenerator to train LSTM but can train DNN

Time:03-01

I am working on a bigger project but was able to reproduce this problem in a small collab notebook that I was hoping someone could take a look at. I am able to train a Dense network successfully but unable to train a LSTM using a time series generator. See the following google collab

I know I am using a lookback length of 1 (just in this example and doesn't make complete sense for a LSTM), but will expand it to more n_features and greater lookback if I can figure out how to make this work.

In this example, I created a simple data frame where you have one input variable named input and it predicts pos_points and neg_points (which in the collab notebook I outline how that's calculated as its very simple.

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
tf.__version__

import pandas as pd
df = pd.DataFrame()
df['input'] = np.random.uniform(-10.0, 10.0, 50000)


# Simple rules:
#   If input is positive:
#     pos_points = input * input
#     neg_points = -0.5 * input
#
#   If input is negative:
#     pos_points = -0.5 * input
#     neg_points = -input * input
df['pos_points'] = df['input'].apply(lambda x: x*x if x > 0 else x * -0.5)
df['neg_points'] = df['input'].apply(lambda x: x*x*-1 if x < 0 else -x * 0.5)

target = pd.concat([df.pop(x) for x in ['pos_points', 'neg_points']], axis=1)

I am able to train it successfully via:

# Build a simple model to go from input to the two outputs
from tensorflow.keras import regularizers
def get_df_model():
  model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1,], activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(10, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(2)
  ])

  model.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())
  return model


model = get_df_model()
model.fit(df, target, epochs=10)

which produces:

Epoch 1/10
1563/1563 [==============================] - 2s 1ms/step - loss: 194.1551
Epoch 2/10
1563/1563 [==============================] - 2s 1ms/step - loss: 26.1025
Epoch 3/10
1563/1563 [==============================] - 2s 1ms/step - loss: 7.3179
Epoch 4/10
1563/1563 [==============================] - 2s 1ms/step - loss: 1.1513
Epoch 5/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.4611
Epoch 6/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.3274
Epoch 7/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.2420
Epoch 8/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1833
Epoch 9/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1411
Epoch 10/10
1563/1563 [==============================] - 2s 1ms/step - loss: 0.1110

However, when I try to use a time series generator, I can't get the inputs quite right for a LSTM. Note, I am using a lookback of 1, so it should be equivocally the same:

# Create timeseries generator
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
lookback = 1
n_features = 1 # If you look at df, there is just the input column, a single number
train_generator = TimeseriesGenerator(df.astype(np.float32).to_numpy(), target.astype(np.float32).to_numpy(), length=lookback, batch_size=16)

print(train_generator[0][0].shape) # input shape, should print out (16, 1, 1) = batchsize, lookback length, input_size
print(train_generator[0][1].shape) # target shape, should print out (16, 2) = batchsize, # of outputs in final dense layer

# Build a simple model to go from input to the two outputs
def get_lstm_model():
  model = tf.keras.Sequential([
    tf.keras.layers.LSTM(10, input_shape=[lookback, n_features,], activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(10, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)),
    tf.keras.layers.Dense(2)
  ])

  model.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())
  return model


model = get_lstm_model()
model.fit(train_generator, epochs=10)

producing:

(16, 1, 1)
(16, 2)
WARNING:tensorflow:Layer lstm_33 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
Epoch 1/10
3125/3125 [==============================] - 9s 3ms/step - loss: 722.0089
Epoch 2/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9944
Epoch 3/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0886
Epoch 4/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0521
Epoch 5/10
3125/3125 [==============================] - 10s 3ms/step - loss: 687.0247
Epoch 6/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9836
Epoch 7/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9711
Epoch 8/10
3125/3125 [==============================] - 10s 3ms/step - loss: 686.9208
Epoch 9/10
3125/3125 [==============================] - 9s 3ms/step - loss: 686.9716
Epoch 10/10
3125/3125 [==============================] - 9s 3ms/step - loss: 686.9753
<keras.callbacks.History at 0x7f3d09185eb8>

CodePudding user response:

Okay, figured out the answer and this has to do with Keras TimeseriesGenerator.

I was using a table with input and output columns to organize the data. Output in the generator always maps one ROW ahead (since it normally expects a traditional time series format).

Thus to solve this, I put in one row of NaN in the front of the target data frame. When I then call the generator, I can see them mapping the results correctly:

adjusted_target = pd.DataFrame([[np.nan] * len(target.columns)], columns=target.columns).append(target, ignore_index=True)[:-1]
train_generator = TimeseriesGenerator(df.astype(np.float32).to_numpy(), adjusted_target.astype(np.float32).to_numpy(), length=1, batch_size=1)

You can verify the inputs/outputs are mapping properly via the following code:

for i in range(len(train_generator)):
    x, y = train_generator[i]
    print('%s => %s' % (x, y))
  • Related