LSTM used for regression-CodePudding

Problem: I have S sequences of T timesteps each and each timestep contains F features so collectively, a dataset of (S x T x F) and each s in S is described by 2 values (Target_1 and Target_2)

Goal: Model/Train an architecture using LSTMs in order to learn/achieve a function approximator model M and given a sequence s, to predict Target_1 and Target_2 ?

Something like this:

M(s) ~ (Target_1, Target_2)

I'm really struggling to find a way, below is a Keras implementation of an example that probably does not work. I made 2 models one for the first Target value and 1 for the second.

model1 = Sequential()
model1.add(Masking(mask_value=-10.0))
model1.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences = True))  
model1.add(Flatten())
model1.add(Dense(hidden_units, activation = "relu"))
model1.add(Dense(1, activation = "linear"))
model1.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model1.fit(x_train, y_train[:,0], validation_data=(x_test, y_test[:,0]), epochs=epochs, batch_size=batch, shuffle=False)

model2 = Sequential()
model2.add(Masking(mask_value=-10.0))
model2.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences=True))
model2.add(Flatten())
model2.add(Dense(hidden_units, activation = "relu"))
model2.add(Dense(1, activation = "linear"))
model2.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model2.fit(x_train, y_train[:,1], validation_data=(x_test, y_test[:,1]), epochs=epochs, batch_size=batch, shuffle=False)

I want to make somehow good use of LSTMs time relevant memory in order to achieve good regression.

CodePudding user response：

IIUC, you can start off with a simple (naive) approach by using two output layers:

import tensorflow as tf

timesteps, features = 20, 5
inputs = tf.keras.layers.Input((timesteps, features))
x = tf.keras.layers.Masking(mask_value=-10.0)(inputs)
x = tf.keras.layers.LSTM(32, return_sequences=False)(x)
x = tf.keras.layers.Dense(32, activation = "relu")(x)
output1 = Dense(1, activation = "linear", name='output1')(x)
output2 = Dense(1, activation = "linear", name='output2')(x)

model = tf.keras.Model(inputs, [output1, output2])
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))

x_train = tf.random.normal((500, timesteps, features))
y_train = tf.random.normal((500, 2))
model.fit(x_train, [y_train[:,0],y_train[:,1]] , epochs=5, batch_size=32, shuffle=False)