I have trained a LSTM model to predict multiple output value. Predicted values are almost same even though the loss is less. Why is it so? How can I improve it?
`from keras import backend as K
import math
from sklearn.metrics import mean_squared_error, mean_absolute_error
from keras.layers.core import Dense, Dropout, Activation
def create_model():
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(40000, 7)))
model.add(LSTM(50, return_sequences= True))
model.add(LSTM(50, return_sequences= False))
model.add(Dense(25))
model.add(Dense(2, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
return model
model = create_model()
model.fit(X_train, Y_train, shuffle=False, verbose=1, epochs=10)
prediction = model.predict(X_test, verbose=0)
print(prediction)
prediction =
[[0.26766795 0.00193274]
[0.2676593 0.00192017]
[0.2676627 0.00193239]
[0.2676644 0.00192784]
[0.26766634 0.00193461]
[0.2676624 0.00192487]
[0.26766685 0.00193129]
[0.26766685 0.00193165]
[0.2676621 0.00193216]
[0.26766127 0.00192624]]
`
calculate mean_relative error
`mean_relative_error = tf.reduce_mean(tf.abs((Y_test-prediction)/Y_test))
print(mean_relative_error)`
`mean_relative_error= 1.9220362`
CodePudding user response:
It means you are just closing the values of x as nearest to y. Just like mapping x -> y. The Relative Error is saying to me that your y's are relatively small and when you are taking the mean difference between y_hat and y they are close enough...
To Break this symmetry you should increase the number of LSTM Cells and add a Dropout to it, also make sure to put an L1-Regularization term into your Dense Layers.
Decrease the number of neurons from each Dense Layer and increase the network size, also change your loss from "mean_squared_error" to "mean_absolute_error".
One more thing use Adagrad with a learning_rate of 1, instead of Adam Optimizer.