Home > Net >  All predicted values of LSTM model is almost same
All predicted values of LSTM model is almost same

Time:10-27

I have trained a LSTM model to predict multiple output value. Predicted values are almost same even though the loss is less. Why is it so? How can I improve it?


`from keras import backend as K
import math
from sklearn.metrics import mean_squared_error, mean_absolute_error
from keras.layers.core import Dense, Dropout, Activation
def create_model():
    model = Sequential()
    model.add(LSTM(50, return_sequences=True, input_shape=(40000, 7)))
    model.add(LSTM(50, return_sequences= True))
    model.add(LSTM(50, return_sequences= False))
    model.add(Dense(25))
    model.add(Dense(2, activation='linear'))

    model.compile(optimizer='adam', loss='mean_squared_error')
    model.summary()
    return model

model = create_model()

model.fit(X_train, Y_train, shuffle=False, verbose=1, epochs=10)


prediction = model.predict(X_test, verbose=0)
print(prediction)



prediction = 

[[0.26766795 0.00193274]
 [0.2676593  0.00192017]
 [0.2676627  0.00193239]
 [0.2676644  0.00192784]
 [0.26766634 0.00193461]
 [0.2676624  0.00192487]
 [0.26766685 0.00193129]
 [0.26766685 0.00193165]
 [0.2676621  0.00193216]
 [0.26766127 0.00192624]]
`

calculate mean_relative error

`mean_relative_error = tf.reduce_mean(tf.abs((Y_test-prediction)/Y_test))
print(mean_relative_error)`


`mean_relative_error= 1.9220362`

CodePudding user response:

It means you are just closing the values of x as nearest to y. Just like mapping x -> y. The Relative Error is saying to me that your y's are relatively small and when you are taking the mean difference between y_hat and y they are close enough...

To Break this symmetry you should increase the number of LSTM Cells and add a Dropout to it, also make sure to put an L1-Regularization term into your Dense Layers.

Decrease the number of neurons from each Dense Layer and increase the network size, also change your loss from "mean_squared_error" to "mean_absolute_error".

One more thing use Adagrad with a learning_rate of 1, instead of Adam Optimizer.

  • Related