ValueError of Input and Output values during LSTM training-CodePudding

I was trying to implement a basic LSTM network using some random data, and I got the following error during execution of the code

'''

Traceback (most recent call last):
  File "C:/Users/dell/Desktop/test run for LSTM thingy.py", line 39, in <module>
    history = model.fit(x_train, y_train, epochs=1, batch_size=16, verbose=1)
  File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\dell\AppData\Local\Temp\__autograph_generated_fileu1zdna1b.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 890, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 948, in compute_loss
        return self.compiled_loss(
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\losses.py", line 139, in __call__
        losses = call_fn(y_true, y_pred)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\losses.py", line 243, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\losses.py", line 1787, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "C:\Users\dell\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\backend.py", line 5119, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 133, 1320) and (None, 133, 5) are incompatible
'''

This is how my code looks like at the moment:

import tensorflow as tf
x_train = tf.random.normal((28, 133, 1320))
y_train = tf.random.normal((28, 133, 1320))
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(5,activation='tanh',recurrent_activation='sigmoid', input_shape=(x_train.shape[1],x_train.shape[2]),return_sequences=True))
model.add(tf.keras.layers.Dense(5, activation= "softmax"))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(x_train, y_train, epochs=1, batch_size=16, verbose=1)

Could anyone help me in debugging this code, as I need to use something similar in another side project which involves both X and Y input data of similar shapes, and I was not able to find a solution to the problem. I know it has something to do with the loss function, but thats all.

Shape of Y - (28, 133, 1320) Shape of X - (28, 133, 1320) Categories needed - 5

CodePudding user response：

You are currently trying to do categorical classification with 5 classes but y has the shape (28, 133, 1320). It does not work like that. Also, when you use categorical_crossentropy, you need one-hot-encoded labels. Here is a working example as orientation:

import tensorflow as tf

x_train = tf.random.normal((28, 133, 1320))

# one-hot encoded labels
y_train = tf.keras.utils.to_categorical(tf.random.uniform((28,), maxval=5, dtype=tf.int32))

model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(5,activation='tanh',recurrent_activation='sigmoid', input_shape=(x_train.shape[1],x_train.shape[2]), return_sequences=False))
model.add(tf.keras.layers.Dense(5, activation= "softmax"))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(x_train, y_train, epochs=1, batch_size=16, verbose=1)