I have a seq2seq model built as so:
latent_dim = 256
epochs = 20
batch_size = 64
encoder_inputs = Input(shape=(None,))
x = Embedding(num_encoder_tokens, latent_dim,input_length=max_english_sentence_length)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim,return_state=True)(x)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None,))
x = Embedding(num_decoder_tokens, latent_dim,input_length=max_toki_sentence_length)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=["accuracy"])
model.summary()
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2)
encoder_input_data has shape (2000, 57, 7265) and contains 2000 sentences with at most 57 words, with one-hot encoded tokens.
decoder_input_data and decoder_target_data have shape (2000, 87, 987) and contains 2000 sentences with at most 87 words, with one-hot encoded tokens. Decoder_target_data is offset by one timestep from decoder_input_data.
As far as i'm aware, the data is formatted correctly but when running model.fit
i get:
Input 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (64, 57, 7265, 256)
What am i doing wrong here?
CodePudding user response:
The issue come from the Embedding Layer. To me, you can not use one hot encoding with the keras "Encobedding" layer. Indeed, as said in the docs :
Input shape
2D tensor with shape: (batch_size, input_length).
Output shape
3D tensor with shape: (batch_size, input_length, output_dim).
Note that it takes a 2D array and outputs a 3D array. This is why your inputs went from a 3D array to a 4D array (after the embedding). And this is not good for your LSTM, as it can only receive 3D array.
You should convert your one hot encoding to a basic numbering. So instead of a [0,0,1,0,0] input, you need just have a single value : 2.
This way your inputs will be a 2D array : (batch_size, input_length) and will be converted to a nice 3D array after the embedding layer : (batch_size, input_length, output_dim). And your LSTM layer will be happy ;)