Model was constructed with shape (None, 50) for input KerasTensor(type

I have followed some kind of training and the final Jupyter notebook was this:

https://colab.research.google.com/drive/1Lmh1b5Ge9NodxIrukCTJC3cpYQDn9VuM?usp=sharing

I understand the entire code, and how the model was trained.

However, at the end I am predicting emotions for tweets in the test dataset like this:

i = random.randint(0, len(test_labels)-1)
print('Sentence:', test_tweets[i])
print('Emotion:', index_to_class[test_labels[i]])
p = model.predict(np.expand_dims(test_seq[i], axis=0))[0]
pred_class = index_to_class[np.argmax(p).astype('uint8')]
print('Predicted Emotion:', pred_class)

This works perfectly fine.

However I want to test the model prediction with random sentences, like:

sentence = 'I love you more than ever'
print('Sentence:',  sentence)
#print('Emotion:', index_to_class[test_labels[i]])
p = model.predict(np.expand_dims(sentence, axis=0))[0]
pred_class = index_to_class[np.argmax(p).astype('uint8')]
print('Predicted Emotion:', pred_class)

But I got this error:

Sentence: I love you more than ever
WARNING:tensorflow:Model was constructed with shape (None, 50) for input KerasTensor(type_spec=TensorSpec(shape=(None, 50), dtype=tf.float32, name='embedding_input'), name='embedding_input', description="created by layer 'embedding_input'"), but it was called on an input with incompatible shape (None,).

What am I missing here?

CodePudding user response：

Your model needs an integer sequence, not a raw string. Try converting the sentence to its corresponding integer sequence first:

sentence = 'I love you more than ever'
print('Sentence:',  sentence)
#print('Emotion:', index_to_class[test_labels[i]])
sentence = get_sequences(tokenizer, np.expand_dims(sentence, axis=0))
p = model.predict(sentence)[0]
pred_class = index_to_class[np.argmax(p).astype('uint8')]
print('Predicted Emotion:', pred_class)

Sentence: I love you more than ever
Predicted Emotion: joy

CodePudding user response：

Just to add a little:

Shape

np.expand_dims(sentence).shape is (1,), not (None, 50).
it should be expanded one more dimension for batch size.

Sequences

Input of your model is a padded sequence of numbers, transformed by a tokenizer.
it should be 50 in length.