ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None-CodePudding

I'm playing around a bit with Tensorflow 2.7.0 and its new TextVectorization layer. However, something does not work quite right in this simple example:

import tensorflow as tf
import numpy as np

X = np.array(['this is a test', 'a nice test', 'best test this is'])

vectorize_layer = tf.keras.layers.TextVectorization()
vectorize_layer.adapt(X)
emb_layer = tf.keras.layers.Embedding(input_dim=vectorize_layer.vocabulary_size() 1, output_dim=2, input_length=4)
flatten_layer = tf.keras.layers.Flatten()
dense_layer = tf.keras.layers.Dense(1)


model = tf.keras.models.Sequential()
model.add(vectorize_layer)
model.add(emb_layer)
model.add(flatten_layer)
#model.add(dense_layer)

model(X)

This works so far. I make ints out of words, embed them, flatten them. But if I want to add a Dense layer after flattening (i.e. uncomment a line), things break, and I get the error message from the question title. I even used the input_length parameter of the Embedding layer because the documentation says that I should specify this when using embedding->flatten->dense. But it just does not work.

Do you know how I can get it to work using Flatten and not something like GlobalAveragePooling1D?

Thanks a lot!

CodePudding user response：

You need to define a max length for the sequences.

vectorize_layer = tf.keras.layers.TextVectorization(output_mode = 'int',
                                                    output_sequence_length=10)

If you check model.summary(), output shape of the TextVectorization will be (None, None).

First None indicates that model can accept any batch size, and the second one indicates that any sentence that is passed to TextVectorization will not be truncated or padded. So the output sentence can have variable length.

Example:

import tensorflow as tf
import numpy as np

X = np.array(['this is a test', 'a nice test', 'best test this is'])

vectorize_layer = tf.keras.layers.TextVectorization(output_mode = 'int')
vectorize_layer.adapt(X)

model = tf.keras.models.Sequential()
model.add(vectorize_layer)

model(np.array(['this is a test']))
>> <tf.Tensor: shape=(1, 4), dtype=int64, numpy=array([[3, 4, 5, 2]])>

model(np.array(['this is a longer test sentence']))
>> <tf.Tensor: shape=(1, 6), dtype=int64, numpy=array([[3, 4, 5, 1, 2, 1]])>

Redefining it:

vectorize_layer = tf.keras.layers.TextVectorization(output_mode = 'int',
                                                    output_sequence_length = 5)

model(np.array(['this is a longer test sentence']))
>> <tf.Tensor: shape=(1, 5), dtype=int64, numpy=array([[3, 4, 5, 1, 2]])>

model(np.array(['this is']))
>> <tf.Tensor: shape=(1, 5), dtype=int64, numpy=array([[3, 4, 0, 0, 0]])>

Defining output_sequence_length to a number will make sure the outputs' lengths are a fixed number.