Inputting some data for BERT model, using tf.data.Dataset.from_tensor

Here's my model:

def build_classifier_model():
    text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='features')
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
    encoder_inputs = preprocessing_layer(text_input)
    encoder = hub.KerasLayer(tfhub_handle_encoder, trainable=True, name='BERT_encoder')
    outputs = encoder(encoder_inputs)
    net = outputs['pooled_output']
    net = tf.keras.layers.Dropout(0.1)(net)
    net = tf.keras.layers.Dense(3, activation="softmax", name='classifier')(net)
    return tf.keras.Model(text_input, net)

In the preprocessing layer, I'm using a BERT preprocessor from TF-Hub.

I've already divided the data into corpus_train, corpus_test, labels_train, labels_test. The corpuses are panda dataframes with the texts that will be used as features, and the labels are NumPy arrays.

corpus=df_speech_EN_merged["contents"]
corpus.shape
(1768,)

labels=np.asarray(df_speech_EN_merged["Classes"].astype("int"))
labels.shape
(1768,)

To create the train and test data set, I've used the following:

train_dataset = (
    tf.data.Dataset.from_tensor_slices(
        {
            "features":tf.cast(corpus_train.values, tf.string),
            "labels":tf.cast(labels_train, tf.int32) #labels is already an array, no need for .values
        }
    )
test_dataset = tf.data.Dataset.from_tensor_slices(
    {"features":tf.cast(corpus_test.values, tf.string),
     "labels":tf.cast(labels_test, tf.int32)
    } #labels is already an array, no need for .values
    )
)

After building and compiling the model without any error message, when I fit the model with:

classifier_model.fit(x=train_dataset,
                               validation_data=test_dataset,
                               epochs=2)

I get the following error:

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (3 total):
        * Tensor("inputs:0", shape=(), dtype=string)
        * False
        * None
      Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='sentences')
    * False
    * None
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='sentences')
    * True
    * None
  Keyword arguments: {}

Option 3:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='inputs')
    * False
    * None
  Keyword arguments: {}

Option 4:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='inputs')
    * True
    * None
  Keyword arguments: {}

I think this error occurs because I'm either building train_dataset/test_dataset wrong or because the text_input layer is expecting the wrong type of data. Any help would be appreciated.

CodePudding user response：

When using tf.data.Dataset.from_tensor_slices, try providing a batch_size, since the Bert preprocessing layer expects a very specific shape. Here is a simplified, working example based on the Bert models used in this tutorial and your specific details:

def build_classifier_model():
    text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='features')
    preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
    encoder_inputs = preprocessing_layer(text_input)
    encoder = hub.KerasLayer(tfhub_handle_encoder, trainable=True, name='BERT_encoder')
    outputs = encoder(encoder_inputs)
    net = outputs['pooled_output']
    net = tf.keras.layers.Dropout(0.1)(net)
    net = tf.keras.layers.Dense(3, activation="softmax", name='classifier')(net)
    return tf.keras.Model(text_input, net)

sentences = tf.constant([
"Improve the physical fitness of your goldfish by getting him a bicycle",
"You are unsure whether or not to trust him but very thankful that you wore a turtle neck",
"Not all people who wander are lost", 
"There is a reason that roses have thorns",
"Charles ate the french fries knowing they would be his last meal",
"He hated that he loved what she hated about hate",
])

labels = tf.random.uniform((6, ), minval=0, maxval=2, dtype=tf.dtypes.int32)

classifier_model = build_classifier_model()
classifier_model.compile(optimizer=tf.keras.optimizers.Adam(),
                         loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                         metrics=tf.keras.metrics.SparseCategoricalAccuracy())
BATCH_SIZE = 1
train_dataset = tf.data.Dataset.from_tensor_slices(
        (sentences, labels)).shuffle(
        sentences.shape[0]).batch(
        BATCH_SIZE)
    
classifier_model.fit(x=train_dataset, epochs=2)

Epoch 1/2
6/6 [==============================] - 7s 446ms/step - loss: 2.4348 - sparse_categorical_accuracy: 0.5000
Epoch 2/2
6/6 [==============================] - 3s 447ms/step - loss: 1.3977 - sparse_categorical_accuracy: 0.5000