Home > other >  the evaluation section pop out erro tensorflow with TPU and T5 (pected shape=(None, 50), found shape
the evaluation section pop out erro tensorflow with TPU and T5 (pected shape=(None, 50), found shape

Time:01-02

im using a notebook from this github repository: https://github.com/flogothetis/Abstractive-Summarization-T5-Keras

notebook link: https://github.com/flogothetis/Abstractive-Summarization-T5-Keras/blob/main/AbstractiveSummarizationT5.ipynb

thanks so much to tensorflow algorithm it works much faster than pytorch and fine but after train in the last section

getSummary("With your permission we and our partners may use precise geolocation data and identification through device scanning. You may click to consent to our and our partners’ processing as described above. Alternatively you may access more detailed information and change your preferences before consenting or to refuse consenting.")

pop out an error and can not generate the result, Error: ValueError: Input 2 is incompatible with layer model_2: expected shape=(None, 50), found shape=(None, 51)

so is there any solution for? Much obliged

the full part that triggers the bug droped below full error:

ValueError: in user code:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1478 predict_function * return step_function(self, iterator) /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1468 step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) /opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:540 run return self.extended.tpu_run(fn, args, kwargs, options) /opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:1296 tpu_run return func(args, kwargs) /opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:1364 tpu_function xla_options=tpu.XLAOptions(use_spmd_for_xla_partitioning=False)) /opt/conda/lib/python3.7/site-packages/tensorflow/python/tpu/tpu.py:968 replicate xla_options=xla_options)[1] /opt/conda/lib/python3.7/site-packages/tensorflow/python/tpu/tpu.py:1439 split_compile_and_replicate outputs = computation(*computation_inputs) /opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:1325 replicated_fn result[0] = fn(*replica_args, **replica_kwargs) /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1461 run_step ** outputs = model.predict_step(data) /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1434 predict_step return self(x, training=False) /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py:998 call input_spec.assert_input_compatibility(self.input_spec, inputs, self.name) /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/input_spec.py:274 assert_input_compatibility ', found shape=' display_shape(x.shape))

ValueError: Input 2 is incompatible with layer model_2: expected shape=(None, 50), found shape=(None, 51)



i tried multiple ways but it did not work, so any ideas?

CodePudding user response:

It looks like during training it's is dropping last elements of train_data['decoder_inputs_ids'] and train_data['decoder_attention_mask'] while during prediction, it's not.

model.fit(x=[train_data['input_ids'],
             train_data['attention_mask'],
             train_data['decoder_inputs_ids'][:,:-1],
             train_data['decoder_attention_mask'][:,:-1]],

pred = model.predict([input_ids, attention_mask, decoder_inputs_ids, decoder_attention_mask])

That's why during inference it has dimension (None, 51) instead of found shape=(None, 50).

You can pad decoder_inputs_ids and decoder_attention_mask to max_len_sum-1 (instead of max_len_sum) during prediction:

#Pad sequence to max_len_sum-1 (instead of original max_len_sum).   
decoder_inputs_ids = tf.keras.preprocessing.sequence.pad_sequences([decoder_input_ids[:-1]], maxlen=
                                                max_len_sum-1, padding= 'post', truncating='post')
decoder_attention_mask = tf.keras.preprocessing.sequence.pad_sequences([decoder_attention_mask[:-1]], maxlen=
                                                max_len_sum-1, padding= 'post', truncating='post')

  • Related