Error while saving the best model; non-serializable model while checkpointing-CodePudding

I'm training a model using Keras and TensorFlow for image classification with two classes. I used the code shown below to save the best model after each round of training.

callbacks = [ModelCheckpoint(filepath="best_model.h5", save_best_only=True,
 monitor="val_loss"), EarlyStopping(monitor='val_loss', patience=patience)]

When I run the code below to train the model

model.fit(train_dataset, validation_data=val_dataset, epochs=epochs, callbacks=callbacks)

I receive this error message

Epoch 1/100
2022-09-13 14:51:51.230897: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8204
2022-09-13 14:51:52.250140: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
23/23 [==============================] - ETA: 0s - loss: 0.6429 - accuracy: 0.7187Traceback (most recent call last):
  File "/home/hassan/workspace/project/model/model.py", line 110, in <module>
    history = model.fit(train_dataset, validation_data=val_dataset, epochs=epochs, callbacks=callbacks)
  File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/__init__.py", line 234, in dumps
    return cls(
  File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.

Moreover, when I comment out the ModelCheckpoint (which is responsible for saving the best model over epochs) from my code, the error goes away. So my code with


callbacks = [#ModelCheckpoint(filepath="best_model.h5", save_best_only=True,
 monitor="val_loss"), EarlyStopping(monitor='val_loss', patience=patience)]

runs without any error.

Any suggestions on how to fix this problem?

BTW, I'm running TensorFlow 2.10 on Ubuntu 22.04. And Python's version is 3.9.

CodePudding user response：

I downgraded Keras to version 2.9 from 2.10 and the problem seemed solved.