I'm training a model using Keras and TensorFlow for image classification with two classes. I used the code shown below to save the best model after each round of training.
callbacks = [ModelCheckpoint(filepath="best_model.h5", save_best_only=True,
monitor="val_loss"), EarlyStopping(monitor='val_loss', patience=patience)]
When I run the code below to train the model
model.fit(train_dataset, validation_data=val_dataset, epochs=epochs, callbacks=callbacks)
I receive this error message
Epoch 1/100
2022-09-13 14:51:51.230897: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8204
2022-09-13 14:51:52.250140: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
23/23 [==============================] - ETA: 0s - loss: 0.6429 - accuracy: 0.7187Traceback (most recent call last):
File "/home/hassan/workspace/project/model/model.py", line 110, in <module>
history = model.fit(train_dataset, validation_data=val_dataset, epochs=epochs, callbacks=callbacks)
File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/__init__.py", line 234, in dumps
return cls(
File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/home/hassan/bin/anaconda3/envs/kerasenv/lib/python3.9/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.
Moreover, when I comment out the ModelCheckpoint
(which is responsible for saving the best model over epochs) from my code, the error goes away. So my code with
callbacks = [#ModelCheckpoint(filepath="best_model.h5", save_best_only=True,
monitor="val_loss"), EarlyStopping(monitor='val_loss', patience=patience)]
runs without any error.
Any suggestions on how to fix this problem?
BTW, I'm running TensorFlow 2.10 on Ubuntu 22.04. And Python's version is 3.9.
CodePudding user response:
I downgraded Keras to version 2.9 from 2.10 and the problem seemed solved.