I want to use mlflow to track the development of a TensorFlow model. How do I log the loss at each epoch? I have written the following code:
mlflow.set_tracking_uri(tracking_uri)
mlflow.set_experiment("/deep_learning")
with mlflow.start_run():
mlflow.log_param("batch_size", batch_size)
mlflow.log_param("learning_rate", learning_rate)
mlflow.log_param("epochs", epochs)
mlflow.log_param("Optimizer", opt)
mlflow.log_metric("train_loss", train_loss)
mlflow.log_metric("val_loss", val_loss)
mlflow.log_metric("test_loss", test_loss)
mlflow.log_metric("test_mse", test_mse)
mlflow.log_artifacts("./model")
If I change the train_loss and val_loss to
train_loss = history.history['loss']
val_loss = history.history['val_loss']
I get the following error:
mlflow.exceptions.MlflowException: Got invalid value [12.041399002075195] for metric 'train_loss' (timestamp=1649783654667). Please specify value as a valid double (64-bit floating point)
How to I save the the loss and the val_loss at all epochs, so I can visualise a learning curve within mlflow?
CodePudding user response: