Home > Software engineering >  How to I track loss at epoch using mlflow/tensorflow?
How to I track loss at epoch using mlflow/tensorflow?

Time:04-15

I want to use mlflow to track the development of a TensorFlow model. How do I log the loss at each epoch? I have written the following code:

mlflow.set_tracking_uri(tracking_uri)

mlflow.set_experiment("/deep_learning")
with mlflow.start_run():
    mlflow.log_param("batch_size", batch_size)
    mlflow.log_param("learning_rate", learning_rate)
    mlflow.log_param("epochs", epochs)
    mlflow.log_param("Optimizer", opt)
    mlflow.log_metric("train_loss", train_loss)
    mlflow.log_metric("val_loss", val_loss)
    mlflow.log_metric("test_loss", test_loss)
    mlflow.log_metric("test_mse", test_mse)
    mlflow.log_artifacts("./model")

If I change the train_loss and val_loss to

train_loss = history.history['loss']
val_loss = history.history['val_loss']

I get the following error:

mlflow.exceptions.MlflowException: Got invalid value [12.041399002075195] for metric 'train_loss' (timestamp=1649783654667). Please specify value as a valid double (64-bit floating point)

How to I save the the loss and the val_loss at all epochs, so I can visualise a learning curve within mlflow?

CodePudding user response:

As you can read enter image description here enter image description here

  • Related