Home > Back-end >  Saving a torch model in Azure ML
Saving a torch model in Azure ML

Time:03-09

I am using the azure ml documentation https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-bring-data to train a deep learning model. I want to save the model in the specified path. This is my original script run-pytorch-data.py

the train.py script is as follows : train.py

the model is not getting saved in the specified location when I printed the current working directory it gives:

  print (os.getcwd())
   /mnt/batch/tasks/shared/LS_root/jobs/workspacecvazue/azureml/day1-experiment-data_1646679122_86b0ac64/wd/azureml/day1-experiment-data_1646679122_86b0ac64

how can I save the model in the desired location?

CodePudding user response:

You can try either of the following ways to save the torch model in Azure ML:

As suggested by User Jadiel de Armas - Stack Overflow:

1. Save the model to use it yourself for inference:

torch.save(model.state_dict(), filepath)   
#Later to restore:  
model.load_state_dict(torch.load(filepath))
model.eval()

2. Save model to resume training later:

state = {
    'epoch': epoch,
    'state_dict': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    ...
}
torch.save(state, filepath)

3. Model to be used by someone else with no access to your code:

torch.save(model, filepath)
#Then later:
model = torch.load(filepath)

**References: python - Best way to save a trained model in PyTorch? - Stack Overflow, Saving and Loading Models — PyTorch Tutorials 1.10.1 cu102 documentation and Save and Load the Model — PyTorch Tutorials 1.10.1 cu102 documentation

CodePudding user response:

when you train a model in azure ML it's trained in a distant compute once the training is complete and the model saved, you must upload the model in your environement so

first step : save the model in the compute

torch.save(state, path_save   '/model.hdf5')
#Then later:

second step : upload the model in your account/blob storage

datastore.upload(src_dir=path_save,
                 target_path=desired_path, overwrite=True)
  • Related