I am using the azure ml documentation https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-bring-data to train a deep learning model. I want to save the model in the specified path. This is my original script run-pytorch-data.py
the train.py script is as follows : train.py
the model is not getting saved in the specified location when I printed the current working directory it gives:
print (os.getcwd())
/mnt/batch/tasks/shared/LS_root/jobs/workspacecvazue/azureml/day1-experiment-data_1646679122_86b0ac64/wd/azureml/day1-experiment-data_1646679122_86b0ac64
how can I save the model in the desired location?
CodePudding user response:
You can try either of the following ways to save the torch model in Azure ML:
As suggested by User Jadiel de Armas - Stack Overflow:
1. Save the model to use it yourself for inference:
torch.save(model.state_dict(), filepath)
#Later to restore:
model.load_state_dict(torch.load(filepath))
model.eval()
2. Save model to resume training later:
state = {
'epoch': epoch,
'state_dict': model.state_dict(),
'optimizer': optimizer.state_dict(),
...
}
torch.save(state, filepath)
3. Model to be used by someone else with no access to your code:
torch.save(model, filepath)
#Then later:
model = torch.load(filepath)
**References: python - Best way to save a trained model in PyTorch? - Stack Overflow, Saving and Loading Models — PyTorch Tutorials 1.10.1 cu102 documentation and Save and Load the Model — PyTorch Tutorials 1.10.1 cu102 documentation
CodePudding user response:
when you train a model in azure ML it's trained in a distant compute once the training is complete and the model saved, you must upload the model in your environement so
first step : save the model in the compute
torch.save(state, path_save '/model.hdf5')
#Then later:
second step : upload the model in your account/blob storage
datastore.upload(src_dir=path_save,
target_path=desired_path, overwrite=True)