I am iterating on a model inference deployment using script mode in Sagemaker (currently running in local mode) and update my inference script often. Every time I update the inference.py
entry point script, I need to recreate the model instance again like this:
model_instance = PyTorchModel(
model_data=model_tar_path,
role=role,
source_dir="code",
entry_point="inference.py",
framework_version="1.8",
py_version="py3"
)
and then call
predictor = model_instance.deploy(
initial_instance_count=1,
endpoint_name='some_name',
instance_type=instance_type,
serializer=JSONSerializer(),
deserializer=JSONDeserializer())
over and over every time I change something. This take super long each time because it basically restarts a new docker container (remember running locally) and waits to install all dependencies before I can start to do something with it. And if at all, there is any error, I need to do the whole thing all over again.
I'd like to explore any possible way I can utilize the update_endpoint
functionality that allows me to essentially redeploy the endpoint within the same container, without having to recreate a new container every time and then waiting for all dependency installations etc.
CodePudding user response:
SageMaker local mode is designed to imitate the hosted environment. As such, every time you deploy/update a new container is run.
For faster development, I usually bake all the packages I can into the container which stops the need for installation on every deploy.
I.e. You can extend the SageMaker PyTorch container and bake your packages into it instead of using a requirements.txt. You can then push the image to ECR and specify it in the PyTorchModel
https://docs.aws.amazon.com/sagemaker/latest/dg/prebuilt-containers-extend.html
In your PyTorchModel
:
model_instance = PyTorchModel(
image_uri = <YourImageECRURI>,
model_data=model_tar_path,
role=role,
source_dir="code",
entry_point="inference.py",
framework_version="1.8",
py_version="py3"
)