I want to create an API using one endpoint but in my app, the first model output is the second model output. Is it possible to implement using SageMaker?
From my understanding model_fn
, prediction_fn
, output_fn
can only use one model at a time.
CodePudding user response:
Pipeline Model (sequential models)
There is a specific mode in SageMaker: Look at PipelineModel.
You can pass a list of sagemaker.Model objects in the order you want the inference to happen.
This is an official AWS example to follow: Train register and deploy a pipeline model
from sagemaker import PipelineModel
pipeline_model = PipelineModel(
models=[model_0, model_1, ...],
role=role,
sagemaker_session=pipeline_session
)
It works like a normal SageMaker Model, in fact you have the normal deployment method.
You can also follow this guide that show how to deploy an Xgboost model binary built for a developer, where a post-processing layer is added through an inference pipeline in sagemaker, deploying an endpoint.