Home > Blockchain >  The primary container for production variant default-variant-name did not pass the ping health check
The primary container for production variant default-variant-name did not pass the ping health check


I'm trying to deploy a pre-trained ML model in Sagemaker.

Whenever I try to deploy the model, I get this error:

The primary container for production variant default-variant-name did not pass the 
ping health check. Please check CloudWatch logs for this endpoint.

in Endpoint Console and I get this - - [06/Oct/2022:05:43:42  0000] "GET /ping HTTP/1.1" 200 0 "-" "AHC/2.0"

in Cloudwatch logs.

ML model is of pickle format, I have converted it into tar.gz and stored the file in S3.

serve.py file -

import json
import joblib
import numpy as np
from sklearn import svm
import os
import sklearn
import pickle
import boto3
import pickle
import tarfile

def init():

    global model
    s3_bucket = 'sagemaker-model-artifacts-dt'

    model_filename = 'svm-model.tar.gz'

    model_s3_key = model_filename

    model_url = f's3://{s3_bucket}/{model_s3_key}'
    my_tar = tarfile.open("svm-model.tar.gz")
    model = pickle.load(open('svm-model.pkl','rb'))

def run(raw_data):
    # Get the input data as a numpy array
    data = np.array(json.loads(raw_data)['data'])
#     data = scaler.transform(data)
    # Get a prediction from the model
    predictions: np.ndarray = model.predict(data)
    # Return the predictions as any JSON serializable format
    return {
        "predictions": predictions.tolist()

Dockerfile -

FROM python:latest


RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3 \
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

RUN wget https://bootstrap.pypa.io/get-pip.py && python3 get-pip.py && \
    pip install joblib numpy sklearn boto3 && \
        rm -rf /root/.cache

ENV PATH="/opt/program:${PATH}"

COPY service_files /opt/program

WORKDIR /opt/program

ENTRYPOINT ["python","/opt/program/serve.py"]

CodePudding user response:

Thanks all for taking time in going through my query.

I tried using one more method which was using ezsmdeploy and it worked. I would suggest that approach if you already have a pre-trained model and dont want to deal with docker.

  • Related