Home > Software engineering >  how to use sagemaker inside pyspark
how to use sagemaker inside pyspark

Time:09-14

I have a simple requirement, I need to run sagemaker prediction inside a spark job

am trying to run the below

ENDPOINT_NAME = "MY-ENDPOINT_NAME"
from sagemaker_pyspark import SageMakerModel
from sagemaker_pyspark import EndpointCreationPolicy
from sagemaker_pyspark.transformation.serializers import ProtobufRequestRowSerializer
from sagemaker_pyspark.transformation.deserializers import ProtobufResponseRowDeserializer

attachedModel = SageMakerModel(
    existingEndpointName=ENDPOINT_NAME,
    endpointCreationPolicy=EndpointCreationPolicy.DO_NOT_CREATE,
    endpointInstanceType=None,  # Required
    endpointInitialInstanceCount=None,  # Required
    requestRowSerializer=ProtobufRequestRowSerializer(
        featuresColumnName="featureCol"
    ),  # Optional: already default value
    responseRowDeserializer= ProtobufResponseRowDeserializer(schema=ouput_schema),
)

transformedData2 = attachedModel.transform(df)
transformedData2.show()

I get the following error TypeError: 'JavaPackage' object is not callable

CodePudding user response:

this was solved by ...

classpath = ":".join(sagemaker_pyspark.classpath_jars())
conf = SparkConf() \
    .set("spark.driver.extraClassPath", classpath)
sc = SparkContext(conf=conf)
  • Related