Home > Enterprise >  Azure ML Studio- Container has crashed. Did your init method fail
Azure ML Studio- Container has crashed. Did your init method fail

Time:12-30

I am trying to deploy an ML model through the Azure ML Studio using the notebook itself. The commands we are using can be found here https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=python#define-an-inference-configuration

We have registered the model as below-

from azureml.core.model import Model
model = Model.register(ws, model_name="pdmrfull", model_path="pdmrfull.model")

But while running this command-

service = Model.deploy(
    ws,
    "myservice",
    ["pdmrfull.model"],
    dummy_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)

We are getting the error that container has crashed. Did your init method fail?

The logs are-

ModelNotFound: Model with id pdmrfull.model not found in provided workspace

Copying local model pdmrfull.model to /tmp/azureml_bj8rboqi/pdmrfull.model/0
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry anomalydetectiondemo.azurecr.io
Logging into Docker registry anomalydetectiondemo.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM anomalydetectiondemo.azurecr.io/azureml/azureml_1566ddc49d4bc958af64b0982f94332f
 ---> 2a67c40c6461
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> a26085e19be6
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjQ0NWRlMzNhLTJkZGQtNDg4Yy04Y2UwLTNhODdhNTZiZTBhNiIsInJlc291cmNlR3JvdXBOYW1lIjoid2F0ZXJ1dGlsaXRpZXNwcm9kdWN0X3JnIiwiYWNjb3VudE5hbWUiOiJ1dGlsaXR5X2FpLW1sX3dvcmtzcGFjZSIsIndvcmtzcGFjZUlkIjoiYTM5Y2M1OTQtYjYxNi00NzFkLWIxYTEtMGJmNzM5ZGVhZjgxIn0sIm1vZGVscyI6e30sIm1vZGVsc0luZm8iOnt9fQ== | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 6a0f7e703eb4
 ---> 9601c2fdb34a
Step 4/5 : RUN mv '/var/azureml-app/tmp61nlnxfj.py' /var/azureml-app/main.py
 ---> Running in ad83f9d6636c
 ---> 5ecb95b1a821
Step 5/5 : CMD ["runsvdir","/var/runit"]
 ---> Running in 77ba64239d16
 ---> d083827df407
Successfully built d083827df407
Successfully tagged myservice:latest
Container (name:strange_bouman, id:626a24dc3c65a0f817bf8b7efefb36b9b79a2ee1cbdaf7b92ac562dbce0ba6f5) cannot be killed.
Container has been successfully cleaned up.
Image sha256:421ee3e0b7153a6dcb372b463150909ee2cef778b45f314e69cd80224a8cb540 successfully removed.
Starting Docker container...
Docker container running.
Checking container health...
Error: Container has crashed. Did your init method fail?


Container Logs:
2021-12-29T07:23:16,299774437 00:00 - gunicorn/run 
Dynamic Python package installation is disabled.
Starting HTTP server
2021-12-29T07:23:16,306725894 00:00 - rsyslog/run 
2021-12-29T07:23:16,305934399 00:00 - iot-server/run 
2021-12-29T07:23:16,307283190 00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2021-12-29T07:23:16,408489759 00:00 - iot-server/finish 1 0
2021-12-29T07:23:16,409794351 00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 20.1.0
Listening at: http://127.0.0.1:31311 (11)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 39
SPARK_HOME not set. Skipping PySpark Initialization.
Exception in worker process
Traceback (most recent call last):
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
    worker.init_process()
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/workers/base.py", line 134, in init_process
    self.load_wsgi()
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
    return self.load_wsgiapp()
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/site-packages/gunicorn/util.py", line 359, in import_app
    mod = importlib.import_module(module)
  File "/azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 978, in _gcd_import
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load
  File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed
  File "/var/azureml-server/entry.py", line 1, in <module>
    import create_app
  File "/var/azureml-server/create_app.py", line 4, in <module>
    from routes_common import main
  File "/var/azureml-server/routes_common.py", line 32, in <module>
    from aml_blueprint import AMLBlueprint
  File "/var/azureml-server/aml_blueprint.py", line 28, in <module>
    main_module_spec.loader.exec_module(main)
  File "/var/azureml-app/pdmscore.py", line 3, in <module>
    from pyspark.ml import Pipeline
ModuleNotFoundError: No module named 'pyspark'
Worker exiting (pid: 39)
Shutting down: Master
Reason: Worker failed to boot.
2021-12-29T07:23:16,864252915 00:00 - gunicorn/finish 3 0
2021-12-29T07:23:16,865700106 00:00 - Exit code 3 is not normal. Killing image.

---------------------------------------------------------------------------
WebserviceException                       Traceback (most recent call last)
<ipython-input-209-0ccedd6ff2bb> in <module>
      7     overwrite=True,
      8 )
----> 9 service.wait_for_deployment(show_output=True)

/anaconda/envs/newtestenv/lib/python3.6/site-packages/azureml/core/webservice/local.py in decorated(self, *args, **kwargs)
     70                 raise WebserviceException('Cannot call {}() when service is {}.'.format(func.__name__, self.state),
     71                                           logger=module_logger)
---> 72             return func(self, *args, **kwargs)
     73         return decorated
     74     return decorator

/anaconda/envs/newtestenv/lib/python3.6/site-packages/azureml/core/webservice/local.py in wait_for_deployment(self, show_output)
    612                                    self._container,
    613                                    health_url=self._get_health_url(),
--> 614                                    cleanup_if_failed=False)
    615 
    616             self.state = LocalWebservice.STATE_RUNNING

/anaconda/envs/newtestenv/lib/python3.6/site-packages/azureml/_model_management/_util.py in container_health_check(docker_port, container, health_url, cleanup_if_failed)
    750             # The container has started and crashed.
    751             _raise_for_container_failure(container, cleanup_if_failed,
--> 752                                          'Error: Container has crashed. Did your init method fail?')
    753 
    754         # The container hasn't crashed, so try to ping the health endpoint.

/anaconda/envs/newtestenv/lib/python3.6/site-packages/azureml/_model_management/_util.py in _raise_for_container_failure(container, cleanup, message)
   1268         cleanup_container(container)
   1269 
-> 1270     raise WebserviceException(message, logger=module_logger)
   1271 
   1272 

WebserviceException: WebserviceException:
    Message: Error: Container has crashed. Did your init method fail?
    InnerException None
    ErrorResponse 
{
    "error": {
        "message": "Error: Container has crashed. Did your init method fail?"
    }
}

The init method is-

def init():
    # read in the model file
    from pyspark.ml import PipelineModel
    # read in the model file
    global pipeline
    pipeline = PipelineModel.load('pdmrfull.model')

CodePudding user response:

Model file not found in the workspace, To troubleshoot please follow the document.

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-debug-pipelines

Here is link to Aure machine learning endpoints.

CodePudding user response:

The logs are pretty explanatory, ModuleNotFoundError: No module named 'pyspark'. What are all the dependencies that you installed in the deploy configuration(environment)?. Check that, maybe you didn't install pyspark.

  • Related