Using the v2 Azure ML Python SDK (azure-ai-ml) how do I get an instance of the currently running job?
In v1 (azureml-core) I would do:
from azureml.core import Run
run = Run.get_context()
if isinstance(run, Run):
print("Running on compute...")
What is the equivalent on the v2 SDK?
CodePudding user response:
This is a little more involved in v2 than in was in v1. The reason is that v2 makes a clear distinction between the control plane (where you start/stop your job, deploy compute, etc.) and the data plane (where you run your data science code, load data from storage, etc.).
Jobs can do control plane operations, but they need to do that with a proper identity that was explicitly assigned to the job by the user.
Let me show you the code how to do this first. This script creates an MLClient and then connects to the service using that client in order to retrieve the job's metadata from which it extracts the name of the user that submitted the job:
# control_plane.py
from azure.ai.ml import MLClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential
import os
def get_ml_client():
uri = os.environ["MLFLOW_TRACKING_URI"]
uri_segments = uri.split("/")
subscription_id = uri_segments[uri_segments.index("subscriptions") 1]
resource_group_name = uri_segments[uri_segments.index("resourceGroups") 1]
workspace_name = uri_segments[uri_segments.index("workspaces") 1]
credential = AzureMLOnBehalfOfCredential()
client = MLClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name,
workspace_name=workspace_name,
)
return client
ml_client = get_ml_client()
this_job = ml_client.jobs.get(os.environ["MLFLOW_RUN_ID"])
print("This job was created by:", this_job.creation_context.created_by)
As you can see, the code uses a special AzureMLOnBehalfOfCredential
to create the MLClient. Options that you would use locally (AzureCliCredential
or InteractiveBrowserCredential
) won't work for a remote job since you are not authenticated through az login
or through the browser prompt on that remote run. For your credentials to be available on the remote job, you need to run the job with user_identity
. And you need to retrieve the corresponding credential from the environment by using the AzureMLOnBehalfOfCredential
class.
So, how do you run a job with user_identity
? Below is the yaml that will achieve it:
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
type: command
command: |
pip install azure-ai-ml
python control_plane.py
code: code
environment:
image: library/python:latest
compute: azureml:cpu-cluster
identity:
type: user_identity
Note the identity
section at the bottom. Also note that I am lazy and install the azureml-ai-ml sdk as part of the job. In a real setting, I would of course create an environment with the package installed.
These are the valid settings for the identity type:
aml_token
: this is the default which will not allow you to access the control planemanaged
ormanaged_identity
: this means the job will be run under the given managed identity (aka compute identity). This would be accessed in your job viaazure.identity.ManagedIdentityCredential
. Of course, you need to provide the chosen compute identity with access to the workspace to be able to read job information.user_identity
: this will run the job under the submitting user's identity. It is to be used with theazure.ai.ml.identity.AzureMLOnBehalfOfCredential
credentials as shown above.
So, for your use case, you have 2 options:
- You could run the job with
user_identity
and use theAzureMLOnBehalfOfCredential
class to create the MLClient - You could create the compute with a managed identity which you give access to the workspace and then run the job with
managed_identity
and use theManagedIdentityCredential
class to create the MLClient