Home > Net >  Passing Databricks ClusterID at runtime from Azure Data Bricks Pipeline
Passing Databricks ClusterID at runtime from Azure Data Bricks Pipeline

Time:10-27

I am looking to make Azure linked service configurable and hence passing the Databricks WorkspaceURL and the ClusterID at runtime. I will be having multiple Spark cluster and based on the size of the cluster I would be invoking the type/size of the cluster.

I am not finding an option of getting the DataBricks ClusterID and passit from the ADF pipeline

Databricks Linked Service

enter image description here

CodePudding user response:

You can use the REST API Clusters API 2.0 to get cluster list.

https://adb-7012303279496007.7.azuredatabricks.net/api/2.0/clusters/list

I have reproduced the above and got the below result.

First generate the access token in databricks workspace and use that in web activity as authorization to get the list of clusters.

enter image description here

Output from web activity:

enter image description here

The above also contains cluster size in mb. Store the above in an array variable.

enter image description here

For getting the desired cluster id based on cluster size you can use your filter condition as per your requirement.

Here, for sample I have used cluster size in mb as filter condition.

enter image description here

Notebook linked service:

parameter for cluster_id.

enter image description here

Pass the desired cluster_id from filtered array like below.

@activity('Filter1').output.Value[0].cluster_id

enter image description here

You can give the Notebook path using the dynamic content.

enter image description here

My Execution:

enter image description here

  • Related