I am looking to make Azure linked service configurable and hence passing the Databricks WorkspaceURL and the ClusterID at runtime. I will be having multiple Spark cluster and based on the size of the cluster I would be invoking the type/size of the cluster.
I am not finding an option of getting the DataBricks ClusterID and passit from the ADF pipeline
CodePudding user response:
You can use the REST API Clusters API 2.0 to get cluster list.
https://adb-7012303279496007.7.azuredatabricks.net/api/2.0/clusters/list
I have reproduced the above and got the below result.
First generate the access token in databricks workspace and use that in web activity as authorization to get the list of clusters.
Output from web activity:
The above also contains cluster size in mb. Store the above in an array variable.
For getting the desired cluster id based on cluster size you can use your filter condition as per your requirement.
Here, for sample I have used cluster size in mb as filter condition.
Notebook linked service:
parameter for cluster_id.
Pass the desired cluster_id from filtered array like below.
@activity('Filter1').output.Value[0].cluster_id
You can give the Notebook path using the dynamic content.
My Execution: