I am able to connect to the Azure Databricks cluster from my Linux Centos VM, using visual studio code.
Below code even works without any issue
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
print("Cluster access test - ",spark.range(100).count())
setting = spark.conf.get("spark.master") # returns local[*]
if "local" in setting:
from pyspark.dbutils import DBUtils
dbutils = DBUtils().get_dbutils(spark)
else:
print("Do nothing - dbutils should be available already")
out = dbutils.fs.ls('/FileStore/')
print(out)
I have a notebook in my local which run another notebook using %run path/anothernotebook
.
Since the %run string is commented # python is not executing it.
So i tried to include the dbutils.notebook.run('pathofnotebook')
but it errors out stating notebook
Exception has occurred: AttributeError
'SparkServiceClientDBUtils' object has no attribute 'notebook'
Is it possible to locally debug a notebook that invokes another notebook?
CodePudding user response:
It’s impossible - dbutils implementation included into Databricks Connect supports only ‘fs’ and ‘secrets’ subcommands (see docs).
Databricks Connect is designed to work with code developed locally, not with notebooks. If you can package content of that notebook as Python package, then you’ll able to debug it.
P.S. please take into account that dbutils.notebook.run executes notebook as a separate job, in contrast with %run