Home > OS >  Databricks on Apache Spark AttributeError: 'str' object has no attribute '_jvm'
Databricks on Apache Spark AttributeError: 'str' object has no attribute '_jvm'

Time:03-20

When attempting to readStream data fron Azure Event Hub with Databricks on Apache Spark I get the error

AttributeError: 'str' object has no attribute '_jvm'

The details of the error is as follows:

----> 8 ehConf['eventhubs.connectionString'] = sparkContext._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString) 

The code is as follows:

sparkContext = ""
connectionString = 'Endpoint=sb://namespace.servicebus.windows.net/;SharedAccessKeyName=both4;SharedAccessKey=adfdMyKeyIGBKYBs=;EntityPath=hubv5'
# Source with default settings
connectionString = connectionString

ehConf = {}

ehConf['eventhubs.connectionString'] = sparkContext._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)

streaming_df = spark \
  .readStream \
  .format("eventhubs") \
  .options(**ehConf) \
  .load()

Has anyone come across this error and found a solution?

CodePudding user response:

It shouldn't be the sparkContext, but just sc:

ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)

P.S. But it's just easier to use built-in Kafka connector with EventHubs - you don't need to install anything, and it's more performant...

  • Related