When attempting to readStream data fron Azure Event Hub with Databricks on Apache Spark I get the error
AttributeError: 'str' object has no attribute '_jvm'
The details of the error is as follows:
----> 8 ehConf['eventhubs.connectionString'] = sparkContext._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
The code is as follows:
sparkContext = ""
connectionString = 'Endpoint=sb://namespace.servicebus.windows.net/;SharedAccessKeyName=both4;SharedAccessKey=adfdMyKeyIGBKYBs=;EntityPath=hubv5'
# Source with default settings
connectionString = connectionString
ehConf = {}
ehConf['eventhubs.connectionString'] = sparkContext._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
streaming_df = spark \
.readStream \
.format("eventhubs") \
.options(**ehConf) \
.load()
Has anyone come across this error and found a solution?
CodePudding user response:
It shouldn't be the sparkContext
, but just sc
:
ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
P.S. But it's just easier to use built-in Kafka connector with EventHubs - you don't need to install anything, and it's more performant...