Home > database >  Change spark configuration at runtime in Databricks
Change spark configuration at runtime in Databricks

Time:11-15

Is it possible to change spark configuration properties at runtime?

I'm using databricks and my goal is to read some cassandra table used in a claster used for production and after some operation write the results in another cassandra table in another cluster used for development.

Now i connect to my cassandra cluster via spark configuration properties usign:

spark.conf.set("spark.cassandra.connection.host", "cluster")
spark.conf.set("spark.cassandra.auth.username", "username")
spark.conf.set("spark.cassandra.auth.password", "password")

but if I try to change this at runtime I cannot perform the write operations.

CodePudding user response:

You can also specify options on the specific read/write operations, like this:

df = spark.read \
  .format("org.apache.spark.sql.cassandra") \
  .options(**{
    "table": "words",
    "keyspace": "test" ,
    "spark.cassandra.connection.host": "host",
    ...
    })
  ).load()

See documentation for more examples.

  • Related