I tried the following:
conf = (
SparkConf().set("spark.driver.maxResultSize", "0"),
SparkConf().set("spark.sql.autoBroadcastJoinThreshold", "-1")
)
sc = SparkContext(conf=conf)
However, i got the folllowing error:
AttributeError: 'tuple' object has no attribute 'get'
This works:
conf = (
SparkConf().set("spark.driver.maxResultSize", "0")
)
sc = SparkContext(conf=conf)
CodePudding user response:
ok i think this worked:
conf = (
pyspark.SparkConf().setAll([("spark.driver.maxResultSize", "0"),("spark.sql.autoBroadcastJoinThreshold", "-1") ])
)
sc = SparkContext(conf=conf)
CodePudding user response:
conf = SparkConf()
conf = conf.set("spark.driver.maxResultSize", "0")
conf = conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
sc = SparkContext(conf=conf)
or
conf = SparkConf().set("spark.driver.maxResultSize", "0").set("spark.sql.autoBroadcastJoinThreshold", "-1")
sc = SparkContext(conf=conf)
both ways are possible (and actually identical)
also set
may be used without conf
variable reinitialization:
conf = SparkConf()
conf.set("spark.driver.maxResultSize", "0")
conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
sc = SparkContext(conf=conf)