With mode('overwrite')
set during a saveAsTable()
operation:
df1.write.format('parquet').mode('overwrite').saveAsTable(
'spark_no_bucket_table1')
Then why does saving a table fail?
pyspark.sql.utils.AnalysisException: Can not create the managed
table('`spark_no_bucket_table1`').
The associated location('file:experiments/spark-warehouse/spark_no_bucket_table1')
already exists.
CodePudding user response:
From Spark's 2.4.0 migration guide:
Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set true to spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation restores the previous behavior. This option will be removed in Spark 3.0.
So if you use Spark in version >= 2.4.0 and < 3.0.0, you can solve it by setting:
spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")
For Spark version > 3.0.0, you will have to manually clean up the data directory specified in the error message.