Home > database >  how to handle exception occurred while using dataframe.saveAsTable
how to handle exception occurred while using dataframe.saveAsTable

Time:12-07

In spark lets say if a dataframe df has 100 records and df.saveAsTable("sometablename") lets assume the dataframe has saved 50 records and while saving the rest of the 50 records if some error occurs will it revoke the already saved 50 records?

In case of sql server we have commit and rollback tran. Do we have such thing in spark. Please help

CodePudding user response:

Spark is not atomic, it does try and do good things though.

I would suggest that you implement a similar strategy to what spark and hadoop do.

Create a temp table df.saveAsTable("sometablename_temp")

Once all the data is written, change the name.

spark.sql("ALTER TABLE sometablename_temp RENAME TO sometablename;")

This de-risks the operation. If a failure happens while you are writing the temp table, there is no harm. This "commit" is at the driver level and this is the correct level to handle a failure from an executor.

CodePudding user response:

I think better to create your table as partition table in case of failure delete that partition.

  • Related