Is the checkpoint requires for the delta lake merge operation in a streaming job-CodePudding

Home > Software design > Is the checkpoint requires for the delta lake merge operation in a streaming job

Is the checkpoint requires for the delta lake merge operation in a streaming job

Time：04-21

I have an understanding that for a spark streaming merge it's helpful to have a checkpoint location specified to not process stuff twice on the job restart (even if the operation is idempotent and ins't mentioned in example notebook). Is it correct?

CodePudding user response：

If you don't specify the location of the checkpoint, each time all the data will be reprocessed.

Page link：https//www.codepudding.com/Softwaredesign/379738.html

Prev:Need help in deciding between Apache flink and apache spark

Next:How reliable is spark stream join with static databricks delta table

Tags：

apache-spark

databricks

upsert

delta-lake

Links：
CodePudding