Home > Software design >  Is the checkpoint requires for the delta lake merge operation in a streaming job
Is the checkpoint requires for the delta lake merge operation in a streaming job

Time:04-21

I have an understanding that for a spark streaming merge it's helpful to have a checkpoint location specified to not process stuff twice on the job restart (even if the operation is idempotent and ins't mentioned in example notebook). Is it correct?

CodePudding user response:

If you don't specify the location of the checkpoint, each time all the data will be reprocessed.

  • Related