Home > Software design >  In Spark difference between repartition(1) and coalesce(1)
In Spark difference between repartition(1) and coalesce(1)

Time:09-17

In our project, we are using repartition(1) to write data into table, I am interested to know why coalesce(1) cannot be used here because repartition is costly operation compared to coalesce. I know repartition distributes data evenly across partitions but when the output file is of single part file, why can't we use coalesce(1) ? please help me understand if any other factors are involved in this

CodePudding user response:

You state nothing else in terms of logic.

  • Related