What's the difference between repartition() vs spark.sql.shuffle.partitions-CodePudding

What happens when we repartition data to higher number than spark.sql.shuffle.partitions property? Are these related?

CodePudding user response：

It depends on which variant of Dataset.repartition you will call.

If you call repartition(partitionExprs: Column*): Dataset[T] - in this case number of partitions will be based on spark.sql.shuffle.partitions parameter.

If you call repartition(numPartitions: Int): Dataset[T] - in this case number of partitions will be based on numPartitions passed parameter.