I running a Spark job for 6GB of data and it ran in a minute, I am aware that spark uses distributed programming and performs the task. I ran this on my MacBook which has 8 Core CPU, my question is does these 8 Cores perform the reading simultaneously and act as 8 node machine or how does it work?
CodePudding user response:
If you start your local Spark session with .master('local[*]')
, you'd (always) have one executor with 8 vCPU. In another word you would have spark.default.parallelism
= 8.