Home > Mobile >  Build sparkSession with master('yarn') takes forever, how about master('local')?
Build sparkSession with master('yarn') takes forever, how about master('local')?

Time:01-12

I have to continue data processing and previous developer use master('yarn') to build spark session. but today I run it, it takes forever and I searched the solution, they said I should change 'yarn' to local. It succeed. But will it change anything? I have searched the difference but still dont understand. Anyone can explain with baby language what is the difference and if it will impact my project?

Thank you

CodePudding user response:

If you set local as your master you will get no parallelism at all. Using local as master may be appropriate for development or test purposes. But it is not a proper way to submit your spark job for production.

If you set master as yarn, spark job runs on yarn cluster and you will be able to get parallelism due to your configuration parameters.

If you need more info about it here is the official document about master urls

  • Related