Home > Software engineering >  Scala code execution on master of spark cluster?
Scala code execution on master of spark cluster?

Time:04-22

The spark application uses some API calls which do not use spark-session. I believe when the piece of code doesn't use spark it is getting executed on the master node!

Why do I want to know this? I am getting a java heap space error while I am trying to POST some files using API calls and I believe if I upgrade the master and increase driver mem it can be solved.

I want to understand how this type of application is executed on the Spark cluster? Is my understanding right or am I missing something?

CodePudding user response:

It depends - closures/functions passed to the built-in function transform or any code in udfs you create, code in forEachBatch (and maybe a few other places) will run on the workers. Other code runs on driver

  • Related