Home > Back-end >  Where does pyspark-shell in PYSPARK_SUBMIT_ARGS come from?
Where does pyspark-shell in PYSPARK_SUBMIT_ARGS come from?

Time:11-30

While I was reading a notebook in jupyter that does some pyspark job, encountered a code line saying

os.environ['PYSPARK_SUBMIT_ARGS'] = f'--name "test submit" --master yarn --deploy-mode client pyspark-shell'

I mostly understand this line but not the last argumnet pyaprk-shell. So I googled PYSPARK_SUMIT_ARGS to read full spec about this environment variable. Problem is that I couldn't find the documentation about it. All research results was saying to use it but not why and what it actually does. Couldn't find about it in the official documentation too.

I can assume it says to use pyspark(Python), not spark(R), to process my job, yet I want to read exactly how and what it does. So where can I read about it?

CodePudding user response:

Here's a link to the code for Spark-Submit. Look for PYSPARK_SHELL it's basically used to select what java class to use to run your code.

  • Related