Home > database >  Environment Variable Error when running Python/PySpark script
Environment Variable Error when running Python/PySpark script

Time:10-26

Is there an easy way to fix this error:

Missing Python executable 'python3', defaulting to 'C:\Users\user1\Anaconda3\Lib\site-packages\pyspark\bin\..' for SPARK_HOME environment variable. Please install Python or specify the correct Python executable in PYSPARK_DRIVER_PYTHON or PYSPARK_PYTHON environment variable to detect SPARK_HOME safely.

Would I have to modify the PATH system variable? Or export/create the environment variables PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON? I have Python 3.8.8.

CodePudding user response:

you need to add an environment variable called SPARK_HOME: this variable contain the path to the installed pyspark library .

In my case , pyspark is installed under my home directory, so this is the content of the variable :

SPARK_HOME=/home/zied/.local/lib/python3.8/site-packages/pyspark

also you need another variable called PYSPARK_PYTHON which have the python version you are using like this :

PYSPARK_PYTHON=python3.8
  • Related