I am trying to run pyspark on jupyter(via anaconda) in windows.Facing the below mentioned error while trying to create a SparkSession.
Exception: Java gateway process exited before sending its port number
I even tried adding JAVA_HOME,SPARK_HOME and HADOOP_HOME path into environment variable:
- JAVA_HOME: C:\Java\jdk-11.0.16.1
- SPARK_HOME:C:\Spark\spark-3.1.3-bin-hadoop3.2
- HADOOP_HOME:C:\Spark\spark-3.1.3-bin-hadoop3.2
Even after this I am facing the same issue.
PS: My pyspark version is 3.3.1 and python version is 3.8.6.
CodePudding user response:
As per spark documentation, the string for setting master should be "local[*]" or "local[N]" for only using N cores. If you leave out the master setting, it defaults to "local[*]".
CodePudding user response:
After several attempts, I finally figured out the issue. It was because the windows firewall had blocked java that caused this error. Once I gave the access permission the error was rectified!