I am trying to run a simple command spark = SparkSession.builder.appName("Basics").getOrCreate()
in my M1 Mac, Monterey 12.6.2, but it throws an error:
The operation couldn’t be completed. Unable to locate a Java Runtime.
Please visit http://www.java.com for information on installing Java.
/Users/user/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/bin/spark-class: line 96: CMD: bad array subscript
head: illegal line count -- -1
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[2], line 2
1 # May take a little while on a local computer
----> 2 spark = SparkSession.builder.appName("Basics").getOrCreate()
File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/sql/session.py:269, in SparkSession.Builder.getOrCreate(self)
267 sparkConf.set(key, value)
268 # This SparkContext may be an existing one.
--> 269 sc = SparkContext.getOrCreate(sparkConf)
270 # Do not update `SparkConf` for existing `SparkContext`, as it's shared
271 # by all sessions.
272 session = SparkSession(sc, options=self._options)
File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/context.py:483, in SparkContext.getOrCreate(cls, conf)
481 with SparkContext._lock:
482 if SparkContext._active_spark_context is None:
--> 483 SparkContext(conf=conf or SparkConf())
484 assert SparkContext._active_spark_context is not None
485 return SparkContext._active_spark_context
File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/context.py:195, in SparkContext.__init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls, udf_profiler_cls)
189 if gateway is not None and gateway.gateway_parameters.auth_token is None:
190 raise ValueError(
191 "You are trying to pass an insecure Py4j gateway to Spark. This"
...
--> 106 raise RuntimeError("Java gateway process exited before sending its port number")
108 with open(conn_info_file, "rb") as info:
109 gateway_port = read_int(info)
RuntimeError: Java gateway process exited before sending its port number
I googled a lot, and finally decided to follow this solution here ###RuntimeError: Java gateway process exited before sending its port number , and thus I need to go to zshrc by ~/.zshrc
to add a line:
export JAVA_HOME="/path/to/java_home/"
. However it gives me this error zsh: permission denied: /Users/user/.zshrc
I have tried these solutions here, but it doesn't work. https://www.stellarinfo.com/blog/fixed-zsh-permission-denied-in-mac-terminal/. I have given Full Disk Access rights to Terminal.
Therefore I have 2 problems right now,
- Java gateway process exited before sending its port number.
- zsh permission denied.
Would anyone please help?
CodePudding user response:
step.1
Open your terminal.
step.2
cd ~
vim .zshrc
step.3
Press i to insert, and use arrow keys to navigate. Insert your command.
export JAVA_HOME="/path/to/java_home/"
Just try the above first. If it throws an error, you may need to remove the backslash in the end.
From your error code, it seems like you are running the in a virtual environment. If the error persists, please try conda env remove
the current env, and create again. Then remember to conda install openjdk
first, then conda install pyspark
. Hope this helps.