I used this simple python code with focus on spark.sql:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('jist-test').getOrCreate()
myDF = spark.read.parquet("v3io://projects/fs-demo-example/FeatureStore/vct_all_other_basic")
myDF.show(3)
and I got this exception:
Py4JJavaError Traceback (most recent call last)
<ipython-input-28-fd92fbf397a3> in <module>
1 spark = SparkSession.builder.appName('jist-test').getOrCreate()
----> 2 myDF = spark.read.parquet("v3io://projects/fs-demo-example/FeatureStore/vct_all_other_basic")
/spark/python/pyspark/sql/readwriter.py in parquet(self, *paths, **options)
299 int96RebaseMode=int96RebaseMode)
300
--> 301 return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
302
303 def text(self, paths, wholetext=False, lineSep=None, pathGlobFilter=None,
/spark/python/lib/py4j-0.10.9.3-src.zip/py4j/java_gateway.py in __call__(self, *args)
1320 answer = self.gateway_client.send_command(command)
1321 return_value = get_return_value(
-> 1322 answer, self.gateway_client, self.target_id, self.name)
1323
1324 for temp_arg in temp_args:
/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
109 def deco(*a, **kw):
110 try:
--> 111 return f(*a, **kw)
112 except py4j.protocol.Py4JJavaError as e:
113 converted = convert_exception(e.java_exception)
/spark/python/lib/py4j-0.10.9.3-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
--> 328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(
Py4JJavaError: An error occurred while calling o91.parquet.
: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
This stopped SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java
...
It happened from time to time in pyspark version 3.2.1. Do you know, how to avoid this mistake?
CodePudding user response:
I caught it. It is possible to minimize occurrence or remove this exception based on regular closing or killing spark session.
See updated part of the code:
from pyspark.sql import SparkSession
with SparkSession.builder.appName('jist-test').getOrCreate() as spark:
myDF = spark.read.parquet("v3io://projects/fs-demo-example/FeatureStore/vct_all_other_basic")
myDF.show(3)
or
from pyspark.sql import SparkSession
try:
spark = SparkSession.builder.appName('jist-test').getOrCreate()
myDF = spark.read.parquet("v3io://projects/fs-demo-example/FeatureStore/vct_all_other_basic")
myDF.show(3)
finally:
spark.close()