I need to test my Spark project using pytest and I dont understand how to create a Spark Session. I did some research and came up with:
import pytest
import unittest
from pyspark.sql import SparkSession
@pytest.fixture(scope="session")
def spark_session():
spark = SparkSession.builder.master("local[*]").appName("test").getOrCreate()
return spark
class Testdb2connection(unittest.TestCase):
@pytest.mark.usefixtures("spark_session")
def test_connectdb2(self):
with self.assertRaises(ValueError):
return spark_session.format('jdbc')\
...
However, when running the test, I get:
'AttributeError: 'function' Object has no attribute 'format'
What am I doing wrong?
CodePudding user response:
Looking at Mixing pytest fixtures into unittest.TestCase subclasses using marks, you can define the spark_session
with scope class
and add the spark session into cls
attribute of the request context to be able to use it as attribute in the class using that fixture.
Try with the following modified code:
import pytest
import unittest
from pyspark.sql import SparkSession
@pytest.fixture(scope='class')
def spark_session(request):
spark = SparkSession.builder.master("local[*]").appName("test").getOrCreate()
request.addfinalizer(lambda: spark.stop()) # to teardown the session after test
request.cls.spark = spark
@pytest.mark.usefixtures("spark_session")
class Testdb2connection(unittest.TestCase):
def test_connectdb2(self):
assert isinstance(self.spark, SparkSession)
Running the test:
pytest ./mytest.py
. [100%]
1 passed in 4.12s