Home > database >  Pytest: create SparkSession
Pytest: create SparkSession

Time:12-27

I need to test my Spark project using pytest and I dont understand how to create a Spark Session. I did some research and came up with:

import pytest
import unittest
from pyspark.sql import SparkSession

@pytest.fixture(scope="session")
def spark_session():
    spark = SparkSession.builder.master("local[*]").appName("test").getOrCreate()
    return spark

class Testdb2connection(unittest.TestCase):
    @pytest.mark.usefixtures("spark_session")
    def test_connectdb2(self):
       with self.assertRaises(ValueError):
          return spark_session.format('jdbc')\
          ...

However, when running the test, I get:

'AttributeError: 'function' Object has no attribute 'format'

What am I doing wrong?

CodePudding user response:

Looking at Mixing pytest fixtures into unittest.TestCase subclasses using marks, you can define the spark_session with scope class and add the spark session into cls attribute of the request context to be able to use it as attribute in the class using that fixture.

Try with the following modified code:

import pytest
import unittest
from pyspark.sql import SparkSession

@pytest.fixture(scope='class')
def spark_session(request):
    spark = SparkSession.builder.master("local[*]").appName("test").getOrCreate()

    request.addfinalizer(lambda: spark.stop()) # to teardown the session after test

    request.cls.spark = spark


@pytest.mark.usefixtures("spark_session")
class Testdb2connection(unittest.TestCase):

    def test_connectdb2(self):
        assert isinstance(self.spark, SparkSession)

Running the test:

pytest ./mytest.py

.                                        [100%]                                                          
1 passed in 4.12s
  • Related