I want to write pytest unit test in Kedro 0.17.5. They need to perform integrity checks on dataframes created by the pipeline. These dataframes are specified in the
catalog.yml
and already persisted successfully usingkedro run
. Thecatalog.yml
is inconf/base
.I have a test module
test_my_dataframe.py
insrc/tests/pipelines/my_pipeline/
.
How can I load the data catalog based on my catalog.yml
programmatically from within test_my_dataframe.py
in order to properly access my specified dataframes?
Or, for that matter, how can I programmatically load the whole project context (including the data catalog) in order to also execute nodes etc.?
CodePudding user response:
For unit testing, we test just the function which we are testing, and everything external to the function we should mock/patch. Check if you really need kedro project context while writing the unit test.
If you really need project context in test, you can do something like following
from kedro.framework.project import configure_project
from kedro.framework.session import KedroSession
with KedroSession.create(package_name="demo", project_path=Path.cwd()) as session:
context = session.load_context()
catalog = context.catalog
or you can also create pytest fixture to use it again and again with scope of your choice.
@pytest.fixture
def get_project_context():
session = KedroSession.create(
package_name="demo",
project_path=Path.cwd()
)
_activate_session(session, force=True)
context = session.load_context()
return context
Different args supported by KedroSession create you can check it here https://kedro.readthedocs.io/en/0.17.5/kedro.framework.session.session.KedroSession.html#kedro.framework.session.session.KedroSession.create
To read more about pytest fixture you can refer to https://docs.pytest.org/en/6.2.x/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session