Home > database >  What can I do to test Airflow's performance?
What can I do to test Airflow's performance?

Time:12-19

I have Airflow and Airflow has 150 DAGs.

I don't know much about testing. I want to do a performance test, load test, etc., but I don't know how to proceed because I didn't get any results when I searched.

What tests can be run to ensure that the airflow is stable?

Please help me. thank you :)

CodePudding user response:

The most important tests you can add are:

  • checking if the dagbag import raises exceptions
  • checking the dagbag import duration is smaller than 60-80% of dagbag_import_timeout to avoid timeout in production
  • checking if the duration needed to process each python script is smaller than dag_file_processor_timeout
import pytest
from datetime impor datetime, timedelta

from airflow.models import DagBag


def test_dags():
    start_date = datetime.now()
    dagbag = DagBag()
    end_date = datetime.now()

    assert dagbag.size() > 0 # (or == nb_expected_dags if you have a fixed number of dags)
    assert len(dagbag.import_errors) == 0

    assert (end_date - start_date) < timedelta(seconds=20) # change it based on your configurations
    
    stats = dagbag.dagbag_stats
    slow_files = filter(lambda d: d.duration > <put some duration>, stats)
    res = ', '.join(map(lambda d: d.file[1:], slow_files))
    assert len(res) == 0
  • Related