Home > Software engineering >  How to get maximum/minimum duration of all DagRun instances in Airflow?
How to get maximum/minimum duration of all DagRun instances in Airflow?

Time:06-30

Is there a way to find the maximum/minimum or even an average duration of all DagRun instances in Airflow? - That is all dagruns from all dags not just one single dag.

I can't find anywhere to do this on the UI or even a page with a programmatic/command line example.

CodePudding user response:

You can use airflow- api to get all dag_runs for dag and calculate statistics.

An example to get all dag_runs per dag and calc total time :

import datetime
import requests
from requests.auth import HTTPBasicAuth

airflow_server = "http://localhost:8080/api/v1/"
auth = HTTPBasicAuth("airflow", "airflow")

get_dags_url = f"{airflow_server}dags"
get_dag_params = {
    "limit": 100,
    "only_active": "true"
}

response = requests.get(get_dags_url, params=get_dag_params, auth=auth)
dags = response.json()["dags"]

get_dag_run_params = {
    "limit": 100,
}
for dag in dags:
    dag_id = dag["dag_id"]
    dag_run_url = f"{airflow_server}/dags/{dag_id}/dagRuns?limit=100&state=success"
    response = requests.get(dag_run_url, auth=auth)
    dag_runs = response.json()["dag_runs"]
    for dag_run in dag_runs:
        start_date = datetime.datetime.fromisoformat(dag_run['start_date'])
        end_date = datetime.datetime.fromisoformat(dag_run['start_date'])
        total = end_date - start_date
  • Related