Airflow: Metrics - Introducing a counter for number of DAGs in "running" state

Created on 14 Nov 2020  路  5Comments  路  Source: apache/airflow

Description

Could we introduce a new counter in Airflow Metrics to track number of DAGs that are in 'running' state?

I am aware of the existing dag_processing.processes counter that is documented as Number of currently running DAG parsing processes. This counter track how many dags are being parsed, not how many are runnings.

I am aware of the existing executor.queued_tasks and executor.running_tasks, but they are task-wise and not dag-wise.

Use case / motivation

Consider this example
Screenshot 2020-11-14 at 16 42 40

I want to know how many DAGs are in the state of running to understand memory consumption issues and to understand how long is my "queue" of DAGs. To my understanding there is no metric currently that I can use to track that.

I propose a counter executor.running_dags that would return the number of DAGs in the running state.

Does this make sense? Am I missing something? If folks I agree I would be happy to work on a PR :)

feature

All 5 comments

Thanks for opening your first issue here! Be sure to follow the issue template!

This information is in the database so you can use:
https://github.com/robinhood/airflow-prometheus-exporter or https://github.com/PBWebMedia/airflow-prometheus-exporter or other similar exporter

We use statsd metrics only for information that we do not have in the database, i.e. runtime metrics.

@mik-laj thanks for the repsonse! I will look into that :)
Do you think we should add two lines in this doc page to mention that statsd only track non-database information and you can find more metrics in the database? https://airflow.apache.org/docs/stable/metrics.html

@SolbiatiAlessandro Yes. Please do it. ;-)

Additionally: we can't (easily) report this as a metric now that running more than on e scheduler is supported.

Was this page helpful?
0 / 5 - 0 ratings