Airflow: New dag run / task state: disposed

Created on 9 Nov 2020  路  4Comments  路  Source: apache/airflow

Description

Introduce a new dag run / task state called "disposed". This new state would represent the acknowledgment of a failed run/task that should not be retried. Update the UI and CLI to provide a mechanism for the disposal action.

Use case / motivation

In a former gig we had a homegrown job management system. One nice feature for operations was the ability to "dispose" of a failed job. Disposal indicated that we recognized the failure, investigated it and decided the job should not be retried and the failure could be ignored going forward. This removed the failure from our daily operations report which we used to investigate failures. I find myself yearning for this feature lately. In airflow I have to either re-run the job or mark it as successful for it to leave my daily operations failure report. This could simply be implemented as a new dag/task state and action. We also had a "dispose reason" for tracking purposes - just a notes field for why the operator performed the action. Since we dealt with financial transaction feeds, we needed this. The auditability of a dispose state + notes field would be quite useful.

Related Issues

None that I see.

Tagging @ryw

feature

Most helpful comment

Love the feature (and I've been thinking on and off about this for 2 years. Just never thinking enough to open an issue) -- not sure of the name is the only thing.

"Acknowledged Failure" -- or perhaps to make the task dep checking easier, the state could be left as failure, but somewhere else we record "yes, this failure has been investigated, it's 'okay'".

Anyway, :100: to the feature idea.

All 4 comments

Thanks for opening your first issue here! Be sure to follow the issue template!

As I mentioned in the Slack thread, I think this is a good step towards improved auditability for Airflow.

Maybe bundle this together with a few other audit improvements for a 2.X release.

Love the feature (and I've been thinking on and off about this for 2 years. Just never thinking enough to open an issue) -- not sure of the name is the only thing.

"Acknowledged Failure" -- or perhaps to make the task dep checking easier, the state could be left as failure, but somewhere else we record "yes, this failure has been investigated, it's 'okay'".

Anyway, :100: to the feature idea.

@ashb can I contribute with the implementation of this feature?

Was this page helpful?
0 / 5 - 0 ratings