Add a new task status that would exist between Success and Failure, such as Warn, Needs Attention, or Degraded #54093
Unanswered
kdickinson87
asked this question in
Ideas
Replies: 1 comment 1 reply
-
|
IMO this sounds like an edge case, so rather than introducing a new task state, I’d recommend sticking with existing features or simple hacks. One approach is to use a custom callback; your task can still return success, but trigger a notification (Slack/email/etc) when it hits certain thresholds. That way, you keep the DAG green but still flag the issue to your team. If you want to make it cleaner, you can also wrap your task with a custom decorator that checks the error rate (like >1%), logs a warning, and fires off alerts, without marking the task as failed. Keeps everything maintainable and avoids messing with Airflow internals. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We have several tasks across our airflow environment that are full load tasks, that process all records for a given API or dataset(some don't have a way to be made incremental/don't support it on vendor end), and failed records get put into a failed output/reprocess queue, and successful records are processed. Right now, when, say for example, >99% of the rows are successful, we either have to set the status of the run as successful, when there are failures, or failed, when >99% of the rows were successfully processed. Either way we currently continue processing of the successful rows, and the failures are investigated by our product team and get updated/picked up on the next run.
It would be super helpful if there were another task status between success and failure to give to these tasks that aren't full Failures, but also not fully successful. Something like "Needs Attention" with an orange color, or "Warning" with an orange color.
This would help parse out critical errors that mean "dag should stop processing completely and return error" and data errors, especially when it's a small subset of data, that would allow continued processing of the downstream tasks, while giving notice that there are items(be it data, etc) that need investigated. Similar to error severity levels.
Beta Was this translation helpful? Give feedback.
All reactions