Skip to content

Fix DbtCloudRunJobTrigger timeout error message and add final status check#61980

Open
eran-moses-human wants to merge 1 commit intoapache:mainfrom
eran-moses-human:fix/dbt-cloud-trigger-timeout-message
Open

Fix DbtCloudRunJobTrigger timeout error message and add final status check#61980
eran-moses-human wants to merge 1 commit intoapache:mainfrom
eran-moses-human:fix/dbt-cloud-trigger-timeout-message

Conversation

@eran-moses-human
Copy link

Summary

Fixes two bugs in DbtCloudRunJobTrigger.run():

  • Misleading error message: The timeout message printed self.end_time (an absolute epoch timestamp, e.g. 1771200015.8) labelled as "seconds", producing nonsensical output. Replaced with a clear "within the configured timeout" message.
  • Missing final status check: The timeout check fired without re-polling the job status. A job completing during asyncio.sleep() could be incorrectly reported as timed out. Now performs one final is_still_running() call before yielding a timeout error.

Changes

  • providers/dbt/cloud/src/airflow/providers/dbt/cloud/triggers/dbt.py:
    • Moved asyncio.sleep() before the timeout check
    • Added a final is_still_running() call when the timeout fires
    • Fixed the error message to no longer print epoch timestamp as duration
  • providers/dbt/cloud/tests/unit/dbt/cloud/triggers/test_dbt.py:
    • Updated existing timeout test to match new error message
    • Added new test test_dbt_job_run_timeout_but_job_completes for the edge case where a job completes at the timeout boundary

Closes #61979

Made with Cursor

…check

The timeout error message in DbtCloudRunJobTrigger.run() printed
self.end_time (an absolute epoch timestamp) labelled as "seconds",
producing nonsensical output like "after 1771200015.8 seconds" instead
of a meaningful duration.

Additionally, the timeout check fired before sleeping, without a final
status poll. A job completing during asyncio.sleep() could be
incorrectly reported as timed out.

Changes:
- Move asyncio.sleep() before the timeout check so the trigger sleeps
  first, then evaluates the deadline.
- Add a final is_still_running() call when the timeout fires so that
  jobs completing at the boundary are handled correctly.
- Replace the misleading epoch-as-duration message with a clear
  "within the configured timeout" message.
- Update existing timeout test and add a new test for the edge case
  where a job completes at the timeout boundary.

Closes: apache#61979
Co-authored-by: Cursor <cursoragent@cursor.com>
@boring-cyborg
Copy link

boring-cyborg bot commented Feb 16, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

Copy link
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I would fix the CI failures (one looks unrelated but the other prek related which you can fix).

Just a heads up: you might get conflicts if PR #61472 merges first as it is touching the same lines of code.

)
return
# Job reached a terminal state — exit loop to handle below.
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the timeout emission could be extended by poll_interval, meaning the actual timeout could occur up to one full polling cycle later than the configured value. For example, with timeout=60s and poll_interval=30s, the timeout may not be emitted until around 90s depending on scheduling.

I believe this behavior was already present prior to this PR, so it’s not a regression. That said, we might consider sleeping for min(poll_interval, remaining_time) to align the timeout more closely with the configured value. This should not block the PR, but I think it might be worth exploring.

async def test_dbt_job_run_timeout_but_job_completes(
self, mock_get_job_status, mocked_is_still_running
):
"""Assert that a job completing at the timeout boundary is treated as success, not timeout."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the docstring mentions "completing at the timeout boundary", but the test appears to simulate completion after the deadline has technically passed but before timeout emission. I would suggest clarifying it accordingly.

@josh-fell
Copy link
Contributor

@eran-moses-human There are some merge conflicts now. Can you resolve and re-push when you get a chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DbtCloudRunJobTrigger: misleading timeout error message and missing final status check

3 participants