-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Fix: Persist import errors from GitDagBundle to import_error table #60100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Persist import errors from GitDagBundle to import_error table #60100
Conversation
|
This issue is now addressed in the linked PR. Requesting a review whenever convenient. |
|
I believe it would be best to fix the issue reference in the description so that this PR can be automatically linked to it. I will link it here now: #60059 |
|
Thanks for pointing that out. I’ve updated the PR description to correctly reference the issue. |
The issue: Runtime errors during DAG parsing in GitDagBundle were being caught but not persisted to the import_error table, causing DAGs with errors to silently disappear from the UI instead of appearing under Import Errors. This was inconsistent with LocalDagBundle behavior. Root cause: When DAG serialization failed in _serialize_dags(), the error was stored using dag.fileloc (absolute path) instead of dag.relative_fileloc (relative path). However, DagBag stores parse-time errors with relative paths, and the update_dag_parsing_results_in_db() function expects all import errors to be keyed by (bundle_name, relative_path) tuples. This path inconsistency caused serialization errors to have absolute paths that couldn't be properly matched to their bundle context, resulting in failed DB inserts and silent failures. Changes: 1. Updated _serialize_dags() to use dag.relative_fileloc instead of dag.fileloc when storing serialization errors, ensuring consistency with parse-time errors 2. Added test_serialization_errors_use_relative_paths() to verify serialization errors use relative paths across bundle types 3. Added test_import_errors_persisted_with_relative_paths() to validate end-to-end error persistence for bundle-backed DAGs This fix ensures that all DAG errors (parse-time and serialization-time) are consistently tracked and displayed in the UI, regardless of bundle type (Git, Local, S3, GCS, etc.). Fixes: #<issue_number>
bc6a092 to
718a059
Compare
Yep. rebased it but I guess tests will still fail |
|
@Arunodoy18 I am going to close your PRs -- Please review and test your changes with correct PR description. Using LLMs without those increase maintenance burdens and CI run time. Feel free to recreate focussed PRs following those guidelines. |
Fixes #60059
The issue: Runtime errors during DAG parsing in GitDagBundle were being caught but not persisted to the import_error table, causing DAGs with errors to silently disappear from the UI instead of appearing under Import Errors. This was inconsistent with LocalDagBundle behavior.
Root cause: When DAG serialization failed in _serialize_dags(), the error was stored using dag.fileloc (absolute path) instead of dag.relative_fileloc (relative path). However, DagBag stores parse-time errors with relative paths, and the update_dag_parsing_results_in_db() function expects all import errors to be keyed by (bundle_name, relative_path) tuples.
This path inconsistency caused serialization errors to have absolute paths that couldn't be properly matched to their bundle context, resulting in failed DB inserts and silent failures.
Changes:
This fix ensures that all DAG errors (parse-time and serialization-time) are consistently tracked and displayed in the UI, regardless of bundle type (Git, Local, S3, GCS, etc.).
Closes #60059