🐛 Fix orphan projects left behind when project creation fails#9155
🐛 Fix orphan projects left behind when project creation fails#9155GitHK wants to merge 22 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a failure mode during project creation where director-v2 pipeline registration could fail silently, leaving persisted “orphaned” projects without corresponding computational pipeline entries (comp_tasks). The change makes pipeline registration failures surface properly and adds cleanup logic/tests to prevent incomplete projects from lingering.
Changes:
- Make
director_v2_service.create_or_update_pipelinere-raiseDirectorV2ServiceErrorinstead of returningNone, so callers can react to failures. - Add a catch-all cleanup path in project creation to delete the partially-created project on unexpected exceptions, while preserving intentional
web.HTTPExceptionbehavior. - Add unit tests verifying cleanup on pipeline failure/unexpected exceptions and ensuring HTTP exceptions do not trigger deletion; update non-critical node-edit flows to suppress director-v2 pipeline sync failures.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| services/web/server/tests/unit/with_dbs/02/test_projects_crud_handlers__create_project_failure_cleanup.py | New tests covering cleanup behavior for create-project failure scenarios. |
| services/web/server/src/simcore_service_webserver/studies_dispatcher/_studies_access.py | Wraps pipeline creation in a suppression block during dispatcher-driven template copy. |
| services/web/server/src/simcore_service_webserver/studies_dispatcher/_projects.py | Catches director-v2 pipeline errors during dispatcher-created projects and logs a warning. |
| services/web/server/src/simcore_service_webserver/projects/_projects_service.py | Suppresses director-v2 pipeline sync failures for add/delete/patch node operations (non-critical paths). |
| services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py | Removes assert, raises HTTPBadRequest on product mismatch, and adds catch-all cleanup on unexpected exceptions. |
| services/web/server/src/simcore_service_webserver/director_v2/_director_v2_service.py | Changes pipeline creation to re-raise after logging (no longer returns None on failure). |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9155 +/- ##
==========================================
- Coverage 87.34% 77.78% -9.56%
==========================================
Files 2065 793 -1272
Lines 81471 37246 -44225
Branches 1475 194 -1281
==========================================
- Hits 71162 28973 -42189
+ Misses 9900 8222 -1678
+ Partials 409 51 -358
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…gate-comp-task-issues
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py:496
- The new
except web.HTTPException: raiseprevents rollback for any HTTPException thrown after the project has been inserted (e.g. any downstream validation error). This can reintroduce orphaned/half-created projects (project exists in DB but the create call returned an error). If the intent is to preserve only specific HTTP errors, consider narrowing this handler or performing conditional cleanup whennew_project['uuid']was created in this call.
except web.HTTPException:
# Intentional HTTP error responses (e.g. HTTPBadRequest) should not trigger cleanup
raise
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
services/web/server/tests/unit/with_dbs/02/test_projects_crud_handlers__create_project_failure_cleanup.py:249
- Same issue here:
submit_delete_project_taskis awaited in the code under test, so patching it withwraps=Nonecreates a non-awaitable mock and can change the observed failure mode (e.g. turning the expected HTTP 400 into a 500 due toTypeError). Use anAsyncMockfor this patch (and optionally set an appropriate return value).
delete_spy = mocker.patch(
"simcore_service_webserver.projects._crud_api_create._projects_service.submit_delete_project_task",
wraps=None,
)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py:520
- In this generic exception handler, failures from
submit_delete_project_taskwill override the original exception being handled, which makes debugging harder and can change the error surfaced to clients. Consider making the cleanup best-effort (try/except + log) and then re-raising the original exception.
except Exception:
_logger.exception(
"Unexpected error during create_project for user '%s'. Cleaning up",
f"{user_id=}",
)
if project_uuid := new_project.get("uuid"):
await _projects_service.submit_delete_project_task(
app=app,
project_uuid=project_uuid,
user_id=user_id,
simcore_user_agent=simcore_user_agent,
product_name=product_name,
)
raise
|
There was a problem hiding this comment.
Inside this module I'd say that we want to suppress the error to have the same equivalent behaviour as before.
If there are reasons to let the error bubble up, let me know, otherwise I would not change it.



What do these changes do?
What:
create_or_update_pipelineorcopy_data_folders_from_projectcould leave half-created projects in the DB with no way for users to recover them.How:
create_project(_crud_api_create.py): Extracted_best_effort_cleanuphelper. All exception handlers now attempt project deletion (logged on failure, never masks the original error).rollback_project_on_errorasync context manager — onException, schedules deletion and raisesProjectCreationAbortedError.CancelledErrorpropagates without rollback.suppress(DirectorV2ServiceError)— pipeline creation is now a required step that triggers rollback on failure.Testing:
rollback_project_on_error(cancelled error, regular exception, cleanup failure).create_projectcleanup on pipeline failure, unexpected errors, and product-name mismatch.Related issue/s
How to test
Dev-ops