Skip to content

🐛 Fix orphan projects left behind when project creation fails#9155

Open
GitHK wants to merge 22 commits into
ITISFoundation:masterfrom
GitHK:pr-osparc-investigate-comp-task-issues
Open

🐛 Fix orphan projects left behind when project creation fails#9155
GitHK wants to merge 22 commits into
ITISFoundation:masterfrom
GitHK:pr-osparc-investigate-comp-task-issues

Conversation

@GitHK
Copy link
Copy Markdown
Contributor

@GitHK GitHK commented May 22, 2026

What do these changes do?

What:

  • Projects created via the web server or studies dispatcher are now reliably cleaned up when any post-insertion step fails (pipeline creation, file copy, product mismatch, cancellation, etc.)
  • Previously, failures in create_or_update_pipeline or copy_data_folders_from_project could leave half-created projects in the DB with no way for users to recover them.

How:

  • create_project (_crud_api_create.py): Extracted _best_effort_cleanup helper. All exception handlers now attempt project deletion (logged on failure, never masks the original error).
  • Studies dispatcher (_projects.py): Added rollback_project_on_error async context manager — on Exception, schedules deletion and raises ProjectCreationAbortedError. CancelledError propagates without rollback.
  • Removed suppress(DirectorV2ServiceError) — pipeline creation is now a required step that triggers rollback on failure.

Testing:

  • Unit tests for rollback_project_on_error (cancelled error, regular exception, cleanup failure).
  • Integration tests for create_project cleanup on pipeline failure, unexpected errors, and product-name mismatch.

Related issue/s

How to test

Dev-ops

@GitHK GitHK self-assigned this May 22, 2026
@GitHK GitHK added this to the Ionio milestone May 22, 2026
@github-actions github-actions Bot added a:webserver webserver's codebase. Assigning the area is particularly useful for bugs a:static-webserver static-webserver service labels May 22, 2026
@GitHK GitHK requested a review from Copilot May 22, 2026 07:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a failure mode during project creation where director-v2 pipeline registration could fail silently, leaving persisted “orphaned” projects without corresponding computational pipeline entries (comp_tasks). The change makes pipeline registration failures surface properly and adds cleanup logic/tests to prevent incomplete projects from lingering.

Changes:

  • Make director_v2_service.create_or_update_pipeline re-raise DirectorV2ServiceError instead of returning None, so callers can react to failures.
  • Add a catch-all cleanup path in project creation to delete the partially-created project on unexpected exceptions, while preserving intentional web.HTTPException behavior.
  • Add unit tests verifying cleanup on pipeline failure/unexpected exceptions and ensuring HTTP exceptions do not trigger deletion; update non-critical node-edit flows to suppress director-v2 pipeline sync failures.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
services/web/server/tests/unit/with_dbs/02/test_projects_crud_handlers__create_project_failure_cleanup.py New tests covering cleanup behavior for create-project failure scenarios.
services/web/server/src/simcore_service_webserver/studies_dispatcher/_studies_access.py Wraps pipeline creation in a suppression block during dispatcher-driven template copy.
services/web/server/src/simcore_service_webserver/studies_dispatcher/_projects.py Catches director-v2 pipeline errors during dispatcher-created projects and logs a warning.
services/web/server/src/simcore_service_webserver/projects/_projects_service.py Suppresses director-v2 pipeline sync failures for add/delete/patch node operations (non-critical paths).
services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py Removes assert, raises HTTPBadRequest on product mismatch, and adds catch-all cleanup on unexpected exceptions.
services/web/server/src/simcore_service_webserver/director_v2/_director_v2_service.py Changes pipeline creation to re-raise after logging (no longer returns None on failure).

Comment thread services/web/server/src/simcore_service_webserver/studies_dispatcher/_projects.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 67.21311% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.78%. Comparing base (344551f) to head (89209a1).

❗ There is a different number of reports uploaded between BASE (344551f) and HEAD (89209a1). Click for more details.

HEAD has 29 uploads less than BASE
Flag BASE (344551f) HEAD (89209a1)
unittests 32 3
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9155      +/-   ##
==========================================
- Coverage   87.34%   77.78%   -9.56%     
==========================================
  Files        2065      793    -1272     
  Lines       81471    37246   -44225     
  Branches     1475      194    -1281     
==========================================
- Hits        71162    28973   -42189     
+ Misses       9900     8222    -1678     
+ Partials      409       51     -358     
Flag Coverage Δ
integrationtests 63.66% <22.95%> (-0.05%) ⬇️
unittests 77.85% <67.21%> (-8.38%) ⬇️
Components Coverage Δ
pkg_aws_library ∅ <ø> (∅)
pkg_celery_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 67.96% <ø> (-18.02%) ⬇️
agent ∅ <ø> (∅)
api_server ∅ <ø> (∅)
autoscaling ∅ <ø> (∅)
catalog ∅ <ø> (∅)
clusters_keeper ∅ <ø> (∅)
dask_sidecar ∅ <ø> (∅)
datcore_adapter ∅ <ø> (∅)
director ∅ <ø> (∅)
director_v2 78.45% <ø> (-12.89%) ⬇️
dynamic_scheduler ∅ <ø> (∅)
dynamic_sidecar 73.96% <ø> (-14.08%) ⬇️
efs_guardian ∅ <ø> (∅)
invitations ∅ <ø> (∅)
payments ∅ <ø> (∅)
resource_usage_tracker ∅ <ø> (∅)
storage ∅ <ø> (∅)
webclient ∅ <ø> (∅)
webserver 79.13% <67.21%> (-7.68%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 344551f...89209a1. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comment thread services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py Outdated
Comment thread services/web/server/src/simcore_service_webserver/studies_dispatcher/_errors.py Outdated
Andrei Neagu added 2 commits May 22, 2026 11:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py:496

  • The new except web.HTTPException: raise prevents rollback for any HTTPException thrown after the project has been inserted (e.g. any downstream validation error). This can reintroduce orphaned/half-created projects (project exists in DB but the create call returned an error). If the intent is to preserve only specific HTTP errors, consider narrowing this handler or performing conditional cleanup when new_project['uuid'] was created in this call.
    except web.HTTPException:
        # Intentional HTTP error responses (e.g. HTTPBadRequest) should not trigger cleanup
        raise

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comment thread services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py Outdated
Comment thread services/web/server/src/simcore_service_webserver/studies_dispatcher/_projects.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

services/web/server/tests/unit/with_dbs/02/test_projects_crud_handlers__create_project_failure_cleanup.py:249

  • Same issue here: submit_delete_project_task is awaited in the code under test, so patching it with wraps=None creates a non-awaitable mock and can change the observed failure mode (e.g. turning the expected HTTP 400 into a 500 due to TypeError). Use an AsyncMock for this patch (and optionally set an appropriate return value).
    delete_spy = mocker.patch(
        "simcore_service_webserver.projects._crud_api_create._projects_service.submit_delete_project_task",
        wraps=None,
    )

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py:520

  • In this generic exception handler, failures from submit_delete_project_task will override the original exception being handled, which makes debugging harder and can change the error surfaced to clients. Consider making the cleanup best-effort (try/except + log) and then re-raising the original exception.
    except Exception:
        _logger.exception(
            "Unexpected error during create_project for user '%s'. Cleaning up",
            f"{user_id=}",
        )
        if project_uuid := new_project.get("uuid"):
            await _projects_service.submit_delete_project_task(
                app=app,
                project_uuid=project_uuid,
                user_id=user_id,
                simcore_user_agent=simcore_user_agent,
                product_name=product_name,
            )
        raise

Comment thread services/web/server/src/simcore_service_webserver/projects/_crud_api_create.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

@GitHK GitHK changed the title 🐛 Fix orphaned projects when computational pipeline creation fails 🐛 Fix orphan projects left behind when project creation fails May 22, 2026
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside this module I'd say that we want to suppress the error to have the same equivalent behaviour as before.
If there are reasons to let the error bubble up, let me know, otherwise I would not change it.

@GitHK GitHK marked this pull request as ready for review May 22, 2026 13:54
@GitHK GitHK requested a review from odeimaiz May 22, 2026 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:static-webserver static-webserver service a:webserver webserver's codebase. Assigning the area is particularly useful for bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants