Skip to content

fix(signup): Restore orphaned standby site#6507

Open
Bowrna wants to merge 6 commits into
frappe:developfrom
Bowrna:restore_orphaned_standby_site
Open

fix(signup): Restore orphaned standby site#6507
Bowrna wants to merge 6 commits into
frappe:developfrom
Bowrna:restore_orphaned_standby_site

Conversation

@Bowrna

@Bowrna Bowrna commented May 25, 2026

Copy link
Copy Markdown
Contributor

related to PR: #6470

The initial commit in the standby site state wont be rolled back if the error occurs out of the try except block scope. This sites will be orphaned even though they are still good to use. So during replenish, if we find sites that are orphaned, we restore them back to pool.

@Bowrna Bowrna requested a review from ssiyad as a code owner May 25, 2026 09:44
@greptile-apps

greptile-apps Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a restore_orphaned_standby_sites method to recover sites that got stuck with is_standby=0 but no committed user data (NULL account_request / signup_time, team still Administrator) after a double-failure during signup. The method is invoked at the start of create_standby_sites_in_each_cluster so the scheduler automatically heals orphaned sites on every replenishment cycle.

  • A 15-minute grace-period filter on modified prevents recently-claimed sites from being mis-identified as orphans, and an early return guards against a missing Administrator Team record.
  • frappe.db.commit() is called immediately after the bulk set_value, so the restoration is durable before the per-cluster site-creation loop (and its potential rollback() calls) begins.

Confidence Score: 4/5

The orphan-restoration logic is sound and the commit is placed correctly before the cluster loop, but the residual race window between the is_standby=0 commit at signup and the subsequent site.save() is not yet fully closed.

The new method correctly commits orphan restorations before any per-cluster rollback can interfere, and the 15-minute grace period materially reduces the chance of a false-positive restoration. The outstanding concern — a signup that holds is_standby=0 for an unusually long time could still be swept up — was flagged in earlier review threads and not yet addressed in this iteration.

press/saas/doctype/product_trial/product_trial.py — specifically the orphan-detection filter and its interaction with the concurrent signup path.

Important Files Changed

Filename Overview
press/saas/doctype/product_trial/product_trial.py Adds restore_orphaned_standby_sites called from create_standby_sites_in_each_cluster; the orphan query lacks a limit, and the restoration logic carries a residual (previously noted) race window between the is_standby=0 commit and the subsequent site.save().

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[replenish_standby_sites scheduler] --> B[get all products with pooling enabled]
    B --> C[for each product: create_standby_sites_in_each_cluster]
    C --> D{enable_pooling?}
    D -- No --> Z[return]
    D -- Yes --> E[restore_orphaned_standby_sites]
    E --> F[get administrator_team]
    F --> G{team found?}
    G -- No --> H[return early]
    G -- Yes --> I[query orphaned sites\nis_standby=0, team=admin,\naccount_request IS NULL,\nsignup_time IS NULL,\nmodified < now-15min]
    I --> J{orphans found?}
    J -- No --> K[skip]
    J -- Yes --> L[set_value is_standby=1 for all orphans]
    L --> M[frappe.db.commit]
    K --> N[get_available_clusters]
    M --> N
    N --> O[for each cluster: create_standby_sites]
    O --> P{success?}
    P -- Yes --> Q[frappe.db.commit]
    P -- No --> R[log_error + frappe.db.rollback]
    Q --> S[next cluster]
    R --> S
Loading

Reviews (3): Last reviewed commit: "Merge branch 'develop' into restore_orph..." | Re-trigger Greptile

Comment thread press/saas/doctype/product_trial/product_trial.py Outdated
Comment thread press/saas/doctype/product_trial/product_trial.py Outdated
Comment thread press/saas/doctype/product_trial/product_trial.py
@codecov

codecov Bot commented May 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 10.00000% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.12%. Comparing base (16722c7) to head (d3c7429).
⚠️ Report is 10 commits behind head on develop.

Files with missing lines Patch % Lines
press/saas/doctype/product_trial/product_trial.py 10.00% 9 Missing ⚠️

❌ Your patch check has failed because the patch coverage (10.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6507      +/-   ##
===========================================
- Coverage    50.13%   50.12%   -0.01%     
===========================================
  Files          956      956              
  Lines        79331    79341      +10     
  Branches       377      375       -2     
===========================================
- Hits         39771    39770       -1     
- Misses       39532    39543      +11     
  Partials        28       28              
Flag Coverage Δ
dashboard 61.02% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Bowrna Bowrna changed the title Restore orphaned standby site fix(signup): Restore orphaned standby site May 25, 2026
@Bowrna Bowrna marked this pull request as draft May 25, 2026 10:04
@Bowrna Bowrna marked this pull request as ready for review May 25, 2026 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant