fix(deploy): anchor rsync excludes to transfer root to prevent exit 23#3787
Conversation
The three artifact-sync / rollback rsyncs used unanchored excludes (--exclude='logs/' --exclude='run/'), which match a directory of that name at ANY depth. rsync protects excluded paths on the receiver from --delete, so deep node_modules dirs named logs/ (e.g. node_modules/@sentry/core/build/esm/logs) shielded stale nested @sentry trees from deletion. rsync could then not remove the now-non-empty parents and aborted with exit 23 before the PM2 restart — the new bundle synced to disk but the old code kept serving. Anchor all three excludes with a leading '/' so they protect ONLY the top-level runtime state (/.env, /logs/, /run/) and let --delete prune stale nested node_modules. Verified on the production host (rsync 3.2.7): the unanchored pattern leaves the nested @sentry tree undeletable ("cannot delete non-empty directory"); the anchored pattern removes it while preserving the top-level logs/ directory.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe deployment workflow's Changesrsync exclude pattern anchoring
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Problem
Every push-to-
maindeploy was aborting at the artifact-syncrsyncstep with exit 23 (cannot delete non-empty directory: node_modules/@sentry/.../node_modules/@sentry/core/build/...), before thepm2 restart. Net effect of a "failed" deploy: the new bundle (server_build/server/index.js) was synced to disk andprisma migrate deploywas applied, but PM2 kept running the old code. Surfaced on the PR #3786 deploy after a Dependabot@sentrybump.Root cause
All three deploy rsyncs used unanchored excludes:
An unanchored pattern like
logs/matches a directory of that name at any depth, and rsync protects excluded paths on the receiver from--delete. The hoisted node_modules tree contains deep dirs literally namedlogs/:When a version bump hoists
@sentry/coreto the top level, the stale nested@sentry/*/node_modules/@sentry/core/trees should be deleted — but theirbuild/*/logs/subdirs were protected by--exclude='logs/', leaving the parents non-empty. rsync cannot remove a non-empty directory → exit 23 →fail_deployment→ restart skipped.Fix
Anchor every exclude to the transfer root with a leading
/, on all three rsyncs (artifact sync + both rollback rsyncs):/logs/now protects only"$remote_dir"/logs/(the intended top-level runtime dir), not deepnode_modules/.../logs/, so--deletecan prune stale nested node_modules and the sync completes through topm2 restart. Top-level/.env,/logs/,/run/stay protected exactly as before (preserves the production.env, PM2 logs, and the live mysqld socket dir — the CLAUDE.md "must not rm -rf the top-level run/ tree" gotcha still holds).Verification (empirical, on the production host — rsync 3.2.7)
Reproduced the exact failing structure in
/tmp(touches no real paths):@sentrytreelogs/--exclude='logs/'(old)cannot delete non-empty directory: @sentry/core/build/esm--exclude='/logs/'(new)No such file or directoryapp.logkept)This proves both that the unanchored exclude is the cause and that anchoring preserves the intended top-level protection. (CI does not exercise the deploy SSH path, so this host-level repro is the real test, not green CI.)
Notes
@sentry/coredirs (dated before the bump) automatically — no manual production cleanup needed.scripts/deploy(manual path) is unaffected — its rsyncs use neither--deletenor these excludes and never syncnode_modules. The only other rsync in the repo (scripts/backup) already uses an anchored--exclude='/tmp/'.--force/--delete-after. With the protect rule corrected there are no un-deletable dirs left; adding those flags would only mask a future protect-rule regression instead of surfacing it as a loud exit 23.Risk
Low — CI-workflow change only. Production is untouched until the next deploy, which this change makes complete through
pm2 restart.Summary by CodeRabbit