fix(skstore): purge DeletedDir tombstones at end of update transaction#1245
Open
zaccharyfavere wants to merge 1 commit into
Open
fix(skstore): purge DeletedDir tombstones at end of update transaction#1245zaccharyfavere wants to merge 1 commit into
zaccharyfavere wants to merge 1 commit into
Conversation
Contributor
|
Hi, thanks for your contribution! |
mbouaziz
added a commit
that referenced
this pull request
May 27, 2026
gen2 CircleCI machines enforce virtual memory limits at the cgroup level. The Skip runtime's default SKIP_CAPACITY of 16 GB (palloc.c) fails to mmap on any class with <16 GB RAM, producing "ERROR (MAP FAILED): Cannot allocate memory" early in the job. Observed on skipruntime/large.gen2 in PR #1245 and on an earlier skdb-wasm/large.gen2 run that died at 61 s. gen1 tolerated the same 16 GB virtual mmap because only RSS was enforced and the mmap is lazy. Set SKIP_CAPACITY explicitly per gen2 job, leaving ~25% RAM headroom for OS and tooling: check-ts medium.gen2 (4 GB) -> 3G skdb large.gen2 (8 GB) -> 6G skdb-wasm large.gen2 (8 GB) -> 6G skipruntime large.gen2 (8 GB)* -> 6G compiler xlarge.gen2 (16 GB) -> 12G * postgres + kafka sidecars get their own RAM allocation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
mbouaziz
added a commit
that referenced
this pull request
May 27, 2026
## Summary Gen2 CircleCI machines enforce virtual memory limits at the cgroup level. The Skip runtime's default `SKIP_CAPACITY` of 16 GB (`skiplang/prelude/runtime/palloc.c:449-451`) fails to `mmap` on any class with <16 GB RAM, producing `ERROR (MAP FAILED): Cannot allocate memory` early in the job. Observed on `skipruntime` / `large.gen2` in PR #1245 (job 16682) and on an earlier `skdb-wasm` / `large.gen2` run that died at 61 s (workflow `f78dd673`, 2026-05-27 09:37). Gen1 tolerated the same 16 GB virtual mmap because only RSS was enforced and the mmap is lazy — gen2 evidently does not. Set `SKIP_CAPACITY` explicitly per gen2 job, leaving ~25% RAM headroom for OS + tooling: | Job | Class | RAM | `SKIP_CAPACITY` | |---|---|---|---| | check-ts | medium.gen2 | 4 GB | `3G` | | skdb | large.gen2 | 8 GB | `6G` | | skdb-wasm | large.gen2 | 8 GB | `6G` | | skipruntime | large.gen2 (+ pg/kafka sidecars) | 8 GB primary | `6G` | | compiler | xlarge.gen2 | 16 GB | `12G` | Why also setting the ones that haven't (yet) failed: same `mmap` happens on every Skip-toolchain invocation, so any gen2 job using `skiplabs/skip*` images is one coin flip away from the same failure. `skipruntime` on `large.gen2` succeeded 3× in a row on the Phase 1 PR before failing on #1245 — the variance is real. `check-examples` is on `large.gen2` but uses `cimg/base` and only runs the Skip toolchain inside docker-compose containers (separate cgroups), so it's not affected. ## Test plan - [ ] Land this, then trigger a PR that exercises each gen2 workflow (touch `skiplang/prelude/src/foo` for compiler + skdb + skdb-wasm + skipruntime, and a ts workspace file for check-ts). - [ ] Confirm `skipruntime` no longer dies with `MAP FAILED` at `skargo build`. - [ ] Confirm `compiler` still passes — 12 G cap is well above measured peak (~10 GB). - [ ] Watch `skdb` and `skdb-wasm` over the next few PRs for any new failures specifically at allocation time; if they appear, drop the cap further. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Author
|
Hi, alright thanks for your return ! |
DeletedDir entries were created in Dirs.state by removeDir() to mark directories as deleted, but were never cleaned up. They accumulated indefinitely, causing a persistent memory leak proportional to the number of subdirectories created and destroyed over time. The fix adds a purgeDeletedDirs() method on the Dirs class and calls it at the end of updateWithStatus(), right after the cascade finishes and fromContext is reset. At that point the tombstones have served their purpose and can safely be removed. This is the natural cleanup point: it runs after each merge of changes into the context state, matching the lifecycle of the tombstones themselves.
eb65f41 to
a04d30d
Compare
Contributor
|
Sorry, still fixing the CI... |
Author
|
No worries, I just re pushed because I realised there was a format error still in my code. That should be fixed now since it made one of the CI pass |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DeletedDir entries were created in Dirs.state by removeDir() to mark directories as deleted, but were never cleaned up. They accumulated indefinitely, causing a persistent memory leak proportional to the number of subdirectories created and destroyed over time.
The fix adds a purgeDeletedDirs() method on the Dirs class and calls it at the end of updateWithStatus(), right after the cascade finishes and fromContext is reset. At that point the tombstones have served their purpose and can safely be removed.
This is the natural cleanup point: it runs after each merge of changes into the context state, matching the lifecycle of the tombstones themselves.
cc @skiplabsdaniel