fix(skstore): purge DeletedDir tombstones at end of update transaction by zaccharyfavere · Pull Request #1245 · SkipLabs/skip

zaccharyfavere · 2026-05-27T13:52:31Z

DeletedDir entries were created in Dirs.state by removeDir() to mark directories as deleted, but were never cleaned up. They accumulated indefinitely, causing a persistent memory leak proportional to the number of subdirectories created and destroyed over time.

The fix adds a purgeDeletedDirs() method on the Dirs class and calls it at the end of updateWithStatus(), right after the cascade finishes and fromContext is reset. At that point the tombstones have served their purpose and can safely be removed.

This is the natural cleanup point: it runs after each merge of changes into the context state, matching the lifecycle of the tombstones themselves.

cc @skiplabsdaniel

CLAassistant · 2026-05-27T13:52:49Z

All committers have signed the CLA.

mbouaziz · 2026-05-27T14:05:44Z

Hi, thanks for your contribution!
Sorry, CI failures might be because I switched jobs to use gen 2 circle ci instances but they may be counting memory differently

gen2 CircleCI machines enforce virtual memory limits at the cgroup level. The Skip runtime's default SKIP_CAPACITY of 16 GB (palloc.c) fails to mmap on any class with <16 GB RAM, producing "ERROR (MAP FAILED): Cannot allocate memory" early in the job. Observed on skipruntime/large.gen2 in PR #1245 and on an earlier skdb-wasm/large.gen2 run that died at 61 s. gen1 tolerated the same 16 GB virtual mmap because only RSS was enforced and the mmap is lazy. Set SKIP_CAPACITY explicitly per gen2 job, leaving ~25% RAM headroom for OS and tooling: check-ts medium.gen2 (4 GB) -> 3G skdb large.gen2 (8 GB) -> 6G skdb-wasm large.gen2 (8 GB) -> 6G skipruntime large.gen2 (8 GB)* -> 6G compiler xlarge.gen2 (16 GB) -> 12G * postgres + kafka sidecars get their own RAM allocation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

## Summary Gen2 CircleCI machines enforce virtual memory limits at the cgroup level. The Skip runtime's default `SKIP_CAPACITY` of 16 GB (`skiplang/prelude/runtime/palloc.c:449-451`) fails to `mmap` on any class with <16 GB RAM, producing `ERROR (MAP FAILED): Cannot allocate memory` early in the job. Observed on `skipruntime` / `large.gen2` in PR #1245 (job 16682) and on an earlier `skdb-wasm` / `large.gen2` run that died at 61 s (workflow `f78dd673`, 2026-05-27 09:37). Gen1 tolerated the same 16 GB virtual mmap because only RSS was enforced and the mmap is lazy — gen2 evidently does not. Set `SKIP_CAPACITY` explicitly per gen2 job, leaving ~25% RAM headroom for OS + tooling: | Job | Class | RAM | `SKIP_CAPACITY` | |---|---|---|---| | check-ts | medium.gen2 | 4 GB | `3G` | | skdb | large.gen2 | 8 GB | `6G` | | skdb-wasm | large.gen2 | 8 GB | `6G` | | skipruntime | large.gen2 (+ pg/kafka sidecars) | 8 GB primary | `6G` | | compiler | xlarge.gen2 | 16 GB | `12G` | Why also setting the ones that haven't (yet) failed: same `mmap` happens on every Skip-toolchain invocation, so any gen2 job using `skiplabs/skip*` images is one coin flip away from the same failure. `skipruntime` on `large.gen2` succeeded 3× in a row on the Phase 1 PR before failing on #1245 — the variance is real. `check-examples` is on `large.gen2` but uses `cimg/base` and only runs the Skip toolchain inside docker-compose containers (separate cgroups), so it's not affected. ## Test plan - [ ] Land this, then trigger a PR that exercises each gen2 workflow (touch `skiplang/prelude/src/foo` for compiler + skdb + skdb-wasm + skipruntime, and a ts workspace file for check-ts). - [ ] Confirm `skipruntime` no longer dies with `MAP FAILED` at `skargo build`. - [ ] Confirm `compiler` still passes — 12 G cap is well above measured peak (~10 GB). - [ ] Watch `skdb` and `skdb-wasm` over the next few PRs for any new failures specifically at allocation time; if they appear, drop the cap further. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

zaccharyfavere · 2026-05-27T14:25:58Z

Hi, alright thanks for your return !

DeletedDir entries were created in Dirs.state by removeDir() to mark directories as deleted, but were never cleaned up. They accumulated indefinitely, causing a persistent memory leak proportional to the number of subdirectories created and destroyed over time. The fix adds a purgeDeletedDirs() method on the Dirs class and calls it at the end of updateWithStatus(), right after the cascade finishes and fromContext is reset. At that point the tombstones have served their purpose and can safely be removed. This is the natural cleanup point: it runs after each merge of changes into the context state, matching the lifecycle of the tombstones themselves.

mbouaziz · 2026-05-27T16:34:05Z

Sorry, still fixing the CI...
Very bad timing for a broken CI, we haven't had any external contribution for months 😆

zaccharyfavere · 2026-05-27T16:51:24Z

No worries, I just re pushed because I realised there was a format error still in my code. That should be fixed now since it made one of the CI pass

mbouaziz mentioned this pull request May 27, 2026

ci: set SKIP_CAPACITY on gen2 jobs to fit class RAM #1246

Merged

4 tasks

zaccharyfavere force-pushed the fix-deleteddir-leak branch from eb65f41 to a04d30d Compare May 27, 2026 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skstore): purge DeletedDir tombstones at end of update transaction#1245

fix(skstore): purge DeletedDir tombstones at end of update transaction#1245
zaccharyfavere wants to merge 1 commit into
SkipLabs:mainfrom
zaccharyfavere:fix-deleteddir-leak

zaccharyfavere commented May 27, 2026

Uh oh!

CLAassistant commented May 27, 2026 •

edited

Loading

Uh oh!

mbouaziz commented May 27, 2026

Uh oh!

zaccharyfavere commented May 27, 2026

Uh oh!

mbouaziz commented May 27, 2026

Uh oh!

zaccharyfavere commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zaccharyfavere commented May 27, 2026

Uh oh!

CLAassistant commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbouaziz commented May 27, 2026

Uh oh!

zaccharyfavere commented May 27, 2026

Uh oh!

mbouaziz commented May 27, 2026

Uh oh!

zaccharyfavere commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented May 27, 2026 •

edited

Loading