feat: add bounded ZSTD encoder/decoder pool with memory management#327
feat: add bounded ZSTD encoder/decoder pool with memory management#327aron-muon wants to merge 11 commits intobuildbarn:mainfrom
Conversation
- Use shared ZSTD encoder/decoder pool instead of creating new instances per request - Add NewCASBlobAccessWithPool for custom pool configurations - Decoders and encoders are acquired from pool with backpressure (blocks when at capacity) - Proper cleanup with defer to ensure encoders/decoders return to pool - Memory usage now bounded by pool configuration (default ~320MB peak)
|
42c50e4 to
92b8cd9
Compare
- Fix SetDefaultZstdPool bug where alreadyInit was never true, so the panic for double-initialization never fired - Replace max64 helper with built-in max (Go 1.21+) - Fix nil error sent to channel in concurrent test when data mismatches - Fix BenchmarkZstdNoPool to use zstd.NewWriter directly for fair comparison - Add TestSetDefaultZstdPoolAfterInitPanics to verify the bug fix - Document context.Background() limitation in chunk reader Read()
EdSchouten
left a comment
There was a problem hiding this comment.
I know you're trying to make the best out of the entire situation. But let me use this opportunity to go on record and say this:
This is a clear demonstration of why I so absolutely hate the way the Remote API working group added support for compression to REv2. Having all of this bloat in bb_storage is just awful.
| ) | ||
|
|
||
| // ZstdPoolConfig holds validated configuration for a BoundedZstdPool. | ||
| type ZstdPoolConfig struct { |
There was a problem hiding this comment.
What's the point in adding this entire structure if we don't provide any facilities for controlling these values?
There was a problem hiding this comment.
Removed ZstdPoolConfig entirely. The pool parameters are now passed directly within the protobuf configuration fields
- Remove finalizers (we own the code, no need for safety nets) - Delete zstd_config.go (config struct without protobuf integration is pointless) - Remove global state (defaultZstdPool, sync.Once, SetDefaultZstdPool) - Remove null-defaulting pattern (callers must pass pool explicitly) - Consolidate to single NewCASBlobAccess constructor with required pool param - Remove unused exports (TryAcquire*, AcquireDecoderAsReadCloser, PooledReadCloser) - Create pool at configuration site when compression is enabled
I can easily understand that. The compression support in REv2 adds a lot of plumbing. Happy to keep the implementation as lean as possible |
Zstd memory management
Replace the hardcoded encoder/decoder pool limits with a ZstdCompressionConfiguration protobuf message on GrpcBlobAccessConfiguration. The presence of the message enables compression; its absence disables it entirely with no pool allocated. This addresses maintainer feedback that there should be no default values, as deployments range from Raspberry Pi to 128-core EC2 instances.
|
Hey @EdSchouten , mind letting the workflows run? |

Summary
The existing CAS gRPC client creates a new
zstd.Encoderandzstd.Decoderfor every ByteStream request. Under high concurrency (e.g., large Bazel builds with many parallel actions), this leads to unbounded memory growth, GC pressure, and potential OOM — each encoder allocates ~4MB and each decoder ~8MB, and theklauspost/compress/zstdlibrary spawns internal goroutines that leak without explicitClose()calls (klauspost/compress#264).This PR introduces a
BoundedZstdPoolthat replaces per-request allocation with a memory-bounded, concurrency-limited pool, and integrates it into the CAS blob access client.Approach
BoundedZstdPool— a new pool type that:sync.Poolto avoid per-request allocationgolang.org/x/sync/semaphore, bounding peak memory to a predictable budget (~320MB with defaults: 16 encoders + 32 decoders)codes.ResourceExhaustedon timeoutCAS client integration —
cas_blob_access.gochanges:Getpath: thezstdByteStreamChunkReadernow acquires a decoder from the pool and releases it onClose(), instead of creating a newzstd.Readerper readPutpath: acquires a pooled encoder withdefer ReleaseEncoder(), replacing inlinezstd.NewWritercallsNewCASBlobAccessWithPoolconstructor for injecting a custom pool (useful for testing and per-client tuning); the existingNewCASBlobAccesscontinues to work via a lazily-initialized default poolTesting
BoundedZstdPool: acquire/release, concurrency limits, context cancellation,TryAcquire,PooledReadCloser, nil-safety, and concurrent stress testexpectGetCapabilitiesWithZSTDtest helper to reduce duplication