perf: intern DataFile fields/column_indices to reduce manifest memory#6477
Merged
jackye1995 merged 3 commits intolance-format:mainfrom Apr 13, 2026
Merged
Conversation
Change DataFile.fields and DataFile.column_indices from Vec<i32> to Arc<[i32]> and add a DataFileFieldInterner that deduplicates identical slices during manifest deserialization. In homogeneous tables every fragment carries the same field list, so at 20M fragments the interning saves ~2.4 GB of redundant heap allocations (~1.2 GB for fields + ~1.2 GB for column_indices). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
933f46a to
90fa6d7
Compare
jackye1995
approved these changes
Apr 13, 2026
Contributor
jackye1995
left a comment
There was a problem hiding this comment.
thanks for the change, looks good to me!
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
6 tasks
jackye1995
pushed a commit
that referenced
this pull request
Apr 14, 2026
…mory (#6499) ## Summary - Change `RowDatasetVersionMeta::Inline` from `Vec<u8>` to `Arc<[u8]>` so that fragments with identical version metadata share a single heap allocation - Extend `DataFileFieldInterner` to deduplicate these inline byte payloads during manifest deserialization - Introduce `InternCache<T>`: a hybrid cache that uses Vec linear scan for ≤16 entries and upgrades to HashMap for larger caches - Add custom `Serialize`/`Deserialize` impls for `RowDatasetVersionMeta` to handle `Arc<[u8]>` transparently ## Motivation Follow-up to #6477 (interning `DataFile.fields`/`column_indices`). After a compaction, all fragments are stamped with the same version metadata (both `last_updated_at_version_meta` and `created_at_version_meta`), but each fragment previously owned its own `Vec<u8>` copy. ### Per-fragment memory breakdown (before) | Field | Size per fragment | |-------|------------------| | `last_updated_at_version_meta: Inline(Vec<u8>)` | ~24 bytes + payload | | `created_at_version_meta: Inline(Vec<u8>)` | ~24 bytes + payload | | **Total redundant at 20M fragments** | **~480 MB+** | ### After this change With interning, all 20M fragments share a single `Arc<[u8]>` allocation per unique payload. ## Benchmark results Microbenchmark at 100K fragments (10 fields per fragment): | Scenario | No interning | With interning | Delta | |----------|-------------|----------------|-------| | **Uniform (1 unique version)** | 24.5 ms | 17.9 ms | **27% faster** | | **Diverse (10 unique)** | 25.7 ms | 19.7 ms | **23% faster** | | **Diverse (100 unique)** | 26.0 ms | 23.4 ms | **10% faster** | | **Diverse (500 unique)** | 26.0 ms | 22.8 ms | **12% faster** | | Memory (100K fragments) | No interning | With interning | Savings | |------------------------|-------------|----------------|---------| | **10 fields** | 39.47 MB | 29.74 MB | **24.6%** | | **50 fields** | 69.99 MB | 29.74 MB | **57.5%** | Both memory and speed improve across all scenarios. The hybrid `InternCache` uses fast Vec scan for the common case (1-3 unique values) and upgrades to HashMap when diversity exceeds 16 entries. Run with: `cargo bench -p lance-table --bench manifest_intern` ## Changes - **`rust/lance-table/src/rowids/version.rs`** — `Inline(Vec<u8>)` → `Inline(Arc<[u8]>)`, custom serde impls, updated protobuf conversions - **`rust/lance-table/src/format/fragment.rs`** — `InternCache<T>` (Vec/HashMap hybrid), extended `DataFileFieldInterner` with version meta interning - **`rust/lance-table/benches/manifest_intern.rs`** — Microbenchmark covering uniform and diverse scenarios ## Compatibility - No format change — protobuf schema is unchanged - Serde JSON output is identical (custom impl serializes `Arc<[u8]>` as `[u8]`) - `from_sequence()` still works as before (converts internally) ## Test plan - [x] `cargo check --workspace --tests` passes - [x] `cargo clippy -p lance-table -p lance -- -D warnings` passes - [x] All 88 `lance-table` tests pass - [x] `cargo fmt --all -- --check` passes - [x] Microbenchmark validates performance across uniform and diverse scenarios - [ ] CI 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DataFile.fieldsandDataFile.column_indicesfromVec<i32>toArc<[i32]>so that fragments with identical field lists share a single heap allocationDataFileFieldInternerthat deduplicates these slices during manifest deserializationMotivation
When dataset manifests grow large (>1 GB with millions of fragments), opening the dataset becomes very expensive in terms of memory. Each
DataFilepreviously owned its ownVec<i32>forfieldsandcolumn_indices, even though in most tables every fragment has the exact same field list. This PR deduplicates those allocations at deserialization time.Per-fragment memory breakdown (before)
fields: Vec<i32>(10 fields)column_indices: Vec<i32>(10 cols)After this change
With interning, all 20M fragments share a single
Arc<[i32]>allocation (~80 bytes total instead of 2.4 GB).Changes
lance-table/src/format/fragment.rs— Core struct change (Vec<i32>→Arc<[i32]>), customSerialize/Deserializeimpls, andDataFileFieldInternerlance-table/src/format/manifest.rs— Use interner during manifest deserializationlance/src/dataset/fragment.rs,merge_insert.rs,io/commit.rs— Tombstoning and field-remapping rebuilt as newArc<[i32]>instead of in-place mutationpython/src/fragment.rs,java/lance-jni/src/fragment.rs— FFI boundary conversionsCompatibility
Arc<[i32]>as[i32])Vec<i32>(e.g.,DataFile::new(),Fragment::add_file()) still acceptVec<i32>and convert internallyTest plan
cargo check --workspace --testspassescargo clippy -p lance-table -p lance -- -D warningspasseslance-tabletests pass🤖 Generated with Claude Code