Design: Unify FTS into the Global Segment Model #6301
Replies: 1 comment 1 reply
-
|
for 1, I think it's fine to have the metadata file for each segment, as they are small, but we can have a logical index metadata in long term to make it cleaner. for 2, I don't worry about this too much, now we have for 3, bloom filter may help, the others may not. I don't think any of this can make query noticeable improvement, now FTS would just return empty results if all tokens are missed, checking tokens existence is not expensive for 4, it may be fine to do in-segment scoring, this is also how the other systems handle segments, we can add a param to do global scoring, just let people know that would be slower |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Lance is unifying its index system around a framework-level
segmentarchitecture so that build, commit, query fan-out, compact, and GC all operate on the same physical unit.FTS needs to align with this model.
How FTS Works Today
On-Disk Layout
An FTS index lives in a single root directory:
metadata.lance+part_<id>_*form a stable FTS root format that the builder, loader, and distributed finalize all depend on.Build
LANCE_FTS_PARTITION_SIZE), spill topart_<id>_*files.metadata.lance.Spilled part files are the final payload — there is no separate intermediate-to-final format conversion.
Distributed Build
part_<id>_tokens/docs/invert.lance+part_<id>_metadata.lance.metadata.lance.Query
BM25 scoring depends on two kinds of global information:
num_docs,avg_doc_length(aggregated from DocSet).Append
metadata.lance.Append cost grows linearly with the number of existing partitions.
Delete
Lazy: dead rows remain in posting lists and are filtered at query time. Physical cleanup depends on optimize.
Problems
The FTS root is currently the only physical management unit. This causes:
Design
Core Decisions
partitions: [0]).Legacy multi-part segments remain readable.
Metadata
No metadata schema changes in the initial phases. The existing manifest-level
IndexMetadata(uuid, name, fields, fragment_bitmap, etc.) and segment-localmetadata.lance(params, partitions, token_set_format, etc.) are sufficient to support the segment control plane.Future work may promote segment-local fields (
token_set_format,posting_tail_codec, etc.) to manifest metadata and add statistical fields such asnum_docs_rawandtotal_tokens_rawso the planner can make decisions without opening payload files. This is not a prerequisite for segmentation.Global BM25 Scoring
BM25 requires two kinds of global information:
num_docs,avg_doc_length): aggregated from each segment's DocSet.PostingListReader::lengths[token_id].Phase 1 — dfs_query_then_search:
Cost:
candidate_segments × query_tokensprobes. For 100 segments and 3 query tokens this is 300 FST lookups + 300 length reads.Phase 2 — metadata acceleration:
Write lightweight df summaries at segment build time to reduce query-time probes:
Build
Workers produce complete FTS segment roots directly.
In distributed builds each worker outputs independent segments; the coordinator performs a logical commit to the manifest. No cross-worker rename/finalize step is needed.
Append
New data produces a new segment. No files from existing segments are copied.
Compact
Select a set of old segments → re-scan source data by fragment coverage and live rows → rebuild into fewer new segments using the existing builder.
Goals: reduce segment fan-out, apply delete cleanup, produce more compact payload.
Cost is
O(source_data), notO(index_size). The trade-off is simplicity: compact reuses the full build path and avoids cross-segment posting-list merge.Delete
Segments are immutable. Deleted rows are lazily filtered at query time. Physical cleanup is performed by compact.
Open Questions
LANCE_FTS_PARTITION_SIZE(default 2 GiB) controls worker memory limits and spill cadence. Under the single-part-per-segment model, a worker that spills multiple times needs either a post-spill merge or to emit multiple segments. The sizing policy and the spill-to-segment mapping need to be co-designed.Beta Was this translation helpful? Give feedback.
All reactions