Skip to content

Make the perThreadHardLimitMB to be configurable above 2GB #15296

@punAhuja

Description

@punAhuja

Description

The parameter perThreadHardLimitMB cannot be larger than 2GB, which means a single thread cannot write segments larger than 2GB.
Refer: https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int)

This issue proposes to make this parameter configurable above the 2GB limit, so that each thread can write a bigger segment.

When indexing high dimensional vector data, each segment has its own HNSW graph. So more segments mean more graphs to search per shard and more graph rebuild work during merges. With this change, a single indexing thread can flush fewer, and larger segments, which is generally more resource-efficient for vector-heavy workloads.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions