Skip to content

Conversation

0ctopus13prime
Copy link
Collaborator

Description

RFC : #2924

Enable optimistic search for memory optimized search and deprecate MultiLeafKnnCollector which has an early termination logic.

This PR has three big changes:

  1. Now, when memory-optimized search is enabled, all queries use NativeEngineKnnVectorQuery.
    KnnQuery, which only provides a ScorerSupplier and performs search within a single leaf segment (with the resulting Scorer being consumed by an external BulkScorer under the standard Lucene search flow). But optimistic search requires coordination across segments. It needs to run an initial (first-phase) search, then identify and revisit only the segments likely to contain promising results.
    To support this coordinated two-phase process, NativeEngineKnnVectorQuery is a more suitable entry point than KnnQuery.

  2. Backported Lucene components required for optimistic search, specifically:
    2.1. ReentrantKnnCollectorManager
    2.2. SeededMappedDISI
    2.3. SeededTopDocsDISI

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#2924

Check List

  • [O] New functionality includes testing.
  • [O] New functionality has been documented.
  • [O] API changes companion pull request created.
  • [O] Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@0ctopus13prime
Copy link
Collaborator Author

@Vikasht34 @shatejas
All CI passed! Could you take a look at this when you have time?
Thank you

@Vikasht34
Copy link
Collaborator

Will look into this PR :- Tomorrow Morning

@0ctopus13prime 0ctopus13prime force-pushed the optimistic-srch branch 2 times, most recently from 5b006af to 1e09bf0 Compare October 9, 2025 05:06
);
}

/*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic has been moved to approximateSearch

* An immutable, empty {@link BitSet} implementation used to represent
* the absence of filter bits without incurring null checks or allocations.
*/
public static final BitSet MATCH_ALL_BIT_SET = new BitSet() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why we are not uysing Lucene's MathAllBits or MatcNoBits?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not use both, as it's sub class of Bits while we need BitSet in here. 😵‍💫
Since optimistic search will call approximateSearch twice, we need to keep BitSet for reusing.

}

@Override
public int length() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing any iteration on Bitset , this gooona break if we are doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only place using this one is when we're getting siblings in nested case.
In there, we don't do iteration.

@shatejas
Copy link
Collaborator

shatejas commented Oct 13, 2025

when memory-optimized search is enabled, all queries use NativeEngineKnnVectorQuery.

There are caveats to doing this,

  1. This will completely bypass the slicing logic, consuming more CPU for concurrent segment search. The concurrency will use all threads available for search. Allowing a shard to use all cores of CPU impacts other parts of the system and is deviating from existing behavior for cases where rescoring is not used
  2. This impacts the total hits count, moving it to the shard level will always show total hits as k, where as with segment level will add up each segment results in total count. This is a behavior change and can impact cases where users rely on total hit count
  3. This now returns k results on shard level, while its not a huge concern, it can affect results of single shard - single segment case when k < size for non-rescoring cases

I think 1 is a concern which needs discussion, 2 is manageable with some extra logic to keep behavior consistent, 3 isn't a big deal

@navneet1v
Copy link
Collaborator

when memory-optimized search is enabled, all queries use NativeEngineKnnVectorQuery.

There are caveats to doing this,

  1. This will completely bypass the slicing logic, consuming more CPU for concurrent segment search. The concurrency will use all threads available for search. Allowing a shard to use all cores of CPU impacts other parts of the system and is deviating from existing behavior for cases where rescoring is not used
  2. This impacts the total hits count, moving it to the shard level will always show total hits as k, where as with segment level will add up each segment results in total count. This is a behavior change and can impact cases where users rely on total hit count
  3. This now returns k results on shard level, while its not a huge concern, it can affect results of single shard - single segment case when k < size for non-rescoring cases

I think 1 is a concern which needs discussion, 2 is manageable with some extra logic to keep behavior consistent, 3 isn't a big deal

@shatejas from a user perspective all of this is a breaking change if we are making Lucene on Faiss default. So this needs to be documented in the docs to clearly callout the behavior and how to mitigate this. Along with this, we should ensure that older indices are still on the same non memory optimized based search so that upgrades are seamless.

We have already seen GH issues in part where changes in range of cosine scores lead to issues with users. #2561

import java.util.List;

@UtilityClass
public class Optimistic2ndSearchUtils {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use better names for this class. 2ndSearchUtils what is first search utils? and why this class is not part of the same class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 2nd is bit misleading.. will find a better name for that.

*/
@Log4j2
@RequiredArgsConstructor
public class ReentrantKnnCollectorManager implements KnnCollectorManager {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is a copy/has taken references from https://github.com/apache/lucene/blob/71e822e6240878018a6ff3c28381a0d88bebdc72/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L368

It will be better to mention this somewhere in this class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will update in the next rev.

@Setter
private KnnCollectorManager optimistic2ndKnnCollectorManager;

public static class OptimisticKnnCollectorManager implements KnnCollectorManager {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to a separate file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will update in the next rev.

public PerLeafResult(final Bits filterBits, final TopDocs result) {
this.filterBits = filterBits == null ? new Bits.MatchAllBits(0) : filterBits;
// Indicates whether this result was produced via exact or approximate search.
private final SearchMode searchMode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the use of this parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimistic search would do deep dive HNSW search with acquired top k results as seeds. And this will not be needed if the results acquired via exact search. Hence having search mode here, and let it bypass the second search if possible.

* An immutable, empty {@link BitSet} implementation used to represent
* the absence of filter bits without incurring null checks or allocations.
*/
public static final BitSet MATCH_ALL_BIT_SET = new BitSet() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

* @return a {@link TopDocs} object containing the top {@code k} approximate search results
* @throws IOException if an error occurs while reading index data or accessing vector fields
*/
public TopDocs approximateSearch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we are making this function public?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this particular function in optimistic second search. Otherwise, if using searchLeaf, then we will end up building filter bitset twice.

@navneet1v
Copy link
Collaborator

navneet1v commented Oct 13, 2025

We are brining in a lot of code from Lucene. Please mention the source of the code for better maintainability.

One way I would think is to move the class to org.opensearch.lucene to ensure that we know these classes are merely copie/inspired from Lucene.

Copy link
Collaborator

@Vikasht34 Vikasht34 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good !! Clean Code and Very Concise !! Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants