You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update the shortcutTotalHitCount logic to identify the query as MatchAllDocsQuery.class.
Today with approximation the match_all is converted to a range query. With this the totalHitsThreshold coming from Lucene TopFieldCollector is changed to 10k.
For match_all the threshold should be 10 (the numHits value) which is coming from TopDocsCollectorContext part of OpenSearch.
With totalHitsThreshold as 10k, with large threshold this is delaying the updateCompetitiveIterator process part of the Lucene NumericComparator and forcing to compare all the 10k docs.
With the default 10, the competitive iterator would have updated early and could eliminate some docs from 10k.
This fixed the inconsistency because now the total hit count correctly includes all documents that would match a true match_all query, even when the query has been optimized into a range query on the sort field.
This should improve the performance for match_all queries that go with approximation as the Lucene competitive iterator would trigger early. Benchmark results #18189 (comment).
This change should also bring the behavior in line with what users expect when running a match_all query with sorting to include the documents that was missing the sort field.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
OS: [e.g. iOS]
Version [e.g. 22]
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
prudhvigodithi
changed the title
[BUG] Retain the default totalHitsThreshold for approximated match_all queries
Retain the default totalHitsThreshold for approximated match_all queries
May 5, 2025
prudhvigodithi
changed the title
Retain the default totalHitsThreshold for approximated match_all queries
Improve performance for approximated match_all sort queries
May 6, 2025
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
Update the
shortcutTotalHitCount
logic to identify the query asMatchAllDocsQuery.class
.Today with approximation the
match_all
is converted to a range query. With this thetotalHitsThreshold
coming from Lucene TopFieldCollector is changed to 10k.For
match_all
the threshold should be 10 (thenumHits
value) which is coming from TopDocsCollectorContext part of OpenSearch.With
totalHitsThreshold
as 10k, with large threshold this is delaying theupdateCompetitiveIterator
process part of the Lucene NumericComparator and forcing to compare all the 10k docs.With the default 10, the competitive iterator would have updated early and could eliminate some docs from 10k.
This fixed the inconsistency because now the total hit count correctly includes all documents that would match a true match_all query, even when the query has been optimized into a range query on the sort field.
Should fix the [AUTOCUT] Gradle Check Flaky Test Report for SimpleSearchIT #16851
Related component
Search:Performance
To Reproduce
N/A
Expected behavior
This should improve the performance for match_all queries that go with approximation as the Lucene competitive iterator would trigger early. Benchmark results #18189 (comment).
This change should also bring the behavior in line with what users expect when running a match_all query with sorting to include the documents that was missing the sort field.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: