[Performance Optimization] Shard level search - optimize for special cases #1236

martin-gaievski · 2025-03-19T20:48:42Z

For hybrid query execution there are some special cases where we can improve performance. While it's not impacting every single query, in niche scenario impact can be big.

Some scenarios I can think of:

multiple sub-queries that after being rewritten are same. We can skip execution of the second query and just copy results from the first execution. This is equivalent of retrieving results from cache instead of executing same query again.
keeping track of the min score threshold, if we have size docs don't take any docs with score lower then the threshold. This is an optimization that will allow to skip adding score and kicking out the element from the min heap of collected scores, which is case of broad queries can be huge (e.g. query with 10M potentially matching docs executing in index with 100 shards -> saving min heap 100.000 operations.

martin-gaievski added enhancement untriaged hybrid search hybrid query performance optimization and removed enhancement labels Mar 19, 2025

martin-gaievski mentioned this issue Mar 19, 2025

[META] Advanced Optimization Techniques for Hybrid query #783

Open

heemin32 removed the untriaged label Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance Optimization] Shard level search - optimize for special cases #1236

[Performance Optimization] Shard level search - optimize for special cases #1236

martin-gaievski commented Mar 19, 2025

[Performance Optimization] Shard level search - optimize for special cases #1236

[Performance Optimization] Shard level search - optimize for special cases #1236

Comments

martin-gaievski commented Mar 19, 2025