Skip to content

Conversation

leoyu808
Copy link

@leoyu808 leoyu808 commented Aug 11, 2025

Description

Currently, we serially iterate through all documents when performing Exact Search at a segment level. Since scores can be computed independently of other documents in the segment, parallelization can be leveraged to reduce the latency of Exact Search queries.

Related Issues

Resolves #2326

Key Changes

Introduction of a new setting: knn.search.concurrent_exact_search.enabled to control whether concurrent exact search is enabled. The segment partitioning is controlled by settings knn.search.concurrent_exact_search.max_partition_count, which specifies the maximum number of partitions that the segment can be divided into and knn.search.concurrent_exact_search.min_document_count, which specifies the minimum number of documents in each partition.

Benchmarking

Cluster Setup

  • 3 data nodes (r5.4xlarge: 128 GB RAM, 16 vCPUs, 250 GB disk space)
  • 3 cluster manager nodes (r5.xlarge: 32 GB RAM, 4 vCPUs, 50 GB disk space)
  • 1 OpenSearch workload client (c5.4xlarge: 32 GB RAM, 16 vCPUs)
  • 1 and 4 search clients
  • exact_searcher thread pool size: 32

Dataset

Cohere-768 10 million

Force-merged Segments

Setting # Clients p50 p90 p99 QPS Average CPU Max CPU
Disabled 1 618.522 645.270 667.310 1.594 19.576 12.038
4 819.820 930.547 981.158 4.773 43.331 44.903
Max Partition Count = 2 1 343.678 372.461 436.045 2.830 22.726 27.335
4 733.338 935.996 1,031.100 5.277 58.779 65.872
Max Partition Count = 4 1 255.249 274.241 288.236 3.834 41.955 43.582
4 685.483 862.162 1,086.680 5.650 70.849 74.306
Max Partition Count = 8 1 265.866 273.005 280.516 3.676 46.031 46.983
4 593.130 801.366 957.282 6.421 82.734 86.500
Max Documents = 250k 1 267.034 278.076 288.622 3.656 43.377 44.778
4 622.086 848.059 1,007.000 6.033 75.565 81.177

Unmerged Segments + Concurrent Segment Search Enabled

Setting # Clients p50 p90 p99 QPS Average CPU Max CPU
Disabled 1 291.208 318.081 333.744 3.359 38.482 42.375
4 826.396 1,028.250 1,180.420 4.635 50.233 66.450
Max Partition Count = 2 1 356.858 408.199 426.813 2.705 27.873 38.450
4 709.455 935.437 1,073.320 5.420 65.523 71.586
Max Partition Count = 4 1 291.902 323.875 339.293 3.322 39.277 41.271
4 716.514 906.047 1,051.050 5.495 69.724 75.092
Max Partition Count = 8 1 269.799 281.195 294.170 3.610 45.296 46.131
4 637.933 847.845 999.826 5.988 77.285 81.226
Max Documents = 250k 1 290.144 318.616 335.098 3.378 38.890 43.103
4 725.721 932.955 1,127.260 5.348 67.210 70.535

Unmerged Segments + Concurrent Segment Search Disabled

Setting # Clients p50 p90 p99 QPS Average CPU Max CPU
Disabled 1 611.117 626.083 638.975 1.616 12.186 18.609
4 823.453 934.623 988.102 4.745 42.975 44.665
Max Partition Count = 2 1 370.850 411.229 433.008 2.616 21.786 30.868
4 718.453 909.039 1,014.950 5.348 62.645 67.959
Max Partition Count = 4 1 258.018 284.487 300.277 3.745 40.098 44.997
4 694.374 877.111 1,003.660 5.571 69.750 74.533
Max Partition Count = 8 1 267.097 275.666 284.184 3.960 45.344 46.259
4 731.418 922.914 1,050.730 5.384 68.569 71.169
Max Documents = 250k 1 600.071 658.401 677.558 1.962 20.310 38.408
4 825.033 1,015.600 1,179.660 4.689 52.877 68.186

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@shatejas
Copy link
Collaborator

@leoyu808 can we switch this to feature branch instead of main

@leoyu808 leoyu808 changed the base branch from main to feature/concurrent-exact-search August 21, 2025 19:29
@leoyu808
Copy link
Author

@leoyu808 can we switch this to feature branch instead of main

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Parallelize Exact Search for vector indices in a segment

2 participants