Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support ANN Queries in _msearch for Multi-Collection Vector Search #2570

Open
genandre opened this issue Feb 28, 2025 · 0 comments
Assignees

Comments

@genandre
Copy link

genandre commented Feb 28, 2025

Is your feature request related to a problem?

Currently, OpenSearch does not support ann queries inside _msearch (basically, approximate nearest neighbor (ANN) search across multiple indices in a single request)

This limitation makes it difficult to efficiently retrieve vector search results from multiple collections, requiring multiple separate queries instead. This increases query complexity, latency, and response merging overhead in applications that need multi-collection vector search.


What solution would you like?

I would like OpenSearch to support ann queries inside _msearch, so users can send multiple ann searches across different indices in a single batch request, similar to how _msearch works for traditional queries.

This feature should allow:

  • Executing multiple ann searches in one request (just like _msearch does for match, term, and script_score queries).
  • Efficiently retrieving results from multiple indices without requiring separate requests.
  • Merging results from multiple indices with minimal performance overhead.

What alternatives have you considered?

Since ann is not currently supported in _msearch, the following workarounds have been explored:

  1. Sending separate ANN _search queries per index and merging results client-side.

    • 🚀 Fast, but requires extra logic in the application to merge responses manually.
    • Increases request overhead due to multiple network calls.
  2. Reindexing all collections into a single index and filtering with a type field.

    • Works well for some cases, but does not scale well if collections are large and frequently updated.
    • Requires additional storage and maintenance overhead.
  3. Using _msearch with script_score instead of ann for vector similarity.

    • Works in _msearch, but significantly slower than ann (brute-force computation instead of ANN indexing).

None of these alternatives fully solve the problem in an efficient and scalable way.


Do you have any additional context?

  • Elasticsearch also does not support ann in _msearch, and OpenSearch could introduce this as a unique advantage.
  • The ability to batch vector searches would be highly beneficial for multi-collection vector search use cases, such as:
    • Searching across multiple document repositories in a single request.
    • Performing multi-modal search (e.g., combining text, image, or embeddings from different sources).
    • Improving efficiency in real-time recommendation systems that rely on fast ANN lookups.

@genandre genandre changed the title [FEATURE] Support knn Queries in _msearch for Multi-Collection Vector Search [FEATURE] Support ANN Queries in _msearch for Multi-Collection Vector Search Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants