Skip to content

Conversation

junqiu-lei
Copy link
Member

@junqiu-lei junqiu-lei commented Aug 19, 2025

Description

This PR adds batch processing to semantic highlighting while keeping everything backward compatible.

Mode Handler When Used ML Calls
Single Inference (default) SemanticHighlighter batch_inference: false or not specified 1 per document
Batch Inference (new) SemanticHighlightingProcessor batch_inference: true + system processor enabled 1 per batch

Highlight API

  • Existing queries continue to work with zero changes:
  {
    "highlight": {
      "fields": {
        "content": { "type": "semantic" }
      },
      "options": {
        "model_id": "my-model"
      }
    }
  }
  • New batch mode
  1. Enable in cluster settings (one-time setup):
  search.pipeline.enabled_system_generated_factories: ["semantic-highlighter"]

  2. Add one flag to your query:
  {
    "highlight": {
      "fields": {
        "content": { "type": "semantic" }
      },
      "options": {
        "model_id": "my-model",
        "batch_inference": true
      }
    }

Processing Flow

  graph TD
      A[Search Request with Semantic Highlighting] --> B{batch_inference?}
      B -->|false/undefined| C[SemanticHighlighter]
      B -->|true| D{System Processor Enabled?}
      D -->|No| E[Throw Clear Error]
      D -->|Yes| F[SemanticHighlighter returns null]
      F --> G[SemanticHighlightingProcessor]
      C --> H[Single ML Inference per Doc]
      G --> I[Batch ML Inference]
      H --> J[Apply Highlights]
      I --> J
Loading

Remote model integration test

This PR also introduced torch serve framework for remote model integration tests. The framework is extensible to host other models(such as embedding model, llms) as well. Example run at here, updated in DEVELOPER_GUIDE.md as well.

Related Issues

Resolves #1516

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement v3.3.0 Issues targeting release v3.3.0 labels Aug 19, 2025
@junqiu-lei junqiu-lei self-assigned this Aug 19, 2025
@junqiu-lei junqiu-lei force-pushed the highlighting-processor branch from 8e8c0f2 to 006d54a Compare August 19, 2025 20:11
@junqiu-lei junqiu-lei changed the title Support semantic highlighting with processor Add semantic highlighting response processor with batch inference support Aug 19, 2025
@junqiu-lei
Copy link
Member Author

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

  1. Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173
  2. Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

@heemin32
Copy link
Collaborator

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

  1. Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173
  2. Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

We could still test single highlighting functionality with local model?

@junqiu-lei
Copy link
Member Author

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

  1. Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173
  2. Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

We could still test single highlighting functionality with local model?

Yes, we can still test single highlighting functionality with local model, while the other main function are from batch ability.

…ompatibility for local and remote models

Signed-off-by: Junqiu Lei <[email protected]>
…e it on github remote model CI

Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
@vibrantvarun
Copy link
Member

LGTM

@junqiu-lei junqiu-lei merged commit d2b6347 into opensearch-project:main Sep 30, 2025
71 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Roadmap:Vector Database/GenAI Project-wide roadmap label semantic-highlighting v3.3.0 Issues targeting release v3.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Batch Inference Support for Semantic Highlighting

6 participants