Add batch semantic highlighting support #1520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

junqiu-lei merged 27 commits into opensearch-project:main from junqiu-lei:highlighting-processor

Sep 30, 2025

Member

junqiu-lei commented Aug 19, 2025 •

edited

Loading

Description

This PR adds batch processing to semantic highlighting while keeping everything backward compatible.

Mode	Handler	When Used	ML Calls
Single Inference (default)	SemanticHighlighter	batch_inference: false or not specified	1 per document
Batch Inference (new)	SemanticHighlightingProcessor	batch_inference: true + system processor enabled	1 per batch

Highlight API

Existing queries continue to work with zero changes:

  {
    "highlight": {
      "fields": {
        "content": { "type": "semantic" }
      },
      "options": {
        "model_id": "my-model"
      }
    }
  }

New batch mode

  1. Enable in cluster settings (one-time setup):
  search.pipeline.enabled_system_generated_factories: ["semantic-highlighter"]

  2. Add one flag to your query:
  {
    "highlight": {
      "fields": {
        "content": { "type": "semantic" }
      },
      "options": {
        "model_id": "my-model",
        "batch_inference": true
      }
    }

Processing Flow

  graph TD
      A[Search Request with Semantic Highlighting] --> B{batch_inference?}
      B -->|false/undefined| C[SemanticHighlighter]
      B -->|true| D{System Processor Enabled?}
      D -->|No| E[Throw Clear Error]
      D -->|Yes| F[SemanticHighlighter returns null]
      F --> G[SemanticHighlightingProcessor]
      C --> H[Single ML Inference per Doc]
      G --> I[Batch ML Inference]
      H --> J[Apply Highlights]
      I --> J

Remote model integration test

This PR also introduced torch serve framework for remote model integration tests. The framework is extensible to host other models(such as embedding model, llms) as well. Example run at here, updated in DEVELOPER_GUIDE.md as well.

Related Issues

Resolves #1516

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

junqiu-lei requested review from VijayanB, bzhangam, heemin32, jmazanec15, martin-gaievski, minalsha, model-collapse, naveentatikonda, navneet1v, sean-zheng-amazon, vamshin, vibrantvarun, yuye-aws, zane-neo and zhichao-aws as code owners

August 19, 2025 17:44

github-actions bot added enhancement v3.3.0 labels

junqiu-lei self-assigned this

junqiu-lei force-pushed the highlighting-processor branch from 8e8c0f2 to 006d54a Compare

August 19, 2025 20:11

junqiu-lei changed the title ~~Support semantic highlighting with processor~~ Add semantic highlighting response processor with batch inference support

Member Author

junqiu-lei commented Aug 19, 2025

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173
Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

bzhangam reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

bzhangam reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/processor/util/ProcessorUtils.java Outdated Show resolved Hide resolved

bzhangam reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

bzhangam reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

Collaborator

heemin32 commented Aug 20, 2025

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173

Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

We could still test single highlighting functionality with local model?

heemin32 reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/processor/util/ProcessorUtils.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

Member Author

junqiu-lei commented Aug 20, 2025

Not yet add integ test since the batch inference only support with remote model, we could run remote model integ test by one of the options:

Use local host model in CI, similar like Add local Ollama setup and testing scripts for native OpenAI API compatibility search-relevance#173

Use pre-defined .env from [GitHub Request] Create github env for running remote integration tests for neural search .github#372 to run against endpoint deployed on AWS Sagemaker.

We could still test single highlighting functionality with local model?

Yes, we can still test single highlighting functionality with local model, while the other main function are from batch ability.

chishui reviewed

View reviewed changes

src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/processor/util/ProcessorUtils.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/processor/util/ProcessorUtils.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessor.java Outdated Show resolved Hide resolved

...java/org/opensearch/neuralsearch/highlight/SemanticHighlightingResponseProcessorFactory.java Outdated Show resolved Hide resolved

chishui mentioned this pull request

Generalize retryableInferenceXXX functions in MLCommonsClientAccessor #1522

Merged

5 tasks

junqiu-lei added 25 commits

September 30, 2025 15:22


          Add batch inference support for semantic highlighting with backward c…

f1d4947

…ompatibility for local and remote models

Signed-off-by: Junqiu Lei <[email protected]>


          Use remote model for both multi-node and single-node integ test

3674f23

Signed-off-by: Junqiu Lei <[email protected]>


          Fix UnsupportedOperationException in multi-node semantic highlighting

7b17ac1

Signed-off-by: Junqiu Lei <[email protected]>


          Enable system-generated semantic highlighting processor with automati…

…c pipeline detection

Signed-off-by: Junqiu Lei <[email protected]>


          Refactor semantic highlighting with cleaner architecture

5e68aa5

Signed-off-by: Junqiu Lei <[email protected]>


          Implement remoteModelIntegTest Gradle task with unified script and us…

ecbc24b

…e it on github remote model CI

Signed-off-by: Junqiu Lei <[email protected]>


          Add more unit test

96910a5

Signed-off-by: Junqiu Lei <[email protected]>


          fix remote_model_integ_tests.yml

647ef19

Signed-off-by: Junqiu Lei <[email protected]>


          Remove test annotation

17fb837

Signed-off-by: Junqiu Lei <[email protected]>


          Add more integ test and update model type check

490761b

Signed-off-by: Junqiu Lei <[email protected]>


          Move batch inference configuration from query parameters to connector

7c1880a

Signed-off-by: Junqiu Lei <[email protected]>


          Externalize connector configurations to resource files

ab975db

Signed-off-by: Junqiu Lei <[email protected]>


          Add bwc tests

49d6cd1

Signed-off-by: Junqiu Lei <[email protected]>


          Enable semantic highlighting processor by default in cluster settings

a2b5843

Signed-off-by: Junqiu Lei <[email protected]>


          Use unified connector api for bwc

0b98538

Signed-off-by: Junqiu Lei <[email protected]>


          optimize integTest and remove unused code

f049f9c

Signed-off-by: Junqiu Lei <[email protected]>


          Fix BWC tests: Implement proper semantic highlighting search requests

c473c21

Signed-off-by: Junqiu Lei <[email protected]>


          rebase main with conflicts

deae9c0

Signed-off-by: Junqiu Lei <[email protected]>


          optimize bwc ci runner disk space

ef11222

Signed-off-by: Junqiu Lei <[email protected]>


          Keep using highlighter for single inference

2f422a0

Signed-off-by: Junqiu Lei <[email protected]>


          update torchserve script to extend other models

eb1e59f

Signed-off-by: Junqiu Lei <[email protected]>


          Add bwc for semantic highlighting

c65adf0

Signed-off-by: Junqiu Lei <[email protected]>


          reduce debug logs

c581afa

Signed-off-by: Junqiu Lei <[email protected]>


          fix bwc failure

4514cd9

Signed-off-by: Junqiu Lei <[email protected]>


          Fix bwc test failure in semantic highlighting

1cd5b77

Signed-off-by: Junqiu Lei <[email protected]>

junqiu-lei force-pushed the highlighting-processor branch from 65b1466 to 1cd5b77 Compare

September 30, 2025 22:27

junqiu-lei requested a review from vibrantvarun

September 30, 2025 22:48

vibrantvarun approved these changes

View reviewed changes

Member

vibrantvarun commented Sep 30, 2025

LGTM

junqiu-lei merged commit d2b6347 into opensearch-project:main

71 of 72 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

vibrantvarun vibrantvarun approved these changes

bzhangam bzhangam approved these changes

navneet1v Awaiting requested review from navneet1v navneet1v is a code owner

VijayanB Awaiting requested review from VijayanB VijayanB is a code owner

vamshin Awaiting requested review from vamshin vamshin is a code owner

jmazanec15 Awaiting requested review from jmazanec15 jmazanec15 is a code owner

naveentatikonda Awaiting requested review from naveentatikonda naveentatikonda is a code owner

martin-gaievski Awaiting requested review from martin-gaievski martin-gaievski is a code owner

sean-zheng-amazon Awaiting requested review from sean-zheng-amazon sean-zheng-amazon is a code owner

model-collapse Awaiting requested review from model-collapse model-collapse is a code owner

zane-neo Awaiting requested review from zane-neo zane-neo is a code owner

zhichao-aws Awaiting requested review from zhichao-aws zhichao-aws is a code owner

yuye-aws Awaiting requested review from yuye-aws yuye-aws is a code owner

minalsha Awaiting requested review from minalsha minalsha is a code owner

chishui Awaiting requested review from chishui chishui is a code owner

heemin32 Awaiting requested review from heemin32 heemin32 is a code owner

owaiskazi19 Awaiting requested review from owaiskazi19 owaiskazi19 is a code owner

Labels

enhancement Roadmap:Vector Database/GenAI semantic-highlighting v3.3.0