Skip to content

[FEATURE] Indexing with offline batch inference #1235

Open
@heemin32

Description

@heemin32

Is your feature request related to a problem?

In neural search, users are interested in minimizing the cost and time needed for embedding generation on large datasets. OSI addressed this by implementing an offline batch inference solution, which leverages batch processing to optimize both cost and performance. This process involves OSI handling file creation, uploading to S3, invoking the ML Common API, monitoring inference completion, retrieving results from S3, and finally parsing and ingesting the data into OpenSearch. opensearch-project/ml-commons#2891

However, many customers do not use OSI during the ingestion process. To accommodate them, we can offer an offline batch inference option that does not require OSI.

Similar issue that I created in ml-common. opensearch-project/ml-commons#3428
Creating another one here for better visibility to get community feedbacks.

What solution would you like?

For example, customers can create an index and ingest plain text data, and we provide an API to generate embeddings using the offline batch inference component. The process would work as follows:

  1. The customer creates an index.
  2. The customer ingests documents containing plain text.
  3. The customer triggers an API to populate embedding field, either as a one-time process or on a scheduled interval:
    1. Retrieve documents from the target index that lack embeddings or having outdated embedding(utilizing timestamp?)
    2. Create a file and upload it to S3.
    3. Call ML Common to perform offline batch inference.
    4. Retrieve the processed file from S3.
    5. Populate the index with the generated embeddings.
  4. The customer sees the embeddings successfully populated in the index.

What alternatives have you considered?

Utilizing OSI

Do you have any additional context?

N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions