You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In neural search, users are interested in minimizing the cost and time needed for embedding generation on large datasets. OSI addressed this by implementing an offline batch inference solution, which leverages batch processing to optimize both cost and performance. This process involves OSI handling file creation, uploading to S3, invoking the ML Common API, monitoring inference completion, retrieving results from S3, and finally parsing and ingesting the data into OpenSearch. opensearch-project/ml-commons#2891
However, many customers do not use OSI during the ingestion process. To accommodate them, we can offer an offline batch inference option that does not require OSI.
Similar issue that I created in ml-common. opensearch-project/ml-commons#3428
Creating another one here for better visibility to get community feedbacks.
What solution would you like?
For example, customers can create an index and ingest plain text data, and we provide an API to generate embeddings using the offline batch inference component. The process would work as follows:
The customer creates an index.
The customer ingests documents containing plain text.
The customer triggers an API to populate embedding field, either as a one-time process or on a scheduled interval:
Retrieve documents from the target index that lack embeddings or having outdated embedding(utilizing timestamp?)
Create a file and upload it to S3.
Call ML Common to perform offline batch inference.
Retrieve the processed file from S3.
Populate the index with the generated embeddings.
The customer sees the embeddings successfully populated in the index.
What alternatives have you considered?
Utilizing OSI
Do you have any additional context?
N/A
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
In neural search, users are interested in minimizing the cost and time needed for embedding generation on large datasets. OSI addressed this by implementing an offline batch inference solution, which leverages batch processing to optimize both cost and performance. This process involves OSI handling file creation, uploading to S3, invoking the ML Common API, monitoring inference completion, retrieving results from S3, and finally parsing and ingesting the data into OpenSearch. opensearch-project/ml-commons#2891
However, many customers do not use OSI during the ingestion process. To accommodate them, we can offer an offline batch inference option that does not require OSI.
Similar issue that I created in ml-common. opensearch-project/ml-commons#3428
Creating another one here for better visibility to get community feedbacks.
What solution would you like?
For example, customers can create an index and ingest plain text data, and we provide an API to generate embeddings using the offline batch inference component. The process would work as follows:
What alternatives have you considered?
Utilizing OSI
Do you have any additional context?
N/A
The text was updated successfully, but these errors were encountered: