Skip to content
2 changes: 1 addition & 1 deletion _field-types/supported-field-types/derived.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: default
title: Derived
nav_order: 62
nav_order: 63
has_children: false
parent: Supported field types
---
Expand Down
1 change: 1 addition & 0 deletions _field-types/supported-field-types/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ You can specify data types for your fields when creating a mapping. The followin
| [`knn_vector`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/) | Indexes a vector for k-NN search. |
| [`semantic`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/semantic/) | Wraps a text or binary field to simplify semantic search setup. |
| [`star_tree`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/star-tree/) | Precomputes aggregations for faster performance using a [star-tree index](https://docs.pinot.apache.org/basics/indexing/star-tree-index). |
| [`sparse_vector`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/sparse-vector/) | Indexes a sparse vector for ANN search [`sparse ANN`]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann) |

## Arrays

Expand Down
162 changes: 162 additions & 0 deletions _field-types/supported-field-types/sparse-vector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
---
layout: default
title: Sparse Vector
nav_order: 61
has_children: false
parent: Supported field types
---

# Sparse Vector
**Introduced 3.3**
{: .label .label-purple }

The `sparse_vector` field supports the sparse ANN (Approximate Nearest Neighbor) algorithm. This significantly boosts the search efficiency while maintaining high search relevance. The `sparse_vector` field is represented as a map, where the keys denote the token with positive [float]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/numeric/) values indicating the token weight.

For more information, see [sparse ANN]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-seismic).

## Parameters

The `sparse_vector` field type supports the following parameters.

| Parameter | Type | Required | Description | Default | Range | Example |
|-------------------------|---------|----------|-----------------------------------------------|-----------------------|-------------|-----------|
| `name` | String | Yes | Algorithm name | - | - | `seismic` |
| `n_postings` | Integer | No | Maximum documents per posting list | `0.0005 * doc_count`¹ | (0, +∞) | `4000` |
| `cluster_ratio` | Float | No | Ratio to determine cluster count | `0.1` | (0, 1) | `0.15` |
| `summary_prune_ratio` | Float | No | Ratio for pruning cluster summary vectors | `0.4` | (0, 1] | `0.3` |
| `approximate_threshold` | Integer | No | Document threshold for SEISMIC activation | `1,000,000` | [0, +∞) | `500000` |
| `quantization_ceiling_search` | Float | No | Ceiling float value to consider during search | `16` | (0, +∞) | `3` |
| `quantization_ceiling_ingest` | Float | No | Ceiling float value to consider during ingest | `3` | (0, +∞) | `2.5` |


¹`doc_count` represents the number of documents within the segment.

For parameter configuration, you can refer to [`sparse ANN configuration`]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann-configuration)
{: .note }

To increase search efficiency and reduce memory consumption, the `sparse_vector` field automatically performs quantization on the token weight. You can adjust the parameter `quantization_ceiling_search` and `quantization_ceiling_ingest` according to different token weight distribution. For doc-only queries, we recommend the default value (`16`). If you're querying with bi-encoder mode alone, we recommend setting `quantization_ceiling_search` to `3`. For doc-only and bi-encoder mode, you can refer to [`generating sparse vector embeddings automatically`]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-with-pipelines/) for more details.
{: .note }

## Example

### Step 1: Index creation

Create a sparse index where the index mapping contains a sparse vector field.

```json
PUT sparse-vector-index
{
"settings": {
"index": {
"sparse": true
},
"mappings": {
"properties": {
"sparse_embedding": {
"type": "sparse_vector",
"method": {
"name": "seismic",
"parameters": {
"n_postings": 300,
"cluster_ratio": 0.1,
"summary_prune_ratio": 0.4,
"approximate_threshold": 1000000
}
}
}
}
}
}
}
```
{% include copy-curl.html %}

To use the `sparse_vector` field, you need to specify the index setting `index.sparse` to be `true`
{: .note }

### Step 2: Data ingestion

Index three documents with a `sparse_vector` field:

```json
PUT sparse-vector-index/_doc/1
{
"sparse_embedding" : {
"1000": 0.1
}
}
```
{% include copy-curl.html %}

```json
PUT sparse-vector-index/_doc/2
{
"sparse_embedding" : {
"2000": 0.2
}
}
```
{% include copy-curl.html %}

```json
PUT sparse-vector-index/_doc/3
{
"sparse_embedding" : {
"3000": 0.3
}
}
```
{% include copy-curl.html %}

### Step 3: Query

Using a `neural_sparse` query, you can query the sparse index with either raw vectors or natural language.

#### Query with raw vector

```json
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_tokens": {
"1055": 5.5
},
"method_parameters": {
"heap_factor": 1.0,
"top_n": 10,
"k": 10
}
}
}
}
}
```
{% include copy-curl.html %}

#### Query with natural language

```json
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_text": "<input text>",
"model_id": "<model ID>",
"method_parameters": {
"k": 10,
"top_n": 10,
"heap_factor": 1.0
}
}
}
}
}
```
{% include copy-curl.html %}

For more details on query, you can refer to [`sparse ANN query`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/#sparse-ann-query) and [`sparse ANN configuration`]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-ann-configuration).
{: .note }

2 changes: 1 addition & 1 deletion _field-types/supported-field-types/star-tree.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: default
title: Star-tree
nav_order: 61
nav_order: 62
parent: Supported field types
---

Expand Down
25 changes: 24 additions & 1 deletion _query-dsl/specialized/neural-sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,24 @@

For more information, see [Generating sparse vector embeddings automatically]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-with-pipelines/).

## Sparse ANN query
Introduced 3.3
{: .label .label-purple }

You can also run a sparse ANN query against a `sparse_vector` field. It supports above mentioned querying with text or querying with tokens.

Check warning on line 53 in _query-dsl/specialized/neural-sparse.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.DirectionAboveBelow] Use 'previous, preceding, or earlier' instead of 'above' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions. Raw Output: {"message": "[OpenSearch.DirectionAboveBelow] Use 'previous, preceding, or earlier' instead of 'above' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions.", "location": {"path": "_query-dsl/specialized/neural-sparse.md", "range": {"start": {"line": 53, "column": 82}}}, "severity": "WARNING"}
```json
"neural_sparse": {
"<vector_field>": {
"query_text": "<input text>",
"model_id": "<model ID>",
"method_parameters": {
"top_n": 10,
"heap_factor": 1.0,
"k": 10
}
}
}
```

## Request body fields

Expand All @@ -58,6 +76,10 @@
`model_id` | String | Optional | Used with `query_text`. The ID of the sparse encoding model (for bi-encoder mode) or tokenizer (for doc-only mode) used to generate vector embeddings from the query text. The model/tokenizer must be deployed in OpenSearch before it can be used in neural sparse search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Generating sparse vector embeddings automatically]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-with-pipelines/). For information about setting a default model ID in a neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/). Cannot be specified at the same time as `analyzer`.
`query_tokens` | Map of token (string) to weight (float) | Optional | A raw sparse vector in the form of tokens and their weights. Used as an alternative to `query_text` for direct vector input. Either `query_text` or `query_tokens` must be specified.
`max_token_score` | Float | Optional | (Deprecated) This parameter has been deprecated since OpenSearch 2.12. It is maintained only for backward compatibility and no longer affects functionality. The parameter can still be provided in requests, but its value has no impact. Previously used as the theoretical upper bound of the score for all tokens in the vocabulary.
`method_parameters.top_n` | Integer | Optional | Specifies the number of query tokens with the highest weights to retain for approximate sparse queries.
`method_parameters.heap_factor` | Float | Optional | Controls the trade-off between recall and performance. Higher values increase recall but reduce query speed; lower values decrease recall but improve query speed.
`method_parameters.k` | Integer | Optional | Specifies the number of top k nearest results to return from the approximate algorithm.
`method_parameters.filter` | Object | Optional | Applies filters to the query results.


#### Examples
Expand Down Expand Up @@ -139,4 +161,5 @@

## Next steps

- For more information about neural sparse search, see [Neural sparse search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-search/).
- For more information about neural sparse search, see [Neural sparse search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-search/).
- For more information about sparse ANN search, see [Sparse Approximate Search]({{site.url}}{{site.baseurl}}/vector-search/ai-search/neural-sparse-seismic/)
162 changes: 162 additions & 0 deletions _vector-search/ai-search/neural-sparse-ann-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
---
layout: default
title: Sparse ANN configuration
parent: Sparse ANN
grand_parent: Neural sparse search
great_grand_parent: AI search
nav_order: 10
has_math: true
---

# Sparse ANN configuration

This page provides comprehensive configuration guidance for sparse ANN in OpenSearch neural sparse search.

## Prerequisites

Before configuring sparse ANN, ensure you have:

- OpenSearch 3.3 or later with the neural-search plugin installed

## Step 1: Create sparse ANN index

To use sparse ANN, you must enable sparse setting at the index level by setting `"sparse": true`

Besides, you should use `sparse_vector` as the field type, because sparse ANN is designed to use sparse vectors.

In addition, there are some special parameters in a mapping field. You can specify what settings you want to use, such as `n_postings`, `cluster_ratio`, `summary_prune_ratio`, and `approximate_threshold`. More details can be seen in [sparse ANN index setting]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/)

### Example
```json
PUT /sparse-ann-documents
{
"settings": {
"index": {
"sparse": true,
"number_of_shards": 2,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"sparse_embedding": {
"type": "sparse_vector",
"method": {
"name": "seismic",
"parameters": {
"n_postings": 4000,
"cluster_ratio": 0.1,
"summary_prune_ratio": 0.4,
"approximate_threshold": 1000000
}
}
}
}
}
}
```
{% include copy-curl.html %}

## Step 2: Ingest data

After a sparse ANN index is successfully created, you can ingest sparse embeddings with tokens in the form of Integers into it

```json
POST _bulk
{ "create": { "_index": "sparse-ann-documents", "_id": "0" } }
{ "sparse_embedding": {"10": 0.85, "23": 1.92, "24": 0.67, "78": 2.54, "156": 0.73} }
{ "create": { "_index": "sparse-ann-documents", "_id": "1" } }
{ "sparse_embedding": {"3": 1.22, "19": 0.11, "21": 0.35, "300": 1.74, "985": 0.96} }
```
{% include copy-curl.html %}

You can also use [ingestion pipeline]({{site.url}}{{site.baseurl}}/ingest-pipelines/) which automatically adjusts the output format of tokens into Integer

## Step 3: Conduct a query

Now, you can prepare a query to retrieve information from the index you just built. Please note that you should not combine sparse ANN with [two-phase]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/) pipeline.

Check warning on line 77 in _vector-search/ai-search/neural-sparse-ann-configuration.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Please] Using 'Please' is unnecessary. Remove. Raw Output: {"message": "[OpenSearch.Please] Using 'Please' is unnecessary. Remove.", "location": {"path": "_vector-search/ai-search/neural-sparse-ann-configuration.md", "range": {"start": {"line": 77, "column": 85}}}, "severity": "WARNING"}

Check warning on line 77 in _vector-search/ai-search/neural-sparse-ann-configuration.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Simple] Don't use 'just' because it's not neutral in tone. If you mean 'only', use 'only' instead. Raw Output: {"message": "[OpenSearch.Simple] Don't use 'just' because it's not neutral in tone. If you mean 'only', use 'only' instead.", "location": {"path": "_vector-search/ai-search/neural-sparse-ann-configuration.md", "range": {"start": {"line": 77, "column": 73}}}, "severity": "WARNING"}

### Natural language query

Query sparse ANN fields using the enhanced `neural_sparse` query:

```json
GET /sparse-ann-documents/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_text": "machine learning algorithms",
"model_id": "your_sparse_model_id",
"method_parameters": {"heap_factor": 1.3, "cut": 6, "k": 10}
}
}
}
}
```
{% include copy-curl.html %}

### Raw vector query
In addition, you can also prepare sparse vectors in advance so that you can send a raw vector as a query. Please note that you should use tokens in a form of Integer here instead of raw text.

Check warning on line 100 in _vector-search/ai-search/neural-sparse-ann-configuration.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Please] Using 'Please' is unnecessary. Remove. Raw Output: {"message": "[OpenSearch.Please] Using 'Please' is unnecessary. Remove.", "location": {"path": "_vector-search/ai-search/neural-sparse-ann-configuration.md", "range": {"start": {"line": 100, "column": 107}}}, "severity": "WARNING"}
```json
GET /sparse-ann-documents/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_tokens": {
"1055": 1.7,
"2931": 2.3
},
"method_parameters": {
"heap_factor": 1.2,
"top_n": 6,
"k": 10
}
}
}
}
}
```
{% include copy-curl.html %}

## Cluster settings

### Thread pool configuration

When building clustered inverted index structure, it requires intensive computations. Our algorithm uses a threadpool to building clusters in parallel. The default value of the threadpool is 1. You can adjust this `plugins.neural_search.sparse.algo_param.index_thread_qty` setting to tune the threadpool size to use more CPU cores to reduce index building time

```json
PUT /_cluster/settings
{
"persistent": {
"plugins.neural_search.sparse.algo_param.index_thread_qty": 4
}
}
```
{% include copy-curl.html %}

### Memory and caching settings
Sparse ANN equips a circuit breaker to prevent the algorithm consuming too much memory, so users do not need to worry about affecting other OpenSearch functionalities. The default value of `circuit_breaker.limit` is $$10\%$$, and you can set a different limit value to control the total memory the algorithm will use. Once the memory reaches this limit, a cache eviction would occur and data which are used least recently will be evicted. Here is an example to call the circuit breaker cluster setting API:
```json
PUT _cluster/settings
{
"persistent": {
"plugins.neural_search.circuit_breaker.limit": "30%"
}
}
```
{% include copy-curl.html %}

A higher circuit breaker limit will allow more memory space to use, which prevents frequent cache eviction, but it may impact other OpenSearch's operation. A lower limit will guarantee more safety, but it may trigger more frequent cache eviction. More details can be seen in [Neural Search API]({{site.url}}{{site.baseurl}}/vector-search/api/neural/)

### Monitor sparse ANN

Use stats API to monitor memory usage and query stats. More details can be seen in [Neural Search API]({{site.url}}{{site.baseurl}}/vector-search/api/neural/)

## Performance tuning
Sparse ANN provides users with an opportunity to balance the trade-off between how accurate search results are and how fast search process can be. In short, you can tune balance between recall and latency with following parameter settings. Check guidance in [sparse ANN performance tuning]({{site.url}}{{site.baseurl}}/vector-search/performance-tuning-sparse/)

## Next steps

- [Sparse ANN performance tuning]({{site.url}}{{site.baseurl}}/vector-search/performance-tuning-sparse/)
Loading
Loading