opensearch-project · kolchfa-aws · Sep 13, 2024 · Sep 4, 2024 · Sep 5, 2024 · Sep 12, 2024
@@ -85,23 +85,30 @@ However, if you intend to use Painless scripting or a k-NN score script, you onl
  }
  ```
 
-## Lucene byte vector
+## Byte vectors
 
-By default, k-NN vectors are `float` vectors, where each dimension is 4 bytes. If you want to save storage space, you can use `byte` vectors with the `lucene` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range. 
+By default, k-NN vectors are `float` vectors, in which each dimension is 4 bytes. If you want to save storage space, you can use `byte` vectors with the `faiss` or `lucene` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range. 
 
-Byte vectors are supported only for the `lucene` engine. They are not supported for the `nmslib` and `faiss` engines.
+Byte vectors are supported only for the `lucene` and `faiss` engines. They are not supported for the `nmslib` engine.
 {: .note}
 
 In [k-NN benchmarking tests](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution). 
 
 When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
 {: .important}
-
+
+When using `byte` vectors with the `faiss` engine, we recommend using [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), which helps to significantly reduce search latencies and improve indexing throughput.
+{: .important} 
+
 Introduced in k-NN plugin version 2.9, the optional `data_type` parameter defines the data type of a vector. The default value of this parameter is `float`.
 
 To use a `byte` vector, set the `data_type` parameter to `byte` when creating mappings for an index:
 
- ```json
+### Example: HNSW
+
+The following example creates a byte vector index with the `lucene` engine and `hnsw` algorithm:
+
+```json
 PUT test-index
 {
   "settings": {
@@ -132,7 +139,7 @@ PUT test-index
 ```
 {% include copy-curl.html %}
 
-Then ingest documents as usual. Make sure each dimension in the vector is in the supported [-128, 127] range:
+After creating the index, ingest documents as usual. Make sure each dimension in the vector is in the supported [-128, 127] range:
 
 ```json
 PUT test-index/_doc/1
@@ -168,6 +175,157 @@ GET test-index/_search
 ```
 {% include copy-curl.html %}
 
+### Example: IVF
+
+The `ivf` method requires a training step that creates and trains the model used to initialize the native library index during segment creation. For more information, see [Building a k-NN index from a model]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
+
+First, create an index that will contain byte vector training data. Specify the `faiss` engine and `ivf` algorithm and make sure that the `dimension` matches the dimension of the model you want to create:
+
+```json
+PUT train-index
+{
+  "mappings": {
+    "properties": {
+      "train-field": {
+        "type": "knn_vector",
+        "dimension": 4,
+        "data_type": "byte"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+First, ingest training data containing byte vectors into the training index:
+
+```json
+PUT _bulk
+{ "index": { "_index": "train-index", "_id": "1" } }
+{ "train-field": [127, 100, 0, -120] }
+{ "index": { "_index": "train-index", "_id": "2" } }
+{ "train-field": [2, -128, -10, 50] }
+{ "index": { "_index": "train-index", "_id": "3" } }
+{ "train-field": [13, -100, 5, 126] }
+{ "index": { "_index": "train-index", "_id": "4" } }
+{ "train-field": [5, 100, -6, -125] }
+```
+{% include copy-curl.html %}
+
+Then, create and train the model named `byte-vector-model`. The model will be trained using the training data from the `train-field` in the `train-index`. Specify the `byte` data type:
+
+```json
+POST _plugins/_knn/models/byte-vector-model/_train
+{
+  "training_index": "train-index",
+  "training_field": "train-field",
+  "dimension": 4,
+  "description": "model with byte data",
+  "data_type": "byte",
+  "method": {
+    "name": "ivf",
+    "engine": "faiss",
+    "space_type": "l2",
+    "parameters": {
+      "nlist": 1,
+      "nprobes": 1
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+To check the model training status, call the Get Model API:
+
+```json
+GET _plugins/_knn/models/byte-vector-model?filter_path=state
+```
+{% include copy-curl.html %}
+
+Once the training is complete, the `state` changes to `created`.
+
+Next, create an index that will initialize its native library indexes using the trained model:
+
+```json
+PUT test-byte-ivf
+{
+  "settings": {
+    "index": {
+      "knn": true
+    }
+  },
+  "mappings": {
+    "properties": {
+      "my_vector": {
+        "type": "knn_vector",
+        "model_id": "byte-vector-model"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Ingest the data containing the byte vectors that you want to search into the created index:
+
+```json
+PUT _bulk?refresh=true
+{"index": {"_index": "test-byte-ivf", "_id": "1"}}
+{"my_vector": [7, 10, 15, -120]}
+{"index": {"_index": "test-byte-ivf", "_id": "2"}}
+{"my_vector": [10, -100, 120, -108]}
+{"index": {"_index": "test-byte-ivf", "_id": "3"}}
+{"my_vector": [1, -2, 5, -50]}
+{"index": {"_index": "test-byte-ivf", "_id": "4"}}
+{"my_vector": [9, -7, 45, -78]}
+{"index": {"_index": "test-byte-ivf", "_id": "5"}}
+{"my_vector": [80, -70, 127, -128]}
+```
+{% include copy-curl.html %}
+
+Finally, search the data. Be sure to provide a byte vector in the k-NN vector field:
+
+```json
+GET test-byte-ivf/_search
+{
+  "size": 2,
+  "query": {
+    "knn": {
+      "my_vector": {
+        "vector": [100, -120, 50, -45],
+        "k": 2
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+### Memory estimation
+
+In the best-case scenario, byte vectors require 25% of the memory required by 32-bit vectors.
+
+#### HNSW memory estimation
+
+The memory required for Hierarchical Navigable Small Worlds (HNSW) is estimated to be `1.1 * (dimension + 8 * m)` bytes/vector, where `m` is the maximum number of bidirectional links created for each element during the construction of the graph.
+
+As an example, assume that you have 1 million vectors with a dimension of 256 and an `m` of 16. The memory requirement can be estimated as follows:
+
+```r
+1.1 * (256 + 8 * 16) * 1,000,000 ~= 0.39 GB
+```
+
+#### IVF memory estimation
+
+The memory required for IVF is estimated to be `1.1 * ((dimension * num_vectors) + (4 * nlist * dimension))` bytes/vector, where `nlist` is the number of buckets to partition vectors into.
+
+As an example, assume that you have 1 million vectors with a dimension of 256 and an `nlist` of 128. The memory requirement can be estimated as follows:
+
+```r
+1.1 * ((256 * 1,000,000) + (4 * 128 * 256))  ~= 0.27 GB
+```
+
+
 ### Quantization techniques
 
 If your vectors are of the type `float`, you need to first convert them to the `byte` type before ingesting the documents. This conversion is accomplished by _quantizing the dataset_---reducing the precision of its vectors. There are many quantization techniques, such as scalar quantization or product quantization (PQ), which is used in the Faiss engine. The choice of quantization technique depends on the type of data you're using and can affect the accuracy of recall values. The following sections describe the scalar quantization algorithms that were used to quantize the [k-NN benchmarking test](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool) data for the [L2](#scalar-quantization-for-the-l2-space-type) and [cosine similarity](#scalar-quantization-for-the-cosine-similarity-space-type) space types. The provided pseudocode is for illustration purposes only.
@@ -269,7 +427,7 @@ return Byte(bval)
 ```
 {% include copy.html %}
 
-## Binary k-NN vectors
+## Binary vectors
 
 You can reduce memory costs by a factor of 32 by switching from float to binary vectors.
 Using binary vector indexes can lower operational costs while maintaining high recall performance, making large-scale deployment more economical and efficient.

@@ -7,7 +7,7 @@ nav_order: 10
 
 # Semantic search using byte-quantized vectors
 
-This tutorial illustrates how to build a semantic search using the [Cohere Embed model](https://docs.cohere.com/reference/embed) and byte-quantized vectors. For more information about using byte-quantized vectors, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#lucene-byte-vector).
+This tutorial shows you how to build a semantic search using the [Cohere Embed model](https://docs.cohere.com/reference/embed) and byte-quantized vectors. For more information about using byte-quantized vectors, see [Byte vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#byte-vectors).
 
 The Cohere Embed v3 model supports several `embedding_types`. For this tutorial, you'll use the `INT8` type to encode byte-quantized vectors. 
 

@@ -322,7 +322,7 @@ To learn more about the radial search feature, see [k-NN radial search]({{site.u
 
 ### Using approximate k-NN with binary vectors
 
-To learn more about using binary vectors with k-NN search, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-k-nn-vectors).
+To learn more about using binary vectors with k-NN search, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-vectors).
 
 ## Spaces
 
@@ -346,5 +346,5 @@ The cosine similarity formula does not include the `1 -` prefix. However, becaus
 With cosine similarity, it is not valid to pass a zero vector (`[0, 0, ...]`) as input. This is because the magnitude of such a vector is 0, which raises a `divide by 0` exception in the corresponding formula. Requests containing the zero vector will be rejected, and a corresponding exception will be thrown.
 {: .note }
 
-The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-k-nn-vectors).
+The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-vectors).
 {: .note}
@@ -41,13 +41,13 @@ PUT /test-index
 ```
 {% include copy-curl.html %}
 
-## Lucene byte vector
+## Byte vectors
 
-Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine to reduce the amount of storage space needed. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).
+Starting with k-NN plugin version 2.17, you can use `byte` vectors with the `faiss` and `lucene` engines to reduce the amount of required memory and storage space. For more information, see [Byte vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#byte-vectors).
 
-## Binary vector
+## Binary vectors
 
-Starting with k-NN plugin version 2.16, you can use `binary` vectors with the `faiss` engine to reduce the amount of required storage space. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-k-nn-vectors).
+Starting with k-NN plugin version 2.16, you can use `binary` vectors with the `faiss` engine to reduce the amount of required storage space. For more information, see [Binary vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-vectors).
 
 ## SIMD optimization for the Faiss engine
 
@@ -116,7 +116,7 @@ Method name | Requires training | Supported spaces | Description
 For hnsw, "innerproduct" is not available when PQ is used.
 {: .note}
 
-The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-k-nn-vectors).
+The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-vectors).
 {: .note}
 
 #### HNSW parameters
@@ -324,7 +324,7 @@ If you want to use less memory and increase indexing speed as compared to HNSW w
 
 If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.
 
-You can reduce the memory footprint by a factor of 2, with a minimal loss in search quality, by using the [`fp_16` encoder]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#faiss-16-bit-scalar-quantization). If your vector dimensions are within the [-128, 127] byte range, we recommend using the [byte quantizer]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#lucene-byte-vector) to reduce the memory footprint by a factor of 4. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/). 
+You can reduce the memory footprint by a factor of 2, with a minimal loss in search quality, by using the [`fp_16` encoder]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#faiss-16-bit-scalar-quantization). If your vector dimensions are within the [-128, 127] byte range, we recommend using the [byte quantizer]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#byte-vectors) to reduce the memory footprint by a factor of 4. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/). 
 
 ### Memory estimation
 

@@ -302,5 +302,5 @@ Cosine similarity returns a number between -1 and 1, and because OpenSearch rele
 With cosine similarity, it is not valid to pass a zero vector (`[0, 0, ... ]`) as input. This is because the magnitude of such a vector is 0, which raises a `divide by 0` exception in the corresponding formula. Requests containing the zero vector will be rejected, and a corresponding exception will be thrown.
 {: .note }
 
-The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-k-nn-vectors).
+The `hamming` space type is supported for binary vectors in OpenSearch version 2.16 and later. For more information, see [Binary k-NN vectors]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#binary-vectors).
 {: .note}