Add documentation for Faiss byte vector

naveentatikonda · naveentatikonda · commit 55259f48950b · 2024-09-12T12:09:29.000-05:00
Signed-off-by: Naveen Tatikonda &lt;navtat@amazon.com&gt;
diff --git a/_field-types/supported-field-types/knn-vector.md b/_field-types/supported-field-types/knn-vector.md
@@ -87,7 +87,7 @@ However, if you intend to use Painless scripting or a k-NN score script, you onl
 
 By default, k-NN vectors are `float` vectors, where each dimension is 4 bytes. If you want to save storage space, you can use `byte` vectors with the `lucene` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range. 
  
-Byte vectors are supported only for the `lucene` engine. They are not supported for the `nmslib` and `faiss` engines.
+Byte vectors are supported only for the `lucene` and `faiss` engines. They are not supported for the `nmslib` engine.
 {: .note}
 
 In [k-NN benchmarking tests](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution). 
@@ -267,6 +267,237 @@ return Byte(bval)
 ```
 {% include copy.html %}
 
+## Faiss byte vector
+Faiss engine is recommended for use cases that requires ingestion on a large scale. But, for these large scale workloads using the default `float` vectors requires a lot of memory usage as each dimension is 4 bytes. If you want to reduce this memory and storage requirements,
+you can use `byte` vectors with the `faiss` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range.
+
+Faiss directly doesn't support byte datatype to store byte vectors. To achieve this functionality we are using a scalar quantizer (SQ8_direct_signed) which accepts float vectors in 
+8-bit signed integer range and encodes them as byte sized vectors. These quantized byte sized vectors are stored in a k-NN index which reduces the memory footprint by a factor of 4.
+When used with [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), SQ8_direct_signed quantization can also significantly reduce search latencies and improve indexing throughput.
+
+When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
+{: .important}
+
+To use a `byte` vector, set the `data_type` parameter to `byte` when creating mappings for an index.
+
+### Example: HNSW
+
+Here is an example to create a byte vector index with the Faiss engine and HNSW algorithm:
+```json
+PUT test-index
+{
+  "settings": {
+    "index": {
+      "knn": true
+    }
+  },
+  "mappings": {
+    "properties": {
+      "my_vector": {
+        "type": "knn_vector",
+        "dimension": 2,
+        "data_type": "byte",
+        "method": {
+          "name": "hnsw",
+          "space_type": "l2",
+          "engine": "faiss",
+          "parameters": {
+            "ef_construction": 128,
+            "m": 24
+          }
+        }
+      }
+    }
+  }
+}
+
+```
+{% include copy-curl.html %}
+
+Then ingest documents as usual. But, make sure each dimension in the vector is in the supported [-128, 127] range:
+```json
+PUT test-index/_doc/1
+{
+  "my_vector": [-126, 28]
+}
+```
+{% include copy-curl.html %}
+
+```json
+PUT test-index/_doc/2
+{
+  "my_vector": [100, -128]
+}
+```
+{% include copy-curl.html %}
+
+When querying, be sure to use a byte vector:
+```json
+GET test-index/_search
+{
+  "size": 2,
+  "query": {
+    "knn": {
+      "my_vector": {
+        "vector": [26, -120],
+        "k": 2
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+### Example: IVF
+
+The IVF method requires a training step that creates and trains the model used to initialize the native library index during segment creation. For more information, see [Building a k-NN index from a model]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
+
+First, create an index that will contain byte vector training data. Specify the Faiss engine and IVF algorithm and make sure that the `dimension` matches the dimension of the model you want to create:
+
+```json
+PUT train-index
+{
+  "mappings": {
+    "properties": {
+      "train-field": {
+        "type": "knn_vector",
+        "dimension": 4,
+        "data_type": "byte"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Ingest training data containing byte vectors into the training index:
+
+```json
+PUT _bulk
+{ "index": { "_index": "train-index", "_id": "1" } }
+{ "train-field": [127, 100, 0, -120] }
+{ "index": { "_index": "train-index", "_id": "2" } }
+{ "train-field": [2, -128, -10, 50] }
+{ "index": { "_index": "train-index", "_id": "3" } }
+{ "train-field": [13, -100, 5, 126] }
+{ "index": { "_index": "train-index", "_id": "4" } }
+{ "train-field": [5, 100, -6, -125] }
+```
+{% include copy-curl.html %}
+
+Then, create and train the model named `byte-vector-model`. The model will be trained using the training data from the `train-field` in the `train-index`. Specify the `byte` data type:
+
+```json
+POST _plugins/_knn/models/byte-vector-model/_train
+{
+  "training_index": "train-index",
+  "training_field": "train-field",
+  "dimension": 4,
+  "description": "model with byte data",
+  "data_type": "byte",
+  "method": {
+    "name": "ivf",
+    "engine": "faiss",
+    "space_type": "l2",
+    "parameters": {
+      "nlist": 1,
+      "nprobes": 1
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+To check the model training status, call the Get Model API:
+
+```json
+GET _plugins/_knn/models/byte-vector-model?filter_path=state
+```
+{% include copy-curl.html %}
+
+Once the training is complete, the `state` changes to `created`.
+
+Next, create an index that will initialize its native library indexes using the trained model:
+
+```json
+PUT test-byte-ivf
+{
+  "settings": {
+    "index": {
+      "knn": true
+    }
+  },
+  "mappings": {
+    "properties": {
+      "my_vector": {
+        "type": "knn_vector",
+        "model_id": "byte-vector-model"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Ingest the data containing the byte vectors that you want to search into the created index:
+
+```json
+PUT _bulk?refresh=true
+{"index": {"_index": "test-byte-ivf", "_id": "1"}}
+{"my_vector": [7, 10, 15, -120]}
+{"index": {"_index": "test-byte-ivf", "_id": "2"}}
+{"my_vector": [10, -100, 120, -108]}
+{"index": {"_index": "test-byte-ivf", "_id": "3"}}
+{"my_vector": [1, -2, 5, -50]}
+{"index": {"_index": "test-byte-ivf", "_id": "4"}}
+{"my_vector": [9, -7, 45, -78]}
+{"index": {"_index": "test-byte-ivf", "_id": "5"}}
+{"my_vector": [80, -70, 127, -128]}
+```
+{% include copy-curl.html %}
+
+Finally, search the data. Be sure to provide a byte vector in the k-NN vector field:
+
+```json
+GET test-byte-ivf/_search
+{
+  "size": 2,
+  "query": {
+    "knn": {
+      "my_vector": {
+        "vector": [100, -120, 50, -45],
+        "k": 2
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+### Memory estimation
+
+In the best-case scenario, byte vectors require 25% of the memory that 32-bit vectors require.
+
+#### HNSW memory estimation
+
+The memory required for Hierarchical Navigable Small Worlds (HNSW) is estimated to be `1.1 * (dimension + 8 * M)` bytes/vector.
+
+As an example, assume that you have 1 million vectors with a dimension of 256 and M of 16. The memory requirement can be estimated as follows:
+
+```r
+1.1 * (256 + 8 * 16) * 1,000,000 ~= 0.39 GB
+```
+
+#### IVF memory estimation
+
+The memory required for IVF is estimated to be `1.1 * ((dimension * num_vectors) + (4 * nlist * d))` bytes/vector.
+
+As an example, assume that you have 1 million vectors with a dimension of 256 and `nlist` of 128. The memory requirement can be estimated as follows:
+
+```r
+1.1 * ((256 * 1,000,000) + (4 * 128 * 256))  ~= 0.27 GB
+```
+
 ## Binary k-NN vectors
 
 You can reduce memory costs by a factor of 32 by switching from float to binary vectors.
diff --git a/_search-plugins/knn/knn-vector-quantization.md b/_search-plugins/knn/knn-vector-quantization.md
@@ -13,6 +13,9 @@ By default, the k-NN plugin supports the indexing and querying of vectors of typ
 
 OpenSearch supports many varieties of quantization. In general, the level of quantization will provide a trade-off between the accuracy of the nearest neighbor search and the size of the memory footprint consumed by the vector search. The supported types include byte vectors, 16-bit scalar quantization, and product quantization (PQ).
 
+## Faiss byte vector
+Starting with version 2.17, the k-NN plugin supports `byte` vectors with the Faiss engine which helps to reduce the memory requirements. For more information, see [Faiss byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#faiss-byte-vector).
+
 ## Lucene byte vector
 
 Starting with k-NN plugin version 2.9, you can use `byte` vectors with the Lucene engine in order to reduce the amount of required memory. This requires quantizing the vectors outside of OpenSearch before ingesting them into an OpenSearch index. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).