Skip to content

Commit 55259f4

Browse files
Add documentation for Faiss byte vector
Signed-off-by: Naveen Tatikonda <[email protected]>
1 parent 76486a4 commit 55259f4

File tree

2 files changed

+235
-1
lines changed

2 files changed

+235
-1
lines changed

_field-types/supported-field-types/knn-vector.md

Lines changed: 232 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ However, if you intend to use Painless scripting or a k-NN score script, you onl
8787

8888
By default, k-NN vectors are `float` vectors, where each dimension is 4 bytes. If you want to save storage space, you can use `byte` vectors with the `lucene` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range.
8989

90-
Byte vectors are supported only for the `lucene` engine. They are not supported for the `nmslib` and `faiss` engines.
90+
Byte vectors are supported only for the `lucene` and `faiss` engines. They are not supported for the `nmslib` engine.
9191
{: .note}
9292

9393
In [k-NN benchmarking tests](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution).
@@ -267,6 +267,237 @@ return Byte(bval)
267267
```
268268
{% include copy.html %}
269269

270+
## Faiss byte vector
271+
Faiss engine is recommended for use cases that requires ingestion on a large scale. But, for these large scale workloads using the default `float` vectors requires a lot of memory usage as each dimension is 4 bytes. If you want to reduce this memory and storage requirements,
272+
you can use `byte` vectors with the `faiss` engine. In a `byte` vector, each dimension is a signed 8-bit integer in the [-128, 127] range.
273+
274+
Faiss directly doesn't support byte datatype to store byte vectors. To achieve this functionality we are using a scalar quantizer (SQ8_direct_signed) which accepts float vectors in
275+
8-bit signed integer range and encodes them as byte sized vectors. These quantized byte sized vectors are stored in a k-NN index which reduces the memory footprint by a factor of 4.
276+
When used with [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), SQ8_direct_signed quantization can also significantly reduce search latencies and improve indexing throughput.
277+
278+
When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
279+
{: .important}
280+
281+
To use a `byte` vector, set the `data_type` parameter to `byte` when creating mappings for an index.
282+
283+
### Example: HNSW
284+
285+
Here is an example to create a byte vector index with the Faiss engine and HNSW algorithm:
286+
```json
287+
PUT test-index
288+
{
289+
"settings": {
290+
"index": {
291+
"knn": true
292+
}
293+
},
294+
"mappings": {
295+
"properties": {
296+
"my_vector": {
297+
"type": "knn_vector",
298+
"dimension": 2,
299+
"data_type": "byte",
300+
"method": {
301+
"name": "hnsw",
302+
"space_type": "l2",
303+
"engine": "faiss",
304+
"parameters": {
305+
"ef_construction": 128,
306+
"m": 24
307+
}
308+
}
309+
}
310+
}
311+
}
312+
}
313+
314+
```
315+
{% include copy-curl.html %}
316+
317+
Then ingest documents as usual. But, make sure each dimension in the vector is in the supported [-128, 127] range:
318+
```json
319+
PUT test-index/_doc/1
320+
{
321+
"my_vector": [-126, 28]
322+
}
323+
```
324+
{% include copy-curl.html %}
325+
326+
```json
327+
PUT test-index/_doc/2
328+
{
329+
"my_vector": [100, -128]
330+
}
331+
```
332+
{% include copy-curl.html %}
333+
334+
When querying, be sure to use a byte vector:
335+
```json
336+
GET test-index/_search
337+
{
338+
"size": 2,
339+
"query": {
340+
"knn": {
341+
"my_vector": {
342+
"vector": [26, -120],
343+
"k": 2
344+
}
345+
}
346+
}
347+
}
348+
```
349+
{% include copy-curl.html %}
350+
351+
### Example: IVF
352+
353+
The IVF method requires a training step that creates and trains the model used to initialize the native library index during segment creation. For more information, see [Building a k-NN index from a model]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
354+
355+
First, create an index that will contain byte vector training data. Specify the Faiss engine and IVF algorithm and make sure that the `dimension` matches the dimension of the model you want to create:
356+
357+
```json
358+
PUT train-index
359+
{
360+
"mappings": {
361+
"properties": {
362+
"train-field": {
363+
"type": "knn_vector",
364+
"dimension": 4,
365+
"data_type": "byte"
366+
}
367+
}
368+
}
369+
}
370+
```
371+
{% include copy-curl.html %}
372+
373+
Ingest training data containing byte vectors into the training index:
374+
375+
```json
376+
PUT _bulk
377+
{ "index": { "_index": "train-index", "_id": "1" } }
378+
{ "train-field": [127, 100, 0, -120] }
379+
{ "index": { "_index": "train-index", "_id": "2" } }
380+
{ "train-field": [2, -128, -10, 50] }
381+
{ "index": { "_index": "train-index", "_id": "3" } }
382+
{ "train-field": [13, -100, 5, 126] }
383+
{ "index": { "_index": "train-index", "_id": "4" } }
384+
{ "train-field": [5, 100, -6, -125] }
385+
```
386+
{% include copy-curl.html %}
387+
388+
Then, create and train the model named `byte-vector-model`. The model will be trained using the training data from the `train-field` in the `train-index`. Specify the `byte` data type:
389+
390+
```json
391+
POST _plugins/_knn/models/byte-vector-model/_train
392+
{
393+
"training_index": "train-index",
394+
"training_field": "train-field",
395+
"dimension": 4,
396+
"description": "model with byte data",
397+
"data_type": "byte",
398+
"method": {
399+
"name": "ivf",
400+
"engine": "faiss",
401+
"space_type": "l2",
402+
"parameters": {
403+
"nlist": 1,
404+
"nprobes": 1
405+
}
406+
}
407+
}
408+
```
409+
{% include copy-curl.html %}
410+
411+
To check the model training status, call the Get Model API:
412+
413+
```json
414+
GET _plugins/_knn/models/byte-vector-model?filter_path=state
415+
```
416+
{% include copy-curl.html %}
417+
418+
Once the training is complete, the `state` changes to `created`.
419+
420+
Next, create an index that will initialize its native library indexes using the trained model:
421+
422+
```json
423+
PUT test-byte-ivf
424+
{
425+
"settings": {
426+
"index": {
427+
"knn": true
428+
}
429+
},
430+
"mappings": {
431+
"properties": {
432+
"my_vector": {
433+
"type": "knn_vector",
434+
"model_id": "byte-vector-model"
435+
}
436+
}
437+
}
438+
}
439+
```
440+
{% include copy-curl.html %}
441+
442+
Ingest the data containing the byte vectors that you want to search into the created index:
443+
444+
```json
445+
PUT _bulk?refresh=true
446+
{"index": {"_index": "test-byte-ivf", "_id": "1"}}
447+
{"my_vector": [7, 10, 15, -120]}
448+
{"index": {"_index": "test-byte-ivf", "_id": "2"}}
449+
{"my_vector": [10, -100, 120, -108]}
450+
{"index": {"_index": "test-byte-ivf", "_id": "3"}}
451+
{"my_vector": [1, -2, 5, -50]}
452+
{"index": {"_index": "test-byte-ivf", "_id": "4"}}
453+
{"my_vector": [9, -7, 45, -78]}
454+
{"index": {"_index": "test-byte-ivf", "_id": "5"}}
455+
{"my_vector": [80, -70, 127, -128]}
456+
```
457+
{% include copy-curl.html %}
458+
459+
Finally, search the data. Be sure to provide a byte vector in the k-NN vector field:
460+
461+
```json
462+
GET test-byte-ivf/_search
463+
{
464+
"size": 2,
465+
"query": {
466+
"knn": {
467+
"my_vector": {
468+
"vector": [100, -120, 50, -45],
469+
"k": 2
470+
}
471+
}
472+
}
473+
}
474+
```
475+
{% include copy-curl.html %}
476+
477+
### Memory estimation
478+
479+
In the best-case scenario, byte vectors require 25% of the memory that 32-bit vectors require.
480+
481+
#### HNSW memory estimation
482+
483+
The memory required for Hierarchical Navigable Small Worlds (HNSW) is estimated to be `1.1 * (dimension + 8 * M)` bytes/vector.
484+
485+
As an example, assume that you have 1 million vectors with a dimension of 256 and M of 16. The memory requirement can be estimated as follows:
486+
487+
```r
488+
1.1 * (256 + 8 * 16) * 1,000,000 ~= 0.39 GB
489+
```
490+
491+
#### IVF memory estimation
492+
493+
The memory required for IVF is estimated to be `1.1 * ((dimension * num_vectors) + (4 * nlist * d))` bytes/vector.
494+
495+
As an example, assume that you have 1 million vectors with a dimension of 256 and `nlist` of 128. The memory requirement can be estimated as follows:
496+
497+
```r
498+
1.1 * ((256 * 1,000,000) + (4 * 128 * 256)) ~= 0.27 GB
499+
```
500+
270501
## Binary k-NN vectors
271502

272503
You can reduce memory costs by a factor of 32 by switching from float to binary vectors.

_search-plugins/knn/knn-vector-quantization.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ By default, the k-NN plugin supports the indexing and querying of vectors of typ
1313

1414
OpenSearch supports many varieties of quantization. In general, the level of quantization will provide a trade-off between the accuracy of the nearest neighbor search and the size of the memory footprint consumed by the vector search. The supported types include byte vectors, 16-bit scalar quantization, and product quantization (PQ).
1515

16+
## Faiss byte vector
17+
Starting with version 2.17, the k-NN plugin supports `byte` vectors with the Faiss engine which helps to reduce the memory requirements. For more information, see [Faiss byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#faiss-byte-vector).
18+
1619
## Lucene byte vector
1720

1821
Starting with k-NN plugin version 2.9, you can use `byte` vectors with the Lucene engine in order to reduce the amount of required memory. This requires quantizing the vectors outside of OpenSearch before ingesting them into an OpenSearch index. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).

0 commit comments

Comments
 (0)