Use IVF_PQ for GPU index build for large datasets #137126

mayya-sharipova · 2025-10-24T18:51:40Z

Use IVF_PQ algorithm for GPU index building for large dataset (>= 1M vectors). Temporarily add a factory for calculating IVF_PQ params. Also skip estimation of needed memory when IVF_PQ is used.

elasticsearchmachine · 2025-10-24T18:52:05Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-10-24T18:52:06Z

Hi @mayya-sharipova, I've created a changelog YAML for you.

mayya-sharipova · 2025-10-24T18:52:22Z

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/CuVSResourceManager.java

        }

        private long estimateRequiredMemory(int numVectors, int dims, CuVSMatrix.DataType dataType) {
+            // for large vector sets, we use IVF+PQ or similar, so we don't skip blocking based on memory usage


@ldematte What do you think we should do here? I made it very naively.

mayya-sharipova · 2025-10-24T19:29:58Z

With this params (1M byte vectors):

@achirkin Notice how here when switching from NN_DESCENT, we got worse graph building time, and more dense graphs.

gist: 1_000_000 docs; 960 dims; euclidean metric

index_type	force_merge_time (ms)	QPS1 seg	recall1 seg
cpu	130129	421	0.91
gpu NN_DESCENT	20643	467	0.92
gpu IVF_PQ	36536	149	1

Use IVF_PQ for GPU index build for large datasets

22d23c3

Use IVF_PQ algorithm for GPU index building for large dataset (>= 1M vectors). Temporarily add a factory for calculating IVF_PQ params. Also skip estimation of needed memory when IVF_PQ is used.

mayya-sharipova added >enhancement auto-backport Automatically create backport pull requests when merged :Search Relevance/Vectors Vector search v9.2.1 v9.3.0 labels Oct 24, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 24, 2025

Update docs/changelog/137126.yaml

931d06a

mayya-sharipova commented Oct 24, 2025

View reviewed changes

mayya-sharipova marked this pull request as draft October 24, 2025 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use IVF_PQ for GPU index build for large datasets #137126

Use IVF_PQ for GPU index build for large datasets #137126

mayya-sharipova commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

mayya-sharipova Oct 24, 2025

Uh oh!

mayya-sharipova commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use IVF_PQ for GPU index build for large datasets #137126

Are you sure you want to change the base?

Use IVF_PQ for GPU index build for large datasets #137126

Conversation

mayya-sharipova commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

mayya-sharipova Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

mayya-sharipova commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants