Commit cc1f139
Refactor AVX2 distance computations to consistently use generic_simd_op (#196)
- [x] Understand the existing code structure and identify
inconsistencies
- [x] Create `L2FloatOp<8>` for AVX2 L2 distance computations
- [x] Create `ConvertToFloat<8>` base class for AVX2
- [x] Refactor L2 AVX2 implementations to use `simd::generic_simd_op()`
- [x] Create `IPFloatOp<8>` for AVX2 Inner Product computations
- [x] Refactor Inner Product AVX2 implementations to use
`simd::generic_simd_op()`
- [x] Create `CosineFloatOp<8>` for AVX2 Cosine Similarity computations
- [x] Add AVX2 implementations for Cosine Similarity with all type
combinations
- [x] Build and test all changes
- [x] Fix compilation warnings
- [x] Address code review feedback
- [x] Optimize masked load implementation
## Recent Changes
Reverted AVX512VL conditional specializations based on reviewer
feedback. Now using consistent blend mask approach for all AVX2
implementations without runtime conditionals for AVX512VL.
The implementation now:
- Uses `create_blend_mask_avx2()` helper function to create masks
- Uses `_mm256_blendv_ps` for masked loads on AVX2
- Handles masking in load operations for accumulate functions
- Maintains clean separation between AVX2 and AVX512 code paths
Performance regression resolved - benchmarks confirmed performance
parity on both AVX512 and AVX2 systems.
<!-- START COPILOT CODING AGENT SUFFIX -->
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>Consistently use compute ops in distance
computations</issue_title>
> <issue_description>Refactor the compute operations in our [distance
primitives](https://github.com/intel/ScalableVectorSearch/tree/main/include/svs/core/distance).
>
> The goal is that all `compute()` methods will dispatch to
`simd::generic_simd_op()`, as it is done for example in
>
>
https://github.com/intel/ScalableVectorSearch/blob/18ba515e595ee65ff39d3abae3f42a56b3f5ab2c/include/svs/core/distance/euclidean.h#L321
>
> To achieve this, the actual compute operation must be wrapped into a
SIMD struct that provides the necessary protocol. One example is
`L2FloatOp`
>
>
https://github.com/intel/ScalableVectorSearch/blob/18ba515e595ee65ff39d3abae3f42a56b3f5ab2c/include/svs/core/distance/euclidean.h#L240-L259
>
> That is, it is necessary to provide `init()`, `accumulate()`, and
`combine()` operations.
>
> The chain `compute() -> simd::generic_simd_op() -> <actual compute>`
is happening in many places already, but not consistently. One example
where avx intrinsics are still used directly in `compute()` is
>
>
https://github.com/intel/ScalableVectorSearch/blob/18ba515e595ee65ff39d3abae3f42a56b3f5ab2c/include/svs/core/distance/euclidean.h#L369
>
> But many other examples exist in our [distance
primitives](https://github.com/intel/ScalableVectorSearch/tree/main/include/svs/core/distance).</issue_description>
>
> <agent_instructions>As a first step, it is enough to focus on
consistency between L2, Inner product, and cosine similarity
distances.</agent_instructions>
>
> ## Comments on the Issue (you are @copilot in this section)
>
> <comments>
> </comments>
>
</details>
Fixes #183
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: ahuber21 <[email protected]>
Co-authored-by: Andreas Huber <[email protected]>1 parent 070fa79 commit cc1f139
File tree
4 files changed
+264
-204
lines changed- include/svs/core/distance
4 files changed
+264
-204
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
383 | 383 | | |
384 | 384 | | |
385 | 385 | | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
386 | 500 | | |
387 | 501 | | |
388 | 502 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
369 | 393 | | |
370 | 394 | | |
371 | 395 | | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
| 396 | + | |
386 | 397 | | |
387 | 398 | | |
388 | 399 | | |
389 | 400 | | |
390 | 401 | | |
391 | 402 | | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
402 | | - | |
403 | | - | |
404 | | - | |
405 | | - | |
406 | | - | |
407 | | - | |
| 403 | + | |
408 | 404 | | |
409 | 405 | | |
410 | 406 | | |
411 | 407 | | |
412 | 408 | | |
413 | 409 | | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | | - | |
418 | | - | |
419 | | - | |
420 | | - | |
421 | | - | |
422 | | - | |
423 | | - | |
424 | | - | |
425 | | - | |
426 | | - | |
427 | | - | |
428 | | - | |
| 410 | + | |
429 | 411 | | |
430 | 412 | | |
431 | 413 | | |
432 | 414 | | |
433 | 415 | | |
434 | 416 | | |
435 | | - | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | | - | |
444 | | - | |
445 | | - | |
446 | | - | |
447 | | - | |
448 | | - | |
449 | | - | |
450 | | - | |
451 | | - | |
452 | | - | |
| 417 | + | |
453 | 418 | | |
454 | 419 | | |
455 | 420 | | |
456 | 421 | | |
457 | 422 | | |
458 | 423 | | |
459 | | - | |
460 | | - | |
461 | | - | |
462 | | - | |
463 | | - | |
464 | | - | |
465 | | - | |
466 | | - | |
467 | | - | |
468 | | - | |
469 | | - | |
470 | | - | |
471 | | - | |
472 | | - | |
473 | | - | |
474 | | - | |
475 | | - | |
476 | | - | |
477 | | - | |
478 | | - | |
479 | | - | |
| 424 | + | |
480 | 425 | | |
481 | 426 | | |
482 | 427 | | |
483 | 428 | | |
484 | 429 | | |
485 | 430 | | |
486 | | - | |
487 | | - | |
488 | | - | |
489 | | - | |
490 | | - | |
491 | | - | |
492 | | - | |
493 | | - | |
494 | | - | |
495 | | - | |
496 | | - | |
497 | | - | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
506 | | - | |
| 431 | + | |
507 | 432 | | |
508 | 433 | | |
509 | 434 | | |
| |||
0 commit comments