Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
10fa912
Adding basic generic vector profiler implementation and tests. (#2624)
oaganesh Apr 2, 2025
b955792
Extend QuantizationStateWriter to Vector Profiler
markwu-sde Apr 14, 2025
fefe7c8
Updating main implementation for serialization and aggregation for pr…
oaganesh Apr 18, 2025
49cadd5
Updating Changelog file.
oaganesh Apr 18, 2025
d844f27
Updating import changes.
oaganesh Apr 18, 2025
0b49596
Optimize first child lookup for nested docs (#2637)
jmazanec15 Apr 2, 2025
9f49e51
Add cmake policy flag (#2645)
owenhalpert Apr 2, 2025
04f3265
Add github action to run ITs against remote index builder (#2620)
anntians Apr 3, 2025
a5356f2
Enhance derived source its (#2648)
jmazanec15 Apr 6, 2025
bd9d2d1
[Remote Vector Index Build] Add metric collection (#2615)
owenhalpert Apr 7, 2025
37b3912
Update engine for version 2.19 or above (#2498)
VijayanB Apr 7, 2025
04adc36
Add multi-vector-support faiss patch to IndexHNSW::search_level_0 (#2…
anntians Apr 9, 2025
a0870a2
Combine method and lucene mappers to EngineFieldMapper (#2646)
kotwanikunal Apr 9, 2025
88774a9
3.0.0 Beta Release Notes (#2656)
Vikasht34 Apr 10, 2025
4201fb0
Fix build due to phasing off SecurityManager usage in favor of Java A…
reta Apr 14, 2025
105b970
Prevent derived source from open reader per transform (#2652)
jmazanec15 Apr 14, 2025
ac813ff
Removing redundant type conversions for script scoring for hamming sp…
kasundra07 Apr 14, 2025
eb2a30c
Explain API changes for Exact/ANN/Radial/Disk based KNN search (#2403)
neetikasinghal Apr 15, 2025
5165dfc
Fix a bug to save the best matching similarity function in meta info.…
0ctopus13prime Apr 15, 2025
9feb203
[BUGFIX] FIX nested vector query at efficient filter scenarios (#2641)
luyuncheng Apr 16, 2025
7a509a3
Fix concurrency bug to share non-thread safe graph structure. (#2663)
0ctopus13prime Apr 17, 2025
044c7d1
Switch derived to default (#2664)
jmazanec15 Apr 17, 2025
03f4e18
Change skip building vector data structure log to debug level (#2639)
owenhalpert Apr 17, 2025
d4022d3
Fix quantization cache bugs. (#2666)
0ctopus13prime Apr 18, 2025
23622da
Extend QuantizationStateWriter to Vector Profiler
markwu-sde Apr 14, 2025
b9eb1b8
Adding basic generic vector profiler implementation and tests. (#2624)
oaganesh Apr 2, 2025
9802027
Applying spotless changes.
oaganesh Apr 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 144 additions & 0 deletions .github/workflows/remote_index_build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
name: Build and Test k-NN using Remote Index Builder
on:
schedule:
- cron: '0 0 * * *' # every night
push:
branches:
- "*"
- "feature/**"
paths:
- 'build.gradle'
- 'settings.gradle'
- 'src/**'
- 'build-tools/**'
- 'buildSrc/**'
- 'gradle/**'
- 'jni/**'
- '.github/workflows/remote_index_build.yml'
pull_request:
branches:
- "*"
- "feature/**"
paths:
- 'build.gradle'
- 'settings.gradle'
- 'src/**'
- 'build-tools/**'
- 'buildSrc/**'
- 'gradle/**'
- 'jni/**'
- '.github/workflows/remote_index_build.yml'

jobs:
Remote-Index-Build-IT-Tests:
strategy:
matrix:
java: [21]

env:
AWS_ACCESS_KEY_ID: test
AWS_SECRET_ACCESS_KEY: test
AWS_SESSION_TOKEN: test

name: Remote-Index-Build-IT-Tests on Linux
runs-on:
group: selfhosted-gpu-runners
labels: g6xlarge

steps:
- name: Checkout k-NN
uses: actions/checkout@v4

# Setup git user so that patches for native libraries can be applied and committed
- name: Setup git user
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"

- name: Setup Java ${{ matrix.java }}
uses: actions/setup-java@v4
with:
java-version: ${{ matrix.java }}
distribution: 'temurin'

- name: Install dependencies on linux
run: |
sudo yum install gcc g++ -y
sudo yum install openblas openblas-devel -y
sudo yum install -y zlib
sudo yum install -y zlib-devel
sudo yum install -y cmake
sudo yum install gcc-gfortran -y

- name: Initial cleanup
run: |
docker ps -aq | xargs -r docker rm -f
docker system prune -af --volumes

- name: Pull Remote Index Build Docker Image from Docker Hub
run: |
docker pull rchitale7/remote-index-build-service:api

- name: Pull LocalStack Docker image
run: |
docker pull localstack/localstack:latest

- name: Run LocalStack
run: |
docker run --rm -d -p 4566:4566 localstack/localstack:latest

- name: Verify Localstack is ready
run: |
if ! timeout 3 bash -c 'until curl --silent --fail http://localhost:4566/_localstack/health; do sleep 1; done'; then
echo "Localstack health check failed after 3 seconds"
exit 1
fi

- name: Create S3 Bucket in LocalStack
run: |
aws --endpoint-url=http://localhost:4566 s3 mb s3://remote-index-build-bucket

- name: Run Docker container
run: |
docker run --rm -d --name remote-index-builder-container --gpus all -p 80:80 -e INTEGRATION_TESTS=TRUE -e AWS_ACCESS_KEY_ID=${{ env.AWS_ACCESS_KEY_ID }} -e AWS_SECRET_ACCESS_KEY=${{ env.AWS_SECRET_ACCESS_KEY }} -e AWS_SESSION_TOKEN=${{ env.AWS_SESSION_TOKEN}} rchitale7/remote-index-build-service:api
sleep 5

- name: Run tests
run: |
if lscpu | grep -i avx512f | grep -i avx512cd | grep -i avx512vl | grep -i avx512dq | grep -i avx512bw
then
if lscpu | grep -q "GenuineIntel" && lscpu | grep -i avx512_fp16 | grep -i avx512_bf16 | grep -i avx512_vpopcntdq
then
echo "the system is an Intel(R) Sapphire Rapids or a newer-generation processor"
./gradlew :integTestRemoteIndexBuild -Ds3.enabled=true -Dtest.remoteBuild=s3.localStack -Dtest.bucket=remote-index-build-bucket -Dtest.base_path=vectors -Daccess_key=${{ env.AWS_ACCESS_KEY_ID }} -Dsecret_key=${{ env.AWS_SECRET_ACCESS_KEY }} -Dsession_token=${{ env.AWS_SESSION_TOKEN}} -Dtests.class=org.opensearch.knn.index.RemoteBuildIT -Davx512_spr.enabled=true -Dnproc.count=`nproc`
else
echo "avx512 available on system"
./gradlew :integTestRemoteIndexBuild -Ds3.enabled=true -Dtest.remoteBuild=s3.localStack -Dtest.bucket=remote-index-build-bucket -Dtest.base_path=vectors -Daccess_key=${{ env.AWS_ACCESS_KEY_ID }} -Dsecret_key=${{ env.AWS_SECRET_ACCESS_KEY }} -Dsession_token=${{ env.AWS_SESSION_TOKEN}} -Dtests.class=org.opensearch.knn.index.RemoteBuildIT -Davx512_spr.enabled=false -Dnproc.count=`nproc`
fi
elif lscpu | grep -i avx2
then
echo "avx2 available on system"
./gradlew :integTestRemoteIndexBuild -Ds3.enabled=true -Dtest.remoteBuild=s3.localStack -Dtest.bucket=remote-index-build-bucket -Dtest.base_path=vectors -Daccess_key=${{ env.AWS_ACCESS_KEY_ID }} -Dsecret_key=${{ env.AWS_SECRET_ACCESS_KEY }} -Dsession_token=${{ env.AWS_SESSION_TOKEN}} -Dtests.class=org.opensearch.knn.index.RemoteBuildIT -Davx512.enabled=false -Davx512_spr.enabled=false -Dnproc.count=`nproc`
else
echo "avx512 and avx2 not available on system"
./gradlew :integTestRemoteIndexBuild -Ds3.enabled=true -Dtest.remoteBuild=s3.localStack -Dtest.bucket=remote-index-build-bucket -Dtest.base_path=vectors -Daccess_key=${{ env.AWS_ACCESS_KEY_ID }} -Dsecret_key=${{ env.AWS_SECRET_ACCESS_KEY }} -Dsession_token=${{ env.AWS_SESSION_TOKEN}} -Dtests.class=org.opensearch.knn.index.RemoteBuildIT -Davx2.enabled=false -Davx512.enabled=false -Davx512_spr.enabled=false -Dnproc.count=`nproc`
fi

- name: Verify Remote Index Builder logs
run: |
if docker logs remote-index-builder-container 2>&1 | grep -q "INFO - Index built successfully!"; then
echo "Success logs found in Remote Index Builder container"
else
echo "No success logs found. Full logs:"
docker logs remote-index-builder-container
exit 1
fi

- name: Final cleanup
if: always()
run: |
docker ps -aq | xargs -r docker rm -f
docker system prune -af --volumes
docker logout
rm -rf ${{ github.workspace }}/*

20 changes: 5 additions & 15 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,18 @@ All notable changes to this project are documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). See the [CONTRIBUTING guide](./CONTRIBUTING.md#Changelog) for instructions on how to add changelog entries.

## [Unreleased 3.0](https://github.com/opensearch-project/k-NN/compare/2.x...HEAD)
### Features
* [Remote Vector Index Build] Client polling mechanism, encoder check, method parameter retrieval [#2576](https://github.com/opensearch-project/k-NN/pull/2576)
* [Remote Vector Index Build] Move client to separate module [#2603](https://github.com/opensearch-project/k-NN/pull/2603)
* Add filter function to KNNQueryBuilder with unit tests and integration tests [#2599](https://github.com/opensearch-project/k-NN/pull/2599)
* [Lucene On Faiss] Add a new mode, memory-optimized-search enable user to run vector search on FAISS index under memory constrained environment. [#2630](https://github.com/opensearch-project/k-NN/pull/2630)
### Enhancements
* Removing redundant type conversions for script scoring for hamming space with binary vectors [#2351](https://github.com/opensearch-project/k-NN/pull/2351)
### Bug Fixes
* Fixing bug to prevent NullPointerException while doing PUT mappings [#2556](https://github.com/opensearch-project/k-NN/issues/2556)
* Add index operation listener to update translog source [#2629](https://github.com/opensearch-project/k-NN/pull/2629)
* [Remote Vector Index Build] Fix bug to support `COSINESIMIL` space type [#2627](https://github.com/opensearch-project/k-NN/pull/2627)
### Infrastructure
### Documentation
### Maintenance
* Update minimum required CMAKE version in NMSLIB [#2635](https://github.com/opensearch-project/k-NN/pull/2635)
### Refactoring
* Switch derived source from field attributes to segment attribute [#2606](https://github.com/opensearch-project/k-NN/pull/2606)
* Migrate derived source from filter to mask [#2612](https://github.com/opensearch-project/k-NN/pull/2612)
* [BUGFIX] Fix KNN Quantization state cache have an invalid weight threshold [#2666](https://github.com/opensearch-project/k-NN/pull/2666)

## [Unreleased 2.x](https://github.com/opensearch-project/k-NN/compare/2.19...2.x)
### Features
* [Vector Profiler] Adding basic generic vector profiler implementation and tests. [#2624](https://github.com/opensearch-project/k-NN/pull/2624)
* [Vector Profiler] Adding main segment implementation for API and indexing. [#2653](https://github.com/opensearch-project/k-NN/pull/2653)
### Enhancements
### Bug Fixes
* [BUGFIX] FIX nested vector query at efficient filter scenarios [#2641](https://github.com/opensearch-project/k-NN/pull/2641)
### Infrastructure
### Documentation
### Maintenance
Expand Down
Loading
Loading