Skip to content

Conversation

sobhu17
Copy link

@sobhu17 sobhu17 commented Aug 8, 2025

Description

This is the second PR which will make use of the changes from the first PR. It includes the C++ changes added as a patch in the KNN project.
Currently we store vectors in two files (.vec and .faiss). The two vector sets stored in each file may differ where Faiss index file (.faiss) might contains quantized vectors for fast approximate search, while the Lucene .vec file stores the full-precision vectors used during exact search.
For FP32, Byte, and Pure Binary cases, maintaining two copies of the same vectors in both .vec and .faiss files is redundant. Both approximate and exact search operations can run on a single set of vectors, making it possible to eliminate this duplication safely.

Adding the link of my local faiss repo for reference- faiss_changes

Files changed in external faiss:

  • io.h
  • index_write.cpp
  • index_read.cpp
  • index_read_utils.h
  • index_io.h

Related Issues

Resolves #2823

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

}
}

-InvertedLists* read_InvertedLists(IOReader* f, int io_flags) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change int -> uint64_t?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use uint64_t because there’s a possibility that FAISS might introduce a flag in the future that uses the same bit position as ours.
By defining const uint64_t DEDUPE_VECTORS_OPT_DISABLED = (1ull << 63);
we ensure our io_flags value sits in a high, reserved bit, minimizing the risk of conflicting with any future FAISS flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info. That makes sense. I think the best way to document this in the code is to use this constant everywhere instead of raw 1ull << 63 value and add the info you shared here as a comment above the constant.

@navneet1v
Copy link
Collaborator

Adding the link of my local faiss repo for reference- faiss_changes

what is this patch actually doing?

idxf->code_size = idxf->d * sizeof(float);
- read_xb_vector(idxf->codes, f);
+ if(dedup_applied){
+ idxf->codes.resize(idxf->ntotal * idxf->code_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add comments here explaining the process?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small overview for the patch:
During index loading, without this optimization the .faiss file contains the flat vector section. With this patch, the .faiss file will no longer store flat vectors, instead vectors are loaded from the .vec file into the C++ side using our custom VectorReader, which streams the vectors one by one. To enable this, vector writing to the .faiss file is disabled(index_write.cpp), and the loading path is updated to fetch vectors via VectorReader(index_read.cpp) at index load time.

// Now that indexWriter is trained, we just load the bytes into an array and return
faiss::VectorIOWriter vectorIoWriter;
faiss::write_index(indexWriter.get(), &vectorIoWriter);
faiss::write_index(indexWriter.get(), &vectorIoWriter, 1ull<<63);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the ioflags be moved into a constant with a descriptive name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will create constant for io_flags and use that.

env->ReleaseByteArrayElements(vector, elems, JNI_ABORT);
});

int vectorByteSize = sizeof(float) * length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be sizeof(byte) if it's the byte vector case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with that, for byte and other quantized(IxSQ) cases faiss expects float vectors and it internally performs quantization. I tried using the byte vectors array but that failed. So what we are doing here is getting byte vectors from VectorReader and then typecasting those vectors to float, then in read_index for IxSQ case what we are doing is using faiss sq.compute_codes so that faiss can get vectors in proper format.

if (env->ExceptionCheck() || vectorReaderGlobalRef == nullptr) return false;
}

if (isFloat) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add two helper functions to reduce branching? It's hard to tell where float and byte cases differ in this code, so helper would improve readability and maintainability.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it make sense, thank you.


float* floatDest = static_cast<float*>(dest);
for (int i = 0; i < length; ++i) {
floatDest[i] = static_cast<float>(elems[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use memcpy

Copy link
Author

@sobhu17 sobhu17 Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Got it!

Signed-off-by: sobhu17 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants