-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add fallback support to Lucene104ScalarQuantizedVectorsFormat getFloatVectorValues when there are no full-precision vectors present #15415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…tVectorValues when there are no full-precision vectors present
benwtrent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I am sorry we missed this with the new format. Thank you for taking care of it.
lucene/CHANGES.txt
Outdated
| (Ben Trent) | ||
|
|
||
| * GITHUB#15415: Add fallback support to Lucene104ScalarQuantizedVectorsFormat getFloatVectorValues when there are | ||
| no full-precision vectors present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add your name for posterity :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed it. Fixed in next revision.
| } | ||
|
|
||
| OffHeapScalarQuantizedFloatVectorValues( | ||
| boolean isQuerySide, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never allow this "querySide" thing. Even as it is now, it wouldn't work.
| // unpack bytes | ||
| switch (encoding) { | ||
| case PACKED_NIBBLE -> | ||
| OffHeapScalarQuantizedVectorValues.unpackNibbles(byteValue, unpackedByteVectorValue); | ||
| case SINGLE_BIT_QUERY_NIBBLE -> | ||
| OptimizedScalarQuantizer.unpackBinary(byteValue, unpackedByteVectorValue); | ||
| case UNSIGNED_BYTE, SEVEN_BIT -> { | ||
| deQuantize(byteValue, vectorValue, encoding.getBits(), correctiveValues, centroid); | ||
| lastOrd = targetOrd; | ||
| return vectorValue; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was "query side" it wouldn't work. Consequently, I think this query side thing should go away.
I think this piece is great if we always assume document quantized bits.
| byte[] quantized, | ||
| float[] dequantized, | ||
| byte bits, | ||
| float[] correctiveValues, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make this specifically lower upper thing instead of this array?
| byte[] quantized, | ||
| float[] dequantized, | ||
| byte bits, | ||
| float[] correctiveValues, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets name these lowerInterval and upperInterval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed in next revision.
benwtrent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff! Thank you
|
Thanks @benwtrent for such quick review!! |
Description
Add fallback support to Lucene104ScalarQuantizedVectorsFormat.getFloatVectorValues() when there are no full-precision vectors present. As part of this PR, we added this support in Lucene99ScalarQuantizedVectorsFormat but it got missed in new vector codec. This PR is trying to add back that support.