-
Notifications
You must be signed in to change notification settings - Fork 169
Test compression segment serializer #2696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test compression segment serializer #2696
Conversation
Signed-off-by: Arun Ganesh <[email protected]>
Signed-off-by: Arun Ganesh <[email protected]>
Signed-off-by: Arun Ganesh <[email protected]>
Signed-off-by: Arun Ganesh <[email protected]>
Signed-off-by: Arun Ganesh <[email protected]>
Signed-off-by: Arun Ganesh <[email protected]>
…erializer Signed-off-by: oaganesh <[email protected]>
segmentWriteState.segmentInfo.name, | ||
segmentWriteState.segmentSuffix, | ||
KNNConstants.QUANTIZATION_STATE_FILE_SUFFIX | ||
// public KNN990QuantizationStateWriter(SegmentWriteState segmentWriteState) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't leave commented code in the PR.
// output = segmentWriteState.directory.createOutput(quantizationStateFileName, segmentWriteState.context); | ||
// } | ||
|
||
public KNN990QuantizationStateWriter(SegmentWriteState segmentWriteState, String fileSuffix) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be using quantization state writer for this? If we're doing a different segment state we should probably have a separate writer. But this is ok for POC purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. When trying to write another writer it required a lot of further refactoring as the other classes implement this.
this(segmentWriteState, KNNConstants.QUANTIZATION_STATE_FILE_SUFFIX); | ||
} | ||
|
||
// public KNN990QuantizationStateWriter(SegmentWriteState segmentWriteState) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not leave commented code here.
); | ||
final QuantizationState quantizationState = train(field.getFieldInfo(), knnVectorValuesSupplier, totalLiveDocs); | ||
SegmentProfilerState.profileVectors(knnVectorValuesSupplier); | ||
profile(field.getFieldInfo(), knnVectorValuesSupplier, totalLiveDocs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we correct in our assumption that the knnVectorValuesSupplier
contains information on non-compressed vectors? i.e the profiling happens before the compression.
private void initSegmentStateWriterIfNecessary() throws IOException { | ||
if (segmentStateWriter == null) { | ||
segmentStateWriter = new KNN990QuantizationStateWriter(segmentWriteState, KNNConstants.SEGMENT_PROFILE_STATE_FILE_SUFFIX); | ||
segmentStateWriter.writeHeader(segmentWriteState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going with this approach we're going to have to refactor the QuantizationStateWriter to support generic file writing
|
||
@Override | ||
public void writeTo(StreamOutput streamOutput) throws IOException { | ||
streamOutput.writeString(shardId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this override? Did we confirm that's its even getting called in the API path?
@Getter | ||
public class KNNProfileRequest extends BroadcastRequest<KNNProfileRequest> { | ||
|
||
private String index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: final
} | ||
|
||
byte[] stateBytes = readStateBytes(input, position, length); | ||
return SegmentProfilerState.fromBytes(stateBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this logic looks ok to me but we can probably clean it up later
Signed-off-by: Arun Ganesh <[email protected]>
Description
Writing the segment profiler state to a file and profiling the different compression standards for benchmark testing.
Related Issues
Implements #2687
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check ✔️ .