-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Vector Index Build] Add metric collection #2615
base: main
Are you sure you want to change the base?
Conversation
a795d41
to
97a5c0c
Compare
Signed-off-by: owenhalpert <[email protected]>
97a5c0c
to
595fc7c
Compare
src/main/java/org/opensearch/knn/index/codec/nativeindex/remote/RemoteIndexBuildStrategy.java
Outdated
Show resolved
Hide resolved
70ca41c
to
e102dfc
Compare
/** | ||
* Build and write index with flush flag. Default implementation calls the non-flag version. | ||
* RemoteIndexBuildStrategy will override this to use the flag. | ||
* | ||
* @param indexInfo {@link BuildIndexParams} containing information about the index to be built | ||
* @param isFlush Flag indicating if operation is from flush (vs merge) | ||
* @throws IOException if an error occurs during the build process | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just explain that this is to differentiate the context for metrics purposes rather than explain what it is doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be a need for this method - why can't we add context
in BuildIndexParams
? the context can be either merge or flush
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that, good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shatejas I added a boolean isFlush
field in BuildIndexParams
. I considered having this be a string of either merge
or flush
but I felt that keeping it as a boolean named isFlush
is more clear for its usage.
src/main/java/org/opensearch/knn/index/codec/nativeindex/NativeIndexWriter.java
Outdated
Show resolved
Hide resolved
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.BUILD_REQUEST_FAILURE_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.BUILD_REQUEST_SUCCESS_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.INDEX_BUILD_FAILURE_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.INDEX_BUILD_SUCCESS_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.READ_FAILURE_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.READ_SUCCESS_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.READ_TIME; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_CURRENT_FLUSH_OPERATIONS; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_CURRENT_FLUSH_SIZE; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_CURRENT_MERGE_OPERATIONS; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_CURRENT_MERGE_SIZE; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_FLUSH_TIME; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.REMOTE_INDEX_BUILD_MERGE_TIME; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.WAITING_TIME; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.WRITE_FAILURE_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.WRITE_SUCCESS_COUNT; | ||
import static org.opensearch.knn.plugin.stats.KNNRemoteIndexBuildValue.WRITE_TIME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud here, should we try to offload all of the stats collection to a separate class? One example could be we have a MetricsCollectingRemoteIndexBuildStrategy extends RemoteIndexBuildStrategy
which does all the metrics collection.
0841e10
to
04e1b56
Compare
Signed-off-by: owenhalpert <[email protected]>
04e1b56
to
201614b
Compare
recordBuildRequestSuccess(stopWatch.stop().totalTime().millis(), indexInfo.getFieldName()); | ||
return remoteBuildResponse; | ||
} catch (IOException e) { | ||
recordBuildRequestFailure(stopWatch.stop().totalTime().millis(), indexInfo.getFieldName(), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do these in finally block. Any other exception and you are not capturing failure. another approach is to catch a generic approach which is generally not recommended. I see the pattern at other places as well.
bool success = false;
try {
success = true;
......
} finally {
recordBuildRequestMetrics(success, stopWatch.millis);
}
I see you are passing in the exception and its just used for logging. Please log in the catch block so you don't have to pass the error
Description
Adds metric collection for various metrics related to the Remote Vector Index Build service, with stats only being returned in
/stats
if the feature flag is enabled. Also adds a flag to detect whether the index build is coming from merge or flush to get granular operation-level metrics.Gist showing stats response with and without feature flag enabled: https://gist.github.com/owenhalpert/a05029924887390bed6c14d60b34bf01
New metrics (only returned if feature flag is enabled):
From @jed326's original PR (#2566):
Related Issues
Meta #2391
Check List
- [ ] New functionality has been documented.- [ ] API changes companion pull request created.--signoff
.- [ ] Public documentation issue/PR created.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.