-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added changes to enable full file cache stats #17538
base: main
Are you sure you want to change the base?
Added changes to enable full file cache stats #17538
Conversation
❌ Gradle check result for da0dcc9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCacheStats.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/opensearch/index/store/remote/filecache/FullFileCacheStatsCollector.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/remotestore/WritableWarmIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/remotestore/WritableWarmIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/remotestore/WritableWarmIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/store/remote/filecache/FileCache.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/store/remote/filecache/FileCacheTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me from the functionality point of view.
But have a few concerns related to extensibility (specially when we are going to introduce the pinning functionality)
-
We have added FullFileCacheStats to account for the stats of the full file. Will we require a similar file for accounting for stats of blocked files (and pinned files in future) ? In that case should we not have it in a single place - CacheUsage would be a good place for that since we are already maintaning the usage and activeUsage overall there.
-
Also adding to my comment on making changes in the LRUCache instead of FileCache, how about adding changes to the Node in LRUCache to check whether the IndexInput is for block or full file. We already have a weight param for Node - we can add file type, pinned status etc as per our use case to simplify the stats/usage aggregation. We would just need to make changes in the replaceNode and addNode functions to account for our needed stats/usages. Something similar to below.
static class Node<K, V> {
final K key;
V value;
long weight;
int refCount;
// Add below properties for Node
FileType fileType;
PinStatus pinned;
}
public class CacheUsage {
private final long usage;
private final long activeUsage;
// Add below properties in CacheUsage
private final long blockUsage;
private final long nonBlockUsage;
private final long pinnedUsage;
}
@gbbafna @ankitkala @mayanksharma27 thoughts on the above approach ?
server/src/main/java/org/opensearch/index/store/remote/filecache/FileCache.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Mayank Sharma <[email protected]>
da0dcc9
to
38847e8
Compare
@rayshrey regarding your comment above -
|
❕ Gradle check result for 38847e8: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17538 +/- ##
============================================
+ Coverage 72.27% 72.44% +0.16%
- Complexity 65611 65756 +145
============================================
Files 5311 5311
Lines 304942 305099 +157
Branches 44225 44247 +22
============================================
+ Hits 220407 221034 +627
+ Misses 66448 65959 -489
- Partials 18087 18106 +19 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
} | ||
|
||
@Override | ||
public String toString() { | ||
return snapshot().toString(); | ||
} | ||
|
||
private boolean isFullFile(V value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using this type checking every-time we remove, get, and put value to FileCache.
Would it make sense to have this value pre-computed for FullFIleCachedIndexInput?
Maybe introducing a method in CachedIndexInput
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point @skumawat2025 . So for every get we would be calling instanceof
. We should do a benchmark for this change by reusing FileCacheBenchmark
. If we see any perf issues, we will need to solve for it .
/** | ||
* A non thread-safe {@link StatsCounter} implementation. | ||
* | ||
* @opensearch.internal | ||
*/ | ||
public class DefaultStatsCounter<K> implements StatsCounter<K> { | ||
public class DefaultStatsCounter<K, V> implements StatsCounter<K, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about having a different implementation of StatsCounter in case of FullFileStats? Maybe using a Factory while initialising Cache. Currently FullFileStats won't make sense in case of SearchableSnapshots right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can make sense if we support pinning some files there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same suggestion here as well - have a base stats counter implementation and then have DefaultStatsCounter as below
DefaultStatsCounter {
StatsCounter overallStatsCounter;
StatsCounter fullFileStatsCounter;
StatsCounter blockedFileStatsCounter;
StatsCounter pinnedFileStatsCounter;
}
* Represents Stats about FullFiles in a {@link RefCountedCache}. | ||
*/ | ||
@ExperimentalApi | ||
public class FullFileStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be static class?
CacheUsage cacheUsage = theCache.usage(); | ||
logger.trace("Total Usage: " + cacheUsage.usage() + " , Active Usage: " + cacheUsage.activeUsage()); | ||
long cacheUsage = theCache.usage(); | ||
logger.trace("Total Usage: " + cacheUsage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log the active usage here as well since it was logged previously as well.
} | ||
|
||
@Override | ||
public String toString() { | ||
return snapshot().toString(); | ||
} | ||
|
||
private boolean isFullFile(V value) { | ||
return value instanceof CachedFullFileIndexInput; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using this instanceof check, I think it will be better to use the Path of the the file to identify it's type(block files will have block as part of their name). This will also ensure that we don't need to change the signature of StatsCounter to StatsCounter<K,V>, thus minimising the changes.
@@ -33,6 +41,7 @@ public class FileCacheStats implements Writeable, ToXContentFragment { | |||
private final long evicted; | |||
private final long hits; | |||
private final long misses; | |||
private final FullFileCacheStats fullFileCacheStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is not extensible, in case we need to add another BlockedFileCacheStats or PinnedFileCacheStats, we would have to create multiple duplicate classes with the same fields and functionality. Let's create a base class similar to below
public class FileCacheStatsBase {
private long used;
private long total;
private long evicted;
private long hits;
private long misses;
}
Then we can create overallCacheStats, FullFileCacheStats, BlockedFileCacheStats, PinnedFileCacheStats and have our FileCacheStats as below
public class FileCacheStats {
private FileCacheStatsBase overallCacheStats;
private FileCacheStatsBase FullFileCacheStats;
private FileCacheStatsBase BlockedFileCacheStats;
private FileCacheStatsBase PinnedFileCacheStats;
// return the overall stats for currently existing methods such as getHits/getMisses etc
public long getCacheHits/getCacheMisses/...() {
overallCacheStats.hits/misses;
}
// similarly for all other stats
public long getFullFileCacheStats() {
return FullFileCacheStats;
}
}
* Represents Stats about FullFiles in a {@link RefCountedCache}. | ||
*/ | ||
@ExperimentalApi | ||
public class FullFileStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar recommendation as provided for FileCacheStats, let's use inheritance to make things extensible here as well (Have a base class with all common fields and then use those classes together in a single class.
/** | ||
* A non thread-safe {@link StatsCounter} implementation. | ||
* | ||
* @opensearch.internal | ||
*/ | ||
public class DefaultStatsCounter<K> implements StatsCounter<K> { | ||
public class DefaultStatsCounter<K, V> implements StatsCounter<K, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same suggestion here as well - have a base stats counter implementation and then have DefaultStatsCounter as below
DefaultStatsCounter {
StatsCounter overallStatsCounter;
StatsCounter fullFileStatsCounter;
StatsCounter blockedFileStatsCounter;
StatsCounter pinnedFileStatsCounter;
}
Description
Related Issues
Resolves #17479.
Meta Issue #13149
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.