Skip to content

Conversation

@stevanbz
Copy link

@stevanbz stevanbz commented May 23, 2023

Is your feature request related to a problem? Please provide an existing Issue # , or describe.
Added cpu/disk related metrics from performance-analyzer-rca project needed for distributed performance tracing.
Client (user of the commons lib) in order to be able to use the metrics, will have to add the permissions (described in src/main/resources/plugin-security.policy).
Copy-pasted code base for getting the native thread id, code for getting the diskIO, cpu and sched metrics.

Publishing to local maven is useful once you want to test the usage of the lib:
.

./gradlew clean build publishToMavenLocal

and then usage the lib as a dependency (if there is a need - exclude the clashing the lib's or maybe we should align the dependencies in this project according to the versions defined in Open search - check it out here):

api ("org.opensearch:performance-analyzer-commons:3.0.0.0-SNAPSHOT") {
    exclude (group: 'com.fasterxml.jackson.core', module: 'jackson-core')
    exclude (group: 'com.fasterxml.jackson.core', module: 'jackson-annotations')
    exclude (group: 'com.fasterxml.jackson', module: 'jackson-bom')
    exclude (group: 'com.fasterxml.jackson.core', module: 'jackson-databind')
  }

Edited a gradle task for publishing - since we had 2 tasks - one was publishing the jar to local mvn while the second one published only javadoc and sources jar.

Describe the solution you are proposing
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Tjofil
Copy link
Contributor

Tjofil commented May 24, 2023

Looks good to me.

@stevanbz stevanbz force-pushed the feature/disk-stats-metrics branch from b2c5a9a to 96affd6 Compare May 24, 2023 19:56
jTidNameMap.put(id, name);
}

static void runThreadDump(String pid, String[] args) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us add a method javadoc highlighting: A thread dump is a snapshot of the state of all the threads of a Java process.

  1. We use HotSpotVirtualMachine to capture thread dump. VirtualMachine is type cast to HotSpotVirtualMachine to access the method at runtime.
  2. Use this method ONLY when NativeThreadId is required, for all other purpose us ThreadMxBeans API’s.
  3. Running Thread Dump is expensive and should be only judiciously done.

@@ -0,0 +1,26 @@
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us rename this file, plugin-security isn't relevant for this library.

Comment on lines 10 to 11
// This method will be called before all following get methods
// to make sure that all information exists for a thread id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Block comments are preferred.

}
}

public synchronized void addSample() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaDoc for this new method. Additionally, we need 2 service timers - to measure the time taken for parsing, calculateCPUDetails and calculatePagingActivity

kvTimestamp = System.currentTimeMillis();
for (String tid : tids) {
Map<String, Object> sample =
// (new SchemaFileParser("/proc/"+tid+"/stat",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comment ?

Comment on lines 167 to 188
public synchronized void addSample(String threadId) {
tids = OSGlobals.getTids();

oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
Map<String, Object> sample =
// (new SchemaFileParser("/proc/"+tid+"/stat",
(new SchemaFileParser(
"/proc/" + pid + "/task/" + threadId + "/stat",
statKeys,
statTypes,
true))
.parse();
tidKVMap.put(threadId, sample);

calculateCPUDetails();
calculatePagingActivity();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need repeated logic here? Update public synchronized void addSample() { for parameters with default set as All thread.

Comment on lines 77 to 117
public synchronized void addSample() {
tids = OSGlobals.getTids();

oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
for (String tid : tids) {
Map<String, Object> sample =
(new SchemaFileParser(
"/proc/" + pid + "/task/" + tid + "/schedstat",
schedKeys,
schedTypes))
.parse();
tidKVMap.put(tid, sample);
}

calculateSchedLatency();
}

public synchronized void addSample(String threadId) {
tids = OSGlobals.getTids();

oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
Map<String, Object> sample =
(new SchemaFileParser(
"/proc/" + pid + "/task/" + threadId + "/schedstat",
schedKeys,
schedTypes))
.parse();
tidKVMap.put(threadId, sample);

calculateSchedLatency();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us not repeat code, can we refactor this to:

private static final ALL_THREADS = "ALL_THREADS";

public synchronized void addSampleForThread(String threadId) {
        addSample(threadId);
}

public synchronized void addSample(String threadId = ALL_THREADS) {
        ...
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I am afraid we can't provide default value for the method parameter in Java (like we can do in Kotlin for example).

Comment on lines 114 to 139
public static synchronized void addSample() {
tids = OSGlobals.getTids();
oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
for (String tid : tids) {
addSampleTid(tid);
}
}

public static synchronized void addSample(String threadId) {
tids = OSGlobals.getTids();
oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
addSampleTid(threadId);
for (String tid : tids) {
addSampleTid(tid);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as other place, instead of replicating code, let us refactor this to use method overloading.

Also, move the Map clear and load part, timestamp update to the private static void addSampleTid(String tid)

Copy link
Author

@stevanbz stevanbz May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - I just copy pasted the methods without too much thinking since my focus was mostly on the other side of the app - to make it workable and to be able to log the metrics for the given threadId. In this case, instead of adding samples for all the threads this method should add the sample for the given thread id:

    public static synchronized void addSample(String threadId) {
        tids = OSGlobals.getTids();
        oldtidKVMap.clear();
        oldtidKVMap.putAll(tidKVMap);

        tidKVMap.clear();
        oldkvTimestamp = kvTimestamp;
        kvTimestamp = System.currentTimeMillis();
        addSampleTid(threadId);
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this method was just copy paste it should be something like:

    public static synchronized void addSample(String threadId) {
        tids = OSGlobals.getTids();
        oldtidKVMap.clear();
        oldtidKVMap.putAll(tidKVMap);

        tidKVMap.clear();
        oldkvTimestamp = kvTimestamp;
        kvTimestamp = System.currentTimeMillis();
        addSampleTid(threadId);
    }

Comment on lines +17 to +18
private Map<String, Double> cpu;
private Map<String, Double[]> pagingActivities;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this made part of the interface CPUPagingActivityGenerator ? A more appropriate name for CPUPagingActivityGenerator is CPUPagingActivityStore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the first question I didn't get that. If you can give me some more context it would be good. Tnx.

Re the second - I was thinking, maybe you are right. But the thing is that metrics generators are part of the the metrics_generator package and also, beside this, we have DiskIOMetricsGenerator, SchedMetricsGenerator etc. so if we decide to rename this class I have a feeling that we should be consistent for all other classes meaning - we should rename all of the ***Generators (since they are doing similar things - gathering the metrics and storing them in some data structures), or just keep them as they are.

What do you think? Tnx in advance!

@khushbr
Copy link
Collaborator

khushbr commented May 24, 2023

Couple of high level comments:

  1. Naming Convention suggestion: Instead of using *ActivityGenerator, we should rename the interfaces to more apt *MetricStore after moving the HashMap here instead of the class.
  2. Service stats are missing here, I am working on a separate PR and will add them.
  3. There are no Unit Tests in this PR. I expect a follow-up PR to add these tests.

Comment on lines 170 to 184
oldtidKVMap.clear();
oldtidKVMap.putAll(tidKVMap);

tidKVMap.clear();
oldkvTimestamp = kvTimestamp;
kvTimestamp = System.currentTimeMillis();
Map<String, Object> sample =
// (new SchemaFileParser("/proc/"+tid+"/stat",
(new SchemaFileParser(
"/proc/" + pid + "/task/" + threadId + "/stat",
statKeys,
statTypes,
true))
.parse();
tidKVMap.put(threadId, sample);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us not clear the whole map when adding sample single threadID

stevanbz added 3 commits May 25, 2023 22:34
Signed-off-by: Stevan Buzejic <[email protected]>
Signed-off-by: Stevan Buzejic <[email protected]>
@stevanbz
Copy link
Author

stevanbz commented Jun 1, 2023

Closing the PR in favor of
#31

We decided not to go with adding the sample per one thread. Instead, we decided to have re-usable units of code that will be responsible for getting the metrics for the threads

@stevanbz stevanbz closed this Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants