-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Thread Metrics #13483
Comments
Could you clarify whether the metrics |
Yeah that is what I meant, but now see that there's a count metric emitted per state, in that case just CPU and User time seems to be a gap |
hi @akats7! what attributes would you propose on |
Hey @trask, I'd have to dig a bit into the internals of the runtime metric modules, but one approach could be to just support this for JMX |
@trask can we add cpu time to runtime metrics through the thread MBean using ManagementFactory. Then we rely on mbean operations to get the time values. I get there may be cardinality concerns since thread name/pool name would have to be an attribute, so it can be disabled by default. |
@SylvainJuge @robsunday I'm hesitant for people to add new JMX metrics in the middle of your convergence effort, so would like to defer to you here |
Thanks @trask! I do want to point out that these are rather important metrics. We've had a lot of internal requests for this from users who are migrating from vendor products that supported this out of the box. |
To expand a bit on the "convergence effort" context, we are currently trying to add JVM metrics in a YAML descriptor with #13392, this YAML will NOT be directly used by instrumentation but will in the future be used by jmx-scraper which is a CLI program replacing JMX Gatherer, but using the same JMX implementation as instrumentation (and thus inheriting it's yaml support). What we are currently focusing on for JVM metrics in YAML, is the ability to capture them in a way that is compliant with semantic conventions, which is already done by the The I think we can add new metrics even if the current work is still in-progress, I would suggest to do that in a few steps:
As a temporary work-around, if you are able to capture those with YAML configuration, you should be able to provide a YAML file for them. However this is not a great OOTB experience, could easily break if the metric definition changes when adding it to semconv. |
Hey @SylvainJuge, thanks for the context. So part of the issue is that I believe the jmx-scraper is only able to scrape attributes and not execute operations which would be required for these metrics. In regards to the experimentation, I've already done this with the JMX Gatherer since it allows you to directly interact with the mbeans if using a custom script. However since the gatherer instruments also only allow the use of attributes, I had to rely on transformation closures to overwrite other mbeans which is not ideal. Also, if possible part of this ask is to be able to move away from the remote approach, I might be missing something but is there a reason its preferable to interact with a JMX server vs just scraping it directly since the javaagent runs in the same JVM? |
Ideally, we should not force users to deploy an instrumentation agent to capture runtime metrics if those could be obtained externally with JMX scraper or gatherer. However, we already have the case of some metrics that can't be captured without instrumentation and explicit code as they can only be captured from within the JVM, either because they require advanced JMX features or rely on JFR events. So this is something we can do already, but it adds more constraints on the users, for example the JVM metrics are not exactly the same if using Java 17 or Java 8, which could lead to user confusion or missed expectations. If I understand it correctly, those metrics would be more in the "runtime-telemetry only" and would be very unlikely supported through YAML due to needing some post-processing, is that correct ? Also, could you try to elaborate a bit on their definitions/attributes and from which MBean attribute would they be captured ? |
Yep, thats exactly right, for example to get cpu_time we'd likely need get the AllThreadIds attribute and then call getThreadCpuTime and getThreadInfo for attributes such as name. And I understand that this utility should still exist for users who want these metrics but don't need the other functionality of the agent. But the situation that we find ourselves in is that the majority of our teams are that are already leveraging the agent for instrumentation also have a need for these metrics, so it would be ideal to not have to configure a jmx server and an additional scraping process when the agent is already in place. |
I agree with you @akats7 , this is probably a use-case for which we could either document (or provide a dedicated config option) when only runtime metrics (or JMX metrics) needs to be captured and sent to OTLP, without any instrumentation nor tracing involved For JMX metrics that are defined in yaml, this could help providing details on JVM rumtime metrics while still allowing to capture metrics defined in yaml, for example if you run a Kafka broker or cluster it would be relevant to capture both by adding the agent to the JVM. |
So just to clarify, is there a path forward to add these as experimental out of the box jvm metrics. I'd be happy to contribute this |
If those new metrics are only captured through code, their implementation is part of In order to add/change things to semconv, we need to have at least an experimental implementation to validate what is being added in semconv is correct and technically achievable, that creates a kind of chicken-egg problem and you have to work on both sides at the same time. I would suggest to do the following:
|
@SylvainJuge That sounds like a plan to me, thanks! |
Is your feature request related to a problem? Please describe.
The current scope of thread metrics appears to be limited to thread count, there are other thread based metrics that are rather critical, such as thread cpu time and metrics based on thread state.
Describe the solution you'd like
Add additional thread metrics for:
jvm.thread.cpu_time
jvm.thread.user_time
Describe alternatives you've considered
Using the JMX Gatherer
Additional context
No response
The text was updated successfully, but these errors were encountered: