-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Time-windowed max gauge for OTLP histograms #6159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Create gauge for Timer and DistributionSummary Signed-off-by: Ben Efrati <[email protected]>
Signed-off-by: Ben Efrati <[email protected]>
75c54f7 to
ff5e6cf
Compare
Signed-off-by: Ben Efrati <[email protected]>
|
@shakuzen , Can you pleases review? |
shakuzen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pull request, and sorry for the delay in reviewing. Currently, this PR is adding an OTLP Gauge for the Micrometer TimeWindowMax unconditionally for Meters that support Histograms (Timer/DistributionSummary). Based on the reasoning in my previous comment about this #3144 (comment), I wonder if it should be conditional on cumulative aggregation temporality. For delta temporality, we already have max set on the histogram.
Would users like the Micrometer TimeWindowMax separately even with delta temporality? I'd like to hear from more Micrometer OTLP users to understand what people would find helpful. @lenin-jaganathan any thoughts?
|
Ok, let me know if you want me to add a condition to prevent publishing the separate max Gauge when delta temporality is active. |
|
@BenEfrati as a user of OtlpMeterRegistry, what do you think makes the most sense? Are you using cumulative temporality and that's why you'd like this change? I do think it's a little unfortunate that the way to query the max on your metrics will change depending on whether you're using delta or cumulative aggregation, but it also makes sense because the max has different semantics in the different places. I am interested in hearing the perspective of users on this. |
|
@shakuzen Yes, that's correct: I am currently using cumulative temporality because our target backend is Prometheus. I have not personally used the deltatocumulativeprocessor when pushing to Prometheus, so I cannot comment on its specific behavior. Maybe this is useful for other backends that supports delta temporality, not sure if there is an option to query the max value. We can introduce a configuration option. This would allow users to explicitly choose whether to create the max Gauge, decoupling that choice from the |
That's an interesting idea. It would eliminate the issue or doing or not doing something people do or don't expect, but it adds some complexity. We try to avoid adding configuration options if we can avoid it, but this may be a case where it is worth it. Something to potentially consider is using a |
|
I have a hard time to get to a conclusion about what would be the best (since cumulative max is not very useful in OTLP).
We should also write docs to explain the OTLP behavior, the problems with its cumulative max and the default behavior above. |
| double max = isTimeBased ? histogramSnapshot.max(baseTimeUnit) : histogramSnapshot.max(); | ||
| long count = histogramSnapshot.count(); | ||
|
|
||
| addMaxGaugeForTimer(id, tags, max); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addMaxGaugeForHistogramSupport?
Since this is for both Timer and DistributionSummary.
| } | ||
|
|
||
| metricBuilder.getGaugeBuilder() | ||
| .addDataPoints(NumberDataPoint.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems addDataPoints has an overload for NumberDataPointBuilder and NumberDataPoint too?
It seems writeGauge (which is quite similar to this) uses NumberDataPoint (calls .build()) while this one does not. Should not be the two consistent?
| // see: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/20519 | ||
| matchesPattern("(?s)^.*test_timer_milliseconds_sum\\{.+} 123\\.0\\n.*$"), | ||
| matchesPattern("(?s)^.*test_timer_milliseconds_bucket\\{.+,le=\"\\+Inf\"} 1\\n.*$"), | ||
| matchesPattern("(?s)^.*test_timer_max_milliseconds\\{.+} 123\\.0\n.*$"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a separate gauge, there should be two more lines, something like this should cover them:
containsString("# HELP test_timer_max_milliseconds \n"),
containsString("# TYPE test_timer_max_milliseconds gauge\n"),
I think we can just link to the OTLP docs on it. See #3144 (comment) where I link to it and quote the relevant section. |
|
The real bikeshed issue is going to be what to name the configuration method on |
Create gauge for Timer and DistributionSummary
See #3144