Skip to content

Performance Degradation Due to Unnecessary Tag Sorting in DefaultMeterObservationHandler #6031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
HYEONSEOK1 opened this issue Mar 17, 2025 · 2 comments · May be fixed by #6035
Open

Performance Degradation Due to Unnecessary Tag Sorting in DefaultMeterObservationHandler #6031

HYEONSEOK1 opened this issue Mar 17, 2025 · 2 comments · May be fixed by #6035
Labels
performance Issues related to general performance

Comments

@HYEONSEOK1
Copy link
Contributor

Describe the bug
The DefaultMeterObservationHandler performs unnecessary tag sorting for every request, leading to potential performance overhead. This occurs during the creation of Tags objects in both the onStart and onStop methods, even when the underlying key-value pairs are the same. This results in repeated sorting operations, which are relatively expensive.

The issue lies in the way Tags objects are created. Even though the same set of low-cardinality key-value pairs is used repeatedly, a new List of KeyValue objects is created for each invocation in both onStart and onStop. This new List is then used to create a new Tags object, which triggers the sorting process.

While sorting is necessary for the internal representation of Tags, it shouldn't be performed on every request when the key-value pairs haven't changed.

Proposed Solutions:

Cache Tags Objects: Implement a caching mechanism to store and reuse Tags objects based on the set of key-value pairs.

private final ConcurrentMap<List<KeyValue>, Tags> tagsCache = new ConcurrentHashMap<>(); 



private Tags getCachedTags(Observation.Context context) {

List<KeyValue> keyValues = context.getLowCardinalityKeyValues();

return tagsCache.computeIfAbsent(keyValues, this::createTags);

} 

Environment

  • Micrometer version [e.g. 1.14.0]
  • Micrometer registry [e.g. prometheus]
  • OS: [e.g. macOS]
  • Java version: [e.g. output of java -version]

To Reproduce
How to reproduce the bug:

  1. Set up a Spring Boot application (or any application using Micrometer) that utilizes the DefaultMeterObservationHandler for collecting metrics.
  2. Create an endpoint or service that generates consistent low-cardinality key-value pairs in the Observation.Context.
  3. Instrument this endpoint or service with Micrometer's Observation API.
  4. Make repeated calls to the endpoint or service, ensuring the same set of key-value pairs is generated with each call.

Expected behavior
A clear and concise description of what you expected to happen.

The DefaultMeterObservationHandler should cache and reuse Tags objects for identical sets of low-cardinality key-value pairs.
Tag sorting should only occur once for each unique set of key-value pairs.
Subsequent requests with the same key-value pairs should retrieve the cached Tags object, avoiding redundant sorting.
Performance should improve, especially under high load with repeated calls generating the same tags.

Additional context
Add any other context about the problem here, e.g. related issues.

@shakuzen
Copy link
Member

Thanks for the detailed issue. We can certainly consider something like you proposed, but I'm curious, did you come across this after having a performance issue, or is this coming from being proactive in optimizing this part of the code? I ask because I believe it's the first report on this since the code was introduced.

@HYEONSEOK1
Copy link
Contributor Author

@shakuzen
Thank you for your response. Yes, this issue came to light after experiencing performance concerns under increased traffic. We conducted JFR profiling and observed that the tag sorting process was unexpectedly consuming a significant portion of CPU time, as evidenced in the flame graphs.

@shakuzen shakuzen added performance Issues related to general performance and removed waiting-for-triage labels Mar 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues related to general performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants