Tag FoundationDB process metrics with process class, assigned roles #19682
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This pull request adds additional tags to existing FoundationDB process metrics to help operators identify which roles the associated processes are performing.
Motivation
FoundationDB processes can take on one or more roles within a cluster. When scaling a cluster, operators generally add new processes for overloaded roles. The heuristics for "overloaded" vary by process, but basic indicators like CPU utilization often give helpful first-order clues for where to begin.
Before this change, each process would report metrics (like CPU utilization) such that each metric was tagged with the process ID. We can use the process ID in conjunction with additional external tools to identify the roles a process is performing, but it would be much more convenient if the process metrics were already tagged by role. After this change, process metrics will be tagged with all of the roles performed by the associated process.
This change should not affect any existing queries, but will allow operators to write new queries to show things like "average CPU load for GRV proxies" or "network throughput for log processes."
Review checklist (to be filled by reviewers)
qa/skip-qa
label if the PR doesn't need to be tested during QA.backport/<branch-name>
label to the PR and it will automatically open a backport PR once this one is merged