Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dropping additional labels in distributor #9711

Open
tiithansen opened this issue Oct 22, 2024 · 2 comments
Open

Allow dropping additional labels in distributor #9711

tiithansen opened this issue Oct 22, 2024 · 2 comments

Comments

@tiithansen
Copy link

tiithansen commented Oct 22, 2024

Describe the feature request

We have a tiered prometheus setup where each tier has its own responsibility. Because of this we track HA labels differently. We have three labels in total. cluster which is used in queries, __prometheus_type__ which indicates the tier prometheus belongs to and __replica__ which indicates replica number in the tier. Because Mimir only drops __replica__ label we are left with __prometheus_type__ replica but we would like to get rid of it.

Reason for such setup is that if one tier becomes unstable others will be unaffected.

For example:

{cluster="prod-1", __prometheus_type__="business-shard-1", __replica__="1"}
{cluster="prod-1", __prometheus_type__="business-shard-0", __replica__="0"}
{cluster="prod-1", __prometheus_type__="system-shard-0", __replica__="1"}

Describe the solution you'd like

Allow specifying in config which additional labels distributor should drop from received timeseries.

Configured labels could be easily dropped here

Alternatives

I have tried drop_labels but it seems to run before ha tracker and it breaks ingestion.

@narqo
Copy link
Contributor

narqo commented Oct 23, 2024

Answering to the specific request:

Mimir supports metric_relabel_configs, that the distributor applies after the HA tracker. From the history, it was originally implemented for cortexproject/cortex#1507, but it's been a niche experimental feature since then. There are some details on how to use it in #1809

Note that the config flag comes with a warning:

in most situations, it is more effective to use metrics relabeling directly in the Prometheus server, e.g. remote_write.write_relabel_configs.


We have three labels in total. cluster which is used in queries, __prometheus_type__ which indicates the tier prometheus belongs to and __replica__ which indicates replica number in the tier.

I cannot say I fully understand this setup. Do different "prometheus_type" prometheuses scrap same set of metrics or not? If yes, then wouldn't removing the __prometheus_type__ label break it, no matter if this happens before or after the HA tracker? It seems that distributor would end up injecting a set of duplicate metrics within one cluster label (providing __prometheus_type__ and __replica__ were removed as per your HA tracking rule).

@tiithansen
Copy link
Author

tiithansen commented Oct 23, 2024

One thing is forgot to mention is that cluster label in HA tracker is configured to __prometheus_type__ but also add regular cluster label when we remote write to Mimir.

Prometheuses with different __prometheus_type__ scrape different metrics from different services. For example __prometheus_type__="system" scrapes only metrics from kubernetes components, node exporters ... and __prometheus_type__="business" scrapes metrics only from applications developed by our developers.

This way if some business app explodes with cardinality we will still receive all system metrics and metrics from others shards of business prometheuses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants