|
| 1 | +--- |
| 2 | +title: UTF-8 in Prometheus |
| 3 | +--- |
| 4 | + |
| 5 | +# Introduction |
| 6 | + |
| 7 | +Versions of Prometheus before 3.0 required that metric and label names adhere to |
| 8 | +a strict set of character requirements. With Prometheus 3.0, all UTF-8 strings |
| 9 | +are valid names, but there are some manual changes needed for other parts of the ecosystem to introduce names with any UTF-8 characters. |
| 10 | + |
| 11 | +There may also be circumstances where users want to enforce the legacy character |
| 12 | +set, perhaps for compatibility with an older system or one that does not yet |
| 13 | +support UTF-8. |
| 14 | + |
| 15 | +This document guides you through the UTF-8 transition details. |
| 16 | + |
| 17 | +# Go Instrumentation |
| 18 | + |
| 19 | +Currently, metrics created by the official Prometheus [client_golang library](github.com/prometheus/client_golang) will reject UTF-8 names |
| 20 | +by default. It is necessary to change the default validation scheme to allow |
| 21 | +UTF-8. The requirement to set this value will be removed in a future version of |
| 22 | +the common library. |
| 23 | + |
| 24 | +```golang |
| 25 | +import "github.com/prometheus/common/model" |
| 26 | + |
| 27 | +func init() { |
| 28 | + model.NameValidationScheme = model.UTF8Validation |
| 29 | +} |
| 30 | +``` |
| 31 | + |
| 32 | +If users want to enforce the legacy character set, they can set the validation |
| 33 | +scheme to `LegacyValidation`. |
| 34 | + |
| 35 | +Setting the validation scheme must be done before the instantiation of metrics |
| 36 | +and can be set on the fly if desired. |
| 37 | + |
| 38 | +## Instrumenting in other languages |
| 39 | + |
| 40 | +Other client libraries may have similar requirements to set the validation |
| 41 | +scheme. Check the documentation for the library you are using. |
| 42 | + |
| 43 | +# Configuring Name Validation during Scraping |
| 44 | + |
| 45 | +By default, Prometheus 3.0 accepts all UTF-8 strings as valid metric and label |
| 46 | +names. It is possible to override this behavior for scraped targets and reject |
| 47 | +names that do not conform to the legacy character set. |
| 48 | + |
| 49 | +This option can be set in the Prometheus YAML file on a global basis: |
| 50 | + |
| 51 | +```yaml |
| 52 | +global: |
| 53 | + metric_name_validation_scheme: legacy |
| 54 | +``` |
| 55 | +
|
| 56 | +or on a per-scrape config basis: |
| 57 | +
|
| 58 | +```yaml |
| 59 | +scrape_configs: |
| 60 | + - job_name: prometheus |
| 61 | + metric_name_validation_scheme: legacy |
| 62 | +``` |
| 63 | +
|
| 64 | +Scrape config settings override the global setting. |
| 65 | +
|
| 66 | +## Scrape Content Negotiation for UTF-8 escaping |
| 67 | +
|
| 68 | +At scrape time, the scraping system **must** pass `escaping=allow-utf-8` in the |
| 69 | +Accept header in order to be served UTF-8 names. If a system being scraped does |
| 70 | +not see this header, it will automatically convert UTF-8 names to |
| 71 | +legacy-compatible using underscore replacement. |
| 72 | + |
| 73 | +Scraping systems can also request a specfic escaping method if desired by |
| 74 | +setting the `escaping` header to a different value. |
| 75 | + |
| 76 | +* `underscores`: The default: convert legacy-invalid characters to underscores. |
| 77 | +* `dots`: similar to UnderscoreEscaping, except that dots are converted to |
| 78 | + `_dot_` and pre-existing underscores are converted to `__`. This allows for |
| 79 | + round-tripping of simple metric names that also contain dots. |
| 80 | +* `values`: This mode prepends the name with `U__` and replaces all invalid |
| 81 | + characters with the unicode value, surrounded by underscores. Single |
| 82 | + underscores are replaced with double underscores. This mode allows for full |
| 83 | + round-tripping of UTF-8 names with a legacy system. |
| 84 | + |
| 85 | +## Remote Write 2.0 |
| 86 | + |
| 87 | +Remote Write 2.0 automatically accepts all UTF-8 names in Prometheus 3.0. There |
| 88 | +is no way to enforce the legacy character set validation with Remote Write 2.0. |
| 89 | + |
| 90 | +# OTLP Metrics |
| 91 | + |
| 92 | +OTLP receiver in Prometheus 3.0 still normalizes all names to Prometheus format by default. You can change this in `otlp` section of the Prometheus configuration as follows: |
| 93 | + |
| 94 | + |
| 95 | + otlp: |
| 96 | + # Ingest OTLP data keeping UTF-8 characters in metric/label names. |
| 97 | + translation_strategy: NoUTF8EscapingWithSuffixes |
| 98 | + |
| 99 | + |
| 100 | +See [OpenTelemetry guide](./opentelemetry) for more details. |
| 101 | + |
| 102 | + |
| 103 | +# Querying |
| 104 | + |
| 105 | + |
| 106 | +Querying for metrics with UTF-8 names will require a slightly different syntax |
| 107 | +in PromQL. |
| 108 | + |
| 109 | +The classic query syntax will still work for legacy-compatible names: |
| 110 | + |
| 111 | +`my_metric{}` |
| 112 | + |
| 113 | +But UTF-8 names must be quoted **and** moved into the braces: |
| 114 | + |
| 115 | +`{"my.metric"}` |
| 116 | + |
| 117 | +Label names must also be quoted if they contain legacy-incompatible characters: |
| 118 | + |
| 119 | +`{"metric.name", "my.label.name"="bar"}` |
| 120 | + |
| 121 | +The metric name can appear anywhere inside the braces, but style prefers that it |
| 122 | +be the first term. |
0 commit comments