Skip to content

naming: Update metric and label name restrictions and recommendations with the latest context #2626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 27 additions & 19 deletions content/docs/concepts/data_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,42 +17,39 @@ Every time series is uniquely identified by its metric name and optional key-val

***Metric names:***

* Specify the general feature of a system that is measured (e.g. `http_requests_total` - the total number of HTTP requests received).
* Metric names may contain ASCII letters, digits, underscores, and colons. It must match the regex `[a-zA-Z_:][a-zA-Z0-9_:]*`.

Note: The colons are reserved for user defined recording rules. They should not be used by exporters or direct instrumentation.

* Metric names SHOULD specify the general feature of a system that is measured (e.g. `http_requests_total` - the total number of HTTP requests received).
* Metric names MAY use any UTF-8 characters.
* Metric names SHOULD match the regex `[a-zA-Z_:][a-zA-Z0-9_:]*` for the best experience and compatibility (see the warning below). Metric names outside of that set will require quoting e.g. when used in PromQL (see the [UTF-8 guide](../guides/utf8.md#querying)).

NOTE: Colons (':') are reserved for user-defined recording rules. They SHOULD NOT be used by exporters or direct instrumentation.

***Metric labels:***

* Enable Prometheus's dimensional data model to identify any given combination of labels for the same metric name. It identifies a particular dimensional instantiation of that metric (for example: all HTTP requests that used the method `POST` to the `/api/tracks` handler). The query language allows filtering and aggregation based on these dimensions.
* The change of any label's value, including adding or removing labels, will create a new time series.
* Labels may contain ASCII letters, numbers, as well as underscores. They must match the regex `[a-zA-Z_][a-zA-Z0-9_]*`.
* Label names beginning with `__` (two "_") are reserved for internal use.
* Label values may contain any Unicode characters.
* Labels with an empty label value are considered equivalent to labels that do not exist.
Labels let you capture different instances of the same metric name. For example: all HTTP requests that used the method `POST` to the `/api/tracks` handler. We refer to this as Prometheus's "dimensional data model". The query language allows filtering and aggregation based on these dimensions. The change of any label's value, including adding or removing labels, will create a new time series.

* Label names MAY use any UTF-8 characters.
* Label names beginning with `__` (two underscores) MUST be reserved for internal Prometheus use.
* Label names SHOULD match the regex `[a-zA-Z_][a-zA-Z0-9_]*` for the best experience and compatibility (see the warning below). Label names outside of that regex will require quoting e.g. when used in PromQL (see the [UTF-8 guide](../guides/utf8.md#querying)).
* Label values MAY contain any UTF-8 characters.
* Labels with an empty label value are considered equivalent to labels that do not exist.

WARNING: The [UTF-8](../guides/utf8.md) support for metric and label names was added relatively recently in Prometheus v3.0.0. It might take time for the wider ecosystem (downstream PromQL compatible projects and vendors, tooling, third-party instrumentation, collectors, etc.) to adopt new quoting mechanisms, relaxed validation etc. For the best compatibility it's recommended to stick to the recommended ("SHOULD") character set.

See also the [best practices for naming metrics and labels](/docs/practices/naming/).
INFO: See also the [best practices for naming metrics and labels](/docs/practices/naming/).

## Samples

Samples form the actual time series data. Each sample consists of:

* a float64 value
* a millisecond-precision timestamp

NOTE: Beginning with Prometheus v2.40, there is experimental support for native
histograms. Instead of a simple float64, the sample value may now take the form
of a full histogram.
* a float64 or [native histogram](https://prometheus.io/docs/specs/native_histograms/) value
* a millisecond-precision timestamp

## Notation

Given a metric name and a set of labels, time series are frequently identified
using this notation:

<metric name>{<label name>=<label value>, ...}
<metric name>{<label name>="<label value>", ...}

For example, a time series with the metric name `api_http_requests_total` and
the labels `method="POST"` and `handler="/messages"` could be written like
Expand All @@ -61,3 +58,14 @@ this:
api_http_requests_total{method="POST", handler="/messages"}

This is the same notation that [OpenTSDB](http://opentsdb.net/) uses.

Names with UTF-8 characters outside the recommended set must be quoted, using
this notation:

{"<metric name>", <label name>="<label value>", ...}

Since metric name are internally represented as a label pair
with a special label name (`__name__="<metric name>"`) one could also use the following notation:

{__name__="<metric name>", <label name>="<label value>", ...}

38 changes: 29 additions & 9 deletions content/docs/practices/naming.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ practices, e.g. naming conventions, differently.

A metric name...

* ...must comply with the [data model](/docs/concepts/data_model/#metric-names-and-labels) for valid characters.
* ...should have a (single-word) application prefix relevant to the domain the
* ...MUST comply with the [data model](/docs/concepts/data_model/#metric-names-and-labels) for valid characters.
* ...SHOULD have a (single-word) application prefix relevant to the domain the
metric belongs to. The prefix is sometimes referred to as `namespace` by
client libraries. For metrics specific to an application, the prefix is
usually the application name itself. Sometimes, however, metrics are more
Expand All @@ -26,9 +26,9 @@ A metric name...
(exported by many client libraries)
* <code><b>http</b>\_request\_duration\_seconds</code>
(for all HTTP requests)
* ...must have a single unit (i.e. do not mix seconds with milliseconds, or seconds with bytes).
* ...should use base units (e.g. seconds, bytes, meters - not milliseconds, megabytes, kilometers).See below for a list of base units.
* ...should have a suffix describing the unit, in plural form. Note that an accumulating count has `total` as a suffix, in addition to the unit if applicable. Also note that this applies to units in the narrow sense (like the units in the table below), but not to countable things in general. For example, <code>connections</code> or <code>notifications</code> are not considered units for this rule and do not have to be at the end of the metric name. (See also examples in the next paragraph.)
* ...MUST have a single unit (i.e. do not mix seconds with milliseconds, or seconds with bytes).
* ...SHOULD use base units (e.g. seconds, bytes, meters - not milliseconds, megabytes, kilometers). See [below](#base-units) for a list of base units.
* ...SHOULD have a suffix describing the unit, in plural form. Note that an accumulating count has `total` as a suffix, in addition to the unit if applicable. Also note that this applies to units in the narrow sense (like the units in the table below), but not to countable things in general. For example, <code>connections</code> or <code>notifications</code> are not considered units for this rule and do not have to be at the end of the metric name. (See also examples in the next paragraph.)
* <code>http\_request\_duration\_<b>seconds</b></code>
* <code>node\_memory\_usage\_<b>bytes</b></code>
* <code>http\_requests\_<b>total</b></code>
Expand All @@ -39,7 +39,7 @@ A metric name...
(for a pseudo-metric that provides [metadata](https://www.robustperception.io/exposing-the-software-version-to-prometheus) about the running binary)
* <code>data\_pipeline\_last\_record\_processed\_<b>timestamp_seconds</b></code>
(for a timestamp that tracks the time of the latest record processed in a data processing pipeline)
* ...may order its name components in a way that leads to convenient grouping when a list of metric names is sorted lexicographically, as long as all the other rules are followed. The following examples have their the common name components first so that all the related metrics are sorted together:
* ...MAY order its name components in a way that leads to convenient grouping when a list of metric names is sorted lexicographically, as long as all the other rules are followed. The following examples have their the common name components first so that all the related metrics are sorted together:
* <code>prometheus\_tsdb\_head\_truncations\_closed\_total</code>
* <code>prometheus\_tsdb\_head\_truncations\_established\_total</code>
* <code>prometheus\_tsdb\_head\_truncations\_failed\_total</code>
Expand All @@ -49,7 +49,7 @@ A metric name...
* <code>prometheus\_tsdb\_head\_established\_truncations\_total</code>
* <code>prometheus\_tsdb\_head\_failed\_truncations\_total</code>
* <code>prometheus\_tsdb\_head\_truncations\_total</code>
* ...should represent the same logical thing-being-measured across all label
* ...SHOULD represent the same logical thing-being-measured across all label
dimensions.
* request duration
* bytes of data transfer
Expand All @@ -61,6 +61,27 @@ meaningful, split the data up into multiple metrics. For example, having the
capacity of various queues in one metric is good, while mixing the capacity of a
queue with the current number of elements in the queue is not.

### Why include unit and type suffixes in metric names?

Some metric naming conventions (e.g. OpenTelemetry) do not recommend or even do not allow
including information about a metric unit and type in the metric name. A common
argument is that those pieces of information are already defined somewhere else (e.g. schema,
metadata, other labels, etc.).

Prometheus strongly recommends including unit and type in a metric name, even if you store that
information elsewhere, because of the following practical reasons:

* **Metric consumption reliability and UX**: When interacting with a modern UI to
use such a metric in PromQL, it's possible to display rich information about the metric's type and unit
(autocompletion, overlays, pop-ups). Unfortunately, interactive, adhoc querying in a powerful UI is not
the only way that users interact with metrics. Metric consumption ecosystem is vast. Majority
of the consumption comes in a form of the plain YAML configuration for variety of observability tools like
alerting, recording, autoscaling, dashboards, analysis, processing, etc. It's **critical**, especially
during monitoring/SRE incident practices to look on PromQL expressions in plain YAML and understand
the underlying metric type and unit you work with.
* **Metric collisions**: With growing adoption and metric changes over time, there are cases where lack
of unit and type information in the metric name will cause certain series to collide (e.g. `process_cpu` for seconds and milliseconds).

## Labels

Use labels to differentiate the characteristics of the thing that is being measured:
Expand All @@ -77,8 +98,7 @@ of data stored. Do not use labels to store dimensions with high cardinality
(many different label values), such as user IDs, email addresses, or other
unbounded sets of values.


## Base units
## Base Units

Prometheus does not have any units hard coded. For better compatibility, base
units should be used. The following lists some metrics families with their base unit.
Expand Down