|
| 1 | +## Feature Flags API |
| 2 | + |
| 3 | +* **Owners:** |
| 4 | + * `@roidelapluie` |
| 5 | + |
| 6 | +* **Implementation Status:** Not implemented |
| 7 | + |
| 8 | +* **Related Issues and PRs:** |
| 9 | + * [Feature request](https://github.com/prometheus/prometheus/issues/10022) |
| 10 | + * [Example use case](https://github.com/grafana/grafana/issues/33487) |
| 11 | + |
| 12 | +This design document proposes introducing a "features list" API within the Prometheus ecosystem. This API would allow Prometheus-like endpoints to advertise which features they support and have enabled. By exposing this information, clients can determine in advance what functionality is available on a given endpoint, leading to more efficient API usage, optimized PromQL queries, and clearer expectations about endpoint capabilities. |
| 13 | + |
| 14 | +The primary objectives are to create a solution that is broadly applicable across various targets, encouraging wide adoption, and to address practical needs and optimizations that arise when such capability information is easily accessible. |
| 15 | + |
| 16 | +## Why |
| 17 | + |
| 18 | +Over time, the Prometheus APIs have undergone numerous optimizations, such as supporting POST in addition to GET requests and allowing filtering on certain API endpoints. Additionally, new APIs, PromQL functions, and capabilities are regularly introduced. Some of these features are optional and can be enabled or disabled by users. |
| 19 | + |
| 20 | +Without a "features API," new advancements are often underutilized because API clients are hesitant to adopt them before widespread support exists among users. By creating an API that clearly communicates available and enabled features, clients can take advantage of new capabilities as soon as they are released. For instance, HTTP POST support was added to Prometheus in version 2.1.0 (2018) but was not adopted as the default in Grafana until version 8.0 (2021), illustrating a three-year delay caused by limited visibility of feature availability. |
| 21 | + |
| 22 | +### Pitfalls of the current solution |
| 23 | + |
| 24 | +Currently, there is no proper solution for feature discovery. While users can retrieve configs or version flags, these APIs are tightly coupled to Prometheus, not machine-friendly, and unsuitable for third-party or generic integrations. |
| 25 | + |
| 26 | +There are client-side workarounds. In Grafana, users can configure datasources like this: |
| 27 | + |
| 28 | +```yaml |
| 29 | +prometheusType: Prometheus # Options: Cortex | Mimir | Prometheus | Thanos |
| 30 | +prometheusVersion: 2.40.0 |
| 31 | +``` |
| 32 | +
|
| 33 | +Grafana infers compatibility from these values and selects endpoints accordingly. For instance, all of the following support label matchers in the Labels API: |
| 34 | +
|
| 35 | +- Prometheus >= 2.24.0 |
| 36 | +- Mimir >= 2.0.0 |
| 37 | +- Cortex >= 1.11.0 |
| 38 | +- Thanos >= 0.18.0 |
| 39 | +
|
| 40 | +If the criteria are met, Grafana chooses more efficient label endpoints (`/api/v1/labels`, `/api/v1/label/<name>/values` with `match[]`). Otherwise, it falls back to the less efficient `/api/v1/series` for label queries. |
| 41 | + |
| 42 | +Key limitations of this approach: |
| 43 | + |
| 44 | +1. Configuration errors (wrong type or version) can lead to incompatible or missing features. |
| 45 | +2. Backend upgrades alone do not enable new features in clients—client logic must also be updated and released. |
| 46 | +3. New Prometheus-compatible backends require explicit code changes in Grafana, slowing adoption. |
| 47 | +4. Type and version checks are coarse; they do not reflect actual enabled features, which may depend on flags or configuration. |
| 48 | + |
| 49 | +Alternatives already exist in some downstream projects and demonstrate the need for such kind of APIs. However, the current approach is based on extending the [`buildinfo` endpoint](https://prometheus.io/docs/prometheus/latest/querying/api/#build-information) with a [`features` field](https://github.com/grafana/mimir/blob/9fccbacdabdd236cb7ff97cf154643b409078178/pkg/util/version/info_handler.go#L11-L30), which is very vendor specific. Grafana already uses this approach for some [alertmanager features](https://github.com/grafana/grafana/blob/8863ed9d6f8395808196b5d81d436fb637a43d37/public/app/features/alerting/unified/api/buildInfo.ts#L137-L145). |
| 50 | + |
| 51 | +## Goals |
| 52 | + |
| 53 | +- Provide a machine-readable API to report enabled features. |
| 54 | +- Ensure the solution is lightweight to encourage broad adoption in the ecosystem. |
| 55 | +- Cover a comprehensive and relevant subset of Prometheus features. |
| 56 | +- Design the API to be extensible, allowing third-party projects to declare their own features. |
| 57 | +- Allow for potential future inclusion of Alertmanager, even though it is currently out of scope. |
| 58 | + |
| 59 | +### Audience |
| 60 | + |
| 61 | +The intended audience for this proposal includes: |
| 62 | + |
| 63 | +- Developers creating software that exposes the Prometheus API |
| 64 | +- Consumers of the Prometheus API |
| 65 | + |
| 66 | +## Non-Goals |
| 67 | + |
| 68 | +Implementing a unified feature gate in the code is out of scope |
| 69 | + |
| 70 | +## How |
| 71 | + |
| 72 | +The `/api/v1/features` endpoint returns a JSON object with top-level categories inspired by Prometheus package organization. Each category key contains a map of unique feature names (strings) to `true`/`false` booleans indicating whether the feature is enabled. |
| 73 | + |
| 74 | +Initial categories: |
| 75 | + |
| 76 | +- `api` - API endpoint features and capabilities |
| 77 | +- `remote_write_receiver` - Remote write receiver features |
| 78 | +- `remote_write_sender` - Remote write sender features |
| 79 | +- `scrape` - Scraping capabilities |
| 80 | +- `tsdb` - Time series database features |
| 81 | +- `rule` - Rule evaluation features |
| 82 | +- `ui` - Web UI capabilities |
| 83 | +- `promql` - PromQL language features (syntax, modifiers, operators) |
| 84 | +- `promql_functions` - Individual PromQL functions |
| 85 | + |
| 86 | +Example response: |
| 87 | + |
| 88 | +```json |
| 89 | +{ |
| 90 | + "status": "success", |
| 91 | + "data": { |
| 92 | + "api": { |
| 93 | + "exemplars": true, |
| 94 | + "labels_matchers": true, |
| 95 | + "query_post": true |
| 96 | + }, |
| 97 | + "promql": { |
| 98 | + "negative_offset": true, |
| 99 | + "at_modifier": true, |
| 100 | + "subqueries": true |
| 101 | + }, |
| 102 | + "promql_functions": { |
| 103 | + "last_over_time": true, |
| 104 | + "limitk": true |
| 105 | + }, |
| 106 | + "prometheus": { |
| 107 | + "stringlabels": true, |
| 108 | + } |
| 109 | + } |
| 110 | +} |
| 111 | +``` |
| 112 | + |
| 113 | +Naming conventions: |
| 114 | +- All names MUST use `snake_case` |
| 115 | +- Each category value is a map from unique feature name to a boolean |
| 116 | +- Clients MUST ignore unknown feature names and categories |
| 117 | +- The response follows standard Prometheus API conventions with `status` and `data` fields |
| 118 | +- The endpoint returns HTTP 200 OK, like other Prometheus APIs |
| 119 | +- Vendors MAY add vendor-specific categories (e.g., `prometheus`, `mimir`, `cortex`) to expose implementation-specific features such as build tags or vendor-unique capabilities |
| 120 | + |
| 121 | +Some items might exist in multiple categories. |
| 122 | + |
| 123 | +We do not differentiate between a feature that is simply disabled and one that is missing because it was not compiled in. There is no separate "build" category. Instead, if a feature depends on a compile-time flag, it will appear under its relevant category. If it is not built-in or disabled, it should be set to `false`. Implementations MAY omit features set to `false`, and clients MUST treat absent features as equivalent to `false`. |
| 124 | + |
| 125 | +## Alternatives |
| 126 | + |
| 127 | +- Flat list: Having categories makes it easier for things like PromQL functions. |
| 128 | +- No booleans (only trues): clients might use false to hint the user that they could enable a feature. |
| 129 | +- Richer information than booleans (limits, etc): primarily to keep things simple |
| 130 | + |
| 131 | +## Action Plan |
| 132 | + |
| 133 | +The package will be located in the prometheus/prometheus repository. |
| 134 | + |
| 135 | +Instead of actively collecting features from other packages, this package will allow other components to register their supported features with it. |
| 136 | + |
| 137 | +For the initial launch, I plan to include a substantial set of already existing features. |
0 commit comments