Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update internal telemetry docs #6484

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 43 additions & 34 deletions content/en/docs/collector/internal-telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,43 +238,52 @@ files in the repository.

#### `basic`-level metrics

| Metric name | Description | Type |
| ------------------------------------------------------- | --------------------------------------------------------------------------------------- | --------- |
| `otelcol_exporter_enqueue_failed_`<br>`log_records` | Number of logs that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_enqueue_failed_`<br>`metric_points` | Number of metric points that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_enqueue_failed_`<br>`spans` | Number of spans that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_queue_capacity` | Fixed capacity of the sending queue, in batches. | Gauge |
| `otelcol_exporter_queue_size` | Current size of the sending queue, in batches. | Gauge |
| `otelcol_exporter_send_failed_`<br>`log_records` | Number of logs that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_send_failed_`<br>`metric_points` | Number of metric points that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_send_failed_`<br>`spans` | Number of spans that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_sent_log_records` | Number of logs successfully sent to destination. | Counter |
| `otelcol_exporter_sent_metric_points` | Number of metric points successfully sent to destination. | Counter |
| `otelcol_exporter_sent_spans` | Number of spans successfully sent to destination. | Counter |
| `otelcol_process_cpu_seconds` | Total CPU user and system time in seconds. | Counter |
| `otelcol_process_memory_rss` | Total physical memory (resident set size) in bytes. | Gauge |
| `otelcol_process_runtime_heap_`<br>`alloc_bytes` | Bytes of allocated heap objects (see 'go doc runtime.MemStats.HeapAlloc'). | Gauge |
| `otelcol_process_runtime_total_`<br>`alloc_bytes` | Cumulative bytes allocated for heap objects (see 'go doc runtime.MemStats.TotalAlloc'). | Counter |
| `otelcol_process_runtime_total_`<br>`sys_memory_bytes` | Total bytes of memory obtained from the OS (see 'go doc runtime.MemStats.Sys'). | Gauge |
| `otelcol_process_uptime` | Uptime of the process in seconds. | Counter |
| `otelcol_processor_batch_batch_`<br>`send_size` | Number of units in the batch that was sent. | Histogram |
| `otelcol_processor_batch_batch_size_`<br>`trigger_send` | Number of times the batch was sent due to a size trigger. | Counter |
| `otelcol_processor_batch_metadata_`<br>`cardinality` | Number of distinct metadata value combinations being processed. | Counter |
| `otelcol_processor_batch_timeout_`<br>`trigger_send` | Number of times the batch was sent due to a timeout trigger. | Counter |
| `otelcol_processor_incoming_items` | Number of items passed to the processor. | Counter |
| `otelcol_processor_outgoing_items` | Number of items emitted from the processor. | Counter |
| `otelcol_receiver_accepted_`<br>`log_records` | Number of logs successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_accepted_`<br>`metric_points` | Number of metric points successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_accepted_spans` | Number of spans successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_`<br>`log_records` | Number of logs that could not be pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_`<br>`metric_points` | Number of metric points that could not be pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_spans` | Number of spans that could not be pushed into the pipeline. | Counter |
| `otelcol_scraper_errored_`<br>`metric_points` | Number of metric points the Collector failed to scrape. | Counter |
| `otelcol_scraper_scraped_`<br>`metric_points` | Number of metric points scraped by the Collector. | Counter |
| Metric name | Description | Type |
| ------------------------------------------------------ | --------------------------------------------------------------------------------------- | ------- |
| `otelcol_exporter_enqueue_failed_`<br>`log_records` | Number of logs that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_enqueue_failed_`<br>`metric_points` | Number of metric points that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_enqueue_failed_`<br>`spans` | Number of spans that exporter(s) failed to enqueue. | Counter |
| `otelcol_exporter_queue_capacity` | Fixed capacity of the sending queue, in batches. | Gauge |
| `otelcol_exporter_queue_size` | Current size of the sending queue, in batches. | Gauge |
| `otelcol_exporter_send_failed_`<br>`log_records` | Number of logs that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_send_failed_`<br>`metric_points` | Number of metric points that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_send_failed_`<br>`spans` | Number of spans that exporter(s) failed to send to destination. | Counter |
| `otelcol_exporter_sent_log_records` | Number of logs successfully sent to destination. | Counter |
| `otelcol_exporter_sent_metric_points` | Number of metric points successfully sent to destination. | Counter |
| `otelcol_exporter_sent_spans` | Number of spans successfully sent to destination. | Counter |
| `otelcol_process_cpu_seconds` | Total CPU user and system time in seconds. | Counter |
| `otelcol_process_memory_rss` | Total physical memory (resident set size) in bytes. | Gauge |
| `otelcol_process_runtime_heap_`<br>`alloc_bytes` | Bytes of allocated heap objects (see 'go doc runtime.MemStats.HeapAlloc'). | Gauge |
| `otelcol_process_runtime_total_`<br>`alloc_bytes` | Cumulative bytes allocated for heap objects (see 'go doc runtime.MemStats.TotalAlloc'). | Counter |
| `otelcol_process_runtime_total_`<br>`sys_memory_bytes` | Total bytes of memory obtained from the OS (see 'go doc runtime.MemStats.Sys'). | Gauge |
| `otelcol_process_uptime` | Uptime of the process in seconds. | Counter |
| `otelcol_processor_incoming_items` | Number of items passed to the processor. | Counter |
| `otelcol_processor_outgoing_items` | Number of items emitted from the processor. | Counter |
| `otelcol_receiver_accepted_`<br>`log_records` | Number of logs successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_accepted_`<br>`metric_points` | Number of metric points successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_accepted_spans` | Number of spans successfully ingested and pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_`<br>`log_records` | Number of logs that could not be pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_`<br>`metric_points` | Number of metric points that could not be pushed into the pipeline. | Counter |
| `otelcol_receiver_refused_spans` | Number of spans that could not be pushed into the pipeline. | Counter |
| `otelcol_scraper_errored_`<br>`metric_points` | Number of metric points the Collector failed to scrape. | Counter |
| `otelcol_scraper_scraped_`<br>`metric_points` | Number of metric points scraped by the Collector. | Counter |

#### Additional `normal`-level metrics

There are currently no metrics specific to `normal` verbosity.
| Metric name | Description | Type |
| ------------------------------------------------------- | --------------------------------------------------------------- | --------- |
| `otelcol_processor_batch_batch_`<br>`send_size` | Number of units in the batch that was sent. | Histogram |
| `otelcol_processor_batch_batch_size_`<br>`trigger_send` | Number of times the batch was sent due to a size trigger. | Counter |
| `otelcol_processor_batch_metadata_`<br>`cardinality` | Number of distinct metadata value combinations being processed. | Counter |
| `otelcol_processor_batch_timeout_`<br>`trigger_send` | Number of times the batch was sent due to a timeout trigger. | Counter |

{{% alert title="Note" color="info" %}} Aside from
`otelcol_processor_batch_batch_send_size_bytes` which has been `detailed` since
its introduction, the other batch processor metrics were `basic` until they were
switched to `normal` in Collector
[v0.99.0](https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.99.0).
They were accidentally switched back to `basic` in v0.109.0, which was fixed in
v0.122.0. {{% /alert %}}
Comment on lines +280 to +286
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how useful this note is for the average reader. It feels like more (historical) detail than we actually need to document. WDYT @tiffany76 @theletterf @open-telemetry/docs-approvers?

If we do keep it, the text should be reworked:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if all that many users are using the very latest Collector version at all times, so since we don't have "per-version" documentation, I think it could still be useful to document changes while the metrics aren't stable? But I'll leave it to docs-approvers judgment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jade-guiton-dd that there are a lot of users working with older versions of the Collector, as evidenced by community Slack conversations. The internal telemetry page also sees a fair bit of use (it's the 58th most viewed docs page so far this year). I think it's worth being explicit about these nuances while the changes are still recent. We can remove the note later if the versions are so outdated that it's unlikely anyone is still using them.

As for the text, I personally like the way @jade-guiton-dd has written it. It's succinct and clear. But I'm open to edits as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are my proposed edits:

Suggested change
{{% alert title="Note" color="info" %}} Aside from
`otelcol_processor_batch_batch_send_size_bytes` which has been `detailed` since
its introduction, the other batch processor metrics were `basic` until they were
switched to `normal` in Collector
[v0.99.0](https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.99.0).
They were accidentally switched back to `basic` in v0.109.0, which was fixed in
v0.122.0. {{% /alert %}}
{{% alert title="Batch processor metrics level changes" color="info" %}}
In Collector [v0.99.0], all batch processor metrics were upgraded from `basic`
to `normal` (current level), except for
`otelcol_processor_batch_batch_send_size_bytes`, which has been `detailed` since
its introduction. Note however that these metrics were inadvertently reverted to
`basic` from v0.109.0 to v0.121.0.
[v0.99.0]:
https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.99.0
{{% /alert %}}

@jade-guiton-dd: please preserve the spacing around the opening and closing alert tags.


#### Additional `detailed`-level metrics

Expand Down
4 changes: 4 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -6707,6 +6707,10 @@
"StatusCode": 200,
"LastSeen": "2024-10-24T15:10:25.832305+02:00"
},
"https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.99.0": {
"StatusCode": 206,
"LastSeen": "2025-03-07T15:19:33.345532+01:00"
},
"https://github.com/open-telemetry/opentelemetry-collector/security/advisories/GHSA-c74f-6mfw-mm4v": {
"StatusCode": 206,
"LastSeen": "2025-02-01T07:10:33.83398-05:00"
Expand Down