Skip to content

Commit 49ffa24

Browse files
authored
Merge pull request newrelic#13933 from josemore/jmore/otel-host-rev
feat: otel host monitoring updates
2 parents 1bf6055 + 8c5127a commit 49ffa24

File tree

1 file changed

+68
-128
lines changed

1 file changed

+68
-128
lines changed

src/content/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/collector/opentelemetry-collector-infra-hosts.mdx

+68-128
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ tags:
77
metaDescription: The OpenTelemetry Collector is a central tool to collect, process, and export your telemetry.
88
---
99

10-
You can collect metrics about your infrastructure hosts with OpenTelemetry if you set up the host receiver in a collector. The collector is a component of OpenTelemetry that collects, processes, and exports telemetry data to New Relic (or any observability backend).
10+
You can collect metrics and logs from your infrastructure hosts with OpenTelemetry and leverage the same infrastructure experiences that are available for New Relic agents.
11+
Specific receivers and processors are required in the OTel collector to collect and report host telemetry.
1112

1213
If you're looking for help with other collector use cases, see the [newrelic-opentelemetry-examples](https://github.com/newrelic/newrelic-opentelemetry-examples) repository.
1314

@@ -33,11 +34,13 @@ Your deployment experience might vary depending on which vendor-specific distrib
3334
To set up infrastructure monitoring, you need to install and configure components that are included in the `collector-contrib` release. For example, the host receiver is required to collect basic host metrics such as CPU, memory, disk, and network stats and is only available in the [OpenTelemetry Collector-contrib](https://github.com/open-telemetry/opentelemetry-collector-contrib) release.
3435
</Callout>
3536

36-
## Step 3: Configure host monitoring using the host receiver [#host-receiver]
37+
## Step 3: Configure host metrics and logs [#host-receiver]
3738

3839
This collector example is meant to serve as a starting point from which you can extend, customize, and validate configurations before using them in production.
3940

40-
The `collector-contrib` release provides a `hostreceiver` that generates metrics about the system scraped from various sources. Deploy the collector as an agent when you use a `hostreceiver`.
41+
The `collector-contrib` release includes:
42+
* The `hostreceiver` that generates metrics about the system scraped from various sources. Deploy the collector as an agent when you use a `hostreceiver`.
43+
* The `filelogreceiver` that tails and parses logs from files.
4144

4245
When using the host receiver as part of the collector configuration, New Relic automatically detects host metrics as part of a `Host` entity and will synthesize its golden metrics providing the same experience as with the New Relic infrastructure agent. The following are the configuration requirements to enable the `Host` entity experience in New Relic UI:
4346

@@ -46,113 +49,12 @@ When using the host receiver as part of the collector configuration, New Relic a
4649

4750
Learn more about available metrics and advanced configurations from the [OpenTelemetry documentation in GitHub](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/hostmetricsreceiver).
4851

49-
Adapt the `config.yaml` with these recommended parameters:
52+
Here is a sample configuration YAML file for a linux host. Be sure to do the following:
5053

51-
<Callout variant="important">
52-
CPU, load, memory, and disk utilization metrics require otelcol-contrib release `v0.47.0` or greater.
53-
</Callout>
54-
55-
<table>
56-
<thead>
57-
<tr>
58-
<th style={{ width: "290px" }}>
59-
Configuration
60-
</th>
61-
62-
<th>
63-
Description
64-
</th>
65-
</tr>
66-
</thead>
67-
68-
<tbody>
69-
<tr>
70-
<td>
71-
`receivers::hostmetrics`
72-
</td>
73-
74-
<td>
75-
Enable host metrics.
76-
77-
* A 20 second interval is recommended (same default as the Infrastructure agent).
78-
* It should not be greater than 60 seconds to avoid issues with “host not responding” alerts.
79-
* Process instrumentation is optional.
80-
</td>
81-
</tr>
82-
83-
<tr>
84-
<td>
85-
`processors::resourcedetection`
86-
</td>
87-
88-
<td>
89-
Keep the following in mind:
90-
91-
* `env`: Reads resource information from the `OTEL_RESOURCE_ATTRIBUTES` environment variable.
92-
* `system`: Adds `host.name` and `os.type`.
93-
* For cloud environments, configure a specific resource detection processor so that metrics are decorated with `host.id` (required to identify the host entity in New Relic). Common cloud detectors are `gce` for GCP machines, `ec2` for AWS EC2, and `azure` for Azure VMs. Additional processors are available for orchestrated environments.
94-
* For on-premises systems (or unsupported cloud environments), a `host.id` attribute is required. Use the `resource` processor to copy the `host.name` value (from `system`) as a new `host.id` attribute. Note this value should be unique across all instrumented hosts:
95-
```yaml
96-
resource:
97-
attributes:
98-
- key: host.id
99-
from_attribute: host.name
100-
action: upsert
101-
```
102-
</td>
103-
</tr>
104-
105-
<tr>
106-
<td>
107-
`processors::batch`
108-
</td>
109-
110-
<td>
111-
The [batch processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md) accepts spans, metrics, or logs and places them into batches. Batching helps better compress the data and reduce the number of outgoing connections required to transmit the data. This processor supports both size and time based batching.
112-
</td>
113-
</tr>
114-
115-
<tr>
116-
<td>
117-
`processors::memory_limiter`
118-
</td>
119-
120-
<td>
121-
The [memory limiter](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/memorylimiterprocessor) processor is used to prevent out of memory situations on the collector.
122-
123-
Putting checks in place is important because:
124-
125-
* The amount and type of data the collector processes is specific to its environment.
126-
* The collector's resource utilization is dependent on the configured processors.
127-
</td>
128-
</tr>
129-
130-
<tr>
131-
<td>
132-
`processors::cumulative_delta`
133-
</td>
134-
135-
<td>
136-
The [cumulative delta](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/cumulativetodeltaprocessor) processor converts cumulative sum metrics to cumulative delta. This helps you query system metric rates more easily in New Relic.
137-
</td>
138-
</tr>
139-
140-
<tr>
141-
<td>
142-
`service::pipelines::metrics`
143-
</td>
144-
145-
<td>
146-
Make sure `hostreceiver` and `resourcedetection` are included.
147-
</td>
148-
</tr>
149-
</tbody>
150-
</table>
151-
152-
Here is a sample configuration YAML file. Be sure to do the following:
153-
154-
* Replace OTLP_ENDPOINT_HERE with the appropriate [endpoint](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/get-started/opentelemetry-set-up-your-app/#review-settings).
155-
* Replace YOUR_KEY_HERE with your <InlinePopover type="licenseKey" />
54+
* Replace `OTLP_ENDPOINT_HERE` with the appropriate [endpoint](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/get-started/opentelemetry-set-up-your-app/#review-settings).
55+
* Replace `YOUR_KEY_HERE` with your <InlinePopover type="licenseKey" />.
56+
* Adjust the target log files in the filelog receiver section based on your requirements.
57+
* Adjust the `memory_limiter` default values based on your environment requirements.
15658

15759
```yaml
15860
extensions:
@@ -183,29 +85,51 @@ receivers:
18385
enabled: true
18486
processes:
18587
process:
88+
metrics:
89+
process.cpu.utilization:
90+
enabled: true
91+
process.cpu.time:
92+
enabled: false
93+
94+
filelog:
95+
include:
96+
- /var/log/alternatives.log
97+
- /var/log/cloud-init.log
98+
- /var/log/auth.log
99+
- /var/log/dpkg.log
100+
- /var/log/syslog
101+
- /var/log/messages
102+
- /var/log/secure
103+
- /var/log/yum.log
186104

187105
processors:
106+
107+
transform/truncate:
108+
trace_statements:
109+
- context: span
110+
statements:
111+
- truncate_all(attributes, 4095)
112+
- truncate_all(resource.attributes, 4095)
113+
log_statements:
114+
- context: log
115+
statements:
116+
- truncate_all(attributes, 4095)
117+
- truncate_all(resource.attributes, 4095)
118+
188119
memory_limiter:
189120
check_interval: 1s
190121
limit_mib: 1000
191122
spike_limit_mib: 200
123+
192124
batch:
193-
cumulativetodelta:
194-
include:
195-
metrics:
196-
- system.network.io
197-
- system.disk.operations
198-
- system.network.dropped
199-
- system.network.packets
200-
- process.cpu.time
201-
match_type: strict
202-
resource:
203-
attributes:
204-
- key: host.id
205-
from_attribute: host.name
206-
action: upsert
125+
207126
resourcedetection:
208127
detectors: [env, system]
128+
129+
resourcedetection/cloud:
130+
detectors: ["gcp", "ec2", "azure"]
131+
timeout: 2s
132+
override: false
209133

210134
exporters:
211135
otlp:
@@ -215,10 +139,21 @@ exporters:
215139

216140
service:
217141
pipelines:
142+
218143
metrics:
219144
receivers: [hostmetrics]
220-
processors: [batch, resourcedetection, resource, cumulativetodelta]
221-
exporters: [otlp]
145+
processors: [memory_limiter, resourcedetection, resourcedetection/cloud, batch]
146+
exporters: [logging, otlp]
147+
148+
traces:
149+
receivers: [otlp]
150+
processors: [memory_limiter, transform/truncate, resourcedetection, resourcedetection/cloud, batch]
151+
exporters: [logging, otlp]
152+
153+
logs:
154+
receivers: [otlp, filelog]
155+
processors: [memory_limiter, transform/truncate, resourcedetection, resourcedetection/cloud, batch]
156+
exporters: [logging, otlp]
222157

223158
extensions: [health_check]
224159
```
@@ -229,11 +164,11 @@ You can view your collector data in a variety of places in the New Relic UI.
229164
230165
### Browse host data in infrastructure UI [#using-ui]
231166
232-
By using the recommended configuration for the host receiver, you can view data through the standard features in the [Infrastructure UI](/docs/infrastructure/infrastructure-ui-pages/infrastructure-ui-entities#access-new-ui) (New Host UI) experience.
167+
By using the recommended configuration in the collector, you can view data through the standard features in the [Infrastructure UI](/docs/infrastructure/infrastructure-ui-pages/infrastructure-ui-entities#access-new-ui) experience.
233168
234-
### Query host metrics [#query-host-metrics]
169+
### Query host metrics and logs [#query-host-metrics]
235170
236-
Once metrics are successfully ingested in New Relic, they are available in [metrics and events](/docs/query-your-data/explore-query-data/browse-data/introduction-data-explorer) and [query builder](/docs/query-your-data/explore-query-data/query-builder/introduction-query-builder).
171+
Once telemetry is successfully ingested in New Relic, they are available in [metrics and events](/docs/query-your-data/explore-query-data/browse-data/introduction-data-explorer) and [query builder](/docs/query-your-data/explore-query-data/query-builder/introduction-query-builder).
237172
238173
The following NRQL queries show examples to help you explore the metrics you received:
239174
@@ -255,6 +190,11 @@ The following NRQL queries show examples to help you explore the metrics you rec
255190
SELECT keyset() FROM Metric WHERE metricName = 'system.disk.operations'
256191
```
257192

193+
* Querying number of log events per host
194+
```sql
195+
SELECT count(*) FROM Log FACET host.name TIMESERIES
196+
```
197+
258198
Learn more about [querying the metric data type](/docs/data-apis/understand-data/metric-data/query-metric-data-type).
259199

260200
## What's next? [#next]

0 commit comments

Comments
 (0)