-
Notifications
You must be signed in to change notification settings - Fork 616
fixing the examples in data prepper trace analytics #11183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,8 +49,8 @@ OpenSearch provides a generic sink that writes data to OpenSearch as the destina | |
|
||
The sink provides specific configurations for the trace analytics feature. These configurations allow the sink to use indexes and index templates specific to trace analytics. The following OpenSearch indexes are specific to trace analytics: | ||
|
||
* otel-v1-apm-span –- The *otel-v1-apm-span* index stores the output from the [otel_traces]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/otel-traces/) processor. | ||
* otel-v1-apm-service-map –- The *otel-v1-apm-service-map* index stores the output from the [service_map]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/service-map/) processor. | ||
* otel-v1-apm-span – The *otel-v1-apm-span* index stores the output from the [otel_traces]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/otel-traces/) processor. | ||
* otel-v1-apm-service-map – The *otel-v1-apm-service-map* index stores the output from the [service_map]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/service-map/) processor. | ||
|
||
## Trace tuning | ||
|
||
|
@@ -114,9 +114,9 @@ The following sections provide examples of different types of pipelines and how | |
|
||
The following example demonstrates how to build a pipeline that supports the [OpenSearch Dashboards Observability plugin]({{site.url}}{{site.baseurl}}/observability-plugin/trace/ta-dashboards/). This pipeline takes data from the OpenTelemetry Collector and uses two other pipelines as sinks. These two separate pipelines serve two different purposes and write to different OpenSearch indexes. The first pipeline prepares trace data for OpenSearch and enriches and ingests the span documents into a span index within OpenSearch. The second pipeline aggregates traces into a service map and writes service map documents into a service map index within OpenSearch. | ||
|
||
Starting with Data Prepper version 2.0, Data Prepper no longer supports the `otel_traces_prepper` processor. The `otel_traces` processor replaces the `otel_traces_prepper` processor and supports some of Data Prepper's recent data model changes. Instead, you should use the `otel_traces` processor. See the following YAML file example: | ||
Starting with Data Prepper version 2.0, Data Prepper no longer supports the `otel_traces_prepper` processor. The `otel_traces` processor replaces the `otel_traces_prepper` processor and supports some of Data Prepper's recent data model changes. See the following YAML file example: | ||
|
||
```yml | ||
```yaml | ||
entry-pipeline: | ||
delay: "100" | ||
source: | ||
|
@@ -196,55 +196,54 @@ otel-trace-pipeline: | |
# workers is the number of threads processing data in each pipeline. | ||
# We recommend same value for all pipelines. | ||
# default value is 1, set a value based on the machine you are running Data Prepper | ||
workers: 8 | ||
workers: 8 | ||
# delay in milliseconds is how often the worker threads should process data. | ||
# Recommend not to change this config as we want the entry-pipeline to process as quick as possible | ||
# default value is 3_000 ms | ||
delay: "100" | ||
delay: "100" | ||
source: | ||
otel_trace_source: | ||
#record_type: event # Add this when using Data Prepper 1.x. This option is removed in 2.0 | ||
ssl: false # Change this to enable encryption in transit | ||
ssl: false | ||
authentication: | ||
unauthenticated: | ||
buffer: | ||
bounded_blocking: | ||
# buffer_size is the number of ExportTraceRequest from otel-collector the data prepper should hold in memeory. | ||
# We recommend to keep the same buffer_size for all pipelines. | ||
# Make sure you configure sufficient heap | ||
# default value is 512 | ||
buffer_size: 512 | ||
# This is the maximum number of request each worker thread will process within the delay. | ||
# Default is 8. | ||
# Make sure buffer_size >= workers * batch_size | ||
batch_size: 8 | ||
# buffer_size is the number of ExportTraceRequest from otel-collector the data prepper should hold in memeory. | ||
# We recommend to keep the same buffer_size for all pipelines. | ||
# Make sure you configure sufficient heap | ||
# default value is 512 | ||
buffer_size: 512 | ||
|
||
# This is the maximum number of request each worker thread will process within the delay. | ||
# Default is 8. | ||
# Make sure buffer_size >= workers * batch_size | ||
batch_size: 8 | ||
sink: | ||
- pipeline: | ||
name: "raw-trace-pipeline" | ||
- pipeline: | ||
name: "entry-pipeline" | ||
- pipeline: { name: "raw-trace-pipeline" } | ||
|
||
- pipeline: { name: "service-map-pipeline" } | ||
|
||
raw-trace-pipeline: | ||
# Configure same as the otel-trace-pipeline | ||
workers: 8 | ||
workers: 8 | ||
# We recommend using the default value for the raw-trace-pipeline. | ||
delay: "3000" | ||
delay: "3000" | ||
source: | ||
pipeline: | ||
name: "entry-pipeline" | ||
pipeline: { name: "otel-trace-pipeline" } | ||
buffer: | ||
bounded_blocking: | ||
# Configure the same value as in entry-pipeline | ||
# Make sure you configure sufficient heap | ||
# The default value is 512 | ||
buffer_size: 512 | ||
# The raw processor does bulk request to your OpenSearch sink, so configure the batch_size higher. | ||
# If you use the recommended otel-collector setup each ExportTraceRequest could contain max 50 spans. https://github.com/opensearch-project/data-prepper/tree/v0.7.x/deployment/aws | ||
# With 64 as batch size each worker thread could process upto 3200 spans (64 * 50) | ||
batch_size: 64 | ||
bounded_blocking: | ||
# Configure the same value as in entry-pipeline | ||
# Make sure you configure sufficient heap | ||
# The default value is 512 | ||
buffer_size: 512 | ||
# The raw processor does bulk request to your OpenSearch sink, so configure the batch_size higher. | ||
# If you use the recommended otel-collector setup each ExportTraceRequest could contain max 50 spans. https://github.com/opensearch-project/data-prepper/tree/v0.7.x/deployment/aws | ||
# With 64 as batch size each worker thread could process upto 3200 spans (64 * 50) | ||
batch_size: 64 | ||
processor: | ||
- otel_traces: | ||
- otel_traces: {} | ||
|
||
# Optional: only if you want the group-filler stage. | ||
- otel_traces_group: | ||
hosts: [ "https://localhost:9200" ] | ||
hosts: [ "https://opensearch:9200" ] | ||
# Change to your credentials | ||
username: "admin" | ||
password: "admin" | ||
|
@@ -257,7 +256,7 @@ raw-trace-pipeline: | |
#aws_region: us-east-1 | ||
sink: | ||
- opensearch: | ||
hosts: [ "https://localhost:9200" ] | ||
hosts: [ "https://opensearch:9200" ] | ||
index_type: trace-analytics-raw | ||
# Change to your credentials | ||
username: "admin" | ||
|
@@ -269,32 +268,32 @@ raw-trace-pipeline: | |
# username and password above. | ||
#aws_sigv4: true | ||
#aws_region: us-east-1 | ||
|
||
service-map-pipeline: | ||
workers: 8 | ||
delay: "100" | ||
source: | ||
pipeline: | ||
name: "entry-pipeline" | ||
pipeline: { name: "otel-trace-pipeline" } | ||
processor: | ||
- service_map: | ||
# The window duration is the maximum length of time the data prepper stores the most recent trace data to evaluvate service-map relationships. | ||
# The default is 3 minutes, this means we can detect relationships between services from spans reported in last 3 minutes. | ||
# Set higher value if your applications have higher latency. | ||
window_duration: 180 | ||
buffer: | ||
bounded_blocking: | ||
# buffer_size is the number of ExportTraceRequest from otel-collector the data prepper should hold in memeory. | ||
# We recommend to keep the same buffer_size for all pipelines. | ||
# Make sure you configure sufficient heap | ||
# default value is 512 | ||
buffer_size: 512 | ||
# This is the maximum number of request each worker thread will process within the delay. | ||
# Default is 8. | ||
# Make sure buffer_size >= workers * batch_size | ||
batch_size: 8 | ||
bounded_blocking: | ||
# buffer_size is the number of ExportTraceRequest from otel-collector the data prepper should hold in memeory. | ||
# We recommend to keep the same buffer_size for all pipelines. | ||
# Make sure you configure sufficient heap | ||
# default value is 512 | ||
buffer_size: 512 | ||
# This is the maximum number of request each worker thread will process within the delay. | ||
# Default is 8. | ||
# Make sure buffer_size >= workers * batch_size | ||
batch_size: 8 | ||
sink: | ||
- opensearch: | ||
hosts: [ "https://localhost:9200" ] | ||
hosts: [ "https://opensearch:9200" ] | ||
index_type: trace-analytics-service-map | ||
# Change to your credentials | ||
username: "admin" | ||
|
@@ -306,6 +305,7 @@ service-map-pipeline: | |
# username and password above. | ||
#aws_sigv4: true | ||
#aws_region: us-east-1 | ||
|
||
``` | ||
|
||
You need to modify the preceding configuration for your OpenSearch cluster so that the configuration matches your environment. Note that it has two `opensearch` sinks that need to be modified. | ||
|
@@ -324,9 +324,70 @@ For other configurations available for OpenSearch sinks, see [Data Prepper OpenS | |
|
||
You need to run OpenTelemetry Collector in your service environment. Follow [Getting Started](https://opentelemetry.io/docs/collector/getting-started/#getting-started) to install an OpenTelemetry collector. Ensure that you configure the collector with an exporter configured for your Data Prepper instance. The following example `otel-collector-config.yaml` file receives data from various instrumentations and exports it to Data Prepper. | ||
|
||
### Example otel-collector-config.yaml file | ||
### Example setup using Docker compose | ||
|
||
The following is an example configuration for OpenSearch, OpenSearch Dashboards, Data Prepper and OpenTelemetry Collector using Docker containers. | ||
|
||
`docker-compose.yaml` | ||
|
||
```yaml | ||
version: "3.8" | ||
|
||
networks: | ||
opensearch-net: | ||
|
||
services: | ||
opensearch: | ||
image: opensearchproject/opensearch:3.2.0 | ||
environment: | ||
- discovery.type=single-node | ||
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=<strong_password> | ||
- bootstrap.memory_lock=true | ||
- "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g" | ||
ulimits: | ||
memlock: { soft: -1, hard: -1 } | ||
nofile: { soft: 65536, hard: 65536 } | ||
ports: | ||
- "9200:9200" | ||
- "9600:9600" | ||
networks: [opensearch-net] | ||
|
||
dashboards: | ||
image: opensearchproject/opensearch-dashboards:3.2.0 | ||
environment: | ||
OPENSEARCH_HOSTS: '["https://opensearch:9200"]' | ||
OPENSEARCH_USERNAME: admin | ||
OPENSEARCH_PASSWORD: "admin" | ||
ports: | ||
- "5601:5601" | ||
depends_on: [opensearch] | ||
networks: [opensearch-net] | ||
|
||
data-prepper: | ||
image: opensearchproject/data-prepper:latest | ||
command: ["/usr/share/data-prepper/bin/data-prepper"] | ||
volumes: | ||
- ./pipelines:/usr/share/data-prepper/pipelines:ro | ||
- ./config:/usr/share/data-prepper/config:ro | ||
ports: | ||
- "4900:4900" # Data Prepper control API (HTTP) | ||
- "21890:21890" # OTLP (gRPC) for traces | ||
depends_on: [opensearch] | ||
networks: [opensearch-net] | ||
|
||
otel-collector: | ||
image: otel/opentelemetry-collector:latest | ||
command: ["--config=/etc/otelcol/otel-collector.yaml"] | ||
volumes: | ||
- ./otel-collector.yaml:/etc/otelcol/otel-collector.yaml:ro | ||
depends_on: [data-prepper] | ||
networks: [opensearch-net] | ||
ports: | ||
- "4317:4317" # OTLP gRPC | ||
- "4318:4318" # OTLP HTTP (optional) | ||
``` | ||
|
||
The following is an example `otel-collector-config.yaml` file: | ||
`otel-collector-config.yaml`: | ||
|
||
``` | ||
receivers: | ||
|
@@ -357,6 +418,119 @@ service: | |
exporters: [otlp/data-prepper] | ||
``` | ||
|
||
`pipelines/pipelines.yaml`: | ||
|
||
```yaml | ||
entry-pipeline: | ||
|
||
source: | ||
otel_trace_source: | ||
port: 21890 | ||
ssl: false | ||
authentication: | ||
unauthenticated: | ||
buffer: | ||
bounded_blocking: | ||
buffer_size: 10240 | ||
batch_size: 160 | ||
sink: | ||
- pipeline: { name: "raw-trace-pipeline" } | ||
|
||
- pipeline: { name: "service-map-pipeline" } | ||
|
||
raw-trace-pipeline: | ||
source: | ||
pipeline: { name: "entry-pipeline" } | ||
processor: | ||
- otel_traces: {} | ||
|
||
sink: | ||
- opensearch: | ||
hosts: ["https://opensearch:9200"] | ||
insecure: true | ||
username: admin | ||
password: <strong_password> | ||
index_type: trace-analytics-raw | ||
|
||
service-map-pipeline: | ||
source: | ||
pipeline: { name: "entry-pipeline" } | ||
|
||
processor: | ||
- service_map: {} | ||
sink: | ||
- opensearch: | ||
hosts: ["https://opensearch:9200"] | ||
insecure: true | ||
username: admin | ||
password: <strong_password> | ||
index_type: trace-analytics-service-map | ||
``` | ||
|
||
`config/data-prepper-config.yaml`: | ||
|
||
```yaml | ||
# Disable TLS on the Data Prepper REST API (local only) | ||
ssl: false | ||
serverPort: 4900 | ||
|
||
peer_forwarder: | ||
ssl: false | ||
discovery_mode: local_node | ||
``` | ||
|
||
`otel-collector.yaml`: | ||
|
||
```yaml | ||
receivers: | ||
otlp: | ||
protocols: | ||
grpc: | ||
endpoint: 0.0.0.0:4317 | ||
http: | ||
endpoint: 0.0.0.0:4318 | ||
|
||
exporters: | ||
otlp: | ||
endpoint: data-prepper:21890 | ||
tls: | ||
insecure: true # plaintext to Data Prepper (matches ssl: false on the otel_trace_source) | ||
|
||
|
||
# optional, useful for debugging | ||
debug: | ||
verbosity: basic | ||
|
||
processors: | ||
batch: {} | ||
|
||
extensions: | ||
health_check: {} # exposes :13133 | ||
|
||
service: | ||
extensions: [health_check] | ||
telemetry: | ||
logs: | ||
level: debug | ||
pipelines: | ||
traces: | ||
receivers: [otlp] | ||
processors: [batch] | ||
exporters: [otlp, debug] | ||
``` | ||
|
||
Start all the containers using `docker-compose up` command. | ||
|
||
You can now use the following command to spin up `telemetrygen` and generate synthetic OpenTelemetry traces for 30 seconds (~50 spans/sec) and send them to `otel-collector:4317` over plaintext gRPC: | ||
|
||
```bash | ||
docker run --rm --network data-prepper_opensearch-net \ | ||
ghcr.io/open-telemetry/opentelemetry-collector-contrib/telemetrygen:latest \ | ||
traces \ | ||
--otlp-endpoint=otel-collector:4317 \ | ||
--otlp-insecure \ | ||
--duration=30s \ | ||
--rate=50 | ||
``` | ||
|
||
This will push sample telemetry to alias `otel-v1-apm-span` and store the documents in index `otel-v1-apm-span-000001`. | ||
|
||
After you run OpenTelemetry in your service environment, you must configure your application to use the OpenTelemetry Collector. The OpenTelemetry Collector typically runs alongside your application. | ||
|
||
## Next steps and more information | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the
otel_traces
replaces theotel_trace_raw
which customers have used before. Theotel_trace_raw
is still allowed in Data Prepper 2.0. We will remove it in 3.0.