Skip to content

mTLS issue with OTLP #12396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vipinvkmenon opened this issue Feb 15, 2025 · 3 comments
Closed

mTLS issue with OTLP #12396

vipinvkmenon opened this issue Feb 15, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@vipinvkmenon
Copy link

Component(s)

receiver/otlp

What happened?

Describe the bug
I am trying to connect 2 OTEL collectors via mTLS, however, I get the error (otlp exporter):

2025-02-15T16:06:24.066Z        info    internal/retry_sender.go:126    Exporting failed. Will retry the request after interval.     {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match abc.xyz.com\"", "interval": "30.981901146s"}

No Logs Reciever side. so not sure what the issue there

Steps to reproduce
The following are the configurations in place:

I) Istio is enabled, however TLS I believe is disabled with the following configuration

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "ba-ot-peer-authentication"
  namespace: tls-ot
spec:
  mtls:
    mode: DISABLE
  1. OTEL Collector (receiver) within the kubernetes cluster:
# Reciever OTEL Collector
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: basic-auth-otel-collector
  namespace: tls-ot
  annotations:
    sidecar.istio.io/logLevel: "debug"
spec:
  mode: deployment
  volumeMounts:
  - name: server-certificates-volume
    mountPath: /etc/pki/ca-trust/source/server-ca
    readOnly: true
  - name: client-certificates-volume
    mountPath: /etc/pki/ca-trust/source/client-ca
    readOnly: true
  volumes:
  - name: server-certificates-volume
    secret:
      secretName: otelserver-creds-new
  - name: client-certificates-volume
    secret:
      secretName: otelclient-creds-new
  image: otel/opentelemetry-collector-contrib:0.119.0
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            tls:
              cert_file: /etc/pki/ca-trust/source/server-ca/tls.crt
              key_file: /etc/pki/ca-trust/source/server-ca/tls.key
              client_ca_file: /etc/pki/ca-trust/source/client-ca/tls.crt
    exporters:
      debug:
        verbosity: detailed

    service:
      pipelines:
        metrics:
          receivers: [otlp]
          exporters: [debug]
      telemetry:
        metrics:
          level: detailed
        logs:
          level: DEBUG
          output_paths: ["stdout"]
---
apiVersion: v1
kind: Service
metadata:
  name: basic-auth-otel-collector-grpc
  namespace: tls-ot 
spec:
  selector:
    app.kubernetes.io/component: opentelemetry-collector
    app.kubernetes.io/instance: tls-ot.basic-auth-otel-collector
    app.kubernetes.io/name: basic-auth-otel-collector-collector
  type: ClusterIP
  ports:
  - name: grpc # important!
    protocol: TCP
    port: 4317

A note on the certificates created:

  1. Root CA certificates created
openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=PMT/CN=root.pmt.com' -keyout root.key -out root.crt
  1. Create signed Server Certificate
openssl req -out server.csr -newkey rsa:2048 -nodes -keyout server.key -subj "/CN=abc.xyz.com/O=PM" -config san_server.cnf -extensions req_ext

openssl x509 -req -sha256 -days 365 -CA root.crt -CAkey root.key -set_serial 0 -in server.csr -out server.crt -extensions req_ext -extfile san_server.cnf

extfile: san_server.cnf

[req]
default_bits = 4096
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn
[ dn ]
CN = abc.xyz.com
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
DNS.1   = abc.xyz.com
  1. Client signed certificates:
openssl req -out client.csr -newkey rsa:2048 -nodes -keyout client.key -subj "/CN=client.local/O=client" -extensions req_ext -config san_client.cnf

openssl x509 -req -sha256 -days 365 -CA root.crt -CAkey root.key -set_serial 1 -in client.csr -out client.crt -extensions req_ext -extfile san_client.cnf

Extfile

[req]
default_bits = 4096
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn
[ dn ]
CN = client.local
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
DNS.1   = client.local

Finally uploaded the certificates as secrets:

# CA File used was root.crt
k create secret generic otelserver-creds -n tls-ot  --from-file=tls.crt=server.crt --from-file=tls.key=server.key --from-file=ca.crt=root.crt

k create secret generic otelclient-creds -n tls-ot  --from-file=tls.crt=client.crt --from-file=tls.key=client.key --from-file=ca.crt=root.crt

Create the Gateway and Route:

Gateway Config:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: tls-ot-gateway
  namespace: tls-ot 
spec:
  gatewayClassName: istio
  listeners:
  - allowedRoutes:
      namespaces:
        from: All
    hostname: abc.xyz.com
    name: otel
    port: 4317
    protocol: TLS
    tls:
      mode: Passthrough

Route Config:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
  name: tls-ot-route
  namespace: tls-ot 
spec:
  hostnames:
    - abc.xyz.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: tls-ot-gateway
    sectionName: otel
  rules:
    - backendRefs:
      - name: basic-auth-otel-collector-grpc
        weight: 1
        port: 4317

I configured sample external otel collector as follows:

# EXPORTER OTEL Collector
receivers:
  # Data sources: metrics
  hostmetrics:
    scrapers:
      cpu:
      disk:
processors:
  # Data sources: traces
  attributes:
    actions:
      - key: environment
        value: production
        action: insert
  # Data sources: traces, metrics, logs
exporters:
  otlp:
    endpoint: "External-Loadbalancer-host:4317"
    tls:
     server_name_override: "abc.xyz.com"
     ca_file: /etc/cert/ca.crt
     cert_file: /etc/cert/client.crt
     key_file: /etc/cert/client.key
     #insecure_skip_verify: true
  debug:
service:
 pipelines:
   metrics:
     receivers: [hostmetrics]
     processors: [attributes]
     exporters: [otlp,debug]

But the setup does not work. As an alternative I also tried to replace the root.crt with ca_file: /etc/cert/server.crt (in the exporter) and client_ca_file: /etc/pki/ca-trust/source/client-ca/client.crt (in the receiver). However it keeps throwing error at the exporter as mentioned above.

P.S With the following configuration below, the setup works of course:

  • tls configuration from the RECIEVER OTEL removed
  • Remove certificate settings (only keeping server_name_override) in the EXPORTER OTEL
  • Configure an HTTPS Gateway with certificateRefs and mode: Terminate
  • PeerAuthentication set to strict it worked with any setting here btw...

What did you expect to see?

  • mTLS works, or proper documentation with example for a successful mTLS setup
  • Proper logs from the telemetry service from the RECIEVER OTEL.

What did you see instead?
mTLS doesn't work and the exporter throws the error:

2025-02-15T16:14:17.964Z        info    internal/retry_sender.go:126    Exporting failed. Will retry the request after interval.     {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match abc.xyz.com\"", "interval": "15.741533215s"}
  • Also, no examples or documentation are showing a successful mTLS setup.

Collector version

0.118.0

Environment information

Environment

Kubernetes

OpenTelemetry Collector configuration

receivers:
  # Data sources: metrics
  hostmetrics:
    scrapers:
      cpu:
      disk:
processors:
  # Data sources: traces
  attributes:
    actions:
      - key: environment
        value: production
        action: insert
  # Data sources: traces, metrics, logs
exporters:
  otlp:
    endpoint: "External-Loadbalancer-host:4317"
    tls:
     server_name_override: "abc.xyz.com"
     ca_file: /etc/cert/ca.crt
     cert_file: /etc/cert/client.crt
     key_file: /etc/cert/client.key
     #insecure_skip_verify: true
  debug:
service:
 pipelines:
   metrics:
     receivers: [hostmetrics]
     processors: [attributes]
     exporters: [otlp,debug]

Log output

2025-02-15T16:14:17.964Z        info    internal/retry_sender.go:126    Exporting failed. Will retry the request after interval.     {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match abc.xyz.com\"", "interval": "15.741533215s"}

Additional context

No response

@vipinvkmenon vipinvkmenon added the bug Something isn't working label Feb 15, 2025
@vipinvkmenon
Copy link
Author

vipinvkmenon commented Feb 15, 2025

Ok I also tried to run them without istio

RECEIVER COLLECTOR

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-dummy
  namespace: local-mtls
  labels:
    app: otel-collector
spec:
  mode: deployment
  volumeMounts:
  - name: server-certificates-volume
    mountPath: /etc/pki/ca-trust/source/server-ca
    readOnly: true
  volumes:
  - name: server-certificates-volume
    secret:
      secretName: otelserver-creds
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            tls:
              cert_file: /etc/pki/ca-trust/source/server-ca/tls.crt
              key_file: /etc/pki/ca-trust/source/server-ca/tls.key
              client_ca_file: /etc/pki/ca-trust/source/server-ca/ca.crt
    exporters:
      # NOTE: Prior to v0.86.0 use `logging` instead of `debug`.
      debug:
    service:
      pipelines:
        metrics:
          receivers: [otlp]
          processors: []
          exporters: [debug]
      telemetry:
          metrics:
            level: detailed
          logs:
            level: debug
            output_paths: ["stdout"]
---
apiVersion: v1 #probably doesn't matter as istio isn't present nyway
kind: Service
metadata:
  name: otel-dummy-grpc-svc
  namespace: local-mtls
spec:
  selector:
    app.kubernetes.io/component: opentelemetry-collector
    app.kubernetes.io/instance: local-mtls.otel-dummy
    app.kubernetes.io/name: otel-dummy
  type: ClusterIP
  ports:
  - name: grpc # important for istio!
    protocol: TCP
    port: 4317

EXPORTER COLLECTOR

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: sender-otel
  namespace: local-mtls
spec:
  mode: deployment
  volumeMounts:
  - name: server-certificates-volume
    mountPath: /etc/pki/ca-trust/source/server-ca
    readOnly: true
  volumes:
  - name: server-certificates-volume
    secret:
      secretName: otelserver-creds
  config: |
    receivers:
      hostmetrics:
        scrapers:
          cpu:
    processors:
      attributes:
        actions:
          - key: a
            action: INSERT
            value: b
    exporters:
      # NOTE: Prior to v0.86.0 use `logging` instead of `debug`.
      debug:
      otlp/x:
        endpoint: otel-dummy-grpc-svc:4317
        tls:
          cert_file: /etc/pki/ca-trust/source/server-ca/tls.crt
          key_file: /etc/pki/ca-trust/source/server-ca/tls.key
          ca_file: /etc/pki/ca-trust/source/server-ca/ca.crt
    service:
      pipelines:
        metrics:
          receivers: [hostmetrics]
          processors: [attributes]
          exporters: [otlp/x,debug]
        # telemetry:
        #   metrics:
        #     level: detailed
        #   logs:
        #     level: debug
        #     output_paths: ["stdout"]

Logs from the EXPORTER COLLECTOR:

2025-02-15T17:48:06.890Z        info    internal/retry_sender.go:126    Exporting failed. Will retry the request after interval.     {"kind": "exporter", "data_type": "metrics", "name": "otlp/x", "error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: read tcp 100.64.2.29:47090->100.107.133.18:4317: read: connection reset by peer\"", "interval": "18.884191716s"}

Logs from the RECIEVER OTEL COLLECTOR:

I0215 23:21:24.358880   51695 versioner.go:58] exec plugin: invalid apiVersion "client.authentication.k8s.io/v1"
2025-02-15T17:42:48.371Z        info    [email protected]/service.go:164 Setting up own telemetry...
2025-02-15T17:42:48.371Z        warn    [email protected]/service.go:213 service::telemetry::metrics::address is being deprecated in favor of service::telemetry::metrics::readers
2025-02-15T17:42:48.371Z        info    telemetry/metrics.go:70 Serving metrics {"address": "0.0.0.0:8888", "metrics level": "Detailed"}
2025-02-15T17:42:48.372Z        info    builders/builders.go:26 Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2025-02-15T17:42:48.372Z        debug   builders/builders.go:24 Stable component.       {"kind": "receiver", "name": "otlp", "data_type": "metrics"}
2025-02-15T17:42:48.373Z        info    [email protected]/service.go:230 Starting otelcol-k8s... {"Version": "0.117.0", "NumCPU": 2}
2025-02-15T17:42:48.373Z        info    extensions/extensions.go:39     Starting extensions...
2025-02-15T17:42:48.374Z        info    [email protected]/server.go:685      [core] [Server #1]Server created        {"grpc_log": true}
2025-02-15T17:42:48.374Z        info    [email protected]/otlp.go:112       Starting GRPC server    {"kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4317"}
2025-02-15T17:42:48.374Z        info    [email protected]/service.go:253 Everything is ready. Begin running and processing data.
2025-02-15T17:42:48.374Z        info    [email protected]/server.go:881      [core] [Server #1 ListenSocket #2]ListenSocket created       {"grpc_log": true}

Guiding to setup work would be helpful.
Thanks

@vipinvkmenon
Copy link
Author

Also, how do correctly configure the telemetry to give out debug logs

@vipinvkmenon
Copy link
Author

Ok it looks like the issue is related to istio and not OTEL for now.
Have written a small post with a simple setup for the same with my experience:
https://dev.to/vipinvkmenon/setting-up-otel-collectors-for-mtls-4n4o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant