Skip to content

Envoy crash using CDS with dynamic configuration from filesystem #34195

Closed
@aojea

Description

@aojea

I have a crash when using CDS with UDP, it is completely reproducible on Kubernetes CI

How to reproduce

  1. Crete the cds.yaml and lds.yaml files
  2. Start envoy with the following configuration
node:
  cluster: cloud-provider-kind
  id: cloud-provider-kind-id

dynamic_resources:
  cds_config:
    resource_api_version: V3
    path: /home/envoy/cds.yaml
  lds_config:
    resource_api_version: V3
    path: /home/envoy/lds.yaml

admin:
  access_log_path: /dev/stdout
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9901
  1. Modify the cds.yaml and lds.yaml two times with the configurations below (changing only the ports)

Details

kubernetes/kubernetes#124729

Crashdump on CDS

[2024-05-16 07:11:34.464][1][info][upstream] [source/common/upstream/cds_api_helper.cc:32] cds: add 1 cluster(s), remove 0 cluster(s)
[2024-05-16 07:11:34.465][1][info][upstream] [source/common/upstream/cds_api_helper.cc:71] cds: added/updated 1 cluster(s), skipped 0 unmodified cluster(s)
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:127] Caught Segmentation fault, suspect faulting address 0x18
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:111] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:112] Envoy version: 816188b86a0a52095b116b107f576324082c7c02/1.30.1/Clean/RELEASE/BoringSSL
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:114] Address mapping: 5585541ca000-558556b72000 /usr/local/bin/envoy
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #0: [0x7f9a5f4e1520]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #1: [0x5585548d4caa]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #2: [0x5585548d2bbf]
[2024-05-16 07:11:43.483][101][critical][backtrace] [./source/server/backtrace.h:121] #3: [0x5585561ed36d]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #4: [0x5585561f1600]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #5: [0x55855651bcc1]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #6: [0x55855651998f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #7: [0x55855651ab2f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #8: [0x5585561f0fa6]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #9: [0x5585561f0d4e]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #10: [0x5585562ea081]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #11: [0x5585562eb62d]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #12: [0x55855653bd40]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #13: [0x55855653a681]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #14: [0x558555b79f9f]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #15: [0x5585565b7d03]
[2024-05-16 07:11:43.484][101][critical][backtrace] [./source/server/backtrace.h:121] #16: [0x7f9a5f533ac3]

Config applied
LDS

resources:
- "@type": type.googleapis.com/envoy.config.listener.v3.Listener
  name: listener_IPv4_80_UDP
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 80
      protocol: UDP
  udp_listener_config:
    downstream_socket_config:
      max_rx_datagram_size: 9000
  listener_filters:
  - name: envoy.filters.udp_listener.udp_proxy
    typed_config:
      '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.UdpProxyConfig
      access_log:
      - name: envoy.file_access_log
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
      stat_prefix: udp_proxy
      matcher:
        on_no_match:
          action:
            name: route
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.Route
              cluster: cluster_IPv4_80_UDP
      upstream_socket_config:
        max_rx_datagram_size: 9000

CDS

resources:
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
  name: cluster_IPv4_80_UDP
  connect_timeout: 5s
  type: STATIC
  lb_policy: RANDOM
  health_checks:
  - timeout: 5s
    interval: 3s
    unhealthy_threshold: 2
    healthy_threshold: 1
    no_traffic_interval: 5s
    always_log_health_check_failures: true
    always_log_health_check_success: true
    event_log_path: /dev/stdout
    http_health_check:
      path: /healthz
  load_assignment:
    cluster_name: cluster_IPv4_80_UDP
    endpoints:
      - lb_endpoints:
        - endpoint:
            health_check_config:
              port_value: 10256
            address:
              socket_address:
                address: 192.168.8.4
                port_value: 32557
                protocol: UDP
      - lb_endpoints:
        - endpoint:
            health_check_config:
              port_value: 10256
            address:
              socket_address:
                address: 192.168.8.2
                port_value: 32557
                protocol: UDP

Originally posted by @aojea in #14866 (comment)

I've tried with all the latest stable images and it still crashes.

Found #33824 that seems related, but after trying with a dev image that should contain the fix the crash still happens

 9dce75e79df0   envoyproxy/envoy:dev-190f9e0cfe16e779f622c16dce8e833600e5fb45   "/docker-entrypoint.…"   About a minute ago   Exited (1) About a minute ago                               kindccm-FIVC6DYE7XD6AIEWUYSGG4EXHENYBGTPO4KYLJLD

Metadata

Metadata

Assignees

No one assigned

    Labels

    backport/reviewRequest to backport to stable releasesno stalebotDisables stalebot from closing an issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions