Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Go 1.23: Additional key exchange mechanism X25519Kyber768Draft00 causes AWS Network Firewall to drop packets #34323

Open
christiangjengedal opened this issue Feb 21, 2025 · 0 comments

Comments

@christiangjengedal
Copy link

The issue with Go 1.23 is described in detail in hashicorp/terraform-provider-aws#39311 My report is just a copy and replace to suit the datadog-agent issue:

datadog-agent 7.62.0 is upgraded to Go 1.23.0, which introduced a minor change to the crypto/tls standard library package:

The experimental post-quantum key exchange mechanism X25519Kyber768Draft00 is now enabled by default when Config.CurvePreferences is nil. The default can be reverted by adding tlskyber=0 to the GODEBUG environment variable.

This additional key exchange mechanism causes the length of the TLS ClientHello message to increase. The increased message length leads to AWS Network Firewall dropping the message.

AWS Network Firewall drops the message (causing the TLS handshake to timeout) because its stateful rule capability currently uses Suricata version 6.0.9, and this version of Suricata is known to drop TLS packets beyond a certain length.

Test 1 using public.ecr.aws/datadog/agent:7.63.0

datadog-agent logs

2025-02-21 10:46:38 UTC | PROCESS | ERROR | (comp/forwarder/defaultforwarder/transaction/transaction.go:116 in 4) | TLS Handshake failure: net/http: TLS handshake timeout
2025-02-21 10:46:40 UTC | CORE | ERROR | (comp/forwarder/defaultforwarder/transaction/transaction.go:116 in 4) | TLS Handshake failure: net/http: TLS handshake timeout
2025-02-21 10:46:40 UTC | CORE | ERROR | (pkg/config/remote/service/service.go:593 in pollOrgStatus) | [Remote Config] Could not refresh Remote Config: failed to issue org data request: Get "https://config.datadoghq.eu/api/v0.1/status": net/http: TLS handshake timeout
2025-02-21 10:52:30 UTC | CORE | ERROR | (comp/forwarder/defaultforwarder/worker.go:222 in process) | Error while processing transaction: error while sending transaction, rescheduling it: Post "https://7-63-0-app.agent.datadoghq.eu/intake/": net/http: TLS handshake timeout

DNS lookups from tcpdump similar to previous agent versions, so AWS Network Firewall domain whitelist is OK

7-63-0-app.agent.datadoghq.eu
api.datadoghq.eu.
config.datadoghq.eu
instrumentation-telemetry-intake.datadoghq.eu
process.datadoghq.eu
trace.agent.datadoghq.eu.

TLS handshakes (from tcpdump) is dropped by firewall

12 8.515165 10.5.16.138 → 34.107.178.244 TLSv1 157 Client Hello
[...]
584 528.207660 10.5.16.138 → 34.107.178.244 TLSv1 157 Client Hello

Firewall egress alerts

{"firewall_name":"dev-egress-firewall","availability_zone":"eu-west-1b","event_timestamp":"1740143757","event":{"app_proto":"tls","src_ip":"10.5.16.138","src_port":59340,"event_type":"alert","alert":{"severity":3,"signature_id":6,"rev":0,"signature":"","action":"blocked","category":""},"flow_id":1538496652483912,"dest_ip":"34.107.178.244","proto":"TCP","verdict":{"action":"drop"},"tls":{"version":"UNDETERMINED"},"dest_port":443,"pkt_src":"geneve encapsulation","timestamp":"2025-02-21T13:15:57.099958+0000","direction":"to_server"}}

Test 2 using public.ecr.aws/datadog/agent:7.63.0 and environment variable GODEBUG="tlskyber=0"

datadog-agent communication with datadoghq.eu works OK. No firewall drops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant