Skip to content

Conversation

@Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Nov 4, 2025

Backport #5104

Which issue(s) this PR fixes:
Fixes #4396

What this PR does / why we need it:
Adds timeout mechanism to establish_connection method to prevent infinite loop when handshake protocol gets stuck. In unstable network environments with proxy components, if connection drops during handshake after TLS establishment, Fluentd gets stuck in infinite loop causing logs to stop being flushed. This fix uses existing hard_timeout configuration to break the loop, disable problematic nodes, and maintain log flow through healthy nodes.

Docs Changes:
None required - uses existing hard_timeout configuration parameter.

Release Note:
Fix infinite loop in out_forward handshake protocol that could cause logs to stop being flushed in unstable network environments.

@Watson1978 Watson1978 force-pushed the backport-to-1.19/pr5104 branch from d109ffa to 8ec8a2d Compare November 4, 2025 07:22
@Watson1978 Watson1978 requested a review from daipom November 4, 2025 07:28
@daipom daipom added this to the v1.19.1 milestone Nov 4, 2025
@Watson1978 Watson1978 force-pushed the backport-to-1.19/pr5104 branch from 8ec8a2d to 13c9e52 Compare November 4, 2025 09:56
…op (#5104)

**Which issue(s) this PR fixes**:
Fixes #4396

**What this PR does / why we need it**:
Adds timeout mechanism to `establish_connection` method to prevent
infinite loop when handshake protocol gets stuck. In unstable network
environments with proxy components, if connection drops during handshake
after TLS establishment, Fluentd gets stuck in infinite loop causing
logs to stop being flushed. This fix uses existing `hard_timeout`
configuration to break the loop, disable problematic nodes, and maintain
log flow through healthy nodes.

**Docs Changes**:
None required - uses existing `hard_timeout` configuration parameter.

**Release Note**:
Fix infinite loop in out_forward handshake protocol that could cause
logs to stop being flushed in unstable network environments.

Signed-off-by: Ian Driver <[email protected]>
Co-authored-by: Ian Driver <[email protected]>
Signed-off-by: Shizuo Fujita <[email protected]>
@Watson1978 Watson1978 force-pushed the backport-to-1.19/pr5104 branch from 13c9e52 to ffff599 Compare November 6, 2025 03:01
@Watson1978
Copy link
Contributor Author

Watson1978 commented Nov 6, 2025

Hmm, we need to investigate the CI failure in https://github.com/fluent/fluentd/actions/runs/19123383854/job/54648369661

1) Error: test: Node with security is thread-safe on multi threads(ForwardOutputTest): TypeError: wrong argument type nil (expected Data)
C:/hostedtoolcache/windows/Ruby/3.2.9/x64/lib/ruby/gems/3.2.0/gems/cool.io-1.9.0/lib/cool.io/loop.rb:88:in `run_once'
C:/hostedtoolcache/windows/Ruby/3.2.9/x64/lib/ruby/gems/3.2.0/gems/cool.io-1.9.0/lib/cool.io/loop.rb:88:in `run'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

So, we discussed that this backport will be released at 1.19.2 or later.

@Watson1978 Watson1978 marked this pull request as draft November 6, 2025 06:21
@daipom daipom modified the milestones: v1.19.1, v1.19.2 Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants