Skip to content

Commit 8ec8a2d

Browse files
2ZZIan Driver
authored andcommitted
Fix #4396: Add timeout to establish_connection to prevent infinite loop (#5104)
**Which issue(s) this PR fixes**: Fixes #4396 **What this PR does / why we need it**: Adds timeout mechanism to `establish_connection` method to prevent infinite loop when handshake protocol gets stuck. In unstable network environments with proxy components, if connection drops during handshake after TLS establishment, Fluentd gets stuck in infinite loop causing logs to stop being flushed. This fix uses existing `hard_timeout` configuration to break the loop, disable problematic nodes, and maintain log flow through healthy nodes. **Docs Changes**: None required - uses existing `hard_timeout` configuration parameter. **Release Note**: Fix infinite loop in out_forward handshake protocol that could cause logs to stop being flushed in unstable network environments. Signed-off-by: Ian Driver <[email protected]> Co-authored-by: Ian Driver <[email protected]> Signed-off-by: Shizuo Fujita <[email protected]>
1 parent 9099c9f commit 8ec8a2d

File tree

2 files changed

+33
-0
lines changed

2 files changed

+33
-0
lines changed

lib/fluent/plugin/out_forward.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -620,7 +620,17 @@ def verify_connection
620620
end
621621

622622
def establish_connection(sock, ri)
623+
start_time = Fluent::Clock.now
624+
timeout = @sender.hard_timeout
625+
623626
while ri.state != :established
627+
# Check for timeout to prevent infinite loop
628+
if Fluent::Clock.now - start_time > timeout
629+
@log.warn "handshake timeout after #{timeout}s", host: @host, port: @port
630+
disable!
631+
break
632+
end
633+
624634
begin
625635
# TODO: On Ruby 2.2 or earlier, read_nonblock doesn't work expectedly.
626636
# We need rewrite around here using new socket/server plugin helper.

test/plugin/test_out_forward.rb

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1406,4 +1406,27 @@ def plugin_id_for_test?
14061406
assert_equal 0, @d.instance.healthy_nodes_count
14071407
assert_equal 0, @d.instance.registered_nodes_count
14081408
end
1409+
1410+
test 'establish_connection_timeout' do
1411+
@d = d = create_driver(%[
1412+
hard_timeout 1
1413+
<server>
1414+
host #{TARGET_HOST}
1415+
port #{@target_port}
1416+
</server>
1417+
])
1418+
1419+
node = d.instance.nodes.first
1420+
mock_sock = flexmock('socket')
1421+
mock_sock.should_receive(:read_nonblock).with(512).and_return('').at_least.once
1422+
1423+
ri = Fluent::Plugin::ForwardOutput::ConnectionManager::RequestInfo.new(:helo)
1424+
1425+
assert_true node.available?
1426+
node.establish_connection(mock_sock, ri)
1427+
assert_false node.available?
1428+
1429+
logs = d.logs
1430+
assert{ logs.any?{|log| log.include?('handshake timeout after 1.0s') } }
1431+
end
14091432
end

0 commit comments

Comments
 (0)