Harden data races between threaded input workers and the main engine#12007
Harden data races between threaded input workers and the main engine#12007erain wants to merge 4 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughAdds relaxed atomic helpers and updates shared scalar reads and writes in engine, metrics, input, and supervisor paths. ChangesShared state atomics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/fluent-bit/flb_atomic.h`:
- Around line 44-53: The fallback in flb_atomic.h currently maps
flb_atomic_load, flb_atomic_store, and flb_atomic_fetch_add to plain accesses,
which is unsafe for shared cross-thread state. Update the `#else` branch to use a
real atomic implementation for non-GCC/Clang compilers, or fail the build with
an explicit error if no atomic backend is available. Keep the fix scoped to the
flb_atomic_* macros so all existing call sites get proper atomic semantics.
In `@src/flb_engine.c`:
- Around line 1138-1139: The grace window update in flb_engine_started setup is
happening too late, so the main thread can observe a stale grace_input value
after FLB_ENGINE_STARTED is published. Move the flb_atomic_store for
config->grace_input to occur before calling flb_engine_started(config), keeping
the update in the startup path around the existing grace_input logic so
src/fluent-bit.c reads the correct supervisor grace window.
In `@src/fluent-bit.c`:
- Around line 1477-1479: The supervisor grace publication is only using an
atomic load in the fixed path, but the later hot-reload paths in the same
function still read ctx->config->grace_input directly and can reintroduce the
race. Update those additional grace publication sites in the function that
handles hot reloads to use flb_atomic_load on ctx->config->grace_input before
calling flb_supervisor_child_update_grace, keeping the access pattern consistent
everywhere in this flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bb82bdf9-9dc4-4eb0-ab1c-9e06c68eaf41
📒 Files selected for processing (5)
include/fluent-bit/flb_atomic.hsrc/flb_engine.csrc/flb_input.csrc/flb_metrics.csrc/fluent-bit.c
flb_metrics_sum() updates a counter from the owning input or output worker thread while the metrics exporter reads the same counter from the main engine thread. With a threaded input this is a data race reported by ThreadSanitizer; it is benign on the supported hardware but undefined under the C memory model. Add a small flb_atomic.h helper with relaxed load/store/fetch_add and use it for the metric value on both the summing and the reading paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yu Yi <yiyu@google.com>
flb_input_collector_fd() runs on the main engine thread but iterated the collectors of every input, including threaded ones. A threaded input initializes its collector descriptors from its own worker thread, so this races with the main thread reading them at startup. Those collectors are registered and dispatched in the input's own thread and event loop, so they are never matched here. Skipping threaded inputs removes the race and avoids needless iteration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yu Yi <yiyu@google.com>
config->grace_input is written once by the engine thread during startup and read concurrently by the supervisor on the main thread, which ThreadSanitizer reports as a data race. Store it with a relaxed atomic; the matching atomic read is done at the supervisor entry point. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yu Yi <yiyu@google.com>
The supervisor entry point reads config->grace_input, which the engine thread publishes during startup. Read it with a relaxed atomic so the cross-thread access is well defined, matching the engine-side store. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yu Yi <yiyu@google.com>
144901f to
6395b8c
Compare
Summary
Three data races between threaded input worker threads and the main
engine thread, all surfaced by ThreadSanitizer while running a threaded
tailinput (with multiline) alongside the metrics exporter. They areundefined under the C memory model and benign on the supported hardware, but
worth closing:
flb_metrics_sum()updates a counter from the owninginput/output worker thread while the metrics exporter reads the same counter
from the main thread. This one is live whenever metrics are scraped together
with a threaded input.
flb_input_collector_fd()runs on the main thread but iterated athreaded input's collectors, racing the worker thread that initializes those
collector fds at startup. A threaded input's collectors are registered and
dispatched in the input's own thread/event loop and are never matched here,
so the handler now skips threaded inputs (also a small optimization).
config->grace_inputis published by the engine threadduring startup and read by the supervisor on the main thread.
A small
include/fluent-bit/flb_atomic.hhelper is added (relaxed GCC/Clang__atomicbuiltins, plain-access fallback) and used for the metric counter andgrace_input.Testing
Built with ThreadSanitizer (
-fsanitize=thread, jemalloc disabled) and run for~45s against a threaded
tail+multiline.parser cri+nullpipeline drivenby a CRI log generator (~6M lines, 6 files):
flb_metrics_sum,flb_input_collector_fd,grace_input).Example configuration used:
Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit