Skip to content

Conversation

abhijat
Copy link
Contributor

@abhijat abhijat commented Oct 15, 2025

Once server family completes its save on shutdown activity, a timer is started. If after 20 seconds the timer is not disarmed, stack traces for all fibers is printed to standard error, similar to what is done in test utils.

FIXES #5905

The watchdog is implemented as a static object which is armed on object creation, arming is only done once, future calls to arm are no-op. The time before logs are printed is fixed to 20 seconds.

Note on testing

I tested manually by adding

  ThisFiber::SleepFor(60s);

right after the ArmShutdownWatchdog(service_.proactor_pool()); call in ServerFamily::Shutdown() and fiber stack traces were printed as expected on shutting down the server after 20 seconds.

I considered adding some kind of environment variable or configuration for testing, to make the wait for watchdog_done to 0 and assert that stack trace is printed, but decided against it.

Maybe there should be a setting for the sleep duration.

Once server family completes its save on shutdown activity, a timer is
started. If after 20 seconds the timer is not disarmed, stack traces for
all fibers is printed to standard error, similar to what is done in test
utils.

Signed-off-by: Abhijat Malviya <[email protected]>
@abhijat abhijat force-pushed the abhijat/feat/shutdown-watchdog branch from 1ce7b87 to 5106b6b Compare October 15, 2025 07:21
@abhijat abhijat marked this pull request as ready for review October 15, 2025 08:38
@abhijat abhijat requested review from kostasrim and romange October 15, 2025 08:38
};

ShutdownWatchdog::ShutdownWatchdog(util::ProactorPool& pp) : pool{pp} {
watchdog_fb = pool.GetNextProactor()->LaunchFiber([&] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use LaunchFiber("name", cb) notation instead of calling util::ThisFiber::SetName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

absl::SetFlag(&FLAGS_alsologtostderr, true);
util::fb2::Mutex m;
pool.AwaitFiberOnAll([&m](unsigned index, auto*) {
util::ThisFiber::SetName("shutdown-watchdog");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use a differrent name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to print_stack_fib_%u

Signed-off-by: Abhijat Malviya <[email protected]>
@abhijat abhijat enabled auto-merge (squash) October 16, 2025 10:42
@abhijat abhijat merged commit 6399e7f into main Oct 16, 2025
10 checks passed
@abhijat abhijat deleted the abhijat/feat/shutdown-watchdog branch October 16, 2025 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

introduce auto-deadlock detection during shutdown

2 participants