-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat(server): Add shutdown watchdog #5915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Once server family completes its save on shutdown activity, a timer is started. If after 20 seconds the timer is not disarmed, stack traces for all fibers is printed to standard error, similar to what is done in test utils. Signed-off-by: Abhijat Malviya <[email protected]>
1ce7b87
to
5106b6b
Compare
src/server/main_service.cc
Outdated
}; | ||
|
||
ShutdownWatchdog::ShutdownWatchdog(util::ProactorPool& pp) : pool{pp} { | ||
watchdog_fb = pool.GetNextProactor()->LaunchFiber([&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use LaunchFiber("name", cb)
notation instead of calling util::ThisFiber::SetName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
src/server/main_service.cc
Outdated
absl::SetFlag(&FLAGS_alsologtostderr, true); | ||
util::fb2::Mutex m; | ||
pool.AwaitFiberOnAll([&m](unsigned index, auto*) { | ||
util::ThisFiber::SetName("shutdown-watchdog"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets use a differrent name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to print_stack_fib_%u
Signed-off-by: Abhijat Malviya <[email protected]>
Once server family completes its save on shutdown activity, a timer is started. If after 20 seconds the timer is not disarmed, stack traces for all fibers is printed to standard error, similar to what is done in test utils.
FIXES #5905
The watchdog is implemented as a static object which is armed on object creation, arming is only done once, future calls to arm are no-op. The time before logs are printed is fixed to 20 seconds.
Note on testing
I tested manually by adding
right after the
ArmShutdownWatchdog(service_.proactor_pool());
call inServerFamily::Shutdown()
and fiber stack traces were printed as expected on shutting down the server after 20 seconds.I considered adding some kind of environment variable or configuration for testing, to make the wait for
watchdog_done
to 0 and assert that stack trace is printed, but decided against it.Maybe there should be a setting for the sleep duration.