-
Notifications
You must be signed in to change notification settings - Fork 597
Medium: Filesystem: faster stop procedure when large number of proces… #1042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ses using the filesystem are found
Hm, why not do the mass-kill even with fewer processes? My guess is that it's to kill the processes in a predictable order. But if that is a real concern, it seems like it could be really difficult to debug problems caused by the switch to a different method at an arbitrary limit. |
On Tue, Oct 24, 2017 at 03:47:33AM -0700, Kristoffer Grönlund wrote:
Hm, why not do the mass-kill even with fewer processes?
I guess because we want to log the processes which are holding
the filesystem. It is assumed that such a situation is unexpected
and such information could be useful to the administrators.
Otherwise, I don't know :)
My guess is that it's to kill the processes in a predictable order. But if that is a real concern, it seems like it could be really difficult to debug problems caused by the switch to a different method at an arbitrary limit.
No, we don't really care about the order. It's not possible to
establish one anyway (i.e. the pid generation can wrap).
|
If it's just for the logging, perhaps something like this would work?
and then just killing them all in a single command all the time. |
On Tue, Oct 24, 2017 at 04:35:12AM -0700, Kristoffer Grönlund wrote:
If it's just for the logging, perhaps something like this would work?
```
ps -f $pids | tail -n +2 | while read line; do
ocf_log info "sending signal $sig to: $line"
done
```
and then just killing them all in a single command all the time.
Apparently, it is ocf_log that is the bottleneck, which is
probably true. Somebody recently complained about it hence this
change.
BTW, please don't merge this yet, it's untested and I asked the
poster to do some testing.
|
Ah, I see.. yeah, that does make sense that it would be. |
Did you manage to do some further testing? This PR starting to rot a bit 😉 |
Our QE suggests simply dropping the entire for-loop. E.g.:
or
|
The latter seems to be the best way forward. Thank you for your work and thanks to the QE for providing the code! |
Sadly it doesnt seem to work with our improved stop-action logic: |
…ses using the filesystem are found