-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. #129
Conversation
…op shutdown races. The EventLoop shutdown sequence has race conditions that could cause it to shut down right before a `removeClient` `write(m_post_fd, ...)` call is about to happen, if threads run in an unexpected order, and cause the write to fail. Cases where this can happen are described in bitcoin/bitcoin#31151 (comment) and the possible causes are that (1) `EventLoop::m_mutex` is not used to protect some EventLoop member variables that are accessed from multiple threads, particularly (`m_num_clients` and `m_async_fns`) and (2) the `removeClient` method can do unnecessary `write(m_post_fd, ...)` calls before the loop is supposed to exit because it is not checking the `m_async_fns.empty()` condition, and these multiple write calls can make the event loop exit early and cause the final `write()` call to fail. In practice, only the second cause seems to actually trigger this bug, but PR fixes both possible causes. Fixes bitcoin/bitcoin#31151 Co-authored-by: Vasil Dimov <[email protected]>
I've been testing this by running |
I'll test the rebased branch on bitcoin/bitcoin#30975 https://github.com/Sjors/libmultiprocess/tree/2025/01/disconnected |
bitcoin-core/libmultiprocess#129: Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. Using rebased commit from: https://github.com/Sjors/libmultiprocess/tree/2025/01/disconnected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the OP:
the possible causes are that (1) EventLoop::m_mutex is not used to protect some EventLoop member variables that are accessed from multiple threads
Which variables?
if (m_post_fn) { | ||
Unlock(lock, *m_post_fn); | ||
m_post_fn = nullptr; | ||
m_cv.notify_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously m_cv.notify_all()
would have been called regardless of whether m_post_fn
was nullptr
or not. Now it will only be called if m_post_fn
is not nullptr
. Is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Previously
m_cv.notify_all()
would have been called regardless of whetherm_post_fn
wasnullptr
or not. Now it will only be called ifm_post_fn
is notnullptr
. Is this intended?
Yep, that's intended. The only reason to notify a condition variable is after updating some shared state, like changing a reference count or adding or removing something from a queue. The notify_all() call here is made right after setting m_post_fn
to notify waiting threads that it changed. In the alternate case, no state is changing and there's nothing notify about.
src/mp/proxy.cpp
Outdated
bool EventLoop::done(std::unique_lock<std::mutex>& lock) | ||
{ | ||
assert(m_num_clients >= 0); | ||
return m_num_clients == 0 && m_async_fns.empty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of passing lock
when it is not going to be used? Looks weird. Is this to show that m_mutex
must be locked when calling this? If yes, then engaging the clang annotations would be more clear. Because now lock
could be anything, it may not own a mutex or may own any mutex other than m_mutex
. Maybe add these:
bool EventLoop::done(std::unique_lock<std::mutex>& lock)
{
assert(m_num_clients >= 0);
+ assert(lock.mutex() == &m_mutex);
+ assert(lock.owns_lock());
return m_num_clients == 0 && m_async_fns.empty();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Maybe add these:
Thanks, added.
You are right that code is written in this style to provide the similar benefits as clang annotations, so it is a compile error for outside callers access shared state without the needed lock. So few existing EventLoop methods take lock parameters, and this new method follows that pattern.
For background, the EventLoop code is basically some of the first code I wrote in bitcoin/bitcoin#10102 and I'm pretty sure it written before annotations were used much in bitcoin core. It probably would be a very good idea to add clang annotations here too, so that could be a followup, but in the meantime this convention prevents the common bug of forgetting to lock something before accessing shared state. (I think a bug where you would somehow find a different mutex and lock the wrong mutex would be more rare.)
@@ -188,29 +188,27 @@ void EventLoop::loop() | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(posting in a random place coz github wouldn't allow me to comment on the destructor)
Not related to this PR, just an observation. The destructor is:
EventLoop::~EventLoop()
{
if (m_async_thread.joinable()) m_async_thread.join();
std::lock_guard<std::mutex> lock(m_mutex);
KJ_ASSERT(m_post_fn == nullptr);
KJ_ASSERT(m_async_fns.empty());
KJ_ASSERT(m_wait_fd == -1);
KJ_ASSERT(m_post_fd == -1);
KJ_ASSERT(m_num_clients == 0);
locking m_mutex
shouldn't be necessary because there cannot be two threads destroying the same object concurrently. Or if there are then this is a serious bug elsewhere that will cause a double free bug after the destructor completes. Also, there cannot be one thread destroying the object while another one is calling a method of that object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
locking
m_mutex
shouldn't be necessary because there cannot be two threads destroying the same object concurrently. Or if there are then this is a serious bug elsewhere that will cause a double free bug after the destructor completes. Also, there cannot be one thread destroying the object while another one is calling a method of that object.
I think that is basically true and the lock is probably unnecessary here.
But there might be scenarios where threadsanitizer will throw errors without a lock, if it sees a variable being written here after being read from another thread with no synchronization in between. I am also not sure if clang thread safety annotations would complain without a lock here.
Also to be pedantic though I think "there cannot be one thread destroying the object while another one is calling a method of that object" is not true. The state in the object is still valid while the destructor method is running, and it should be fine for other threads to access it. The destructor method is not different from other methods in this respect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang thread safety is aware of this:
class A
{
public:
A()
{
x = 1;
}
~A()
{
x = 2;
}
void f()
{
x = 3;
}
std::mutex m;
int x __attribute__((guarded_by(m)));
};
Generates only a warning for f()
:
t.cc:194:9: warning: writing variable 'x' requires holding mutex 'm' exclusively [-Wthread-safety-analysis]
194 | x = 3;
| ^
1 warning generated.
I assume the thread sanitizer is the same even though I couldn't demonstrate this easily.
The state in the object is still valid while the destructor method is running, and it should be fine for other threads to access it.
True, but then the problem arises that after the destructor finishes, without further synchronization, the memory that contains the object is freed. So it would be a serious read-after-free bug if another method is being executed while the destructor is being called because it could happen that the destructor finishes and the memory freed while the other method is still executing. Any mutexes within the object locked inside the destructor will be unlocked when the destructor finishes and before the memory is freed, so they cannot be used for synchronization with the other method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Clang thread safety is aware of this:
Thanks, this is interesting. I'm a little unsure how clang knows that no mutex is required to access the variable in the destructor. It doesn't seem like a totally safe assumption to make because I could easily think of cases where locking the mutex in a destructor would be required, and then clang wouldn't diagnose any problem. But maybe it is just assuming that typical destructors aren't blocking or doing anything complicated, and the mutex is about to be destroyed anyway, and that is enough proof of ownership.
I assume the thread sanitizer is the same even though I couldn't demonstrate this easily.
This could also be the case, but I wouldn't assume it. Thread sanitizer (as far as I know) does not treat destructors differently than other methods, and is operating at a lower level just looking at what mutexes are held during reads and write and what synchronization events are happening between reads and writes.
It's definitely possible you are right and dropping locks from this code is perfectly ok. So just for the sake of this PR I don't want to expand it by dropping an already existing lock, but, that could be a good change to make in a followup.
The state in the object is still valid while the destructor method is running, and it should be fine for other threads to access it.
True, but then the problem arises that after the destructor finishes, without further synchronization, the memory that contains the object is freed. So it would be a serious read-after-free bug if another method is being executed while the destructor is being called because it could happen that the destructor finishes and the memory freed while the other method is still executing. Any mutexes within the object locked inside the destructor will be unlocked when the destructor finishes and before the memory is freed, so they cannot be used for synchronization with the other method.
This is all true but it is ok for a destructor to block and wait for other events and other threads. And as long as it is waiting, it is ok for other threads to access the object and call its methods.
@@ -188,29 +188,27 @@ void EventLoop::loop() | |||
|
|||
kj::Own<kj::AsyncIoStream> wait_stream{ | |||
m_io_context.lowLevelProvider->wrapSocketFd(m_wait_fd, kj::LowLevelAsyncIoProvider::TAKE_OWNERSHIP)}; | |||
int post_fd{m_post_fd}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make a copy of m_post_fd
? This code here is the only one that can change it. That is, for sure, at the end where close(post_fd)
is called, post_fd
will be equal to m_post_fd
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Why make a copy of
m_post_fd
? This code here is the only one that can change it. That is, for sure, at the end whereclose(post_fd)
is called,post_fd
will be equal tom_post_fd
, right?
This is just saving a copy of the variable while the lock is held. There are three methods (loop, post, removeClient) where m_post_fd
is saved to a temporary variable before the lock is release and the descriptor is used.
As you say, this should never be necessary because the code that changes this variable can only run after the code that reads it, but at least when I started debugging this issue, thread sanitizer complained that this variable was being read and written to without synchronization, so I stopped accessing it without a lock to prevent this.
Going forward it might also be clearer to access m_post_fd
with locks, and this might make it easier to add clang annotations too (not sure).
I wouldn't object to a followup accessing m_post_fd
directly without locks, but I would want make sure this is compatible with the tools we use (tsan, thread safety annotations) given their limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just saving a copy of the variable while the lock is held.
Hmm, but the lock is not held when making the copy at line 191?
191 int post_fd{m_post_fd};
...
213 KJ_SYSCALL(::close(post_fd));
214 std::unique_lock<std::mutex> lock(m_mutex);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Hmm, but the lock is not held when making the copy at line 191?
Oh, that is a good point, I didn't notice that. For consistency it would be good to just acquire the lock whereever the variable is used, even when it not needed, or have some other more consistent rule. I think a good followup would be to add clang annotations and add the lock everywhere it says they are required, and remove it places where it isn't necessary and the annotations don't say it is required. This would add an unnecessary lock here as you are suggesting, and remove an unnecessary lock in the destructor as you have also suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use std::atomic_int
for m_post_fd
? If it does not have to be in sync with other variables...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Would it be possible to use
std::atomic_int
form_post_fd
? If it does not have to be in sync with other variables...
Definitely possible. I think it wouldn't be a performance difference because in practice whenever m_post_fd is used some state update is being posted, and a lock is needed anyway to update the shared state. But there could be other reasons for preferring an atomic reasons like style reasons. Very possible writing this code in a different style could make it clearer. I think at very least it should be using clang thread safety annotations which would make intent more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think std::atomic
wouldn't make a difference wrt performance but would make the code more clear.
int post_fd{m_post_fd}; | ||
Unlock(lock, [&] { | ||
char buffer = 0; | ||
KJ_SYSCALL(write(m_post_fd, &buffer, 1)); | ||
KJ_SYSCALL(write(post_fd, &buffer, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right that the problem with this code was that the loop()
method may finish just before the write()
call and set m_post_id
to -1
and this may end up calling write(-1, ...)
? If yes, then this change is not sufficient - if the same happens, then it will call write(n, ...
where n
is not -1
but is some number, a stale file descriptor, that has already been closed. Even worse, n
may represent a newly opened file if the file descriptor numbers are reused by the OS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Am I right that the problem with this code was [...]
This was actually not the problem. The problem wasn't that the code was trying to write(-1, ...)
, although that could have been a possible outcome of the bug. The problem was not even that it was trying to write to a closed file descriptor, although that could have been another possible outcome the bug. Instead, the actual problem, whenever it happened, was that the write() call failed because it was writing to a pipe that was closed on the reading side.
The cause of all these problems (theoretical and actual) was that removeClient
was only checking the m_num_clients == 0
condition to see if it should write to the pipe before this PR, instead of checking the checking m_num_clients == 0 && m_async_fns.empty()
condition like it is doing after this PR. So previously, it could write to the pipe multiple times before the loop was actually supposed to exit, and depending on thread scheduling this could occasionally cause the loop to exit slightly too early, immediately before the final write() call was made, causing the write() call to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so there is another problem in master
which is not addressed in this PR:
EventLoop::post()
is executing in threadT1
EventLoop::loop()
is executing in threadT2
- Lets say
m_post_id
is5
post()
/T1
makes a copy ofm_post_fd
,post_fd = 5
and unlocksm_mutex
loop()
/T2
doesclose(5)
and assigns-1
tom_post_fd
post()
/T1
callswrite(post_fd, ...)
withpost_fd
being5
. At this point5
does not refer to an opened file so the call fails, or worse, another file has been opened by the process and the number5
has been reused for it and now5
refers to an opened file which is different than the file associated with5
when the copypost_fd
was made (in 4. above)
Right? Or do I miss some high level context here (I am new to the internals of this library)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Right? Or do I miss some high level context here (I am new to the internals of this library)?
Yes, the high level context that prevents what you described from happening is the addClient/removeClient reference counting. If you call post() without calling addClient(), the race condition you described will be present, but that is a higher level bug. The eventloop code is assuming that addClient and removeClient are being used whenever post() is used.
The bug being fixed in this PR is a lower-level bug that only happens during shutdown when the reference count is 0 and at the very last second, right very last removeClient call in the program is about to call write(m_post_fd)
, the event loop wakes up because of previous spurious writes, and closes the read end of the file descriptor (before closing m_post_fd) as well.
The cause of the spurious writes is that previously removeClient was only checking the m_num_clients == 0
condition so it was doing spurious write(m_post_fd) calls which could cause the event loop to wake up too early and exit slightly before it was supposed to exit. Now removeClient
is checking m_num_clients == 0 && m_async_fns.empty()
so it will not do spurious writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, conclude this. I have another question. Posting here because it is related:
This pattern, used in post()
and removeClient()
:
- lock mutex that protects
m_post_fd
- make a local copy of
m_post_id
inpost_fd
- unlock
- use the local copy:
write(post_fd)
looks odd to me. Assuming the usual semantics that the mutex is protecting the variable - it is not allowed to change while holding the mutex and can change when not holding the mutex. This means that at the time of 4. m_post_fd
could have changed after the unlock in 3. Above you said that a higher level logic implies that loop()
which is the only one which changes m_post_fd
, when it finishes will not finish while post()
is executing. Ok, but then that means m_post_fd
does not need any protection. Is this just trying to silence some false alarm by the thread sanitizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Is this just trying to silence some false alarm by the thread sanitizer?
Not exactly. I think you keep basically asking the question again and again "Why is m_post_fd accessed with a lock? It doesn't need to be locked." And sorry I haven't given a more direct answer. It just has to do with behavior of thread sanitizer with earlier versions of this code. When I first enabled thread sanitizer to debug this issue, I fixed immediate tsan errors that happened running the code by restructuring the loop in the way it is currently written, but without the post_fd variable. But then when I ran the test in a loop, after around 20-30 minutes eventually there were tsan errors complaining both about the m_post_fd
variable and the underlying file descriptor having conflicting read/write accesses from multiple threads without synchronization. So I added the int post_fd = m_post_fd variables, and that fixed the errors about the m_post_fd variable, but not the errors about the underlying file descriptor. Eventually I figured out the underlying problem which was spurious writes in removeClient and fixed that issue too, but I did not go back and remove the post_fd = m_post_fd variables.
I did not feel any need to remove these variables because I was thinking of the m_post_fd, m_num_clients, and m_async_fns as a unit and wanted to follow a simple rule of using a single lock to protect them. You're completely right though that a lock is probably not needed to access m_post_fd, although I still think clang thread annotation might complain if it is accessed without a lock outside of the destructor, and I am not 100% sure that tsan would not complain if I removed the post_fd variables, but I am 99% sure so that it would be ok, so that change could definitely be a followup.
Thanks for asking this and feel free to ask about anything else that isn't clear or could be improved.
5a6c9e0
to
0e4f88d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review!
Updated 5a6c9e0 -> 0e4f88d (pr/disconnected.1
-> pr/disconnected.2
, compare) adding suggested asserts and updating some comments.
re: #129 (review)
From the OP:
the possible causes are that (1) EventLoop::m_mutex is not used to protect some EventLoop member variables that are accessed from multiple threads
Which variables?
Updated the description with this information, but checking previous version of the code it looks like variables accessed from multiple threads and used without a lock were m_num_clients
, m_async_fns
, and m_post_fd
.
Also, to be clear, the intermittent failure that's being fixed here was not caused by lack of locking as far as I can tell. The failure was caused by a race between the startAsyncThread and removeClient calls at the end of the ~Connection destructor as described in bitcoin/bitcoin#31151 (comment)
if (m_post_fn) { | ||
Unlock(lock, *m_post_fn); | ||
m_post_fn = nullptr; | ||
m_cv.notify_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Previously
m_cv.notify_all()
would have been called regardless of whetherm_post_fn
wasnullptr
or not. Now it will only be called ifm_post_fn
is notnullptr
. Is this intended?
Yep, that's intended. The only reason to notify a condition variable is after updating some shared state, like changing a reference count or adding or removing something from a queue. The notify_all() call here is made right after setting m_post_fn
to notify waiting threads that it changed. In the alternate case, no state is changing and there's nothing notify about.
src/mp/proxy.cpp
Outdated
bool EventLoop::done(std::unique_lock<std::mutex>& lock) | ||
{ | ||
assert(m_num_clients >= 0); | ||
return m_num_clients == 0 && m_async_fns.empty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Maybe add these:
Thanks, added.
You are right that code is written in this style to provide the similar benefits as clang annotations, so it is a compile error for outside callers access shared state without the needed lock. So few existing EventLoop methods take lock parameters, and this new method follows that pattern.
For background, the EventLoop code is basically some of the first code I wrote in bitcoin/bitcoin#10102 and I'm pretty sure it written before annotations were used much in bitcoin core. It probably would be a very good idea to add clang annotations here too, so that could be a followup, but in the meantime this convention prevents the common bug of forgetting to lock something before accessing shared state. (I think a bug where you would somehow find a different mutex and lock the wrong mutex would be more rare.)
@@ -188,29 +188,27 @@ void EventLoop::loop() | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
locking
m_mutex
shouldn't be necessary because there cannot be two threads destroying the same object concurrently. Or if there are then this is a serious bug elsewhere that will cause a double free bug after the destructor completes. Also, there cannot be one thread destroying the object while another one is calling a method of that object.
I think that is basically true and the lock is probably unnecessary here.
But there might be scenarios where threadsanitizer will throw errors without a lock, if it sees a variable being written here after being read from another thread with no synchronization in between. I am also not sure if clang thread safety annotations would complain without a lock here.
Also to be pedantic though I think "there cannot be one thread destroying the object while another one is calling a method of that object" is not true. The state in the object is still valid while the destructor method is running, and it should be fine for other threads to access it. The destructor method is not different from other methods in this respect.
@@ -188,29 +188,27 @@ void EventLoop::loop() | |||
|
|||
kj::Own<kj::AsyncIoStream> wait_stream{ | |||
m_io_context.lowLevelProvider->wrapSocketFd(m_wait_fd, kj::LowLevelAsyncIoProvider::TAKE_OWNERSHIP)}; | |||
int post_fd{m_post_fd}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Why make a copy of
m_post_fd
? This code here is the only one that can change it. That is, for sure, at the end whereclose(post_fd)
is called,post_fd
will be equal tom_post_fd
, right?
This is just saving a copy of the variable while the lock is held. There are three methods (loop, post, removeClient) where m_post_fd
is saved to a temporary variable before the lock is release and the descriptor is used.
As you say, this should never be necessary because the code that changes this variable can only run after the code that reads it, but at least when I started debugging this issue, thread sanitizer complained that this variable was being read and written to without synchronization, so I stopped accessing it without a lock to prevent this.
Going forward it might also be clearer to access m_post_fd
with locks, and this might make it easier to add clang annotations too (not sure).
I wouldn't object to a followup accessing m_post_fd
directly without locks, but I would want make sure this is compatible with the tools we use (tsan, thread safety annotations) given their limitations.
int post_fd{m_post_fd}; | ||
Unlock(lock, [&] { | ||
char buffer = 0; | ||
KJ_SYSCALL(write(m_post_fd, &buffer, 1)); | ||
KJ_SYSCALL(write(post_fd, &buffer, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Am I right that the problem with this code was [...]
This was actually not the problem. The problem wasn't that the code was trying to write(-1, ...)
, although that could have been a possible outcome of the bug. The problem was not even that it was trying to write to a closed file descriptor, although that could have been another possible outcome the bug. Instead, the actual problem, whenever it happened, was that the write() call failed because it was writing to a pipe that was closed on the reading side.
The cause of all these problems (theoretical and actual) was that removeClient
was only checking the m_num_clients == 0
condition to see if it should write to the pipe before this PR, instead of checking the checking m_num_clients == 0 && m_async_fns.empty()
condition like it is doing after this PR. So previously, it could write to the pipe multiple times before the loop was actually supposed to exit, and depending on thread scheduling this could occasionally cause the loop to exit slightly too early, immediately before the final write() call was made, causing the write() call to fail.
bitcoin-core/libmultiprocess#129: Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. Using rebased commit from: https://github.com/Sjors/libmultiprocess/tree/2025/01/disconnected
bitcoin-core/libmultiprocess#129: Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. Using rebased commit from: https://github.com/Sjors/libmultiprocess/tree/2025/01/disconnected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing and keeping discussion going on this. The bug is a pretty complicated and hard to explain, but I want to try to make everything clear as possible so the change is understood.
I am probably about to merge this PR shortly, because the bug is not straightforward to reproduce and I am pretty confident this change fixes it, and 100% confident this change improves locking in general and fixes a bunch of immediate -fsanitize=thread bugs. I also want to bump the libmultiprocess version in bitcoin core to be able support building libmultiprocess as a git subtree, and would like this fix to be included to avoid any potential issues in CI in that PR and other multirprocess bitcoin core PRs.
I would like to continue to improve and discuss this code and talk about it here, or make a followup PR that adds clang thread safety annotations, and could be another place to make related improvements.
@@ -188,29 +188,27 @@ void EventLoop::loop() | |||
|
|||
kj::Own<kj::AsyncIoStream> wait_stream{ | |||
m_io_context.lowLevelProvider->wrapSocketFd(m_wait_fd, kj::LowLevelAsyncIoProvider::TAKE_OWNERSHIP)}; | |||
int post_fd{m_post_fd}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Hmm, but the lock is not held when making the copy at line 191?
Oh, that is a good point, I didn't notice that. For consistency it would be good to just acquire the lock whereever the variable is used, even when it not needed, or have some other more consistent rule. I think a good followup would be to add clang annotations and add the lock everywhere it says they are required, and remove it places where it isn't necessary and the annotations don't say it is required. This would add an unnecessary lock here as you are suggesting, and remove an unnecessary lock in the destructor as you have also suggested.
int post_fd{m_post_fd}; | ||
Unlock(lock, [&] { | ||
char buffer = 0; | ||
KJ_SYSCALL(write(m_post_fd, &buffer, 1)); | ||
KJ_SYSCALL(write(post_fd, &buffer, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #129 (comment)
Right? Or do I miss some high level context here (I am new to the internals of this library)?
Yes, the high level context that prevents what you described from happening is the addClient/removeClient reference counting. If you call post() without calling addClient(), the race condition you described will be present, but that is a higher level bug. The eventloop code is assuming that addClient and removeClient are being used whenever post() is used.
The bug being fixed in this PR is a lower-level bug that only happens during shutdown when the reference count is 0 and at the very last second, right very last removeClient call in the program is about to call write(m_post_fd)
, the event loop wakes up because of previous spurious writes, and closes the read end of the file descriptor (before closing m_post_fd) as well.
The cause of the spurious writes is that previously removeClient was only checking the m_num_clients == 0
condition so it was doing spurious write(m_post_fd) calls which could cause the event loop to wake up too early and exit slightly before it was supposed to exit. Now removeClient
is checking m_num_clients == 0 && m_async_fns.empty()
so it will not do spurious writes.
If you do need to make further changes here, can you rebase it? That makes it easier to update bitcoin/bitcoin#30975. |
063ff18 fix startAsyncThread comment (Ryan Ofsky) Pull request description: While implementing #129 I noticed startAsyncThread comment wasn't totally accurate so this should fix that. Top commit has no ACKs. Tree-SHA512: b3ff8c53752c65c3d58af8dddd13bc718d18c230adb0b40c674b381e47629ed0780650e6bb017a3238093c4e12a7b5b02244ad90871b005a9a562330b5ea4ba7
bitcoin-core/libmultiprocess#121: ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120: proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127: ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129: Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130: refactor: Add CleanupRun function to dedup clean list code
bitcoin-core/libmultiprocess#121: ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120: proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127: ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129: Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130: refactor: Add CleanupRun function to dedup clean list code bitcoin-core/libmultiprocess#131: doc: fix startAsyncThread comment bitcoin-core/libmultiprocess#133: Fix debian "libatomic not found" error in downstream builds
Post merge ACK, the change looks good and the discussions above drifted to things that are out of the scope of this PR. |
This should be the final update to the libmultiprocess package via the depends system. It brings in the libmultiprocess cmake changes from bitcoin-core/libmultiprocess#136 needed to support building as subtree. After this, a followup PR will add libmultiprocess as a git subtree and depends will just use the git subtree instead of hardcoding its own version hash. Since there have been libmultiprocess API changes since the last update, this commit also updates bitcoin code to be compatible with them. This update brings in the following changes: bitcoin-core/libmultiprocess#121 ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120 proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127 ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129 Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130 refactor: Add CleanupRun function to dedup clean list code bitcoin-core/libmultiprocess#131 doc: fix startAsyncThread comment bitcoin-core/libmultiprocess#133 Fix debian "libatomic not found" error in downstream builds bitcoin-core/libmultiprocess#94 c++ 20 cleanups bitcoin-core/libmultiprocess#135 refactor: proxy-types.h API cleanup bitcoin-core/libmultiprocess#136 cmake: Support being included with add_subdirectory
This should be the final update to the libmultiprocess package via the depends system. It brings in the libmultiprocess cmake changes from bitcoin-core/libmultiprocess#136 needed to support building as subtree. After this, a followup PR will add libmultiprocess as a git subtree and depends will just use the git subtree instead of hardcoding its own version hash. Since there have been libmultiprocess API changes since the last update, this commit also updates bitcoin code to be compatible with them. This update brings in the following changes: bitcoin-core/libmultiprocess#121 ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120 proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127 ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129 Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130 refactor: Add CleanupRun function to dedup clean list code bitcoin-core/libmultiprocess#131 doc: fix startAsyncThread comment bitcoin-core/libmultiprocess#133 Fix debian "libatomic not found" error in downstream builds bitcoin-core/libmultiprocess#94 c++ 20 cleanups bitcoin-core/libmultiprocess#135 refactor: proxy-types.h API cleanup bitcoin-core/libmultiprocess#136 cmake: Support being included with add_subdirectory bitcoin-core/libmultiprocess#137 doc: Fix broken markdown links
This should be the final update to the libmultiprocess package via the depends system. It brings in the libmultiprocess cmake changes from bitcoin-core/libmultiprocess#136 needed to support building as subtree. After this, a followup PR will add libmultiprocess as a git subtree and depends will just use the git subtree instead of hardcoding its own version hash. Since there have been libmultiprocess API changes since the last update, this commit also updates bitcoin code to be compatible with them. This update brings in the following changes: bitcoin-core/libmultiprocess#121 ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120 proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127 ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129 Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130 refactor: Add CleanupRun function to dedup clean list code bitcoin-core/libmultiprocess#131 doc: fix startAsyncThread comment bitcoin-core/libmultiprocess#133 Fix debian "libatomic not found" error in downstream builds bitcoin-core/libmultiprocess#94 c++ 20 cleanups bitcoin-core/libmultiprocess#135 refactor: proxy-types.h API cleanup bitcoin-core/libmultiprocess#136 cmake: Support being included with add_subdirectory bitcoin-core/libmultiprocess#137 doc: Fix broken markdown links
…ng to subtree 4e0aa18 test: Add test for IPC serialization bug (Ryan Ofsky) 2221c88 depends: Update libmultiprocess library before converting to subtree (Ryan Ofsky) Pull request description: This should be the final update to the libmultiprocess package via the depends system. It brings in the libmultiprocess cmake changes from bitcoin-core/libmultiprocess#136 needed to support building as subtree. After this, followup PR #31741 will add libmultiprocess as a git subtree and depends will just use the git subtree instead of hardcoding its own version hash. Since there have been libmultiprocess API changes since the last update, this commit also updates bitcoin code to be compatible with them. This update has the following new changes since previous update #31105: bitcoin-core/libmultiprocess#121 ProxyClientBase: avoid static_cast to partially constructed object bitcoin-core/libmultiprocess#120 proxy-types.h: add static_assert to detect int/enum size mismatch bitcoin-core/libmultiprocess#127 ProxyClientBase: avoid static_cast to partially destructed object bitcoin-core/libmultiprocess#129 Fix "disconnected: write(m_post_fd, &buffer, 1): Broken pipe" EventLoop shutdown races. bitcoin-core/libmultiprocess#130 refactor: Add CleanupRun function to dedup clean list code bitcoin-core/libmultiprocess#131 doc: fix startAsyncThread comment bitcoin-core/libmultiprocess#133 Fix debian "libatomic not found" error in downstream builds bitcoin-core/libmultiprocess#94 c++ 20 cleanups bitcoin-core/libmultiprocess#135 refactor: proxy-types.h API cleanup bitcoin-core/libmultiprocess#136 cmake: Support being included with add_subdirectory bitcoin-core/libmultiprocess#137 doc: Fix broken markdown links ACKs for top commit: Sjors: ACK 4e0aa18 vasild: ACK 4e0aa18 TheCharlatan: ACK 4e0aa18 Tree-SHA512: 6d81cdf7f44762c7f476212295f6224054fd0a61315bb54786bc7758a2b33e5a2fce925c71e36f7bda320049aa14e7218a458ceb03dacbb869632c466c4789b0
Add basic thread safety annotations to EventLoop. Use could be expanded to other functions. Use can be expanded and deepened but this puts basic functionality in place. Use of annotations was discussed in bitcoin-core#129 (comment)
To follow up on all the discussion about clang annotations above, some basic ones are now added in #160 |
The EventLoop shutdown sequence has race conditions that could cause it to shut down right before a
removeClient
write(m_post_fd, ...)
call is about to happen, if threads run in an unexpected order, and cause the write to fail.Cases where this can happen are described in bitcoin/bitcoin#31151 (comment) and the possible causes are that (1)
EventLoop::m_mutex
is not used to protect some EventLoop member variables that are accessed from multiple threads, particularly (m_num_clients
andm_async_fns
) and (2) theremoveClient
method can do unnecessarywrite(m_post_fd, ...)
calls before the loop is supposed to exit because it is not checking them_async_fns.empty()
condition, and these multiple write calls can make the event loop exit early and cause the finalwrite()
call to fail. In practice, only the second cause seems to actually trigger this bug, but PR fixes both possible causes.Fixes bitcoin/bitcoin#31151