Skip to content

refactor: EventLoop locking cleanups + client disconnect exception #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

ryanofsky
Copy link
Collaborator

@ryanofsky ryanofsky commented Feb 10, 2025

This PR introduces RAII reference counting to the EventLoop class to simplify a recent bugfix in #159. It also deletes DestructorCatcher and AsyncCallable helper classes as discussed in #144 (comment) and it adds clang thread safety annotations as discussed in #129.

This also cleanup up disconnected exceptions to help resolve #123 by letting clients easily detect when they are trying to call a remote method after a connection has been closed. This change is part of the same PR because it depends on adding a ProxyContext::loop struct member, which is done in an earlier commit.

@ryanofsky
Copy link
Collaborator Author

Because this change removes EventLoop addClient and removeClient methods it will require an update to Bitcoin core if it is merged.

--- a/src/ipc/capnp/protocol.cpp
+++ b/src/ipc/capnp/protocol.cpp
@@ -41,10 +41,7 @@ class CapnpProtocol : public Protocol
 public:
     ~CapnpProtocol() noexcept(true)
     {
-        if (m_loop) {
-            std::unique_lock<std::mutex> lock(m_loop->m_mutex);
-            m_loop->removeClient(lock);
-        }
+        m_loop_ref.reset();
         if (m_loop_thread.joinable()) m_loop_thread.join();
         assert(!m_loop);
     };
@@ -83,10 +80,7 @@ public:
         m_loop_thread = std::thread([&] {
             util::ThreadRename("capnp-loop");
             m_loop.emplace(exe_name, &IpcLogFn, &m_context);
-            {
-                std::unique_lock<std::mutex> lock(m_loop->m_mutex);
-                m_loop->addClient(lock);
-            }
+            m_loop_ref.emplace(*m_loop);
             promise.set_value();
             m_loop->loop();
             m_loop.reset();
@@ -96,6 +90,7 @@ public:
     Context m_context;
     std::thread m_loop_thread;
     std::optional<mp::EventLoop> m_loop;
+    std::optional<mp::EventLoopRef> m_loop_ref;
 };
 } // namespace

ryanofsky and others added 9 commits March 31, 2025 13:00
This commit makes mechanical changes needed to simplify an upcoming commit
which replaces EventLoop* with an EventLoopRef.

This change also happens to be also useful on its own so clientInvoke can
detect disconnections in a non-racy way
(bitcoin-core#123 (comment))
by seeing if the client Connection pointer is null while holding the event loop
mutex.
Use EventLoopRef to avoid reference counting bugs and be more exception safe
Now that they are only called in one place by EventLoopRef class, they can be
inlined.
Use kj::Function instead of std::function to avoid the need for AsyncCallable
and DestructorCatcher classes, which were used to work around the requirement
that std::function objects need to be copyable. kj::Function does not have this
requirement.

Change is from bitcoin-core#144 (comment)

Co-authored-by: Ryan Ofsky <[email protected]>
Add basic thread safety annotations to EventLoop. Use could be expanded to
other functions. Use can be expanded and deepened but this puts basic
functionality in place.

Use of annotations was discussed in
bitcoin-core#129 (comment)
Improve clientInvoke exceptions so IPC clients can more reliably detect when
they are calling a remote method after the connection is closed. Before this
change different exceptions were thrown, which made this condition difficult to
detect and handle.
@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups refactor: EventLoop locking cleanups and client Disconnect errors Apr 24, 2025
@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups and client Disconnect errors refactor: EventLoop locking cleanups + client disconnect exceptions Apr 24, 2025
@ryanofsky ryanofsky changed the title refactor: EventLoop locking cleanups + client disconnect exceptions refactor: EventLoop locking cleanups + client disconnect exception Apr 24, 2025
@ryanofsky
Copy link
Collaborator Author

ryanofsky commented Apr 24, 2025

Updated ce4814f -> b47ea9f (pr/eventlock.1 -> pr/eventlock.2, compare), making test and comment fixes, splitting commits, and adding connection disconnects tests and exceptions to help resolve #123

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 24, 2025
Use EventLoopRef to avoid reference counting bugs and be more exception safe
and deal with removal of addClient/removeClient methods in
bitcoin-core/libmultiprocess#160

A test update is also required due to
bitcoin-core/libmultiprocess#160 to deal with changed
reference count semantics. In IpcPipeTest(), it is is now necessary to destroy
the client Proxy object instead of just the client Connection object to
decrease the event loop reference count and allow the loop to exit so the test
does not hang on shutdown.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Apr 24, 2025
This fixes an error reported by Antoine Poinsot <[email protected]> in
bitcoin-core/libmultiprocess#123 that does not happen
in master, but does happen with bitcoin#10102
applied, where if the child bitcoin-wallet process is killed (either by an
external signal or by Ctrl-C as reported in the issue) the bitcoin-node process
will not shutdown cleanly after that because chain client flush()
calls will fail.

This change fixes the problem by handling ipc::Exception errors thrown during
the flush() calls, and it relies on the fixes to disconnect detection
implemented in bitcoin-core/libmultiprocess#160 to work
effectively.
@ryanofsky
Copy link
Collaborator Author

Rebased b47ea9f -> f15ef6c (pr/eventlock.2 -> pr/eventlock.3, compare) so this can be merged cleanly in bitcoin PR bitcoin/bitcoin#32345. Also dropped DisconnectError exception type so original ipc::Exception can be used instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bitcoin-node segfaults when interrupted
2 participants