-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC 2: Add Fiber::ExecutionContext::MultiThreaded
#15517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 2: Add Fiber::ExecutionContext::MultiThreaded
#15517
Conversation
Introduces the second EC scheduler that runs in multiple threads. Uses the thread-safe queues (Runnables, GlobalQueue). Contrary to the ST scheduler, the MT scheduler needs to actively park the thread in addition to waiting on the event loop, because only one thread is allowed to run the event loop.
f5e466e
to
7bcff33
Compare
private def start_schedulers(hijack) | ||
@size.end.times do |index| | ||
@schedulers << Scheduler.new(self, "#{@name}-#{index}") | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Why do we initialize all schedulers from the start? We only start the minimal number of threads. And might never actually need all schedulers when the number of threads doesn't reach the max.
Could we lazily initialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid parallelism issues? For example in relation to stealing. We can expect the schedulers to exist, whatever if they're used. Maybe we could lazily initialize. That will need some experimentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steal is bounded by @schedulers.size
and the array should never shrink. So accessing should be fine. The worst that can happen is missing out on stealing from a scheduler if it was just added after we took the size. But that should not be an issue.
Either way, we should document why the code works like it does so we can be reminded the next time we look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd only be mutating the array during #initialize
then inside a mutex, so it's safe.
But we're accessing @schedulers
outside of a mutex when stealing (for obvious reasons), so there's no telling how the compiler or a weak CPU like ARM will reorder the writes when pushing to the array (size could be updated before the value is written). For example:
- T1:
mutex.lock
(ensures following writes can't happen before) - T1:
schedulers.@size += 1
(happens first) - T2:
execution_context.steal
- T2:
schedulers.last
=> 💥 - T1:
schedulers.@buffer[size] = scheduler
(too late) - T1:
mutex.unlock
(ensures above writes did happen)
Yes, I'm paranoid but if something can happen, it will happen. The above shouldn't happen on x86 until LLVM optimizers decide to reorder, and it will happen on ARM.
To be safe, we'd need an array-like object that uses an atomic or fence to update the size after the value has been set. It could be used for both @schedulers
and @threads
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: @threads
isn't affected because calls to @threads.size
are purely informational, we don't use it to iterate or access a thread; we only do this inside the mutex.
I sanitized the range handling, and fixed the exclusive range/capcity, and wrote a few specs to assert the behavior. I also dropped a number of unused args. |
Now the fixed range normalization and capacity checks should be correct 🤞 |
Fiber::ExecutionContext::MultiThreaded
Introduces the LAST EC scheduler that runs in multiple thread, with work stealing so any thread can resume any runnable fiber in the context (no more starving threads).
Unlike the ST scheduler, the MT scheduler needs to actively park threads since only one thread in the context can run the event loop (no parallel runs).
Having a single event loop for the whole context instead of having one per thread avoids situations where fibers would wait in an event loop but won't be processed because this thread happens to be busy, causing delays. With a single event loop, as soon as a thread is starving it can check the event loop and enqueue runnable fibers, that can be immediately resumed (and stolen).
NOTE: we can start running the specs in this context though they can segfault sometimes. Maybe because of some issues in spec helpers that used to expect fibers not switching, or maybe of issues in the stdlib for the same reason (for example libxml).
Kept in draft until #15511 and #15513 are merged.refs #15342