-
Notifications
You must be signed in to change notification settings - Fork 16
Mutex is not safe on multi-core systems #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
13: Add note about mutex unsafety on multi-core systems r=japaric a=adamgreig See #12. Co-authored-by: Adam Greig <[email protected]>
Spinlocks require a CAS operation so it's not possible to provide this on ARMv6-M. I don't really see a Cargo feature as an option. Enabling a Cargo feature should not break code so neither of these are valid / proper uses:
My proposal would be to remove the |
I agree that removing Mutex entirely is the best option but I think we should wait until rust-embedded/wg#294 is at least somewhat resolved. |
Since this API is still not stable, how about renaming the current Mutex<> to something which makes it clear is safe only for single-core? As you said, multi-core systems are a minority, so depending on the app, this could be enough.
If we have some APIs which have some clear safety boundary, we can have different implementations on different systems, or some might even be missing, I would consider that acceptable. |
I’ve been doing a bit of research into this topic this morning. Generally speaking, in order to achieve multi-core synchronization, hardware support is required, but perhaps one of these software implementations is feasible? I am fairly convinced that an implementation of Mutex does not belong in this hardware agnostic crate, but perhaps having a trait here to be implemented in hw specific crates might make sense. However, in that case, I would argue that a |
Yes, something like a spinlock can be implemented without CAS if you know the number of competing parties beforehand. That's what I did with The drawback is that spinlocks can easily lead to deadlock when used across interrupts (
Agreed. Right now we have one in |
After further research on the software implementations, all of them do require a memory barrier. I checked the Cortex-M0, which does not have hardware muted support, and it does have a memory barrier instruction. I don’t know if that’s something we can expect from every mcu though.
While that is a phenomenal property of a mutex, it’s not one I expect to be there. In general, mutexes do not guarantee deadlocks can’t happen and neither does Rust in general. The guarantee is that race conditions can not occur. I was unaware of the That still leaves the question of what to do with the mutex in this crate. I’m a bit concerned about the impact of removing it before another alternative is widely available. Even once another is available, the books will need to be updated accordingly. |
Every MCU in scope of embedded Rust has a memory barrier implementation. I haven't checked those implementations but I'd be very surprised if you would not need a CAS. Typically CAS free algorithms assume the absence of hardware interrupts.
The problem is that in the presence of interrupt handlers deadlocks through to use of e.g. spinlocks are much more likely and cannot be prevented or compile-time checked. This is in stark contrast to a regular operating system, so it is a somewhat important property. |
I’m not sure what this means. Isn’t “every MCU in scope of embedded Rust” simply every MCU? Just because there isn’t support now doesn’t mean there won’t be support in the future.
It may be worthwhile to research. The first software implementation listed, Dekker’s Algoritm, indicates that a spin lock can be implemented without even a test-and-set instruction, let alone a compare-and-swap. That algo has some serious limitations (it only works for 2 processes), but it does seem that it’s worthwhile looking into how a software implementation may be provided as a fallback for MCUs that don’t have mutex primitive instructions. Much like the But I digress. There seems to be a fair bit of agreement that |
It means exactly what I said: Every currently supported MCU can do memory barriers. There may be some which are problematic in that respect but I don't know which ones. I have my doubts those can be supported in Rust same as I have my doubts some will be supported even if technically possible but you're right that this is speculation.
We cannot remove it unless we have an established and working and supported replacement. This Mutex is used pretty much everywhere. |
Let's not be too theatrical here. https://github.com/search?l=Rust&q=bare_metal%3A%3AMutex&type=Code Out of the 44 repositories returned in the search above, very few are actually using I'd also like to point out that I don't think anyone here is talking about outright deleting this Mutex implementation. I admit that "remove" was a poor choice of words on my part. "Remove" is in context of "remove it from this crate". I would expect that this "good enough for many single core use cases" implementation would move to it's own crate ( |
If you're going to argue with me about essentials and call me theatrical please get at least your data straight: |
There's no reason to be upset. Let's just take a breath here. You're absolutely correct. There are significantly more usages of This could still be easily handled by creating a new crate with the critical section mutex that, with the exception of you @therealprof, people don't seem to believe belongs in this crate. Of course, none of this solves the fact that this Mutex is not safe on multi-core systems, but there also seems to be some consensus that it's dubious at best to think that a reasonable multi-core safe Mutex can be implemented without hardware support. |
Only a small fraction of people actually chimed in here, so it's a bit early to make such statements. Indeed I don't have any issues with the We still don't have the ability to do something like a crater run to ensure that we're not accidentally causing major damage to the ecosystem, so I'd rather we treat with extreme caution.
Indeed. |
@therealprof I was looking into this again this morning. It is possible to implement a different mechanism to provide the https://gist.github.com/rubberduck203/20415cb0bdc0726b2ebf0903e7193665 |
Just to be clear, the lock I linked to isn’t sound either, it should use a compare and swap, not an exchange, but is just to prove out that sound methods of providing a lock for the existing mutex can be implemented. |
Yeah, this has been discussed back and forth. Problem is: spinlocks are not ideal either for other reasons and also this implementation will not work on e.g. all Cortex-M0 and M0+ because they don't have CAS instructions so it's not an universally applicable approach. |
I think that’s the point. There is no universal approach, but the existing Since the |
There seems to be some confusion about what
I'll improve the docs of |
That’s a good idea @jonas-schievink. It took me quite a minute to completely grok how the two interact, and the guarantee that |
I am of the opinion that we're still thinking in terms of ARM cores, or even ARM Cortex M cores. On SoC with hybrid cores there could be a mix of Cortex A and Cortex M cores, or even non-ARM cores such as RISC-V, so I expect there must be a HW peripheral that could properly implement the synchronization across cores, so, just as I suggested in my 2019 Oxidize Conf presentation (https://www.youtube.com/watch?v=IKXrNlXXfL4#t=29m11s), a trait for such HW-enabled mechanisms is desirable. |
That would be by |
Actually you might not even need this if the cores have their own peripherals and a shared mutex peripheral. |
I’d like to be clear, I referenced the ARM paper, but the problem & solution are the same for any platform. |
This issue was effectively closed by rust-embedded/wg#419; the Mutex in bare-metal is only considered sound on single-core systems and some other abstraction will be required for multi-core systems. |
On a multi-core system, disabling interrupts does not prevent the other core from operating, and so values protected by a
bare_metal::Mutex
will be incorrectly marked Sync.Since the overwhelming majority of embedded use cases are single-core, I propose putting a prominent warning in the
Mutex
docstring for now, and working to develop a safe multi-core extension to the Mutex which can be enabled with a feature gate. Probably something using an atomic to implement a spinlock on top of requiring aCriticalSection
.The text was updated successfully, but these errors were encountered: