cpu/powerpc: implement reservation for lwarx and stwcx instruction pair#15082
cpu/powerpc: implement reservation for lwarx and stwcx instruction pair#15082gm-matthew wants to merge 2 commits intomamedev:masterfrom
Conversation
I guess: |
|
Isn’t this flawed because you’re operating on logical addresses? These operations are used when multiple devices have access to the same memory. Other devices won’t be aware of the CPU’s address translation setup. Since the main purpose of this pair of instructions is to implement atomic operations in a system where multiple devices can access the same memory, how do you plan to support this? I imagine this would be important for fixing seemingly random issues in systems with multiple CPUs and/or bus mastering PCI devices. Also, there should be a way for implementation classes to set the cache line size. |
At the moment this commit works in the event of an interrupt occurring between the There are two alternative approaches I can think of:
This is probably a good idea and should not be too difficult as long as we can find out the cache block size (not cache line, my mistake) for each PowerPC device currently emulated. |
|
Out of curiosity, isn't listening for modifications to a particular address-space range precisely what memory taps are useful for? https://docs.mamedev.org/techspecs/memory.html#taps Install a tap upon |
That won’t work, because in MAME the other CPU will be accessing the memory through its own address space, not your address space (e.g. the same RAM installed in both CPUs’ address spaces with |
|
fvipers2 and vs298 can enter test mode, swtrilgy can't, but that appears to be the case even without the patch, this patch changes nothing in that regard. neither exit test mode cleanly |
|
Oh, and BTW there is only a |
Sorry I meant |
|
those are unchanged by this PR, fvipers2 crashes MAME, vs298 shows an onscreen error. |
In the model3 and the powermac cases I don't think there's any shared ram with other cpus or devices, it's in practice more a protection against interrupts, incluing preemption. So in that specific case a tap would do quite ok. |
|
FWIW, this is a slightly more strict version of what both Supermodel and DingusPPC do right now, except Dingus does at least translate the address to physical. (Neither of them mask the cache line size off the address, but that's apparently sufficient to boot OS X). So I wouldn't be mad about making this do the logical to physical translation, adding some sufficiently scary comments, and checking this in more or less as-is as long as there's a nice TODO. The rough sketch in my head of doing this correctly and MAME-idiomatically is to have a device that maintains a list of active reserved physical memory blocks using your favorite fast-lookup-on-random-key data structure. The device is configured with the block size and masks incoming addresses accordingly. The API should just be reserve a block, remove the reservation on a block, and check if a block is reserved. CPUs that have multiprocessor bus primitives (68K, PPC, and i960 off the top of my head, but there are likely more) would have a configurable |
|
(Disclaimer: my PowerPCjutsu isn’t as sharp as it once was, due to atrophy from disuse, but I’m still pretty confident on most of this.) The reservation granularity for The CPUs support a single reservation, which operates on physical addressing (i.e. the address after address translation). A misaligned Using
The reservation is cancelled when any of the following occur:
Well-behaved interrupt handlers and process switching code will explicitly clear any reservation before returning (usually by executing a See, for example, the recommendations in the 603 user manual section 4.4:
Or footnote 2 of this blog post by Raymond Chen:
(Note that that blog post incorrectly states that, “If you attempt to store back to a location different from the most recent preceding Simply writing to another location within the reservation granularity from the same CPU will not cancel the reservation. If it did, it would be impossible to write future-proof code, as future architecture implementations can increase the size of the reservation granularity (and have done so). |
|
I decided to whip out my PowerBook 1400cs so I could test the Are there any drivers in MAME using a PowerPC that have other devices that share RAM with the PowerPC? |
|
konami/konamim2.cpp is a dual PPC602 SMP setup. apple/macpdm.cpp supports bus mastering NuBus cards with a PPC601 in principle. konami/cobra.cpp has a grab bag of PowerPC CPUs (PPC603e main CPU, PPC403 system controller, PPC604 T&L processor) with plenty of DMA capabilities, but emulation is very incomplete. be/bebox.cpp is a dual PPC603 SMP setup, but I don’t know how complete emulation is. |
|
Bebox is very incomplete, out of using legacy PCI & SCSI. For starter it doesn't hookup the ISA and the Super I/O, which means it can't even output to the debug port. Then afaik none of boot CDs actually boots, not sure if it was supposed to do more but tbh it's in a kind of limbo that probably rewriting from scratch is probably for the best (note: reuses MPC105 as Sega Model 3) |
…t reserved it does not cancel the reservation
|
I've made it so that writing to a reserved block from the same CPU that reserved it no longer cancels the reservation, as per my test on an actual PowerPC. The Out of curiosity I looked at the code for Xenia which has to emulate multiple PowerPC cores, and its approach for I think attempting to implement it so that any other device writing to a reserved memory block cancels the reservation would be a lot of effort for not much benefit; it would be much easier to either force the PowerPC to keep executing until the reservation is cleared (thus ensuring that nothing else can touch the reserved block), or make it so that |
|
Look, I don’t think this implementation can work:
It’s similar for |
|
Something is unclear to me. The cpu does not snoop every write to the address, only another stxcw can invalidate the reservation? |
If a particular CPU uses |
|
And if another cpu or a dma does it? |
Then the cache block containing the reserved value is invalidated, resulting in the reservation also being invalidated. |
Yes, I'm currently working on moving the logic over to the |
PowerPC has the instructions
lwarxandstwcxwhich are used to perform atomic update (load-modify-store) operations.lwarxloads a word and creates a reservation for a specific memory block; any write to this block will cause the reservation to be lost.stwcxchecks if the reservation is still valid, and if it is, it writes the updated memory value, clears the reservation and sets the EQ bit ofcr0to signify success. If it is not valid, it clears the EQ bit to signify failure and no write is performed.The size of the reserved memory block is implementation-dependent, generally matching the cache line size. For many early PowerPC models, including most of those emulated in MAME, the reservation size is 32 bytes so that is what I have implemented.
This commit fixes a bug in several Model 3 games that causes them to freeze as a result of the previous incorrect implementation of the
lwarxandstwcxinstructions.