|
| 1 | +# Multiplayer Locks |
| 2 | + |
| 3 | +## Summary |
| 4 | +The game thread and network thread access shared data structures without proper synchronization. |
| 5 | +These race conditions can cause operations to disappear, and that causes the game to desync. |
| 6 | + |
| 7 | +## Details |
| 8 | +Despite locks and even lock-free concurrent data structures being [well-understood in the mid 1980s](https://en.wikipedia.org/wiki/Treiber_stack), P3 does not use correct locks anywhere. |
| 9 | + |
| 10 | +### Execute Current Operations |
| 11 | +The `execute_operations` function at `0x00546870` is not locking the current operations properly. |
| 12 | + |
| 13 | +The basic block at `0x005468B3` attempts to lock the current operations: |
| 14 | + |
| 15 | +This is not how locks work. |
| 16 | + |
| 17 | +### Insert Pending Operations |
| 18 | +The `insert_into_pending_operations_warpper` function at `0054AA70` is not locking the pending operations properly. |
| 19 | + |
| 20 | +The basic blocks at `0054AA79` attempt to lock the pending operations: |
| 21 | + |
| 22 | +This is not now locks work. |
| 23 | + |
| 24 | +### Client Ingress Queue |
| 25 | +The function at `0x0054B080` which moves operations from the ingress queue and the socket into the current operations is not locking the current operations properly at two locations. |
| 26 | + |
| 27 | +The basic block at `0x0054B13F` attempts to try-lock the current operations: |
| 28 | + |
| 29 | +This is not how locks work. |
| 30 | + |
| 31 | +The basic block at `0x0054B200` attempts to try-lock the current operations: |
| 32 | + |
| 33 | +This is not how locks work. |
| 34 | + |
| 35 | +### Client Pending Operations |
| 36 | +The function at `0x0054AFA0` which sends operations from the pending operations to the host is not locking the current operations properly: |
| 37 | + |
| 38 | +This is certainly not how locks work. |
| 39 | + |
| 40 | +### Host Egress Queue |
| 41 | +The function at `0x0054B670` which moves operations from the host's pending operations and the client sockets into the egress queue is not locking the pending operations properly: |
| 42 | + |
| 43 | +This is not how locks work. |
| 44 | + |
| 45 | +### Host Ingress Queue |
| 46 | +The function at `0x0054B960` which moves operations from the host's egress and ingress queues into the current operations is not locking the current operations properly: |
| 47 | + |
| 48 | +This is not how locks work. |
| 49 | + |
| 50 | +## Fix |
| 51 | +All bugs are fixed by the [fix-multiplayer-locks](https://github.com/P3Modding/p3-lib/tree/master/mod-fix-multiplayer-locks) mod. |
| 52 | + |
| 53 | +### Execute Current Operations |
| 54 | +To fix the problem at `0x005468B3` the following changes have to be made: |
| 55 | +- The "locking" basic blocks at `0x005468B3` must correctly lock the current operations. |
| 56 | +This can be achieved by inserting a call to a proper lock function. |
| 57 | +- The "unlocking" basic block at `0x00547254` must correctly unlock the current operations. |
| 58 | +This can be achieved by inserting a call to a proper unlock function. |
| 59 | + |
| 60 | +### Insert Pending Operations |
| 61 | +To fix the problem at `0x0054AA79` the following changes have to be made: |
| 62 | +- The "locking" basic blocks at `0x0054AA79` must correctly lock the pending operations. |
| 63 | +This can be achieved by inserting a call to a proper lock function. |
| 64 | +- The "unlocking basic block at `0x0054AAC2` must correctly unlock the pending operations. |
| 65 | +This can be achieved by inserting a call to a proper unlock function. |
| 66 | + |
| 67 | +### Client Ingress Queue |
| 68 | +To fix the problem at `0x0054B13F` the following changes have to be made: |
| 69 | +- The "try-locking" basic block at `0x0054B13F` must correctly try-lock the current operations, and continue into the basic block at `0x0054B14F` only if the lock was acquired. |
| 70 | +This can be achieved by replacing the two `mov` instructions with a call instruction to a proper try-lock function which returns the result. |
| 71 | +- The "unlocking" basic block at `0x0054B198` must unlock the current operations only if they were locked by the basic block at `0x0054B13F`. |
| 72 | +This can be achieved by replacing the `mov` instruction with a call instruction to a proper unlock function and making the `jnz` instruction target the next instruction after the call. |
| 73 | +The `cmp` instruction above it must be moved below it to ensure it always happens, so `jnz` must point to the moved `cmp`. |
| 74 | + |
| 75 | +To fix the problem at `0x0054B200` the following changes have to be made: |
| 76 | +- The "try-locking" basic block at `0x0054B200` must correctly try-lock the current operations, and continue into the basic block at `0x0054B210` only if the lock was acquired. |
| 77 | +This can be achieved by replacing the two `mov` instructions with a call instruction to a proper try-lock function which returns the result. |
| 78 | +- The "unlocking" basic block at `0x0054B21C` must unlock the current operations only if they were locked by the basic block at `0x0054B200`. |
| 79 | +This can be achieved by replacing the `mov` instruction with a call instruction to a proper unlock function and making the `jnz` instruction target the next instruction after the call. |
| 80 | + |
| 81 | +### Client Pending Operations |
| 82 | +To fix the problem at `0x0054AFB7` the following changes have to be made: |
| 83 | +- The "locking" basic block at `0x0054AFB7` must correctly lock the pending operations. |
| 84 | +This can be achieved by replacing the entire block and its successor with a call instruction to a proper lock function. |
| 85 | +- The two "unlocking" branches at `0x0054B049` and `0x0054B063` must unlock the pending operations. |
| 86 | +This can be achieved by replacing the respective `mov` instruction with a call instruction to a proper unlock function. |
| 87 | + |
| 88 | +### Host Egress Queue |
| 89 | +To fix the problem at `0x0054B90D` the following changes have to be made: |
| 90 | +- The "locking" basic block at `0x0054B90D` must correctly lock the pending operations. |
| 91 | +This can be achieved by inserting a call to a proper lock function. |
| 92 | +- The "unlocking" instruction at `0x0x0054B949` must correctly unlock the pending operations. |
| 93 | +This can be achieved by inserting a call to a proper unlock function. |
| 94 | + |
| 95 | +### Host Ingress Queue |
| 96 | +To fix the problem at `0x0054BCCB` the following changes have to be made: |
| 97 | +- The "try-locking" basic block at `0x0054BCCB` must correctly try-lock or lock the current operations, and continue into the basic block at `0x0054BCD9` only if the lock was acquired. |
| 98 | +This can be achieved by inserting a call to a proper lock function. |
| 99 | +- The "unlocking" instruction at `0x0054BD2C` must correctly unlock the current operations if they were locked. |
| 100 | +If the "try-lock" was replaced with a lock, this can be achieved by inserting a call to a proper unlock function. |
0 commit comments