-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFICallbackTrampoline has unsafe handling of SP/CSP #60285
Comments
We ran into this when we recently refactored our system to instead of running FFICallbackTrampolineStub in the simulator, instead run it on the CPU (so it could do the temporary isolate creation) before transitioning to the simulator to run _FFICallbackFoo (per callback stubs generated by the frontend afaict). But the only things we copy into the simulator are THR (for the case of a temporary isolate), the argument registers, and the top frame from the CSP. The only things we copy out are the argument/return registers. This was then crashing for us in EnterSafePointStub in the --use-slow-path case, since it seemed to be assuming that SP was set up (in our case it was 0 since we hadn't copied it out of the simulator since the simulator has a different stack and typically subroutines don't depend on temp variables from other subroutines as this one is). |
Conceptually this sounds reasonable. 👍 cc @liamappelbe who last touched the trampolines.
@rmacnak-google All the FFI callback tests pass on |
I think the only case where this type of code fails is if we go slow case and before we update CSP we hit a signal and the distance between SP value and (stale) CSP is larger than 16 words (which is documented size of the redzone). So you need to hit the signal precisely here: __ EnterFrame(0);
__ PushRegisters(all_registers);
// vvv
__ mov(CALLEE_SAVED_TEMP, CSP);
__ mov(CALLEE_SAVED_TEMP2, SP);
__ ReserveAlignedFrameSpace(0);
// ^^^ ...
__ mov(CSP, SP); Which I guess almost never happens? |
Wouldn't one be in danger the moment the EnterFrame(0) happens? That's when SP has then moved further than CSP since at the start of this function SP == CSP (at least when called from the FFICallbackTrampoline). [edit] nm, I guess the "redzone" implies that those 16 bytes are safe, but rather pushing all the registers is what actually gets us in trouble? |
My implementation uses a different stub that does not run under the simulator.
Where is this documented? The Procedure Call Standard for ARM 64-bit Architecture gives the stack constraint
which I read as specifying there is no redzone. |
Conflicting documentation is the best form of documentation :) (Yeah, its only Apple - I should have said that. I have not checked Android or Linux ARM64 or Windows ARM64) |
FWIW https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst is what I use for the abi standard (I welcome links to better docs!) |
The Dart profiler on Mac/iOS doesn't use SIGPROF anymore (lldb is unreasonably slow handling signals), so the signal would need to be something caused by the embedder. |
sdk/runtime/vm/compiler/stub_code_compiler_arm64.cc
Line 608 in 0e556f6
This BLR calls out int the (generated) _FFICallbackFoo... which sets up Dart SP, marshalls arguments from C, etc.
But then at the end, it unsets SP, reverting back to CSP being controlling and SP just being a normal temp.
Example:
However at the very next line after returning from _FfiCallbackFoo , this code (FfiCallbackTrampolineStub) calls into EnterFullSafepoint, which, (at least in the --use-slow-path) case then calls back into the VM via EnterSafepointStub, after having assumed that SP was set up:
sdk/runtime/vm/compiler/assembler/assembler_arm64.cc
Line 1549 in 0e556f6
https://github.com/dart-lang/sdk/blob/main/runtime/vm/compiler/stub_code_compiler_arm64.cc#L339
My guess is that this code is unsafe in the
--use-slow-path
case, particularly against signals (e.g. the profiler). Since it seems to assume that SP is set up (by dumping registers to it), but doesn't bump CSP accordingly, and then calls into the VM I'm not entirely sure how the--use-slow-path --stacktrace-every=100
tests are passing?My recommendation would be that (somewhat unrelatedly) FFICallbackTrampoline should move off of using SP directly and instead stick with CSP (since it's just a normal C function anyway) and then explicitly set up/tear down SP around the call to EnterFullSafepoint. I think the only reason that FFICallbackTrampoline is using SP at all is just because that's what Assembler has easy methods for. If those methods were either forked for CSP handling (or could taken extra argument to control SP/CSP) a bunch of FFICallbackTrampoline would be cleaned up (and maybe this bug would be more obvious).
The text was updated successfully, but these errors were encountered: