Skip to content

Commit 9d730ce

Browse files
committed
Debug: add some infrastructure for catching traps, and handle traps properly in Pulley.
This is a followup to #11895 where I had disabled a test that failed to emit a debug event for a hostcall-generated trap on a divide-by-zero in Pulley. This PR allows that test to pass, and brings Pulley back to parity with native Cranelift in debug support currently. This was a bit of a "start to pull the thread and the entire finished mechanism materializes" PR; happy to consider ways to split it up if needed. In short, disabling signal-based traps on a Pulley configuration still relies on Pulley opcodes (e.g., divide) actually trapping, in a way that looks more like a "native ISA trap"; so I had to start to build out the actual trap-handling mechanisms. In any case, this will all be needed for followup work soon that will handle traps on native platforms (redirecting from signals by injecting calls), so this is not a distraction. This PR includes, ranked in decreasing order of "may scare other Wasmtime maintainers" score: - A raw `NonNull<dyn VMStore>` in the `CallThreadState`, with a long comment about provenance and mut-borrow exclusivity. This is needed right now to allow the interpreter to invoke the debug event handler, but will soon be needed when injecting hostcalls on signals, because a signal context also has no state available from the Wasm code other than what is in TLS. Hence, we need a way to get the store back from the Wasm when we do something that is "morally a hostcall" at a trapping instruction. I do believe this is sound, or at least close to it if not (please scrutinize carefully!); the basic idea is that the Wasm acts as an opaque blob in the middle, and the pointer comes out of it one way or another (the normal way, as the first arg to a hostcall, or the weird way, via TLS and the CallThreadState during a trap). Exclusive ownership is still clear at any given point and only one `&mut` ever exists in the current frame at a time. That said, I haven't tested with miri yet. This does require careful thought about the Wasm compilation, too; we need the moral equivalent of a `&mut self` reborrow as-if we were making a hostcall on each trapping instruction. It turns out that we already treat them as memory-fence instructions, so nothing loaded from the store can be moved or cached across them, and I've added a comment now about how this is load-bearing. - Updates to `CallThreadState`'s "exit state", normally set by the exit trampoline, that we now also set when we invoke a debug event handler during a trap context[^1] so that `Store::debug_frames` properly sees the current activation. This is a little more awkward than it could be because we store the *trampoline* FP, not last Wasm FP, and there is no trampoline frame in this case, so I've added a flag and some conditionals. I'm happy to refactor instead to go (back) to storing the last Wasm FP instead, with the extra load in the exit trampoline to compute that. - A whole bunch of plumbing, creating a large but mechanical diff, in the code translator to actually add debug tags on all traps and calls to `raise`. It turns out that once I got all of the above working in Pulley, the test disagreed about current Wasm PC between native and Pulley, and Pulley was right; native was getting it wrong because the `raise` libcall was sunk to the bottom in a cold block and, without tags, we scanned backward to pick up the last Wasm PC in the function. This new plumbing and addition of tags in all the appropriate places fixes that. [^1]: I keep saying "during a trap context" here, but to avoid any signal-safety scares, note that when this is done for native signals in a followup PR, we will inject a hostcall by modifying stack/register state and returning from the actual signal context, so it really is as-if we did a hostcall from a trapping instruction.
1 parent dbc21cd commit 9d730ce

File tree

18 files changed

+674
-228
lines changed

18 files changed

+674
-228
lines changed

cranelift/codegen/src/inst_predicates.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,12 @@ pub fn has_memory_fence_semantics(op: Opcode) -> bool {
150150
| Opcode::Debugtrap
151151
| Opcode::SequencePoint => true,
152152
Opcode::Call | Opcode::CallIndirect | Opcode::TryCall | Opcode::TryCallIndirect => true,
153+
// N.B.: this is *load-bearing for borrow safety and
154+
// provenance in Wasmtime*. A trapping op can potentially
155+
// cause an implicit hostcall, and that hostcall implicitly
156+
// mutably borrows Wasmtime's Store. So we can't allow alias
157+
// anslysis to cross trapping opcodes; they are implicitly
158+
// as-if they called the host.
153159
op if op.can_trap() => true,
154160
_ => false,
155161
}

crates/cranelift/src/bounds_checks.rs

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
use crate::{
2323
Reachability,
2424
func_environ::FuncEnvironment,
25-
translate::{HeapData, TargetEnvironment},
25+
translate::{FuncTranslationStacks, HeapData, TargetEnvironment},
2626
};
2727
use Reachability::*;
2828
use cranelift_codegen::{
@@ -84,12 +84,15 @@ pub fn bounds_check_and_compute_addr(
8484
index: ir::Value,
8585
bounds_check: BoundsCheck,
8686
trap: ir::TrapCode,
87+
stacks: &FuncTranslationStacks,
8788
) -> Reachability<ir::Value> {
8889
match bounds_check {
8990
BoundsCheck::StaticOffset {
9091
offset,
9192
access_size,
92-
} => bounds_check_field_access(builder, env, heap, index, offset, access_size, trap),
93+
} => {
94+
bounds_check_field_access(builder, env, heap, index, offset, access_size, trap, stacks)
95+
}
9396

9497
#[cfg(feature = "gc")]
9598
BoundsCheck::StaticObjectField {
@@ -113,6 +116,7 @@ pub fn bounds_check_and_compute_addr(
113116
0,
114117
object_size,
115118
trap,
119+
stacks,
116120
) {
117121
Reachable(v) => v,
118122
u @ Unreachable => return u,
@@ -123,7 +127,7 @@ pub fn bounds_check_and_compute_addr(
123127
}
124128

125129
// Otherwise, bounds check just this one field's access.
126-
bounds_check_field_access(builder, env, heap, index, offset, access_size, trap)
130+
bounds_check_field_access(builder, env, heap, index, offset, access_size, trap, stacks)
127131
}
128132

129133
// Compute the index of the end of the object, bounds check that and get
@@ -148,6 +152,7 @@ pub fn bounds_check_and_compute_addr(
148152
0,
149153
0,
150154
trap,
155+
stacks,
151156
) {
152157
Reachable(v) => v,
153158
u @ Unreachable => return u,
@@ -177,6 +182,7 @@ fn bounds_check_field_access(
177182
offset: u32,
178183
access_size: u8,
179184
trap: ir::TrapCode,
185+
stacks: &FuncTranslationStacks,
180186
) -> Reachability<ir::Value> {
181187
let pointer_bit_width = u16::try_from(env.pointer_type().bits()).unwrap();
182188
let bound_gv = heap.bound;
@@ -298,7 +304,7 @@ fn bounds_check_field_access(
298304
// max_memory_size`, since we will end up being out-of-bounds regardless
299305
// of the given `index`.
300306
env.before_unconditionally_trapping_memory_access(builder);
301-
env.trap(builder, trap);
307+
env.trap(builder, trap, stacks);
302308
return Unreachable;
303309
}
304310

@@ -308,7 +314,7 @@ fn bounds_check_field_access(
308314
// native pointer type anyway, so this is an unconditional trap.
309315
if pointer_bit_width < 64 && offset_and_size >= (1 << pointer_bit_width) {
310316
env.before_unconditionally_trapping_memory_access(builder);
311-
env.trap(builder, trap);
317+
env.trap(builder, trap, stacks);
312318
return Unreachable;
313319
}
314320

@@ -430,6 +436,7 @@ fn bounds_check_field_access(
430436
AddrPcc::static32(heap.pcc_memory_type, memory_reservation),
431437
oob,
432438
trap,
439+
stacks,
433440
));
434441
}
435442

@@ -464,6 +471,7 @@ fn bounds_check_field_access(
464471
AddrPcc::dynamic(heap.pcc_memory_type, bound_gv),
465472
oob,
466473
trap,
474+
stacks,
467475
));
468476
}
469477

@@ -513,6 +521,7 @@ fn bounds_check_field_access(
513521
AddrPcc::dynamic(heap.pcc_memory_type, bound_gv),
514522
oob,
515523
trap,
524+
stacks,
516525
));
517526
}
518527

@@ -558,6 +567,7 @@ fn bounds_check_field_access(
558567
AddrPcc::dynamic(heap.pcc_memory_type, bound_gv),
559568
oob,
560569
trap,
570+
stacks,
561571
));
562572
}
563573

@@ -575,7 +585,7 @@ fn bounds_check_field_access(
575585
builder.func.dfg.facts[access_size_val] =
576586
Some(Fact::constant(pointer_bit_width, offset_and_size));
577587
}
578-
let adjusted_index = env.uadd_overflow_trap(builder, index, access_size_val, trap);
588+
let adjusted_index = env.uadd_overflow_trap(builder, index, access_size_val, trap, stacks);
579589
if pcc {
580590
builder.func.dfg.facts[adjusted_index] = Some(Fact::value_offset(
581591
pointer_bit_width,
@@ -603,6 +613,7 @@ fn bounds_check_field_access(
603613
AddrPcc::dynamic(heap.pcc_memory_type, bound_gv),
604614
oob,
605615
trap,
616+
stacks,
606617
))
607618
}
608619

@@ -756,9 +767,10 @@ fn explicit_check_oob_condition_and_compute_addr(
756767
// in bounds (and therefore we can proceed).
757768
oob_condition: ir::Value,
758769
trap: ir::TrapCode,
770+
stacks: &FuncTranslationStacks,
759771
) -> ir::Value {
760772
if let OobBehavior::ExplicitTrap = oob_behavior {
761-
env.trapnz(builder, oob_condition, trap);
773+
env.trapnz(builder, oob_condition, trap, stacks);
762774
}
763775
let addr_ty = env.pointer_type();
764776

0 commit comments

Comments
 (0)