Skip to content

Commit e4190de

Browse files
Debugging: add a debugger callback mechanism to handle debug events. (#11895)
* Debugging: add a debugger callback mechanism to handle debug events. This PR adds a notion of "debug events", and a mechanism in Wasmtime to associate a "debug handler" with a store such that the handler is invoked as-if it were an async hostcall on each event. The async handler owns the store while its future exists, so the whole "world" (within the store) is frozen and the handler can examine any state it likes with a `StoreContextMut`. Note that this callback-based scheme is a compromise: eventually, we would like to have a native async API that produces a stream of events, as sketched in #11826 and in [this branch]. However, the async approach implemented naively (that is, with manual fiber suspends and with state passed on the store) suffers from unsoundness in the presence of dropped futures. Alex, Nick and I discussed this extensively and agreed that the `Accessor` mechanism is the right way to allow for a debugger to have "timesliced"/"shared" access to a store (only when polled/when an event is delivered), but we will defer that for now, because it requires additional work (mainly, converting existing async yield points in the runtime to "give up" the store with the `run_concurrent` mechanism). I'll file a followup issue to track that. The idea is that we can eventually build that when ready, but the API we provide to a debugger component can remain unchanged; only this plumbing and the glue to the debugger component will be reworked. With this scheme based on callbacks, we expect that one should be able to implement a debugger using async channels to communicate with the callback. The idea is that there would be a protocol where the callback sends a debug event to the debugger main loop elsewhere in the executor (e.g., over a Tokio channel or other async channel mechanism), and when the debugger wants to allow execution to continue, it sends a "continue" message back. In the meantime, while the world is paused, the debugger can send messages to the callback to query the `StoreContextMut` it has and read out state. This indirection/proxying of Store access is necessary for soundness: again, teleporting the Store out may look like it almost works ("it is like a mutable reborrow on a hostcall") except in the presence of dropped futures with sandwiched Wasm->host->Wasm situations. This PR implements debug events for a few cases that can be caught directly in the runtime, e.g., exceptions and traps raised just before re-entry to Wasm. Other kinds of traps, such as those normally implemented by host signals, require additional work (as in #11826) to implement "hostcall injection" on signal reception; and breakpoints will be built on top of that. The point of this PR is only to get the initial plumbing in place for events. [this branch]: https://github.com/cfallin/wasmtime/tree/wasmtime-debug-async * Add some more tests. * Review feedback: comment updates, and make `debug` feature depend on `async`. * Review feedback: debug-hook setter requires guest debugging to be enabled. * Review feedback: ThrownException event; handle block_on errors; explicitly list UnwindState cases. * Add comment about load-bearing Send requirement. * Fix no-unwind build. * Review feedback: pass in hostcall error messages while keeping the trait object-safe. Co-authored-by: Alex Crichton <[email protected]> * Ignore divide-trapping test on Pulley for now. --------- Co-authored-by: Alex Crichton <[email protected]>
1 parent 51e5294 commit e4190de

File tree

6 files changed

+537
-15
lines changed

6 files changed

+537
-15
lines changed

crates/wasmtime/Cargo.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -415,5 +415,8 @@ component-model-async-bytes = [
415415
]
416416

417417
# Enables support for guest debugging.
418-
debug = ['runtime']
418+
debug = [
419+
'runtime',
420+
'async'
421+
]
419422

crates/wasmtime/src/runtime/debug.rs

Lines changed: 75 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
//! Debugging API.
22
33
use crate::{
4-
AnyRef, AsContext, AsContextMut, ExnRef, ExternRef, Func, Instance, Module, StoreContext,
5-
StoreContextMut, Val,
4+
AnyRef, AsContext, AsContextMut, ExnRef, ExternRef, Func, Instance, Module, OwnedRooted,
5+
StoreContext, StoreContextMut, Val,
66
store::{AutoAssertNoGc, StoreOpaque},
77
vm::{CurrentActivationBacktrace, VMContext},
88
};
@@ -13,7 +13,7 @@ use core::{ffi::c_void, ptr::NonNull};
1313
use wasmtime_environ::FrameTable;
1414
use wasmtime_environ::{
1515
DefinedFuncIndex, FrameInstPos, FrameStackShape, FrameStateSlot, FrameStateSlotOffset,
16-
FrameTableDescriptorIndex, FrameValType, FuncKey,
16+
FrameTableDescriptorIndex, FrameValType, FuncKey, Trap,
1717
};
1818
use wasmtime_unwinder::Frame;
1919

@@ -445,3 +445,75 @@ impl<'a, T: 'static> AsContextMut for DebugFrameCursor<'a, T> {
445445
StoreContextMut(self.iter.store.0)
446446
}
447447
}
448+
449+
/// One debug event that occurs when running Wasm code on a store with
450+
/// a debug handler attached.
451+
#[derive(Debug)]
452+
pub enum DebugEvent<'a> {
453+
/// An `anyhow::Error` was raised by a hostcall.
454+
HostcallError(&'a anyhow::Error),
455+
/// An exception is thrown and caught by Wasm. The current state
456+
/// is at the throw-point.
457+
CaughtExceptionThrown(OwnedRooted<ExnRef>),
458+
/// An exception was not caught and is escaping to the host.
459+
UncaughtExceptionThrown(OwnedRooted<ExnRef>),
460+
/// A Wasm trap occurred.
461+
Trap(Trap),
462+
}
463+
464+
/// A handler for debug events.
465+
///
466+
/// This is an async callback that is invoked directly within the
467+
/// context of a debug event that occurs, i.e., with the Wasm code
468+
/// still on the stack. The callback can thus observe that stack, up
469+
/// to the most recent entry to Wasm.[^1]
470+
///
471+
/// Because this callback receives a `StoreContextMut`, it has full
472+
/// access to any state that any other hostcall has, including the
473+
/// `T`. In that way, it is like an epoch-deadline callback or a
474+
/// call-hook callback. It also "freezes" the entire store for the
475+
/// duration of the debugger callback future.
476+
///
477+
/// In the future, we expect to provide an "externally async" API on
478+
/// the `Store` that allows receiving a stream of debug events and
479+
/// accessing the store mutably while frozen; that will need to
480+
/// integrate with [`Store::run_concurrent`] to properly timeslice and
481+
/// scope the mutable access to the store, and has not been built
482+
/// yet. In the meantime, it should be possible to build a fully
483+
/// functional debugger with this async-callback API by channeling
484+
/// debug events out, and requests to read the store back in, over
485+
/// message-passing channels between the callback and an external
486+
/// debugger main loop.
487+
///
488+
/// Note that the `handle` hook may use its mutable store access to
489+
/// invoke another Wasm. Debug events will also be caught and will
490+
/// cause further `handle` invocations during this recursive
491+
/// invocation. It is up to the debugger to handle any implications of
492+
/// this reentrancy (e.g., implications on a duplex channel protocol
493+
/// with an event/continue handshake) if it does so.
494+
///
495+
/// Note also that this trait has `Clone` as a supertrait, and the
496+
/// handler is cloned at every invocation as an artifact of the
497+
/// internal ownership structure of Wasmtime: the handler itself is
498+
/// owned by the store, but also receives a mutable borrow to the
499+
/// whole store, so we need to clone it out to invoke it. It is
500+
/// recommended that this trait be implemented by a type that is cheap
501+
/// to clone: for example, a single `Arc` handle to debugger state.
502+
///
503+
/// [^1]: Providing visibility further than the most recent entry to
504+
/// Wasm is not directly possible because it could see into
505+
/// another async stack, and the stack that polls the future
506+
/// running a particular Wasm invocation could change after each
507+
/// suspend point in the handler.
508+
pub trait DebugHandler: Clone + Send + Sync + 'static {
509+
/// The data expected on the store that this handler is attached
510+
/// to.
511+
type Data;
512+
513+
/// Handle a debug event.
514+
fn handle(
515+
&self,
516+
store: StoreContextMut<'_, Self::Data>,
517+
event: DebugEvent<'_>,
518+
) -> impl Future<Output = ()> + Send;
519+
}

crates/wasmtime/src/runtime/store.rs

Lines changed: 128 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,10 @@
7676
//! contents of `StoreOpaque`. This is an invariant that we, as the authors of
7777
//! `wasmtime`, must uphold for the public interface to be safe.
7878
79+
#[cfg(feature = "debug")]
80+
use crate::DebugHandler;
81+
#[cfg(all(feature = "gc", feature = "debug"))]
82+
use crate::OwnedRooted;
7983
use crate::RootSet;
8084
#[cfg(feature = "gc")]
8185
use crate::ThrownException;
@@ -259,6 +263,50 @@ pub struct StoreInner<T: 'static> {
259263
///
260264
/// For comments about `ManuallyDrop`, see `Store::into_data`.
261265
data_no_provenance: ManuallyDrop<T>,
266+
267+
/// The user's debug handler, if any. See [`crate::DebugHandler`]
268+
/// for more documentation.
269+
///
270+
/// We need this to be an `Arc` because the handler itself takes
271+
/// `&self` and also the whole Store mutably (via
272+
/// `StoreContextMut`); so we need to hold a separate reference to
273+
/// it while invoking it.
274+
#[cfg(feature = "debug")]
275+
debug_handler: Option<Box<dyn StoreDebugHandler<T>>>,
276+
}
277+
278+
/// Adapter around `DebugHandler` that gets monomorphized into an
279+
/// object-safe dyn trait to place in `store.debug_handler`.
280+
#[cfg(feature = "debug")]
281+
trait StoreDebugHandler<T: 'static>: Send + Sync {
282+
fn handle<'a>(
283+
self: Box<Self>,
284+
store: StoreContextMut<'a, T>,
285+
event: crate::DebugEvent<'a>,
286+
) -> Box<dyn Future<Output = ()> + Send + 'a>;
287+
}
288+
289+
#[cfg(feature = "debug")]
290+
impl<D> StoreDebugHandler<D::Data> for D
291+
where
292+
D: DebugHandler,
293+
D::Data: Send,
294+
{
295+
fn handle<'a>(
296+
self: Box<Self>,
297+
store: StoreContextMut<'a, D::Data>,
298+
event: crate::DebugEvent<'a>,
299+
) -> Box<dyn Future<Output = ()> + Send + 'a> {
300+
// Clone the underlying `DebugHandler` (the trait requires
301+
// Clone as a supertrait), not the Box. The clone happens here
302+
// rather than at the callsite because `Clone::clone` is not
303+
// object-safe so needs to be in a monomorphized context.
304+
let handler: D = (*self).clone();
305+
// Since we temporarily took `self` off the store at the
306+
// callsite, put it back now that we've cloned it.
307+
store.0.debug_handler = Some(self);
308+
Box::new(async move { handler.handle(store, event).await })
309+
}
262310
}
263311

264312
enum ResourceLimiterInner<T> {
@@ -716,6 +764,8 @@ impl<T> Store<T> {
716764
#[cfg(target_has_atomic = "64")]
717765
epoch_deadline_behavior: None,
718766
data_no_provenance: ManuallyDrop::new(data),
767+
#[cfg(feature = "debug")]
768+
debug_handler: None,
719769
});
720770

721771
let store_data =
@@ -1188,21 +1238,61 @@ impl<T> Store<T> {
11881238
/// VM-level values (locals and operand stack), when debugging is
11891239
/// enabled.
11901240
///
1191-
/// This object views all activations for the current store that
1192-
/// are on the stack. An activation is a contiguous sequence of
1193-
/// Wasm frames (called functions) that were called from host code
1194-
/// and called back out to host code. If there are activations
1195-
/// from multiple stores on the stack, for example if Wasm code in
1196-
/// one store calls out to host code which invokes another Wasm
1197-
/// function in another store, then the other stores are "opaque"
1198-
/// to our view here in the same way that host code is.
1241+
/// This object views the frames from the most recent Wasm entry
1242+
/// onward (up to the exit that allows this host code to run). Any
1243+
/// Wasm stack frames upward from the most recent entry to Wasm
1244+
/// are not visible to this cursor.
11991245
///
12001246
/// Returns `None` if debug instrumentation is not enabled for
12011247
/// the engine containing this store.
12021248
#[cfg(feature = "debug")]
12031249
pub fn debug_frames(&mut self) -> Option<crate::DebugFrameCursor<'_, T>> {
12041250
self.as_context_mut().debug_frames()
12051251
}
1252+
1253+
/// Set the debug callback on this store.
1254+
///
1255+
/// See [`crate::DebugHandler`] for more documentation.
1256+
///
1257+
/// # Panics
1258+
///
1259+
/// - Will panic if this store is not configured for async
1260+
/// support.
1261+
/// - Will panic if guest-debug support was not enabled via
1262+
/// [`crate::Config::guest_debug`].
1263+
#[cfg(feature = "debug")]
1264+
pub fn set_debug_handler(&mut self, handler: impl DebugHandler<Data = T>)
1265+
where
1266+
// We require `Send` here because the debug handler becomes
1267+
// referenced from a future: when `DebugHandler::handle` is
1268+
// invoked, its `self` references the `handler` with the
1269+
// user's state. Note that we are careful to keep this bound
1270+
// constrained to debug-handler-related code only and not
1271+
// propagate it outward to the store in general. The presence
1272+
// of the trait implementation serves as a witness that `T:
1273+
// Send`. This is required in particular because we will have
1274+
// a `&mut dyn VMStore` on the stack when we pause a fiber
1275+
// with `block_on` to run a debugger hook; that `VMStore` must
1276+
// be a `Store<T> where T: Send`.
1277+
T: Send,
1278+
{
1279+
assert!(
1280+
self.inner.async_support(),
1281+
"debug hooks rely on async support"
1282+
);
1283+
assert!(
1284+
self.engine().tunables().debug_guest,
1285+
"debug hooks require guest debugging to be enabled"
1286+
);
1287+
self.inner.debug_handler = Some(Box::new(handler));
1288+
}
1289+
1290+
/// Clear the debug handler on this store. If any existed, it will
1291+
/// be dropped.
1292+
#[cfg(feature = "debug")]
1293+
pub fn clear_debug_handler(&mut self) {
1294+
self.inner.debug_handler = None;
1295+
}
12061296
}
12071297

12081298
impl<'a, T> StoreContext<'a, T> {
@@ -1320,7 +1410,6 @@ impl<'a, T> StoreContextMut<'a, T> {
13201410

13211411
/// Tests whether there is a pending exception.
13221412
///
1323-
///
13241413
/// See [`Store::has_pending_exception`] for more details.
13251414
#[cfg(feature = "gc")]
13261415
pub fn has_pending_exception(&self) -> bool {
@@ -2543,13 +2632,31 @@ at https://bytecodealliance.org/security.
25432632
self.pending_exception.take()
25442633
}
25452634

2635+
/// Tests whether there is a pending exception.
2636+
#[cfg(feature = "gc")]
2637+
pub fn has_pending_exception(&self) -> bool {
2638+
self.pending_exception.is_some()
2639+
}
2640+
25462641
#[cfg(feature = "gc")]
25472642
fn take_pending_exception_rooted(&mut self) -> Option<Rooted<ExnRef>> {
25482643
let vmexnref = self.take_pending_exception()?;
25492644
let mut nogc = AutoAssertNoGc::new(self);
25502645
Some(Rooted::new(&mut nogc, vmexnref.into()))
25512646
}
25522647

2648+
/// Get an owned rooted reference to the pending exception,
2649+
/// without taking it off the store.
2650+
#[cfg(all(feature = "gc", feature = "debug"))]
2651+
pub(crate) fn pending_exception_owned_rooted(&mut self) -> Option<OwnedRooted<ExnRef>> {
2652+
let mut nogc = AutoAssertNoGc::new(self);
2653+
nogc.pending_exception.take().map(|vmexnref| {
2654+
let cloned = nogc.clone_gc_ref(vmexnref.as_gc_ref());
2655+
nogc.pending_exception = Some(cloned.into_exnref_unchecked());
2656+
OwnedRooted::new(&mut nogc, vmexnref.into())
2657+
})
2658+
}
2659+
25532660
#[cfg(feature = "gc")]
25542661
fn throw_impl(&mut self, exception: Rooted<ExnRef>) {
25552662
let mut nogc = AutoAssertNoGc::new(self);
@@ -2645,6 +2752,18 @@ unsafe impl<T> VMStore for StoreInner<T> {
26452752
fn component_calls(&mut self) -> &mut vm::component::CallContexts {
26462753
&mut self.component_calls
26472754
}
2755+
2756+
#[cfg(feature = "debug")]
2757+
fn block_on_debug_handler(&mut self, event: crate::DebugEvent<'_>) -> anyhow::Result<()> {
2758+
if let Some(handler) = self.debug_handler.take() {
2759+
log::trace!("about to raise debug event {event:?}");
2760+
StoreContextMut(self).with_blocking(|store, cx| {
2761+
cx.block_on(Pin::from(handler.handle(store, event)).as_mut())
2762+
})
2763+
} else {
2764+
Ok(())
2765+
}
2766+
}
26482767
}
26492768

26502769
impl<T> StoreInner<T> {

crates/wasmtime/src/runtime/vm.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,10 @@ pub unsafe trait VMStore: 'static {
228228
fn component_async_store(
229229
&mut self,
230230
) -> &mut dyn crate::runtime::component::VMComponentAsyncStore;
231+
232+
/// Invoke a debug handler, if present, at a debug event.
233+
#[cfg(feature = "debug")]
234+
fn block_on_debug_handler(&mut self, event: crate::DebugEvent) -> anyhow::Result<()>;
231235
}
232236

233237
impl Deref for dyn VMStore + '_ {

0 commit comments

Comments
 (0)