Cranelift: Non-deterministic JIT execution failures on ARM64 macOS

## Summary

JIT-compiled code execution on ARM64 macOS (Apple Silicon) fails non-deterministically, with approximately 44% failure rate in multi-threaded scenarios. The failures manifest as SIGBUS, incorrect results, or silent corruption.

## Environment

- **OS**: macOS 14.x (Sonoma) or later
- **Architecture**: ARM64 (Apple Silicon - M1/M2/M3)
- **Crate**: `cranelift-jit`
- **Rust version**: 1.75+ (any recent stable)

## Observed Behavior

In a real-world JIT compiler (Rayzor - a Haxe-to-native compiler using cranelift-jit), we observe:

- **Without MAP_JIT fix:** ~56% success rate (50 runs)
- **With MAP_JIT fix:** 100% success rate (50+ consecutive runs)

The failures manifest as:
- SIGBUS (Bus error: 10)
- Incorrect computation results
- Segmentation fault
- Silent wrong values

### Note on Minimal Reproduction

A simple test case may not reliably reproduce the issue because:

1. The failure is **non-deterministic** and depends on timing, memory layout, and CPU scheduling
2. Simple tests may not trigger the problematic code paths
3. The issue is more likely with:
   - Complex JIT-compiled functions with multiple blocks
   - Multiple functions compiled together
   - Closures and captured variables passed to threads
   - Runtime library functions called from JIT code

### Complex Reproduction (Rayzor Compiler)

The issue was observed and fixed in the [Rayzor compiler](https://github.com/darmie/rayzor), a Haxe-to-native compiler using cranelift-jit. The compiler:
- Compiles 50+ runtime functions using cranelift-jit
- Spawns threads that execute closures containing JIT function calls
- Uses channels and mutexes with JIT-compiled callback functions

**E2E Test Case:** [`compiler/examples/test_rayzor_stdlib_e2e.rs`](https://github.com/darmie/rayzor/blob/main/compiler/examples/test_rayzor_stdlib_e2e.rs)

**Commits for testing:**
- **Before fix (unstable):** [`0eb9472`](https://github.com/darmie/rayzor/commit/0eb9472) - Uses upstream cranelift without MAP_JIT
- **After fix (stable):** [`9a0e80e`](https://github.com/darmie/rayzor/commit/9a0e80e) - Uses fork with MAP_JIT + pthread_jit_write_protect_np

```bash
# Test BEFORE fix (~56% success rate)
git clone https://github.com/darmie/rayzor
cd rayzor
git checkout 0eb9472
cargo build --release --package compiler --example test_rayzor_stdlib_e2e

# Run stability test
passed=0; failed=0
for i in {1..50}; do
    if timeout 120 ./target/release/examples/test_rayzor_stdlib_e2e 2>&1 | grep -q "All tests passed"; then
        passed=$((passed+1))
    else
        echo "Run $i: FAILED"
        failed=$((failed+1))
    fi
done
echo "Before fix - Passed: $passed/50, Failed: $failed/50"

# Test AFTER fix (100% success rate)
git checkout 9a0e80e
cargo build --release --package compiler --example test_rayzor_stdlib_e2e

passed=0; failed=0
for i in {1..50}; do
    if timeout 120 ./target/release/examples/test_rayzor_stdlib_e2e 2>&1 | grep -q "All tests passed"; then
        passed=$((passed+1))
    else
        echo "Run $i: FAILED"
        failed=$((failed+1))
    fi
done
echo "After fix - Passed: $passed/50, Failed: $failed/50"
```

**Results:**

| Commit | Configuration | Success Rate |
|--------|--------------|--------------|
| [`0eb9472`](https://github.com/darmie/rayzor/commit/0eb9472) | Upstream cranelift (no MAP_JIT) | ~56% (28/50) |
| [`9a0e80e`](https://github.com/darmie/rayzor/commit/9a0e80e) | [darmie/wasmtime fix-plt-aarch64](https://github.com/darmie/wasmtime/tree/fix-plt-aarch64) | **100%** (50/50) |

### Simple Test Case (May Not Reliably Fail)

For reference, here's a minimal test that exercises the same code paths:

```rust
use cranelift::prelude::*;
use cranelift_jit::{JITBuilder, JITModule};
use cranelift_module::{Linkage, Module, FuncId};
use std::thread;

fn define_function(module: &mut JITModule, name: &str, op: &str) -> FuncId {
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(types::I64));
    sig.params.push(AbiParam::new(types::I64));
    sig.returns.push(AbiParam::new(types::I64));

    let func_id = module.declare_function(name, Linkage::Export, &sig).unwrap();

    let mut ctx = module.make_context();
    ctx.func.signature = sig;

    let mut builder_ctx = FunctionBuilderContext::new();
    {
        let mut builder = FunctionBuilder::new(&mut ctx.func, &mut builder_ctx);
        let block = builder.create_block();
        builder.append_block_params_for_function_params(block);
        builder.switch_to_block(block);
        builder.seal_block(block);

        let a = builder.block_params(block)[0];
        let b = builder.block_params(block)[1];

        let result = match op {
            "add" => builder.ins().iadd(a, b),
            "sub" => builder.ins().isub(a, b),
            "mul" => builder.ins().imul(a, b),
            _ => builder.ins().iadd(a, b),
        };

        builder.ins().return_(&[result]);
        builder.finalize();
    }

    module.define_function(func_id, &mut ctx).unwrap();
    module.clear_context(&mut ctx);

    func_id
}

fn main() {
    let mut flag_builder = settings::builder();
    flag_builder.set("use_colocated_libcalls", "false").unwrap();
    flag_builder.set("is_pic", "false").unwrap();
    let isa_builder = cranelift_native::builder().unwrap();
    let isa = isa_builder.finish(settings::Flags::new(flag_builder)).unwrap();

    let builder = JITBuilder::with_isa(isa, cranelift_module::default_libcall_names());
    let mut module = JITModule::new(builder);

    let add_id = define_function(&mut module, "add", "add");
    let sub_id = define_function(&mut module, "sub", "sub");
    let mul_id = define_function(&mut module, "mul", "mul");

    module.finalize_definitions().unwrap();

    let add_fn: fn(i64, i64) -> i64 = unsafe {
        std::mem::transmute(module.get_finalized_function(add_id))
    };
    let sub_fn: fn(i64, i64) -> i64 = unsafe {
        std::mem::transmute(module.get_finalized_function(sub_id))
    };
    let mul_fn: fn(i64, i64) -> i64 = unsafe {
        std::mem::transmute(module.get_finalized_function(mul_id))
    };

    let handles: Vec<_> = (0..20).map(|thread_id| {
        thread::spawn(move || {
            for i in 0..5000 {
                let a = (thread_id * 1000 + i) as i64;
                let b = (i * 7) as i64;

                assert_eq!(add_fn(a, b), a + b, "add failed");
                assert_eq!(sub_fn(a, b), a - b, "sub failed");
                assert_eq!(mul_fn(a, b), a * b, "mul failed");
            }
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }

    println!("All tests passed!");
}
```

**Cargo.toml:**
```toml
[package]
name = "jit_repro"
version = "0.1.0"
edition = "2021"

[dependencies]
cranelift = { version = "0.125", features = ["jit", "module", "native"] }
cranelift-jit = "0.125"
cranelift-module = "0.125"
cranelift-codegen = "0.125"
cranelift-frontend = "0.125"
cranelift-native = "0.125"
```

## Root Cause Analysis

Two issues combine to cause this:

### 1. Missing MAP_JIT flag

`cranelift-jit` allocates executable memory using the standard allocator (`alloc::alloc`), which doesn't set the `MAP_JIT` flag. On Apple Silicon, memory intended for JIT execution **must** be allocated with:

```c
mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_JIT, -1, 0);
```

Without `MAP_JIT` (0x0800), the kernel cannot properly track the memory for W^X enforcement.

### 2. Missing W^X mode switch for spawned threads

Apple Silicon enforces W^X (Write XOR Execute) at the hardware level. Each thread has an independent write/execute mode:

- `pthread_jit_write_protect_np(0)` = write mode (can write JIT memory, cannot execute)
- `pthread_jit_write_protect_np(1)` = execute mode (can execute JIT code, cannot write)

**Threads inherit write mode by default.** The current implementation doesn't switch spawned threads to execute mode before calling JIT code, causing crashes.

## Proposed Solution

1. **Use `mmap` with `MAP_JIT`** for memory allocation on ARM64 macOS instead of the standard allocator
2. **Call `pthread_jit_write_protect_np(1)`** after making memory executable to switch to execute mode
3. **Add memory barriers** (DSB SY + ISB SY) for proper icache coherency on Apple Silicon's heterogeneous cores

## Technical References

- [Apple: Writing ARM64 Code for Apple Platforms](https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms)
- [Porting Just-In-Time Compilers to Apple Silicon](https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon)
- Apple Silicon has independent instruction caches per core (P-cores and E-cores), requiring explicit barriers

## Related Issues

- #2735 - Support PLT entries in `cranelift-jit` crate on aarch64
- #8852 - Cranelift: JIT assertion failure when using `ArgumentPurpose::StructArgument` on macOS (A64)
- #4000 - Cranelift: JIT relocations depend on system allocator behaviour


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cranelift: Non-deterministic JIT execution failures on ARM64 macOS #12076

Summary

Environment

Observed Behavior

Note on Minimal Reproduction

Complex Reproduction (Rayzor Compiler)

Simple Test Case (May Not Reliably Fail)

Root Cause Analysis

1. Missing MAP_JIT flag

2. Missing W^X mode switch for spawned threads

Proposed Solution

Technical References

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Commit	Configuration	Success Rate
`0eb9472`	Upstream cranelift (no MAP_JIT)	~56% (28/50)
`9a0e80e`	darmie/wasmtime fix-plt-aarch64	100% (50/50)

Cranelift: Non-deterministic JIT execution failures on ARM64 macOS #12076

Description

Summary

Environment

Observed Behavior

Note on Minimal Reproduction

Complex Reproduction (Rayzor Compiler)

Simple Test Case (May Not Reliably Fail)

Root Cause Analysis

1. Missing MAP_JIT flag

2. Missing W^X mode switch for spawned threads

Proposed Solution

Technical References

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions