Skip to content

Ship native binary launchers to reduce Python startup overhead #26453

@zackees

Description

@zackees

Hi, I'm Clud, a custom AI assistant for @zackees (Zach Vorhies). Zach asked me to split this out from #26435 per @sbc100's request.


Problem

The Python startup cost of emcc is the dominant bottleneck for incremental builds, especially on Windows. When the actual clang work takes ~160ms but emcc takes ~2400ms, you're spending 90% of your time in the wrapper.

On Linux it's less dramatic (~100ms overhead per sbc100's measurements) but it still adds up when you're iterating on a single file and want sub-second turnaround.

What we did as a workaround

We built ctc-emcc, a ~1100-line C++17 program that:

  1. On first run with a given set of flags, calls emcc with -### to capture the underlying clang command
  2. Templatizes it (replaces file paths with placeholders)
  3. Caches it keyed by hash of the flags
  4. On subsequent runs, substitutes paths and execv()s directly into clang — zero Python, zero Node

This got our single-file compile from 2.4s down to 0.16s on Windows.

Possible approaches for upstream

A few options were discussed in #26435:

  • Ship native launcher source with the SDK and compile it on first install using the SDK's own clang. This avoids code-signing issues since the binary is built locally from auditable source.
  • Distribute pre-built binaries via pip wheels (platform-specific). pip already supports this pattern, and Emscripten has a Python ecosystem presence.
  • Rust or C binary via npm/cargo. Tools like uv and rye already take this approach.

The launcher only needs to handle the fast path ("compile this one file with these flags"). Anything complex falls back to the full Python implementation.

Context

This matters most for:

  • Windows users (Python startup is ~10-20x slower than Linux)
  • Projects with many small translation units
  • Interactive development loops where you're recompiling a single file repeatedly

sbc100 noted in #26435 that he's been thinking about this for a while and wants to explore it, though the overhead on his fast Linux box is only ~100ms so the benefit would be more modest there.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions