Skip to content

Upgrade napi-rs 2.x → 3 to fix node bindings CI worker_threads segfault#161

Draft
lfarrel6 wants to merge 2 commits into
mainfrom
claude/stoic-albattani-ueOtk
Draft

Upgrade napi-rs 2.x → 3 to fix node bindings CI worker_threads segfault#161
lfarrel6 wants to merge 2 commits into
mainfrom
claude/stoic-albattani-ueOtk

Conversation

@lfarrel6
Copy link
Copy Markdown
Member

Why

The lint_and_test_node job has been failing for a long time with yarn test reporting all tests pass and then the process exiting with code 129 (under yarn) or 139 (under ava directly). It looked like a yarn problem because yarn is the visible top-level command, but it isn't.

Root cause: napi-rs 2.x is not fully context-aware with respect to Node's worker_threads. AVA 4+ runs each test file in a Worker by default. When the worker's V8 isolate is torn down, napi-rs 2.x leaves dangling references (thread_local! REFERENCE_MAP, threadsafe-function plumbing, env-cleanup hooks not deregistered) and the teardown SIGSEGVs. The crash propagates back through ava/yarn as 129 (SIGHUP, yarn rewraps the child signal) or 139 (raw SIGSEGV) depending on how the process tree dies.

Verified the package manager is irrelevant — same crash reproduces under pnpm with a fresh project consuming the prebuilt .node. The minimal repro is new Worker(... import('./index.js')) followed by worker exit — no AVA, no yarn, no libfaketime needed. So migrating to pnpm would not have fixed it.

The relevant fixes shipped in napi-rs 3.x:

What changed

  • node-attestation-bindings/Cargo.toml: napi 2.10.6 → 3, napi-derive 2.9.4 → 3, napi-build =2.0.12. Kept the napi4 feature flag (still maps to Node-API v4).
  • node-attestation-bindings/src/lib.rs: napi::JsBuffer doesn't exist in v3. Switched to napi::bindgen_prelude::Buffer, which implements AsRef<[u8]> directly, so the cert.into_value()? .as_ref() plumbing collapses to cert.as_ref(). Two match blocks (and their dead error branches) drop out — net simplification.
  • node-attestation-bindings/package.json:
    • @napi-rs/cli ^2.14.3^3 (resolves to 3.6.2).
    • Migrated napi config to v3 schema: namebinaryName, triples.additional → flat targets array. The v3 targets list must include the host triples that were previously implicit defaults, so x86_64-apple-darwin, x86_64-pc-windows-msvc, and x86_64-unknown-linux-gnu are added explicitly (publish surface unchanged).
  • node-attestation-bindings/index.js / index.d.ts: regenerated by napi build v3. The TS signature now correctly types the params as Buffer and the JS loader file is the modern v3 layout.
  • node-attestation-bindings/yarn.lock and Cargo.lock: refreshed.

The build / prepublishOnly / artifacts / universal / version scripts in package.json were left as-is — all flags (napi build --platform --release, napi prepublish -t npm, napi universal, napi artifacts) are still valid CLI surface in v3.

What I deliberately did not touch

  • .github/workflows/build-and-deploy-node-bindings.yml: the on: triggers are entirely commented out, so this workflow doesn't run today. The matrix targets and the napi script invocations are still compatible with v3, but reactivating the publish flow on v3 deserves its own PR/verification (especially the per-arch docker images and the napi artifacts/napi universal step output paths). Out of scope here.
  • .github/workflows/lint-and-test.yml: no changes needed — cargo make ci calls yarn install and yarn test, both of which now succeed.
  • AVA configuration: no workerThreads: false workaround added. The point of this PR is to fix it at the root so worker_threads stay enabled.
  • pnpm migration: independent question; punted.

Verification

Reproduced and fixed on Node 18 (the version CI uses) and Node 22:

Scenario Before After
new Worker() that only imports the addon and exits 139 SIGSEGV 0
node node_modules/.bin/ava (worker_threads default) 139 0
yarn test (the exact CI command, with libfaketime) 129 0
pnpm test against the prebuilt .node (control) 139 n/a

Core dump from the original crash showed the segfault deep in V8/N-API teardown threads with no addon symbols on the stack — consistent with the napi-rs context-awareness issue rather than anything in our Rust code.

Test plan

  • lint_and_test_node workflow goes green
  • lint_and_test_rust (clippy/fmt over the workspace, which now resolves to napi 3) goes green
  • Optional: someone with macOS sanity-checks yarn build locally before we re-enable the publish workflow

Follow-ups (not in this PR)

  • Re-enable build-and-deploy-node-bindings.yml once we're ready to publish on v3 — needs a separate pass to revalidate the docker matrix and the artifact pipeline.
  • Decide whether to standardize on pnpm. With this fix in, that becomes a pure hygiene/standardization call rather than a CI fix.

Refs:


Generated by Claude Code

The node bindings test workflow appeared to pass tests but failed with
exit 129 (yarn) / 139 (raw ava). Root cause: napi-rs 2.x is not fully
context-aware on worker_thread shutdown. AVA 4+ runs each test file in
a Worker by default; when the worker exits, napi-rs 2.x leaves dangling
references (thread_local REFERENCE_MAP, threadsafe-function plumbing)
and the V8 isolate teardown SIGSEGVs. The crash propagates back through
yarn/ava as 129 or 139 depending on signal handling.

This is fixed upstream in napi-rs 3 (napi-rs/napi-rs#2469, #2470, #3026).

Changes:
- Cargo.toml: napi 2.10.6 -> 3, napi-derive 2.9.4 -> 3, napi-build 2.0.1 -> 2
- src/lib.rs: JsBuffer -> bindgen_prelude::Buffer; into_value() plumbing
  no longer needed (Buffer impls AsRef<[u8]> directly)
- package.json: @napi-rs/cli ^2 -> ^3, migrate napi config from
  triples.additional -> targets per v3 schema, name -> binaryName
- index.js / index.d.ts: regenerated by napi build v3
- yarn.lock / Cargo.lock: refreshed

Verified locally on Node 18 (CI's version) and Node 22 that the
minimal worker_threads + native-addon repro, direct ava invocation,
and full `yarn test` all exit 0.
@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatednpm/​@​types/​node@​20.11.5 ⏵ 20.19.411001008196100
Updatednpm/​@​napi-rs/​cli@​2.14.3 ⏵ 3.6.292 -810085 -1493 -4100
Updatednpm/​ava@​5.2.0 ⏵ 5.3.198 +110010092100

View full report

@socket-security
Copy link
Copy Markdown

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
License policy violation: npm argparse under Python-2.0.1

License: Python-2.0.1 - The applicable license policy does not permit this license (5) (package/LICENSE)

From: ?npm/@napi-rs/cli@3.6.2npm/argparse@2.0.1

ℹ Read more on: This package | This alert | What is a license policy violation?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Find a package that does not violate your license policy or adjust your policy to allow this package's license.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore npm/argparse@2.0.1. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

@napi-rs/cli v3 (pulled in by the napi-rs 3 upgrade) depends on
@inquirer/core, which imports node:util.styleText — added in
Node 20.12. On Node 18 the CLI SyntaxErrors at startup before
napi build can compile anything, producing exit 105 from
cargo make.

CI build environment only; package's engines.node (>= 10)
remains unchanged for consumers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants