Skip to content

fix(pampa): detect parse errors without the per-lex tree-sitter logger (bd-b7eb7)#246

Closed
cscheid wants to merge 2 commits into
mainfrom
bugfix/bd-b7eb7-tree-sitter-logger-locale-lock
Closed

fix(pampa): detect parse errors without the per-lex tree-sitter logger (bd-b7eb7)#246
cscheid wants to merge 2 commits into
mainfrom
bugfix/bd-b7eb7-tree-sitter-logger-locale-lock

Conversation

@cscheid
Copy link
Copy Markdown
Member

@cscheid cscheid commented May 31, 2026

Summary

The qmd reader (crates/pampa/src/readers/qmd.rs::read) attached a tree-sitter set_logger callback on every parse. When any logger is set, tree-sitter formats a debug string with snprintf for every lex action — and on macOS snprintf locks the global locale (localeconv_l). Under multithreaded project renders (rayon, ~16 workers) that locale lock serializes all parser threads.

How it was found

Profiled a large healthy project (401 docs / 8.2 MB, no errors) — the case we most want fast. A fresh multithreaded render spent ~75% of samples in __ulock_wait2/__ulock_wake/os_unfair_lock, call chain:

ts_lexer__advance → snprintf → __vfprintf → localeconv_l → os_unfair_lock

16 threads bought only ~1.2× wall-clock; sys time ~37 s. It is not the allocator — swapping mimalloc both as #[global_allocator] and via tree_sitter::set_allocator changed nothing. The locale-lock culprit was found with macOS sample (which symbolicates system libraries, unlike samply's sidecar).

Fix

Parse the happy path with no logger, and detect whether tree-sitter entered error-correction mode via its own public signal — the parse_with_options progress callback's ParseState::has_error — combined with the final tree's Node::has_error (which backstops an error in the last few ops before parse end). The logger is attached only for the diagnostic re-parse, which runs only when the document has an error. Encapsulated in MarkdownTree::had_parse_error. The previous, ineffective TreeSitterLogObserverFast (its cheap callback didn't help — the snprintf is inside tree-sitter, before the callback) is removed.

Why has_error is reliable here: tree-sitter-qmd declares no conflicts:, so the parser is deterministic LR and GLR multiple-stack speculation only occurs during error recovery — has_error therefore means a genuine error, not a speculative branch that will be pruned. This invariant is now documented in grammar.js and parser.rs.

Results (large healthy project, release-perf)

metric before after
16-thread render 8.04 s 3.71 s (2.2×)
16-thread sys 36.83 s 1.37 s (27×)
pass-1 @ 16 threads 2935 ms 426 ms (scales ~6.4× on 16 cores)
single-thread render 9.71 s 6.0 s (1.6×)

Correctness

  • Error-corpus snapshot tests (json / text / ariadne / locations) pass.
  • The real 565-file qmd-plans corpus reports the same 140 error files — exact parity with the previous log-based detection.
  • Full cargo xtask verify (Rust workspace + WASM/hub build + hub tests) passes.

Notes

  • Also documents the profiling lesson (use macOS sample for system-lib frames; a Rust #[global_allocator] doesn't cover a vendored C library's libc malloc) in claude-notes/instructions/performance-profiling.md.
  • Full investigation: claude-notes/research/2026-05-31-q2-render-perf-qmd-plans.md (lands separately with the website experiment).

🤖 Generated with Claude Code

cscheid and others added 2 commits May 31, 2026 09:31
…r (bd-b7eb7)

The qmd reader attached a tree-sitter `set_logger` callback on every parse. When
any logger is set, tree-sitter formats a debug string with `snprintf` for every
lex action; on macOS `snprintf` locks the global locale (`localeconv_l`). Under
multithreaded project renders (rayon, ~16 workers) that locale lock serializes
all parser threads.

Profiled on a large *healthy* project (401 docs / 8.2 MB, no errors): a fresh
multithreaded render spent ~75% of samples in __ulock_wait2/__ulock_wake/
os_unfair_lock — chain ts_lexer__advance -> snprintf -> __vfprintf ->
localeconv_l. 16 threads bought only ~1.2x wall-clock; sys ~37s. (It is NOT the
allocator: mimalloc, both as #[global_allocator] and via ts_set_allocator,
changed nothing. Identified with macOS `sample`, which symbolicates system libs
unlike samply's sidecar.)

Fix: parse the happy path with no logger, and detect whether tree-sitter entered
error-correction mode via its own public signal — the `parse_with_options`
progress callback's `ParseState::has_error` — combined with the final tree's
`Node::has_error` (the latter backstops an error in the last few ops before the
parse ends). The logger is attached only for the diagnostic re-parse, which runs
only when the document has an error. `MarkdownTree::had_parse_error` encapsulates
this; `TreeSitterLogObserverFast` (the previous, ineffective optimization) is
removed.

Why has_error is reliable here: tree-sitter-qmd declares no `conflicts:`, so the
parser is deterministic LR and GLR multiple-stack speculation only happens during
error recovery — has_error therefore means a genuine error, not a speculative
branch. Documented this invariant in grammar.js and parser.rs so it isn't
regressed.

Measured (large healthy project, release-perf): 16-thread render 8.04 -> 3.71s
(2.2x); sys 36.83 -> 1.37s (27x); pass-1 2935 -> 426ms (now scales ~6.4x on 16
cores); single-thread 1.6x faster. No diagnostic regression: error-corpus
snapshot tests pass and the real 565-file corpus reports the same 140 error
files. Full `cargo xtask verify` (incl. WASM/hub) passes.

Research notes: claude-notes/research/2026-05-31-q2-render-perf-qmd-plans.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…b7eb7)

samply's syms sidecar doesn't symbolicate system libraries, so hot/lock-holding
frames inside libsystem show up as bare addresses. Document using macOS `sample`
to resolve them, with the bd-b7eb7 worked example (guessed malloc, was wrong;
mimalloc swap as a decisive negative result; the real cause was snprintf ->
localeconv_l, the locale lock). Also note that #[global_allocator] doesn't cover
a vendored C library's direct libc malloc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cscheid
Copy link
Copy Markdown
Member Author

cscheid commented May 31, 2026

Superseded by #247 — same commits, rebranched to perf/tree-sitter-error-logging.

@cscheid cscheid closed this May 31, 2026
@cscheid cscheid deleted the bugfix/bd-b7eb7-tree-sitter-logger-locale-lock branch May 31, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant