Skip to content

chore(message-parser): pre-screen, LRU cache, tldParse memo, async generator#40568

Draft
ggazzo wants to merge 1 commit into
developfrom
feat/message-parser-perf
Draft

chore(message-parser): pre-screen, LRU cache, tldParse memo, async generator#40568
ggazzo wants to merge 1 commit into
developfrom
feat/message-parser-perf

Conversation

@ggazzo
Copy link
Copy Markdown
Member

@ggazzo ggazzo commented May 15, 2026

Summary

Performance optimizations for @rocket.chat/message-parser without changing grammar behavior.

  • Pre-screen: bypass Peggy for inputs with no markdown triggers; return a single plaintext paragraph immediately. Catches the bulk of trivial chat messages.
  • LRU cache (512 entries, inputs ≤ 4KB) on parse(input, options). Composite key encodes relevant options. Repeated messages (system notifications, presets, common phrases) hit the cache instead of re-parsing.
  • tldts memo (256 entries each) around tldParse calls in autoLink and autoEmail. Same URL or email reused across messages skips the tldts work.
  • parseStream async generator: top-level block splitter (BlockSplitter) feeds Peggy one block at a time and yields the event loop between blocks. Long inputs no longer block the main thread for the full parse duration.

Notes

  • parse() API and AST output are unchanged. Cached entries are returned by reference — callers should not mutate the AST (matches existing assumptions; plain(...) objects were already shared internally).
  • parseStream is opt-in (new export). Falls back to whole-input parse when the input is short or contains constructs that span blocks (|| block spoiler).
  • For true off-main-thread parsing, callers should run parse/parseStream inside a Web Worker — async generator only frees the event loop, not the CPU.
  • BlockSplitter is no longer dead code; parseStream now consumes it.

Bench

Benchmarks reuse the same input per task, so the LRU cache dominates after the first iteration. Numbers reflect cache-hit / pre-screen behaviour, not cold parsing speed:

Case Before (ops/s) After (ops/s)
Plain text "short" 32 23,542,690
**Hello world** 31 7,248,097
repeated specials (worst-case) 1.6 4,178,167
URL with path 26 3,976,463
Realistic chat message 18 2,740,404

Cold parsing of new inputs is unchanged — gains depend on repeat rate of real workload.

Test plan

  • yarn test from packages/message-parser/ — 34 suites, 634 tests passing
  • Run app, open a busy channel, confirm rendering visually unchanged
  • Verify no AST mutation issues downstream (renderer, search highlight, notifications)
  • Optional: hook parseStream into composer preview / large thread render path

- Skip Peggy for inputs with no markdown triggers (trivial text fast path).
- LRU cache (512 entries, <=4KB inputs) for repeated parse(input, options).
- LRU memo (256 entries) around tldts parse() in autoLink/autoEmail.
- parseStream async generator: split top-level blocks via BlockSplitter and
  yield event loop between blocks for long inputs.
@dionisio-bot
Copy link
Copy Markdown
Contributor

dionisio-bot Bot commented May 15, 2026

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label
  • This PR is missing the required milestone or project

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 15, 2026

⚠️ No Changeset found

Latest commit: cac97b5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7eef6c97-3bfb-4f60-987d-508b86a32113

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ggazzo ggazzo changed the title perf(message-parser): pre-screen, LRU cache, tldParse memo, async generator chore(message-parser): pre-screen, LRU cache, tldParse memo, async generator May 15, 2026
};

export const parse = (input: string, options?: Options): Root => grammar.parse(input, options);
const MARKDOWN_TRIGGER = /[\r\n*_~`@#:<>|!+$\\[\]()\-]|\.[A-Za-z]|^\d+\.|[⌀-➿☀-⛿\uD800-\uDBFF]/;
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 89.77273% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.61%. Comparing base (83f1191) to head (cac97b5).
⚠️ Report is 18 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop   #40568      +/-   ##
===========================================
+ Coverage    69.60%   69.61%   +0.01%     
===========================================
  Files         3323     3326       +3     
  Lines       122610   122904     +294     
  Branches     21853    21940      +87     
===========================================
+ Hits         85338    85563     +225     
- Misses       33939    33973      +34     
- Partials      3333     3368      +35     
Flag Coverage Δ
e2e 59.14% <ø> (+0.08%) ⬆️
e2e-api 47.15% <ø> (+0.89%) ⬆️
unit 70.30% <89.77%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants