Skip to content

perf: cache compiled regexes in TriggerMatcher#109

Merged
matt1398 merged 1 commit intomatt1398:mainfrom
MintCollector:fix/regex-caching
Mar 11, 2026
Merged

perf: cache compiled regexes in TriggerMatcher#109
matt1398 merged 1 commit intomatt1398:mainfrom
MintCollector:fix/regex-caching

Conversation

@MintCollector
Copy link
Contributor

@MintCollector MintCollector commented Mar 10, 2026

Summary

  • Adds a bounded LRU-style cache (max 500 entries) for compiled RegExp objects in TriggerMatcher
  • matchesPattern() and matchesIgnorePatterns() now use getCachedRegex() instead of recompiling via createSafeRegExp() on every call
  • Cache key uses null-byte separator to avoid pattern/flag collisions

Addresses #95

Test plan

  • All 653 tests pass
  • Typecheck clean
  • Lint clean
  • Production build succeeds
  • Cache is bounded — evicts oldest entry when full

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor
    • Optimized pattern matching operations for improved performance and efficiency.

Add bounded (500-entry) module-level cache for compiled RegExp objects.
Eliminates ~10,000 redundant regex compilations per session when checking
1000 messages against 10 trigger patterns.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance optimization by implementing a caching mechanism for compiled regular expressions. By storing and reusing RegExp objects, it aims to reduce the overhead associated with recompiling patterns on every invocation of matching functions, thereby improving the efficiency of pattern-based operations within the TriggerMatcher service.

Highlights

  • Regex Caching: Implemented a bounded LRU-style cache (max 500 entries) for compiled RegExp objects within TriggerMatcher.
  • Performance Improvement: Functions matchesPattern() and matchesIgnorePatterns() now leverage getCachedRegex() to avoid repeated recompilation of regular expressions.
  • Cache Key Strategy: Utilized a null-byte separator in the cache key (${pattern}\0${flags}) to prevent potential collisions between different patterns or flags.
Changelog
  • src/main/services/error/TriggerMatcher.ts
    • Introduced a module-level regexCache (Map) and a MAX_CACHE_SIZE constant.
    • Added a getCachedRegex function to manage the cache, handling compilation, retrieval, and LRU-style eviction.
    • Modified matchesPattern to use getCachedRegex instead of createSafeRegExp directly.
    • Modified matchesIgnorePatterns to use getCachedRegex instead of createSafeRegExp directly.
    • Updated JSDoc comments for matchesPattern and matchesIgnorePatterns to reflect the caching.
Activity
  • All 653 tests passed.
  • Typecheck was clean.
  • Lint was clean.
  • Production build succeeded.
  • The cache was verified to be bounded and correctly evict oldest entries when full.
  • The pull request was generated with Claude Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai coderabbitai bot added the feature request New feature or request label Mar 10, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to improve performance by caching compiled regular expressions in TriggerMatcher using a Map as a bounded cache. However, the current implementation is vulnerable to memory exhaustion, potentially leading to a Denial of Service (DoS), as it caches invalid patterns of arbitrary length. Additionally, the caching strategy could be refined from FIFO to a true LRU policy for better effectiveness.

Comment on lines +32 to +49
function getCachedRegex(pattern: string, flags: string): RegExp | null {
const key = `${pattern}\0${flags}`;
if (regexCache.has(key)) {
return regexCache.get(key) ?? null;
}

// Evict oldest entries when cache is full
if (regexCache.size >= MAX_CACHE_SIZE) {
const firstKey = regexCache.keys().next().value;
if (firstKey !== undefined) {
regexCache.delete(firstKey);
}
}

const regex = createSafeRegExp(pattern, flags);
regexCache.set(key, regex);
return regex;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The getCachedRegex function is vulnerable to Denial of Service (DoS) via memory exhaustion. It creates cache keys from arbitrarily long pattern strings before validating their length. Even if createSafeRegExp rejects long patterns, the long string is cached as a key with a null value, allowing an attacker to fill the cache with large keys and consume significant memory. A length check should be performed before caching. Furthermore, the current cache implementation acts as a FIFO queue; a true LRU (Least Recently Used) strategy would be more effective for retaining frequently used items.

function getCachedRegex(pattern: string, flags: string): RegExp | null {
  if (pattern.length > 100) {
    return null;
  }
  const key = `${pattern}\0${flags}`;
  if (regexCache.has(key)) {
    return regexCache.get(key) ?? null;
  }

  // Evict oldest entries when cache is full
  if (regexCache.size >= MAX_CACHE_SIZE) {
    const firstKey = regexCache.keys().next().value;
    if (firstKey !== undefined) {
      regexCache.delete(firstKey);
    }
  }

  const regex = createSafeRegExp(pattern, flags);
  regexCache.set(key, regex);
  return regex;
}

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

Introduces a module-level RegExp cache in TriggerMatcher.ts with a bounded size of 500 entries. A getCachedRegex helper function compiles and caches RegExp objects, evicting the oldest entry when full. The matchesPattern and matchesIgnorePatterns methods are updated to utilize this cache instead of directly calling createSafeRegExp.

Changes

Cohort / File(s) Summary
RegExp Caching
src/main/services/error/TriggerMatcher.ts
Adds module-level RegExp cache with MAX_CACHE_SIZE of 500, implements getCachedRegex helper for pattern compilation and caching, and updates matchesPattern and matchesIgnorePatterns to use cached RegExp objects.

Suggested labels

feature request

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/main/services/error/TriggerMatcher.ts (1)

34-42: Implement true LRU eviction by refreshing cache hits.

The current implementation evicts the oldest inserted entry, not the least recently used one. Map.prototype.get() does not change iteration order, so frequently accessed patterns will still be evicted under heavy workload churn with >500 distinct patterns. To implement LRU with a Map, delete and re-insert the key on cache hits.

♻️ Suggested change
 function getCachedRegex(pattern: string, flags: string): RegExp | null {
   const key = `${pattern}\0${flags}`;
   if (regexCache.has(key)) {
-    return regexCache.get(key) ?? null;
+    const cached = regexCache.get(key) ?? null;
+    regexCache.delete(key);
+    regexCache.set(key, cached);
+    return cached;
   }

   // Evict oldest entries when cache is full
   if (regexCache.size >= MAX_CACHE_SIZE) {
     const firstKey = regexCache.keys().next().value;
     if (firstKey !== undefined) {
       regexCache.delete(firstKey);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/services/error/TriggerMatcher.ts` around lines 34 - 42, The
regexCache currently treats hits as accesses but doesn't update order, so
implement true LRU in TriggerMatcher by, when a cache hit occurs (check
regexCache.has(key) in the lookup logic), retrieve the value via
regexCache.get(key), then refresh its recency by deleting and re-inserting the
same key/value pair into regexCache (delete(key); set(key, value)) before
returning it; keep the existing eviction logic using MAX_CACHE_SIZE to remove
regexCache.keys().next().value when size exceeds the limit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/main/services/error/TriggerMatcher.ts`:
- Around line 34-42: The regexCache currently treats hits as accesses but
doesn't update order, so implement true LRU in TriggerMatcher by, when a cache
hit occurs (check regexCache.has(key) in the lookup logic), retrieve the value
via regexCache.get(key), then refresh its recency by deleting and re-inserting
the same key/value pair into regexCache (delete(key); set(key, value)) before
returning it; keep the existing eviction logic using MAX_CACHE_SIZE to remove
regexCache.keys().next().value when size exceeds the limit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3f30424b-40c7-41da-b19b-afe4c902b53e

📥 Commits

Reviewing files that changed from the base of the PR and between f23c581 and e51c1fd.

📒 Files selected for processing (1)
  • src/main/services/error/TriggerMatcher.ts

@matt1398 matt1398 merged commit f69493c into matt1398:main Mar 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants