Skip to content

Fix bug in PHP syntax highlighting#203

Merged
javiereguiluz merged 1 commit intosymfony-tools:mainfrom
wouterj:fix-php-highlight
Mar 31, 2026
Merged

Fix bug in PHP syntax highlighting#203
javiereguiluz merged 1 commit intosymfony-tools:mainfrom
wouterj:fix-php-highlight

Conversation

@wouterj
Copy link
Copy Markdown
Contributor

@wouterj wouterj commented Mar 22, 2026

Don't ask me why and how. I don't see why it breaks in this UX example, but doesn't in e.g. https://symfony.com/doc/current/messenger.html#creating-a-message-handler And I also don't see how this change fixes it. But it fixes it, and it doesn't appear to break any other PHP attributes in the test suite of this library (which is quite extensive).

@javiereguiluz
Copy link
Copy Markdown
Collaborator

Wouter, thanks for fixing this.

FYI, below I reproduce the analysis made by Claude Code:


Root Cause: Catastrophic Backtracking

The bug is caused by catastrophic backtracking in PCRE, triggered by the interaction between the case_insensitive: true flag in php.json and the nested quantifiers in the attribute regex.

The regex in question

Inside the meta mode (PHP attributes #[...]), there are two variants:

  1. Variant 1 (for attributes with parens): \s*(\\?[A-Z][A-Za-z0-9_\x7f-\xff]+)+(?=\()
  2. Variant 2 (for attributes without parens): \s*(\\?[A-Z][A-Za-z0-9_\x7f-\xff]+)+(?!\()

These are combined into a single mega-regex alternation by the highlight engine.

Why case_insensitive: true is the catalyst

The php.json grammar has case_insensitive: true, which adds the i flag to ALL compiled regexes. This makes [A-Z] match both uppercase AND lowercase letters.

Without i flag: The group (\\?[A-Z][A-Za-z0-9_]+)+ can only start a new repetition at uppercase letters. For AsNativeConfigurationProvider, only positions A(0), N(2), C(8), P(21) are valid split points → ~2^3 = 8 backtracking attempts.

With i flag: [A-Z] matches any letter, so EVERY character position can start a new group repetition → ~2^28 ≈ 268 million backtracking attempts.

The chain of failure

  1. The mega-regex tries variant 1 first: (\\?[A-Z]...)+(?=\(). It greedily matches AsNativeConfigurationProvider, then the lookahead (?=\() fails (next char is ], not ().
  2. PCRE backtracks, trying every possible way to split the 29-character name among group repetitions. With the i flag, this is exponential.
  3. For names ≥28 characters, this exceeds PHP's pcre.backtrack_limit (default 1,000,000). preg_match returns FALSE.
  4. The highlight engine's RegEx::exec() sees an empty result and returns null. The Terminators::exec() returns null. The main highlight loop breaks out — but $this->top is still the meta mode!
  5. The cleanup code at the end of highlight() closes any open spans, producing </span> at the very end of the output — which is why the entire rest of the file gets swallowed into the meta span.

Why shorter names work

  • AsMessageHandler (16 chars) → backtracking fits within the limit
  • AsNativeConfiguration (21 chars) → still fits
  • AsNativeConfigurationProvider (29 chars) → exceeds the limit

The threshold is around 28 characters.

Why removing (?=\() fixes it

Without the lookahead, variant 1's greedy match succeeds immediately — there's nothing to fail and trigger backtracking. The contains sub-rule (begin: "\\(") then handles the optional parentheses: if ( follows, it matches; if not, the meta mode's ] end simply closes the attribute. No exponential behavior at all.

@Kocal
Copy link
Copy Markdown

Kocal commented Mar 23, 2026

Damn 😅 Thanks for the explanation!

@wouterj
Copy link
Copy Markdown
Contributor Author

wouterj commented Mar 23, 2026

Oh that makes sense, then we would need another fix and can probably simplify the variants into 1.

@wouterj wouterj force-pushed the fix-php-highlight branch from 53d984c to ebaae6a Compare March 23, 2026 22:21
@wouterj
Copy link
Copy Markdown
Contributor Author

wouterj commented Mar 23, 2026

Now with a nicer fix :)

Copy link
Copy Markdown
Collaborator

@javiereguiluz javiereguiluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks a lot Wouter.

@javiereguiluz
Copy link
Copy Markdown
Collaborator

It's merged now! Thanks.

@javiereguiluz javiereguiluz merged commit 3d534e5 into symfony-tools:main Mar 31, 2026
3 checks passed
@wouterj wouterj deleted the fix-php-highlight branch April 1, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants