Avoid leaking extra-pattern matches in scrub reasons by Whning0513 · Pull Request #2048 · pydantic/logfire

Whning0513 · 2026-07-01T13:16:03Z

Summary

compile scrub patterns with per-pattern groups so we can identify which configured pattern matched
keep existing default scrub reasons, but use the configured extra pattern string instead of the matched secret substring
add a regression test covering URL credential scrubbing via extra_patterns

Testing

python -m pytest -q tests/test_secret_scrubbing.py -k extra_pattern_redaction_reason_does_not_echo_secret
python -m pytest -q tests/test_secret_scrubbing.py
python -m pytest -q tests/test_print.py -k instrument_print

Fixes #1909

Copilot

Pull request overview

This PR addresses a sensitive-data leak in Logfire’s scrubbing system where scrub markers could echo the matched secret substring when using ScrubbingOptions(extra_patterns=...), and adds regression coverage to prevent reintroducing the issue.

Changes:

Compiles scrubbing regexes with per-pattern named groups to attribute a match to a specific configured pattern.
Updates redaction “reason” generation to use the configured extra_patterns regex string (instead of the matched substring) while preserving existing behavior for default patterns.
Adds a regression test for URL-credential scrubbing to ensure the scrub marker and logfire.scrubbed metadata do not contain the secret.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`logfire/_internal/scrubbing.py`	Changes scrubber regex compilation and redaction reason selection to avoid leaking matched secrets for `extra_patterns`.
`tests/test_secret_scrubbing.py`	Adds a regression test ensuring scrub reasons don’t echo URL credentials matched via `extra_patterns`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        matched_substring = match.pattern_match.group(0)
-        self.scrubbed.append(ScrubbedNote(path=match.path, matched_substring=matched_substring))
-        return f'[Scrubbed due to {matched_substring!r}]'
+        reason = self._pattern_reason_by_group.get(match.pattern_match.lastgroup or '', matched_substring) or matched_substring
+        self.scrubbed.append(ScrubbedNote(path=match.path, matched_substring=reason))
+        return f'[Scrubbed due to {reason!r}]'


cubic-dev-ai

No issues found across 2 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Re-trigger cubic}

codecov · 2026-07-01T13:24:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai · 2026-07-02T07:38:49Z

📝 Walkthrough

Walkthrough

The scrubbing regex compilation in `logfire/_internal/scrubbing.py` now wraps each configured pattern in a uniquely named capture group and builds a mapping from group name to either `None` (default patterns) or the originating pattern string (extra patterns). `SpanScrubber` copies this mapping. The `_redact` function now derives the stored `ScrubbedNote` reason from the matched group via this mapping instead of always using the raw matched substring, preventing extra-pattern redaction markers from echoing sensitive matched text. A new test verifies this behavior for a PostgreSQL connection URL pattern.

Changes

Area	Change
`logfire/_internal/scrubbing.py`	Named capture groups per pattern; reason lookup map; `_redact` uses reason instead of raw matched text for extra patterns
`tests/test_secret_scrubbing.py`	New test asserting scrub reason does not leak the matched secret for `extra_patterns`

Sequence Diagram(s)

sequenceDiagram
  participant Span as SpanScrubber
  participant Redact as _redact
  participant Map as _pattern_reason_by_group
  Span->>Redact: regex match with lastgroup
  Redact->>Map: lookup reason by group name
  Map-->>Redact: None (default) or pattern string (extra)
  Redact->>Redact: choose matched_substring or pattern string as reason
  Redact-->>Span: ScrubbedNote(reason) recorded safely

Related issues: #1909 — fixes the scrub-message leak where `extra_patterns` matches exposed the matched credential substring in the `[Scrubbed due to '...']` marker.

Suggested labels: bug, security, scrubbing

Suggested reviewers: alexmojaki, Kludex

Poem:
A rabbit hopped through regex dens,
Named each group with careful pens,
No more secrets in the reason shown,
Just patterns marked, the leak now gone,
Hop, hop, hooray — the burrow's safe again! 🐇🔒

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main fix: preventing scrub reasons from leaking extra-pattern matches.
Description check	✅ Passed	The description matches the implemented changes and regression test for extra_patterns scrubbing.
Linked Issues check	✅ Passed	The change removes matched secret text from extra-pattern scrub reasons, which addresses issue `#1909`.
Out of Scope Changes check	✅ Passed	The diff stays focused on scrubbing logic and a regression test, with no obvious unrelated changes.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

🧹 Nitpick comments (1)

tests/test_secret_scrubbing.py (1)
340-358: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Prefer the inline_snapshot pattern for span assertions.

The rest of this module (e.g. test_scrubbing_config) asserts via exporter.exported_spans_as_dict(...) == snapshot(...). This test hand-picks attributes instead. Keep the secret not in ... guards (they document intent well), but add a snapshot() assertion so drift in the full span is caught.

As per coding guidelines: "Tests that create spans should use TestExporter and inline_snapshot with the pattern... assert with exporter.exported_spans_as_dict(parse_json_attributes=True) == snapshot()".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_secret_scrubbing.py` around lines 340 - 358, The span assertion in
test_extra_pattern_redaction_reason_does_not_echo_secret only checks selected
attributes, so it can miss unrelated drift in the emitted span. Keep the
existing secret-not-in guards, but update the assertion to use
exporter.exported_spans_as_dict(parse_json_attributes=True) with inline
snapshot() like the other tests in this module (for example
test_scrubbing_config) so the full span shape is verified while preserving the
redaction checks.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_secret_scrubbing.py`:
- Around line 340-358: The span assertion in
test_extra_pattern_redaction_reason_does_not_echo_secret only checks selected
attributes, so it can miss unrelated drift in the emitted span. Keep the
existing secret-not-in guards, but update the assertion to use
exporter.exported_spans_as_dict(parse_json_attributes=True) with inline
snapshot() like the other tests in this module (for example
test_scrubbing_config) so the full span shape is verified while preserving the
redaction checks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0b20ed9f-2f40-4786-aafa-beca62e01f17

📥 Commits

Reviewing files that changed from the base of the PR and between ef5c776 and 5560f63.

📒 Files selected for processing (2)

logfire/_internal/scrubbing.py
tests/test_secret_scrubbing.py

🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

pydantic/pydantic (auto-detected)
pydantic/pydantic-ai (auto-detected)

Avoid leaking extra-pattern matches in scrub reasons

00be5c1

Copilot AI review requested due to automatic review settings July 1, 2026 13:16

Copilot started reviewing on behalf of Whning0513 July 1, 2026 13:16 View session

Copilot AI reviewed Jul 1, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Jul 1, 2026

View reviewed changes

Format extra-pattern scrubber follow-up

5560f63

hramezani requested a review from alexmojaki July 2, 2026 07:36

coderabbitai Bot reviewed Jul 2, 2026

View reviewed changes

coderabbitai Bot approved these changes Jul 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid leaking extra-pattern matches in scrub reasons#2048

Avoid leaking extra-pattern matches in scrub reasons#2048
Whning0513 wants to merge 2 commits into
pydantic:mainfrom
Whning0513:fix-extra-pattern-scrub-reason-1909

Whning0513 commented Jul 1, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

codecov Bot commented Jul 1, 2026

Uh oh!

coderabbitai Bot commented Jul 2, 2026

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Whning0513 commented Jul 1, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 1, 2026

Codecov Report

Uh oh!

coderabbitai Bot commented Jul 2, 2026

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Whning0513 commented Jul 1, 2026 •

edited by cubic-dev-ai Bot

Loading