Skip to content

feat: URL hardening consolidation - plugin framework + shared helpers + test vectors (Issues #4434, #4435)#4601

Open
MohanLaksh wants to merge 8 commits intomainfrom
feat/plugin-url-hardening-and-shared-vectors
Open

feat: URL hardening consolidation - plugin framework + shared helpers + test vectors (Issues #4434, #4435)#4601
MohanLaksh wants to merge 8 commits intomainfrom
feat/plugin-url-hardening-and-shared-vectors

Conversation

@MohanLaksh
Copy link
Copy Markdown
Collaborator

@MohanLaksh MohanLaksh commented May 5, 2026

Summary

This PR implements URL percent-encoding hardening consolidation across the gateway and plugin framework validators, addressing two related issues:

Background & Motivation

The gateway's SecurityValidator.validate_url() had comprehensive URL percent-encoding hardening (from PR #4335), but the plugin framework's validator lacked these protections. Additionally, both validators had duplicate helper implementations, and test vectors for encoding edge cases were embedded in a single test file.

What Was Done (3 Phases)

Phase 1: Plugin Framework Hardening (Prerequisite)

Problem: Plugin framework SecurityValidator.validate_url() was vulnerable to bypasses that the gateway already blocked.

Changes (mcpgateway/plugins/framework/validators.py):

  • Added URL decoding via _unquote_if_needed() helper
  • Added _decode_strict() for strict UTF-8 validation
  • Added regex patterns: _PERCENT_U_ESCAPE_RE (IIS %uXXXX), _JS_ESCAPE_RE (JS \uXXXX/\xXX)
  • Updated validate_url() to decode URLs before pattern checks
  • Blocks: IIS-style %uXXXX escapes, JS-style unicode escapes, invalid UTF-8 (U+FFFD), C0 control characters, protocol-relative URLs (//evil.com)
  • Updated existing plugin tests to match new error messages

Phase 2: Extract Shared Helpers (#4434)

Problem: Both gateway and plugin validators had copy-pasted URL hardening logic.

Changes:

  • Created mcpgateway/common/_url_hardening.py with 4 coarse-grained helpers:
    • _unquote_if_needed() - decode percent-encoding only when % present (hot-path optimization)
    • _decode_and_check_encoding() - single-pass decode + double-encoding detection + IIS/JS-escape + U+FFFD rejection
    • _check_structural_forbidden_chars() - IPv6 brackets, control chars, spaces, protocol-relative URLs
    • _check_netloc() - decode netloc, check spaces/credentials
  • Refactored mcpgateway/common/validators.py to import from shared module
  • Refactored mcpgateway/plugins/framework/validators.py to import from shared module
  • Removed duplicate helper definitions from both files
  • Fixed TestUrlHardeningHelpers to import from correct location
  • Fixed test_ipv6_double_check_netloc_brackets (added netloc bracket check)

Design constraint: _url_hardening.py is stdlib-only (no mcpgateway.config.settings dependency) to keep plugin framework self-contained.

Phase 3: Shared Test Vectors (#4435)

Problem: Percent-encoding test vectors were embedded only in test_validators_advanced.py, not shared with plugin tests.

Changes:

  • Created tests/helpers/url_encoding_vectors.py with 10 vector lists following _VECTORS naming:
    • ENCODED_CRLF_VECTORS, ENCODED_HTML_TAG_VECTORS, ENCODED_DANGEROUS_PROTOCOL_VECTORS
    • ENCODED_IPV6_BRACKET_VECTORS, ENCODED_WHITESPACE_AUTHORITY_VECTORS
    • DOUBLE_ENCODED_VECTORS, IIS_UNICODE_ESCAPE_VECTORS, JS_UNICODE_ESCAPE_VECTORS
    • UTF8_OVERLONG_VECTORS, LEGITIMATE_ENCODED_ACCEPTED_VECTORS
  • Refactored TestValidateUrlPercentEncoding in test_validators_advanced.py to use shared vectors
  • Added TestSecurityValidatorPercentEncoding in plugin test_validators.py using shared vectors
  • 64 new tests added (32 per suite) exercising identical encoding edge cases

Verification

  • ✅ All 303 validator tests pass (3 skipped)
  • make verify - 10/10 rating, package ready
  • make bandit - no security issues
  • make interrogate - 100% docstring coverage
  • make pylint - 10.00/10 rating
  • make pre-commit - all hooks pass
  • make ruff - no linting errors

Note: make doctest shows 5 pre-existing failures in tool_service.py and content_security.py (unrelated to this PR - verified on base branch f855e54d5).

Commits

  1. fb255a6d8 - fix(security): add URL percent-encoding hardening to plugin framework validator
  2. 0090ac24f - refactor: extract shared URL-hardening helpers into _url_hardening.py ([CHORE][REFACTOR]: Extract shared URL-hardening helpers into mcpgateway/common/_url_hardening.py #4434)
  3. ea98a4f7b - feat(validation): share percent-encoding test vectors (Phase 3)
  4. cd4a69dd5 - fix(lint): remove unused imports and variables in validators
  5. dc57b37da - fix(lint): resolve ruff errors in _url_hardening.py and test_validators_advanced.py
  6. 579b260cb - chore: update .secrets.baseline with current timestamps

Files Changed

New files:

  • mcpgateway/common/_url_hardening.py - shared URL-hardening helpers
  • tests/helpers/url_encoding_vectors.py - shared test vectors

Modified:

  • mcpgateway/common/validators.py - use shared helpers
  • mcpgateway/plugins/framework/validators.py - use shared helpers + Phase 1 hardening
  • tests/unit/mcpgateway/validation/test_validators_advanced.py - use shared vectors
  • tests/unit/mcpgateway/plugins/framework/test_validators.py - add Phase 3 tests + Phase 1 hardening

Closes

Closes #4434
Closes #4435

MohanLaksh added 6 commits May 5, 2026 15:37
… validator

Add URL decoding and security checks to plugin framework's SecurityValidator.validate_url()
to mirror the gateway's hardening from PR #4335.

Changes:
- Add urllib.parse.unquote import
- Add _unquote_if_needed() and _decode_strict() helpers
- Add _PERCENT_U_ESCAPE_RE and _JS_ESCAPE_RE regex patterns
- Update validate_url() to decode URLs before pattern checks
- Block IIS-style %uXXXX escapes, JS-style \uXXXX/\xXX escapes
- Block invalid UTF-8 sequences (U+FFFD detection)
- Block C0 control characters in decoded values
- Block protocol-relative URLs
- Update existing tests to match new error messages

Part of #4434 and #4435. Verified: all 61 plugin framework tests pass.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
…#4434)

Extract URL percent-encoding helpers from gateway and plugin validators
into a shared stdlib-only module for maintainability.

Changes:
- Create mcpgateway/common/_url_hardening.py with 4 coarse-grained helpers:
  * _unquote_if_needed() - decode percent-encoding when needed
  * _decode_and_check_encoding() - single-pass decode + double-encoding +
    IIS/JS-escape + U+FFFD rejection
  * _check_structural_forbidden_chars() - IPv6 brackets, control chars,
    spaces, protocol-relative URLs
  * _check_netloc() - decode netloc, check spaces/credentials
- Update gateway validators.py to import and use shared helpers
- Update plugin framework validators.py to import and use shared helpers
- Remove duplicate helper definitions from both validator files
- Fix TestUrlHardeningHelpers to import from correct location
- Fix test_ipv6_double_check_netloc_brackets for netloc bracket check

No behavior change - all 271 validator tests pass (3 skipped).

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Extract URL percent-encoding test vectors into shared tests/helpers/url_encoding_vectors.py
following issue #4435 spec with _VECTORS naming convention.

- Create tests/helpers/url_encoding_vectors.py with 10 vector lists:
  ENCODED_CRLF_VECTORS, ENCODED_HTML_TAG_VECTORS, ENCODED_DANGEROUS_PROTOCOL_VECTORS,
  ENCODED_IPV6_BRACKET_VECTORS, ENCODED_WHITESPACE_AUTHORITY_VECTORS,
  DOUBLE_ENCODED_VECTORS, IIS_UNICODE_ESCAPE_VECTORS, JS_UNICODE_ESCAPE_VECTORS,
  UTF8_OVERLONG_VECTORS, LEGITIMATE_ENCODED_ACCEPTED_VECTORS
- Refactor TestValidateUrlPercentEncoding in test_validators_advanced.py to use shared vectors
- Add TestSecurityValidatorPercentEncoding in plugin test_validators.py using shared vectors
- Both test classes now use identical parametrize decorators from shared location
- 64 new tests added (32 per suite) exercising the same encoding edge cases
- All 303 validator tests pass (3 skipped)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
- Remove unused urllib.parse.unquote import from gateway validators.py
- Remove unused decoded_netloc variable assignment in gateway validators.py
- Remove unused _check_netloc import from plugin framework validators.py

Ruff F401/F841 errors resolved.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
…rs_advanced.py

- Add ParseResult import to fix F821 undefined name error
- Fix None comparison to use 'is None' (E711)
- Replace 'import mcpgateway.common.validators as validators' with
  'from mcpgateway.common import validators' (PLR0402)

Ruff F821, E711, PLR0402 errors resolved.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Pre-commit hook detect-secrets updated line numbers and timestamp
in the baseline file.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
@MohanLaksh
Copy link
Copy Markdown
Collaborator Author

@jonpspri ,

Can you help me in reviewing this please?

@araujof
Copy link
Copy Markdown
Member

araujof commented May 5, 2026

DO NOT MERGE before #3754 is merged.

MohanLaksh added 2 commits May 6, 2026 11:42
The string type hint "ParseResult" doesn't require the actual import.
Vulture flagged the import as unused (90% confidence).

Fixes CI vulture check error.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants