feat: URL hardening consolidation - plugin framework + shared helpers + test vectors (Issues #4434, #4435)#4601
Open
MohanLaksh wants to merge 8 commits intomainfrom
Open
Conversation
… validator Add URL decoding and security checks to plugin framework's SecurityValidator.validate_url() to mirror the gateway's hardening from PR #4335. Changes: - Add urllib.parse.unquote import - Add _unquote_if_needed() and _decode_strict() helpers - Add _PERCENT_U_ESCAPE_RE and _JS_ESCAPE_RE regex patterns - Update validate_url() to decode URLs before pattern checks - Block IIS-style %uXXXX escapes, JS-style \uXXXX/\xXX escapes - Block invalid UTF-8 sequences (U+FFFD detection) - Block C0 control characters in decoded values - Block protocol-relative URLs - Update existing tests to match new error messages Part of #4434 and #4435. Verified: all 61 plugin framework tests pass. Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
…#4434) Extract URL percent-encoding helpers from gateway and plugin validators into a shared stdlib-only module for maintainability. Changes: - Create mcpgateway/common/_url_hardening.py with 4 coarse-grained helpers: * _unquote_if_needed() - decode percent-encoding when needed * _decode_and_check_encoding() - single-pass decode + double-encoding + IIS/JS-escape + U+FFFD rejection * _check_structural_forbidden_chars() - IPv6 brackets, control chars, spaces, protocol-relative URLs * _check_netloc() - decode netloc, check spaces/credentials - Update gateway validators.py to import and use shared helpers - Update plugin framework validators.py to import and use shared helpers - Remove duplicate helper definitions from both validator files - Fix TestUrlHardeningHelpers to import from correct location - Fix test_ipv6_double_check_netloc_brackets for netloc bracket check No behavior change - all 271 validator tests pass (3 skipped). Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Extract URL percent-encoding test vectors into shared tests/helpers/url_encoding_vectors.py following issue #4435 spec with _VECTORS naming convention. - Create tests/helpers/url_encoding_vectors.py with 10 vector lists: ENCODED_CRLF_VECTORS, ENCODED_HTML_TAG_VECTORS, ENCODED_DANGEROUS_PROTOCOL_VECTORS, ENCODED_IPV6_BRACKET_VECTORS, ENCODED_WHITESPACE_AUTHORITY_VECTORS, DOUBLE_ENCODED_VECTORS, IIS_UNICODE_ESCAPE_VECTORS, JS_UNICODE_ESCAPE_VECTORS, UTF8_OVERLONG_VECTORS, LEGITIMATE_ENCODED_ACCEPTED_VECTORS - Refactor TestValidateUrlPercentEncoding in test_validators_advanced.py to use shared vectors - Add TestSecurityValidatorPercentEncoding in plugin test_validators.py using shared vectors - Both test classes now use identical parametrize decorators from shared location - 64 new tests added (32 per suite) exercising the same encoding edge cases - All 303 validator tests pass (3 skipped) Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
- Remove unused urllib.parse.unquote import from gateway validators.py - Remove unused decoded_netloc variable assignment in gateway validators.py - Remove unused _check_netloc import from plugin framework validators.py Ruff F401/F841 errors resolved. Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
…rs_advanced.py - Add ParseResult import to fix F821 undefined name error - Fix None comparison to use 'is None' (E711) - Replace 'import mcpgateway.common.validators as validators' with 'from mcpgateway.common import validators' (PLR0402) Ruff F821, E711, PLR0402 errors resolved. Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Pre-commit hook detect-secrets updated line numbers and timestamp in the baseline file. Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Collaborator
Author
|
Can you help me in reviewing this please? |
Member
|
DO NOT MERGE before #3754 is merged. |
The string type hint "ParseResult" doesn't require the actual import. Vulture flagged the import as unused (90% confidence). Fixes CI vulture check error. Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements URL percent-encoding hardening consolidation across the gateway and plugin framework validators, addressing two related issues:
mcpgateway/common/_url_hardening.pytests/helpers/url_encoding_vectors.pyBackground & Motivation
The gateway's
SecurityValidator.validate_url()had comprehensive URL percent-encoding hardening (from PR #4335), but the plugin framework's validator lacked these protections. Additionally, both validators had duplicate helper implementations, and test vectors for encoding edge cases were embedded in a single test file.What Was Done (3 Phases)
Phase 1: Plugin Framework Hardening (Prerequisite)
Problem: Plugin framework
SecurityValidator.validate_url()was vulnerable to bypasses that the gateway already blocked.Changes (
mcpgateway/plugins/framework/validators.py):_unquote_if_needed()helper_decode_strict()for strict UTF-8 validation_PERCENT_U_ESCAPE_RE(IIS%uXXXX),_JS_ESCAPE_RE(JS\uXXXX/\xXX)validate_url()to decode URLs before pattern checks%uXXXXescapes, JS-style unicode escapes, invalid UTF-8 (U+FFFD), C0 control characters, protocol-relative URLs (//evil.com)Phase 2: Extract Shared Helpers (#4434)
Problem: Both gateway and plugin validators had copy-pasted URL hardening logic.
Changes:
mcpgateway/common/_url_hardening.pywith 4 coarse-grained helpers:_unquote_if_needed()- decode percent-encoding only when%present (hot-path optimization)_decode_and_check_encoding()- single-pass decode + double-encoding detection + IIS/JS-escape + U+FFFD rejection_check_structural_forbidden_chars()- IPv6 brackets, control chars, spaces, protocol-relative URLs_check_netloc()- decode netloc, check spaces/credentialsmcpgateway/common/validators.pyto import from shared modulemcpgateway/plugins/framework/validators.pyto import from shared moduleTestUrlHardeningHelpersto import from correct locationtest_ipv6_double_check_netloc_brackets(added netloc bracket check)Design constraint:
_url_hardening.pyis stdlib-only (nomcpgateway.config.settingsdependency) to keep plugin framework self-contained.Phase 3: Shared Test Vectors (#4435)
Problem: Percent-encoding test vectors were embedded only in
test_validators_advanced.py, not shared with plugin tests.Changes:
tests/helpers/url_encoding_vectors.pywith 10 vector lists following_VECTORSnaming:ENCODED_CRLF_VECTORS,ENCODED_HTML_TAG_VECTORS,ENCODED_DANGEROUS_PROTOCOL_VECTORSENCODED_IPV6_BRACKET_VECTORS,ENCODED_WHITESPACE_AUTHORITY_VECTORSDOUBLE_ENCODED_VECTORS,IIS_UNICODE_ESCAPE_VECTORS,JS_UNICODE_ESCAPE_VECTORSUTF8_OVERLONG_VECTORS,LEGITIMATE_ENCODED_ACCEPTED_VECTORSTestValidateUrlPercentEncodingintest_validators_advanced.pyto use shared vectorsTestSecurityValidatorPercentEncodingin plugintest_validators.pyusing shared vectorsVerification
make verify- 10/10 rating, package readymake bandit- no security issuesmake interrogate- 100% docstring coveragemake pylint- 10.00/10 ratingmake pre-commit- all hooks passmake ruff- no linting errorsNote:
make doctestshows 5 pre-existing failures intool_service.pyandcontent_security.py(unrelated to this PR - verified on base branchf855e54d5).Commits
fb255a6d8- fix(security): add URL percent-encoding hardening to plugin framework validator0090ac24f- refactor: extract shared URL-hardening helpers into _url_hardening.py ([CHORE][REFACTOR]: Extract shared URL-hardening helpers into mcpgateway/common/_url_hardening.py #4434)ea98a4f7b- feat(validation): share percent-encoding test vectors (Phase 3)cd4a69dd5- fix(lint): remove unused imports and variables in validatorsdc57b37da- fix(lint): resolve ruff errors in _url_hardening.py and test_validators_advanced.py579b260cb- chore: update .secrets.baseline with current timestampsFiles Changed
New files:
mcpgateway/common/_url_hardening.py- shared URL-hardening helperstests/helpers/url_encoding_vectors.py- shared test vectorsModified:
mcpgateway/common/validators.py- use shared helpersmcpgateway/plugins/framework/validators.py- use shared helpers + Phase 1 hardeningtests/unit/mcpgateway/validation/test_validators_advanced.py- use shared vectorstests/unit/mcpgateway/plugins/framework/test_validators.py- add Phase 3 tests + Phase 1 hardeningCloses
Closes #4434
Closes #4435