Skip to content

Conversation

@mjkim610
Copy link
Contributor

@mjkim610 mjkim610 commented Jul 23, 2025

According to RFC 1866 and RFC 3986, the SP character can only be encoded as + in application/x-www-form-urlencoded values. But encoding it as %20 is always correct. Therefore, prefer to encode with %20.

Related nuclei issue: projectdiscovery/nuclei#6162

I tested it locally with a simple case, but much more thorough testing is required.

❯ go build ; ./nuclei -u http://127.0.0.1:8000/ -dast -t ../../../fuzzing-test.yaml -vv -debug

                     __     _
   ____  __  _______/ /__  (_)
  / __ \/ / / / ___/ / _ \/ /
 / / / / /_/ / /__/ /  __/ /
/_/ /_/\__,_/\___/_/\___/_/   v3.4.7

                projectdiscovery.io

[INF] Current nuclei version: v3.4.7 (latest)
[INF] Current nuclei-templates version: v10.2.5 (latest)
[WRN] Scan results upload to cloud is disabled.
[INF] New templates added in latest release: 75
[INF] Templates loaded for current scan: 1
[WRN] Loading 1 unsigned templates for scan. Use with caution.
[INF] Targets loaded for current scan: 1
[fuzzing-test] Some fuzzing test (@mj) [low]
[DBG] mjkim - path - key: 1 , value: (select 1 from sleep(5)
[DBG] mjkim - key: 1 , value: (select%201%20from%20sleep(5)
[INF] [fuzzing-test] Dumped HTTP request for http://127.0.0.1:8000/(select%201%20from%20sleep(5)

GET /%28select%25201%2520from%2520sleep%285%29 HTTP/1.1
Host: 127.0.0.1:8000
User-Agent: Mozilla/5.0 (ZZ; Linux i686; rv:123.0) Gecko/20100101 Firefox/123.0
Accept-Encoding: gzip

[DBG] [fuzzing-test] Dumped HTTP response http://127.0.0.1:8000/(select%201%20from%20sleep(5)

HTTP/1.0 404 File not found
Content-Length: 335
Connection: close
Content-Type: text/html;charset=utf-8
Date: Wed, 23 Jul 2025 02:28:35 GMT
Server: SimpleHTTP/0.6 Python/3.12.4

<!DOCTYPE HTML>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 404</p>
        <p>Message: File not found.</p>
        <p>Error code explanation: 404 - Nothing matches the given URI.</p>
    </body>
</html>
[INF] Scan completed in 4.32725ms. No results found.

@mjkim610
Copy link
Contributor Author

@Mzack9999 PTAL :)

@mjkim610
Copy link
Contributor Author

Hi @ehsandeep @dogancanbakir
PTAL or let me know if there's some process I must follow :)

@ehsandeep ehsandeep requested a review from tarunKoyalwar July 25, 2025 23:17
According to RFC 1866 and RFC 3986, the `SP` character can only
be encoded as `+` in application/x-www-form-urlencoded values.
But encoding it as `%20` is always correct. Therefore, prefer
to encode with `%20`.
@mjkim610
Copy link
Contributor Author

Updated test cases and verified TC passes on my machine.

❯ go test ./...
ok      github.com/projectdiscovery/utils/async 2.349s
ok      github.com/projectdiscovery/utils/auth/pdcp     0.264s
ok      github.com/projectdiscovery/utils/batcher       3.170s
?       github.com/projectdiscovery/utils/buffer        [no test files]
ok      github.com/projectdiscovery/utils/channelutil   0.747s
?       github.com/projectdiscovery/utils/conn/connpool [no test files]
?       github.com/projectdiscovery/utils/consts        [no test files]
ok      github.com/projectdiscovery/utils/context       7.175s
ok      github.com/projectdiscovery/utils/conversion    0.474s
ok      github.com/projectdiscovery/utils/crypto        1.394s
?       github.com/projectdiscovery/utils/crypto/jarm   [no test files]
ok      github.com/projectdiscovery/utils/dedupe        1.452s
ok      github.com/projectdiscovery/utils/dns   1.851s
ok      github.com/projectdiscovery/utils/env   1.620s
ok      github.com/projectdiscovery/utils/errkit        1.858s
ok      github.com/projectdiscovery/utils/errors        1.904s
ok      github.com/projectdiscovery/utils/exec  1.923s
ok      github.com/projectdiscovery/utils/file  22.131s
ok      github.com/projectdiscovery/utils/folder        1.648s
ok      github.com/projectdiscovery/utils/generic       1.645s
?       github.com/projectdiscovery/utils/global        [no test files]
ok      github.com/projectdiscovery/utils/healthcheck   2.262s
ok      github.com/projectdiscovery/utils/http  1.740s
ok      github.com/projectdiscovery/utils/io    1.949s
ok      github.com/projectdiscovery/utils/ip    2.287s
ok      github.com/projectdiscovery/utils/log   1.951s
ok      github.com/projectdiscovery/utils/maps  1.647s
?       github.com/projectdiscovery/utils/memguardian   [no test files]
ok      github.com/projectdiscovery/utils/memoize       11.330s
?       github.com/projectdiscovery/utils/memoize/cmd   [no test files]
?       github.com/projectdiscovery/utils/memoize/gen/generic   [no test files]
?       github.com/projectdiscovery/utils/memoize/simpleflight  [no test files]
?       github.com/projectdiscovery/utils/memoize/tests [no test files]
?       github.com/projectdiscovery/utils/ml    [no test files]
?       github.com/projectdiscovery/utils/ml/metrics    [no test files]
?       github.com/projectdiscovery/utils/ml/model_selection    [no test files]
ok      github.com/projectdiscovery/utils/ml/naive_bayes        0.892s
ok      github.com/projectdiscovery/utils/net   1.111s
?       github.com/projectdiscovery/utils/os    [no test files]
ok      github.com/projectdiscovery/utils/patterns      1.309s
ok      github.com/projectdiscovery/utils/permission    1.383s
ok      github.com/projectdiscovery/utils/ports 1.452s
?       github.com/projectdiscovery/utils/pprof [no test files]
?       github.com/projectdiscovery/utils/process       [no test files]
?       github.com/projectdiscovery/utils/proxy [no test files]
ok      github.com/projectdiscovery/utils/ptr   1.357s
?       github.com/projectdiscovery/utils/race  [no test files]
ok      github.com/projectdiscovery/utils/rand  1.451s
ok      github.com/projectdiscovery/utils/reader        18.569s
?       github.com/projectdiscovery/utils/reader/examples/keypress/buffered     [no test files]
?       github.com/projectdiscovery/utils/reader/examples/keypress/raw  [no test files]
?       github.com/projectdiscovery/utils/reader/rawmode        [no test files]
ok      github.com/projectdiscovery/utils/reflect       1.383s
?       github.com/projectdiscovery/utils/reflect/tests [no test files]
ok      github.com/projectdiscovery/utils/regexp        0.450s
?       github.com/projectdiscovery/utils/routing       [no test files]
ok      github.com/projectdiscovery/utils/scripts/versionbump   3.290s
ok      github.com/projectdiscovery/utils/slice 0.958s
ok      github.com/projectdiscovery/utils/strings       1.154s
ok      github.com/projectdiscovery/utils/structs       0.925s
ok      github.com/projectdiscovery/utils/sync  5.760s
?       github.com/projectdiscovery/utils/sync/semaphore        [no test files]
ok      github.com/projectdiscovery/utils/sync/sizedpool        4.467s
ok      github.com/projectdiscovery/utils/syscallutil   1.340s
ok      github.com/projectdiscovery/utils/sysutil       1.232s
ok      github.com/projectdiscovery/utils/time  (cached)
ok      github.com/projectdiscovery/utils/trace 4.343s
?       github.com/projectdiscovery/utils/unit  [no test files]
ok      github.com/projectdiscovery/utils/update        0.509s
ok      github.com/projectdiscovery/utils/url   (cached)

@4FunAndProfit
Copy link

Hello @mjkim610 thanks a lot ! It should indeed resolve a part of my issue too (#6342)

Yet, when text/plain Space we should let as real Space not? In the changes I cannot see that? I think the code may have forgot this case ?

@4FunAndProfit
Copy link

4FunAndProfit commented Jul 26, 2025

Maybe we might add an option like encoding_type ? To cover all the case (plain, url, www-form,...)?
But I Think nuclei should auto encode accordingly to the content-type expected if it has the information (through the openapi for example) ?
[Also here, i think you changed the default encoding to url encoding which is correct for queryParams but isn't for body in many cases (i indeed suspect theses codes in url folder to be used also in bodyparams etc?)]

Copy link
Member

@tarunKoyalwar tarunKoyalwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @mjkim610 , thanks for the pr, but url utils in this repo are not meant to follow any RFC's and are implemented on case to case basis based on the requirement of the poc being written ( apart from that we have many other optimizations that are not part of any RFC https://github.com/projectdiscovery/utils/tree/main/url )

this ordereparams is used for url encoding throughout nuclei ( in request body / url etc ) , so any change here will most likely break existing nuclei-templates and integration tests ,

can you share any exploit / poc where you are able to exploit a particular vulnerability via %20 but not + , if so we can plan some changes where we can support that without breaking existing templates

@4FunAndProfit
Copy link

4FunAndProfit commented Jul 26, 2025

@tarunKoyalwar see for example projectdiscovery/nuclei#6342 for an example where encoding is not good and consider an XXE for example in this example
Can't we have an option like space_encoding which is set by default to + and that we can change? It will not break existing templates with that

@4FunAndProfit
Copy link

Or maybe just authorize unsafe: true in fuzzing? It will resolve a big part of it

@mjkim610
Copy link
Contributor Author

mjkim610 commented Jul 28, 2025

@tarunKoyalwar thanks for the feedback.

To answer your question regarding POCs that require %20 instead of +:

Regarding your concern about breaking integration tests:

  • For URLs, the decoded URL will be exactly the same with this change. So if any tests fail due to this change, then most likely the expected value of the TC should be updated.
  • For others, different data should be encoded differently. (E.g. text/plain should not be encoded) This is an issue independent of this PR and I think this should be handled as a separate issue/PR.

Regarding your concern about breaking existing templates:

  • I agree this would be a problem. Can you provide some examples so that I can get a better understanding?
  • The %20 encoding could be set as a new option, as @4FunAndProfit suggested. This means maintaining compatibility while giving an option for new templates.

@mjkim610
Copy link
Contributor Author

mjkim610 commented Aug 8, 2025

@tarunKoyalwar could you provide feedback? :)

I think the best course of action would be to set %20 encoding as a new option in the short term, then support all other encoding types in the long term.

@Mzack9999
Copy link
Member

@mjkim610 Unfortunately we can't change the default behavior for edge cases. I've added an environment variable SPACE_ENCODING that you can set to percent in order to have the wanted behavior:

SPACE_ENCODING=percent nuclei -t xxx ...

@Mzack9999 Mzack9999 merged commit bc0b0d6 into projectdiscovery:main Aug 15, 2025
7 checks passed
@mjkim610 mjkim610 deleted the url-encoding branch August 15, 2025 15:47
@mjkim610
Copy link
Contributor Author

@mjkim610 Unfortunately we can't change the default behavior for edge cases. I've added an environment variable SPACE_ENCODING that you can set to percent in order to have the wanted behavior:

SPACE_ENCODING=percent nuclei -t xxx ...

Thanks for providing a way to change space encoding type! Although I have some reservations about using env vars, I understand why, given that implementing it in other methods (e.g. flags) would require lots of planning and big changes.

I will keep an eye out for any updates regarding these encoding issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants