Skip to content

Commit 2fd0fda

Browse files
b-c-dsbcaller
authored andcommitted
Update README again
1 parent f614040 commit 2fd0fda

File tree

2 files changed

+22
-18
lines changed

2 files changed

+22
-18
lines changed

README.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -19,20 +19,20 @@ For explotability, cubic complexity or higher is typically required unless truly
1919

2020
## Example
2121

22-
Run `regexploit` and enter the regular expression `abc*[a-z]+c+$` at the command line.
22+
Run `regexploit` and enter the regular expression `v\w*_\w*_\w*$` at the command line.
2323

2424
```
2525
$ regexploit
26-
abc*[a-z]+c+$
27-
Pattern: abc*[a-z]+c+$
26+
v\w*_\w*_\w*$
27+
Pattern: v\w*_\w*_\w*$
2828
---
29-
Worst-case complexity: 3 ⭐⭐⭐
30-
Repeated character: [c]
31-
Final character to cause backtracking: [^[a-z]]
32-
Example: 'ab' + 'c' * 3456 + '0'
29+
Worst-case complexity: 3 ⭐⭐⭐ (cubic)
30+
Repeated character: [5f:_]
31+
Final character to cause backtracking: [^WORD]
32+
Example: 'v' + '_' * 3456 + '!'
3333
```
3434

35-
The part `c*[a-z]+c+` contains three overlapping repeating groups. As showed in the line `Repeated character: [c]`, a long string of `c` will match this section in many different ways. The worst-case complexity is 3 as there are 3 infinitely repeating groups. An example to cause ReDoS is given: it consists of the required prefix `ab`, a long string of `c` and then a `0` to cause backtracking. Not all ReDoSes require a particular character at the end, but in this case, a long string of `c` will match the regex successfully and won't backtrack. The line `Final character to cause backtracking: [^[a-z]]` shows that a non-matching character not in the range `[a-z]` is required at the end to prevent matching and cause ReDoS.
35+
The part `\w*_\w*_\w*` contains three overlapping repeating groups (\w matches letters, digits *and underscores*). As showed in the line `Repeated character: [5f:_]`, a long string of `_` (0x5f) will match this section in many different ways. The worst-case complexity is 3 as there are 3 infinitely repeating groups. An example to cause ReDoS is given: it consists of the required prefix `v`, a long string of `_` and then a `!` (non-word character) to cause backtracking. Not all ReDoSes require a particular character at the end, but in this case, a long string of `_` will match the regex successfully and won't backtrack. The line `Final character to cause backtracking: [^WORD]` shows that a non-matching character (not a word character) is required at the end to prevent matching and cause ReDoS.
3636

3737
As another example, install a module version vulnerable to ReDoS such as `pip install ua-parser==0.9.0`.
3838
To scan the installed python modules run `regexploit-python-env`.
@@ -82,7 +82,7 @@ pip install regexploit
8282

8383
# Usage
8484

85-
## Regex list
85+
## Regexploit with a list of regexes
8686

8787
Enter regular expressions via stdin (one per line) into `regexploit`.
8888

@@ -95,17 +95,21 @@ or via a file
9595
```bash
9696
cat myregexes.txt | regexploit
9797
```
98-
## Python code
98+
99+
## Extract regexes automatically
100+
101+
There is built-in support for parsing regexes out of Python, JavaScript, TypeScript, C#, YAML and JSON.
102+
### Python code
99103

100104
Parses Python code (without executing it) via the AST to find regexes. The regexes are then analysed for ReDoS.
101105

102106
```bash
103107
regexploit-py my-project/
104108
regexploit-py "my-project/**/*.py" --glob
105109
```
106-
## Javascript / Typescript
110+
### Javascript / Typescript
107111

108-
This will use the bundled NodeJS package in `regexploit/bin/javascript` which parses your javascript as an AST with [eslint](https://github.com/typescript-eslint/typescript-eslint/tree/master/packages/parser) and prints out all regexes.
112+
This will use the bundled NodeJS package in `regexploit/bin/javascript` which parses your JavaScript as an AST with [eslint](https://github.com/typescript-eslint/typescript-eslint/tree/master/packages/parser) and prints out all regexes.
109113

110114
Those regexes are fed into the python ReDoS finder.
111115

@@ -116,7 +120,7 @@ regexploit-js "my-project/node_modules/**/*.js" --glob
116120

117121
N.B. there are differences between javascript and python regex parsing so there may be some errors. I'm [not sure I want](https://hackernoon.com/the-madness-of-parsing-real-world-javascript-regexps-d9ee336df983) to write a JS regex AST!
118122

119-
## Python imports
123+
### Python imports
120124

121125
Search for regexes in all the python modules currently installed in your path / env. This means you can `pip install` whatever modules you are interested in and they will be analysed. Cpython code is included.
122126

@@ -126,15 +130,15 @@ regexploit-python-env
126130

127131
N.B. this doesn't parse the python code to an AST and will only find regexes compiled automatically on module import. Modules are actually imported, **so code in the modules will be executed**. This is helpful for finding regexes which are built up from smaller strings on load e.g. [CVE-2021-25292 in Pillow](https://github.com/python-pillow/Pillow/commit/3bce145966374dd39ce58a6fc0083f8d1890719c)
128132

129-
## JSON / YAML
133+
### JSON / YAML
130134

131-
Yaml requires pyyaml, which can be installed with `pip install regexploit[yaml]`.
135+
Yaml support requires pyyaml, which can be installed with `pip install regexploit[yaml]`.
132136

133137
```bash
134138
regexploit-json *.json
135139
regexploit-yaml *.yaml
136140
```
137-
## C# (.NET)
141+
### C# (.NET)
138142

139143
```bash
140144
regexploit-csharp something.cs

regexploit/bin/regexploit_js.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,8 @@ def main():
7676
os.path.join(os.path.split(__file__)[0], "javascript", "node_modules")
7777
):
7878
path = os.path.join(os.path.split(__file__)[0], "javascript")
79-
print("The javascript & typescript parsers requires some node modules.\n")
80-
print(f"Go to {path} and run 'npm install'")
79+
print("The JavaScript & TypeScript parsers require some node modules.\n")
80+
print(f"Run (cd {path}; npm install)")
8181
sys.exit(1)
8282
with warnings.catch_warnings():
8383
warnings.simplefilter(

0 commit comments

Comments
 (0)