You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-16Lines changed: 20 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -19,20 +19,20 @@ For explotability, cubic complexity or higher is typically required unless truly
19
19
20
20
## Example
21
21
22
-
Run `regexploit` and enter the regular expression `abc*[a-z]+c+$` at the command line.
22
+
Run `regexploit` and enter the regular expression `v\w*_\w*_\w*$` at the command line.
23
23
24
24
```
25
25
$ regexploit
26
-
abc*[a-z]+c+$
27
-
Pattern: abc*[a-z]+c+$
26
+
v\w*_\w*_\w*$
27
+
Pattern: v\w*_\w*_\w*$
28
28
---
29
-
Worst-case complexity: 3 ⭐⭐⭐
30
-
Repeated character: [c]
31
-
Final character to cause backtracking: [^[a-z]]
32
-
Example: 'ab' + 'c' * 3456 + '0'
29
+
Worst-case complexity: 3 ⭐⭐⭐ (cubic)
30
+
Repeated character: [5f:_]
31
+
Final character to cause backtracking: [^WORD]
32
+
Example: 'v' + '_' * 3456 + '!'
33
33
```
34
34
35
-
The part `c*[a-z]+c+` contains three overlapping repeating groups. As showed in the line `Repeated character: [c]`, a long string of `c`will match this section in many different ways. The worst-case complexity is 3 as there are 3 infinitely repeating groups. An example to cause ReDoS is given: it consists of the required prefix `ab`, a long string of `c` and then a `0`to cause backtracking. Not all ReDoSes require a particular character at the end, but in this case, a long string of `c` will match the regex successfully and won't backtrack. The line `Final character to cause backtracking: [^[a-z]]` shows that a non-matching character not in the range `[a-z]` is required at the end to prevent matching and cause ReDoS.
35
+
The part `\w*_\w*_\w*` contains three overlapping repeating groups (\w matches letters, digits *and underscores*). As showed in the line `Repeated character: [5f:_]`, a long string of `_` (0x5f) will match this section in many different ways. The worst-case complexity is 3 as there are 3 infinitely repeating groups. An example to cause ReDoS is given: it consists of the required prefix `v`, a long string of `_` and then a `!` (non-word character) to cause backtracking. Not all ReDoSes require a particular character at the end, but in this case, a long string of `_` will match the regex successfully and won't backtrack. The line `Final character to cause backtracking: [^WORD]` shows that a non-matching character (not a word character) is required at the end to prevent matching and cause ReDoS.
36
36
37
37
As another example, install a module version vulnerable to ReDoS such as `pip install ua-parser==0.9.0`.
38
38
To scan the installed python modules run `regexploit-python-env`.
@@ -82,7 +82,7 @@ pip install regexploit
82
82
83
83
# Usage
84
84
85
-
## Regex list
85
+
## Regexploit with a list of regexes
86
86
87
87
Enter regular expressions via stdin (one per line) into `regexploit`.
88
88
@@ -95,17 +95,21 @@ or via a file
95
95
```bash
96
96
cat myregexes.txt | regexploit
97
97
```
98
-
## Python code
98
+
99
+
## Extract regexes automatically
100
+
101
+
There is built-in support for parsing regexes out of Python, JavaScript, TypeScript, C#, YAML and JSON.
102
+
### Python code
99
103
100
104
Parses Python code (without executing it) via the AST to find regexes. The regexes are then analysed for ReDoS.
101
105
102
106
```bash
103
107
regexploit-py my-project/
104
108
regexploit-py "my-project/**/*.py" --glob
105
109
```
106
-
## Javascript / Typescript
110
+
###Javascript / Typescript
107
111
108
-
This will use the bundled NodeJS package in `regexploit/bin/javascript` which parses your javascript as an AST with [eslint](https://github.com/typescript-eslint/typescript-eslint/tree/master/packages/parser) and prints out all regexes.
112
+
This will use the bundled NodeJS package in `regexploit/bin/javascript` which parses your JavaScript as an AST with [eslint](https://github.com/typescript-eslint/typescript-eslint/tree/master/packages/parser) and prints out all regexes.
109
113
110
114
Those regexes are fed into the python ReDoS finder.
N.B. there are differences between javascript and python regex parsing so there may be some errors. I'm [not sure I want](https://hackernoon.com/the-madness-of-parsing-real-world-javascript-regexps-d9ee336df983) to write a JS regex AST!
118
122
119
-
## Python imports
123
+
###Python imports
120
124
121
125
Search for regexes in all the python modules currently installed in your path / env. This means you can `pip install` whatever modules you are interested in and they will be analysed. Cpython code is included.
122
126
@@ -126,15 +130,15 @@ regexploit-python-env
126
130
127
131
N.B. this doesn't parse the python code to an AST and will only find regexes compiled automatically on module import. Modules are actually imported, **so code in the modules will be executed**. This is helpful for finding regexes which are built up from smaller strings on load e.g. [CVE-2021-25292 in Pillow](https://github.com/python-pillow/Pillow/commit/3bce145966374dd39ce58a6fc0083f8d1890719c)
128
132
129
-
## JSON / YAML
133
+
###JSON / YAML
130
134
131
-
Yaml requires pyyaml, which can be installed with `pip install regexploit[yaml]`.
135
+
Yaml support requires pyyaml, which can be installed with `pip install regexploit[yaml]`.
0 commit comments