-
Notifications
You must be signed in to change notification settings - Fork 94
constant.numeric.dec.python only has 2 capture groups #198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
actually, on second thought -- I should probably check the rest of the regexen too 🤔 |
ok there are more, but I am having a hard time wrapping my head around the include scheme here's the script I used to find them (can probably be improved a little bit to have more/better output): import argparse
import re
import plistlib
from typing import Any
from typing import Dict
import onigurumacffi
_BACKREF_RE = re.compile(r'((?<!\\)(?:\\\\)*)\\([0-9]+)')
def _fix_end(s: str) -> str:
"""end can have backreferences"""
return _BACKREF_RE.sub('ZZZ', s)
def _visit_captures(reg: str, captures: Dict[str, Dict[str, Any]]) -> None:
max_n = onigurumacffi.compile(reg).number_of_captures()
for k, v in captures.items():
if int(k) > max_n:
print(f'{k} > {max_n}: {reg!r} {v}')
_visit_rule(v)
def _visit_rule(rule: Dict[str, Any]) -> None:
if 'match' in rule and 'captures' in rule:
_visit_captures(rule['match'], rule['captures'])
if 'begin' in rule:
if 'captures' in rule:
_visit_captures(rule['begin'], rule['captures'])
_visit_captures(_fix_end(rule['end']), rule['captures'])
if 'beginCaptures' in rule:
_visit_captures(rule['begin'], rule['beginCaptures'])
if 'endCaptures' in rule:
_visit_captures(_fix_end(rule['end']), rule['endCaptures'])
for sub_rule in rule.get('patterns', ()):
_visit_rule(sub_rule)
for sub_rule in rule.get('repository', {}).values():
_visit_rule(sub_rule)
def main() -> int:
parser = argparse.ArgumentParser()
parser.add_argument('filename')
args = parser.parse_args()
with open(args.filename, 'rb') as f:
contents = plistlib.load(f)
_visit_rule(contents)
return 0
if __name__ == '__main__':
exit(main()) $ python3 t.py grammars/MagicPython.tmLanguage
2 > 1: "(\\]|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\]|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\'\\'\\')" {'name': 'invalid.illegal.newline.python'}
2 > 1: '(""")' {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\)|(?=\\'\\'\\'))" {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(\\)|(?="""))' {'name': 'invalid.illegal.newline.python'}
2 > 1: "(\\'\\'\\')" {'name': 'invalid.illegal.newline.python'}
2 > 1: '(""")' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'}
2 > 1: '(ZZZ)' {'name': 'invalid.illegal.newline.python'} |
@elprans @vpetrovykh anything left to do here? |
@1st1 maybe? |
OK, the change looks good. As for the other hits, all of the Sorry for the delay in merging this. |
also validated this against oniguruma via onigurumacffi