Skip to content

Version and Specifier accept (erroneously) some non-ASCII letters in the *local version* segment #469

@zuo

Description

@zuo

Reproducing the behavior concerning packaging.version.Version:

Python 3.9.7 (default, Oct  4 2021, 18:09:29) 
[...]
>>> import packaging.version
>>> packaging.version.Version('1.2+\u0130\u0131\u017f\u212a')
<Version('1.2+i̇ıſk')>

The cause is that packaging.version.VERSION_PATTERN makes use of a-z character ranges in conjunction with re.IGNORECASE and (implicit in Python 3.x) re.UNICODE (see the 2nd paragraph of this fragment: https://docs.python.org/3/library/re.html#re.IGNORECASE).

It can be fixed in one of the following two ways:

  • either by adding re.ASCII to flags (but then both occurrences of \s* in the actual regex will be restricted to match ASCII-only whitespace characters!);
  • or by removing re.IGNORECASE from flags and replacing (in VERSION_PATTERN) both occurrences of a-z with A-Za-z plus adding suitable upper-case alternatives in the pre_l, post_l and dev_l regex groups, e.g., [aA][lL][pP][hH][aA] in place of alpha (quite cumbersome...).

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions