REVAMP core to FIX #20 reuse of instanciated plugin digesters#21
Open
ankostis wants to merge 10 commits intoMiserlou:masterfrom
Open
REVAMP core to FIX #20 reuse of instanciated plugin digesters#21ankostis wants to merge 10 commits intoMiserlou:masterfrom
ankostis wants to merge 10 commits intoMiserlou:masterfrom
Conversation
Currently plugins install digester-instances when initialized. The SAME-DIGESTERSs are re-used multiple times, corrupting hashes for all but the 1st input! - Installed TCs to detect this bug.
MAJOR REVAMP revamp to solve the bug discovered in a21bf38 about reusing plugin-digesters. In the previous version, each "digester" was a 2-tuple `(digester-instance, final-hash-func)`. To solve the reuse bug without refactoring it would require to re-initialize the plugins for each input. In this revision, each registered *digester* is actually a `factory_function(fsize: int)` that will create a *digester* class with 2 methods: - update(bytes) - hexdigest() -> str # lower NOTE that the factory-function takes `fsize` as its argument, - this is necessary for git-digesters not to always slurp bytes, particularly for URL-resources; so all other digesters must use a "special" factory that ignores the `fsize` arg. Other changes, impossible to separate in commits: - FIX plugins - multiple problems were preventing them from running - added travis TCs to detect them. - Add `-x family` option to exclude families. - The inclusion/exclusion logic is implemented within a class. - Do not git-slurp if URL provide Content-Length. - Avoid needless instantiation of excluded digesters. - Avoid some top-level imports, to speed up cmd-line launch for help-msg.
Trying to work across PY-versions is easier if not shadowing package-name with module, and using relative imports instead. So move code: omniparse.omniparse.py --> omniparse.__init__.py - and move project coords --> omniparse._version.py.
+ NOW ALL TCs OK (but PY36-dev).
Collaborator
Author
|
After shattered, this utility might become a bit popular. Would you check this PR? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
HASHING REVAMP
Had to revamp to solve the bug in #20 about reusing instanciated plugin-digesters.
In the previous version, each "digester" was a "convoluted' 2-tuple
(digester-instance, final-hash-func).To solve the reuse bug without refactoring it would require to
re-initialize the plugins for each input.
In this PR registered digesters are actually
factory_functions(fsize: int)that will create a digester class with just 2 methods:
NOTE: the factory-function takes
fsizeas its argument, - this isuseful for git-digesters, to avoid always slurping files with known size;
that is also handy for for URL-resources; but as bargain,
all other digesters must use a "special" factory just to ignore the
fsizearg.MODULES REVAMP
As explained in #20, enabling plugins in Travis across PY-versions revealed
structural module issues.
In general it is easier if module-names do not shadow their package-name,
and using relative imports is helpfull. So I had to move:
Other changes
(most changes are in commit e15bef5, impossible to separate, sorry)
added travis TCs to detect them.
-x familyoption to exclude families (that was easy) :-).needless instantiation of excluded digesters.
help-msg.
TODO:
__main__.pyfile, and do the trick I described in Rework cmd-line interface with Inclusions/eXclusions #10 with 2 cmds (oh, &omnihash).