Skip to content

Conversation

@fnothaft
Copy link
Contributor

Stems out of work done in #142 and #149. Refactors BGZFSplitGuesser to use the logic in the BaseSplitGuesser class. To do this, we need to make BaseSplitGuesser public. Additionally, we replace the logic that is used in BaseSplitGuesser to identify the start of a BGZF block with logic that doesn't require copying bytes from the input stream to a buffer. This improves the performance of identifying the start of a BGZF block; for whatever reason, that copy to the buffer/methods on the buffer are expensive.

This needs some code style TLC and I'll be back with performance benchmarks.

…plitGuesser

Stems out of work done in HadoopGenomics#142 and HadoopGenomics#149. Refactors BGZFSplitGuesser to use the
logic in the BaseSplitGuesser class. To do this, we need to make
BaseSplitGuesser public. Additionally, we replace the logic that is used in
BaseSplitGuesser to identify the start of a BGZF block with logic that doesn't
require copying bytes from the input stream to a buffer. This improves the
performance of identifying the start of a BGZF block; for whatever reason, that
copy to the buffer/methods on the buffer are expensive.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 63.238% when pulling c2ae54e on fnothaft:bgzf-split-on-base-split into e4224ab on HadoopGenomics:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants