fix off-by-ones, and implement variable fmt uniwig#230
Conversation
|
I have some hesitation regarding the changes because I see that the outputs in the test files have changed. I spent some time last August doing a review on the counting function, making tweaks and confirming the output: 0f7ce42 Could you explain why this implementation is better for the output's intended use case? Or give a specific example where the original implementation was insufficient (or inaccurate)? |
|
Right. the reason the test files changed is that I adjusted the 0-based output to 1-based to match wiggle file format. |
donaldcampbelljr
left a comment
There was a problem hiding this comment.
Ok, these changes seem reasonable.
|
More explanation on the +1 coordinate fix in gtars uniwig: Quick SummaryThe gtars uniwig tool accepts two input formats: BED files (0-based coordinates) and BAM files (1-based coordinates). The The Two Input PathsBED Input PathWhen the user provides The relevant code in chromosome.starts.push((parsed_start + 1, default_score)); // Convert 0-based BED to 1-basedWithout this fix, a BED start of 19 would be treated as position 19 internally, producing WIG output shifted one position too low. BAM Input PathWhen the user provides The relevant code in first_record.alignment_start().unwrap().unwrap().get() as i32The SAM/BAM specification defines the POS field as 1-based. The noodles Why PEPATAC Is UnaffectedPEPATAC calls uniwig with
The |
|
for the edge clamping error, here's the explanation: Inputsingle.bed (BED 0-based: position 3 in 1-based) chr1 2 3 chrom.sizes chr1 20 Commandgtars uniwig \
--file single.bed \
--chromref chrom.sizes \
--smoothsize 5 \
--stepsize 1 \
--fileheader output \
--outputtype wig \
--counttype start \
--filetype bedOutput Comparison
The Bug A read at position 3 with smoothsize=5 should produce a window from 3-5 to 3+5, i.e., positions -2 to 8. Clamped to chromosome bounds: positions OLD algorithm (broken): NEW algorithm (fixed): The fix: calculate the end position from the original read position, not the clamped start. |
|
Great. Thank you for the follow up explanations. |
No description provided.