Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cram: Record counter field description is possibly off-by-one #807

Open
zaeleus opened this issue Jan 31, 2025 · 0 comments
Open

cram: Record counter field description is possibly off-by-one #807

zaeleus opened this issue Jan 31, 2025 · 0 comments

Comments

@zaeleus
Copy link

zaeleus commented Jan 31, 2025

This is in regard to CRAM format specification (version 3.1) (2024-09-04).

The record counter field in both the container (§ 7 "Container header structure") and slice headers (§ 8.5 "Slice header block") is described to be a "1-based sequential index of records in the file/stream."

However, current implementations write 0-based values, e.g., htslib:

$ samtools --version | head -2
samtools 1.21
Using htslib 1.21

$ (
samtools view --output-fmt cram --output-fmt-option seqs_per_slice=2 <<EOF
@HD	VN:1.6
r1	4	*	0	255	*	*	0	0	NNNN	!!!!
r2	4	*	0	255	*	*	0	0	NNNN	!!!!
r3	4	*	0	255	*	*	0	0	NNNN	!!!!
EOF
) | cram_dump - | grep "Rec counter"
    Rec counter:       0
        Rec counter      0
    Rec counter:       2
        Rec counter      2
    Rec counter:       0

I would expect the record counters to be 1 and 3 if they were to be 1-based. Is the description in the spec incorrect, or is the field missing a more thorough explanation on how its supposed to be tracked?

@zaeleus zaeleus changed the title cram: Questionable record counter field description cram: Record counter field description is possibly off-by-one Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant