Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cram-codecs: rANS 4x8 order-1 frequency table example does not follow description #817

Open
zaeleus opened this issue Mar 5, 2025 · 0 comments

Comments

@zaeleus
Copy link

zaeleus commented Mar 5, 2025

This is in regard to CRAM codec specification (version 3.1) (2023-03-15).

The example in § 2.1 "Frequency table: Order-1 encoding" splits the following input

abracadabraabracadabraabracadabraabracadabr

into

abracadabra abracadabra abracadabra abracadabr

It's not obvious why the example is split this way. The note directs you to § 2.2.2 "rANS entropy encoding: Interleaving", which says

We therefore split the input data into 4 approximately equal sized fragments5 starting at 0, ⌊len/4⌋, ⌊len/4⌋ × 2 and ⌊len/4⌋ × 3.

I.e.,

abracadabr aabracadab raabracada braabracad abr

This method invalidates the observed frequencies for the example and footnote 5.

5This was why the \0a context in the example above had a frequency of 4 instead of 1.

@zaeleus zaeleus changed the title cram-codecs: Order-1 frequency table example does not follow description cram-codecs: rANS 4x8 order-1 frequency table example does not follow description Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant