Skip to content

Conversation

@mkitti
Copy link
Contributor

@mkitti mkitti commented Oct 15, 2025

This pull request adds the conditional (formerly optional) codec as an extension.

Abstract

The conditional codec is a meta-codec that enables or disables an encapsulated sequence of other codecs on a per-chunk basis. It achieves this by wrapping a user-defined list of codecs in its configuration and prepending a bitfield header to the byte stream of each chunk. Each bit in the header corresponds to a codec in the encapsulated list, indicating whether it should be applied or skipped. This allows for dynamic, data-dependent optimization of the codec pipeline, such as disabling compression when it provides no benefit.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 15, 2025

Draft on HackMD: https://hackmd.io/@zarr/rJRh51apex

Clarify the optional codec's upper bound mechanism and its benefits for sharded stores. Include detailed calculations for chunk offsets and emphasize performance improvements in write operations.
Corrected code block syntax highlighting and fixed a typo in the example.

Use `python` instead of `python=` for code blocks.
Clarify the impact of padded shard layout on disk size and explain the shard compaction process for long-term storage optimization.

* **Parallel I/O Within a Shard:** With predictable offsets, multiple threads or processes can issue concurrent write operations to different chunks within the same shard file using overlapped I/O.

This transforms a shard from a monolithic object that must be written sequentially into a parallel-access container. It significantly boosts write throughput in high-performance computing (HPC) and other concurrent data processing environments where multiple workers need to write to the same dataset simultaneously.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakirkham I included this section based on our discussion of "padding" in shards. Is this along the lines of what you had in mind?

@normanrz
Copy link
Member

Thanks for writing this up. Especially appreciate all the motivating examples.

Initially, my mental model was that this would work similar to an "optional" data type in various programming languages. The difference to your model is that it would only hold a single codec that can be toggled. Have you considered that?

@normanrz
Copy link
Member

Another note, by toggling codecs off selectively, it could become possible to create invalid codec pipelines. Primarily, if the array-to-bytes codec gets toggled off. I think it would be good to have a normative section about what codecs this higher-order codec can be applied to.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 16, 2025

Another note, by toggling codecs off selectively, it could become possible to create invalid codec pipelines. Primarily, if the array-to-bytes codec gets toggled off. I think it would be good to have a normative section about what codecs this higher-order codec can be applied to.

It only survived in the Discussion section, but I only meant to define a bytes-to-bytes codec here. That means all encapsulated codecs must also be bytes-to-bytes codecs.

An array-to-array optional codec would work pretty similarly though in that one can toggle codecs arbitrarily. However, there is an additional validity issue in that certain array-to-array codecs may have constraints on the dimensionality of their array inputs and outputs.

An array-to-bytes optional codec would need to be mutually exclusive. In that case, I would argue that we may want an array-to-bytes optional codec to work differently. The header might be interpreted as an integer index rather than a bitfield. A single byte could select among 256 array-to-bytes codecs.

In summary, only a bytes-to-bytes optional codec is intended to be defined here. The encapsulated pipeline thus can only contain other byte-to-byte codecs.

Would it even make sense to have a single codec that could be an array-to-array codec, a bytes-to-array codec, or a bytes-to-bytes codec? I believe these should be three separate codecs. Could those three codecs share a name?

@jbms
Copy link
Contributor

jbms commented Oct 16, 2025

Other related work to consider: OpenZL (https://openzl.org/) from facebook

In regards to future expansion: the current design allows future expansion at the end of the codec list but not at the beginning or middle. For example it would not be possible to add an additional pre-filter like shuffle afterwards. I suppose this can be mitigated by duplicating codecs in the list as needed.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 16, 2025

In regards to future expansion: the current design allows future expansion at the end of the codec list but not at the beginning or middle. For example it would not be possible to add an additional pre-filter like shuffle afterwards. I suppose this can be mitigated by duplicating codecs in the list as needed

We currently lack an identity bytes-to-bytes codec that could serve as a placeholder. That would be a codec whose output is exactly the input byte sequence or stream.

The optional codec could serve as an identity codec if it encapsulated no codecs and had header_bits set or default to 0 as in the following JSON.

{
    "name": "optional",
    "configuration": {
        "codecs": []
    }
}

If we want to reserve space to prepend a codec, we could start the encapsulated codec chain with an optional codec configured as an identity codec.

{
    "name": "optional",
    "configuration": {
        "codecs": [
            {
                "name": "optional",
                "configuration": {
                    "codecs": []
                 }
            },
            // additional codecs ...
        ]
    }
}

@jbms
Copy link
Contributor

jbms commented Oct 16, 2025

In regards to future expansion: the current design allows future expansion at the end of the codec list but not at the beginning or middle. For example it would not be possible to add an additional pre-filter like shuffle afterwards. I suppose this can be mitigated by duplicating codecs in the list as needed

We currently lack an identity bytes-to-bytes codec that could serve as a placeholder. That would be a codec whose output is exactly the input byte sequence or stream.

The optional codec could serve as an identity codec if it encapsulated no codecs and had header_bits set or default to 0 as in the following JSON.

{
    "name": "optional",
    "configuration": {
        "codecs": []
    }
}

If we want to reserve space to prepend a codec, we could start the encapsulated codec chain with an optional codec configured as an identity codec.

That is true but instead using null to explicitly mean reserved for future use may be better because then you would fail if you encounter it.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 17, 2025

Do you mean that the element of the codec list should itself be null and that null is the identity codec?

{
    "name": "optional",
    "configuration": {
        "codecs": [
            null,
            // additional codecs ...
        ]
    }
}

or do you mean that the codec list itself should be null?

{
    "name": "optional",
    "configuration": {
        "codecs": [
            {
                "name": "optional",
                "configuration": {
                    "codecs": null
                 }
            },
            // additional codecs ...
        ]
    }
}

@mkitti
Copy link
Contributor Author

mkitti commented Oct 17, 2025

I think you mean that element of the codec list should be null such that if the "null codec" were enabled then there would be an error. null does not mean the identity codec but explicitly one that is invalid and would generate an error.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 17, 2025

It has been proposed to change the name of this codec to conditional.

@mkitti mkitti changed the title Add optional codec feat: add conditional codec Oct 17, 2025
@jbms
Copy link
Contributor

jbms commented Oct 17, 2025

I think you mean that element of the codec list should be null such that if the "null codec" were enabled then there would be an error. null does not mean the identity codec but explicitly one that is invalid and would generate an error.

Yes that's what I meant.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 18, 2025

After thinking about this more, perhaps we should add an explicit ErrorCodec that explicitly will error if enabled and can carry a specific error message.

Depending on null to do this sounds like a continuation of the "billion dollar mistake".

Copy link
Member

@normanrz normanrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR meets the requirements for being merged. Let me know when you're ready to.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 20, 2025

The file needs to be moved from the optional directory to the conditional directory.

edit: Resolved in 5528471

@mkitti
Copy link
Contributor Author

mkitti commented Oct 20, 2025

In anticipation of a future array-to-array or array-to-bytes conditional codec, is there any provision that we would want to make here?

For example, we could make from_type and to to_type parameters where the only current valid value is "bytes".

Alternatively, the AA or AB codecs could just have a different name. The AB codec would probably be more like a selection codec where the header contains an integer to select exactly one or zero codecs.

@jbms
Copy link
Contributor

jbms commented Oct 20, 2025

An array to array version of this has the issue of how to encode the extra bits, since its only output is an array. I think it would need to be an array to bytes codec, where the sub-codecs can be any kind but the selected subset must be in a valid order and include exactly one array to bytes codec. Alternatively there could be a way to specify a non-conditional array to bytes codec to use.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 20, 2025

For now, I'm am thinking about how to distinguish this bytes-to-bytes codec from another conceptually similar codec of a similar name that may be implemented in the future.

That said I am now convinced that an array-to-x analog of this would be substantially different enough to require a different name completely.

An array-to-array version might require a per-chunk metadata facility that we do not have yet.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 20, 2025

Should the condition be serialized somehow?

  • The condition does not need to be serialized. It is not needed to read the data. Also, another implementation does not need to use the same condition when writing data.
  • The condition does need to be serialized in order to round-trip the encoding of the array.

While I do not think we should require the condition to be serialized, would it be worthwhile to have a common way of describing a few conditions as anticipated here? For example, we could have an optional condition field. Potential conditions could be:

  • "if_available" - use the encapsulated codec chain if it is available
  • "disabled" - do not apply the encapsulated codec chain (reserve it for later use?)
  • "if_smaller_size" - only encode if the number of output bytes is smaller than the number of input bytes.
  • "if_smaller_or_equal_size" - only encode if the number of output bytes is smaller or equal than the number of input bytes
  • { "max_nbytes": N } - the inclusive maximum size of the encoded chunk, not including the conditional header, must be equal to or less than N bytes where N is an integer.

The alternative is that this all could be additional codecs in and of themselves.

@d-v-b
Copy link
Contributor

d-v-b commented Oct 20, 2025

The condition does need to be serialized in order to round-trip the encoding of the array.

When reading, you can copy the bitmask, so I don't think you need the original decision-making procedure to re-generate the same bytes. But I also don't think byte-identical round-tripping is important here. Ensuring encoded data can be decoded is probably a better objective. I think the reason for choosing to use a codec or not is akin to the order in which sub-chunks are written -- basically a runtime thing that readers don't need to know about.

@mkitti
Copy link
Contributor Author

mkitti commented Oct 23, 2025

I'm good for this to be merged now as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants