Skip to content

Implement scale/offset codec in the EOPF data model #134

@emmanuelmathot

Description

@emmanuelmathot

Context

The scale/offset codec extension has been merged in the zarr-extensions repository, providing a standardized Zarr V3 array-to-array codec for scale/offset transformation. This closes the specification gap documented in zarr-developers/zarr-extensions#42.

The next step is implementation in zarr-python, which would make the codec transparently available through xarray and our downstream stack (data pipeline, TiTiler).

What this enables

With a standardized and implemented scale/offset codec, our GeoZarr data arrays can:

  • Store reflectance data efficiently as uint16 (0–10000) while transparently presenting float32 values (0.0–1.0) to users
  • Follow CF conventions for physical quantities without requiring users to manually apply scale_factor / add_offset attributes
  • Decouple storage encoding from data presentation — the Zarr V3 design intent

This directly addresses the reflectance data convention discussion with EOPF CPM, where both sides agree on integer storage with codec-based float presentation.

Tasks

  • Confirm zarr-python implementation status@d-v-b: is the scale/offset codec from zarr-extensions#43 being implemented in zarr-python? If so, this is an activity we can do in this project.
  • Update the data pipeline to use the codec in the converter and the data pipeline @lhoupert
  • Validate downstream compatibility and confirm that TiTiler correctly decode data through the codec without additional handling @vincentsarago
  • Document the convention in the data model spec @emmanuelmathot

Related issues

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions