Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Expected a BytesBytesCodec. Got <class 'numcodecs.blosc.Blosc'> instead. #10032

Open
leoniewgnr opened this issue Feb 6, 2025 · 7 comments
Labels
bug topic-zarr Related to zarr storage library

Comments

@leoniewgnr
Copy link

This code runs without any problems with zarr2, but give the following error when running with zarr3:

import pandas as pd
import numpy as np
import xarray as xr
from numcodecs.blosc import Blosc

ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(4, 5))},
    coords={
        "x": [10, 20, 30, 40],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcd")),
    },
)

tmp_path = 'tmp.zarr'

# this works
ds.to_zarr(tmp_path, mode="w")
print('Saved to tmp.zarr')

# this does not work 
compressor = Blosc(cname="zstd", clevel=3, shuffle=2)
ds.to_zarr(tmp_path, encoding={"foo": {"compressor": compressor}}, mode="w")
print('Saved to tmp.zarr')

The error message is: TypeError: Expected a BytesBytesCodec. Got <class 'numcodecs.blosc.Blosc'> instead.
The same error occurs in the documentation: https://docs.xarray.dev/en/stable/user-guide/io.html#zarr-compressors-and-filters

Copy link

welcome bot commented Feb 6, 2025

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@TomNicholas TomNicholas added topic-zarr Related to zarr storage library bug labels Feb 6, 2025
@TomNicholas
Copy link
Member

Thanks for raising this @leoniewgnr ! We're still hunting down all the bugs that the move to zarr 3 created.

The same error occurs in the documentation:

That's particularly weird - errors in the documentation examples are supposed to lead to errors in the CI...

@keewis
Copy link
Collaborator

keewis commented Feb 6, 2025

See also #9987

@FedeMPouzols
Copy link

I think this example needs to be updated for zarr-python 3. Something like this works for me:

diff --git a/doc/user-guide/io.rst b/doc/user-guide/io.rst
index 986d43ce..7f5d6e2b 100644
--- a/doc/user-guide/io.rst
+++ b/doc/user-guide/io.rst
@@ -829,10 +829,10 @@ For example:
     :okwarning:
 
     import zarr
-    from numcodecs.blosc import Blosc
+    from zarr.codecs import BloscCodec
 
-    compressor = Blosc(cname="zstd", clevel=3, shuffle=2)
-    ds.to_zarr("foo.zarr", encoding={"foo": {"compressor": compressor}})
+    compressor = BloscCodec(cname="zstd", clevel=3, shuffle="shuffle")
+    ds.to_zarr("foo.zarr", encoding={"foo": {"compressors": (compressor,)}})
 
 .. note::

(this is my best guess based on what I see in the backend tests some Zarr v3 related PRs. In this particular case, {"compressor": compressor} (without tuple) seems to also work.).

Perhaps @d-v-b can confirm this is now the proper way to specify encoders/help with this?

@d-v-b
Copy link
Contributor

d-v-b commented Feb 9, 2025

that looks right, although I'm not too familiar with what ds.to_zarr is doing under the hood. The basic idea in zarr v3 is that there can be multiple codecs that transform an array after it has been flattened to a byte stream (alternately called "compressors" or "BytesBytesCodec"), hence the tuple. but we also accept a single codec, which we will wrap in a tuple.

@fowlerovski
Copy link

My situation with numcodecs 0.15.1 and Zarr 3.0.3 mirrors this: BytesBytesCodec is unavailable in numcodecs.abc, and even numcodecs.Blosc is rejected with TypeError: Expected a BytesBytesCodec.

@roansong
Copy link

roansong commented Feb 20, 2025

I'm running into this as well, even when using numcodecs.zarr3.Blosc or zarr.codecs.BloscCodec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug topic-zarr Related to zarr storage library
Projects
None yet
Development

No branches or pull requests

7 participants