Description
Is your feature request related to a problem?
Hi xarray community.
When I try to put UPath into xr.open_zarr, it fails. I didn't check with the other open_* methods.
from upath import UPath
up = UPath('az://[email protected]/mystore.zarr', anon=False)
print(up.storage_options) # {'account_name': 'myaccount', 'anon': False}
# up.fs works fine
xr.open_zarr(up)
ValueError: unable to connect to account for Must provide either a connection_string or account_name with credentials!!
When I debugged it, the issue seems to be that xarray coerces the UPath into a string url (which is in this case just 'az://mycontainer/mystore.zarr', and then calls fsspec.url_to_fs(url, **{'asynchronous': True})
, so that all storage options bound to the UPath are lost.
Describe the solution you'd like
I think it would be a very nice addition if xarray added special handling for UPath, because it seems like the perfect way to save the zarr store destination in a variable, without performing any work.
I think the implementation cannot be so naive as to just use the UPath.fs attribute as filesystem when it encounters UPath, because it needs asynchronous=True for zarr. So the solution would be to merge UPath.storage_options into the storage_options dict passed to xr.open_zarr, maybe?
I could put up a proof-of-concept PR as a first contribution if you are interested.
Describe alternatives you've considered
Everything works well if I input the url and storage options separately, but it is less convenient.
Additional context
UPath is a project by fsspec community and it seems to have quite good adoption in various python libraries.
It expands pathlib.Path interface to fsspec filesystems, such as S3. It can be used to bundle together a path, protocol and storage options in a single object, which is more convenient than passing around tuple[str, dict[str, Any]].
https://github.com/fsspec/universal_pathlib