Skip to content

Can't amend a writable snapshot following a rearrange snapshot #2121

@tomwhite

Description

@tomwhite

What happened?

I'm using amend in order to avoid keeping history, and I want to be able to mix data changes (in writable sessions) and structural modifications (in rearrange sessions). If I do a commit in a writable session then an amend in a rearrange session the snapshop is modified (so there is only one). But if I try to another amend in a writable session, it fails.

What did you expect to happen?

I expected amend to always amend the previous snapshot.

Minimal Complete Verifiable Example

# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "icechunk>=2.0.4",
# ]
# ///

import icechunk as ic
import zarr

ic.print_debug_info()

def main() -> None:
    # create a repo with an array
    repo = ic.Repository.create(ic.in_memory_storage())
    session = repo.writable_session("main")
    root = zarr.group(session.store)
    root.create_group("group")
    root.create_array("group/a", shape=(10,), dtype="f4")
    session.commit("create")
    print("\n".join([str((a.id, a.message)) for a in repo.ancestry(branch="main")]))

    # rename the array
    session = repo.rearrange_session("main")
    session.move("/group/a", "/group/b")
    session.amend("move")
    print("\n".join([str((a.id, a.message)) for a in repo.ancestry(branch="main")]))

    # update the array
    session = repo.writable_session("main")
    root = zarr.open_group(session.store, mode="r+")
    group = root["group"]
    arr = group["b"]
    arr[:] = -1
    session.amend("update")
    print("\n".join([str((a.id, a.message)) for a in repo.ancestry(branch="main")]))

if __name__ == "__main__":
    main()

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in icechunk.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of icechunk and its dependencies.

Relevant log output

('SKK7R5EYET9DC0G2VEX0', 'create')
('1CECHNKREP0F1RSTCMT0', 'Repository initialized')
('V5D2PPQD7DAEXP4QXCRG', 'move')
('1CECHNKREP0F1RSTCMT0', 'Repository initialized')
Traceback (most recent call last):
  File "/Users/tom/workspace/vczstore/repro-ic.py", line 39, in <module>
    main()
  File "/Users/tom/workspace/vczstore/repro-ic.py", line 35, in main
    session.amend("update")
  File "/Users/tom/.cache/uv/environments-v2/repro-ic-ffba12c202125ccc/lib/python3.12/site-packages/icechunk/session.py", line 537, in amend
    return self._session.amend(message, metadata, allow_empty=allow_empty)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
icechunk.IcechunkError:   × session error: This session was created to rearrange the hierarchy, other write operations cannot be executed. Commit or abandon the sessions and create a regular writable session
  │ 
  │ context:
  │    0: icechunk::session::commit_innerwith update max_concurrent_nodes=1 rewrite_manifests=false commit_method=Amend allow_empty=falseat icechunk/src/session.rs:1598

Anything else we need to know?

This came up in some code I'm writing to do a simple rechunk. Ideally I'd like to be able to do a rechunk operation in a single commit. See sgkit-dev/vczstore#98

Environment

Details platform: macOS-14.7-arm64-arm-64bit python: 3.12.0 icechunk: 2.0.4 zarr: 3.2.1 numcodecs: 0.16.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions