Skip to content

Milestones

List view

  • ## Problem Statement Lance's current transaction system uses a monolithic `Operation` enum with 15 variants (Append, Delete, Overwrite, CreateIndex, Rewrite, Merge, etc.). Each variant captures the full state change for a single kind of operation. This creates several problems: 1. **Opaque transaction history.** Users cannot easily diff two versions or understand what changed between them. The operation captures end-state, not a semantic diff. 2. **No compound transactions.** Operations are one-per-transaction. You cannot atomically append data and create an index, or append data and update config, in a single commit. This leads to performance cliffs (e.g. #5952) and split-brain windows. 3. **Difficult cleanup of cancelled transactions.** It's hard to determine the set of new files introduced by a transaction for garbage collection purposes. 4. **Duplicated logic.** Each Operation variant has its own `build_manifest` and conflict resolution logic, leading to large match arms that are hard to maintain. Based on earlier proposal #3734 and discussion #5960. ## Solution Replace the monolithic `Operation` with a composable `UserOperation` containing an ordered list of granular `Action` messages. Each Action represents a single, well-defined change to the manifest (add fragments, remove fragments, update a fragment, add fields, add an index, etc.). The transaction top level becomes: ``` Transaction { oneof kind { UserOperation user_operation // Normal composable operations Restore restore // Full manifest restore (special case) Clone clone // Dataset clone (special case) } } ``` `UserOperation` contains an ordered list of `UserAction`s (per Weston's suggestion in #5960), each with a description and a list of `Action`s. This supports compound transactions with per-step descriptions (e.g. "append batch" + "rebuild index"). At the internal level, old `Operation` variants are translated into `Action` lists at deserialization time, so there is a single code path for applying changes to the manifest and for conflict resolution. ## User Stories 1. As a data engineer, I want to see a human-readable diff of what changed between two table versions, so that I can debug data quality issues. 2. As a data engineer, I want to atomically append data and create/update an index in a single transaction, so that I don't have a window where the index is stale. 3. As a data engineer, I want to atomically append data and update table config in a single transaction, so that metadata stays consistent with the data. 4. As a platform engineer, I want to identify all files introduced by a cancelled transaction, so that garbage collection can clean them up reliably. 5. As a library developer, I want a single code path for applying operations to the manifest, so that I don't have to maintain 15 separate `build_manifest` match arms. 6. As a library developer, I want a single code path for conflict resolution, so that adding a new action type doesn't require updating every conflict-checking function. 7. As a user of the Python/Java SDK, I want to construct compound transactions programmatically, so that I can express complex workflows as atomic commits. 8. As a platform engineer, I want older library versions to reject tables using the new transaction format via feature flags, so that they don't silently misinterpret transactions. 9. As a library developer, I want old `Operation`-style transactions to be automatically translated to the new `Action`-based representation at deserialization, so that internal logic only needs one code path. 10. As a data engineer, I want `Restore` and `Clone` operations to continue working as before, so that existing workflows are not broken. 11. As a library developer, I want to be able to round-trip old-format transactions (`OldOp → Actions → OldOp`) losslessly, so that we can use the new internal code path without changing the serialization format during the transition period. 12. As a platform engineer, I want a writer feature flag for the new transaction format, so that the cutover can be controlled and rolled back. ## Implementation Decisions ### Transaction Structure The protobuf `Transaction` message will have a top-level `oneof` with three variants: - **`UserOperation`**: The normal composable operation. Contains a UUID, read_version, and an ordered list of `UserAction`s. - **`Restore`**: Special case — loads an entire historical manifest. Cannot be expressed as actions. - **`Clone`**: Special case — dataset-level copy with branching/base-path semantics. Cannot be expressed as actions. ### UserAction Middle Layer Per Weston's suggestion, there is a `UserAction` between `UserOperation` and `Action`: ``` UserOperation → repeated UserAction → repeated Action ``` Each `UserAction` has a human-readable description and a list of Actions. For applying to the manifest, the UserAction lists are flattened. ### Action Types (14 total) | Action | Covers | |--------|--------| | `AddFragments` | Append, the "add new rows" part of Overwrite, new fragments from compaction | | `RemoveFragments` | The "remove old" part of Overwrite, removing compacted fragments | | `UpdateFragment` | Delete (new deletion file), Update (new data files), Merge/AddColumns (new data files for new fields), DropColumns (tombstone fields), DataReplacement (swap data files) | | `AddFields` | AddColumns, Merge, adding fields during Overwrite | | `DropFields` | DropColumns (schema-level; data tombstoned via UpdateFragment) | | `UpdateSchemaMetadata` | Schema metadata changes without field changes | | `AddIndex` | Add or replace an index | | `RemoveIndex` | Remove an index by UUID | | `UpdateConfig` | Table configuration (lance.* keys) | | `UpdateTableMetadata` | Arbitrary user key-value pairs | | `SetDataFormat` | Data storage format changes | | `SetFeatureFlags` | Enable or clear reader/writer feature flags | | `ReserveFragmentIds` | Increment max_fragment_id without adding fragments | | `UpdateMemWalIndex` | Update the MemWAL system index with merged generation state | ### Fragment ID Assignment `AddFragments` does **not** contain pre-assigned fragment IDs. IDs are assigned at apply time (after conflict resolution/rebasing), using the current `max_fragment_id` high-water mark. ### Feature Flag A new writer feature flag (`FLAG_ACTION_TRANSACTIONS`, next bit: `64`) will signal that the table uses the new transaction format. Older libraries that don't recognize this flag will reject writes. The flag value 64 is currently `FLAG_UNKNOWN`, so existing libraries already reject it. ### Old Operation Translation At deserialization, old `Operation` variants are translated into equivalent `Action` lists. Metadata like `Delete.predicate` maps to `UserAction.description`. Internal logic (apply, conflict resolution) only has one code path. The reverse translation (`Actions → old Operation`) is supported for the subset of action lists that map 1:1 to old operations, enabling the transition period. ### Phases **Phase 1: Internal refactor.** Define the Rust `Action` types (not yet as protobuf). Translate old `Operation` variants into `Action` lists in memory. Rewrite `build_manifest` as a loop over `action.apply(manifest)`. Rewrite conflict resolution to work on Actions. Remove old per-Operation apply/conflict logic. Still serialized as old format — actions round-trip through old Operations. This phase must support arbitrary action combinations (not just 1:1 old-Operation mappings) so that the internal logic is ready before the new format ships. **Phase 2: New protobuf format.** Define new proto messages (`UserOperation`, `UserAction`, `Action` variants). Implement serialization and deserialization. Gate writing behind writer feature flag. Support reading both old and new formats. Because the internal logic from Phase 1 already handles compound actions, readers can correctly process any valid action list from day one. **Phase 3: Compound transactions and user-facing API.** Expose action-based transaction construction in Rust, Python, and Java. Enable users to build compound transactions (e.g. append + create index). Requires new serialization format from Phase 2. ### Conflict Resolution Conflict resolution will be redesigned to work at the Action level. A spike is needed to determine: - Whether conflicts can be detected purely from Action types and payloads, without knowing higher-level semantic intent. - How to handle rebasing when actions within a transaction interact (e.g. AddFragments followed by AddIndex referencing those fragments). - Whether certain action combinations need special-case conflict rules. Initial analysis suggests feasibility: `AddFragments` only conflicts with schema changes; `UpdateFragment` conflicts with other `UpdateFragment`s on the same fragment ID; `AddIndex`/`RemoveIndex` conflicts with `DataReplacement` on indexed fields; etc. ### Tags Tag mutations (`AddTag`/`RemoveTag`) are **out of scope**. Tags currently live as separate JSON files under `_refs/tags/` and are decoupled from the manifest commit. ## Testing Decisions ### What Makes a Good Test Tests should verify external behavior: given a set of Actions, does the resulting manifest have the correct fragments, schema, indices, and metadata? Round-trip serialization tests should verify that `old_op → actions → old_op` produces identical protobuf. ### Modules to Test - **Action apply logic**: Each Action type gets unit tests. Edge cases: empty fragment lists, duplicate field IDs, tombstoning nonexistent fields, etc. - **Old Operation → Action translation**: Each of the 15 Operation variants gets a round-trip test. - **Conflict resolution**: Once the spike lands, integration tests for concurrent transactions with various action combinations. - **Serialization/deserialization**: Round-trip tests for new protobuf format. Backwards compat tests reading old-format transaction files. - **Feature flag gating**: Verify old libraries reject tables with new flag, and flag controls serialization format. ### Prior Art - Conflict resolution tests in `rust/lance/src/dataset/transaction.rs` - Round-trip serialization tests for `Transaction` ↔ protobuf - Feature flag tests in `rust/lance-table/src/feature_flags.rs` ## Out of Scope - **Tag mutations as Actions.** Tags live outside the manifest and need separate design. - **Isolation levels.** The action-based model is a prerequisite for more rigorous isolation semantics, but defining them is a separate project. - **User-facing transaction history UI.** This enables better diffs but does not define a specific UI/CLI for viewing them. - **Deprecating old transaction format.** The old format will coexist indefinitely; deprecation is a future decision. ## Further Notes - Discussion: #5960 - Earlier proposal: #3734 - Related performance issue: #5952 - The conflict resolution spike (Phase 1) is on the critical path. Schedule early so findings can inform Action type design. - `Restore` and `Clone` remain as top-level transaction variants, not composed of Actions. - `UpdateMemWalIndex` is kept as its own type rather than folded into `AddIndex` because its merge semantics (keep higher generation per shard) are specialized. Can be promoted later.

    No due date
    0/12 issues closed
  • No due date
    0/6 issues closed
  • This is a tracking issue to track the 2.2 version of the Lance file format.

    Overdue by 1 month(s)
    Due by February 28, 2026
    8/8 issues closed
  • Make merge insert more useful or easier to use.

    No due date
    4/9 issues closed
  • We are migrating the implementation of merge insert to be more efficient, particularly with memory. During the refactor, we are taking the implementation that manually manipulates streams and replacing with an implementation that uses DataFusion to generate and optimize the whole plan. ## PRD: Retire v1 Code Path Three categories of operations still fall back to v1: 1. `WhenMatched::DoNothing` (find-or-create pattern) 2. Partial schema upserts (source has subset of target columns) 3. Upserts when a scalar index exists on the join key **Goal:** Migrate all three to v2, then delete v1. This is a correctness/parity migration, not a performance optimization pass. ### Vertical Slices 1. **DoNothing on v2** — Add `DoNothing` to `can_use_create_plan` eligibility. Simplest slice; establishes the pattern. 2. **Partial schema upsert on v2** — Fill missing columns via projection in the logical plan. Write full rows. Does NOT replicate v1's column-rewrite optimization (future ticket). 3. **Research spike: indexed join strategy** — Time-boxed evaluation of `ScalarIndexJoinExec` (#3480) vs finger search (#4648) vs wrapping v1 logic. Output: design for slice 4. 4. **Scalar-indexed joins on v2** — Implement approach from spike. Remove scalar index fallback. Retain `use_index` as escape hatch. 5. **Remove v1 code** — Delete `Merger`, v1 branch in `execute_uncommitted_impl`, `can_use_create_plan`, v1-only helpers. ### Existing Issue Disposition | Issue | Action | |---|---| | #4194 (optimize merge insert with delete) | Split into two post-retirement tickets: `WhenMatched::Delete` opt and `WhenNotMatchedBySource::Delete` opt | | #4193 (partial schema / TakeExec) | Reframe as follow-up optimization after slice 2 | | #4266 (TakeExec logical node) | Deferred until #4193 optimization | | #3480 (indexed merge insert) | Spike (slice 3) updates description; slice 4 implements | | #4583 (streaming vs materialized) | Out of scope | | #4648 (finger search) | Feeds into slice 3 research | ### Out of Scope - Ergonomics improvements (milestone 8) - Performance optimizations: TakeExec column-rewrite avoidance, streaming vs materialized inputs, join order optimization

    No due date
    4/15 issues closed
  • Support the experience of creating index while creating table, then adding data. Read more at: https://github.com/lancedb/lance/issues/3674

    No due date
    1/5 issues closed
  • Goal is to eliminate most cases where users can get avoidable commit conflicts. For example, users should be able to update or delete different rows in the same lance file. Or users should be able to run an update query and not have it fail because a background compaction job just ran.

    No due date
    5/8 issues closed
  • No due date
    1/6 issues closed
  • Urgent bugs or other blockers to adoption we should consider fixing immediately.

    No due date
    5/8 issues closed