Skip to content

Scylla Wasm UDFs to decode RootKey partition keys (deferred) #6298

@ndr-ds

Description

@ndr-ds

Tracking issue for in-cqlsh decoding of RootKey partition keys via ScyllaDB Wasm UDFs. Deferred from #6297; revisit when one of the trigger conditions below is met.

Background

linera-storage-decode-key decodes partition-key blobs offline. It works but requires copying hex out of cqlsh and piping it through the binary. Moving the decoder into ScyllaDB as a Wasm UDF would let engineers run queries like:

SELECT decode_root_key(partition_key) AS what, partition_size
FROM system.large_partitions
WHERE keyspace_name = 'linera'
ORDER BY partition_size DESC
LIMIT 20;

and

SELECT decode_root_key(root_key), length(k), length(v)
FROM linera."ns_42"
WHERE root_key IN (
  encode_root_key('ChainState',       '7a3f...'),
  encode_root_key('BlockByHeight',    '7a3f...'),
  encode_root_key('Event',            '7a3f...'),
  encode_root_key('EventBlockHeight', '7a3f...')
);

Why deferred

ScyllaDB Wasm UDFs require experimental_features: [udf] in scylla.yaml — a cluster-wide flag that enables arbitrary Rust/Lua UDF registration on every node. For a debugging convenience whose offline equivalent already exists, the blast radius is not worth it on validators.

Hard data driving the wait:

  • Wasm UDFs landed in Scylla 5.2 (2022). Still labeled experimental in current docs ("insufficient testing").
  • scylla-udf v0.1.0 (the Rust helper) was last published 2023-02-15. The ecosystem has not advanced in 3 years.
  • No public Scylla roadmap milestone for Wasm UDF GA.

Revisit when ANY of these is true

  1. ScyllaDB Wasm UDFs go GA (no longer require experimental_features).
  2. A debug-only Scylla replica is in place so the experimental flag can stay off validators.
  3. The CLI offline pipe becomes a real bottleneck during an investigation.

Existing prototype

Branch ndr-ds/scylla-bcs-udf has a working implementation. Closed via #6297. Reviving = git checkout ndr-ds/scylla-bcs-udf and rebasing on main. Highlights:

  • New workspace member at linera-storage/udf/ (excluded from default-members, targets wasm32-wasip1).
  • Two UDFs: decode_root_key(blob) -> text and encode_root_key(text, text) -> blob. Both transparently handle the leading 0x00 tag byte that linera-views::backends::scylla_db::get_big_root_key prepends.
  • RootKey is mirrored in the UDF crate (compiling the real linera-storage crate to wasm would drag in the validator dependency tree). Drift-detection test at linera-storage/tests/root_key_drift.rs asserts canonical and mirrored enums BCS-encode identically.
  • Custom [profile.wasm-udf] (strip, LTO, opt-level z, panic abort) brings the binary from ~54 MB to ~145 KB.
  • Build script at linera-storage/udf/build.sh runs cargo buildwasm-stripwasm2wat and produces a .wat file ready to embed in a CQL CREATE FUNCTION statement.
  • All native tests pass (cargo test -p linera-storage-udf, 11 unit + 1 drift).

Validation gap to close on revival

The prototype was validated natively only. Before merging on revival:

  1. Run a local Scylla container with experimental_features: [udf] enabled.
  2. Register both UDFs via cqlsh.
  3. Insert a row with a real RootKey-encoded partition key.
  4. Confirm SELECT decode_root_key(root_key) returns the expected Debug representation.
  5. Confirm SELECT * FROM ... WHERE root_key = encode_root_key(...) returns the right row.

The native unit tests bypass the #[scylla_udf::export_udf] macro entirely (they call private *_impl helpers), so the macro-generated Wasm ABI wrappers have zero coverage in the existing branch.

RootKey divergence note

RootKey differs between main and testnet_conway (variant order, names, and Placeholder variant). The branch above targets main. A testnet_conway version needs its own port; the drift test mechanism applies on both branches independently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions