Replies: 3 comments 5 replies
-
|
@beinan @Xuanwo @jackye1995 would appreciate some early feedback while polishing the API. |
Beta Was this translation helpful? Give feedback.
-
|
Many OLAP engines and data lakes use deletion vector to mark deleted row ids in a data file. Is it possible to add a method to pass excluded row ids to lance scanner? |
Beta Was this translation helpful? Give feedback.
-
|
A common use case of the c binding is for existing c/c++ systems (esp. robotics) to ingest data into lance tables efficiently. Exposing a minimal local fragment creation api with explicit schema would suffice, as a sidecar process in either rust or python can upload/finalize the ingestion async. cf. lance-format/lance-c#4 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
RFC: Lance C++ Library (
liblance)Summary
This RFC proposes creating a first-class C/C++ library for Lance (
liblance) that exposes the full Lance feature set through a stable C-ABI with idiomatic C++ wrapper headers. This enables native integration with C++ query engines (Velox, DuckDB), ML frameworks, and any language with C FFI capabilities.Motivation
Lance currently provides bindings for Python (via PyO3) and Java (via JNI). Both are mature and production-grade, but the lack of a C/C++ interface creates several gaps:
Query engine integration: Velox, DuckDB, DataFusion (C++), and other C++ query engines cannot natively read Lance datasets. The lance-duckdb extension already implements a project-specific C-ABI to bridge this gap, but it would be better to have a unified interface for all applications.
Ecosystem reach: A C-ABI library is the universal FFI target. Languages like Go, Ruby, Swift, Zig, and others can bind to it without project-specific bridges.
ML infrastructure: Many ML serving systems (TensorRT, ONNX Runtime, TFServing) and feature stores are C++-native. A Lance C++ library enables direct integration without Python overhead.
Velox integration: Velox is Meta's open-source C++ vectorized execution engine. A Lance C++ library would enable building a Velox connector for Lance, unlocking Lance as a data source for Velox-powered query workloads — including vector search capabilities that Parquet/ORC cannot provide.
Design Principles
C-ABI first, C++ wrappers second. The core interface is
extern "C"functions callable from any language. Idiomatic C++ wrappers (RAII, smart pointers, iterators) are a thin layer on top.Arrow C Data Interface for data exchange. All data crosses the FFI boundary as
ArrowArray/ArrowSchema/ArrowArrayStreamstructs — the industry-standard zero-copy interchange format. No custom serialization.Opaque handle pattern. All Rust objects are exposed as
void*opaque pointers with explicit lifecycle functions (lance_*_open/lance_*_close). No struct layout leaks across the boundary.Feature parity with Python/Java. The C API should expose the same capabilities available in the Python and Java bindings: dataset CRUD, scanning with pushdowns, vector/scalar indexing, versioning, and merge-insert.
Panic safety. The Rust crate compiles with
panic = "abort"to prevent undefined behavior from unwinding across FFI boundaries.Dual threading model. Expose both synchronous (
block_on) and non-blocking (poll + waker) APIs. Simple integrators and scripts use the blocking API; high-performance query engines (Velox, Presto) use the poll-based API to avoid thread starvation.Architecture
Crate Structure
A new
rust/lance-c/crate is added to the workspace:Build Outputs
liblance.so/liblance.dylib/lance.dllliblance.a/lance.liblance.hlance.hpplance.pcProposed API
Error Handling
Following the pattern proven in lance-duckdb, errors use thread-local storage:
Dataset Operations
Scanning / Reading
Writing
Data Modification
Indexing
Versioning
Schema Evolution
Optimization
C++ Wrapper Design (
lance.hpp)The C++ wrapper is a header-only library providing RAII semantics, iterators, and builder patterns:
Threading Model
The Problem with
block_on()Both the Java JNI bindings and the lance-duckdb extension use a simple
block_on()approach: a global Tokio runtime is initialized once, and every async Lance operation is called viaruntime.block_on(future), which blocks the calling thread until the I/O completes.This works acceptably for DuckDB because it achieves parallelism at the fragment level — each DuckDB worker thread gets its own stream over a different fragment, so multiple
block_on()calls run concurrently on different threads. However, within a single stream, the calling thread sits idle waiting for I/O.For high-performance engines like Velox or Presto, this creates thread starvation: engine worker threads block on Lance I/O instead of yielding to process other tasks. This wastes thread pool capacity and can cascade into pipeline stalls.
Dual API: Sync + Poll
The C library exposes both APIs. The right choice depends on the integrator:
lance_scanner_next()(blocking)lance_scanner_poll_next()(non-blocking)lance_scanner_to_arrow_stream()(blocking)The blocking API is a thin wrapper over the poll API internally — both share the same Tokio-backed stream. This avoids maintaining two separate code paths.
How Poll + Waker Works
The poll-based API mirrors Rust's
Future::pollsemantics across the FFI boundary:Internal Design
On the Rust side,
lance_scanner_poll_nextwraps a TokioRecordBatchStreambehind a two-state machine:The waker notification uses a shared signaling mechanism (e.g.,
Arc<Notify>oroneshotchannel) between the I/O task and a lightweight waker task. When the I/O task completes, it signals the waker task, which calls the C function pointer. This avoids the mistake of trying to share aJoinHandleacross two tasks.Waker Contract
The waker callback has strict thread-safety requirements that must be documented in
lance.h:Thread-safety: The waker may be called from any thread (a Tokio worker thread, or potentially inline during
poll_nextif data is immediately available). The callback implementation must be thread-safe.Single-use: Each
poll_nextcall that returnsLANCE_POLL_PENDINGwill fire the waker exactly once. The caller must provide a fresh waker on eachpoll_nextcall.No reentrancy: The waker callback must NOT call back into any
lance_scanner_*function. Doing so will deadlock. Typical implementations should signal a promise, post to an event loop, or set an atomic flag.Stale waker safety: If the scanner is closed while a waker is pending, the waker may still fire. The callback must tolerate being called after the scanner has been destroyed (e.g., by checking a validity flag in the context).
Spurious wake tolerance: The caller must tolerate re-polling after a waker fires and receiving
LANCE_POLL_PENDINGagain. This can happen if the Tokio task was cancelled or if the I/O result was consumed by another path. The correct response is simply to re-register a new waker.Implementation Plan
Phase 1: Core Read Path + C++ Wrappers (MVP)
Goal: A usable C++ library for reading Lance datasets, sufficient to build a Velox connector or integrate with any C++ application.
lance-ccrate scaffoldlance.hlance_scanner_next()for simple integrationslance_scanner_poll_next()for async engines (Velox, Presto)lance.hppheader-only libDataset,Scanner), exception-based error handling.columns().filter().limit().nearest()APIarrow::RecordBatch↔ C Data InterfaceDeliverable:
liblance.so+lance.h+lance.hpp+lance.pc. A C++ developer can open a dataset, scan with filters/projections, and consume Arrow batches. Everything builds withcargo build.Phase 2: Vector Search & Indexing
nearest()with metric, k, nprobesDataset::create_vector_index(),Dataset::create_scalar_index()Deliverable: Vector search and index management from C++. Enables Velox
IndexSourcefor ANN lookups.Phase 3: Write Path & Mutations
WriterRAII classDeliverable: Full read-write C++ library. Enables Velox
DataSink.Phase 4: Advanced Features
Deliverable: Production-ready library with optimization and distribution support.
Comparison with Existing Bindings
Lessons from lance-duckdb
The lance-duckdb extension validates the core architectural decisions:
Static library linking works well. DuckDB bundles
liblance_duckdb_ffi.adirectly. We should support both static and shared library builds.Arrow C Data Interface is the right data exchange format. Zero-copy, no custom serialization, works with both DuckDB and Velox.
Thread-local error handling is simple and effective. Avoids complex return types while remaining thread-safe.
Custom binary IR for filter pushdown is powerful but complex. lance-duckdb serializes DuckDB expressions into a
LFT1binary format and deserializes them into DataFusion expressions on the Rust side. For the general-purpose C library, we should start with SQL string filters (simpler, already supported by Lance) and add binary IR as an optional optimization later.block_on()works but has limitations. Both lance-duckdb and the Java bindings use a global Tokio runtime withOnceLockandblock_on(). This is simple and correct, and DuckDB mitigates the blocking cost through fragment-level parallelism (each worker thread gets its own stream). However, for cooperative async engines like Velox, blocking is a thread starvation risk. The C library should support bothblock_on()(for simplicity) and a poll-based API (for performance).panic = "abort"is mandatory. Prevents undefined behavior from Rust panics unwinding into C/C++ frames.Lessons from Python and Java Bindings
The existing bindings inform several design decisions:
Builder pattern for scanners. Both Python (
ScannerBuilder) and Java (ScanOptions) use builders. The C API follows suit withlance_scanner_new()+lance_scanner_set_*()functions.block_on()is a starting point, not a long-term design. Java'sBlockingDatasetandBlockingScannerwere implemented as expedient shortcuts. For a general-purpose C library targeting high-performance engines, we go further with a poll + waker API that avoids blocking engine threads on I/O.Arrow for data exchange. Python uses PyArrow arrays; Java uses Arrow C Data Interface structs. The C library should exclusively use Arrow C Data Interface.
Comprehensive feature surface. Both bindings expose dataset CRUD, scanning, indexing, versioning, merge-insert, schema evolution, and compaction. The C library should aim for parity.
Storage options as key-value pairs. Both bindings accept
HashMap<String, String>for cloud storage configuration. The C API usesNULL-terminatedconst char**arrays.Alternatives Considered
1. Wrap lance-duckdb's FFI
lance-duckdb already has a Rust FFI layer. We could extract and generalize it.
Rejected because: lance-duckdb's FFI is tightly coupled to DuckDB's execution model (custom IR formats, DuckDB-specific scan patterns). A general-purpose library needs a cleaner, more universal API.
2. Use Substrait for filter pushdown
Substrait is the emerging standard for cross-engine expression exchange.
Deferred: Substrait support can be added later as an alternative filter format alongside SQL strings. SQL strings are simpler for initial adoption and already supported by Lance's scanner.
Open Questions
Header generation tooling: cbindgen vs hand-written headers? cbindgen is automated but sometimes produces suboptimal output. lance-duckdb uses hand-written headers.
Thread safety guarantees: Should a single
LanceDataset*be safe to use from multiple threads? The RustDatasetisSend + Sync, so this is feasible but needs documentation.Waker thread safety contract: Resolved — see "Waker Contract" in the Threading Model section. The waker is documented as single-use, callable from any thread, no-reentrancy, and must tolerate stale/spurious invocations.References
Beta Was this translation helpful? Give feedback.
All reactions