-
-
Notifications
You must be signed in to change notification settings - Fork 104
Start porting Rust code back to C - Phase 1 #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Start porting Rust code back to C - Phase 1 #454
Conversation
This commit begins reverting the Rust rewrite back to clean C code. The Rust FFI overhead, memory allocator hacks, and complexity are being replaced with straightforward SQLite C extension code. Completed in this phase: 1. **triggers.c/h** (98 Rust lines → 182 C lines) - Port trigger SQL generation for INSERT/UPDATE/DELETE - Clean string formatting with sqlite3_mprintf - No more format! macro overhead 2. **str_util.c/h** (New utility module) - String escaping for SQL identifiers and values - Identifier list building (e.g., "col1", "col2") - Simple C string manipulation, no String allocations 3. **bootstrap.c/h** (235 Rust lines → 349 C lines) - UUID v4 generation using sqlite3_randomness - Site ID initialization and persistence - Database version migration logic - Clock table creation - No Result<T> wrappers, just SQLITE_OK/ERROR 4. **tableinfo.h** (Data structures) - Define crsql_ColumnInfo and crsql_TableInfo structs - Foundation for table metadata management 5. **consts.h** (Updated) - Add version constants (CRSQLITE_VERSION, etc.) - Fix TBL_SITE_ID to match current schema All code is pure C with no FFI overhead. Memory management uses sqlite3_malloc/sqlite3_free. Error handling uses standard SQLite codes. Next phases will port: db_version, tableinfo, changes_vtab, local_writes, and fractional indexing modules. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces new trigger creation functionality for CRR (Conflict-free Replicated Relations) tables and refactors bootstrap logic. It adds utilities for SQL identifier escaping and list generation, along with version updates to 0.16.3 and table name changes.
- Implements trigger creation functions for INSERT, UPDATE, and DELETE operations on CRR tables
- Adds string utility functions for SQL escaping and identifier list generation
- Refactors bootstrap logic including site_id management, peer tracking, and clock table creation
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| core/src/triggers.h | Header file declaring trigger creation functions for CRR tables |
| core/src/triggers.c | Implementation of trigger creation logic with proper error handling and memory management |
| core/src/tableinfo.h | Data structures for table and column metadata used in trigger creation |
| core/src/str_util.h | Header for SQL string escaping and identifier list utilities |
| core/src/str_util.c | Implementation of SQL escaping functions and identifier list builder |
| core/src/consts.h | Updated version to 0.16.3 and changed TBL_SITE_ID constant from "site_id" to "crsql_site_id" |
| core/src/bootstrap.h | Header for database bootstrap functions including site_id initialization and table creation |
| core/src/bootstrap.c | Implementation of database initialization, migration, and clock table creation logic |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Build comma-separated identifier list | ||
| // Example: columns=["id", "name"], prefix="NEW." -> 'NEW."id", NEW."name"' | ||
| char *crsql_as_identifier_list(crsql_ColumnInfo **columns, int columns_len, const char *prefix) { | ||
| if (!columns || columns_len == 0) return sqlite3_malloc(1); // Return empty string |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When returning an empty string, the allocated memory is not null-terminated. The function should allocate 1 byte and set it to '\0' before returning, otherwise the returned pointer will contain uninitialized memory.
| if (!columns || columns_len == 0) return sqlite3_malloc(1); // Return empty string | |
| if (!columns || columns_len == 0) { | |
| char *empty_str = sqlite3_malloc(1); | |
| if (empty_str) empty_str[0] = '\0'; | |
| return empty_str; // Return empty string | |
| } |
|
|
||
| // prefix + " + escaped_name + " + , (if not last) | ||
| total_size += prefix_len + 1 + name_len + quote_count + 1; | ||
| if (i < columns_len - 1) total_size += 2; // ", " |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size calculation incorrectly adds separator bytes for all non-last elements, but the loop at lines 69 and 86 uses 'continue' to skip null entries. If a null entry is not the last element, the size calculation will be incorrect. The separator should only be added when there's a next valid column, not based on array index.
| // Create primary key table | ||
| sql = sqlite3_mprintf( | ||
| "CREATE TABLE IF NOT EXISTS \"%s__crsql_pks\" (__crsql_key INTEGER PRIMARY KEY, %s)", | ||
| table_info->tbl_name, pk_list |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent use of escaped vs raw table name in SQL generation. Line 272 uses 'escaped_tbl' with %s format, while this line uses 'table_info->tbl_name' directly with %s. Consider using escaped_tbl here for consistency, or if the raw name is intentional, add a comment explaining the difference.
| table_info->tbl_name, pk_list | |
| escaped_tbl, pk_list |
| sql = sqlite3_mprintf( | ||
| "CREATE UNIQUE INDEX IF NOT EXISTS \"%s__crsql_pks_pks\" " | ||
| "ON \"%s__crsql_pks\" (%s)", | ||
| table_info->tbl_name, table_info->tbl_name, pk_list |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent use of escaped vs raw table name in SQL generation. The code uses 'table_info->tbl_name' directly with %s format, while earlier in the function (line 272) 'escaped_tbl' is used. Consider using escaped_tbl here for consistency, or if the raw name is intentional, add a comment explaining the difference.
| table_info->tbl_name, table_info->tbl_name, pk_list | |
| escaped_tbl, escaped_tbl, pk_list |
Continued reverting Rust code back to C. This phase completes all utility functions and database version management. Completed in this phase: 1. **str_util.c/h** - Extended with full utility functions - crsql_where_list() - Build WHERE clauses with IS ? operators - crsql_binding_list() - Generate parameter placeholders - crsql_get_dflt_value() - Query column default values via pragma - crsql_get_db_version_union_query() - Build UNION for max db_version - crsql_slab_rowid() - Calculate slab-based rowids 2. **db_version.c/h** (150 Rust lines → 117 C lines) - crsql_fill_db_version_if_needed() - Lazy-load db version - crsql_next_db_version() - Get next version for transactions - fetch_db_version_from_storage() - Read from clock tables - Handles schema changes, merging versions, clean databases 3. **ext-data.c** - Extended with version statement management - crsql_recreate_db_version_stmt() - Rebuild union query stmt - Dynamically queries all clock tables for max version - Proper memory management with realloc for growing arrays All implementations use: - Standard SQLite error codes (SQLITE_OK, SQLITE_ERROR, etc.) - sqlite3_mprintf() for safe string formatting - Proper statement lifecycle (prepare, step, reset, finalize) - No Result<T> wrappers or FFI overhead Total ported so far: ~483 lines of Rust → ~1,100 lines of clean C 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Ported three standalone utility modules to C. Modules ported: 1. **is_crr.c/h** (27 Rust lines → 38 C lines) - crsql_is_crr() - Check if table is CRR by trigger existence - Simple SQL query, no complex logic 2. **compare_values.c/h** (64 Rust lines → 63 C lines) - crsql_compare_sqlite_values() - Compare two SQLite values - crsql_any_value_changed() - Check if arrays differ - Handles all SQLite types (NULL, INTEGER, FLOAT, TEXT, BLOB) - NULL is treated as less than all other values 3. **config.c/h** (86 Rust lines → 71 C lines) - crsql_config_set() - Set configuration values - crsql_config_get() - Get configuration values - Persists to crsql_master table - Manages mergeEqualValues setting All implementations use standard SQLite APIs with no FFI overhead. Total ported so far: ~740 Rust lines eliminated 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Implemented the core tableinfo module - the foundation for table metadata management. This is a simplified but functional version of the 1,001-line Rust tableinfo.rs module. Implemented: 1. **tableinfo.c/h** (~1,001 Rust lines → 288 C lines core) - crsql_extract_table_info() - Extract schema via pragma_table_info - crsql_free_table_info() - Complete memory cleanup - crsql_is_table_compatible() - Validate table has primary key - crsql_get_or_create_key() - Manage __crsql_pks lookaside table - Struct with 15 cached statement pointers (2 implemented, rest stubbed) Key features: - Extracts columns from SQLite schema - Separates primary keys from non-PK columns - Lazy-loading statement cache (select_key, insert_key) - Proper memory management with realloc for dynamic arrays - All cached statements properly finalized on cleanup This provides enough functionality for other modules to compile and link. Additional cached statements can be added during integration as needed. Total ported: ~1,000+ Rust lines eliminated 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Ported utility modules for CRR creation and teardown. Modules ported: 1. **stmt_cache.c/h** (33 Rust lines → 13 C lines) - crsql_reset_cached_stmt() - Clear bindings and reset statement - Simple wrapper for statement lifecycle management 2. **create_crr.c/h** (49 Rust lines → 73 C lines) - crsql_create_crr() - Convert regular table to CRR - Creates clock tables, triggers, and backfills data - Orchestrates full CRR setup process 3. **teardown.c/h** (55 Rust lines → 129 C lines) - crsql_remove_crr_clock_table_if_exists() - Drop clock tables - crsql_remove_crr_triggers_if_exist() - Drop all triggers - Handles per-PK-column trigger cleanup - Safe cleanup with IF EXISTS checks All implementations use straightforward SQL execution. create_crr has forward declaration for backfill (to be ported). Total ported: ~1,140+ Rust lines eliminated 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port backfill.rs (~238 Rust lines) to backfill.c/h (~380 C lines) Key functionality: - Backfills clock tables with entries for existing rows - Handles rows not yet tracked in __crsql_pks - Supports schema evolution (missing columns) - Transaction safety with savepoints - Lazy key creation in lookaside tables Functions: - crsql_backfill_table: Main orchestrator - create_clock_rows_from_stmt: Creates clock entries - get_or_create_key: Manages __crsql_pks entries - backfill_missing_columns: Handles schema changes - fill_column: Backfills individual columns Eliminates FFI overhead, uses pure SQLite C API. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port pack_columns.rs (~208 Rust lines) to pack_columns.c/h (~350 C lines) Port alter.rs (~192 Rust lines) to alter.c/h (~284 C lines) pack_columns functionality: - Binary packing format for efficient column storage - Minimal bytes encoding for integers (1-8 bytes as needed) - Support for all SQLite types (NULL, INTEGER, FLOAT, TEXT, BLOB) - Unpack and bind utilities for statement preparation Format: [count:u8, [(type+len:u8), len?:i32, data:u8[]]...] alter functionality: - Compact clock tables after schema changes - Detect PK changes (drop/recreate clock tables) - Remove obsolete column entries - Remove orphaned row entries (preserve tombstones) - Remove orphaned pk lookaside entries - Save pre_compact_dbversion for migration tracking Eliminates Rust bytes crate, Vec allocations, FFI overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port automigrate.rs (~453 Rust lines) to automigrate.c/h (~773 C lines) Key functionality: - Automatic schema migration from current to desired state - Opens in-memory DB with desired schema for comparison - Drops removed tables - Modifies existing tables (add/drop columns, update indices) - Handles CRR tables with begin/commit alter - Preserves data during migrations - Strips crsql_as_crr statements for temp DB Migration operations: - Table comparison and removal - Column addition/removal - Index creation/recreation - PK change detection (rejects unsupported PK additions) Eliminates Rust BTreeSet, String allocations, FFI overhead. Uses StringSet (dynamic array) for simple set operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port local_writes module (~484 Rust lines) to local_writes.c/h (~398 C lines) Key functionality: - Trigger handlers called from INSERT/UPDATE/DELETE triggers - Updates clock tables with version and sequence tracking - Handles CRDT metadata for conflict resolution Functions: - crsql_after_insert: Creates clock entries for new rows - crsql_after_update: Updates clock entries for changed columns - Handles PK changes as delete+insert - Moves non-sentinel entries when PK changes - crsql_after_delete: Marks rows as deleted (tombstones) Helper functions: - mark_new_pk_row_created: Creates sentinel entry - mark_locally_updated: Updates column clock entry - bump_seq: Increments sequence counter - step_trigger_stmt: Steps and resets cached statements Eliminates Rust ManuallyDrop, Result types, FFI overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port unpack_columns_vtab.rs (~269 Rust lines) to unpack_columns_vtab.c/h (~215 C lines) Key functionality: - Read-only virtual table to unpack binary column packages - Schema: CREATE TABLE x(cell ANY, package BLOB hidden) - Usage: SELECT cell FROM crsql_unpack_columns WHERE package = ? Virtual table callbacks: - connect/disconnect: Table lifecycle - best_index: Requires package column constraint - open/close: Cursor lifecycle - filter: Unpacks binary blob into column values - next/eof: Iteration through unpacked values - column: Returns unpacked cell value - rowid: Returns cursor position Works with pack_columns module to provide efficient storage/retrieval of packed column values. Eliminates Rust Box, Vec, virtual table FFI overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Port create_cl_set_vtab.rs (~270 Rust lines) to create_cl_set_vtab.c/h (~320 C lines)
Key functionality:
- Virtual table module for creating causal length set tables
- Table name must end with "_schema"
- Automatically creates base table and converts to CRR
- Schema: CREATE TABLE x(alteration TEXT HIDDEN, schema TEXT HIDDEN)
Virtual table callbacks:
- create: Creates base storage table and converts to CRR
- connect: Connects to existing virtual table
- destroy: Drops base table and all clock tables
- disconnect: Frees virtual table structure
- open/close: Cursor lifecycle (minimal, no reading)
- Other callbacks: No-ops (write-only management table)
Example:
CREATE VIRTUAL TABLE foo_schema USING clset(
id INTEGER PRIMARY KEY,
value TEXT
);
-- Creates: foo table (as CRR) and foo_schema virtual table
Eliminates Rust Box, String allocations, FFI overhead.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
Port changes_vtab_read.rs (~67 Rust lines) to changes_vtab_read.c/h (~116 C lines) Key functionality: - Builds SQL queries to read changes from all CRR tables - Creates UNION ALL of all table change queries - Each query selects: table, packed PKs, column, versions, site_id, seq, causal length Query structure per table: - Joins clock table with pks table to get primary keys - LEFT JOINs site_id table to resolve site IDs - LEFT JOINs for causal length (sentinel) optimization - Uses crsql_pack_columns to efficiently pack PK values Used by changes virtual table to provide unified view of all changes across all CRR tables in the database. Eliminates Rust Vec, String allocations, FFI overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
This commit begins reverting the Rust rewrite back to clean C code. The Rust FFI overhead, memory allocator hacks, and complexity are being replaced with straightforward SQLite C extension code.
Completed in this phase:
triggers.c/h (98 Rust lines → 182 C lines)
str_util.c/h (New utility module)
bootstrap.c/h (235 Rust lines → 349 C lines)
tableinfo.h (Data structures)
consts.h (Updated)
All code is pure C with no FFI overhead. Memory management uses sqlite3_malloc/sqlite3_free. Error handling uses standard SQLite codes.
Next phases will port: db_version, tableinfo, changes_vtab, local_writes, and fractional indexing modules.
🤖 Generated with Claude Code