Skip to content

Conversation

@sinkingsugar
Copy link

This commit begins reverting the Rust rewrite back to clean C code. The Rust FFI overhead, memory allocator hacks, and complexity are being replaced with straightforward SQLite C extension code.

Completed in this phase:

  1. triggers.c/h (98 Rust lines → 182 C lines)

    • Port trigger SQL generation for INSERT/UPDATE/DELETE
    • Clean string formatting with sqlite3_mprintf
    • No more format! macro overhead
  2. str_util.c/h (New utility module)

    • String escaping for SQL identifiers and values
    • Identifier list building (e.g., "col1", "col2")
    • Simple C string manipulation, no String allocations
  3. bootstrap.c/h (235 Rust lines → 349 C lines)

    • UUID v4 generation using sqlite3_randomness
    • Site ID initialization and persistence
    • Database version migration logic
    • Clock table creation
    • No Result wrappers, just SQLITE_OK/ERROR
  4. tableinfo.h (Data structures)

    • Define crsql_ColumnInfo and crsql_TableInfo structs
    • Foundation for table metadata management
  5. consts.h (Updated)

    • Add version constants (CRSQLITE_VERSION, etc.)
    • Fix TBL_SITE_ID to match current schema

All code is pure C with no FFI overhead. Memory management uses sqlite3_malloc/sqlite3_free. Error handling uses standard SQLite codes.

Next phases will port: db_version, tableinfo, changes_vtab, local_writes, and fractional indexing modules.

🤖 Generated with Claude Code

This commit begins reverting the Rust rewrite back to clean C code.
The Rust FFI overhead, memory allocator hacks, and complexity are being
replaced with straightforward SQLite C extension code.

Completed in this phase:

1. **triggers.c/h** (98 Rust lines → 182 C lines)
   - Port trigger SQL generation for INSERT/UPDATE/DELETE
   - Clean string formatting with sqlite3_mprintf
   - No more format! macro overhead

2. **str_util.c/h** (New utility module)
   - String escaping for SQL identifiers and values
   - Identifier list building (e.g., "col1", "col2")
   - Simple C string manipulation, no String allocations

3. **bootstrap.c/h** (235 Rust lines → 349 C lines)
   - UUID v4 generation using sqlite3_randomness
   - Site ID initialization and persistence
   - Database version migration logic
   - Clock table creation
   - No Result<T> wrappers, just SQLITE_OK/ERROR

4. **tableinfo.h** (Data structures)
   - Define crsql_ColumnInfo and crsql_TableInfo structs
   - Foundation for table metadata management

5. **consts.h** (Updated)
   - Add version constants (CRSQLITE_VERSION, etc.)
   - Fix TBL_SITE_ID to match current schema

All code is pure C with no FFI overhead. Memory management uses
sqlite3_malloc/sqlite3_free. Error handling uses standard SQLite codes.

Next phases will port: db_version, tableinfo, changes_vtab, local_writes,
and fractional indexing modules.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copilot AI review requested due to automatic review settings October 30, 2025 07:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces new trigger creation functionality for CRR (Conflict-free Replicated Relations) tables and refactors bootstrap logic. It adds utilities for SQL identifier escaping and list generation, along with version updates to 0.16.3 and table name changes.

  • Implements trigger creation functions for INSERT, UPDATE, and DELETE operations on CRR tables
  • Adds string utility functions for SQL escaping and identifier list generation
  • Refactors bootstrap logic including site_id management, peer tracking, and clock table creation

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
core/src/triggers.h Header file declaring trigger creation functions for CRR tables
core/src/triggers.c Implementation of trigger creation logic with proper error handling and memory management
core/src/tableinfo.h Data structures for table and column metadata used in trigger creation
core/src/str_util.h Header for SQL string escaping and identifier list utilities
core/src/str_util.c Implementation of SQL escaping functions and identifier list builder
core/src/consts.h Updated version to 0.16.3 and changed TBL_SITE_ID constant from "site_id" to "crsql_site_id"
core/src/bootstrap.h Header for database bootstrap functions including site_id initialization and table creation
core/src/bootstrap.c Implementation of database initialization, migration, and clock table creation logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Build comma-separated identifier list
// Example: columns=["id", "name"], prefix="NEW." -> 'NEW."id", NEW."name"'
char *crsql_as_identifier_list(crsql_ColumnInfo **columns, int columns_len, const char *prefix) {
if (!columns || columns_len == 0) return sqlite3_malloc(1); // Return empty string
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When returning an empty string, the allocated memory is not null-terminated. The function should allocate 1 byte and set it to '\0' before returning, otherwise the returned pointer will contain uninitialized memory.

Suggested change
if (!columns || columns_len == 0) return sqlite3_malloc(1); // Return empty string
if (!columns || columns_len == 0) {
char *empty_str = sqlite3_malloc(1);
if (empty_str) empty_str[0] = '\0';
return empty_str; // Return empty string
}

Copilot uses AI. Check for mistakes.

// prefix + " + escaped_name + " + , (if not last)
total_size += prefix_len + 1 + name_len + quote_count + 1;
if (i < columns_len - 1) total_size += 2; // ", "
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size calculation incorrectly adds separator bytes for all non-last elements, but the loop at lines 69 and 86 uses 'continue' to skip null entries. If a null entry is not the last element, the size calculation will be incorrect. The separator should only be added when there's a next valid column, not based on array index.

Copilot uses AI. Check for mistakes.
// Create primary key table
sql = sqlite3_mprintf(
"CREATE TABLE IF NOT EXISTS \"%s__crsql_pks\" (__crsql_key INTEGER PRIMARY KEY, %s)",
table_info->tbl_name, pk_list
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of escaped vs raw table name in SQL generation. Line 272 uses 'escaped_tbl' with %s format, while this line uses 'table_info->tbl_name' directly with %s. Consider using escaped_tbl here for consistency, or if the raw name is intentional, add a comment explaining the difference.

Suggested change
table_info->tbl_name, pk_list
escaped_tbl, pk_list

Copilot uses AI. Check for mistakes.
sql = sqlite3_mprintf(
"CREATE UNIQUE INDEX IF NOT EXISTS \"%s__crsql_pks_pks\" "
"ON \"%s__crsql_pks\" (%s)",
table_info->tbl_name, table_info->tbl_name, pk_list
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of escaped vs raw table name in SQL generation. The code uses 'table_info->tbl_name' directly with %s format, while earlier in the function (line 272) 'escaped_tbl' is used. Consider using escaped_tbl here for consistency, or if the raw name is intentional, add a comment explaining the difference.

Suggested change
table_info->tbl_name, table_info->tbl_name, pk_list
escaped_tbl, escaped_tbl, pk_list

Copilot uses AI. Check for mistakes.
claude added 11 commits October 30, 2025 08:57
Continued reverting Rust code back to C. This phase completes all
utility functions and database version management.

Completed in this phase:

1. **str_util.c/h** - Extended with full utility functions
   - crsql_where_list() - Build WHERE clauses with IS ? operators
   - crsql_binding_list() - Generate parameter placeholders
   - crsql_get_dflt_value() - Query column default values via pragma
   - crsql_get_db_version_union_query() - Build UNION for max db_version
   - crsql_slab_rowid() - Calculate slab-based rowids

2. **db_version.c/h** (150 Rust lines → 117 C lines)
   - crsql_fill_db_version_if_needed() - Lazy-load db version
   - crsql_next_db_version() - Get next version for transactions
   - fetch_db_version_from_storage() - Read from clock tables
   - Handles schema changes, merging versions, clean databases

3. **ext-data.c** - Extended with version statement management
   - crsql_recreate_db_version_stmt() - Rebuild union query stmt
   - Dynamically queries all clock tables for max version
   - Proper memory management with realloc for growing arrays

All implementations use:
- Standard SQLite error codes (SQLITE_OK, SQLITE_ERROR, etc.)
- sqlite3_mprintf() for safe string formatting
- Proper statement lifecycle (prepare, step, reset, finalize)
- No Result<T> wrappers or FFI overhead

Total ported so far: ~483 lines of Rust → ~1,100 lines of clean C

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Ported three standalone utility modules to C.

Modules ported:

1. **is_crr.c/h** (27 Rust lines → 38 C lines)
   - crsql_is_crr() - Check if table is CRR by trigger existence
   - Simple SQL query, no complex logic

2. **compare_values.c/h** (64 Rust lines → 63 C lines)
   - crsql_compare_sqlite_values() - Compare two SQLite values
   - crsql_any_value_changed() - Check if arrays differ
   - Handles all SQLite types (NULL, INTEGER, FLOAT, TEXT, BLOB)
   - NULL is treated as less than all other values

3. **config.c/h** (86 Rust lines → 71 C lines)
   - crsql_config_set() - Set configuration values
   - crsql_config_get() - Get configuration values
   - Persists to crsql_master table
   - Manages mergeEqualValues setting

All implementations use standard SQLite APIs with no FFI overhead.

Total ported so far: ~740 Rust lines eliminated

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Implemented the core tableinfo module - the foundation for table
metadata management. This is a simplified but functional version
of the 1,001-line Rust tableinfo.rs module.

Implemented:

1. **tableinfo.c/h** (~1,001 Rust lines → 288 C lines core)
   - crsql_extract_table_info() - Extract schema via pragma_table_info
   - crsql_free_table_info() - Complete memory cleanup
   - crsql_is_table_compatible() - Validate table has primary key
   - crsql_get_or_create_key() - Manage __crsql_pks lookaside table
   - Struct with 15 cached statement pointers (2 implemented, rest stubbed)

Key features:
- Extracts columns from SQLite schema
- Separates primary keys from non-PK columns
- Lazy-loading statement cache (select_key, insert_key)
- Proper memory management with realloc for dynamic arrays
- All cached statements properly finalized on cleanup

This provides enough functionality for other modules to compile and link.
Additional cached statements can be added during integration as needed.

Total ported: ~1,000+ Rust lines eliminated

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Ported utility modules for CRR creation and teardown.

Modules ported:

1. **stmt_cache.c/h** (33 Rust lines → 13 C lines)
   - crsql_reset_cached_stmt() - Clear bindings and reset statement
   - Simple wrapper for statement lifecycle management

2. **create_crr.c/h** (49 Rust lines → 73 C lines)
   - crsql_create_crr() - Convert regular table to CRR
   - Creates clock tables, triggers, and backfills data
   - Orchestrates full CRR setup process

3. **teardown.c/h** (55 Rust lines → 129 C lines)
   - crsql_remove_crr_clock_table_if_exists() - Drop clock tables
   - crsql_remove_crr_triggers_if_exist() - Drop all triggers
   - Handles per-PK-column trigger cleanup
   - Safe cleanup with IF EXISTS checks

All implementations use straightforward SQL execution.
create_crr has forward declaration for backfill (to be ported).

Total ported: ~1,140+ Rust lines eliminated

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port backfill.rs (~238 Rust lines) to backfill.c/h (~380 C lines)

Key functionality:
- Backfills clock tables with entries for existing rows
- Handles rows not yet tracked in __crsql_pks
- Supports schema evolution (missing columns)
- Transaction safety with savepoints
- Lazy key creation in lookaside tables

Functions:
- crsql_backfill_table: Main orchestrator
- create_clock_rows_from_stmt: Creates clock entries
- get_or_create_key: Manages __crsql_pks entries
- backfill_missing_columns: Handles schema changes
- fill_column: Backfills individual columns

Eliminates FFI overhead, uses pure SQLite C API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port pack_columns.rs (~208 Rust lines) to pack_columns.c/h (~350 C lines)
Port alter.rs (~192 Rust lines) to alter.c/h (~284 C lines)

pack_columns functionality:
- Binary packing format for efficient column storage
- Minimal bytes encoding for integers (1-8 bytes as needed)
- Support for all SQLite types (NULL, INTEGER, FLOAT, TEXT, BLOB)
- Unpack and bind utilities for statement preparation

Format: [count:u8, [(type+len:u8), len?:i32, data:u8[]]...]

alter functionality:
- Compact clock tables after schema changes
- Detect PK changes (drop/recreate clock tables)
- Remove obsolete column entries
- Remove orphaned row entries (preserve tombstones)
- Remove orphaned pk lookaside entries
- Save pre_compact_dbversion for migration tracking

Eliminates Rust bytes crate, Vec allocations, FFI overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port automigrate.rs (~453 Rust lines) to automigrate.c/h (~773 C lines)

Key functionality:
- Automatic schema migration from current to desired state
- Opens in-memory DB with desired schema for comparison
- Drops removed tables
- Modifies existing tables (add/drop columns, update indices)
- Handles CRR tables with begin/commit alter
- Preserves data during migrations
- Strips crsql_as_crr statements for temp DB

Migration operations:
- Table comparison and removal
- Column addition/removal
- Index creation/recreation
- PK change detection (rejects unsupported PK additions)

Eliminates Rust BTreeSet, String allocations, FFI overhead.
Uses StringSet (dynamic array) for simple set operations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port local_writes module (~484 Rust lines) to local_writes.c/h (~398 C lines)

Key functionality:
- Trigger handlers called from INSERT/UPDATE/DELETE triggers
- Updates clock tables with version and sequence tracking
- Handles CRDT metadata for conflict resolution

Functions:
- crsql_after_insert: Creates clock entries for new rows
- crsql_after_update: Updates clock entries for changed columns
  - Handles PK changes as delete+insert
  - Moves non-sentinel entries when PK changes
- crsql_after_delete: Marks rows as deleted (tombstones)

Helper functions:
- mark_new_pk_row_created: Creates sentinel entry
- mark_locally_updated: Updates column clock entry
- bump_seq: Increments sequence counter
- step_trigger_stmt: Steps and resets cached statements

Eliminates Rust ManuallyDrop, Result types, FFI overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port unpack_columns_vtab.rs (~269 Rust lines) to unpack_columns_vtab.c/h (~215 C lines)

Key functionality:
- Read-only virtual table to unpack binary column packages
- Schema: CREATE TABLE x(cell ANY, package BLOB hidden)
- Usage: SELECT cell FROM crsql_unpack_columns WHERE package = ?

Virtual table callbacks:
- connect/disconnect: Table lifecycle
- best_index: Requires package column constraint
- open/close: Cursor lifecycle
- filter: Unpacks binary blob into column values
- next/eof: Iteration through unpacked values
- column: Returns unpacked cell value
- rowid: Returns cursor position

Works with pack_columns module to provide efficient storage/retrieval
of packed column values.

Eliminates Rust Box, Vec, virtual table FFI overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port create_cl_set_vtab.rs (~270 Rust lines) to create_cl_set_vtab.c/h (~320 C lines)

Key functionality:
- Virtual table module for creating causal length set tables
- Table name must end with "_schema"
- Automatically creates base table and converts to CRR
- Schema: CREATE TABLE x(alteration TEXT HIDDEN, schema TEXT HIDDEN)

Virtual table callbacks:
- create: Creates base storage table and converts to CRR
- connect: Connects to existing virtual table
- destroy: Drops base table and all clock tables
- disconnect: Frees virtual table structure
- open/close: Cursor lifecycle (minimal, no reading)
- Other callbacks: No-ops (write-only management table)

Example:
  CREATE VIRTUAL TABLE foo_schema USING clset(
    id INTEGER PRIMARY KEY,
    value TEXT
  );
  -- Creates: foo table (as CRR) and foo_schema virtual table

Eliminates Rust Box, String allocations, FFI overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Port changes_vtab_read.rs (~67 Rust lines) to changes_vtab_read.c/h (~116 C lines)

Key functionality:
- Builds SQL queries to read changes from all CRR tables
- Creates UNION ALL of all table change queries
- Each query selects: table, packed PKs, column, versions, site_id, seq, causal length

Query structure per table:
- Joins clock table with pks table to get primary keys
- LEFT JOINs site_id table to resolve site IDs
- LEFT JOINs for causal length (sentinel) optimization
- Uses crsql_pack_columns to efficiently pack PK values

Used by changes virtual table to provide unified view of all changes
across all CRR tables in the database.

Eliminates Rust Vec, String allocations, FFI overhead.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants