Skip to content

Add SQLite PRAGMA user_version + application_id as structural schema-version guard #2577

@thedavidmeister

Description

@thedavidmeister

Surfaced from review of #2526 (rename + DB schema v2→v3 bump).

Current defense

Schema version is tracked in exactly one place: a row in the db_metadata table read at bootstrap. crates/common/src/local_db/pipeline/adapters/bootstrap.rs:60-67 reads it and returns SchemaVersionMismatch on mismatch; the client adapter catches that and calls reset_db to wipe + recreate.

That works for the happy path, but it is brittle:

  • The check only runs once per bootstrap. Long-lived processes that reopen the SQLite file later have no way to notice if the file changed underneath them.
  • A user-shared / synced / partially-restored DB file can present a valid db_metadata row while the actual tables are stale — the row is just regular data, not a structural label.
  • Errors at query time look like no such column: raindex_address instead of db schema is wrong version — bad UX and slower to diagnose.

Proposed structural defense

SQLite has two PRAGMAs purpose-built for this:

  1. PRAGMA application_id — 32-bit integer label stamped into the file header. Identifies the file as a raindex local-db, not some other SQLite file that happens to be in the same directory.
  2. PRAGMA user_version — 32-bit schema-version number, also in the file header. Cheap to read; survives table corruption.

Set both atomically when creating tables (already an atomic batch in create_tables_stmt):

```sql
PRAGMA application_id = ;
PRAGMA user_version = 3;
```

Verify both on every connection open (not just at bootstrap):

```rust
let app_id: i32 = conn.query_row("PRAGMA application_id", [], |r| r.get(0))?;
let user_ver: i32 = conn.query_row("PRAGMA user_version", [], |r| r.get(0))?;
if app_id != RAINDEX_APPLICATION_ID { return Err(NotARaindexDb); }
if user_ver != DB_SCHEMA_VERSION { return Err(SchemaVersionMismatch { … }); }
```

Why this is worth doing now, not later

The v2→v3 cutover is the first time we've actually shipped a schema version bump. The next bump will hit the same brittleness from a real user base, with more data at stake. Adding the structural guard now lets every subsequent bump rely on the SQLite header rather than convention.

Scope

  • Pick a stable application_id (4-byte ASCII, e.g. "RIDX" → 0x52494458). Document it.
  • Add the two PRAGMA statements to create_tables/query.sql.
  • Add a connection-open hook that checks both and errors cleanly. Variant of the existing BootstrapPipeline::ensure_schema, but called from LocalDbQueryExecutor::open rather than bootstrap.
  • Update bootstrap.rs to keep the db_metadata row check too — belt and braces; the row is what user-facing migration logic queries, the PRAGMA is what code that doesn't know about migrations sees.
  • Existing dump-import code should NOT clobber application_id / user_version — confirm or fix in dump apply.

Out of scope

  • Changing the on-disk format
  • Adding migration steps (the PR's auto-reset on mismatch is fine and stays)

🤖 Generated with Claude Code

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions