-
Notifications
You must be signed in to change notification settings - Fork 1
Add Raindex Event Sourcing Sync Specification V1 #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added Raindex Event Sourcing Sync Specification V1 documentation detailing the architecture, design goals, synchronization flow, error handling, and remote server API for a local-first event-sourced system.
WalkthroughIntroduces a comprehensive Raindex Event Sourcing Sync Specification V1 document defining a local-first, event-sourced system for synchronizing blockchain data into SQLite. The specification covers RPC-based live synchronization, remote bootstrap, multi-network and multi-orderbook support, database schema, sync coordination flows, and a Remote Server API. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1. **Sync State Tables**: Track synchronization progress per orderbook | ||
| 2. **Event Tables**: Raw logs and processed events (orders, trades, deposits, withdrawals) | ||
| 3. **State Tables**: Current state (vaults, vault balances) | ||
| 4. **Side Effect Tables**: Token info, interpreter bytecode, store Set events | ||
| 5. **Tracking Tables**: e.g. Which stores are being monitored per orderbook | ||
|
|
||
| ## Multi-Orderbook Coordination | ||
|
|
||
| ### Independent Sync | ||
|
|
||
| Each orderbook syncs independently and maintains its own state: | ||
|
|
||
| - **Separate sync_state**: Each orderbook has its own record tracking block number, block hash, and sync status | ||
| - **Independent error handling**: One orderbook failing doesn't affect others | ||
| - **Different sync speeds**: Networks with different block times sync at different rates | ||
| - **Flexible configuration**: Each orderbook can use different remotes or sync purely from RPC | ||
| - **Per-orderbook tracking**: Each orderbook tracks its own stores via tracked_stores table | ||
| - **Concurrent syncing**: Multiple orderbooks can sync simultaneously | ||
|
|
||
| **Benefits of independent sync:** | ||
| - Add/remove orderbooks without affecting existing syncs | ||
| - Each orderbook's data is always internally consistent | ||
| - Different networks progress at their natural pace | ||
| - Errors are isolated and don't cascade | ||
|
|
||
| ### Shared Side Effects | ||
|
|
||
| While orderbooks sync independently, certain data is shared across all orderbooks: | ||
|
|
||
| #### Token Info (token_info table) | ||
| - **Scope**: One record per network:token (not per orderbook) | ||
| - **Rationale**: Token metadata is network-level data, not orderbook-specific | ||
| - **Benefit**: Fetching token info for one orderbook benefits all orderbooks on same network | ||
|
|
||
| #### Interpreter Bytecode (interpreter_bytecode table) | ||
| - **Scope**: One record per network:interpreter | ||
| - **Rationale**: Bytecode is immutable and not orderbook-specific | ||
| - **Benefit**: Multiple orderbooks can reference same interpreter without duplication | ||
|
|
||
| #### Store Set Events (store_set_events table) | ||
| - **Scope**: Filtered by orderbook context, but stores can be shared | ||
| - **Rationale**: Stores are shared contracts that multiple orderbooks may use | ||
| - **Benefit**: If multiple orderbooks use the same store, Set events are available to all | ||
|
|
||
| **Implications of shared side effects:** | ||
| - Side effect tables use INSERT OR IGNORE pattern during merge | ||
| - Cannot simply delete and replace side effect data during remote merge | ||
| - Store Set events must be carefully filtered by orderbook transaction context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolve the contradictory guidance on store_set_events.
This section classifies store_set_events as a shared side-effect table that must be preserved with INSERT OR IGNORE semantics, yet the merge policy later lists it among orderbook-specific tables to delete and replace. The reorg rollback procedure also deletes it while asserting side-effect tables are not rolled back. Implementers cannot act on these conflicting instructions, and following the delete/replace path would violate the stated side-effect guarantees. Please consolidate the specification so store_set_events has a single, consistent lifecycle (either fully shared side effect or fully orderbook-scoped) and update all affected sections accordingly.
Also applies to: 408-419
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 78 to 125 (and also apply same change to
lines 408–419), the spec gives contradictory lifecycle rules for
store_set_events by calling it both a shared side-effect table (preserved with
INSERT OR IGNORE and not rolled back) and an orderbook-specific table that is
deleted/replaced during remote merges and reorg rollbacks; consolidate to one
clear policy: decide whether store_set_events is shared (network-scoped) or
orderbook-scoped, then update every mention to match that choice — if shared,
mark it as a side-effect table, remove it from delete/replace and rollback
lists, and add INSERT OR IGNORE merge semantics and filtering notes; if
orderbook-scoped, remove it from the side-effect section, document
delete-and-replace merge behavior and include it in rollback/delete lists; apply
the change consistently in both locations and any related reorg/merge procedure
descriptions.
| - **Replace on hash mismatch**: If block numbers match but hashes differ (reorg), trust remote | ||
| - **Preserve side effects**: Never overwrite existing side effect data | ||
| - **Atomic operation**: All merges happen within a transaction | ||
|
|
||
| #### Transaction Boundaries | ||
|
|
||
| Remote dump merges are atomic operations: | ||
| ``` | ||
| BEGIN TRANSACTION | ||
| - Validate schema version compatibility | ||
| - For each orderbook-specific table: delete and replace if criteria met | ||
| - For each side effect table: insert new records only (preserve existing) | ||
| - Update sync_state with remote's block number and hash | ||
| COMMIT TRANSACTION | ||
| ``` | ||
|
|
||
| If any step fails during merge, the entire transaction rolls back and the local database remains in its previous consistent state. | ||
|
|
||
| #### Detailed Merge Rules | ||
|
|
||
| **For orderbook-specific tables** (orders, trades, vaults, vault_balance_changes, raw_logs, tracked_stores, store_set_events): | ||
| ``` | ||
| IF remote.last_synced_block > local.last_synced_block | ||
| OR (remote.last_synced_block == local.last_synced_block | ||
| AND remote.block_hash != local.block_hash): | ||
|
|
||
| DELETE FROM <table> WHERE network = ? AND orderbook_address = ? | ||
| INSERT all records from dump | ||
|
|
||
| ELSE: | ||
| Skip (local is current or ahead) | ||
| ``` | ||
|
|
||
| **For side effect tables** (token_info, interpreter_bytecode): | ||
| ``` | ||
| For each record in dump: | ||
| IF NOT EXISTS (SELECT 1 FROM <table> WHERE id = dump_record.id): | ||
| INSERT dump_record | ||
| ELSE: | ||
| SKIP (preserve existing local data) | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow remote dumps to upgrade side-effect records.
The merge rule “insert only when missing, otherwise skip” prevents remote dumps from improving existing side-effect rows. If a local client previously recorded token_info.fetch_succeeded = 0 (or stale metadata) and the remote has a successful, up-to-date record, the insert-or-ignore policy will retain the failed local row indefinitely. That negates the benefit of remote bootstraps delivering richer metadata. Please adjust the spec to permit safe updates (e.g., compare timestamps or prefer successful remote fetches) so remote side-effect data can replace incomplete local state.
🤖 Prompt for AI Agents
i9r/raindex-local-db around lines 388-428: the current spec forbids updating
existing side-effect rows (token_info, interpreter_bytecode), preventing remote
dumps from improving stale or failed local records; change the merge rules to
allow safe updates by replacing a local row only when the remote row is
demonstrably newer or better—for example, if remote.last_updated >
local.last_updated OR (remote.fetch_succeeded = 1 AND local.fetch_succeeded =
0), otherwise skip; apply these comparisons per-record inside the same
transaction (preserve atomicity), and ensure updates are limited to fields
provided by the dump (don’t clobber unrelated local fields) to avoid regressing
local improvements.
Removed unnecessary RPC sync configuration and clarified remote checking process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
♻️ Duplicate comments (2)
i9r/raindex-local-db (2)
115-124: Unify lifecycle and merge/rollback rules for store_set_events (contradictions).
- It’s called a shared side-effect (Lines 115-124), but is deleted/replaced on remote merges (Lines 406-419) and rolled back on reorg (Lines 258-264). The dump section also treats it as orderbook-scoped (Lines 641-646). This is internally inconsistent and not implementable.
- Additionally, “shared across orderbooks” conflicts with “filter Set events to only those in orderbook transaction hashes” (orderbook-scoped by context).
Choose one model and apply it consistently across all sections:
- Option A (recommended for correctness): Treat store_set_events as orderbook-scoped artifacts derived by filtering global store Set logs by orderbook tx context. Remove it from “Shared Side Effects,” keep it in delete/replace and reorg rollback lists, and ensure dumps are orderbook-filtered.
- Option B: Treat store_set_events as globally shared side-effects. Then remove it from delete/replace and rollback lists, keep only INSERT/UPSERT semantics, and clarify how cross‑orderbook Set events affect each orderbook’s state.
Provide a single authoritative definition and update “Implications,” “Reorg Recovery,” “Merge Policy,” and “Dump Generation” accordingly.
Also applies to: 258-264, 406-419, 641-646
419-426: Permit safe upgrades of side‑effect rows (token_info, interpreter_bytecode).“Insert if missing, otherwise skip” blocks remote improvements (e.g., upgrading failed or stale token_info). Change to UPSERT with safe precedence, e.g.:
- Prefer remote when remote.last_updated > local.last_updated, or remote.fetch_succeeded = 1 and local.fetch_succeeded = 0.
- Limit updates to provided columns to avoid clobbering unrelated local fields.
Also applies to: 120-124
| networks: | ||
| arbitrum: | ||
| rpcs: | ||
| - https://arbitrum-one-rpc.publicnode.com | ||
| chain-id: 42161 | ||
| network-id: 42161 | ||
| currency: ETH | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Clarify chain-id vs network-id to avoid ambiguity.
Having both chain-id and network-id invites misconfiguration. Prefer chain-id (EIP‑155) as the canonical identifier; drop or clearly define network-id usage.
If network-id is required by any component, please point to that dependency so the spec can document the exact need and mapping.
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 36 to 43, the config currently includes
both chain-id and network-id which can cause ambiguity; update the spec to use
chain-id (EIP-155) as the canonical identifier by removing the network-id field
(or, if some legacy component requires it, rename it to network-id-legacy and
add a comment linking to the dependent component and its exact requirement); if
you must keep network-id, add a precise comment documenting the dependency (with
repo/package and version) and the mapping between network-id and chain-id so the
spec is unambiguous.
i9r/raindex-local-db
Outdated
| 6. For each batch (with max_concurrent_batches parallelism): | ||
| a. REORG CHECK: Fetch block_hash for last_synced_block from RPC | ||
| - Compare with last_synced_block_hash in sync_state | ||
| - If hashes don't match: REORG DETECTED → execute reorg recovery | ||
|
|
||
| b. Fetch logs via eth_getLogs filtered by orderbook address | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per‑orderbook batch concurrency risks atomicity; enforce single in‑flight DB batch per orderbook.
“max_concurrent_batches” within the same orderbook can race on last_synced_block, commit out of order, undermine reorg checks (6a), and violate “complete, consistent state per block.” Allow parallel RPC fetching, but serialize DB transactions and sync_state updates per orderbook (single in‑flight batch). Explicitly require a per‑orderbook mutex/advisory lock and deterministic commit order.
Also applies to: 218-221
| d. BEGIN TRANSACTION | ||
|
|
||
| e. Insert raw logs into raw_logs table | ||
|
|
||
| f. Extract transaction hashes from raw_logs | ||
|
|
||
| g. Parse AddOrder logs to identify new store addresses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define canonical log ordering and idempotency constraints.
Add explicit processing order by (block_number ASC, transaction_index ASC, log_index ASC). Specify unique keys and UPSERTs for raw_logs and derived tables to ensure idempotency under retries/merges. Example constraints:
- raw_logs: UNIQUE(network, block_number, transaction_hash, log_index)
- orders/trades/vault_balance_changes/store_set_events: deterministic primary keys and ON CONFLICT DO NOTHING/UPSERT as appropriate.
Also applies to: 206-214
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 190-196 (and also apply same changes to
lines 206-214), the processing lacks a canonical deterministic log ordering and
idempotency guarantees; enforce processing order by sorting by (block_number
ASC, transaction_index ASC, log_index ASC) when selecting/iterating logs, add a
UNIQUE constraint on raw_logs like (network, block_number, transaction_hash,
log_index), and change inserts into raw_logs and derived tables (orders, trades,
vault_balance_changes, store_set_events) to use deterministic primary keys and
UPSERT semantics (ON CONFLICT DO NOTHING or ON CONFLICT (...) DO UPDATE SET ...
with deterministic columns) so repeated runs/merges are idempotent; ensure
derived-table key choices are stable (e.g., composed from network +
transaction_hash + log_index or an actual domain PK) and document the
ordering/constraints in the code comments where the steps (d–g) are implemented.
i9r/raindex-local-db
Outdated
| h. Fetch Store Set events: | ||
| - Query tracked_stores for this orderbook to get all store addresses | ||
| - For each tracked store: | ||
| * Fetch Set events via eth_getLogs for batch block range | ||
| * Filter Set events to only those in orderbook transaction hashes | ||
| * Insert Set event raw_logs and store_set_events records | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Reduce RPC load when fetching Set events across many stores.
Clarify implementation to:
- Use topics[0] filter for Set events and pass multiple addresses per eth_getLogs call (chunked to provider limits).
- Cache negative results per (store, block_range).
- Bound per‑batch address fan‑out to avoid provider timeouts.
| Note: Side effect tables (token_info, interpreter_bytecode) are NOT rolled back | ||
| as they represent immutable on-chain data. tracked_stores is also not rolled back | ||
| as stores remain relevant even after a reorg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracked_stores not rolled back can cause stale scans; add validation/pruning.
If a reorg eliminates an AddOrder that introduced a store, keeping tracked_stores forever causes unnecessary Set‑event scans. Add a mechanism to:
- Mark entries “suspect” when rolling back past first_seen_block and re‑validate, or
- Maintain last_seen_block and periodically prune stores not referenced by any canonical AddOrder up to last_synced_block.
Also applies to: 197-205
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 277-279 (also apply same fix to lines
197-205), tracked_stores is never rolled back which can cause stale Set-event
scans after reorgs; implement a two-part fix: when rolling back past a store's
first_seen_block mark that tracked_store entry as “suspect” (add a suspect flag
and timestamp) so the next scan will re-validate the store against canonical
AddOrder data, and additionally maintain/track last_seen_block for each store
and run a periodic pruning job that, up to last_synced_block, removes stores not
referenced by any canonical AddOrder (or marks them stale) to avoid perpetual
scanning of orphaned stores. Ensure rollback code sets suspect=true when
appropriate and the periodic job queries canonical AddOrders up to
last_synced_block to prune or clear suspect flags accordingly.
| ## Remote Bootstrap | ||
|
|
||
| ### Remote Discovery | ||
|
|
||
| When local client determines it needs to bootstrap from remote: | ||
| ``` | ||
| 1. Fetch remote metadata: GET {remote_url}/metadata | ||
| Response: { | ||
| schema_version: 1, | ||
| orderbooks: [ | ||
| { | ||
| network: "mainnet", | ||
| address: "0x...", | ||
| last_block: 18500000, | ||
| block_hash: "0xabcd...", | ||
| timestamp: 1704067200 | ||
| } | ||
| ] | ||
| } | ||
|
|
||
| 2. Validate schema_version matches local | ||
|
|
||
| 3. For each configured orderbook that exists in remote: | ||
| a. Compare remote's last_block and block_hash with local sync_state | ||
| b. If remote is ahead OR block_hashes differ at same height: | ||
| - Fetch dump: GET {remote_url}/dump/{network}/{orderbook} | ||
| c. Otherwise skip (local is current) | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serialize remote merges with RPC sync for the same orderbook.
Remote merge “within a transaction” is not sufficient if RPC sync is concurrently running on that orderbook. Specify an exclusive per‑orderbook lock so remote merge pauses RPC processing (and vice versa) to avoid interleaved writes and torn sync_state.
Also applies to: 390-404
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 351 to 379 (also apply same change at
390-404), remote merge logic currently runs "within a transaction" but does not
prevent concurrent RPC sync tasks for the same orderbook from interleaving
writes; add an exclusive per-orderbook lock (e.g., keyed mutex by
network+orderbook) that must be acquired by both the remote-merge path and the
RPC-sync path before performing any dump/apply or sync_state updates, hold the
lock for the duration of the merge/sync (including reading/writing sync_state),
and ensure lock acquisition is timeout- or cancel-aware and released in all
error paths so remote merges and RPC syncs are fully serialized for the same
orderbook.
| **Resolution**: Trust the remote dump. Rationale: | ||
| - Remote servers typically have better infrastructure and connectivity | ||
| - Remote servers are more likely to be on the canonical chain | ||
| - Local can detect its own reorgs via block hash checks during sync | ||
| - If remote is wrong, local will detect mismatch on next RPC sync and self-correct | ||
| ``` | ||
| Action when hash mismatch detected: | ||
| 1. Log warning about potential reorg | ||
| 2. Replace local data with remote dump (per merge policy) | ||
| 3. Resume RPC sync from remote's last_synced_block | ||
| 4. RPC sync will validate chain continuity and detect if remote was on wrong fork | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t blindly “trust remote” on hash mismatch; add canonicality checks + finality guard.
Before replacing local state:
- Verify remote.block_hash with at least one independent RPC for the same height and chainId.
- Require remote.last_block ≤ (current_block − finality_depth) to avoid importing non‑final blocks.
- If verification fails, skip merge and continue RPC sync.
i9r/raindex-local-db
Outdated
| c. Include side effect tables (full tables, not filtered): | ||
| - token_info (all networks) | ||
| - interpreter_bytecode (all networks) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Scope side‑effect tables in dumps to relevant networks to keep artifacts small.
Including all networks’ side‑effects in every dump inflates size and bandwidth. Restrict token_info/interpreter_bytecode in dumps to the network(s) referenced by the orderbook, or make inclusion configurable.
Also applies to: 641-646
| Queries MUST filter by `block_number <= last_synced_block` to ensure they never see partial state. | ||
|
|
||
| Example: | ||
| ```sql | ||
| SELECT * FROM orders | ||
| WHERE network = ? | ||
| AND orderbook_address = ? | ||
| AND added_block <= ( | ||
| SELECT last_synced_block | ||
| FROM sync_state | ||
| WHERE network = ? AND orderbook_address = ? | ||
| ) | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Generalize “safe query” guidance for multi-table reads.
Extend the example to joins and derived tables: every participating table must be filtered to rows with block_number/added_block ≤ last_synced_block for the same (network, orderbook). Consider providing SQL views or a helper that injects these predicates to avoid foot‑guns.
| Returns metadata about available dumps. | ||
| ```json | ||
| { | ||
| "schema_version": 1, | ||
| "dumps": [ | ||
| { | ||
| "network": "mainnet", | ||
| "orderbook": "0x1234...", | ||
| "last_block": 18500000, | ||
| "block_hash": "0xabcd...", | ||
| "timestamp": 1704067200, | ||
| "size_bytes": 52428800, | ||
| "url": "/dump/mainnet/0x1234..." | ||
| } | ||
| ] | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strengthen metadata and artifact integrity.
Augment metadata with:
- chainId and genesisBlockHash (not just “network” string).
- sha256 (or stronger) of the gz dump, size_bytes, and content schema hash.
- Optionally a signature to authenticate the dump origin.
Clients must verify these before merge.
Also applies to: 357-370
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 615 to 631 (and also apply same changes to
lines 357 to 370), the dump metadata is too weak — replace/augment the current
"network" string with explicit chainId and genesisBlockHash fields, add a sha256
checksum for the gzipped dump, ensure size_bytes is accurate, add a
content_schema_hash (hash of the dump's JSON schema or canonicalized content)
and an optional signature field (base64) to authenticate origin; update the JSON
schema_version if needed and ensure any code that produces or reads these dumps
computes and validates sha256, content_schema_hash and signature before
accepting a dump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (12)
i9r/raindex-local-db (12)
36-43: Clarify chainId vs network-id; prefer chainId (EIP‑155).Having both invites misconfig. Keep chainId canonical; remove or clearly mark network-id as legacy.
networks: arbitrum: rpcs: - https://arbitrum-one-rpc.publicnode.com - chain-id: 42161 - network-id: 42161 + chain-id: 42161 # Canonical EIP-155 chainId + # network-id removed to avoid ambiguity; if a legacy component needs it, + # name it `network-id-legacy` and document the dependency and mapping. currency: ETH
115-124: Contradictory lifecycle for store_set_events; pick one model or split tables.You mark store_set_events “shared side effects” but also delete/replace on merge and roll them back on reorg. This is internally inconsistent and unimplementable.
Two viable fixes (choose one and apply consistently):
- Option A (split, recommended):
- Keep a network‑scoped side‑effect table storing all Set events per store (store_set_events_raw) — never rolled back.
- Materialize per‑orderbook filtered rows into ob_store_set_events — delete/replace on merge and roll back on reorg.
- Option B (orderbook‑scoped only):
- Treat store_set_events as strictly orderbook‑scoped everywhere; remove it from “shared side effects.”
Suggested edits:
- #### Store Set Events (store_set_events table) - - Scope: Filtered by orderbook context, but stores can be shared - - Rationale: Stores are shared contracts that multiple orderbooks may use - - Benefit: If multiple orderbooks use the same store, Set events are available to all + #### Store Set Events + - Side-effect (network-scoped): `store_set_events_raw` — all Set events per (network, store_address); never rolled back. + - Orderbook-scoped derived: `ob_store_set_events` — subset where tx_hash belongs to the orderbook’s txs; rolled back on reorg and replaced on remote merge.- **For orderbook-specific tables** (..., tracked_stores, store_set_events): + **For orderbook-specific tables** (..., tracked_stores, ob_store_set_events):- - Delete all records where block_number > common_ancestor_block: + - Delete all records where block_number > common_ancestor_block: * raw_logs * orders ... * vault_balance_changes * trades - * store_set_events + * ob_store_set_events- Note: Side effect tables (token_info, interpreter_bytecode) are NOT rolled back - as they represent immutable on-chain data. tracked_stores is also not rolled back - as stores remain relevant even after a reorg. + Note: Side-effect tables (token_info, interpreter_bytecode, store_set_events_raw) are NOT rolled back. + Orderbook-scoped derived tables (e.g., ob_store_set_events) ARE rolled back.Also applies to: 258-265, 277-279, 406-417
181-186: Serialize DB batches per orderbook; allow parallel fetch only.Parallel in‑orderbook DB transactions race on last_synced_block and can commit out of order, breaking atomicity/reorg guarantees. Use a per‑orderbook mutex; one in‑flight DB batch at a time. Parallelize RPC fetches only.
- 6. For each batch (with max_concurrent_batches parallelism): + 6. Fetch multiple ranges in parallel if desired, but: + - Acquire a per-(network, orderbook) mutex before BEGIN TRANSACTION. + - At most one DB batch may be in-flight per orderbook (writes serialized). + - Release the mutex after COMMIT/ROLLBACK.- 2. Transaction Boundaries: Every batch is atomic. All operations (...) happen within a single transaction. + 2. Transaction Boundaries: Every batch is atomic. DB writes are serialized per orderbook (single in-flight batch). RPC fetches may run in parallel, but their results must queue behind the mutex before writing.-local-db-sync: - batch-size: 2000 - max-concurrent-batches: 5 +local-db-sync: + batch-size: 2000 + max-concurrent-fetches: 5 # fetch concurrency + serialize-db-batches: true # enforce single writer per orderbookAlso applies to: 218-221, 239-241
192-205: Add canonical log ordering and idempotent constraints/UPSERTs.Specify sort by (block_number ASC, transaction_index ASC, log_index ASC) and enforce UNIQUE keys to make retries/merges safe.
+ Processing order: strictly sort logs by (block_number ASC, transaction_index ASC, log_index ASC).Constraints (documentation snippet):
-- raw_logs UNIQUE(network, block_number, transaction_hash, log_index); -- orders/trades/vault_balance_changes/ob_store_set_events -- choose deterministic PKs (e.g., network + transaction_hash + log_index [+ role]) -- and use UPSERTs: INSERT INTO ... ON CONFLICT (...) DO UPDATE SET ...; -- or DO NOTHING where correctAlso applies to: 206-214, 230-238
199-205: Reduce RPC load for Set events (batch addresses + topics).Batch store addresses per eth_getLogs call using topics[0] = Set signature, chunked to provider limits; cache negatives per (store, range).
- h. Fetch Store Set events: - - Query tracked_stores ... get all store addresses - - For each tracked store: - * Fetch Set events via eth_getLogs for batch block range + h. Fetch Store Set events: + - Query tracked_stores to list store addresses. + - Call eth_getLogs with topics[0]=Set(...) and batched address arrays (chunked to provider limits) for the batch range. + - Cache negative results per (store, block_range) to avoid repeat calls.
277-279: tracked_stores not rolled back → stale scans; add suspect/pruning.Reorgs that remove AddOrder can orphan stores. Mark suspect when rolling back past first_seen_block and prune if not referenced by any canonical AddOrder up to last_synced_block.
- tracked_stores is also not rolled back as stores remain relevant even after a reorg. + tracked_stores is not rolled back, but: + - If rollback passes a store's first_seen_block, set tracked_stores.suspect = 1, suspect_at = now(). + - A periodic job clears suspect or deletes the entry if no canonical AddOrder references the store up to last_synced_block.
390-404: Serialize remote merges with RPC sync for the same orderbook.A DB transaction alone doesn’t prevent interleaving with RPC sync. Require an exclusive per‑orderbook lock for both paths.
- Remote dump merges are atomic operations: + Remote dump merges are atomic and mutually exclusive with RPC sync: BEGIN TRANSACTION - ... + - Acquire per-(network, orderbook) exclusive lock/mutex (timeout/cancel-aware). + - Pause/skip RPC sync while the lock is held. ... COMMIT TRANSACTIONAlso applies to: 351-370
384-389: Permit safe upgrades to side‑effect rows during merge.Insert‑only blocks improvements (e.g., failed token_info retained forever). Allow updates when the dump is newer/better.
- **Preserve side effects**: Never overwrite existing side effect data + **Preserve side effects, but allow safe upgrades**: + - Update when remote.last_updated > local.last_updated OR + (remote.fetch_succeeded = 1 AND local.fetch_succeeded = 0). + - Only update fields present in the dump; don’t clobber unrelated local fields.- For each record in dump: - IF NOT EXISTS (...): - INSERT - ELSE: - SKIP + For each record in dump: + INSERT ... ON CONFLICT(id) DO UPDATE SET + ... -- only if the record is newer or strictly better (per rule above)Also applies to: 419-426
438-457: Don’t blindly “trust remote” on hash mismatch; verify canonicality + finality.Before replacing local state, independently verify remote.block_hash and ensure the dump is at least finality_depth behind head.
- **Resolution**: Trust the remote dump. + **Resolution**: Verify the remote dump then merge: + - Fetch the block hash for (chainId, remote.last_synced_block) from at least one independent RPC and match remote.block_hash. + - Require remote.last_synced_block ≤ (current_block − finality_depth). + - If verification fails, skip merge and continue RPC sync.
478-481: Scope side‑effect tables in dumps to relevant networks.Including all networks’ side‑effects bloats artifacts. Restrict to the dump’s networks or make inclusion configurable.
- c. Include side effect tables (full tables, not filtered): - - token_info (all networks) - - interpreter_bytecode (all networks) + c. Include side-effect tables for the relevant network(s) only (or make configurable): + - token_info (only networks present in this dump) + - interpreter_bytecode (only networks present in this dump)- The dump includes: - - Complete side effect tables (token_info, interpreter_bytecode) + The dump includes: + - Side-effect tables limited to the orderbook’s network(s) (token_info, interpreter_bytecode)Also applies to: 641-646
549-563: Generalize “safe querying” for joins/derived tables and removed rows.When joining multiple tables, each must be bounded by last_synced_block for the same (network, orderbook). Also account for removed_block.
-- Example for live orders: SELECT * FROM orders o WHERE o.network = ? AND o.orderbook_address = ? AND o.added_block <= (SELECT last_synced_block FROM sync_state WHERE network=? AND orderbook_address=?) AND (o.removed_block IS NULL OR o.removed_block > (SELECT last_synced_block FROM sync_state WHERE network=? AND orderbook_address=?));Add a note: “For multi‑table queries, apply the same last_synced_block predicate to every participating table (joins/CTEs/views). Consider providing SQL views/helpers that inject these predicates.”
357-370: Strengthen dump metadata and integrity.Add chainId and genesisBlockHash, sha256 of the gz, content schema hash, and optional signature; require clients to verify before merge.
{ - schema_version: 1, + schema_version: 1, orderbooks: [ { - network: "mainnet", + network: "mainnet", + chain_id: 1, + genesis_block_hash: "0x...", address: "0x...", last_block: 18500000, block_hash: "0xabcd...", - timestamp: 1704067200 + timestamp: 1704067200 } ] }{ "schema_version": 1, "dumps": [ { "network": "mainnet", + "chain_id": 1, + "genesis_block_hash": "0x...", "orderbook": "0x1234...", "last_block": 18500000, "block_hash": "0xabcd...", "timestamp": 1704067200, - "size_bytes": 52428800, + "size_bytes": 52428800, + "sha256": "base16-hex...", + "content_schema_hash": "base16-hex...", + "signature": "base64...", // optional "url": "/dump/mainnet/0x1234..." } ] }Also applies to: 615-631
| ## Database Schema | ||
|
|
||
| The database consists of the following key table categories: | ||
|
|
||
| 1. **Sync State Tables**: Track synchronization progress per orderbook | ||
| 2. **Event Tables**: Raw logs and processed events (orders, trades, deposits, withdrawals) | ||
| 3. **State Tables**: Current state (vaults, vault balances) | ||
| 4. **Side Effect Tables**: Token info, interpreter bytecode, store Set events | ||
| 5. **Tracking Tables**: e.g. Which stores are being monitored per orderbook | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a canonical blocks table to persist headers for reorg checks.
You fetch block data but don’t persist it. Store (block_number, hash, parent_hash, timestamp, safe/finalized flags) to support reorg depth scans and continuity checks.
1. **Sync State Tables** ...
2. **Event Tables** ...
3. **State Tables** ...
-4. **Side Effect Tables** ...
-5. **Tracking Tables** ...
+4. **Blocks Table**: canonical headers per (network, block_number) with hash, parent_hash, timestamp, safe/finalized.
+5. **Side Effect Tables** ...
+6. **Tracking Tables** ...- 3. Get current_block from RPC
+ 3. Get current_block from RPC
+ - Persist headers for encountered blocks into `blocks` and use them for reorg detection.Also applies to: 188-191
🤖 Prompt for AI Agents
In i9r/raindex-local-db around lines 72 to 81 (and also apply the same change
for lines 188 to 191), the schema lacks a canonical blocks table to persist
fetched block headers; add a new table definition that stores block_number
(bigint, primary key), hash (text), parent_hash (text), timestamp (bigint or
timestamptz), and flags for safe/finalized (booleans) and include indices on
block_number and hash; update any DB migration/schema file and README comment to
describe its purpose (reorg depth scans and continuity checks) and ensure code
paths that fetch blocks persist rows (on insert or upsert) so headers are
available for reorg checks.
| j. Perform side effects: | ||
| - Token info: fetch name/symbol/decimals for new tokens (non-blocking) | ||
| - Interpreter bytecode: fetch bytecode for new interpreters (blocking) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t perform network side‑effects inside the DB transaction; use an outbox.
eth_call/eth_getCode during the transaction hold SQLite write locks, risking timeouts and torn state under retries. Stage side‑effect jobs in an outbox, commit the batch, then process jobs asynchronously/idempotently.
- j. Perform side effects:
- - Token info ... (non-blocking)
- - Interpreter bytecode ... (blocking)
+ j. Enqueue side-effect jobs into `side_effect_jobs` (token_info fetches, interpreter bytecode).
+ - No external RPC calls inside the DB transaction.
...
- 2. Transaction Boundaries: ... All operations (raw logs, Set events, event processing, side effects, sync_state update) happen within a single transaction.
+ 2. Transaction Boundaries: ... All DB writes for the batch happen within one transaction; side-effect jobs are queued and processed after COMMIT.Outbox processing (new subsection):
+ ### Side-Effect Outbox
+ - Table: `side_effect_jobs(job_id, kind, network, address, created_at, attempts, last_attempt_at, status, payload)`
+ - Worker: reads pending jobs, performs eth_call/eth_getCode, UPSERTs into token_info/interpreter_bytecode, marks success/failure.
+ - Idempotent: multiple executions safe; updates only when remote/local record improves.Also applies to: 239-241, 329-350, 515-529
Motivation
We need a clear spec and approach to validate our implementation against.
Solution
Added Raindex Event Sourcing Sync Specification V1 documentation detailing the architecture, design goals, synchronization flow, error handling, and remote server API for a local-first event-sourced system.
Checks
By submitting this for review, I'm confirming I've done the following:
Summary by CodeRabbit