A high-performance, distributed event store designed for scalable event sourcing applications
SierraDB is a modern, horizontally-scalable database specifically built for event sourcing workloads. It combines the simplicity of Redis protocol compatibility with the distributed architecture principles of Cassandra/ScyllaDB, providing developers with a powerful foundation for building event-driven systems.
Want to understand how SierraDB works under the hood? Read the detailed blog post: Building SierraDB: A Distributed Event Store Built in Rust
The post covers the architecture decisions, watermark system for consistency, distributed consensus, and the journey of building an event store from scratch.
Get SierraDB running with Docker and start storing events immediately:
# Start SierraDB
docker run -p 9090:9090 tqwewe/sierradb
# In another terminal, use redis-cli to interact with SierraDB
redis-cli -p 9090
# Append some events to a user stream
EAPPEND user-123 UserRegistered PAYLOAD '{"email":"[email protected]","name":"Alice"}'
EAPPEND user-123 EmailVerified PAYLOAD '{"timestamp":"2024-10-18T10:30:00Z"}'
EAPPEND user-123 ProfileUpdated PAYLOAD '{"bio":"Software engineer"}'
# Read all events from the stream
ESCAN user-123 - +
# Get the current version of the stream
ESVER user-123
# Subscribe to new events (will stream in real-time)
ESUB user-123 FROM LATESTThat's it! You're now running a distributed event store and can append/read events using any Redis client.
# Run with persistent data
docker run -p 9090:9090 -v ./data:/app/data tqwewe/sierradbcargo install sierradb-server
sierradb --dir ./data --client-address 0.0.0.0:9090# Connect with any Redis client
redis-cli -p 9090
# Append events to streams
EAPPEND orders order-456 OrderCreated PAYLOAD '{"total":99.99,"items":["laptop"]}'
# Read events back
ESCAN orders - + COUNT 100
# Subscribe to real-time events
ESUB orders FROM LATESTSince SierraDB uses the RESP3 protocol, you can also connect with any Redis-compatible client:
import redis
# Connect to SierraDB
client = redis.Redis(host='localhost', port=9090, protocol=3)
# Append an event
result = client.execute_command('EAPPEND', 'user-123', 'UserCreated',
'PAYLOAD', '{"name":"John","email":"[email protected]"}')
# Read events from stream
events = client.execute_command('ESCAN', 'user-123', '-', '+', 'COUNT', '100')- Redis Protocol Compatible - Uses RESP3 protocol, making it compatible with existing Redis clients and tools
- Horizontally Scalable - Distributed architecture that scales across multiple nodes with configurable replication
- High Performance - Optimized for event sourcing with consistent write performance regardless of database size
- Real-time Subscriptions - Seamless transition from historical events to live streaming for projections and event handlers
- Event Sourcing Optimized - Purpose-built for append-only event patterns with strong ordering guarantees
- Data Integrity - CRC32C checksums ensure corruption detection with automatic recovery capabilities
- Actively Developed - Under active development with extensive testing and stability improvements
SierraDB organizes data using a three-tier hierarchy designed for optimal performance and scalability:
- Default Configuration: 4 buckets containing 32 partitions (8 partitions per bucket)
- Scalability: Can be scaled up across many nodes for virtually unlimited horizontal performance
- Distribution: Events are distributed across partitions using consistent hashing of partition keys
- Segment Files: Events are written to segment files that are sealed at 256MB (configurable)
- Consistent Performance: New segments ensure write performance remains constant regardless of database size
- Parallel Processing: Writes within a partition are sequential, but parallel across different partitions
- Partition Sequences: Gapless monotonic sequence numbers within each partition
- Stream Versions: Gapless monotonic version numbers within each stream
- Consistency: Events within the same partition maintain strict ordering while allowing parallelism across partitions
SierraDB is optimized for three primary read patterns:
- Event Lookup by ID - Fast UUID-based event retrieval using event indexes
- Stream Scanning - Efficient sequential reading of events within a stream using stream indexes
- Partition Scanning - High-throughput scanning of events within a partition using partition indexes
Each segment file is accompanied by specialized index files and bloom filters for maximum query performance.
SierraDB operates as a distributed cluster similar to Cassandra/ScyllaDB:
- Cluster Coordination - Automatic node discovery and coordination using libp2p and optional mDNS
- Replication Factor - Configurable data replication (defaults to min(node_count, 3))
- Fault Tolerance - Automatic failover and data recovery across cluster nodes
- Consistent Hashing - Automatic data distribution and rebalancing
One of SierraDB's most powerful features is its subscription system, designed specifically for event sourcing projections:
- Historical + Live: Subscriptions start from any point in history and automatically transition to live events
- Guaranteed Delivery: Events are delivered in order with acknowledgment tracking
- Multiple Stream Types: Subscribe to individual streams, partitions, or combinations
- Backpressure Handling: Configurable windowing prevents overwhelmed consumers
Perfect for building:
- Event sourcing projections that need to catch up from the beginning
- Real-time event handlers that process new events as they arrive
- Read models that require both historical rebuild and live updates
SierraDB offers extensive configuration options. Key settings include:
[bucket]
count = 4 # Number of buckets
[partition]
count = 32 # Number of partitions (8 per bucket by default)
[segment]
size_bytes = 268435456 # Segment size in bytes (256MB default)
[replication]
factor = 3 # Data replication factor
buffer_size = 1000 # Replication buffer size
buffer_timeout_ms = 8000 # Buffer timeout
[network]
client_address = "0.0.0.0:9090" # Client connection address
cluster_address = "/ip4/0.0.0.0/tcp/0" # Inter-node communication
[threads]
read = 8 # Read thread pool size
write = 4 # Write thread pool sizeSee crates/sierradb-server/src/config.rs for the complete configuration reference.
SierraDB implements a comprehensive set of RESP3 commands for event operations:
Append an event to a stream with optional expected version control.
Syntax:
EAPPEND <stream_id> <event_name> [EVENT_ID <event_id>] [PARTITION_KEY <partition_key>] [EXPECTED_VERSION <version>] [TIMESTAMP <timestamp>] [PAYLOAD <payload>] [METADATA <metadata>]
Parameters:
stream_id- Stream identifier to append the event toevent_name- Name/type of the eventevent_id(optional) - UUID for the event (auto-generated if not provided)partition_key(optional) - UUID to determine event partitioningexpected_version(optional) - Expected stream version (number, "any", "exists", "empty")timestamp(optional) - Event timestamp in millisecondspayload(optional) - Event payload datametadata(optional) - Event metadata
Examples:
EAPPEND my-stream UserCreated PAYLOAD '{"name":"john"}' METADATA '{"source":"api"}'
EAPPEND orders OrderPlaced EVENT_ID 550e8400-e29b-41d4-a716-446655440000 EXPECTED_VERSION empty
Append multiple events across different streams within the same partition atomically.
Syntax:
EMAPPEND <partition_key> <stream_id1> <event_name1> [EVENT_ID <event_id1>] [EXPECTED_VERSION <version1>] [TIMESTAMP <timestamp>] [PAYLOAD <payload1>] [METADATA <metadata1>] [<stream_id2> <event_name2> ...]
Parameters:
partition_key- UUID that determines which partition all events will be written to- For each event:
stream_id,event_name, and optionalEVENT_ID,EXPECTED_VERSION,TIMESTAMP,PAYLOAD,METADATA
Examples:
EMAPPEND 550e8400-e29b-41d4-a716-446655440000 stream1 EventA PAYLOAD '{"data":"value1"}' stream2 EventB PAYLOAD '{"data":"value2"}'
Note: All events succeed or fail together as a single atomic transaction.
Retrieve an event by its unique identifier.
Syntax:
EGET <event_id>
Parameters:
event_id- UUID of the event to retrieve
Examples:
EGET 550e8400-e29b-41d4-a716-446655440000
Scan events in a stream by version range.
Syntax:
ESCAN <stream_id> <start_version> <end_version> [PARTITION_KEY <partition_key>] [COUNT <count>]
Parameters:
stream_id- Stream identifier to scanstart_version- Starting version number (use "-" for beginning)end_version- Ending version number (use "+" for end, or specific number)partition_key(optional) - UUID to scan specific partitioncount(optional) - Maximum number of events to return
Examples:
ESCAN my-stream 0 100 COUNT 50
ESCAN my-stream - + PARTITION_KEY 550e8400-e29b-41d4-a716-446655440000
Get the current version number for a stream.
Syntax:
ESVER <stream_id> [PARTITION_KEY <partition_key>]
Parameters:
stream_id- Stream identifier to get version forpartition_key(optional) - UUID to check specific partition
Examples:
ESVER my-stream
ESVER my-stream PARTITION_KEY 550e8400-e29b-41d4-a716-446655440000
Subscribe to events from one or more streams with real-time delivery.
Syntax:
# Single stream
ESUB <stream_id> [PARTITION_KEY <partition_key>] [FROM <version>] [WINDOW <size>]
# Multiple streams
ESUB <stream_id_1> [PARTITION_KEY <pk_1>] <stream_id_2> [PARTITION_KEY <pk_2>] ... [FROM LATEST | FROM <version> | FROM MAP <stream>=<ver>...] [WINDOW <size>]
Parameters:
stream_id- Stream identifier(s) to subscribe topartition_key(optional) - UUID for specific partitionFROM(optional) - Starting position (LATEST, version number, or MAP for per-stream positions)WINDOW(optional) - Maximum number of unacknowledged events
Examples:
ESUB user-123 # Single stream, latest, no window
ESUB user-123 WINDOW 100 # Single stream, latest, window 100
ESUB user-123 FROM 50 WINDOW 100 # Single stream, from version 50
ESUB user-1 user-2 user-3 FROM MAP user-1=10 user-2=20 user-3=30 WINDOW 50
Note: Establishes a persistent connection for real-time event delivery.
Scan events in a partition by sequence number range.
Syntax:
EPSCAN <partition> <start_sequence> <end_sequence> [COUNT <count>]
Parameters:
partition- Partition selector (partition ID 0-65535 or UUID key)start_sequence- Starting sequence number (use "-" for beginning)end_sequence- Ending sequence number (use "+" for end, or specific number)count(optional) - Maximum number of events to return
Examples:
EPSCAN 42 100 200 COUNT 50
EPSCAN 550e8400-e29b-41d4-a716-446655440000 - + COUNT 100
Get the current sequence number for a partition.
Syntax:
EPSEQ <partition>
Parameters:
partition- Partition selector (partition ID 0-65535 or UUID key)
Examples:
EPSEQ 42
EPSEQ 550e8400-e29b-41d4-a716-446655440000
Subscribe to events from one or more partitions with real-time delivery.
Syntax:
# All partitions
EPSUB * [FROM LATEST | FROM <sequence> | FROM MAP <p1>=<s1> <p2>=<s2>... [DEFAULT <seq>]] [WINDOW <size>]
# Single partition
EPSUB <partition_id> [FROM <sequence>] [WINDOW <size>]
# Multiple partitions
EPSUB <p1>,<p2>,<p3> [FROM LATEST | FROM <sequence> | FROM MAP <p1>=<s1> <p2>=<s2>... [DEFAULT <seq>]] [WINDOW <size>]
Parameters:
partition_id- Partition identifier(s) (* for all, comma-separated list for multiple)FROM(optional) - Starting position (LATEST, sequence number, or MAP for per-partition positions)WINDOW(optional) - Maximum number of unacknowledged events
Examples:
EPSUB * # All partitions, latest
EPSUB * WINDOW 100 # All partitions, latest, window 100
EPSUB * FROM 1000 WINDOW 100 # All partitions, from seq 1000
EPSUB 5 FROM 100 WINDOW 50 # Partition 5, from seq 100
EPSUB 1,2,3 FROM MAP 1=100 2=200 DEFAULT 0 WINDOW 500
Note: Establishes a persistent connection for real-time event delivery.
Acknowledge events up to a cursor for a subscription.
Syntax:
EACK <subscription_id> <sequence_or_version>
Parameters:
subscription_id- UUID of the subscriptionsequence_or_version- Sequence or version number to acknowledge up to
Examples:
EACK 550e8400-e29b-41d4-a716-446655440000 1000
Note: Tells SierraDB that events up to the specified position have been processed successfully.
Perform protocol handshake and retrieve server information.
Syntax:
HELLO <version>
Parameters:
version- Protocol version (currently only 3 is supported)
Examples:
HELLO 3
Returns:
server- Server name ("sierradb")version- Server versionpeer_id- Cluster peer IDnum_partitions- Number of partitions
Simple health check and connectivity test.
Syntax:
PING
Returns: PONG
Examples:
PING
SierraDB ensures data reliability through multiple mechanisms:
- CRC32C Checksums: Every event includes integrity checksums to detect corruption
- Atomic Transactions: Events within the same partition can be written transactionally
- Corruption Recovery: Automatic detection and recovery from data corruption scenarios
- Graceful Shutdown Handling: Proper recovery even after sudden system shutdowns
For distributed deployments across multiple nodes:
# Node 1
sierradb --dir ./data1 --node-index 0 --node-count 3 --client-address 0.0.0.0:9090
# Node 2
sierradb --dir ./data2 --node-index 1 --node-count 3 --client-address 0.0.0.0:9091
# Node 3
sierradb --dir ./data3 --node-index 2 --node-count 3 --client-address 0.0.0.0:9092For Rust applications, SierraDB can be embedded directly using the core sierradb crate:
use std::time::{SystemTime, UNIX_EPOCH};
use sierradb::{Database, Transaction, NewEvent, StreamId};
use sierradb::id::{NAMESPACE_PARTITION_KEY, uuid_to_partition_hash, uuid_v7_with_partition_hash};
use sierradb_protocol::ExpectedVersion;
use uuid::Uuid;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open database directly
let db = Database::open("./data")?;
// Create and append events
let stream_id = StreamId::new("user-123")?;
let partition_key = Uuid::new_v5(&NAMESPACE_PARTITION_KEY, stream_id.as_bytes());
let partition_hash = uuid_to_partition_hash(partition_key);
let partition_id = uuid_v7_with_partition_hash(partition_hash);
let timestamp = SystemTime::now()
.duration_since(UNIX_EPOCH)?
.as_nanos() as u64;
let transaction = Transaction::new(
partition_key,
partition_id,
vec![NewEvent {
event_id: Uuid::new_v4(),
stream_id,
stream_version: ExpectedVersion::Any,
event_name: "UserCreated".into(),
timestamp,
payload: br#"{"name":"john"}"#.to_vec(),
metadata: vec![],
}]
)?;
let result = db.append_events(transaction).await?;
println!("Event appended with sequence: {}", result.first_partition_sequence);
Ok(())
}SierraDB provides a native Rust client built on top of the redis crate with full type support for SierraDB commands and subscriptions:
use sierradb_client::{SierraClient, SierraAsyncClientExt, SierraMessage};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = SierraClient::connect("redis://localhost:9090").await?;
// Append an event with type safety
let options = EAppendOptions::new()
.payload(br#"{"name":"john"}"#)
.metadata(br#"{"source":"api"}"#);
let result = client.eappend("user-123", "UserCreated", options)?;
// Scan events from stream
let events = client.escan("user-123", "-", None, Some(50))
.execute()
.await?;
// Subscribe to real-time events
let mut manager = client.subscription_manager().await?;
let mut subscription = manager
.subscribe_to_stream_from_version("user-123", 0)
.await?;
while let Some(msg) = subscription.next_message().await? {
println!("Received event: {msg:?}");
}
Ok(())
}SierraDB is under active development and testing:
- Stability Testing: Extensive fuzzing campaigns ensure robustness under edge conditions
- Performance Validation: Consistent performance characteristics validated across various workloads
- Current Usage: Being used to build systems in development and testing environments
- Data Safety: Comprehensive corruption detection and recovery mechanisms implemented
Note: While SierraDB demonstrates strong stability in testing, it is not yet recommended for production use. The project is actively working toward production readiness.
SierraDB is being prepared for public release. Contribution guidelines and development setup will be available once the repository is made public.
SierraDB is licensed under the Apache License 2.0. See the LICENSE file for details.
Ready to scale your event sourcing architecture? SierraDB provides the performance, reliability, and developer experience you need to build robust event-driven systems.