Architecture | Communication | Integrity | AI Traces | AI Artifacts | Constitution | Configuration | Guide | Evaluations | Release
transit is a fresh take on message streaming: a Rust-first, object-storage-native append-only log with native tiered storage, stream lineage, and explicit branch-and-merge semantics.
The project thesis is simple:
- the same engine should run embedded in-process or as a networked server
- object storage should be a first-class persistence layer, not an archival afterthought
- branching and merging streams should be primitives, not application hacks
- append-only history should stay immutable while new branches diverge cheaply
- verifiable lineage should attach to segments, manifests, and checkpoints without bloating every append
- AI agents, model harnesses, and human communication systems should be first-class workloads
Most streaming systems make a strong trade:
- they are excellent at ordered append and fan-out
- they are weak at representing divergence, experimentation, and conversational lineage
- they rarely treat merge and reconciliation as first-class dataflow operations
- they treat object storage as backup or offload instead of as part of the normal storage plan
transit aims at a different center:
- low-latency local append and tail for the hot path
- immutable segments persisted into object storage as part of the normal lifecycle
- stream branches and merges that reuse history without copying bytes
- one storage model that works for embedded runtimes, servers, agents, processors, and operator tools
record: immutable bytes plus headers and timestampsstream: an ordered append-only sequence of recordsbranch: a child stream created from a parent stream at a specific offsetmerge: an explicit reconciliation of two or more stream heads under a declared merge policylineage: the DAG formed by branch and merge relationshipssegment: an immutable block of ordered recordsmanifest: metadata that maps streams and branches to their segments across local and remote storagecheckpoint: a proof-bearing envelope that binds a stream head or derived state to immutable history
In transit, a branch is not a filtered consumer view. It is its own stream head with explicit ancestry.
In transit, a merge should also be explicit. It should create new lineage state with declared parents and merge policy, not silently rewrite history behind the scenes.
The initial target use cases are direct:
- AI model harnesses that need replayable traces, retries, forks, and evaluation provenance
- agent runtimes where one interaction can branch into parallel tool-use or planning paths
- a Slack-like communication system where channels are root streams and threads are native branches
- systems that need to merge branch results back into a mainline without losing provenance
- stream processing and incremental materialization over branching and merging event histories
- classifier-driven auto-threading, where a model can fork a new branch when a conversation diverges
- remote restore and audit flows that can verify immutable history instead of trusting remote storage implicitly
That auto-threading path is a core design motivator. A classifier should be able to observe a root stream, identify a new thread boundary, and create a child branch anchored to the triggering record without rewriting history.
The deeper thesis is not just "logs that can fork." It is "logs that can branch, merge, and feed deterministic derived state."
- Branches let a system diverge cheaply for retries, thread splits, alternate plans, or hypothetical work.
- Merges let those paths reconcile explicitly instead of forcing the application to pretend divergence never happened.
- Materialization lets processors build durable derived state, indexes, views, and caches from that lineage-rich history.
That suggests a product direction beyond a flat append-only log:
- the core engine should own append, branch, merge, lineage, and tiered storage
- materializers and processors may start as an adjacent first-party layer, but they should use the same manifests, checkpoints, and lineage model
- branch and merge semantics should make incremental recompute and branch-local derived state practical instead of expensive
- integrity should bind immutable segments and manifests, then grow into checkpoints and proofs without contaminating the hottest append path
- Embedded-first core with a server mode layered on the same engine
- Native tiered storage with explicit local-head and remote-object responsibilities
- O(1)-style branch creation relative to ancestor history size
- explicit, inspectable merge operations with deterministic merge policies
- Immutable acknowledged history with no silent rewrites
- staged verifiable lineage from checksums to manifest roots to checkpoints
- incremental materialization over ordered, branching, and merging histories
- Clear durability modes so latency claims and safety claims are comparable
- Benchmarkable behavior for append, replay, cold restore, tailing, and branch-heavy workloads
transit is not trying to be a general mutable database, a hidden background compactor that rewrites acknowledged history, or a queue that destroys provenance once a consumer advances.
This repository is at the bootstrap stage.
Today it contains:
- README.md
- ARCHITECTURE.md
- CONSTITUTION.md
- CONFIGURATION.md
- GUIDE.md
- EVALUATIONS.md
- RELEASE.md
- AGENTS.md
- a Rust workspace with
transit-coreandtransit-cli - a Nix flake and Rust toolchain bootstrap
- a
Justfilewith a human-facingjust screenverification path for local-engine proof, tiered publication/restore proof, networked single-node server proof, object-store probing, and the current Keel board view - a local durable engine that can append, replay, branch, merge, recover from trailing uncommitted active-head bytes, publish rolled immutable segments to object storage, and cold-restore published history from remote manifests
- an initial shared-engine server bootstrap that can open the same local engine, bind a daemon listener, shut down deterministically, and serve provisional remote root creation, append/read/tail, branch/merge, and lineage-inspection operations through a framed request/response envelope with correlation IDs, explicit acknowledgement and error semantics, and logical tail sessions with credit-based delivery, without introducing a second storage path
- a first CLI client surface for remote root creation, append, read, branch, merge, lineage inspection, and logical tail-session workflows
- a first networked mission proof path that validates the live single-node server and keeps the
transitprotocol explicitly distinct from optional secure underlays such as WireGuard - an initial
object_storeintegration with a filesystem probe command
The implementation work now has a real scaffold to grow from instead of needing to reverse-engineer direction later.
The first canonical AI workload contract now lives in AI_TRACES.md.
The first canonical communication workload contract now lives in COMMUNICATION.md.
The first verifiable-lineage contract now lives in INTEGRITY.md.
The intended surface area is:
- an embedded library for in-process append, read, tail, and branch operations
- a server daemon exposing the same semantics over a network API, starting from the shared-engine bootstrap and provisional remote root creation, append/read/tail, branch/merge, and lineage-inspection support with explicit request correlation, acknowledgement envelopes, and logical tail-session control
- a client library and CLI for operators, application runtimes, and benchmarks
The server protocol remains an application-layer contract. It can run over ordinary transports, and secure meshes such as WireGuard are optional deployment underlays rather than protocol replacements.
If a future design choice conflicts with one of these, the docs should be updated explicitly before code drifts:
- The embedded and server products share one storage engine.
- Tiered storage is a default architecture, not a premium add-on.
- Stream lineage is a product primitive.
- Durability, consistency, and benchmark scope must be explicit.
- AI and communication workloads are reference workloads, not edge cases.