I work on backend and distributed systems where failure has real cost.
I care about what happens after something goes wrong — preventing state corruption, containing blast radius, and designing recovery paths so systems fail safely instead of catastrophically.
I like boring systems, clear invariants, and mornings without incident alerts.
failure-first-job-queue
A reliability-first job processing system built to expose failure modes, invariants, and recovery boundaries.
Status: defining core invariants and implementing FM-001 (duplicate execution under retry).
- Your backend mostly works… until it doesn’t
- You’re shipping something new and want to sleep after launch
- You need one person to take a system from idea → production — and keep it calm
Roles focused on reliability, platform, infrastructure, or correctness-critical backend systems.
If failure would be expensive, irreversible, or public, I’m interested.
Talk: https://cal.com/miladtsx/intro
Selected results
- Launch: shipped investor-ready MVPs in days to weeks
- Scale: grew real-time backends from ~100 → 2,000+ concurrent users
- Stability: improved crash-free sessions from ~65% → 92%
- Cost: reduced AWS spend by ~60–70% and on-chain transaction costs by ~99% (zk batching)
How I usually work
- Turn vague ideas into concrete scope and architecture
- Design for failure modes, not just happy paths
- Prefer boring, understandable solutions over clever ones
- Keep security in mind from day one (money, data, users)





