This is a rewrite of the Claw Code project aimed at significantly reducing token usage. In our initial experience, we observed up to 74% token savings, with around 30% on average, without sacrificing quality.
It was adapted from ultraworkers/claw-code.
It is still an early project and under active iteration.
There are two practical ways to get started here.
Start with Python if you want to inspect the porting workspace, manifest, and parity-oriented utilities:
python3 -m src.main summary
python3 -m src.main manifest
python3 -m src.main parity-auditStart with Rust if you want to build and run the interactive CLI:
cd rust
cargo build --release
cargo run --bin claw --Once the CLI is running, use /status, /cost, and /compact to inspect the current prompt state and token-saving behavior.
The optimization work is mainly focused on repeated input cost, not just one-shot prompt length.
- Smaller system prompt
- workspace/config/git context is summarized instead of replayed raw
- instruction files such as
CLAW.mdare converted into compact digests - static prompt rules were merged and shortened
- Smaller tool surface
- normal turns expose only a minimal default tool set
- heavier tools are unlocked later through
ToolSearchor explicit intent - prompt-facing tool schemas use shorter compatible field names such as
q,cmd,text,old, andnew
- Smaller replay
- tool inputs/results are compacted before being replayed to the model
- old assistant/user/system text is clipped more aggressively
- only the latest preview-worthy read/search/shell result keeps rich preview text
- Earlier compaction
- long sessions auto-compact based on replay cost
- compacted summaries are shorter and more structured
- Lower helper overhead
- helper flows and sub-agents use lighter prompts
- Anthropic-compatible requests enable prompt caching by default
- Better measurement
claw token-auditreports prompt, tool, and replay costclaw token-audit --example-suiteruns the built-in benchmark corpus
Use these commands to inspect the effect:
cargo run --bin claw -- system-prompt
cargo run --bin claw -- token-audit
cargo run --bin claw -- token-audit --session session.json --output-format json
cargo run --bin claw -- token-audit --example-suite
cargo run --bin claw -- # then use /status, /cost, /compact- Example benchmark spread ranges from about
24%to79%token savings across different workloads. - One representative early case in this snapshot is
6427 -> 1646tokens, or about74.39%lower than the original reference baseline. - These numbers are a current benchmark snapshot from an actively evolving project, not a final claim about every real-world workload.
Illustrative claw token-audit --example-suite workloads (2026-03-31):
| Case | Description | Origin Tokens | Current Tokens | Savings | Pass |
|---|---|---|---|---|---|
repo-code-summarizer |
A repository code summarization workflow that condenses the structure, key modules, and implementation patterns of a codebase. | 6427 | 1646 | 74.39% | Yes |
shopify-theme-checkout |
A medium-sized storefront task covering cart, checkout, and theme files in a live e-commerce codebase. | 1842 | 648 | 64.82% | Yes |
marketplace-monorepo-launch |
A very large multi-turn engineering task across web app, API, auth, billing, and admin surfaces in one monorepo. | 52041 | 27487 | 47.18% | Yes |
warehouse-job-queue-backend |
A follow-up backend task for a warehouse operations system handling robot job queues, retries, and status updates. | 7032 | 4667 | 33.64% | Yes |
cli-env-toggle |
A tiny configuration tweak for a command-line tool with minimal surrounding context. | 687 | 521 | 24.16% | Yes |
The Origin Tokens column shows illustrative workload sizes chosen to demonstrate the relative savings range across different task types.
The optimization work was achieved through a combination of Codex, Grok, and the IterX code optimization tool. Together, they were used to iterate on prompt reduction, tool-surface minimization, replay compaction, and measurement-driven refinement across the harness.
This project is based on Claw Code's CLI/runtime pattern, especially around session history, prompt construction, tool orchestration, tool replay, and context compaction.