Skip to content

deepreinforce-ai/Tokenless-Claw-Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tokenless Claw Code

This is a rewrite of the Claw Code project aimed at significantly reducing token usage. In our initial experience, we observed up to 74% token savings, with around 30% on average, without sacrificing quality.

It was adapted from ultraworkers/claw-code.

It is still an early project and under active iteration.

Getting Started

There are two practical ways to get started here.

Start with Python if you want to inspect the porting workspace, manifest, and parity-oriented utilities:

python3 -m src.main summary
python3 -m src.main manifest
python3 -m src.main parity-audit

Start with Rust if you want to build and run the interactive CLI:

cd rust
cargo build --release
cargo run --bin claw --

Once the CLI is running, use /status, /cost, and /compact to inspect the current prompt state and token-saving behavior.

Discovered Strategies for Saving Tokens

The optimization work is mainly focused on repeated input cost, not just one-shot prompt length.

  • Smaller system prompt
    • workspace/config/git context is summarized instead of replayed raw
    • instruction files such as CLAW.md are converted into compact digests
    • static prompt rules were merged and shortened
  • Smaller tool surface
    • normal turns expose only a minimal default tool set
    • heavier tools are unlocked later through ToolSearch or explicit intent
    • prompt-facing tool schemas use shorter compatible field names such as q, cmd, text, old, and new
  • Smaller replay
    • tool inputs/results are compacted before being replayed to the model
    • old assistant/user/system text is clipped more aggressively
    • only the latest preview-worthy read/search/shell result keeps rich preview text
  • Earlier compaction
    • long sessions auto-compact based on replay cost
    • compacted summaries are shorter and more structured
  • Lower helper overhead
    • helper flows and sub-agents use lighter prompts
    • Anthropic-compatible requests enable prompt caching by default
  • Better measurement
    • claw token-audit reports prompt, tool, and replay cost
    • claw token-audit --example-suite runs the built-in benchmark corpus

Use these commands to inspect the effect:

cargo run --bin claw -- system-prompt
cargo run --bin claw -- token-audit
cargo run --bin claw -- token-audit --session session.json --output-format json
cargo run --bin claw -- token-audit --example-suite
cargo run --bin claw --           # then use /status, /cost, /compact

Results

  • Example benchmark spread ranges from about 24% to 79% token savings across different workloads.
  • One representative early case in this snapshot is 6427 -> 1646 tokens, or about 74.39% lower than the original reference baseline.
  • These numbers are a current benchmark snapshot from an actively evolving project, not a final claim about every real-world workload.

Illustrative claw token-audit --example-suite workloads (2026-03-31):

Case Description Origin Tokens Current Tokens Savings Pass
repo-code-summarizer A repository code summarization workflow that condenses the structure, key modules, and implementation patterns of a codebase. 6427 1646 74.39% Yes
shopify-theme-checkout A medium-sized storefront task covering cart, checkout, and theme files in a live e-commerce codebase. 1842 648 64.82% Yes
marketplace-monorepo-launch A very large multi-turn engineering task across web app, API, auth, billing, and admin surfaces in one monorepo. 52041 27487 47.18% Yes
warehouse-job-queue-backend A follow-up backend task for a warehouse operations system handling robot job queues, retries, and status updates. 7032 4667 33.64% Yes
cli-env-toggle A tiny configuration tweak for a command-line tool with minimal surrounding context. 687 521 24.16% Yes

The Origin Tokens column shows illustrative workload sizes chosen to demonstrate the relative savings range across different task types.

How The Optimization Was Achieved

The optimization work was achieved through a combination of Codex, Grok, and the IterX code optimization tool. Together, they were used to iterate on prompt reduction, tool-surface minimization, replay compaction, and measurement-driven refinement across the harness.

Acknowledgement

This project is based on Claw Code's CLI/runtime pattern, especially around session history, prompt construction, tool orchestration, tool replay, and context compaction.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors