Routing log never captures input/output tokens — dream cost-aware tuning is blind

## Summary

`tier-routing-log.jsonl` entries are written with `inputTokens` / `outputTokens` **always null**, so the dream's tier-routing review has no per-entry cost data. This silently disables the cost-aware machinery added in #451 (the High-tier cost floor) and the `lowOutputAtHigh` flagged-cluster rule — both depend on per-entry tokens that are never present.

## Evidence

On the live cluster, **0 of 261** entries in `/data/agent/tier-routing-log.jsonl` have `inputTokens`/`outputTokens` populated. Subagent entries *do* carry `latencyMs` (e.g. `485678`) but leave tokens null, which is what tipped this off.

Because the tokens are null, `TierRoutingAnalyzer.BuildScan` skips every entry in its cost-delta loop (`if (e.InputTokens is not long it || e.OutputTokens is not long ot) continue;`), so `projectedCostDelta` is `null` on every threshold scan — even after #451 correctly wired `tierModelMap` through. The ceiling is currently climbing back off its floor purely on the LLM's *quality* judgment (the directive's "≥5 flips **OR** quality reason" path), not on the cost signal #451 was built to provide. So the cost floor has never actually executed against a real number in production.

## Root cause — three write sites, none persist usable tokens

- **`src/RockBot.Subagent/SubagentRunner.cs:311`** — sets `LatencyMs` but **omits** `InputTokens`/`OutputTokens` entirely. This is the dominant source of High-tier routing decisions, so it's the most important to fix.
- **`src/RockBot.Agent/UserMessageHandler.cs:461`** — omits `InputTokens`/`OutputTokens` (and `LatencyMs`) entirely.
- **`src/RockBot.Agent/UserMessageHandler.cs:248`** — *attempts* `InputTokens = firstResponse.Usage?.InputTokenCount` / `OutputTokens = firstResponse.Usage?.OutputTokenCount`, but these land null, i.e. `firstResponse.Usage` isn't populated on the response object the handler holds at that point.

## Suggested direction

`AgentLoopRunner` already aggregates per-iteration usage internally (`src/RockBot.Host/AgentLoopRunner.cs:893,985` sum `response.Usage?.InputTokenCount` across the loop). The cleanest fix is to **surface that aggregate on the loop result** and have all three write sites persist it, rather than relying on a single `firstResponse.Usage` that may be empty. Threading the aggregated `UsageDetails` through is also what lets `ModelId`-based pricing joins work end-to-end.

## Impact / why it matters

- #451's High-tier cost floor and the `tierModelMap` wiring are inert until this lands — the dream can't weigh cost against quality, which was the original driver of the balancedCeiling ratchet-to-floor.
- This matters most for the *future* case where High stops sharing Balanced's model and becomes a genuinely premium tier: the dream would still be blind to the cost difference and could over-route to High again, with no cost signal to correct it.
- The `lowOutputAtHigh` detection rule (over-routing signal) also can't fire without output tokens.

## Acceptance criteria

- [ ] New `tier-routing-log.jsonl` entries from subagent and user-message paths carry non-null `inputTokens`/`outputTokens` when the provider returns usage.
- [ ] `TierRoutingAnalyzer` threshold scans produce a non-null `projectedCostDelta` once tokenized entries exist.
- [ ] Verify on the live cluster that the dream's routing notes begin citing cost deltas (not just flip counts).

Related: #451 (cost floor), and the routing-tuning observability gap.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routing log never captures input/output tokens — dream cost-aware tuning is blind #452

Summary

Evidence

Root cause — three write sites, none persist usable tokens

Suggested direction

Impact / why it matters

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Routing log never captures input/output tokens — dream cost-aware tuning is blind #452

Description

Summary

Evidence

Root cause — three write sites, none persist usable tokens

Suggested direction

Impact / why it matters

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions