Feature Request: shadow traffic mirroring for AIGatewayRoute with per-mirror model rewriting

# Summary

- What: Add a mirrors field to AIGatewayRouteRule that sends a fire-and-forget copy of each matched request to one or more shadow backends, with the AI Gateway's upstream ExtProc pipeline applied to the mirror leg (model rewrite,
header/body mutation).
- Why: Evaluate a new model, provider, or region against live production traffic without affecting client responses — a common ask for canary, regression, and A/B-eval workflows.
- Where: New field on AIGatewayRouteRule in api/v1beta1, plus follow-through in the controller and extension server.
- Impact: Backward compatible (opt-in field). Mirror leg billing is suppressed so shadow traffic does not double-count in LLMRequestCosts.

# Problem

Shadow-traffic evaluation is the standard way to qualify a new LLM backend before flipping production routing. Today, an operator can technically add an HTTPRequestMirrorFilter by editing the controller-generated HTTPRoute out-of-band,
but:

1. The change is overwritten on the next reconcile — AIGatewayRoute has no surface for it.
2. The mirror cluster does not receive the AI Gateway's upstream ExtProc filter chain, so model rewriting (x-ai-eg-model, body model field) and header/body mutations do not run — the shadow backend sees a request shaped for the
primary, not for itself.
3. LLMRequestCosts dynamic metadata is emitted for the mirror leg, double-counting tokens in access logs and downstream billing pipelines.

# Proposal

Add a mirrors field to AIGatewayRouteRule. Each entry wraps an AIGatewayRouteRuleBackendRef (so it inherits modelNameOverride, headerMutation, bodyMutation) plus an HTTPRequestMirrorFilter-style percent for sampling:

```
type AIGatewayRouteRule struct {
    // ...existing fields...
    // +optional
    // +kubebuilder:validation:MaxItems=16
    Mirrors []AIGatewayRouteRuleMirror `json:"mirrors,omitempty"`
}

type AIGatewayRouteRuleMirror struct {
    AIGatewayRouteRuleBackendRef `json:",inline"`
    // +optional
    Percent *gwapiv1.Fraction `json:"percent,omitempty"`
}
```

Semantics: responses from mirror backends are always discarded; only the primary backendRefs respond to the client. Each entry maps to one Gateway API HTTPRequestMirrorFilter.

# What this enables

- Side-by-side evaluation of a new model or provider on real traffic.
- Region-failover dry runs (mirror to a candidate region, compare error rates).
- Regression testing of modelNameOverride / translation changes against production payloads.
- Per-tenant audit of model behavior without paying for it twice.

# Open questions

1. Per-mirror translation shape: should mirrors carry full modelNameOverride / headerMutation / bodyMutation, or reference an AIServiceBackend only and inherit its config? Inheriting is cleaner but forces operators to create a dedicated AIServiceBackend per shadow target.
2. Cost emission policy: always-suppress LLMRequestCosts on mirrors, opt-in via a field on the mirror entry, or controlled at BackendSecurityPolicy / GatewayConfig level?
3. Mirror-cluster naming contract: the extension server needs to identify mirror clusters in the xDS push to install (or skip) the upstream filter chain. Envoy Gateway currently emits names of the form
httproute/<ns>/<name>/rule/<ruleIdx>-mirror-<mirrorIdx> with 1-based mirror indices. Is it acceptable to depend on that wire format, or should AI Gateway negotiate a stable contract / metadata-based marker with Envoy Gateway?
4. Failure handling: should mirror-leg ExtProc failures be silently absorbed (current Envoy mirror semantics) or surface as gen_ai.* error metrics tagged is_mirror=true?

# Related

- Gateway API HTTPRequestMirrorFilter spec.
- #2135 — maybeModifyCluster cluster-name parsing assumptions; overlap with question 4 above.

We have a working PoC on a fork with e2e coverage and would be happy to upstream it once the design questions above are resolved.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: shadow traffic mirroring for AIGatewayRoute with per-mirror model rewriting #2137

Summary

Problem

Proposal

What this enables

Open questions

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: shadow traffic mirroring for AIGatewayRoute with per-mirror model rewriting #2137

Description

Summary

Problem

Proposal

What this enables

Open questions

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions