feat: rtk discover --share / --issue — Community filter prioritization

## Context

`rtk discover` scans Claude Code sessions locally. This issue tracks adding the ability for users to **share their discover results** back to the RTK project, so we can prioritize filter development based on real-world usage data rather than guessing.

**Scope**: 4 Rust phases + Cloudflare Worker backend, shipped together.

---

## Phase 1: Anonymization structs + logic

**File**: `src/discover/report.rs`

Add new structs for sharing (no args, no paths, no examples):

```rust
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct AnonymizedEntry {
    pub command: String,  // "curl", "terraform", "helm"
    pub count: usize,
}

#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct AnonymizedMissedEntry {
    pub command: String,         // "git status", "cargo test"
    pub count: usize,
    pub rtk_equivalent: String,  // "rtk git", "rtk cargo"
}

#[derive(Debug, Serialize, Deserialize)]
pub struct AnonymizedReport {
    pub sessions_scanned: usize,
    pub total_commands: usize,
    pub already_rtk: usize,
    pub unhandled: Vec<AnonymizedEntry>,
    pub missed: Vec<AnonymizedMissedEntry>,
}
```

Add `anonymize_report(report: &DiscoverReport) -> AnonymizedReport`:
- `unsupported` → `unhandled`: keep `base_command` (first word only), drop `example`
- `supported` → `missed`: keep first 1-2 words of `command`, drop token estimates
- Reuse existing `truncate_command()` from `mod.rs` (make it `pub(crate)`)

Add `format_github_issue(report: &AnonymizedReport, version, os, arch) -> String`.

---

## Phase 2: `rtk discover --issue`

**Files**: `src/main.rs`, `src/discover/mod.rs`

Add `--issue` flag to `Commands::Discover`. When set: anonymize → render markdown → print to stdout → exit.

Output format:
```markdown
## RTK Discover Report

**Version**: 0.28.0 | **OS**: macos/aarch64 | **Sessions**: 23 | **Commands**: 412

### Top Unhandled Commands (no RTK filter yet)
| Command | Count |
|---------|-------|
| curl | 45 |
| terraform | 32 |

### Missed Savings (RTK already supports these)
| Command | Count | RTK Equivalent |
|---------|-------|----------------|
| git status | 67 | rtk git |
| cargo test | 23 | rtk cargo |

---
*Generated by `rtk discover --issue` v0.28.0*
```

Users can then:
```bash
rtk discover --issue | pbcopy                    # paste manually
rtk discover --issue | gh issue create --title "Discover report" --body-file -
```

---

## Phase 3: `rtk discover --share`

**New file**: `src/discover/share.rs`
**Files**: `src/main.rs`, `src/discover/mod.rs`, `src/telemetry.rs`

Add `--share` flag. Flow:
1. Run discover pipeline
2. Anonymize report
3. Show preview of **exactly** what will be sent
4. Ask `[y/N]` confirmation
5. POST to `RTK_SHARE_URL` (compile-time env, same pattern as telemetry)

Preview shows:
```
RTK Discover -- Sharing Report
========================================
This will send anonymized usage data to help prioritize RTK filters.

DATA TO BE SENT:
  Device:     a1b2c3d4e5f6 [pseudonymous hash]
  Version:    0.28.0
  OS/Arch:    macos/aarch64
  Sessions:   23
  Commands:   412

  Unhandled (top 8):
    curl                     45
    terraform                32

  Missed savings (top 5):
    git status               67  → rtk git
    cargo test               23  → rtk cargo

WHAT IS NOT SENT:
  - No file paths or directory names
  - No command arguments or flags
  - No example commands or output
```

Payload JSON:
```json
{
  "device_hash": "a1b2c3...",
  "version": "0.28.0",
  "os": "macos",
  "arch": "aarch64",
  "report": { /* AnonymizedReport */ }
}
```

Also: make `generate_device_hash()` in `telemetry.rs` `pub` (reuse for share payload).

---

## Phase 4: Telemetry aggregate

**File**: `src/telemetry.rs`

Enrich the existing daily ping with lightweight discover aggregate:

```json
{
  "discover": {
    "unhandled_top10": ["curl", "terraform", "helm"],
    "missed_top10": ["git status", "cargo test"],
    "sessions_scanned": 23
  }
}
```

Implementation:
- `get_discover_summary()` runs classify-only scan (last 7 days, current project, no output_len analysis)
- Returns `serde_json::Value::Null` on any error — never breaks the ping
- Caches result in `.discover_cache` marker file (skip if <23h old)
- Runs in the existing background thread

---

## Phase 5: Backend — Cloudflare Worker + D1

New repo/subdirectory: `rtk-share-worker/`

### D1 Schema

```sql
CREATE TABLE discover_reports (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  device_hash TEXT NOT NULL,
  version TEXT NOT NULL,
  os TEXT NOT NULL,
  arch TEXT NOT NULL,
  sessions_scanned INTEGER,
  total_commands INTEGER,
  already_rtk INTEGER,
  report_json TEXT NOT NULL,
  created_at TEXT DEFAULT (datetime('now'))
);

CREATE INDEX idx_device_hash ON discover_reports(device_hash);
CREATE INDEX idx_created_at ON discover_reports(created_at);

CREATE TABLE command_rankings (
  command TEXT PRIMARY KEY,
  total_count INTEGER DEFAULT 0,
  unique_devices INTEGER DEFAULT 0,
  score REAL DEFAULT 0,   -- total_count * log(unique_devices + 1)
  category TEXT DEFAULT 'unhandled',
  last_seen TEXT
);
```

### Endpoints

- `POST /api/v1/discover-share` — upsert report, update rankings, dedup by device_hash per day
- `GET /api/v1/community-stats` — top 30 unhandled + top 30 missed, public, cached 1h

### Scoring formula

`score = total_count × log(unique_devices + 1)`

Balances "one user runs curl 500x" vs "50 users each run terraform 5x" — the latter scores higher.

---

## Phase 6: Agent-Assisted Filter Creation Pipeline

Closes the loop between "we know what's missing" (discover) and "someone writes the filter". Three components that let a contributor go from raw command output to a merged PR with minimal friction.

### 6.1 — `rtk analyze <cmd>`

**Purpose**: Analyze command output and recommend a filter implementation approach (TOML stages vs Rust module), without blindly executing anything.

**Input modes** (safety-first):
- stdin: `helm list | rtk analyze helm`
- Fixture file: `rtk analyze helm --fixture tests/fixtures/helm_list_raw.txt`
- Explicit opt-in: `rtk analyze helm --run` (executes the command and captures output)

Default is stdin/fixture — `--run` is an explicit opt-in, never automatic.

**What it does**:
1. Detects output format (JSON, NDJSON, tabular, free-form text, mixed)
2. Measures repetition ratio (high repetition → strong TOML candidate)
3. Checks for structured fields vs unstructured prose
4. Recommends TOML or Rust with a justification sentence
5. If TOML: suggests which stages apply (`strip_lines`, `keep_sections`, `truncate`, etc.)
6. If Rust: notes which existing module is the closest template

**Decision tree** (TOML vs Rust):

```
Output format?
├── JSON or NDJSON → Rust (structured parsing, serde)
├── Tabular (columns, headers) → TOML likely enough
│   ├── Static columns → TOML (strip_lines + keep_columns)
│   └── Dynamic/nested → Rust
├── Free-form text with sections → TOML (keep_sections + strip_lines)
└── Mixed (text + JSON blobs) → Rust

Repetition ratio?
├── >60% repeated lines → TOML (dedup stage)
└── <60% → Rust (logic needed)

Token savings estimate?
├── TOML stages alone achieve ≥60% → recommend TOML
└── Below 60% threshold → recommend Rust
```

**Flags**:
- `--run` — execute the command and capture output (opt-in)
- `--fixture <path>` — analyze output from a fixture file
- `--save-fixture <path>` — save captured output to a fixture file
- `--json` — machine-readable output (for agent consumption)
- `--verbose` — show line-by-line analysis details

**Example output**:
```
rtk analyze helm list

Input: 847 tokens (stdin)
Format: tabular (7 columns detected)
Repetition: 12% (low)
Estimated savings with TOML: 71%

Recommendation: TOML filter
Rationale: Tabular output with static columns. TOML stages sufficient.

Suggested stages:
  - strip_lines: [regex for empty/separator lines]
  - keep_columns: [NAME, NAMESPACE, STATUS, CHART]
  - truncate: max_lines=50

Next step: rtk analyze helm list --save-fixture tests/fixtures/helm_list_raw.txt
```

---

### 6.2 — `/create-filter` slash command (Claude Code)

**File**: `.claude/commands/create-filter.md`

A Claude Code slash command contributors run in their local clone of RTK. Takes a command name, orchestrates the full filter creation loop.

**Flow**:
1. Runs `rtk analyze <cmd>` (or prompts for a fixture if no stdin)
2. Scaffolds the filter: TOML file in `.rtk/filters/` or Rust module in `src/<cmd>_cmd.rs`
3. Creates a test fixture if not already present
4. Writes unit tests (snapshot + token savings assertion)
5. Runs `cargo fmt && cargo clippy && cargo test` — up to 3 retry loops if failures
6. On success: commits, detects the contributor's fork remote, pushes, opens a PR toward `rtk-ai/rtk:develop`

**Fork-aware PR creation**:
- Detects `origin` vs `upstream` remotes (standard fork setup)
- Pushes to `origin` (contributor's fork), opens PR toward `upstream` (rtk-ai/rtk)
- If only one remote: assumes it's the fork, warns if it looks like the main repo
- PR title follows RTK convention: `feat: add <cmd> filter (X% token savings)`

**Retry loop** (max 3 iterations):
```
cargo test fails
  → Claude reads error output
  → Fixes the filter or test
  → Re-runs cargo test
  → If still failing after 3 attempts: stops, reports what's stuck
```

**Contributor requirements**: local Rust toolchain + Claude Code + their own API credits. No special RTK infrastructure needed.

---

### 6.3 — `rtk discover --suggest`

**Purpose**: Enrich the unhandled commands table with a "Recommendation" column, reusing heuristics from `analyze` on already-captured output — no re-execution of commands.

**Output** (new column in the existing unhandled table):

| Command | Sessions | Count | Recommendation |
|---------|----------|-------|----------------|
| helm | 8 | 234 | TOML (tabular, 71% est.) |
| terraform | 5 | 156 | Rust (JSON output) |
| kubectl logs | 3 | 89 | TOML (repetitive lines, 80% est.) |

**Implementation**:
- Reuses `output_content` already stored by discover (no re-execution)
- Applies the TOML vs Rust decision tree from `analyze` heuristics
- Falls back to "Unknown" if no output was captured for that command
- Adds savings estimate when output sample is available

---

## Files added/modified (Phase 6)

| File | Change |
|------|--------|
| `src/analyze_cmd.rs` | **New** — `rtk analyze` implementation |
| `src/main.rs` | Add `Commands::Analyze` variant |
| `src/discover/mod.rs` | Add `--suggest` recommendation column |
| `.claude/commands/create-filter.md` | **New** — slash command for contributors |

---

## Files modified (Phases 1-5)

| File | Change |
|------|--------|
| `src/discover/report.rs` | `AnonymizedReport` structs + `anonymize_report()` + `format_github_issue()` |
| `src/discover/share.rs` | **New** — HTTP share logic, preview, confirmation |
| `src/discover/mod.rs` | `mod share`, route `--issue`/`--share`, `truncate_command` → `pub(crate)` |
| `src/main.rs` | `--issue` and `--share` flags on `Commands::Discover` |
| `src/telemetry.rs` | `generate_device_hash()` → `pub`, `discover` field in ping payload |
| `rtk-share-worker/` | **New** — Cloudflare Worker + D1 backend |

## Reused functions

| Function | File | Reused for |
|----------|------|------------|
| `generate_device_hash()` | `src/telemetry.rs:82` | Device ID in share payload |
| `truncate_command()` | `src/discover/mod.rs:230` | Base command extraction |
| `classify_command()` | `src/discover/registry.rs` | Telemetry aggregate |
| `split_command_chain()` | `src/discover/registry.rs` | Telemetry aggregate |
| `ClaudeProvider` | `src/discover/provider.rs` | Session scanning |

---

## Privacy controls

| Control | Effect |
|---------|--------|
| `telemetry.enabled = false` in config.toml | Disables Phase 4 telemetry aggregate |
| `RTK_TELEMETRY_DISABLED=1` | Disables Phase 4 |
| `--share` always requires interactive `y/N` | Phase 3 always explicit |
| Preview shows exact payload before sending | Full transparency |
| No args/paths/examples ever sent | All phases |
| `rtk analyze` never executes commands without `--run` | Phase 6 safety |

---

## Unit tests to write

```rust
// report.rs
test_anonymize_strips_examples()       // UnsupportedEntry.example dropped
test_anonymize_strips_args()            // "git log --oneline -20" → "git log"
test_anonymize_strips_paths()           // "/usr/bin/grep -r foo" → "grep"
test_anonymize_preserves_counts()       // counts unchanged
test_format_github_issue_valid_md()     // output is valid markdown table
test_format_github_issue_no_paths()     // no "/" in output except markdown syntax

// share.rs
test_share_payload_serializes()         // SharePayload → valid JSON
test_share_no_url_returns_ok()          // graceful when RTK_SHARE_URL not set

// telemetry.rs
test_discover_summary_null_on_error()   // returns null, doesn't panic

// analyze_cmd.rs
test_analyze_detects_json_format()      // JSON input → Rust recommendation
test_analyze_detects_tabular_format()   // tabular input → TOML recommendation
test_analyze_high_repetition_toml()     // >60% repetition → TOML recommendation
test_analyze_json_output_flag()         // --json produces valid JSON
test_analyze_stdin_no_run()             // no --run = no command execution
```

## Verification checklist

- [ ] `cargo fmt --all && cargo clippy --all-targets && cargo test`
- [ ] `rtk discover --issue` → valid markdown, no paths/args in output
- [ ] `rtk discover --share` → preview shown, y/N works, POST succeeds
- [ ] `rtk discover --suggest` → Recommendation column appears, no command re-execution
- [ ] `helm list | rtk analyze helm` → recommendation + suggested TOML stages
- [ ] `rtk analyze helm --run` → executes helm, captures output, analyzes
- [ ] `/create-filter helm` in Claude Code → scaffolds filter, tests pass, PR opened
- [ ] `hyperfine 'rtk discover'` before/after (no regression)
- [ ] `wrangler dev` → POST payload → GET community-stats → rankings correct


File	Change
`src/analyze_cmd.rs`	New — `rtk analyze` implementation
`src/main.rs`	Add `Commands::Analyze` variant
`src/discover/mod.rs`	Add `--suggest` recommendation column
`.claude/commands/create-filter.md`	New — slash command for contributors

File	Change
`src/discover/report.rs`	`AnonymizedReport` structs + `anonymize_report()` + `format_github_issue()`
`src/discover/share.rs`	New — HTTP share logic, preview, confirmation
`src/discover/mod.rs`	`mod share`, route `--issue`/`--share`, `truncate_command` → `pub(crate)`
`src/main.rs`	`--issue` and `--share` flags on `Commands::Discover`
`src/telemetry.rs`	`generate_device_hash()` → `pub`, `discover` field in ping payload
`rtk-share-worker/`	New — Cloudflare Worker + D1 backend

Function	File	Reused for
`generate_device_hash()`	`src/telemetry.rs:82`	Device ID in share payload
`truncate_command()`	`src/discover/mod.rs:230`	Base command extraction
`classify_command()`	`src/discover/registry.rs`	Telemetry aggregate
`split_command_chain()`	`src/discover/registry.rs`	Telemetry aggregate
`ClaudeProvider`	`src/discover/provider.rs`	Session scanning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: rtk discover --share / --issue — Community filter prioritization #481

Context

Phase 1: Anonymization structs + logic

Phase 2: `rtk discover --issue`

Phase 3: `rtk discover --share`

Phase 4: Telemetry aggregate

Phase 5: Backend — Cloudflare Worker + D1

D1 Schema

Endpoints

Scoring formula

Phase 6: Agent-Assisted Filter Creation Pipeline

6.1 — `rtk analyze <cmd>`

6.2 — `/create-filter` slash command (Claude Code)

6.3 — `rtk discover --suggest`

Files added/modified (Phase 6)

Files modified (Phases 1-5)

Reused functions

Privacy controls

Unit tests to write

Verification checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Command	Sessions	Count	Recommendation
helm	8	234	TOML (tabular, 71% est.)
terraform	5	156	Rust (JSON output)
kubectl logs	3	89	TOML (repetitive lines, 80% est.)

Control	Effect
`telemetry.enabled = false` in config.toml	Disables Phase 4 telemetry aggregate
`RTK_TELEMETRY_DISABLED=1`	Disables Phase 4
`--share` always requires interactive `y/N`	Phase 3 always explicit
Preview shows exact payload before sending	Full transparency
No args/paths/examples ever sent	All phases
`rtk analyze` never executes commands without `--run`	Phase 6 safety

feat: rtk discover --share / --issue — Community filter prioritization #481

Description

Context

Phase 1: Anonymization structs + logic

Phase 2: rtk discover --issue

Phase 3: rtk discover --share

Phase 4: Telemetry aggregate

Phase 5: Backend — Cloudflare Worker + D1

D1 Schema

Endpoints

Scoring formula

Phase 6: Agent-Assisted Filter Creation Pipeline

6.1 — rtk analyze <cmd>

6.2 — /create-filter slash command (Claude Code)

6.3 — rtk discover --suggest

Files added/modified (Phase 6)

Files modified (Phases 1-5)

Reused functions

Privacy controls

Unit tests to write

Verification checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Phase 2: `rtk discover --issue`

Phase 3: `rtk discover --share`

6.1 — `rtk analyze <cmd>`

6.2 — `/create-filter` slash command (Claude Code)

6.3 — `rtk discover --suggest`