Add public docs page for hosted scorers (preview) by dmontagu · Pull Request #2049 · pydantic/logfire

dmontagu · 2026-07-02T03:44:01Z

What

Adds docs/guides/web-ui/scorers.md — the public docs page for the hosted-scorers preview — plus the mkdocs nav entries (site nav + llmstxt), placed right after Live Evaluations.

The page covers:

What scorers are (platform-run LLM-as-judge over live agent runs; counterpart to Pydantic AI online evals, but with no evaluator code and no judge API key of your own).
How the select → judge → write-back loop works, and that scores land as gen_ai.evaluation.result OTel log events.
Creating a scorer from the agent detail page's Scorers tab (field list matches the live UI form: Score name / Rubric / Sample rate / Enabled).
The dry-run-before-enable loop (Dry-run on recent runs; nothing written back, no quota spent).
Preview quota: free, hard cap of 10,000 scores per project per month.
Transparency note: written-back scores are ordinary ingested telemetry and count toward ingest usage.
Viewing scores in Live Evaluations, the trace view, and via SQL in Explore.

Verification

Drafted and then verified end-to-end against a running platform stack with the feature enabled: UI labels and form fields checked against the real Scorers tab (one inaccuracy fixed in the second commit — the form has no separate "Name" field), and the example SQL query was run verbatim and returned real score rows.

Ships together with

pydantic/unified-docs#53 — a one-line addition to the Evaluate section's include: allow-list in src/config/libraries.ts. Without it this page never renders on the docs site (mkdocs nav only controls ordering there). This PR must merge first; the include references the page it adds.
The platform-side feature is in preview behind the hosted_scorers flag (platform draft PR #25540 and follow-ups). This PR should merge as part of opening the public preview, not before.

https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71

Document the preview "Scorers" feature — platform-run LLM-judge evaluators that continuously score an agent's live runs server-side, with no evaluator code or judge API key of your own. The page mirrors the internal walkthrough at `src/walkthroughs/hosted-evaluators/` in the platform repo and covers: creating a scorer from the agent's Scorers tab (name, rubric, score name, sample rate, enable toggle), the dry-run → refine → enable loop, the free 10,000-scores/project/month preview quota, the transparency note that scores are written back as `gen_ai.evaluation.result` telemetry and count toward ingest usage, and how to view scores in Live Evaluations, the trace view, and SQL. Added to the "Evaluate" nav in `mkdocs.yml` (both the site nav and the llmstxt sections), alongside the Live Evaluations guide. Claude-Session: https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71

Verified against a running stack: the form exposes Score name / Rubric / Sample rate / Enabled (no separate "Name" field), sample rate is a percentage with deterministic sampling, and the judge returns a 0-1 score with a reason. Claude-Session: https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71

dmontagu added 2 commits July 1, 2026 17:15

dmontagu self-assigned this Jul 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add public docs page for hosted scorers (preview)#2049

Add public docs page for hosted scorers (preview)#2049
dmontagu wants to merge 2 commits into
mainfrom
dm/hosted-scorers-docs

dmontagu commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dmontagu commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Verification

Ships together with

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dmontagu commented Jul 2, 2026 •

edited

Loading