Add public docs page for hosted scorers (preview)#2049
Draft
dmontagu wants to merge 2 commits into
Draft
Conversation
Document the preview "Scorers" feature — platform-run LLM-judge evaluators that continuously score an agent's live runs server-side, with no evaluator code or judge API key of your own. The page mirrors the internal walkthrough at `src/walkthroughs/hosted-evaluators/` in the platform repo and covers: creating a scorer from the agent's Scorers tab (name, rubric, score name, sample rate, enable toggle), the dry-run → refine → enable loop, the free 10,000-scores/project/month preview quota, the transparency note that scores are written back as `gen_ai.evaluation.result` telemetry and count toward ingest usage, and how to view scores in Live Evaluations, the trace view, and SQL. Added to the "Evaluate" nav in `mkdocs.yml` (both the site nav and the llmstxt sections), alongside the Live Evaluations guide. Claude-Session: https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71
Verified against a running stack: the form exposes Score name / Rubric / Sample rate / Enabled (no separate "Name" field), sample rate is a percentage with deterministic sampling, and the judge returns a 0-1 score with a reason. Claude-Session: https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
docs/guides/web-ui/scorers.md— the public docs page for the hosted-scorers preview — plus the mkdocs nav entries (site nav + llmstxt), placed right after Live Evaluations.The page covers:
gen_ai.evaluation.resultOTel log events.Verification
Drafted and then verified end-to-end against a running platform stack with the feature enabled: UI labels and form fields checked against the real Scorers tab (one inaccuracy fixed in the second commit — the form has no separate "Name" field), and the example SQL query was run verbatim and returned real score rows.
Ships together with
include:allow-list insrc/config/libraries.ts. Without it this page never renders on the docs site (mkdocs nav only controls ordering there). This PR must merge first; the include references the page it adds.hosted_scorersflag (platform draft PR #25540 and follow-ups). This PR should merge as part of opening the public preview, not before.https://claude.ai/code/session_01JsVLds2HfEKkcBU9t37H71