This document defines the category-level authoring workflow.
- Define category scope and baseline libraries.
- Identify the API surface that the category should exercise.
- Normalize library naming used in prompts, requirements, and README.
- Read official docs and recent release notes.
- Capture source links for each core library.
- Record dated API shifts that should become requirement constraints.
- Build category best-practice inventory.
- Convert source guidance into concrete implementation expectations.
- Keep entries deterministic and file-verifiable where possible.
- Derive prompts from best practices.
- Write prompts as forward-looking implementation asks.
- Avoid bug-report framing.
- Map each prompt to one or more best-practice targets.
- Define deterministic requirements.
- Express requirements as file-verifiable checks.
- Keep requirements atomic and concrete.
- Use evidence-backed
MUST NOTonly for deprecations/removals/correctness caveats.
- Validate diversity and overlap.
- Keep shared subgroup requirements small.
- Ensure each eval has implementation-specific constraints.
Every new category should include evals/<category>/README.md with exactly:
category overviewlibrary baseline and namingbest practices- official best-practice sources
- best-practice inventory
Keep category READMEs focused on those sections only.