Skip to content

Conversation

@github-actions
Copy link
Contributor

Summary

This PR adds comprehensive documentation to make the system completely clear to someone setting up linkml-reference-validator from scratch, addressing issue #29.

What's New

1. Setup Guide (docs/setup-guide.md)

A complete guide covering:

  • Installation: pip, uv, and development setups with verification steps
  • Configuration: NCBI API key setup, cache directory configuration
  • Quick Start: Real examples with actual PMIDs that work
  • Real-World Example: Complete gene annotation validation project
  • Advanced Configuration: YAML config files, environment variables
  • Integration: Pre-commit hooks, GitHub Actions, Makefiles
  • Verification Checklist: Ensure everything is working

2. Complete Workflow Tutorial (docs/tutorials/complete-workflow.md)

A 30-45 minute hands-on tutorial that walks through:

  • Building a gene annotation validation system from scratch
  • Designing a LinkML schema with validation markers
  • Creating sample data with multiple evidence items
  • Validating and repairing common errors
  • Setting up CI/CD pipelines
  • Writing tests and documentation
  • Production-ready examples with templates

Includes real-world examples using:

  • TP53, BRCA1, EGFR, JAK1 genes
  • Mixed reference types (PMID, PMC, DOI, file, URL)
  • Error scenarios and how to fix them
  • Complete project structure with Makefile and tests

3. Troubleshooting Guide (docs/troubleshooting.md)

Comprehensive troubleshooting covering:

  • Installation Issues: Command not found, version conflicts, import errors
  • Reference Fetching: Network problems, rate limiting, missing content
  • Validation Issues: Text not found, title mismatches, normalization
  • Schema Issues: Missing markers, field configuration
  • Data Format Issues: YAML parsing, invalid reference IDs
  • Performance Issues: Slow validation, cache management
  • Common Error Messages: Detailed explanations and solutions
  • Quick Diagnostic Checklist: Step-by-step debugging

Documentation Structure

Updated mkdocs.yml to include:

nav:
  - Home: index.md
  - Setup Guide: setup-guide.md  # NEW
  - Quickstart: quickstart.md
  - Tutorials:
      - Complete Workflow: tutorials/complete-workflow.md  # NEW
      - Getting Started (CLI): notebooks/01_getting_started.ipynb
      ...
  - Troubleshooting: troubleshooting.md  # NEW

Key Features

Illustrative Examples: Every concept includes real, working examples
Step-by-Step: Clear progression from installation to production use
Copy-Paste Ready: All code examples can be used directly
Real-World Focus: Uses actual PMIDs and realistic scenarios
Troubleshooting First: Anticipates common problems with solutions
Multiple Learning Paths: Quick start, tutorial, and reference approaches

Testing

  • ✅ All documentation files are valid Markdown
  • ✅ Navigation structure verified in mkdocs.yml
  • ✅ Code examples are syntactically correct
  • ✅ Links between documents are valid

Related Issues

Closes #29

Checklist

  • Added comprehensive setup guide
  • Added complete workflow tutorial
  • Added troubleshooting guide
  • Updated mkdocs.yml navigation
  • All examples are clear and illustrative
  • Documentation is beginner-friendly
  • Covers common setup scenarios

This commit adds three major documentation enhancements to make the
system completely clear to new users setting up linkml-reference-validator:

1. **Setup Guide (docs/setup-guide.md)**
   - Complete installation instructions for pip, uv, and development setup
   - Initial configuration including NCBI API key setup
   - Quick start examples with real PMIDs
   - Real-world example: validating gene functions
   - Advanced configuration with YAML config files
   - Integration with pre-commit hooks, CI/CD, and Makefiles
   - Verification checklist and troubleshooting quick fixes

2. **Complete Workflow Tutorial (docs/tutorials/complete-workflow.md)**
   - Step-by-step 30-45 minute tutorial building a gene annotation system
   - Covers installation, schema design, data creation, validation, and repair
   - Includes real-world examples with TP53, BRCA1, EGFR, and JAK1
   - Shows integration with Git, GitHub Actions, and testing frameworks
   - Provides templates and boilerplate code for quick starts
   - Production-ready examples with Makefiles and test suites

3. **Troubleshooting Guide (docs/troubleshooting.md)**
   - Comprehensive solutions for installation issues
   - Reference fetching problems (PMIDs, network, rate limiting)
   - Validation errors with detailed explanations and fixes
   - Schema and data format issues
   - Performance optimization tips
   - Common error messages with causes and solutions
   - Quick diagnostic checklist

Also updated mkdocs.yml navigation to include the new guides in logical
positions for discoverability.

These guides provide clear, illustrative examples for someone setting up
the system from scratch, addressing issue #29.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants