Skip to content

Conversation

@caidanw
Copy link
Contributor

@caidanw caidanw commented Dec 6, 2025

Summary

Replaces the single validationErrors column with a flexible reasons array that supports multiple validation failure types. Implements a validator strategy pattern to collect all validation failures (NSID format, DID authority, rkey mismatch, schema errors) with detailed error messages for developer debugging.

Key Changes

Database Schema

  • Replace validation_errors column with reasons JSONB array
  • Add CHECK constraint ensuring reasons array is not empty
  • Add GIN index on reasons for efficient querying

Validation Strategy Pattern

  • New pluggable validator system in src/app/api/ingest/validation.ts
  • Runs all validators and collects all failures
  • Easy to extend with new validation types

Four Validation Types

  1. NSID format validation - Uses @atproto/syntax for detailed error messages
  2. DID authority matching - DNS-based verification moved into validator
  3. Rkey matching NSID - Ensures rkey equals NSID for lexicon records
  4. Lexicon schema validation - Zod-based validation with error details

Type Safety

  • BaseCommit interface for raw Nexus data (record: unknown)
  • Commit interface for validated data (record: LexiconSchemaRecord)
  • Type guard isLexiconCommit for proper TypeScript narrowing

Benefits

  • Extensible: Add new validators without touching route logic
  • Comprehensive: Catches all validation issues in one pass
  • Debuggable: Detailed error messages help developers fix issues faster
  • Maintainable: Clear separation of concerns (types, validation, reasons)

Example Output

{
  "reasons": [
    {
      "type": "rkey_mismatch",
      "expected": "com.example.lexicon",
      "actual": "my-lexicon"
    },
    {
      "type": "invalid_nsid_format",
      "nsid": "com.example_lexicon",
      "message": "Disallowed characters in NSID (ASCII letters, digits, dashes, periods only)"
    }
  ]
}

Breaking Changes

  • Database schema change: validation_errorsreasons (acceptable for pre-production)
  • Requires running db:push to apply migration

…lid_lexicons

Renames all references to the validLexicons and invalidLexicons tables and types to valid_lexicons and invalid_lexicons for consistency with database naming conventions. Updates imports, queries, and type definitions across API routes and schema.
- Move InvalidLexiconReason and related reason interfaces to a new reasons.ts module
- Move isZodError helper to types.ts for reuse
- Simplify ValidationResult type and validation logic
- Improve type safety and separation of concerns in validation code
… updating Commit type

- Remove unnecessary destructuring of commit object in ingest route
- Use validationResult.lexiconDoc and commit.record directly for DB inserts and logging
- Update Commit type to allow record to be any object, not just LexiconSchemaRecord
- Minor import cleanup
Copilot AI review requested due to automatic review settings December 6, 2025 02:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the lexicon validation system to implement a flexible, extensible validator strategy pattern. The key change replaces the single validationErrors column with a typed reasons array that can store multiple types of validation failures with detailed error messages.

Key Changes:

  • Implements a pluggable validator system with four validation types: NSID format, DID authority matching, rkey matching, and lexicon schema validation
  • Replaces database validation_errors JSONB column with a strongly-typed reasons JSONB array with CHECK constraint
  • Refactors table naming from camelCase to snake_case (validLexicons → valid_lexicons)

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/util/lexicon.ts Removed - types moved to src/app/api/ingest/types.ts
src/db/schema.ts Renamed tables to snake_case, replaced validationErrors with reasons array, added CHECK constraint and GIN index
src/app/api/stats/route.ts Updated table references from camelCase to snake_case
src/app/api/repos/[repoDid]/route.ts Updated table references from camelCase to snake_case
src/app/api/lexicons/route.ts Updated table references from camelCase to snake_case
src/app/api/lexicons/resolve/route.ts Updated table references from camelCase to snake_case
src/app/api/lexicons/[nsid]/route.ts Updated table references from camelCase to snake_case
src/app/api/ingest/validation.ts New file implementing validator strategy pattern with four validation types
src/app/api/ingest/types.ts New file containing type definitions for commits, events, and type guards (moved from lexicon.ts)
src/app/api/ingest/route.ts Refactored to use new validation system, simplified logic by delegating to validators
src/app/api/ingest/reasons.ts New file defining typed validation failure reason interfaces
drizzle/meta/_journal.json Updated migration timestamp (future date issue)
drizzle/meta/0000_snapshot.json Updated schema snapshot with new table structure
drizzle/0000_init.sql Updated initial migration SQL with new schema

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@caidanw caidanw merged commit 30f25bb into main Dec 10, 2025
@caidanw caidanw deleted the caidanw/atm-209/add-reason-to-invalid_lexicons branch December 10, 2025 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants