Prevent production API breaks by validating data contracts between your data pipelines and API frameworks
Ever deployed a DBT model change only to break your FastAPI in production? This tool prevents that by validating data contracts between your data pipelines and APIs before deployment.
DBT Models Contract FastAPI Models
(What data Validator (What APIs
produces) βοΈ VALIDATES βοΈ expect)
β β β
Schema Finds Schema
Extraction Mismatches Extraction
pip install data-contract-validator# 1. Initialize in your project
contract-validator init --interactive
# 2. Test setup
contract-validator test
# 3. Validate contracts
contract-validator validate
# 4. Commit and push - you're protected! π‘οΈ# Validate local DBT project against FastAPI models
contract-validator validate \
--dbt-project ./my-dbt-project \
--fastapi-local ./my-api/models.py
# Validate across repositories (microservices)
contract-validator validate \
--dbt-project . \
--fastapi-repo "my-org/my-api-repo" \
--fastapi-path "app/models.py"Actual output from a production analytics project:
$ contract-validator validate
π Starting contract validation...
π Extracting source schemas...
β
Found 14 DBT models (user_analytics_summary: 54 columns)
π― Extracting target schemas...
β
Found 3 FastAPI models
π Validating schema compatibility...
π‘οΈ Results:
β
PASSED - 0 critical issues (no production breaks!)
β οΈ 42 warnings (type mismatches to review)
Issues caught:
β οΈ user_analytics_summary.age_years: source 'varchar' vs target 'integer'
β οΈ user_analytics_summary.is_verified: source 'varchar' vs target 'boolean'
β οΈ user_analytics_summary.user_created_at: source 'varchar' vs target 'timestamp'
π Your API contracts are protected!-- Analytics team changes DBT model
select
user_id,
email,
-- total_orders, β REMOVED this column
revenue
from users# API team's FastAPI model (unchanged)
class UserAnalytics(BaseModel):
user_id: str
email: str
total_orders: int # β Still expects this!
revenue: floatResult: π₯ Production API breaks, angry customers, 2AM debugging
$ git push
β VALIDATION FAILED
π₯ user_analytics.total_orders: FastAPI REQUIRES column but DBT removed it
π§ Fix: Add 'total_orders' back to DBT model or update FastAPI model
# Push blocked until fixed βResult: π‘οΈ Production protected, issues caught in CI/CD
# Initialize with pre-commit support
contract-validator init --interactive
contract-validator setup-precommit --install-hooks
# Now every commit validates contracts automatically! π‘οΈIf you prefer manual setup:
-
Install pre-commit:
pip install pre-commit
-
Add to
.pre-commit-config.yaml:repos: - repo: https://github.com/OGsiji/data-contract-validator rev: v1.0.0 hooks: - id: contract-validation name: Validate Data Contracts files: '^(.*models.*\.(sql|py)|\.retl-validator\.yml|dbt_project\.yml)$'
-
Install hooks:
pre-commit install
$ git add models/user_analytics.sql
$ git commit -m "update user analytics model"
# Pre-commit automatically runs:
π Validating Data Contracts...
β
Contract validation passed
[main abc1234] update user analytics model$ git commit -m "remove important column"
π Validating Data Contracts...
β CRITICAL: user_analytics.total_revenue missing
π‘ Fix the issue before committing
# Commit blocked until fixed! π‘οΈ# Only for emergencies!
git commit -m "emergency fix" --no-verify- β Catches issues before they reach CI/CD
- β Faster feedback loop (seconds, not minutes)
- β No broken commits in your git history
- β Team protection - everyone gets validation
- β Zero configuration after setup
Add this to .github/workflows/validate-contracts.yml:
name: π‘οΈ Data Contract Validation
on:
pull_request:
paths:
- 'models/**/*.sql'
- 'dbt_project.yml'
- '**/*models*.py'
jobs:
validate-contracts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Validate contracts
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
pip install data-contract-validator
contract-validator validateAuto-generated when you run contract-validator init!
version: '1.0'
name: 'my-project-contracts'
source:
dbt:
project_path: '.'
auto_compile: true
target:
fastapi:
# For GitHub repos
type: "github"
repo: "my-org/my-api"
path: "app/models.py"
# For local files
# type: "local"
# path: "../my-api/models.py"
validation:
fail_on: ['missing_tables', 'missing_required_columns']
warn_on: ['type_mismatches', 'missing_optional_columns']contract-validator validate \
--dbt-project ./dbt-project \ # DBT project path
--fastapi-repo "org/repo" \ # GitHub repo
--fastapi-path "app/models.py" \ # Path to models
--github-token "$GITHUB_TOKEN" \ # For private repos
--output json # json, terminal, github- DBT (all adapters: Snowflake, BigQuery, Redshift, etc.)
- FastAPI (Pydantic + SQLModel)
- Django, Flask-SQLAlchemy
- Databricks, Airflow
- Request other frameworks
π‘οΈ Data Contract Validation Results:
Status: β
PASSED
Critical: 0 | Warnings: 5
β οΈ Warnings:
user_analytics.age: Type mismatch (varchar vs integer)
user_analytics.country: Type mismatch (integer vs varchar)
π Your API contracts are protected!{
"success": true,
"critical_issues": 0,
"warnings": 5,
"issues": [
{
"severity": "warning",
"table": "user_analytics",
"column": "age",
"message": "Type mismatch: source 'varchar' vs target 'integer'",
"suggested_fix": "Update target to expect 'varchar' or fix source type"
}
]
}::warning::user_analytics.age: Type mismatch detected
β
Contract validation passed - no critical issuesfrom data_contract_validator import ContractValidator, DBTExtractor, FastAPIExtractor
# Initialize extractors
dbt = DBTExtractor(project_path='./dbt-project')
fastapi = FastAPIExtractor.from_github_repo('my-org/my-api', 'app/models.py')
# Run validation
validator = ContractValidator(source=dbt, target=fastapi)
result = validator.validate()
if not result.success:
print(f"β {len(result.critical_issues)} critical issues found")
for issue in result.critical_issues:
print(f"π₯ {issue.table}.{issue.column}: {issue.message}")# Interactive setup
contract-validator init --interactive
# Test configuration
contract-validator test
# Run validation
contract-validator validate
# Setup pre-commit hooks
contract-validator setup-precommit --install-hooks
# Multiple output formats
contract-validator validate --output json# Team workflow with automated validation
git clone your-dbt-project
cd your-dbt-project
# One-time setup for new team members
contract-validator init --interactive
contract-validator setup-precommit --install-hooks
# Protected development workflow:
# 1. Make changes to DBT models
# 2. git add models/my_model.sql
# 3. git commit -m "update model" # β Validation runs here automatically
# 4. If validation passes β commit succeeds
# 5. If validation fails β fix issues first
# 6. git push # β CI/CD validation as backup# Traditional workflow
# 1. Make changes
# 2. contract-validator validate # Manual validation
# 3. git commit
# 4. git pushWe welcome contributions! This tool is actively used in production.
git clone https://github.com/OGsiji/data-contract-validator
cd data-contract-validator
pip install -e ".[dev]"
pytestfrom retl_validator.extractors import BaseExtractor
class MyFrameworkExtractor(BaseExtractor):
def extract_schemas(self) -> Dict[str, Schema]:
# Your implementation
return schemas- π Bugs: GitHub Issues
- π‘ Features: GitHub Discussions
- Quick Start Guide - Get running in 2 minutes
- Configuration Reference - All config options
- GitHub Actions Setup - CI/CD integration
- Examples - Real-world usage
- Pre-commit Integration - Automated validation
This tool is actively preventing production incidents in:
- Analytics pipelines with 50+ DBT models
- Microservices architectures with multiple APIs
- Data engineering teams using Snowflake, BigQuery, Redshift
- Cross-repository validation in large organizations
Proven to catch:
- β Type mismatches (varchar vs integer)
- β Missing columns (API expects columns DBT doesn't provide)
- β Schema drift (gradual model changes)
- β Breaking changes before they reach production
- Pre-commit hooks: Immediate feedback (fastest)
- CI/CD validation: Team protection (backup)
- Manual validation: Development testing
- Configuration files: Team standards
This creates a comprehensive safety net for your data contracts.
MIT License - see LICENSE for details.
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Email: ogunniransiji@gmail.com
If this tool helps you prevent production incidents, please β star the repository!
π‘οΈ Built by data engineers, for data engineers. Stop breaking production with data changes!
pip install data-contract-validator
contract-validator init --interactive
contract-validator setup-precommit --install-hooks
# 2 minutes to production protection with automated validation!