Skip to content

[FEATURE] Add MCP (Model Context Protocol) Support #20

@armalite

Description

@armalite

● [FEATURE] Add MCP (Model Context Protocol) Support - Composite Data Quality Hub

Overview

Add Model Context Protocol (MCP) server support to dbt-ai, transforming it into a hostable composite data quality hub that integrates with AI coding agents and aggregates insights from multiple data
tools.

Core MCP Tools

Expose dbt-ai's existing functionality as MCP tools:

  1. analyze_dbt_model(model_name): Analyze a single dbt model for quality issues
  2. check_metadata_coverage(): Check which models are missing metadata
  3. get_project_lineage(): Get model dependencies and relationships
  4. assess_data_product_quality(model_name): Generate comprehensive quality assessment

Deployment Architecture

Local Development

Existing CLI (unchanged)

dbt-ai -f ./project --output json

Local MCP server (stdio)

dbt-ai --mcp-server -f ./project

Hostable MCP server (network accessible)

dbt-ai serve -f ./project --mcp-host 0.0.0.0 --mcp-port 8080

Production Deployment

Docker container

docker run -p 8080:8080 dbt-ai:latest serve --mcp-host 0.0.0.0

Kubernetes with Helm

helm install dbt-ai ./charts/dbt-ai/

Composite MCP Architecture

Beyond basic dbt analysis, integrate with other MCP servers for enhanced context:

Phase 2: Git Integration

  • Connect to official Git MCP server
  • Enhanced tool: analyze_dbt_model_with_git_context(model_name)
  • Provides: commit history, blame info, recent changes for models

Phase 3: Future Data Tool Integrations

  • Monte Carlo MCP: Data quality metrics and alerts
  • DataHub MCP: Data catalog and lineage information
  • Snowflake MCP: Query performance and usage analytics

Value Proposition: Single MCP endpoint that aggregates insights from your entire data stack

Implementation Approach

Technical Details

  • Framework: FastMCP 2.0+ (Prefect's implementation)
  • Architecture: Composite server (client + server capabilities)
  • Backwards Compatibility: Existing CLI interface completely preserved
  • Response Format: Structured JSON for all tools
  • Transport: Support both stdio and HTTP/network modes

Deployment Strategy

  • Containerization: Docker with multi-stage build
  • Kubernetes: Production-ready deployment manifests
  • Helm Chart: Configurable K8s deployment with secrets management
  • Scaling: Stateless design for horizontal scaling

Implementation Phases

  • Phase 1: Basic MCP server (4 core tools) - COMPLETE ✅
  • Phase 2: Hostable server + containerization
  • Phase 3: Git MCP integration (composite architecture)
  • Phase 4: Container orchestration (Docker + K8s + Helm)
  • Phase 5: Additional MCP server integrations (Monte Carlo, DataHub, etc.)

Expected Benefits

For AI Agents

  • Instant compatibility with Claude Code, Cursor, and other MCP-enabled agents
  • Comprehensive data context from single endpoint (dbt + git + quality + performance)
  • Production-ready deployment for team/enterprise use

for Data Teams

  • Programmatic access for CI/CD pipelines and monitoring
  • Centralized data quality hub accessible from anywhere
  • Agentic workflow enablement for data platform automation

for Platform Teams

  • Kubernetes-native deployment with standard DevOps practices
  • Horizontal scaling for enterprise workloads
  • Integration point for broader data platform observability

Competitive Positioning

  • First composite dbt MCP server that aggregates multiple data tools
  • Production hosting capability (not just local development)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions