Examples for deploying, configuring, and using the Inference Gateway across different environments and use cases.
β οΈ Early Development Notice: This project is in its early stages of development. Some deployment examples may not work as expected or are currently in draft mode. We are actively working on completing and testing all examples using design-first principles. Please be patient as we continue to improve the documentation and examples.Note: The examples in each individual project repository are more accurate and likely to be working. This monolithic examples repository is still a work in progress (WIP). For the most reliable examples, please check the individual project repositories.
If you encounter issues, feel free to open an issue or contribute to help us improve.
Choose your preferred deployment method:
cd quickstart/docker
task
cd quickstart/kubernetes
task setup # Interactive setup with automatic cluster creation
Don't have a cluster? Use task setup
for interactive cluster creation with k3d.
From main examples directory? Use task quickstart:k8s
as a shortcut.
examples/
βββ quickstart/ # Get started quickly
β βββ docker/ # Docker deployment
β βββ kubernetes/ # Kubernetes deployment
βββ sdks/ # SDK usage examples
β βββ go/ # Go SDK examples
β βββ python/ # Python SDK examples
β βββ typescript/ # TypeScript SDK examples
β βββ rust/ # Rust SDK examples
βββ a2a/ # Agent-to-Agent examples
β βββ getting-started/ # Basic A2A setup
β βββ google-calendar/ # Calendar agent example
β βββ custom-agents/ # Custom agent examples
βββ mcp/ # Model Context Protocol examples
β βββ servers/ # MCP server implementations
β βββ clients/ # MCP client examples
βββ monitoring/ # Observability examples
β βββ prometheus/ # Metrics collection
β βββ grafana/ # Dashboards
β βββ tracing/ # Distributed tracing
βββ security/ # Security configurations
β βββ authentication/ # Auth setups
β βββ tls/ # TLS/SSL examples
β βββ rbac/ # Role-based access
βββ tutorials/ # Step-by-step guides
βββ first-request/ # Your first API call
βββ model-routing/ # Advanced routing
βββ rate-limiting/ # Rate limiting setup
βββ custom-providers/ # Adding new providers
β Go to quickstart/
for Docker or Kubernetes deployment
β Check sdks/
for language-specific examples
β Visit a2a/
for Agent-to-Agent examples
β See mcp/
for Model Context Protocol examples
β Review deployment/
for advanced deployment scenarios
β Check monitoring/
for observability setups
β Explore security/
for security configurations
β Follow tutorials/
for guided learning
- Task: Install Task for running examples
- Git: For cloning repositories
- curl: For testing API endpoints
- Docker: For container-based examples
- kubectl: For Kubernetes examples
- Node.js: For UI and TypeScript examples
- Go: For Go SDK examples
- Python: For Python SDK examples
- Rust: For Rust SDK examples
You'll need at least one API key from supported providers:
- OpenAI (
OPENAI_API_KEY
) - Anthropic (
ANTHROPIC_API_KEY
) - Groq (
GROQ_API_KEY
) - DeepSeek (
DEEPSEEK_API_KEY
) - Cohere (
COHERE_API_KEY
) - or local Ollama (
OLLAMA_API_URL
)
Perfect for getting up and running immediately:
- Docker: Single command deployment
- Kubernetes: Production-ready cluster deployment
Language-specific client implementations:
- Go: Native Go applications
- Python: Data science and AI workflows
- TypeScript: Web applications and Node.js services
- Rust: High-performance applications
Agent-to-Agent protocol implementations:
- Getting Started: Basic agent setup
- Google Calendar: Real-world integration
- Custom Agents: Build your own agents
Model Context Protocol integrations:
- Servers: Custom MCP server implementations
- Clients: MCP client integrations
Production deployment scenarios:
- Helm: Kubernetes package management
- Terraform: Infrastructure as Code
- Docker Compose: Multi-service orchestration
- Cloud: AWS, GCP, Azure specific deployments
Observability and monitoring:
- Prometheus: Metrics collection
- Grafana: Visualization dashboards
- Tracing: Request tracing setup
Security and authentication:
- Authentication: User authentication setups
- TLS: SSL/TLS configuration
- RBAC: Role-based access control
Guided learning experiences:
- First Request: Making your first API call
- Model Routing: Advanced request routing
- Rate Limiting: Implementing rate limits
- Custom Providers: Adding new AI providers
# From the main examples directory
task quickstart:k8s # Kubernetes with interactive setup
task quickstart:docker # Docker deployment
-
Choose your deployment method:
# For development cd quickstart/docker && task # For production cd quickstart/kubernetes && task setup
-
Test your deployment:
# Health check curl http://localhost:8080/health # List models curl http://localhost:8080/v1/models
-
Explore examples:
# Try SDK examples cd sdks/python && task # Build a UI cd ui/nextjs && task # Create agents cd a2a/getting-started && task
Found an issue or want to add an example? We welcome contributions!
- Fork the repository
- Create a feature branch
- Add your example with proper documentation
- Follow the existing structure and Taskfile patterns
- Submit a pull request
- Use Taskfile.yml for automation (no shell scripts)
- Include comprehensive README.md
- Add .env.example for configuration
- Follow language-specific best practices
- Include error handling and troubleshooting
- Documentation - Official documentation
- API Reference - API documentation
- GitHub - Source code
- Discord - Community support
This project is licensed under the MIT License - see the LICENSE file for details.