diff --git a/01_getting_started/01_hello_world/README.md b/01_getting_started/01_hello_world/README.md index b75d0dc..a72e3b0 100644 --- a/01_getting_started/01_hello_world/README.md +++ b/01_getting_started/01_hello_world/README.md @@ -7,23 +7,21 @@ Simple example demonstrating GPU-based serverless workers with automatic scaling ### 1. Install Dependencies ```bash -pip install -r requirements.txt +uv sync ``` -### 2. Configure Environment - -Create `.env` file: +### 2. Authenticate ```bash -RUNPOD_API_KEY=your_api_key_here +uv run flash login ``` -Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). +Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`. ### 3. Run Locally ```bash -flash run +uv run flash run ``` Server starts at **http://localhost:8888** diff --git a/01_getting_started/02_cpu_worker/README.md b/01_getting_started/02_cpu_worker/README.md index a242690..4d5fb88 100644 --- a/01_getting_started/02_cpu_worker/README.md +++ b/01_getting_started/02_cpu_worker/README.md @@ -7,23 +7,21 @@ Simple example demonstrating CPU-based serverless workers with automatic scaling ### 1. Install Dependencies ```bash -pip install -r requirements.txt +uv sync ``` -### 2. Configure Environment - -Create `.env` file: +### 2. Authenticate ```bash -RUNPOD_API_KEY=your_api_key_here +uv run flash login ``` -Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). +Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`. ### 3. Run Locally ```bash -flash run +uv run flash run ``` Server starts at **http://localhost:8888** diff --git a/01_getting_started/03_mixed_workers/README.md b/01_getting_started/03_mixed_workers/README.md index ae466af..78e8562 100644 --- a/01_getting_started/03_mixed_workers/README.md +++ b/01_getting_started/03_mixed_workers/README.md @@ -53,15 +53,14 @@ If you haven't run the repository-wide setup: ```bash # Install dependencies -pip install -r requirements.txt +uv sync -# Set API key (choose one): -export RUNPOD_API_KEY=your_api_key_here -# OR create .env file: -echo "RUNPOD_API_KEY=your_api_key_here" > .env +# Authenticate +uv run flash login +# Or create .env file with RUNPOD_API_KEY=your_api_key_here # Run -flash run +uv run flash run ``` Server starts at http://localhost:8888 diff --git a/01_getting_started/04_dependencies/README.md b/01_getting_started/04_dependencies/README.md index d3f251e..5ea27fe 100644 --- a/01_getting_started/04_dependencies/README.md +++ b/01_getting_started/04_dependencies/README.md @@ -28,15 +28,14 @@ If you haven't run the repository-wide setup: ```bash # Install dependencies -pip install -r requirements.txt +uv sync -# Set API key (choose one): -export RUNPOD_API_KEY=your_api_key_here -# OR create .env file: -echo "RUNPOD_API_KEY=your_api_key_here" > .env +# Authenticate +uv run flash login +# Or create .env file with RUNPOD_API_KEY=your_api_key_here # Run -flash run +uv run flash run ``` ## Dependency Types diff --git a/01_getting_started/README.md b/01_getting_started/README.md index 8b20778..4a18ac3 100644 --- a/01_getting_started/README.md +++ b/01_getting_started/README.md @@ -60,8 +60,8 @@ Managing Python packages and system dependencies. 1. Start with **01_hello_world** to understand the basics 2. Explore **03_mixed_workers** for cost optimization and validation patterns -3. Move to **02_cpu_worker** to learn CPU-only patterns _(coming soon)_ -4. Master **04_dependencies** for production readiness _(coming soon)_ +3. Move to **02_cpu_worker** to learn CPU-only patterns +4. Master **04_dependencies** for production readiness ## Next Steps diff --git a/02_ml_inference/01_text_to_speech/README.md b/02_ml_inference/01_text_to_speech/README.md index 30be7bd..4b89a47 100644 --- a/02_ml_inference/01_text_to_speech/README.md +++ b/02_ml_inference/01_text_to_speech/README.md @@ -24,15 +24,16 @@ This example demonstrates running a 1.7B parameter TTS model on serverless GPU i ```bash cd 02_ml_inference/01_text_to_speech -pip install -r requirements.txt -cp .env.example .env -# Add your RUNPOD_API_KEY to .env +uv sync +uv run flash login ``` +Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`. + ### Run ```bash -flash run +uv run flash run ``` First run provisions the endpoint (~1 min). Server starts at http://localhost:8888 diff --git a/02_ml_inference/README.md b/02_ml_inference/README.md index cd41fed..be41df1 100644 --- a/02_ml_inference/README.md +++ b/02_ml_inference/README.md @@ -4,6 +4,15 @@ Deploy machine learning models as production-ready APIs. Learn how to serve LLMs ## Examples +### [01_text_to_speech](./01_text_to_speech/) +Text-to-Speech API using Qwen3-TTS. + +**What you'll learn:** +- Running HuggingFace models with `@remote` on GPU workers +- Returning binary audio data (WAV) from API endpoints +- Using `bfloat16` precision for memory-efficient inference +- Input validation inside self-contained `@remote` functions + ### 01_text_generation _(coming soon)_ LLM inference API with streaming support. diff --git a/03_advanced_workers/05_load_balancer/README.md b/03_advanced_workers/05_load_balancer/README.md index c3f45d7..f3071ac 100644 --- a/03_advanced_workers/05_load_balancer/README.md +++ b/03_advanced_workers/05_load_balancer/README.md @@ -26,22 +26,21 @@ Load-balanced endpoints use direct HTTP routing to serverless workers, providing ### 1. Install Dependencies ```bash -pip install -r requirements.txt +uv sync ``` -### 2. Configure Environment +### 2. Authenticate ```bash -cp .env.example .env -# Add your RUNPOD_API_KEY to .env +uv run flash login ``` -Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). +Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`. ### 3. Run Locally (from repository root) ```bash -flash run +uv run flash run ``` Visit **http://localhost:8888/docs** for interactive API documentation (unified app with all examples). diff --git a/04_scaling_performance/01_autoscaling/README.md b/04_scaling_performance/01_autoscaling/README.md index ddb76de..2d8101a 100644 --- a/04_scaling_performance/01_autoscaling/README.md +++ b/04_scaling_performance/01_autoscaling/README.md @@ -4,7 +4,7 @@ Configure Flash worker autoscaling for different workload patterns. This example ## Quick Start -**Prerequisites**: Complete the [repository setup](../../README.md#quick-start) first (clone, `make dev`, set API key). +**Prerequisites**: Complete the [repository setup](../../README.md#quick-start) first, or run `flash login` to authenticate. ```bash cd 04_scaling_performance/01_autoscaling diff --git a/05_data_workflows/01_network_volumes/README.md b/05_data_workflows/01_network_volumes/README.md index d8d325c..bd9cf24 100644 --- a/05_data_workflows/01_network_volumes/README.md +++ b/05_data_workflows/01_network_volumes/README.md @@ -11,23 +11,21 @@ The GPU worker generates images with Stable Diffusion and writes them to a Runpo ### 1. Install Dependencies ```bash -pip install -r requirements.txt +uv sync ``` -### 2. Configure Environment - -Create `.env`: +### 2. Authenticate ```bash -RUNPOD_API_KEY=your_api_key_here +uv run flash login ``` -Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). +Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`. ### 3. Run Locally ```bash -flash run +uv run flash run ``` Server starts at `http://localhost:8888` diff --git a/05_data_workflows/README.md b/05_data_workflows/README.md index 7b1623b..87be705 100644 --- a/05_data_workflows/README.md +++ b/05_data_workflows/README.md @@ -4,7 +4,7 @@ Handle data storage, processing, and pipelines in Flash applications. Learn pers ## Examples -### 01_network_volumes _(coming soon)_ +### [01_network_volumes](./01_network_volumes/) Persistent storage with Runpod network volumes. **What you'll learn:** diff --git a/CLI-REFERENCE.md b/CLI-REFERENCE.md index a058cf3..142beba 100644 --- a/CLI-REFERENCE.md +++ b/CLI-REFERENCE.md @@ -21,6 +21,7 @@ flash --help # Show help for specific command | Command | Purpose | |---------|---------| +| [`flash login`](#flash-login) | Authenticate with Runpod | | [`flash init`](#flash-init) | Create new Flash project | | [`flash run`](#flash-run) | Run development server | | [`flash build`](#flash-build) | Build application package | @@ -60,6 +61,51 @@ flash --help # Show help for specific command --- +## flash login + +Authenticate with Runpod. Opens a browser for authentication and saves credentials locally. + +### Syntax + +```bash +flash login +``` + +### What It Does + +1. Opens your default browser to Runpod's authentication page +2. After you authenticate, saves credentials securely +3. Environment variables persist across sessions + +### Examples + +**Authenticate with Runpod:** +```bash +flash login +# Opens browser for authentication +``` + +### Alternative: Manual Configuration + +If `flash login` doesn't work in your environment, you can set the API key manually: + +```bash +# Set environment variable +export RUNPOD_API_KEY=your-key-here + +# Or add to .env file +echo "RUNPOD_API_KEY=your-key-here" > .env +``` + +Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). + +### Related Commands + +- [`flash run`](#flash-run) - Run development server (requires authentication) +- [`flash deploy`](#flash-deploy) - Deploy to Runpod (requires authentication) + +--- + ## flash init Create a new Flash project with the correct structure and boilerplate code. diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index 9efc3f5..984e29a 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -155,6 +155,45 @@ Virtual Environment: Active (.venv exists) Python Version: Python 3.12.10 ``` +### Running Flash with Different Package Managers + +After `make setup` completes, you can run Flash in two ways: + +**Option A: Using Package Manager Prefix (recommended for uv/poetry/pipenv/conda)** + +Run commands with the package manager prefix without activation: + +```bash +# With uv +uv run flash run + +# With poetry +poetry run flash run + +# With pipenv +pipenv run flash run + +# With conda +conda run -p ./.venv flash run +``` + +**Option B: Activate Virtual Environment (works with all managers)** + +Alternatively, activate the virtual environment first: + +```bash +# Unix/macOS +source .venv/bin/activate + +# Windows +.venv\Scripts\activate + +# Then run normally +flash run +``` + +Once activated, you can run Flash and other commands directly without a prefix. + ## Makefile Commands ### Help @@ -217,6 +256,14 @@ Shows all available commands with your detected environment manager. The Flash CLI provides commands for local development, building, and deployment. +### Authentication + +```bash +flash login # Authenticate with Runpod (opens browser) +``` + +Or set `RUNPOD_API_KEY` in your `.env` file. + ### Development Commands ```bash @@ -495,6 +542,66 @@ make clean-venv make setup ``` +## Unified App Architecture + +The root directory provides a programmatic discovery system that automatically finds and loads all examples when you run `flash run` from the project root. + +### Discovery Process + +1. Scans all example category directories (`01_getting_started/`, `02_ml_inference/`, etc.) +2. Detects `@remote` decorated functions in `.py` files via AST parsing +3. Dynamically generates routes with unique prefixes (e.g., `/01_hello_world/gpu/`) +4. Generates metadata and documentation automatically + +### Benefits + +- Add new examples without modifying any central configuration +- Consistent routing across all examples +- Single entry point for exploring all functionality +- Automatic discovery eliminates manual registration + +### Example Structure + +Each example follows this flat-file pattern: + +``` +example_name/ +├── README.md # Documentation and deployment guide +├── gpu_worker.py # @remote decorated functions (GPU) +├── cpu_worker.py # @remote decorated functions (CPU, optional) +├── pyproject.toml # Project dependencies +└── .gitignore # Git ignore patterns +``` + +### Example Dependency Syncing + +The unified app requires all example dependencies to be installed in the root environment for discovery to work. + +**Automatic Dependency Syncing:** + +When adding a new example with additional dependencies: + +1. Define dependencies in your example's `pyproject.toml` +2. Run `make sync-example-deps` to update the root dependencies +3. Run `uv sync --all-groups` (or `make setup`) to install new packages +4. Run `make requirements.txt` to regenerate the lockfile + +The sync script automatically: +- Scans all example directories for `pyproject.toml` files +- Merges dependencies into the root configuration +- Filters out transitive dependencies already provided by `runpod-flash` +- Detects version conflicts and uses the most permissive constraint +- Preserves essential root dependencies (numpy, torch, runpod-flash) + +**Example:** + +```bash +# After adding a new example with pillow and structlog dependencies +make sync-example-deps # Syncs example deps to root pyproject.toml +uv sync --all-groups # Installs new dependencies +make requirements.txt # Regenerates lockfile +``` + ## Troubleshooting ### Package Manager Not Detected diff --git a/README.md b/README.md index 155773d..b0099f0 100644 --- a/README.md +++ b/README.md @@ -4,342 +4,74 @@ A collection of example applications showcasing Runpod Flash - a framework for b ## What is Flash? -Flash is a CLI tool and framework from the `runpod_flash` package that enables you to build FastAPI applications with workers that run on Runpod's serverless infrastructure. Write your code locally, and Flash handles deployment, scaling, and resource management. +Flash is a Python framework that lets you run functions on Runpod's Serverless infrastructure with a single decorator. Write code locally, deploy globally—Flash handles provisioning, scaling, and routing automatically. -## Prerequisites - -- Python 3.10+ (3.12 recommended) -- [Runpod Account](https://console.runpod.io/signup) -- [Runpod API Key](https://docs.runpod.io/get-started/api-keys) -- Flash CLI: `pip install runpod_flash` - -## Quick Start - -```bash -# Clone the repository -git clone https://github.com/runpod/flash-examples.git -cd flash-examples - -# Setup development environment (auto-detects package manager) -make setup - -# The setup will: -# - Create virtual environment -# - Install all dependencies -# - Create .env file from template -# - Verify your setup - -# Add your API key to .env file -# Get your key from: https://www.runpod.io/console/user/settings -nano .env # or use your preferred editor - -# Run all examples from the unified app (recommended) -flash run - -# Visit http://localhost:8888/docs -``` +```python +from runpod_flash import Endpoint, GpuType -**Verification:** -After setup, verify your environment is correct: -```bash -make verify-setup +@Endpoint(name="image-gen", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, dependencies=["torch", "diffusers"]) +async def generate_image(prompt: str) -> bytes: + # This runs on a cloud GPU, not your laptop + ... ``` -This checks: -- Python version (3.10+ required) -- Virtual environment exists -- Flash CLI is installed -- API key is configured +**Key features:** +- **`@Endpoint` decorator**: Mark any async function to run on serverless infrastructure +- **Auto-scaling**: Scale to zero when idle, scale up under load +- **Local development**: `flash run` starts a local server with hot reload +- **One-command deploy**: `flash deploy` packages and ships your code -**Alternative Setup Methods:** -- **With Makefile**: `make setup` (auto-detects your package manager) -- **With uv**: `uv sync && uv pip install -e .` -- **With pip**: `pip install -e .` -- **With poetry**: `poetry install` - -**Using Different Package Managers:** +## Prerequisites -The setup automatically detects and uses your available package manager in this order: -1. uv -2. poetry -3. pipenv -4. pip (standard Python package manager) -5. conda +- **Python 3.10+** +- **uv**: Install with `curl -LsSf https://astral.sh/uv/install.sh | sh` +- **Runpod account**: [Sign up here](https://runpod.io/console/signup) -To explicitly use a specific package manager: +## Quick Start ```bash -# Force pip usage -PKG_MANAGER=pip make setup - -# Force poetry -PKG_MANAGER=poetry make setup - -# Force conda -PKG_MANAGER=conda make setup -``` - -**Running Flash Examples:** - -After `make setup` completes, you can run Flash in two ways: - -**Option A: Using Package Manager Prefix (recommended for uv/poetry/pipenv/conda)** +# Clone and install +git clone https://github.com/runpod/flash-examples.git +cd flash-examples +uv sync && uv pip install -e . -With uv, poetry, pipenv, or conda, run commands with the package manager prefix without activation: +# Authenticate with Runpod +uv run flash login -```bash -# With uv +# Run all examples locally uv run flash run - -# With poetry -poetry run flash run - -# With pipenv -pipenv run flash run - -# With conda -conda run -p ./.venv flash run -``` - -**Option B: Activate Virtual Environment (works with all managers)** - -Alternatively, activate the virtual environment first: - -```bash -# Unix/macOS -source .venv/bin/activate - -# Windows -.venv\Scripts\activate - -# Then run normally -flash run ``` -Once activated, you can run Flash and other commands directly without a prefix. - -**Note**: After running `make setup`, all example dependencies are installed. You can navigate to any example directory and run `flash run` immediately. The `make setup` command will show you the recommended next steps based on your detected package manager. - -For detailed development instructions, see [DEVELOPMENT.md](./DEVELOPMENT.md). - -## Examples by Category - -### 01 - Getting Started -Learn the fundamentals of Flash applications. +Open **http://localhost:8888/docs** to explore all endpoints. -- **[01_hello_world](./01_getting_started/01_hello_world/)** - The simplest Flash application with GPU workers -- **[02_cpu_worker](./01_getting_started/02_cpu_worker/)** - CPU-only worker example -- **[03_mixed_workers](./01_getting_started/03_mixed_workers/)** - Mixed GPU/CPU workers with cost optimization and validation -- **[04_dependencies](./01_getting_started/04_dependencies/)** - Managing Python and system dependencies +> **Using pip, poetry, or conda?** See [DEVELOPMENT.md](./DEVELOPMENT.md) for alternative setups. -### 02 - ML Inference -Deploy machine learning models as APIs. +## Examples -- 01_text_generation - LLM inference (Llama, Mistral, etc.) _(coming soon)_ -- 02_image_generation - Stable Diffusion image generation _(coming soon)_ -- 03_embeddings - Text embeddings API _(coming soon)_ -- 04_multimodal - Vision-language models _(coming soon)_ +| Category | Example | Description | +|----------|---------|-------------| +| **Getting Started** | [01_hello_world](./01_getting_started/01_hello_world/) | Basic GPU worker | +| | [02_cpu_worker](./01_getting_started/02_cpu_worker/) | CPU-only worker | +| | [03_mixed_workers](./01_getting_started/03_mixed_workers/) | GPU + CPU pipeline | +| | [04_dependencies](./01_getting_started/04_dependencies/) | Dependency management | +| **ML Inference** | [01_text_to_speech](./02_ml_inference/01_text_to_speech/) | Qwen3-TTS model serving | +| **Advanced** | [05_load_balancer](./03_advanced_workers/05_load_balancer/) | HTTP routing with load balancer | +| **Scaling** | [01_autoscaling](./04_scaling_performance/01_autoscaling/) | Worker autoscaling configuration | +| **Data** | [01_network_volumes](./05_data_workflows/01_network_volumes/) | Persistent storage with network volumes | -### 03 - Advanced Workers -Production-ready worker patterns. +More examples coming soon in each category. -- 01_streaming - Streaming responses (SSE/WebSocket) _(coming soon)_ -- 02_batch_processing - Batch inference optimization _(coming soon)_ -- 03_caching - Model and result caching strategies _(coming soon)_ -- 04_custom_images - Custom Docker images _(coming soon)_ -- **[05_load_balancer](./03_advanced_workers/05_load_balancer/)** - Load-balancer endpoints with custom HTTP routes - -### 04 - Scaling & Performance -Optimize for production workloads. - -- 01_autoscaling - Worker autoscaling configuration _(coming soon)_ -- 02_gpu_optimization - GPU memory management _(coming soon)_ -- 03_concurrency - Async patterns and concurrency _(coming soon)_ -- 04_monitoring - Logging, metrics, and observability _(coming soon)_ - -### 05 - Data Workflows -Handle data storage and processing. - -- 01_network_volumes - Persistent storage with network volumes _(coming soon)_ -- 02_file_upload - Handling file uploads _(coming soon)_ -- 03_data_pipelines - ETL workflows _(coming soon)_ -- 04_s3_integration - Cloud storage integration _(coming soon)_ - -### 06 - Real World Applications -Complete production-ready applications. - -- 01_chatbot_api - Production chatbot service _(coming soon)_ -- 02_image_api - Image processing service _(coming soon)_ -- 03_audio_transcription - Whisper transcription service _(coming soon)_ -- 04_multimodel_pipeline - Complex multi-stage workflows _(coming soon)_ - -## Learning Path - -**New to Flash?** Start here: -1. [01_getting_started/01_hello_world](./01_getting_started/01_hello_world/) - Understand the basics -2. [01_getting_started/03_mixed_workers](./01_getting_started/03_mixed_workers/) - Learn cost optimization and validation patterns -3. 02_ml_inference/01_text_generation - Deploy your first model - -**Coming from Modal?** -Flash is FastAPI-centric for building production applications, while Modal focuses on standalone functions. Flash provides structured application development with built-in routing and deployment management. - -**Production Deployment?** -1. Review examples in `04_scaling_performance/` -2. Study `06_real_world/` for complete architectures -3. Check deployment docs in each example's README - -## Flash CLI Commands - -The Flash CLI provides a complete toolkit for building, testing, and deploying distributed inference applications. - -### Core Commands +## CLI Commands ```bash -flash init [project] # Create new Flash project -flash run # Run development server (localhost:8888) -flash build # Build deployment package -flash deploy --env # Build and deploy to environment -flash undeploy # Delete deployed endpoint +flash login # Authenticate with Runpod (opens browser) +flash run # Run development server (localhost:8888) +flash build # Build deployment package +flash deploy --env # Build and deploy to environment +flash undeploy # Delete deployed endpoint ``` -### Environment Management - -```bash -flash env list # Show all environments -flash env create # Create new environment (dev, staging, prod) -flash env get # Show environment details -flash env delete # Delete environment -``` - -### Application Management - -```bash -flash app list # List all Flash apps -flash app create # Create new app -flash app get # Show app details -flash app delete --app # Delete app and all resources -``` - -### Command Options - -Many commands support additional options for customization: - -```bash -flash run --host 0.0.0.0 --port 9000 # Custom host and port -flash build --exclude torch,torchvision # Exclude packages (reduce size) -flash deploy --env prod --preview # Local preview before deploying -flash undeploy --all --force # Remove all endpoints -``` - -### Full Documentation - -**[Complete CLI Reference](CLI-REFERENCE.md)** - Comprehensive guide with all commands, options, and examples - -**Step-by-Step Guides:** -- [Getting Started (5 minutes)](docs/cli/getting-started.md) - Your first Flash project -- [Command Reference](docs/cli/commands.md) - Exhaustive documentation for all commands -- [Workflows](docs/cli/workflows.md) - Common development workflows -- [Troubleshooting](docs/cli/troubleshooting.md) - Solutions to common problems - -### Quick Start - -```bash -# 1. Create project -flash init my-api && cd my-api - -# 2. Run locally -flash run -# Visit http://localhost:8888/docs - -# 3. Create environment and deploy -flash env create production -flash deploy --env production -``` - -## Testing Your Application - -After running `flash run`, you can test your API in two ways: - -**Option A: Using the Interactive UI** - -Visit **http://localhost:8888/docs** to use FastAPI's built-in Swagger UI where you can: -- See all available endpoints -- Test requests directly in your browser -- View request/response schemas - -**Option B: Using curl** - -Test endpoints from the command line: -```bash -curl -X POST http://localhost:8888/endpoint \ - -H "Content-Type: application/json" \ - -d '{"key": "value"}' -``` - -See individual example READMEs for specific endpoint examples. - -## Unified App Architecture - -The root [main.py](main.py) provides a programmatic discovery system that automatically finds and loads all examples: - -**Discovery Process**: -1. Scans all example category directories (`01_getting_started/`, `02_ml_inference/`, etc.) -2. Detects two patterns: - - **Queue-based workers**: `@Endpoint(...)` decorated functions - - **Load-balanced workers**: `Endpoint` instances with `.get()/.post()` route decorators -3. Dynamically imports and registers all endpoints with unique prefixes (e.g., `/01_hello_world/gpu/`) -4. Generates metadata and documentation automatically - -**Benefits**: -- Add new examples without modifying the unified app -- Consistent routing across all examples -- Single entry point for exploring all functionality -- Automatic discovery eliminates manual registration - -## Example Structure - -Each example follows this structure: - -``` -example_name/ -├── README.md # Documentation and deployment guide -├── gpu_worker.py # GPU worker with @Endpoint decorator -├── cpu_worker.py # CPU worker with @Endpoint decorator -├── requirements.txt # Python dependencies -├── pyproject.toml # Project configuration -└── .env.example # Environment variable template -``` - -### Dependency Management - -The unified app automatically discovers and loads all examples at runtime, which requires all example dependencies to be installed in the root environment. - -**Automatic Dependency Syncing**: - -When adding a new example with additional dependencies: - -1. Define dependencies in your example's `pyproject.toml` -2. Run `make sync-example-deps` to update the root dependencies -3. Run `uv sync --all-groups` to install the new packages -4. Run `make requirements.txt` to regenerate the lockfile - -The sync script automatically: -- Scans all example directories for `pyproject.toml` files -- Merges dependencies into the root configuration -- Filters out transitive dependencies already provided by `runpod-flash` -- Detects version conflicts and uses the most permissive constraint -- Preserves essential root dependencies (numpy, torch, runpod-flash) - -**Example**: - -```bash -# After adding a new example with pillow and structlog dependencies -make sync-example-deps # Syncs example deps to root pyproject.toml -uv sync --all-groups # Installs new dependencies -make requirements.txt # Regenerates lockfile -``` - -This automation ensures that `flash run` from the root directory always has access to all required dependencies for dynamic example loading. +See **[CLI-REFERENCE.md](./CLI-REFERENCE.md)** for complete documentation. ## Key Concepts @@ -350,9 +82,9 @@ The `Endpoint` class configures functions for execution on Runpod's serverless i **Queue-based (one function = one endpoint):** ```python -from runpod_flash import Endpoint, GpuGroup +from runpod_flash import Endpoint, GpuType -@Endpoint(name="my-worker", gpu=GpuGroup.ADA_24, workers=(0, 3), dependencies=["torch"]) +@Endpoint(name="my-worker", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, workers=(0, 3), dependencies=["torch"]) async def process(data: dict) -> dict: import torch # this code runs on Runpod GPUs @@ -389,14 +121,18 @@ print(job.output) ### Resource Types **GPU Workers** (`gpu=`): -- `GpuGroup.ADA_24` - RTX 4090 (24GB) -- `GpuGroup.ADA_48_PRO` - RTX 6000 Ada, L40 (48GB) -- `GpuGroup.AMPERE_80` - A100 (80GB) +| Type | Use Case | +|------|----------| +| `GpuType.NVIDIA_GEFORCE_RTX_4090` | RTX 4090 (24GB) | +| `GpuType.NVIDIA_RTX_6000_ADA_GENERATION` | RTX 6000 Ada (48GB) | +| `GpuType.NVIDIA_A100_80GB_PCIe` | A100 (80GB) | **CPU Workers** (`cpu=`): -- `CpuInstanceType.CPU3G_2_8` - 2 vCPU, 8GB RAM -- `CpuInstanceType.CPU3C_4_8` - 4 vCPU, 8GB RAM (Compute) -- `CpuInstanceType.CPU5G_4_16` - 4 vCPU, 16GB RAM (Latest) +| Type | Specs | +|------|-------| +| `cpu3g-2-8` | 2 vCPU, 8GB RAM | +| `cpu3c-4-8` | 4 vCPU, 8GB RAM (Compute) | +| `cpu5c-4-16` | 4 vCPU, 16GB RAM (Latest) | ### Auto-Scaling @@ -405,30 +141,16 @@ Workers automatically scale based on demand: - `workers=(1, 5)` - Keep 1 warm, scale up to 5 - `idle_timeout=5` - Minutes before scaling down -## Contributing - -We welcome contributions! See [CONTRIBUTING.md](./CONTRIBUTING.md) for contribution guidelines and [DEVELOPMENT.md](./DEVELOPMENT.md) for development setup. - -To add a new example: -1. Follow the standard example structure -2. Include comprehensive README with deployment steps -3. Add tests for critical functionality -4. Ensure all dependencies are pinned in requirements.txt -5. Run `make quality-check` before committing -6. Test deployment with `flash deploy` - ## Resources -- [Flash CLI Documentation](https://github.com/runpod/runpod-flash) +- [Flash Documentation](https://docs.runpod.io) - [Runpod Serverless Docs](https://docs.runpod.io/serverless/overview) -- [Flash SDK Reference](https://github.com/runpod/runpod-flash) - [Community Discord](https://discord.gg/runpod) -## Testing +## Contributing -All examples are continuously tested against Python 3.10-3.14 to ensure compatibility across all supported versions. See [.github/workflows/](./.github/workflows/) for CI configuration. +See [CONTRIBUTING.md](./CONTRIBUTING.md) for contribution guidelines and [DEVELOPMENT.md](./DEVELOPMENT.md) for development setup. ## License MIT License - see [LICENSE](./LICENSE) for details. - diff --git a/docs/cli/commands.md b/docs/cli/commands.md index ae02f64..9c309f0 100644 --- a/docs/cli/commands.md +++ b/docs/cli/commands.md @@ -4,6 +4,7 @@ Exhaustive documentation for all Flash CLI commands. This guide covers every opt ## Table of Contents +- [flash login](#flash-login) - Authenticate with Runpod - [flash init](#flash-init) - Create new Flash project - [flash run](#flash-run) - Run development server - [flash build](#flash-build) - Build deployment package @@ -22,6 +23,46 @@ Exhaustive documentation for all Flash CLI commands. This guide covers every opt --- +## flash login + +Authenticate with Runpod by opening a browser for OAuth authentication. + +### Synopsis + +```bash +flash login +``` + +### Description + +Authenticates with Runpod by opening your default browser to the Runpod authentication page. After successful authentication, credentials are saved locally and persist across terminal sessions. + +This is the recommended authentication method for interactive use. For CI/CD or automated environments, use the `RUNPOD_API_KEY` environment variable instead. + +### Examples + +**Authenticate with Runpod:** +```bash +flash login +``` + +**Alternative: Manual API key configuration:** +```bash +# Set environment variable +export RUNPOD_API_KEY=your-key-here + +# Or add to .env file +echo "RUNPOD_API_KEY=your-key-here" > .env +``` + +### Notes + +- Requires a browser for OAuth flow +- Credentials are stored securely on your local machine +- For headless environments, use `RUNPOD_API_KEY` environment variable + +--- + ## flash init Create a new Flash project with the correct structure, boilerplate code, and configuration files. @@ -179,12 +220,12 @@ The generated `.gitignore` already includes necessary patterns. 2. **Install Dependencies** ```bash - pip install -e . + uv sync && uv pip install -e . ``` 3. **Run Locally** ```bash - flash run + uv run flash run ``` 4. **View Documentation** diff --git a/docs/cli/getting-started.md b/docs/cli/getting-started.md index 940b8f0..04f3a79 100644 --- a/docs/cli/getting-started.md +++ b/docs/cli/getting-started.md @@ -7,30 +7,34 @@ Complete your first Flash project in under 10 minutes. This guide walks you thro Before starting, ensure you have: - **Python 3.10 or higher** - Check with `python --version` +- **uv** - Install with `curl -LsSf https://astral.sh/uv/install.sh | sh` - **Runpod API Key** - Get from https://runpod.io/console/user/settings -- **Flash installed** - Install with `pip install runpod-flash` ### Verify Installation ```bash -flash --version +uv run flash --version # Should output: flash, version X.Y.Z ``` -### Configure API Key +### Authenticate -Set your Runpod API key as an environment variable: +The easiest way to authenticate is with `flash login`: ```bash -export RUNPOD_API_KEY=your-key-here +uv run flash login ``` -Or add to `.env` file: +This opens a browser for authentication and saves your credentials. + +**Alternative:** Set your Runpod API key manually: ```bash +export RUNPOD_API_KEY=your-key-here +# Or add to .env file: echo "RUNPOD_API_KEY=your-key-here" > .env ``` -**Checkpoint:** Running `echo $RUNPOD_API_KEY` should display your key. +**Checkpoint:** Running `uv run flash login` or `echo $RUNPOD_API_KEY` confirms authentication. --- @@ -41,7 +45,7 @@ echo "RUNPOD_API_KEY=your-key-here" > .env Create a new Flash project with the CLI: ```bash -flash init hello-flash +uv run flash init hello-flash cd hello-flash ``` @@ -88,7 +92,7 @@ async def process_request(payload: dict) -> dict: Start the development server: ```bash -flash run +uv run flash run ``` **Expected output:** @@ -152,7 +156,7 @@ curl -X POST http://localhost:8888/process \ Stop the development server (Ctrl+C), then create a deployment environment: ```bash -flash env create dev +uv run flash env create dev ``` **Expected output:** @@ -164,7 +168,7 @@ Environment 'dev' created successfully - Created a deployment target named "dev" - This environment will contain your deployed endpoints -**Checkpoint:** Run `flash env list` and verify "dev" appears. +**Checkpoint:** Run `uv run flash env list` and verify "dev" appears. --- @@ -173,7 +177,7 @@ Environment 'dev' created successfully Package your application for deployment: ```bash -flash build +uv run flash build ``` **Expected output:** @@ -200,7 +204,7 @@ Creating archive... Deploy your application to the "dev" environment: ```bash -flash deploy --env dev +uv run flash deploy --env dev ``` **Expected output:** @@ -289,7 +293,8 @@ The flash-examples repository contains production-ready examples: ```bash git clone https://github.com/runpod/flash-examples.git cd flash-examples -flash run +uv sync && uv pip install -e . +uv run flash run # Visit http://localhost:8888/docs to explore all examples ``` @@ -331,9 +336,11 @@ flash run **Solution:** ```bash -pip install runpod-flash +# Install uv if needed +curl -LsSf https://astral.sh/uv/install.sh | sh + # Verify installation -flash --version +uv run flash --version ``` ### Port Already in Use @@ -343,7 +350,7 @@ flash --version **Solution:** ```bash # Use different port -flash run --port 9000 +uv run flash run --port 9000 # Or find and kill process using port 8888 lsof -ti:8888 | xargs kill -9 @@ -367,7 +374,7 @@ echo $RUNPOD_API_KEY **Solution:** ```bash # Exclude packages present in Runpod base image -flash build --exclude torch,torchvision,torchaudio +uv run flash build --exclude torch,torchvision,torchaudio ``` See [Troubleshooting Guide](troubleshooting.md) for more solutions. @@ -378,13 +385,13 @@ See [Troubleshooting Guide](troubleshooting.md) for more solutions. | Command | Purpose | |---------|---------| -| `flash init ` | Create new project | -| `flash run` | Run development server | -| `flash build` | Build deployment package | -| `flash deploy --env ` | Deploy to environment | -| `flash env create ` | Create environment | -| `flash env list` | List environments | -| `flash undeploy ` | Delete endpoint | +| `uv run flash init ` | Create new project | +| `uv run flash run` | Run development server | +| `uv run flash build` | Build deployment package | +| `uv run flash deploy --env ` | Deploy to environment | +| `uv run flash env create ` | Create environment | +| `uv run flash env list` | List environments | +| `uv run flash undeploy ` | Delete endpoint | --- diff --git a/docs/cli/troubleshooting.md b/docs/cli/troubleshooting.md index 44ac31a..9f0cc58 100644 --- a/docs/cli/troubleshooting.md +++ b/docs/cli/troubleshooting.md @@ -29,7 +29,16 @@ bash: flash: command not found **Solutions:** -**1. Install Flash:** +**1. Install with uv (recommended):** +```bash +# Install uv if needed +curl -LsSf https://astral.sh/uv/install.sh | sh + +# Verify installation +uv run flash --version +``` + +**2. Install with pip (alternative):** ```bash pip install runpod-flash @@ -37,7 +46,7 @@ pip install runpod-flash flash --version ``` -**2. Check PATH:** +**3. Check PATH:** ```bash # Find where flash is installed which flash diff --git a/docs/cli/workflows.md b/docs/cli/workflows.md index 7ecc93e..3283969 100644 --- a/docs/cli/workflows.md +++ b/docs/cli/workflows.md @@ -46,19 +46,21 @@ cd my-api #### 2. Set Up Environment ```bash -# Create virtual environment (if needed) +# Install dependencies with uv (recommended) +uv sync && uv pip install -e . +``` + +**Alternative with pip:** +```bash python -m venv .venv source .venv/bin/activate # macOS/Linux -# .venv\Scripts\activate # Windows - -# Install dependencies pip install -e . ``` **Validation:** ```bash python --version # Should show 3.10+ -pip list | grep runpod-flash # Should show runpod-flash +uv run flash --version # Should show flash version ``` #### 3. Configure Environment Variables