Flash Examples

Auto-generated by /analyze-repos on 2026-02-22. Manual edits will be overwritten on next analysis.

Project Overview

Production-ready examples demonstrating Flash framework capabilities. Flat-file pattern: each worker is a standalone .py file with @Endpoint decorator, auto-discovered by flash run. 6 categories, 18 worker files. Root pyproject.toml declares only runpod-flash dependency; runtime deps declared inline via Endpoint(dependencies=[...]).

Architecture

Key Abstractions

@Endpoint decorator (QB) -- Core pattern. async def marked with @Endpoint(name=..., gpu=..., ...) for queue-based remote execution.
Endpoint routes (LB) -- Load-balanced pattern. api = Endpoint(...) with @api.get()/@api.post() route decorators for HTTP endpoints.
@Endpoint decorator (class) -- Used on SimpleSD class (05_data_workflows). Class-based pattern for stateful workers.
Cross-worker orchestration -- Pipeline files import from QB workers, chain with await. LB endpoint orchestrates QB workers.
Flat-file discovery -- No FastAPI boilerplate, no routers, no main.py. flash run auto-generates routes from decorated functions.
In-function imports -- Heavy libs (torch, transformers, etc.) imported inside @Endpoint body, only runpod_flash at module level.

Entry Points

All worker files across 6 categories. Each file is an independent entry point discovered by flash run.

Module Structure

01_getting_started/          # Fundamentals
  01_hello_world/            # Basic GPU worker
  02_cpu_worker/             # CPU-only worker
  03_mixed_workers/          # Cross-worker orchestration (CPU -> GPU -> LB)
  04_dependencies/           # Runtime dependency declaration
02_ml_inference/             # ML deployment
  01_text_to_speech/         # Qwen3-TTS model serving
03_advanced_workers/         # Advanced patterns
  05_load_balancer/          # LB endpoints with custom HTTP routes
04_scaling_performance/      # Autoscaling
  01_autoscaling/            # Scaling strategy examples
05_data_workflows/           # Data pipelines
  01_network_volumes/        # Network volume usage with @Endpoint class
06_real_world/               # Placeholder for production patterns

Worker File Patterns

Queue-based (function decorator):

from runpod_flash import Endpoint, GpuType

@Endpoint(
    name="my-worker",
    gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
)
async def my_function(payload: dict) -> dict:
    """All runtime imports inside the function body."""
    import torch
    return {"status": "success"}

Load-balanced (route decorators):

from runpod_flash import Endpoint

api = Endpoint(name="my-api", cpu="cpu3c-1-2", workers=(1, 3))

@api.post("/process")
async def process(data: dict) -> dict:
    return {"result": data}

@api.get("/health")
async def health() -> dict:
    return {"status": "ok"}

Resource Configuration

GPU vs CPU is a parameter, not a class choice:

Config	Syntax	Use Case
GPU endpoint	`@Endpoint(name=..., gpu=GpuType.NVIDIA_GEFORCE_RTX_4090)`	GPU workers
CPU endpoint	`@Endpoint(name=..., cpu="cpu3c-1-2")`	CPU workers
GPU LB	`api = Endpoint(name=..., gpu=GpuType.NVIDIA_GEFORCE_RTX_4090); @api.post(...)`	GPU LB endpoints
CPU LB	`api = Endpoint(name=..., cpu="cpu3c-1-2"); @api.post(...)`	CPU LB endpoints

Cross-Worker Orchestration

Pipeline files import functions from other workers and chain them:

from cpu_worker import preprocess_text
from gpu_worker import gpu_inference
from runpod_flash import Endpoint

pipeline = Endpoint(name="pipeline", cpu="cpu3c-1-2", workers=(1, 3))

@pipeline.post("/classify")
async def classify(text: str) -> dict:
    result = await preprocess_text({"text": text})
    return await gpu_inference(result)

Public API Surface

All examples import from runpod_flash. Import frequency by symbol:

Symbol	Files Using It	Breakage Risk
`Endpoint`	18	ALL examples break
`GpuType`	7	GPU config breaks
`CpuInstanceType`	4	CPU config breaks
`NetworkVolume`	2	Volume examples break
`ServerlessScalerType`	1	Scaling example breaks

Cross-Repo Dependencies

Depends On

flash (runpod_flash package) -- all files import from it. Any breaking change to Endpoint constructor, enum values, or route decorator signature breaks examples at import time.

Depended On By

None. This is a leaf repo (documentation/examples only).

Interface Contracts

Endpoint(name=..., gpu=..., cpu=..., workers=...) constructor -- parameter rename/removal breaks all files
.get()/.post()/.put()/.delete()/.patch() route decorator signatures
GpuGroup, GpuType, CpuInstanceType enum values -- value removals break GPU/CPU configs
NetworkVolume constructor -- field changes break volume examples

Dependency Chain

flash-examples --> flash (runpod_flash) --> runpod-python (runpod)

Known Drift

No automated tests -- changes caught only at import time or flash run
No CI that validates examples against current flash version
Python version: inherits from flash (3.10+)

Development Commands

Setup

uv venv && source .venv/bin/activate
uv sync --all-groups

Testing

flash run                     # Start local dev server (localhost:8888)
# Visit http://localhost:8888/docs for interactive API docs
python gpu_worker.py          # Test a single worker directly (if __name__ == "__main__" block)

Quality

make quality-check            # REQUIRED BEFORE ALL COMMITS
make lint                     # Ruff linter
make format                   # Ruff formatter
make format-check             # Check formatting

Build and Deploy

flash build                   # Package build artifacts
flash deploy                  # Build + upload + provision endpoints
flash deploy --preview        # Local Docker Compose preview
flash build --use-local-flash # Use local flash library instead of PyPI

Code Health

High Severity

No test infrastructure at all. No conftest.py, no tests/ directory, no pytest config. Only if __name__ == "__main__" blocks for manual testing. Any flash API change is caught only at import time.

Medium Severity

Broad except Exception catches in 4 files -- swallows specific errors, makes debugging harder
Duplicated GPU inference logic in 04_scaling_performance -- 3 near-identical functions that should be extracted
No CI validation that examples work against the current flash version

Low Severity

Duplicated speakers/languages lists in 02_ml_inference/01_text_to_speech
Missing input validation in some workers (accepts arbitrary dict without schema)

Testing

Structure

No formal test infrastructure exists. Each worker has an optional if __name__ == "__main__" block for manual execution.

Coverage Gaps

100% uncovered -- no test framework, no conftest, no pytest config
No smoke tests that verify examples import successfully
No integration tests that run flash run against examples

Patterns

To test manually:

cd 01_getting_started/01_hello_world
flash run                    # Starts dev server, auto-discovers workers
# Use http://localhost:8888/docs to invoke endpoints

Recommended Test Strategy

Add tests/test_imports.py that imports every worker file (catches Endpoint signature drift)
Add tests/test_configs.py that validates all resource configs construct without error
Add CI job that runs flash run --check (dry-run mode) against each example category

Common Mistakes

Accessing external scope in @Endpoint functions -- only local variables, parameters, and internal imports work. The function body is serialized and sent to a remote worker.
Module-level imports of heavy libraries -- import torch, numpy, transformers, etc. inside the function body, not at module level.
Missing if __name__ == "__main__" test block -- each worker should be independently testable.
Mutable default arguments -- use None and initialize in function body.
Importing from flash instead of runpod_flash -- the package name is runpod_flash.

Last analyzed: 2026-02-22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash Examples

Project Overview

Architecture

Key Abstractions

Entry Points

Module Structure

Worker File Patterns

Resource Configuration

Cross-Worker Orchestration

Public API Surface

Cross-Repo Dependencies

Depends On

Depended On By

Interface Contracts

Dependency Chain

Known Drift

Development Commands

Setup

Testing

Quality

Build and Deploy

Code Health

High Severity

Medium Severity

Low Severity

Testing

Structure

Coverage Gaps

Patterns

Recommended Test Strategy

Common Mistakes

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Flash Examples

Project Overview

Architecture

Key Abstractions

Entry Points

Module Structure

Worker File Patterns

Resource Configuration

Cross-Worker Orchestration

Public API Surface

Cross-Repo Dependencies

Depends On

Depended On By

Interface Contracts

Dependency Chain

Known Drift

Development Commands

Setup

Testing

Quality

Build and Deploy

Code Health

High Severity

Medium Severity

Low Severity

Testing

Structure

Coverage Gaps

Patterns

Recommended Test Strategy

Common Mistakes