Auto-generated by /analyze-repos on 2026-02-22. Manual edits will be overwritten on next analysis.
Production-ready examples demonstrating Flash framework capabilities. Flat-file pattern: each worker is a standalone .py file with @Endpoint decorator, auto-discovered by flash run. 6 categories, 18 worker files. Root pyproject.toml declares only runpod-flash dependency; runtime deps declared inline via Endpoint(dependencies=[...]).
- @Endpoint decorator (QB) -- Core pattern.
async defmarked with@Endpoint(name=..., gpu=..., ...)for queue-based remote execution. - Endpoint routes (LB) -- Load-balanced pattern.
api = Endpoint(...)with@api.get()/@api.post()route decorators for HTTP endpoints. - @Endpoint decorator (class) -- Used on
SimpleSDclass (05_data_workflows). Class-based pattern for stateful workers. - Cross-worker orchestration -- Pipeline files import from QB workers, chain with
await. LB endpoint orchestrates QB workers. - Flat-file discovery -- No FastAPI boilerplate, no routers, no
main.py.flash runauto-generates routes from decorated functions. - In-function imports -- Heavy libs (torch, transformers, etc.) imported inside
@Endpointbody, onlyrunpod_flashat module level.
All worker files across 6 categories. Each file is an independent entry point discovered by flash run.
01_getting_started/ # Fundamentals
01_hello_world/ # Basic GPU worker
02_cpu_worker/ # CPU-only worker
03_mixed_workers/ # Cross-worker orchestration (CPU -> GPU -> LB)
04_dependencies/ # Runtime dependency declaration
02_ml_inference/ # ML deployment
01_text_to_speech/ # Qwen3-TTS model serving
03_advanced_workers/ # Advanced patterns
05_load_balancer/ # LB endpoints with custom HTTP routes
04_scaling_performance/ # Autoscaling
01_autoscaling/ # Scaling strategy examples
05_data_workflows/ # Data pipelines
01_network_volumes/ # Network volume usage with @Endpoint class
06_real_world/ # Placeholder for production patterns
Queue-based (function decorator):
from runpod_flash import Endpoint, GpuType
@Endpoint(
name="my-worker",
gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
)
async def my_function(payload: dict) -> dict:
"""All runtime imports inside the function body."""
import torch
return {"status": "success"}Load-balanced (route decorators):
from runpod_flash import Endpoint
api = Endpoint(name="my-api", cpu="cpu3c-1-2", workers=(1, 3))
@api.post("/process")
async def process(data: dict) -> dict:
return {"result": data}
@api.get("/health")
async def health() -> dict:
return {"status": "ok"}GPU vs CPU is a parameter, not a class choice:
| Config | Syntax | Use Case |
|---|---|---|
| GPU endpoint | @Endpoint(name=..., gpu=GpuType.NVIDIA_GEFORCE_RTX_4090) |
GPU workers |
| CPU endpoint | @Endpoint(name=..., cpu="cpu3c-1-2") |
CPU workers |
| GPU LB | api = Endpoint(name=..., gpu=GpuType.NVIDIA_GEFORCE_RTX_4090); @api.post(...) |
GPU LB endpoints |
| CPU LB | api = Endpoint(name=..., cpu="cpu3c-1-2"); @api.post(...) |
CPU LB endpoints |
Pipeline files import functions from other workers and chain them:
from cpu_worker import preprocess_text
from gpu_worker import gpu_inference
from runpod_flash import Endpoint
pipeline = Endpoint(name="pipeline", cpu="cpu3c-1-2", workers=(1, 3))
@pipeline.post("/classify")
async def classify(text: str) -> dict:
result = await preprocess_text({"text": text})
return await gpu_inference(result)All examples import from runpod_flash. Import frequency by symbol:
| Symbol | Files Using It | Breakage Risk |
|---|---|---|
Endpoint |
18 | ALL examples break |
GpuType |
7 | GPU config breaks |
CpuInstanceType |
4 | CPU config breaks |
NetworkVolume |
2 | Volume examples break |
ServerlessScalerType |
1 | Scaling example breaks |
- flash (
runpod_flashpackage) -- all files import from it. Any breaking change toEndpointconstructor, enum values, or route decorator signature breaks examples at import time.
- None. This is a leaf repo (documentation/examples only).
Endpoint(name=..., gpu=..., cpu=..., workers=...)constructor -- parameter rename/removal breaks all files.get()/.post()/.put()/.delete()/.patch()route decorator signaturesGpuGroup,GpuType,CpuInstanceTypeenum values -- value removals break GPU/CPU configsNetworkVolumeconstructor -- field changes break volume examples
flash-examples --> flash (runpod_flash) --> runpod-python (runpod)
- No automated tests -- changes caught only at import time or
flash run - No CI that validates examples against current flash version
- Python version: inherits from flash (3.10+)
uv venv && source .venv/bin/activate
uv sync --all-groupsflash run # Start local dev server (localhost:8888)
# Visit http://localhost:8888/docs for interactive API docs
python gpu_worker.py # Test a single worker directly (if __name__ == "__main__" block)make quality-check # REQUIRED BEFORE ALL COMMITS
make lint # Ruff linter
make format # Ruff formatter
make format-check # Check formattingflash build # Package build artifacts
flash deploy # Build + upload + provision endpoints
flash deploy --preview # Local Docker Compose preview
flash build --use-local-flash # Use local flash library instead of PyPI- No test infrastructure at all. No
conftest.py, notests/directory, no pytest config. Onlyif __name__ == "__main__"blocks for manual testing. Any flash API change is caught only at import time.
- Broad
except Exceptioncatches in 4 files -- swallows specific errors, makes debugging harder - Duplicated GPU inference logic in
04_scaling_performance-- 3 near-identical functions that should be extracted - No CI validation that examples work against the current flash version
- Duplicated
speakers/languageslists in02_ml_inference/01_text_to_speech - Missing input validation in some workers (accepts arbitrary dict without schema)
No formal test infrastructure exists. Each worker has an optional if __name__ == "__main__" block for manual execution.
- 100% uncovered -- no test framework, no conftest, no pytest config
- No smoke tests that verify examples import successfully
- No integration tests that run
flash runagainst examples
To test manually:
cd 01_getting_started/01_hello_world
flash run # Starts dev server, auto-discovers workers
# Use http://localhost:8888/docs to invoke endpoints- Add
tests/test_imports.pythat imports every worker file (catchesEndpointsignature drift) - Add
tests/test_configs.pythat validates all resource configs construct without error - Add CI job that runs
flash run --check(dry-run mode) against each example category
- Accessing external scope in @Endpoint functions -- only local variables, parameters, and internal imports work. The function body is serialized and sent to a remote worker.
- Module-level imports of heavy libraries -- import torch, numpy, transformers, etc. inside the function body, not at module level.
- Missing
if __name__ == "__main__"test block -- each worker should be independently testable. - Mutable default arguments -- use
Noneand initialize in function body. - Importing from
flashinstead ofrunpod_flash-- the package name isrunpod_flash.
Last analyzed: 2026-02-22