Conversation
There was a problem hiding this comment.
Pull request overview
This PR consolidates all Flash local state under the .flash/ directory, eliminating the previous split paradigm where some state was stored in .runpod/ and other state in .flash/. This is a clean refactoring that improves consistency and simplifies the directory structure for the project.
Changes:
- Updated all source code references from
.runpod/to.flash/for resource state files and container archives - Removed
.runpod/from ignore patterns across all configuration files and templates - Renamed test fixtures and updated test paths to reflect the new directory structure
- Updated all documentation references to use
.flash/resources.pklinstead of.runpod/resources.pkl
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/runpod_flash/core/resources/resource_manager.py |
Changed RUNPOD_FLASH_DIR constant from .runpod to .flash |
src/runpod_flash/cli/commands/run.py |
Updated _RESOURCE_STATE_FILE path from .runpod/resources.pkl to .flash/resources.pkl |
src/runpod_flash/cli/commands/preview.py |
Changed CONTAINER_ARCHIVE_PATH from /root/.runpod/artifact.tar.gz to /root/.flash/artifact.tar.gz |
src/runpod_flash/cli/utils/ignore.py |
Removed .runpod/ from the always_ignore patterns list |
src/runpod_flash/cli/commands/undeploy.py |
Updated docstring reference from .runpod/resources.pkl to .flash/resources.pkl |
.gitignore |
Consolidated ignore patterns to only use .flash/ instead of separate .runpod/ and .flash/logs/ entries |
src/runpod_flash/cli/utils/skeleton_template/.gitignore |
Removed .runpod/ from template ignore patterns |
src/runpod_flash/cli/utils/skeleton_template/.flashignore |
Changed .runpod/ to .flash/ in flashignore patterns |
tests/conftest.py |
Renamed worker_runpod_dir fixture to worker_flash_dir and updated all path references |
tests/unit/resources/test_resource_manager.py |
Updated mock resource file path from .runpod/resources.pkl to .flash/resources.pkl |
tests/unit/cli/commands/build_utils/test_scanner.py |
Removed test for .runpod directory exclusion (now redundant with existing .flash test) |
src/runpod_flash/cli/docs/flash-undeploy.md |
Updated all documentation references from .runpod/resources.pkl to .flash/resources.pkl |
src/runpod_flash/cli/docs/flash-run.md |
Updated documentation references from .runpod/resources.pkl to .flash/resources.pkl |
docs/Flash_Deploy_Guide.md |
Updated all technical documentation references from .runpod/resources.pkl to .flash/resources.pkl |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
93543f4 to
3fa85d8
Compare
QA ReportStatus: PASS Targeted Test Results
Full Suite Results
All 12 failures are in files NOT touched by this PR and reproduce on the base branch. CI passes across all 5 Python versions (3.10-3.14). Migration Completeness
PR Diff AnalysisFiles changed: 14 (7 source, 3 docs, 2 templates, 2 test files) What changed:
Backward compatibility note: There is no migration logic for users who have an existing
This is acceptable for a development tool where re-deploy is cheap, but worth noting in release notes. RecommendationMERGE — Clean migration with no regressions. All directory path references are consistently updated. The lack of backward compatibility for Generated by flash-qa agent |
QA Report — PR #221: Migrate
|
| Check | Result |
|---|---|
ruff format --check |
PASS (1290 files formatted) |
ruff check |
PASS (all checks passed) |
Test Results
| Suite | Passed | Failed | Skipped | Result |
|---|---|---|---|---|
Full (tests/ -n 4) |
1239 | 14 | 1 | See below |
| Changed files only | 60 | 0 | 0 | PASS |
All 14 failures are pre-existing (not introduced by this PR):
| Test File | Failures | Status |
|---|---|---|
test_class_execution_integration.py |
2 | P4 known — _class_type / _constructor_args AttributeError |
test_remote_concurrency.py |
10 | P4 known — asyncio.gather unhashable/not-awaitable TypeError |
test_file_locking.py |
2 | Pre-existing flaky — platform-dependent file lock timing |
Zero new failures introduced by this PR.
Flaky Test Check
test_resource_manager.py — 3/3 runs passed (19/19 tests each). No flakiness detected.
Coverage
| Metric | Value |
|---|---|
| Overall coverage | 72.27% |
| Threshold | 65% |
| Status | PASS |
No coverage regression from this PR.
PR Diff Analysis
Completeness Checklist
| Item | Status | Notes |
|---|---|---|
All .runpod/ dir refs replaced in src/ |
PASS | Zero remaining .runpod/ path refs in source |
All .runpod/ dir refs replaced in tests/ |
PASS | Zero remaining .runpod/ path refs in tests |
All .runpod/ dir refs replaced in docs |
PASS | Flash_Deploy_Guide.md updated correctly |
.gitignore updated |
PASS | .runpod/ removed, .flash/ covers everything |
| Skeleton templates updated | PASS | Both .flashignore and .gitignore templates updated |
ignore.py always-ignore list updated |
PASS | .runpod/ entry removed (.flash/ already present) |
resource_manager.py constant updated |
PASS | RUNPOD_FLASH_DIR = Path(".flash") |
run.py state file path updated |
PASS | _RESOURCE_STATE_FILE = Path(".flash") / "resources.pkl" |
preview.py container archive path updated |
PASS | CONTAINER_ARCHIVE_PATH = "/root/.flash/artifact.tar.gz" |
undeploy.py docstring updated |
PASS | References .flash/resources.pkl |
conftest.py fixture names/paths updated |
PASS | worker_runpod_dir -> worker_flash_dir |
| Scanner test updated | PASS | test_exclude_runpod_directory removed (redundant with test_exclude_flash_directory) |
| Test resource manager paths updated | PASS | tmp_path / ".flash" / "resources.pkl" |
Findings
-
No migration/backward compatibility: The PR explicitly states "No legacy/migration handling." Users with existing
.runpod/resources.pklwill lose their cached resource state. This is acceptable for a development-time cache (resources can be re-provisioned), but could cause a one-time re-deployment for users upgrading. This should be noted in release notes. -
Container archive path change (
preview.py): Changed from/root/.runpod/artifact.tar.gzto/root/.flash/artifact.tar.gz. This is only used byflash deploy --preview(Docker Compose local preview). The flash-worker Docker images that handle production deployment do NOT reference this path — they useFLASH_*environment variables. No cross-repo impact. -
Remaining
.runpodstrings are all correct: All remaining matches are RunPod API URLs (api.runpod.ai,console.runpod.io,docs.runpod.io) or Python module imports (from runpod_flash.core.api.runpod). These are unrelated to the.runpod/directory. -
Test removed (
test_exclude_runpod_directory): This test verified that the scanner excluded.runpod/directories. Since.runpod/is no longer a project directory, this test is correctly removed. The equivalenttest_exclude_flash_directorytest remains and passes. -
No anti-patterns detected: No bare except, no print statements, no mutable defaults, no hardcoded secrets, no missing awaits.
Known Issue Check
| Known Issue | Affected? |
|---|---|
P4: test_class_execution_integration.py failures |
Pre-existing, not related to PR |
P4: test_remote_concurrency.py failures |
Pre-existing, not related to PR |
test_file_locking.py flaky tests |
Pre-existing, not related to PR |
Recommendation
APPROVE — Clean migration with complete coverage of all .runpod/ directory references. All changed tests pass consistently. No new failures, no coverage regression, no lint/format issues. The only consideration is documenting the lack of backward migration in release notes for users with existing .runpod/ state directories.
d6e172d to
af7dffc
Compare
runpod-Henrik
left a comment
There was a problem hiding this comment.
Code Review — PR #221: .runpod/ → .flash/ migration
Large refactor with significant overlap with PRs #220 and #235. Found three high-severity issues:
Bug 1 (HIGH): Authorization header sent to presigned S3 URLs — breaks upload AND download
app.py — upload_build() and download_tarball()
Both methods switched from hand-crafted requests calls (no Authorization) to get_authenticated_requests_session(), which injects Authorization: Bearer <key> as a default session header.
# OLD — no auth sent to S3 presigned URL
headers = {"User-Agent": get_user_agent(), "Content-Type": TARBALL_CONTENT_TYPE}
resp = requests.put(url, data=fh, headers=headers)
# NEW — auth header sent to S3 presigned URL
with get_authenticated_requests_session() as session:
session.headers["Content-Type"] = TARBALL_CONTENT_TYPE
resp = session.put(url, data=fh) # sends Authorization: Bearer ... to S3AWS S3 rejects requests with both query-string signature AND Authorization header → HTTP 400 InvalidArgument: "Only one auth mechanism allowed". Build uploads and artifact downloads will fail for all users.
Fix: Use a plain session (no auth) for presigned URL requests, or add include_auth=False parameter to get_authenticated_requests_session().
Bug 2 (HIGH): No migration for existing .runpod/resources.pkl — users lose all endpoint state
resource_manager.py
RUNPOD_FLASH_DIR = Path(".flash") # was Path(".runpod")No migration code. Existing .runpod/resources.pkl with tracked endpoints is silently orphaned. flash undeploy lists nothing. Endpoints keep running and billing on RunPod with no CLI management.
Fix: Add migration on first startup:
_LEGACY_STATE_FILE = Path(".runpod") / "resources.pkl"
if not RESOURCE_STATE_FILE.exists() and _LEGACY_STATE_FILE.exists():
RUNPOD_FLASH_DIR.mkdir(parents=True, exist_ok=True)
shutil.copy2(_LEGACY_STATE_FILE, RESOURCE_STATE_FILE)
log.info("Migrated state from .runpod/ to .flash/")Bug 3 (HIGH): Per-request API key propagation from LB to workers silently removed
Deleted: api_key_context.py, extract_api_key_middleware in lb_handler.py
The middleware extracted the caller's API key from Authorization: Bearer header and propagated it to downstream QB worker calls via ContextVar. Now removed — worker calls fall back to the server-process RUNPOD_API_KEY env var only.
In multi-tenant or API-key-scoped deployments, worker calls use the wrong key. When the server process has no RUNPOD_API_KEY, worker calls fail with 401 even if the client sent a valid key.
Fix: Either preserve the per-request key propagation, or document that LB now requires RUNPOD_API_KEY set server-side.
Bug 4 (MEDIUM): Container archive path renamed but Docker images not updated
preview.py
CONTAINER_ARCHIVE_PATH = "/root/.flash/artifact.tar.gz" # was /root/.runpod/If Flash worker Docker images still look for /root/.runpod/artifact.tar.gz, flash deploy --preview will silently fail — containers start but find no user code.
Fix: Verify and update Docker image entrypoints in sync with this PR.
7a4cc55 to
48b0653
Compare
Rename all references from .runpod to .flash across the codebase: - Update CLI commands, docs, and skeleton templates - Update .gitignore and .flashignore patterns - Update ResourceManager config directory path - Update test fixtures and conftest helpers - Remove obsolete scanner tests for deleted .runpod patterns
Rename all references from .runpod to .flash across the codebase: - Update CLI commands, docs, and skeleton templates - Update .gitignore and .flashignore patterns - Update ResourceManager config directory path - Update test fixtures and conftest helpers - Remove obsolete scanner tests for deleted .runpod patterns
48b0653 to
3ba9612
Compare
Summary
.flash/directory, eliminating the split.runpod/vs.flash/paradigmresources.pkl), container archive path, ignore patterns, skeleton templates, tests, and docsCloses AE-2257
Changed files
Source (4 files):
resource_manager.py,run.py,preview.py,ignore.pyConfig/templates (3 files):
.gitignore,skeleton_template/.gitignore,skeleton_template/.flashignoreTests (3 files):
conftest.py,test_resource_manager.py,test_scanner.pyDocs (4 files):
Flash_Deploy_Guide.md,flash-run.md,flash-undeploy.md,undeploy.pydocstringTest plan
make quality-checkpasses (format, lint, typecheck, tests, coverage).runpoddirectory references in codebase (verified via grep)