Fix BO mode wired as 'none' in non-wavelet SURE samplers#486
Open
pycms-nube wants to merge 106 commits into
Open
Fix BO mode wired as 'none' in non-wavelet SURE samplers#486pycms-nube wants to merge 106 commits into
pycms-nube wants to merge 106 commits into
Conversation
Add support of diffuser of UNet, LoRA. Also Claude now has capability to handle complex usage
And we now ready for some fun features
MPS backend has some updates now, let's bump up
This commit fix all noneType and other issiue cause by some werid loading problem. Also add typing Diffuser pipline now has capiability to compile with LoRA. This commit also introduce first version of forge auto offloading (maxiumn fir offloading by layers) using diffuser auto device map infere
The diffuser pipline now can do model loading using diffuser. This allows advanced model support and better model loading. Assiatant by claude opus
We kind of fix problem in orginal DoRA support???? I not sure... Claude says orginal is wrong but at this point both reForge pipline and diffuser has support. Restore orginal support of multi chunk CLIP. for Diffuser.
This commit introduce fix about not using pipline. Also create stub for future optimzation
This should solve problem of SDP is not efficent for non tech users, plus easier to check if we hit performance maxiumn
On MacOS, memory foot print is essential. This commit use autocast so we can run on bp16 or extreme fp16
This commit support mapping aginst diffuser schecduler and adding diffuser scheduler. Expand LRU cache to diffuser functions with high cost
…webui-reForge into reforge_upstream
Reforge upstream
Fix noise scale on EDM for Euler A2. And add DC-sampler, SURE, a trojactory sampler Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This fix introduce some performance improve around SURE sampler. A proper preheat is added. Along with vaiant of DPM++ 2M/2S a I will suggest you use DPM++ 2S a SURE. This is the most best one that somehow match the paper but not introduce werid artifacts For SURE it's sure now.
SURE reimplement after noticing nosie is add back later And we now can use model metadata infer what is right
This fix allows SURE to actually run under the sampling assumption. The problem is due sampling needs high sigma while SURE don't like it. Also we inject noise wrong, so it blew up somthing Sure this is a SURE fix;)
Though mostly you should not change but... Yeah why not?
Replace the fixed sigma-scaled SGD step in _sure_correct_x0 with an optional Adam/AdamW optimiser whose state (m, v, t) persists across diffusion steps, mapping each denoising step to one optimizer iteration. Adam normalises each pixel's gradient by its historical variance so that alpha becomes a true scale-invariant learning rate rather than a raw gradient magnitude knob. The sigma-scaling heuristic (alpha / (1 + sigma_t)) is therefore dropped when Adam is active — it is redundant and would double-suppress early steps that Adam already handles via its second moment. AdamW adds decoupled weight decay applied directly to x0_hat, pulling the corrected estimate toward zero each step without contaminating the moment estimates. Four new UI options are exposed under Settings: sure_adam_mode — none / adam / adamw (Radio) sure_adam_beta1 — first-moment decay (Slider, default 0.9) sure_adam_beta2 — second-moment decay (Slider, default 0.999) sure_adam_wd — AdamW weight decay (Slider, default 0.01) All eight SURE samplers receive the new parameters via the existing signature-inspection path in sd_samplers_common.py. Diagnostic logging now reports eff_grad_rms and adam_ratio (eff/raw) so the per-pixel adaptation can be verified at runtime. Empirically the adam_ratio and eff_grad_rms trajectories are nearly identical across a 5× range of alpha values, confirming that Adam has absorbed the scale sensitivity that previously required manual tuning. Signed-off-by: PYCMS <zenghongyi2004@gmail.com> Co-developed-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Now we can track what can be upgrade or eurgent upgrade
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v5...v6) --- updated-dependencies: - dependency-name: actions/setup-python dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [actions/dependency-review-action](https://github.com/actions/dependency-review-action) from 4 to 5. - [Release notes](https://github.com/actions/dependency-review-action/releases) - [Commits](actions/dependency-review-action@v4...v5) --- updated-dependencies: - dependency-name: actions/dependency-review-action dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@v3...v4) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](actions/cache@v4...v5) --- updated-dependencies: - dependency-name: actions/cache dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4 to 6. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@v4...v6) --- updated-dependencies: - dependency-name: actions/setup-node dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
1. open_clip: GitHub archive ZIP for commit bb6e834 contains duplicate entries with conflicting content for ViT-g-14.json, which uv (stricter than pip) rejects with exit code 2. Switch the default OPENCLIP_PACKAGE to the PyPI wheel `open_clip_torch>=2.24.0`, which installs cleanly via uv and provides the same `open_clip` import. 2. SDXL CPU/MPS tensor mismatch in get_learned_conditioning: the spatial conditioning tensors (original_size_as_tuple, crop_coords_top_left, target_size_as_tuple, aesthetic_score) were created on `clip.patcher.model.device`, which is always the *offload* device (CPU) in ldm_patched's lazy-load scheme — even after move_clip_to_gpu() has dispatched the model to MPS. The CLIP text-encoder outputs (both the standard torch path and the MLX hook path) land on MPS because tokens are always created on `devices.device`. The subsequent torch.cat of the pooled CLIP-G vector with the CPU spatial-cond embeddings therefore raised "Passed CPU tensor to MPS op". Fix: create all spatial conditioning tensors on `devices.device` (the authoritative compute device, always MPS on Apple Silicon), which matches where the text-encoder embeddings land regardless of VRAM state, separate-TE usage, or whether the MLX pipeline is active. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Bumps [open-clip-torch](https://github.com/mlfoundations/open_clip) from 2.30.0 to 3.3.0. - [Release notes](https://github.com/mlfoundations/open_clip/releases) - [Changelog](https://github.com/mlfoundations/open_clip/blob/main/HISTORY.md) - [Commits](mlfoundations/open_clip@v2.30.0...v3.3.0) --- updated-dependencies: - dependency-name: open-clip-torch dependency-version: 3.3.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.20.0 to 7.35.0. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Commits](https://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 7.35.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Updates the requirements on [pytest-cov](https://github.com/pytest-dev/pytest-cov) to permit the latest version. - [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest-cov@v4.0.0...v7.1.0) --- updated-dependencies: - dependency-name: pytest-cov dependency-version: 7.1.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
- Bump CI to Python 3.14 (allow-prereleases) in run_tests and linter - Update README: Python 3.14 is the target, brew install python@3.14 - Incorporate all pending dependabot updates: pillow 10→12.2.0, accelerate 0.21→1.13.0, psutil 5.9→7.2.2, pytorch-lightning 1.9→2.6.5, albumentations 1.4→2.0.8, pytest 7.3→9.0, pydantic custom-wheel→2.13.4, diffusers 0.32→0.37.1, huggingface-hub 0.25→0.36.2, fastapi 0.94→0.136.3, starlette pinned 1.2.1, timm 1.0.17→1.0.27 - Fix Pydantic v2 compat: Optional fields, ConfigDict, model_fields, model_dump; remove _OptionsModelProxy, use lazy _build_options_model() - Add debug_image_probe.py instrumentation module Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The debug endpoint /debug/last-output served arbitrary local files as inline base64 and the /file= override bypassed Gradio's security gates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
allow-licenses and deny-licenses are mutually exclusive in actions/dependency-review-action@v5. Keep only deny-licenses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes two high-severity CVEs in diffusers 0.37.1: - CVE-2026-44513 / GHSA-98h9-4798-4q5v: trust_remote_code bypass via custom_pipeline and local custom components - CVE-2026-45804 / GHSA-7wx4-6vff-v64p: TOCTOU race condition enabling silent RCE in DiffusionPipeline.from_pretrained() Both are patched in diffusers 0.38.0, which requires safetensors>=0.8.0-rc.0. Accepting the pre-release safetensors intentionally — security takes priority over stability here. Added inline warnings in requirements_versions.txt to document the trade-off and flag for revisit once 0.8.0 stable ships. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zoom.js: - Remove trailing whitespace on blank lines (8 occurrences) - Add space after `if` keyword (lines 520, 818) extraNetworks.js: - Remove trailing whitespace on blank lines (7 occurrences) - Remove duplicate `extraNetworksSearchButton` declaration (no-redeclare) - Add missing `extraNetworksTreeProcessFileClick` stub (no-undef) - Add trailing newline (eol-last) hires_fix.js: - Remove trailing whitespace (line 36) - Add trailing newline (eol-last) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pyproject.toml: - Bump target-version py39 → py310 (match statement support) - Move exclude to [tool.ruff] (was silently ignored under [tool.ruff.lint]) - Add per-file-ignores for upstream k-diff, mlx optional imports, type stubs, star-import preprocessor_compiled, late-import patterns - Add fastapi.Body to extend-immutable-calls Code fixes across diff_pipeline/, cst-auto/, mlx_pipeline/, scripts/, extensions-builtin/ (Lora, forge_legacy_preprocessors, sd_forge_controlnet, sd_forge_ipadapter, sd_forge_latent_modifier, sd_forge_multidiffusion, sd_forge_neveroom, sd_webui_random_resolutions, reForge-*, mahiro_reforge): - E741: rename ambiguous `l` variables - E701: split multi-statement if-colon lines - B006: mutable default args → None + guard - F841: remove unused variable assignments - E722: bare except → except Exception - B007: unused loop vars → _ - B904: raise in except → raise ... from err - B011: assert False → raise AssertionError - C416: unnecessary dict/list comprehensions → dict()/list() - B005: multi-char strip fix - F821: add missing imports (typing.Any, traceback, devices) - W291/W293: trailing whitespace Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
blendmodes 2024.1.1 caps pillow<11 which conflicts with pillow==12.2.0. blendmodes==2025 requires only pillow>=10.4.0 with no upper bound. CI: increase wait-for-it timeout 20s→60s to give the server more time to boot on slow runners. Make kill-server step non-fatal with || true so a connection-refused (server never started) does not fail the job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
numpy 2.0+ is required by both scikit-image==0.25.1 and blendmodes==2025. numpy 1.26.4 was already incompatible with our current scikit-image pin; this resolves the full dependency chain. torch 2.11.0 supports numpy 2.x. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
modules/launch_utils.py: - Add _select_requirements_file(): auto-selects requirements_versions.txt (Python 3.11–3.13) or requirements_versions_py314.txt (Python 3.14+) based on sys.version_info; overridable via REQS_FILE env var - Update check_python_version(): supported range is now 3.11–3.14; drop 3.10 (PyWavelets>=1.9.0 requires >=3.11) requirements_versions.txt (Python 3.11–3.13): - Revert pillow 12.2.0 → 10.4.0 (gradio 3.41.2 caps pillow<11) - Revert numpy 2.2.6 → 1.26.4 (gradio 3.41.2 caps numpy<2) - Revert blendmodes 2025 → 2024.1.1 (blendmodes 2025 needs numpy>=2) - Bump scikit-image 0.25.1 → 0.25.2, kornia 0.8.0 → 0.8.2 - Retain security pins: diffusers 0.38.0, safetensors 0.8.0rc0 requirements_versions_py314.txt (Python 3.14+, new file): - Same baseline with 3.14-specific uv resolutions: scikit-image 0.26.0, kornia 0.8.3, PyWavelets 1.9.0 - Marks where gradio/numpy/pillow constraints can be lifted when gradio is eventually upgraded on this branch pyproject.toml: bump ruff target-version py310 → py311 README: update minimum Python 3.10+ → 3.11+ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
webui.py: inject pytorch_lightning.utilities.distributed compatibility shim before any module that triggers ldm imports. pytorch-lightning 2.x removed that module; ldm/ddpm.py and generative-models still import rank_zero_only from the old path. Falls back through pytorch_lightning.utilities.rank_zero → lightning_fabric.utilities.rank_zero so it works across 2.x versions. requirements_versions*.txt: add audioop-lts to both requirements files. audioop was removed in Python 3.13 (PEP 594); pydub (pulled in by gradio) imports it and crashes on 3.13+ without this backport. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
preprocessor_compiled.py: add explicit 'import functools'. The module was previously available via star-import re-export from preprocessor.py, but the ruff lint pass removed the unused import there, silently breaking the re-export. functools.partial is used directly in this file so the import must be explicit. launch_utils.py: add _apply_repository_patches() called after all git_clone steps. First patch fixes CPU/MPS tensor mismatch in repositories/generative-models/sgm/modules/encoders/modules.py: emb.to(output[out_key].device) before torch.cat so all embedder outputs are on the same device on Apple Silicon MPS. repositories/ is gitignored so the patch is re-applied idempotently on every fresh clone instead of being committed directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the MLX pipeline fails to load on an MPS-capable machine, emit a visible boxed warning so the user knows they are running on the slower MPS backend and how to install MLX to fix it. Previously the failure was silently swallowed at DEBUG log level, leaving users confused about why generation was slow or hitting device-placement errors (like the CPU/MPS tensor mismatch in generative-models that requires the post-clone patch). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously the warning only fired in the except block, but maybe_activate() returns False silently (no exception) when mlx is not installed or Metal is unavailable — the most common case. Now check the return value: if is_apple_silicon() is True but maybe_activate() returned False, set the reason string and print the warning. Exception path still sets the reason from the error message, keeping both cases covered. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses two "Uncontrolled data used in path expression" findings reported by CodeQL in modules/ui_extensions.py. save_config_state (line ~70): - os.path.basename() alone only prevents directory traversal but leaves characters and length unconstrained; an attacker with UI access could write semi-controlled filenames into config_states_dir. - Added: re.sub whitelist (word chars / spaces / hyphens / dots only), 64-char length cap, and a realpath() containment check that rejects the path if it resolves outside config_states_dir after all steps. apply_and_restart (line ~29-34): - JSON list elements were only checked to be a list, not to be strings. Non-string elements could reach extension-name comparisons that derive filesystem paths. - Added: assert all(isinstance(x, str) for x in ...) on both disable_list and update_list after json.loads. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous fix sanitised 'name' but still used it in os.path.join(),
which CodeQL's taint analysis correctly flags — any user-derived byte
in a path expression is a finding regardless of sanitisation guards.
config_states.list_config_states() reads the display name from JSON
content (cs.get('name')), NOT from the filename, so the filename
needs zero user input.
New approach:
- Display name: sanitised (basename + whitelist + 64-char cap) and
stored only inside the JSON body as 'name'.
- File path: timestamp + uuid4 hex suffix — entirely server-generated,
no taint from user input whatsoever.
This closes the CodeQL 'Uncontrolled data used in path expression'
finding at line 70 definitively.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CodeQL flags 'Uncontrolled data used in path expression' at line 29 because Path(filename) is constructed directly from user-supplied data. check_tmp_file is a security gatekeeper that checks whether a user-requested file lives inside an allowed temp dir, so the filename must be handled — but Path(user_input) is unnecessary. Fix: canonicalise both sides to plain strings via os.path.realpath + os.path.abspath, then do a string startswith() containment check. This is functionally equivalent (both resolve symlinks and check whether the file is within an allowed directory) but avoids constructing any Path object from user-controlled data. Explicit try/except rejects malformed inputs cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces `sample_sure_restart` ("Restart SURE"), a Heun-based sampler
that uses the SURE residual ratio as an adaptive restart signal instead
of triggering restarts blindly at predetermined sigma thresholds.
The SURE residual ratio is computed for free at every step:
r_ratio = sqrt(mean((x − Dθ(x, σ))²)) / σ
For a healthy trajectory r_ratio ≈ 1 (denoiser residual ≈ injected
noise). When the trajectory drifts off-manifold r_ratio rises above
`sure_threshold` (default 1.5) and the sampler:
1. Re-injects noise to reach σ_restart = min(σ_cur × scale, σ_max)
2. Runs `restart_steps` Heun sub-steps on a Karras schedule back to σ_cur
3. Refreshes the denoised estimate and continues the main step
Mathematical backing: the stop-gradient SURE approximation gives
SURE_approx/(n·σ²) = r_ratio² − 1
so r_ratio > T ↔ SURE_approx/(n·σ²) > T²−1. The Lean theorem
`cond4_step_closer` (SURE_verification.lean) confirms that positive
SURE_approx signals correctable denoising error, making the restart
productive. No extra model calls are incurred on steps where no
restart is triggered (2 NFE/step, same as standard Heun).
Parameters: sure_threshold=1.5, restart_max_times=2, restart_steps=9,
restart_scale=2.0, sure_preheat_steps=3.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
All 8 non-wavelet SURE samplers (
sample_sure,sample_dpmpp_2s_a_sure,sample_dpmpp_2s_a_sure_adaptive,sample_dpmpp_2m_sure,sample_dpmpp_2m_sde_sure,sample_dpmpp_3m_sde_sure,sample_dpmpp_2m_sde_sure_adaptive,sample_sure_adaptive) were initialising_adam_statewith:This allocated a real dict when
sure_adam_mode='bo'and passed it as a non-Noneadam_stateto_sure_correct_x0.Why it was wrong
_sure_correct_x0only activates Adam whenadam_mode in ('adam', 'adamw'). Whenadam_mode='bo', the adam block is skipped and the function falls through to plain SGD — the non-None dict is silently ignored. BO mode in the non-wavelet path ran as plain SGD, wasting an allocation each call and masking the mismatch.The wavelet sampler
sample_sure_waveletalready had the correct check (sure_adam_mode in ('adam', 'adamw')) so_adam_state=Nonefor BO mode.Fix
Changed all 8 non-wavelet samplers to match the wavelet pattern:
Test plan
sure_adam_mode='bo'— confirm no runtime errorssure_adam_mode='adam'and'adamw'still initialise_adam_statecorrectly (non-None)sure_adam_mode='none'keeps_adam_state=None🤖 Generated with Claude Code