gradio-diffusion

Gradio app for Stable Diffusion 1.5 featuring:

txt2img and img2img pipelines with IP-Adapter
ControlNet with Canny edge detection
FastNegative textual inversion
Real-ESRGAN resizing up to 8x
Compel prompt weighting support
Multiple samplers with Karras scheduling
DeepCache available for faster inference

Installation

uv venv
uv pip install -r requirements.txt
uv run app.py

Usage

Enter a prompt or roll the 🎲 and press Generate.

Prompting

Positive and negative prompts are embedded by Compel. See syntax features to learn more.

Models

Some require specific parameters to get the best results, so check the model's link for more information:

Scale

Rescale up to 8x using Real-ESRGAN with weights from ai-forever.

Image-to-Image

The Image-to-Image settings allows you to provide input images for the initial latent, ControlNet, and IP-Adapter.

Strength

Initial image strength (known as denoising strength) is essentially how much the generation will differ from the input image. A value of 0 will be identical to the original, while 1 will be a completely new image. You may want to also increase the number of inference steps.

Note that denoising strength only applies to the Initial Image input; it doesn't affect ControlNet or IP-Adapter.

ControlNet

In ControlNet, the input image is used to get a feature map from an annotator. These are computer vision models used for tasks like edge detection and pose estimation. ControlNet models are trained to understand these feature maps. Read the docs to learn more.

Currently, the only annotator available is Canny (edge detection).

IP-Adapter

In an image-to-image pipeline, the input image is used as the initial latent representation. With IP-Adapter, the image is processed by a separate image encoder and the encoded features are used as conditioning along with the text prompt.

For capturing faces, enable IP-Adapter Face to use the full-face model. You should use an input image that is mostly a face and it should be high quality.

Advanced

Textual Inversion

Add <fast_negative> anywhere in your negative prompt to apply the FastNegative v2 textual inversion embedding. Read An Image is Worth One Word to learn more.

💡 Wrap in parens to weight the embedding like (<fast_negative>)0.8.

DeepCache

DeepCache caches lower UNet layers and reuses them every n steps. Trade quality for speed:

1: no caching (default)
2: more quality
3: balanced
4: more speed

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
.vscode		.vscode
data		data
embeddings		embeddings
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.css		app.css
app.py		app.py
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gradio-diffusion

Installation

Usage

Prompting

Models

Scale

Image-to-Image

Strength

ControlNet

IP-Adapter

Advanced

Textual Inversion

DeepCache

About

Languages

License

adamelliotfields/gradio-diffusion

Folders and files

Latest commit

History

Repository files navigation

gradio-diffusion

Installation

Usage

Prompting

Models

Scale

Image-to-Image

Strength

ControlNet

IP-Adapter

Advanced

Textual Inversion

DeepCache

About

Topics

Resources

License

Stars

Watchers

Forks

Languages