Skip to content

Commit 0ca7b68

Browse files
[PEFT / docs] Add a note about torch.compile (#6864)
* Update using_peft_for_inference.md * add more explanation
1 parent 3cf4f9c commit 0ca7b68

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

docs/source/en/tutorials/using_peft_for_inference.md

+19
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,25 @@ list_adapters_component_wise
165165
{"text_encoder": ["toy", "pixel"], "unet": ["toy", "pixel"], "text_encoder_2": ["toy", "pixel"]}
166166
```
167167

168+
## Compatibility with `torch.compile`
169+
170+
If you want to compile your model with `torch.compile` make sure to first fuse the LoRA weights into the base model and unload them.
171+
172+
```py
173+
pipe.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
174+
pipe.load_lora_weights("CiroN2022/toy-face", weight_name="toy_face_sdxl.safetensors", adapter_name="toy")
175+
176+
pipe.set_adapters(["pixel", "toy"], adapter_weights=[0.5, 1.0])
177+
# Fuses the LoRAs into the Unet
178+
pipe.fuse_lora()
179+
pipe.unload_lora_weights()
180+
181+
pipe = torch.compile(pipe)
182+
183+
prompt = "toy_face of a hacker with a hoodie, pixel art"
184+
image = pipe(prompt, num_inference_steps=30, generator=torch.manual_seed(0)).images[0]
185+
```
186+
168187
## Fusing adapters into the model
169188

170189
You can use PEFT to easily fuse/unfuse multiple adapters directly into the model weights (both UNet and text encoder) using the [`~diffusers.loaders.LoraLoaderMixin.fuse_lora`] method, which can lead to a speed-up in inference and lower VRAM usage.

0 commit comments

Comments
 (0)