Request to add support for nunchaku-flux.1-kontext-dev

nunchaku is a high-speed quantization solution.

https://github.com/mit-han-lab/nunchaku 

SVDQuant reduces the 12B FLUX.1 model size by 3.6× and cuts the 16-bit model's memory usage by 3.5×. With Nunchaku, our INT4 model runs 3.0× faster than the NF4 W4A16 baseline on both desktop and laptop NVIDIA RTX 4090 GPUs. Notably, on the laptop 4090, it achieves a total 10.1× speedup by eliminating CPU offloading. Our NVFP4 model is also 3.1× faster than both BF16 and NF4 on the RTX 5090 GPU.

https://huggingface.co/mit-han-lab/nunchaku-flux.1-kontext-dev



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request to add support for nunchaku-flux.1-kontext-dev #711

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Request to add support for nunchaku-flux.1-kontext-dev #711

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions