Skip to content

AIoT-MLSys-Lab/Efficient-Diffusion-Model-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Efficient Diffusion Models: A Survey

Efficient Diffusion Models: A Survey [arXiv] (Version 1: 02/03/2025, camera ready version of Transactions on Machine Learning Research)

Hui Shen1, Jingxuan Zhang2, Boning Xiong3, Rui Hu4,Shoufa Chen1, Zhongwei Wan1, Xin Wang1, Yu Zhang5, Zixuan Gong5, Guangyin Bao5, Chaofan Tao6, Yongfeng Huang7, Ye Yuan8, Mi Zhang.1

1The Ohio State University, 2Indiana University, 3Fudan University, 4Hangzhou City University, 5Tongji University, 6The University of Hong Kong, 7The Chinese University of Hong Kong, 8Peking University.

If you like our project, please give us a star ⭐ on GitHub for the latest update.

Awesome GitHub Repo stars

Diffusion models have kickstart a new era in the field of artificial intelligence generative content (AIGC). This repo is a curated list of papers about the latest advancements in efficient diffusion models. This repo is being actively updated, please stay tuned!

⚡News: Our survey has been officially accepted by Transactions on Machine Learning Research (TMLR), Apr 2025. Arxiv version is available at:[arXiv]

@article{shen2025efficient,
  title={Efficient Diffusion Models: A Survey},
  author={Shen, Hui and Zhang, Jingxuan and Xiong, Boning and Hu, Rui and Chen, Shoufa and Wan, Zhongwei and Wang, Xin and Zhang, Yu and Gong, Zixuan and Bao, Guangyin and others},
  journal={arXiv preprint arXiv:2502.06805},
  year={2025}
}

❤️ Community Support

We will actively maintain this repository by incorporating new research as it emerges. If you have any suggestions regarding our taxonomy, find any missed papers, or update any preprint arXiv paper that has been accepted to some venue, feel free to send us an email or submit a pull request using the following markdown format.

📖 Table of Contents

Algorithm

Efficient Training

  • Latent Diffusion
    • [CVPR 2023] Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models. [Paper]
    • [Arxiv 2022.11] MagicVideo: Efficient Video Generation With Latent Diffusion Models. [Paper]
    • [CVPR 2023] Video Probabilistic Diffusion Models in Projected Latent Space. [Paper]
    • [Arxiv 2023.03] Latent Video Diffusion Models for High-Fidelity Long Video Generation. [Paper]
    • [ICLR 2024] Latent 3D Graph Diffusion. [Paper] [Code]
    • [CVPR 2023] Executing your commands via motion diffusion in latent space. [Paper] [Code]
    • [ICML 2023] AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. [Paper] [Code]
    • [Arxiv 2024.10] L3DG: Latent 3D Gaussian Diffusion. [Paper]
    • [ICLR 2024] MIXED-TYPE TABULAR DATA SYNTHESIS WITH SCORE-BASED DIFFUSION IN LATENT SPACE [Paper] [Code]
    • [NeurIPS 2023] Generating behaviorally diverse policies with latent diffusion models. [Paper]
    • [Arxiv 2024.09] Latent Diffusion Models for Controllable RNA Sequence Generation. [Paper]
  • Loss Formulation
    • [JMLR 2005] Estimation of Non-Normalized Statistical Models by Score Matching. [Paper]
    • [UAI 2019] Sliced Score Matching: A Scalable Approach to Density and Score Estimation. [Paper] [Code]
    • [NeurIPS 2019] Generative Modeling by Estimating Gradients of the Data Distribution. [Paper] [Code]
    • [NeurIPS 2021] Maximum Likelihood Training of Score-Based Diffusion Models. [Paper] [Code]
    • [ICLR 2022] Score-Based Generative Modeling with Critically-Damped Langevin Diffusion. [Paper] [Code]
    • [Arxiv 2022.09] Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow. [Paper] [Code]
    • [Arxiv 2022.09] Rectified Flow: A Marginal Preserving Approach to Optimal Transport. [Paper]
    • [ICLR 2024] InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation. [Paper] [Code]
    • [Arxiv 2024.09] PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator. [Paper] [Code]
    • [Arxiv 2024.10] Improving the Training of Rectified Flows. [Paper] [Code]
    • [Arxiv 2024.02] SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow. [Paper] [Code]
  • Training Tricks
    • [Arxiv 2022.02] PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior. [Paper]
    • [Arxiv 2023.05] DiGress: Discrete Denoising diffusion for graph generation. [Paper] [Code]
    • [Arxiv 2024.02] DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design. [Paper] [Code]
    • [CVPR 2023] Leapfrog diffusion model for stochastic trajectory prediction. [Paper] [Code]
    • [ICLR 2025] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think. [Paper] [Code]
    • [Arxiv 2025.01] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models. [Paper] [Code]
    • [NeurIPS 2020] Denoising Diffusion Probabilistic Models. [Paper] [Code]
    • [ICML 2021] Improved Denoising Diffusion Probabilistic Models. [Paper] [Code]
    • [Arxiv 2024.07] Improved Noise Schedule for Diffusion Training. [Paper]
    • [EMNLP 2023] A Cheaper and Better Diffusion Language Model with Soft-Masked Noise. [Paper] [Code]
    • [ICLR 2021] Denoising Diffusion Implicit Models. [Paper]
    • [NeurIPS 2024] ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting. [Paper] [Code]
    • [Arxiv 2024.06] Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment. [Paper] [Code]
    • [ICLR 2023] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models. [Paper] [Code]
    • [ACL 2024] Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation. [Paper]

Efficient Fine-tuning

  • Low-Rank Adaptation
    • [Arxiv 2024.10] LoRA: Low-Rank Adaptation of Large Language Models. [Paper] [Code]
    • [ECCV 2024] Concept sliders: Lora adaptors for precise control in diffusion models. [Paper] [Code]
    • [ECCV 2024] Lcm-lora: A universal stable-diffusion acceleration module [Paper] [Code]
    • [Arxiv 2024.07] LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models. [Paper] [Code]
    • [Arxiv 2024.10] Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model [Paper]
  • Adapter
    • [AAAI 2024] T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. [Paper] [Code]
    • [Arxiv 2023.08] Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models. [Paper] [Code]
    • [ICML 2024] Accelerating Parallel Sampling of Diffusion Models. [Paper] [Code]
    • [Arxiv 2024.05] Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model [Paper] [Code]
    • [CVPR 2024] Simda: Simple diffusion adapter for efficient video generation [Paper] [Code]
  • ControlNet
    • [ICCV 2023] Adding conditional control to text-to-image diffusion models. [Paper][Code]
    • [ECCV 2025] ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback. [Paper][Code]
    • [Arxiv 2023.12] ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems. [Paper][Code]
    • [Arxiv 2024.08] ControlNext: Powerful and efficient control for image and video generation. [Paper][Code]
    • [NeurIPS 2024] Uni-ControlNet: All-in-one control to text-to-image diffusion models. [Paper][Code]
    • [NeurIPS 2023] UniControl: A unified diffusion model for controllable visual generation in the wild. [Paper][Code]

Efficient Sampling

  • Solver
    • [NeurIPS 2021] Diffusion Normalizing Flow. [Paper]
    • [Arxiv 2023.09] Diffusion models with deterministic normalizing flow priors. [Paper] [Code]
    • [Arxiv 2021.05] Gotta Go Fast When Generating Data with Score-Based Models. [Paper] [Code]
    • [NeurIPS 2023] Gaussian Mixture Solvers for Diffusion Models. [Paper] [Code]
    • [ICML 2024] Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations. [Paper] [Code]
    • [NeurIPS 2023] SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models. [Paper]
    • [NeurIPS 2023] Restart sampling for improving generative processes. [Paper] [Code]
    • [ICLR 2021] Denoising diffusion implicit models. [Paper]
    • [NeurIPS 2022] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. [Paper] [Code]
    • [ICLR 2023] Fast Sampling of Diffusion Models with Exponential Integrator. [Paper] [Code]
    • [ICML 2023] Denoising MCMC for Accelerating Diffusion-Based Generative Models. [Paper] [Code]
    • [ICML 2023] Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs. [Paper] [Code]
  • Sampling Scheduling
    • [NeurIPS 2022] Deep Equilibrium Approaches to Diffusion Models. [Paper] [Code]
    • [NeurIPS 2023] Parallel Sampling of Diffusion Models. [Paper] [Code]
    • [ICML 2024] Accelerating Parallel Sampling of Diffusion Models. [Paper] [Code]
    • [Arxiv 2023.12] StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation. [Paper] [Code]
    • [Arxiv 2021.06] On fast sampling of diffusion probabilistic models. [Paper] [Code]
    • [Arxiv 2021.06] Learning to Efficiently Sample from Diffusion Probabilistic Models. [Paper]
    • [ICML 2024] Align Your Steps: Optimizing Sampling Schedules in Diffusion Models. [Paper]
  • Truncated Sampling
    • [EMNLP 2021] Consistent Accelerated Inference via Confident Adaptive Transformers. [Paper] [Code]
    • [ICML 2024] A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models. [Paper] [Code]
    • [Arxiv 2022.04] Semi-Parametric Neural Image Synthesis. [Paper] [Code]
    • [ICLR 2023] kNN-Diffusion: Image Generation via Large-Scale Retrieval. [Paper]
    • [ICLR 2023] Re-Imagen: Retrieval-Augmented Text-to-Image Generator. [Paper]
    • [ICML 2023] ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval. [Paper] [Code]
  • Knowledge Distillation
    • [Arxiv 2021.01] Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed. [Paper] [Code]
    • [ICLR 2022] Progressive Distillation for Fast Sampling of Diffusion Models. [Paper] [Code]
    • [CVPR 2023] On Distillation of Guided Diffusion Models. [Paper]
    • [Arxiv 2025.03] SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation. [Paper] [Code]
    • [CVPR 2024] One-step Diffusion with Distribution Matching Distillation. [Paper]
    • [ICLR 2023] DreamFusion: Text-to-3D using 2D Diffusion. [Paper] [Project]
    • [NeurIPS 2023] ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. [Paper] [Code]
    • [NeurIPS 2023] Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models. [Paper] [Code]
    • [CVPR 2024] 3D Paintbrush: Local Stylization of 3D Shapes with Cascaded Score Distillation. [Paper] [Code]
    • [CVPR 2025] MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation [Paper] [Code]

Compression

  • Quantization
    • [CVPR 2023] Post-training Quantization on Diffusion Models. [Paper] [Code]
    • [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models. [Paper] [Code]
    • [ICLR 2021] BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction. [Paper] [Code]
    • [NeurIPS 2023] Leveraging Early-Stage Robustness in Diffusion Models for Efficient and High-Quality Image Synthesis. [Paper]
    • [NeurIPS 2023] PTQD: Accurate Post-Training Quantization for Diffusion Models. [Paper] [Code]
    • [NeurIPS 2023] Temporal Dynamic Quantization for Diffusion Models. [Paper]
    • [ICLR 2024] EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models. [Paper] [Code]
    • [ICLR 2020] Learned Step Size Quantization. [Paper]
    • [NeurIPS 2024] BitsFusion: 1.99 bits Weight Quantization of Diffusion Model. [Paper]
  • Pruning
    • [NeurIPS 2023] Structural Pruning for Diffusion Models. [Paper] [Code]
    • [CVPRW 2024] LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights. [Paper]
    • [ICML 2024] LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging. [Paper] [Code]
    • [Arxiv 2024.04] LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models. [Paper]

System

Hardware-Software Co-Design

  • [CVPRW 2023] Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations. [Paper] [Project]
  • [FPL 2024] SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs. [Paper] [Code]
  • [ISCAS 2024] A 28.6 mJ/iter Stable Diffusion Processor for Text-to-Image Generation with Patch Similarity-based Sparsity Augmentation and Text-based Mixed-Precision. [Paper]

Parallel Computing

  • [CVPR 2024] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models. [Paper] [Code]
  • [Arxiv 2024.05] PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models. [Paper] [Code]
  • [Arxiv 2024.07] SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules. [Paper]
  • [MLSys 2024] DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines. [Paper]

Caching Technique

  • [NSDI 2024] Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models. [Paper] [Code]
  • [CVPR 2024] DeepCache: Accelerating Diffusion Models for Free. [Paper] [Code]
  • [CVPR 2024] Cache Me if You Can: Accelerating Diffusion Models through Block Caching. [Paper] [Project]
  • [Arxiv 2024.06] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching. [Paper] [Code]
  • [Arxiv 2024.07] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration. [Paper] [Code]
  • [NeurIPS 2024] MD-DiT: Step-aware Mixture-of-Depths for Efficient Diffusion Transformers. [Paper]
  • [Arxiv 2025.02] $ackslash$Delta $-DiT: Accelerating Diffusion Transformers without training via Denoising Property Alignment. [Paper]

Frameworks

Training Inference Key Features
FlashAttention [Code] High-efficient attention computation for Diffusion Transformers (DiT)
xFormers [Code] Memory-efficient attention and modular ops tailored for diffusion Transformer speedups
DeepSpeed [Code] Scalable distributed training and inference optimizations for large diffusion models
OneFlow [Code] Compiler-optimized pipeline for faster diffusion model training and sampling
Stable-Fast [Code] Fast inference optimization for Diffusers with CUDA and fusion
Onediff [Code] Diffusion-specific acceleration with DeepCache and quantization
DeepCache [Code] Reuses cached diffusion features to speed up inference iterations
TGATE [Code] Temporal gating to streamline cross-attention in diffusion inference
xDiT [Code] Parallel inference engine for Diffusion Transformers