LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
_{Official PyTorch Implementation}

`HF Checkpoint 🚀` | `Technical Report 📝`

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
Xianfeng Wu^{1, 2}^# · Yajing Bai^{1, 2}^# · Haoze Zheng^{1, 2}^# · Harold (haodong) Chen^{1, 2}^# · Yexin Liu^{1, 2}^# · Zihao Wang^{1, 2} · Xuran Ma^{1, 2} · Wenjie Shu^{1, 2} · Xianzu Wu^{1, 2} · Harry Yang^{1, 2}^* · Sernam Lim^{2, 3}^*

_{¹ HKUST AMC, ² Everlyn AI, ³ UCF CS, ^#Equal contribution, ^* Corresponding Author}

This is a PyTorch/GPU implementation of LightGen, this repo wants to provide an efficient pre-training pipeline for text-to-image generation based on Fluid/MAR

🦉 ToDo List

DPO Post-proceesing Code Released
Release Complete Checkpoint.
Add Accelerate Module.

Env

conda create -n everlyn_video python=3.10
conda activate everlyn_video
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121
# pip install -U xformers==0.0.26 --index-url https://download.pytorch.org/whl/cu121
pip install -r requierments.txt

Prepare stage

huggingface-cli download --token hf_ur_token --resume-download stabilityai/stable-diffusion-3.5-large --local-dir stable-diffusion-3.5-large # Image VAE
huggingface-cli download --resume-download google/flan-t5-xxl --local-dir google/flan-t5-xxl # Text Encoder
huggingface-cli download --repo-type dataset --resume-download jackyhate/text-to-image-2M --local-dir text-to-image-2M # Dataset

untar script for text-to-image2M

#!/bin/bash

# Check if the 'untar' directory exists, and create it if it does not
mkdir -p untar

# Loop through all .tar files
for tar_file in *.tar; do
    # Extract the numeric part, for example 00001, 00002, ...
    dir_name=$(basename "$tar_file" .tar)
    
    # Create the corresponding directory
    mkdir -p "untar/$dir_name"
    
    # Extract the tar file to the corresponding directory
    tar -xvf "$tar_file" -C "untar/$dir_name"
    
    echo "Extraction completed: $tar_file to untar/$dir_name"
done

echo "All files have been extracted."

It may too large to cost much time to read this data in normal dataset, so we need to generate a json file first to accelerate this process, modify scripts/generate_txt.py then run it.

python generate_json.py

Training

Script for the default setting, u can modify some setting in scripts/run.sh:

sh run.sh

Inference

Script for the default setting:

python pipeline_image.py

Acknowledgements

A large portion of codes in this repo is based on MAR.

✨ Star History

Cite

@article{wu2025lightgen,
  title={LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization},
  author={Wu, Xianfeng and Bai, Yajing and Zheng, Haoze and Chen, Harold Haodong and Liu, Yexin and Wang, Zihao and Ma, Xuran and Shu, Wen-Jie and Wu, Xianzu and Yang, Harry and others},
  journal={arXiv preprint arXiv:2503.08619},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
demo		demo
diffusion		diffusion
models		models
scripts		scripts
util		util
LICENSE		LICENSE
README.md		README.md
engine_fluid.py		engine_fluid.py
main_cache.py		main_cache.py
main_fluid.py		main_fluid.py
requierments.txt		requierments.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
_{Official PyTorch Implementation}

`HF Checkpoint 🚀` | `Technical Report 📝`

🦉 ToDo List

Env

Prepare stage

Training

Inference

Acknowledgements

✨ Star History

Cite

About

Releases

Packages

Languages

License

XianfengWu01/LightGen

Folders and files

Latest commit

History

Repository files navigation

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Official PyTorch Implementation

HF Checkpoint 🚀 | Technical Report 📝

🦉 ToDo List

Env

Prepare stage

Training

Inference

Acknowledgements

✨ Star History

Cite

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
_{Official PyTorch Implementation}

`HF Checkpoint 🚀` | `Technical Report 📝`

Packages