[ICLR26]UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy

Overview

UniCalli is a groundbreaking unified diffusion framework that addresses column-level generation of Chinese calligraphy. Unlike existing methods that focus on isolated character generation or compromise calligraphic correctness for page-level synthesis, UniCalli integrates both recognition and generation tasks in a single model, achieving superior results in both stylistic fidelity and structural accuracy.

Key Features

Unified Architecture: First framework to unify column-level calligraphy generation and recognition
Multi-Master Styles: Supports diverse calligraphic styles, including Wang Xizhi, Yan Zhenqing, Ouyang Xun, etc.
Densely Annotated Data: Trained on large-scale calligraphy dataset with detailed annotations

Licence

For academic research and non-commercial use only.

本模型仅供学术研究、非商业使用。

TODO List

Getting Started

Installation

git clone https://github.com/EnVision-Research/UniCalli.git
cd UniCalli
pip install -r requirements.txt

Download Model

Download the complete model package (includes checkpoint, InternVL embedding, and font) from Hugging Face:

# Using huggingface-cli (recommended)
huggingface-cli download TSXu/UniCalli-base --local-dir ./checkpoints

Or from ModelScope:

# Using modelscope
pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('tianshuo/UniCalli-base', local_dir='./checkpoints')"

Usage

Run Demo (Gradio Interface)

python app.py

4-bit Quantization (GPU Memory < 18GB)

⚠️ Note: 4-bit quantization significantly reduces VRAM usage but may result in slightly degraded output quality.

pip install optimum-quanto

from inference import CalligraphyGenerator

generator = CalligraphyGenerator(
    model_name="flux-dev",
    device="cuda",
    offload=False,
    intern_vlm_path="./checkpoints/internvl_embedding",  # Path to downloaded embedding
    checkpoint_path="./checkpoints/unicalli-base_cleaned.bin",
    font_descriptions_path='dataset/chirography.json',
    author_descriptions_path='dataset/calligraphy_styles_en.json',
    use_deepspeed=False,
    use_4bit_quantization=True,  # Enable 4-bit quantization
)

image, cond_img = generator.generate(
    text="生日快乐喵",  # Must be 5 characters
    font_style="草",    # 楷(Regular)/草(Cursive)/行(Running)
    author="黄庭坚",    # Or None to use synthetic style
    save_path="output.png",
    num_steps=25,
    seed=42,
)

Using DeepSpeed for Memory Optimization (GPU Memory < 40G)

from inference import CalligraphyGenerator

generator = CalligraphyGenerator(
    model_name="flux-dev",
    device="cuda",
    offload=False,  # DeepSpeed manages memory
    intern_vlm_path="./checkpoints/internvl_embedding",  # Path to downloaded embedding
    checkpoint_path="./checkpoints/unicalli-base_cleaned.bin",
    font_descriptions_path='dataset/chirography.json',
    author_descriptions_path='dataset/calligraphy_styles_en.json',
    use_deepspeed=True,
    use_4bit_quantization=False,
    deepspeed_config="ds_config_zero2.json"
)

image, cond_img = generator.generate(
    text="生日快乐喵",  # Must be 5 characters
    font_style="楷",    # 楷(Regular)/草(Cursive)/行(Running)
    author="赵佶",    # Or None to use synthetic style
    save_path="output.png",
    num_steps=39,
    seed=1128293374,
)

Supported Font Styles

楷 (Regular Script / Kaishu): Standard, block-style characters
行 (Running Script / Xingshu): Semi-cursive, flowing style
草 (Cursive Script / Caoshu): Highly cursive, artistic style

Supported Calligraphy Masters

The model supports various historical calligraphy masters including:

王羲之 (Wang Xizhi) - "Sage of Calligraphy"
颜真卿 (Yan Zhenqing) - Tang Dynasty master
欧阳询 (Ouyang Xun) - One of the Four Great Masters
赵佶 (Emperor Huizong) - Song Dynasty emperor and calligrapher
And many more...

You can also use author=None to generate in a synthetic, averaged style.

Model Details

Base Architecture: FLUX diffusion model
Model Size: ~23GB
Input: Text (5 characters), font style, author style
Output: Column-level calligraphy image
Training Data: Large-scale Chinese calligraphy dataset with dense annotations

Citation

If you find UniCalli useful in your research, please consider citing:

@article{xu2025unicalli,
  title={UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy},
  author={Xu, Tianshuo and Wang, Kai and Chen, Zhifei and Wu, Leyi and Wen, Tianshui and Chao, Fei and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2025.13745},
  year={2025}
}

Acknowledgments

This work builds upon the FLUX architecture and benefits from the rich heritage of Chinese calligraphy. We thank the calligraphy masters whose works made this research possible.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
dataset		dataset
docs		docs
src/flux		src/flux
.gitignore		.gitignore
README.md		README.md
README_zh.md		README_zh.md
app.py		app.py
ds_config_zero2.json		ds_config_zero2.json
inference.py		inference.py
predict.py		predict.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICLR26]UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy

Overview

Key Features

Licence

TODO List

Getting Started

Installation

Download Model

Usage

Run Demo (Gradio Interface)

4-bit Quantization (GPU Memory < 18GB)

Using DeepSpeed for Memory Optimization (GPU Memory < 40G)

Supported Font Styles

Supported Calligraphy Masters

Model Details

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

EnVision-Research/UniCalli

Folders and files

Latest commit

History

Repository files navigation

[ICLR26]UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy

Overview

Key Features

Licence

TODO List

Getting Started

Installation

Download Model

Usage

Run Demo (Gradio Interface)

4-bit Quantization (GPU Memory < 18GB)

Using DeepSpeed for Memory Optimization (GPU Memory < 40G)

Supported Font Styles

Supported Calligraphy Masters

Model Details

Citation

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages