LayerD: Decomposing Raster Graphic Designs into Layers

Tomoyuki Suzuki¹ Kang-Jun Liu² Naoto Inoue¹ Kota Yamaguchi¹

¹CyberAgent, ²Tohoku University

ICCV 2025

This repository is the official implementation of the paper "LayerD: Decomposing Raster Graphic Designs into Layers".

Recent updates

Release weight for high-resolution inference and set it as default (2025-10-22)

Setup

Environment

We have verified reproducibility under the following environment.

Ubuntu 22.04
Python 3.12.3
CUDA 12.8 (optional)
uv 0.8.17

Install

LayerD uses uv to manage the environment and dependencies. You can install this project with the following command:

uv sync

Quick example

You can decompose an image into layers using the following minimal example:

from PIL import Image
from layerd import LayerD

image = Image.open("./data/test_image_2.png")
layerd = LayerD(matting_hf_card="cyberagent/layerd-birefnet").to("cpu")
layers = layerd.decompose(image)

The output layers is a list of PIL Image objects in RGBA format. We provide some test images in the data/ directory.

Note

We recommend PNG images as input to avoid compression artifacts (especially around text edges) that may degrade the inpainting quality. You may mitigate the issue by setting a higher kernel_scale (default: 0.015) value when initializing LayerD.

Note

Building LayerD involves downloading two pre-trained models: the top-layer matting module from the Hugging Face repository cyberagent/layerd-birefnet (~1GB) and the inpainting model from eneshahin/simple-lama-inpainting (~200MB). Please ensure you have a stable internet connection during the first run.

Inference

We provide a script to run inference on a dataset.

uv run python ./tools/infer.py \
  --input </path/to/input> \
  --output-dir </path/to/output> \
  --device <device>  # e.g., cuda or cpu

--input can be a file, a directory, or a glob pattern. You can also specify multiple input files like --input img1.png img2.png .... --matting-weight-path can be used to specify the path to the trained weights of the top-layer matting module. If not specified, it uses the model from cyberagent/layerd-birefnet by default.

Training

We provide code for fine-tuning the top-layer matting part of LayerD on Crello dataset.

Dataset preparation

You can convert the Crello dataset for top-layer matting training.

uv run python ./tools/generate_crello_matting.py --output-dir </path/to/dataset> --inpainting --save-layers

Note

This script downloads the Crello dataset (<20GB) from Hugging Face. Please ensure you have a stable internet connection and sufficient disk space for the first run.

This will create a dataset with the following structure:

</path/to/save/dataset>
├── train
│   ├── im/ # Input images (full composite or intermediate composite images)
│   ├── gt/ # Ground-truth (top-layer alpha mattes)
│   ├── composite/ # Full composite images (not used for training, but for evaluation)
│   └── layers/ # Ground-truth layers (RGBA) (not used for training, but for evaluation)
├── validation
└── test

Training

You can fine-tune the top-layer matting module on the generated dataset.

We reorganized the training code for this study, based on the original BiRefNet, which is the backbone of the top-layer matting module. Training configuration is managed with Hydra as the training involves a lot of hyperparameters.

Below is an example command to start training with a specified configuration file.

uv run python ./tools/train.py \
  config_path=./src/layerd/configs/train.yaml \
  data_root=</path/to/dataset> \
  out_dir=</path/to/output> \
  device=<device>  # e.g., cuda or cpu

data_root is the dataset root path (like </path/to/dataset> above), out_dir is the output directory path, and they are mandatory fields in the configuration file that must be specified at runtime. You also override other hyperparameters in the configuration file by specifying them in the command line arguments.

Training supports distributed mode using both torch.distributed and Hugging Face Accelerate.

To use torch.distributed, launch the training script with torchrun as follows:

CUDA_VISIBLE_DEVICES=0,1 uv run torchrun --standalone --nproc_per_node 2 \
  ./tools/train.py \
  config_path=./src/layerd/configs/train.yaml \
  data_root=</path/to/dataset> \
  out_dir=</path/to/output> \
  dist=true

For Hugging Face Accelerate, set use_accelerate=true in the command line arguments. You can also set the mixed_precision parameter (options: no, fp16, bf16).

CUDA_VISIBLE_DEVICES=0,1 uv run torchrun --standalone --nproc_per_node 2 \
  ./tools/train.py \
  config_path=./src/layerd/configs/train.yaml \
  data_root=</path/to/dataset> \
  out_dir=</path/to/output> \
  use_accelerate=true \
  mixed_precision=bf16

Note

We observe that the training takes about 40 hours using A100 40GB x 4 GPUs with use_accelerate=true, mixed_precision=bf16, and the default configuration.

We thank the authors of BiRefNet for releasing their code, which we used as a basis for our matting backbone.

Evaluation

You can calculate the proposed evaluation metrics using the following minimal example:

from layerd.evaluation import LayersEditDist

metric = LayersEditDist()
# Both layers_pred and layers_gt are lists of PIL.Image (RGBA)
result = metric(layers_pred, layers_target)

We also provide a script to run dataset-level evaluation.

uv run python ./tools/evaluate.py \
  --pred-dir </path/to/predictions> \
  --gt-dir </path/to/groundtruth> \
  --output-dir </path/to/output> \
  --max-edits 5

--pred-dir and --gt-dir need to follow the structure below. The dataset prepared by the script in Dataset preparation has a layers/ directory (not gt/) that follows this structure and is ready for evaluation (e.g., --gt-dir </path/to/crello-matting>/layers).

</path/to/predictions or groundtruth>
├── {sample_id}
│   ├── 0000.png
│   ├── 0001.png
│   └── ...
└── {sample_id}
    ├── 0000.png
    ├── 0001.png
    └── ...

License

This project is licensed under the Apache-2.0 License. See the LICENSE file for details.

Third-party libraries

This project makes use of several third-party libraries, each of which has its own license:

BiRefNet — MIT License
simple-lama-inpainting — Apache-2.0 License

Citation

If you find this project useful in your work, please cite our paper.

@inproceedings{suzuki2025layerd,
  title={LayerD: Decomposing Raster Graphic Designs into Layers},
  author={Suzuki, Tomoyuki and Liu, Kang-Jun and Inoue, Naoto and Yamaguchi, Kota},
  booktitle={ICCV},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
src/layerd		src/layerd
static		static
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LayerD: Decomposing Raster Graphic Designs into Layers

Tomoyuki Suzuki¹ Kang-Jun Liu² Naoto Inoue¹ Kota Yamaguchi¹

¹CyberAgent, ²Tohoku University

ICCV 2025

Recent updates

Setup

Environment

Install

Quick example

Inference

Training

Dataset preparation

Training

Evaluation

License

Third-party libraries

Citation

About

Uh oh!

Releases

Contributors 2

Uh oh!

Languages

License

CyberAgentAILab/LayerD

Folders and files

Latest commit

History

Repository files navigation

LayerD: Decomposing Raster Graphic Designs into Layers

Tomoyuki Suzuki1 Kang-Jun Liu2 Naoto Inoue1 Kota Yamaguchi1 1CyberAgent, 2Tohoku University

ICCV 2025

Recent updates

Setup

Environment

Install

Quick example

Inference

Training

Dataset preparation

Training

Evaluation

License

Third-party libraries

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors 2

Uh oh!

Languages

Tomoyuki Suzuki¹ Kang-Jun Liu² Naoto Inoue¹ Kota Yamaguchi¹

¹CyberAgent, ²Tohoku University