We used several typical prompts to visualize generated images, proving that the compilation acceleration of onediff is almost lossless.
We calculated the Frechet Inception Distance (FID), Inception Score (IS), CLIP Score, and Aesthetic Score based on the coco-30-val-2014 dataset.
The Human Preference Score v2 (HPS v2) was evaluated on the Human Preference Dataset v2 (HPD v2) dataset.
These metrics demonstrate that the acceleration by onediff does not result in quality loss.
In addition, compared with the images generated by the original PyTorch HF diffusers baseline, we calculated the average Structural Similarity (SSIM) on the coco-30-val-2014 to demonstrate that the structure of the images generated after acceleration by onediff does not undergo significant changes.
Metric | Reference Value (Original PyTorch) | OneFlow Backend | NexFort Backend |
---|---|---|---|
FID ⬇️ | 25.022 | 24.924 | 24.904 |
IS ⬆️ | 37.472 ± 0.474 | 37.575 ± 0.559 | 37.415 ± 0.723 |
CLIP Score ⬆️ | 0.305 | 0.305 | 0.305 |
Aesthetic Score ⬆️ | 6.098 | 6.097 | 6.100 |
HPS v2 ⬆️ | 31.47 | 31.47 | 31.58 |
SSIM ⬆️ | - | 0.858 | 0.810 |
Note: ⬇️ indicates that a lower value is better, ⬆️ the opposite.