Skip to content

[PRCV 25] Towards Real-World Document Specular Highlight Removal: The DocHighlight Dataset and DocSHRNet Method

Notifications You must be signed in to change notification settings

SCUT-DLVCLab/DocHighlight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 

Repository files navigation

DocHighlight: A Real-World Dataset for Document Specular Highlight Removal

Method License

πŸ“– Introduction

DocHighlight is a large-scale, high-resolution dataset for document specular highlight removal, captured with a polarization-based acquisition pipeline across diverse real-world scenarios. The dataset is detailed in "Towards Real-World Document Specular Highlight Removal: The DocHighlight Dataset and DocSHRNet Method" (PRCV 2025), and the reference implementation DocSHRNet is available at πŸ‘‰ https://github.com/shallweiwei/DocSHRNet.


πŸ“₯ Download

The dataset is available via the following links:


πŸ“Œ Key Features

  • 2,201 rigorously aligned highlight vs. highlight-free image pairs
  • Average resolution of 2924 Γ— 3672 (range: 1034Γ—737 – 3468Γ—4624)
  • Covers books, magazines, multilingual text, and graphical content
  • Captures real-world variations in document pose, illumination, and three camera devices
  • Combines polarization imaging with manual quality verification for reliable ground truth

πŸ“ Usage Notes


πŸ“š Citation

If you find this dataset useful in your research, please consider citing our paper.

About

[PRCV 25] Towards Real-World Document Specular Highlight Removal: The DocHighlight Dataset and DocSHRNet Method

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published