HRCenterNet (Official Pytorch Implementation)

Chinese Character Detection in Historical Documents

HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents
Chia-Wei Tang, Chao-Lin Liu, Po-Sen Chu
IEEE Big Data 2020 Workshops, Computational Archival Science: digital records in the age of big data
IEEE Xplore (10.1109/BigData50022.2020.9378051)
arXiv technical report (arXiv 2012.05739)

Contact: [email protected]. Any questions or discussions are welcomed!

Installation

git clone https://github.com/Tverous/HRCenterNet.git
cd HRCenterNet/
pip install -r requirements.txt

Download the dataset and the pretrained weight

Google Drive

OneDrive

How to use ?

Training:

python train.py --train_csv_path data/train.csv --train_data_dir data/images \
                --val_csv_path data/val.csv --val_data_dir data/images/ --val \
                --batch_size 8 --epoch 80

Evaluation:

python evaluate.py --csv_path data/val.csv --data_dir data/images/ --log_dir weights/HRCenterNet.pth.tar

Test with unseen images:

python test.py --data_dir /path/to/images --log_dir /path/to/pretrained --output_dir /path/to/save/outputs

Training on Your Own Dataset

Prepare your csv files with following format:

image_id              labels
file_name_1           obj_id_1 topleft_x topleft_y width height obj_id_2 topleft_x topleft_y width height ...
file_name_2           obj_id_1 topleft_x topleft_y width height obj_id_2 topleft_x topleft_y width height ...
    .                 .
    .                 .
    .                 .

Results

Citation

Use this bibtex to cite this repository:

@INPROCEEDINGS{
  9378051,  
  author={C. -W. {Tang} and C. -L. {Liu} and P. -S. {Chiu}},  
  booktitle={2020 IEEE International Conference on Big Data (Big Data)},   
  title={HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents},   
  year={2020},  
  volume={},  
  number={},  
  pages={1924-1930},  
  doi={10.1109/BigData50022.2020.9378051}
}

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
datasets		datasets
images		images
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_dataset.ipynb		demo_dataset.ipynb
evaluate.py		evaluate.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HRCenterNet (Official Pytorch Implementation)

Installation

Download the dataset and the pretrained weight

Google Drive

OneDrive

How to use ?

Training:

Evaluation:

Test with unseen images:

Training on Your Own Dataset

Results

Citation

About

Releases

Packages

Languages

License

Tverous/HRCenterNet

Folders and files

Latest commit

History

Repository files navigation

HRCenterNet (Official Pytorch Implementation)

Installation

Download the dataset and the pretrained weight

Google Drive

OneDrive

How to use ?

Training:

Evaluation:

Test with unseen images:

Training on Your Own Dataset

Results

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages