Huizhou Univeristy, University of Macau, Shanghai Jiao Tong University, SIAT CAS, Shenzhen Polytechnic University
In IEEE/CVF Winter Conference on Applications of Computer Vision 2025 (WACV 2025)
StainDoc is the first large-scale high-resolution dataset that includes ground truth data specifically for the task of document stain removal.
StainDoc_mark and StainDoc_seal are made with the process in DocDiff.
You may download the dataset first, and then specify TRAIN_DIR, VAL_DIR and SAVE_DIR in the section TRAINING in config.yml
.
For single GPU training:
python train.py
For multiple GPUs training:
accelerate config
accelerate launch train.py
If you have difficulties with the usage of accelerate
, please refer to Accelerate.
Please first specify TRAIN_DIR, VAL_DIR and SAVE_DIR in section TESTING in config.yml
.
python infer.py
@inproceedings{li2025high,
title={High-fidelity document stain removal via a large-scale real-world dataset and a memory-augmented transformer},
author={Li, Mingxian and Sun, Hao and Lei, Yingtie and Zhang, Xiaofeng and Dong, Yihang and Zhou, Yilin and Li, Zimeng and Chen, Xuhang},
booktitle={2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
pages={7614--7624},
year={2025},
organization={IEEE}
}