DocMAE

Unofficial implementation of DocMAE: Document Image Rectification via Self-supervised Representation Learning

https://arxiv.org/abs/2304.10341

TODO

Document background segmentation network using U2 net
Synthetic data generation for self-supervised pre-training
Pre-training
Fine-tuning for document rectification (In progress)
Evaluation
Code clean up and documentation
Model release

Demo

Find a jupyter notebook at demo/background_segmentation.ipynb

Data

Pre-training

3411482 pages from ~1M documents from Docile dataset (https://github.com/rossumai/docile)
Rendered with Doc3D https://github.com/Dawars/doc3D-renderer
558 HDR env lighting from https://hdri-haven.com/

Pretraining on 200k documents:

Run training via:

python pretrain.py -c config/config.json Visualize trained model using https://github.com/NielsRogge/Transformers-Tutorials/blob/master/ViTMAE/ViT_MAE_visualization_demo.ipynb

Acknowledgement

Test documents come from DIR300 dataset https://github.com/fh2019ustc/DocGeoNet

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.github/workflows		.github/workflows
assets		assets
config		config
demo		demo
docmae		docmae
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocMAE

TODO

Demo

Data

Pre-training

Run training via:

Acknowledgement

About

Releases 1

Packages

Contributors 2

Languages

Dawars/DocMAE

Folders and files

Latest commit

History

Repository files navigation

DocMAE

TODO

Demo

Data

Pre-training

Run training via:

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages