OCR related works published in top conferences and journals.
-
文档智能分析与识别前沿:回顾与展望, 刘成林, 金连文, 白翔, 李晓辉, 殷飞,中国图形图像学报,[Paper]
-
Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing, Yan Shu, Weichao Zeng, Zhenhang Li, Fangmin Zhao, Yu Zhou [Paper]
-
[arXiv:2407.19889] Self-Supervised Learning for Text Recognition: A Critical Survey, Carlos Penarrubia, Jose J. Valero-Mas, Jorge Calvo-Zaragoza [Paper]
-
[arXiv:2506.07112] EdgeSpotter: Multi-Scale Dense Text Spotting for Industrial Panel Monitoring, Changhong Fu, Hua Lin, Haobo Zuo, Liangliang Yao, Liguo Zhang [Paper]
-
Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance,Jiahao Lyu, Wei Wang, Dongbao Yang, Jinwen Zhong, Yu Zhou [Paper]
-
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning, Ling Fu, Biao Yang, Zhebin Kuang, Jiajun Song, Yuzhe Li, Linghao Zhu, Qidi Luo, Xinyu Wang, Hao Lu, Mingxin Huang, Zhang Li, Guozhi Tang, Bin Shan, Chunhui Lin, Qi Liu, Binghong Wu, Hao Feng, Hao Liu, Can Huang, Jingqun Tang, Wei Chen, Lianwen Jin, Yuliang Liu, Xiang Bai [Paper] [Project Page]
-
[arXiv:2409.01704] General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model, Haoran Wei, Chenglong Liu, Jinyue Chen [Paper] [Code]
-
[IJCAI 2024] Zhou, Bangbang, et al. "Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition." arXiv preprint arXiv:2407.05562 (2024). [Paper] [Code]
-
[ECCV 2024] WAS: Dataset and Methods for Artistic Text Segmentation, Xudong Xie, Yuzhe Li, Yang Liu, Zhifei Zhang, Zhaowen Wang, Wei Xiong, Xiang Bai [Paper] [Code]
-
[arXiv:2407.17020] EAFormer: Scene Text Segmentation with Edge-Aware Transformers, Haiyang Yu, Teng Fu, Bin Li, Xiangyang Xue [Paper]
-
[arXiv:2406.19101] DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming, Jiaxin Zhang, Wentao Yang, Songxuan Lai, Zecheng Xie, Lianwen Jin [Paper]
-
[IJCAI 2024] Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition, arXiv:2405.05841, Zuan Gao, Yuxin Wang, Yadong Qu, Boqiang Zhang, Zixiao Wang, Jianjun Xu, Hongtao Xie [Paper]
-
Visually Guided Generative Text-Layout Pre-training for Document Intelligence, NAACL 2024, Zhiming Mao, Haoli Bai, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu, Kam-Fai Wong [NAACL 2024]
-
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document, Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai [Paper] [Code]
-
Efficiently Leveraging Linguistic Priors for Scene Text Spotting, Nguyen Nguyen, Yapeng Tian, Chenliang Xu [Paper]
-
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models, Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun [Paper]
-
[PR-2024] Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition, Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai [Paper] [Code]
-
STEP - Towards Structured Scene-Text Spotting, Sergi Garcia-Bordils1,2 Dimosthenis Karatzas1 Marc¸al Rusinol, [Paper] [Code]
- Self-supervised Implicit Glyph Attention for Text Recognition,Tongkun Guan1, Chaochen Gu2*, Jingzheng Tu2, Xue Yang1, Qi Feng2, Yudi Zhao2, Wei Shen1* [Paper] [Code]
- Self-supervised Character-to-Character Distillation for Text Recognition,Tongkun Guan1, Wei Shen1, Xue Yang1, Qi Feng2, Zekun Jiang1, Xiaokang Yang1 [Paper] [Code]
-
Relational Contrastive Learning for Scene Text Recognition, Jinglei Zhang, Tiancheng Lin, Yi X, Kai Chen, Rui Zhang, [Paper] [Code]
-
Context Perception Parallel Decoder for Scene Text Recognition, Yongkun Du1, Zhineng Chen1*, Caiyan Jia2, Xiaoting Yin3, Chenxia Li3, Yuning Du3, Yu-Gang Jiang1, [Paper] [Code]
-
Revisiting Scene Text Recognition: A Data Perspective,Qing Jiang , Jiapeng Wang , Dezhi Peng , Chongyu Liu , Lianwen Jin†, [Paper] [Code]
-
Enhancing Table Recognition with Vision LLMs:A Benchmark andNeighbor-Guided Toolchain Reasoner,Yitong Zhou1, Mingyue Cheng1, Qingyang Mao1, Qi Liu1, Feiyang Xu2, Xin Li2, Enhong Chen1, [Paper]