Deep Learning Based Document Layout Analysis on Historical Documents

Ravichandra, Sriram; Siva Sathya, S.; Lourdu Marie Sophie, S.

doi:10.1007/978-981-19-1018-0_23

Sriram Ravichandra¹⁵,
S. Siva Sathya¹⁵ &
S. Lourdu Marie Sophie¹⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 427))

659 Accesses
3 Citations

Abstract

Mining historical documents provide important information that assists scientists and researchers to gain new insights into their domain. Document Layout Analysis is the first step of the automatic capturing of information from documents. The layout recognition supports establishing relationships between components in the document, which allows extraction of information. A method to visually segment critical regions of Historical handwritten documents using an object detection technique augmented with contextual features are presented in this paper. Layout analysis of Historical documents is complicated due to complex document layout and degradation of documents due to age. The object-detection technique YOLOv3 has been adopted for document layout analysis by extracting the features at different levels by down-sampling the input and combining it using Feature Pyramid Network to detect smaller regions to improve the region detection performance. The experimental results on two distinct datasets that demonstrate the method’s potential and competitive results with respect to state-of-the-art approaches have been presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

VTLayout: Fusion of Visual and Text Features for Document Layout Analysis

BINYAS: a complex document layout analysis system

Article 04 November 2020

Deep Layout Analysis of Multi-lingual and Composite Documents

References

Binmakhashen GM, Mahmoud SA (2020) Document layout analysis: a comprehensive survey. ACM Comput Surv 52(6), Article 109 (Jan)
Google Scholar
Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. ArXiv abs/1804.02767
Google Scholar
Lin T, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) pp 936–944
Google Scholar
Lombardi F, Marinai S (2020) Deep learning for historical document analysis and recognition—a survey. J Imaging 6:110
Article Google Scholar
Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: SPIE 5010, document recognition and retrieval X 5010
Google Scholar
Augusto Borges Oliveira D, Palhares Viana M (2017) Fast CNN-based document layout analysis. In: 2017 IEEE international conference on computer vision Workshops (ICCVW), pp 1173–1180
Google Scholar
Soto C, Yoo S (2019) Visual detection with context for document layout analysis. In: EMNLP/IJCNLP
Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. In: IEEE transactions on pattern analysis and machine intelligence, vol 42, pp 386–397
Google Scholar
Yang H, Hsu W (2021) Vision-based layout detection from scientific literature using recurrent convolutional neural networks. In: 2020 25th international conference on pattern recognition (ICPR), pp 6455–6462
Google Scholar
Wu X, Hu Z, Du X, Yang J, He L (2021) Document layout analysis via dynamic residual feature fusion. ArXiv abs/2104.02874
Google Scholar
Simistira F, Seuret M, Eichenberger N, Garz A, Liwicki M, Ingold R (2016) DIVA-HisDB: a precisely annotated large dataset of challenging Medieval manuscripts. In: 2016 15th international conference on Frontiers in handwriting recognition (ICFHR), pp 471–476
Google Scholar
Quiros L (2018) Multi-task handwritten document layout analysis. ArXiv, abs/806.08852
Google Scholar
Quiros L, Serrano L, Bosch V, Toselli AH, Congost R, Saguer E, Vidal E (2018) Oficio de Hipotecas de Girona. A dataset of Spanish notarial deeds (18th Century) for Handwritten Text Recognition and Layout Analysis of historical documents (July)
Google Scholar
Toselli AH, Romero V, Villegas M, Vidal E, Sánchez JA (2016) HTR dataset. In: ICFHR 2016 (1.2.0)
Google Scholar
Xu Y, Yin F, Zhang Z, Liu CL (2018) Multi-task layout analysis for historical handwritten documents using fully convolutional networks. In: Proceedings of the 27th international joint conference on artificial intelligence (IJCAI’18). AAAI Press, pp 1057–1063
Google Scholar
Quiros L, Vidal E (2021) Evaluation of a region proposal architecture for multi-task document layout analysis. ArXiv abs/2106.11797
Google Scholar
Liu W et al (2016) SSD: single shot multi box detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9905
Google Scholar
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
Google Scholar
Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence, vol 39, pp 1137–1149
Google Scholar
Oliveira SA, Seguin B, Kaplan F (2018) dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th international conference on Frontiers in handwriting recognition (ICFHR), pp 7–12
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Pondicherry University, Kalapet, India
Sriram Ravichandra, S. Siva Sathya & S. Lourdu Marie Sophie

Authors

Sriram Ravichandra
View author publications
You can also search for this author in PubMed Google Scholar
S. Siva Sathya
View author publications
You can also search for this author in PubMed Google Scholar
S. Lourdu Marie Sophie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sriram Ravichandra .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Warangal, India
Rashmi Ranjan Rout
Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, India
Soumya Kanti Ghosh
Department of Computer Science and Engineering, Indian Institute of Technology (ISM) Dhanbad, Dhanbad, India
Prasanta K. Jana
School of IT and Engineering (SITE), Vellore Institute of Technology, Vellore, India
Asis Kumar Tripathy
Department of Computer Science and Information Technology, Institute of Technical Education and Research (ITER), Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, India
Jyoti Prakash Sahoo
Department of Computer Science and Information Engineering (CSIE), Providence University, Taichung, Taiwan
Kuan-Ching Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ravichandra, S., Siva Sathya, S., Lourdu Marie Sophie, S. (2022). Deep Learning Based Document Layout Analysis on Historical Documents. In: Rout, R.R., Ghosh, S.K., Jana, P.K., Tripathy, A.K., Sahoo, J.P., Li, KC. (eds) Advances in Distributed Computing and Machine Learning. Lecture Notes in Networks and Systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-19-1018-0_23

Download citation

DOI: https://doi.org/10.1007/978-981-19-1018-0_23
Published: 28 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1017-3
Online ISBN: 978-981-19-1018-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Deep Learning Based Document Layout Analysis on Historical Documents

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VTLayout: Fusion of Visual and Text Features for Document Layout Analysis

BINYAS: a complex document layout analysis system

Deep Layout Analysis of Multi-lingual and Composite Documents

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Deep Learning Based Document Layout Analysis on Historical Documents

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VTLayout: Fusion of Visual and Text Features for Document Layout Analysis

BINYAS: a complex document layout analysis system

Deep Layout Analysis of Multi-lingual and Composite Documents

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation