Abstract
Mining historical documents provide important information that assists scientists and researchers to gain new insights into their domain. Document Layout Analysis is the first step of the automatic capturing of information from documents. The layout recognition supports establishing relationships between components in the document, which allows extraction of information. A method to visually segment critical regions of Historical handwritten documents using an object detection technique augmented with contextual features are presented in this paper. Layout analysis of Historical documents is complicated due to complex document layout and degradation of documents due to age. The object-detection technique YOLOv3 has been adopted for document layout analysis by extracting the features at different levels by down-sampling the input and combining it using Feature Pyramid Network to detect smaller regions to improve the region detection performance. The experimental results on two distinct datasets that demonstrate the method’s potential and competitive results with respect to state-of-the-art approaches have been presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Binmakhashen GM, Mahmoud SA (2020) Document layout analysis: a comprehensive survey. ACM Comput Surv 52(6), Article 109 (Jan)
Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. ArXiv abs/1804.02767
Lin T, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) pp 936–944
Lombardi F, Marinai S (2020) Deep learning for historical document analysis and recognition—a survey. J Imaging 6:110
Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: SPIE 5010, document recognition and retrieval X 5010
Augusto Borges Oliveira D, Palhares Viana M (2017) Fast CNN-based document layout analysis. In: 2017 IEEE international conference on computer vision Workshops (ICCVW), pp 1173–1180
Soto C, Yoo S (2019) Visual detection with context for document layout analysis. In: EMNLP/IJCNLP
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. In: IEEE transactions on pattern analysis and machine intelligence, vol 42, pp 386–397
Yang H, Hsu W (2021) Vision-based layout detection from scientific literature using recurrent convolutional neural networks. In: 2020 25th international conference on pattern recognition (ICPR), pp 6455–6462
Wu X, Hu Z, Du X, Yang J, He L (2021) Document layout analysis via dynamic residual feature fusion. ArXiv abs/2104.02874
Simistira F, Seuret M, Eichenberger N, Garz A, Liwicki M, Ingold R (2016) DIVA-HisDB: a precisely annotated large dataset of challenging Medieval manuscripts. In: 2016 15th international conference on Frontiers in handwriting recognition (ICFHR), pp 471–476
Quiros L (2018) Multi-task handwritten document layout analysis. ArXiv, abs/806.08852
Quiros L, Serrano L, Bosch V, Toselli AH, Congost R, Saguer E, Vidal E (2018) Oficio de Hipotecas de Girona. A dataset of Spanish notarial deeds (18th Century) for Handwritten Text Recognition and Layout Analysis of historical documents (July)
Toselli AH, Romero V, Villegas M, Vidal E, Sánchez JA (2016) HTR dataset. In: ICFHR 2016 (1.2.0)
Xu Y, Yin F, Zhang Z, Liu CL (2018) Multi-task layout analysis for historical handwritten documents using fully convolutional networks. In: Proceedings of the 27th international joint conference on artificial intelligence (IJCAI’18). AAAI Press, pp 1057–1063
Quiros L, Vidal E (2021) Evaluation of a region proposal architecture for multi-task document layout analysis. ArXiv abs/2106.11797
Liu W et al (2016) SSD: single shot multi box detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9905
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence, vol 39, pp 1137–1149
Oliveira SA, Seguin B, Kaplan F (2018) dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th international conference on Frontiers in handwriting recognition (ICFHR), pp 7–12
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ravichandra, S., Siva Sathya, S., Lourdu Marie Sophie, S. (2022). Deep Learning Based Document Layout Analysis on Historical Documents. In: Rout, R.R., Ghosh, S.K., Jana, P.K., Tripathy, A.K., Sahoo, J.P., Li, KC. (eds) Advances in Distributed Computing and Machine Learning. Lecture Notes in Networks and Systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-19-1018-0_23
Download citation
DOI: https://doi.org/10.1007/978-981-19-1018-0_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1017-3
Online ISBN: 978-981-19-1018-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)