Skip to main content

Deep Learning Based Document Layout Analysis on Historical Documents

  • Conference paper
  • First Online:
Advances in Distributed Computing and Machine Learning

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 427))

Abstract

Mining historical documents provide important information that assists scientists and researchers to gain new insights into their domain. Document Layout Analysis is the first step of the automatic capturing of information from documents. The layout recognition supports establishing relationships between components in the document, which allows extraction of information. A method to visually segment critical regions of Historical handwritten documents using an object detection technique augmented with contextual features are presented in this paper. Layout analysis of Historical documents is complicated due to complex document layout and degradation of documents due to age. The object-detection technique YOLOv3 has been adopted for document layout analysis by extracting the features at different levels by down-sampling the input and combining it using Feature Pyramid Network to detect smaller regions to improve the region detection performance. The experimental results on two distinct datasets that demonstrate the method’s potential and competitive results with respect to state-of-the-art approaches have been presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Binmakhashen GM, Mahmoud SA (2020) Document layout analysis: a comprehensive survey. ACM Comput Surv 52(6), Article 109 (Jan)

    Google Scholar 

  2. Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. ArXiv abs/1804.02767

    Google Scholar 

  3. Lin T, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) pp 936–944

    Google Scholar 

  4. Lombardi F, Marinai S (2020) Deep learning for historical document analysis and recognition—a survey. J Imaging 6:110

    Article  Google Scholar 

  5. Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: SPIE 5010, document recognition and retrieval X 5010

    Google Scholar 

  6. Augusto Borges Oliveira D, Palhares Viana M (2017) Fast CNN-based document layout analysis. In: 2017 IEEE international conference on computer vision Workshops (ICCVW), pp 1173–1180

    Google Scholar 

  7. Soto C, Yoo S (2019) Visual detection with context for document layout analysis. In: EMNLP/IJCNLP

    Google Scholar 

  8. He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. In: IEEE transactions on pattern analysis and machine intelligence, vol 42, pp 386–397

    Google Scholar 

  9. Yang H, Hsu W (2021) Vision-based layout detection from scientific literature using recurrent convolutional neural networks. In: 2020 25th international conference on pattern recognition (ICPR), pp 6455–6462

    Google Scholar 

  10. Wu X, Hu Z, Du X, Yang J, He L (2021) Document layout analysis via dynamic residual feature fusion. ArXiv abs/2104.02874

    Google Scholar 

  11. Simistira F, Seuret M, Eichenberger N, Garz A, Liwicki M, Ingold R (2016) DIVA-HisDB: a precisely annotated large dataset of challenging Medieval manuscripts. In: 2016 15th international conference on Frontiers in handwriting recognition (ICFHR), pp 471–476

    Google Scholar 

  12. Quiros L (2018) Multi-task handwritten document layout analysis. ArXiv, abs/806.08852

    Google Scholar 

  13. Quiros L, Serrano L, Bosch V, Toselli AH, Congost R, Saguer E, Vidal E (2018) Oficio de Hipotecas de Girona. A dataset of Spanish notarial deeds (18th Century) for Handwritten Text Recognition and Layout Analysis of historical documents (July)

    Google Scholar 

  14. Toselli AH, Romero V, Villegas M, Vidal E, Sánchez JA (2016) HTR dataset. In: ICFHR 2016 (1.2.0)

    Google Scholar 

  15. Xu Y, Yin F, Zhang Z, Liu CL (2018) Multi-task layout analysis for historical handwritten documents using fully convolutional networks. In: Proceedings of the 27th international joint conference on artificial intelligence (IJCAI’18). AAAI Press, pp 1057–1063

    Google Scholar 

  16. Quiros L, Vidal E (2021) Evaluation of a region proposal architecture for multi-task document layout analysis. ArXiv abs/2106.11797

    Google Scholar 

  17. Liu W et al (2016) SSD: single shot multi box detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9905

    Google Scholar 

  18. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

    Google Scholar 

  19. Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE transactions on pattern analysis and machine intelligence, vol 39, pp 1137–1149

    Google Scholar 

  20. Oliveira SA, Seguin B, Kaplan F (2018) dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th international conference on Frontiers in handwriting recognition (ICFHR), pp 7–12

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sriram Ravichandra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ravichandra, S., Siva Sathya, S., Lourdu Marie Sophie, S. (2022). Deep Learning Based Document Layout Analysis on Historical Documents. In: Rout, R.R., Ghosh, S.K., Jana, P.K., Tripathy, A.K., Sahoo, J.P., Li, KC. (eds) Advances in Distributed Computing and Machine Learning. Lecture Notes in Networks and Systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-19-1018-0_23

Download citation

Publish with us

Policies and ethics