Machine Learning for Reading Order Detection in Document Image Understanding

Malerba, Donato; Ceci, Michelangelo; Berardi, Margherita

doi:10.1007/978-3-540-76280-5_3

Donato Malerba⁴,
Michelangelo Ceci⁴ &
Margherita Berardi⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 90))

2623 Accesses
14 Citations

Document image understanding refers to logical and semantic analysis of document images in order to extract information understandable to humans and codify it into machine-readable form. Most of the studies on document image understanding have targeted the specific problem of associating layout components with logical labels, while less attention has been paid to the problem of extracting relationships between logical components, such as cross-references. In this chapter, we investigate the problem of detecting the reading order relationship between components of a logical structure. The domain specific knowledge required for this task is automatically acquired from a set of training examples by applying a machine learning method. The input of the learning method is the description of “chains” of layout components defined by the user. The output is a logical theory which defines two predicates, first_to_read/1 and succ_in_reading/2, useful for consistently reconstructing all chains in the training set. Only spatial information on the page layout is exploited for both single and multiple chain reconstruction. The proposed approach has been evaluated on a set of document images processed by the system WISDOM++.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Image-based logical document structure recognition

Article Open access 25 September 2014

Analysis of the Logical Layout of Documents

Reading order detection on handwritten documents

Article Open access 03 February 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25(7) (1992) 10-22
Article Google Scholar
Dengel, A., Bleisinger, R., Hoch, R., Fein, F., Hones, F.: From paper to office document standard representation. IEEE Computer 25(7) (1992) 63-67
Google Scholar
Wenzel, C., Maus, H.: Leveraging corporate context within knowledge-based document analysis and understanding. IJDAR 3(4) (2001) 248-260
Article Google Scholar
Ceci, M., Berardi, M., Malerba, D.: Relational data mining and ILP for docu-ment image understanding. Applied Artificial Intelligence 21(4-5) (2007) 317-342
Article Google Scholar
Tsujimoto, S., Asada, H.: Understanding multi-articled documents. In: in Pro-ceedings of the 10th International Conference on Pattern Recognition. (1990) 551-556
Google Scholar
Ishitani, Y.: Document transformation system from papers to xml data based on pivot xml document method. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2003) 250
Chapter Google Scholar
Nagy, G., Seth, S.: Hierarchical representation of optically scanned documents. In: Seventh Int’l Conf. Pattern Recognition, IEEE CS Press (1984) 347-349
Google Scholar
Meunier, J.L.: Optimized xy-cut for determining a page reading order. In: IC-DAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2005) 347-351
Chapter Google Scholar
Taylor, S.L., Dahl, D.A., Lipshutz, M., Weir, C., Norton, L.M., Nilson, R., Linebarger, M.: Integrated text and image understanding for document un-derstanding. In: HLT ’94: Proceedings of the workshop on Human Language Technology, Morristown, NJ, USA, Association for Computational Linguistics (1994) 421-426
Chapter Google Scholar
Aiello, M., Monz, C., Todoran, L., Worring, M.: Document understanding for a broad class of documents. International Journal on Document Analysis and Recognition-IJDAR 5(1) (2002) 1-16
Article MATH Google Scholar
Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11)(1983) 832-843
Article MATH Google Scholar
Aiello, M., Smeulders, A.M.W.: Thick 2d relations for document understanding. Inf. Sci. Inf. Comput. Sci. 167(1-4) (2004) 147-176
MATH MathSciNet Google Scholar
Breuel, T.M.: High performance document layout analysis. In: Proceedings of the 2003 Symposium on Document Image Understanding (SDIUT ’03). (2003)
Google Scholar
Altamura, O., Esposito, F., Malerba, D.: Transforming paper documents into XML format with WISDOM++. IJDAR 4(1) (2001) 2-17
Article Google Scholar
Malerba, D., Esposito, F., Altamura, O., Ceci, M., Berardi, M.: Correcting the document layout: A machine learning approach. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2003) 97-101
Chapter Google Scholar
Esposito, F., Malerba, D., Lisi, F.A.: Machine learning for intelligent processing of printed documents. J. Intell. Inf. Syst. 14(2-3) (2000) 175-198
Article Google Scholar
Malerba, D., Esposito, F., Lisi, F.A., Altamura, O.: Automated discovery of dependencies between logical components in document image understanding. ICDAR ’01: Proceedings of the Sixth International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2001) 174-178
Chapter Google Scholar
Utgoff, P.: An improved algorithm for incremental induction of decision trees. In: Proc. of the Eleventh Int. Conf. on Machine Learning, Morgan Kaufmann (1994)
Google Scholar
Malerba, D.: Learning recursive theories in the normal ilp setting. Fundamenta Informaticae 57(1) (2003) 39-77
MATH MathSciNet Google Scholar
Grimaldi, R.P.: Discrete and Combinatorial Mathematics, an Applied Introduc-tion. Addison Wesley, third edition (1994)
Google Scholar
Muggleton, S.: Inductive Logic Programming. Academic Press, London (1992)
MATH Google Scholar
De Raedt, L.: Interactive Theory Revision. Academic Press, London (1992)
Google Scholar
Lavrač, N., Džeroski, S.: Inductive Logic Programming: techniques and appli-cations. Ellis Horwood, Chichester (1994)
Google Scholar
Bergadano, F., Gunetti, D.: Inductive Logic Programming: from machine learn-ing to software engineering. The MIT Press, Cambridge, MA (1996)
Google Scholar
Nienhuys-Cheng, S.W., de Wolf, R.: Foundations of inductive logic program-ming. Springer, Heidelberg (1997)
Google Scholar
Levi, G., Sirovich, F.: Generalized and/or graphs. Artif. Intell. 7(3) (1976) 243-259
Article MATH MathSciNet Google Scholar
Mladenić, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naive bayes. In: ICML ’99: Proceedings of the Sixteenth International Con-ference on Machine Learning, Morgan Kaufmann Publishers Inc. (1999) 258-267
Google Scholar
Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. J. Artif. Intell. Res. (JAIR) 10 (1999) 243-270
MATH MathSciNet Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW ’01: Proceedings of the 10th international conference on World Wide Web, New York, NY, USA, ACM Press (2001) 613-622
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari, via Orabona 4, 70126, Bari, Italy
Donato Malerba, Michelangelo Ceci & Margherita Berardi

Authors

Donato Malerba
View author publications
You can also search for this author in PubMed Google Scholar
Michelangelo Ceci
View author publications
You can also search for this author in PubMed Google Scholar
Margherita Berardi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Sistemi e Informatica, University of Florence, Via S. Marta, 3, 50139, Firenze, Italy
Simone Marinai
Hitachi Central Research Laboratory, 1-280, Higashi-Koigakubo, Kokubunji-shi, Tokyo, 185-8601, Japan
Hiromichi Fujisawa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Malerba, D., Ceci, M., Berardi, M. (2008). Machine Learning for Reading Order Detection in Document Image Understanding. In: Marinai, S., Fujisawa, H. (eds) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol 90. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76280-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-76280-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76279-9
Online ISBN: 978-3-540-76280-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Machine Learning for Reading Order Detection in Document Image Understanding

Chapter PDF

Similar content being viewed by others

Image-based logical document structure recognition

Analysis of the Logical Layout of Documents

Reading order detection on handwritten documents

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Machine Learning for Reading Order Detection in Document Image Understanding

Chapter PDF

Similar content being viewed by others

Image-based logical document structure recognition

Analysis of the Logical Layout of Documents

Reading order detection on handwritten documents

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation