Document image understanding refers to logical and semantic analysis of document images in order to extract information understandable to humans and codify it into machine-readable form. Most of the studies on document image understanding have targeted the specific problem of associating layout components with logical labels, while less attention has been paid to the problem of extracting relationships between logical components, such as cross-references. In this chapter, we investigate the problem of detecting the reading order relationship between components of a logical structure. The domain specific knowledge required for this task is automatically acquired from a set of training examples by applying a machine learning method. The input of the learning method is the description of “chains” of layout components defined by the user. The output is a logical theory which defines two predicates, first_to_read/1 and succ_in_reading/2, useful for consistently reconstructing all chains in the training set. Only spatial information on the page layout is exploited for both single and multiple chain reconstruction. The proposed approach has been evaluated on a set of document images processed by the system WISDOM++.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25(7) (1992) 10-22
Dengel, A., Bleisinger, R., Hoch, R., Fein, F., Hones, F.: From paper to office document standard representation. IEEE Computer 25(7) (1992) 63-67
Wenzel, C., Maus, H.: Leveraging corporate context within knowledge-based document analysis and understanding. IJDAR 3(4) (2001) 248-260
Ceci, M., Berardi, M., Malerba, D.: Relational data mining and ILP for docu-ment image understanding. Applied Artificial Intelligence 21(4-5) (2007) 317-342
Tsujimoto, S., Asada, H.: Understanding multi-articled documents. In: in Pro-ceedings of the 10th International Conference on Pattern Recognition. (1990) 551-556
Ishitani, Y.: Document transformation system from papers to xml data based on pivot xml document method. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2003) 250
Nagy, G., Seth, S.: Hierarchical representation of optically scanned documents. In: Seventh Int’l Conf. Pattern Recognition, IEEE CS Press (1984) 347-349
Meunier, J.L.: Optimized xy-cut for determining a page reading order. In: IC-DAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2005) 347-351
Taylor, S.L., Dahl, D.A., Lipshutz, M., Weir, C., Norton, L.M., Nilson, R., Linebarger, M.: Integrated text and image understanding for document un-derstanding. In: HLT ’94: Proceedings of the workshop on Human Language Technology, Morristown, NJ, USA, Association for Computational Linguistics (1994) 421-426
Aiello, M., Monz, C., Todoran, L., Worring, M.: Document understanding for a broad class of documents. International Journal on Document Analysis and Recognition-IJDAR 5(1) (2002) 1-16
Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11)(1983) 832-843
Aiello, M., Smeulders, A.M.W.: Thick 2d relations for document understanding. Inf. Sci. Inf. Comput. Sci. 167(1-4) (2004) 147-176
Breuel, T.M.: High performance document layout analysis. In: Proceedings of the 2003 Symposium on Document Image Understanding (SDIUT ’03). (2003)
Altamura, O., Esposito, F., Malerba, D.: Transforming paper documents into XML format with WISDOM++. IJDAR 4(1) (2001) 2-17
Malerba, D., Esposito, F., Altamura, O., Ceci, M., Berardi, M.: Correcting the document layout: A machine learning approach. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2003) 97-101
Esposito, F., Malerba, D., Lisi, F.A.: Machine learning for intelligent processing of printed documents. J. Intell. Inf. Syst. 14(2-3) (2000) 175-198
Malerba, D., Esposito, F., Lisi, F.A., Altamura, O.: Automated discovery of dependencies between logical components in document image understanding. ICDAR ’01: Proceedings of the Sixth International Conference on Document Analysis and Recognition, Washington, DC, USA, IEEE Computer Society (2001) 174-178
Utgoff, P.: An improved algorithm for incremental induction of decision trees. In: Proc. of the Eleventh Int. Conf. on Machine Learning, Morgan Kaufmann (1994)
Malerba, D.: Learning recursive theories in the normal ilp setting. Fundamenta Informaticae 57(1) (2003) 39-77
Grimaldi, R.P.: Discrete and Combinatorial Mathematics, an Applied Introduc-tion. Addison Wesley, third edition (1994)
Muggleton, S.: Inductive Logic Programming. Academic Press, London (1992)
De Raedt, L.: Interactive Theory Revision. Academic Press, London (1992)
Lavrač, N., Džeroski, S.: Inductive Logic Programming: techniques and appli-cations. Ellis Horwood, Chichester (1994)
Bergadano, F., Gunetti, D.: Inductive Logic Programming: from machine learn-ing to software engineering. The MIT Press, Cambridge, MA (1996)
Nienhuys-Cheng, S.W., de Wolf, R.: Foundations of inductive logic program-ming. Springer, Heidelberg (1997)
Levi, G., Sirovich, F.: Generalized and/or graphs. Artif. Intell. 7(3) (1976) 243-259
Mladenić, D., Grobelnik, M.: Feature selection for unbalanced class distribution and naive bayes. In: ICML ’99: Proceedings of the Sixteenth International Con-ference on Machine Learning, Morgan Kaufmann Publishers Inc. (1999) 258-267
Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. J. Artif. Intell. Res. (JAIR) 10 (1999) 243-270
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW ’01: Proceedings of the 10th international conference on World Wide Web, New York, NY, USA, ACM Press (2001) 613-622
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Malerba, D., Ceci, M., Berardi, M. (2008). Machine Learning for Reading Order Detection in Document Image Understanding. In: Marinai, S., Fujisawa, H. (eds) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol 90. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76280-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-76280-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76279-9
Online ISBN: 978-3-540-76280-5
eBook Packages: EngineeringEngineering (R0)