Abstract
To develop a generic method for document recognition, it is necessary to build a system with a generic approach for dealing with noise. Indeed, a lot of noise is present in an image and a recognizer needs to find the right information in the middle of noise to make a recognition. We describe in this paper the parser we develop in DMOS, a generic method for structured document recognition. This method use EPF, a grammatical language for describing documents. From an EPF description, a new recognition system is automatically build by compilation. DMOS had been successfully used for musical scores, mathematical formulae, table structure and old forms recognition (tested on 60,000 documents).
To illustrate the dealing of noise and to show how it is easy to define a grammatical description in EPF, we present in this paper a real and complete grammar defined to detect tennis court in videos. Even if this application is not directly on document, tennis court offers a good illustration example and has the same kind of problems as those found in structured documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brainerd, W.S.: Tree generating regular systems. Information and Control 14, 217–231 (1969)
Coüasnon, B.: Dmos: A generic document recognition method, application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: ICDAR, International Conference on Document Analysis and Recognition, Seattle, USA, September 2001, pp. 215–220 (2001)
Coüasnon, B., Camillerapp, J.: Using grammars to segment and recognize music scores. In: Spitz, L., Dengel, A. (eds.) Document Analysis Systems. World Scientific, Singapore (1995)
Coüasnon, B., Camillerapp, J.: A way to separate knowledge from program in structured document analysis: application to optical music recognition. In: ICDAR, International Conference on Document Analysis and Recognition, Montréal, Canada, August 1995, vol. 2, pp. 1092–1097 (1995)
Coüasnon, B., Leplumey, I.: A generic system for making archives documents accessible to public. In: ICDAR, International Conference on Document Analysis and Recognition, Edinburgh, UK (August 2003)
Coüasnon, B., Pasquer, L.: A real-world evaluation of a generic document recognition method applied to a military form of the 19th century. In: ICDAR, International Conference on Document Analysis and Recognition, Seattle, USA, September 2001, pp. 779–783 (2001)
Feder, J.: Plex languages. Information and Science 3, 225–241 (1971)
Garcia, P., Coüasnon, B.: Using a generic document recognition method for mathematical formulae recognition. In: Blostein, D., Kwon, Y.-B. (eds.) GREC 2001. LNCS, vol. 2390, p. 236. Springer, Heidelberg (2002)
Grbavec, A., Blostein, D.: Mathematics recognition using graph rewriting. In: ICDAR, International Conference on Document Analysis and Recognition, Montréal, Canada, August 1995, vol. 1, pp. 417–421 (1995)
Pfaltz, J.L., Rosenfeld, A.: Web grammars. In: Proceedings of the First International Joint Conference on Artificial Intelligence, Washington, D.C., May 1969, pp. 609–619 (1969)
Searls, D.B., Taylor, S.L.: Document image analysis using logic-grammarbased syntactic pattern recognition. In: Yamamoto, K., Baird, H.S., Bunke, H. (eds.) Structured Document Image Analysis, pp. 520–545. Springer, Heidelberg (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Coüasnon, B. (2004). Dealing with Noise in DMOS, a Generic Method for Structured Document Recognition: An Example on a Complete Grammar. In: Lladós, J., Kwon, YB. (eds) Graphics Recognition. Recent Advances and Perspectives. GREC 2003. Lecture Notes in Computer Science, vol 3088. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25977-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-25977-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22478-5
Online ISBN: 978-3-540-25977-0
eBook Packages: Springer Book Archive