Abstract
We present a system that recognizes tables in archival documents. Many works were carried out on table recognition but very few on tables of historical documents. These are difficult to analyze because they are often damaged due to their age and conservation. Therefore we have to introduce knowledge to compensate for missing information and noise in these documents. As there is a very important number of documents of a same type, the cost is not significant to introduce this explicit knowledge. We also want to minimalize the cost to adapt the system for a given document type. The precision of the knowledge given by the user is dependent on the quality of the document. The more the document is damaged, the more the specification has to be precise. We will show in this article how an external minimal knowledge can be sufficient for an efficient recognition system for tables in archival documents.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Lopresti, D.P., Nagy, G.: A tabular survey of automated table processing. In: Chhabra, A.K., Dori, D. (eds.) GREC 1999. LNCS, vol. 1941, pp. 93–120. Springer, Heidelberg (2000)
Zanibbi, R., Blostein, D., Cordy, J.R.: A survey of table recognition. International Journal of Document Analysis and Recognition (IJDAR) 7(1), 1–16 (2004)
Handley, J.C.: Table analysis for multiline cell identification. In: Proceedings of SPIE – Document Recognition and Retrieval VIII, vol. 4307, pp. 34–43 (2000)
Xingyuan, L., Gao, W., Doermann, D., Oh, W.G.: A robust method for unknown forms analysis. In: 5th International Conference on Document Analysis and Recognition (ICDAR 1999), Bangalore, India, pp. 531–534 (1999)
Hori, O., Doermann, D.S.: Robust table-form structure analysis based on box-driven reasoning. In: 3rd International Conference on Document Analysis and Recognition (ICDAR 1995), Montreal, Canada, pp. 218–221 (1995)
Chhabra, A.K., Misra, V., Arias, J.F.: Detection of horizontal lines in noisy run length encoded images: The fast method. In: Kasturi, R., Tombre, K. (eds.) Graphics Recognition 1995. LNCS, vol. 1072, pp. 35–48. Springer, Heidelberg (1996)
He, J., Downton, A.C.: User-assisted archive document image analysis for digital library construction. In: 7th International Conference on Document Analysis and Recognition (ICDAR 2003), Edinburgh, UK, pp. 498–502 (2003)
Esposito, F., Malerba, D., Semeraro, G., Ferilli, S., Altamura, O., Basile, T.M.A., Berardi, M., Ceci, M., Mauro, N.D.: Machine learning methods for automatically processing historical documents: From paper acquisition to xml transformation. In: 1st International Workshop on Document Image Analysis for Libraries (DIAL 2004), Palo Alto, CA, USA, pp. 328–335 (2004)
Antonacopoulos, A., Karatzas, D.: Document image analysis for world war 2 personal records. In: 1st International Workshop on Document Image Analysis for Libraries (DIAL 2004), Palo Alto, CA, USA, pp. 336–341 (2004)
Tubbs, K., Embley, D.: Recognizing records from the extracted cells of microfilm tables. In: ACM Symposium on Document Engineering, pp. 149–156 (2002)
Nielson, H., Barrett, W.: Consensus-based table form recognition. In: 7th International Conference on Document Analysis and Recognition (ICDAR 2003), Edinburgh, UK, pp. 906–910 (2003)
Coüasnon, B.: Dmos: A generic document recognition method, application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: 6th International Conference on Document Analysis and Recognition (ICDAR 2001), Seattle, WA, USA, pp. 215–220 (2001)
Coüasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: 1st International Workshop on Document Image Analysis for Libraries (DIAL 2004), Palo Alto, CA, USA, pp. 270–277 (2004)
Coüasnon, B.: Dmos, a generic document recognition method: Application to table structure analysis in a general and in a specific way. International Journal of Document Analysis and Recognition (IJDAR) (to be published)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martinat, I., Coüasnon, B. (2006). A Minimal and Sufficient Way of Introducing External Knowledge for Table Recognition in Archival Documents. In: Liu, W., Lladós, J. (eds) Graphics Recognition. Ten Years Review and Future Perspectives. GREC 2005. Lecture Notes in Computer Science, vol 3926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11767978_19
Download citation
DOI: https://doi.org/10.1007/11767978_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34711-8
Online ISBN: 978-3-540-34712-5
eBook Packages: Computer ScienceComputer Science (R0)