Matching document images with ground truth

Hobby, John D.

doi:10.1007/s100320050006

Matching document images with ground truth

Original paper
Published: February 1998

Volume 1, pages 52–61, (1998)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal on Document Analysis and Recognition Aims and scope Submit manuscript

Matching document images with ground truth

Download PDF

John D. Hobby¹

114 Accesses
21 Citations
6 Altmetric
Explore all metrics

Abstract.

Since optical character recognition systems often require very large amounts of training data for optimum performance, it is important to automate the process of finding ground truth character identities for document images. This is done by finding a transformation that matches a scanned image to the machine-readable document description that was used to print the original. Rather than depend on finding feature points, a more robust procedure is to follow up by using an optimization algorithm to refine the transformation. The function to optimize can be based on the character bounding boxes – it is not necessary to have access to the actual character shapes used when printing the original.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Author information

Authors and Affiliations

Bell Laboratories, Lucent Technologies, Murray Hill, NJ, USA e-mail: hobby@bell-labs.com , , , , , , US
John D. Hobby

Authors

John D. Hobby
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received 25 June, 1997 / Revised August 20, 1997

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hobby, J. Matching document images with ground truth. IJDAR 1, 52–61 (1998). https://doi.org/10.1007/s100320050006

Download citation

Issue Date: February 1998
DOI: https://doi.org/10.1007/s100320050006

Key words:Optical character recognition – Ground truth – Nelder-Mead algorithm

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Matching document images with ground truth

Abstract.

Article PDF

Similar content being viewed by others

Matching Topological Structures for Handwritten Character Recognition

An Automated Pipeline for Robust Image Processing and Optical Character Recognition of Historical Documents

Image Based Retrieval and Keyword Spotting in Documents

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Matching document images with ground truth

Abstract.

Article PDF

Similar content being viewed by others

Matching Topological Structures for Handwritten Character Recognition

An Automated Pipeline for Robust Image Processing and Optical Character Recognition of Historical Documents

Image Based Retrieval and Keyword Spotting in Documents

Explore related subjects

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation