Script and language identification for handwritten document images

Hochberg, Judith; Bowers, Kevin; Cannon, Michael; Kelly, Patrick

doi:10.1007/s100320050036

Script and language identification for handwritten document images

Original papers
Published: December 1999

Volume 2, pages 45–52, (1999)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal on Document Analysis and Recognition Aims and scope Submit manuscript

Script and language identification for handwritten document images

Download PDF

Judith Hochberg¹,
Kevin Bowers²,
Michael Cannon¹ &
…
Patrick Kelly¹

291 Accesses
74 Citations
Explore all metrics

Abstract.

A system for automatically identifying the script used in a handwritten document image is described. The system was developed using a 496-document dataset representing six scripts, eight languages, and 279 writers. Documents were characterized by the mean, standard deviation, and skew of five connected component features. A linear discriminant analysis was used to classify new documents, and tested using writer-sensitive cross-validation. Classification accuracy averaged 88% across the six scripts. The same method, applied within the Roman subcorpus, discriminated English and German documents with 85% accuracy.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Author information

Authors and Affiliations

Computer Research and Applications Group (CIC-3), Mail Stop B265, Los Alamos National Laboratory, Los Alamos, NM 87545, USA; e-mail: {judithh,tmc,kelly}@lanl.gov , , , , , , US
Judith Hochberg, Michael Cannon & Patrick Kelly
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA , , , , , , US
Kevin Bowers

Authors

Judith Hochberg
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Bowers
View author publications
You can also search for this author in PubMed Google Scholar
Michael Cannon
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Kelly
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received December 1, 1998 / Revised April 5, 1999

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hochberg, J., Bowers, K., Cannon, M. et al. Script and language identification for handwritten document images. IJDAR 2, 45–52 (1999). https://doi.org/10.1007/s100320050036

Download citation

Issue Date: December 1999
DOI: https://doi.org/10.1007/s100320050036

Key words:Script – Language – Handwriting – Discrimination – Features

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Script and language identification for handwritten document images

Abstract.

Article PDF

Similar content being viewed by others

Word-Level Handwritten Script Identification from Multi-script Documents

Off-line Handwritten Script Identification from Eastern Indian Document Images Using Logistic Model Tree

Continuous Handwritten Script Recognition

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Script and language identification for handwritten document images

Abstract.

Article PDF

Similar content being viewed by others

Word-Level Handwritten Script Identification from Multi-script Documents

Off-line Handwritten Script Identification from Eastern Indian Document Images Using Logistic Model Tree

Continuous Handwritten Script Recognition

Explore related subjects

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation