Abstract
An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Abdelazim, H.Y., Hashish, M.A.: Arabic reading machine. In: Proceedings of the 10th National Computer Conference, Jeddah, pp. 733–744 (1988)
Al-Shoshan, A.I.: Arabic OCR based on image invariants. In: Proceedings of the International Conference on Geometric Modeling and Imaging—New Trends, pp. 150–154 (2006)
Amin A.: Off-line Arabic character recognition: the state of the art. Pattern Recognit. 31(5), 517–530 (1998)
Azmi R., Kabir E.: A new segmentation technique for omnifont Farsi text. Pattern Recognit. Lett. 22, 97–104 (2001)
Cheung A., Bennamoun M., Bergmann N.W.: An Arabic optical character recognition system using recognition-based segmentation. Pattern Recognit. 34, 215–233 (2001)
Ebrahimi A., Kabir E.: A pictorial dictionary for printed Farsi subwords. Pattern Recognit. Lett. 29(5), 656–663 (2008)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, Bari, Italy, pp. 148–156 (1996)
Gouda, A.M., Rashwan, M.A.: Segmentation of connected Arabic characters using hidden Markov models. IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, USA pp. 115–119 (2004)
Houle, G., Shridhar, M.: Handwritten word recognition with OCR-based segmenter. In: Proceedigns of the Workshop on Document Image Analysis, pp. 51–58 (1997)
Khosravi H., Kabir E.: Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit. Lett. 28(10), 1133–1141 (2007)
Khosravi, H., Kabir, E.: Introducing two fast and efficient features for Farsi digit recognition (in Farsi). Machine Vision and Image Processing, Mashhad, pp. 1126–1131 (2007)
Khosravi H., Kabir, E.: Farsi font recognition based on Sobel- Roberts features. Pattern Recognit. Lett. (Under Review) (2008)
Kimura, F., Shridhar, M., Chen, Z.: Improvements of a Lexicon directed algorithm for recognition of unconstrained handwritten words. In: Proceedings of 2nd ICDAR Conference, pp. 18–22 (1993)
Kurdy, B., AlSabbagh, M.: Omnifont Arabic optical character recognition system. In: Proceedings of International Conference on Information and Communication Technologies: From Theory to Applications, pp. 469–470 (2004)
Levenshtein V.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady 10(8), 707–710 (1966)
Mansoory, S., Hassibi, H., Rajabi, F.: A heuristic Persian handwritten digit recognition with neural network. In: The 6th Iranian Conference on Electrical Engineering, pp. 131–135 (1998)
Mehran, R., Pirsiavash, H., Razzaziy, F.: A front-end OCR for omni-font Persian/Arabic cursive printed documents. Digital Imaging Computing: Techniques and Applications, pp. 385–392 (2005)
Menhaj, M.B., Adab, M.: Simultaneous segmentation and recognition of Farsi/Latin printed texts with MLP. In: International Joint Conference on Neural Networks, pp. 1534–1539 (2002)
Nabavi, S.H., Ebrahimpour, R., Kabir, E.: Recognition of handwritten Farsi digits using classifier combination. In: Third Conference on Machine Vision, Image Processing and Applications, Tehran, pp. 116–119 (2005)
Nashida H., Mori S.: An Algebraic approach to automatic construction of structured models. Pattern Anal. Mach. Intell. 15(12), 1298–1311 (1993)
Parhami B., Taraghi M.: Automatic recognition of printed Farsi texts. Pattern Recognit. Lett. 14, 395–403 (1981)
Sarfraz, M., Nawaz, S.N., Al-Khuraidly, A.: Offline Arabic text recognition system. In: Proceedings of International Conference on Geometric Modeling and Graphics, pp. 30–35 (2003)
Soltanzadeh H., Rahmati M.: Recognition of Persian handwritten digits using image profiles of multiple orientations. Pattern Recognit. Lett. 25(14), 1569–1576 (2004)
Yazdi, S.A.B., A’rabi, B.N.: Printed Farsi text recognition with simultaneous use of HMM. In: Dynamic Programming and SVM (in Farsi), Machine Vision and Image Processing, Mashhad (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article can be found at http://dx.doi.org/10.1007/s10032-009-0087-7
Rights and permissions
About this article
Cite this article
Khosravi, H., Kabir, E. A blackboard approach towards integrated Farsi OCR system. IJDAR 12, 21–32 (2009). https://doi.org/10.1007/s10032-009-0079-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-009-0079-7