Abstract
Optical character recognition (OCR) is the process to recognise text in the image which might be handwritten or printed and convert it into a text which is in editable format. OCR for Telugu language is difficult as consonants, and vowels of Telugu language can be combined in many ways to form different sets of words. The two major problems with Telugu OCR are character-level segmentation and training of compound character. To deal with character segmentation problems for handwritten styles, we have proposed an algorithm to segment characters in such a way that most of the features in a character are preserved. To deal with training problems, characters are converted into a numerical value so that problems which occur due to training of compound characters by using Telugu characters Unicode is eliminated, as it considers a compound character as 2 or more characters. For feature extraction, we have used Inception, which captures the small variances in the character, thereby increasing the recognition rates. We have achieved character-level recognition rates of 83% and word-level recognition rates of 70%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schantz HF (1982) The history of OCR, optical character recognition [Manchester Center, Vt.]. Recognition Technologies Users Association
Vijaya Lakshmi TR, Sastry PN, Rajinikanth TV (2018) Feature selection to recognize text from palm leaf manuscripts. SIViP 12:223–229. https://doi.org/10.1007/S11760-017-1149-9
Revathi B, Naveen Kishore G, Dheeraj V (2019) A survey on OCR for Telugu language. Int J Sci Technol Res 8(12):2–4
Ahmed SB, Naz S, Swati S, Razzak MI (2017) Handwritten Urdu character recognition using 1-dimensional BLSTM classifier. Comput Vis Pattern Recogn. Arxiv: 1705.05455 [Cs.Cv]
Chan Y-H, Xu Z-X, Lun DP-K (2020) A framework of reversible color-to-grayscale conversion with watermarking feature. IEEE Trans Image Process 29:859–870. https://doi.org/10.1109/Tip.2019.2936097
Lins RD, Ávila BT (2004) A new algorithm for skew detection in images of documents. In: Campilho A, Kamel M (eds) Image analysis and recognition, ICIAR, Lecture notes in computer science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_29
Pujari AK, Dhanunjaya Naidu C, Jinaga BC (2002) An adaptive character recognizer for Telugu scripts using multiresolution analysis and associative memory. ICVGIP, Ahmedabad
Negi A, Chereddi CK (2003) Candidate search and elimination approach for Telugu OCR. In: TENCON 2003, conference on convergent technologies for Asia-Pacific region, vol 2
Vasantha Lakshmi C, Jain R, Patvardhan C (2006) OCR for printed Telugu text with high recognition accuracies. In: Computer vision, graphics and image processing, 5th Indian conference, ICVGIP 2006, Madurai, India, proceedings, 13–16 Dec 2006, pp 786–795
Singh R, Kaur M (2010) OCR for Telugu script using back propagation based classifier. Int J Inf Technol Knowl Manag 2(2):639–643
Pavan Kumar P, Bhagvati C, Negi A, Agarwal A, Deeekshatulu BL (2011) Towards improving the accuracy of Telugu OCR systems. In: International conference on document analysis and recognition
Dhandra BV, Mukarambi G, Hangarg M (2011) A script independent approach for handwritten bilingual Kannada and Telugu digits recognition. Int J Mach Intell 3(3):155–159
Varalakshmi A, Negi A, Krishna S (2012) Dataset generation and feature extraction for Telugu hand-written recognition. Int J Comput Sci Telecommun 3(2)
Swamy Das M, Kovvur RMR (2015) Evaluation of neural based feature extraction methods for printed Telugu OCR system. In: ITSEICT, At: Jnu, New Delhi, vol 2
Ramalingeswara Rao KV, Bhaskara Rao N, Ramesh Babu DR (2014) Telugu character recognition based on topological feature alterations after selective morphological unification of the target. In: Proceedings of 8th IRF international conference, Pune, India, 04 May 2014
Manoj PV, Sahoo AK, Maurya SG, Kumar R (2014) Handwritten character recognition for English and Telugu scripts using multi-layer perceptions (Mlp). Int J Sci Eng Technol 3(6):730–733
Sastry PN, Vijaya Lakshmi TR, Koteswara Rao NV, Rajinikanth TV, Wahab A (2014) Telugu handwritten character recognition using zoning features. In: International conference on IT convergence and security, 28–30 Oct 2014
Mathew M, Jain M, Jawahar CV (2017) Benchmarking scene text recognition in Devanagari, Telugu and Malayalam. In: MOCR workshop, ICDAR, computer vision and pattern recognition. Arxiv: 2104.04437 [Cs.Cv]
Prameela N, Anjusha P, Karthik R (2017) Off-line Telugu handwritten characters recognition using optical character recognition. Int Conf Electron Commun Aerosp Technol 2
Vishwanath NV, Manjunathachari K, Satyaprasad K (2018) Handwritten Telugu composite character recognition using morphological analysis. Int J Pure Appl Math 119(18):667–676
Kumar BH, Chitra P (2020) Survey paper of script identification of Telugu language using OCR. In: script identification of different language documents & handwritten images
Paris S, Kornprobst P, Tumblin J, Durand F (2007) A gentle introduction to bilateral filtering and its applications. In: ACM Siggraph, international conference on computer graphics and interactive techniques
Yang A (2010) Research on image filtering method to combine mathematics morphology with adaptive median filter. In: 9th international conference on optical communications and networks, pp 55–59. https://doi.org/10.1049/Cp.2010.1152
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cyber 9(1):62–66
Tomasi C, Manduchi R (2018) Bilateral filtering for gray and color images. In: Sixth international conference on computer vision, pp 839–846. https://doi.org/10.1109/Iccv.1998.710815
Villar SA, Torcida S, Acosta GG (2017) Median filtering: a new insight. J Math Imaging Vis 58:130–146. https://doi.org/10.1007/S10851-016-0694-0
Kanopoulos N, Vasanthavada N, Baker RL (1988) Design of an image edge detection filter using the Sobel operator. IEEE J Solid State Circuits 23(2):358–367. https://doi.org/10.1109/4.996
Rong W, Li Z, Zhang W, Sun L (2014) An improved canny edge detection algorithm. IEEE Int Conf Mechatron Autom 577–582. https://doi.org/10.1109/Icma.2014.6885761
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. IEEE Conf Comput Vis Pattern Recogn 2155–2162. https://doi.org/10.1109/Cvpr.2014.276
Bankar J, Gavai NR (2018) Convolutional neural network based inception V3 model for animal classification. Int J Adv Res Comput Commun Eng 7(5)
Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. Comput Vis Pattern Recogn. Arxiv: 1512.00567 [Cs.Cv]
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. Comput Vis Pattern Recogn. Arxiv: 1409.4842 [Cs.Cv]
Dumoulin V, Visin F (2016) A guide to convolution arithmetic for deep learning. Arxiv: 1603.07285 [Stat.Ml]
Velpuru MS, Tejasree G, Ravi Kumar M (2020) Telugu handwritten character dataset. IEEE Dataport. https://doi.org/10.21227/Mw6a-D662
Velpuru MS, Chatterjee P, Tejasree G, Kumar MR, Rao SN (2020) Comprehensive study of deep learning based Telugu OCR. In: Third international conference on smart systems and inventive technology, pp 1166–1172. https://doi.org/10.1109/Icssit48917.2020.9214087
Berg A, Deng J, Fei-Fei L (2010) Large scale visual recognition challenge 2010. www.Imagenet.Org/Challenges2010
Le Cun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Arxiv: 1409.1556 [Cs.Cv]
Sitaula C, Hossain MB (2021) Attention-based Vgg-16 model for Covid-19 chest X-ray image classification. Appl Intell 51:2850–2863. https://doi.org/10.1007/S10489-020-02055-X
Liu S, Deng W (2015) Very deep convolutional neural network based image classification using small training sample size. In: 3rd IAPR Asian conference on pattern recognition. https://doi.org/10.1109/Acpr.2015.7486599
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nagendra Kumar, S., Revathi, B., Vidya, P.N.K.A.V.V.S., Neha, S.C. (2022). Optical Character Recognition of Telugu Text Using Inception Model. In: Pandian, A.P., Palanisamy, R., Narayanan, M., Senjyu, T. (eds) Proceedings of Third International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1415. Springer, Singapore. https://doi.org/10.1007/978-981-16-7330-6_74
Download citation
DOI: https://doi.org/10.1007/978-981-16-7330-6_74
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7329-0
Online ISBN: 978-981-16-7330-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)