Abstract
Recognition of handwritten documents is an essential part of today’s world for various reasons such as analysis, data extraction, etc. Recognition of handwritten in Indian regional languages, for, e.g., Tamil character is very difficult, due to variations in size, style, and orientation angle. Limited availability of handwritten character dataset makes it difficult to achieve high accuracy for all the characters present in the language. In this paper, we propose a deep learning approach to recognize and classify handwritten characters of the Tamil language. The proposed system can be divided into five phases: expansion of the dataset and data augmentation with the help of Generative Adversarial Networks (GAN), preprocessing of the input image to reduce the noise, segmentation of the characters, feature extraction, and classification of the character done using Convolutional Neural Networks (CNN). We have achieved an accuracy of 94.03% and with data augmentation using GAN, an accuracy of 97% was achieved. The proposed system, along with the recognition of characters, also converts the image of the handwritten document into an editable word document or a PDF file.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Vijayaraghavan, P., Sra, M.: Handwritten tamil recognition using a convolutional neural network. In: MIT Media Lab. 2018 International Conference on Information, Communication, Engineering and Technology (ICICET). (2014)
Kowsalya, S., Periasamy, P.S.: Recognition of Tamil handwritten character using modified neural network with aid of elephant herding optimization. Multimedia Tools Appl. 78(17), 25043–25061 (2019)
Prakash, A.A., Preethi, S.: Isolated offline tamil handwritten character recognition using deep convolutional neural network. In: 2018 International Conference on Intelligent Computing and Communication for Smart World (I2C2SW), IEEE, (2018)
Jayakanthan, R., et al.: Handwritten tamil character recognition using ResNet. Int. J. Res. Eng. Sci. Manag. 3(3), 133–137 (2020)
Pragathi, M. A., et al.: Handwritten tamil character recognition using deep learning. In: 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). IEEE, (2019)
Banumathi, P., Nasira, G.M.: Handwritten Tamil character recognition using artificial neural networks. In: 2011 International Conference on Process Automation, Control and Computing, IEEE, (2011)
Chowdhury, R.R., et al.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), IEEE, (2019)
Purkaystha, B., Datta, T., Islam, M.S.: Bengali handwritten character recognition using deep convolutional neural network. In: 2017 20th International Conference of Computer and Information Technology (ICCIT), IEEE, (2017)
Balci, B., Saadati, D., Shiferaw, D.: Handwritten text recognition using deep learning. CS231n: Convolutional Neural Networks for Visual Recognition, Course Project Report, Stanford University, Spring, 752–759 (2017)
Deore, S.P., Pravin, A.: Devanagari handwritten character recognition using fine-tuned deep convolutional neural network on trivial dataset. Sadhana 45(1), 1–13 (2020)
Yadav, M., Purwar, R.: Hindi handwritten character recognition using multiple classifiers. In: 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, IEEE, (2017)
Frid-Adar, M., et al.: Synthetic data augmentation using GAN for improved liver lesion classification. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), IEEE, (2018)
Jha, G., Cecotti, H.: Data augmentation for handwritten digit recognition using generative adversarial networks. Multimedia Tools Appl. 1–14 (2020)
HP Isolated Handwritten Tamil Character Dataset: IWFHR-10, http://lipitk.sourceforge.net/datasets/tamilchardata.htm
Bhattacharya, U., Ghosh, S.K., Parui, S.: A two stage recognition scheme for handwritten Tamil characters. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 1. IEEE, (2007)
Shanthi, N., Duraiswamy, K.: A novel SVM-based handwritten Tamil character recognition system. Pattern Anal. Appl. 13(2), 173–180 (2010)
Goodfellow, Ian, et al.: Generative adversarial nets. In: Advances in neural information processing systems, p. 27 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Murugesh, V., Parthasarathy, A., Gopinath, G.P., Khade, A. (2022). Tamil Language Handwritten Document Digitization and Analysis of the Impact of Data Augmentation Using Generative Adversarial Networks (GANs) on the Accuracy of CNN Model. In: Chen, J.IZ., Wang, H., Du, KL., Suma, V. (eds) Machine Learning and Autonomous Systems. Smart Innovation, Systems and Technologies, vol 269. Springer, Singapore. https://doi.org/10.1007/978-981-16-7996-4_12
Download citation
DOI: https://doi.org/10.1007/978-981-16-7996-4_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7995-7
Online ISBN: 978-981-16-7996-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)