Skip to main content

Tamil Language Handwritten Document Digitization and Analysis of the Impact of Data Augmentation Using Generative Adversarial Networks (GANs) on the Accuracy of CNN Model

  • Conference paper
  • First Online:
Machine Learning and Autonomous Systems

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 269))

Abstract

Recognition of handwritten documents is an essential part of today’s world for various reasons such as analysis, data extraction, etc. Recognition of handwritten in Indian regional languages, for, e.g., Tamil character is very difficult, due to variations in size, style, and orientation angle. Limited availability of handwritten character dataset makes it difficult to achieve high accuracy for all the characters present in the language. In this paper, we propose a deep learning approach to recognize and classify handwritten characters of the Tamil language. The proposed system can be divided into five phases: expansion of the dataset and data augmentation with the help of Generative Adversarial Networks (GAN), preprocessing of the input image to reduce the noise, segmentation of the characters, feature extraction, and classification of the character done using Convolutional Neural Networks (CNN). We have achieved an accuracy of 94.03% and with data augmentation using GAN, an accuracy of 97% was achieved. The proposed system, along with the recognition of characters, also converts the image of the handwritten document into an editable word document or a PDF file.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Vijayaraghavan, P., Sra, M.: Handwritten tamil recognition using a convolutional neural network. In: MIT Media Lab. 2018 International Conference on Information, Communication, Engineering and Technology (ICICET). (2014)

    Google Scholar 

  2. Kowsalya, S., Periasamy, P.S.: Recognition of Tamil handwritten character using modified neural network with aid of elephant herding optimization. Multimedia Tools Appl. 78(17), 25043–25061 (2019)

    Article  Google Scholar 

  3. Prakash, A.A., Preethi, S.: Isolated offline tamil handwritten character recognition using deep convolutional neural network. In: 2018 International Conference on Intelligent Computing and Communication for Smart World (I2C2SW), IEEE, (2018)

    Google Scholar 

  4. Jayakanthan, R., et al.: Handwritten tamil character recognition using ResNet. Int. J. Res. Eng. Sci. Manag. 3(3), 133–137 (2020)

    Google Scholar 

  5. Pragathi, M. A., et al.: Handwritten tamil character recognition using deep learning. In: 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). IEEE, (2019)

    Google Scholar 

  6. Banumathi, P., Nasira, G.M.: Handwritten Tamil character recognition using artificial neural networks. In: 2011 International Conference on Process Automation, Control and Computing, IEEE, (2011)

    Google Scholar 

  7. Chowdhury, R.R., et al.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), IEEE, (2019)

    Google Scholar 

  8. Purkaystha, B., Datta, T., Islam, M.S.: Bengali handwritten character recognition using deep convolutional neural network. In: 2017 20th International Conference of Computer and Information Technology (ICCIT), IEEE, (2017)

    Google Scholar 

  9. Balci, B., Saadati, D., Shiferaw, D.: Handwritten text recognition using deep learning. CS231n: Convolutional Neural Networks for Visual Recognition, Course Project Report, Stanford University, Spring, 752–759 (2017)

    Google Scholar 

  10. Deore, S.P., Pravin, A.: Devanagari handwritten character recognition using fine-tuned deep convolutional neural network on trivial dataset. Sadhana 45(1), 1–13 (2020)

    Google Scholar 

  11. Yadav, M., Purwar, R.: Hindi handwritten character recognition using multiple classifiers. In: 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, IEEE, (2017)

    Google Scholar 

  12. Frid-Adar, M., et al.: Synthetic data augmentation using GAN for improved liver lesion classification. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), IEEE, (2018)

    Google Scholar 

  13. Jha, G., Cecotti, H.: Data augmentation for handwritten digit recognition using generative adversarial networks. Multimedia Tools Appl. 1–14 (2020)

    Google Scholar 

  14. HP Isolated Handwritten Tamil Character Dataset: IWFHR-10, http://lipitk.sourceforge.net/datasets/tamilchardata.htm

  15. Bhattacharya, U., Ghosh, S.K., Parui, S.: A two stage recognition scheme for handwritten Tamil characters. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 1. IEEE, (2007)

    Google Scholar 

  16. Shanthi, N., Duraiswamy, K.: A novel SVM-based handwritten Tamil character recognition system. Pattern Anal. Appl. 13(2), 173–180 (2010)

    Article  MathSciNet  Google Scholar 

  17. Goodfellow, Ian, et al.: Generative adversarial nets. In: Advances in neural information processing systems, p. 27 (2014)

    Google Scholar 

  18. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Venkatesh Murugesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Murugesh, V., Parthasarathy, A., Gopinath, G.P., Khade, A. (2022). Tamil Language Handwritten Document Digitization and Analysis of the Impact of Data Augmentation Using Generative Adversarial Networks (GANs) on the Accuracy of CNN Model. In: Chen, J.IZ., Wang, H., Du, KL., Suma, V. (eds) Machine Learning and Autonomous Systems. Smart Innovation, Systems and Technologies, vol 269. Springer, Singapore. https://doi.org/10.1007/978-981-16-7996-4_12

Download citation

Publish with us

Policies and ethics