Abstract
Handwriting recognition is still not a solved problem. With the advancements in artificial intelligence and machine learning, the construction of Optical Character Recognition systems (OCRs) has become more effective. However, there is still no serious commercially available OCRs for many low-resource languages, such as Bangla. Bangla presents additional challenges, since oftentimes, the vowels and consonants in the middle of the words are abbreviated and replaced with notations called diacritics, and multiple letters can be combined to build shorthand representations, called compound characters. Furthermore, the compound characters can have diacritics as well, making the recognition task extremely complex. This means that a successful commercial OCR should not only model individual characters but also model these diacritics and combined characters, leading us to propose a grapheme-based holistic recognition approach. Borno is the first multiclass convolutional neural network-based deep learning model that can recognize Bangla handwritten characters with graphemes. The proposed model has been trained on a dataset of 1,069,132 images, with 50 basic characters, 10 numerals, 146 compound characters, 10 modifiers, and 6 consonant diacritics classes. The trained Borno model achieves a 92.61% average character recognition accuracy in the validation set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The World Factbook. Central Intelligence Agency. Archived from the original on 13 February 2008. Retrieved 21 February 2018 www.cia.gov
Summary by language size. Ethnologue. 3 October 2018. Archived from the original on 11 September 2013. Retrieved 21 February 2019
Bengali | Ethnologue. A language profile for Bengali. from population to dialects and usage. Retrieved April 28, 2020, from https://www.ethnologue.com/language/ben
Hays, J.: BENGALIS–Facts and Details. factsanddetails.com. Archived from the original on 30 July 2017. Retrieved 4 July 2018
The Nobel Prize in Literature 1913. NobelPrize.org. Nobel Media AB 2020. Sat. 25 Apr 2020. https://www.nobelprize.org/prizes/literature/1913/summary
Rabby, A.S.A., Haque, S., Abujar, S., Hossain, S.A.: EkushNet: using convolutional neural network for bangla handwritten recognition. Procedia Comput. Sci. 143, 603–610 (2018)
Rabby, A.S.A., Haque, S., Islam, M. S., Abujar, S., Hossain, S.A.: Ekush: a multipurpose and multitype comprehensive database for online off-line bangla handwritten characters. In: International Conference on Recent Trends in Image Processing and Pattern Recognition, pp. 149–158. Springer, Singapore (2018)
Saha, C., Faisal, R.H., Rahman, M.M.: Bangla handwritten basic character recognition using deep convolutional neural network. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 190–195. IEEE (2019)
Das, N., Das, B., Sarkar, R., Basu, S., Kundu, M., Nasipuri, M.: Handwritten bangla basic and compound character recognition using MLP and SVM classifier. J. Comput. 2, (2010)
Rahman, M.M., Akhand, M.A.H., Islam, S., Shill, P.C., Rahman, M.H.: Bangla handwritten character recognition using convolutional neural network. Int. J. Image Graph. Signal Process. 8, 42–49, (2015)
Sarkhel, R., Das, N., Saha, A.K., Nasipuri, M.: A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition. Pattern Recogn. 58, 172–189 (2016)
Pal, U., Belad, A., Choisy, Ch.: Touching numeral segmentation using water reservoir concept. Pattern Recogn. Lett. 24, 261–272 (2003)
Sharif, S.M.A., Mohammed, N., Momen, S., Mansoor, N.: Classification of Bangla compound characters using a HOG-CNN hybrid model. In: Mandal J., Saha G., Kandar D., Maji A. (eds) Proceedings of the International Conference on Computing and Communication Systems. Lecture Notes in Networks and Systems, vol 24. Springer, Singapore (2018)
Shopon, M., Mohammed, N., Abedin, M.A.: Bangla handwritten digit recognition using autoencoder and deep convolutional neural network. In: 2016 International Workshop on Computational Intelligence (IWCI), pp. 64–68. IEEE (2016)
Alom, M.Z., Sidike, P., Taha, T.M., Asari, V.K.: Handwritten bangla digit recognition using deep learning. arXiv preprint arXiv:1705.02680 (2017)
Rahman, M.M., Akhand, M.A.H., Islam, S., Shill, P.C., Rahman, M.H.: Bangla handwritten character recognition using convolutional neural network. International Journal of Image, Graphics and Signal Processing (IJIGSP). 7(8), 42–49 (2015)
Saha, S., Saha, N.: A lightning fast approach to classify Bangla handwritten characters and numerals using newly structured deep neural network. Procedia Comput. Sci. 132, 1760–1770 (2018)
Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: CMATERdb1: a database of unconstrained handwritten Bangla and Bangla-English mixed script document image. Int. J. Doc. Anal. Recogn. (IJDAR) 15(1), 71–83 (2012)
Biswas, M., et al.: Banglalekha-isolated: A multi-purpose comprehensive dataset of handwritten bangla isolated characters. Data Brief 12, 103–107 (2017)
Ferdous, J., Karmaker, S., Rabby, A.S.A., Hossain, S.A.: MatriVasha: a multipurpose comprehensive database for Bangla handwritten compound characters. In: Hassanien, A.E., Bhattacharyya, S., Chakrabati, S., Bhattacharya, A., Dutta, S. (eds.) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, Springer, Singapore (2021)
Bengali.AI Handwritten Grapheme Classification | Kaggle. Kaggle.com. (2020). Retrieved 28 April 2020, from https://www.kaggle.com/c/bengaliai-cv19/data
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Images “Normalization”. Deep Learning Course Forums. (2017). Retrieved 28 April 2020, from https://forums.fast.ai/t/images-normalization/4058/11
Jaitley, U.: Why data normalization is necessary for machine learning models. Medium. Retrieved 28 April 2020, from https://link.medium.com/EXbwW8iP25 (2018)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019). https://doi.org/10.1186/s40537-019-0197-0
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. Interspeech 13, 1756–1760 (2013)
Mannor, S., Peleg, D., Rubinstein, R.: The cross-entropy method for classification. In: Proceedings of the 22nd international conference on Machine learning, pp. 561–568 (2005)
Begum, H., Islam, M.M.: Recognition of handwritten bangla characters using gabor filter and artificial neural network. Int. J. Comput. Technol. Appl. 8(5), 618–621
Das, N., et al.: Recognition of handwritten Bangla basic characters and digits using convex hull-based feature set. arXiv preprint arXiv:1410.0478 (2014)
Badsha, M.A., Ali, M.A., Deb, D.K., Bhuiyan, M.N.: Handwritten bangla character recognition using neural network. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(11), 307–312 (2012)
Rabby, A.S.A., Haque, S., Islam, S., Abujar, S., Hossain, S.A.: Bornonet: bangla handwritten characters recognition using convolutional neural network. Procedia Comput. Sci. 143, 528–535 (2018)
Biswas, C., Bhattacharya, U., Parui, S.K.: HMM based online handwritten Bangla character recognition using Dirichlet distributions. In: 2012 International Conference on Frontiers in Handwriting Recognition, pp. 600–605. IEEE (2012)
Azim, R., Rahman, W., Karim, M.F.: Bangla hand-written character recognition using support vector machine. Int. J. Eng. Works 3(6), 36–46 (2016)
Islam, M.B., Azadi, M.M.B., Rahman, M., Hashem, M.M.A.: Bengali handwritten character recognition using modified syntactic method. In: Proceedings of 2nd National Conference on Computer Processing of Bangla (NCCPB-2005). Independent University, Bangladesh (2005)
Acknowledgement
The authors would like to acknowledge the encouragement and funding from the “Enhancement of Bangla Language in ICT through Research & Development (EBLICT)” project, under the Ministry of ICT, the Government of Bangladesh.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Rabby, A.S.A., Islam, M.M., Hasan, N., Nahar, J., Rahman, F. (2021). Borno: Bangla Handwritten Character Recognition Using a Multiclass Convolutional Neural Network. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Proceedings of the Future Technologies Conference (FTC) 2020, Volume 1. FTC 2020. Advances in Intelligent Systems and Computing, vol 1288. Springer, Cham. https://doi.org/10.1007/978-3-030-63128-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-63128-4_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63127-7
Online ISBN: 978-3-030-63128-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)