Abstract
Recently, word sense disambiguation has gained increased attention by NLP practitioner due to its various potential applications in language technology. This paper proposes a Naïve Bayes classifier for resolving lexical ambiguities of Bangla words with the help of a Bangla sense annotated corpus. At the initial stage, a Bangla sense annotated corpus is generated from a raw text corpus for serving as a training dataset. For a given input Bangla sentence, ambiguous words detection is done first and then Bayes probability theorem is applied to calculate the posterior probability that an ambiguous word belongs to a particular sense class. The values of posterior probability of several senses of the detected ambiguous word finally train the Naïve Bayes classifier to classify a closest sense of the ambiguous word. Experimental outcome reveals that the proposed method outdoes existing techniques by achieving the highest F1-score of \(90\%\) on the test data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pal, A.R., Saha, D., Das, N.S., Pal, A.: Word sense disambiguation in Bangla language using supervised methodology with necessary modifications. J. Inst. Eng. India Ser. B 99(5), 519–526 (2018)
Parameswarappa, S., Narayana, V.N.: Kannada word sense disambiguation using decision list. Int. J. Emerg. T. Tec. Comput. Sci. 2(3), 272–278 (2013)
Samhith, K., Tilak, A.S., Panda, G.: Word sense disambiguation using WordNet lexical categories. In: International Conference on SCOPES, pp. 1664–1666. India (2016)
Pal, R.A., Saha, D., Naskar, K.S.: Word sense disambiguation in Bengali: a knowledge based approach using Bengali WordNet. In: International Conference on ICECCT, pp. 1–5. India (2017)
Haque, A., Hoque, M.M.: Bangla word sense disambiguation system using dictionary based approach. In: Proceedings of the ICAICT (2016)
Pedersen, T.: In: Agirre, E., Edmonds, P. (Eds.): Word Sense Disambiguation: Algorithms and Applications. Springer (2007)
Pal, R.A., Saha, D.: Word sense disambiguation in Bengali language using unsupervised methodology with modifications. Sadhana 44(168), 1–13 (2019)
Màrquez, L., Escudero, G., Martínez, D., Rigau, G: Supervised corpus-based methods for WSD. In: TLTB, vol. 33. Springer (2007)
Pandit, R., Naskar, K.S.: A memory based approach to word sense disambiguation in Bengali using k-NN method. In: International Conferences on Recent Trends in Information Systems, pp. 383–386. India (2015)
Brown, F.P., Pietra, D.J.V., Mercer, L.R.: Word sense disambiguation using statistical methods. In: Annual Meeting of the ACL, pp. 264–270. USA (1991)
Soltani, M., Faili, H.: A statistical approach on Persian word sense disambiguation. In: International Conference on Informatics and Systems, pp. 1–6. Cairo, Egypt (2010)
Pal, A.R., Saha, D., Dash, N.S., Naskar, S.K., Pal, A.: A novel approach to word sense disambiguation in Bengali language using supervised methodology. Sādhanā 44(8), 1–12 (2019). https://doi.org/10.1007/s12046-019-1165-2
Nazah, S., Hoque, M.M., Hossain, R.: Word sense disambiguation of Bangla sentences using statistical approach. In: Proceedings of the ECCE, pp. 1–6. Bangladesh (2017)
Biswas, M., Hoque, M.M.: Development of a Bangla sense annotated corpus for word sense disambiguation. In: Proceedings of the ICBSLP, pp. 1–6. Sylhet, Bangladesh (2019)
Dawn, D.D., Shaikh, H.S., Pal, K.R.: A comprehensive review of Bengali word sense disambiguation. In: Artificial Intelligence Review, pp. 4183–4213. Springer (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Biswas, M., Sharif, O., Hoque, M.M. (2022). An Empirical Framework for Bangla Word Sense Disambiguation Using Statistical Approach. In: Misra, R., Shyamasundar, R.K., Chaturvedi, A., Omer, R. (eds) Machine Learning and Big Data Analytics (Proceedings of International Conference on Machine Learning and Big Data Analytics (ICMLBDA) 2021). ICMLBDA 2021. Lecture Notes in Networks and Systems, vol 256. Springer, Cham. https://doi.org/10.1007/978-3-030-82469-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-82469-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82468-6
Online ISBN: 978-3-030-82469-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)