Application of Expectation–Maximization Algorithm to Solve Lexical Divergence in Bangla–Odia Machine Translation

Das, Bishwa Ranjan; Maringanti, Hima Bindu; Dash, Niladri Sekhar

doi:10.1007/978-981-16-8739-6_39

Bishwa Ranjan Das⁷,
Hima Bindu Maringanti⁷ &
Niladri Sekhar Dash⁸

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 271))

439 Accesses

Abstract

This paper shows the word alignment between Odia–Bangla languages using the expectation–maximization (EM) algorithm with high accuracy output. The entire mathematical calculation is worked out and shown here by taking some Bangla–Odia sentences as a set of examples. The EM algorithm helps to find out the maximum likelihood probability value with the collaboration of the ‘argmax function’ that follows the mapping between two or more words of source and target language sentences. The lexical relationship among the words between two parallel sentences is known after calculating some mathematical values, and those values indicate which word of the target language is aligned with which word of the source language. As the EM algorithm is an iterative or looping process, the word relationship between source and target languages is easily found out by calculating some probability values in terms of maximum likelihood estimation (MLE) in an iterative way. To find the MLE or maximum a posterior (MAP) of parameters in the probability model, the model depends on unobserved latent variable(s). For years, it has been one of the toughest challenges because the process of lexical alignment for translation involves several machine learning algorithms and mathematical modeling. Keeping all these issues in mind, we have attempted to describe the nature of lexical problems that arise at the time of analyzing bilingual translated texts between Bangla (as source language) and Odia (as the target language). In word alignment, handling the ‘word divergence’ or ‘lexical divergence’ problem is the main issue and a challenging task, though it is not solved by EM algorithm, it is only possible through a bilingual dictionary or called as a lexical database that is experimentally examined and tested only mathematically. Problems of word divergence are normally addressed at the phrase level using bilingual dictionaries or lexical databases. The basic challenge lies in the identification of the single word units of the source text which are converted into multiword units in the target text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Maximum Likelihood Estimation for Bangla–Odia Word Alignment

Learning Word Alignment Models for Kazakh-English Machine Translation

Hybrid Word Alignment

References

Aswani, N., Gaizauskas, R.: Aligning words in English-Hindi parallel corpora. In: Association for Computational Linguistics, vol. 19, pp. 115–118 (2005)
Google Scholar
Das, B.R., Maringanti, H.B., Dash, N.S.: Word alignment in bilingual text for Bangla to Odia machine translation. Presented in the International Conference on Languaging and Translating: Within and Beyond on 21–23 Feb 2020, IIT Patna, India
Google Scholar
Das, B.R., Maringanti, H.B., Dash, N.S.: Challenges faced in machine learning-based Bangla-Odia word alignment for machine translation. Presented in the 42nd International Conference of Linguistic Society of India (ICOLSI-42) on 10–12 Dec 2020, GLA University, Mathura, UP, India
Google Scholar
Das, B.R., Maringanti, H.B., Dash, N.S.: Bangla-Odia word alignment using EM algorithm for machine translation. J. Sci. Technol (Special issue), Maharaja Sriram Chandra Bhanja Deo (erstwhile North Orissa) University, Baripada, India
Google Scholar
Dubey, S., Diwan, T.D.: Supporting large English-Hindi parallel corpus using word alignment. Int. J. Comput. Appl. 49(16–19) (2012)
Google Scholar
Jindal, K., et al.: Automatic word aligning algorithm for Hindi-Punjabi parallel text. In: Conference on Information Systems for Indian languages, pp. 180–184 (2011)
Google Scholar
Koehn, P., Knight, K.: Empirical methods for compounding splitting. In: EACL ‘03 Association for Computational Linguistics, vol. 1, pp. 187–193, 12–17 Apr (2003)
Google Scholar
Mansouri, A.B., et. al.: Joint prediction of word alignment with alignment types. Trans. Assoc. Comput. Linguist. 5, 501–514 (2017)
Google Scholar
Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)
Google Scholar
Koehn, P.: Statistical machine translation (2010)
Google Scholar
Songyot, T., Songyot, D.C.: Improving word alignment using word similarity. In: Empirical methods in Natural Language Processing, pp. 1840–1845 (2014)
Google Scholar
Tidemann, J.: Word alignment step by step. In: Proceedings of the 12th Nordic Conference on Computational Linguistics, pp. 216–227. University of Trondheim, Norway (1999)
Google Scholar
Tidemann, J.: Combining clues for word alignment. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 339–346, Budapest, Hungary, Apr 2003
Google Scholar
Tidemann, J.: Word to word alignment strategies. In: International Conference on Computational Linguistics (2004)
Google Scholar
Bhattacharyya, P.: Machine Translation. CRC Press (2017)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 4th edn. Pearson (2011)
Google Scholar
https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm.
https://www.cs.sfu.ca/~anoop/students/anahita_mansouri/anahita-depth-report.pdf.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.421.5497&rep=rep1&type=pdf.

Download references

Author information

Authors and Affiliations

Department of Computer Application, Maharaja Sriram Chandra Bhanja Deo University, Baripada, India
Bishwa Ranjan Das & Hima Bindu Maringanti
Linguistic Research Unit, Indian Statistical Institute, Kolkata, India
Niladri Sekhar Dash

Authors

Bishwa Ranjan Das
View author publications
You can also search for this author in PubMed Google Scholar
Hima Bindu Maringanti
View author publications
You can also search for this author in PubMed Google Scholar
Niladri Sekhar Dash
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fakir Mohan University, Balasore, Odisha, India
Satchidananda Dehuri
KIIT Deemed to be University, Bhubaneswar, Odisha, India
Bhabani Shankar Prasad Mishra
KIIT Deemed to be University, Bhubaneswar, India
Pradeep Kumar Mallick
Yonsei University, Seoul, Korea (Republic of)
Sung-Bae Cho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, B.R., Maringanti, H.B., Dash, N.S. (2022). Application of Expectation–Maximization Algorithm to Solve Lexical Divergence in Bangla–Odia Machine Translation. In: Dehuri, S., Prasad Mishra, B.S., Mallick, P.K., Cho, SB. (eds) Biologically Inspired Techniques in Many Criteria Decision Making. Smart Innovation, Systems and Technologies, vol 271. Springer, Singapore. https://doi.org/10.1007/978-981-16-8739-6_39

Download citation

DOI: https://doi.org/10.1007/978-981-16-8739-6_39
Published: 04 June 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8738-9
Online ISBN: 978-981-16-8739-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Application of Expectation–Maximization Algorithm to Solve Lexical Divergence in Bangla–Odia Machine Translation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum Likelihood Estimation for Bangla–Odia Word Alignment

Learning Word Alignment Models for Kazakh-English Machine Translation

Hybrid Word Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Application of Expectation–Maximization Algorithm to Solve Lexical Divergence in Bangla–Odia Machine Translation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum Likelihood Estimation for Bangla–Odia Word Alignment

Learning Word Alignment Models for Kazakh-English Machine Translation

Hybrid Word Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation