Abstract
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in the target language (used to train a language model). In this article we explore the use of semi-supervised model adaptation methods for the effective use of monolingual data from the source language in order to improve translation quality. We propose several algorithms with this aim, and present the strengths and weaknesses of each one. We present detailed experimental evaluations on the French–English EuroParl data set and on data from the NIST Chinese–English large-data track. We show a significant improvement in translation quality on both tasks.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Abney S (2004) Understanding the Yarowsky algorithm. Computat Linguist 30(3): 365–395
Blatz J, Fitzgerald E, Foster G, Gandrabur S, Goutte C, Kulesza A, Sanchis A, Ueffing N (2003) Confidence estimation for machine translation. Final report, JHU/CLSP Summer Workshop. http://www.clsp.jhu.edu/ws2003/groups/estimate/
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory (COLT 1998). Madison, WI, pp 92–100
Brants T, Popat AC, Xu P, Och FJ, Dean J (2007) Large language models in machine translation. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). Prague, Czech Republic, pp 858–867
Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Computat Linguist 19(2): 263–311
Callison-Burch C (2002) Co-training for statistical machine translation. Master’s thesis. School of Informatics, University of Edinburgh, Edinburgh, UK
Callison-Burch C, Talbot D, Osborne M (2004) Statistical machine translation with word- and sentence-aligned parallel corpora. In: ACL-04: 42nd annual meeting of the association for computational linguistics, proceedings of the conference. Barcelona, Spain, pp 176–183
Foster G. Kuhn R, Johnson H (2006) Phrasetable smoothing for statistical machine translation. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006). Sydney, Australia, pp 53–61
Fraser A, Marcu D (2006) Semi-supervised training for statistical word alignment. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics. Sydney, Australia, pp 769–776
Hildebrand AS, Eck M, Vogel S, Waibel A (2005) Adaptation of the translation model for statistical machine translation based on information retrieval. In: Proceedings of 10th annual conference of the European association of machine translation (EAMT 2005). Budapest, Hungary, pp 133–142
McClosky D, Charniak E, Johnson M (2006a) Effective self-training for parsing. In: Proceedings of the human language technology conference of the North American chapter of the ACL, Main conference (HLT NAACL 2006). New York City, NY, pp 152–159
McClosky D, Charniak E, Johnson M (2006b) Reranking and self-training for parser adaptation. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics. Sydney, Australia, pp 337–344
Nießen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation: Fast evaluation for MT research. In: Proceedings of the 2nd international conference on language resources & evaluation (LREC 2000). Athens, Greece, pp 39–45
Och FJ (2003) Minimum error rate training in statistical machine translation. In: 41st annual meeting of the association for computational linguistics (ACL 2003). Sapporo, Japan, pp 160–167
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th annual meeting of the association for computational linguistics (ACL 2002). Philadelphia, PA, pp 311–318
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2002) Numerical recipes in C++. Cambridge University Press, Cambridge, UK
Stolcke A (2002) SRILM - an extensible language modeling toolkit. In: Proceedings of the 7th international conference on spoken language processing (ICSLP 2002). Denver, CO, pp 901–904
Ueffing N (2006) Using monolingual source-language data to improve MT performance. In: Proceedings of the international workshop on spoken language translation (IWSLT 2006). Kyoto, Japan, pp 174–181
Ueffing N, Ney H (2007) Word-level confidence estimation for machine translation. Computat Linguist 33(1): 9–40
Ueffing N, Haffari G, Sarkar A (2008) Semi-supervised learning for machine translation. In: Learning machine translation. NIPS Series, MIT Press
Ueffing N, Simard M, Larkin S, Johnson H (2007) NRCs PORTAGE system for WMT 2007. In: Proceedings of the second workshop on statistical machine translation. Prague, Czech Republic, pp 185–188
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd annual meeting of the association for computational linguistics (ACL 1995). Cambridge, MA, USA, pp 189–196
Zens R, Ney H (2006) N-Gram posterior probabilities for statistical machine translation. In: Human language technology conference of the North American chapter of the association for computational linguistics (HLT-NAACL): proceedings of the workshop on statistical machine translation. New York City, NY, pp 72–77
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ueffing, N., Haffari, G. & Sarkar, A. Semi-supervised model adaptation for statistical machine translation. Machine Translation 21, 77–94 (2007). https://doi.org/10.1007/s10590-008-9036-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-008-9036-3