Abstract
We propose a novel metric ATEC for automatic MT evaluation based on explicit assessment of word choice and word order in an MT output in comparison to its reference translation(s), the two most fundamental factors in the construction of meaning for a sentence. The former is assessed by matching word forms at various linguistic levels, including surface form, stem, sound and sense, and further by weighing the informativeness of each word. The latter is quantified in term of the discordance of word position and word sequence between a translation candidate and its reference. In the evaluations using the MetricsMATR08 data set and the LDC MTC2 and MTC4 corpora, ATEC demonstrates an impressive positive correlation to human judgments at the segment level, highly comparable to the few state-of-the-art evaluation metrics.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Babych B, Hartley A (2004) Extending the BLEU MT evaluation method with frequency weightings. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL-2004), Barcelona, Spain, 21–26 July 2004, pp 621–628
Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, University of Michigan, Ann Arbor, MI, 29 June 2005, pp 65–72
Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceeding of the second conference on human language technology (HLT-2002), San Diego, CA, 24–27 March 2002, pp 138–145
Giménez J, Màrquez L (2007) Linguistic features for automatic evaluation of heterogenous MT systems. In: Proceedings of the second workshop on statistical machine translation, Prague, Czech Republic, 23 June 2007, pp 256–264
Gopen GD (2004) The sense of structure: writing from the reader’s perspective. Longman, New York
Krauwer S (1993) Evaluation of MT systems: a programmatic view. Mach Transl 8(1–2): 59–66
Landauer TK (2002) On the computational basis of learning and cognition: arguments from LSA. In: Ross BH (eds) The psychology of learning and motivation, vol 41. Academic Press, New York, pp 43–84
Liu D, Gildea D (2005) Syntactic features for evaluation of machine translation. In: Proceedings of the workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, University of Michigan, Ann Arbor, MI, 29 June 2005, pp 25–32
Miller GA, Beebe-Center JG (1956) Some psychological methods for evaluating the quality of translations. Mech Transl 3(3): 73–80
Owczarzak K, Van Genabith J, Way A (2007) Dependency-based automatic evaluation for machine translation. In: Proceedings of SSST, NAACL-HLT 2007/AMTA workshop on syntax and structure in statistical translation, Rochester, NY, 26 April 2007, pp 80–87
Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL-2002), Philadelphia, PA, pp 311–318
Porter MF (1980) An algorithm for suffix stripping. Program 14(3): 130–137
Russell RC (1918) US Patent 1,261,167, 2 April 1918
Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the association for machine translation in the Americas, visions of the future of machine translation (AMTA-2006), Cambridge, MA, USA, 8–12 August 2006, pp 223–231
White JS (2000) Contemplating automatic MT evaluation. In: White JS (ed) Proceedings of the 4th conference of the association for machine translation in the Americas, envisioning machine translation in the information future (AMTA-2000), Cuernavaca, Mexico, pp 100–108
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wong, B., Kit, C. ATEC: automatic evaluation of machine translation via word choice and word order. Machine Translation 23, 141–155 (2009). https://doi.org/10.1007/s10590-009-9061-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-009-9061-x