Abstract
This paper describes an algorithm which represents one of the few linguistics-based systems for word-to-word alignment. Most systems are purely statistic and assume some hypotheses about the structure of texts which are often infirmed. Our approach combines statistic methods with positional and linguistic ones in order to can be successfully applied to any kind of bitext as far as the internal structure of the texts is concerned. The linguistic part uses shallow parsing by regular expressions and relies on very general linguistic principles. However a component of language-specific methods can be developed for improving results. Our word-alignment system was evaluated on a Romanian-English bitext.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Piperidis, S., Papageorgiou, H., Boutsis, S.: From sentences to words and clauses. In: Véronis, J. (ed.) Parallel Text Processing. Alignment and Use of Translation Corpora, pp. 117–138. Kluwer Academic Publishers, Dordrecht (2000)
Melamed, D.: Pattern recognition for mapping bitext correspondence. In: Véronis, J. (ed.) Parallel Text Processing. Alignment and Use of Translation Corpora, pp. 25–47. Kluwer Academic Publishers, Dordrecht (2000)
Mihalcea, R., Pedersen, T.: An Evaluation Exercise for Word Alignment. In: Proceedings of the HLTNAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Edmonton, Canada, pp. 1–10 (2003)
Tufis, D., Barbu, A.M.: Revealing Translators’ Knowledge: Statistical Methods in Constructing Practical Translation Lexicons for Language and Speech Processing. International Journal of Speech Technology 5, 199–209 (2002)
Melamed, D.: Models of translation equivalence among words. Computational Linguistics 26(2), 221–249 (2000)
Hunt, J.W., Szymanski, T.G.: A Fast Algorithm for Computing Longest Common Subsequences. Comunications of the ACM 20(5), 350–353 (1977)
Dejean, H., Gaussier, E., Goutte, C., Yamanda, K.: Reducing Parameter Space for Word Alignment. In: Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Edmonton, Canada, pp. 23–26 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barbu, AM. (2004). A Positional Linguistics-Based System for Word Alignment. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-30120-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive