Abstract
This paper describes our participation in the Cross-Language Evaluation Forum. Our objectives for this experiment were three-fold. Firstly, the coverage of the Thai bilingual dictionary was evaluated when translating queries. Secondly, we investigated whether the segmentation process affected the CLIR. Lastly, this research examines query formation techniques. Since this is our first international experimental in CLIR, our approach used dictionary-based techniques to translate Thai queries into English queries. Four runs were submitted to CLEF: (a) single mapping translation with manual segmentation, (b) multiple mapping translation with manual segmentation, (c) single mapping translation with automatic segmentation and (d) single mapping with query enhancement using words from our Thai thesaurus. The retrieval effectiveness was worse than we expected. The simple dictionary mapping technique is unable to achieve retrieval effectiveness, although the dictionary lookup gave a very high percentage of mapping words. The words from the dictionary lookup are not specific terms but each is mapped to a definition or meaning of that term. Furthermore, Thai stopwords, stemmed words and word separation have reduced the effectiveness of Thai CLIR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
LEXiTRON, Thai<->English Dictionary, Software and Language Engineering Laboratory, National Electronics and Computer Technology Center, http://www.links.nectec.or.th/lexit/lex_t.html (downloaded in June, 2001)
Online Thai English Dictionary, Northern Illinois University, http://www.seasite.niu.edu/Thai/home_page/online_thai_dictionaries.htm (download in June 2001)
Thai Wordbreak Insertion Services, National Electronics and Computer Technology Center, URL:http://ntl.nectec.or.th/services/wordbreak/ (downloaded in June, 2001)
Sophonpanich Kalaya, The R&D Activities of MT in Thailand, The National Electronics and Computer Technology Center, Bangkok, Thailand.
Parsit, Information Research and Development Division, National Electronics and Computer Technology Center, http://www.links.nectec.or.th/services/parsit/index2.html (downloaded in June, 2001)
Kanlayanawat W., and Prasitjutrakul S., Automatic Indexing for Thai Text with Unknown Words using Trie Structure, Department of Computer Engineering, Chulalongkorn University.
Jaruskulchai C., An Automatic Indexing for Thai Text Retrieval, Ph.D. Thesis, George Washington University, U.S.A., Aug 1998.
Charoenkitkarn, N., and Udomporntawee, R. Optimal Text Signature Length for Word Searching on Thai Holy Bible(in Thai). Proceeding of Electrical Engineering Conference, KMUTT, Bangkok, November 1998, 549–552.
Suwanvisat Prayut and Prasitjutrakul Somchai, Transliterated Word Encoding and Retrieval Algorithms for Thai-English Cross-Language Retrieval.
Suwanvisat P. and Prasijutrakul S., Thai-English Cross-Language Transliterated Word Retrieval Soundex Technique, NCSEC2000.
Kawtrakul A., Deemagarn A., Thumkanon C., Khantonthong N and McFetridge Paul., Backward Transliteration for Thai Document Retrieval, Natural Language Processing and Intelligent Information System Technology, Research Laboratory, Dept. of Computer Engineering, Kasetsart University, Bangkok, Thailand.
Adriani M., and Croft Bruce, The Effectiveness of a Dictionary-Based Technique for Indonesian-English Cross-Language Text Retrieval, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, USA.
Chen, Gey, Kishida, Jiang and Liang, Comparing multiple methods for Japanese and Japanese-English text retrieval, Working Notes of the Cross-Language Evaluation Forum 2000, http://www.clef-campaign.org
The Thai Internet Education Project, http://www.cyberc.com/crcl/ehelp/base.htm (doug@crcl.chula.edu: Contract person, downloaded in June, 2001)
Yuen Phuwarawan and team, Thai Thesaurus, in Thai, Ed publisher.
Pirkola Ari, The Effects of Query Structure and Dictionary Setups in Dictionary-Based Cross-language Information Retrieval, SIGIR’98
Sripimonwan V. and Jaruskulchai C., Cross-Language Retrieval from Thai to English (in Thai), submitted to The Fifth National Computer Science and Engineering Conference, Thailand.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chuleerat, J. (2002). Dictionary-Based Thai CLIR: An Experimental Survey of Thai CLIR. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_18
Download citation
DOI: https://doi.org/10.1007/3-540-45691-0_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44042-0
Online ISBN: 978-3-540-45691-9
eBook Packages: Springer Book Archive