Abstract
Automatic Query expansion is a well-known method to improve the performance of information retrieval systems. In this paper we have suggested information theoretic measures to improve efficiency of co-occurrence based automatic query expansion. We have used pseudo relevance feedback based local approach. The expansion terms were selected from the top N documents using co-occurrence based approach. They were then ranked using two different information theoretic approaches. First one is standard Kullback-Leibler divergence (KLD). As a second measure we have suggested use of a variant KLD. Experiments were performed on TREC-1 dataset. The result suggests that there is a scope of improving co-occurrence based query expansion by using information theoretic measures. Extensive experiments were done to select two important parameters: number of top N documents to be used and number of terms to be used for expansion.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Lee, C.J., Lin, Y.C., Chen, R.C., Cheng, P.J.: Selecting effective terms for query formulation. In: Proc. of the Fifth Asia Information Retrieval Symposium (2009)
Van Rijsbergen, C.J.: A theoretical basis for the use of cooccurrence data in information retrieval. Journal of Documentation (33), 106–119 (1977)
Carpineto, C., Romano, G.: TREC-8 Automatic Ad-Hoc Experiments at Fondazione Ugo Bordoni,TREC (1999)
Croft, W.B., Harper, D.J.: Using probabilistic models of document retrieval without relevance information. Journal of Documentation 35, 285–295 (1979)
Carmel, D., Yom-Tov, E., Soboroff, I.: SIGIR Workshop Report: Predicting query difficulty – methods and applications. In: Proc. of the ACM SIGIR 2005 Workshop on Predicting Query Difficulty – Methods and Applications, pp. 25–28 (2005)
Voorhees, E.M.: Query expansion using lexical semantic relations. In: Proceedings of the 1994 ACM SIGIR Conference on Research and Development in Information Retrieval (1994)
Efthimiadis, E.N.: Query expansion. Annual Review of Information Systems and Technology 31, 121–187 (1996)
Voorhees, E.M.: Overview of the TREC 2003 robust retrieval track. In: TREC, pp. 69–77 (2003)
Voorhees, E.M.: The TREC 2005 robust track. SIGIR Forum 40(1), 41–48 (2006)
Voorhees, E.M.: The TREC robust retrieval track. SIGIR Forum 39(1), 11–20 (2005)
Cao, G., Nie, J.Y., Gao, J.F., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proc. of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250 (2008)
Imran, H., Sharan, A.: Thesaurus and Query Expansion. International journal of computer science & information Technology (IJCSIT) 1(2), 89–97 (2009)
Harper, D.J., van Rijsbergen, C.J.: Evaluation of feedback in document retrieval using co-occurrence data. Journal of Documentation 34, 189–216 (1978)
Peat, H.J., Willett, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems. JASIS 42(5), 378–383 (1991)
Schütze, H., Pedersen, J.O.: A cooccurrence-based thesaurus and two applications to information retrieval. Inf. Process. Manage 33(3), 307–318 (1997)
Jing, Y., Croft, W.B.: An association thesaurus for information retrieval. In: 4th International Conference on Proceedings of RIAO 1994, New York, US, pp. 146–160 (1994)
Xu, J., Croft, W.B.: Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst. 18(1), 79–112 (2000)
Lesk, M.E.: Word-word associations in document retrieval systems. American Documentation 20, 27–38 (1969)
Stairmand, M.A.: Textual context analysis for information retrieval. In: Proceedings of the 1997 ACM SIGIR Conference on Research and Development in Information Retrieval (1997)
Porter, M.F.: An algorithm for suffix stripping. Program - automated library and information systems 14(3), 130–137 (1980)
Maron, M.E., Kuhns, J.K.: On relevance, probabilistic indexing and information retrieval. Journal of rhe ACM 7, 216–244 (1960)
Minker, J., Wilson, G.A., Zimmerman, B.H.: Query expansion by the addition of clustered terms for a document retrieval system. Information Storage and Retrieval 8, 329–348 (1972)
Ruch, P., Tbahriti, I., Gobeill, J., Aronson, A.R.: Argumentative feedback: A linguistically-motivated term expansion for information retrieval. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pp. 675–682 (2006)
Mandala, R., Tokunaga, T., Tanaka, H.: Combining multiple evidence from different types of thesaurus for query expansion. In: Proceedings of the 1999 ACM SIGIR Conference on Research and Development in Information Retrieval (1999)
Mandala, R., Tokunaga, T., Tanaka, H.: Ad hoc retrieval experiments using wornet and automatically constructed theasuri. In: Proceedings of the seventh Text REtrieval Conference, TREC7 (1999)
Robertson, S.E., Sparck Jones, K.: Relevance weighting of search terms. Journal of the American Society of Informarion Science 21, 129–146 (1976)
Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: Proceedings of the 2004 ACM SIGIR Conference on Research and Development in Information Retrieval (2004)
Smeaton, A.F.: The retrieval effects of query expansion on a feedback document retrieval system, University College Dublin, MSc thesis (1982)
Smeaton, A.F., van Rijsbergen, C.J.: The retrieval effects of query expansion on a feedback document retrieval system. Computer Journal 26, 239–246 (1983)
Sparck Jones, K.: Automatic keyword classification for information retrieval. Butterworth, London (1971)
Van Rijsbergen, C.J., Harper, D.J., Porter, M.F.: The selection of good search terms. Information Processing and Management 17, 77–91 (1981)
Qiu, Y., Frei, H.-P.: Concept based query expansion. In: SIGIR, pp. 160–169 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Imran, H., Sharan, A. (2010). Improving Effectiveness of Query Expansion Using Information Theoretic Approach. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13025-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-13025-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13024-3
Online ISBN: 978-3-642-13025-0
eBook Packages: Computer ScienceComputer Science (R0)