Abstract
This paper focuses on the important problem of semantic-aware search in textual (structured, semi-structured, NoSQL) databases. This problem has emerged as a required extension of the standard containment keyword based query to meet user needs in textual databases and IR applications. We provide here a new approach, called SemIndex, that extends the standard inverted index by constructing a tight coupling inverted index graph that combines two main resources: a general purpose semantic network, and a standard inverted index on a collection of textual data. We also provide an extended query model and related processing algorithms with the help of SemIndex. To investigate its effectiveness, we set up experiments to test the performance of SemIndex. Preliminary results have demonstrated the effectiveness, scalability and optimality of our approach.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Bast, H., Buchhold, B.: An index for efficient semantic full-text search. In: 22nd ACM Int. Conf. on CIKM, pp. 369–378 (2013)
Burton-Jones, A., Storey, V.C., Sugumaran, V., Purao, S.: A heuristic-based methodology for semantic augmentation of user queries on the web. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 476–489. Springer, Heidelberg (2003)
Carpineto, C., et al.: Improving retrieval feedback with multiple term-ranking function combination. ACM Trans. Inf. Syst. 20(3), 259–290 (2002)
Chandramouli, K., et al.: Query refinement and user relevance feedback for contextualized image retrieval. In: 5th International Conference on Visual Information Engineering, pp. 453–458 (2008)
Cimiano, P., et al.: Towards the self-annotating web. In: 13th Int. Conf. on WWW, pp. 462–471 (2004)
Das, S., et al.: Making unstructured data sparql using semantic indexing in oracle database. In: IEEE 29th ICDE, pp. 1405–1416 (2012)
de Limaand, E.F., Pedersen, J.O.: Phrase recognition and expansion for short, precision-biased queries based on a query log. In: 22nd Int. Conf. ACM SIGIR, pp. 145–152 (1999)
Fellbaum, C.: Wordnet an electronic lexical database. MIT Press (May 1998)
Florescu, D., et al.: Integrating keyword search into xml query processing. Comput. Netw. 33(1-6), 119–135 (2000)
Frakes, W.B., Baeza-Yates, R.A. (eds.): Information retrieval: Data structures and algorithms. Prentice-Hall (1992)
Grefenstette, G.: Explorations in automatic thesaurus discovery. Kluwer Pub. (1994)
Kumar, S., et al.: Ontology based semantic indexing approach for information retrieval system. Int. J. of Comp. App. 49(12), 14–18 (2012)
Li, Y., Yang, H., Jagadish, H.V.: Term disambiguation in natural language query for XML. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 133–146. Springer, Heidelberg (2006)
Mishra, C., Koudas, N.: Interactive query refinement. In: 12th Int. Conf. on EDBT, pp. 862–873 (2009)
Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2), 10:1–10:69 (2009)
Navigli, R., Crisafulli, G.: Inducing word senses to improve web search result clustering. In: Int. Conf. on Empirical Methods in Natural Language Processing, pp. 116–126 (2010)
Nguyen, S.H., Świeboda, W., Jaśkiewicz, G.: Semantic evaluation of search result clustering methods. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intell. Tools for Building a Scientific Information. Studies in Computational Intelligence, vol. 467, pp. 393–414. Springer, Heidelberg (2013), http://dx.doi.org/10.1007/978-3-642-35647-6_24
Navigli Paola, R., et al.: Extending and enriching wordnet with ontolearn. In: Int. Conf. on GWC 2004, pp. 279–284 (2004)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: 14th Int. Conf. on Artificial intelligence, pp. 448–453 (1995)
Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. In: Readings in Information Retrieval, pp. 355–364 (1997)
Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: 2nd Int. ACM Conf. on CIKM, pp. 67–74 (1993)
Velardi, P., et al.: Ontolearn reloaded: A graph-based algorithm for taxonomy induction. Computational Linguistics 39, 665–707 (2013)
Voorhees, E.M.: Query expansion using lexical-semantic relations. In: 17th Int. ACM Conf. on SIGIR, pp. 61–69 (1994)
Weeds, J., et al.: Characterising measures of lexical distributional similarity. In: 20th Int. Conf. on Computational Linguistics (2004)
Wen, H., et al.: Clustering web search results using semantic information. In: 2009 Int. Conf. on Machine Learning and Cybernetics, vol. 3, pp. 1504–1509 (2009)
Zhong, S., et al.: A design of the inverted index based on web document comprehending. JCP 6(4), 664–670 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chbeir, R. et al. (2014). SemIndex: Semantic-Aware Inverted Index. In: Manolopoulos, Y., Trajcevski, G., Kon-Popovska, M. (eds) Advances in Databases and Information Systems. ADBIS 2014. Lecture Notes in Computer Science, vol 8716. Springer, Cham. https://doi.org/10.1007/978-3-319-10933-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-10933-6_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10932-9
Online ISBN: 978-3-319-10933-6
eBook Packages: Computer ScienceComputer Science (R0)