Abstract
Nowadays, there is a huge amount of digital data stored in repositories that are queried by search systems that rely on keyword-based interfaces. Therefore, the retrieval of information from repositories has become an important issue. Organizations usually implement architectures based on relational databases that do not consider the syntax and semantics of the data. To solve this problem, they perform complex Extract, Transform and Load (ETL) processes from relational repositories to triple stores. However, most organizations do not carry out this migration due to lack of time, money and knowledge.
In this paper we present a methodology that performs an automatic query expansion based on natural language processing and semantics to improve information retrieval from relational databases repositories. We have integrated it into an existing system in a real Media Group organization and we have tested it to analyze its effectiveness. Results obtained are promising and show the interest of the proposal.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval, vol. 463. ACM Press, New York (1999)
Yu, J.X., Qin, L., Chang, L.: Keyword search in databases. Synthesis Lectures on Data Management 1(1) (2009)
Granados Buey, M., Luis Garrido, Á., Escudero, S., Trillo, R., Ilarri, S., Mena, E.: SQX-Lib: Developing a semantic query expansion system in a media group. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 780–783. Springer, Heidelberg (2014)
Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM 30(11), 964–971 (1987)
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Computing Surveys 44(1) (2012)
Qiu, Y., Frei, H.-P.: Concept based query expansion. In: Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 160–169 (1993)
Buckley, C., Salton, G., Allan, J.: Automatic retrieval with locality information using SMART, pp. 59–72 (1993)
Bergamaschi, S., Guerra, F., Interlandi, M., Trillo-Lado, R., Velegrakis, Y.: QUEST: A keyword search system for relational data based on semantic and machine learning techniques. Proceedings of the VLDB Endowment 6(12), 1222–1225 (2013)
Sekine, S., Ranchhod, E.: Named Entities: Recognition, Classification and Use. John Benjamins (2009)
Downey, D., Broadhead, M., Etzioni, O.: Locating complex named entities in web text. In: IJCAI, vol. 7, pp. 2733–2739 (2007)
Sanderson, M.: Retrieving with good sense. Information Retrieval 2(1), 49–69 (2000)
Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys 41(2), 10 (2009)
Vasilescu, F., Langlais, P., Lapalme, G.: Evaluating variants of the lesk approach for disambiguating words. In: LREC (2004)
Miller, G.A.: WordNet: A lexical database for English. Communications of ACM 38(11), 39–41 (1995)
Voorhees, E.M.: Using WordNet to disambiguate word senses for text retrieval, pp. 171–180 (1993)
Schütze, H., Pedersen, J.O.: Information retrieval based on word senses (1995)
Carreras, X., Chao, I., Padró, L., Padró, M.: FreeLing: An open-source suite of language analyzers. In: Fourth International Conference on Language Resources and Evaluation, pp. 239–242. European Language Resources Association (2004)
Vossen, P.: EuroWordNet: A multilingual database with lexical semantic networks. Kluwer Academic Boston (1998)
Cilibrasi, R.L., Vitanyi, P.M.: The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering 19(3), 370–383 (2007)
Garrido, A.L., Buey, M.G., Escudero, S., Ilarri, S., Mena, E., Silveira, S.B.: TM-gen: A topic map generator from text documents, pp. 735–740 (November 2013)
Garrido, A.L., Pera, M.S., Ilarri, S.: SOLER-R, a semantic and linguistic approach for Book Recommendations. In: 14th IEEE International Conference on Advanced Learning Technologies, ICALT (July 2014)
Garrido, A.L., Ilarri, S.: TMR: A Semantic recommender system using topic maps on the items descriptions. In: 11th European Conference of Web Semantic (May 2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Buey, M.G., Garrido, Á.L., Ilarri, S. (2014). An Approach for Automatic Query Expansion Based on NLP and Semantics. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8645. Springer, Cham. https://doi.org/10.1007/978-3-319-10085-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-10085-2_32
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10084-5
Online ISBN: 978-3-319-10085-2
eBook Packages: Computer ScienceComputer Science (R0)