Abstract
The main reason of adopting Semantic Web technology in information retrieval is to improve the retrieval performance. A semantic search-based system is characterized by locating web contents that are semantically related to the query’s concepts rather than relying on the exact matching with keywords in queries. There is a growing interest in Arabic web content worldwide due to its importance for culture, political aspect, strategic location, and economics. Arabic is linguistically rich across all levels which makes the effective search of Arabic text a challenge. In the literature, researches that address searching the Arabic web content using semantic web technology are still insufficient compared to Arabic’s actual importance as a language. In this research, we propose an Arabic semantic search approach that is applied on Arabic web content. This approach is based on the Vector Space Model (VSM), which has proved its success and many researches have been focused on improving its traditional version. Our approach uses the Universal WordNet to build a rich concept-space index instead of the traditional term-space index. This index is used for enabling a Semantic VSM capabilities. Moreover, we introduced a new incidence measurement to calculate the semantic significance degree of the concept in a document which fits with our model rather than the traditional term frequency. Furthermore, for the purpose of determining the semantic similarity of two vectors, we introduced a new formula for calculating the semantic weight of the concept. Because documents are indexed by their topics and classified semantically, we were able to search Arabic documents effectively. The experimental results in terms of Precision, Recall and F-measure have showed improvement in performance from 77%, 56%, and 63% to 71%, 96%, and 81%, respectively.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Khaled, S.: A Survey of Arabic Named Entity Recognition and Classification. Computational Linguistics 40(2), 469–510 (2014), doi:10.1162/COLIa00178.
Saleh, L.M.B., Al-Khalifa, H.S.: AraTation: An Arabic Semantic Annotation Tool. In: The 11th International Conference on Information Integration and Web-based Applications & Services (2009)
Tazit, N., Bouyakhf, E.H., Sabri, S., Yousfi, A., Bouzouba, K.: Semantic internet search engine with focus on Arabic language. In: The 1st International Symposium on Computers and Arabic Language & Exhibition © KACST & SCS (2007)
Cardoso, J.: Semantic Web services: Theory, tools, and applications. IGI Global (March 30, 2007) ISBN-13: 978-1599040455
Hepp, M., De Leenheer, P., de Moor, A.: Ontology management: Semantic web, semantic web services, and business applications. Springer (2008) ISBN: 978-0-387-698899-1
Kashyap, V., Bussler, C., Moran, M.: The Semantic Web: Semantics for Data and Services on the Web (Data-Centric Systems and Applications). Springer (August 15, 2008) ISBN-13: 978-3540764519
Panigrahi, S., Biswas, S.: Next Generation Semantic Web and Its Application. IJCSI International Journal of Computer Science Issues 8(2) (March 2011)
Unni, M., Baskaran, K.: Overview of Approaches to Semantic Web Search. International Journal of Computer Science and Communication 2(2), 345–349 (2011)
Renteria-Agualimpia, W., López-Pellicer, F.J., Muro-Medrano, P.R., Nogueras-Iso, J., Zarazaga-Soria, F.J.: Exploring the Advances in Semantic Search Engines. In: de Leon F. de Carvalho, A.P., Rodríguez-González, S., De Paz Santana, J.F., Rodríguez, J.M.C. (eds.) Distrib. Computing & Artif. Intell., AISC, vol. 79, pp. 613–620. Springer, Heidelberg (2010)
Kassim, J.M., Rahmany, M.: Introduction to Semantic Search Engine. In: International Conference on Electrical Engineering and Informatics, ICEEI 2009 (2009)
Habash, N.Y.: Introduction to Arabic Natural Language Processing. Association for Computational Linguistics 30 (August 2010) ISBN 978-1-59829-795-9
Al-Zoghby, A.M., Eldin, A., Hamza, T.T.: Arabic Semantic Web Applications: A Survey. Journal of Emerging Technologies in Web Intelligence, 52–69 (2013)
Elkateb, S., Black, W., Vossen, P., Farwell, D., Pease, A., Fellbaum, C.: Arabic WordNet and the challenges of Arabic: The Challenge of Arabic for NLP/MT. In: International Conference at the British Computer Society (October 23, 2006)
Al-Khalifa, H.S., Al-Wabil, A.S.: The Arabic Language and the Semantic Web: Challenges and Opportunities. In: International Symposium on Computers and the Arabic Language, Riyadh, Saudi Arabia (November 2007)
Omar, D.: Arabic Ontology and Semantic Web. al-Mu’tamar al-duwali lil-lughah. al-lughat al-’Arabiyah bayn al-inqirad wa al-tatawwur, tahaddiyat wa tawqi’at, Jakarta, Indonesia July 22-25 (2010), الأنطولوجيا العربية والويب الدلالي: المؤتمر الدولي للغة العربية (اللغة العربية بين الانقراض والتطور-التحديات والتوقعات)، جاكرتا، إندونيسيا: 22-25 يوليو (2010)
Zhao, Y.-H., Shi, X.-F.: Shi: The Application of Vector Space Model in the Information Retrieval System. Software Engineering and Knowledge Engineering: Theory and Practice 162, 43–49 (2012)
de Melo, G., Weikum, G.: UWN: A Large Multilingual Lexical Knowledge Base. In: Annual Meeting of the Association of Computational Linguistics (2012)
Oudah, M.M., Shaalan, K.: A Pipeline Arabic Named Entity Recognition Using a Hybrid Approach. In: Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012) (2012)
Al-Zoghby, A.M., Eldin Ahmed, A.S., Hamza, T.: Utilizing Conceptual Indexing to Enhance the Effectiveness of Vector Space Model. International Journal of Information Technology and Computer Science (IJITCS) 5(11) (2074) (October 2013) ISSN: 2074-9007
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Al-Zoghby, A.M., Shaalan, K. (2015). Conceptual Search for Arabic Web Content. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-18117-2_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18116-5
Online ISBN: 978-3-319-18117-2
eBook Packages: Computer ScienceComputer Science (R0)