Abstract
In this paper an approach based on Heuristic Semantic Walk (HSW) is presented, where semantic proximity measures among concepts are used as heuristics in order to guide the concept chain search in the collaborative network of Wikipedia, encoding problem-specific knowledge in a problem-independent way. Collaborative information and multimedia repositories over the Web represent a domain of increasing relevance, since users cooperatively add to the objects tags, label, comments and hyperlinks, which reflect their semantic relationships, with or without an underlying structure. As in the case of the so called Big Data, methods for path finding in collaborative web repositories require solving major issues such as large dimensions, high connectivity degree and dynamical evolution of online networks, which make the classical approach ineffective. Experiments held on a range of different semantic measures show that HSW lead to better results than state of the art search methods, and points out the relevant features of suitable proximity measures for the Wikipedia concept network. The extracted semantic paths have many relevant applications such as query expansion, synthesis of explanatory arguments, and simulation of user navigation.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Bollegala, D., Matsuo, Y., Ishizukain, M.: A Web Search Engine-Based Approach to Measure Semantic Similarity between Words. IEEE Transactions on Knowledge and Data Engineering (2011)
Cilibrasi, R., Vitanyi, P.: The Google Similarity Distance. ArXiv.org (2004)
Church, K.W., Hanks, P.: Word association norms, mutual information and lexicography. In: ACL, vol. 27 (1989)
Franzoni, V., Milani, A.: PMING Distance: A Collaborative Semantic Proximity Measure. In: WI-IAT, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 2, pp. 442–449 (2012)
Kurant, M., Markopoulou, A., Thiran, P.: On the bias of BSF. ITC (2010)
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. WIKIAI (2008)
Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: WikiWalk: Random walks on Wikipedia for Semantic Relatedness. In: Proc. Graph-based Methods for Natural Language Processing (2009)
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. University of Michigan, MI (2003)
Cao, G., Gao, J., Nie, J.Y., Bai, J.: Extending query translation to cross-language query expansion with markov chain models. CIKM, ATM (2007)
Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Xu, Z., Luo, X., Yu, J., Xu, W.: Measuring semantic similarity between words by removing noise and redundancy in web snippets. Concurrency Computat: PE 23 (2011)
Wu, L., Hua, X.S., Yu, N., Ma, W.Y., Li, S.: Flickr Distance. Microsoft Research Asia (2008)
Leung, C.H.C., Li, Y., Milani, A., Franzoni, V.: Collective Evolutionary Concept Distance Based Query Expansion for Effective Web Document Retrieval. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) ICCSA 2013, Part IV. LNCS, vol. 7974, pp. 657–672. Springer, Heidelberg (2013)
Gori, M.,, P.: A random-walk based scoring algorithm with application to recommender systems for large-scale e-commerce. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)
Franzoni, V., Milani, A.: Heuristic Semantic Walk. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.-Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) ICCSA 2013, Part IV. LNCS, vol. 7974, pp. 643–656. Springer, Heidelberg (2013)
Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comp. Com. App (2006)
Franzoni, V., Milani, A.: Heuristic semantic walk for concept chaining in collaborative networks. International Journal of Web Information Systems 10(1), 85–103 (2014), doi:10.1108/IJWIS-11-2013-0031
Franzoni, V., Milani, A., Mengoni, P., Mencacci, M.: Semantic Heuristic Search in Collaborative Networks: Measures and Contexts. In: WI-IAT, 2014 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (2014) (accepted for)
Cheng, V.C., Leung, C.H.C., Liu, J., Milani, A.: Probabilistic Aspect Mining Model for Drug Reviews. IEEE Transactions on Knowledge and Data Engineering 99, 1 (preprint, 2014), doi:10.1109/TKDE.2013.175
Milani, A., Santucci, V.: Community of scientist optimization: An autonomy oriented approach to distributed optimization. AI Commun. 25(2), 157–172 (2012), doi:10.3233/AIC-2012-0526
Leung, C.H.C., Chan, A.W.S., Milani, A., Liu, J., Li, Y.: Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine. ACM TIST 3(3), 47 (2012), doi:10.1145/2168752.2168761
Baioletti, M., Milani, A., Poggioni, V., Rossi, F.: Experimental evaluation of pheromone models in ACOPlan. Ann. Math. Artif. Intell. 62(3-4), 187–217 (2011), doi:10.1007/s10472-011-9265-7
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Franzoni, V., Mencacci, M., Mengoni, P., Milani, A. (2014). Heuristics for Semantic Path Search in Wikipedia. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8584. Springer, Cham. https://doi.org/10.1007/978-3-319-09153-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-09153-2_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09152-5
Online ISBN: 978-3-319-09153-2
eBook Packages: Computer ScienceComputer Science (R0)