Abstract
The information on the web is increasing day by day and to manage such vast amount of information is really a difficult task. The user finds it really hard to capture the desired information as per their need and maximum amount of time is spent in framing proper query and filtering the resultant web pages. The search engine plays a major role in filtering the information and ranking the desired result. The quest for accurate information is still a dream and in this regard this paper presents an approach that tries to optimize the ranking algorithm by employing document clustering and similarity measures. In this paper we present an outline of different ranking algorithms and proposed an approach where PageRank algorithm is optimized by using document clustering. It also employs content mining along with structural mining that help to reduce the computational complexity of the algorithm and thereby diminish the time in performing the ranking of the web pages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alam, M., Sadaf, K.: A review on clustering of web search result. In: Advances in Computing & Information Technology, AISC, vol. 177, pp. 153–159. Springer, Heidelberg (2013)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques (2000)
Leuski, A., Allan, J.: Improving interactive retrieval by combining ranked lists and clustering (2000)
Sheshasaayee, A., Thailambal, G.: Comparison of classification algorithms in text mining. Int. J. Pure Appl. Math. 116(22), 425–433 (2017)
Jain, R., Purohit, G.N.: page ranking algorithms for web mining. Int. J. Comput. Appl. 13(5), 0975–8887 (2011)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: discovery and applications of usage patterns from web data. In: ACM SIGKDD, January 2000
Wang, Z.: Improved link-based algorithms for ranking web pages. In: WAIM. LNCS, vol. 3129, pp. 291–302. Springer, Heidelberg (2004)
Yates, R.B., Hurtado, C., Mendoza, M.: Query clustering for boosting web page ranking. In: AWIC 2004. LNAI, vol. 3034, pp. 164–175. Springer, Heidelberg (2004)
Irfan, S., Ghosh, S.: A review on different ranking algorithms. In: International Conference on Advances in Computing, Communication Control and Networking IEEE ICACCCN (2018)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh International World Wide Web Conference (1998)
Masterton, G., Olsson, E.J.: From impact to importance: the current state of the wisdom-of-crowds justification of link-based ranking algorithms. Philos. Technol. 31, 593–609 (2018)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46, 604–632 (1999)
Xing, W., Ghorbani, A.: Weighted pagerank algorithm. In: Proceedings of the Second Annual Conference on Communication Networks and Services Research. IEEE (2004)
Fujimura, K., Inoue, T., Sugisaki, M.: The eigenrumor algorithm for ranking blogs. In: WWW (2005)
Bidoki, A.M.Z., Yazdani, N.: DistanceRank: an intelligent ranking algorithm for web pages. Inf. Process. Manag. 44, 877–892 (2007)
Jiang, H.: TIMERANK: a method of improving ranking scores by visited time. In: Proceedings of the Seventh International Conference Machine Learning and Cybernetics, Kunming, 12–15 July 2008 (2008)
LaTorre, A., Pena, J.M., Robles, V., Perez, M.S.: A survey in web page clustering techniques (2019)
Sandhya, N., Govardhan, A.: Analysis of similarity measures with wordnet based text document clustering. In: Proceedings of the InConINDIA. AISC, vol. 132, pp. 703–714. Springer, Heidelberg (2012)
Rafi, M., Shaikh, M.S.: An improved semantic similarity measure for document clustering based on topic maps (2013)
Markov, Z., Larose, D.T.: Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage. Wiley, Hoboken (2007)
Manning, C.D., Raghavan, P.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). Preliminary draft© 2008
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Irfan, S., Ghosh, S. (2020). Efficient Ranking Framework for Information Retrieval Using Similarity Measure. In: Smys, S., Tavares, J., Balas, V., Iliyasu, A. (eds) Computational Vision and Bio-Inspired Computing. ICCVBIC 2019. Advances in Intelligent Systems and Computing, vol 1108. Springer, Cham. https://doi.org/10.1007/978-3-030-37218-7_141
Download citation
DOI: https://doi.org/10.1007/978-3-030-37218-7_141
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37217-0
Online ISBN: 978-3-030-37218-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)