Abstract
Predicting the future impact of academic publications has many important applications. In this paper, we propose methods for predicting future article impact, leveraging digital libraries of academic publications containing citation information. Using a set of successive past impact scores, obtained through graph-ranking algorithms such as PageRank, we study the evolution of the publications in terms of their yearly impact scores, learning regression models to predict the future PageRank scores, or to predict the future number of downloads. Results obtained over a DBLP citation dataset, covering papers published up to the year of 2011, show that the impact predictions are highly accurate for all experimental setups. A model based on regression trees, using features relative to PageRank scores, PageRank change rates, author PageRank scores, and term occurrence frequencies in the abstracts and titles of the publications, computed over citation graphs from the three previous years, obtained the best results.
This work was partially supported by Fundação para a Ciência e a Tecnologia (FCT), through project grants with references UTA-EST/MAI/0006/2009(REACTION) and PTDC/EIA-EIA/109840/2009 (SInteliGIS), as well as through PEst-OE/EEI/LA0021/2013 (INESC-ID plurianual funding).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Breiman, L.: Random Forests. Machine Learning 45(1) (2001)
Chen, P., Xie, H., Maslov, S., Redner, S.: Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics 1(1) (2007)
Chien, S., Dwork, C., Kumar, R., Simon, D.R., Sivakumar, D.: Link evolution: Analysis and algorithms. Internet Mathematics 1(3) (2003)
Davis, J.V., Dhillon, I.S.: Estimating the global PageRank of web communities. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)
Friedman, J.H.: Greedy function approximation: A gradient boosting machine. Annals of Statistics 29(5) (2000)
Kan, M.-Y., Thi, H.O.N.: Fast webpage classiffication using URL features. In: Proceedings of the ACM International Conference on Information and Knowledge Management (2005)
Kaul, R., Yun, Y., Kim, S.-G.: Ranking billions of web pages using diodes. Communications of ACM 52(8) (2009)
Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2) (1938)
Langville, A., Meyer, C.D.: Survey: Deeper inside PageRank. Internet Mathematics 1(3) (2003)
Lerman, K., Ghosh, R., Kang, J.H.: Centrality metric for dynamic networks. In: Proceedings of the Workshop on Mining and Learning with Graphs (2010)
Mohan, A., Chen, Z., Weinberger, K.Q.: Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research 14 (2011)
Radicchi, F., Fortunato, S., Markines, B., Vespignani, A.: Diffusion of scientific credits and the ranking of scientists. Physical Review (2009)
Sayyadi, H., Getoor, L.: Future rank: Ranking scientific articles by predicting their future PageRank. In: Proceedings of the SIAM International Conference on Data Mining (2009)
Spearman, C.: The proof and measurement of association between two things. American Journal of Psychology 15 (1904)
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: Extraction and mining of academic social networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)
Vazirgiannis, M., Drosos, D., Senellart, P., Vlachou, A.: Web page rank prediction with Markov models. In: Proceedings of the International Conference on World Wide Web (2008)
Voudigari, E., Pavlopoulos, J., Vazirgiannis, M.: A framework for web Page Rank prediction. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H. (eds.) EANN/AIAI 2011, Part II. IFIP AICT, vol. 364, pp. 240–249. Springer, Heidelberg (2011)
Walker, D., Xie, H., Yan, K.-K., Maslov, S.: Ranking scientific publications using a simple model of network traffic. Technical Report CoRR, abs/physics/0612122 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bento, C., Martins, B., Calado, P. (2013). Predicting the Future Impact of Academic Publications. In: Correia, L., Reis, L.P., Cascalho, J. (eds) Progress in Artificial Intelligence. EPIA 2013. Lecture Notes in Computer Science(), vol 8154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40669-0_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-40669-0_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40668-3
Online ISBN: 978-3-642-40669-0
eBook Packages: Computer ScienceComputer Science (R0)