Abstract
A common task in many applications is to find persons who are knowledgeable about a given topic (i.e., expert finding). In this paper, we propose and develop a general probabilistic framework for studying expert finding problem and derive two families of generative models (candidate generation models and topic generation models) from the framework. These models subsume most existing language models proposed for expert finding. We further propose several techniques to improve the estimation of the proposed models, including incorporating topic expansion, using a mixture model to model candidate mentions in the supporting documents, and defining an email count-based prior in the topic generation model. Our experiments show that the proposed estimation strategies are all effective to improve retrieval accuracy.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Azzopardi, L., Balog, K., de Rijke, M.: Language modeling approaches for enterprise tasks. In: Proceedings of TREC-05 (2006)
Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: Proceedings of SIGIR-06 (2006)
Cao, Y., et al.: Research on expert search at enterprise track of trec2005. In: Proceedings of TREC-05 (2006)
Craswell, N., de Vries, A.P., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of TREC-05 (2006)
Fu, Y., et al.: Thuir at trec 2005: Enterprise track. In: Proceedings of TREC-05 (2006)
Lafferty, J., Zhai, C.: Probabilistic relevance models based on document and query generation. In: Language Modeling and Information Retrieval. Kluwer International Series on Information Retrieval, vol. 13, Kluwer Academic Publishers, Dordrecht (2003)
Macdonald, C., et al.: University of glasgow at trec 2005: Experiments in terabyte and enterprise tracks with terrier. In: Proceedings of TREC-05 (2006)
Petkova, D., Croft, W.B.: Umass notebook 2006: Enterprise track. In: Proceedings of TREC-06 (2007)
Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proceedings of SIGIR’94 (1994)
Robertson, S.E.: The probability ranking principle in ir. Journal of Documentation 33(4), 294–304 (1977)
Soboroff, I., de Vries, A.P., Craswell, N.: Overview of the trec 2006 enterprise track. In: Proceedings of TREC-06 (2007)
Voorhees, E., Harman, D.: Proceedings of Text REtrieval Conference (TREC1-9) (http://trec.nist.gov/pubs.html). NIST Special Publications (2001)
Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of CIKM-01 (2001)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of SIGIR-01 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Fang, H., Zhai, C. (2007). Probabilistic Models for Expert Finding. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)