Abstract
In this paper, we present a novel semantic-aware clustering approach for partitioning of experts represented by lists of keywords. A common set of all different keywords is initially formed by pooling all the keywords of all the expert profiles. The semantic distance between each pair of keywords is then calculated and the keywords are partitioned by using a clustering algorithm. Each expert is further represented by a vector of membership degrees of the expert to the different clusters of keywords. The Euclidean distance between each pair of vectors is finally calculated and the experts are clustered by applying a suitable partitioning algorithm.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Aggarwal, C., Zhai, C.: A survey of text clustering algorithms. In: Mining Text Data, pp. 77–128 (2012)
Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - the concepts and technology behind search, 2nd edn. Pearson Education Ltd., Harlow (2011)
Balog, K., et al.: Broad expertise retrieval in sparse data environments. In: 30th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, New York (2007)
Balog, K., de Rijke, M.: Finding similar experts. In: 30th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 821–822. ACM Press, New York (2007)
Balog, K.: People search in the enterprise. PhD thesis, Amsterdam University (2008)
Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002)
Boeva, V., Krusheva, M., Tsiporkova, E.: Measuring Expertise Similarity in Expert Networks. In: 6th IEEE Int. Conf. on Intelligent Systems, pp. 53–57. IEEE (2012)
Buelens, S., Putman, M.: Identifying experts through a framework for knowledge extraction from public online sources. Master thesis, Gent University, Belgium (2011)
Campbell, C.S., Maglio, P.P., Cozzi, A., Dom, B.: Expertise identification using Bibliography 189 email communications. In: 12th International Conference on Information and Knowledge Management. ACM Press (2003)
Craswell, N., et al.: Overview of the TREC-2005 Enterprise Track. In: 14th Text Retrieval Conference (2006)
D’Amore, R.: Expertise community detection. In: 27th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval. ACM Press (2004)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. of the Royal Statistical Society B 39(1), 1–38 (1977)
ECSCW99 Workshop. Beyond knowledge management: Managing expertise, http://www.informatik.uni-bonn.de/~prosec/ECSCW-XMWS/
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (2001)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2) (2001)
Hattori, F., et al.: Socialware: Multiagent systems for supporting network communities. Communications of the ACM 42(3), 55–61 (1999)
Hawking, D.: Challenges in enterprise search. In: 15th Australasian Database Conference. Australian Computer Society, Inc. (2004)
Hirst, G., St-Onge, D.: Lexical Chains as Representations of Context for Detection and Correction of Malapropisms. In: WordNet: An Electronic Lexical Database, pp. 305–332. MIT Press (1998)
Hristoskova, A., Tsiporkova, E., Tourwé, T., Buelens, S., Putman, M., De Turck, F.: A Graph-based Disambiguation Approach for Construction of an Expert Repository from Public Online Sources. In: 5th IEEE Int. Conf. on Agents and Art. Int. (2013)
Böhm, C., Khuri, S., Lhotská, L., Pisanti, N. (eds.): ITBAM 2011. LNCS, vol. 6865. Springer, Heidelberg (2011)
Jardine, N., van Rijsbergen, C.J.: The use of hierarchic clustering in information retrieval. Information Storage and Retrieval 7, 217–240 (1971)
Jiang, J., Conrath, D.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: International Conference Research on Computational Linguistics, pp. 19–33 (1997)
Jung, H., Lee, M., Kang, I., Lee, S., Sung, W.: Finding topic-centric identified experts based on full text analysis. In: 2nd International ExpertFinder Workshop at the 6th International Semantic Web Conference, ISWC (2007)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear Time Document Clustering. In: KDD 1999, pp. 16–29 (1999)
Leacock, C., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification, pp. 265–283. MIT Press, Cambridge (1998)
Lin, D.: An Information-Theoretic Definition of Similarity. In: 15th International Conference on Machine Learning, ICML, pp. 296–304 (1998)
Lu, Y., Mei, Q., Zhai, C.: Investigating task performance of probabilistic topic models: an empirical study of plsa and lda. Information Retrieval 14(2), 178–203 (2011)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: 5th Berkeley Symp. Math. Stat. Prob., vol. 1, pp. 281–297 (1967)
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Mockus, A., Herbsleb, J.D.: Expertise browser: a quantitative approach to identifying expertise. In: 24th Int. Conf. on Software Engineering. ACM Press (2002)
Python LinkedIn - a python wrapper around the LinkedIn API, http://code.google.com/p/python-linkedin/
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453 (1995)
Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational Applied Mathematics 20, 53–65 (1987)
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Seid, D., Kobsa, A.: Demoir: A hybrid architecture for expertise modeling and recommender systems (2000)
Stankovic, M., Jovanovic, J., Laublet, P.: Linked data metrics for flexible expert search on the Open Web. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 108–123. Springer, Heidelberg (2011)
Theodoridis, S., Koutroubas, K.: Pattern recognition. Academic Press (1999)
Toutanova, K.: Enriching the knowledge sources used in a maximum entropy partofspeech tagger. In: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, EMNLP/VLC-2000 (2000)
Tsiporkova, E., Tourwé, T.: Tool support for technology scouting using online sources. In: De Troyer, O., Bauzer Medeiros, C., Billen, R., Hallot, P., Simitsis, A., Van Mingroot, H. (eds.) ER Workshops 2011. LNCS, vol. 6999, pp. 371–376. Springer, Heidelberg (2011)
WordNet Similarity for Java (WS4J), https://code.google.com/p/ws4j/
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: 32nd Annual Meeting on Association for Computational Linguistics, pp. 133–138 (1994)
Zhang, J., Tang, J., Li, J.: Expert finding in a social network. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1066–1069. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Boeva, V., Boneva, L., Tsiporkova, E. (2014). Semantic-Aware Expert Partitioning. In: Agre, G., Hitzler, P., Krisnadhi, A.A., Kuznetsov, S.O. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2014. Lecture Notes in Computer Science(), vol 8722. Springer, Cham. https://doi.org/10.1007/978-3-319-10554-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-10554-3_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10553-6
Online ISBN: 978-3-319-10554-3
eBook Packages: Computer ScienceComputer Science (R0)