Abstract
This work presents a kernel method for clustering the nodes of a weighted, undirected, graph. The algorithm is based on a two-step procedure. First, the sigmoid commute-time kernel (\(\mathbf{K}_{\text{CT}}\)), providing a similarity measure between any couple of nodes by taking the indirect links into account, is computed from the adjacency matrix of the graph. Then, the nodes of the graph are clustered by performing a kernel k-means or fuzzy k-means on this CT kernel matrix. For this purpose, a new, simple, version of the kernel k-means and the kernel fuzzy k-means is introduced. The joint use of the CT kernel matrix and kernel clustering appears to be quite successful. Indeed, it provides good results on a document clustering problem involving the newsgroups database.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Berry, M.W.: Survey of Text Mining. Springer, New York (2003)
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 551–556. ACM Press, New York (2004)
Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Machine Learning 42(1), 143–175 (2001), citeseer.ist.psu.edu/article/dhillon01concept.html
Ding, C., He, X.: Linearized cluster assignment via spectral ordering. In: ICML ’04: Proceedings of the twenty-first international conference on Machine learning, Banff, Alberta, Canada, p. 30. ACM Press, New York (2004)
Drineas, P., et al.: Clustering large graphs via the singular value decomposition. Machine Learning 56(1-3), 9–33 (2004)
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis. Arnold Publishers, New Delhi (2001)
Flake, G.W., Tarjan, R.E., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Internet Math. 1(4), 385–408 (2003)
Fouss, F., et al.: Random-walk computation of similarities between nodes of a graph, with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering 19(3), 355–369 (2007)
Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks 13(3), 780–784 (2002)
Kim, D.-W., et al.: Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recognition 38(4), 607–611 (2005)
Newman, M.E.J.: Detecting community structure in networks. The European Physical Journal B 38, 321–330 (2004)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processiong Systems, vol. 14, Vancouver, Canada, pp. 849–856. MIT Press, Cambridge (2001)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Saerens, M., et al.: The principal components analysis of a graph, and its relationships to spectral clustering. In: Boulicaut, J.-F., et al. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 371–383. Springer, Heidelberg (2004)
Scholkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)
Weiss, S., et al.: Text Mining: Predictive Methods for Analyzing Unstructured Information. Springer, Heidelberg (2004)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graph. In: SDM (2005)
Wu, Z.-D., Xie, W.-X., Yu, J.-P.: Fuzzy c-means clustering algorithm based on kernel method. In: ICCIMA ’03: Proceedings of the 5th International Conference on Computational Intelligence and Multimedia Applications, Washington, DC, USA, p. 49. IEEE Computer Society Press, Los Alamitos (2003)
Zha, H., et al.: Bipartite graph partitioning and data clustering. In: Proc. of ACM 10th Int’l Conf. Information and Knowledge Management (CIKM 2001), pp. 25–32. ACM Press, New York (2001)
Zhang, D.-Q., Chen, S.-C.: Fuzzy clustering using kernel method. In: Proceedings of the 2002 International Conference on Control and Automation, ICCA, pp. 162–163 (2002)
Zhang, D.-Q., Chen, S.-C.: A novel kernelized fuzzy c-means algorithm with application in medical image segmentation. Artificial Intelligence in Medicine 32(1), 37–50 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Yen, L., Fouss, F., Decaestecker, C., Francq, P., Saerens, M. (2007). Graph Nodes Clustering Based on the Commute-Time Kernel. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_117
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_117
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)