Abstract
The co-association (CA) matrix was previously introduced to combine multiple partitions. In this paper, we analyze the CA matrix, and address its difference from the similarity matrix using Euclidean distance. We also explore how to find a proper and better algorithm to obtain the final partition using the CA matrix. To get more robust and reasonable clustering ensemble results, a new hierarchical clustering algorithm is proposed by developing a novel concept of normalized edges to measure the similarity between clusters. The experimental results of the proposed approach are compared with those of some single runs of well-known clustering algorithms and other ensemble methods and the comparison clearly demonstrates the effectiveness of our algorithm.
This work was partially supported by the National Natural Science Foundation under Grant No. 60303014, the Fok Ying Tung Education Foundation under Grant No. 101068, the Specialized Research Found of Doctoral Program of Higher Education of China under Grant No. 20050004008, and the Foundation for the Authors of National Excellent Doctoral Dissertation, China, under Grant 200038.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Dudoit, S., Fridlyand, J.: Bagging to Improve the Accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Hodder Arnold, London (2001)
Fern, X., Brodley, C.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. 21st Int’l Conf. Machine Learning, pp. 281–288 (2004)
Fischer, B., Buhmann, J.M.: Bagging for Path-Based Clustering. IEEE Trans. Patt. Anal. Machine Intell. 25(11), 1411–1415 (2003)
Fred, A.L.N., Jain, A.K.: Data Clustering Using Evidence Accumulation. In: Proc. 16th Int’l Conf. Pattern Recognition, pp. 276–280 (2002)
Fred, A.L.N., Jain, A.K.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Trans. Patt. Anal. Machine Intell. 27(6), 835–850 (2005)
Frossyniotis, D., Likas, A., Stafylopatis, A.: A Clustering Method Based on Boosting. Pattern Recognition Letters 25, 641–654 (2004)
Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information systems 25(5), 345–366 (2000)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Leisch, F.: Bagged Clustering. Working Papers SFB Adaptive Information Systems and Modeling in Economics and Management Science. Institut für Information, Abt. Produktionsmanagement, Wien, Wirtschaftsuniv., 51 (1999)
Monti, S., et al.: Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 52(1-2), 91–118 (2003)
Newman, D.J., et al.: UCI Repository of machine learning databases, Univ. of California, Dept. of Info. and Computer Science (1998)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an Algorithm. In: Advances in Neural Information Processing Systems, vol. 14 (2002)
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Trans. Patt. Anal. Machine Intell. 22(8), 888–905 (2000)
Strehl, A., Ghosh, J.: Cluster Ensembles — A Knowledge Reuse Framework for Combining Partitionings. J. Machine Learning Research 3, 583–617 (2002)
Topchy, A., Jain, A.K., Punch, W.: Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans. PAMI 27(12), 1866–1881 (2005)
Topchy, A.P., et al.: Analysis of Consensus Partition in Cluster Ensemble. In: Proc. 4th IEEE Int’l Conf. Data Mining, pp. 225–232. IEEE Computer Society Press, Los Alamitos (2004)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 545–678 (2005)
Zhou, Z.H., Tang, W.: Clusterer Ensemble. Knowledge-Based Systems 19(1), 77–83 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Li, Y., Yu, J., Hao, P., Li, Z. (2007). Clustering Ensembles Based on Normalized Edges. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_71
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)