Clustering Ensembles Based on Normalized Edges

Li, Yan; Yu, Jian; Hao, Pengwei; Li, Zhulin

doi:10.1007/978-3-540-71701-0_71

Yan Li¹,
Jian Yu²,
Pengwei Hao^1,3 &
…
Zhulin Li¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1915 Accesses
16 Citations

Abstract

The co-association (CA) matrix was previously introduced to combine multiple partitions. In this paper, we analyze the CA matrix, and address its difference from the similarity matrix using Euclidean distance. We also explore how to find a proper and better algorithm to obtain the final partition using the CA matrix. To get more robust and reasonable clustering ensemble results, a new hierarchical clustering algorithm is proposed by developing a novel concept of normalized edges to measure the similarity between clusters. The experimental results of the proposed approach are compared with those of some single runs of well-known clustering algorithms and other ensemble methods and the comparison clearly demonstrates the effectiveness of our algorithm.

This work was partially supported by the National Natural Science Foundation under Grant No. 60303014, the Fok Ying Tung Education Foundation under Grant No. 101068, the Specialized Research Found of Doctoral Program of Higher Education of China under Grant No. 20050004008, and the Foundation for the Authors of National Excellent Doctoral Dissertation, China, under Grant 200038.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble

Article Open access 27 May 2024

Weighted-object ensemble clustering: methods and analysis

Article 06 September 2016

A comprehensive study of clustering ensemble weighting based on cluster quality and diversity

Article 29 December 2017

References

Dudoit, S., Fridlyand, J.: Bagging to Improve the Accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Article Google Scholar
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Hodder Arnold, London (2001)
Google Scholar
Fern, X., Brodley, C.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. 21st Int’l Conf. Machine Learning, pp. 281–288 (2004)
Google Scholar
Fischer, B., Buhmann, J.M.: Bagging for Path-Based Clustering. IEEE Trans. Patt. Anal. Machine Intell. 25(11), 1411–1415 (2003)
Article Google Scholar
Fred, A.L.N., Jain, A.K.: Data Clustering Using Evidence Accumulation. In: Proc. 16th Int’l Conf. Pattern Recognition, pp. 276–280 (2002)
Google Scholar
Fred, A.L.N., Jain, A.K.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Trans. Patt. Anal. Machine Intell. 27(6), 835–850 (2005)
Article Google Scholar
Frossyniotis, D., Likas, A., Stafylopatis, A.: A Clustering Method Based on Boosting. Pattern Recognition Letters 25, 641–654 (2004)
Article Google Scholar
Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information systems 25(5), 345–366 (2000)
Article Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
MATH Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Google Scholar
Leisch, F.: Bagged Clustering. Working Papers SFB Adaptive Information Systems and Modeling in Economics and Management Science. Institut für Information, Abt. Produktionsmanagement, Wien, Wirtschaftsuniv., 51 (1999)
Google Scholar
Monti, S., et al.: Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 52(1-2), 91–118 (2003)
Article MATH Google Scholar
Newman, D.J., et al.: UCI Repository of machine learning databases, Univ. of California, Dept. of Info. and Computer Science (1998)
Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an Algorithm. In: Advances in Neural Information Processing Systems, vol. 14 (2002)
Google Scholar
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Trans. Patt. Anal. Machine Intell. 22(8), 888–905 (2000)
Article Google Scholar
Strehl, A., Ghosh, J.: Cluster Ensembles — A Knowledge Reuse Framework for Combining Partitionings. J. Machine Learning Research 3, 583–617 (2002)
Article MathSciNet Google Scholar
Topchy, A., Jain, A.K., Punch, W.: Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans. PAMI 27(12), 1866–1881 (2005)
Google Scholar
Topchy, A.P., et al.: Analysis of Consensus Partition in Cluster Ensemble. In: Proc. 4th IEEE Int’l Conf. Data Mining, pp. 225–232. IEEE Computer Society Press, Los Alamitos (2004)
Google Scholar
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 545–678 (2005)
Article Google Scholar
Zhou, Z.H., Tang, W.: Clusterer Ensemble. Knowledge-Based Systems 19(1), 77–83 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Center for Information Science, Peking University, Beijing, 100871, China
Yan Li, Pengwei Hao & Zhulin Li
Inst. of Computer Science & Engineering, Beijing Jiaotong Univ., Beijing, 100044, China
Jian Yu
Dept. of Computer Science, Queen Mary, Univ. of London, London, E1 4NS, UK
Pengwei Hao

Authors

Yan Li
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yu
View author publications
You can also search for this author in PubMed Google Scholar
Pengwei Hao
View author publications
You can also search for this author in PubMed Google Scholar
Zhulin Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Yu, J., Hao, P., Li, Z. (2007). Clustering Ensembles Based on Normalized Edges. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_71

Download citation

DOI: https://doi.org/10.1007/978-3-540-71701-0_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Clustering Ensembles Based on Normalized Edges

Abstract

Chapter PDF

Similar content being viewed by others

A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble

Weighted-object ensemble clustering: methods and analysis

A comprehensive study of clustering ensemble weighting based on cluster quality and diversity

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Clustering Ensembles Based on Normalized Edges

Abstract

Chapter PDF

Similar content being viewed by others

A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble

Weighted-object ensemble clustering: methods and analysis

A comprehensive study of clustering ensemble weighting based on cluster quality and diversity

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation