Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

Fernández, Alberto; Gómez, Sergio

doi:10.1007/s00357-008-9004-x

Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

Published: 26 June 2008

Volume 25, pages 43–65, (2008)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Classification Aims and scope Submit manuscript

Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

Download PDF

Alberto Fernández¹ &
Sergio Gómez^1,2

543 Accesses
110 Citations
7 Altmetric
Explore all metrics

Abstract

In agglomerative hierarchical clustering, pair-group methods suffer from a problem of non-uniqueness when two or more distances between different clusters coincide during the amalgamation process. The traditional approach for solving this drawback has been to take any arbitrary criterion in order to break ties between distances, which results in different hierarchical classifications depending on the criterion followed. In this article we propose a variable-group algorithm that consists in grouping more than two clusters at the same time when ties occur. We give a tree representation for the results of the algorithm, which we call a multidendrogram, as well as a generalization of the Lance andWilliams’ formula which enables the implementation of the algorithm in a recursive way.

Article PDF

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Article 16 July 2019

Consensus of Clusterings Based on High-Order Dissimilarities

Overlapping Hierarchical Clustering (OHC)

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

ARNAU, V., MARS, S., and MARÍN, I. (2005), “Iterative Cluster Analysis of Protein Interaction Data,” Bioinformatics, 21(3), 364–378.
Article Google Scholar
BACKELJAU, T., DE BRUYN, L., DE WOLF, H., JORDAENS, K., VAN DONGEN, S., and WINNEPENNINCKX, B. (1996), “Multiple UPGMA and Neighbor-Joining Trees and the Performance of Some Computer Packages,” Molecular Biology and Evolution, 13(2), 309–313.
Google Scholar
CORMACK, R.M. (1971), “A Review of Classification” (with discussion), Journal of the Royal Statistical Society, Ser. A, 134, 321–367.
Article MathSciNet Google Scholar
GORDON, A.D. (1999), Classification (2nd ed.), London/Boca Raton, FL:Chapman & Hall/CRC.
MATH Google Scholar
HART, G. (1983), “The Occurrence of Multiple UPGMA Phenograms,” in Numerical Taxonomy, ed. J. Felsenstein, Berlin Heidelberg: Springer-Verlag, pp. 254–258.
Google Scholar
LANCE, G.N., and WILLIAMS, W.T. (1966), “A Generalized Sorting Strategy for Computer Classifications,” Nature, 212, 218.
Article Google Scholar
MACCUISH, J., NICOLAOU, C., and MACCUISH, N.E. (2001), “Ties in Proximity and Clustering Compounds,” Journal of Chemical Information and Computer Sciences, 41, 134–146.
Article Google Scholar
MORGAN, B.J.T., and RAY, A.P.G. (1995), “Non-uniqueness and Inversions in Cluster Analysis,” Applied Statistics, 44(1), 117–134.
Article MATH Google Scholar
SNEATH, P.H.A., and SOKAL, R.R. (1973), Numerical Taxonomy: The Principles and Practice of Numerical Classification, San Francisco: W. H. Freeman and Company.
MATH Google Scholar
SZÉKELY, G.J., and RIZZO, M.L. (2005), “Hierarchical Clustering via Joint Between-Within Distances: Extending Ward’s Minimum Variance Method,” Journal of Classification, 22, 151–183.
Article MathSciNet Google Scholar
VAN DER KLOOT, W.A., SPAANS, A.M.J., and HEISER, W.J. (2005), “Instability of Hierarchical Cluster Analysis Due to Input Order of the Data: The Permu CLUSTER Solution,” Psychological Methods, 10(4), 468–476.
Article Google Scholar
WARD, J.H., Jr. (1963), “Hierarchical Grouping to Optimize an Objective Function,” Journal of the American Statistical Association, 58, 236–244.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Universitat Rovira i Virgili, Tarragona, Spain
Alberto Fernández & Sergio Gómez
Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Campus Sescelades, Avinguda dels Països Catalans 26, E–43007, Tarragona, Spain
Sergio Gómez

Authors

Alberto Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Gómez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio Gómez.

Additional information

The authors thank A. Arenas for discussion and helpful comments. This work was partially supported by DGES of the Spanish Government Project No. FIS2006–13321–C02–02 and by a grant of Universitat Rovira i Virgili.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernández, A., Gómez, S. Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms. J Classif 25, 43–65 (2008). https://doi.org/10.1007/s00357-008-9004-x

Download citation

Published: 26 June 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s00357-008-9004-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

Abstract

Article PDF

Similar content being viewed by others

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Consensus of Clusterings Based on High-Order Dissimilarities

Overlapping Hierarchical Clustering (OHC)

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

Abstract

Article PDF

Similar content being viewed by others

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Consensus of Clusterings Based on High-Order Dissimilarities

Overlapping Hierarchical Clustering (OHC)

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation