Optimal algorithms for comparing trees with labeled leaves

Day, William H. E.

doi:10.1007/BF01908061

Optimal algorithms for comparing trees with labeled leaves

Authors Of Articles
Published: December 1985

Volume 2, pages 7–28, (1985)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Classification Aims and scope Submit manuscript

Optimal algorithms for comparing trees with labeled leaves

Download PDF

William H. E. Day¹

1007 Accesses
154 Citations
3 Altmetric
Explore all metrics

Abstract

LetR _n denote the set of rooted trees withn leaves in which: the leaves are labeled by the integers in {1, ...,n}; and among interior vertices only the root may have degree two. Associated with each interior vertexv in such a tree is the subset, orcluster, of leaf labels in the subtree rooted atv. Cluster {1, ...,n} is calledtrivial. Clusters are used in quantitative measures of similarity, dissimilarity and consensus among trees. For anyk trees inR _n, thestrict consensus tree C(T ₁, ...,T _k) is that tree inR _n containing exactly those clusters common to every one of thek trees. Similarity between treesT ₁ andT ₂ inR _n is measured by the numberS(T ₁,T ₂) of nontrivial clusters in bothT ₁ andT ₂; dissimilarity, by the numberD(T ₁,T ₂) of clusters inT ₁ orT ₂ but not in both. Algorithms are known to computeC(T ₁, ...,T _k) inO(kn ²) time, andS(T ₁,T ₂) andD(T ₁,T ₂) inO(n ²) time. I propose a special representation of the clusters of any treeT R _n, one that permits testing in constant time whether a given cluster exists inT. I describe algorithms that exploit this representation to computeC(T ₁, ...,T _k) inO(kn) time, andS(T ₁,T ₂) andD(T ₁,T ₂) inO(_n) time. These algorithms are optimal in a technical sense. They enable well-known indices of consensus between two trees to be computed inO(n) time. All these results apply as well to comparable problems involving unrooted trees with labeled leaves.

Article PDF

An \(O(n \log n)\) Time Algorithm for Computing the Path-Length Distance Between Trees

Article 19 June 2019

A More Practical Algorithm for the Rooted Triplet Distance

Optimal Search Trees with 2-Way Comparisons

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

ADAMS, E. N., III (1972), “Consensus Techniques and the Comparison of Taxonomic Trees,”Systematic Zoology, 21, 390–397.
Google Scholar
AHO, A. V., HOPCROFT, J. E., and ULLMAN, J. D. (1974),The Design and Analysis of Computer Algorithms, Reading, Massachusetts: Addison-Wesley.
Google Scholar
BOURQUE, M. (1978), “Arbres de Steiner et Réseaux dont Certains Sommets sont à Localisation Variable,” Ph.D. dissertation, Université de Montréal, Quebec, Canada.
Google Scholar
BROWN, E. K., and DAY, W. H. E. (1984), “A Computationally Efficient Approximation to the Nearest Neighbor Interchange Metric,”Journal of Classification, 1, 93–124.
Google Scholar
CAVALLI-SFORZA, L. L., and EDWARDS, A. W. F. (1967), “Phylogenetic Analysis Models and Estimation Procedures,”American Journal of Human Genetics, 19, 233–257.
Google Scholar
COLLESS, D. H. (1980), “Congruence between Morphometric and Allozyme Data forMenidia Species: A Reappraisal,”Systematic Zoology, 29, 288–299.
Google Scholar
DAY, W. H. E. (1983), “The Role of Complexity in Comparing Classifications,”Mathematical Biosciences, 66, 97–114.
Google Scholar
HARARY, F. (1969),Graph Theory, Reading, Massachusetts: Addison-Wesley.
Google Scholar
HENDY, M. D., LITTLE, C. H. C., and PENNY, D. (1984), “Comparing Trees with Pendant Vertices Labelled,”SIAM Journal on Applied Mathematics Theory, 44, 1054–1065.
Google Scholar
MARCZEWSKI, E., and STEINHAUS, H. (1958), “On a Certain Distance of Sets and the Corresponding Distance of Functions,”Colloquium Mathematicum, 6, 319–327.
Google Scholar
MARGUSH, T. (1982), “Distances Between Trees,”Discrete Applied Mathematics, 4, 281–290.
Google Scholar
MARGUSH, T., and McMORRIS, F.R. (1981), “Consensus n-Trees,”Bulletin of Mathematical Biology, 43, 239–244.
Google Scholar
McMORRIS, F.R., MERONK, D.B., and NEUMANN, D.A. (1983), “A View of some Consensus Methods for Trees,” inNumerical Taxonomy: Proceedings of a NATO Advanced Study Institute, ed. J. Felsenstein, Berlin: Springer-Verlag, 122–126.
Google Scholar
McMORRIS, F.R., and NEUMANN, D. (1983), “Consensus Functions Defined on Trees,”Mathematical Social Sciences, 4, 131–136.
Google Scholar
MICKEVICH, M.F. (1978), “Taxonomic Congruence,”Systematic Zoology, 27, 143–158.
Google Scholar
NELSON, G. (1979), “Cladistic Analysis and Synthesis: Principles and Definitions, with a Historical Note on Adanson'sFamilles des Plantes (1763–1764),”Systematic Zoology, 28, 1–21.
Google Scholar
NELSON, G., and PLATNICK, N. (1981),Systematics and Biogeography: Cladistics and Vicariance, New York: Columbia University Press.
Google Scholar
NEUMANN, D.A. (1983), “Faithful Consensus Methods for n-Trees,”Mathematical Biosciences, 63, 271–287.
Google Scholar
RESTLE, F. (1959), “A Metric and an Ordering on Sets,”Psychometrika, 24, 207–220.
Google Scholar
ROBINSON, D.F. (1971), “Comparison of Labeled Trees with Valency Three,”Journal of Combinatorial Theory, 11, 105–119.
Google Scholar
ROBINSON, D.F., and FOULDS, L.R. (1981), “Comparison of Phylogenetic Trees,”Mathematical Biosciences, 53, 131–147.
Google Scholar
ROHLF, F.J. (1982), “Consensus Indices for Comparing Classifications,”Mathematical Biosciences, 59, 131–144.
Google Scholar
ROHLF, F.J. (1983), “Numbering Binary Trees with Labeled Terminal Vertices,”Bulletin of Mathematical Biology, 45, 33–40.
Google Scholar
SCHUH, R.T., and FARRIS, J.S. (1981), “Methods for Investigating Taxonomic Congruence and Their Application to the Leptopodomorpha,”Systematic Zoology, 30, 331–351.
Google Scholar
SHAO, K. (1983), “Consensus Methods in Numerical Taxonomy,” Ph.D. dissertation, State University of New York, Stony Brook, New York.
Google Scholar
SOKAL, R.R., and ROHLF, F.J. (1981), “Taxonomic Congruence in the Leptopodomorpha Re-examined,”Systematic Zoology, 30, 309–325.
Google Scholar
STANDISH, T.A. (1980),Data Structure Techniques, Reading, Massachusetts: Addison-Wesley.
Google Scholar
STINEBRICKNER, R. (1984), “s-Consensus Trees and Indices,”Bulletin of Mathematical Biology, 46, 923–935.
Google Scholar
TATENO, Y., NEI, M., and TAJIMA, F. (1982), “Accuracy of Estimated Phylogenetic Trees from Molecular Data I. Distantly Related Species,”Journal of Molecular Evolution, 18, 387–404.
Google Scholar
WATERMAN, M.S., and SMITH, T.F. (1978), “On the Similarity of Dendrograms,”Journal of Theoretical Biology, 73, 789–800.
Google Scholar
WEIDE, B. (1977), “A Survey of Analysis Techniques for Discrete Algorithms,”Computing Surveys, 9, 291–313.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Memorial University of Newfoundland, A1C 5S7, St. John's, Newfoundland, Canada
William H. E. Day

Authors

William H. E. Day
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

The Natural Sciences and Engineering Research Council of Canada partially supported this work with grant A-4142.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Day, W.H.E. Optimal algorithms for comparing trees with labeled leaves. Journal of Classification 2, 7–28 (1985). https://doi.org/10.1007/BF01908061

Download citation

Issue Date: December 1985
DOI: https://doi.org/10.1007/BF01908061

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Optimal algorithms for comparing trees with labeled leaves

Abstract

Article PDF

Similar content being viewed by others

An \(O(n \log n)\) Time Algorithm for Computing the Path-Length Distance Between Trees

A More Practical Algorithm for the Rooted Triplet Distance

Optimal Search Trees with 2-Way Comparisons

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal algorithms for comparing trees with labeled leaves

Abstract

Article PDF

Similar content being viewed by others

An \(O(n \log n)\) Time Algorithm for Computing the Path-Length Distance Between Trees

A More Practical Algorithm for the Rooted Triplet Distance

Optimal Search Trees with 2-Way Comparisons

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation