Skip to main content

Evidence for a Relationship Between Algorithmic Scheme and Shape of Inferred Trees

  • Chapter
Data Analysis

Abstract

Agglomeration and addition are the two main algorithmic schemes for constructing a tree distance from a dissimilarity matrix. The former scheme iteratively agglomerates pairs of leaves to form larger and larger clusters, while the latter proceeds by stepwise addition of objects to a growing tree. A third approach involves improving the global fitness of an initial tree by exchanging subtrees. This article suggests that the shape of inferred trees partly depends on the chosen algorithmic scheme: agglomeration tends to produce compact and bushy tree shapes, while addition and exchange have a preference for sparse and chain-like trees. This phenomenon is explained by the difference between the a priori probability distributions induced by each scheme. An illustration is provided with the Mitochondrial Eve data set (Vigilant et al. 1991), and the practical impacts are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • BARTHéLEMY, J.P. and A. GUéNOCHE (1991): Trees and proximity representations. Wiley, Chichester.

    Google Scholar 

  • BOCK, H.H. (1996): Probabilistic models in cluster analysis. Computational Statistics and Data Analysis, 23, 5–28.

    Article  Google Scholar 

  • BROWN, J.K.M. (1994): Probabilities of evolutionary trees. Systematic Biology, 43, 78–91.

    Google Scholar 

  • DAY, W.H.E. (1987): Computational Complexity of Inferring Phylogenies from Dissimilarity Matrices. Bulletin of Mathematical Biology, 49, 461–467.

    Google Scholar 

  • EDWARDS, A.W.F. (1970): Estimation of branch points of a branching diffusion process. J. of the Royal Statistical Society B, 32, 155–174.

    Google Scholar 

  • ERDOS, P.L., M. STEEL, L.A. SZEKELY, and T.J. WARNOW (1999): A few logs suffice to build (almost) all trees: Part II. Theoretical Computer Science, 221, 77–118

    Google Scholar 

  • FARRIS, J.S. (1970): Methods for computing Wagner trees, Systematic Zoology, 34, 21–34.

    Google Scholar 

  • FELSENSTEIN, J. (1993): Phylip (phylogeny inference package), version 3.5c, distributed by the author.

    Google Scholar 

  • FELSENSTEIN, J. (1997): An alternating least-squares approach to inferring phylogenies from pairwise distances. Systematic Biology, 46, 101–111.

    Article  Google Scholar 

  • FLOREK, K., J. LUKASZEWICZ, J. PERKAL, H. STEINHAUS, and S. ZURBRZYCKI (1951): Sur la liaison et la division des points dun ensemble fini. Colloquium Mathematicum, 2, 282–285.

    Google Scholar 

  • GASCUEL, O. (1997a): BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Molecular Biology and Evolution, 14, 685–695.

    Article  Google Scholar 

  • GUénoche, A. and P. PRéa (1998): Counting and selecting at random phylogenetic topologies to compare reconstruction methods. Proc. of the Conf. of the International Federation of the Classifications Societies (IFCS’98), short papers volume, 242–245.

    Google Scholar 

  • HARDING, E.F. (1971): The probabilities of rooted-tree shapes generated by random bifurcation. Advances in Applied Probabilities, 3, 44–77.

    Article  Google Scholar 

  • MCKENZIE, A. and M. STEEL (2000): Distributions of cherries for two models of trees. Mathematical Biosciences, 164, 81–92.

    Article  Google Scholar 

  • MOOERS, A. and S.B. HEARD (1997): Inferring evolutionary process from phylogenetic tree shape. The Quarterly Review of Biology, 72, 31–54.

    Article  Google Scholar 

  • PAGE, R.D.M. (1991): Random dendograms and null hypothesis in cladistic biogeography. Systematic Zoology, 40, 54–62.

    Article  Google Scholar 

  • RZHETSKY, A. and M. NEI. (1993): Theoretical Foundation of the Minimum-Evolution Method of Phylogenetic Inference. Molecular Biology and Evolution, 10, 1073–1095.

    Google Scholar 

  • SAITOU, N. (1988): Property and efficiency of the maximum likelihood method for molecular phylogeny. Journal of Molecular Evolution, 27, 261–273.

    Article  Google Scholar 

  • SAITOU, N. and M. NEI. (1987): The neighbor-joining method: a new method for reconstruction of phylogenetic trees. Molecular Biology and Evolution, 4, 406–425.

    Google Scholar 

  • Sattath, S. and A. TVERSKY (1977): Additive similarity trees. Psychometrika, 42, 319–345.

    Article  Google Scholar 

  • SORENSEN, T. (1948): A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons. Biologiske Skrifter, 5, 1–34.

    Google Scholar 

  • SWOFFORD, D.L., G.J. OLSEN, P.J. WADDELL, and D.M. HILLIS. (1996): Phylogenetic Inference. In D.M. Hillis, C. Moritz and B.K. Mable (eds.): Molecular Systematics, Sinauer, Sunderland (MA), 402–514.

    Google Scholar 

  • VIGILANT, L., M. STONEKING, H. HARPENDING, K. HAWKES, and A.C. WILSON (1991): African populations and the evolution of human mitochondrial DNA. Science, 253, 1503–1507.

    Article  Google Scholar 

  • YULE, G.U. (1924): A mathematical theory of evolution based on the conclusions of Dr. J.C. Willis. F.R.S. PTRS B, 213, 21–87.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin · Heidelberg

About this chapter

Cite this chapter

Gascuel, O. (2000). Evidence for a Relationship Between Algorithmic Scheme and Shape of Inferred Trees. In: Gaul, W., Opitz, O., Schader, M. (eds) Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-58250-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-58250-9_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67731-4

  • Online ISBN: 978-3-642-58250-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics