Abstract
We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson’s optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson’s criterion is not satisfied. We also provide a proof for Atteson’s conjecture on the optimal edge radius of the neighbor-joining algorithm. The strong performance guarantees we provide also hold for the quadratic time fast neighbor-joining algorithm, thus providing a theoretical basis for inferring very large phylogenies with neighbor-joining.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Atteson, K.: The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25, 251–278 (1999)
Bruno, W.J., Socci, N.D., Halpern, A.L.: Weighted neighbor-joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol. 17(1), 189–197 (2000)
Bryant, D.: On the uniqueness of the selection criterion in neighbor-joining. J. Classif. 22(1), 3–15 (2005)
Dai, W., Xu, Y., Zhu, B.: On the edge l ∞ radius of Saitou and Nei’s method for phylogenetic reconstruction. Theor. Comput. Sci. 369(1–3), 448–455 (2006)
Desper, R., Gascuel, O.: The minimum evolution distance-based approach to phylogenetic inference. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny. Oxford University Press, Oxford (2005)
Elias, I., Lagergren, J.: Fast neighbor joining. In: Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP’05) (2005)
Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.J.: A few logs suffice to build (almost) all trees, I. Random Struct. Algorithms 14(2), 153–184 (1999)
Farris, J.S., Albert, V.A., Källersjö, M., Lipscomb, D., Kluge, A.G.: Parsimony jackknifing outperforms neighbor-joining. Cladistics 12, 99–124 (1996)
Felsenstein, J.: PHYLIP (phylogeny inference package) version 3.5c. Tech. report, Department of Genetics, University of Washington, Seattle (1993)
Gascuel, O.: A note on Sattath and Tversky’s, Saitou and Nei’s, and Studier and Keppler’s algorithms for inferring phylogenies from evolutionary distances. Mol. Biol. Evol. 11(6), 961–963 (1994)
Gascuel, O.: BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14(7), 685–695 (1997)
Gascuel, O., Steel, M.: Neighbor-joining revealed. Mol. Biol. Evol. 23(11), 1997–2000 (2006)
Hall, B.G.: Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. Mol. Biol. Evol. 22(3), 792–802 (2005)
Huelsenbeck, J., Hillis, D.: Success of phylogenetic methods in the four-taxon case. Syst. Biol. 42(3), 247–264 (1993)
John, K.St., Warnow, T., Moret, B., Vawter, L.: Performance study of phylogenetic methods: (unweighted) quartet methods and neighbor joining. J. Algorithms 48, 174–193 (2003)
Kuhner, M.K., Felsenstein, J.: A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol. 11, 459–468 (1994)
Kumar, S., Gadagker, S.R.: Efficiency of the neighbor-joining method in reconstructing evolutionary relationships in large phylogenies. J. Mol. Evol. 51, 544–553 (2000)
Levy, D., Yoshida, R., Pachter, L.: Beyond pairwise distances: neighbor joining with phylogenetic diversity estimates. Mol. Biol. Evol. 23, 491–498 (2006)
Olsen, G.J., Matsuda, H., Hagstrom, R., Overbeek, R.: fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10, 41–48 (1994)
Ota, S., Li, W.H.: NJML: a hybrid algorithm for the neighbor-joining and maximum likelihood methods. Mol. Biol. Evol. 17(9), 1401–1409 (2000)
Pauplin, Y.: Direct calculation of tree length using a distance matrix. J. Mol. Evol. 51, 41–47 (2000)
Rambaut, A., Grassly, N.C.: Seq-Gen: an application for the Monte Carlo simulation of DNA sequences evolution along phylogenetic trees. Comput. Appl. Biosci. 13, 235–238 (1997)
Ranwez, V., Gascuel, O.: Improvement of distance-based phylogenetic methods by a local maximum likelihood approach using triplets. Mol. Biol. Evol. 19(11), 1952–1963 (2002)
Saitou, N., Nei, M.: The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4), 406–425 (1987)
Sattath, S., Tversky, A.: Additive similarity trees. Psychometrika 42(6), 319–345 (1977)
Semple, C., Steel, M.: Phylogenetics. Graduate Series in Mathematics and its Applications. Oxford University Press, Oxford (2003)
Strimmer, K., von Haeseler, A.: Quartet puzzling: a quartet maximum likelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13, 964–969 (1996)
Studier, J.A., Keppler, K.J.: A note on the neighbor-joining method of Saitou and Nei. Mol. Biol. Evol. 5, 729–731 (1988)
Tamura, K., Nei, M., Kumar, S.: Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. 101, 11030–11035 (2004)
Xu, Y., Dai, W., Zhu, B.: A lower bound on the edge l ∞ radius of Saitou and Nei’s method for phylogenetic reconstruction. Inf. Process. Lett. 94, 225–230 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mihaescu, R., Levy, D. & Pachter, L. Why Neighbor-Joining Works. Algorithmica 54, 1–24 (2009). https://doi.org/10.1007/s00453-007-9116-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-007-9116-4