Abstract
DNA sequences are increasingly used for large-scale biodiversity inventories. Because these genetic data avoid the time-consuming initial sorting of specimens based on their phenotypic attributes, they have been recently incorporated into taxonomic workflows for overlooked and diverse taxa. Major statistical developments have accompanied this new practice, and several models have been proposed to delimit species with single-locus DNA sequences. However, proposed approaches to date make different assumptions regarding taxon lineage history, leading to strong discordance whenever comparisons are made among methods. Distance-based methods, such as Automatic Barcode Gap Discovery (ABGD) and Assemble Species by Automatic Partitioning (ASAP), rely on the detection of a barcode gap (i.e., the lack of overlap in the distributions of intraspecific and interspecific genetic distances) and the associated threshold in genetic distances. Network-based methods, as exemplified by the REfined Single Linkage (RESL) algorithm for the generation of Barcode Index Numbers (BINs), use connectivity statistics to hierarchically cluster-related haplotypes into molecular operational taxonomic units (MOTUs) which serve as species proxies. Tree-based methods, including Poisson Tree Processes (PTP) and the General Mixed Yule Coalescent (GMYC), fit statistical models to phylogenetic trees by maximum likelihood or Bayesian frameworks.
Multiple webservers and stand-alone versions of these methods are now available, complicating decision-making regarding the most appropriate approach to use for a given taxon of interest. For instance, tree-based methods require an initial phylogenetic reconstruction, and multiple options are now available for this purpose such as RAxML and BEAST. Across all examined species delimitation methods, judicious parameter setting is paramount, as different model parameterizations can lead to differing conclusions. The objective of this chapter is to guide users step-by-step through all the procedures involved for each of these methods, while aggregating all necessary information required to conduct these analyses. The “Materials” section details how to prepare and format input files, including options to align sequences and conduct tree reconstruction with Maximum Likelihood and Bayesian inference. The Methods section presents the procedure and options available to conduct species delimitation analyses, including distance-, network-, and tree-based models. Finally, limits and future developments are discussed in the Notes section. Most importantly, species delimitation methods discussed herein are categorized based on five indicators: reliability, availability, scalability, understandability, and usability, all of which are fundamental properties needed for any approach to gain unanimous adoption within the DNA barcoding community moving forward.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hebert PDN, Cywinska A, Ball SL, de Waard JR (2003) Biological identifications through DNA barcodes. Proc R Soc London Ser B 270:313–321
Kerr KC, Stoeckle MY, Dove CJ et al (2007) Comprehensive DNA barcode coverage of north American birds. Mol Ecol Notes 7:535–543
April J, Mayden LR, Hanner RH, Bernatchez L (2011) Genetic calibration of species diversity among North America’s freshwater fishes. Proc Natl Acad Sci U S A 108:10602–10607
Hubert N, Hanner R (2015) DNA barcoding, species delineation and taxonomy: a historical perspective. DNA Barcodes 3:44–58
Janzen DH, Hajibabaei M, Burns JM et al (2005) Wedding biodiversity inventory of a large and complex lepidoptera fauna with DNA barcoding. Philos Trans R Soc Ser B 360:1835–1845
Smith AM, Rodriguez JJ, Whitfield JB et al (2008) Extreme diversity of tropical parasitoid wasps exposed by iterative integration of natural history, DNA barcoding, morphology, and collections. Proc Natl Acad Sci U S A 105:12359–12364
Smith AM, Fisher BL, Hebert PDN (2005) DNA barcoding for effective biodiversity assessment of a hyperdiverse arthropod group: the ants of Madagascar. Philos Trans R Soc Ser B 360:1825–1834
Butcher BA, Smith MA, Sharkey MJ, Quicke DLJ (2012) A turbo-taxonomic study of Thai Aleiodes (Aleiodes) and Aleiodes (Arcaleiodes) (Hymenoptera: Braconidae: Rogadiniae) based largely on COI barcoded specimens, with rapid descriptions of 179 new species. Zootaxa 3457:1–232
Riedel A, Sagata K, Suhardjono YR et al (2013) Integrative taxonomy on the fast track - towards more sustainability in biodiversity research. Front Zool 10:15
Sharkey MJ, Janzen DH, Hallwachs W et al (2021) Minimalist revision and description of 403 new species in 11 subfamilies of Costa Rican braconid parasitoid wasps, including host records for 219 species. Zookeys:1–666
Blaxter M (2003) Molecular systematics: counting angels with DNA. Nature 421:122–124
Sholihah A, Delrieu-Trottin E, Condamine FL et al (2021) Impact of pleistocene eustatic fluctuations on evolutionary dynamics in Southeast Asian biodiversity hotspots. Syst Biol 70:940–960. https://doi.org/10.1093/sysbio/syab006
Utami CY, Sholihah A, Condamine FL et al (2022) Cryptic diversity impacts model selection and macroevolutionary inferences in diversification analyses. Proc R Soc B 289:20221335
Bickford D, Lohman DJ, Sodhi NS et al (2007) Cryptic species as a window on diversity and conservation. Trends Ecol Evol 22:148–155
Lohman DJ, Ingram KK, Prawiradilaga DM et al (2010) Cryptic genetic diversity in “widespread” Southeast Asian bird species suggests that Philippine avian endemism is gravely underestimated. Biol Conserv 143:1885–1890
Barley AJ, Brown JM, Thomson RC (2018) Impact of model violations on the inference of species boundaries under the multispecies coalescent. Syst Biol 67:269–284
Chambers EA, Hillis DM (2020) The multispecies coalescent over-splits species in the case of geographically widespread taxa. Syst Biol 69:184–193
Sukumaran J, Knowles LL (2017) Multispecies coalescent delimits structure, not species. Proc Natl Acad Sci 114:1607–1612
Puillandre N, Lambert A, Brouillet S, Achaz G (2012) ABGD, automatic barcode gap discovery for primary species delimitation. Mol Ecol 21:1864–1877
Puillandre N, Brouillet S, Achaz G (2021) ASAP: assemble species by automatic partitioning. Mol Ecol Resour 21:609–620
Ratnasingham S, Hebert PDN (2007) BOLD: the barcode of life data system (www.barcodinglife.org). Mol Ecol Notes 7:355–364. https://doi.org/10.1111/j.1471-8286.2006.01678.x
Ratnasingham S, Hebert PDN (2013) A DNA-based registry for all animal species: the barcode index number (BIN) system. PLoS One 8. https://doi.org/10.1371/journal.pone.0066213
Fujiwasa T, Barraclough TG (2013) Delimiting species using single-locus data and the generalized mixed yule coalescent approach: a revised mathod and evaluation on simulated data sets. Syst Biol 62:707–724
Pons J, Barraclough TG, Gomez-Zurita J et al (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:595–606
Zhang J, Kapli P, Pavlidis P, Stamatakis A (2013) A general species delimitation method with applications to phylogenetic placements. Bioinformatics 29:2869–2876. https://doi.org/10.1093/bioinformatics/btt499
Kapli P, Lutteropp S, Zhang J et al (2017) Multi-rate Poisson tree processes for single-locus species delimitation under Maximum Likelihood and Markov Chain Monte Carlo. Bioinformatics 33:1630–1638
Arida E, Ashari H, Dahruddin H et al (2021) Exploring the vertebrate fauna of the Bird’s Head Peninsula (Indonesia, West Papua) through DNA barcodes. Mol Ecol Resour 21:2369–2387
Shen Y, Hubert N, Huang Y et al (2019) DNA barcoding the ichthyofauna of the Yangtze River: insights from the molecular inventory of a mega-diverse temperate fauna. Mol Ecol Resour 19:1278–1291
Kekkonen M, Mutanen M, Kaila L et al (2015) Delineating species with DNA barcodes: a case of taxon dependent method performance in moths. PLoS One 10:e0122481. https://doi.org/10.1371/journal.pone.0122481
Kekkonen M, Hebert PDN (2014) DNA barcode-based delineation of putative species: efficient start for taxonomic workflows. Mol Ecol Resour 14:706–715
Blair C, Bryson JRW (2017) Cryptic diversity and discordance in single-locus species delimitation methods within horned lizards (Phrynosomatidae: Phrynosoma). Mol Ecol Resour 17:1168–1182
Miralles A, Vences M (2013) New metrics for comparison of taxonomies eveal striking discrepancies among species delimitation methods in Madascincus lizards. PLoS One 8:e68242
Chen W, Hubert N, Li Y et al (2022) Large-scale DNA barcoding of the subfamily Culterinae (Cypriniformes: Xenocyprididae) in East Asia unveils a geographical scale effect, taxonomic warnings and cryptic diversity. Mol Ecol 31:3871–3887
Geiger MF, Herder F, Monaghan MT et al (2014) Spatial heterogeneity in the mediterranean biodiversity hotspot affects barcoding accuracy of its freshwater fishes. Mol Ecol Resour 14:1210–1221
Arhens D, Fujisawa T, Krammer HJ et al (2016) Rarity and incomplete sampling in DNA-based species delimitation. Syst Biol 65:478–494
Delrieu-Trottin E, Durand J, Limmon G et al (2020) Biodiversity inventory of the grey mullets (Actinopterygii: Mugilidae) of the indo-Australian archipelago through the iterative use of DNA-based species delimitation and specimen assignment methods. Evol Appl 13:1451–1467
Limmon G, Delrieu-Trottin E, Patikawa J et al (2020) Assessing species diversity of Coral Triangle artisanal fisheries: a DNA barcode reference library for the shore fishes retailed at Ambon harbor (Indonesia). Ecol Evol 10:3356–3366
Okonechnikov K, Golosova O, Fursov M et al (2012) Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics 28:1166–1167. https://doi.org/10.1093/bioinformatics/bts091
Larsson A (2014) AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30:3276–3278
Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224
Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9:772
Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321
Trifinopoulos J, Nguyen L-T, von Haeseler A, Minh BQ (2016) W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res 44:W232–W235
Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313
Darriba D, Posada D, Kozlov AM et al (2020) ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models. Mol Biol Evol 37:291–294
Edler D, Klein J, Antonelli A, Silvestro D (2021) raxmlGUI 2.0: a graphical interface and toolkit for phylogenetic analyses using RAxML. Methods Ecol Evol 12:373–377. https://doi.org/10.1111/2041-210X.13512
Talavera G, Dinca V, Vila R (2013) Factors affecting species delimitations with the GMYC model: insights from a butterfly survey. Methods Ecol Evol 4:1101–1110
Bouckaert R, Heled J, Kühnert D et al (2014) BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 10:1–6. https://doi.org/10.1371/journal.pcbi.1003537
Rambaut A, Drummond AJ, Xie D et al (2018) Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol 67:901–904. https://doi.org/10.1093/sysbio/syy032
Jukes TH, Cantor CR (1969) Evolution of protein molecules. Mamm protein Metab 3:21–132
Kimura M (1980) A simple method for estimating evolutionary rate of base subtitutions through comparative studies of nucleotide sequences. J Mol Evol 15:111–120
Allaire J (2012) RStudio: integrated development environment for R. Boston, MA 770:165–171
Fujita MK, Leaché AD, Burbrink FT et al (2012) Coalescent-based species delimitation in an integrative taxonomy. Trends Ecol Evol 27:480–488
Ogilvie HA, Bouckaert RR, Drummond AJ (2017) StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates. Mol Biol Evol 34:2101–2114. https://doi.org/10.1093/molbev/msx126
Hleap JS, Littlefair JE, Steinke D et al (2021) Assessment of current taxonomic assignment strategies for metabarcoding eukaryotes. Mol Ecol Resour 21:2190–2203
Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Phillips JD, Gillis DJ, Hanner RH (2022) Lack of statistical rigor in DNA barcoding likely invalidates the presence of a true species’ barcode gap. Front Ecol Evol 10:859099. https://doi.org/10.3389/fevo.2022.859099
Miralles A, Ducasse J, Brouillet S et al (2022) SPART: a versatile and standardized data exchange format for species partition information. Mol Ecol Resour 22:430–438
Ducasse J, Ung V, Lecointre G, Miralles A (2020) LIMES: a tool for comparing species partition. Bioinformatics 36:2282–2283
Bergsten J, Bilton DT, Fujisawa T et al (2012) The effect of geographical scale of sampling on DNA barcoding. Syst Biol 61:851–869
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Hubert, N., Phillips, J.D., Hanner, R.H. (2024). Delimiting Species with Single-Locus DNA Sequences. In: DeSalle, R. (eds) DNA Barcoding. Methods in Molecular Biology, vol 2744. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3581-0_3
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3581-0_3
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3580-3
Online ISBN: 978-1-0716-3581-0
eBook Packages: Springer Protocols