Introduction

Loss of biodiversity is one of the greatest problems facing modern society. This environmental crisis is increasingly evidenced by the loss or deterioration of biodiversity-related resources and services, as well as recent and long-standing attempts to highlight and address the issue at the highest international levels (e.g. the Convention on Biological Diversity (2005), 2010 being designated the year of Biodiversity, and the recent Nagoya Biodiversity Summit; van Kooten et al. 2001; Djoghlaf 2010; Fay 2010). Successful limitation of biodiversity loss is an ongoing challenge due to two main factors. First, there is a lack of knowledge of both global (and in many cases regional or local) levels of biodiversity. Second, the rate of loss, particularly in relation to background levels is unknown. Though estimates of decline vary according to different indicators and between taxa (see Butchart et al. 2010; Magurran et al. 2010), that the rate is anthropogenically inflated and is therefore cause for action is the general consensus (Butchart et al. 2010). Within this context, attempts to preserve biodiversity also remain uncertain in their efficacy, since in many cases the fundamental baseline data is not available at sufficient resolution, spatial scale or over the necessary timescales to confirm conservation success, other than for some specific and well-studied cases (e.g. Magurran et al. 2010; cf Hoffmann et al. 2011).

Despite 250 years of taxonomic studies, the vast majority of biological diversity remains under-described or undiscovered, partly due to the increasingly smaller number of taxonomists, and partly due to the tedious nature of the research (Wilson 2000; Scotland et al. 2003). Some attempts to quantify the extent of unknown biodiversity have been made; for instance, Mora et al. (2011) suggest that 86 % of species on Earth and 91 % of species in the oceans are yet to be discovered. While these estimates are highly speculative, they provide a template to strategise discovery of new species. This, alongside the current biodiversity crisis provides an impetus for rapid action and application of standardised, replicable techniques to improve data acquisition and quality.

DNA barcoding provides an opportunity to identify, inventory, and study specimens in order to (i) understand the diversity of species within an ecosystem, and (ii) evaluate the genetic variability within species. More specifically, DNA barcodes provide useful information for different stakeholders in conservation: researchers could identify species more quickly; taxonomists could determine groups of species that require more detailed studies; and policymakers could use barcode data to determine the appropriate scales for conservation (Francis et al. 2010). This type of information can provide useful inputs to prioritise conservation efforts (cf. Ardura et al. 2010). It is in this context that DNA barcoding is recommended as a cheap, quick and objective method (e.g. Yao et al. 2009; Dentinger et al. 2010; Li and Dao 2011) to aid conservation efforts, though it has encouraged much debate regarding its potential (Moritz and Cicero 2004; Smith et al. 2005; Hebert and Gregory 2005; Rubinoff et al. 2006; Pilgrim et al. 2011).

The purpose of this comment is to provide a critical overview of the latest literature to determine to what extent DNA barcoding can realistically contribute to a practical understanding of global biodiversity and successful conservation efforts. The paper first addresses the definitional shortcomings of DNA barcoding, arguing that a lack of consensus has led to confusion about its aims and applications. It then examines the underlying assumptions of using mitochondrial DNA as a standardised gene and argues that the efficiency of DNA barcoding depends on the use of integrative taxonomy. Subsequently, the potential role of barcoding in conservation efforts is analysed, noting particularly that the most significant contributions of DNA barcoding are in the conservation of known biodiversity. The paper concludes by discussing the role of DNA barcoding in enhancing taxonomic research and conservation efforts but also suggests that genetic barcoding should not be used to replace traditional taxonomic research, which is crucial for conservation.

DNA barcoding: epistemological and molecular issues

In the years following the publication of the findings by Hebert et al. (2003), a limitation in debates about DNA barcoding related to the lack of agreement about what it is and its aims (Moritz and Cicero 2004; Rubinoff et al. 2006). In recent years, however, DNA barcoding has been increasingly defined in operational terms, as proposed by the Barcode of Life Initiative (BOLI 2012a) as a genetic method for taxonomical research whereby (i) a single gene consisting of approximately 600 base pairs of cytochrome c oxidase I (COI) in the 5’ end of mitochondrial DNA (mtDNA) is sequenced and used as the barcode; (ii) the same gene and region are used in a standardised manner; and consequently (iii) gene sequences are analysed with distance methods to classify specimens (specifically neighbour-joining) and their taxa accordingly (Savolainen et al. 2005; cf. Rubinoff et al. 2006).

However, the concept of standard DNA barcodes is being reviewed as the techniques and technologies are validated with different samples. Recent studies moved away from the original barcoding paradigm to include alternative standardised genes for animals (Vences et al. 2005; Ivanova et al. 2007), fungi (Seifert 2009) and plants (Holingsworth 2011), multiple genes (e.g. Kress et al. 2005), different analytical frameworks (e.g. Matz and Nielsen 2005), metrics other than distance methods (Zou et al. 2011) and specimen classification based on operational taxonomic units rather than traditional species concepts (Creer et al. 2010). However, in order for the debates to advance, it is imperative that the specific aspects of barcoding methodology are defined. Neither the official website of the Consortium for the Barcode of Life [CBOL] (2011) (http://barcoding.si.edu/index.html) nor that of the International Barcode of Life [iBOL] (2011) (http://www.dnabarcoding.org) proposes an explicit definition of DNA barcoding, and thus its meaning can be manipulated to fit different barcoding paradigms (Rubinoff et al. 2006). However, recent attempts to provide a working definition have been consolidated by the application of specific criteria for inclusion in databases such as BOLD (www.boldsystems.org) and GenBank (http://www.ncbi.nlm.nih.gov/genbank/) with a “barcode” tag. These criteria include, among others, a peer-reviewed publication of key results, submission of the nucleotide sequences, primers, source modifiers, and protein sequences used for the analysis. Additional criteria might be useful in determining what barcoding is and what it is not.

Further confusion regarding the potential roles of DNA barcoding relates to the taxonomic objectives to which it is applied. This is arguably attributable to the ambitious goals set out in the original paper by Hebert et al. (2003); accordingly, barcoding aims (i) to facilitate identification of species already defined by morphological criteria (thus speeding up global biodiversity assessments), and, more contentiously (ii) to describe new species by interpreting species diversity in terms of DNA diversity (Dasmahapatra and Mallet 2006). It can therefore be argued that the first aim seeks to prove that genetic research methods can complement morphological analyses (Gregory 2005). The latter, more ambitiously, seeks to prove that DNA barcoding is a novel concept that can be applied to the vast amount of undiscovered biodiversity, and thus provide a first step towards solving the biodiversity crisis (cf. McKinney 1999).

The term ‘DNA barcode’ suggests that standardised DNA sequences can identify taxa in the same way that 11-digit universal product codes identify retail products (Neigel et al. 2007). Consequently, a central element to this concept is the use of homologous gene regions that can be amplified by polymerase chain reaction and distinguish between species. Accordingly, the mitochondrial genome of animals is preferred over other genomes (such as those of the nucleus) because mitochondrial DNA recombination is rare—thus preventing the formation of DNA sequences that would not normally occur. Similarly, mtDNA is haploid and usually maternally inherited, so the problem of sequencing heterozygous organisms is prevented (Hebert et al. 2003; Neigel et al. 2007; Swartz et al. 2008). Further, the use of the cytochrome c oxidase I gene is justified in two important ways: firstly, these primers are very robust, enabling analysis for most animal organisms; and secondly, it has a high rate of evolution compared to other DNA sequences, allowing for the discrimination of species, including those that are closely related (Stoeckle and Hebert 2008). Therefore, as Rubinoff (2006: 1028) explains, “mitochondrial DNA can be a powerful tool in the effort to identify species, their relationships to each other, and threatened or endangered populations with divergent genetic patterns worthy of conservation attention.” Indeed, the use of mtDNA for species identification has been claimed to have high rates of success—most studies have shown relatively small failure rates of below 5 % (Waugh 2007). For example, Hebert et al. (2003) claim a 100 % success rate in identifying lepidopterans, while Hubert et al. (2008) report a 93 % success rate in identifying Canadian freshwater fish.

There are some exceptions however; for example Meyer and Paulay (2005) failed to identify 17 % of cowrie species using barcoding analysis based on mtDNA, and some authors have deemed mtDNA an inadequate source of species-defining data (Rubinoff et al. 2006). Phenomena such as introgression due to hybridisation have been identified as a major limitation to mtDNA use (Rubinoff 2006). Introgression creates confusion about species boundaries between evolutionary lineages (phylogenies) that would normally be distinct (Fig. 1). In a meta-analysis of phylogenetic studies, Funk and Omland (2003) show that over 20 % of the studied lineages present problems of mtDNA introgression, suggesting that this may be a significant limitation towards the use of barcodes based on mtDNA.

Fig. 1
figure 1

Hypothetical phylogenies for four species—A, B, C, D. In this example, species A and B share the same mitochondrial DNA due to introgression, but an analysis of their nuclear genomes reveals that they are distinct species, and not even sister species. In mtDNA barcode analysis, species A and B would be incorrectly used as the same conservation unit. Several such misidentifications may be common when using mtDNA as a single source of species-identification data. [Adapted from Rubinoff (2006)]

An additional problem relates to the inheritance of mtDNA: because the effects of maternal inheritance on the rate of molecular divergence cannot be accurately predicted, the failure rate of mtDNA barcoding cannot be estimated (Rubinoff et al. 2006). For one, not all mtDNA is maternally inherited—an example is that of bivalve molluscs which display doubly uniparental inheritance (Hoeh et al. 2002). Increasingly, paternal inheritance of mtDNA is being registered in different taxa (Zhao et al. 2004). In the infrequent cases where mitochondrial DNA is paternally inherited, barcoding efforts may not have the desired effect (Rubinoff et al. 2006). Therefore, by using mitochondrial genes as a sole source of data, several evolutionary processes which affect genders differently are not accounted for, and information becomes skewed based on whether maternal or paternal inheritance data is selected (Shaw 2002; Trewick 2008; Rubinoff and Sperling 2004).

Furthermore, although the rarity of mtDNA recombination has been advocated as one of the successes of DNA barcoding, it is a problem that can produce sequence variations that violate barcoding protocols (Rubinoff et al. 2006). Moreover, despite the argument that mitochondrial recombination is impossible (e.g. Avise 2000), several empirical studies show that it is a common phenomenon in molluscs (Burzynski et al. 2003) and humans (Kratysberg et al. 2004), and more instances of recombination are likely to be found as large-scale mtDNA barcoding takes place. Although not an obstacle to DNA-based species classification, recombination may render mtDNA barcodes uninformative (Rubinoff et al. 2006).

Another potential problem with barcoding is heteroplasmy: namely the presence of nucleotide differences within a single specimen. Heteroplasmy has been reported in bats (Petri et al. 1996), fish (Hilsdorf and Krieger 2004), insects (Nardi et al. 2001), and nematodes (Tsang and Lemire 2002). Variability in nucleotides emerges due to both sequence variability and length heteroplasmies due to insertions/deletions (indels) within coding genes—a phenomenon which has not been adequately addressed by proponents of barcoding. Heteroplasmy also means that the mitochondria of an individual could represent a sample of the alleles within a population, like any other gene, therefore requiring other genetic markers (Rubinoff et al. 2006).

A related issue relates to nuclear pseudogenes of mitochondrial origin (numts)—nonfunctional copies of mtDNA in the nucleus that are common in major clades of eukaryotes, and that can be easily amplified causing problems for DNA barcoding. In species where individuals have numts of the COI gene, barcoding analysis incorrectly overestimates the number of unique species based on distance-based methods of classification (Song et al. 2008). Although removal of numts can reduce the number of incorrect inferences of species quantity, even rigorous quality control fails to identify some numts. Removal of numts is a long process that requires examination of sequence characteristics, indels, and nucleotide composition—the additional work required is a violation of the “cheap and quick” characteristic of DNA barcoding. The limitations set out above call for the use of several sources of genetic data in order to provide more accurate assessments of biodiversity.

Is DNA barcoding useful in biodiversity conservation?

One of the arguments for promoting DNA barcoding initiatives is that genetic data have the potential to enhance conservation strategies, and indeed molecular tools are used in conservation biology to inform at different levels of analysis (Haig 1998; DeSalle and Amato 2004). For conservation biology, genetic tools can aid in research as diverse as fine-scale management of coral reefs (Neigel et al. 2007), identification of cryptic and invasive species (Ball and Armstrong 2006; Stoeckle 2008) and regional management of fisheries (Schander and Willassen 2005; Swartz et al. 2008). DNA barcoding, in particular, can contribute to conservation policy in two important ways: by speeding up local biodiversity assessments to prioritise conservation areas or evaluate the success of conservation actions, and by providing information about evolutionary histories and phylogenetic diversity.

Quicker biodiversity assessments

A central argument for the use of DNA barcoding is its efficiency: barcoding has the potential of identifying species quickly and cheaply (Frézal and Leblois 2008): in under a decade of research, Hebert et al. (2010) argue that barcodes for 0.1 % of all described animal species have been collected. It is estimated that sequencing can take place in under 90 min at a cost of $2–5 per species, compared to several months of field work and a cost of at least $100 per specimen with morphological analyses (Hebert and Gregory 2005; Stoeckle and Hebert 2008). This assumption suggests that a purely genetic analysis can be utilised, but barcoding papers have notoriously included traditional taxonomic research methods, for example, in identifying differences in the larval appearance and food habits of cryptic species of skipper butterflies (Hebert et al. 2004a). This approach therefore suggests that taxonomical research is becoming increasingly integrative, combining traditional morphological analyses with innovative technological developments, all of which require resource investment (cf. Ebach and Holdrege 2005a, b; Smith et al. 2005; Caesar et al. 2006). However, the argument can also be made that after the initial investment is made to develop a reliable database, barcoding can be used to quickly identify newly collected specimens. Further research is necessary to test the effectiveness of genetic barcoding in such cases but recent improvements in genetic sequencing techniques and technologies suggest that genetic tools for biodiversity assessment are becoming increasingly cheaper (Shendure and Ji 2008).

Rapid biodiversity assessment is a research priority given the ongoing species extinction/extirpation cascade, and the most critical contribution of barcoding to biodiversity conservation is facilitating biodiversity assessment cheaply and quickly where financial resources are limited. This is especially important because the vast majority of described biodiversity is in developing countries, where resources for comprehensive biodiversity assessments are lacking. There have been several notable conservation successes using barcoding, from the identification of endemism and intra-specific population structure among mammals in Southeast Asia (Francis et al. 2010) and the rapid classification of 99 % of 210 chondrichthyan species from 36 families in Australia (Ward et al. 2008) to the identification of larval and juvenile organisms for which morphological data is non-existent (Neigel et al. 2007), all of which have accelerated conservation priorities and policy response.

A particular strength of barcoding is that it can potentially identify species from small or incomplete samples, including e.g. stomach contents or faecal remains (Valentini et al. 2008), making the use of this type of data more effective. Barcoding can also act as a tool to actively prevent biodiversity impacts or punish environmental criminals, for example by providing forensic evidence to prosecute illegal trading and poaching of endangered species, even after organisms have been processed: one such example is the illegal shark-fin trade, which is a significant threat to biodiversity in many coastal African countries—barcoding can identify different shark species, including those that are critically endangered and protected under international legislation (Swartz et al. 2008). As such, DNA barcoding in conjunction with a reference database can potentially play a key role in environmental law enforcement. DNA barcoding therefore plays an important role for conservation of species that are already studied (Soltis and Gitzendanner 1999; Ardura et al. 2010; Francis et al. 2010; Williams et al. 2012).

The most significant contribution of barcoding is in identifying species where morphology is insufficient, because morphological keys are only applicable at particular life stages or gender (Hennig 1976). This is particularly true in the case of arthropods – a group for which current taxonomic understanding is considered insufficient (May and Harvey 2009). DNA barcoding has played a significant role in identifying larval stage in species, including mayflies (Ball et al. 2005), stomatopods (Barber and Boyce 2006), and butterflies (Gossner and Hausmann 2009); and classifying species that have a variety of life-stages including aphids (Foottit et al. 2009).

Barcodes have also complemented morphological studies in cases where taxa are easily misidentified due to the existence of cryptic species (Rubinoff et al. 2006; Ward et al. 2008). In an attempt to prove the utility of DNA barcoding in identifying new (or overlooked) species, considerable attention has been given to the discovery of cryptic diversity, for example, in the skipper butterfly (Astrapes fulgerator) (Hebert et al. 2004a), in neotropical Lepidoptera (Hajibabaei et al. 2006), in amphipods (Witt et al. 2006), in Diplostomoidea parasites of freshwater fish (Locke et al. 2010), in Oriental flies (Pramual et al. 2011) and in marine metazoans (Bucklin et al. 2011). While the majority of these studies have focused on arthropods (Taylor and Harris 2012), that barcodes have also been to identify cryptic diversity in metazoans and other complex groups highlights the potential of barcoding to enhance understanding of species diversity.

However, as Rubinoff et al. (2006: 583) explain, “cryptic species identifications are contingent upon a preexisting understanding of species in those groups based on other sources of data and are therefore not representative of the unknown biodiversity we are challenged to identify”. This is not to say that the merits of cryptic species identifications ought to be dismissed. But the usefulness of discovering cryptic species should be analysed critically. In the case of microbes, the majority of species are cryptic so barcoding can provide useful information to compare ecological sequences and determine conservation priorities (Begerow et al. 2010). However in the case of metazoans, while cryptic diversity can be used as an argument for prioritising diversity conservation, it does not reveal enough information about the rarity or value of species. Furthermore, a barcode may not be sufficient to justify conservation of species that do not show morphological or ecological differences, both of which are critically important in determining the value of species (Rubinoff and Sperling 2004).

DNA barcoding is also intended to facilitate the identification of new species (Hebert et al. 2004a; Gomez et al. 2007; Johnson et al. 2008; Ragupathy et al. 2009; Pauls et al. 2010). Given that the vast majority of species remain undiscovered (Wilson 1994), the potential for discovering new species using an objective, cheap, and standardised method such as DNA barcoding has received considerable attention. However, this is one of the most controversial uses of barcodes. Evidence suggests that it is simplistic to describe a new species purely based on DNA sequences (DeSalle 2006) and that barcoding should be used in conjunction with traditional taxonomic methods in order to adequately describe species (Prendini 2005). Barcodes on their own can only help propose hypoteheses about new species, rather than discovering them (Goldstein and DeSalle 2011).

A challenge associated with barcodes is that even the same authors use inconsistent species concepts. For instance, in a study of North American bird diversity, Hebert et al. (2004b) find that inter-species diversity is greater than intra-species diversity by a factor of at least 10 (i.e. species are defined in relative terms), while in a study of skipper butterflies (Hebert et al. 2004a) species are defined according to thresholds. Like other DNA analyses, barcoding studies suggest a thresholds approach as an objective and quantitative way of delineating species diversity: a ≥ 2 % difference in the DNA barcode of mammals implies that the samples are from different species; similarly, the threshold for delineating insect species is ≥3 % (Stoeckle and Hebert 2008). Obvious problems arise from such an approach: in some cases,<2 % mtDNA divergence may be ecologically critical (Holmes 2004).

For policymakers, this implies that, for example, mammal species that diverge by 1.8 % should not be prioritised from a conservation or management perspective (Rubinoff 2006). This is a significant problem because rates of change within species are dynamic, and many divergences will not equal 2 %. For instance, when a wide ranging species complex is studied, divergences among adjacent populations are likely to be low (Rubinoff et al. 2006). Therefore, in instances where mtDNA varies geographically (a characteristic particularly noticeable in widespread species such as domestic cats and even humans), individuals from separate populations of the same species will be divergent, but they are genetically contiguous (Rubinoff et al. 2006). An example can be made of the brown bear, which renders the polar bear genetically redundant (Talbot and Shields 1996)—were it not for the obvious differences in marine ecology, feeding patterns and morphology, the polar bear would not receive conservation priority.

Given that barcode thresholds are often inconsistent between taxonomic groups, different results may be obtained from the same dataset. For example, of the ten cryptic species in Astraptes fulgerator identified by Hebert et al. (2004a), only two diverge by more than 3 %, the threshold advocated by Hebert et al. (2003). If, however, a ≥ 3 % divergence is selected as the threshold, A. fulgerator would be classified as four species. Furthermore, if the minimum level of acceptable genetic divergence is applied (0.32 %), a total of fourteen species can be described. The final claim that A. fulgerator consists of ten cryptic species was supported by morphological and ecological data (Hebert et al. 2004a). Even within the same analysis, divergences can be variable: in a study of butterfly sister species from the subfamily Ithominae by Whinnett et al. (2005), some divergences were as low as 0.23 % and others were as high as 6.40 %. These results indicate the need for “flexible” thresholds in order to identify previously described and accepted species, and again highlight that barcoding as a stand-alone technique currently has notable limitations. Based on the flexibility of thresholds, barcoding analyses can then be manipulated for political purposes. On the one hand, enthusiasts may argue that cryptic diversity merits conservation priority (Witt et al. 2006). However, where obvious differences cannot be perceived, it is questionable whether conservation based on mtDNA barcodes alone would receive public support (Rubinoff 2006). On the other hand, organisations or individuals with important stakes for developing a potential area of conservation can find a “suitable” threshold to justify development (Rubinoff et al. 2006).

Although the attribution of a taxonomic rank based on genetic analyses is not straightforward, it is increasingly recognised that genetic distances can be a useful tool for determination of species boundaries given that genetic divergences are ordinarily lower among individuals of a species than between closely related species (cf. del Prado et al. 2010). A number of studies have illustrated the potential for DNA barcoding to identify intra- and interspecific variability (Smith et al. 2005; Zemlak et al. 2009; Lakra et al. 2011). Such studies have highlighted that it is possible to establish robust thresholds of intra- and interspecific variability for specific taxonomic groups (Lefebure et al. 2006). Despite the difficulty of establishing a standard threshold for all species, where a taxonomic group has been well studied and variability thresholds robustly determined, the complexity of species (including cryptic diversity) can be detected more adequately (Zemlak et al. 2009).

This indicates that the contributions of DNA barcoding to biodiversity conservation are most noticeable and less controversial where species have already been well studied and described by other criteria. In the context of undiscovered species, barcoding might be more controversial. However it can provide initial genetic information that can be used to complement additional morphological and ecological studies to investigate its potential conservation status.

DNA barcoding for assessments of phylogenetic diversity

Reliance on indicators of species diversity, value and richness are central to designing and prioritising conservation areas (Carter 2007). One such indicator is that of phylogenetic diversity which measures taxonomic divergence between species, calculated as “the sum of the lengths of all the branches that are members of the corresponding minimum spanning path” (Faith 1992: 4). An index of phylogenetic diversity can inform conservation strategies by ignoring tedious species counts and using evolutionary lineages (phylogenies) to boost predictions about biodiversity patterns (Mitchell 2008). Indeed, Faith and Williams (2005) argue that the most significant contribution of DNA barcoding to conservation efforts is its role in improving and speeding up phylogenetic diversity assessments.

Faith and Baker (2006) used DNA barcodes as a way of assessing phylogenetic diversity (PD) to quantify biodiversity value in New South Wales, Australia. They used these to identify genetic patterns for economically valuable aquatic invertebrates, with the purpose of devising a phylogenetic diversity index that can inform conservation policy. Barcoding revealed divergent lineages as well as large genetic (and possible cryptic) diversity in the spiny crayfish (Euastacus spp.). The PD assessment exposed five lineages; it further indicated that lineage A, which is endemic to the upper Gorges River, was the phylogenetic ‘sister’ of lineage B (Fig. 2). This assessment has been used to prioritise conservation and management in the upper Gorges River. The location is threatened by mining activities in the region, and also holds an endemic lineage (A) of Euastacus spp. which is the sister lineage of the already threatened lineage (B)—if both lineages were to be extirpated, a deep phylogenetic branch would be irreversibly lost. Thus the PD value of conserving lineage A is greater than that of the other four identified lineages.

Fig. 2
figure 2

Phylogenetic diversity of “spiny crayfish” (Euastacus spp.) [Adapted from Faith and Baker 2006]

This example outlines a potential role for DNA barcoding in PD assessments for biodiversity conservation strategies, placing less emphasis on decisions about the importance of cryptic species, and emphasising the importance of rare, deep phylogenetic branches and lineages. Further, the example highlights the role of PD assessments in predicting biodiversity patterns: for instance, the geographic distribution of Euastacus spp. reveals general historical relationships and predicts distribution patterns for other freshwater species (Baker et al. 2004). Therefore, PD assessments based on one group may generate predictions about abundance and richness of species, as well as the overall phylogenetic value of a region (Faith and Baker 2006; Cadotte et al. 2010).

The value of PD assessments in conservation planning suggests a role for an integration of web-based barcoding databases (such as iBOL and CBOL) and PD analysis tools in order to produce robust phylogenetic calculations that can inform conservation policy (Faith and Baker 2006). Such an approach involves the integration of mtDNA barcode-based phylogenetic data with the broader phylogenetic databases (Faith and Baker 2006). This remains one of the greatest challenges for PD analyses—phylogenetic studies are needed for many more taxa in order to produce accurate predictions for global biodiversity patterns (Soltis and Gitzendanner 1999).

Emerging perspectives

The relative successes of DNA barcoding techniques in animals have encouraged research for potential barcoding applications for other kingdoms that are harder to classify with genetic tools, especially plants (cf. Holingsworth 2011). The process of barcoding plants has been problematic because land plant phylogenetic markers seem to have too little variation to determine species limits (Kress et al. 2005). Some studies suggest strategies based on a single chloroplast region (Lahaye et al. 2008) or a combination of different regions (Kress and Erickson 2007; Chase et al. 2007; Holingsworth et al. 2009). Overall, it appears that different genetic markers can be applied to plants with differing degrees of success: trnH-psbA, rp136-rpf8, and trnL-F have been used with relative success to classify angiosperms (Kress et al. 2005); the ITS2 region has been useful in identifying a range of medicinal plants compared to other genetic markers (Chen et al. 2010); the ITS region has also been used successfully to classify angiosperms (Li et al. 2011); and rbcL and trnL-F have been suggested as a two-locus DNA barcode for ferns (de Groot et al. 2011). More recently, a combination of plastid markers including rbcL, matK, and trnH-psbA have been used with success in 87.1–92.7 % of cases (Li et al. 2011). The recent research highlighted here shows high potential for the application of barcoding techniques based on a combination of genetic markers for plants. Advancements in the understanding of phylogenetic diversity might provide useful inputs for the standardisation of barcodes for plants and potentially other complex taxa.

Additional efforts have also focused on identifying barcodes for microorganisms, particularly for fungi (Begerow et al. 2010) resulting in a dedicated project to barcode this kingdom (www.fungalbarcoding.org). The fragment suggested by early barcoding efforts, the COI marker, has been only partly effective in identifying fungal species (Chen et al. 2009; Seifert 2009). However barcoding studies based on markers found in the ITS region have been successful at identifying species both in aquatic (Kelly et al. 2011) and terrestrial fungi (Stockinger et al. 2010). As with plants, additional research is needed to identify a gene (or group of genes) that can be used for the quick identification of fungal species.

Evidence indicates that conservation planning informed by DNA barcoding techniques is relatively easy to replicate and cost-efficient, with minimal field work (Dahlberg and Mueller 2011). Automated sequencers can provide over 1,000 sequences of 1,000 base pairs every day and even non-geneticists have access to sequencing via private companies that offer the service at a competitive price (Valentini et al. 2008). Initial research shows that barcoding techniques could be replicated at a faster rate using smaller base-pair regions for mini-barcodes to enhance species identification where DNA has been degraded in archival specimens and processed biological material, preventing the recovery of fragments longer than 200 base pairs. An experimental study of 1,566 specimens derived from 691 species of mammals, birds, and insects highlights that as little as 100 base-pairs (with 90 % identification success rate) or 250 base-pairs (with 95 % success rate) can be used to classify species (compared to a 97 % success rate for the full-length barcodes) (Meusnier et al. 2008). Mini-barcodes can be particularly helpful where a rough picture, rather than a comprehensive analysis, of species diversity is needed.

As the technology to collect and interpret DNA barcodes becomes increasingly accessible and affordable, non-experts can potentially use barcoding for purposes beyond taxonomy. For instance in forensics, DNA samples can be used for drug authentication efforts (Chase et al. 2005). In biosecurity, barcoding can provide a fast and reliable method to identify exotic species: one such example is the identification of the invasive North American bullfrog (Rana catesbeiana) in France by using short mitochondrial fragments found in water samples from ponds (Ficetola et al. 2008). In the context of biomedical research, technological improvements through next-generation DNA sequencing have the potential to accelerate research by enabling the comprehensive analysis of DNA barcodes, genomes, transcriptomes and interactomes through more inexpensive and mainstreamed efforts (Shendure and Ji 2008). Moreover, non-experts can also contribute to barcoding initiatives by providing samples, geo-coding them and uploading the information onto standardised databases such as BOLD (http://www.barcodinglife.org) for further taxonomic interpretation. This information can be used for forensic or biosecurity purposes as described above and can enhance local participation in these matters. Besides significantly reducing fieldwork and costs associated with collecting samples, non-expert participation in barcoding efforts could enhance “citizen science” as well as local conservation strategies leading to measurable impacts on biodiversity (cf. Cooper et al. 2007).

DNA barcoding studies are also redefining traditional species concepts. Barcoding has been integrated into metagenetic approaches (e.g. Porazinska et al. 2010; Hajibabaei et al. 2006). These techniques have focused on operational taxonomic units (OTUs, Blaxter et al. 2005) rather than traditional concepts of morphological species. Such approaches are useful in determining the ecological value for eukaryote microbiota, where high abundance, diversity, minute size, and unclear morphological characteristics present challenges for correct identification (Porazinska et al. 2010). Metagenetic studies based on OTUs are also being advocated as potentially useful strategies for meiofaunal biodiversity assessments. In an analysis of meiofaunal biodiversity using second-generation sequencing methods, Creer et al. (2010) found that environmental metagenetic analyses differentiate between OTUs across the eukaryote domain at a fraction of the time and cost of traditional morphological approaches. Such studies highlight the role of DNA barcoding in enhancing biodiversity assessment, and therefore in conservation efforts.

Recognising the challenge of species identification based on distance-based methods, additional methods based on character-based barcodes have been proposed (Damm et al. 2010). Character-based barcoding focuses on identifying characteristic attributes for every clade at each branching node that is initially produced from a given dataset. Characteristic attributes identify diagnostic character states (genes, aminoacids, base pairs or even morphological, ecological or behavioural attributes) which are found only in one clade and not in other groups diverging from the same group. As such, the character-based DNA barcode method provides a molecular data set that can be incorporated into classical taxonomic data such that the discovery of new species can be made in an analytical framework that includes multiple sources of data to more easily discriminate between closely related species. This method has been successfully used to classify Drosophila (Yassin et al. 2010), odonates (Rach et al. 2008) and nematodes (Zou et al. 2011) and is particularly promising in cases where there is significant overlap between intraspecific and interspecific divergences in the primers that are used for the classification.

DNA barcoding reflects a shift in increasing reliance on technology from policymakers, and indeed technological advances can improve scientific and societal understanding of biodiversity. The former is important in producing (reliable) data to inform policy, while the latter is necessary to generate awareness and to draw public attention to biodiversity. Since the publication of the controversial paper by Hebert et al. (2003), barcoding has received significant media attention globally: over 20,000 articles have been published on DNA barcoding as of March 2012 (SCOPUS: “DNA barcod*”) and at least seven dozen major peer-reviewed journals and popular scientific magazines have featured articles on the topic including Molecular Ecology Resources, Public Library of Science and Mitochondrial DNA where most of the barcoding papers are published (BOLI 2012b). The term ‘DNA barcode’ is a fashionable one as it is intended to capture popular imagination (Holmes 2004; Rubinoff 2006; Mitchell 2008); as such, it appeals to both the general public and potential funders.

Discussion

There is considerable optimism about the use of DNA barcoding in providing quick, cheap and reliable information about species that can help in conservation strategies (cf. Valentini et al. 2008). Despite successful uses of genetic barcoding in classifying taxa, it is perhaps premature to suggest that DNA barcoding can offer a replacement paradigm for assessing and understanding biodiversity. Instead, it is an additional tool that incorporates genetic data into existing studies. DNA barcodes have been particularly helpful in identifying cryptic diversity within species, especially when used in conjunction with other taxonomic criteria such as morphology and ecology. Consequently it can be argued that DNA barcodes can facilitate an integrative approach to species identification since the most significant successes of DNA barcoding have been those which involve traditional taxonomic studies. Such validation is likely to be a consistent requirement of the application of DNA barcoding. The role of DNA barcodes in identifying undescribed species is ambiguous: barcodes can help in speeding up biodiversity assessments, but cannot provide information in the same way that complex species definitions can. Integrative taxonomic methods are required to address the ‘taxonomic impediment’ (Wilson 2000).

In particular, there are three main reasons why genetic barcoding should be used cautiously. First, no common distance threshold has been identified for species delimitation, and thresholds vary between different taxonomic groups: some studies suggest a 1 % divergence (Ratnasingham and Hebert 2007) while others suggest a 3 % divergence (Hebert et al. 2003) but neither has proved useful for all taxa, especially where only a few specimens are available for analysis (Yassin et al. 2010). Establishing robust thresholds for species delimitation is a key component of the barcoding process: only after establishing a threshold in a target group is it possible to identify potential cryptic and overlooked species. Second, it is difficult to discriminate closely-related species by using a species definition based on genetic distance approaches (Meier et al. 2006). Third, no single classification technique can be applied universally for species identification (cf. Taylor and Harris 2012)—this is especially the case in the context of species discovery, which requires corroboration from morphology, geography, ecology, behaviour and molecular information (Hickerson et al. 2006). These limitations can increase uncertainty in species concepts and can obfuscate conservation efforts.

DNA barcoding, therefore, seems to require integration into other analyses. Used in conjunction with ecological, morphological and other genetic studies (e.g. Smith et al. 2006; Creer et al. 2010), DNA barcoding has proved useful in enhancing data availability for existing research problems (Rubinoff et al. 2006). However, Meyer and Paulay (2005) point out that barcoding research has met with limited success when identifying new species in unstudied groups. For instance, barcoding fails to identify recently isolated populations and reproductively isolated lineages (Hickerson et al. 2006), fish species that have undergone recent speciation (Swartz et al. 2008) and complex groups with sexual compatibility with several species such as wild potatoes (Solanum sect. Petota) (Spooner 2009). DNA barcoding therefore cannot be used on its own to enhance knowledge about biodiversity: a genetic study can only prove that biodiversity is complex, without necessarily providing essential information about ecology (Holmes 2004).

It is likely that ecologists will encourage the DNA barcoding approach in various cases, because it is an objective, easy and replicable method to identify species. This trend will continue as reliable databases are expanded, and as barcoding techniques become cheaper and quicker. However, although DNA barcoding has helped raise the profile of taxonomic research, conservation decisions depend on a number of factors and are not solely contingent on scientific information (cf. Hambler 2004). Moreover, conservation funding is highly dependent on public support (Rubinoff 2006). Communities may oppose conservation strategies based on mtDNA barcodes since such conservation is based on something that they cannot appreciate because they cannot see or understand it (DeSalle 2005, 2006). Additionally, only a fraction of diversity may correspond to mtDNA divisions, and only that fraction of biodiversity could be conserved with DNA barcodes alone (Rubinoff 2006). Therefore, a challenge for DNA barcoding will be to integrate genetic data into the wider context of scientific, social, economic and political factors for effective conservation of biodiversity.