Introduction

At present, DNA is commonly used as a source of evidence in comparative biology. It has brought important changes to molecular biology, molecular systematics, genomics, bioinformatics and even to alpha-taxonomy (Miller, 2007; Goldstein & DeSalle, 2011). However, nucleotide sequences are also applied to identify organisms (Hebert et al., 2003a). This use is known as DNA barcoding, as a nod to the application of unique combinations of variable-width vertical marks to identify products in commercial activities (Hebert et al., 2003b; Stoeckle & Hebert, 2008; reviewed by Brower, 2006 and Meier, 2008; Cräutlein et al., 2011). The use of the DNA barcode is not only intended for taxonomic work to identify overlooked species, but also has agricultural, clinical, ecological, forensic, illegal trade-related and even recreational applications (Janzen, 2004; Hebert & Gregory, 2005; Schindel & Miller, 2005, Stoeckle & Hebert, 2008, Alacs et al., 2010).

DNA barcoding in land plants is based on standardized regions of the chloroplast genome. Following a debate and a long period of testing, a 2-core-locus consisting of the coding regions matK and rbcL was suggested by the Consortium for the Barcode of Life (CBOL) as the valid alternative for DNA barcoding (CBOL Plant Working Group, 2009). Prior to this, both coding regions were widely used in plant phylogenetic studies (reviewed by Palmer et al., 2004). To reach this conclusion, the Plant Working Group compared the performance of seven leading candidate plastid DNA regions: atpF-atphH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer (CBOL Plant Working Group, 2009).

Among the applications of DNA barcoding for plant conservation is the identification of illegally traded endangered species from small samples or vegetative specimens. DNA barcoding offers an important tool for the phytosanitary authorities to identify species in groups such as bamboos, cacti, cycads and orchids, all of which fetch high prices in the horticultural trade, and in many cases are regulated by CITES agreement for the international trade of endangered species.

Orchids are highly sought after by collectors and nurseries worldwide and are in high demand. In Mexico, 188 species of Orchidaceae are classified as endangered; 60 % are critically endangered and moreover Laelia gouldiana Rchb.f. is considered extinct (SEMARNAT, 2001). Populations are being decimated by habitat destruction and because plants are extracted from natural habitats for illegal export and sale in local markets (Flores-Palacios & Valencia-Díaz, 2007; Chow, 2010). Another plant group whose species are also heavily collected from native vegetation is the woody bamboos (Bambusoideae, Poaceae). These plants have large, showy stems, which are used for constructing buildings and for making handicrafts. At present, bamboo cane is a good substitute for timber and an increasing number of nurseries are becoming involved in their trade (Lobokivov, 2003). Of the 41 bamboo species found in Mexico, 38 are endemic and restricted to a few localities; most of the species in two genera, Olmeca Soderstr. and Otatea (McClure & W.W. Sm.) C.E. Calderón & Soderstr. are endangered. Particularly, Otatea glauca L.G. Clark & G. Cortés, O. ximenae Ruiz-Sanchez & L. G. Clark and Olmeca zapotecorum Ruiz-Sanchez, Sosa & Mejía-Saulés are critically endangered (Ruiz-Sanchez et al., 2008, 2011).

In this paper we set out to create a DNA barcode library for 20 endangered Orchidaceae species and 36 species of bamboo distributed in Mexico. We apply several metrics to evaluate the efficiency of the matK and rbcL barcodes and, for the bamboos, that of the plastid spacer psbI-K, which was one of the candidate barcode loci evaluated by CBOL.

Materials and Methods

Sampling

A total of 20 endangered species of Orchidaceae were included in the study. They were selected from the Mexican Red List (SEMARNAT, 2001) and moreover they are found in cloud forests of the mountains of the Gulf of Mexico, one of the most endangered habitats in the country. Tissues were obtained mostly from the living collection of the Botanical Garden Clavijero or in the field (Appendix 1). Of the 41 species of bamboo found in Mexico, 36 were analysed. They were collected in the field as part of the project of establishing the collection of Mexican native bamboos at the Botanical Garden Clavijero. In order to determine variation of matK and rbcL among individuals of the same species, for Arthrostylidium excelsum, Olmeca fulgor, Olmeca reflexa and Rhipidocladum martinezii we obtained nucleotide sequences for four or five individuals (Table 1). For orchids we were not able to get several individuals of same species. Vouchers are indicated in Appendix 1.

Table 1 DNA sequence statistics for species studied. Length in bp base pairs, P polymorphic sites, S singleton variable sites, K average number of nucleotide differences, π nucleotide diversity, PI parsimony informative sites

DNA Sequencing

DNA was isolated using either the modified 2× CTAB method (Rogers & Bendich, 1985; Doyle & Doyle, 1987; Cota-Sánchez et al., 2006) or the DNeasy Plant Mini Kit (Quiagen, Valencia, California), following the manufacturer’s instructions. For bamboos the rbcL and matk genes were amplified and sequenced using the primer pairs rbcLaf - bcLajf634R for the former and matK Xf - matk 5r for the latter, and following the protocols outlined by Erickson et al. (2008), CBOL Plant Working Group (2009), and Dunning & Savolainen (2010). The psbK-psbI spacer was amplified and sequenced using primers and protocols by Lahaye et al. (2008). For orchids, the primers rbcLa_f (Erickson et al., 2008; CBOL Plant Working Group, 2009) and rbcLajf-634R (Fazekas et al., 2008, CBOL Plant Working Group, 2009) were utilized to amplify rbcL. For matK the primer pais Xf - 5r was utilized (Ford et al., 2009; CBOL Plant Working Group, 2009). For both the bamboo and the orchid sequences, amplified double-stranded DNA fragments were purified using QIAquick columns (Qiagen) following the protocols provided by the manufacturer and subsequently were sequenced at a commercial facility (Macrogen INC. Seoul, Korea). GenBank accession numbers are recorded in Appendix 1. Electropherograms were edited and assembled using Sequencher 4.1 (Gene Codes, Ann. Arbor, MI). Sequences were aligned manually using the program Se-Al v. 2.0a11 (Rambaut, 2002).

Evaluation of Barcodes

Analyses of nucleotide polymorphisms were performed for aligned matrices of every barcode with the program eDnaSP (DNA Sequence Polymorphism, v. 3.99, Rozas et al., 2003). Inter- and intra-specific genetic divergences were calculated using every DNA barcode locus. Average pairwise distances were estimated between every species, as well as for the four or five accessions in the selected bamboo species. For each single barcode and in combination, pairwise distances were calculated with the simplest K2P model following Lahaye et al. (2008) with the software MEGA 5.0 (Tamura et al., 2011). This model also utilizes the CBOL recommendations for distance calculations (barcoding.si.edu/). Phylogenetic analyses were carried out to evaluate if species or genera were retrieved as monophyletic with each barcode following Lahaye et al. (2008) and thus able to identify species or genera. Parsimony analyses were run in TNT (Goloboff et al., 2003), under a traditional search, with equal weights and 10,000 iterations saving 100 trees per run. Clade support was estimated by bootstrap (BS), with 1,000 replicates. In addition, UPGMA trees were inferred from K2P distances with MEGA 5.0 (Tamura et al., 2011) and evaluated by bootstrap.

Results

Molecular Characteristics

Amplification was successful for each barcode tested, for bamboos as well for orchids. Therefore, nucleotide sequences from all the taxa were successfully amplified and sequenced. Table 1 includes DNA sequence statistics for every DNA barcode and possible combinations. In the orchids, matK was the plastid sequence with the highest number of sequence polymorphic sites, while for the bamboos it was psbI-K. The second barcode locus proposed by CBOL, rbcL, retrieved the lowest polymorphisms. For bamboos the combination of matK + psbI-K retrieved more polymorphic sites than the combination of matk + rbcL.

Inter- and Intra-Specific Diversity

The performance of each barcode and all their combinations was assessed by means of the genetic distance from the K2P (Kimura’s two parameters) pairwise distance matrices (Table 2, Fig. 1). The highest inter-specific diversity was reached by matK for Orchidaceae and for intra-specific variation in bamboos, and the combination of matK with psbI-K for Bambusoideae. For orchids and bamboos (intra- and inter-specific variation) rbcL had the lowest distances.

Table 2 Measures of inter-specific K2P distances for barcodes.
Fig. 1
figure 1

Evaluation of barcodes. P = % polymorphic sites, S = % singleton variable sites, K = average number of nucleotide differences. Parsimony = % monophyletic groups retrieved by parsimony. UPGMA = % monophyletic groups retrieved by UPGMA

Species Identification

The performance of each DNA barcode in identifying and delineating species was assessed by the percentage of monophyletic groups recovered by Parsimony and UPGMA analyses (Table 3, Fig. 1). The highest values of species monophyly were obtained from UPGMA reconstruction in comparison with parsimony reconstruction. One hundred percent of monophyly was retreived with matrices based on matK alone in both orchids and individuals of same species in bamboos and for inter-specific monophyly of genera in bamboos. Percentage of monophyly retrieved by inter-specific matrices with matK was identical to the retrieved by the combination of matK-psbI-K for species of bamboos. The lowest percentage of monophyly in both bamboos and orchids was reached by rbcL.

Table 3 Proportion (%) of monophyletic groups (genera or species) (with BS > 50 %) recovered by the Parsimony/UPGMA analyses

Discussion

For orchids our results are concordant with those reported by Asahina et al. (2010), in that with matK alone it was possible to identify more taxa than with rbcL. These authors obtained sequences of the two markers for five medicinal species in Dendrobium Sw. Moreover, our results also match those obtained Gigot et al. (2007) who found small genetic distances among Mesoamerican orchid species (intra- and inter-specific variation) utilizing plastid regions proposed in phase 2 of the DNA barcode protocol for plants, including rbcL and matK. The Mesoamerican orchids we studied can be identified to at least the genus level by just sequencing matK. Lahaye et al. (2009) also concluded that the most variable DNA barcode is matK for a broader array of species in several families found at Kruger National Park, South Africa. By contrast, to identify tropical plant species in Panama and tropical tree species in Puerto Rico Kress et al. (2009, 2010) utilized a three-locus barcode.

There is still no DNA barcoding project for species in Subfamily Bambusoideae. The only record for Poaceae is the detection of a cryptic species in Tripogon Roem. & Schult. (subfamily Chloridoideae) based on matK and trnH-psbA sequences (Ragupathy et al., 2009). In our study we further estimated variation of matK, psbI-K and rbcL in 4 or 5 individuals of the same bamboo species. We found that variation is very low and thus with the combination of the three barcodes is possible to identify species just sequencing only one sample.

In Mesoamerica the cycads comprise another group of plants that is highly threatened and vulnerable to illegal trade. Of the approximately 55 species of cycads from Mexico, 80 % are endangered (Vovides, 2000; SEMARNAT, 2001). Nicolalde-Morejón et al. (2011) examined the performance of seven plastid DNA regions considered in phase 2 by CBOL and the nuclear ITS region as barcodes for the cycad genera Ceratozamia, Dioon and Zamia. They concluded that for the successful identification of the species, it would be necessary to sequence four or five markers. Thus in comparison with bamboos and orchids, in which a two- locus barcode is useful to identify at the level of species, cycads require two or three additional barcodes to recognize genera.

Our results indicate that the identification of endangered Mesoamerican orchids is achieved at the generic level using only matK. Moreover, the failure to identify species of bamboos with the 2-core locus barcode system indicates that the proposed barcode combination needs further refinement, as indicated by a number of previous studies (reviewed by Vijayan & Tsou, 2011). However, considering that it is noteworthy that the 20 orchid species included in here are not as closely related as the studied bamboos. We sample species from most of the Orchidaceae clades; while, for the bamboos our sampling was restricted to subfamily Bambusoideae. Bamboos in Mesoamerica are closely related and belong to the subtribes Guaduinae, Arthrostyliidinae and Chusqueinae grouped in this subfamily (Ruiz-Sanchez et al., 2008, 2011).

Our results concord with other barcoding projects, in which matK retrieved more variation in land plant groups. In orchids, the identification of more species was achieved utilizing only matK. For bamboos we found that the spacer psbI-K retrieved more polymorphic sites and in combination with matK we were able to identify bamboos to at least the generic level.