Keywords

1 Reptiles: A Fundamental Component of Biodiversity

Reptiles are a group of vertebrate animals that comprises snakes, lizards, crocodiles, turtles, etc. These groups of animals have originated in and around 310–320 million years ago, in the late Carboniferous period (Laurin and Reisz 1995) (http://www.ucmp.berkeley.edu/carboniferous/carboniferous.php). Reptiles either have four limbs or like snakes, which had descended from four-limbed ancestors. Reptiles, contrasting to amphibians, do not have an aquatic larval stage (Sander 2012). Reptiles play an important role in the food webs of the ecosystems, filling up the critical role of both predator and prey. Reptiles have been hunted or traded, particularly as food, traditional medicines, leathers as well as decorative materials (http://www.endangeredspeciesinternational.org/reptiles3.html). Modern-day reptiles (Squamata) are the most diverse order of reptiles with more than 9600 species (Sander 2012).

Saudi Arabia

Saudi Arabia occupies most of the Arabian Peninsula, with the Red Sea and the Gulf of Aqaba to the west and the Persian Gulf to the east (Figure 1). Saudi Arabia contains the world’s largest continuous desert, which is known as the Rub Al-Khali or Empty Quarter. It has a land area of 2,149,690 sq. km (http://www.factmonster.com/country/saudi-arabia.html). The desert features a subtropical, hot and arid climate throughout the year, very similar to the Sahara Desert, which is actually an extension of the Sahara Desert over the Arabian Peninsula. The temperatures swing between very high heat and seasonal night time freezes. The desert of Saudi Arabia provides an excellent refuge for reptiles from the savage extremes of climate, because even a few inches of sand offer excellent insulation against heat and cold.

Fig. 1
figure 1

Study site (Saudi Arabia) (http://www.operationworld.org/saud)

(http://www.saudiaramcoworld.com/issue/196805/the.toadhead.from.najad.and.other.reptiles.htm).

DNA Barcoding and Species Identification

The ability to accurately identify and describe species is indispensable for any biological research, but the traditional morphological-based taxonomic approaches have only managed to explain 1–1.5 million species over the past 250 years (Chapple and Ritchie 2013; Mora et al. 2011), which is around 10% of the Earth’s predicted eukaryotic diversity, a very meagre amount (Mora et al. 2011). It is estimated that dogging overwhelming and cumbersome approaches would not accomplish a comprehensive inventory of the world’s biodiversity (Chapple and Ritchie 2013; Packer et al. 2009) and maybe for much longer given the sharp decline in the number of specialist taxonomists (Rodman and Cody 2003; Wheeler et al. 2004). The DNA barcoding approach was initiated in 2003 by Paul Hebert and his team (Hebert et al. 2003) in the University of Guelph, Ontario, as a way to overcome the existing taxonomic ‘impediments’ (Hebert et al. 2003). DNA barcoding has been a promising tool for the rapid and accurate identification of various species and inventorying species diversity (Hebert et al. 2003; Dawnay et al. 2007). It has been instrumental in the identification of existing species and the discovery of new species. DNA barcoding can be helpful in species diagnosis because sequence divergences are generally much lower among individuals of the same species than between species (Hebert et al. 2003). The distinction between intra- and inter-specific divergences, termed the ‘barcoding gap’ (Meyer and Paulay 2005), enables unknown sequences to be assigned to an existing species or flagged as a suspected new species. DNA barcoding use sequence variations in short regions (648-bp) of cytochrome c oxidase I (COI) to aid species identification and discovery in large assemblages of life (Hebert et al. 2003; Savolainen et al. 2005). A significant advantage of the DNA barcoding approach is that it works in situations where morphological approaches become confounding (Armstrong and Ball 2005; Chapple and Ritchie 2013), species with multiple life stages (Hebert et al. 2004) and sexual dimorphism, variable or plastic morphology (Smith et al. 2006, 2007; Burns et al. 2008). DNA barcoding is not only a powerful tool for species identification but also can play a vital role in wildlife forensics and conservation genetics (Wolinsky 2012). The occurrence of cryptic species is relatively common in nature. Cryptic species are those species that are morphologically similar but genetically distinct. DNA barcoding can be a very effective tool in the assessment of these cryptic species (Hebert et al. 2004). DNA barcoding can also be very effective for molecular phylogenic studies (Ajmal Ali et al. 2014).

2 Identification of Reptiles from Tabuk Region of Saudi Arabia through DNA Barcoding: A Case Study

2.1 BLAST Result Analysis

A total of 21 reptile sequences from the order Squamata have been collected from Tabuk Region of Saudi Arabia and sequenced. The BLAST search results of these sequences have been detailed in Table 1. A Neighbour Joining (NJ) tree has been constructed using the developed sequences along with the downloaded BLAST hits of individual sequences. Only those BLAST hits have been considered which have the highest scores, and E_value is close to 0. Among them, only eight sequences have conspecific sequences available in the database. Remaining sequences showed a match with the closest available relative in the database like congeneric or confamilial species. In some rare cases, in the absence of true phylogenetic relative in the database, the closest hit showed random matches with species belonging to completely different taxa, like Aves and Anguilliformes. However, these cases were associated with high E-value which makes the hit false positive. As in the case of Acanthodactylus opheodurus, in the absence of conspecific sequence, BLAST generated hit with 98% query coverage and 82% similarity with conger sequences which belongs to the phylum Aves. The E-Value of the match was however high with 1.00E-125 that showed a random match. The taxonomic details of Blast hits are given in Table 2.

Table 1 Similarity match with GenBank sequences using nucleotide BLAST. The result showed the closest match with the available database sequence. The similarity of the sequences is expressed in terms of percentage of identity with E value
Table 2 Taxonomic details of the BLAST hit results in NCBI

3 Species Identification Using BOLD

The BOLD Identification System (IDS) was used to establish species identity of the developed sequences. This identification system for COI accepts sequences from the 5′ region of the mitochondrial Cytochrome c oxidase subunit I gene and returns a species-level identification when one is possible. We searched both the private and published data in BOLD for available sequences through the “All Barcode Records” search engine. The search returns every COI barcode record on BOLD with a minimum sequence length of 500 bp including unvalidated library and records without species-level identification. This also includes many species represented by only one or two specimens as well as all species with interim taxonomy. Further, the “Species Level Barcode Records” was used to extract a list of the nearest matches and that provided a probability of placement to a taxon.

Among the twenty-one COI barcode sequences developed in the lab, species status for only five sequences could be confirmed using the BOLD identification system. For most of the remaining sequences, conspecific sequences were not available in the BOLD database. Table 3 shows a detailed description of similarity match of the sequences using the BOLD identification system. Top five matches of the sequences using the “All Barcode Records” search were displayed for each of the sequences. In the case of 001(F), Chamaeleo chamaeleon fifteen COI sequences were available in the BOLD database. However, the top five similarity match did not show close identity with any of these sequences. Instead, the sequence showed 99.81% similarity with Diplometopon zarudnyi and IDS identified the sequence as Diplometopon zarudnyi. Such incongruency in the similarity may be because of the presence of hybrid sequences or mislabelled sequence. Conspecific sequences for 2R(F) Chalcides ocellatus were not available in the BOLD database. 3R(F) Scincus mitranus showed 95.4% similarity with congeneric sequence Scinus scinus available in a private database. Three sequences of Eurylepis taeniolatus were found in early release section; however, they showed an average of 87% match with the 5R(F) Eurylepis taeniolatus. Four sequences of Stellagama stellio were present in the database. They showed 88%–96% similarity with 7(F) Stellagama stellio and IDS did not identify species status of the sequence. However, 8(F) Stellagama stellio was identified up to species level as it showed 98.5% similarity with database conspecific sequence. Developed sequences of Pseudotrapelus aqabensis 9(F) and 10(F)) showed 99% similarity with database sequences and were identified correctly by IDS. 12(F) Diplometopon zarudnyi showed 99% similarity with database sequence and was identified correctly up to species level. Rhagerhis moilensis and Mesalina brevirostris did not have any conspecific sequences available in the database. However, Mesalina brevirostris showed 98% similarity with Acanthodactylus boskianus and hence was identified as the same species. Cyrtopodion scabrum has a conspecific sequence available in the database but IDS did not show significant similarity with these sequence. 21(F) Stenodactylus doriae showed 81–89% similarity with the available database sequences while 22(F) Stenodactylus doriae showed 91% similarity with the sequences. 63(F) Stellagama stellio did not show match with any of the available database sequences.

Table 3 Species identification using BOLD-IDS (Barcode of Life Datasystem-Identification system) search engine. The developed sequences of the specimen are checked for similarity match in the Public Record Barcode Database of BOLD-IDS for comprehensive species identification

4 Neighbour-Joining (NJ) Clustering

The Neighbour Joining tree of all the species under this study is constructed as shown in Fig. 2. The phylogenetic reconstruction was done using K2P distance model as per the standard protocol of DNA barcode. As observed in this case, 001F_Chamaeleo chamaeleon clusters with three Diplometopon zarudnyi sequences; of them, two sequences (12F) and 60(F)) have been generated in the lab and one sequence, AY605474, has been extracted from the database. Such clustering could be possible because of either the presence of mislabelled or misidentified sequence or there could be the possibility of species introgression. 2R(F) Chalcides ocellatus clusters separately as no conspecific sequence is available in the database. However, they align close to (KU985908, KU985944) Sceloporus virgatus belonging to the same order Squamata but different family Phrynosomatidae. 3RF_Scincus mitranus clusters separately but close to three confamilial database sequences of Oligosoma maccanni (KC349720, KC349736, KC349722). Eurylepis taeniolatus also forms distinct branch in the vicinity of three sequences of Myrophis platyrhyncus(GU224956, GU224963-64), which are Anguilliformes. 7R and 8R Stellagama stellio clusters together along with another database sequence (KF691700) of the same species. However, 63R Stellagama stellio clusters separately and close to 16(F) Cerastes gasperettii. 009F and10R Pseudotrapelus aqabensis clusters together with conspecific sequence KP994947 from database. Moreover, four database sequences (KP979760, KP994949, KP979759, KP994946) from three congeneric species of Pseudotrapelus clusters distinctly under the same node. As conspecific sequences are not present in the database, 13(F) Rhagerhis moilensis shows closest hit with Mimophis mahfalensis, which belong to the same family. In the NJ tree as well the two sequences form close cluster distinct from other families. 19 (F) Cyrtopodion scabrum forms subcluster with three sequences of Hemidactylus genus where sequences (KU567377, KU567474) of two species were extracted from the database and one sequence, 30(F) Hemidactylys flavivirdis, was developed in lab. Both of these genera belong to the same family Gekkonidae. 21(F) and 22(F) Stenodactylus doriae clusters together along with other sequences of Gekkonidae family. Species of Lacertidae family, 25 (F) Mesalina brevirostris, 26R (F) and 29 (F) Acanthodactylus opheodurus forms distinct cluster. However, 27(F) Phoenicolacerta kulzeri khazaliensis ssp forms separate cluster along with a conspecific database sequence FJ460596.

Fig. 2
figure 2

Neighbour Joining tree of COI sequences of all the reptile species from Tabuk Region of Saudi Arabia along with the other database sequences as study replicates

This case study demonstrated the effectiveness of COI barcodes for reptile species from Saudi Arabia in discriminating species recognized through prior taxonomic work contributing to the growing library of DNA barcodes of animal species of the world. The study showed that the partial COI gene enables accurate animal species identification where adequate reference sequence data exist. Some species groups with overlapping barcodes identified in this study were good candidates for further studies of phylogeography and speciation processes. Further phylogenetic work on these species will reveal which of these highly divergent and geographically separated populations should be treated as belonging to the same species or sister species.