Introduction

South Asia is known for rich biodiversity, with many of the species being endemic to the Indian subcontinent and the majority of species are threatened to extinction because of changing habitat quality and quantity [1]. Indian antelope or Blackbuck (Antilope cervicapra) is one of the most charismatic ungulate species endemic to Indian sub-continent. Blackbuck is protected under Schedule-I of the Wildlife (Protection) Act, 1972 of India, and also protected under CITES that prohibits trade of their parts and products. However, IUCN lists the Blackbuck as ‘Least concerned’ species.

Blackbuck is an important ecological indicator species for grassland ecosystem, however, it is also considered to be a pest, feeding mainly on crops in many regions of India [2]. Blackbuck was earlier distributed throughout India and forms an important prey base of large carnivores of dry tropical habitats, but recently, the occurrence of this species has dramatically declined due to poaching, anthropogenic factors and habitat loss [3,4,5]. Meat samples of Blackbuck, seized in offence cases may be in cooked form/partially raw/fresh/degraded. Sometimes skins are also seized as a whole or in pieces, while other parts that are in trade in India includes horns and their products and are used in the carving of trophies and other decorative items. To improve our ability to detect, monitor and control the trade in wildlife and wildlife products, it is necessary to process and analyze the sample using robust molecular biology techniques where the use of morphology-based approaches have limited applications.

Species have often been identified using different mitochondrial genes across various taxa viz. Cytochrome b (Cyt b), Cytochrome Oxidase I (COI); 12S rRNA; 16S rRNA; control regions and others markers [6,7,8,9,10,11,12,13,14,15]. Amongst these commonly deployed approaches in wildlife forensics, “Forensically Informative Nucleotide Sequencing” (FINS) is also used in species assignment based on specifically informative nucleotide sequences [16]. Over the years, FINS in different genes of mitochondrial genome has been used for the identification of substitution in seafood, marine species, and canned products [17,18,19]; identification of lizard, Indian civets [20, 21]. These variable sites have enough information to differentiate species at inter-familial level.

So far, few studies have documented the use of mitochondrial genes for species identification of Blackbuck in forensic cases [22, 23], but these were restricted to samples from a particular geographic location in India. Therefore, in the present paper, for the first time, we discuss the use of multi genes approach so as to minimize any possibilities of false positive species identification from wildlife parts and establish the barcode region for identification of Blackbuck population in India. Besides we also provide intra-species variation in Blackbuck population across its geographic distribution range in India using FINS and comparing its affinity with other closely related Bovidae species.

Materials and methods

Collection of samples

Biological samples of Blackbuck (n = 60) used in the present study were collected during the year 2012 to 2014 (Fig. 1). Base map was taken with reference to previously published data [24]. Most of the samples used were procured from the reference repository available at Wildlife Institute of India (WII) that has been established with the support from various forest departments, through research projects and confiscated parts in wildlife offences across India. Only those confiscated samples were incorporated in the present study, for which the geographic origin of the samples were known based on case history, and those that were seized in specific distribution range in protected area and not while transit. Few samples were collected during field transect from carcasses of accidentally or naturally found dead individual in protected areas or animals available at rescue centers across India. Samples included in the study varied from tissues (n = 55), hairs (n = 02) and horns (n = 03). As samples were collected under extreme field conditions, therefore most of the tissue samples had a different degree of decay.

Fig. 1
figure 1

Blackbuck samples used with respect to distribution of Blackbuck as proposed by Meena and Saran [24]. Samples outside distribution range were seized under wildlife offences

DNA extraction and PCR amplification

The DNA was extracted from biological samples depending on sample type. Prior to extraction, all the tissue samples were rinsed with absolute alcohol and repeatedly washed with milli-Q water so as to remove any surface contamination, utill the samples were ready for the DNA extraction process. Total genomic DNA was extracted from tissues (n = 55) samples using DNeasy Blood & Tissue Kit™ (Qiagen, Valencia, CA, USA) with minor modification in a standardized protocol. In the modification, we increased the incubation time for 12 h and temperature to 60 °C. Minor modification was incorporated in protocol by increasing/decreasing the quantity of lysis buffers used while DNA extraction as per sample quantity and requirements. DNA isolation from horn (n = 03) and hair (n = 02) samples were undertaken using commercially available Merck GeNei Hair and Bone/Horn DNA isolation kit. All the extracted samples were subjected to 0.8% agarose gel in 1x TAE buffer to check the concentration and quantity of DNA yield and observed under UV transilluminator. DNA concentration varied depending on sample condition and type and as a result, the DNA templates were diluted accordingly in order to bring to an approximately uniform concentration that would ease the task in amplification process.

A partial fragments of Cytochrome c oxidase subunit I (COI; 650 bp) [7]; Cyt b (381 bp) and 16S rRNA (550 bp) [9, 25] of mitochondrial genome were used for the amplification of DNA template (Supplementary Table S1). Amplification of each of the above gene was carried out individually in a 10 µl reaction volume containing 1 µl of 1x PCR buffer; 0.5 µl 10 mM dNTPs; 0.5 µl 25 mM MgCl2; 0.4 µl of BSA; 0.5 U Taq DNA Polymerase and 2 µl of ~ 20 ng genomic DNA. A negative control, as well as positive control was also set up along with reaction mixture to cross check any contamination. Thermal cycling conditions for the reaction mixture varied depending on the primer pairs we used. The conditions for the Cyt b and 16S rRNA genes were: initial denaturation at 94 °C for 5 min; 35 cycles of denaturation at 40 s for 94 °C; annealing temperature of 53 °C for 45 s and extension at 72 °C for 40 s with the final extension step at 72 °C for 10 min. For COI gene, the annealing temperature was set at 45 °C for 30 s for 40 cycles, with denaturation and extension being same as other genes. Amplification success was visualized under UV transilluminator after running the amplicons over 2% agarose gels immersed in 1x TAE buffer. We removed the residual primer and dNTPs and PCR amplicons were then treated with Exonuclease-I and Shrimp Alkaline Phosphatase (Thermo Scientific Inc.) following the cycling condition of 37 °C and 85 °C each for 15 min. Purified products were then cycle sequenced with a master mix containing Big Dye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, CA) with their respective primers in standard proportions. The fragments were then directly sequenced on Applied Biosystems Genetic Analyzer 3130. DNA sequences were generated using forward and reverse primers and single consensus sequence for each sample were used.

Data analysis

The raw DNA sequence data of Blackbuck for all three genes were initially checked and edited using Sequencher Software (v.3.1, Gene Codes Corporation) and subsequently, all the sequences were screened through visual inspections and then trimmed to identical length for using in BioEdit v 7.0.9.0. software [26]. Total n = 119 sequences of different genes for Blackbuck were generated through present study that included COI (n = 36); Cyt b (n = 34) and 16S rRNA (n = 49) regions of a mitochondrial genome out of 60 samples used for each gene (Supplementary Table S1). The variation in samples size was mainly because of the amplification failure in few samples that can be attributed to dilutions in DNA templates that were used over a brief period of time or may be due to unexplained parameters. Further most of the tissue samples yielded DNA sequences for all three fragments of mitochondrial genes whereas problematic samples like hairs and horns failed to amplify in few samples.

Besides, we also downloaded complete mitochondrial genome sequences for Antilope cervicapra (n = 3); Boselaphus tragocamelus (n = 2), Tetracerus quadricornis (n = 2); Gazella bennettii (n = 2); Naemorhedus baileyi (n = 3); Naemorhedus griseus (n = 4); Naemorhedus goral (n = 3) and Naemorhedus caudatus (n = 3) from National Centre for Biotechnology Information (NCBI, USA). Apart from this we also compared the reseutls with already published sequences of the Blackbuck from the different sources for the COI genes, we used Kumar et al. 2017 data [23]. These samples were used to understand the inter-species sequences divergence of the closely related species with respect to Blackbuck. Multiple Sequence Alignment was performed using the ClustalW algorithm as per default settings and later manual adjustment as implemented in BioEdit software. All the sequences were aligned and trimmed to yield sequences of identical lengths for respective genes, with the absence of INDEL and gaps. Final DNA fragments used in the study of different genes were COI (575 bp), Cyt b (324 bp) and 16S rRNA (423 bp). We also used reference sequences of con-specific species and other genetically closely related species along with our datasets. BEAST was used to construct the phylogenetic tree [27]. We selected a model of nucleotide substitution estimated in Modeltest 3.6 [28]. A HKY model using gamma + invariant sites were used, and final search was performed using 5 × 106 MCMC generations and smapled at every 1000 steps (10% discarded as burn-in). Tree Annotator v.1.7.4 [21] was used to compute the tree and final results were visualized in FigTree v.1.3.1 (http://tree.bio.ed.ac.uk/software/figtree/). Further, pairwise evolutionary sequence divergence between haplotype (H) was calculated using Kimura-2 parameter distance in MEGA 6.0 software [29]. In order to check the level of genetic diversity within the Blackbuck population across India, nucleotide diversity (π), haplotype diversity (hd) and mismatch distribution of differences were calculated using DnaSP software v. 5. [30]. The number of polymorphic sites and events of transition–transversion in the species were determined along with the composition of purines and pyrimidine bases as well as the presence of any INDEL phenomenon.

Results and discussion

Recently, use of multiple genes and diverse analytical methods have been popular for robust species identification as well as avoiding any false positive identification in wildlife forensics and complementing a better understanding of evolutionary process instead of relying on a single gene for phylogeny [20, 31, 32]. In present study, the inference based on different mitochondrial genes (COI, Cyt b and 16S rRNA) were independently undertaken and accordingly the inference drawn in the preceeding sections are independent for each of mitochondrial gene; while for few samples for which we had common data for all the three genes and from same locations, the generalized interpretation was made, which did not gave drastic difference in the generalized result. Observed nucleotide composition for different regions of mtDNA genome were COI (A = 25.0%; T = 30.9%; C = 28.5% and G = 15.6%); Cyt b (A = 29.4%; T = 26.7%; C = 34.1% and G = 9.9%) and 16S rRNA (A = 33.5%; T = 25.5%; C = 20.8% and G = 20.2%) (Table 1). The average frequency of nucleotide composition was in the similar range as exhibited by other ungulate species. However, Blackbuck exhibit AT-rich nucleotide composition which was in resonance with previously published studies on sympatric species like Tibetan antelope [33]; Oryx [34]; Swamp deer [35]; Hangul [36]. Whereas GC content derived using mitochondrial genes were in congruence with the expected range in other mammalian species [22, 37].

Table 1 Nucleotide composition and genetic diversity indices using mitochondrial genes in Blackbuck (Indian antelope)

Informative nucleotide sites in Blackbuck population

With respect to the reference sequence of Blackbuck (NCBI GeneBank Accession no. JN632598), over a partial fragment of Cyt b (324 bp), we observed 17 variable sites; 12 parsimony informative sites and 5 singleton sites. In the COI gene (575 bp), 18 variable sites, 8 parsimony informative sites and 3 singleton sites were observed. In 16S rRNA (423 bp) there were 12 variable sites; 12 parsimony informative sites and 1 singleton sites (Table 2). Of the three mitochondrial regions used have the different length hence all paraments provided as indipendenly. As evident from the present data, COI gene exhibited more variable nucleotides than partial fragments of two mitochondrial genes i.e. Cyt b and 16S rRNA. We observed, total 18 variables and 12 parsimony informative sites compared to the 12 variable and 5 parsimony informative sites observed by the Kumar et al. (2017) using the COI gene. However, these samples were collected from the state of Haryana only and all samples clustered in a same clade (Haplotype 10–16) along with samples collected from the Rajasthan (Fig. 2). We also found that the sequence generated by Kumar et al. (2017) have the unique haplotypes and surprisingly didn`t shared with any of haplotye observed over a larger distribution range of Blackbuck which may be due to ambiguous sequences (Joshi et al. 2018). Whereas in the present study, we found contradictory results that of major sharing of haplotype from the different locations (Fig. 2).

Table 2 Forensically informative nucleotide sequences (FINS) observed in Blackbuck (Indian antelope) population using three genes of mitochondrial genome
Fig. 2
figure 2

Bayesian based phylogenetic tree generated for Blackbuck population in India using the mitochondrial genes and haplotypes observed with reference to other wild bovids

Inter and intra-specific genetic distance

To evaluate the competence of DNA barcodes in delineating the closely related ungulate species in India, a distance matrix was built using DNA sequences of Blackbuck generated during the present study as well as from reference sequences retrieved from NCBI database. Mitochondrial genes Cytb and COI showed significantly higher intra and interspecific divergences compared to 16S rRNA gene (p values, 0.05). Intraspecific sequence divergence (Kimura 2 parameter distances) in Blackbuck were 1.6% (0.016) > 0.7% (0.007) > 0.4% (0.004) for COI, Cytb and 16S rRNA genes respectively. Mean K2P based sequence divergence in different genes between selected ungulate species ranged from 0.044 to 0.114 for the 16S rRNA; 0.050–0.196 for COI gene and 0.096–0.21 for Cytb gene (Table 3a–c). Comparative analysis of three mitochondrial genes for inter-species divergence revealed that Cyt b gene was a more robust marker for species identification in cryptic animals like blackbuck.

Table 3 Pairwise sequence divergence calculated using Kimura-2 parameter distances in closely related ungulate species using partial fragments of 16S rRNA, COI, Cytochrome b genes of mitochondrial genome, (a) 16S rRNA (423 bp), (b) Cytochrome Oxidase I (COI gene) (575 bp), (c) Cytochrome b (324 bp)

In order to delineate species, a ‘10 × rule’, has been proposed to consider 10 times the mean intraspecific variation to classify as a putative species [38] but has been vary between the groups and should be used conservatively. We observed the higher threshold values for the species delimitation using the, COI was 5.6% and for Cyt b, it was about 7.5%, however in COI it is well within the maximum limit (< 2%) of intraspecific variation as detected for the mitochondrial COI in mammals [39]. Hence, COI can be useful gene while investigating the intra-specific and inter-specific distances in closely related species. Intraspecific variability was lower than the divergence between species but the success rate of species identification based on query coverage was 90–100%.

Genetic diversity indices

We report the presence of 11 haplotypes using Cytb gene; 17 haplotypes using COI gene and 8 haplotypes using 16S rRNA genes (NCBI GenBank Accession No. Cyt b: MK951753–MK951763; COI: MK951764–MK951772; 16S rRNA: MK934773- MK934779) of Blackbuck sequences generated in the present study as well as from those retrieved from the NCBI. Haplotype diversity (Hd) in Blackbuck was highest using COI gene (0.9138) whereas nucleotide diversity was highest (0.00704) using Cyt b gene (Table 1). Blackbuck population exhibited overall low nucleotide diversity (0.005) and high haplotype diversity (0.7826) when compared with other ungulate species viz. Oryx gazelle, Naemorhedus baileyi, Cervus elephus hanglu [22, 36, 40, 41]. These results imply that Blackbuck population might have experienced recent population expansion. Bayesian-based phylogenetic tree generated based on all three mitochondrial genes revealed shallow intra-species divergence with a strong posterior probability support value (0.98) and supported a monophyletic status for delineating species in relation to other closely related bovid antelope species (Fig. 2). Despite, high haplotype diversity across Blackbuck populations, the pattern of low nucleotide diversity indicates the presence of marginal differences among different haplotypes. Further, this can be corroborated with the presence of single clade containing samples of different geographic locations (with high posterior support) (Fig. 2). This may also indicate the presence of historical high maternal gene flow across its distribution range and thus resulted in poor geographic differentiation or population sub-structuring in spite of contemporary fragmentation in habitat.

Conclusion

We found overall high intra-species sequence divergence between haplotype (H) calculated using Kimura-2 parameter genetic distances in the COI gene compare to Cyt b and 16S rRNA in the Blackbuck but interspecies sequence divergence was high in the Cyt b than COI gene. Besides, sequence divergence (genetic distance) and coalescent-tree based approaches for species recognition revealed the significant intra-specific segregation between Blackbuck and other conspecific species with higher bootstrap values. Our finding also corroborated with previous work specifically on FINS and application of these studies in the wildlife offences, however these study was restricted over the small geographic region whereas we sampled and covered comparatively larger distribution range of Blackbuck in India. Present study also assessed the genetic diversity as well as the suitability of FINS, in delineating Blackbuck covering samples from extensive geographic ranges. Use of COI and Cytb genes offered better means for identification of Blackbuck because of high genetic divergence indices, compared to 16S rRNA that showed low genetic divergence. Because of the presence of false positive sequences in the public domain/database, we suggest that exclusive use of BLAST search in species assignment should be avoided. Therefore, we recommend using FINS and coalescent tree-based approaches for species assignment of wildlife parts.