Introduction

Fishes of the genera Tor, Neolissochilus, and Naziritor (family: Cyprinidae; subfamily: Cyprininae), commonly known as mahseers, are economically valued as sport fish, in capture fisheries, and for aquaculture. Mahseers inhabit both rivers and lakes, ascending to rapid streams with rocky bottoms (Desai 2003) and are distributed from the Middle East to South and Southeast Asia (Roberts 1999; Silas et al. 2005). This group of large-scaled carps is endemic to Asia. Mahseers originated in southwest China and spread westward during the Pliocene and thereafter spread south to the Malay Archipelago and southwestward to the Indian peninsula and Sri Lanka (Shrestha 1997). The iconic mahseers are listed among the 20 mega fishes of the world (Stone 2007). However, overfishing, habitat degradation, and several anthropogenic factors are responsible for the decline of their natural population (Pinder and Raghavan 2012). Of 18 mahseer species assessed under the IUCN criteria, 10 are listed as threatened, namely under the “endangered” and “vulnerable” categories (IUCN 2013). To improve the natural population distribution of mahseers, comprehensive conservation strategies, including habitat management, protection, and artificial propagation are needed. Taxonomic stability of an organism is the essential platform before developing conservation action plans (Morrison et al. 2009) which unfortunately is likely to be a large constraint for mahseer conservation.

Mahseer systematics has been a subject of considerable interest to taxonomists, however, contradiction in nomenclature, possibly due to plasticity of characters (Mohindra et al. 2007) is leading to taxonomic instability in this group of species. Eschmeyer et al. (2013) catalogued a total of 59 currently valid species belonging to the mahseer group with 36 assigned to the genus Tor, 22 to the genus Neolissochilus, and one under Naziritor. However, these numbers vary because of the taxonomic uncertainties regarding this group (Siraj et al. 2007; Froese and Pauly 2013). Naziritor (Mirza and Javed 1985) was proposed to accommodate mahseer species found in the Zhob River (Naziritor zhobensis) in West Pakistan and was represented by the single species until Menon (1999) classified Tor chelynoides (Talwar and Jhingran 1991) as Naziritor chelynoides. However, currently, this species is recognized as Puntius chelynoides (Dahanukar 2010; Froese and Pauly 2013).

Across their distribution in Asia, species of Tor are among the most diversified of the cyprinids, and taxonomists have conflicting opinions regarding their nomenclature because of the morphological variations that they exhibit (Silas et al. 2005). Tor putitora, the golden mahseer, is the most widely distributed in the trans-Himalayan rivers across south Asia including Myanmar, Bhutan, Bangladesh, India, China, Nepal, and Pakistan (Bhatt et al. 2004). In portions of its known range of distribution, new species were split from T. putitora on the basis of certain discriminating morphological characters, such as Tor macrolepis from the Indus river system in Pakistan (Mirza and Awan 1976; Mirza 2004) based on the hypertrophied lip structure and Tor yingjiangensis from the upper Irrawadi basin in Yunan, China. T. yingjiangensis was distinguished from T. putitora due to combination of characters (Chen and Yang 2004) such as 3–3.5 (vs. 2.5) scales from lateral line to pelvic fin origin, shorter caudal peduncle length (13.0 % vs. 17.2 % of standard length), lesser body depth (26.4 % vs. 24.0 % of standard length) and longer caudal peduncle depth (12.0 % vs. 10.9 % of standard length).

In southern Asia, the Indian subcontinent harbors the highest species richness for Tor; seven species are distributed across the Himalayas and the central and Deccan plateaus (Desai 2003; Jayaram 2005). These are T. putitora (Hamilton), T. tor (Hamilton), Tor khudree (Sykes), Tor progeneius (McClelland), Tor mosal (Sykes), Tor mussullah (Sykes), and Tor kulkarnii (Menon). In a recent review on Indian mahseers, Dinesh et al. (2010) accepted only five valid species based on morphological descriptions and ignored T. mosal and T. kulkarni, as well as some of the newly reported species such as Tor moyarensis and Tor remadevi. T. mosal has been considered a synonym of T. putitora (Froese and Pauly 2013; Eschmeyer et al. 2013). Deccan mahseers, T. khudree, and T. mussulah, are exclusive to the rivers of the Deccan plateau and peninsular India (Jayaram; 2005). Three subspecies, T. mosal mahanadicus (David 1953a, b), T. khudree malabaricus (Jerdon) from Peninsular India and T. khudree longispinus, are considered valid, were not confirmed by evidence based on recent molecular studies. T. khudree malabaricus has been characterized as a separate species, T. malabaricus (Silas et al. 2005). T. mosal mahandicus (David 1953a, b), which is endemic to the Mahanadi River that originates from the central plateau, is also described as T. khudree mahanadicus (Menon 1992), T. tor mahanadicus (Sugunan 1995) and a synonym of T. khudree. However, a RAPD profile of T. mosal mahanadicus found it to be more similar to T. putitora than all other Tor species (Mohindra et al. 2007). Nguyen et al. (2006) concluded that T. khudree longispinus could not be distinguished from T. khudree on the basis of three mitochondrial genes.

Mahseers are generally present in the so called “Tor zone” (600–1,200 m) of the glacier-fed Himalayan Rivers (Singh and Kumar 2000) and have much greater reach to the lower reaches in the peninsular Indian rivers (Ajithkumar et al. 1999). The golden mahseer, T. putitora is the largest in size and is widely distributed in all Himalayan rivers and reported to migrate up to an altitude of 2,000 msl (Raina et al. 1999). Currently, only T. tor is known to exist both in the rivers of Himalayas and also those flowing through the central plateau, such as in Narmada, Tapti, and Tons (Desai 2003; Dinesh et al. 2010). Lal et al. (2013) described the extended distribution of T. tor in the Godavari and Krishna peninsular rivers.

The established morphological criteria, such as proportion of head length to body depth (Talwar and Jhingran 1991; Jayaram 2005; Laskar et al. 2013) or lip and median lobe structure (Zhou and Cui 1996), to classify the mahseer species are a source of disagreement among taxonomists (Nguyen et al. 2006). Molecular markers are pivotal to understanding the phylogenetic relationships and complementing taxonomy knowledge of the species needing conservation (Morrison et al. 2009; Chakrabarty 2010). Among molecular markers, the partial cytochrome c oxidase I (COI) sequence has potential for species discrimination and has been proposed as a DNA barcoding marker for animals (Hubert et al. 2008; Lakra et al. 2011; Young et al. 2013). The mtDNA D-loop region has been reported to be useful marker for Cyprinid phylogeny (Gilles et al. 2001), and its variable ETAS domain is particularly useful for lower level group relationships (Liu and Chen 2003).

The present study aims to critically assess the current taxonomic framework for mahseer species and morphological outliers across different river basins in India by using a DNA barcoding approach. Study objectives are to assess the concordance of species discrimination using morphological keys and molecular analyses, to subsequently make taxonomic recommendations based on the findings, and interpret species distribution with respect to drainage patterns and earth history in the region.

Materials and methods

Samples for genetic analyses consisted of 86 individuals for nine mahseer species and subspecies, belonging to three genera, Tor (six species T. putitora, T. macrolepis, T. tor, T. khudree, T. mussullah, T. mosal; one subspecies T. mosal mahanadicus), Neolissochilus (Neolissochilus hexagonolepis) and Naziritor (N. chelynoides), as well as some morphological outliers (which could not be assigned distinctly to any of the species) of genus Tor. The representative specimens for these species were collected through explorations of different rivers across India (Table 1, Fig. 1) based on the previous knowledge of their distribution (Shrestha 1997; Desai 2003; Jayaram 2005). The best-fit individuals conforming to the morpho-meristic descriptions (Appendix 1 and 2; Electronic Supplementary Material) from the literature were identified as representative of the respective species. The specimens that were found to be morphological or distribution outliers were not considered as representative of the species and considered as non-descript types (morphotype I and II; Tor of the Sank and Tons rivers) and their molecular data was compared with that generated from representative species to identify their lineage (Figs. 2, 3, 4, 5, 6, 7, 8, 9 and 10).

Table 1 Species list, river system and their tributary rivers, figures and photograph, location with latitude and longitude and sample genbank accession number (*genseq-4 D-loop and genseq-4 COI) of specimens used for DNA analysis from different rivers of India. Genseq nomenclature follows recommendation of Chakrabarty et al. 2013
Fig. 1
figure 1

Map showing major drainages and mahseer collection localities across India. 1 River Tawi, Jammu. 2 River Beas, Pong. 3 River Beas, Pathankot. 4 River Satluj, Nangal. 5 River Yamuna, Yamuna nagar. 6 River Ganga, Rishekesh. 7 River Bhagirathi. 8 River Kosi, Ramnagar. 9 River Sharda, Tanakpur. 10 River Gerua, Katarnia Ghat. 11 River Tista, Mal/Udalabadi. 12 River Jaldhaka, Bindhu. 13 River Ziyabharali, Bhlukpong. 14 River Dikrong, Doimukh. 15 River Sank, Gwalior. 16 River Tons, Chackghat. 17 River Tawa, Itarsi. 18 River Mahanadi, Sambhalpur. 19 River Krishna. 20 River Chaliyaar, Nilambur. 21 River Chalakudy, Puzha

Fig. 2
figure 2

A Lateral view of yellow finned Mahseer, T. putitora (Hamilton) (GQ469809) and morphotypes. Ai Side ventral T. putitora (GQ469822). Aii Lateral view of head showing normal lip structure of T. putitora (GQ469822). Bi Side ventral view of morphotype I of T. putitora (GQ469815). Bii Lateral view of lips and head part showing small medium lobe in lower lip of T. putitora (GQ469815). Ci Side ventral view of morphotype II of T. putitora (GQ469824). Cii Lateral view of head showing thick lips and lower lips with thick medium lobe morphotype II of T. putitora (GQ469824)

Fig. 3
figure 3

a Lateral view of thick-lipped Mahseer, T. macrolepis (GQ469830), b Side ventral and lateral view of head part of a normal specimen of T. macrolepis (GQ469830)

Fig. 4
figure 4

a Lateral view of the red finned Mahseer, T. tor shows hypertrophied lip structures (EU714120). b Ventral and Lateral view of head and anterior part of body of T. tor (EU714120)

Fig. 5
figure 5

A Lateral view of the Mahanadi Mahseer, T. mosal mahanadicus (GQ469783). Ai Ventral and lateral view of head and anterior part of body of T. mosal mahanadicus (GQ469783). B Lateral view of the Mahseer, T. mosal (EU714108). Bi Lateral view of head and anterior part of body of T. mosal (EU714108)

Fig. 6
figure 6

A Lateral view of the Deccan Mahseer, T. khudree (GQ469788). Ai Ventral and lateral view of head and anterior part of body of T. khudree (GQ469788). B Lateral view of the Hump Mahseer, T. mussullah (GQ469798), Bi Lateral view of head and anterior part of body of T. mussullah (GQ469798)

Fig. 7
figure 7

A Lateral view of the chocolate mahseer, N. hexagonolepis (EU714096). Ai Ventral and lateral view of head and anterior part of body of N. hexagonolepis (EU714096). B Lateral view of the black Mahseer, N. chelynoides (EU714101). Bi Lateral view of head and anterior part of body of N. chelynoides (EU714101)

Fig. 8
figure 8

A Lateral view of the Tor from River Tons (HQ609728). Ai Ventral and lateral view of head and anterior part of body of Tor from River Tons (HQ609728). B Lateral view of the Tor from River Sank. (HQ609726), Bi Lateral view of head and anterior part of body of Tor from River Sank (HQ609726)

Fig. 9
figure 9

a Lateral view of the Tor from River Krishna (EU714115). b Ventral and lateral view of head and anterior part of body of Tor from River Krishna (EU714115)

Fig. 10
figure 10

Maximum likelihood (ML) tree developed based on Dloop sequences of mahseers. Numbers represent node supports inferred from ML bootstrap analyses (only values above 50 are shown), ML tree using Hasegawa Kishino Yano with Gamma (HKY + G); α = 1.1750, I = 0.2787, −lnL = 1958.8831. Bootstraps estimated are derived from 500 replications

DNA isolation

The genomic DNA was extracted from blood or muscle and fixed in 95 % ethanol, following the procedure of Ruzzante et al. (1996), with minor modifications. Briefly, 50 μl of ethanol-fixed blood cells were washed twice with High TE buffer (100 mM Tris–HCl, 40 mM EDTA, pH 8.0) and incubated overnight in 0.5 ml of lysis buffer (10 mM Tris–HCl, 1 mM EDTA, 400 mM NaCl, pH 8.0), containing 1 % sodium dodecyl sulfate and 0.5 mg/ml proteinase K, at 37 °C, followed by phenol chloroform and purification with ethanol.

PCR amplification and mtDNA sequencing

The 5′ regions of COI were amplified using the primer pairs suggested for DNA Barcoding of fish (Ward et al. 2005). For the D-loop, different combination of primer pairs L15998-pro and H CSBDH (Alvarado et al. 1995), L15926 and H16498 (Kocher et al. 1989), L16378 and H16578 (Faber and Stepien 1998) and H503 (Titus and Larson 1995) were used, which are located on tRNA proline (tRNApro) and a conserved sequence D-block, respectively (Alvarado et al. 1995). PCR was performed in a 25-μl reaction volume, and each sample reaction contained 1X PCR buffer (10 mM Tris–HCl, pH 9.0; 50 mM KCl; 0.01 % gelatin), 1.5 mM MgCl2, 0.2 mM of each dNTP, 5 pmol of primer, 1.5 units Taq DNA polymerase (Genei, Bangalore) and 50 ng of genomic DNA. PCR amplifications were performed in a thermal cycler (MJ Research, PTC-200) with the following cycling parameters: one cycle of pre-denaturation at 94 ºC for 5 min, 30–40 cycles of 94 ºC for 1 min, 50 ºC for 1 min and 72 ºC for 2–2.5 min with a final elongation at 72 ºC for 4 min. After amplification, the PCR products were sequenced bidirectionally using an automated capillary sequencer (MegaBACE 1000, GE Healthcare, Hong Kong) following the manufacturer’s instructions. All the sequences generated, along with their specimen voucher numbers, were deposited in GenBank (Accession numbers listed in Table 1) and nomenclature suggested by (Chakrabarty et al. 2013) was followed.

Sequence analysis

Alignments were then performed using ClustalW and manually, based on the descriptions by Baker and Marshall (1997). The base compositional frequencies and nucleotide substitution between pairwise distances were determined using PAUP* (Swofford 2002). The software PAUP* was also used to generate random trees (n = 1,000) to examine the phylogenetic signal (Hillis and Huelsenbeck 1992). Phylogenetic analyses were performed using different methods, which are maximum parsimony (MP), neighbor joining (NJ) and maximum likelihood (ML) as implemented in PAUP*. The MP method was performed using heuristic searches with 20 random-addition-sequence replicates and tree-bisection-reconnection (TBR) branch swapping. Likelihood ratio tests (Goldman 1993a, b; Huelsenbeck and Crandall 1997), as implemented in MODELTEST 3.06 (Posada and Crandall 1998), were employed to model based methods (NJ and ML) without branch length parameters under the standard Akaike Information Criterion (AIC). In D-loop Hasegawa Kishino Yano with Gamma (HKY + G) and in COI mtDNA, Tamura-Nei with Proportion of Invariation sites and Gamma (TrN + I + G) was selected as the model of analysis. Likelihood analyses and were performed by TBR branch swapping, using the MULTREES option, by random stepwise additions using the heuristic search algorithm. Quartet puzzling (with 1,000 pseudo replications) was used to obtained the phylogenetic trees by using the PAUP* package, and their confidence for the analyses was estimated by 500 bootstrapping iterations (Felsenstein 1985), whereas NJ was inferred using 1,000 bootstrapping iterations. All the trees were rooted with the outgroup species Clarias batrachus, (family Claridae; Order Siluriformes). The sequences of carps, Cyprinus carpio, Carassius carassius, Carassius auratus, Puntius denisonii and Puntius chalakkudiensis were downloaded from NCBI and analyzed with the mahseers. The online version of automatic barcode gap discovery, ABGD (Puillandre et al. 2012) was used to determine barcode gap occurrence and partition the sequences into putative groups or species. COI alignments were analyzed at the default settings (Pmin = 0.001, Pmax = 0.1, Steps = 10, X (relative gap width) = 1.5, Nb bins = 20) and with K2P distances.

Results

Characteristics of mtDNA sequences studied in mahseer species

COI

Out of the 654 bp analyzed, 434 bp (67.28 %) were conserved, 220 bp (33.64 %) variable and 178 bp (27.22 %) parsimony informative in all the species analyzed. According to codon position, the third base was the most informative with 84 parsimony informative characters. The average nucleotide composition was 30.4 % (T), 27.0 % (C), 26.1 % (A) and 16.5 % (G). The estimated transition/transversion ratio ranged from 3.48 to 9.02 (Table 2).

Table 2 Nucleotide parameters in D-loop and COI regions of Mahseer species studied; all frequencies are averages (rounded) over all taxa

D-loop

The size of the amplified fragment of D-loop region was approximately 450 bp. Both heavy and light strands sequences yielded 411 bp, after alignment and reliability checking. The average nucleotide base composition was 32.9 % thymine (T), 16.2 % cytosine (C), 37.8 % adenosine (A), and 13.1 % guanine (G), with a total T and A contents (70.7 %) higher than that of C and G (29.3 %). Out of a total 411 sites, 174 sites were variable, 222 conserved and 111 were parsimony informative. The ratio of transition to transversions was found to be 1.0. In some individuals of T. putitora, T. tor, T. mosal mahanadicus, T. mussullah and in all the individuals of T. macrolepis, Tons River Tor, Sank River Tor, Krishna River Tor, N. hexagonolepis and N. chelynoides, the amplified product did not confirmed to the mtDNA COI sequence, hence, it was considered to be pseudogenes. Repeated isolation of DNA from alternative tissue samples of such individuals and amplification with all the possible combinations of the primers sets failed to resolve the amplification of putative pseudogenes. Therefore, D-loop data was available for only five Tor species (Table 3).

Table 3 Genetic distance, between six Mahseer species from D-loop region analysis (distance method: Hasegawa Kishino Yano with gamma)

Model of evolution

Models of the evolution dataset resulted in the best likelihood score. For COI, gene base frequencies of A = 0.2938, C = 0.2834, G = 0.1438, T = 0.2791 were recorded, and for the D-loop, base frequencies of A = 0.3759, C = 0.1569, G = 0.1422, T = 0.3251, were recorded. The substitution model incorporated the following rate matrix [A-C] = 1.0000, [A-G] = 14.6471, [A-T] = 1.0000, [C-G] = 1.0000, [C-T] = 8.9889, [G-T] = 1.0000 and proportion of invariable sites was I = 0.6121, whereas the shape parameter of the discrete gamma distribution was G = 3.1570 for the COI gene, and for the D-loop region, the substitution model incorporated Ti/tv ratio = 1.3010 and the shape parameter of the discrete gamma distribution was found to be G = 1.1723. The models of evolution of the individual gene regions were TrN + I + G for the COI gene and HKY + G for the D-loop region. These models were used to determine the number of substitution types and the inclusion of gamma rate distribution and/or proportion of invariable sites in the analyses.

Divergence between mahseer species

COI

Genetic distances (Tamura and Nei with proportion of invariable site and gamma correction) between mahseer species are given in Table 4, and the sequence divergence is provided in Table 5. The genera Naziritor (N. chelynoides) and Neolissochilus (N. hexagonolepis) exhibited high nucleotide diversity 0.095 with each other. N. hexagonolepis displays high genetic distance from the genus Tor (0.064) and other outgrouped species. N. chelynoides displays high genetic distance and sequence divergence from P. chalakkudiensis (0.196, 0.171) and Puntius denisonii (0.207, 0.170) and displays comparatively low genetic distance and sequence divergence from the genus Tor (0.092, 0.080) and genus Neolissochilus (0.106, 0.095).

Table 4 Genetic distance, between Mahseer species and other outgroups used in COI analysis (distance method: Tamura-Nei with proportion of invariation site and Gamma)
Table 5 Sequence divergence (COI) between different mahseer species and other outgroup species

The genetic distance between the species of the genus Tor was found to range from 0.000 to 0.037. T. mosal had the maximum genetic distance from all the other Tor species (0.022 to 0.037). A short genetic distance was observed between the three species T. putitora, T. mosal mahanadicus, and T. macrolepis (0.000). Low sequence divergence (0.00 to 0.004) values were also observed between them. These three species also did not exhibit any detectable difference. The two morphological outliers (morphotype I and morphotype II) collected from the associated rivers of the Indus and Ganga system also did not exhibit any significant nucleotide divergence and genetic distance from these three species (T. putitora, T. mosal mahanadicus, and T. macrolepis).

The two morphological variants of Tor from the Sank River and Tons River had high genetic distance (0.022 to 0.025) with T. tor. However, these were found to be closer to T. putitora, T. mosal mahanadicus, and T. macrolepis than T. tor with a short genetic distance (0.002 to 0.006) but with significant differentiation. The sequence divergence was also low ranging (0.002 to 0.007) within the group that includes T. putitora, T. mosal mahanadicus, and T. macrolepis of these two observed variants. The COI sequences also revealed small (0.006) but significant genetic differentiation between these two Tor variants from the Sank and Tons rivers.

The genetic distance of T. tor from other Tor species was found to be significant, ranging from 0.019 (T. putitora) to 0.031 (T. mosal). T. tor displayed low values of sequence divergence (0.004) and genetic distance (0.002) with Tor of the Krishna River. The two Deccan mahseers, T. khudree and T. mussullah, exhibited high genetic distance (0.022) and nucleotide diversity (0.0025) between them, and with other species, their genetic distance ranged from 0.019 to 0.039.

D-loop

Genetic distances between different mahseer species based on D-loop analysis are given in Table 4. T. putitora and T. mosal mahanadicus exhibited low genetic distance (0.005). T. tor, T. khudree, and T. mussullah exhibited high genetic distance (0.093, 0.110, and 0.207, respectively) with T. putitora. Deccan mahseer species, T. khudree and T. mussullah, were separated by a high genetic distance (0.111) between them and also in comparison with T. tor (0.113 and 0.199, respectively).

Genetic relationship

COI

The panalysis of COI sequences was performed using nine species of mahseer and two morphotypes with other carp species and C. batrachus as the outgroup species. Neighbor Joining, Maximum Parsimony and Maximum likelihood trees revealed similar topology (Fig. 11). Three major clades within the mahseer group were identified. Clade 1 contained the genus Tor (T. putitora, morphotype I and II, T. mosal mahanadicus, T. macrolepis, Tor from the Sank River and Tons River, Tor tor, Tor from the Krishna River, Tor khudree, Tor mussullah, and T. mosal). Clade two and three contained Genus Neolissochilus (N. hexagonolepis) and Genus Naziritor (N. chelynoides), respectively. It is noteworthy here that the Naziritor genus is clearly discriminated from the genus Puntius. Within clade 1, five groups were observed. Group 1 contained three subgroups. Subgroup 1 was formed of T. putitora, T. putitora morphotypes I and II, T. mosal mahanadicus, T. macrolepis. Subgroups 2 and 3 within group 1 were formed by Tor of the Sank River and Tor of the Tons River with low bootstrap values. Group 2 of clade 1 contained two subgroups that accommodate the species T. tor and Tor of Krishna river. The other groups are T. khudree in group 3, T. mussullah in group 4 and T. mosal in group 5 (T. putitora and others) and group 2 (T. tor). The genus Naziritor was separated with both genus Tor and Neolissochilus. All the supporting nodes were supported by strong bootstrap values.

Fig. 11
figure 11figure 11

a Maximum likelihood (ML) tree developed based on COI sequences of mahseers. Numbers represent node supports inferred from ML bootstrap analyses (only values above 40 are shown), ML tree using Tamura-Nei with proportion of Invariation site and Gamma (TrN + I + G) model; α = 3.1570, I = 0.6121, −lnL = 2954.66, bootstraps estimated are derived from 500 replications. The group number given with the species is based on the membership of sequences obtained from partitioning with ABGD. b Maximum likelihood (ML) tree developed based on COI sequences (629 bp), between species of genus Tor and Hypselobarbus using pairwise genetic Tamura-Nei model + G model

A neighbor joining tree (Fig. 11b) was computed to find out if the genus Hypselobarbus is distinct from genus Tor. The 629 bp sequences COI gene available in NCBI genbank for three species Hypselobarbus micropogon (KC4454641), Hypselobarbus kurali (KC4454631), Hypselobarbus periyarensis (KF1135591, KC4454651) were downloaded and analyzed with samples of other mahseer species used in this study. The genus Hypselobarbus clearly separated as distinct clade from genus Tor with genetic distance T. mussullah ranged from 0.106 to 0.172.

The results of the ABGD analysis indicated that a barcode gap is detected between intraspecific and interspecific distances (Fig. 12). The delimiting of species is found to be congruent with the clade formation obtained from NJ, MP, and ML analyses (Fig. 11). The sequences partitioned into 10 groups with prior maximal distance P = 0.0016, 0.0027, and 0.0046, respectively (Fig. 12c).

Fig. 12
figure 12

Barcode gap analysis of Mahseer species as generated by Automatic Barcode Gap Discovery (Puillandre et al. 2012). Distributions of K2P distances and between each pair of specimens for the COI gene, histogram of distance (a); ranked distance (b) and number of PSHs obtained for each prior intraspecific divergence (c)

D-loop

All methods (NJ, ML, and MP) used for analysis of D-loop data yielded similar topology (Fig. 10). The tree evident from the analysis revealed there were two main clusters in the tree, one containing the Tor genus and a second cluster having Cyprinus carpio as an outgroup. In the genus Tor, there were two main clusters, one containing T. putitora, and T. mahanadicus clustered with T. tor. The second main cluster formed between T. khudree and T. mussullah. The nodes in the trees, formed from NJ, ML and MP methods, were supported by significant bootstrap values. The genetic distances between the groups are given in Table 3. The pair of T. putitora with T. mosal mahanadicus exhibited a small distance value (0.005). T. khudree and T. mussullah displayed high pairwise distance (0.111). The pairwise genetic distance ranged from 0.005 to 0.111.

Discussion

The study represented a comprehensive effort to characterize genetic divergence between mahseers found in India through use of DNA sequence polymorphism in the COI and D-loop regions.

Relative performance of D-loop and COI sequences in species delineation

COI sequences were found to be more useful in discriminating mahseer species than the D-loop region. Though, D-loop sequences have been reported as effective markers for cyprinid phylogeny (Gilles et al. 2001; Liu and Chen 2003); they could not provide useful data for all the mahseer species studied. The sequences amplified with D-loop primers in some of the mahseer species and individuals within a species did not confirm to sequence of fish mitochondrial D-loop region. This is likely to happen due to sequencing errors or possibly pseudogenes amplified instead of the target region. In all such cases, the repeated attempts were made but it did not amplify the sequences that confirmed to the fish mitochondrial D-loop sequences. Agostinho and Ramos (2005) discovered a large number of mitochondrial pseudogenes, previously unrecognized in fish genomes. Pseudogenes co-amplify with, or in some cases, instead of, the desired mtDNA target (Collura and Stewart 1995), possibly because of heteroplasmy, “leakage” from the paternal lineage and nuclear duplications (Zhang and Hewitt 1996). The presence of pseudogenes or observation of multiple copies of an mtDNA gene can be due to of four possible reasons: (1) contamination by other organisms (e.g., parasitic or ingested organisms) or by exogenous DNA in the laboratory, (2) heteroplasmy, (3) paternal leakage, and (4) duplication of the mtDNA gene (Williams and Knowlton 2001). To confirm that the present results were not affected by contamination, the whole procedure from DNA isolation to sequencing was repeated with utmost care, for those individuals where pseudogenes were amplified. Therefore, the D-loop region data could be used only for four Tor species, T. putitora, T. mosal mahanadicus, T. khudree, and T. mussullah and the sequences that did not confirm to the D-loop sequence were discarded to avoid confounding of data. The mtDNA COI region did not pose any such problem and was correctly sequenced with the universal primers (Ward et al. 2005) in all the specimens of studied mahseer species.

Taxonomic status of mahseers

The neighbor joining, maximum parsimony, and maximum likelihood analysis, based on the D-loop and COI sequence data revealed more or less similar tree topologies. The COI mtDNA sequence analysis revealed the monophyly of the genera Neolissochilus and Naziritor as separate sister-groups of the genus Tor. The results supported the placement of species chelynoides under a separate genus as it can be considered a member neither of the genus Tor nor of genus Puntius (Dahanukar 2010). The mitochondrial data support the placement of this species in the genus Naziritor (Talwar and Jhingran 1991) along with another species, N. zhobensis, which is found in the Zhob River in Pakistan (Mirza and Javed 1985).

The results highlighted some interesting insights into distribution and genetic relatedness among species of the genus Tor. The genus Tor was composed of four distinct sister clades: 1, T. khudree and T. mussullah; 2, T. mosal; 3, T. putitora (including morphotype I and II, T. mosal mahanadicus, T. macrolepis and Tor of the Tons River and Tor of the Sank River) and 4, T. tor (including Tor of the Krishna Rover). These clades were supported by high bootstrap values at the separating node. Out of the seven species of Tor described in India, six species were analyzed based on COI phylogeny. Samples of Tor progenies, which have a restricted distribution in Northeast India, were not available to us, and the species has been recently reported to be a synonym of T. putitora (Laskar et al. 2013).

The two Deccan mahseers, T. khudree and T. mussullah, were found to be genetically closer to each other than to other Tor species. Nevertheless, the significant separation between the two species with high genetic distance did not favor the argument of Menon (1992) that the two species could be synonyms. The consideration of T. khudree and T. mussullah as different species, inferred based on the data in this study agrees with conclusions from RAPD analysis (Mohindra et al. 2007) meristic, non-meristic and osteological data (Jayaram 2005), and karyomorphology (Kushwaha et al. 2001; Indra Mani et al. 2010). COI analysis in our earlier study (Lal et al. 2013) also confirmed the extended distribution of T. tor in the peninsular rivers, namely the Godavari and Krishna River system, previously known to harbor only the Deccan mahseer T. khudree. The findings further indicated separate valid species status of T. mussullah and its membership within subclade shared with T. khudree within Tor clade indicate that this species is a Tor cogener instead of Hypselobarbus (Dahanukar and Raghavan 2011). Analysis of Hypselobarbus COI sequences (629 bp) with the Tor genera sequences clearly supported the inference that inclusion of T. mussullah with genus Hypselobarbus is not justified.

There is strong genetic evidence for retaining T. mosal as distinct species from T. putitora and therefore, the findings disagree with the use of T. mosal as a synonym of T. putitora (Menon 1999; Eschmeyer et al. 2013). Hamilton (1822) from the Kosi, Ramnagar, and Uttarakhand rivers described as Cyprinus mosal and later as T. mosal. However, several authors (Dinesh et al. 2010) have questioned the validity of this species from Himalayan Rivers and favored it as synonym of T. putitora (Menon 1999, 1992). Desai (2003) had enlisted T. mosal as a valid species, but largely as a Myanmar species or restricted to some rivers in northeastern regions of India (Kundu 2000). In the present study, we discovered specimens of T. mosal from its type locality (the Kosi River) and from the Yamuna River, both mid-Himalayan rivers in northern India. The specimens confirmed the morpho-meristic descriptions (Jayaram 2005) and could be clearly distinguished from T. putitora.

The mtDNA sequences failed to discriminate the two species considered valid earlier, T. macrolepis and subspecies T. mosal mahanadicus, from T. putitora. To ensure the confidence and consistency of T. putitora COI sequences, 24 samples from six distant rivers that belonged to three different river basins, the Indus, Ganga, and Brahmaputra, were examined for sequence comparisons with T. macrolepis and T. mosal mahanadicus. T. macrolepis, a mahseer species with a hypertrophied lip, was originally described from the Indus River system (Heckle 1838) as Labeobarbus macrolepis with a thick lip structure. Since then, the taxonomic status of this species has involved several contradictory suggestions: the genus Barbus, a synonym of T. putitora (Silas 1960), a subspecies of T. putitora (Day 1871; 1878; Mirza and Awan 1976; Mirza and Bhatti 1996) and as a separate species, T. macrolepis (Mirza 2004). Previously, T. macrolepis was described only from rivers of the Indus basin (Mirza 2004; Nguyen et al. 2009); however, the present explorations revealed its sympatric co-existence with T. putitora in some of the rivers of the Ganga River system (Yamuna and Kosi, Sharda and Garua) as well. The inferences based on the mtDNA COI region in the present study and three other mtDNA genes (Nguyen et al. 2009) conclusively fail to discriminate T. macrolepis as a separate species from T. putitora and thereby support the observations of Silas (1960) that T. macrolepis could only be a synonym of T. putitora. T. macrolepis can be morphologically distinguished from T. putitora only because of a fleshy structure of the lower lip with prominent median lobe, however, with similar morpho-meristic characters (Appendix 1, 2 Electronic Supplementary Material). These combined insights suggest it more appropriate to consider T. macrolepis as a thick-lipped morphotype of T. putitora rather than a separate species. The specimens recorded with different intermediary variations of the median lobe (morphotype I and II) but with COI sequences similar to T. putitora possibly indicate the occurrence of interbreeding between these morphotypes. The hypertrophied lips could be produced by environmental factors, and its development depends on the type of stream in which the mahseer lives and the degree of adherence required to live in swift current (Hora 1940). Such variation of lip structure had earlier been reported in T. tor, irrespective of size and sex (Desai 1982). We found co-existence of T. mosal with T. putitora, T. macrolepis and morphotype (morphotype I and II) as these were captured at the same fishing grounds both in Kosi and Yamuna rivers. However, it needs to be mention here that COI, being a mitochondrial gene, will trace maternal lineages. Hence, to conclude on species limits of T. putitora and T. macrolepis decisively or if the two forms inhabit with random mixing, analysis of conserved nuclear markers will be a crucial input.

T. mosal mahanadicus or Mahanadi mahseer was originally described by David (1953a, b); however, it was later reported as a subspecies of T. khudree (Menon 1992), and based on the meristic characters (head length larger than body depth) it was categorized as a subspecies of T. mosal (Shrestha 1997) and T. tor (Sugunan 1995; Froese and Pauly 2013). The present study (both D-loop and COI regions) clearly demonstrated that T. mosal mahanadicus is genetically closer to T. putitora than any of the other Tor species. The clustering of T. mosal mahanadicus individuals within the T. putitora cluster with an insignificant genetic divergence indicated a shared ancestral lineage. Therefore, there is no evidence to suggest that T. mosal mahanadicus is a subspecies of T. mosal. However, the results also raise the question of whether T. mosal mahanadicus could be considered a subspecies or a differentiated genetic stock of T. putitora. We attempted to address the question in the context of the evidence available in the literature, both published and unpublished, for these two mahseer species. The morpho-meristic characters of these two species significantly overlap (Desai 2003). A study that assessed the population structure of T. putitora and simultaneously analyzed T. mosal mahandicus, using microsatellite and allozyme genotyping (Ranjana 2005), presented an interesting scenario. Out of 33 alleles from 10 polymorphic allozyme loci, T. mosal mahanadicus shared all the alleles with T. putitora except two private alleles. Out of eight microsatellite loci amplified in T. putitora samples from ten geographical locations (n = 411), seven loci amplified in T. mosal mahanadicus (n = 64). Furthermore, out of a total of 82 alleles generated from these seven loci in all the samples, 76 alleles were found in T. mosal mahandicus, and 54 alleles were common with T. putitora. Moreover, the pairwise Fst values were variable from 4 to 14 % (allozyme) and 15 to 25 % (microsatellites) between T. mosal mahandicus and T. putitora from 10 geographical localities. Such variable Fst values and shared alleles are unlikely to happen if these two mahseers are different species. The comparative RAPD profiles also did not yield any loci, which discriminated T. mosal mahanadicus from T. putitora (Mohindra et al. 2007). Therefore, both molecular and genetic results agree to the systematic status of T. mosal mahanadicus as a synonym of T. putitora. However, T. mosal mahanadicus (22 m + 12sm + 22st + 44 t) and T. putitora (12 m + 22sm + 14st + 52 t) do have different karyomorphology but the same fundamental arm number (NF) 134 (Indra Mani et al. 2010) however, such differences have been reported between fish conspecifics inhabiting different localities (Singh et al. 2013). In all likelihood, the T. mosal mahanadicus is a subpopulation or genetic stock of T. putitora

Similarly, Tor specimens collected from the Sank and Tons Rivers (flowing northward to Ganges from central plateau) formed two independent subgroups within the T. putitora group. However, this inference from COI sequences exhibited an interesting discordance with the morpho-meristic characters. For these specimens from the Sank and Tons rivers, though, meristic characters such as lateral line scale count and LTr count were similar to that for T. putitora however, the body depth more than the head length and fleshy anal fin exhibited morphological proximity to T. tor. Such variations could be environmentally induced, resulting from adaptive selection pressure or the possibly from evolution of these morphotypes after interspecific breeding between T. tor and T. putitora. Data from combined use of nuclear and mitochondrial will be useful to derive insight into the evolution of such morphotypes.

Biogeographical history of genus Tor in India

The new taxonomic clarity on Tor species based on molecular data reveals species distributions that contrast to previous information. The COI and D-loop data (for some species) established the extended distribution of the T. putitora clade both as an original species and its possible diverging lineages in the rivers originating and flowing through the central and Deccan plateaus. Therefore, it is unambiguous that, the two Himalayan mahseers, T. tor and T. putitora (including T. mosal mahandicus and T. macrolepis), have wide geographic distribution but as fragmented populations, not only across the whole of the southern trans-Himalayan rivers extending to Hindukush ranges in the west and northeast of the Indian peninsula (Petr 2002) but also various rivers flowing in central plateau and eastern ghats in India. The discontinuous distribution pattern of the two species is evident as these are restricted to the longitudinal rivers on southern face of Himalaya mountain, flowing into three major systems Indus, Ganga and Brahmaputra (Sinclair and Jaffey 2001) on the northern side and completely absent in the river portions flowing through alluvium Indo-Gangetic plains. The two species are again distributed in the rivers of peninsular India on southern side of the Ganga river system. Nevertheless, it is likely that the spread of these torrential hill stream Himalayan carps to central plateau happened through circumventing the Gangetic plains and agree to Satpura hypothesis postulated by Hora (1951).

According to the Satpura hypothesis, migration of torrential fish species from Assam Himalayas to Peninsular India happened during the Pleistocene (less than 2.0 mya), across the Garo-Rajmahal gap along the Satpura-Vindhya ranges and further spread due to excessive runoff in streams like Narmada-Tapti,. Dispersal of fishes was enabled by (a) river capture, (b) longitudinal river valleys, and (c) tilting of mountain blocks. The river courses of central plateau as seen today are a result of rise in plateau and tilt of peninsula that happened during the middle Pleistocene, around 0.78 to 0.126 mya (Berg et al. 1969; Briggs 2003), an event possibly responsible for vicarian fragmentation of populations and their further dispersal to rivers of peninsular India. Mahanadi and Godavari rivers flowing westward as tributaries of Narmada river, changed their course to flow as independent river systems southward through Eastern Ghats (Menon 1951). Evidently, this explains the presence of T. putitora in river Mahanadi lineage in inferred in this study. Finding of T. tor (Lal et al. 2013) in tributaries of Godavari river system also support this inference. Silas (1952) opined that Tor species were among the early migrants (pre-tilt forms) during lower Pleistocene and subsequently spread up to Sri Lanka, which was linked to mainland intermittently (until 10,000 years ago) and diversified through speciation, possibly the two Deccan mahseers found today, T. khudree and T. mussulah, confirmed as two distinct but close related species. T. khudree is widely spread in peninsular India, from down the river Godavari and whole of western ghats. The endemic variety from Sri Lanka, T. khudree longispinus is reported to be actually T. khudree only (Nguyen et al. 2009). Based on the description of T. mosal mahandicus (David 1953a, b) as an endemic variant of T. mosal, Silas (1952) hypothesized that the T. mosal was among the last migrants during the upper Pleistocene (later than 0.126 mya) and could not migrate beyond the Mahanadi River. However, genetic evidence did not support, as T. mosal mahandicus was not found to be actually related to T. mosal.

Implications for conservation

Research on mahseer conservation is considered a priority area (Siraj et al. 2007; Nautiyal 2012). However, the present findings indicate that the conservation programs are at risk of flawed planning because of species misidentification and, consequently, could lead to genetic introgression of natural gene pools (Avise 2000). It is possible that many of the biological (Dwivedi and Nautiyal 2012) and ex situ conservation (Goswami et al. 2012) findings could be addressing the wrong species. Achieving precise knowledge of species identification, taxonomy and biology has a strong bearing on the success of conservation planning (Zaccara et al. 2004). Splitting taxa can lead to efforts on extra protection; conversely, a distinct valid species, when classified as synonym, may become extinct because of a lack of recognition in conservation efforts (Morrison et al. 2009). These taxonomic implications are equally applicable to mahseer conservation in the whole of south Asia because, except for Deccan mahseers, other species are spread across the trans-Himalayan region and may also contain cryptic diversity or confused taxonomy.

Currently, in south Asia including India, most of the efforts on biological research, breeding and conservation planning are focused on T. putitora (Shrestha 1997; Ogale 2002; Sarma et al. 2010; Nautiyal 2012), followed by T. tor (Desai 2003), T. khudree and T. mussullah (Ogale 2002). However, our findings emphasize the need for extending conservation efforts to lesser known mahseers such as T. mosal, which might be under serious threat, as even their presence in some of the localities is not recognized (synonymized with T. putitora), and their biology and population structure are poorly understood. Even mahseer breeding programs for conservation (Sarma et al. 2010) mostly supported through wild broodstock collection can be confounded inadvertently because of the co-inhabitation of T. mosal with T. putitora, unless brood fish are screened with caution.

Fine-scale explorations are envisaged for rivers flowing in the central and Deccan plateaus in India to map the species distribution of mahseers (in addition to T. tor), morphotypes with characters of adaptive significance, taxonomic status, and their life history traits. This is of utmost importance, as all the descriptions on biology and life history traits of mahseers from rivers of this region relate to the distribution of only one species, T. tor (Desai 2003; Dinesh et al. 2010; Dwivedi and Nautiyal 2012). Badapanda (1996) reported a 48.8 % decline in Mahanadi mahseer during 1977 to 1992. The morphotypes of T. putitora found in Tons and Sank Rivers and also Mahanadi mahseer also have good conservation importance for the adaptations likely to have acquired with exposure to a tropical climate in contrast to temperate Himalayan streams. These rivers could be harboring fragmented and reproductively isolated populations of these Himalayan species, which have undergone significant genetic differentiation to adapt to environment challenges. Animals with such adaptive variations are of great significance in view of the climate changes taking place especially global warming and hence, useful research material for genomic studies. Therefore, in situ and ex situ conservation of such organisms through the establishment of species-specific live germplasm resource centers is important. It will be judicious to use molecular markers to alleviate conflicts because of phenotypic overlaps, while raising broodstock for species-specific propagation for conservation programs to avoid the risk of long-term genetic contamination in the wild gene pool. The reference species-specific sequences (COI) from the present study and corresponding images of accessions will be of specific use for deriving species identification.

Conclusions

Out of the six species studied, the present results recovered only five distinct valid species, T. putitora, T. tor, T. mosal, T. khudree, and T. mussullah. Molecular evidence, from this study and previous studies (Silas et al. 2005; Nguyen et al. 2006; Mohindra et al. 2007) could not discriminate any of the three subspecies, T. mosal mahanadicus, T. khudree malabaricus and T. khudree longispinus, reported from the region of India and Sri Lanka. The Naziritor genus more appropriately accommodates N. chelynoides than the genus Puntius. Similarly, T. mussulah must be retained in the genus Tor rather than placing in the genus Hypselobarbus. The present data also flagged the occurrence of morphological outlier specimens of genus Tor in nature, however, future use of conserved nuclear genome sequences will be utmost necessary to get biparental information so as to have insight into evolution of these animals and their precise systematic affinity. Our results strongly advocate that there is an urgent need to revisit systematic of species of the genus Tor distributed across whole of the South and Southeast Asia. This will need a network program involving multiple countries and disciplines, to address holistic objective of deciphering taxonomy, biology, biogeography and also intraspecific genetic divergence. In such an endeavor, molecular markers, both nuclear and matrilineal, must complement morphological, biological and geological data to contribute to knowledge base for management and conservation of these iconic fish species.