Introduction

Studies on the phylogeography of organisms endemic to the Tibetan Plateau and its surrounding areas have increased rapidly in recent years. Especially the dramatic geological and climatic changes around the Tibetan Plateau, such as rising of mountains, repeated river capture and glaciation events, which greatly influenced the biological population genetic structure (Harrison et al. 1992). The Tibetan Plateau is the largest and highest plateau in the world, with a surface area of about 2.3 million km2 and mean altitude over 4500 m above sea level (Herzschuh et al. 2010), and it is has some of the most extreme environments (Wang et al. 2015). The rise of the Tibetan Plateau began from about 50 million years ago (Ma), and large-scale uplift of this region are considered to have occurred until 10–8 Ma, or more recently (Harrison et al. 1992; Molnar et al. 1993; An et al. 2001). The Qilian Mountains located in the northeast edge of the Tibet Plateau, originated in stages during the uplift of the Tibetan Plateau (Li 1963; Li et al. 2001). Different viewpoints exist on the timing for the formation of the Qilian Mountains, and this time is between 10 Ma and 20 Ma (Li 2000; George et al. 2001; Wang et al. 2015) or began from about 8.3 Ma (Yue et al. 2001; Fang 2005). The Tibetan Plateau and its surrounding areas contain the largest quantity of glaciers besides the polar regions, and these glaciers are also the source of many important Asian rivers (Yao et al. 2007, 2012), which also created a complex geographical environment strongly shaped by repeated glacial events. The rise of Tibetan Plateau since Pliocene times therefore strongly affected the surrounding hydrological systems that provide unique habitats for important freshwater fish species. Rivers around the Qilian Mountains display a radial distribution. The three main endorheic river systems which derived from the northeast side of the Qilian Mountains encircle the Shiyang, Heihe (Ruo-Shui was in prevailing usage in 1970-80th) and Shule Rivers, flow north into the Hexi Corridor and finally disappear in the desert. The Hexi Corridor originated as a result of intermittent uplift of the Qilian Mountains but the formation of its modern hydrological system was affected by repeated river capture events (Zhu 2004; Chu et al. 2015). Towards the east of the Qilian Mountains, the Datong River flow into the Yellow River system. In the ancient time, the headwater areas of the Yellow River existed as a series of ancient lake basins around 3.0–1.6 Ma. Subsequently, many lakes disappear following a sharp uplift of the Tibet Plateau draining towards the ancient Yellow River around 1.6–0.15 Ma. Finally, the modern Yellow River formed around 0.15–0 Ma (Li et al. 1998). And on the southwest side of the Qilian Mountains was the Qaidam Basin endorheic river system. In history, as the elevational difference between the Qilian Mountains and Qaidam Basin increased, lakes in the Qaidam Basin gradually developed into rivers (Li et al. 1999). Meanwhile, dynamic geological processes affecting this region had major impacts on the evolution of the river systems, with repeated connection and separation events. For example, Qinghai Lake may have been the headwater of the ancient Yellow River (Chen 1988), while Chaka Salt Lake that once an flowed eastward into the Gonghe Basin switched later into the Yellow River (Zhang et al. 2011). Similarly, the Heihe River connected with Shule River in ancient period (Feng 1981; Chen 1988).

So far, few phylogenetic studies focusing on animals have been conducted on fishes living in the Tibetan Plateau and its surrounding areas (He et al. 2001; Zhao et al. 2008). Phylogenetic and biogeographical studies of glyptosternoid fishes which distributed in this area showed that the speciation of this group has a direct relationship with the interval uplift of the Tibet Plateau, and the water system of this region separated gradually into discrete basins (He et al. 2001). Phylogenetic analysis of one endemic species Gymnocypris chilianensis, suggested a scenario of a gradual westerly expansion from the Shiyang River population along the northeastern edge of the Tibetan Plateau, and the allopatric speciation occurred at about 0.37 Ma and 0.05 Ma, correlated with at least two glacial maxima (Zhao et al. 2011). Hou et al. (2012) examined the freshwater fish Triplophysa orientalis collected from the upper Yangtze River and associated Plateau lakes. Molecular-clock dating of intra-specific divergence suggested that the genetic structure of T. orientalis was shaped by the uplift of Tibet Plateau, and demographic analysis further indicated bottlenecks or expansions in three lake populations, suggesting the isolation of lakes have a potential impact on population differentiation.

Triplophysa leptosoma (Herzenstein 1888) (Cobitidae: Cypriniformes) is a endemic species which is restricted to the Tibet Plateau and adjacent high altitude areas. The current distribution of T. leptosoma includes the upper reaches of the Yangtze River and the Yellow River, northern Tibet, the Qaidam Basin, the Qinghai Lake and Hexi Corridor in China. This species inhabits areas close to river banks with slowly-flowing water and bottom composed of gravel where submerged waterweeds are absent. They are small-sized fishes, with conical and flattened heads, cylindrical and thick bodies, caudal peduncle is not deep and long, weakly compressed laterally, scales are absent, lateral line is complete (Zhu 1989). The lips strongly furrowed or papillated, anus is positioned from the beginning of anal fin at a distance, intestine short and thin and bending in zigzag-shape. Branched dorsal fin rays 7, branched pelvic fin rays 6–7, branched pectoral fin rays 10–11, branched anal fin rays 5, branched caudal fin rays 7–8 and vertebral numbers 4 + 38–42 (Chen 1998). The basic background light- or grayish-brown from above, small brownish mottles scattered on the upper surface of the body along the lateral line, and on the dorsal, caudal fins, peritoneum is silverfish (Prokofiev 2007). T. leptosoma starts to reproduce as soon as the river channel begins to melt. Chironomidae larvae and other insects are their primary food (Zhu 1989).

Current knowledge of genus Triplophysa is restricted to the description of morphology, life history habits and the discovery of new species of Triplophysa (Zheng et al. 2009; Li et al. 2015), but there are very few studies of the phylogenetic or population genetic structure of these species (Hou et al. 2012). Especially the phylogenetic studies on the genus Triplophysa, which are distributed around the Qilian Mountains is barely been studied. We collected the samples from the river systems around the Qilian Mountains, including the Shiyang River, Heihe River and Shule River in Hexi Corridor, Qaidam Basin river system and Yellow River system. The sampling sites represent almost the entire geographical range of the T. leptosoma except the upper reaches of the Yangtze River, but we focused on the Qilian Mountains area along the northeast edge of Tibet Plateau, to investigate the effects of geological processes and environmental factors on the evolutionary history of the fish fauna in this region. In this study, we analyze mitochondrial DNA (mtDNA) sequences and the two major goals were to (1) detect whether the phylogenetic divergence of the mtDNA clades of T. leptosoma may correspond to the intermittent uplift of the Qilian Mountains and the related geological events, and (2) investigate whether population expansion and diversification were occurred during the repeated climate changes in the Quaternary period.

Materials and methods

Study area and samples collection

A total of 215 samples form 15 sites (Fig. 1 and Table 1) were collected using gill nets between July and October, 2010 and 2012. Sampling sites were chosen to represent different river systems around the Qilian Mountains at an altitude between 1000 and 4000 m. For each collection site we recorded geographic coordinates (latitude/longitude) and altitude using a GPS unit, and river width, depth, flow velocity, turbidity, and pH to characterize the habitat. All samples were immediately transferred to collection tubes with 100% ethanol, and placed at −20 °C until DNA extraction in the laboratory. Voucher specimens were deposited in the College of Life Sciences, Lanzhou University (voucher number: LZUTL10001 ~ LZUTL10089, LZUTL11001 ~ LZUTL10104, LZUTL12001 ~ LZUTL12022).

Fig. 1
figure 1

Geographic location of the investigated populations of T. leptosoma. Shown are all river systems around the Qilian Mountains (center). Sampling localities are color-coded according to river system

Table 1 Geographic location, population codes, elevation (m), coordinates (eastern longitude/northern latitude), number of specimens (N) and river basin for sampling sites for T. leptosoma. Haplotype refers to the mitochondrial DNA type found among the samples at each location

Extraction, amplification and sequencing

Total genomic DNA was extracted from muscle tissue or fin clips using TIANamp Marine Animals DNA Kit (Tiangen Biotech Co., Ltd.). The complete mitochondrial cytochrome b (Cytb) was amplified using primers L14724 (Pääbo 1990) and H15915 (Irwin et al. 1991). The complete cytochrome oxidase I gene (Cox1) was amplified using the primers FishF1 and FishR1 (Ward et al. 2005).

PCR reagents of Cytb contained 1.0 uL of template DNA, 1 uL of each primer (10 uM), 2.5 uL of 10X reaction buffer, 2 uL of dNTPs (150 uM) and 0.2 uL of Taq DNA polymerase in a total volume of 25 uL. The PCR reaction of Cox1 contained 2.0 uL of template DNA, 0.8 uL of each primer (10 uM), 2.5 uL of 10X reaction buffer, 2 uL of dNTPs (150 uM) and 0.15 uL of Taq DNA polymerase in a total volume of 25 uL. The PCR conditions were almost the same for both genes, with an initial denaturation at 94 °C for 5 min, followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 60 °C (Cytb) or 55 °C (Cox1) for 45 s, and extension at 72 °C for 1 min, followed by a final extension at 72 °C for 7 min. The amplified DNA was detected by electrophoresis in 0.8% low-melting agarose gels and sequenced in both directions using Sanger dideoxy sequencing by HuaDa Biotechnology Company (Beijing, China).

Statistical analyses

DNA sequences obtained from Cytb and Cox1 amplicons were assembled and aligned with MEGA v.5 (Tamura et al. 2011) and ClustalX v.1.8 (Thompson et al. 1997) and inspected by eye to verify the open reading frame. Subsequently, both fragments were concatenated for analyses after using partition homogeneity test as implemented in PAUP* v.4b.10 (Swofford 2002). Diversity of the population samples was estimated by haplotype number (H), haplotype diversity (Hd), nucleotide diversity (pi) and the number of polymorphic sites of the combined genes was estimated with Dnasp v.5.10 (Librado and Rozas 2009). Hierarchical analysis of molecular variance (AMOVA) (Excoffier et al. 1992) was used to quantify population divergence among the five phylogenetic clades (Fig. 4) and among populations within each clade, and diversity of populations (Fst-values) within each phylogenetic clade using Arlequin v. 3.5 (Excoffier and Lischer 2010).

Evolutionary neutrality tests and mismatch analysis were used to examine population demography. Mismatch distributions test for demographic signatures of population expansions on mtDNA lineages (Rogers 1995). Tajima’s D (1989) and Fu’s Fs (1997) parameters were calculated with Arlequin v. 3.5 (Excoffier and Lischer 2010) to test for equilibrium of the populations. The validity of the expansion model was tested by the sum of square deviations (SSDs) between the observed and expected mismatches as implemented in Arlequin v. 3.5 (Excoffier and Lischer 2010). These statistics are expected to have large negative values under demographic expansion. We also estimated the population size changes with Dnasp v.5.10 (Librado and Rozas 2009). In addition, the time of expansion of the different populations was estimated as t = Tau/2u (u = mμ, μ = λg) (Rogers and Harpending 1992), where Tau is the mode of the mismatch distribution, expressed in units of evolutionary time, u is the mutation rate for the whole sequence, m is the length of the sequence, λ is the nucleotide substitution rate, and g is the generation time. In this study, the generation time (g) was assuming of 1 year, the length of the sequence (m) was 1635 bp, the nucleotide substitution rate (λ) was based on the Cobitis molecular divergence rate of 0.68% (nucleotide substitution rate of 0.34% per MY) (Doadrio and Perdices 2005), and the mismatch value (Tau) was computed by Arlequin v. 3.5 (Excoffier and Lischer 2010).

To reconstruct the historical biogeography of the T. leptosoma group, we use the method of Statistical Dispersal-Vicariance Analysis (S-DIVA) (Yu et al. 2010) to infer the ranges of all ancestral nodes of the phylogeny. This method is a routine approach which was widely in population genetic and phylogeographic analysis (Ali et al. 2012; Liao et al. 2012; Wang et al. 2014). We constructed the S-DIVA illustration for the mtDNA haplotypes with S-DIVA analysis which was implemented in RASP V.3.1 (Yu et al. 2015). We ran 1000 max reconstructions and the number of max areas at each node was kept as 2. Five areas were defined as: (A) Yellow River; (B) Qaidam Basin; (C) Shiyang River; (D) Shule River; (E) Heihe River; (F) Outgroup distribution.

Phylogenetic analyses and divergence estimates

A haplotype phylogeny was estimated using partitioned maximum likelihood (ML) and partitioned Bayesian inference (BI). We choose Triplophysa rosa (NC_019587.1), Triplophysa robusta (NC_025632.1) and Triplophysa pappenheimi (KT213600.1), which all three species belong to the Triplophysa genus, as the outgroup species. The best-fit partitioning scheme across each gene was determined for each data set under the Bayesian information criterion using PartitionFinder v 1.1.0 (Lanfear et al. 2012). Partitioned MrBayes analysis using mixed models was performed with MrBayes v.3.2.3 (Ronquist et al. 2012) and parameters were drawn every 1000 steps from a total of 10,000,000 Markov Chain Monte Carlo (MCMC) generations. The samples were used to generate a majority-rule consensus tree with the first 2500 trees were discarded as burn-in. Convergence of the BI analyses between the runs was assessed by the average standard deviation of split frequencies less than 0.01 as well as the potential scale reduction factors (PSRF) approach 1.0 for all parameters. The support values of the BI tree were estimated by posterior probability from the Bayesian analyses. We also reconstructed the maximum likelihood tree implemented in RAxML v8.2.X (Stamatakis 2014) with the GTRGAMMA model. The bootstrap support values of the ML phylogenetic tree were performed with 1000 bootstrap replications. In addition, a minimum spanning median-joining network of all T. leptosoma haplotypes was constructed with Network v.4.6 (Bandelt et al. 1999) to visualize the relationships among haplotypes within each lineage.

We estimated nodal ages for the mtDNA phylogeny using a molecular clock method implemented with BEAST v.2.1.3 (Drummond and Rambaut 2007). Because fossil data to calibrate a species-specific molecular clock are lacking, we use an approximate molecular divergence rate of 0.68% (per MY) estimated for mtDNA divergences among Cobitis fishes (Doadrio and Perdices 2005). We implemented the GTR substitution model and a relaxed uncorrelated lognormal clock model. The priors used Coalescent Constant Population and we ran a Markov Chain Monte Carlo (MCMC) simulation with 10,000,000 steps, sampling every 1000 generations. Stationarity of the posterior distributions was assessed with ESS (requiring values >200, as recommended by the software developers). TreeAnnotator v.2.1 (Beast package) was used to summarize the information from the posterior distribution of trees by majority-rule consensus, with the first 40% of samples discarded as burn-in. The consensus tree was visualized and graphed using FigTree v.1.4 (Beast package).

Results

Sequence characteristics

Total length of the concatenated mtDNA fragments was 1635 bp (1000 bp of Cytb and 635 bp of Cox1), obtained for 215 individuals sampled from 15 geographic locations. Among the 1635 aligned nucleotide sites, 335 were variable and 322 parsimony informative. A total of 48 distinct haplotypes (concatenated Cytb and Cox1) were assembled from mtDNA sequences obtained from T. leptosoma samples. DNA sequences of the 39 Cytb haplotypes have been submitted to GenBank under Accession Nos. KX213634-KX213672, and sequences of 20 Cox1 haplotypes have been submitted to GenBank under Accession Nos. KX213673 ~ KX213692.

Genetic diversity and demographic scenarios within and among phylogenetic clade

Estimates of polymorphism and genetic diversity of mitochondrial haplotypes in each phylogenetic clade are given in Table 2. The highest value of nucleotide diversity and haplotype diversity were both found in clade I-3 (Fig. 4; Pi = 0.0154 and Hd = 0.900), which all haplotypes of this clade belong to the Shiyang River system. This latter value may be inflated due to the low sample size (N = 5) available from the Shiyang River. Both indices were lowest among samples from the clade I-4 (Fig. 4, Pi = 0.0005 and Hd = 0.643), which all haplotypes of this clade belong to the Shule River system.

Table 2 Genetic diversity and polymorphism among Cytb and Cox1 haplotypes in populations of T. leptosoma from five phylogenetic clades. Hd: haplotype diversity; Pi: nucleotide diversity

Genetic differentiation among phylogenetic clades (pairwise Fst, Table 3) is generally high and significant; the lowest value was the comparison between the clade I-1 and the clade I-3. Results of AMOVA (Table 4) are consistent with observed genetic differentiation among the two main population groups: 53% of the total variation is distributed among clade I and clade II (Fig. 4), with only 3.55% of the variation observed within populations. Likewise, AMOVA indicated high levels of differentiation at the nucleotide level among population within groups (FSC = 0.9245, P < 0.001), as well as within population (FST = 0.9645, P < 0.001). Among groups presented a much lower level of differentiation (FCT = 0.5300, P < 0.01).

Table 3 Fst values among the phylogenetic clades of T. leptosoma. P = 0.00
Table 4 Results of analyses of molecular variance (AMOVA) of haplotype frequencies for phylogenetic groups and populations of T. leptosoma

We calculated Fu’s Fs and Tajima’s D statistic for each phylogenetic clade (Table 5). Significant negative values of FS and Tajima’s D indicated that population had undergone a demographic expansion, while the raggedness index for the mismatch distribution also supported a sudden-expansion model.

Table 5 The Neutrality tests and mismatch analysis of the populations of T. leptosoma in five phylogenetic clades

The S-DIVA analysis shows the optimal reconstruction (Fig. 2) and the diagram of biogeographical events is shown in Fig. 3. The reconstruction reveals 5 dispersals and 7 vicariances to explain the present distribution pattern of the T. leptosoma. Moreover, frequent dispersal and vicariance events were occurred within the clade I-1 and clade II. The diagram (Fig. 3) results also show fastigiums of dispersal and vicariance events within the populations. In addition, S-DIVA suggests possible alternative ancestral ranges of nodes 98, 81, 60, 96 and 95 (with frequency of occurrence).

Fig. 2
figure 2

Graphical results of reconstruction of ancestral state in phylogenies with time scale output from S-DIVA (exported from RASP). Biogeographical regions: (A) Yellow River. (B) Qaidam Basin. (C) Shiyang River. (D) Shule River. (E) Heihe River. (F) Outgroup distribution. The pie charts represent the relative probability of ancestral areas reconstructed for each node. Biogeographic events are highlighted in the top left corner of each node. Colors refer to possible ancestral ranges at different nodes; black with an asterisk represents other ancestral ranges

Fig. 3
figure 3

Frequency of biogeographic events within the T. leptosoma populations

Genetic structure and divergence dates

Both phylogenetic inference methods (ML and BI) converged on highly similar results with high bootstrap support values and Bayesian posterior probability (Fig. 4). Phylogenetic relationships among the 48 haplotypes of T. leptosoma revealed two major clades with strong geographic orientation (Fig. 4). While clade II has shallow phylogenetic structure, four well-supported haplogroups were resolved within clade I as follows: clade I-1 has haplotypes from the Yellow River and the Qaidam Basin and one haplotype (H25) from the Heihe River; clade I-2 contains haplotypes found only in the Qaidam Basin; clade I-3 is exclusive of the Shiyang River; and clade I-4 is exclusive of the Shule River. Clade II is composed of haplotypes from the Heihe River and two from the Yellow River. Therefore, haplotypes found in the Yellow River are affiliated to both mitochondrial clades and share close affinities with haplotypes obtained in the Qaidam Basin and the Heihe River. In contrast, haplotypes from the Shiyang and Shule Rivers form exclusive phylogenetic clades.

Fig. 4
figure 4

The phylogenetic trees obtained for Cytb and Cox1 haplotypes (1635 bp) using Bayesian inference and RAxML methods. Numbers on the branches correspond to posterior probabilities (> 0.5) from the Bayesian analyses and bootstrap support (>50%) from the ML analyses. Colors refer to different river system. Scale bar on the bottom represents 0.03 substitutions per site

The haplotype network (Fig. 5) is consistent with the five phylogenetic clades shown in Fig. 4, except that group I-1 is not supported. A star-like structure of haplotype lineages radiating from a central common haplotype is well defined for clade II (Heihe River, H15 central) and to a lesser extent in the Qaidam Basin clade I-2 (H32 and H33 in central position), a pattern that suggests recent expansion.

Fig. 5
figure 5

The median-joining network for the 48 haplotypes based on the combined sequence data (Cytb plus Cox1) for T. leptosoma. Circle sizes are proportional to the number of individuals (scale below the Figure). The black dots are the nucleotide substitutions inferred in that branch. The number of mutational steps joining the clades is indicated along the connecting branches. The different colour represents the different river systems

The divergence time since separation of mitochondrial clades I and II within T. leptosoma was estimated to approximately 10 Ma (95% HPD = 9.05–13.82 Ma; see Fig. 2; HPD means highest posterior density), whereas the separation of this species from the outgroup species dates to about 15.84 Ma (95% HPD = 11.43–22.26 Ma). Within clade I, separation of the four haplotypes groups was estimated to originate approximately between 7 and 3 Ma (Fig. 2).

Discussion

Phylogenetic divergence of the T. leptosoma upon the uplift of Qilian Mountains

This is the first phylogeographic study of T. leptosoma in the Qilian Mountains area. The phylogenetic clades of T. leptosoma may correspond to the intermittent uplift of the Qilian Mountains and the related historical geological events, including the rising of mountains and repeated river capture. A major split in mtDNA diversity in this species (between Clades I and II) was inferred to occurred around 10 Ma, corresponding to the separation of the Heihe River population (except haplogroup H8 and H3 from the Yellow River) from the rest of the river system population. This divergence may have been caused by the associated geographic events in this region during the late Cenozoic. Since the intensive uplift of the Qilian Mountains of the northeastern margin of the Tibet Plateau, the upper reaches of Heihe River became torrential flows running along the South Corridor Mountain, resulting in deep canyons (Feng 1988). Meanwhile, the results of S-DIVA illustration also revealed one vicariance event at node 98 at about 10 Ma (Fig. 2), just correspond to the separation of the population of clade I and II. Furthermore, the rising intensity of the Qilian Mountains is not the same from east to west and the eastern part is lower than the west (Li 1963). So, in this process, Heihe River population was separated from the other rivers in this region gradually during the uplift of the Qilian Mountains.

The inferred calibration of divergences within major mtDNA clades, suggest that the divergence time for clade I-4 (Shule River population) separated from the rest populations of the clade I was about 6.78 Ma (95% HPD = 4.88–9.38 Ma), a date that is consistent with results based on high resolution paleomagnetic dating of the Shule River formation at 8.3–4.9 Ma (Fang et al. 2004). Clade I-3 (Shiyang River population) separated from clade I-1 and clade I-2 populations at about 3.71 Ma (95% HPD = 2.55–5.10 Ma) and the prior research pointed that the separation and subsequent flow reversal of the Shiyang River could have occurred as a result of intermittent uplift of the Qilian Mountains (Feng 1981). At the same time, the two divergence of clade I-4 and Clade I-3 population were corresponding to the vicariance events at nodes 81 and 77 from the analysis of S-DIVA (Fig. 2). Clade I-1 (Yellow River, Heihe River and Qaidam Basin river population) and clade I-2 (Qaidam Basin river population) split from each other about 2.58 Ma (95% HPD = 1.57–3.72 Ma), and the Yellow River population within clade I-1 separated from the rest population at about 0.43 Ma. The divergence time of the Yellow River population also close to the time of the formation of the Yellow River from a series of ancient lake basins around 3.0 ~ 1.6 Ma. More recently, many lakes disappeared as a consequence of the sharp uplift of the Tibet Plateau and the ancient Yellow River emerged around 1.6 ~ 0.15 Ma. Above all, our data and analyses inferred that the divergence time of the five clades corresponded to the rise of the Qilian Mountains that formed physical barriers between lineages. Until now, the main processes giving rise to the current landscape of Qilian Mountains are not well established and different viewpoints exist on the timing for the formation of the Qilian Mountains. Apatite fission-track and vitrinite-reflectance data shows this time is between 10 Ma and 20 Ma (George et al. 2001). The sedimentary strata date shows it began to raise at 14 ~ 18 Ma (Wang 1997; Yue et al. 2001). Moreover, due to the changes of lithological and lithofacies, it began from about 8.3 Ma, a slight acceleration at about 6.6 Ma, and at about 3.66 Ma have a rapid uplift sharply (Li 1963; Fang 2005). Compared to all viewpoints above, our results show the divergence time between the five clades was coincides with the most recent largest uplift of the Qilian Mountains, as suggested by Fang (2005) and Li (1963).

Exchanges between the Heihe River population and the Yellow River population happened more recently. The evidence for this is the presence of haplotypes H8 and H3 from the Yellow River population, that are most close relationship to other haplotypes from the Heihe River population, especially H21 that is very similar. The estimated divergence time between Yellow River population (haplogroup of H3 and H8) and Heihe River population (H21) was about 0.17 Ma. So we could infer that the population of T. leptosoma carried these haplotypes must have moved from the Heihe River to the Yellow River at about 0.17 Ma. We also note that the BB locality (where H21 was found; Table 1) is close to the headwaters of the DT River (where H3 and H8 were found, Table 1), so the exchange could have happened between BB locality and DT locality due to river capture event. Moreover, exchanges in the opposite direction (Yellow River population to Heihe River population) also happened and the evidence for this is the affiliation of H25 found in GZ locality (Heihe River) in clade I-1. This clade has all the other Yellow River haplotypes besides H3 and H8, and the divergence time between Heihe River population (H25) and Yellow river population is about 0.6 Ma. Previous studies have also pointed out that the source of Heihe River and Datong River were converged in ancient time (Feng 1988). So, when the river capture happened in the glacial cycle period of Quaternary, the fish from both river basins were exchanged, carrying with their haplotypes that are characteristic of their original river basin. We also found the population from Qaidam Basin river system and Yellow River are closely related. Clade I-1 has haplotypes form both river system and Clade I-2 (all other Qaidam Basin haplotypes) is the sister to Clade I-1. This suggests that both river systems have been interconnected recently. Although the sample sites from Qaidam Basin are far apart from Yellow River system geographically, they also have very close phylogenetic relationship. The explanation of this results is stratigraphic evidence shows that prior to the development of the Yellow River on the Tibetan Plateau, the limits of the headwater area of the Yellow River existed as a series of ancient lake basins (Zhang et al. 2003), and Qinghai Lake may have been the headwater of the ancient Yellow River. Besides, Chaka Salt Lake was once an outflow lake which flows eastward into the Gonghe Basin, and finally into the Yellow River (Zhang et al. 2011). Meanwhile, frequent dispersal and vicariance events which were occurred within the clade I-1 and clade II (Fig. 2), may have been vital in the exchanges between the Heihe River population and the Yellow River population. So, the past and present connections among the rivers and lakes have a far-reaching influence on the major patterns of intraspecific diversification and population structure of the T. leptosoma. Individuals with shared haplotypes were observed in samples taken from the same locality and also from different geographic localities within the same river system. Fish population living in different reaches or tributaries of these river systems have been interconnected in ancient times, with extensive gene flow within rivers that gradually began to separate following the uplift of the western of the Qilian Mountains.

Demographic processes and diversification of T. leptosoma in the quaternary period

Population expansion was occurred during the repeated climate changes in the Quaternary period. The haplotype median-joining network of the T. leptosoma suggested different demographic histories within haplogroups. We found two star-like structures of haplotype lineages radiating from a central common haplotype (Fig. 5), respectively are H15 central in clade II and H32 and H33 in central positions in clade I-2. The star-like haplogroups of the clades could reflect a recent (post-glacial) expansion within each clade. In the first star-like haplogroups in clade II, sample sites YN, BQ, BB and GZ shared the same haplotype H15. They were the head-water of the Heihe River system and connected with each other, which were located in the north slope of the Qilian Mountains. Two glacial periods have been documented in the Qilian Mountains. Some scholars believe that the entire range of the northern slope experienced two ice ages. The first ice age was relatively large, and the second ice age is relatively small, only distributed in high altitude mountain areas (Zhou et al. 2002). According to our field observation, sample sites YN, BQ, BB and GZ are at an altitude of about 3000 m, located in the Valley of Qilian Mountains with moist climate, high vegetation coverage and abundant river flows supplied by the mountain glaciers. During glacial period, the temperature was falling sharply, the glacier area increased, and the fish population shrank into the remaining living environment. However, during interglacial periods with rising temperature and melting glaciers, the fish populations expanded. In addition, the neutral test for the clade II population (Tajima’s D = −1.00) also showed signs that clade II expanded at about 0.009 Ma, which coincided with the climatic oscillations of the Pleistocene due to the large uplift events of the Tibetan Plateau (Li et al. 2001). Meanwhile, a dispersal event deduced by S-DIVA (node 97, Fig. 2) analyses supported the inference on the contraction and expansion event of the clade II. In the second star-like haplogroups in clade I-2, sample sites TL, GE and NM located in Qaidam Basin system and they shared the haplotype H32, and sample sites GE and NM shared the haplotype H33. And, this clade experienced population expansion occurred at about 0.049 Ma.

Furthermore, we noted that Shule River and Shiyang River are well isolated from the rest of the fishes in Clade I in the phylogenetic tree. This genealogy and diversity indices suggested a bottleneck or recent erosion of genetic diversity in the clade I-4 (Shule River). The climate of this area is relatively poor compared with other river systems environment, some rivers dried up or they were isolated from each other since few of precipitation (Feng 1981). Besides, neutral test of clade I-4 (Shule River) population FS = −0.19 also showed the population expansion in this region. Moreover, the S-DIVA analyses showed a dispersal event (node 81, Fig. 2), which was corresponding to the population expansion in the clade I-4. Therefore, we hypothesized that the population of this region experienced a sharp reduction due to the dramatic climate changes accompanied by the raise of the Qilian Mountains recently.

Genetic diversity, measured as nucleotide diversity (Pi), and the most remarkable of these is the clear geographic pattern of decreasing genetic diversity from the eastern clade I-3 (Shiyang River, Pi = 0.0154), towards to clade I-4 (Shule River, Pi = 0.0004). Due to ecological environment of Qilian Mountains with a tendency of increasingly harsher climate from east to west, and less rainfall, lower vegetation coverage and sparse river network, this leads to gradually reduced survival and lower population genetic diversity.

In short, our data and analyses inferred the major patterns of intraspecific diversification and population structure of the T. leptosoma. The phylogenetic clades may correspond to the intermittent uplift of the Qilian Mountains and the related historical geological events. The Quaternary climatic changes cycles played integral roles in the differentiation of clades and the genetic diversity of the T. leptosoma, and the expansion also occurred in the repeated glacial event of Quaternary.