Introduction

The geological history of China has shaped the evolutionary dynamics of endemic species in a way that differs from other well-studied regions of the world, such as Europe and North America (e.g., Kou et al. 2015; Meng et al. 2017). Topographic changes, such as the uplift of the Qinghai-Tibetan Plateau (QTP), and climatic shifts, such as the intensification of East Asian monsoons, and aridification in Central Asia, have exerted a significant influence on the evolution of the flora of China (Chen et al. 2017; Yu et al. 2017; Lu et al. 2018). Several critical periods marked by a close relationship between geological/palaeoclimatic events and intraspecific/interspecific evolutionary events have been identified, e.g., the middle Miocene (Kou et al. 2015; Bai et al. 2016; Meng et al. 2017), the late Pliocene to the early Pleistocene (Chen et al. 2015; Wang et al. 2017; Ye et al. 2018; Zhao et al. 2018), shortly after the mid-Pleistocene transition (Tian et al. 2015; Ye et al. 2018), the Last Interglacial and the Last Glacial Maximum (Tian et al. 2015; Xu et al. 2016; Fan et al. 2018). However, most recent phylogeographic investigations were carried out based on species that inhabit a separate climate region or geographic unit of East Asia, e.g., subtropical China, Qinghai-Tibetan Plateau, and northern China. Only a few focused on the impact of palaeoenvironmental changes on the intraspecific divergence and range dynamics of widespread species across multiple climatic zones or floristic regions (e.g., Chen et al. 2012; Guo et al. 2014; Ye et al. 2017). Hence, more studies at a larger geographic scale are expected to provide new insight into the diversity of evolutionary patterns amid the complex topography and climatic heterogeneity in East Asia.

Oaks (Quercus spp.) are one of the most widely distributed forest-tree genera in the Northern Hemisphere (Hubert et al. 2014) and are considered to be models to integrate ecology, evolution, and genomics (Petit et al. 2013). The evolutionary histories of oaks are strongly influenced by palaeoenvironmental changes in different ways, such as historical introgression/hybridization triggered by range fluctuations (Bagnoli et al. 2015; Ortego et al. 2017), interspecific divergence along with formation of geographical barriers (Cavender-Bares et al. 2015; Rodríguez-Correa et al. 2017), and lineage diversification caused by local adaptation to diverging climates (Meng et al. 2017; Hipp et al. 2018). Climatic forces and environmental heterogeneity have been found to be key factors driving the intraspecific genetic divergence of oaks (Ortego et al. 2012; Gugger et al. 2013; Riordan et al. 2016). Meanwhile, intraspecific admixture pattern of oaks is also associated with the stability of the local habitat. Recent studies have shown that higher stability may contribute to genetic connectivity across different climatic periods, whereas lower habitat stability may promote the genetic admixture among regions (Ortego et al. 2015).

Quercus acutissima Carruthers, one of the members of section Cerris, is a dominant species in warm temperate deciduous forests of East Asia (Fujiwara and Harada 2015). As an ecologically important and economically valuable tree species, it has been listed as one of the precious wood species by the government. In China, its modern distribution range extends from 18°N to 41°N latitude and from 91°E to 123°E longitude, across seven temperature zones and three moisture regions of China (Zhang et al. 2014a, 2015). The historical distribution of Q. acutissima remains unknown, but leaf and fruit fossils of morphologically closely related species from section Cerris were observed in the Neogene floras of southwestern China and northern China (Zhou 1993; Song et al. 2000), suggesting a probable widespread palaeodistribution of Q. acutissima in this region.

A previous study based on chloroplast markers indicated that Q. acutissima may have two localized glacial refugia in Central China and Southwest China, but the phylogeographic structure was weak (Zhang et al. 2015). Population genetic analysis of 28 provenances implied that a nuclear genetic differentiation may occur between East China and West China (Zhang et al. 2013a). Other researchers also described the genetic variation of Q. acutissima in China, South Korea, and Japan (Chung et al. 2002; Choi et al. 2005; Ye and Zeng. 2009; Zhang et al. 2013b; Saito et al. 2017). However, the information on the evolutionary history of the species was limited by small sample size and a narrow sampling range.

Due to the widespread distribution and the dominant position in warm temperate deciduous forests of East Asia, Quercus acutissima provides a useful model to evaluate the impact of the stepped geomorphology in China on the intraspecific genetic structure and demographic history of a tree species at a larger geographic scale. Here, we used ten nuclear microsatellite (nSSR) markers to assess the genetic variation of Q. acutissima based on 30 natural populations sampled throughout the distribution in China. The specific goals of this work were to (1) examine range-wide patterns of intraspecific genetic diversity, (2) investigate whether a distinct genetic structure exists, (3) infer the intraspecific evolutionary history and evaluate the effects of geological/palaeoclimatic events, and (4) reveal possible locations of glacial refugia.

Materials and methods

Population sampling

A total of 707 individuals were sampled from 30 natural populations across the entire distribution of Q. acutissima in China during the spring and summer of 2014 (Fig. 1). In each population, eight to ten fresh leaves per tree were collected from 10 to 30 adult individuals at least 30 m apart from each other. Leaf tissues were quickly dried with silica gel and stored at room temperature in the laboratory. Spatially explicit information of each sampled population, including latitude, longitude, and elevation, was recorded using a handheld GPS unit (Magellan, USA) (Tables 1 and S1).

Fig. 1
figure 1

Bayesian cluster analysis for 30 Quercus acutissima populations. Area of the pie is proportional to the probability of membership to the West China group (green) or the East China group (red). Small dots indicate the subgroup each population belongs to when we divided all the populations into seven subgroups. The bottom right map shows ranges of Southwest China (SW), Central China (C), and East China (E) groups. Light blue dashed line represents the boundary between the second step region and the third step region of China’s terrain. Population codes are described in Tables 1 and S1

Table 1 Geographic information and genetic statistics of 30 Quercus acutissima populations

DNA extraction and microsatellite genotyping

Total genomic DNA was extracted from 30 mg silica gel-dried leaf tissue of each individual using a Plant Genomic DNA Kit (Tiangen, Beijing, China). The quality and concentration of the genomic DNA were evaluated by electrophoresis with 1% agarose gels and one drop spectrophotometer (OD-1000, Shanghai Cytoeasy Biotech Co., Ltd., China), respectively. DNA samples were diluted to 20 ng/μL and stored at − 20 °C for PCR amplification.

All the 707 DNA samples were genotyped at 12 nSSR loci using primers developed for Q. acutissima or other oak tree species, including quru-GA-0M05 and quru-GA-0M07 (Aldrich et al. 2002); MSQ16 (Dow and Ashley 1996); QM67-3M1 (Isagi and Suhandono 1997); ssrQrZAG4, ssrQrZAG7, ssrQrZAG11, ssrQrZAG20, and ssrQrZAG112 (Kampfer et al. 1998); ssrQpZAG9 and ssrQpZAG110 (Steinkellner et al. 1997); and EE743802 (Zhang et al. 2013a). The forward primers were labeled with fluorescent dye, 6-FAM, or HEX (Applied Biosystems, USA). Polymerase chain reactions (PCRs) were performed using a Mastercycler pro Thermal Cycler (Eppendorf, Germany) in 20-μL volumes containing 2× Taq PCR MasterMix (Tiangen, Beijing, China), 10 μM of each primer, 20–40 ng of template DNA and ddH2O. The PCR protocols were designed as an initial denaturing of 4 min at 94 °C, followed by 30 cycles of 45 s at 94 °C, 45 s of annealing at a primer-specific temperature (Table 2), 45 s of elongation at 72 °C, and ending with a final extension of 8 min at 72 °C. PCR products were separated on an ABI3730xl automated Genetic Analyzer using ROX-500 as an internal standard (Applied Biosystems, USA). Allele sizes were determined manually using Genemarker version 2.2.0 (Applied Biosystems, USA).

Table 2 Genetic statistics of ten nuclear microsatellite loci used in this study

Data analysis

Null alleles

The frequency of null alleles at each locus in each population was estimated using INEST version 2.2 (Chybicki and Burczyk 2009) based on an individual inbreeding model (IIM) simultaneously including three parameters, i.e., genotyping failures (b), inbreeding coefficients (f), and null alleles (n). This approach was implemented with 500,000 iterations, discounting 50,000 iterations as burn-in, and sampling every 50th update. We compared deviance information criterion (DIC) values between models with null alleles (nfb) and models without null alleles (fb) to evaluate the significance of null alleles in each population. A smaller DICnfb than DICfb indicated that the model with null alleles was favored; in other words, we can verify the significance of null alleles in the model. Using this method, we found that the mean null allele frequency of quru-GA-0M05 and EE743802 across all the 30 populations were 0.21 and 0.13, respectively, greater than the threshold of 0.10. These two loci were then excluded from all subsequent analyses.

Genetic diversity

Genetic diversity statistics were estimated using POPGENE version 1.32 (number of alleles, NA; effective number of alleles, NE; observed heterozygosity, HO; Yeh et al. 1999) and FSTAT version 2.9.3.2 (allelic richness, AR; genetic diversity within populations, HS; inbreeding coefficient, FIS; Goudet 2001) for each locus and each population. Linkage disequilibrium (LD) for all locus pairs in each population and significant deviations from Hardy-Weinberg equilibrium (HWE), as indicated by deviations of FIS from zero, were tested by randomization using FSTAT. The obtained P values were adjusted using a sequential Bonferroni correction (Rice 1989). Since the presence of null alleles may affect the estimation of FIS, a corrected inbreeding coefficient (FIS′) was also evaluated for each population with INEST version 2.2 using the full model (nfb). The significance of FIS′ was assessed by comparing DIC values between the full model and the model without inbreeding (nb).

To estimate the effect of reduction in population size on genetic diversity due to bottlenecks, the Wilcoxon sign-rank test was implemented in Bottleneck version 1.2.02 (Cornuet and Luikart 1996) under two mutation models: the stepwise mutation model (SMM) and the two-phase mutation model (TPM) with 70% stepwise mutations and 30% multistep mutations. We performed 10,000 simulations for each population.

Genetic differentiation

We estimated both global FST (Weir and Cockerham 1984) and the standardized measure GST (Hedrick 2005) for each locus and over all loci in MSA version 4.05 (Dieringer and Schlotterer 2003), with the significance tested by 10,000 permutations. To evaluate the impact of null alleles on estimation of genetic differentiation among populations, we also calculated FST with a correction using the ‘exclusion null alleles’ (ENA) method (cFST) in FreeNA (Chapuis and Estoup 2007). The 95% confidence intervals (CIs) of both global FST and cFST across loci were obtained through bootstrap re-sampling in the same program. In addition, we applied the allele size permutation test (Hardy et al. 2003) in SPAGeDi version 1.5 (Hardy and Vekemans 2002) to detect the significant influence of stepwise-like mutations on genetic structuring, as indicated by RST > FST, where RST is an FST analog based on allele size variance (Slatkin 1995).

We tested the significance of differences in genetic statistics AR, HO, HS, and FST among geographic regions identified by STRUCTURE using FSTAT version 2.9.3.2. The two-sided P values were obtained after 10,000 permutations. A hierarchical analysis of molecular variance (AMOVA) was performed in ARLEQUIN version 3.5 (Excoffier and Lischer 2010) to examine the pattern of genetic variance partitioned among regions (sub-regions), among populations within regions (sub-regions), and within populations. The significance of fixation indices (FCT, FSC, and FST) was tested with 10,000 permutations. To test isolation by distance (IBD), a Mantel test with 10,000 random permutations was also carried out using GENALEX 6.5 (Peakall and Smouse 2012) between the matrix of pairwise FST/(1-FST) and the matrix of logarithm of the geographic distances.

Genetic structure

We used Bayesian cluster analysis to detect the genetic structure of Q. acutissima using STRUCTURE version 2.3.4 (Pritchard et al. 2000). A LOCPRIOR model was applied to use sampling locations as prior information (Hubisz et al. 2009). Twenty independent runs for each K (from 1 to 12) were performed with a 500,000 burn-in period and 500,000 repetitions. The optimal number of clusters (K) was determined through the delta K method as described in Evanno et al. (2005). Values of membership coefficient (q) were post-processed over the 20 runs using CLUMPP version 1.1.2 (Jakobsson and Rosenberg 2007) and displayed with DISTRUCT version 1.1 (Rosenberg 2004). Complementary to Bayesian cluster analysis, we also performed principal coordinate analysis (PCoA) based on pairwise population FST values using GENALEX 6.5 (Peakall and Smouse 2012) and conducted UPGMA cluster analysis in NTSYS-pc version 2.1 (Rohlf 1999) based on Nei’s unbiased genetic distances (Nei 1978) calculated by POPGENE version 1.32 (Yeh et al. 1999).

Gene flow among regions

We used a maximum likelihood procedure and a Brownian motion microsatellite model as implemented in MIGRATE-N version 3.6.11 (Beerli 2006) to estimate the mutation-scaled population size (θ) and immigration rates (M) among the three geographic regions identified by STRUCTURE. Three independent replicates were run with ten short chains (10,000 sampled trees for each chain), five long chains (100,000 sampled trees for each chain), and the first 10,000 sampled trees discarded as burn-in. The adaptive heating scheme at four temperatures (1.0, 1.5, 3.0, and 6.0) was used to efficiently search the genealogy space.

Demographic history

The approximate Bayesian computation (ABC) procedure was carried out in DIYABC version 2.1.0 (Cornuet et al. 2014) to infer the probable demographic history of the three genetic groups identified in this study. Specifically, the Southwest China group (SW) comprises 61 individuals from Southwest China and the Hainan Island, the Central China group (C) comprises 258 individuals from Central China and the Loess Plateau, and the East China group (E) comprises 388 individuals from East China and the Tianmu Mountain. Eight scenarios were compared to test different hypotheses about glacial refugia: three refugia in Southwest China, Central China, and East China (scenario 1); two refugia in Southwest China and Central China (scenario 2); and one refugium in East China (scenario 3), Central China (scenario 4), or Southwest China (scenarios 5–8) (Fig. 2 and Table S2). The analysis was run five times independently, with one million simulations for each scenario in each replicate. The prior distributions of historical parameters are shown in Table S3. Eighteen summary statistics including mean number of alleles, mean genic diversity for three regions, FST, classification index, and shared allele distance between pairs of regions were defined in the ABC analysis. The 1% simulated data closest to the observed data was used to estimate the posterior distributions of historical parameters and the relative posterior probabilities of each scenario via a logistic regression approach. The model checking process was performed to evaluate the goodness of fit for the preferred scenario.

Fig. 2
figure 2

Eight demographic scenarios tested by approximate Bayesian computation (ABC). NA, NSW, NC, and NE are effective population sizes of the common ancestor (A), the Southwest China group (SW), the Central China group (C), and the East China group (E). Parameters t1, t2, and t3 are times of occurrence of various events including population size variation, divergence, and admixture, which are counted in generations and described in Table S2. Parameters rA and 1-rA are the admixture rates from the Southwest China group to the Central China group and from the East China group to the Central China group

Results

Genetic diversity of microsatellite loci and populations

Mean null allele frequency at each of the ten loci was estimated to be lower than the threshold of 0.10 across all the 30 populations (Table 2). Comparison of DIC values showed that the model with null alleles was slightly favored in 22 populations of Q. acutissima (Tables 1 and S4), while estimated frequency averaged across loci was relatively low (range 0.02–0.07; Table 1). Ten populations (TC, LN, TE, MX, AK, TB, NJ, LYG, QHD, and WTM) had an inbreeding coefficient that significantly deviated from zero (P < 0.05 after Bonferroni correction), but these signs of departure from HWE were not shown after excluding the bias introduced by the existence of null alleles (Tables 1 and S4). No evidence of significant LD was observed for each pair of loci in each population (P < 0.05 after Bonferroni correction), indicating that all loci were inherited independently and thus can be used for investigating the genetic variation of Q. acutissima.

At the population level, range of mean effective number of alleles (NE) was 2.7–4.7 (mean 4.0), allelic richness (AR) was 4.1–6.5 (mean 5.7), and genetic diversity within populations (HS) was 0.572–0.763 (mean 0.709; Table 1). Higher intrapopulation genetic diversity was observed in East China than in West China (P = 0.007 for AR; P < 0.001 for HS; Table 3). Central populations between 26°N and 35°N also presented higher genetic diversity than southern populations (P = 0.014 for AR; P = 0.022 for HS) and northern populations (P = 0.044 for AR; except for P = 0.117 for HS; Table S5). Additionally, a significant quadratic regression relationship was detected between latitude and genetic diversity statistics for all populations (R = 0.500, P = 0.020 for AR; R = 0.515, P = 0.016 for HS), and in both Southwest-Central China (R = 0.659, P = 0.043 for AR; R = 0.789, P = 0.005 for HS) and East China (R = 0.686, P = 0.016 for AR; except for P = 0.082 for HS; Fig. S1), suggesting that central populations have higher genetic diversity than southern and northern populations. No signal of recent bottleneck was detected in each population based on the stepwise mutation model (SMM) and the two-phase mutation model (TPM), except for in YF under the stepwise mutation model (P = 0.016; Table S6).

Table 3 Comparison of genetic statistics among different geographic regions of Quercus acutissima

Genetic differentiation among populations and regions

The impact of null alleles on the estimation of population differentiation was very mild, as indicated by the almost equal values and 95% CIs for both global FST (0.056, 95% CI: 0.049–0.063) and cFST (0.056, 95% CI: 0.049–0.062). Genetic differentiation at each locus was significant (P < 0.001) but relatively low (Table 2). The observed value of multilocus RST (0.058) laid between the upper and lower limits of the 95% CI of the null distribution of the permuted RST (0.034–0.075), and the difference was not statistically significant (P = 0.710). None of the global RST values for each locus was significantly larger than the permuted RST (P > 0.05; Table 2), suggesting that RST was equal to FST and the stepwise-like mutations have not contributed to the genetic divergence of Q. acutissima.

Results of AMOVA indicated significant genetic differentiation between West China and East China (FCT = 0.019, P < 0.0001), whereas only 1.85% of the genetic variation was partitioned among groups (Table 4). At the population level, a lower FST was detected in East China than in West China (P = 0.005) or in Southwest China (P = 0.022; Table 3), implying stronger genetic differentiation among populations within the West China region. Additionally, a significant pattern of isolation by geographic distance was observed for all the 30 populations (R = 0.346, P < 0.001), West China populations (R = 0.548, P < 0.001), and Central China populations (R = 0.292, P = 0.039), but not for East China populations (P > 0.05; Fig. S2).

Table 4 Analysis of molecular variance (AMOVA) among different geographic regions (sub-regions) of Quercus acutissima

Genetic structure and gene flow

Bayesian cluster analysis showed that the optimal number of genetic groups (K) was two (Fig. 3a). All the 30 populations of Q. acutissima were divided into the West China group and the East China group, across a well-known boundary of the stepped geomorphology of China, which separates the second step region characterized by plateaus with higher elevation of 1000–2000 m, and the third step region characterized by plains with lower elevation < 500 m (Fig. 1 and Table 1). A significant decline in the probability of membership to the West China cluster (qW) in each population was detected with increasing longitude, latitude, and decreasing elevation (all P values < 0.001; Fig. 3d–f). According to this trend, we then subdivided all the populations into three geographic regions: (1) Southwest China with qW > 0.75, (2) East China with qW < 0.25, and (3) Central China, an admixed cluster of the former two regions, with 0.25 ≤ qW ≤ 0.75 (Figs. 1 and 3g; Table 1). The Central China region can further be subdivided into the West Central China sub-region with 0.50 < qW ≤ 0.75 and the East Central China sub-region with 0.25 ≤ qW ≤ 0.50. The STRUCTURE result of K = 4 is shown in Figs. 1 and 3g. Three of the four genetic clusters were found to be dominant only in a single population at the distribution edge, i.e., cluster I in the southern-most population CJ (qI = 0.948), cluster II in the northwestern-most population HL (qII = 0.856), and cluster III in the southeastern-most population WTM (qIII = 0.826) (Table S7). Cluster IV was not only dominant in the northeastern-most populations such as QHD (qIV = 0.922), DL (qIV = 0.892), and ZH (qIV = 0.891), but also in other populations in East China (Table S7). Significant linear correlations were observed between latitude and the probability of membership to the genetic cluster I (qI) and II (qII) in each population from West China (all P values < 0.01; Fig. 3b, c). Based on the STRUCTURE result of K = 4, we further divided all the 30 populations into seven sub-regions (Tables 1 and 5, Fig. 1). Both PCoA and UPGMA cluster analysis showed good consistency with STRUCTURE, except for populations TB and NS nested within the East China cluster in the UPGMA dendrogram (Figs. 4 and S3).

Fig. 3
figure 3

STRUCTURE results for 707 Quercus acutissima individuals from 30 populations. a Delta K statistics calculated according to Evanno et al. (2005). b, c Linear correlations between latitude and the probability of membership (q) to the cluster I (qI) and the cluster II (qII) in each West China population when K = 4. df Linear correlations between geographic information and the probability of membership to the West China cluster (qW) in each population when K = 2. g Histogram of individual assignments when K = 2, 3, and 4. Each vertical bar represents one individual. Roman numerals show genetic clusters when K = 4. Colors of dots in bf and of horizontal bars in g represent different geographic regions (sub-regions) as shown in Fig. 1. Population and region codes are shown in Tables 1 and S1

Table 5 Seven sub-regions defined by STRUCTURE result of K = 4
Fig. 4
figure 4

UPGMA cluster analysis based on the Nei’s unbiased genetic distances. Colors of vertical bars represent seven geographic sub-regions as shown in Fig. 1. Population codes are shown in Tables 1 and S1

Taking the three main regions identified by STRUCTURE into account, all six pairwise maximum likelihood estimates of mutation-scaled immigration rate were significant with no 95% CIs overlapping zero (Table 6). The evidence of distinctly asymmetric historical gene flow was detected among three region-pairs, with migration direction from Southwest China to Central China (27.31 vs. 11.58), from East China to Central China (39.88 vs. 27.00), as well as from Southwest China to East China (20.15 vs. 13.86).

Table 6 Estimates of mutation-scaled effective population size (θ) and mutation-scaled effective immigration rate (M) for the three geographic regions of Quercus acutissima using MIGRATE-N

Demographic history

ABC analysis showed that scenario 8 (ancient divergence with recent admixture model, Fig. 2) had the highest posterior probability (0.491, 95% CI: 0.473–0.510) across all the five runs. No overlap of 95% CIs was observed with the other seven scenarios (Table S2). A good fit to the observed data was assessed by 16 of 18 summary statistics not significantly different from the observed values. The mean median values of t1 (the admixture time between SW and E), t2 (the divergence time between SW and E), and t3 (the time of ancestral effective population size variation) across the five runs were 5508 (95% CI: 309–23,300), 23,680 (95% CI: 4400–47,700), and 106,400 (95% CI: 24,900–195,000) generations, respectively (Table S8). Similar admixture rates, rA, were detected between SW/E and C (SW → C = 0.511; E → C = 0.489; Table S8). The mean median values of the effective population sizes NSW, NC, NE, and NA were 254,400, 535,600, 617,400, and 300,000, respectively (Table S8).

Discussion

Genetic east-west differentiation revealed by nuclear markers and discordant pattern at chloroplast genome

Topographic differences between the mountainous West China and the lowland East China may have affected the intraspecific differentiation patterns of extant plant species in this region (Shi et al. 2014). Previous studies have observed an obvious phylogeographic break between West China and East China, such as in Juglans cathayensis (Bai et al. 2014), Castanopsis eyrei (Shi et al. 2014), Castanopsis fargesii (Li et al. 2014; Sun et al. 2014), and Castanopsis carlesii (Sun et al. 2016). However, most recent studies are restricted in subtropical China. Our investigation of Q. acutissima confirms that a significant nuclear east-west divergence occurs at a much larger geographic scale, coincident with Zhang et al. (2013a), which examined the genetic structure of 28 provenances of Q. acutissima at a regional scale. Such a finding strongly supports the hypothesis that the boundary between the second step region and the third step region of China’s terrain plays an important role in shaping the genetic structure of widespread tree species in East Asia. Furthermore, a distinct genetic pattern across the boundary was also detected for Q. acutissima. The West China group, characterized by habitats with more complex topography, exhibited a lower level of genetic diversity, a higher level of population differentiation, and a stronger pattern of isolation-by-distance (Table 3 and Fig. S2), implying that topography heterogeneity in West China has exerted a significant influence on the intraspecific genetic structure of Q. acutissima, probably through long-term isolation with restricted gene flow among populations.

In contrast to the above, Zhang et al. (2015) did not detect a clear longitudinal divergence of Q. acutissima at the chloroplast genome level. Instead, the most common chlorotype (H1) was identified in 23 of 30 populations in China, occurring at six of seven sub-regions defined in this study. The diagnostic power of intraspecific structure was thus reduced for chloroplast markers. Similar cases were also reported in plants of subtropical China, such as Castanopsis eyrei (Shi et al. 2014) and Sargentodoxa cuneata (Tian et al. 2015). Indeed, the cytoplasmic-nuclear discordance is more common across closely related species (e.g., Qi et al. 2012; Jose-Maldia et al. 2017; Zhou et al. 2017). Markers associated with the farthest dispersing sex may better delimit species (Petit and Excoffier 2009). Previous studies have shown that in conifers, variation in paternally inherited chloroplast DNA was more species-specific than variation in maternally inherited mitochondrial DNA, and biparentally inherited nuclear DNA exhibited an intermediate pattern (Zhou et al. 2010; Zhou et al. 2017). In oaks, two hypotheses are usually used to explain the origin of shared chlorotypes among species, i.e., chloroplast capture (Acosta and Premoli 2010; Premoli et al. 2012), and incomplete lineage sorting (ILS; Simeone et al. 2016; Vitelli et al. 2017).

In our case at the intraspecific level, we integrate multiple clues to argue for the range expansions and incomplete lineage sorting. First, the star-like pattern of the chlorotype lineages implied that Q. acutissima may have experienced ancient range expansions during the interglacial periods (Tian et al. 2015). This process would have provided an opportunity for ancestral haplotypes, such as the most common haplotype (H1) that occupied the interior position in the network, to disperse to different parts of the distribution, which was a necessary condition for the formation of shared variation between groups. Second, recent studies pointed out that high intraspecific gene flow and high mutation rates may accelerate lineage sorting between species (Zhou et al. 2010). In oaks, gene flow through pollen was estimated to be 200 times higher than that through seeds (Ennos 1994; Mohammad-Panah et al. 2017). Thus, biparentally inherited nuclear markers may have experienced a faster lineage sorting between two geographic groups, whereas maternally inherited chloroplast markers likely retained ancestral polymorphisms for a long time. Finally, the current random distribution pattern of ancestral haplotypes throughout the whole range exactly coincides with the expectation of the ILS scenario (Zhou et al. 2017). If the extensive shared haplotypes arose from the genetic admixture between groups or chloroplast capture between species, it would be expected to be concentrated geographically in contact zones or adjacent areas (Zhou et al. 2010; Yang et al. 2016). Therefore, we conclude that the widespread distribution of ancestral chlorotypes was more likely a consequence of ancient range expansions combined with incomplete lineage sorting. Furthermore, recent repeated local range expansions (Tian et al. 2015; Fan et al. 2018; Ye et al. 2018) may also promote the haplotype sharing in Q. acutissima. These factors will give chloroplast markers limited efficiency in detecting intraspecific genetic structure.

Ancient divergence and recent admixture triggered by topographic changes and climatic fluctuations

Based on the STRUCTURE result, we subdivided all the populations of Q. acutissima into three genetic groups, namely Southwest China, Central China, and East China. The scenario of ancient divergence and recent admixture was strongly supported by ABC analysis. Assuming an average generation time of 100–150 years for Q. acutissima (Cavender-Bares et al. 2011; Zeng et al. 2015), the split between the East China group and the Southwest China group was estimated to have occurred during the late Pliocene to the early Pleistocene (3.55–2.37 Ma, mean 2.96 Ma). Compared to other oak species in China, this time of the initial divergence between major clades was slightly earlier than that of deciduous oaks, e.g., Q.variabilis (1.45 Ma, Chen et al. 2012) and Q. mongolica (1.38 Ma, Zeng et al. 2015), but much later than that of evergreen oaks, e.g., Q. glauca (9.07 Ma, Xu et al. 2015), Q. arbutifolia (10.25 Ma, Xu et al. 2016), and Q. aquifolioides (8.60 Ma, Du et al. 2017). This finding supports the hypothesis that the occurrence of deciduous oaks in China may be later than that of evergreen oaks, which was also inferred from the fossil history of Chinese oaks (Zhou 1993).

Our findings also indicated that the intraspecific divergence of Q. acutissima was likely associated with the recent uplift of the QTP and climate changes during the late Pliocene to the early Pleistocene in East Asia. Several forms of evidence have shown that the accelerating uplift of the QTP during the period of 3.4–2.6 Ma has increased the tectonic activities within the adjacent mountains and regions, e.g., uplift of the West Qinling Mountain and the formation of fault basins on the Yunnan-Guizhou Plateau (An et al. 1999). The increasing topographic differences between the second step region and the third step region may deepen the divergence between the eastern and western populations. Furthermore, East Asia experienced a significant change in climate around the Pliocene/Pleistocene boundary. Previous studies have shown that both the East Asian summer monsoon and the Indian summer monsoon displayed a trend of intensification during the period of 3.6–2.6 Ma (An et al. 2001; Chang et al. 2010), whereas the Asian interior was under the control of enhanced aridification (Zhang et al. 2014b), with the most easterly record observed at the eastern edge of the Loess Plateau (Li et al. 2015). Although the link between the rapid uplift of the QTP and climate changes is considered controversial (Wang et al. 2005), it is possible that the increased climate heterogeneity and topographic differences may drive the intraspecific divergence of Q. acutissima simultaneously.

Our analysis detected a recent intraspecific admixture event of Q. acutissima between the Southwest China group and the East China group, which was estimated to have occurred 0.83–0.55 Ma (mean 0.69 Ma) under the assumption of an average generation time of 100–150 years. This period comprises several interglacial intervals in the middle Pleistocene, including Marine isotope stages (MISs) 13, 15, 17, and 19. An extra-long interglacial may have persisted throughout MISs 15–13 (0.62–0.48 Ma), associated with the limited extent of Arctic ice sheets in glacial MIS 14 (Hao et al. 2015). Notably, an unusually warm and wet climate accompanied by the extremely strong East Asian summer monsoon may have existed in China during this time. Specifically, in the southern Loess Plateau, Guo et al. (1998) have recognized a period of greatest warmth since the last 1.2 million years, corresponding to MIS 13 and 15. According to their estimations, this area was likely under a subtropical semi-humid climate in MIS 13, with the annual mean temperature at least 4–6 °C higher and the annual rainfall 200–300 mm higher than at present. In the eastern Tibetan Plateau, the warmest interglacial over the last 0.8 million years was also related to MIS 13, and the appearance of Quercus pollen distinguished this time from other interglacials (Chen et al. 1999). More similar cases were also recorded in the paleosol sediments of southern China (Yin and Guo 2007). Interestingly, all the studied sites mentioned here were around the distribution of the Central China group, implying that a regional warm and humid climate may have occurred, at least in Central China. Under such a favorable condition, we infer that Q. acutissima may have expanded from Southwest China and East China refugia into the mountainous area of Central China, which then triggered the occurrence of genetic admixture along with the duration of recolonization. Our findings highlight that Central China was not only a potential glacial refugium (Zhang et al. 2015), but also likely a ‘melting pot’ of genetic diversity for Q. acutissima.

Genetic evidence for central-marginal hypothesis and multiple cryptic refugia at the distribution edges

For Q. acutissima, a gradual decline of intraspecific genetic diversity from center to margin along latitudinal gradients was observed at both the whole-distribution level and regional level (Fig. S1). Populations between 35°N and 26°N presented higher allelic richness and within-population genetic diversity than southern populations (< 26°N) and northern populations (> 35°N) (Table S5). This result exactly coincides with the expectation of the central-marginal (C-M) hypothesis, which predicts marginal populations have lower genetic diversity within populations due to small population size and increased spatial isolation (Micheletti and Storfer 2015). In East Asia, this prediction has been successfully verified in Euptelea pleiospermum (Wei et al. 2016). However, our study did not detect a C-M pattern along longitudinal gradient. Instead, a significantly higher genetic diversity was observed in East China than in West China, indicating again the profound influence of the topographic difference between the second step region and third step region of China’s terrain.

According to the STRUCTURE result of K = 4, our study also suggests the existence of multiple cryptic refugia in situ at distribution edges of Q. acutissima. Populations at the southern-most point (CJ), northwestern-most point (HL), and southeastern-most point (WTM) constituted three distinct single-population clusters (I–III). Another three populations (TC, KM, and FN) in the Southwest China sub-region were geographically close to CJ, with relatively higher qI (0.6–0.8). Finally, although cluster IV is common in both East China and Central China regions, populations dominated by this cluster with qIV larger than or approximately equal to 0.9 were only found in the northwestern-most part, e.g., TS, QHD, ZH, and DL (Table S7). Such a finding shows a strong signal for the existence of multiple refugia at the southwestern, southeastern, northwestern, and northeastern edges for Q. acutissima.

Based on maternally inherited chloroplast markers, Zhang et al. (2015) also uncovered a close genetic relationship among these four refugia. In their study, four haplotypes from the same lineage were found specific to Southwest China refugium (H2 in FN and TC), Southeast China refugium (H4 in QY), Northwest China refugium (H14 in HL), and Northeast China refugium (H9 in DL), presenting a distribution pattern similar to the genetic structure revealed by nSSR markers. Moreover, H2, the ancestor haplotype occupying the interior position of the network, generated H4, H9, and H14 through one to two single-site mutations, implying an ancient common origin in Southwest China and subsequent range expansions or long-distance dispersal events from Southwest China to the other three refugia during interglacials. This inference was consistent with the result of ABC analysis, which also provides evidence of the ancient common origin of the three main genetic groups in Southwest China.

Another explanation, although unlikely, for the close genetic relationship among these four refugia is human-mediated historical seed dispersal. A recent study has found that human activities resulted in a homogeneous genetic structure and lower genetic diversity of Q. acutissima in Japan (Saito et al. 2017). However, our study detected a distinct genetic structure (Fig. 1) and a central-marginal pattern of genetic diversity along latitude (Fig. S1). Additionally, the gradient change of probability of membership to the genetic cluster I and II along the latitude was obvious in West China (Fig. 3b, c). Significant patterns of isolation by geographic distance were also observed for populations across the entire distribution and within West China (Fig. S2). The co-occurrence of these genetic imprints more likely supports the hypothesis of natural processes such as range expansions, and not human-mediated seed flow. Even if the formation of putative refugia populations was associated with human-mediated gene flow, a subsequent gradual recolonization was required to shape the genetic structure of Q. acutissima. This process may need at least 10,000 years according to the estimation method in Saito et al. (2017), which is much longer than the history of artificial cultivation of oaks in China (You et al. 2017). Therefore, the existence of multiple refugia was most likely a consequence of naturally occurring range expansions or long-distance seed dispersal events, not human-mediated historical seed transfer.

Implications for conservation

Q. acutissima is able to regenerate naturally; however, long-term exploitation and habitat fragmentation have caused a considerable decrease in its distribution across China. Field surveys found that in some provinces such as Sichuan, Chongqing, Guizhou, and Hainan, the habitats of several wild populations have been devastated for farmland reclamation, tourism development, and commercial forest afforestation (e.g., Hevea brasiliensis and Eucalyptus robusta). Therefore, it is essential to take effective conservation measures for this species.

One of the important goals of species conservation is maintaining its genetic diversity and evolutionary potential (Mingeot et al. 2016). Our study reveals that Q. acutissima has higher genetic diversity at the species level, with genetic variation mainly partitioned within populations. Considering that natural populations of the species have not yet reached the endangered level, we suggest that in situ conservation should be the primary management activity in the future. Populations for priority conservation should be those with higher genetic diversity (e.g., populations in Central China) or with unique alleles (e.g., populations in putative refugia). Under the condition of limited funds, populations with the highest genetic diversity among those with close genetic relationships should be selected first for protection. Zhang et al. (2013a) observed a similar level of genetic diversity in provenances of Q. acutissima, indicating that it is feasible to preserve the genetic diversity of this species by constructing germplasm banks. According to the genetic structure identified in this study, we suggest that in practical situations, at least two germplasm banks should be set up, corresponding to the West China group and the East China group.