Introduction

The northern Neotropical region comprises southern Mexico and Central America. This region is important for the study of evolutionary processes such as migration, extinction, and diversification of different organismal groups, in the context of the geological and climatic heterogeneity that characterizes it (Gutiérrez-García and Vázquez-Domínguez 2013). Even though consensual patterns have not emerged yet, several studies coincide on the importance of geological elements such as the Isthmus of Tehuantepec (IT), the Polochic-Motagua-Jocotán fault system (PMJ), the Nicaraguan Depression (ND), and the Isthmus of Panama (IP) as promoters of phylogeographic structure in different organisms (see Table 1 in Gutiérrez-García and Vázquez-Domínguez 2013), and also in the geographical organization of entire lineages such as Quercus (Rodríguez-Correa et al. 2015) and ferns (Ramírez-Barahona and Luna-Vega 2015). On the other hand, the lack of phylogeographic structure related to these geological elements also has been observed mainly (but not exclusively) in lowland species (see Table 1 in Gutiérrez-García and Vázquez-Domínguez 2013).

More interesting is the fact that even when the phylogeographic structure of the species appears to be determined by these barriers, it seems like processes of intraspecific divergence took place at different time periods, suggesting differential responses to the geological dynamics of the region, and therefore, linage-specific evolutionary histories (Ornelas et al. 2013). Historical environmental fluctuations have also been considered important in structuring genetic diversity of the Neotropical species. However, available evidence (reviewed in Ramírez-Barahona and Eguiarte 2013) indicates differential responses to climate changes according to the distribution, ecology, and phylogenetic relationships of species. Therefore, the identification of common patterns and how these can be associated to the species’ characteristics requires further studies dealing with codistributed species and the integration of palaeoclimatic, ecological, and molecular data (Ramírez-Barahona and Eguiarte 2013).

In northern latitudes, several Quercus species have been studied using phylogeographic approaches (Dumolin-Lapègue et al. 1997; Fineschi et al. 2002; Cottrell et al. 2002; Csaikl et al. 2002; Olalde et al. 2002; Petit et al. 2002a, b; Grivet et al. 2006; López de Heredia et al. 2007; Chen et al. 2012; Liu et al. 2013; Alexander and Woeste 2014) and the results obtained have allowed understanding the historical migration processes related to climatic fluctuations during the last glacial cycle, particularly in the European continent. Interestingly, oak species situated in the Neotropics exhibit contrasting patterns with respect to their Nearctic and Palearctic relatives. For example, Mexican oak species show higher levels of within-population variation and lower among population differentiation than northern latitude species (Tovar-Sánchez et al. 2008; Ramos-Ortiz et al. 2016), probably because at lower latitudes, the effects of glaciations were less severe and oak populations did not isolate into small refugia but remained large and maintained relatively stable geographic ranges (González-Rodríguez et al. 2004; Cavender-Bares et al. 2011). In contrast, it seems like the most important effects of the Pleistocene climatic changes on oak species in the Neotropics were related to changes in their altitudinal distribution (Hooghiemstra and van der Hammen 2004; Rodríguez-Correa 2015).

Recently, it has been suggested that areas such as the IT, the PMJ, the ND, and the IP have acted as important barriers to the dispersal of oak species, influencing biogeographic patterns, species diversity, and niche divergence processes (Torres-Miranda et al. 2013; Rodríguez-Correa et al. 2015). However, a few oak species (Quercus corrugata, Quercus insignis, Quercus oleoides, and Quercus sapotifolia, and perhaps also Quercus benthamii, Quercus cortesii, and Quercus salicifolia) (Valencia-A 2004; Morales 2010) have geographical distributions that extend across almost all these barriers (except the IP), eliciting questions about the timing and process of dispersal and the influence of species’ traits and ecological niche on dispersal success. Also, an evident hypothesis is that the mentioned barriers probably correspond to phylogeographic breaks within these species. However, the lowland Q. oleoides (distributed from northeastern Mexico to northern Costa Rica) is the only species so far analyzed (Cavender-Bares et al. 2011, 2015), using traditional genetic markers and next-generation sequence data. No genetic discontinuities across the IT and the PMJ were found, but there was strong differentiation across the ND, estimated at about 1.9 my, implicating the formation of the Nicaraguan Depression and associated volcanic activity as the cause of this pattern (Cavender-Bares et al. 2015).

In this study, we used chloroplast DNA microsatellite data to reconstruct phylogeographic patterns of two Neotropical oak species (Q. insignis and Q. sapotifolia) distributed from southern Mexico to northern Panama. Both species are cataloged as threatened in Mexico (Q. insignis as critically endangered and Q. sapotifolia as vulnerable; González-Espinosa et al. 2011). Q. insignis is a white oak (section Quercus) and Q. sapotifolia is a red oak (section Lobatae), but both have roughly similar geographical distributions and are mainly montane species. The specific questions for the study were as follows: (i) What are the phylogeographic patterns in the two oak species, Q. insignis and Q. sapotifolia? (ii) How did the major geological features of the northern Neotropics influence the evolutionary history of these two species? (iii) Did recent Pleistocene climatic changes affect the geographic distribution and connectivity patterns in these oaks across their range?

Methods

Studied species and sampled populations

The two oak species were sampled across their whole distribution range (Fig. 1). Q. insignis M. Martens & Galeotti is found in the Mexican states of Jalisco, Guerrero, Oaxaca, Chiapas, and Veracruz, and in Belize, Guatemala, Honduras, Nicaragua, Costa Rica, and Panama in Central America. The altitudinal range of Q. insignis varies from approximately 1500 to 2000 m (Valencia-A 2004). The species is emblematic for producing very large acorns (7–8 cm in diameter; Montes-Hernández and López-Herrera 2013), probably the largest of all oaks species. Q. sapotifolia Liebm. is distributed in the states of Hidalgo, Oaxaca, Chiapas, and Veracruz in Mexico, and in Guatemala, Honduras, El Salvador, Nicaragua, Costa Rica, and Panama in Central America, with an altitudinal range between 250 and 2000 m (Valencia-A 2004). This species is characterized by having considerable morphological variation across its distribution. Despite their wide ranges, both species are rare and populations are difficult to locate because of their low individual density, which was also reflected in the dissimilar population sample sizes for both species (see “Results”). In total, 13 populations of Q. insignis and 15 populations of Q. sapotifolia were sampled (Table 1).

Fig. 1
figure 1

Quercus insignis and Q. sapotifolia distribution represented by white and black triangles, respectively. Elevation is represented in a grayscale from lowlands (light gray) to highlands (black). SMO Sierra Madre Oriental, SMS Sierra Madre del Sur, IT Isthmus of Tehuantepec, SMC Sierra Madre de Chiapas, PMJ Polochic-Motagua-Jocotán fault system, CAVA (white dotted line) Central America volcanic arc, ND Nicaraguan Depression, CRM Costa Rica mountains. Details of Middle Mesoamerica (MM) and Lower Mesoamerica (LM) are represented inside dotted and solid rectangles as follows: MH Maya highlands, CRa Chuacus range, LM Las Minas range, PMJ (white crossed line) Polochic-Motagua-Jocotán fault system, wCH western rifted Chortis highlands, CH central Chortis highlands, CG Cordillera de Guanacaste, VC Valle Central, CC Cordillera Central, CG Cordillera de Talamanca, CVF (solid line) Chorotega volcanic front

Table 1 Geographic information summary for Quercus insignis and Q. sapotifolia populations and genetic diversity and historical demography estimates

DNA isolation and microsatellite amplification

Fresh leaf tissue was collected in the field and stored in silica gel until processing in the laboratory. Dried tissue was disrupted using liquid nitrogen, and DNA isolation was carried out using the DNeasy plant mini kit (Qiagen), following the manufacturer’s instructions. A set of nine chloroplast DNA (cpDNA) microsatellite loci (cmcs2, cmcs3, cmcs4, cmcs5, cmcs6, cmcs7, cmcs10, cmcs12, and cmcs14) previously characterized for Fagaceae species by Sebastiani et al. (2004) was tested for polymorphism in both Q. insignis and Q. sapotifolia. Two groups of primers (cmcs3, cmcs4, cmcs5, and cmcs6 in the first group and cmcs2, cmcs7, cmcs10, cmcs12, and cmcs14 in the second group), with different combinations of fluorescent dyes and expected size were employed in multiplexed polymerase chain reactions (PCRs). These were performed using the multiplex PCR kit (Qiagen) with a final volume of 5 μL containing 1× multiplex PCR master mix, 0.25 mM of each primer, dH2O and 20 ng of template DNA. Amplification was performed using an initial denaturation step for 15 min at 95 °C, followed by 35 cycles of 30 s at 95 °C, 1.5 min at 55 °C, 1 min at 72 °C, and a final extension step for 30 min at 60 °C. PCR products were run in a ABI-PRISM 3300 Avant sequencer (Applied Biosystems), and allele size was determined using a standard (GeneScan-600 LIZ) with the Peak Scanner program version 2.0 (Applied Biosystem).

Genetic analysis

Genetic diversity and genetic structure

Each unique combination of size variants for the nine loci was defined as a different haplotype. Genetic diversity was described in terms of rarefacted haplotype richness (AR), haplotype diversity with unordered alleles (h sensu Pons and Petit 1996), nonstandardized haplotype diversity with ordered alleles (v sensu Pons and Petit 1996), and the pairwise genetic distance among individuals within a population under a stepwise mutation model (D 2 SH; Goldstein et al. 1995) using SPAGeDi version 1.1 (Hardy and Vekemans 2002). These estimators were calculated both at the population and region levels. The regions were defined as follows: (1) populations located to the north/east of the IT belonged to the Upper Mesoamerica (UM) region, (2) populations found between the IT and the ND belonged to the Middle Mesoamerica (MM) region, and (3) populations to the south of the ND formed the lower Mesoamerica (LM) group. Population i14 of Q. insignis (from Chiapas, Mexico) only had one individual and was considered for graphical purposes only, but it was excluded from population structure and historical demography analysis.

To infer haplotype relationships, a minimum spanning network was computed using Network version 4.6 (available at www.fluxus-engineering.com) with the median-joining method (Bandelt et al. 1999) and a maximum parsimony search (Polzin and Daneschmand 2003). Genetic differentiation and phylogeographic structure were analyzed by calculating G ST and N ST (a G ST analogue which takes into account the genetic distances among haplotypes) with SPAGeDi version 1.1 (Hardy and Vekemans 2002). A permutation test implemented in this program was used to test whether the value of N ST was significantly greater than the value of G ST, which indicates that there is phylogeographic structure in the populations (Pons and Petit 1996). The F S statistic (Fu 1997) was estimated for each population and region in order to test for events of demographic expansion. Calculations were carried out using Arlequin version 3.5 (Excoffier et al. 2005) and codifying cpSSR data in binary form as suggested by Navascués et al. (2009).

The partitioning of the genetic variation among regions, among populations within regions and within populations was described using a hierarchical analysis of molecular variance (AMOVA) implemented in Arlequin version 3.5 (Excoffier et al. 2005). The grouping of populations in regions was the same as explained above. AMOVA was calculated using both F ST (based on the infinite alleles mutation model, IAM) and R ST (based on the stepwise mutation model, SMM) with 10,000 permutations to determine significance of the estimates (Excoffier et al. 2005). To further understand the geographic distribution of genetic variation, a spatial analysis of molecular variance was implemented using SAMOVA version 1.0 (Dupanloup et al. 2002). The number of groups was evaluated using values of K from 2 to 10 so that the genetic differentiation among the groups (Ф CT) was maximized. Consistence between different runs was determined by repeating twice every K-value evaluated.

Also, the geographical location of the most important genetic discontinuities among populations was determined using the Monmonier’s maximum difference algorithm implemented in BARRIER ver. 2.2 (Manni et al. 2004). To provide bootstrap support to the observed genetic barriers, a set of 100 pairwise matrices of average square genetic distance (ASD; Goldstein et al. 1995; Slatkin 1995) among populations was generated by manual resampling of individuals in the original matrix. Finally, genetic diversity indexes (AR, h, v, D2 SH) were correlated with the altitude and latitude of the sampled populations and also compared among regions using a Kruskal-Wallis rank sum test in order to determine the existence of geographical patterns in the distribution of genetic diversity. These tests were implemented in R ver. 3.0.2.

Ecological niche modelling

Ecological niche models (ENMs) were estimated using a maximum entropy approach implemented in MAXENT version 3.3.3a (Phillips et al. 2006) in order to determine the climatically suitable areas for both Q. insignis and Q. sapotifolia during the Last Interglacial (LIG; ∼120 ka BP), the Last Glacial Maximum (LGM; ∼21 ka BP), and the present-day (PD) periods using the most common and therefore comparable climatic layers reported in previous phylogeographic studies for Neotropical species (e.g., Ramírez-Barahona and Eguiarte 2014; see below). In order to build the models, occurrence data were downloaded from a public repository (Global Biodiversity Information Facility; http://www.gbif.org). The data were filtered using as reference the reported distribution and altitudinal range of the species (Valencia-A 2004).

Climatic information used to run the models was based on the 19 bioclimatic variables proposed by Hijmans et al. (2005; www.worldclim.com) at a spatial resolution of 30 arc-seconds (∼1 km). However, in order to avoid correlation among variables, a correlation matrix was calculated, and from each pair of highly correlated variables (r > 0.7), the more specific variable was discarded. To decrease possible effects of spatial autocorrelation due to the aggregation of records, we only used points separated by more than 0.1 decimal degrees from the nearest neighbor. ENMs were estimated after 100 replicates using the bootstrap resampling method in MAXENT. Thirty percent of the presence records were used to calculate estimators of quality, and the remaining 70% was used to run the models. The ENMs were projected onto past climate scenarios (LGM and LIG), with two models for the LGM, the Community Climate System Model (CCSM; Collins et al. 2004) and the Model for Interdisciplinary Research on Climate (MIROC; Hasumi and Emori 2004). Both models simulate climatic conditions as they are calculated to have been for the LGM, with a stronger temperature decrease assumed in CCSM compared to MIROC (Otto-Bliesner et al. 2006). The original LGM data were downloaded from the PMIP2 website (http://www.pmip2.cnrs-gif.fr/) (Braconnot et al. 2007) and LIG from www.worldclim.com.

Model quality was determined with a threshold-independent method, the area under the receiver operating characteristic (ROC) curve analysis (AUC; Fielding and Bell 1997), and the fixed cumulative value 1 logistic threshold (FCV) was applied in order to obtain binomial output. FCV was used as it has been observed as an appropriated threshold value in other Neotropical species analyzed with palaeoecological methods involving ENM (Rodríguez-Correa 2015). Finally, in the models, we determined the areas where the distribution of both species remained stable during the LIG, LGM, and PD periods. The location of the main phylogeographic breaks obtained with the molecular data was compared with the discontinuities in climatically suitable areas obtained in the MAXENT models to evaluate the possible role of present-day and historical climatic conditions in structuring the genetic diversity of the two species.

Results

Genetic diversity, genetic structure, and historical demography

A total of 76 individuals from 14 populations and 139 individuals from 15 populations were respectively characterized for Q. insignis and Q. sapotifolia (Table 1) (Fig. 1). For Q. insignis, the total number of haplotypes was 28, 14 of them corresponding to singletons. The number of haplotypes per population ranged from 1 to 6. At the region level, the number of haplotypes was as follows: 2 haplotypes were observed in UM, 16 haplotypes (9 singletons) in MM, and 12 haplotypes (5 singletons) in LM. Within-population haplotype diversity (h S ) ranged between 0.4 and 1. Mean (SE) h S and total diversity (h T ) were 0.65 (0.05) and 0.95 (0.03), respectively. Genetic differentiation among populations was 0.315 (0.05) for unordered alleles (G ST ) and 0.716 (0.07) for ordered alleles (N ST ). G ST and N ST values were significantly different (P < 0.001), indicating the presence of phylogeographic structure. Fu’s F S values for all populations and geographic regions were nonsignificant (P > 0.02; Table 1).

Q. sapotifolia exhibited 34 haplotypes, with 17 singletons. The number of haplotypes per population ranged from 1 to 7. At the regional level, haplotype richness was 7 (4 singletons) in UM, 22 (11 singletons) in MM, and 5 (2 singletons) in LM. Within-population genetic diversity (h S ) ranged between 0.4 and 0.95. Mean (SE) h S and total haplotype diversity (h T ) were 0.51 (0.06) and 0.96 (0.01), respectively. Genetic differentiation among populations was 0.471 (0.06) for unordered alleles (G ST) and 0.752 (0.05) for ordered alleles (N ST). A significant difference between G ST and N ST (P < 0.001) was observed, indicating the presence of phylogeographic structure. Fu’s F S values for all populations and geographic regions were not significant (P > 0.02; Table 1).

The partitioning of genetic variation in Q. insignis according to the AMOVA (Table 2) showed that for both F ST and R ST, the differences among regions explained a considerable amount of variation (20 and 30%, respectively; P < 0.001). Meanwhile, most of the genetic variation was distributed among populations within regions under both mutation models, 42.22% (P < 0.001) for IAM and 45.18% (P < 0.001) for SMM. The haplotype network (Fig. 2a) indicated a complex scenario. Despite the presence of haplotypes with high frequencies such as H27 (found in 15 individuals), there is not a widely distributed haplotype through the whole Q. insignis distribution. The distribution of haplotypes in MM and SM suggested that several haplotypes are more closely related between regions than within regions (e.g., H10 and H5, H6 and H7). In Q. sapotifolia, partitioning of the genetic variation according to the AMOVA results (Table 2) showed that for F ST, differences among regions explained 21% (P < 0.001) of the variation. However, for the case of R ST, the differentiation value was very small and nonsignificant (2%; P > 0.05). The observed differences between F ST and R ST should be related to the effect of high mutation rates and the mutation models to estimate population structure (for a complete discussion see Rousset (1996)). On the other hand, most of the genetic variation was distributed among populations within regions under both mutation models (68%; P < 0.001, for IAM and 64%; p < 0.001 for SMM). The haplotype network (Fig. 3a) showed that the majority of haplotypes are found in the MM region. Interestingly, the haplotypes found within the UM region seem to belong to two lineages independently derived from haplotypes H34 and H3, and a similar pattern seems to be true for the haplotypes present in the LM region (Fig. 3a).

Table 2 Hierarchical analysis of molecular variance (AMOVA) using F ST and R ST for Quercus insignis and Q. sapotifolia
Fig. 2
figure 2

a Haplotype distribution and haplotype network for Quercus insignis. In the network, circle diameter is proportional to haplotype frequency. On the map, circle diameter is proportional to sample size. The Northern Neotropics area is divided in upper Mesoamerica (UM; red line), middle Mesoamerica (MM; green line), and lower Mesoamerica (LM; yellow line). The same colors are used in the haplotype network to indicate the presence of haplotypes in each region. On the map, haplotype composition was represented by different colors reflecting richness and shared haplotypes. Gray and white haplotypes correspond to unique haplotypes in MM and LM, respectively. b Geographical location of the most important genetic discontinuities (gray lines) for Q. insignis populations according to the BARRIER analysis. Also, the different colors of the triangles indicate the grouping of the populations according to SAMOVA. SMO Sierra Madre Oriental, SMS Sierra Madre del Sur, IT Isthmus of Tehuantepec, SMC Sierra Madre de Chiapas, PMJ Polochic-Motagua-Jocotán fault system, CAVA Central America volcanic arc, ND Nicaraguan Depression, CRM Costa Rica mountains. Elevation is represented in a grayscale from lowlands (light gray) to highlands (black)

Fig. 3
figure 3

a Haplotype distribution and haplotype network for Quercus sapotifolia. In the network, circle diameter is proportional to haplotype frequency. On the map, circle diameter is proportional to sample size. The Northern Neotropics area is divided in upper Mesoamerica (UM; red line), middle Mesoamerica (MM; green line), and lower Mesoamerica (LM; yellow line). The same colors are used in the haplotype network to indicate the presence of haplotypes in each region. On the map, haplotype composition was represented by different colors reflecting richness and shared-haplotypes. Dark gray, light gray, and white haplotypes correspond to unique haplotypes in UM, MM, and LM, respectively. b Geographical location of the most important genetic discontinuities (gray lines) for Q. sapotifolia populations according to the BARRIER analysis. Also, the different colors of the triangles indicate the grouping of the populations according to SAMOVA. SMO Sierra Madre Oriental, SMS Sierra Madre del Sur, IT Isthmus of Tehuantepec, SMC Sierra Madre de Chiapas, PMJ Polochic-Motagua-Jocotán fault system, CAVA Central America volcanic arc, ND Nicaraguan Depression, CRM Costa Rica mountains. Elevation is represented in a grayscale from lowlands (light gray) to highlands (black)

Geographic distribution of the genetic variation

The Q. insignis results of the SAMOVA analysis (Fig. 2b) indicated that the maximum value of genetic differentiation among groups of populations corresponded to four groups (ФCT = 0.72; P < 0.001). According to this result, the first group was formed by three populations from Veracruz, Mexico (i13, i12, i11) and three populations from Honduras (i10, i8, and i6), thus including populations from both UM and MM. A second group was constituted by two populations from Honduras (MM; i9 and i7) and two populations from Costa Rica (LM; i4 and i2). A third group included two populations from Costa Rica (i5 and i3), and the fourth group included only the southernmost population (i1). Q. sapotifolia SAMOVA results (Fig. 3b) showed that genetic differentiation among groups is maximum when K = 5 (ФCT = 0.72; P < 0.001). According to this result, the grouping of populations was as follows: group 1 formed by UM and northern MM populations (s13, s14, and s15), groups 2 and 3 each formed by one population from MM (s10 and s12, respectively), group 4 was constituted by eight MM and one LM populations (s2, s3, s4, s5, s6, s7, s8, s9, and s11), and group five formed by population s1 (LM).

The analysis of genetic discontinuities along the Q. insignis distribution (Fig. 2b) suggests that the IT and the ND are important barriers for the species (94 and 95% bootstrap support). In Honduras, the two southernmost populations i7 and i10 were separated from the northern populations (95% bootstrap support). In Costa Rica, the heterogeneous composition of populations was indicated by the presence of barriers separating populations (95% bootstrap support) into three groups corresponding to northern, central, and southern Costa Rica (Fig. 2b). Finally, all the correlations of genetic diversity statistics with elevation and latitude were not significant (P > 0.05), as well as the comparisons among regions (P > 0.05). The analysis of Q. sapotifolia genetic discontinuities (Fig. 3b) showed several significant barriers that define the geographic configuration of the genetic differentiation. From north to south, the first barrier separates the Mexican populations from the remaining populations (100% bootstrap support), and it is located at the SMC close to the PMJ. Adjacent to this barrier and to population s10, a second barrier separates this population (s10) from the remaining groups (98% bootstrap support). At the middle portion of MM, a third barrier separates Guatemalan from Honduran populations (90% bootstrap support). In the Honduras territory, population s7 differentiates from the rest of the populations. LM populations were also separated in two groups (100% bootstrap support) similarly as for Q. insignis: northern and central Costa Rica. Correlations of genetic diversity measures with elevation and latitude as well as comparisons among geographic regions did not show any significant result (P > 0.05).

Ecological niche modelling

The ENMs of Q. insignis for the present-day period (Fig. 4a) showed a good performance according to the AUC value (0.991, SD 0.005). This model showed two main distribution areas, the first one located from the Sierra Madre de Chiapas (SMC) through the Central America volcanic arc (CAVA) in Guatemala, Honduras, and northern Nicaragua. The second distribution area is located in the Costa Rica mountains (CRM) (Fig. 4a). The northernmost part of the species’ range (i.e., southern Mexico) is characterized by a fragmented distribution. ENMs also suggested an increase in the area available for Q. insignis during the LGM (Fig. 4b, c) and an important reduction in the available climatic niche during the LIG (Fig. 4d). The increase in available area during the LGM could have favored the contact or the proximity of populations across barriers such as the IT and ND, suggesting the possibility of intermittent gene flow across these barriers at some time periods. Finally, it can be observed that distribution areas that have remained stable are located in SMS, SMC, CAVA (including an important portion of the Nicaraguan southern region), and CRM (Fig. 4e).

Fig. 4
figure 4

a Present-day (1950–2000). b, c Last Glacial Maximum (∼21 ka BP; CCSM and MIROC models, respectively). d Last Interglacial (∼120 ka BP) ecological niche models for Quercus insignis. e Q. insignis stable areas of distribution since the Last Interglacial (∼120 ka BP). Areas in black for ad correspond to suitable climatic conditions for Q. insignis. In e, solid black corresponds to the areas where the distribution coincides in the four models used (PD, CCSM, MIROC and LIG; see text for details). Dark gray corresponds to the areas where models for the PD and LGM coincide, and light gray corresponds to the areas where only one of the models predicted the presence of the species

Q. sapotifolia ENM for the present-day period (Fig. 5a) exhibited a good performance according to the AUC value (0.90, SD 0.001). This model predicted two main distribution areas as was also observed for Q. insignis. These areas are formed by (i) the MM region including the Sierra Madre de Chiapas (SMC) and the trans-isthmian mountains (CAVA) in Guatemala, Honduras, and northern Nicaragua, and (ii) the CRM in LM (Fig. 5a). The first area showed a more continuous distribution than the observed in Q. insignis, particularly at Honduras and northern Nicaragua (Fig. 5a). The Q. sapotifolia distribution at UM is almost restricted to the southern SMS and SMO. For the LGM, both models (CCSM and MIROC) suggested an increase in the connectivity among areas across the IT region, but a gap between the SMS and CAVA at least in the CCSM model (Fig. 5b, c). At the ND, both models showed an important gap with a shallow connection between Nicaragua and northwestern Costa Rica according to the MIROC model (Fig. 5c). During the LIG, the climatic niche distribution is restricted to the SMS and northern CAVA, southern Honduras and northern Nicaragua portions of the CAVA and the CRM (Fig. 5d). When all models were considered together, it was observed that, as in Q. insignis, most stable areas were located between the SMS and the CAVA, as well as at the CRM (Fig. 5e). In contrast to the climatic niche extension in Q. insignis, Q. sapotifolia has more restricted suitable and stable areas at southern Nicaragua.

Fig. 5
figure 5

a Present-day (1950–2000). b, c Last Glacial Maximum (∼21 ka BP; CCSM and MIROC models, respectively). d Last Interglacial (∼120 ka BP) ecological niche models for Quercus sapotifolia. e Q. sapotifolia stable areas of distribution since the Last Interglacial (∼120 ka BP). Areas in black for ad correspond to suitable climatic conditions for Q. sapotifolia. For e, solid black corresponds to the areas where the distribution coincides in the four models used (PD, CCSM, MIROC, and LIG; see text for details). Dark gray corresponds to the areas where models for the PD and LGM coincide, and light gray corresponds to the areas where only one of the models predicted the presence of the species

Discussion

Phylogeography in Neotropical oak species

Recent studies regarding the Central American biota have focused on the description of evolutionary processes such as speciation, extinction, and diversification of flora and fauna considering the intricate geologic history of this region, its habitat diversity, palaeoclimatic dynamics, and tectonic history (Gutiérrez-García and Vázquez-Domínguez 2013). Gutiérrez-García and Vázquez-Domínguez (2013) proposed the presence of at least three evolutionary groups in the northern Neotropics based on the coincidence of genetic differentiation patterns: the Mayan group (located between the IT and the PMJ), the Mid-Central group (between the PMJ and the Hess escarpment), and the Panamanian group (from the HE to the Andes). Within this area, other oak species studied have shown a strong differentiation between Costa Rica and Honduras associated to the formation of the ND (Q. oleoides; Cavender-Bares et al. 2011, 2015). However, little is known so far about the effects of past geological and climatic events on the population history of other oak species in the same area, but that differ from Q. oleoides in climatic niche and habitat.

Q. insignis and Q. sapotifolia exhibited higher haplotype richness (28 and 34 haplotypes, respectively) in comparison to higher latitude oak species that have been characterized with comparable chloroplast microsatellite molecular markers, such as the European white oak complex (from France and the Iberian Peninsula; Quercus robur, Q. petraea, Q. canariensis, Q. faginea, and Q. pyrenaica) which showed 11 haplotypes (Grivet et al. 2006), Q. suber in the western Mediterranean with five haplotypes (Magri et al. 2007), and Q. garryana in the Pacific northwestern region of North America with six haplotypes (Marsico et al. 2009). However, haplotype richness was lower than in Mexican oak species such as Q. castanea (Peñaloza-Ramírez 2011) which had 90 haplotypes and similar to haplotype richness in the Q. crassifolia × Q. crassipes hybrid complex that showed 26 haplotypes (Tovar-Sánchez et al. 2008) and the Q. affinis × Q. laurina hybrid complex that had 35 haplotypes (Ramos-Ortiz et al. 2016).

Also, the values were similar to haplotype richness in Q. lobata in California, with 39 haplotypes (Grivet et al. 2006). Within-population diversity (h S ) and total diversity (h T ) observed in Q. insignis (0.65 and 0.95, respectively) and Q. sapotifolia (0.51 and 0.96) were also higher than in most of the abovementioned mid- and high-latitude oak species (0.28 and 0.97 in Q. lobata; Grivet et al. 2006; 0.11 and 0.75 in the European white oak complex; Grivet et al. 2006; and 0.08 and 0.67 in Q. garryana; Marsico et al. 2009) and comparable to the Mexican species (0.73 and 0.98 for Q. castanea; 0.9 and 0.99 for the Q. affinis × Q. laurina hybrid complex; Ramos-Ortiz et al. 2016). Despite that diversity values seem to be clearly associated with the different geographic regions (higher values within the Neotropics and lower in the temperate zone), these comparisons should be treated with caution considering that not only the set but also the number of cpSSR loci used only partially coincide among studies.

In terms of the total number of haplotypes, the genetic diversity was higher at the regional level in the MM region than in the UM and LM regions for both Q. insignis and Q. sapotifolia (Table 1). However, the mean values of within-population diversity were not significantly different according to the Kruskal-Wallis test (p > 0.05). Several authors have attributed high levels of chloroplast DNA diversity to processes such as introgression and the persistence of large populations through different periods (Lumaret et al. 2002; Jímenez et al. 2004). It is probable for both Neotropical oak species (particularly Q. sapotifolia) that not only processes such as hybridization may have been important drivers of the high observed genetic diversity, but also that the persistence of large areas with suitable climatic niches at least since the LGM promoted large effective population sizes through time. This affirmation is also supported by the absence of the signal of a recent historical population demographic expansion (and therefore population equilibrium), for both species, not only at the population level but also regionally.

Geography and geology effects on the species phylogeographic structure

Both oak species exhibited considerable values of genetic structure. Moreover, significant differences between G ST and N ST (p < 0.001) indicated a well-defined phylogeographical structure. The AMOVA showed that 20.57–30.41% of the genetic variation in Q. insignis (depending on the mutation model, IAM and SMM, respectively), and 21.21–2.79% in Q. sapotifolia is explained by differences among the three predefined geographical regions (UM, MM, and LM). However, a SAMOVA maximizing the genetic variance among population groups indicated that, in the case of Q. insignis, the IT and ND have not been effective barriers. Meanwhile, the BARRIER analysis suggested that the same areas (IT and ND) represent zones where genetic structure differ more than it would be expected by chance. The discrepancy between the SAMOVA and BARRIER results are probably due to methodological differences. SAMOVA searches for groups of populations that are geographically contiguous and maximally differentiated from each other, and BARRIER identifies boundaries between points or study sites based on differences of an attribute distance matrix. The lack of a concordant pattern of geographic structure between the results of BARRIER and SAMOVA may also be reflecting a complex evolutionary history, characterized by events of multiple colonization and recolonization. In particular, the haplotype network suggests an event of colonization of LM from Honduras, at least two events of migration on the opposite direction (LM to MM) and finally another colonization of MM to LM.

The BARRIER analysis also showed important differentiation of populations within MM and LM. Particularly, in the LM region, there was a substructure dividing the populations into northern, central, and southern groups. Also, both MM and LM regions are characterized by high levels of haplotype diversity and a considerable number of singletons. Complementarily, ENMs suggested that precisely, these areas (MM and LM) have exhibited stable climatic niche suitability since the LIG and, despite that geographic barriers (particularly IT and ND) seem to interrupt the suitable climatic areas for the species, through time there have been episodes when connectivity may have been possible.

Other important fact that may have influenced processes of colonization and recolonization between MM and LM is the unstable geological dynamics of the region, characterized by the persistence of intense volcanic activity (as most of southern Central America consists of a Neogene-Quaternary volcanic belt) and tectonic activity related to the Nicaraguan volcanic front, Los Gatusos and San Carlos lowlands and the Chorotega volcanic front described by Marshall (2007), which together configure a highly dynamic and heterogeneous geographic area. Despite it has been shown that human activity can have a major impact on forest species, and particularly in historical demographic expansions, extinctions, and bottlenecks of oak species (de Nascimento et al. 2009), recent studies have indicated that the most likely explanations for the genetic structure in the lowland Q. oleoides in Costa Rica are the volcanic activity and geologic events such as the formation of the Guanacaste mountains (Cavender-Bares et al. 2011).

In the case of Q. insignis, a similar effect of the complex geological history that characterizes the CRM can be observed. Populations in Costa Rica are distributed in the Cordillera de Guanacaste (CG; i3; Figs. 1 and 2), Cordillera Central (CC; i2 and i4; Figs. 1 and 2), and Cordillera de Talamanca (CT; i1; Figs. 1 and 2). This distribution includes a Quarternary chain of shield-like stratovolcanos at CG, composite shield volcanoes and strong climate gradients at CC; the Central Valley (between CG, CC, and CT) consisting of a low-relief upland surface with deeply incised river canyons, active faulting and a thick accumulation of andesitic to dacitic lavas, pyroclastic rocks, and lacustrine sediments throughout the Quarternary. Meanwhile, the CT is characterized by a suite of Neogene-Quaternary intrusive and extrusive rocks and a rapid Quaternary uplift and glaciated peaks during the Pleistocene (Marshall 2007). Under such a geological diverse scenario mainly characterized by high volcanic activity and heterogeneous mountain systems, the differentiation of oak populations would be expected, even more considering the limited capability of the oak species to disperse their seeds. Q. insignis probably has the largest acorns of any oak in the world (between 7 and 8 cm in diameter; Montes-Hernández and López-Herrera 2013), which could have implied some restrictions in seed movement.

For Q. sapotifolia, both AMOVA and SAMOVA results suggested a pattern of genetic differentiation poorly congruent with our a priori regionalization based on the presence of the IT and ND. SAMOVA suggested that Mexican populations (s13, s14, and s15) formed a unique group, and results from BARRIER also showed a lack of differentiation associated to the IT. Three populations (s1, s10, and s12) did not group with other populations, and all the remaining sites formed a unique group including MM and LM samples. Again, the BARRIER analysis did not support significant differences across the ND. Most of the differentiation observed in the BARRIER analysis concentrated at the MM region, particularly near the PMJ. The first main barrier was located at the eastern portion of the IT in the SMC followed by a second barrier separating Guatemala from Honduras populations. The MM is an area of convergence among the Maya highlands, the Motagua fault zone, and the Chortis highlands. The potential distribution area of Q. sapotifolia in Guatemala and Honduras is interrupted by the Chuacus range and Las Minas range (in the Motagua fault zone) and the western rifted Chortis highlands. Differentiation among populations indicated by the BARRIER analysis could be attributed to prominent river valleys (such as the Motagua and Polochic valleys) and major structural depressions associated to the valleys as is it described by Marshall (2007).

The haplotype network of Q. sapotifolia indicates processes of multiple colonization from MM to UM and LM. Interestingly, the data suggested that the major barriers (IT and ND) were not an important factor defining the genetic structure of the species; meanwhile, barriers that were not expected as strong drivers of differentiation as the CRM showed an important effect on the distribution of genetic variation. This unexpected pattern may reflect for the CRM case the importance of the recent geological, volcanic, and climatic heterogeneity of the Chortis block. It is necessary to consider that Q. sapotifolia has a wide altitudinal distribution (from 250 up to 2000 m) which suggests that this species could have been more successful during processes of dispersion through the lowlands than other oak species such as Q. insignis that distributes above the 1500 m. Second, seed dispersal in Q. sapotifolia should be easier due to the smaller size of their acorns (∼1.5 cm in diameter) (Morales 2010). Therefore, even under a scenario of few climatic suitable areas connecting MM with UM and LM, it is possible to suggest long-distance dispersal (e.g., via birds such as passenger pigeons as it was hypothesized by Nixon (1985)) as a mechanism of dispersal through the main barriers. Finally, similarly to what was observed for Q. insignis in LM, Q. sapotifolia populations located in the CC are genetically differentiated from the population of the CT, suggesting that Central Valley and the characteristic geology of the CRM described above may have played an important role in the genetic differentiation of several Quercus species including Q. sapotifolia and Q. insignis.

Palaeoclimate effects on the species phylogeographic structure

Despite important genetic barriers in both species were located at continuous areas of climatically suitable conditions during the present-day, LGM and LIG periods (such as the northern portions of the CAVA), there are coincidences between the presence of genetic discontinuities at the ND and the lack of climatic suitable areas in the same region according to the entire set of modeled climatic scenarios (Fig. 4a–d) for Q. insignis. Similarly, there was a lack of genetic discontinuities at the ND associated to the presence of continuous climatic suitable areas according to the MIROC and combined models (Fig. 5c, e) for Q. sapotifolia. Such differences between the observed genetic differentiation patterns and the results of the ecological niche modelling may suggest that the climatic oscillations during the LGM apparently were not major drivers of the intraspecific differentiation of the species, contrary to the patterns observed in cloud forest species (Ramírez-Barahona and Eguiarte 2014). Contrasting patterns of genetic structure between Q. insignis and Q. sapotifolia probably resulted from differences in the way the two species responded to environmental changes, as it has been observed in other Quercus species (Koehler et al. 2012). Therefore, patterns of genetic diversity should be interpreted on species-specific contexts, as it has been suggested by Ramírez-Barahona and Eguiarte (2013), for the Neotropical cloud forests. Despite this apparent limited effect of paleoclimate changes, it is important to mention that our ecological niche modelling did reflect higher and stable environmental suitability in the MM region, which we consider as a possible area of origin of the species analyzed.

Conclusions

Understanding the effect of historical processes such as tectonics, volcanism, and climate fluctuations on the evolutionary history of the biotas distributed in the northern Neotropics still requires considerable effort. In the present study, two Quercus species exemplify the diversity of possible evolutionary histories that can be observed and reconstructed in a common region. In the case of Q. insignis, we observed coincidences with previously proposed barriers that may have determined the species phylogeographic structure, the IT and the ND. The influence of both barriers has been observed in different biological groups such as mammals (Arellano et al. 2005), reptiles (Castoe et al. 2003; Hasbún et al. 2005; Daza et al. 2010), birds (DaCosta and Klicka 2008; Vázquez-Miranda et al. 2009), amphibians (Mulcahy et al. 2006), and plants (Cavender-Bares et al. 2011). All the mentioned studies attributed a role as barriers to the IT and ND based on different geological attributes (such as volcanism, tectonics, and physiography) and/or palaeoclimatic events.

On the other hand, the lack of influence of the IT and ND, but significant levels of population differentiation due to the geologic heterogeneity of areas such as the PMJ and the Central American volcanic arc have also been reported for different organismal lineages such as plants (Novick et al. 2003; Jardón-Barbolla et al. 2011) and mammals (Eizirik et al. 1998, 2001; Ordóñez-Garza et al. 2010). Once again, the mentioned studies not only considered different geological features but also palaeoclimatic information in order to explain the patterns observed in their studies. The findings of the present study suggest a complex evolutionary history of the oak populations in the northern Neotropics. The most striking indirect finding is the possibility of a MM origin of the species analyzed; this is particularly clear for Q. sapotifolia. This conclusion is based on the higher genetic diversity in this region and the evidence of multiple colonizations from MM to both UM and LM and even recolonizations from LM to MM. The evaluation of a possible Central American origin of this species should be considered in future studies, particularly considering that the northern Neotropics are the encounter point between Nearctic and Neotropical biotas under an extremely complex geologic and climatic scenario.