Introduction

The Quaternary period (starting 2.58 Mya) began with a major extensive glaciation and was marked by periods of large eustatic changes and alternation of about 35 glacial and interglacial periods since 1.8 Mya, with many species migrating to refuges, especially towards the maximum cold at the Late Pleistocene, a progression punctuated by brief warm interglacial periods (Ogg et al. 2016). These geological and climatic changes left their imprint on the colonization history and current geographical distribution of freshwater fish species (Wilkens 1982).

The karstic nature of YP has impeded the formation and existence of surface rivers, but it has favored the formation of cenotes (sinkholes). The most isolated, inland cenotes, have been colonized by only a few fish species (Wilkens 1982).

Two alternative mechanisms (vicariance and dispersal) have been proposed to explain the current and past distribution of species (Matamoros et al. 2014), but the latter is likely to be the principal mechanism by which the cichlids have colonized YP from South America, because the northern part of YP emerged only around 1.8 Mya, whereas the southern half was already available to colonization 5.5 Mya (Fig. 1; Bautista et al. 2011). Three waves of dispersal across the isthmian bridge of Panama during the Quaternary have been hypothesized to explain contemporary primary fish distribution in Central America during the Great American Biotic Interchange (Martin and Bermingham 1998; Woodburne 2010). Cichlid fishes colonized Middle America from South America during the Oligocene, 29-24 Mya, in the course of the Greater Antilles–Aves Ridge land bridge (GAARlandia) event (Říčan et al. 2013).

Fig. 1
figure 1

Tissue-sampled localities of M. urophthalmus for the phylogeographic study, and geologic time, Ticul hill and the ancient coastlines of the Peninsula de Yucatan. The coastlines are based on Urrutia-Fucugauchi et al. (2008), and the geological times for the YP at (Bautista et al. 2011) coupled with topology of haplotypic relationships inferred at Network and Statistical Parsimony based on two mitochondrial protein gene fragments (CYTB and COI, 1677 bp) and showing the North and south components. The haplotype network figure is adapted and modified by Barrientos-Villalobos et al. (2018). Circles represent the frequency of each haplotype and the color denotes their locality

The Mayan Cichlid (Mayaheros urophthalmus Günther) is a Neotropical fish very frequent and abundant in brackish and freshwater environments, but also tolerant of marine salinity (Miller et al. 2009), which facilitates its dispersion along the coast. It ranges from southern Veracruz, Mexico, to Nicaragua, inhabiting rivers, lakes, ponds, and karstic sinkholes (cenotes) in the YP, including coastal lagoons in the islands of Contoy and Mujeres. It has been introduced to southern Florida (Harrison et al. 2014) and southeastern Asia (Nico et al. 2007). The species is often consumed locally and even cultured as an alternative to tilapia (Martínez-Palacios and Ross 1988). Although its commercial importance is modest, as a predator (an omnivore with carnivorous tendency: Miller et al. 2009), it is ecologically very relevant in most water bodies of the region, especially in the faunistically impoverished inland cenotes of the YP (Schmitter-Soto et al. 2002).

The species displays such a variety of body shape and coloration along its distribution range in YP, that Hubbs (1936) was able to describe eight subspecies. Barrientos-Medina (2005) provided partial diagnoses for them based solely on morphometrics and proposed raising these taxa to species level; however, the validity of these has been questioned (Barrientos-Villalobos et al. 2018), the phenotypic variability being ascribed to the type of aquatic environment in which the fish live, i.e., river, cenote or lagoon; this conclusion was based on a molecular analysis of two concatenated fragments of mitochondrial DNA (mtDNA), cytochrome b (CYTB) and cytochrome oxidase subunit I (COI).

Although Barrientos-Villalobos et al. (2018) concluded that “the patterns of the body shape of M. urophthalmus are more consistent with ecophenotypic variation than with genetic differentiation due to geographic isolation by distance”, they also found low genetic divergence among populations (and elevated gene flow, which has diluted any speciation process). In another analysis, based on cytb as well, but also on nuclear microsatellites, Harrison et al. (2014) found genetic subdivision in YP and explained it through “pre-Columbian human transportation,” because their results showed that northern YP was more similar to the Guatemalan Petén than to southern YP. Part of the same dataset was interpreted earlier by Razo-Mendívil et al. (2013) as having “two groups with low divergence and with no correspondence with geographical distribution,” concluding, nevertheless, that “historical long-distance dispersal and drought periods during the Holocene, with significant population size reductions and fragmentations, are factors that could have shaped the genetic structure of the Mayan cichlid.”

Following the geological reconstruction of the past coastline of YP by Urrutia-Fucugauchi et al. (2008, Fig. 1) and the geological timing inferred for the YP by Bautista et al. (2011), herein we reanalyze the data from Barrientos-Villalobos et al. (2018) in order to explore a scenario of colonization of the YP by the Mayan Cichlid by means of a phylogeographic analysis. Our hypothesis (argued previously by Schmitter-Soto 2002, among others) was that the genetic structure of the Mayan Cichlid would reflect the north-south pattern and would be explainable by marine transgressions and regressions from Pleistocene, alternatively allowing and hindering colonization of freshwaters in YP.

Methods

To test our hypothesis, we based the phylogeographic analyses on the genetic descriptors, AMOVA test and Bayesian phylogeny described by Barrientos-Villalobos et al. (2018), who used two partial coding genes of mitochondrial DNA, CYTB and COI. Because cytb contains regions with evolutionary rates both slow and fast as well as more conserved regions, it is used at very different levels, from population to deep phylogeny (Farias et al. 2001), whereas COI is much more conserved, so it has been employed in the barcoding approach (Ward et al. 2009); both genes are widely used in phylogeographic scenarios to answer how species reached their present genetic and geographic configuration (Avise 2000; Lovejoy and de Araújo 2000).

Mayan cichlids were surveyed at 15 sites along within their distribution range (Fig. 1, Appendix 1) in the Mexican and Belizean parts of the YP. The sampling size varied from nine individuals at Cenote Popolvuh and Ría Celestún to just one individual at Silvituc lake, where the species is scarce, probably because of the introduction of the alien tilapia Oreochromis niloticus (Linnaeus). Dorsal fins were clipped from fish caught using cast nets and double-cone traps and stored in 90% ethanol for transport to the laboratory. A subsample of the captured fishes was vouchered at the ichthyological collection of El Colegio de la Frontera Sur, Chetumal, Mexico (ECO-CH) (Appendix 1).

Molecular data

For total genomic DNA extraction, a small amount of tissue of dorsal fin (i.e. 2 mm3 approximately) was ground in DNeasy Blood & Tissue Kit (QIAGEN, Cat No. /ID: 69504) following the protocol of the manufacturers. The CYTB mtDNA oligonucleotides used for the amplification were described by Irwin et al. (1991); Farias et al. (2001) and Chakrabarty (2006) described the primers of COI.

DNA amplification was conducted in Peltier-effect thermocyclers (MultiGene OptiMax) under the following parameters: one initial cycle at 95 °C for 120 s, followed by 35 cycles of 95 °C for 20 s, 50 °C for 20 s, 72 °C for 60 s, with one final cycle at 72 °C for 240 s. All PCR reactions were conducted along with positive and negative controls.

The amplifications of PCR were visualized on 2% agarose gels stained with ethidium bromide and successful amplifications were purified using Wizard® SV Gel and PCR Clean-Up System (Promega). Purified products of PCR were sent for sequencing in both directions to South Korea (Macrogen). Sequence files were analyzed with the aid of the program BioEdit Sequence Alignment v7.0.9 (Hall 1999). All sequences were deposited in GenBank under the next accession: MF741939 to MF741974, and MF776666 to MF776701.

Molecular analyses

The best-fitted model of molecular evolution under the Akaike information criterion (AIC) for our concatenated sequences data (CYTB and COI) were determined with the aid of software jModeltest v.2.1.10 (Guindon and Gascuel 2003; Darriba et al. 2012). A Bayesian analysis with the software MrBayes 3.2 (Ronquist and Huelsenbeck 2003; Ronquist et al. 2012) were executed for inferred the evolutionary relationships among haplotypes, under the next parameters: Lset Nst = 6, revmatpr = (0.3049, 7.4737, 1.0000, 0.3049, 2.5344, and 1), statefreqpr = (0.2395, 0.3215, 0.1525, and 0.2866), pinvarpr = (0.9620), as suggested by jModeltest. Ten million generations searches were implemented with four parallel chains and sampling every 1000th generation, after a 25% burn-in value; finally, to estimate posterior probabilities for each node a majority-rule consensus tree was implemented. To root the phylogeny, under the outgroup criterion, the species Petenia splendida (Günther) and Darienheros calobrensis (Meek & Hildebrand) were used (with the next accession numbers from GenBank: AF370679, EU751899, AY843381, GU817255). In addition, the evolutionary relationships among haplotypes of M. urophthalmus were not constrained as monophyletic.

A phylogenetic minimum spanning network based on statistical parsimony was inferred for the haplotypes (Templeton et al. 1992). This unrooted network was recovered using the program TCS v. 1.21 with a 95% connection probability limit, treating gaps as missing data (Clement et al. 2000). The loops were resolved according to coalescent theory criteria proposed by Pfenninger et al. (2002). Alternatively, a median-joining (MJ) network algorithm based on genetic distances was calculated with an epsilon value = 10 using software NETWORK v.5.0.0 (Bandelt et al. 1999); The maximum parsimony criterion (MP) was implemented to purge superfluous links and median vectors from the network as a post-process (Polzin and Daneshmand 2003).

To detect probable genetic barriers on geographical space, a spatial analysis of molecular variance (SAMOVA) by means of the Monmonier algorithm was implemented in software SAMOVA v.2.0 (Monmonier 1973; Stenico et al. 1998; Simoni et al. 1999). By means of this analysis, geographical and genetically homogeneous groups of populations were defined through the maximization of total genetic variance due to differences between populations of FST genetic distances. A meta-heuristic search of simulated annealing procedure to find a global optimum configuration was implemented, with no groups defined a priori. The transitions, transversions, and deletion weights were = 1, letting the software identify the best genetic structure without geographical constraint, with 10,000 steps in each process. In the end, a new AMOVA based on the best genetic-geographical configuration determined by the analysis of SAMOVA software was recalculated, based on pairwise differences of FST with 1000 permutations on the new configuration determined by the spatial analysis.

Finally, the dating of putative barriers was performed by means of a secondary Bayesian relaxed lognormal clock under an uncorrelated model, which does not require rates to be heritable and, therefore, allows lineage-specific rate heterogeneity (Thorne et al. 1998; Aris-Brosou and Yang 2002). Two independent Montecarlo Markovian chains (MCMC) analyses of 40 million of generations each, sampling every 1000th generation with BEAST-BEAUti v1.8.2 (Drummond et al. 2012), were followed by visualization with Tracer v1.6 (Rambaut et al. 2014), the statistics and convergence of model parameter values, such as Effective Sample Size (ESS), and estimates of optimal posterior distributions of node heights.

For secondary dating, we used two points of calibrations based on estimates from Říčan et al. (2013) used as priors under normal distributions as recommended by Ho and Phillips (2009): 1. - the divergence time between D. calobrensis and P. splendida (mean 12.9 Mya, s.d. 1.5), and 2. - the divergence time between M. urophthalmus and P. splendida (mean 6.6 Mya. s.d. 1.5). The trees and log files from the runs were combined using the software LogCombiner v. 1.8.2 (Rambaut et al. 2014). A maximum-credibility tree representing the maximum a posteriori topology with mean heights and 95% higher posterior densities (HPD) of age estimates were obtained from the overall outputs after removing burn-in trees (10%) using the software TreeAnnotator v. 1.8.2, and the final topology was visualized and edited by means of package FigTree v. 1.4.3 (Rambaut 2016).

Results

The concatenated matrix included 1677 aligned positions from 81 individuals from 15 natural populations of M. urophthalmus (Fig. 1, Appendix 1); 1053 pb corresponded to CYTB and 618 bp to COI. The posterior Basic Local Alignment Search Tool (BLAST) inspection corroborated that the fragments of both genes employed in this study correspond to the species. The best-fit model selected by the AIC criterion as prior for the execution of the analyses was TIM3 + I p-inv = 0.9620.

The number of polymorphic sites observed at the concatenated mitochondrial sequences was 31, of which 24 were transitions and 7 were transversions; all mutations were substitutions, and no indels were observed. A total of 36 haplotypes (H) were recovered for 15 populations, of which 28 were singletons. H12 and H6 were the most widespread and the most common haplotypes respectively (Fig. 1). As commonly observed in fishes, the relative nucleotide composition observed in the CYTB and COI fragments were characterized by an antiguanine bias: cytosine, 33.6%; 30.12%; thymine, 28.3%, 29.33; adenine 24.3%, 22.4%, and guanine, 13.7%, 18%, respectively.

There was an excess of rare mutations, as well as the large negative Fs values in the neutrality tests, that suggest population expansion, in the populations of Cobá (Fs = −1.937, p = 0.011), Río Palizada (Fs = −2.517, p = 0.017), and Sabancuy (Fs = −1.812, p = 0.02). According to the AMOVA, the largest proportion of molecular variance occurred within populations (68.11%), whereas the fixation index FST was conservatively intermediate to low (FST = 0.318). However, the absolute number of migrants per generation (m) was sufficient for diluting speciation process.

Phylogeographic interrelationships

The Bayesian tree supports the monophyletic relationships among haplotypes of M. urophthalmus with high posterior probability, whereas the internodes had low support values, and the topology was poorly resolved. However, a recurrent ordination of populations was observed on the map, corresponding to a north-south pattern reflected at both Network and Bayesian phylogenies (Figs. 1, 2).

Fig. 2
figure 2

The Bayesian 50% majority rule tree of M. urophthalmus core based on two mitochondrial protein gene fragments (CYTB and COI, 1677 bp). Bayesian Posterior Probabilities (PPB) support at the base of internodes. The figure is based and modified by Barrientos-Villalobos et al. (2018). The principal nodes were collapsed for more clear visualization. The N corresponds to the number of organisms that share the same haplotype

The Bayesian topology was also highly consistent with the relationships among haplotype nets inferred by means of statistical parsimony at TCS (data not shown) and the median-joining results. These two networks were identical, with a high proportion of singletons, which means that most coalescent events occurred near the root and few occurred later (Fig. 1). A north-south distribution pattern was revealed in the haplotype net reconstructions and phylogeny, although the Mantel’s test between linearized (FST) values and geographic distances was not significant, discarding an effect of isolation by distance among populations (r2 = −0.0028, p = 0.523).

The spatial analyses with no groups defined a priori suggested the existence of two population groups (K = 2), and a putative geographic barrier between these groups was suggested by SAMOVA with highest FCT value (0.43, p = 0.0001). This pattern represents the better genetic-geographic configuration, which is in accord with the Bayesian and network analyses. The groups corresponds to northern (1, 2, 3, 7, 8, 10, and 13) vs. southern (4, 5, 6, 9, 11, 12, 14, and 15) populations (Figs. 1, 2).

The a posteriori hierarchical AMOVA on the new configuration determined by SAMOVA revealed that 43.8% of variance was due to differences between these two groups (the north populations versus south populations), 24.7% was due to differences among populations within groups, and 31.4% was due to variance within populations. However, when K is incremented a priori to constrain to 3, 4, and 5 groups, these two groups are disgregated and the new groups of subpopulations identified by SAMOVA do not clearly any geographic barrier, while the fixation index Fsc and FST values decline, and Fct values increase. So the recognition by SAMOVA of two groups (the north populations versus south populations) and a barrier that separates them represents the statistically most robust configuration.

Dating the incipient north-south pattern

We attempted to date this incipient but highly consistent pattern recognized both by the Bayesian and network analyses, as well as the SAMOVA suggestion of a putative barrier among two groups of populations (North-South distribution pattern).

A coalescent growth population size was considered because the general pattern suggested by Fu’s test for M. urophthalmus was the demographic expansion (Barrientos-Villalobos et al. 2018). The other priors were: a general time reversible (GTR) model, Lset Nst = 5, revmatpr = fixed (0.3049, 7.4737, 0.3049, 2.5344, 10,000), statefreqpr = fixed (0.2395, 0.3215, 0.1525, 0.2866), and pinvarpr = fixed (0.9620), as suggested by jModeltest. The MCMC analyses resulted in acceptable mixing of the statistics and convergence of parameter values.

Our dating of the incipient divergence among north-south components corresponds to the Pleistocene (mean 1 Mya), with high support value (0.9 posterior Bayesian probability), and 95% HPD interval 1.7-0.063 Mya (Fig. 3). This places the colonization event of the north of the YP in the window of time corresponding to the Quaternary, when the last eustatic changes of the Caribbean Sea occurred in consonance with glaciations alternated with warm periods. In addition, the emergence of the north of YP has been dated around 1.8 Mya, leaving as the only possibility of colonization of the north of YP the dispersal of organisms from the south of the peninsula during the last eustatic changes.

Fig. 3
figure 3

Relative time of divergence among North and South subclades of Mayaheros urophthalmus based on two fragments of mtDNA genes. Blue bars represent the 95% High Probability Densities (HPD) of divergence times. Upper clade: Bayesian Posterior Probabilities / Intervals of divergence times estimates (95% of HPD) in million years before present. Down clade: mean of Posterior Credibility Value of absolute times of divergence. At the bottom of the graph at blue colour the eustatic curves of the sea Based on Haq et al. (1987), and, red line represent the geologic changes of temperature based on Condamine et al. (2016), and, relative times of epochs based on the international chronostratigraphic chart v2018-08 (www.stratigraphy.org)

Discussion

The genetic flow and number of migrants per generation are congruent with the haplotypic diversity and effective population size of females observed on lentic aquatic systems relative to others for these molecular markers referred by Barrientos-Villalobos et al. (2018, Table 1). On the other hand, the high to moderate gene flow among most of the other populations of M. urophthalmus and the genetic differentiation (FST = 0.31) observed by us is similar to the 0.4% (in CYTB) reported by Razo-Mendivil et al. (2013). Furthermore, the observed pattern of the network, with many singletons, may be indicative of a population that recently began to increase from a few founders or a bottleneck previously to populational expansion (Slatkin and Hudson 1991); this star-like pattern was found as well by Razo-Mendívil et al. (2013).

Table 1 Conventional population pairwise Fst values from haplotype frequencies downcast and the absolute number of migrants per generation (m), ∞ (infinite) upper case. Significant differences (*) to level: p < 0.05. The table was modified from Barrientos-Villalobos et al. (2018), adding the absolute number of migrants per generation in the upper case of the table

A biogeographic scenario for the colonization of the YP

Two alternative mechanisms, vicariance and dispersal, have been proposed to explain biological distributions. The first process takes place when a barrier emerges dividing a species and posteriorly the divergence occurs; during dispersal, a subgroup of individuals of a species colonize new areas or simply their distribution range grows (Brown and Lomolino 1988). The latter mechanism has marked the major biogeographical events at the history of the Central American cichlid fishes, since these migrated from South America during the Oligocene (between 29 and 24 Mya) through the GAARlandia event (Říčan et al. 2013). As few as two to four independent dispersal events to Middle America occurred between 24 and 16 Mya from Central America by secondary and halotolerant fishes, such as poeciliids, rivulids, and cichlids (Concheiro Pérez et al. 2007). They colonized new areas along the coasts relatively easily: M. urophthalmus and Gambusia yucatana (Regan) are the more euryaceous fishes present in northern and inland YP, and both are derived from the ichthyofauna of coastal lagoons (Wilkens 1982). This hypothesis is supported also by the geologic history of YP: the peninsula emerged gradually at least from the Oligocene onwards, with the youngest part in the north emerged around 1.8 Mya (Fig. 1; Lugo-Hubp et al. 1992; Vázquez-Domínguez and Arita 2010).

Divergence times between M. urophthalmus and P. splendida have been estimated at around 6.6 Mya by Říčan et al. (2013); our dating of the incipient north-south event of colonization of M. urophthalmus (1 Mya, 95% HPD interval 1.7-0.063 Mya) corresponds to the Quaternary and is in accord to the geologic history of northern YP. Populations of M. urophthalmus could have emerged relatively recently from the late Miocene to the Quaternary (3.6 Mya-1.8 kya, Figs. 1, 3; Gondwe et al. 2010), a period that started with acontinental glaciation ∼2.58 Mya, albeit the major glacial/interglacial oscillations took place around 1.4 -0.4 Mya, with climatic cycles of ∼41 Kyr, expanding to intervals of ∼100 Kyr in Middle to Late Pleistocene (Head and Gibbard 2015). This cycle induced sea level fluctuations between 50 m above current sea level and − 125 m below (Haq et al. 1987; Rohling et al. 2014), so the shorelines also changed (Urrutia-Fucugauchi et al. 2008), especially in the shallow north of YP (see old coastlines, Fig. 1).

It has long been recognized that the Pliocene and the Pleistocene interval was characterized by an alternation of glacial and interglacial episodes, with several droughts during the most intense glacial periods, an enhanced high primary production, sea level lowering, and more vigorous atmospheric circulation (Delmonte et al. 2007). Advances and recessions of glaciers provoked contrasting climatic changes, which contributed to mold the distribution of species (Hulsey and López-Fernández 2011). According to oxygen isotope records, about 35 major glaciations have occurred since 1.8 Mya (Montaggioni and Braithwaite 2009). These changes in sea level, and the karstic nature of the terrain, are responsible for the lack of rivers and the abundance of isolated sinkholes in northern YP (Humphries and Miller 1981; Hulsey and López-Fernández 2011); the open sinkholes were colonized during Pliocene-Pleistocene, probably coupled with marine regressions (Wilkens 1982; Wilkens and Strecker 2017).

The “sierrita” (“small hill range”) of Ticul has been considered a limit between north and south YP; this limit corresponds with the ordination of two subgroups north and south recognized by SAMOVA. Northern YP is characterized by a low diversity of freshwater fishes, which increases closer to the coast (Wilkens 1982). Furthermore, fish associations in the south can be explained by ecological factors, like salinity, temperature, and distance from the coastline (Schmitter-Soto and Gamboa-Pérez 1996), whereas in the north associations can be due to both historical and ecological factors (Schmitter-Soto 1999).

Except for some endemisms, the vertebrate fauna of the north of YP is mostly a subset of the fauna at the base of YP, including Petén in Guatemala, suggesting that northern faunas were originated simply by dispersal from the south (Arita 1997; Vázquez-Domínguez and Arita 2010), and this dispersal presumably occurred during Post-Pliocene times, because marine transgressions submerged several times the north of YP (Morrone 2005).

This north-south difference and the decrease in species diversity towards the northern tip of YP, a “peninsula effect” (Simpson 1964), has been invoked to establish two biogeographical (sub) provinces, the Peten province in the south and the Yucatan province in the north (Udvardy 1975; Espadas Manrique et al. 2003).

To complement this biogeographic scenario based on M. urophthalmus, additional phylogeographic analyses and divergence dating from other species will be needed to make more conclusive the already quite suggestive patterns in the historical biogeography of the YP and the Caribbean Sea.

Conclusions

Is clear to us that dispersal was the only possible mechanism to explain the current distribution of M. urophthalmus. This dispersal could have occurred with displacements around the old coastlines from the south. The phylogeographic analyses, the SAMOVA, and geologic history of YP suggest so; in addition, the dating of the event of dispersal by some individuals from the south of the YP to North of the YP geologically makes sense.

On the other side, the establishment of the populations of M. urophthalmus on the northern of YP probably occurred during Post-Pliocene times, because marine transgressions submerged several times the north of YP, preventing this from happening before.

All the above agrees perfectly with the biogeographic history observed in other groups of vertebrates of YP.