Introduction

The Caribbean coast of the Yucatan Peninsula, Mexico (Fig. 1), is known to contain an important karst system, which includes several of the longest underwater caves in the world (Gulden and Coke 2011). It has been explored and mapped by cave divers for decades (Fig. 2a). The aquifer consists of a freshwater lens of thicknesses between <10 and 100 m on the top of a saline-water intrusion, which reaches several tens of kilometers inland (Bauer-Gottwein et al. 2011). The relief is low and surface runoff is absent from the peninsula (Beddows 2004). The karst system appears to be very well developed: the deepest explored cave lies at 119 m depth (Smart et al. 2006) and the largest conduit diameters reach some 70 m.

Fig. 1
figure 1

a Location of the study area (modified after Google Earth 2011). b Geographical situation of the Yucatan Peninsula. Modified after NASA (2010)

Fig. 2
figure 2

Location of the a modeled area and b available data

Because of the limited resource of the freshwater lens and the growing demand, water supply in the Yucatan Peninsula is becoming problematic (Bauer-Gottwein et al. 2011), especially in the area of Tulum (Quintana Roo State) where environmental concerns are growing because of the planned urban development that includes the building of a hotel complex (Supper et al. 2009), an airport and a highway (SCT 2010). Moreover, it is common practice in the Yucatan Peninsula to reinject wastewater into the aquifer without previous treatment (Marin et al. 2000). This pollution risk is a major threat for the nearby Sian Ka’an Biosphere Reserve and the large coral reefs located near the shore (Fig. 1b), which are host to ecosystems that are highly dependent on the karst aquifer (Gondwe 2010). To assess the vulnerability of the aquifer, a finite-element flow model has been built (Fig. 2a). As the network geometry is a major constraint on flow paths and velocities in karstic aquifers (Worthington and Smart 2003), it has to be incorporated in the model, which is done by using a finite-element method accounting both for the matrix and the conduits. The conduits are included as one-dimensional (1D)-pipes where flow is modeled by the Manning-Strickler formula allowing for representative turbulent flow. Attempts to calibrate this model are made using high-resolution global positioning system (GPS) water-level measurements.

However, the main focus of this paper is not on the modeling of this specific site, but rather on the extension of a new pseudo-genetic method to model the geometry of the karstic system. Indeed, even if the karst conduits have been extensively mapped in this region of the world, they are not known exhaustively. The same situation occurs in most karstic systems in the world, which are only partially explored. For modeling groundwater flow and transport, it is therefore necessary to construct a realistic model of the unexplored parts, which is why, in recent years, several techniques have been developed to build stochastic karstic network models (Collon-Drouaillet et al. 2012; Fournillon et al. 2010; Henrion et al. 2008; Jaquet et al. 2004; Pardo-Igúzquiza et al. 2012). Here, the method of Borghi et al. (2012) is extended to account for geophysical data when available. The application of this new method is illustrated using airborne electromagnetic measurements collected by the Geological Survey of Austria (Supper et al. 2009), revealing the presence of underwater karstic conduits thanks to the strong electrical conductivity contrast between water-filled caves and the limestone matrix. Several equiprobable karstic network geometries are constructed with this method. It is then proposed to consider that the radius of the conduits follows a power law relating the radius with the order of the conduit. This assumption is based on an analogy with Horton’s law for rivers and provides a simple model relating the flow properties with the geometry. The parameters involved in this formulation are the targets for the calibration of the flow model.

Overall, the aim of this paper is to present this new methodology and to test if it is applicable on a real data set. The resulting model is considered as a preliminary step allowing a better understanding of how such systems could be modeled in the future.

Description of the study site

The Yucatan Peninsula aquifer

The Yucatan Peninsula is a 300-km-wide carbonated platform located between the Gulf of Mexico and the Caribbean Sea (Fig. 1). The overall topography is relatively flat with maximum elevations reaching 300 m above sea level (asl). The mean annual temperature is 26 ° C and annual precipitation is between 1,000 and 1,400 mm y−1, the major part falling during the wet season from May to October (Héraud-Piña 1996). Surface runoff is absent (Beddows 2004) and, according to Gondwe et al. (2010), in the southeastern peninsula, 17 % of the precipitation recharges the aquifer, while the remaining undergoes evapotranspiration.

The aquifer is densely stratified: its upper part consists of freshwater flowing toward the sea, while the lower part consists of warmer saline water (Beddows et al. 2007). General groundwater circulation is organized in a concentric flow from the center of the peninsula towards the coast. The regional hydraulic gradient is relatively low with estimates lying between 1 and 10 cm km−1 for coastal plains (Bauer-Gottwein et al. 2011).

The groundwater hydrodynamics is characteristic of strongly karstified aquifers. According to the review of Bauer-Gottwein et al. (2011), estimations of the hydraulic conductivity in the Yucatan Peninsula aquifer vary widely depending on the scale of interest. Testing of core samples gives values of 10−6–5·10−2 ms−1, while calibration of flow models (at a scale of hundreds of kilometers) yields effective hydraulic conductivity values in the range of 10−1–102 ms−1. Those very high equivalent conductivities are related to the presence of large karst conduits that were not described explicitly in the calibrated model. As for groundwater velocity, Moore et al. (1992) measured values in fractures increasing coastward from 1 to 12 cm s−1, while in the limestone matrix, they obtain values of approximately 10−2 cm s−1. Beddows (2004) provides measurements for two conduits that were monitored for several months. These data reveal an important and rapid effect of sea level variations on the conduit flow velocity. Most of the recorded values are in the range of a few centimeters per second, with a maximum of ∼20 cm s−1 at a coastal site. Cave divers report that the flow is sometimes too strong to swim against, which suggests that velocities can reach several tens of centimeters per second.

The freshwater lens constituting the Yucatan Peninsula aquifer is relatively thin: its maximum thickness is approximately 100 m (Bauer-Gottwein et al. 2011). The depth of the interface between the freshwater lens and the saline water increases inland following a linear trend rather than the Dupuit-Ghyben-Herzberg model, possibly because of the influence of the karstic network (Beddows 2004). In southern Quintana Roo, the freshwater lens discharges to the sea at a rate of 0.27–0.73 m3 s−1 per km of coastline, depending on estimations (Bauer-Gottwein et al. 2011).

Setting of the karst network

The Yucatan Peninsula consists of Mesozoic and Cenozoic sediments on top of a Paleozoic basement. According to seven drillings realized in the northern peninsula at depths between 1.5 and 3.5 km, there is no low-solubility geological formation that could constrain the formation of karst at large scale (Ward et al. 1995). The upper hundreds of meters are sub-horizontally oriented limestones and dolomites (SGM 2007) that show a high solubility (Marin et al. 2000). These carbonates are indeed highly karstified, containing one of the most developed cave system in the world. Karstification may have commenced as early as the late Eocene, when the peninsula emerged (Iturralde-Vinent and MacPhee 1999).

Smart et al. (2006) presented a study of conduit geometry and distribution on the Caribbean coast of the peninsula, as well as assumptions about the processes ruling cave development in this area. According to them, the Yucatan cave system represents an intermediate type between telogenetic and eogenetic karst, following the classification proposed by Vacher and Mylroie (2002). The first type includes typical continental karst, where speleogenesis proceeds under the influence of water flowing from infiltration points toward base level. Cave development is mainly influenced by recharge through secondary porosity, which creates preferential flow paths. In contrast, eogenetic karst development, also known as the flank-margin model, is typical of small carbonated islands. In this case, speleogenesis occurs in diagenetically immature sediments presenting a high primary porosity. Mixing corrosion at the interface between fresh and saline water is the major process ruling carbonate dissolution. The resulting karst system consists of isolated chambers developing along the coast.

Following the observations of Smart et al. (2006), mixing corrosion has a major influence on the development of the Quintana Roo caves. Their study of conduit depths shows a correlation with the halocline position. Many of the conduit cross sections are enlarged at the freshwater/saline-water interface, which is in some locations very sharp (Beddows et al. 2007). Caves located above or below the interface often show speleothems or recrystallization features that suggest that dissolution is not active there anymore. Moreover, the authors were able to link these paleo-karst horizons to previous sea level low or high stands. However, the flank-margin model cannot explain on its own the wide extent of the Yucatan karst system. Caves are organized into large anastomotic systems discharging to the sea, extending up to 8–12 km inland. On the one hand, this morphology suggests the action of an important discharge toward the sea in the process of cave development; and, on the other hand, the lack of preferential orientation in conduit direction suggests their development in a high-porosity matrix rather than in a heterogeneous fractured matrix.

Karst network modeling

Method

The modeling of the karst network geometry is based on the stochastic pseudo-genetic method proposed by Borghi et al. (2012). The method proceeds in four main steps: (1) modeling the three-dimensional (3D) geology of the site (folds, thrusts, faults, etc.) to define the location of the karstifiable formations; (2) modeling the internal heterogeneity of the karstifiable formations (fractures, bedding planes, etc.); (3) map or model (with a stochastic point process) the locations of inlets such as sinkholes or dolines and outlets such as springs; (4) use a fast marching algorithm (Sethian 1996) to compute the fastest path between the inlets and outlets. This procedure is repeated iteratively to generate a hierarchical network. One has just to update the heterogeneity described in step 2 with the result of step 4 and repeat steps 3 and 4. Other variations and improvements are proposed in Borghi et al. (2012) to, for example, account for the unsaturated zone.

At the heart of the method, the fast marching algorithm allows a very efficient computation of the paths between inlets and outlets. This algorithm requires as input a 3D map of maximum velocities. The velocities represent a characteristic of the medium and are used as a parameter to control the contrast between the different geological heterogeneities and their ability to get karstified. It is thus a representation of potential locations for preferential speleogenesis. For typical continental karst systems, this field is built by means of a geological model with higher velocities assigned to soluble formations, faults and/or fractures. Other input parameters are the inlet and outlet points of the network. They can be positioned either deterministically or stochastically. If the precise location of springs is unknown, it is possible to select a diffuse spring zone. In this case, the algorithm selects the most probable outlet points in the given area. The last main parameter to set is the number of iterations and the number of generated conduits per iteration.

For the Yucatan coastal system that was described in the previous section, the caves develop mainly toward the sea in a sub-horizontal plane. Thus, there is no need to build a complex 3D geological model; furthermore, there is no evidence for a clear control by small-scale fracturation or bedding. Steps 1 and 2 of Borghi et al. (2012) must, therefore, be adapted to account for the specificity of the site. In addition, two sources of data are available at that site and must be considered. First, the area has been extensively explored by two airborne geophysical campaigns. Supper et al. (2009) and Ottowitz (2009) already mentioned the ability of these electromagnetic measurements to reveal karstic conduits. This source of information needs to be used to control the simulation of the karstic conduits. Here, the proposal is to directly rescale the geophysical anomaly and use it as one component of the input heterogeneity field resulting from step 2 of the algorithm. Second, extensive maps of conduits are available and they should constrain the karst network model; again, that information will be added during step 2 of the algorithm. The following sections present how the pseudo-genetic karst simulator algorithm has been modified and how these new data have been processed to obtain an ensemble of realistic network models.

Processing the geophysical data

Method and data

The airborne geophysical measurements were carried out by the Geological Survey of Austria during two campaigns in 2007 and 2008. They cover an area of approximately 140 km2 around the town of Tulum and are distributed along flight paths oriented N22 with a spacing of 20–100 m (Fig. 2b). They were obtained by means of an active frequency-domain electromagnetic method. In principle, the generated primary field induces eddy currents in the subsurface, which themselves create a secondary magnetic field. The amplitude and phase of the secondary field depends on the electrical resistivity of the subsurface. Thanks to a strong contrast in resistivity between water-filled conduits and the surrounding limestone matrix, anomalous responses are expected where conduits are located. This method is thought to be particularly efficient in the Tulum area because of a very flat topography, a thin soil cover and the relatively shallow position of the main karst features (Supper et al. 2009).

The main part of the measurement system consists of a modified GEOTECH-“Bird” of 5.6 m length and 140 kg weight (Motschka 2001). It is towed on a cable 30 m below the helicopter, and transmitter coils inside the probe generate primary electromagnetic fields of four frequencies (340, 3,200, 7,190 and 28,850 Hz). The resultant secondary field is recorded by the corresponding receiver coils. For every frequency there are two measurement values, the in-phase (no phase shift between primary and secondary field) and the out-phase component (90° phase shift). Results are given in parts per million (ppm), which is the ratio of the secondary field amplitude over the primary field amplitude. The data from the two field campaigns that give the best insight on the karstic conduits are the in-phase (north) and out-phase (south) component of the measured signal for a primary alternating field of the third frequency (7,190 Hz), which were chosen on the basis of a visual analysis. Indeed, a comparison with the map of explored conduits (Fig. 3) shows a potentially good agreement between conduits and anomalous electromagnetic responses: positive anomalies are measured around known conduits. The electromagnetic map reveals as well potential unexplored conduits.

Fig. 3
figure 3

Measured amplitude (after altitude correction) of the field induced by a frequency of 7,190 Hz. In the northern area (2008 survey, see Fig. 2b), the measured out-phase component is shown and the southern area (2007 survey) it is the in-phase component. Color scale is the equalized histogram, ranging from 2 to 20 ppm for the northern area and from 14 to 57 ppm for the southern

The variation of the signal amplitude is induced by several factors. It is influenced by groundwater salinity, which varies across the study area. Thus, a global rise in conductivity toward the coast is observed which is linked to the shallower depth of the halocline. Other identified sources of noises are a circular lagoon west of Tulum, the town of Tulum itself, and some shift in the values between the flight paths (Fig. 3). In addition, the sharpness of conduit-induced anomaly highly depends on the conduit size, depth and on the resistivity contrast between the water filling the conduit and the surrounding matrix. For these reasons, available inversion does not work properly to map the small anomalies of caves since they are mostly 1D inversion algorithms (assuming a horizontally layered Earth) and cave structures show at least a 2D structure (if not 3D). It was therefore decided to use the geophysical signal after some processing to guide the simulation of the karstic network in a statistical manner. The details of these steps are described in the following paragraphs.

Processing the data

The initial data consists of the value of the electromagnetic signal at each measurement point of the 2007 and 2008 field campaigns. The first steps were to apply (1) an altitude correction (method from Huang (2008)) and (2) a median filter. The correction reduces the influence of altitude variation of the bird, while the median filter, on the other hand, enhances conduit-induced anomalies. Figure 4 shows the histograms of the base 10 logarithm of the resulting values of the filtered signal. One can see on Fig. 4 that the distributions of the signal are significantly different for the two surveys and display a significant proportion of extreme values making these two distributions clearly different and not Gaussian. The extreme values are not correlated to the presence of conduits and tend to mask the information contained in the rest of the data (Fig. 5a). These points were therefore considered as outliers; by selecting the values ranging beyond the interval of three standard deviations from the mean value they are removed from the data sets (Fig. 5b). To combine the 2007 and 2008 data sets into a single one, a normal score transform—see e.g. Chilès and Delfiner (1999)—has been applied separately on the two distributions after eliminating the extreme values. The resulting data set is depicted in Fig. 5c. The next step was carried out using the Isatis software (Geovariances 2010). The resulting variable has been interpolated using kriging on a regular grid of 20 × 20-m cells. The origin of the grid (southern vertex) is positioned at (451, 100 m; 2,225,280 m) in the UTM Zone 16 N coordinate system (datum WGS84). Its dimensions are 11.8 × 14.8 km and its orientation is N22 (Fig. 2). First, the experimental variogram of the processed variable has been computed for twenty distance lags of 20 m each. A variogram model has been fitted to the experimental one; it is an exponential model with a range of 70 m and a sill of 0.95. Some residual flight path noise still appears on the interpolated field. To reduce it, a moving average filter with a window of 100 × 20 m has been applied—the longer axis of the ellipsoid being oriented perpendicular to the flight paths (i.e. N68). The explored cave data were been taken into account for the interpolation. The final map of normalized electromagnetic anomalies is shown in Fig. 5d.

Fig. 4
figure 4

Histograms of the base 10 logarithm values of the electromagnetic signal of both surveys after the median filter

Fig. 5
figure 5

a Logarithm of the magnitude of the out-phase (north) and in-phase (south) component of 7,190 Hz after altitude correction and median filter; b after removal of extreme values; c after normal score transform on both data fields; d after kriging and moving average. Coordinate system is UTM Zone 16 N, WGS84

Completing the electromagnetic map

A final step in processing the geophysical data is to complete the southwestern part of the model domain. Indeed, this zone has not yet been surveyed by geophysical measurements and maps of the geophysical anomalies should be provided over the complete domain. For that purpose, an ensemble of equiprobable maps of geophysical anomalies have been simulated in the region using a geostatistical technique. To guide the geostatistical simulation in this area, available cave maps have been used (Fig. 2). Furthermore, the anomalous signal caused by a lagoon northwest of Tulum has also been replaced by simulated values.

The simulation of the geostatistical anomaly in the parts where the signal is missing has been achieved using the direct sampling multiple point algorithm (Mariethoz et al. 2010). The reconstruction method is presented in detail in Mariethoz and Renard (2010). The simulation grid is the same as defined in the preceding. The available cave map allows the realization of a bivariate simulation, variable 1 being the geophysical anomaly and variable 2 the presence of conduits (categorical variable, 1 = conduit is present; 0 = conduit is absent). By doing so, the simulation of the geophysical signal accounts for the presence of known caves. The area of the Ox Bel Ha system was chosen as a training image, for it has been extensively surveyed by cave divers and by the 2007 geophysical campaign. It was assumed that, in this zone, all the caves were known, i.e. the conduits are considered as absent where no information is provided. This is a reasonable assumption at the resolution of the model since the density of the mapped caves is very high in this small area. The conditioning data consists of, for variable 1, the whole geophysical data (Fig. 6a). Variable 2 was built with the cave map: it is equal to 1 where a known cave is located. The rest of the map was set as being not informed (Fig. 6b). The search radius has been set to 50 m and the distance threshold to 0.01. Twice as much weight has been given to variable 1 as to variable 2 for the distance calculation. One realization is illustrated in Fig. 6c, and the mean simulated signal for 100 equiprobable realizations is shown in Fig. 6d. A positive anomaly is calculated around known conduits, while the remainder of the map shows strong variability, reflecting the uncertainty on the data in this area.

Fig. 6
figure 6

a- b Shows the conditioning data for the extrapolation with direct sampling. The training image is the lower-left quarter of the simulation grid. c One random realization of the geophysical anomaly and d mean values of 100 realizations. The area is located in the grey box in Fig. 2a

Simulating the karst

Building the heterogeneity field

Two pieces of information have been used to create the velocity field describing the geological heterogeneity for the karst simulator: the processed electromagnetic anomaly map E m(x) and the map of explored caves I c(x).

For the electromagnetic map, the first step was to add the minimum value taken by E m(x) to the map:

$$ {N_{{\text{m}}}}\left( {\bf{x}} \right) = {E_{{\text{m}}}}\left( {\bf{x}} \right) + \min \left\{ {{E_{{\text{m}}}}\left( {\bf{x}} \right)} \right\}. $$
(1)

A first trial of karst simulation on a small portion of the simulation area using N m(x) as a velocity field is shown in Fig. 7a. A diffuse spring zone has been set at the lower border of the model. Conduit starting points were picked randomly on the map. It appears that the velocity contrast has to be strengthened for a proper control of the electromagnetic map on the karst simulation. This has been realized using a power law relationship between the velocity and the electromagnetic anomaly.

Fig. 7
figure 7

Two karst simulation attempts on a small portion of the simulation area. In a, the electromagnetic component of the velocity medium is the electromagnetic map with positively shifted values (spatial variable e). In b, the electromagnetic control on the karst simulation is enhanced by setting its value to 3e. This area is located in the red box in Fig. 2a

$$ V\left( {\bf{x}} \right) = {b^{{{N_{{\text{m}}}}\left( {\bf{x}} \right)}}} $$
(2)

After some trial and errors, a value b = 3 was chosen as it provides a good constraint to conduit generation (Fig. 7b).

For the explored caves, the map I c(x) is a Boolean indicator variable constructed by the projection of the cave maps on the simulation grid. It is equal to 1 where caves are present and 0 where no information is provided:

$$ {I_{{\text{c}}}}\left( {\bf{x}} \right) = \left\{ {\begin{array}{*{20}c} {1\;{\text{if}}\,{\text{a}}\,{\text{cave}}\,{\text{is}}\,{\text{present}}\,{\text{at}}\,{\text{location}}\,{\bf{x}}} \\ {0\;{\text{otherwise}}} \\\end{array}.} \right. $$
(3)

A constant high velocity V k has been assigned to points where I c(x) = 1. Finally, the velocity field representing the heterogeneity of the medium is described as:

$$ V\left( {\bf{x}} \right) = {I_{{\text{c}}}}\left( {\bf{x}} \right) \cdot {V_{{\text{k}}}} + \left[ {1 - {I_{{\text{c}}}}\left( {\bf{x}} \right)} \right] \cdot {3^{{{N_{{\text{m}}}}\left( {\bf{x}} \right)}}} $$
(4)

Additional settings of the karst simulator

The next step is the determination of in- and outlet points of the simulated karst system. Little information about the conduit inlets was available, as the numerous cenotes present on the surface are not consistently linked with the underground network (Beddows 2004). They were, thus, picked randomly for each conduit simulation in high velocity zones—i.e. where the presence of a conduit is suggested by cave exploration or a geophysical anomaly. The chosen threshold value for the inlet points is the third quartile of the velocity values, so that a quarter of the grid points can be potentially picked as a conduit inlet. An additional constrain was applied to determine the location of the starting points at the first iteration: they must be located on the upper border of the model. This is done in order to ease flow simulation, as a water inflow is located at this specific border (see section Hydrogeological modeling).

As for the outlets of the system, it has been established in section Description of the study site that the aquifer discharges to the sea. Without more precise information on the location of submarine springs, the option of a diffuse spring zone located on the coast was chosen. Two attempts for the configuration of the spring zone were made: (1) all the coast is selected as a spring zone; (2) alternation between both halves of the coast at each iteration. This latest option is expected to increase reconnections between the conduits, in order to reproduce the anastomotic patterns that are observed.

Figure 8 illustrates the map of conduit occurrence probability after 1,000 realizations for both spring zone options. The simulation grid has the same extent as previously defined, but it is discretized in 40 × 40-m squares—note that this resolution is used to represent the overall field and will be used later to solve the flow in the matrix, but the conduits are represented by discrete 1D elements having a radius that can be much smaller than the cell size. Three iterations were computed, with respectively 20, 100 and 1,000 generated conduits per iteration. It appears that the proposed velocity field is a good constrain to the simulation, as some conduits have an occurrence probability of over 0.8. As expected, network branches are more connected with an alternating spring zone (Fig. 8b). Also, conduit orientations show much variability. With a uniform outlet area, conduits are mainly oriented straight coastward and are organized in dendritic patterns (Fig. 8a). Conduit distribution simulated with the second option is, thus, in better agreement with field observations. Furthermore, it can be observed that conduit paths have a wider variability with option 2 (Fig. 8b). With option 1, the most probable conduits (probability of occurrence of conduits P cond > 0.8) are mainly located on the same paths (Fig. 8a). It is another argument to prefer option 2, as it reflects the high uncertainty arising from this cave mapping method. The following simulations presented in this work were thus realized with an alternating spring zone.

Fig. 8
figure 8

Map of the probability of occurrence of conduits (P cond ) computed on the basis of 1,000 realizations. In a, a diffuse spring zone is defined along the coast (lower border of the simulation grid). In b, one half of the coast line is defined as a diffusive spring zone for the first iteration. The other half is considered for the next iteration, and so on. The area is located in the grey box in Fig. 2a

The remaining input parameters to set are the number of iterations and the number of conduits generated at each iteration. The hierarchy of the conduits directly arises from the number of computed iterations: when a conduit is simulated on the same path as a conduit resulting from a previous iteration, its order increases by one unit. This is illustrated in Fig. 9. The maximum order is, thus, the total number of iterations. As a matter of simplification, the network has been organized in three orders; therefore, three iterations were computed. The known network was then included in the model. If no conduit was simulated on a known conduit, its order is 1; otherwise, the ordering obtained by the iterative process of simulation is kept. The number of generated conduits determines the overall density of the network. As this parameter is unknown, three network densities were generated and tested in the flow model (see section Hydrogeological modeling). Their parameters are described in Table 1 and they are illustrated in Fig. 10.

Table 1 Input parameters of the three simulated networks
Fig. 9
figure 9

Simulation of a three-iteration network. Each time a conduit is simulated on the same path as a conduit generated in a previous iteration, its order increases by one unit. The area is located in the grey box in Fig. 2a

Fig. 10
figure 10

Three simulated networks that were used in the flow model. Conduit orders are assigned according to the number of times a conduit is generated on the same path. Input parameters of these network realizations are described in Table 1. The area is located in the grey box in Fig. 2a

Hydrogeological modeling

Building the flow model

The hydrogeological modeling was achieved using the finite-element code Ground Water (Cornaton 2007). For the purpose of this preliminary study, the model was restricted to the freshwater lens, which is represented as a two-dimensional (2D) confined aquifer of constant thickness and in steady state. This assumption is justified by the fact that all the simulated conduits are fully saturated. Because only the steady-state long-term behavior of the system is modeled, the variation in saturated thickness in the carbonate matrix is expected to have only a minor effect on the values of the fluxes within the conduits, which is why it is neglected at that stage of the work. The model is discretized in 40 × 40 m squares (4-node elements) and the simulated karstic network is integrated in the mesh as a set of 1D elements (2-node elements)(Fig. 11).

Fig. 11
figure 11

Simplified scheme representing the finite-element flow model with boundary conditions: red nodes represent an inflow boundary condition and blue ones represent a constant-head boundary condition. The bold black lines are the 1D elements representing karstic conduits; the white cells are the limestone matrix elements, and in gray are the matrix elements with a high hydraulic conductivity

The groundwater flow in matrix elements is represented using Darcy’s law, while groundwater flow in the 1D conduit elements is represented using the non-linear Manning-Strickler formula for turbulent flow in pipes:

$$ {\bf{v}} = - {\Phi _{{\text{s}}}} \cdot \sqrt[3]{{{R^{2}}}}\frac{{\nabla h}}{{\sqrt {{\left\| {\nabla h} \right\|}} }} $$
(5)

where v is velocity, Φs is the friction coefficient [L1/3 T−1] characteristic of the pipe, ∇h is the hydraulic gradient and R [L] the hydraulic radius of the pipe. A homogeneous value of Φs= 50 m1/3 s−1 was chosen for the whole network. This value is proposed by Lauterjung and Schmidt (1989) for irregular concrete surfaces. In the model, the pipes are assumed to be saturated, so the hydraulic radius is equal to the physical radius of the conduit. The identification of this parameter is described in the following paragraphs.

A constant-head boundary condition is applied at the downstream border of the model in order to represent the sea boundary. Its value has thus been set to 0 m. The upstream border is defined as an inflow boundary. Direct recharge by rainfall over the site is considered negligible as compared to the upstream flow; it was therefore assumed that the observed coastal outflow is directly flowing from the model upstream limit. In their review, Bauer-Gottwein et al. (2011) state that coastal outflow estimates for southern Quintana Roo range from 0.27 to 0.73 m3 s−1 per kilometer of coastline. After comparing these values by means of numerical flow models, they suggest an outflow of 0.3–0.4 m3 s−1 km–1. The upstream influx was thus set to 0.35 m3 s−1 km–1. A row of very high conductivity elements (105 m s−1) was added at the upstream border, so that the inflow is distributed between the matrix and the conduits (Fig. 11).

The parameters that are expected to have a significant influence on the flow simulation are: (1) the matrix hydraulic conductivity, (2) the karstic network density and (3) karstic conduit diameters. In order to calibrate the model, output flow fields were compared with 43 GPS groundwater level measurement, which were realized in cenotes and boreholes over the modeled area during February 2011 (Fig. 2b). Six of those points were monitored during the 10 previous months with pressure sensors at a rate of one measurement every 30 min. Monitoring data reveal a tidal influence on the piezometric level in the range of 10 cm, whereas the maximum observed head is 37 cm with respect to the averaged measured sea level (six measurements). On the other hand, the estimated measurement uncertainty is approximately 5 cm. These data are thus highly uncertain. Considering this fact and the simplicity of the steady-state model, the simulations were calibrated with the aim of reproducing the overall mean gradient rather than the punctual values. Regardless of tidal fluctuations, minimum and maximum gradients were computed by a linear regression of the punctual observed heads versus the distance to the coast. The 95 % confidence interval of the gradient is 2.74 ± 0.5 cm km−1, which is illustrated in Fig. 12.

Fig. 12
figure 12

Measured hydraulic heads versus distance to the coast. Error bars on the water-table monitoring points represent their standard deviation

Sensitivity test

To assess the influence of matrix transmissivity T mat and conduit radius r on the simulated hydraulic heads, a series of simulations were carried out. The same karstic network is used for each simulation. As the hydraulic conductivity contrast between matrix elements and conduit element increases between each simulation, the output flow field of one simulation was used as initial conditions for the next one. The solution is thus more easily computed by the program.

Figure 13 shows the maximum simulated head for T mat ranging from 10−6 to 10−1.5 m2 s−1 and r from 1 to 10 m. It appears that T mat has a smaller influence on the simulated flow field than conduit radius, which suggests that most of the flow is drained by the conduit network. In the range of transmissivity values that yield realistic hydraulic heads (i.e. less than 1 m) T mat variations have almost no influence on the simulated flow field. Since this parameter has little impact on the evaluation of the model uncertainty, a fixed transmissivity of 10−4 m2 s−1 was selected for the following simulations.

Fig. 13
figure 13

Maximum simulated heads resulting from a series of simulations testing the influence of the matrix transmissivity and conduit radius

Calibration of karst network parameters

Since only very few measurements were available, a simple model of conduit radius is proposed. The karst network is regarded as an ordered and hierarchical system (Fig. 10). An analogy was then made with river systems and, which assumes that the radius of a conduit can be related to its order via a power law:

$$ r=\alpha \cdot {e^{{\beta u}}} $$
(6)

where r is the conduit hydraulic radius and u the conduit order. α and β are parameters that characterize the system. This type of law is observed in various complex hierarchical systems: for example, they relate many features of river channels such as drainage area, segment length or width, with their order (Horton 1945). Here, it is considered that such a relation can provide a reasonable model in the case of a karst system. It would have to be tested further against field data but for the preliminary model, it provides a simple way to control the variability of the hydraulic radius of the network with only two parameters.

A series of simulations with varying α and β for three increasing network densities (Table 1; Fig. 10) were run. Both α and β range from 0.1 to 1.6. An overview of the maximum head of each simulated flow field is shown in Fig. 14a. It appears that α and β have an important control on the flow simulation; however, the three conduit densities yield similar results. Numerous model configurations with large conduit radius could not be solved numerically (in gray in Fig. 14a), which is probably due to the high conductivity contrast between matrix elements and conduits which prevents the solution from converging.

Fig. 14
figure 14

a Maximum simulated hydraulic head (in meters, on a logarithmic scale) for a set of simulations. In gray are the models for which the simulator could not compute a solution. The varying parameters are α and β, empirical parameters defining the radius of each conduit order, and the network density (increasing from network 1 to 3, see Fig. 14. b Models that yield a maximum head matching the observed gradient

Based on the calculated gradient of 2.74 ± 0.5 cm km−1, hydraulic heads at the upstream limit of the model (11.8 km from the coast) should lie between 26 and 38 cm (Fig. 12). Simulations that yield maximum heads in this range are shown in Fig. 14b. They are considered to be good approximations of the system. A parameter summary of each simulation is presented in Table 2.

Table 2 Summary of the parameters of the simulations yielding realistic hydraulic gradient (measured gradient: 2.74 ± 0.5 cm km−1). Referred network densities are shown in Fig. 14

The three network densities can provide equally good fits. The corresponding conduit radiuses range from 2.4 to 3.2 m for conduit order 1, from 6.1 to 10.8 m for order 2, and from 12.2 to 40.0 m for order 3. Figure 14b shows that α and β are correlated, which is illustrated in Fig. 15 for all the network types together. A second order least-squares fit of α versus β yield the relation:

$$ \beta =0.37{\alpha^2}-1.60\alpha +2.27\cdot $$
(7)
Fig. 15
figure 15

Second order least-squares fit of the combinations of α and β that yield realistic flow fields

The available data set is therefore not sufficient to constrain the radius model and to provide a uniquely defined set of parameters. This implies a large source of uncertainty and indicates that additional data must be acquired in the future to better constrain that part of the model.

The maximum conduit velocities for the selected simulations are presented in Table 2 and are in the range of 0.6–0.9 m s−1, which is in agreement with the in-situ observation by divers of strong currents, but has yet to be confirmed by proper measurements. The output flow and velocity fields of one of these simulations are shown in Fig. 16. The heterogeneity of the piezometric and velocity maps illustrates the strong influence of the conduits on the flow simulation. The velocity map shows that water flows at approximately 1 m y−1 in the matrix, while in major conduits, velocities are higher than 1 cm s−1, indicating that a large proportion of the flow occurs in the conduits.

Fig. 16
figure 16

Piezometric and velocity maps simulated with network 1, α = 1.5 and β = 0.7. The area is located in the grey box in Fig. 2a

Discussion and conclusion

This study presents the development of a finite-element flow model of the karstic aquifer located on the eastern coast of the Yucatan Peninsula. In a first step, the known karstic network, mapped by divers, is extrapolated using a modification of the stochastic pseudo-genetic karst simulator (Borghi et al. 2012), which allows to account for electromagnetic data resulting from extensive airborne geophysical surveys. The resulting conduit networks are modeled as 1D pipes, accounting for turbulent flow in a porous matrix.

However, the resulting hydrogeological models are subject to much uncertainty. A set of 15 tested parameter configurations yield hydraulic fields that match hydraulic head observations, out of more than 700 tested models. In addition, a wide set of stochastic karst networks can be generated with the chosen input parameters. It can, however, be observed that the satisfying combinations of parameters α and β tend to follow a specific trend for each of the three proposed network densities.

The uncertainty of the model arises on the one hand from the lack of conditioning data and on the other hand from the karstic nature of the aquifer. The one-off water-level measurements are difficult to handle because the measurement accuracy is 5 cm, whereas the hydraulic gradient is only a few centimeters per kilometer. In addition, tidal fluctuations observed in boreholes (∼10 cm) have not been taken into account for the calculation of the hydraulic gradient. An in-depth study of the tidal wave propagation in the aquifer (delay, amplitude decay) is necessary in order to interpret more precisely the piezometric levels at that scale. Beside, water tables in karstic aquifers are strongly influenced by the position of conduits. Thus, a regional interpretation of localized measurements requires consideration of the karst network configuration. Finally, the models do not allow unconnected karst features, which might affect the flow model for they have an important storage capacity but low transmissivity.

Furthermore, a high level of uncertainty is linked with the conduit radius estimates. As no detailed measurements were available, the network was oversimplified in three radius classes. The application of an analogy with Horton laws for radius estimates is questionable, as the ordering does not respect the standard Horton or Strahler ordering systems for river systems. Again, further research is needed here to check the model against data and to elaborate a specific relation for karstic systems.

The necessity to include highly conductive flow channels in karst aquifer models has been pointed out by many authors, for example Kiraly (1998) and Worthington (2009). This cannot go without a high uncertainty since a thorough exploration of the caves is unrealistic, even in this study case where extensive cave maps were available. The method that is proposed here allows for building reasonable models of the cave conduits and exploring the uncertainty. In the specific situation of the Yucatan system, the available electromagnetic measurements constitute a major clue for the completion of the network. Their potential in revealing karstic conduits has been established in previous studies (Ottowitz 2009; Supper et al. 2009).

Although the exact position of the conduits is uncertain, the flow models including the stochastic conduits yield more realistic simulated water tables and flow paths than any homogenous or distributed equivalent porous medium model. Indeed, groundwater flow is widely drained by the cave network. This is in agreement with the estimation provided by Worthington et al. (2000) that 99.7 % of the flow in this aquifer occurs in the conduits. Moreover, the simulation of turbulent flow in conduits allows realistic velocity and travel time estimates even if the whole complexity of the system was not accounted for and even if data were missing to calibrate precisely the distribution of the radius of the conduits. The proposed model gives a global insight on the aquifer behavior. Should it be used to address questions of flow paths and travel times at a more local scale, the output would be very different from one realization to another and thus would allow one to estimate uncertainty in those forecasts. This could be used in an iterative procedure to define where and what type of information is required to better constrain the model and optimally reduce the uncertainty.

Regarding pollution risk, the results of these preliminary flow simulations suggest that the vulnerability of the aquifer is extremely high. Indeed, computed velocities in conduits are on the order of tens of cm s−1, which is unusually rapid for groundwater flow, which infers that wastewater travel-times from the injection point to the outlets are short. Pollutant decay and/or absorption processes that could reduce water toxicity are thus inhibited.

Tulum wastewater is likely to travel to the sea at a very rapid rate. Indeed, cave maps indicate the presence of three explored conduits just beneath the town (Fig. 2a). Based on the proposed conduit probability map (Fig. 8b), it is highly probable that a straight connection exists between Tulum and the sea. According to the velocities computed by flow simulations, water could travel this 3-km distance very rapidly. This is alarming considering the major coral reef that lies near the coast and the lack of wastewater treatment in this area, but needs to be confirmed by in-situ velocity measurements. Regarding hydrologic linkages between Tulum and the Sian Ka’an Biosphere Reserve, a direct karstic connection is possible but has not been proven. More likely, pollution could travel from the sea to Sian Ka’an lagoons, which induces a major pollution risk for the protected ecosystem hosted in this area. To quantify this risk, a better understanding of the karst network and a study of the potential contaminant sources are required.