Introduction

Modelling surface–groundwater interactions is imperative to different sectors such as ecology, agriculture, and particularly, water resources management (Dages et al. 2012). It is also deemed to be one of the biggest challenges in modelling task (Irawan et al. 2015), as surface water and groundwater interplay in such a way that effects on each side will cause some consequences on the quantity or quality of the other (Fleckenstein et al. 2010; Zhang et al. 2015). Analysing the spatio-temporal variability of surface and groundwater interaction is usually conducted with the purpose of (1) investigating into tolerability of streamflow discharge, groundwater storage, and aquatic species to ongoing climate change (Waibel et al. 2013), groundwater abstraction, or land use/land cover change (Bailey et al. 2016); (2) improving the conjunctive use of surface water and groundwater (Sophocleous 2002); (3) locating places prone to contamination for setting up possible protection plans such as elimination of pollutants in groundwater for instance in riparian zones; and (4) assessing the contamination risk of surface water by adverse groundwater constituents like nitrate and phosphorus.

Over the past decades, several hydrological models have been developed for the coupled simulation of surface and groundwater flows (Sulis et al. 2010) which are mostly based on interconnection of models developed for either one or the other compartments of the hydrological cycle (Bejranonda et al. 2013). Conventionally, the integration and coupling of surface and subsurface hydrological models have mostly been assessed against streamflow discharge, groundwater heads, and appropriate surface–groundwater exchangeable components as the calibration targets (Bejranonda et al. 2007; Kalbus et al. 2006). The Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998) and the Modular Three-Dimensional Finite-Difference Groundwater Flow (MODFLOW) (McDonald et al. 1988) are among the most widely used surface and groundwater models, respectively (Guzman et al. 2015), to simulate joint interactions of surface and sub-surface hydrological processes (Guzman et al. 2015).

Given the fact that SWAT is a quasi-distributed model and has its own simplified groundwater module (Arnold et al. 1993), the most important parameters such as recharge, hydraulic conductivity and/or transmissivity, specific storage, specific yield, and effective porosity can be assigned at a hydrologic response unit (HRU) level. Thus, depending on the size of an HRU and MODFLOW-grid cell, a large number of grids/cells can be covered by an HRU. Correspondingly, when SWAT-HURs are intersected with MODFLOW-grid cells, the spatial resolution of the exchangeable parameters including hydraulic parameters will be improved. Therefore, the possible outputs of the coupled model, particularly groundwater heads, can be better simulated according to a higher resolution and a better parameterization of recharge and the hydraulic parameters (Kim et al. 2008). Hence, benefitting from a fully distributed groundwater flow model, i.e. MODFLOW can resolve this shortcoming of SWAT model in the light of greatly enhancing the resolution of input hydraulic and recharge parameters and subsequently, a better simulated output (e.g. groundwater head).

Concerning the proper stress-driving of a groundwater model, such as MODFLOW, the recharge rate is one of the most indispensable input parameters. As the latter is estimated in pre-processing calculations that are fraught with uncertainty, it should, therefore, usually be calibrated with other model parameters, namely the hydraulic conductivity, although a typical trade-off between the two is often encountered (Arlai et al. 2013). All these considerations lead to some uncertainty in the simulated groundwater flow results.

Although the SWAT model has proven itself to be rather reliable in mimicking total streamflow, this is less the case when it comes to properly simulate the baseflow because a quite simplified groundwater module, in which the groundwater system is divided into a shallow and a deep aquifer (Neitsch et al. 2011), is implemented in the SWAT model for this purpose. More specifically, the SWAT-groundwater module functions on the basis of a linear-reservoir baseflow model, where the groundwater discharge/baseflow is proportional to the groundwater storage (Arnold and Fohrer 2005; Wang and Brubaker 2014). Nevertheless, in spite of this deficiency, there have been numerous, generally, satisfactory simulations of groundwater discharge by SWAT for perennial streamflow conditions (Aizen et al. 2000; Arnold et al. 2000). However, this has been less the case when it comes to reproducing groundwater discharge during dry seasons/periods (Kalin and Hantush Mohamed 2006; Srivastava et al. 2006) which is most likely due to the simple shallow and deep aquifer partitioning in SWAT. In fact, it is the shallow aquifer which sustains the groundwater discharge (baseflow), while the deep aquifer contributes water to the rivers outside the watershed, and is thus not properly accounted for in the SWAT- hydrological budget (Arnold et al. 1993).

MODFLOW simulates flow processes taking place at the continuum volume in the saturated zone represented by three-dimensional grids (groundwater domain) and hydrogeological properties. MODFLOW simultaneously solves the groundwater flow differential equation using the finite difference scheme and links groundwater systems to other subsurface hydrological compartments (e.g. vadose zone, surface drainage, transport phenomena, etc.) by virtue of “packages” implemented using a gridded spatial discretization. Nevertheless, it cannot directly take into account hydrologic processes that take part in land surface or within the root zone. As a result, a prevailing approach is to assume lumped percolation fluxes as a fraction of precipitation and then optimize the value during the calibration process. Even though the groundwater model, calibrated for recharge, can satisfactorily reproduce groundwater head, such a satisfactory simulation may be attributed to the right answer for the wrong reasons (Kirchner 2006) because this method is not able to account for spatial variability of recharge rates due to varying land use, irrigation, and agricultural measures adopted over the land surface domain. Despite the fact that some bold attempts have been made to enhance recharge estimation in MODFLOW through incorporating unsaturated zone processes, i.e. Unsaturated-Zone Flow (UZF1) package (Niswonger et al. 2006), hydro-climatic processes cannot properly be captured. Thus, assessing land management practices and climate change impacts on groundwater and surface-groundwater interactions faces considerable uncertainty even in the light of this groundwater flow model advancement. This is because MODFLOW does not simulate surface processes such as land–atmosphere interactions, infiltration and surface runoff, plant growth, and the impacts of management practices on agricultural systems.

In addition, from a qualitative point of view, this method may not properly model the nutrients transport to the groundwater domain for the same reasons. Therefore, an integrated SWAT and MODFLOW is required to better spatially represent feedback fluxes within the land surface, unsaturated, and aquifer systems (Guzman et al. 2015).

Benefiting from pros of SWAT and MODFLOW models while mitigating whose cons can be fulfilled when they are coupled in such a manner that flux exchanges between surface and subsurface hydrological domains can be plausibly characterized (Ke 2014; Kim et al. 2008). Under the proposed approach, the functionality of the coupled model SWAT-MODFLOW will be not only greatly enhanced but also that will be extended, in comparison with when either model is independently used (Guzman et al. 2015). Considering a successful coupling approach, a wide range of applications would be efficiently possible such as (1) assessment of climate change and variability impacts on all water compartments of a basin (Brown and Funk 2008); (2) enhancement of irrigation system analysis (Playan and Mateos 2006); (3) identifying effective spatial planning (Scanlon et al. 2005); (4) advancing of groundwater fate and transport modelling; and (5) characterizing and quantifying the surface–groundwater flux exchange—which has been aimed for the contribution of the present study.

To the authors’ best knowledge, previous developments and applications of SWAT-MODFLOW coupled models have focused mainly on the representation of the flux exchanges in basins where groundwater discharge and recharge occur under a perennial drainage network regime, while the applicability of such a coupled SWAT-MODFLOW model has not been assessed for basins where an intermittent/seasonal drainage network dictates how surface water and groundwater interconnection can take place. Under a seasonal/intermittent river system circumstance, interconnection of surface water and groundwater is highly sporadic and complex, thus characterizing and quantifying the water components of such basins, which are prevailing in arid and semi-arid regions, using a solid coupled hydrological model can lead to providing an effective water resources management strategy. Furthermore, as watersheds located in arid and semi-arid regions are highly vulnerable to recurrent and prolonged drought events as well as ongoing climate change impacts, a well-coupled and tested hydrological model can make it possible to undertake assessment studies for devising effective adaptation strategies under a broad and plausible range of scenarios.

In that regard, the newest versions of SWAT and MODFLOW-NWT, a Newton–Raphson formulation of MODFLOW-2005, which is particularly suitable for the solution of unconfined groundwater-flow problems where drying and rewetting of upper aquifer layers play a role (Niswonger et al. 2011), as coupled with the SWAT hydrological model by Bailey et al. (2016) have been considered.

This new integrated model developed by Bailey et al. (2016) is superior over other previously developed coupled SWAT-MODFLOW model versions of Perkins and Sophocleous (1999); Sophocleous and Perkins (2000); Conan et al. (2003); Menking et al. (2003); Galbiati et al. (2006); Bejranonda et al. (2007); Kim et al. (2008); Chung et al. (2010); Luo and Sophocleous (2011); and Guzman et al. (2015) because (1) it benefits from ‘Disaggregated Hydrologic Response Units’ (DHRUs) instead of HRUs; (2) it has a well-structured HRUgrid mapping; (3) it is applicable to watersheds and groundwater aquifers that have different spatial sizes; and (4) it uses MODFLOW-NWT for solving problems involving drying and rewetting nonlinearities of the unconfined groundwater-flow equation, instead of the conventional groundwater flow model, MODFLOW-2005, which is known to perform poorly in such cases.

In addition, typically, the calibration procedure of the coupled hydrological models, particularly previously developed SWAT-MODFLOW models, is usually started once the two models have been already coupled. As a result, the calibration scheme has become much more time-consuming and fraught with difficulties. To reduce the computational cost and to facilitate the calibration process, in this study, SWAT and MODFLOW model are first calibrated individually, and afterwards an add-on-recalibration of the coupled model is performed. Moreover, in the present study, implication of separately calibrating SWAT and MODFLOW for screening the most suitable parameter(s) for the subsequent final coupled model calibration will be examined which has not been documented in the literature as of yet.

Given the fact that the balance between the water supply and demand has been lost due to mainly population growth, intensification of irrigated agriculture practices, and recurrent drought and ongoing climate change impacts in the Gharehsoo River Basin (GRB) and whose enclosed aquifer, namely Ardabil Aquifer in north-western Iran (Ardabil Regional Water Authority 2013), proper characterising and quantifying the surface–groundwater interactions would be quite useful to advocate the future water resource management plans taken into consideration by policy makers and corresponding authorities. Moreover, official statistics of groundwater utilization in Ardabil aquifer, provided by Ardabil Regional Water Authority (2013), indicate that the groundwater utilization has considerably risen from 35 MCM in 1978 to 160 MCM in 2012 and subsequently, the depth to water has been widely varied accordingly. Therefore, the reciprocal flux exchanges between surface water and groundwater systems could have turned towards a highly complex situation in recent years.

Based on these premises, the specific research objectives of the present studies are: (1) assessing the performance of the coupled SWAT-MODFLOW model to reproduce the observed streamflow discharges and groundwater heads in the study region; (2) analysing the spatial and temporal variability of the groundwater recharge and discharge from losing (influent) and gaining (effluent) streams, respectively, in the GRB where an intermittent/ephemeral river drainage regime functions; (3) identifying the importance of separately calibrating SWAT and MODFLOW-NWT for screening suitable parameter(s) for the subsequent final coupled model calibration; and (4) apprising the current water resource management status of the basin in terms of groundwater storage and water yield.

Study area

Geography

The Gharehsoo River Basin (GRB), located in north-west Iran, covers an area of 4193 km2 and extends between latitude 37°46ʹ–38°36ʹN and longitude 47°46ʹ–48° 42ʹE. An alluvial aquifer, the so-called Ardabil plain/aquifer, with an area of 1073 km2 possesses about one-fourth of the basin (Fig. 1) and it supplies the major fraction of water resource demand in the region. The average annual precipitation of the GRB is only 300 mm (Nourani et al. 2015) which falls mainly in the winter months from November until April. It is of great importance to provide a solid-coupled hydrological model for the GRB. First, since the study area is located in a mountainous region, it is anticipated to be more vulnerable to global climate change, as it is the case for many mountainous areas of the world (e.g. the Andes and the Himalayas) (Hu et al. 2013). Second, as confirmed by IPCC (2013), in many mid-latitude and subtropical dry regions, as the GRB, mean precipitation will most likely decline. Subsequently, it will have enormous adverse impacts on the water availability in the GRB. For these reasons, characterizing and identifying spatio-temporal variations of surface–groundwater interactions can lead to devise effective water resource management plans to cope with both the current increase in water resource consumptions due to population growth and future natural and man-made (e.g. climate change) activities on the water components of the basin.

Fig. 1
figure 1

Geographical location of Gharehsoo River Basin (GRB) and landuse map of the basin imported to the SWAT model along with the locations of the hydrometric stations/outlets (denoted by “O”)

Hydrogeology

Geologically, the Ardabil aquifer is composed of Quaternary alluvial deposits, namely gravel, sand, and a little amount of clay (Fig. 2a). It represents a one-layer unconfined aquifer (Fig. 3), which was originated from erosion and alteration of the surrounding mountains. According to analysis of geophysical explorations and drilling logs, the aquifer thickness varies from 10 m to over 200 m. The estimated aquifer thickness in connection with results of pumping tests indicates that the aquifer transmissivities range from 50 to 2200 m2 day−1, whereas the specific yields are between 0.02 and 0.14. The dominant direction of groundwater flow mimics that of the topography, i.e. is towards the northwest (Fig. 2b) (Kord and Asghari Moghaddam 2014).

Fig. 2
figure 2

Ardabil aquifer geological formation map and position of pumping wells (a), and groundwater level contour map with the assigned aquifer boundaries used in the later groundwater flow model (b)

Fig. 3
figure 3

Spatial discretization of the modelled area along with pumping wells, springs, and qanats that extract water from the aquifer and positions of top and bottom of layers and initial head (with ten times vertical exaggerations) illustrated by two inlets representing vertical cross-sections along transverses A1–A2 and B1–B2

Since 1980, the rapid expansion of Ardabil city, intensive irrigated agricultural, and more industrial activities have put much more stress on the Ardabil aquifer. Ardabil Regional Water Authority (2013) reported that the average water consumption for drinking, industrial, and agricultural sectors in the region was 26, 4, and 177 (million m3 year−1), respectively, with 89% of the water supplied by groundwater through 2622 active pumping wells, 36 qanats, and 77 springs operating in the Ardabil aquifer (Kord and Asghari Moghaddam 2014).

Materials and methods

Development of the SWAT-MODFLOW coupled model

The coupled SWAT-MODFLOW model, with the source codes of the two models, SWAT and MODFLOW-NWT, united as a single FORTRAN executable file, as developed by Bailey et al. (2016), are used. This variant of coupled SWAT-MODFLOW appears to better represent the river-groundwater interchanges than others, such as, for example, the recent SWATmf- integrated modelling framework of Guzman et al. (2015). In the present integrated/coupled model, MODFLOW-NWT is called as a sub-module of SWAT, wherefore SWAT simulates the land surface and vadose zone, in-stream-, and soil domain processes, while MODFLOW-NWT models the three-dimensional groundwater flow including all sources (surface- and stream recharge) and sinks (pumping and stream discharge).

As proposed by Bailey et al. (2016), the coupling of MODFLOW-NWT to SWAT is conducted through the permanent exchange of three variables, namely (1) deep percolation (water leaving the root zone considered as recharge) from SWAT to MODFLOW-NWT; (2) river head from SWAT to MODFLOW-NWT; and (3) groundwater discharge (base flow) from MODFLOW-NWT to SWAT. The most important steps of the coupled model are sketched in the flowchart of Fig. 4 and are described as follows:

Fig. 4
figure 4

Flowchart of the modelling of the surface–groundwater interactions in SWAT-MODFLOW-NW

  1. 1.

    Owing to the fact that SWAT simulates the surface water processes at the (larger) HRU scale and MODFLOW-NWT at the (smaller) grid cell scale, in order to provide the appropriate interconnection between the two, the HRUs were disaggregated (called here DHRUs) using GIS-preprocessing operations in such a manner that each part of a specific HRU has a unique geographic location. These DHRUs are then intersected with the underlying grid cells of MODFLOW to make variable/flux exchanges possible between SWAT and MODFLOW-NWT.

  2. 2.

    The SWAT-simulated deep percolation (recharge) is mapped onto the grid cells that fall within each DHRU which is then used as a driving recharge input into MODFLOW-NWT to compute the groundwater heads. It should be noted that simplifications assumed for vadose/unsaturated zone processes by most of hydrological models are resolved to some reasonable extent by SWAT model. Therefore, the percolation estimated by SWAT model represents net infiltration/recharge; because evapotranspiration that occurs within the unsaturated zone, (denoted by “REVAP” in SWAT) is subtracted from the total percolation computed by SWAT model. As a result, some portion of total percolation/gross recharge is trapped due to the unsaturated zone evapotranspiration as well as some fraction of the percolating water is kept by the matrix potential of the soil, thus ultimately the water leaving the root zone towards the groundwater level is recognized as net infiltration which will then feed MODFLOW-NWT model. The SWAT stream network is also intersected with the MODFLOW grid cells. Based on Darcy’s law, the hydraulic conductance and the difference between the groundwater head and river head dictates the direction and magnitude of the flow, i.e. groundwater discharge or recharge to and from the river network, respectively, simulated by MODFLOW-NWT. When groundwater discharge occurs, its amount is added to the SWAT-simulated streamflow along the corresponding stream reaches, whereas groundwater recharge (streamflow loss) is subtracted accordingly.

  3. 3.

    It should be noted that although SWAT model is also able to compute the evapotranspiration from the shallow groundwater system, this was not the case in this study because of the oversimplification considered in SWAT model concerning partitioning off the aquifer system into a shallow and deep aquifer. To that end, we set two parameters, used to estimate evapotranspiration from the shallow groundwater system in SWAT model, namely REVAPMN.gw and GW_REVAP.gw (see Table 2) in a manner to minimize the evapotranspiration estimated by SWAT from the groundwater. Therefore, in the present study, the evapotranspiration from the groundwater was computed by EVT-MODFLOW package, although overall, due to over-utilization of groundwater in GRB, depth to water has been markedly risen, thereby leading to decrease the influence of the groundwater evapotranspiration on total water balance, particularly in the recent decade of the simulation. Moreover, following groundwater evapotranspiration estimated by Ardabil Regional Water Authority (2013) (Table 3), we were able to check out the model’s output whereby to not allow the estimated values fall within an unreliable range.

Although time scales for groundwater flow processes are usually much larger than those for surface water—usually taken as one day in SWAT—to guarantee proper and instantaneous coupling of surface and groundwater, the MODFLOW-module’s time step is also forced to be one day in the coupled SWAT-MODFLOW model. It is clear that because of this small-time step with an apparent redundancy and due to the fact that MODFLOW is an implicit back-ward-in-time integration code, overall execution times of the present SWAT-MODFLOW model can be prohibitive for routine modelling of this kind. For this reason, the automatic calibration procedures provided in SWAT, such as the Latin-Hypercube sampling methodology of the SWAT-CUP stochastic calibration procedure (Abbaspour et al. 2004, 2007), could only be applied at the expense of huge computation costs in the present application.

Because of the mentioned computational difficulties with the full automatic calibration and validation of the total integrated SWAT-MODFLOW-NWT model, for the GRB these tasks are carried out in a three-step manner, where first and second SWAT and MODFLOW-NWT are calibrated and validated independently from each other, followed, third, by the calibration and validation of the integrated SWAT-MODFLOW-NWT model, employing information from the first two steps. Details of this consecutive calibration/validation exercise are provided in the following sub-sections. The required input data used for construction and calibration of the SWAT, MODFLOW, and the coupled model are described in the following sections.

SWAT-model calibration and validation

The SWAT model of the GRB was constructed and simulated for 1978–2012, considering the first 3 years as a warm-up period which is the time required to get a reasonable initial hydrological state (Daggupati et al. 2015). The main input data required for the SWAT model consist of climate data, topography, soil, and landuse maps. The climate data consisting of precipitation and minimum and maximum temperature were collected on a daily basis from both synoptic meteorology stations of the Iran Meteorological Organization and climatology stations of the Iran Ministry of Energy (Ardabil Regional Water Authority 2013). Soil, conditioned and corrected digital elevation model (DEM), and land use maps were obtained from the global soil map of the Food and Agriculture Organization of the United Nations (FAO 1995), and U.S. Geological Survey (USGS) (Lehner et al. 2008), and Iran Forests, Range and Watershed Management Organization, respectively. By overlaying these three layers, i.e. DEM, soil, and land use, 1778 HRUs were generated which yield 124 sub-basins according to the DEM and the locations of the five outlets (see Table 1).

Table 1 Characteristics of outlets considered for the calibration/validation scheme

As is common in calibration/validation exercises, the totally available time period was partitioned off into two intervals, namely 1988–2012 for calibration and 1978–1987 for validation. With regard to the available historical data record of the five outlets (see Fig. 1), as listed in Table 1, the calibration was conducted for all of them, whereas the validation could only be performed for two outlets.

Following recommendations made by the literature (e.g. Abbaspour et al. 2007; Faramarzi et al. 2009; Daggupati et al. 2015), 17 parameters, particularly groundwater variables (Table 2), were chosen for automatic calibration by means of the sequential uncertainty fitting (SUFI2) algorithm as a semi-automated inverse modelling technique for calibration, as well as sensitivity- and uncertainty analysis. Details of the methodology can be found in Abbaspour et al. (2004). In SUFI-2, the uncertainty is quantified by a set of simulations (here 500) which contain different parameter values taken from a set of calibrated parameter ranges. The output range capturing 95% of all simulations represents the uncertainty, which is denoted by  the 95% prediction uncertainty band (95PPU) (Andersson et al. 2012; Abbaspour et al. 2004). The 95PPU is computed at the 2.5% and 97.5% levels of the cumulative distribution of an output variable obtained through Latin hypercube sampling. Coefficient of determination (\({R^2}\)) (Eq. 1) was employed as the objective function, although Nash–Sutcliffe efficiency (NS) (Eq. 2) was computed with regard to the simulations obtained from applying (\({R^2}\)) as the objective function.

Table 2 SWAT-model parameters considered for the calibration process
$${R^2}={\left( {\frac{{\mathop \sum \nolimits_{{i=1}}^{n} ({O_i} - \bar {O})({P_i} - \bar {P})}}{{\sqrt {\mathop \sum \nolimits_{{i=1}}^{n} {{\left( {{O_i} - \bar {O}} \right)}^2}} ~\sqrt {\mathop \sum \nolimits_{{i=1}}^{n} {{\left( {{P_i} - \bar {P}} \right)}^2}} }}} \right)^2},$$
(1)
$${\text{NS}}=1 - \left( {\frac{{\mathop \sum \nolimits_{{i=1}}^{n} {{\left( {{O_i} - {P_i}} \right)}^2}}}{{\mathop \sum \nolimits_{{i=1}}^{n} {{\left( {{O_i} - \bar {O}} \right)}^2}}}} \right).$$
(2)

where \({O_i}\) indicates the observed discharge/groundwater level, \({\bar {O}_i}\) shows the average of the observed discharge/groundwater level  computed for a specific period, and \({P_i}\) states the simulated discharge/groundwater level at time i. \({R^2}\) changes from 0 to 1 (perfect fit) and indicates how much of the observed dispersion is explained by the simulation. \({\text{NS}}\) varies from −  to 1.0 (perfect fit), where \({\text{NS}}\)< 0 represents that the mean value of the observed series is an even better predictor than that of the model (Nash and Sutcliffe 1970).

More specifically, following a suggestion of Abbaspour et al. (2015), first the basin parameters (eight parameters denoted by extension "bsn"), which are mostly recognized as water generating parameters (see Table 2), were calibrated and then fixed in the model; then, second, other variables (nine) for each of the five sub basins (outlet), namely groundwater variables (.gw), in order to facilitate the subsequent SWAT-MODFLOW calibration, were parameterized, calibrated, and validated, wherefore for the latter only two outlet stations could be used.

As indicated in Table 1, the three outlets, namely O 48, O 76, and O 105 drain ephemeral/intermittent tributaries and the other two, O 12 and O 95, probe more or less as perennial rivers, even though in the past decade (2000–2012) they have also been becoming ephemeral during the non-rainy seasons, due to an overall decrease of the groundwater discharge, i.e. baseflow.

MODFLOW-NWT groundwater flow modelling

Model-setup and specification of boundary conditions

To construct the MODFLOW-NWT-model of Ardebil aquifer, the study area (see Figs. 1, 2, 3) was discretized into a grid of 276 rows and 232 columns in north–south and west-east- directions, respectively, with element size of 200 × 200 m. In the vertical direction only one layer representing the unconfined aquifer was modelled (Fig. 3). Inasmuch as the distinctive stresses including discharge and recharge periods are basically applied to the Ardabil aquifer on a seasonal time scale, a stress period of 3 months (one season) was assumed, and the latter was then divided further into six time steps, i.e. each with an length of 0.5 months, following the suggestions of Anderson et al. (2015), who indicate that for a division of a total pumping period into six time steps, the numerical solution agrees satisfactorily with the analytical solution.

Boundary conditions of the model were assigned based on the geological formations encompassing the aquifer and the observed groundwater contour pattern (see Fig. 2). Thus, flow/head boundary (FHB) conditions were specified along most sections of the Ardabil aquifer boundary by means of the FHB-package, allowing the general attribution of no-flow, inflow, and outflow boundaries, as drawn in Figs. 2b and 3. More specifically, according to the shape of the groundwater contour map, drawn for the initial condition, groundwater inflow, outflow, and no-flow sections of the boundary were identified. Afterwards, with regard to head difference between two interval groundwater contours crossing the boundary and with respect to hydraulic conductivity of the porous material (please see Fig. 2a), the groundwater inflow and outflow are computed by means of Darcy’s law. It is worth mentioning that as the simulated groundwater heads are updated within each stress period, the calculated groundwater inflow and outflow are renewed accordingly. In this respect, the north-west boundary of the aquifer was assigned as an outflow boundary, where not only all surface water is drained out of the main outlet of the basin (O 12) (Fig. 1), but where also groundwater outflow takes place. Some parts of the aquifer boundary which are in contact with impervious formations were discerned as no-flow, while parts of the pattern of the groundwater contours (see Fig. 2b) hint of some groundwater inflow into the aquifer. In addition to specified flow boundaries, time-variant specified heads (CHD-package), recognized as Dirichlet conditions, were defined for boundary cells where historical observed groundwater heads are available, e.g. P18.

As mentioned in Sect. “Study area”, the largest amount of external stress on the Ardabil aquifer is originated from groundwater pumping by the huge number of farming wells as well as qanats, which has led to groundwater over-draft—mainly in the dry summer season—and that has increased from 35 MCM in 1978 to 160 MCM in 2012. This expansion in pumping was implemented in the groundwater flow model by augmenting the number of pumping wells from 600 at the beginning to 2400 at the end of the simulation period.

Based on these specifications of boundary conditions and stresses, various packages of the MODFLOW-core model were enabled in the present study, such as the basic, time-variant specified head (CHD), flow and head boundary (FHB), river, well (well-pumping, qanats and springs), recharge, evapotranspiration, head observation, upstream weighting (UPW), and Newton solver (NWT).

The average groundwater heads for winter 1988 (January–March) were employed to represent the initial condition and the model was calibrated for that period to ensure a steady-state situation.

To construct the groundwater flow model and in order to later prepare the input files required for the coupled model, the graphical user interface ModelMuse (Winston 2009) was used.

MODFLOW calibration and validation

To calibrate parameters, having trustable and physically sensible initial values can noticeably reduce the calibration computational cost (Hill and Tiedeman 2007). Thus, initial hydraulic parameters of the aquifer, i.e. hydraulic conductivity, transmissivity, and storativity were extracted and interpreted from around 50 pumping tests (Ardabil Regional Water Authority 2013) undertaken evenly across the Ardabil aquifer. Concerning the detailed investigation conducted into dominant sources and sinks of Ardabil aquifer (Table 3) by Ardabil Regional Water Authority (2013) and Abshileh Engineering Consulting Firm (http://www.abshileh.com/), the obtained values were treated as initial inputs for the model set up which then further were refined within a hydrologically reasonable range during the subsequent trial-and-error-calibration procedure.

Table 3 Values representing prevailing sources and sinks active in Ardabil Aquifer used as the initial inputs to set-up and calibrate MODFLOW-NWT model (Ardabil Regional Water Authority 2013)

The MODFLOW-NWT calibration/validation periods were treated identical to those of the SWAT, wherefore transient head observations of 38 and 35 piezometers were used as calibration and validation targets, respectively. Since the groundwater flow modelling using MODFLOW-NWT is not the central objective of the present study, details of the calibration/validation methodology, which was basically a trial and error approach, are omitted here.

SWAT-MODFLOW calibration and validation of river–groundwater interchanges

Once both SWAT and MODFLOW-NWT were individually calibrated and validated, the coupled SWAT-MODFLOW-model for the Ardabil aquifer (Fig. 2) was set up, following the procedure described earlier and outlined in the flowchart of Fig. 4. Given the fact that two-way flux exchange between the river network and the adjacent aquifer (Ardabil Aquifer) takes place through the bed and bank of the gaining and losing streams, success of the coupled model calibration depends largely on the proper quantification of this interaction. The direction and quantity of the flux are mainly ascertained by the difference between groundwater head and river stage and is also subject to the hydraulic conductance of the porous materials between the groundwater and river network domains.

As this process is much determined by the value of the hydraulic conductance \({\text{CRI}}{{\text{V}}_n}\), in the river-aquifer boundary layer (specified in the MODFLOW-river package), calibration of this parameter is the major task here. To that avail, the rivers/streams, incised across the modeled area, were intersected with the MODFLOW grid cells, resulting in a total of 2581 reaches.

Basis of the river-aquifer bed conductance calibration is the fundamental (Darcy’s law) equation that describes the volumetric flow \({\text{QRI}}{{\text{V}}_n}\) \({{\text{L}}^{\text{3}}}{{\text{T}}^{ - 1}}\) between a river section and the adjacent groundwater aquifer (McDonald et al. 1988).

$${\text{QRI}}{{\text{V}}_n}={\text{CRI}}{{\text{V}}_n}({\text{HRI}}{{\text{V}}_n} - {h_{i,~j,~k}}),$$
(3)

where \({\text{CRI}}{{\text{V}}_n}\), the hydraulic conductance of the river-aquifer interconnection bed (\({{\text{L}}^2}{{\text{T}}^{ - 1}}\)), is defined as follows:

$${\text{CRI}}{{\text{V}}_n}=\frac{{{{\text{K}}_n}{{\text{L}}_n}{{\text{W}}_n}}}{{{{\text{M}}_n}}},$$
(4)

where \({\text{HRI}}{{\text{V}}_{n~}}\) represents the water head in the river at stream section n; \({h_{i,~j,~k}}\) denotes the groundwater head in the grid element (i, j, k) in the adjacent aquifer; Ln is the length of the reach through the grid cell (i, j, k); Wn states the river width; Mn is the thickness of the riverbed layer; and Kn indicates the hydraulic conductivity of the riverbed material.

As the conductance is adjusted during the calibration process, the groundwater discharge/recharge to/from the stream (GWQ)—taken as positive in the former case, which is the normal surface–groundwater interchange process in the semi-arid study area here, and negative for the latter case—is altered which must be accounted for in the surface water balance, so that the water yield of the SWAT-model is modified in the SWAT-MODFLOW model to:

$${\text{Water yield}}={\text{SURQ}}+{\text{LATQ}}+{\text{TILEQ}}+{\text{GWQ,}}$$
(5)

where SURQ is the surface runoff, LATQ is the lateral flow, TILEQ is tile drain flow, and GWQ states the named groundwater discharge/recharge. This changing water yield affects the average river head, \({\text{HRI}}{{\text{V}}_n},\) along a reach section in Eq. (3), computed in SWAT, which in turn alters the groundwater-stream discharge/recharge.

Finally, as groundwater is (most likely) lost to the streams and extracts by the pumping wells, qanats, and springs, its amount of storage (GW) in the aquifer is reduced. Because of this importance, GW is also computed in the coupled model by multiplying the water-saturated volume of the unconfined aquifer layer by its specific yield.

Results and discussion

SWAT parameterization, calibration, and validation

The final values of the basin parameters and the ranges of the parameterized-calibrated values as based on the 95PPU-criterion implemented in SUFI2 (Abbaspour et al. 2004) are listed in Table 4. It may be noted that while the calibrated CN-parameters for the sub-basins shown are close to the reference values, some of the calibrated groundwater parameters are further away from their default values, indicating again the inherent limitation of SWAT to properly model the groundwater component of the hydrological cycle (Srivastava et al. 2006).

Table 4 Values and ranges of the calibrated parameters in the basin and sub-basin levels

The present calibration of the SWAT-model turned out to be here a rather big challenge for a small basin with ephemeral rivers, because many processes, i.e. agricultural and urban water consumption and so forth, should be taken into consideration in order to capture the uncertainties that may exist in the form of process simplification, processes not accounted for by the model, and processes in the watershed that are unknown to the modeler (Abbaspour et al. 2007). Thus, the calibration of the ephemeral rivers (outlets O 48, O 76, O 105 < 0.5 m3 s−1) and also of the perennial rivers with small magnitudes of discharge (outlets O 12 and O 95) were difficult to achieve. Nevertheless, as listed in Table 5, coefficients of determination \({R^2}\) > 0.5 and \({\text{NS}}\) ≥ 0.5 could still be obtained for the calibration periods.

Table 5 Nash–Sutcliffe efficiency (\({\text{NS}}\)) and coefficients of determination (\({R^2}\)) for SWAT and SWAT-MODFLOW calibration of discharge at five outlets and validation at two stations

It is interesting to note from Table 5 that the SWAT-modeled streamflows, especially, for the main outlet of the basin (O 12, see Fig. 1), agree even better with the observed ones for the validation period.

This may be attributed to the early period of the streamflow time series set aside for the validation (1978–1987), which has not yet been much affected by man-made activities, such as land use changes, groundwater over-utilization, increasing of water consumption in agricultural, industrial, and domestic sectors, as it has been the case for the more recent calibration period (1988–2012), resulting in the groundwater discharge reduction over the time, irrespective of the precipitation situation. Indeed, the observed and modelled streamflow hydrographs (Fig. 5) indicate that, most likely, owing to a non-availability of relevant data referring to the mentioned anthropogenic impacts on the natural behavior of the river basin regime, namely the small magnitudes of streamflow, particularly in the ephemeral rivers, the SWAT-model is not always able to capture all the streamflow variability over time.

Fig. 5
figure 5

Observed and simulated river discharge using SWAT and SWAT-MODFLOW models in calibration and validation steps for the five outlets which are  distributed across the SWAT-MODFLOW modelled area, i.e. the aquifer area

SWAT-MODFLOW calibration and validation

The SWAT-MODFLOW calibration was carried out by comparison of simulated discharges and groundwater heads with observed ones and subsequent adjustment of the hydraulic conductances of the corresponding stream reaches. For example, the initial results of the coupled model showed that the simulated groundwater heads became quickly satisfactory, and the simulated streamflow reacted strongly to changes of groundwater discharge, i.e. the river bed conductance, allowing thus for a good calibration of the latter. Therefore, the calibration of SWAT-MODFLOW was achieved using adjustments made only to the hydraulic conductances, whereas several parameters that contributed to independent calibration of SWAT and MODFLOW-NWT were left unchanged.

Thus, it could be inferred that the individual calibration of such models can speed up the calibration process of a coupled model by adjusting fewer parameters, which in turn leads to less uncertainty. On the other hand, if a quite poor performance of a coupled model, assessed against streamflow and groundwater heads, is obtained during the first simulations, it demonstrates the necessity of a recalibration of each model independently before being coupled.

Both the \({R^2}\) and \({\text{NS}}\) values (Table 5) and the hydrographs (Fig. 5) indicate that, despite its higher complexity, the SWAT-MODFLOW integrated model did not improve simulating the streamflow. Thus, from the river regime’s point of view, one could draw the conclusion that for the proper simulation of the total streamflow here, since many of the tributaries of this basin are ephemeral and intermittent streams (Table 1), which in turn means that only a small fraction of the total streamflow is sustained by the groundwater discharge (return flow or base flow), the coupled SWAT-MODFLOW model does not have an appreciable impact on the simulation of the streamflow at the various outlets of this basin.

In this respect, other coupled SWAT-MODFLOW studies, e.g. Bejranonda et al. (2007); Kim et al. (2008); Chung et al. (2010); Luo and Sophocleous (2011); Guzman et al. (2015) and Bailey et al. (2016), have shown an improvement of the modelled streamflow rather than using SWAT alone. The reason for this discrepancy can be attributed to application areas where these studies were conducted. We have found out that the aforementioned studies were undertaken in basins with perennial rivers whose major fraction of streamflow is contributed from groundwater discharge/baseflow, while this is not the case in the present application area where the large portion of the streamflow is supplied by direct surface runoff.

However, the skill of the SWAT-MODFLOW integrated model becomes obvious, when the modelled groundwater heads are considered, as indicated by the various panels of Fig. 6 and calculated \({R^2}\) and \({\text{NS}}\), where the 68 observed and simulated groundwater heads are plotted across each other for both calibration (1988–2102) and validation (1978–1987) periods using either MODFLOW-NWT alone or the coupled SWAT-MODFLOW model. Figure 6 illustrates that both models could quite satisfactorily simulate the groundwater head (R2 and NS > 0.9), surprisingly even better in the validation period.

Fig. 6
figure 6

Comparison of observed and simulated groundwater levels for: a the MODFLOW-NWT model during calibration; b SWAT- MODFLOW-NWT during calibration; c MODFLOW-NWT during validation; and d SWAT- MODFLOW-NWT during validation

The advantage of the use of the coupled model over the separate models becomes even clearer from the groundwater head time-series plotted in Fig. 7 for a few piezometers. One can notice that whereas the groundwater heads for piezometer P18 have been simulated quite satisfactorily by both the MODFLOW-NWT and the SWAT-MODFLOW models, this is not the case for P23, where the standalone MODFLOW-NWT model performs poorly when compared with the coupled model. In the former case, this may be due to the fact that P18 is located at the aquifer boundary (see Fig. 2) and is used to define the CHD (time-dependent) boundary condition over that section of the aquifer boundary, whereas heads for P23, located in the center of the aquifer (Fig. 2), are not affected by CHD.

Fig. 7
figure 7

Observed and MODFLOW-NWT- and SWAT-MODFLOW-simulated groundwater heads for two piezometers for the calibration period

Even so, because of the passing of the highly spatialized SWAT-estimated recharge and the river heads to MODFLOW-NWT, together with the calibration of the hydraulic conductance, the observed groundwater heads at piezometer P23 are well simulated by SWAT-MODFLOW, unlike the MODFLOW-NWT-standalone, which is able to mimic the fluctuations and the trend of the observed groundwater heads, but underestimates the latter systematically. This discrepancy for MODFLOW-NWT resulted most likely from (1) applying a certain percentage of the precipitation (7%) as recharge evenly distributed over the aquifer (lumped-estimated recharge) (Ardabil Regional Water Authority 2013) and (2) using a fixed value for the river head in all stream reaches  incised the aquifer, owing to the non-availability of measured river head data in the GRB.

In conclusion of this section, the simulated groundwater heads, unlike the streamflow, have notably been improved using the coupled SWAT-MODFLOW model, which is most likely due to the use of more accurate spatially distributed SWAT-modeled recharge feeding the MODFLOW-NWT model, as also been indicated in a similar study conducted by Chung et al. (2010).

Spatial and temporal variability of surface–groundwater interactions

The SWAT-MODFLOW-simulated surface–groundwater interactions are presented in Fig. 8 in terms of the gaining (groundwater discharge/baseflow) and losing (groundwater recharge) stream reaches, marked by red and blue bars, respectively. Due to the overall extreme differences of total discharge to the streams (groundwater discharge) compared with total recharge towards the streams (groundwater recharge), the maximum of the red bars is 623 m3 day−1, but only 0.5 m3 day−1 for the blues ones.

Fig. 8
figure 8

Gaining (red bars, negative, groundwater discharge) and losing (blue bars, positive, groundwater recharge) river sections in the SWAT-MODFLOW model area with average daily rates for the calibration time period 1988–2012. The two ovals show the stream sections with the highest gains (red) and losses (blue). Note the large differences of the scales for the river gains (max = 623 m3 day−1) and losses (max = 0.37 m3 day−1)

Figure 8 illustrates that most tributaries in the modelled aquifer area  serve as losing (influent) rivers (blue bars), where groundwater is recharged by seepage of water through the beds and banks of the streams. However, the total amount of groundwater recharge towards the aquifer in these sections is still very low, unlike the groundwater discharge form the gaining reaches. The latter occurs mainly in the central area part of the river network and particularly, near the main outlet of the basin where the topography is nearly flat and groundwater heads are high and close to the land surface. Here the river bed conductances are also very high, which in turn leads to more groundwater discharge towards the streams. A similar behaviour has been found by Baalousha (2012) in the flat plain Ruataniwha basin, New Zealand, i.e. in such a terrain, rivers gain much more water from the groundwater aquifer (effluent rivers) than they lose to the latter (influent rivers). When summing up all gaining and losing stream reaches separately along the stream network, a total average daily groundwater discharge of 63,416 m3 day−1and of groundwater recharge of 81 m3 day−1 is obtained, respectively.

The annual accumulated computed groundwater discharges and recharges are illustrated in tandem with the precipitation over the calibration period 1988–2012 in Fig. 9. One notices that while the groundwater discharges follow a more long-term monotonous behaviour, recharges to the stream reaches are more oscillating and follow nearly the variations of the annual precipitation pattern across the basin. This groundwater discharge towards the river network increased steadily since troughs, witnessed during the 1988–1991 drought period, and reached its maximum about 2 years after the precipitation came back to the normal condition and even reached a peak in 1993. Interestingly, after that time, groundwater discharge has been steadily decreasing, irrespective of the precipitation amount, most likely due to ongoing over-exploitation of the groundwater over the past three decades—from 35 MCM in 1978 to 160 MCM in 2012 Ardabil Regional Water Authority (2013)—resulting in concomitant strong head drawdowns (see Fig. 7). This overall decrease of the SWAT-MODFLOW simulated groundwater discharge is also confirmed by both the observed and simulated streamflow at the five outlets, notably at the main basin outlet 12 (Fig. 5). Therefore, it can be noted that while the river network still behaved as a perennial system in the initial years of the simulation period, it has been converted more or less to an intermittent/ephemeral river system over the years, owing to the reasons given above.

Fig. 9
figure 9

a Annual precipitation with long-year average for the Ghrehsoo River Basin: b SWAT-MODFLOW simulated total annual groundwater discharge/ recharge to/from the river network. Note the different scales for discharge and recharge

As mentioned earlier, the simulated groundwater recharge from the stream reaches in Fig. 9b are more oscillatory and somewhat in phase—although delayed—with the precipitation variability (Fig. 9a). For instance, groundwater recharge reached its minimum in 1995 when a severe drought occurred. On the other hand, in the relatively wet years 2000 and 2007, groundwater recharge increased again.

The three panels of Fig. 10 show the simulated variations of groundwater discharge, groundwater storage, and water yield, on a monthly scale. For the groundwater discharge (Fig. 10a), one can clearly notice the seasonal, somewhat sinusoidal pattern, embedded in the long-term decreasing trend after 1994, as already mentioned. Because of the well-known inertia of the groundwater aquifer (Markovic and Koch 2015), the computed groundwater storage (Fig. 10b) reacts as a low-pass filter on a longer (annual) time scale within the earlier-indicated long-term decreasing trend. This in turn, should decrease the groundwater discharge to the stream network of the basin and thus bring to a long-term drop of the water yield of the basin. However, the time series of the water yield in Fig. 10c does not exhibit any particular trend in that regard. This may be associated with the fact that the reduction in groundwater discharge, witnessed over the last decade time, has been counterbalanced by sufficient volumes of direct surface runoff and lateral flow which are generated in the course of normal precipitation conditions. However, it should be noted that this compensation could be possible due to the fact that only a small fraction of total streamflow has been supplied with baseflow/groundwater discharge in this region.

Fig. 10
figure 10

SWAT-MODFLOW-simulated monthly oscillations of the groundwater discharge (a), storage (b), and water yield (c)

Conclusions

In the current study, surface and groundwater interactions in the GRB were analysed by an integration/coupling of the SWAT surface hydrological model with the MODFLOW-NWT groundwater flow model. By doing so, the first three goals raised in the introduction section could be achieved.

First, the results indicate that the coupled SWAT-MODFLOW model delivers more accurate groundwater heads than the MODFLOW-NWT standalone model. However, this is not necessarily true for the simulated streamflow where a better performance of SWAT-MODFLOW cannot be detected.

Second, it can be concluded that the independent calibration of each model (SWAT and MODFLOW) can expedite the calibration of the coupled model. This can be regarded as a convenient approach to overcome the computational difficulties with the automatic calibration of the coupled model.

Third, the analysis of the various simulated components of the hydrological cycle at the surface/subsurface interface shows that the groundwater discharge towards the stream network (gaining streams) is the dominant surface–groundwater interaction process in the GRB, whereas groundwater recharge from the rivers (losing streams) is up to three orders of magnitudes lower, which is due to the fact that the modelled area is located in a relatively flat plain where the groundwater heads are generally close to the land surface.

Moreover, the observed data indicate a sharp drop of the groundwater heads over the simulation period 1991–2012 which hints of an extreme over-exploitation of groundwater over the past two decades, which grew from 35 MCM in 1978 to 160 MCM in 2012. This has led to a concomitant steady decrease of the simulated groundwater storage over the years, which in turn has reduced inflow to the streams somewhat during the named period. However, the effect on the SWAT-MODFLOW-simulated streamflow has overall still been minor, as surface water (runoff) to the stream network continued to be relatively high over the past decade, due to sufficient precipitation. Nevertheless, the seasonal hydrological behaviour of the river network has been shifted from a formerly mostly perennial to an intermittent/ephemeral river system. As most of the irrigation scheduling in the GRB and, specifically, the Ardabil aquifer operates within the non-rainy seasons, namely spring and summer, sustaining the streamflow during these non-rainy periods is crucial both for the environmental and agricultural needs.

In conclusion, the present investigation indicates that, notwithstanding the augmented intricacies with the calibration of the coupled surface–groundwater flow model SWAT-MODFLOW model, the latter is able to provide a higher spatio-temporal resolution of the major surface–subsurface hydrological processes, which control the availability of water resources in a basin, than is possible by running either one of the two models alone. Therefore, the provided SWAT-MODFLOW model is able to assess the impacts of a wide variety of stresses acting on the surface–subsurface interface of the hydrological cycle, including climate and land use change, groundwater over-draft, all of which exert uncertain and sometimes enormous pressures on the sustainability of water resources. Under these circumstances, the present coupled surface–groundwater hydrological model can be an effective tool for selecting appropriate proactive and reactive measures to mitigate such adverse impacts on the water resources in a basin, by providing a better management of the latter.

One of the limitations of this study was found to be the calculation of flux interactions in the coupled model, the process which is quite time-consuming especially for cases with long time span as in our study (25 years). In this study, SWAT-MODFLOW operated on a daily base requiring 17 h for one single simulation. This made it impossible to apply an automatic calibration underlying an iterative procedure. We tried to partially overcome this problem by independent calibration of each model. However, due to importance of a high temporal resolution output (daily) in an expensive integrated modelling, the validity of coupled model and the uncertainty of parameters should be assessed and discussed in more detail by taking advantage of uncertainty-based optimization algorithms. Future studies can embed automatic calibration algorithms in the framework of SWAT-MODFLOW with specific features and possibilities to execute parallel simulations on cluster servers to accelerate the process of calibration.