1 Introduction

Physically-based spatially-distributed (PBSD) hydrological models are increasingly used because of their capacity for evaluating the impacts of future climate and land-use changes on river basins (Bovolo et al. 2009; Bekele and Knapp 2010; Birkinshaw et al. 2011; Shi et al. 2013). One of the major difficulties of these models is the evaluation of the most important parameters to represent a particular basin. Theoretically, these parameters should be accessible from catchment data; however, in practice, this is not the case due to unaffordable cost, experimental constraints or scaling problem (Beven et al. 1980). Calibration is necessary for river basin planning and management studies. The calibration of a PBSD model is complex and expensive due to the sophisticated model structure, heavy computation requirements and large number of calibration parameters (Blasone et al. 2007). Successful manual calibration requires rigorous and purposeful parameterisation (Refsgaard 1997) and well-trained modeller. It is subjective, tedious and very time-consuming, which makes an extensive analysis of the model calibration quite difficult. This paper therefore proposes the use of an automatic method (based on Shuffled Complex Evolution) to calibrate the SHETRAN model.

Ewen and Parkin (1996) proposed a “blind” model validation procedure for this model, with no calibration allowed, to quantify the uncertainty of predicted features for a particular application. In practice, there are various approximations in the model designs which degrade the physical bases, so that some level of adjustment in the model parameters is required. SHETRAN model is mostly calibrated manually by adjusting the principal calibration parameters on the basis of physical reasoning (Mourato 2010; Bathurst et al. 2011; Birkinshaw et al. 2011). This can be easily handled in basins with homogenous characteristics, such as elevation, slope, land-use, and soil type, and small size, but it would be much more complicated for large basins with more heterogeneous characteristics.

Studies have shown that the Shuffled Complex Evolution (SCE-UA) algorithm, developed by Duan et al. (1992), is an effective and efficient global optimization method for the calibration of PBSD models like SWAT (Eckhardt and Arnold 2001), MIKE SHE (Madsen 2003), WESP (Santos et al. 2003) and GW (Blasone et al. 2007). The SCE-UA method has a great potential to solve the problems accompanying the automatic calibration of PBSD models, due to its robustness in the presence of different parameter sensitivities and parameter interdependence and its capacity for handling high-parameter dimensionality. Santos et al. (2003) introduced new evolution steps in SCE-UA, which speed up the parameter searching processes. They also demonstrated that the final results from the Modified Shuffled Complex Evolution (MSCE) are independent of the initial parameter values, which facilitates its application.

This paper aims to demonstrate the applicability and efficiency of the MSCE in calibration of SHETRAN model when applied to a semi-arid middle-sized basin in an area of active desertification processes. It is made in the context of the anti-desertification effort in southern Portugal. The results will be used in predicting the impacts of climate and land-use changes on basin hydrology and soil erosion in areas undergoing desertification.

2 Cobres Basin

This study is carried out on the part of the Cobres river basin situated above the Monte da Ponte gauging station. The basin is, semi-arid, middle-sized with area of 705 km2, located in the Alentejo province of southern Portugal (37°28′N–37°57′N, 8°10′W–7°51′W, Fig. 1), an area suffering from desertification (Bathurst et al. 1996). It is a region of relatively low relief, with the elevation varying from 103 to 308 m above sea level. Nine types of soil are identified, of which the main types are red or yellow Mediterranean soil of Schist origin (Vx soil), brown Mediterranean soil of Schist or Greywacke origin (Px soil) and lithosols from semi-arid and sub-humid climate of Schist or Greywacke origin (Ex soil), occupying respectively 20 %, 45 % and 26 % of the basin area. The soils are thin with depth varying from 10 to 50 cm. Four types of land-use are identified, of which the predominant types are crop (70 %) and agroforestry (27 %). The climate in this region is characteristically Mediterranean and Continental, with moderate winters and hot and dry summers, high daily temperature range, and a weak and irregular precipitation regime; mean annual precipitation of rain gauge stations in the region varies between 400 and 900 mm, with around 50 to 80 rainy days per year (Ramos and Reis 2001). The mean annual potential evapotranspiration (PET) is around 1300 mm.

Fig. 1
figure 1

Location map, SHETRAN grid network and channel system (heavy blue lines, representing all channel links, and light blue lines, representing the links used to extract simulated discharges at basin outlet and internal gauging stations) for the Cobres basin, showing the rain gauges and gauging stations (the blue triangles at outlet, northern and central parts of the basin, are respectively Monte da Ponte, Albernoa and Entradas gauging stations). The grid squares have dimensions 2 × 2 km2

3 Methods and Data

3.1 SHETRAN Modelling System

SHETRAN (http://research.ncl.ac.uk/shetran/) is a PBSD modelling system for water flow and sediment and contaminant transports in river catchments (Ewen et al. 2000; Birkinshaw et al. 2010). The physical processes are modelled by finite difference representations of the partial differential equations of mass and energy conservation or by empirical equations. The basin is discretized by an orthogonal grid network in the horizontal view and by a column of layers at each grid square in the vertical view; and the river network is simplified as the links run along the edges of the grid squares.

Herein, the calibration is considered only for water flow component of SHETRAN (v4.4.0). The model represents the physical processes of the hydrological cycle through: (1) the interception calculated from the modified Rutter model; (2) the actual evapotranspiration (AET) calculated from FAO Penman-Monteith PET and a prescribed ratio of AET/PET as a function of soil water potential; (3) the overland and channel flow processes based on the diffusive wave approximation of the Saint-Venant equations; (4) the subsurface flow processes calculated from 3D variably saturated flow equation; (5) the river-aquifer interaction calculated from Darcy equation.

3.2 Calibration Parameters

Model parameterisation and choice of calibration parameters are based on model structure and previous studies. Bathurst (1986) carried out sensitivity analysis of the SHE model, SHETRAN’s precursor, for an upland catchment in mid-Wales and found out that soil and Strickler overland flow resistance coefficients are the parameters to which the results are most sensitive. Studies by Parkin et al. (1996), Bathurst et al. (2004, 2011), Mourato (2010) and Birkinshaw et al. (2011) have indicated that parameters such as Strickler overland flow resistance coefficient, AET/PET ratio and soil parameters namely top soil depth, saturated hydraulic conductivity, soil water retention and hydraulic conductivity functions are the key parameters required to be specified using field or calibrated data for flow simulations.

3.3 SHETRAN Model Set Up

The input data comprise rainfall and PET, whilst the model parameters comprise rainfall station distribution, ground surface elevations, land-use and soil type distributions as well as river links with associated cross-section information. Hourly precipitation data and basin runoff are available at the Portuguese Water Resources Information System (SNIRH) for the stations indicated in Fig. 1. Daily FAO Penman-Monteith PET from Quinta da Saúde meteorological station (38°02′15″N, 07°53′06″W) at Beja is provided by the Agrometeorological System for the Management of Irrigation in the Alentejo/Irrigation Technology and Operative Center (SAGRA/COTR). Hourly PET is also available for Vale de Camelos station (37°48′43″N, 07°52′11″W) from SNIRH for the study period; however its annual PET is around 1000 mm, which seems around 200–300 mm lower than the literature values for the region (Bathurst et al. 1996). Preliminary analysis has indicated that the lower annual PET might have resulted from the higher relative humidity and the lower wind velocity measurements. Since hourly distribution of PET during the day is mainly influenced by solar radiation in the semi-arid southern Portugal region, hourly PET proportion during the day from Vale de Camelos station may not have been affected much, and it is assumed to be the same for stations under the same climate condition. Therefore, the daily PET from Beja is disaggregated into hourly intervals, according to the variation of the corresponding hourly PET from Vale de Camelos, to serve as input. A comprehensive geospatial dataset is available including topographic data with a scale of 1:25000 at 10 m intervals, digital maps of land-use type (Caetano et al. 2009) with a scale of 1:100000 and soil type (from Institute of Hydraulics, Rural Engineering and Environment, IHERA) with a scale of 1:25000. Here, model calibration and validation are carried out respectively from October 1st 2004 to September 30th 2006 and from October 1st 2006 to September 30th 2008. The calibration excludes the first 10 months as warm-up period; the validation excludes the period from November 4th 2006 to November 8th 2006, due to the existence of missing data. SHETRAN is applied to the study basin with spatial resolution of 2 km grid and temporal resolution of 1 h.

To effectively reduce the number of the calibration parameters, the key parameters are considered for calibration of only the two main types of land-use and the three main types of soil, while those for the other types of land-use and soil maintain their baseline values. AET is determined by PET, crop characteristics and soil water stress conditions (Allen et al. 1998). The AET/PET ratio is considered to be maximal at soil field capacity declining linearly with increasing soil suction. The AET/PET ratio at soil field capacity and Strickler overland flow resistance coefficient are to be calibrated for the main types of land-use. Anisotropy of soil physical properties is not considered, so vertical saturated conductivity is assumed to be the same as the lateral saturated conductivity. The soil water retention and hydraulic conductivity functions are defined by van Genuchten et al. (1991). The saturated hydraulic conductivity, saturated water content, residual water content, van Genuchten n and α parameters, and top soil depth are to be calibrated for the main types of soil. Consequently, twenty-two parameters are to be calibrated by MSCE algorithm.

As automatic calibration does not use physical reasoning, the parameter values are constrained within physically realistic ranges according to field measurements and literature data to produce results that can be justified on physical grounds. The measured and estimated soil parameters are shown in Table 1. The key parameters for automatic calibration of the Cobres basin, with spatial resolution of 2 km grid and temporal resolution of 1 h, are finalized in Table 2, with specified ranges and baseline values based on literature (Cardoso 1965; Bathurst et al. 1996, 2002; Saxton and Rawls 2006), sensitivity analysis and personal communication with Dr. Birkinshaw at Newcastle University. According to Allen et al. (1998), the AET/PET ratio at field capacity is considered to be in the range of [0.5, 0.9] for crop and [0.6, 0.8] for agroforestry; it is set to 0.6 for crop and 0.7 for agroforestry in baseline simulation. Ramos and Santos (2009) found that the AET/PET ratio is around 0.7 at field capacity for olive orchard in southern Portugal, which confirmed our AET/PET ratio setting. Based on Engman (1986) and Bathurst et al. (1996, 2002), the Strickler overland flow resistance coefficient is set to be in the ranges of [2.5, 10] and [0.5, 5.0] m1/3/s respectively for crop and agroforestry; it is set to 5.0 and 2.0 m1/3/s respectively for crop and agroforestry in baseline simulation. Based on Chow (1959), the Strickler channel flow resistance coefficient is set to 30 m1/3/s. Sensitivity analysis is carried out on the 22 parameters in terms of model outputs such as total runoff and NSE. It is shown that the results are most sensitive to van Genuchten α, sensitive to AET/PET ratio, Strickler overland flow resistance coefficient, top soil depth, van Genuchten n, saturated water content and residual water content, and not so much sensitive to saturated hydraulic conductivity.

Table 1 Soil parameters based on Cardoso (1965) for the main soil types in the Cobres basin
Table 2 The SHETRAN calibration parameters’ description, feasible ranges, baseline setting (in bracket) and values derived from manual and MSCE calibrations

To compare the difference of results between manual and automatic calibrations, scenario I considers only calibration of Strickler overland flow resistance coefficient for the two main types of land-use (two parameters), scenario II considers calibration of Strickler overland flow resistance coefficient and the AET/PET ratio at field capacity for the two main types of land-use (four parameters). The differences among MSCE calibration schemes with different parameterizations are compared: scenarios I and II; scenario III, considering key parameters for two main types of land-use and Px soil (ten parameters), and scenario IV (the previously proposed MSCE calibration of 22 parameters).

3.4 The MSCE Optimization Algorithm

The SCE-UA method, proposed by Duan et al. (1992), is an effective and efficient global optimization method in calibration of lumped and distributed models (Madsen 2000, 2003; Eckhardt and Arnold 2001; Blasone et al. 2007). It is based on the simplex downhill search scheme (Nelder and Mead 1965). Santos et al. (2003) introduced new evolution steps to improve its efficiency by making the simplex expand in a direction of more favourable conditions, or contract if a move was taken in a direction of less favourable conditions. The MSCE optimization algorithm was tested successfully for calibration of the physically-based erosion model WESP in a semi-arid watershed in Brazil (Santos et al. 2003).

According to them, the number of complexes is set to 2, considering the long time requirement for a single SHETRAN simulation (4 min). The number of population in each complex is set to 2NOPT+1, in which NOPT is the number of optimization parameters; the number of population in a subcomplex is set to NOPT+1, and the number of evolution steps required before complexes are shuffled is set to 2NOPT+1. The initial parameter values are selected randomly from the feasible hypercube search space. The optimization is terminated if the model simulation has been tried 10,000 times, if the change of the best function value in 10 shuffling loops is less than 0.01 % or if the normalized geometric mean of parameter ranges is less than 0.001.

3.5 The Objective Function

The objective function to be minimised in the calibration and validation of SHETRAN model is the root mean square error (RMSE) between observed and simulated hourly discharge at basin outlet. Other functions such as LOG transformed Error (LOGE) (Bekele and Nicklow 2007), NSE (Nash and Sutcliffe 1970), coefficient of determination (PMCC) (Rodgers and Nicewander 1988) and index of agreement (IOA) (Willmott 1981) are also calculated to evaluate comprehensively the model performances. In addition, visual fitting of hydrographs has also been performed in manual calibration. RMSE emphasizes fitting of the higher or peak discharges due to the square to the errors with values greater than 1.0 and LOGE is designed to emphasize fitting of the lower discharges by the introduction of logarithms. Both of them range between 0 (perfect match) and +∞. NSE is a measure of goodness-of-fit and is independent of the flow magnitude. It ranges from −∞ to 1 (perfect fit). PMCC measures the variability of observed flow that is explained by the model. It ranges from −1 (fully negative correlation) to 1 (fully positive correlation). IOA makes cross-comparisons between models or model performances and it varies between 0 and 1 (perfect fit).

4 Results and Discussion

4.1 MSCE Calibration of SHETRAN Model (Scenario IV)

Scenario IV provides the best set of parameters (Table 2). The parameter values are well consistent with literature data. Bathurst et al. (1996) carried out a SHETRAN simulation of the Cobres basin for the period from 1977 to 1985; they characterized the basin land-use as crop (at least 90 % occupation) and the soil type as a thin, poor quality, red Mediterranean soil overlying schists (corresponding to the Vx soil of this study) with measured saturated hydraulic conductivity values between 0.03 and 0.4 m/day and depth of A and B horizons between 13 and 33 cm thick. Their calibration indicated that the soil depth is 0.4 m, saturated hydraulic conductivity is 0.05 m/day and Strickler overland flow resistance coefficient is 6 m1/3/s. Here, we carried out hydrological simulation for the period from 2004 to 2008, and characterized the basin as two main types of land-use (crop and agroforestry) and three main types of soil (Vx, Px and Ex soil). Scenario IV determined that soil depth is 0.30 m, saturated hydraulic conductivity is 0.168 m/day for Vx soil, which is in agreement with Bathurst et al. (1996). Strickler overland flow resistance coefficient for crop is 10 m1/3/s, which is larger than that derived by Bathurst et al. (1996) and at the highest limit of its physically realistic range. Experiment of scenario IV with spatial resolution of 1 km suggests a value of 7.0 m1/3/s, which indicated that by using the larger spatial resolution the resulting value of Strickler overland flow resistance coefficient may become smaller than the highest limit of its physically realistic range. However, further studies are required to clarify this point.

The result of prescribed AET/PET ratio as a function of soil water potential can also be properly interpreted by physical reasoning. Scenario IV suggests a value of 0.50 and 0.60 respectively for crop and agroforestry at field capacity. The AET/PET ratio was assigned to decline linearly with increasing soil suction. It is 0 at wilting point. Specifically, we assumed −3.3 m as field capacity, −150.0 m as wilting point; then, the AET/PET ratios for crop and agroforestry with soil water potential of −10.0 m are respectively 0.165 and 0.198. Taking the Px soil as an example, the calibrated soil water retention curve indicates that soil water content at field capacity, soil water potential of −10.0 m and wilting point are respectively 0.298, 0.228 and 0.122 m3/m3. The available water at field capacity and soil water potential of −10.0 m are respectively 0.176 and 0.106 m3/m3. To access the available water, plants need to exert 3.3 and 10.0 m soil suction respectively at field capacity and soil water potential of −10.0 m. Consequently, the AET/PET ratio at soil water potential of −10.0 m is 0.33 times at field capacity.

Model performance of the scenario IV is shown in Table 3; annual mass balance analysis of it is shown in Table 4 for basin outlet and internal gauging stations. For basin outlet, the NSE is 0.86 for calibration and 0.74 for validation; the NSE is respectively 0.65 and 0.82 for calibration, 0.69 and 0.63 for validation of internal gauging stations Albernoa and Entradas. The simulation underestimated annual runoff at basin outlet around 11 % (year 2007) to 35 % (year 2006). The graphical comparison between observed and simulated discharges at basin outlet, displayed in Fig. 2a−b for the main runoff periods, during the calibration and validation phases, indicates that the model could not catch well the peak discharge for most of the storm events.

Table 3 Comparison of model performances from manual with MSCE calibrations at basin outlet (Monte da Ponte gauging station)
Table 4 Statistics for the MSCE calibration scenario IV at Cobres basin
Fig. 2
figure 2

Comparison of observed and simulated discharges from MSCE calibration scenario IV for the Cobres basin with spatial resolution of 2 km grid and temporal resolution of 1 h, for main periods of (a) calibration and (b) validation processes

To find out what has happened, we plotted the monthly water balance components for the simulation in Fig. 3. It is shown that, during the entire period, (1) rainfall mainly concentrated in the period from October to May of the following year; (2) runoff mainly appeared in 4 months, namely November 2005, October 2006, November 2006 and December 2006. It is clear that the two main runoff generation periods are respectively preceded by 12 and 6 months’ drought. Therefore, the runoff underestimation may also be explained by the reduced soil infiltration resulting from the occurrence of surface sealing and crust formation, physical processes that are not embodied in SHETRAN model, due to the existence of forcing factors such as dry initial soil moisture content, gentle basin slope, Px and Ex soils (loam and sandy loam) and moderate rainfall intensity. Studies conducted in this region (Silva 2006; Pires et al. 2007) have shown that Mediterranean soils are characterized by having crust formation problems and low infiltration capacity. Soil sealing and crusting are recognized as common processes in cultivated soils of semi-arid and arid regions. Since the study basin is mainly occupied by crops, the crusting formation problems might have been very important in this region. However, the crust formation problem is not considered in this study due to the lack of information for quantifying how much infiltration would be reduced by soil crust considering the nature of the rain, the soil’s physical and chemical properties of the Cobres basin during the study period. Experiments show that the overall model performance would not be improved by arbitrarily reducing saturated hydraulic conductivity for the whole simulation period.

Fig. 3
figure 3

Water balance analysis of MSCE calibration scenario IV for calibration and validation period; P –precipitation, AET – actual evapotranspiration, ΔS – change of subsurface water storage, R – total runoff

Figure 4a−d are made to get a clear impression of SHETRAN’s ability to reproduce the storm events preceded by long period of drought. Storms No.1 and No.4 are the largest storm events respectively during the calibration and validation periods. Figure 4a−b are respective comparisons of observed and simulated hydrographs for storms No.1 and No.4 at basin outlet; Figure 4c−d are comparisons of observed and simulated hydrographs for storm No.4 respectively at internal gauging stations Albernoa and Entradas. The NSE is 0.87 and 0.64 respectively for Storms No.1 and No.4 at basin outlet; it is 0.69 and 0.65 for Storms No.4 respectively at Albernoa and Entradas. It is shown that, for both storm events, SHETRAN model reproduced well the qualitative evolutions of the hydrographs at basin outlet, as well as at two internal gauging stations; however, it greatly underestimated the peak discharges and the simulated hydrographs are much less flashier than the observed ones.

Fig. 4
figure 4

Comparison of observed and simulated discharges from MSCE calibration scenario IV for the Cobres basin with spatial resolution of 2 km grid and temporal resolution of 1 h: (a) Storm No.1 at basin outlet; (b) Storm No.4 at basin outlet; (c) Storm No.4 at internal gauging station Albernoa; (d) Storm No.4 at internal gauging station Entradas

4.2 Comparison of Manual and MSCE Calibrations

To compare manual calibration with MSCE calibration, scenario I considers the most frequently used calibration parameters—Strickler overland flow resistance coefficients; based on scenario I, scenario II also considers the water balance controlling parameters—the AET/PET ratios at field capacity. As shown in Tables 2 and 3, manual calibration can achieve the same parameter setting and model performance as MSCE calibration for scenarios I and II. The success of manual calibration may be attributed to: (1) the rigorous and deliberate parameterization; (2) the narrow ranges of parameters set in this study; (3) the small number of involved calibration parameters. For these two scenarios, the MSCE calibrations do not distinctly overpass manual calibrations in terms of model performances. In section 4.1, it is shown that scenario IV considers 22 parameters obtaining satisfactory results in terms of calibration parameters and model performance. For scenario IV, we did not consider to carry out manual calibration due to its complexity and limitations. In summary, the advantages of MSCE calibration stem from it being capable of taking a large number of parameters into consideration, being objective, and excluding modellers’ subjective interference, releasing them from monotonous laborious work.

4.3 Comparison of MSCE Calibrations

Scenarios I, II, III and IV involve respectively 2, 4, 10 and 22 calibration parameters; it is shown in Table 2 that for the majority of calibration parameters, we get similar or even equal values, for all considered scenarios. This circumstance requires further investigation, which is beyond the scope of this paper. Table 3 displays that NSE is 0.81 and 0.60 respectively for calibration and validation of scenario I; NSE is around 0.85 and 0.65 respectively for calibration and validation of scenarios II and III; NSE is 0.86 and 0.74 respectively for calibration and validation of scenario IV. Model performance of scenario IV is better than for all the other three scenarios. With an increasing number of considered key parameters, MSCE calibration does not always improve, unless all the key parameters are considered.

5 Conclusions

The MSCE optimization algorithm, introduced by Santos et al. (2003) based on the SCE-UA developed by Duan et al. (1992), is successfully applied to calibrate the SHETRAN model in the semi-arid Cobres basin with spatial resolution of 2 km and temporal resolution of 1 h. Twenty-two parameters are calibrated based on the two main types of land-use and the three main types of soil, and no initial parameter setting is provided. The calibrated parameters are within measured ranges of Cardoso (1965), well consistent with previous work of Bathurst et al. (1996) and well explained by physical reasoning. The results are very satisfactory. NSE is 0.86 for calibration and 0.74 for validation for basin outlet; NSE is respectively 0.65 and 0.82 for calibration, 0.69 and 0.63 for validation of internal gauging stations Albernoa and Entradas; as for storm events, NSE is 0.87 and 0.64 respectively for Storms No.1 (during the calibration period) and No.4 (during the validation period) at basin outlet; it is 0.69 and 0.65 for Storm No.4 respectively at Albernoa and Entradas. As a confirmation to the study of Santos et al. (2003), the MSCE optimization algorithm is able to converge to the global optimal values.

For SHETRAN model, manual calibration can be successful if the rigorous and deliberate parameterization has been carried out and a few parameters are involved. MSCE is recommended due to the following advantages: being capable of taking a large number of parameters into consideration, being objective, excluding modellers’ subjective interference and releasing them to other more important activities. To get the best model performance, all key parameters should be considered in MSCE calibration. Future studies should include other automatic calibration techniques, such as simulated annealing (Santos et al. 2012) and consider the influence of catchment discretization (Santos et al. 2011) especially when applying GIS and remote sensing techniques (Silva et al. 2012).