1 Introduction

Hydrological models are essential to watershed planning and management. The hydrological models are more and more widely applied by the hydrologists and water resource managers to understand natural processes that affect the watershed systems. There exists a many integrated watershed management systems (Ratha and Agrawal 2014, 2015) and Geographical Information System (GIS) based hydrological models for the water resources management activities (Rao and Kumar 2004; Bhalla et al. 2011). These hydrological models can contain parameters that cannot be measured directly due to measurement issues and scaling issues (Zhang et al. 2008). Prediction capability of the models depends on the correct selection of the model parameter values. Some of the model parameters values can be physically measured but the other model parameters are difficult to measure on spatial and temporal scales. Parameters which cannot be measured or whose value need to be found during runtime, need calibration to produce the model predictions that are close to the observed values.

For flood forecasting, it is required to model individual storm events at the catchment scale (Bates and Ganeshanandam 1990; Zarriello 1998; Moussa et al. 2002; Jain and Indurthy 2003; Reddy et al. 2008, 2011). The first important challenge that awaits the modeler in this task is to choose a rainfall runoff model and to calibrate a set of parameters that can accurately simulate a number of flood events and related hydrographs shapes (Moussa and Chahinian 2009). Rosenbrock 1960; Duan et al. 1992; Gan and Biftu 1996; Yapo et al. 1998; and Vrugt et al. 2003 studied various calibration algorithms and procedures. Although they differ in the ways they seek the optimal value, they all aim at minimizing or maximizing objective functions.

There are several traditional methods available to calibrate the hydrological model parameters. Recently, the computing based optimization methods have proven to be efficient and robust. When calibrating hydrological models one or more objectives are often used to measure the agreement between observed and simulated values. Seibert (2000) proposed an algorithm for single and multi-criteria calibration for the Hydrologiska Byråns Vattenbalansavdelning (HBV) model. The results obtained in their study indicate that the genetic algorithm is capable of optimizing the parameters for a conceptual runoff model. Henrik (2003) presented the use of the calibration framework for parameter estimation in the MIKE SHE integrated and distributed hydrological modeling system. Their results showed that, the balanced Pareto optimum solution provides a better simulation of the runoff. Muleta and Nicklow (2005) described an automatic approach for calibrating daily stream flow and daily sediment concentration values estimated using Soil and Water Assessment Tool (SWAT) and Genetic Algorithm (GA). Reca and Martinez (2006) developed a new computer model called Genetic Algorithm Pipe Network Optimization Model (GENOME) and the model is aimed to optimize the design of new looped irrigation water distribution networks. Wang et al. (2006) proposed an Interval Fuzzy Multi- Objective Programming Lake Watershed System (IFMOPLWS) method was used to solve an integrated watershed management problem and they have concluded that the IFMOPLWS is a powerful tool for integrated watershed management planning and can provide a solid base for sustainable watershed management. Zhang et al. (2008) developed single objective and multi-objective optimization algorithms which were applied to optimize the parameters of the SWAT using observed stream flow data. Their results demonstrated the advantages and disadvantages of single objective and multi objective parameter estimation methods. Xuesong et al. (2009) presented the application of GA and Bayesian Model Averaging (BMA) to simultaneously conduct calibration and uncertainty analysis using SWAT. Yang and Fan (2010) presented a new sensitivity analysis scheme for the NAM/MIKE 11 model. They achieved a sufficiently accurate Pareto set and a good diversity in the obtained front with the use of Non-dominated Sorting Differential Evolution (NSDE) for the model calibration. Sahoo et al. (2010) examined the use of a loosely coupled GA as an auto calibration tool for optimization of model parameters for the Hydrologic Simulation Program-Fortran (HSPF). The objective function was optimized by minimizing the mean absolute error between corresponding observed and simulated average daily stream flow in the San Antonio River watershed. Kamali and Mousavi (2013) presented single and multi-objective optimization algorithms for automatic calibration of Hydrologic Engineering Center- Hydrologic Modeling System (HEC_HMS) rainfall runoff model and a fuzzy optimal model to combine different criteria.

From the above research studies, it is observed that there is a need for the robust optimization model for calibration of event based rainfall runoff model. Many of the continuous hydrological models had the automatic calibration methods which use different soft computing techniques. But, for management of watershed management activities and for designing of hydraulic structures, an event based rainfall runoff model with proper automatic calibration methodologies is needed. The present study focus on the event based rainfall runoff model and its integration with single and multi-objective GA algorithms for automatic calibration of parameters.

2 Materials and Methods

2.1 Governing Equations

Hydrological processes like infiltration, overland flow and channel flow are considered for flow simulation in the watershed. The infiltration model formulation was given in Reddy et al. (2007). The full formulation of overland flow and channel flow was given in Reddy et al. (2011). Green Ampt Mein Larson (GAML) model was used for simulation of infiltration process. The equation for infiltration rate, given by Green-Ampt in 1911 (Mein and Larson 1973) is as follows.

$$ {f}_p={K}_s\left[1+\frac{M{S}_c}{F}\right] $$
(1)

where, f p is infiltration capacity (cm/h), K s is saturated hydraulic conductivity (cm/h), M is initial moisture deficit, S c is capillary suction at the wetting front (cm), F is cumulative infiltration in cm. Initial moisture deficit M can be expressed as follows:

$$ M={\theta}_s-{\theta}_i $$
(2)

where, θ s is saturated water content and θ i is initial water content. The continuity and momentum equations for kinematic wave in one dimension are given as follows:

$$ \frac{\partial_q}{\partial_x}+\frac{\partial_h}{dt}={r}_e $$
(3)
$$ {S}_o={S}_f $$
(4)

where, S o is slope of overland flow plane, S f is friction slope of flow plane. The final form of the FEM equation for kinematic wave equation, which is used to simulate the overland flow, is as follows (Reddy et al. 2011):

$$ \left[C\right]{\left\{h\right\}}^{t+\varDelta t}=\left[C\right]{\left\{h\right\}}^t-\varDelta t\left[B\right]\left\{\left(1-\omega \right){q}^t+\omega {q}^{t+\varDelta t}\right\}+\varDelta t\left\{f\right\}\left(\left(1-\omega \right){\left({r}_e\right)}^t+\omega {\left({r}_e\right)}^{t+\varDelta t}\right) $$
(5)

where, h is the depth of flow (m), q is the unit width flow (m2/s), r e is excess rainfall rate (m/s), superscripts t and t + Δt indicate the variables at the previous time step and the current time step. ω is the factor that determines the type of finite difference scheme involved. The Crank–Nicolson scheme with ω = 0.5 is used in this study. [C], [B] and {f} are global matrices.

Continuity and momentum equations for one dimensional kinematic equation for channel flow are given as follows:

$$ \frac{\partial Q}{\partial x}+\frac{\partial A}{\partial t}-q=0 $$
(6)
$$ S={S}_{fc} $$
(7)

where, A is area of flow in the channel (m2). Q is discharge in the channel (m3/s). The final matrix form of the FEM equation, which is used for the simulation of the channel flow, is as follows (Reddy et al. 2011):

$$ \left[C\right]{\left\{A\right\}}^{t+\varDelta t}=\left[C\right]{\left\{A\right\}}^t-\varDelta t\left[B\right]\left\{\left(1-\omega \right){Q}^t+\omega {Q}^{t+\varDelta t}\right\}+\varDelta t\left\{f\right\}\left(\left(1+\omega \right){q}^t+\omega {q}^{t+\varDelta t}\right) $$
(8)

where, A is area of flow in the channel (m2) and Q is discharge in the channel (m3/s).

2.2 Genetic Algorithm Formulation

The main difference between genetic algorithms and most of the traditional optimization methods is that GA uses a population of points at one time in contrast to the single point approach by traditional optimization methods (Rajasekaran and Vijayalakshmi 2007). In GA, the optimization has been carried out either maximizing or minimizing the objective function (fitness function) value. In this study, Single-objective GA (SGA) and Multi-objective GA (MGA) models are integrated with FEM based rainfall-runoff model of Reddy et al. (2011). In SGA, the Nash-Sutcliffe Efficiency (NSE) is used as objective function. In MGA, correlation coefficient (r) and NSE are used as objective functions. The NSE and r are calculated as follows:

$$ NSE=1-\frac{{\displaystyle \sum {\left(y-f\right)}^2}}{{\displaystyle \sum {\left(y-\overline{y}\right)}^2}} $$
(9)
$$ r=\frac{N{\displaystyle \sum (fy)}-{\displaystyle \sum (f)}{\displaystyle \sum (y)}}{\sqrt{\left(N{\displaystyle \sum \left({f}^2\right)-{\left({\displaystyle \sum f}\right)}^2}\right)}\sqrt{\left(N{\displaystyle \sum \left({y}^2\right)-{\left({\displaystyle \sum y}\right)}^2}\right)}} $$
(10)

where, f is the model simulated runoff value, y is the observed runoff value, and \( \overline{y} \) is the mean of observed runoff values for the entire time period of the evaluation. For optimization of four infiltration parameters, two constraints have been considered i.e. (i) the initial water content should be less than saturated water content; (ii) parameters value should be within the specified bounding limits.

Infiltration and flow resistance parameters are input to an event based rainfall-runoff model. However, it is very difficult to get the field values of infiltration and flow resistance parameters for a given rainfall event. Hence, it is required to find out the best parameters for a given rainfall event from the possible range of those parameter values. GA will help to find the optimal parameter values and substantially reduce the burden of manual calibration. In the present study, C programming code of GA developed by Prof Kalyanmoy Deb (http://www.iitk.ac.in /kangal/index.shtml) is downloaded and modified for the present problem (Srinivas and Deb 1994; Deb K 2001). It is loosely coupled with the C programming code of Reddy et al. (2011) runoff model. The loose coupling of models externally keeps the algorithms independent of each other. Interaction occurs only in a common external file. The common file transfers the decoded parameters to runoff model which uses the parameters as input and simulate runoff. Simulated runoff will be transferred to the GA for fitness evaluation. The process is repeated till obtaining the best fitness value.

2.3 SGA Integrated Runoff Model

Methodology flowchart for coupled runoff model with SGA is shown in Fig. 1. General steps in the SGA integrated runoff model for calibration of four parameters are as follows:

Fig. 1
figure 1

Flowchart for SGA integrated runoff model

  1. Step 1:

    Initially, the required GA and runoff model inputs are given to the model. The input datasets (FEM grid map, Land Use (LU) /Land Cover (LC), Soil map, Drainage map) for the runoff model are prepared using remote sensing and geographical data. The preparation of maps and detailed information about the parameters are given in Reddy et al. (2011). The SGA inputs are number of generations, population size, variable types, upper and lower boundary values of the parameters, choice of selection, crossover operator, crossover probability and mutation probability. For this study, the number of generations and population size has been fixed to 100 and 8 respectively. In SGA, the NSE is used as objective function. The upper and lower bounds for the parameters have been fixed for the each rainfall event with the available information (Keefer et al. 2008; Reddy et al. 2011). The roulette wheel selection and uniform crossover are considered with crossover probability of 0.9 and mutation probability of 0.1.

  2. Step 2:

    For generation zero, a set of initial population (8 sets of 4 variables) is created in SGA based on given inputs, especially lower and upper boundaries of variables. Each set of variables is transferred to the rainfall runoff model. With other inputs, the rainfall runoff model is simulated for runoff. Once fitness values are evaluated for all the populations, GA will store the best ever population for generation zero. Based on the best ever population, the GA creates a new population using crossover and mutation operators.

  3. Step 3:

    The new population will be sent to the rainfall runoff model one by one and fitness function values are evaluated for each population. Based on the fitness values, GA will store for the best ever population and corresponding fitness value. If the best fitness value satisfies the convergence criteria or SGA generation reaches 100, then the process is stopped and runoff is simulated for best ever solution.

2.4 MGA Integrated Runoff Model

Methodology flowchart for the MGA integrated model is shown in Fig. 2. The following general steps are involved in the MGA integrated model.

Fig. 2
figure 2

Flowchart MGA integrated runoff model

  1. Step 1:

    GA and runoff model inputs are same as SGA. In MGA, correlation coefficient (r) and NSE are used as objective functions. The upper and lower bounds of parameters, GA operators and their probability are taken as same as the integrated SGA model.

  2. Step 2:

    For generation zero, a set of initial population (8 sets of 4 variables) is created in MGA based on given inputs and fitness values are evaluated for all the populations. Pareto optimal solutions are identified using nondominated sorting method.

  3. Step 3:

    In MGA scenario, to select optimal solutions, an additional process is added with GA. All acceptable solutions (Pareto) are assigned with rank one using nondominated sorting GA method. All the possible solutions are then combined with the sum of the weighted objective method.

  4. Step 4:

    If the combined value satisfies the convergence criteria or MGA generation reaches 100. Then the process is stopped and runoff is simulated for best ever solution.

2.5 Study Area Description

Walnut Gulch Experimental watershed located in Arizona State of United State of America (USA) (Fig. 3) and Harsul watershed located in Nashik district, Maharashtra, India (Fig. 4) are chosen as study watersheds. The model has been applied for two watersheds to test the validity of the proposed integrated runoff model for providing accurate hydrology prediction and uncertainty intervals. Finite element formulation of event based rainfall runoff model and input layer preparation for Harsul watershed are explained in Reddy et al. (2011). Soils of the Walnut Gulch Experimental Watershed are sandy gravely loams and major watershed vegetation includes the grass and shrub species. The Walnut Gulch Experimental watershed consists of 42 major sub watersheds. The sub watershed selected for model application has an area of 7.8 km2. Digital Elevation Model (DEM), LU/LC, soil and rainfall data have been downloaded from the online data access facility of the USDA-ARS (http://www.tucson.ars.ag.gov/dap//). In this study, runoff parameters for the simulation model have been optimized using SGA and MGA.

Fig. 3
figure 3

Location map of Walnut Gulch watershed

Fig. 4
figure 4

Location map of Harsul watershed (Reddy et al. 2011)

3 Results and Discussions

In the present study, the GA integrated runoff model has been auto calibrated for four infiltration parameters simultaneously namely, saturated hydraulic conductivity (K s), average capillary suction at the wetting front (S av ), initial water content (θ i ) and saturated water content (θ s ) for each rainfall event using GA. Twelve rainfall events in Harsul and four rainfall events in Walnut Gulch are simulated using the integrated model. Parameters have been optimized using SGA and MGA. The simulated results have been compared with HEC_HMS and Reddy et al. (2011) model simulated results for same rainfall events. Roulette wheel selection and uniform cross over operators are used in the SGA and MGA. Crossover probability of 0.9 and mutation probability of 0.1 is used in simulations. In SGA, the value of NSE between simulated and observed runoff has been taken as fitness value of population. The selection operator has been set to maximize the value of fitness function, as the NSE varies -∞ to 1. The population which is near to 1 will have less error. For twelve rainfall events of Harsul watershed, the maximum and minimum fitness value observed are 0.65 and −0.57 respectively. To improve the performance of the model, in addition to NSE, the correlation coefficient has been taken as a second objective function in MGA. In MGA, the selection operator has been set to maximize these two objective functions. Best population has been selected by combining Pareto optimal solution using the sum of weighted objective method with 60 % of NSE and 40 % correlation coefficient. The maximum and minimum total fitness has been observed as 0.85 and −0.16 respectively. In single and multi-objective GA models, these fitness values have been achieved within 100 generations with eight populations.

The upper and lower limit of the parameters for Harsul and Walnut Gulch watersheds are given in Table 1. The optimized parameters for Harsul and Walnut Gulch watershed using single and multi-objective GA are listed in Table 1. The observed and simulated hydrographs for Harsul and Walnut Gulch watersheds are shown in Figs. 5 and 6 respectively. The model simulation results for Harsul and Walnut Gulch watersheds are shown in Table 2.

Table 1 Upper and lower bounds of the auto-calibration parameters and optimized parameters obtained from SGA and MGA for rainfall events of Harsul watershed and Walnut Gulch watersheds
Fig. 5
figure 5

Observed and simulated hydrographs generated GA Integrated runoff model (Single and Multi-objective),Reddy et al. (2011) model and HEC_HMS model for Harsul watershed a Aug 22,1997 b Aug 04,1997 c Jul 27,1997 d July 28,1997 e Jul 30,1997 f Sep 26,1997

Fig. 6
figure 6

Observed and simulated hydrographs generated by GA Integrated runoff model (Single and Multi-objective), and HEC_HMS models for Walnut Gulch watershed a Jul 20, 2007 b Aug 23, 2009 c Aug 28, 2008 d Aug 28, 2010

Table 2 Model simulated results for Harsul and Walnut Gulch watersheds

Due to the random nature of GA, the global optimum is not guaranteed to be the best solution. Thus, the global optimal solution is cross validated by evaluating each parameter on its parameter space. Global optimum validation for the rainfall event Aug 22, 1997 is shown in Fig. 7 and it is very clear from the figure that the global optimum solution is the optimum solution for that event.

Fig. 7
figure 7

The convexity of the NSE on the parametric space for the event Aug 22, 1997

From the simulation results of Harsul watershed with integrated SGA model, it is seen that the volume of the runoff has been simulated within a variation of 12.3 to 75 %, peak runoff has been simulated within a variation of 2.25 to 50.3 %, and time to peak runoff has been simulated within the variation of 0.05 to 69.49 %. For integrated MGA model, volume of runoff has been simulated within the variation 4.2 to 65.1 %, peak runoff has been simulated within the variation of 0.85 to 48.65 %, and time to peak has been simulated within the variation 0.05 % to 69.87 %. From the hydrographs, it is observed that GA effectively optimized the calibrated parameters. The overall shapes of the hydrographs are well captured with the integrated model.

From the simulation results of the Walnut Gulch watershed with integrated SGA, it is observed that the volume of runoff has been simulated within the variation in 4.30 to 46.00 %. Peak runoff has been simulated within the variation of 0.86 to 16 %. Time to peak has been simulated within the variation of 18 to 32 % except for the event on Aug 28, 2008. For this event, the variation in time to peak is 74.69 %. For the integrated MGA, it is seen that the volume of runoff has been simulated within the variation of 1.08 to 46.50 %. Peak runoff has been simulated within the variation of 4.70 to 14.41 %. Time to peak has been simulated within the variation of 18 to 34 % except for the event on Aug 28, 2008. For this event the variation in time to peak is 74.07 %.

Average percentage error is the ratio between the sum of all events absolute error percentage in a criteria and total number of events in the watershed. From the simulation results of Harsul watershed with integrated SGA, for twelve rainfall events the average percentage error for volume of runoff, peak runoff and time to peak are 44.76 %, 22.25 and 25.5 %.. For the integrated MGA, the average percentage error for volume of runoff, peak runoff and time to peak are 39.42, 17.96 and 25.19 %. For the four rainfall events of the Walnut Gulch watershed with integrated SGA, the average percentage error for volume of runoff, peak runoff and time to peak are 30.85, 10.61 and 36.69 % respectively. For the integrated MGA, the average percentage error of the volume of runoff, peak runoff and time to peak are 30.17, 9.02 and 36.52 % respectively. For the eleven rainfall events of Harsul watershed, the percentage error for volume of runoff, peak runoff and time to peak for Reddy et al. (2011) are 57.89, 28.30 and 24.27 %. It is observed that for all the rainfall events, integrated model with SGA has better identified the parameters set than the runoff model by Reddy et al. (2011). However, due to single objective function error percentage in volume of runoff, Peak runoff and time to peak are comparatively high and these errors are further reduced with MGA. From the results it is evident that the MGA has performed better than SGA and runoff model by Reddy et al. (2011).

The performance of integrated runoff models (SGA and MGA), Reddy et al. (2011) model and HEC_HMS model are evaluated with NSE and correlation coefficient values. The NSE and correlation coefficient for all the models are shown in Table 3. It is seen that the integrated runoff models (SGA and MGA) are providing better performance than Reddy et al. (2011) model. Range of NSE values obtained for Harsul watershed with integrated SGA, integrated MGA, HEC_HMS and Reddy et al. (2011) models are [−0.12, 0.65], [−0.61,0.79], [−3.37, 0.95] and [−5.78, 0.53] respectively. Range of correlation coefficient values for Harsul watershed with integrated SGA and integrated MGA, HEC_HMS and Reddy et al. (2011) models are [0.01, 0.86], [0.19, 0.95], [−0.18, 0.97] and [−0.12, 0.86] respectively. For the rainfall events of the Walnut Gulch watershed, the range of NSE values for integrated SGA, integrated MGA and HEC_HMS models are [−0.52, 0.74], [0.49, 0.65] and [−0.13, 0.82] respectively. For the rainfall events of the Walnut Gulch watershed, the range of correlation coefficient values for integrated SGA, integrated MGA and HEC_HMS models are [−0.55, 0.90], [0.18, 0.90], [0.54, 0.92].

Table 3 Nash- Sutcliffe Efficiency (NSE) and correlation coefficient (r) values for the simulation events

Furthermore, to validate the model accuracy, the model efficiency values obtained in the present study are compared with the available research studies. Lafdani et al. (2013) evaluated the efficiency of Adaptive Neuro-Fuzzy Inference System (ANFIS) based daily runoff simulation and it is reported that the maximum correlation coefficient and NSE values are 0.86 and 0.79 respectively. Haghizadeh et al. (2014) integrated ANN model and Watershed Modeling System (WMS) for estimating the infiltration parameter. The performance of the model was evaluated using correlation coefficient and the maximum value is 0.90. It is observed that the maximum correlation coefficient and NSE values obtained from the present GA integrated runoff model are 0.95 and 0.79 respectively.

From Table 2, it is observed that the HEC_HMS runoff model proving better performance over integrated runoff models (SGA and MGA) for all rainfall events of Walnut Gulch watershed and six rainfall events of Harsul watershed. For these rainfall events, the performance of the GA integrated model over HEC_HMS may be improved by increasing number of generations and by increasing search space. It is also observed that HEC_HMS model involves manual optimization and model requires the detail information of the parameters for each sub-basin.

4 Conclusions

Present paper focus on the applicability of GA based single and multi-objective optimization algorithm for automatic calibration of event based rainfall runoff model. The integrated GA runoff model has been applied for Harsul watershed, India and Walnut Gulch watershed, USA. For comparison of the simulation results, the same rainfall events have been calibrated and validated with HEC_HMS model. The model performance has been tested using NSE and correlation coefficient. A set of parameters for a runoff model that resulted in a good fit with measured stream flow data were obtained using SGA and MGA. The integrated GA runoff model structure has reduced the time for calibration as compared to manual calibration. From the simulation results, it is observed that the model has predicted the peak runoff and time to peak reasonably well when compared with the observed data and simulation results generated by the HEC_HMS. However, the volume of runoff was underestimated to some extent in all twelve data sets. The developed GA integrated model can be useful to use in real time flow simulation models and to simulate the flow parameter of data sparse watersheds.