Keywords

4.1 Introduction

A model is “a small object usually built to scale, that represents in detail another, often larger object.” In water modelling, the model is not physically built; rather, there are mathematical relations that are applied in order to simulate reality (Chapra 1997). Single process modelling has been applied in hydrology and hydraulics since the 1950s (Hashemi and O’Connell 2010a). Predicting peak discharge from rainfall (Shaw 1994; O’Connell 1991; Singh and Woolhiser 2002), and the use of the Sherman unit hydrograph (Sherman 1932), are examples of important attempts by scientists to explain and quantify hydrological phenomena. The Stanford watershed model was the first comprehensive digital model created after the emergence of computers in the late 1950s (Hashemi and O’Connell 2010b). Physical spatial distribution models were the next generation of hydrological models (Freeze and Harlan 1969), an example of which is the Systeme Hydrologique Europeen (SHE) modelling system (Abbott et al. 1986; Bathurst 1986). SHE was developed into two separate models: SHE (Refsgaard and Storm 1995) and SHETRAN (Ewen et al. 2000).

Third generation models were mathematical, developed to simulate watershed hydrological processes, in addition to sediment transport and water quality (Singh and Woolhiser 2002; Fakhri et al. 2014). Geographic Information System (GIS) and remote sensing development provided the opportunity for further application of the abovementioned models, by adding spatial dimensions to the outputs.

To illustrate the growth of modelling development up to 1991, an inventory of more than 60 watershed hydrological models was reported (Dzurik 2003; Singh and Woolhiser 2002). Development of water resources systems modelling and optimization progressed, with numerous simulation and optimization examples, such as reservoir system simulations, hydrological flood forecasting, and water quality models (Biswas 1974). Between the mid-1960s and early 1980s, more than 39 major projects were recorded around the globe which used hybrid or integrated modelling techniques in their assessments, and linked different components of water resources systems (Loucks et al. 1985; Dzurik 2003). Wurbs (1997) listed a number of generalized water resources simulation models in the following categories: watershed, river hydraulics, river and reservoir water quality, reservoir/river system operation, groundwater, water distribution system hydraulics, and demand forecasting.

With the development of new models, and the subsequent increased numbers of models available, selecting an appropriate model became an ever more important issue. Research has been conducted to compare models’ abilities and limitations. Kovács (2004) compared the results of SWAT and MONERIS, where it was found that SWAT is weak in estimating phosphorous loads, as it does not account for nonorganic phosphorus attached to sediment. SWAT has also been compared to HSPF (Singh et al. 2005), where it was shown that SWAT is more powerful in the simulation of low flows. The reason posited was a potential underestimation of evapotranspiration, which was confirmed by Saleh and Du (2004).

The aim of this chapter is to classify models based on their applications and the structures on which they are developed, assisting with selecting the desired model in different scenarios.

4.2 Water Systems Modelling for Quantity and Quality

Existing water systems models simulate water quality, quantity, or both. Integration of these discrete and continual aspects of water systems, as well as socioeconomic parameters into one model, helps users analyze a water system holistically. Water quality models differ in various ways, and from the early twentieth century, their evolution has progressed based on societal concerns and available computational abilities.

Water quality models such as the Streeter–Phelps model (1925) focused Velz (1947) on the quantification of dissolved oxygen in streams and estuaries. Consequently O’ Connor 1967 provided a model with respiratory and spatial bacterial simulation capabilities. These early models were however limited to linear kinetics, simple geometries, and steady-state receiving waters, due to the absence of advanced computational tools. Following the development of computers in the 1960s, models underwent considerable improvement, particularly in numerical expressions of their analytical frameworks (Thomann 1963). Two-dimensional systems were new improvements during this period, with models being used to simulate activities and processes within watersheds. Operational research was added to models’ abilities, in order to generate more cost-effective treatment alternatives (Thomann and Sobel 1964; Deininger 1965; Ravelle et al. 1967). In the 1970s, eutrophication was one of the main water quality problems that attracted attention within nutrient/food chain models (Chen 1970; Chen and Orlob 1975), which employed nonlinear kinetics equations. Subsequent advancements in modelling included the inclusion of environmental issues such as solute transport and the fate of toxicants (Chapra 1997). In the last decade, advancement in computer hardware and software has led to a revolution in modelling; two- or three-dimensional models with highly mechanistic kinetics are now readily available with graphical user interfaces at reasonable costs.

In this section, examples of well-known models are given, with a brief background on development and use. The models discussed are: AGNPS and AnnAGNPS, ANSWERS and ANSWERS-Continuous, CASC2D, MIKESHE, SWAT, DWSM, KINEROS, PRMS, HEC-HMS, HEC-RAS, and WEAP.

4.2.1 AGNPS

AGNPS (Agricultural Non-Point Source pollution model) is an event-based model developed at the USDA-ARS North Central Soil Conservation Research Laboratory in Morris, Minnesota, and designed to analyze the impact of non-point source pollutants from predominantly agricultural watersheds on the environment (Young et al. 1987).

The model components include: transport of sediment, nitrogen (N), phosphorous (P), and chemical oxygen demand (COD), with user interfaces for data input and analysis and other capabilities. Revision of this model was undertaken by the USDA-ARS National Sedimentation Laboratory (NSL) in Oxford, Mississippi, and led to its upgrading to the Annualized Agricultural Non-Point Source model (AnnAGNPS) (Bingner and Theurer 2001). This upgraded model is practical in continuous simulations of hydrology and soil erosion, as well as transport of sediment, nutrients, and pesticides. The model has source accounting capabilities and user interactive programs, including TOPANGPS generating cells, and stream networks from Digital Elevation Models (Borah and Bera 2003).

4.2.2 ANSWERS

ANSWERS (A real Non-point Source Watershed Environment Response Simulation) was developed at Purdue University in West Lafayette, Indiana. ANSWERS considers various processes of runoff, infiltration, subsurface drainage, and erosion for single-event storms. The model has two major components: hydrology (with the conceptual basis adapted from Huggins and Monke (1966)), and upland erosion responses (with the conceptual basis adapted from Foster and Meyer (1972)). ANSWERS-Continuous is an upgraded version of ANSWERS, developed at the Virginia Polytechnic Institute and State University in Blacksburg, Virginia; the upgrade is a one-dimensional model which uses square grids with similar hydrological characteristics. Although it is not able to simulate sediment transportation, nitrogen and phosphorus transport and transformation is possible. ANSWERS-Continuous was also improved and expanded to include upland nutrient transport and losses. Notable examples of these newer models are GLEAMS and EPIC (Williams et al. 1984; Leonard et al. 1987).

4.2.3 CASC2D

CASC2D (Cascade of planes in two dimensions) is a physically based model with single-event and long-term continuous simulation components, capable of simulating water and sediment in two-dimensional overland grids and one- dimensional channels (Ogden and Julien 2002). Development of CASC2D occurred in two phases, initially at Colorado State University in Fort Collins, Colorado, and then at the University of Connecticut in Storrs, Connecticut.

4.2.4 MIKESHE

MIKESHE is a physically based model, founded on the European Hydrological System (SHE). It was developed by a European consortium of three organizations: the French consulting firm SO-GREAH, the UK Institute of Hydrology, and the Danish Hydraulic Institute (Abbott et al. 1986). The model performs simulations of water, sediment, and water quality parameters in two-dimensional overland grids, one-dimensional channels, and one-dimensional unsaturated and three-dimensional saturated flow layers. MIKESHE has two components, including the capability for continuous long-term and single-event simulation (Borah and Bera 2003). The model also has the ability to simulate dissolved conservative solute in surface, soil, and groundwater by applying a numerical solution to the advection–dispersion equation for the respective regimes.

4.2.5 DWSM

The DWSM (Dynamic Watershed Simulation Model) was developed by the Illinois State Water Survey (ISWS) in Champaign, Illinois. This event-based model can simulate distributed surface and subsurface storm-water runoff, erosion, sediment, and agrochemical transport in agricultural and rural watersheds during single rainfall events. It simulates nutrients and pesticides, soil and water temperature, dissolved oxygen, carbon dioxide, nitrate, ammonia, organic N, phosphate, organic P, and pesticides in dissolved, adsorbed, and crystallized forms (Borah and Bera 2003).

4.2.6 KINEROS

The KINematic runoff and EROSion model was developed between the 1960s and 1980s at the USDA-ARS in Fort Collins, Colorado. It is a single-event model, with the ability to simulate channel excess, overland flow, surface erosion and sediment transport, channel erosion and sediment transport, flow, sediment, and channel routing (Smith et al. 1995).

4.2.7 HSPF

The Hydrological Simulation Program-Fortran was initially developed by the U.S. Environmental Protection Agency (USEPA) in 1980. A mixture of the Stanford Watershed Model (SWM), the Agricultural Runoff Management (ARM) model, the Non-point Source Runoff (NPS) model, and the Hydrologic Simulation Program (HSP) (including HSP Quality) formed the base of the HSPF (Donigian Jr and Crawford 1979). Various software tools are applied in this model by the U.S. Geological Survey (USGS) for better interaction between its capabilities, from model input and data storage, to input–output analyses, and calibration; because of this, different versions of this model have been released, for instance: Version 8 in 1984, and Version 10 in 1993 (Bicknell et al. 1993). HSPF is a continuous watershed simulation model, with runoff and water quality constituents on pervious and impervious land areas, movement of water and constituents in stream channels and mixed reservoir components (Borah and Bera 2003).

4.2.8 SWAT

The SWAT (Soil and Water Assessment Tool) was developed at the USDA-ARS Grassland, Soil, and Water Research Laboratory in Temple, Texas. Development was geared towards creating a means of predicting the impact of management on water, sediment, and agricultural chemical yields in watersheds or river basins, and it has the ability to account for parameters such as hydrology, weather, sedimentation, soil temperature, crop growth, nutrients, pesticides, and agricultural management. SWAT is a continuous long-term model, based on a daily time step, but recent improvement allows for the use of rainfall input at any time increment, and channeling routing at an hourly time step (Arnold 2002). Similar to HSPF, SWAT is incorporated into the USEPA’s BASINS for non-point source simulations on agricultural lands. SWAT simulates Nitrate-N based on water volume and average concentration, runoff P based on partitioning factors, daily organic N and sediment adsorbed P losses using loading functions, crop N and P use from supply and demand, and pesticides, based on plant leaf-area-index, application efficiency, wash off fraction, organic carbon adsorption coefficient, and exponential decay according to half-lives (Bazrkar and Sarang 2011).

4.2.9 PRMS

The Precipitation- Runoff Modelling System was developed at the USGS in Lakewood, Colorado. PRMS has both long-term and single-storm modes. The “long-term” mode of PRMS is a hydrological model, while the “single-storm event” mode has hydrology and surface runoff, channel flow, channel reservoir flow, soil erosion, and overland sediment transport components. In addition, it is linked to the USGS data management program ANNIE for formatting input data and analyzing simulated results (Borah and Bera 2003).

4.2.10 HEC-HMS

The Hydrologic Modelling System was initially created by the Hydrologic Engineering Center within the U.S. Army Corps of Engineers in 1998 as a single-event model to replace an older standard model for hydrologic simulation, HEC-1. The Hydrologic Modelling System provides a variety of options for simulating precipitation-runoff processes, and components that cover a wide range of hydrologic features, such as precipitation, potential evapotranspiration, snowmelt, infiltration, surface runoff, base flow, channel routing, and channel seepage (Xuefeng and Alan 2009).

HEC-HMS, in addition to HEC-1’s capabilities, provides a number of features, such as continuous simulation and grid cell surface hydrology. In 2005, the U.S. Army Corps of Engineers added two different soil moisture models for continuous modelling: one with five layers and another with a single layer. This model also includes advanced numerical analysis and graphical user interfaces which make it simpler and more efficient than its predecessor.

4.2.11 HEC-RAS

The Hydrologic Engineering Center River Analysis System (HEC-RAS) was developed and released in 1995 by the U.S. Department of Defense, Army Corps of Engineers, with the aim of performing one-dimensional hydraulic calculations for a full network of natural and constructed channels. Components of this model for one-dimensional river analysis can be divided into different categories: steady flow water surface profile computations, unsteady flow simulation; movable boundary sediment transport computations, and water quality analysis (Hicks and Peacock 2005).

4.2.12 WEAP

The Water Evaluation and Planning system was initially developed in 1988 and continued by the U.S. Center of the Stockholm Environment Institute, a nonprofit research institute based at Tufts University in Somerville, Massachusetts. Different components of this model, including hydrology, climate, land use, technology, and socioeconomic factors, offer a wide variety of simulation capabilities. Water demand and supply, runoff, evapotranspiration, infiltration, crop irrigation requirements, instream flow requirements, ecosystem services, groundwater and surface storage, reservoir operations, and pollution generation, treatment, discharge, and instream water quality are all parameters considered in this model (Sieber 2011).

4.3 Time and Space Scale

Scaling refers to an increase or reduction in size and can be defined mathematically as a function as follows:

(4.1)

Where g is a small-scale function of state variables s, parameters θ, and inputs I. G is the corresponding large-scale function.

Hydrological models can be divided into two categories: predictive and investigative. Predictive models have been applied to determine the answer to a particular problem, while investigative models developed based on our perception of hydrological processes. Both types however, are similar, in that they involve the following: (1) data collection and analyses; (2) development of a conceptual model; (3) development of the mathematical model using a conceptual model; (4) calibration and validation of the mathematical model. Some of the steps have to be repeated, until validation is satisfactory. Unfortunately, the conditions for prediction are often different in space or time scale from those of the modelling data set. The time scale, in this instance, refers to a characteristic time (or length) of a process, observation, or model and can range from seconds (e.g., flashfloods of several minutes duration) to hundreds of years or more (such as flow in aquifers). Hydrological processes occur across a wide range of space scales; from small (a 1 m soil profile) to larger scales (such as floods in river systems with catchments millions of square kilometers in area). “Scaling,” within this context, is defined as a transfer of information across scales by extrapolation, or interpolation. The limitations of measurement techniques and logistics define the observation scale. The observation scale is related to the finite nature of the number of samples and can be defined by the spatial or temporal extent (coverage) of a data set, the spacing (resolution) between samples, or the integration volume (time) of a sample. Based on the nature of the process, space scales can be defined as spatial extent or integral scale. Integral scale in this instance refers to the average distance or time over which a property is correlated.

Ideally, the observation scale should be equal to the process scale. This is seldom possible, due to the cost and other limitations associated with observation instruments. If a process lasts longer than the coverage, it will be used as trends in data; conversely, there is “noise” wherever a process is shorter than the resolution (coverage). The modelling scale is defined based on the process characteristics and applications of the hydrological model. Some typical model scales are categorized in Table 4.1 below.

Table 4.1 Time and space scale

The gap between scaling and modelling can be bridged using one of three methods: upscaling, downscaling, and regionalization. Upscaling entails transferring data to a large scale and consists of two steps: (1) distributing and (2) aggregating. To illustrate, assume estimation of rainfall in a catchment is carried out using a small number of rain gauges. In step (1), small-scale precipitation is distributed over the catchment as a function of topography. Interpolation is the distribution of information over space and time. Since hydrological measurements are much more coarsely spaced in space than in time, most interpolation refers to the space domain. In the Isohyetal method, optimum interpolation/kriging, spline interpolation, moving polynomials, and inverse distance are all methods of solving this kind of classical problem in hydrology. In step (2), the spatial distribution of rainfall is then aggregated into one single value. In contrast, disaggregating and singling out are the two steps involved in downscaling. Transferring data and information from one catchment to another is defined as regionalization, and can feasibly be carried out, if the catchments are similar. The difficulty in scaling, however, is derived from catchments’ heterogeneity and hydrological processes’ variability. The term heterogeneity typically refers to variations in space, and is related to media properties, while the term variability here refers to differences in space, time, or both, and is often used for fluxes (runoff).

Hydrological processes may display one or more of the following properties: discontinuity, periodicity, and randomness. Intermittency of rainfall events within discrete zones is referred to as discontinuity, within these zones, properties are relatively uniform and predictable. Periodicity is shown in an annual cycle of runoff and is predictable. Randomness, while not predictable in detail, is predictable in terms of statistical properties such as PDF (Probability Density Function). Statistical calculations can be applied when the property of randomness is observed in a data set (Bloschl and Silvaplan 1995).

4.3.1 Time Scales in Modelling

Models use various ranges of time steps, from seconds to years, based on their applications. Temporal scale is one of the most important factors in modelling, as time step length remains constant throughout the model run. Choosing a time step requires special care; a time step that is too short will require unnecessary computing power, while a time step that is too long will create model instability and simulation failure (Todd 2007). To illustrate, applying a second as a time step in modelling may waste time; on the other hand, using a year as a time step cannot simulate an ephemeral event. Table 4.2 shows the temporal scale for the models discussed in this chapter.

Table 4.2 Models’ temporal scale

4.3.1.1 Event-Based Models

Some models are developed to simulate a particular event. Event modelling shows the response in a watershed or basin to an individual event; HEC-HMS, AGNPS, ANSWERS, CASC2D, DWSM, KINEROS, and PRMS Storm Mode are all models within this category. AGNPS runs in-storm duration, while the temporal scale of ANSWERS, CASC2D as well as KINEROS can vary based on numerical stability. DWSM is variable in the number of steps required. In an HEC-HMS simulation covering a single event, the scale can be specified for several days (Xuefeng and Alan 2009).

4.3.1.2 Continuous Models

Continuous modelling synthesizes processes and phenomena over a longer period than event-based models (Xuefeng and Alan 2009). For example, AnnAGNPS runs as a long-term model in daily or sub-daily steps, while ANSWERS-Continuous can simulate in dual time steps: daily for dry days and 30 s for rainy days. HSPF can run hourly as a long-term model, while SWAT, another long-term model, runs with a daily time step. MIKESHE is both a long-term and storm-event model, and its time step is variable depending on numerical stability. HEC-HMS can span multiple decades to do a period of records. The discussed models have been categorized in Table 4.3 based on their types.

Table 4.3 Types of water system models

4.3.2 Space Scale in Modelling

Distributed parameter models subdivide the catchment into a number of units to quantify the hydrological variability that occurs at a range of scales. These units are called either HRUs (hydrological response units), subcatchments, hillslopes, contour-based elements, or square grid elements. The representation of a process within a unit (element) involves local (site) scale descriptions, and some assumptions on the variability within the unit. Distributed parameter hydrological models often represent local phenomena in considerable detail, while the variability within a unit is often neglected. To drive the models for each unit, input variables need to be estimated for each element by some sort of interpolation between observations. Unfortunately, distributed models are limited by the extreme heterogeneity of catchments, which makes accurately defining element to element variations and subgrid variability difficult; with a large number of model parameters, model calibration and evaluation also become difficult (Bloschl and Silvaplan 1995).

4.3.3 Mathematical Bases for the Selected Models

Hydrological processes are widely related to many aspects of atmosphere, water, soil, etc. making them complex to formulate numerically. Fortunately, statistical and mathematical methods can help hydrologists to improve existing models and develop new models. In this section, the hydrological models discussed will be reviewed in terms of the mathematical bases which reflect model performance and application.

Dynamic wave and St. Venant or shallow water wave are the flow-governing equations (continuity and momentum) for the gradual unsteady flow as follows (Singh 1996):

$$ \frac{\partial h}{\partial t}+\frac{\partial Q}{\partial x}=0 $$
(4.2)
$$ \frac{\partial u}{\partial t}+u\frac{\partial u}{\partial x}+g\frac{\partial h}{\partial x}=g\left({S}_0-{S}_{\mathrm{f}}\right) $$
(4.3)

where

h = flow depth (m)

Q = flow per unit width (m3 s−1 m−1)

u = water velocity (ms−1)

g = gravitational acceleration (m s−2)

S 0 = bed slope (mm−1)

S f = energy gradient (m m−1)

t = time (s)

x = longitudinal distance (m)

CASC2D is the only watershed model that uses the dynamic wave equation on a limited basis, due to its computationally intensive numerical solutions.

Diffusive wave equations consist of the continuity and simplified momentum equations that are used in some models (Singh 1996).

$$ \frac{\partial h}{\partial t}+\frac{\partial Q}{\partial x}=q $$
(4.4)
$$ \frac{\partial h}{\partial x}=\left({S}_0-{S}_f\right) $$
(4.5)

where

q is the lateral inflow per unit width and per unit length (m3 S−1m−1m−1).

CASC2D and MIKESHE use approximate numerical solutions of these equations for routing surface runoff, overland planes and through channel segments. In order to compute flow, Manning’s formula is used as follows (Ogden and Julien 2002):

$$ Q=\frac{1}{n}A{R}^{\frac{2}{3}}{S}_{\mathrm{f}}^{\frac{1}{2}} $$
(4.6)

where

n = Manning’s roughness coefficient

A = flow cross-sectional area per unit width (m2 m−1)

R = hydraulic radius (m)

$$ \mathrm{S}\mathrm{o} = \mathrm{S}\mathrm{r}. $$
(4.7)

Equations (4.4) and (4.7) illustrate the kinematic wave equations that are well-accepted tools for modelling a variety of hydrological processes (Singh 1996). In the momentum equation, the energy gradient is equal to the bed slope. In order to express this equation as a parametric function of stream hydraulic parameters, a suitable law of flow resistance can be used (Borah 1989).

$$ Q=\propto {h}^m $$
(4.8)

where

α is the kinematic wave parameter

m is the kinematic wave exponent

and α and m are related to channel or plane roughness and geometry.

Equations (4.4) and (4.8) are kinematic wave equations that have the advantage of yielding an analytical solution. These equations generate only one system of characteristics. In other words, waves traveling upstream, as is the case with backwater flow, cannot be presented in these equations. Singh (2002) suggested the use of kinematic wave solutions to present accurate results of hydrological significance. An approximate numerical solution of kinematic wave equations is used in KINEROS and PRMS. On the other hand, an analytical and an approximate shock-fitting solution are applied in DWSM.

For flow routing, the simple storage-based (nonlinear reservoir) equations (4.6 and 4.9) are used in ANSWERS, ANSWERS-Continuous, and HSPF (Borah and Bera 2003):

$$ \frac{ds}{dt}=I-0 $$
(4.9)

where

s = storage volume of water (m3)

I = inflow rate (m3s−1)

O = outflow rate (m3s−1)

SWAT, AGNPS, and AnnAGNPS do not route water by means of mass conservation-based continuity equations. In order to compute runoff volumes, these models apply the USDA Soil Conservation Service runoff curve number method (SCS 1972), while other empirical relations similar to the rational formula are used to compute peak flows. The empirical procedure is used in SWAT in order to route water in channels. The SCS runoff curve number method, in addition to an interception-infiltration alternative procedure, is used in DWSM to estimate rainfall excess rates at discrete time intervals, while the interception-infiltration routine is used in the following models: ANSWERS, ANSWERS-Continuous, CASC2D, HSPF, KINEROS, MIKESHE, and PRMS.

$$ {Q}_{\mathrm{r}}=\frac{{\left(P-0.2{S}_{\mathrm{r}}\right)}^2}{P+0.8{S}_{\mathrm{r}}} $$
(4.10)
$$ {S}_{\mathrm{r}}=\frac{25,400}{\mathrm{CN}}-254 $$
(4.11)
$$ {Q}_{\mathrm{p}}=0.0028CiA $$
(4.12)

where

Q r = direct runoff (mm)

P = cumulative rainfall (mm)

S r = potential difference between rainfall and direct runoff (mm)

CN = curve number representing runoff potential for a soil cover complex

Q p = peak runoff rate (m3 s−1)

C = the runoff coefficient

I = rainfall intensity (mm h−1)

A = watershed area (ha)

4.4 Model Calibration and Verification

A model’s performance needs to be evaluated to ascertain: (1) a quantitative estimate of the model’s ability to reproduce historic and future watershed behavior, (2) a means for evaluating improvements to the modelling approach through adjustment, which allows for the modelling of parameter values and modelling structural modifications, the inclusion of additional observational information, and representation of important spatial and temporal characteristics of the watershed; and (3) comparison of current modelling efforts with previous study results. Calibration of a model makes it useful and applicable to a specific watershed, and in this section the various methods of calibration will be introduced.

In addition, in order to study the scenarios’ effects on a watershed using a calibrated model, the model must be verified. Verification is defined as the “examination of the numerical technique in the computer code to ascertain that it truly represents the conceptual model and there are no inherent numerical problems,” (Reckhow 1990), while validation is the comparison of model results with an independent data set (without further adjustment).

The most fundamental approach to assessing model performance in terms of behavior is through visual inspection of the simulated and observed hydrograph. Calibration is defined as model testing with known input and output used to adjust or estimate factors. The key factors influencing model calibration are: Calibration parameters, length of calibration period, and the calibration coefficient (Objective Function). Objective assessment generally requires the use of a mathematical estimate of the error between simulated and observed hydrological variables.

Calibration parameters are selected with the specific characteristics of the model and watershed playing a key role; each series of parameters included in an objective function is important in reducing the problem of nonuniqueness. Sensitivity analysis is a well-known method of choosing calibration parameters. Sensitivity analysis in model calibration is optional, but highly recommended for all parameters in the early stages of calibration, and is conducted by keeping all parameters constant to realistic values, while varying each parameter within an assigned range.

After choosing an objective function, physically meaningful absolute minimum and maximum ranges of parameters are selected. Lack of information, however, may cause the model user to assume a uniform distribution of all parameters within this range. Parameters’ ranges have to be as large as possible, due to their constraining role in model calibration.

$$ {b}_j:{b}_{j.\mathrm{abs}\_ \min}\le b{\le}_j\;{b}_{j.\mathrm{abs}\_ \max}\;j\kern0.5em =\kern0.5em 1,\dots m $$
(4.13)

where

b j is the jth parameter, and m is the number of parameters to be estimated (Abbaspour 2008).

A decrease or increase in the calibration period affects the calibration results. There are various ways of defining simulation period, calibration period, and verification period (Abbaspour 2008). Defining an objective function is a crucial step in model calibration, and different methods for objective functions have been reviewed (Legates and McCabe 1999; Gupta et al. 1998). Each formulation yields a different result, so the range of final parameters in the model is ultimately based on the objective function. In order to achieve a multi-criteria formulation, various types of objective functions (root mean square error, absolute difference, logarithm of differences, R 2, Chi square, Nash-Sutcliffe, etc.) have been combined. These well-known objective functions in calibration are covered in the subsequent sections of the chapter.

4.4.1 Root Mean Square Error (RMSE)

RMSE is presented in two forms: Multiplicative and summation. The multiplicative form (Green and Stephenson 1986) is written as:

$$ \begin{array}{l}g={w}_1{\displaystyle {\sum}_i{\left({Q}_m-{Q}_s\right)}_i^2+{w}_2{\displaystyle {\sum}_i{\left({S}_m-{S}_s\right)}_i^2}}+{w}_3{\displaystyle {\sum}_i{\left({N}_m-{N}_s\right)}_i^2+\dots}\\ {}\kern-25.2em {w}_i=\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{${n}_i$}\right.{\sigma}_i^2\end{array} $$
(4.14)

where

σ i 2 is the variance of the ith measured variable

$$ {w}_1=1,\;{w}_2=\raisebox{1ex}{${\overline{Q}}_m$}\!\left/ \!\raisebox{-1ex}{${\overline{S}}_m$}\right.,{w}_3=\raisebox{1ex}{${\overline{Q}}_m$}\!\left/ \!\raisebox{-1ex}{${N}_m$}\right. $$

4.4.2 Coefficient of Determination R2

The coefficient of determination R 2 is defined as the squared value of the coefficient of correlation (Pearson 1896). It is calculated as:

$$ {R}^2=\frac{{\left[{\displaystyle {\sum}_i\left({Q}_{m,i}-{\overline{Q}}_m\right)\left({Q}_{s,i}-{\overline{Q}}_s\right)}\right]}^2}{{\displaystyle {\sum}_i{\left({Q}_{m,i}-{\overline{Q}}_m\right)}^2{\left({Q}_{s,i}-{\overline{Q}}_s\right)}^2}} $$
(4.15)

The R 2 range lies between 0 and 1. A result nearer 1 indicates better results and a more comparable simulation. R 2 can also be expressed as the squared ratio between the covariance and the multiplied standard deviations of the observed and predicted values. Therefore, it estimates the combined dispersion against the single dispersion of the observed and predicted series. The R 2 range also describes how much of the observed dispersion is explained by the prediction. A value of zero means no correlation at all; whereas a value of 1 means that the dispersion of the prediction is equal to that of the observation. The fact that only the dispersion is quantified is one of the major drawbacks of R 2 if it is considered in isolation. A model which systematically over- or under-predicts consistently will still display good R 2 values close to 1 even if all the predictions are wrong. If R 2 is used for model validation, it is advisable to take into account additional information which can cope with this problem. Such information is provided by the gradient (b) and the intercept (a), of the regression on which R 2 is based. Ideally, the intercept a should be close to zero, which means that an observed runoff of zero would also result in a prediction near zero, and the gradient b should be close to one.

For proper model assessment, the gradient b should always be discussed in tandem with R 2. To do this in a more operational way, the two parameters can be combined to provide a weighted version (wR 2) of R 2. Such a weighting can be performed by (Abbaspour 2008):

$$ w{r}^2=\left\{{\scriptscriptstyle \frac{\left|b\right|.{r}^2\ \mathrm{f}\mathrm{o}\mathrm{r}\ b\kern0.5em \le 1}{\left|b\right|{}^{-1}.{r}^2\ \mathrm{f}\mathrm{o}\mathrm{r}\ b\kern0.5em >1}}\right. $$
(4.16)

By weighting R 2, under- or overpredictions are quantified together with the dynamics, which results in a more comprehensive reflection of model results.

4.4.3 Chi-square

Chi-square is a statistical method of assessing the goodness of fit between a set of observed values and those expected theoretically and is calculated as follows (Mann and Wald 1942):

$$ {x}^2=\frac{{\displaystyle {\sum}_i{\left({Q}_m-{Q}_s\right)}_i^2}}{\sigma_Q^2} $$
(4.17)

4.4.4 Nash-Sutcliffe Coefficient

The efficiency E, proposed by Nash and Sutcliffe (1970), is defined as one minus the sum of the absolute squared difference between the predicted and observed values normalized by the variance of the observed values during the period under investigation. It is calculated as:

$$ \mathrm{N}\mathrm{S}=1-\frac{{\displaystyle {\sum}_i{\left({Q}_m-{Q}_s\right)}_i^2}}{{\displaystyle {\sum}_i{\left({Q}_{m,i}-{\overline{Q}}_m\right)}_i^2}} $$
(4.18)

The normalization of the variance of the observation series results in relatively higher values of E in catchments with higher dynamics and lower values of E in catchments with lower dynamics. To obtain comparable values of E in a catchment with lower dynamics, the prediction has to be better than one in a basin with high dynamics. The range of E lies between 1 (perfect fit) and −1. An efficiency lower than zero indicates that the mean value of the observed time series would have been a better predictor than the model. The greatest disadvantage of the Nash-Sutcliffe efficiency is the fact that the differences between the observed and predicted values are calculated as squared values. As a result, larger values in a time series are greatly overestimated, whereas lower values are neglected. For the quantification of runoff predictions, this leads to an overestimation of the model performance during peak flows and an underestimation during low flow conditions.

Similar to R 2, the Nash-Sutcliffe is not very sensitive to systematic model over- or under-prediction, especially during low flow periods. If NS is closer to l, the results of the simulation have higher validity and less error. NS between 0 and 1 is accepted. Negative NS indicates that mean observed values are better predictors than simulated values and indicates unacceptable model performance (Moriasi et al. 2007). For instance, a negative NS coefficient was observed in a study by Saleh and Du (2004) in estimating daily sediment, and in another study estimating monthly discharge (Sudheer et al. 2007).

4.4.5 Index of Agreement d

The index of agreement d was proposed to overcome the insensitivity of E and R 2 to differences in the observed and predicted means and variances (Willmot 1981). The index of agreement d represents the ratio of the mean square error to the potential error and is defined as:

$$ d=1-\frac{{\displaystyle {\sum}_{i=1}^n{\left({O}_i-{P}_i\right)}^2}}{{\displaystyle {\sum}_{i\kern0.5em =1}^n{\left(\left|{P}_i-\overline{O}\left|+\right|{O}_i-\overline{O}\right|\right)}^2}} $$
(4.19)

The potential error in the denominator represents the largest value that the squared difference of each pair can attain. With the mean square error in the numerator, d is also very sensitive to peak flows, and insensitive to low flow conditions, as is E. The range of d is similar to that of R 2 and lies between 0 (no correlation) and 1 (perfect fit). Practical applications of d show that it has some disadvantages: (1) Relatively high values (more than 0.65) of d may be obtained even for poor model fits, leaving only a narrow range for model calibration; and (2) d is not sensitive to systematic model over- or under-prediction.

4.4.6 Nash-Sutcliffe Efficiency with Logarithmic Values ln E

To reduce the problem of the squared differences and the resulting sensitivity to extreme values, the Nash-Sutcliffe efficiency E is often calculated with logarithmic values of O and P. Through the logarithmic transformation of the runoff values, the peaks are flattened, and the low flows are kept more or less at the same level. As a result, the influence of the low flow values is increased in comparison to the flood peaks, resulting in an increased sensitivity of lnE to systematic model over- or under-prediction (Krause et al. 2005).

4.4.7 Modified Forms of E and d

The logarithmic form of E is widely used to overcome the oversensitivity to extreme values induced by the mean square error in the Nash-Sutcliffe efficiency and the index of agreement and to increase the sensitivity for lower values (Krause et al. 2005). In addition to this modification, a general form of the two equations can be used for the same purpose:

$$ {E}_j=1-\frac{{\displaystyle {\sum}_{i=1}^n\left|{O}_i-{P}_i\right|{}^j}}{{\displaystyle {\sum}_{i\kern0.5em =1}^n\left|{O}_i-\overline{O}\right|{}^j}}\mathrm{with}\;j\in N $$
(4.20)
$$ {d}_j=1-\frac{{\displaystyle {\sum}_{i=1}^{\mathrm{n}}\left|{O}_i-{P}_i\right|{}^j}}{{\displaystyle {\sum}_{i=1}^n\left|{P}_i-\overline{O}\right|+\left|{O}_i-\overline{O}\right|{}^j}}\mathrm{with}\;j\in N $$
(4.21)

In particular, when j = 1, overestimation of flood peaks is reduced significantly, resulting in a better overall evaluation. Based on this result, it can be expected that the modified forms are more sensitive to significant over- or under-prediction than the squared forms. In addition, the modified forms with j = 1 always produce lower values than the forms with squared parameters. This behavior can be viewed in two ways: (1) The lower values leave a broader range for model calibration and optimization, or (2) the lower values might be interpreted as a worse model result when compared to the squared forms. A further increase in the value of j results in an increase in the sensitivity to high flows; thus, it is used when only the high flows are of interest, e.g., for flood prediction.

4.4.8 Relative Efficiency Criteria Erel and drel

All the criteria described above quantify the difference between observation and prediction by absolute values. As a result, an over- or under-prediction of higher values has a greater influence than that of lower values. To counteract this, efficiency measures based on relative deviations (Krause et al. 2005) can be derived from E and d as follows:

$$ {E}_{\mathrm{rel}}=1-\frac{{\displaystyle {\sum}_{i=1}^n{\left(\frac{O_i-{P}_i}{O_i}\right)}^2}}{{\displaystyle {\sum}_{i=1}^n{\left(\frac{O_i-\overline{O}}{\overline{O}}\right)}^2}} $$
(4.22)
$$ {d}_{\mathrm{rel}}=1-\frac{{\displaystyle {\sum}_{i=1}^n{\left(\frac{O_i-{P}_i}{O_i}\right)}^2}}{\kern0.5em {\displaystyle {\sum}_{i=1}^n{\left(\frac{\left|{P}_i-\overline{O}\left|+\right|{O}_i-\overline{O}\right|}{\overline{O}}\right)}^2}} $$
(4.23)

Through this modification, the differences between the observed and predicted values are quantified as relative deviations which reduce the influence of the absolute differences significantly during high flows. On the other hand, the influence of the absolute lower differences during low flow periods is enhanced, as it is significant if looked at relatively. As a result, it can be expected that the relative forms are more sensitive to systematic over- or under-prediction, in particular during low flow conditions.

4.4.9 Measures of Efficiency

Krause et al. (2005) investigated nine different efficiency measures for the evaluation of model performance, using three different examples.

In the first example, efficiency values were calculated for a systematically underpredicted runoff hydrograph. The systematic error was not reflected by all of the measures—values between 1.0 (R 2) and 0.81 (lnE) were calculated. Only the weighted form wR 2 and the modified form E1 produced lower values of 0.7 and 0.62, and therefore proved to be more sensitive to the model error in this example. Since most of the criteria investigated are primarily focused on the reproduction of the dynamics compared to the volume of the hydrograph, it is advisable to quantify volume errors with additional measures, like absolute and relative volume measures, or the mean squared error, for a thorough model evaluation.

In the second experiment of Krause et al. (2005), 10,000 random predictions were created by modifying the values of an observed hydrograph to compare the behavior of different efficiency measures against one another. It was found that E and R 2 were not correlated. To improve the sensitivity of R 2, a weighted form wR 2 was proposed, which takes the deviation of the gradient from 1.0 into account. With wR 2, a good and positive correlation with E was found, highlighting the improved applicability of wR 2 over R 2 for model evaluation. In this case, the comparison of the index of agreement d with E, revealed that only the “ideal” values for both measures were found in the same model realizations. In the range of lower values, an increasing amount of scatter occurred. From the comparisons, and the fact that E, R 2, wR 2, and d are based on squared differences, it is fair to say that efficiency measures are primarily focused on the peaks and high flows of the hydrograph, at the expense of improvements to the low flow predictions. The experiment illustrates an important trade-off between accuracy and computational efficiency. For better quantification of the error in fitting low flows, the logarithmic Nash-Sutcliffe efficiency (lnE) was tested. The comparison of lnE with E and d showed almost no correlation, which is evidence that lnE is sensitive to other components of model results. With the findings of example 3 in Krause’s work, it was shown that lnE reacts less to peak flows and more to low flows than E. To further increase the sensitivity of efficiency measures to low flow conditions, relative forms of E and d were proposed. The results from the three different examples showed that neither E rel nor d rel was able to reflect the systematic under-prediction of example 1. The comparison in example 2 demonstrated that the correlation of E rel and E was similar to that of lnE and E. This could be underpinned by the comparison of E rel with lnE, which showed a linear trend, but also a considerable amount of scatter. In example 3, the scatter was explained by the fact that E rel showed virtually no reaction to model enhancement during peak flow, being mostly sensitive to better model realization during low flow conditions. A more suitable sensitivity measure for the quality of the model results during the entire period was found in the two modified forms E1 and d1. Both parameters showed linear correlations with not only E and d, but also with lnE. These findings could be underpinned by the evolution of E1 and d1 during example 3, where they showed average values between the extremes of E and d on one side and lnE, E rel, and d rel on the other.

Overall, it can be stated that none of the efficiency criteria described and tested performed ideally. Each of the criteria has specific pros and cons which have to be taken into account during model calibration and evaluation. The most frequently used Nash-Sutcliffe efficiency and the coefficient of determination are very sensitive to peak flows, at the expense of better performance during low flow conditions. This is also true for the index of agreement, as all three measures are based on squared differences between observation and prediction. Additionally, it was shown that R 2 alone should not be used for model quantification, as it can produce high values for very bad model results—it is based solely on correlation. To counteract this, a weighted form wR 2 was proposed, which integrates the gradient b in the evaluation.

The Nash-Sutcliffe efficiency, calculated with logarithmic values, shows that it is more sensitive to low flows, but still reacts to peak flows. This reaction could be suppressed by the derivation of the relative form E rel. E rel proved to be sensitive solely to low flows, showing no reaction to peak flows. Based on this behavior, E rel could be suitable for calibration of model parameters which are responsible for low flow conditions. The use of E or R 2 for such a task often results in the statement that the parameter under consideration is not sensitive.

As for more global measures, the modified forms E1 and d1 were identified as a kind of middle ground between the squared and relative forms. One drawback associated with these criteria is that it is more difficult to achieve high values, which makes them less attractive at first glance.

We conclude here that in scientifically sound model calibration and validation, a combination of different efficiency criteria is recommended and should be complemented by assessment of the absolute or relative volume error. The selection of the best efficiency measures should reflect the intended use of the model and should concern model quantities which are deemed relevant for the study at hand. The goal should be to provide good values for a set of measures, even if they are lower than single best realizations, to include the full set of dynamics of the model results (Krause et al. 2005).

4.5 Discussion

Due to the multitude of models available for the simulation of water resources and systems, the selection process requires special attention. Tables 4.4 and 4.5 summarize each model’s characteristics, abilities, weaknesses, and limitations.

Table 4.4 Models’ abilities and components
Table 4.5 Models’ weaknesses and limitations

4.6 Selecting a Model for Estimating Nutrient Yield and Transportation During Flash Floods and Wet Seasons

In order to exemplify the selection process between available models, we have selected a problem within the Chamgordalan Reservoir watershed in Iran. Located between the latitudes of 33° 23′ 53″ and 33° 38′ 56″ N and the longitudes of 46° 20′ 25″ to 48° 36′ 58″ E, the Chamgordalan Reservoir watershed has three rivers: the Golgol, Chaviz, and Ama. The watershed has an area of 471.6 km2, is heavily forested and, as a mountainous watershed, the average land slope is approximately 34 %. Absolute maximum and minimum annual temperatures are 40.6 °C and−13.6 °C, respectively, and the average annual rainfall, recorded at the Ilam synoptic station, is 616 mm.

The watershed’s topography is characterized by high mountains, steep slopes and deep valleys, making it highly vulnerable to flooding. Soil is exposed to erosion, and sediment as well as pollutants are transported downstream by common flash floods in the region. Pollutants accumulate in the reservoir, resulting in an increase in unusable volume, and a subsequent reduction in water quality. In September 2008, one of these flash floods occurred, leading to a critical water quality situation. The decision makers in Ilam province had no way of preventing the flow of this water into the urban potable water network, and this led to a crisis in public health and hygiene lasting several days (Bazrkar and Sarang 2011).

A model with the capability of simulating runoff and chemical parameters during the flood event had to be selected. After preparation of data, selection of a suitable model from the pool of available models was undertaken using the flowchart in Fig. 4.1. In this figure, the abovementioned criteria were considered. The inverted pyramids on the left hand side of the figure were then applied in order to select the model. In this case, DWMS was selected as an event-based model for simulation of nutrient transport during floods.

Fig. 4.1
figure 1

Model selection flowchart in modelling a flood event

4.7 Selecting a Model for Estimating Nutrient Yield and Transportation During Regular Flow

Data scarcity at the local scale for flood simulation (lack of observation data in minutes and hours) means event-based models could not be applied in the case of the Chamgordalan Reservoir watershed. Selection must then be carried out using the available continuous models’ options. Figure 4.1 illustrates that the SWAT Model was selected in order to simulate nutrient yield and transport during regular flow. The inverted pyramids on the right hand side of Fig. 4.1 are related to the selection process during regular flow.

4.8 Summary and Concluding Remarks

Each model has its most appropriate applications, and choosing an inappropriate model may result in unexpected errors. Understanding a model’s capabilities, advantages, and disadvantages will help the user to achieve the desired objectives. Using models as “clear boxes” with regard to their underlying assumptions leads to more valuable results, while application as “black boxes” increases the possibility of undesirable results. The aim of this chapter was not to rank models, but rather to present their unique capabilities and applications. This chapter has attempted to shed light on the “black boxes” of the most well-known models and helps users to select an appropriate model that is best suited to their unique situation.