Reservoir Inflow Forecasting Using Ensemble Models Based on Neural Networks, Wavelet Analysis and Bootstrap Method

Kumar, Sanjeet; Tiwari, Mukesh Kumar; Chatterjee, Chandranath; Mishra, Ashok

doi:10.1007/s11269-015-1095-7

Reservoir Inflow Forecasting Using Ensemble Models Based on Neural Networks, Wavelet Analysis and Bootstrap Method

Published: 13 August 2015

Volume 29, pages 4863–4883, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Water Resources Management Aims and scope Submit manuscript

Reservoir Inflow Forecasting Using Ensemble Models Based on Neural Networks, Wavelet Analysis and Bootstrap Method

Download PDF

Sanjeet Kumar¹,
Mukesh Kumar Tiwari²,
Chandranath Chatterjee¹ &
…
Ashok Mishra¹

1386 Accesses
88 Citations
Explore all metrics

Abstract

Accurate and reliable forecasting of reservoir inflow is necessary for efficient and effective water resources planning and management. The aim of this study is to develop an ensemble modeling approach based on wavelet analysis, bootstrap resampling and neural networks (BWANN) for reservoir inflow forecasting. In this study, performance of BWANN model is also compared with wavelet based ANN (WANN), wavelet based MLR (WMLR), bootstrap and wavelet analysis based multiple linear regression models (BWMLR), standard ANN, and standard multiple linear regression (MLR) models for inflow forecasting. Robust ANN and WANN models are ensured considering state of the art methodologies in the field. For development of WANN models, initially original time series data is decomposed using wavelet transformation, and wavelet sub-time series are considered to develop WANN models instead of standard data used for development of ANN model. To ensure a robust WANN model different types of wavelet functions are utilized. Further, a comparative analysis is carried out among different approaches of WANN model development using wavelet sub time series. Seven years of reservoir inflow data along with outflow data from two upstream reservoirs in the Damodar catchment along with rainfall data of 5 upstream rain gauge stations are considered in this study. Out of 7 years daily data, 5 years data are used for training the model, 1 year data are used for cross-validation and remaining 1 year data are used to evaluate the performance of the developed models. Different performance indices indicated better performance of WANN model in comparison with WMLR, ANN and MLR models for inflow forecasting. This study demonstrated the effectiveness of proper selection of wavelet functions and appropriate methodology for wavelet based model development. Moreover, performance of BWANN models is found better than BWMLR model for uncertainty assessment, and is found that instead of point predictions, range of forecast will be more reliable, accurate and can be very helpful for operational inflow forecasting.

Reservoir Inflow Forecasting Using Extreme Learning Machines

Reservoir Inflow Prediction by Ensembling Wavelet and Bootstrap Techniques to Multiple Linear Regression Model

Article 05 December 2019

A Comparative Study of Various Hybrid Wavelet Feedforward Neural Network Models for Runoff Forecasting

Article 22 August 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Accurate reservoir inflow forecasting is necessary for planning and management of available water resources. Inflow forecast can be applied for reservoir operation and management, flood control, drought management, water supply for irrigation, industrial and domestic uses and hydro-power generation. It emphasizes development of a model that is accurate and can be easily applied for the operational reservoir inflow forecasting. Several approaches have been applied to map the non-linear relationship between rainfall and runoff such as empirical, conceptual, physically and data driven (Verma et al. 2010; Paudel et al. 2011; Adamowski et al. 2013; Gad 2013). Data driven models have been applied in different fields of water resources with promising results (Jain et al. 2001; Mehta and Jain 2009; Mukerjee et al. 2009; Tiwari and Chatterjee 2010; Tiwari et al. 2013; Rath et al. 2013). In the previous years, complex nature of water resource variables has brought increased attention to the potential of soft computing technique methods including fuzzy logic and genetic programming (Kant et al. 2013), support vector regression (Herrera et al. 2010), and artificial neural networks (NN) (Adamowski 2008; Tiwari et al. 2013). Neural network information processing systems and capable of mimicking the functioning of human brain, and has been widely applied as an effective method for modeling highly non-linear phenomenon in hydrological processes (Abrahart et al. 2012).

One of the earliest uses of ANNs in reservoir inflow forecasting was used by Coulibaly et al. (2000). This research was subsequently followed by numerous studies on reservoir inflow forecasting (Jothiprakash and Magar 2012; Okkan 2012; Krishna 2014). Jothiprakash and Magar (2012) used artificial intelligent (AI) techniques such as artificial neural network (ANN), adaptive neuro-fuzzy inference system (ANFIS) and linear genetic programming (LGP) to predict daily and hourly multi-time-step ahead intermittent reservoir inflow of Koyna river watershed in Maharashtra, India. Similarly Valipour et al. (2013) developed ANN and ARIMA models for forecasting the inflow of Dez dam reservoir using monthly discharges from 1960 to 2007 and showed better performance of static and dynamic autoregressive artificial neural networks to forecast the inflow to the dam reservoirs. Among previous studies, feed forward back propagation neural networks (FFBP-NN), literally called as ANNs, is the most commonly used method.

Although ANN methods have been used extensively as useful tools for prediction of hydrological variables, it has some limitations in dealing with non-stationary data (Cannas et al. 2006; Partal 2009). A non-stationary time series data has a variable variance and mean that does not remain constant or same to their long-run mean over time, whereas the stationary time series data reverts around a constant long-term mean exhibits a constant variance independent of time. Daily flow time-series data are often non-linear and non-stationary (Rao et al. 2003; Wang et al. 2006). Non-stationarity such as seasonal variations and trends significantly affect modeling of time series and generally lead to poor predictability in practical applications (Francesco and Bernd 2000). Since the hydrological time series includes several frequency components and have non-linear relationships, hybrid model approaches which include different data-preprocessing and combine techniques, have been used to raise the prediction performance of neural networks. Wavelet analysis has emerged as an effective tool to simplify the non-stationarity in the dataset and has been widely applied by coupling with neural networks for water resource variables forecasting (Zhou et al. 2008; Makwana and Tiwari 2014; Kisi and Shiri 2012; Sahay and Sehgal 2013; Sahay and Srivastava 2014; Sehgal et al. 2014a, b). To develop WANN model, wavelet sub time series generated using discrete wavelet transformation (DWT) are used as inputs to the ANN models. The DWT decomposes original time series data into many components and each component has a distinct role in the original time series data. The low-frequency component or approximation generally reflects the identity (periodicity and trend) of the original data whereas the high-frequency components (i.e. details) uncover sharp fluctuations (Kucuk and Oglu 2006). There are several applications of WANN models in water resource variables forecasting and successful application inflow forecasting in the literature (Okkan 2012; Krishna 2014; Sehgal et al. 2014a, b). More recently, Krishna (2014) developed and demonstrated the potential of wavelet analysis and moving average (MA) methods in conjunction with two types of neural networks i.e. feed forward neural network and radial basis (RB) neural network and multiple linear regression (MLR) models in the prediction of the daily inflow values of Malaprabha reservoir, Belgaum, India. The results showed that WANN model performs better compared to an ANN and MLR models in forecasting the inflow hydrograph effectively. The author suggested undertaking further studies using data from upstream gauging stations to strengthen the findings.

ANN models including WANN models are prone to uncertainty depending on the input data arrangement in training and testing that leads to different optimal model structure and parametric values (Abrahart 2003; Arhami et al. 2013). Bootstrap resampling method that generates different realizations of dataset by resampling with replacement methodology is found to be efficient, simple and comparatively less complex than Bayesian method to assess uncertainty in the forecasts (Sharma and Tiwari 2009; Hinsbergen et al. 2009). Ensemble forecasting utilizing bootstrap method has improved the model performance compared to a single ANN model and provided assessment of uncertainty associated with the forecasts making it easier to implement in practice (Tiwari and Chatterjee 2011; Wang et al. 2013).

Even though wavelet analysis has improved the performance of ANN models significantly in different fields of water resources, the way wavelet sub time series are used to develop WANN models are different in different studies. Some of the previous work emphasized use of all the components for WANN model development (Wang and Ding 2003; Zhou et al. 2008; Nourani et al. 2009; Adamowski and Sun 2010; Maheswaran and Khosa 2012), other removed d1 components, considering it as noise, as its correlation with the original data was very little and generated a new time series data by summing all the components except d1 component of a particular time series data (Partal and Kisi 2007; Kişi 2009; Rajaee et al. 2010; Kisi and Cimen 2011). In some of the previous studies, new wavelet time series was developed by summing up effective wavelet components based on correlation analysis (Tiwari and Chatterjee 2010, 2011). In this study, in spite of considering a selective method to generate a new wavelet time series all the combination reported in the previous studies along with some new approaches are tested. Moreover, during WANN model development for forecasting water resources variables generally a particular wavelet transformation function and a level of decomposition are applied. However, each wavelet functions have their own strengths in capturing the different characteristics and physical structure of the hydrological processes and therefore, completely relying on a model based on a particular wavelet function often leads to predictions that capture some phenomena at the expense of others (Rathinasamy et al. 2013).

Therefore, in this study a combined bootstrap and wavelet based ANN model is proposed first time for inflow forecasting including different analysis such as (i) wavelet analysis for extracting non-stationarity from the dataset, (ii) uses combined strength of wavelet analysis and bootstrap resampling on a single platform to produce a model that is accurate and reliable, (iii) evaluate different combination approaches of discrete wavelet components, (iv) evaluate performance of different wavelet functions, and (v) produce the inflow forecast with confidence bands, assessing uncertainty associated with these forecasts.

2 Materials and Methods

2.1 Study Area

Damodar catchment, a part of the lower Ganges River, is located in the upper reaches of the Damodar river basin in Jharkhand state of India (Fig. 1). The area lies between 23° 34′ and 24° 09′ North latitude and 84° 42′ to 86° 46′ East longitudes with an elevation variation between 122 and 1340 m above mean sea level. The catchment falls within sub-tropical climate and daily mean relative humidity varies from 40 to 95 % with alternating dry and wet periods. The daily mean temperature of the area ranges from 4 to 43 °C with average annual rainfall of 1390 mm most of which occurs in months from July to September. The study area comprises of mixed forest, mainly with deciduous and tropical moist forest along with many thorny bushes, and agricultural area. Damodar catchment consists of three reservoirs: Konar, Tenughat and Panchet, these reservoirs are constructed on Damodar river during 1955, 1972 and 1959, respectively with the intention of hydropower generation, water supply for irrigation, industrial and domestic uses, and flood control. Damodar catchment is further divided into three sub-catchments: Konar sub-catchment, Tenughat sub-catchment and Panchet sub-catchment with area of 997, 4480 and 5401 km², respectively. The total area of the catchment area is about 10878 km² with the length of the main stream being 350 km draining into the Panchet reservoir.

2.2 Dataset Used in the Study

Daily rainfall data were collected from Indian meteorological Department (IMD), Pune for five stations covering the study area (i.e. R₁, R₂, R₃, R₄ and R₅); daily outflow reservoir data for Konar (K_O) and Tenughat (T_O) reservoirs and daily inflow data for Panchet reservoir (P_I) were collected from for 7 years from 01 January, 2001 to 31 December, 2007 from the Reservoir Operation Department, Damodar Valley Corporation (DVC), Maithon, Jharkhand. From the available dataset, first 5 years of dataset (from 01 Jan. 2001 to 01 Jan 2005) were considered for training, whole data during 2006 were considered for cross validation and data during the year 2007 were for evaluation of the developed standard and combined ANN models (Table 1).

Table 1 Partitioning of data for ANN model development

Full size table

2.3 Artificial Neural Networks

Artificial Neural Networks (ANN) is a strong mathematical approach, based upon imitation of human brain functioning by forming a model structure with the capability to map complex non-linear relationships and processes that are inherent among several variables. In a simpler term it is networks with nodes in form of feed forward neural network, consisting of different layers with computational nodes such as an input layer, one or more hidden layers, and an output layer. As this approach is found very fast and efficient in highly complex and noisy environments to solve a wide range of problems, ANNs have been applied in numerous real-world applications including time series predictions (Abrahart et al. 2012). For detailed study on general properties of ANNs and its applications in water resource engineering, interested readers are directed to refer Bishop (1995), Haykin (1999), Maier and Dandy (2010) and Abrahart et al. (2012).

2.4 Wavelet Analysis

Wavelet analysis utilizes a wavelet function known as mother wavelet defined as ψ(t) = ∫ ^+ ∞_− ∞ ψ(t)dt = 0 and successive wavelets can be derived as (Mallat 1989)

$$ \begin{array}{ll}{\psi}_{a,b}(t)={\left|a\right|}^{-\frac{1}{2}}\psi \left(\frac{t-b}{a}\right)\hfill & b\in R,\;a\in R,\kern0.62em a\ne 0\hfill \end{array} $$

(1)

where a and b are the scale and time factor, respectively, and R is the domain of real numbers.

For a time series with a finite energy signal f(t) ∈ L²(R), the continuous wavelet transform is defined as (Kisi 2010)

$$ {W}_f\left(a,b\right)={\left|a\right|}^{-\frac{1}{2}}{\displaystyle \underset{R}{\int }f(t)\psi *\left(\frac{t-b}{a}\right)dt} $$

(2)

where W _f (a, b) is the matrix of wavelet coefficient or a contour map known as scaleogram and ψ * denotes a complex conjugate function.

To avoid generation of large number of coefficients discrete wavelet transform (DWT) is applied as it is a convenient way and very useful for solving practical problems. DWT is obtained by constraining the wavelet dilation (a) and translation (b) parameters and defining the DWT as (Mallat 1989)

$$ {\psi}_{m,n}\left(\frac{t-b}{a}\right)={a}_o^{-m/2}\psi *\left(\frac{t-n{b}_o{a}_o^m}{a_o^m}\right) $$

(3)

where, integers m and n determine the magnitude of wavelet dilation and translation, respectively, a ₀ represents a specified dilation step greater than 1 (most commonly a ₀ = 2), and b ₀ represents the location parameter which must be greater than zero (most commonly b ₀ = 1).

2.5 Bootstrap Technique

There are three major sources of uncertainty such as parameter uncertainty, sub optimal training and insufficient input variables, which significantly affect the output of the ANN and WANN models. Bootstrapping a computational, data-driven simulation method has been used widely to assess the uncertainty by measuring the variance σ ²_s . The bootstrap samples are generated through intensive resampling of the data with replacement method, and these resample’s or realizations of data provide a better understanding of the average and variability of the original, unknown distribution or process, that help to assess uncertainty associated with the estimate (Efron and Tibshirani 1993). Assuming a population T with unknown probability distribution F, where each sample is denote as t _i = (x _i, y _i), where x _i is a input vector and y _i is the corresponding output vector, is a realization drawn as independently and identically distributed (i.i.d.) from T. T _n is a bootstrap resample denoted as T _n = {(x₁, y₁), (x₂, y₂), …., (x_n, y_n)}, where n is the size of original dataset obtained from empirical distribution function $ \widehat{F} $ _, by putting a mass of 1/n for each t ₁, t ₂, …, t _n. Similarly, a set of bootstrap samples such as T ¹, T ²,…, T ^s,…, T ^S can be produced, where S is the total number of bootstrap samples, generally ranging from 50 to 200 samples (Efron and Tibshirani 1993).

In this study, several bootstrap resample’s are generated and used to train several different ANN and WANN models, and an ensemble forecast is obtained and named as BANN and BWANN, respectively (Tiwari and Chatterjee 2010, 2011). For each T ^s, a ANN and WANN model is developed and trained using all n observations and the ANN and WANN outputs, f _ANN(x _i, w _s/T ^s) and f _WANN(x _i, w _s/T ^s), is then evaluated using a set A ^s of observation pairs t _i =(x _i, y _i) that are not included to generate bootstrap resample’s. The performance of the ANNs and WANNs in these validation tests is subsequently averaged/ensemble. These ensembled models are also represents the generalization error for the ANN models relative to T _n that is denoted as E ₀, and this can be estimated for ANN as (Twomey and Smith 1998)

$$ {\widehat{E}}_0=\frac{{\displaystyle \sum_{s=1}^S{\displaystyle {\sum}_{i\in {A}_s}{\left({y}_i-{f}_{NN}\left({x}_i,{w}_s/{T}^s\right)\right)}^2}}}{{\displaystyle \sum_{s=1}^B\left({A}_s\right)}}\cdot $$

(4)

where f _ANN(x _i, w _b/ T ^s) is the output of the ANN developed from the bootstrap sample T ^s, in which x _i is a particular input vector and w _s is the weight vector. Subsequently, the BANN estimate ŷ(x) of all developed ANNs is given by the average of the S bootstrapped estimates (Jia and Culver 2006; Tiwari and Chatterjee 2010, 2011) as

$$ \widehat{y}(x)=\frac{1}{S}{\displaystyle \sum_{s=1}^S{f}_{NN}\left(x,{w}_s\right)}. $$

(5)

and the variance is estimated as

$$ {\sigma}^2(x)=\frac{{\displaystyle \sum_{s=1}^S{\displaystyle \sum_{i={A}_s}{\left[{y}_i-{f}_{NN}\left({x}_i/{w}_s\right)\right]}^2}}}{S-1} $$

(6)

Several forecasts obtained from the ANN and WANN models trained using several realizations of the training dataset are then used to generate the confidence band or confidence interval (CI) at the α% significance level. These CIs indicate the frequency containing the true value in the repeated simulation and is denoted as 100 × (1 - α) % with value of α is generally taken as 0.05 that corresponds to 95 % confidence bands. The CIs covering the ensemble inflow ŷ(x) are estimated as (Efron and Tibshirani 1993).

$$ CI=\left(UB,LB\right)=\left[\widehat{y}(x)+{t}_{n-p}^{\alpha /2}\sigma (x),\widehat{y}(x)-{t}_{n-p}^{\alpha /2}\sigma (x)\right] $$

(7)

where UB and LB represent the upper and lower band, respectively, and σ(x) represents the standard deviation of S number of forecasts, t ^α/2_n − p is the α/2 percentile for the t distribution with n - p degrees of freedom. n and p are the number of inflow data pattern and total number of parameters, respectively in the ANN and WANN models.

2.6 Development of ANN Models

To develop a robust ANN model, different structures of ANN model were developed by using different combinations of input variables along with 1–10 numbers of hidden neurons with learning coefficient equal to 0.2 and momentum equal to 0.9 for which the generalization error is found to be minimum. The dataset were standardized and normalized to scale in the range of 0–1. The Levenberg–Marquardt method, considered as most efficient and fast second order training method, is used to minimize the mean squared error between the forecast and observed reservoir inflows. To forecast inflows of Panchet reservoir at 1 day (t) lead time, initially inflow of Panchet reservoir at previous time step (i.e. t-1) is considered and then subsequently other input variables were considered starting from one lag time, until performance of model start deteriorating.

Selection of suitable wavelet function called as mother wavelet and a suitable level of wavelet decomposition is a crucial issue, as there is no such study showing the best performance of model for a particular wavelet function or decomposition level. Another important property of wavelet function is its vanishing moment that limits wavelets’ ability to represent polynomial behavior of the signal. It can be considered that higher vanishing moments should capture the variations more effectively and efficiently and should improve the model forecasting ability. Therefore in this study different wavelet functions with different vanishing moments such as db2, db5, db10, db20, Bior1.1, Bior3.3, Bior6.8, Haar, Coif1, Coif3, Coif5 were considered to develop WANN models.

Optimum number of decomposition level for DWT of the time series are estimated using the following formula (Nourani et al. 2008):

$$ \mathrm{L}=\mathrm{i}\mathrm{n}\mathrm{t}\left[ \log \left(\mathrm{N}\right)\right] $$

(8)

where

L =:: Decomposition level
N =:: Number of time series data

This study uses N = 1826, which produces L = 3.

In this way for each wavelet function three level of decomposition is carried out such as approximation (A3) and three details d1, d2 and d3. Inflow data at Panchet reservoir, outflow data from Konar and Tenughat reservoir and rainfall data at five gauging stations were initially decomposed using more frequently used mother wavelet db5 into approximation (A3) and details (d1, d2 and d3) for each component. As an illustration, only wavelet sub-time series of the rainfall at station DVC R2 and inflow to Panchet reservoir are shown from 01 Jan. 2001 to 01 Jan 2007 in Fig. 2. The time series data were also decomposed separately for training, cross-validation and testing dataset, but just for representation they are shown together. Table 2 shows the correlation between different wavelet components of a particular time series to the inflow of Panchet dam. Some of the components have very good correlation and some have no correlation. In spite of considering a selective method as discussed above to generate a new wavelet time series, all the combinations reported in previous studies along with some new approaches were tested in this study.

Table 2 Correlation between inflow of Panchet reservoir and different wavelet components of input variables

Full size table

For several water resource variables forecasting, wavelet analysis has improved the ANN model performance significantly, but there is no specific method for selection of appropriate wavelet components for WANN model development and different studies have applied different approaches to develop WANN models. In this study, a comprehensive examination is carried out to compare and select the best approach for inflow forecasting. Five approaches tested in this study are: (i) all the 4 components (i.e. A ₃, d ₁, d ₂ and d ₃) of 8 input variables (i.e., R₁, R₂, R₃, R₄,R₅, K_O, T_O, P_I) with lag 1 (Approach 1) (ii) all the four components of input variables and lags found best for ANN model (i.e., A ₃, d ₁, d ₂ and d ₃ of P_I with lag 1; A ₃, d ₁, d ₂ and d ₃ of T_O with lag 1; A ₃, d ₁, d ₂ and d ₃ of R₁ with lag 1, 2 and 4; A ₃, d ₁, d ₂ and d ₃ of R₂ with lag 1 and 2) (Approach 2) (iii) all the significant wavelet components separately having correlation >0.10 with lag 1 (A₃ of R₁, R₂, R₃, R₄,R₅, K_O, T_O, P_I; d₁ of K_o, P_I, d₂ of K_O, T_O, P_I; d₃ of K_O, T_O, P_I) (Approach 3) (iv) newly constructed time series adding significant components of each parameter excluding d1 component (lag input as best ANN) (A ₃, d ₂ and d ₃ of P_I with lag 1; A ₃, d ₂ and d ₃ of T_O with lag 1; A ₃, d ₂ and d ₃ of R₁ with lag 1, 2 and 4; A ₃, d ₂ and d ₃ of R₂ with lag 1 and 2) (Approach 4), and (v) newly constructed time series adding significant components having correlation >0.10 of each variable (A₃ of R₁, R₂, R₃, R₄,R₅, K_O, T_O, P_I; A ₃ + d ₁ +, d ₂ + d ₃ of K_O; A ₃ + d ₂ + d ₃ of T_O; A ₃ + d ₁ +, d ₂ + d ₃ of P_I) (Approach 5).

In this study, MLR models are also developed to compare the performance of different ANN models. To further evaluate the effectiveness of wavelet analysis similar to WANN models, BWMLR models are also developed and evaluated to forecast 1 day lead time reservoir inflow of Panchet dam. Next, BWANN and BWMLR models are developed by combining several WANN and WMLR models trained using different realizations of the dataset generated using bootstrap resample’s. In this way BWANN and BWMLR hybrid models contain capabilities of both wavelet analysis and powerful bootstrap resampling techniques. Different realizations of data patterns were generated using Bootstrap.xla an Excel-add-in (Barreto and Howland 2006). The WANN models were developed using all the approaches discussed above. The BWANN and BWMLR models are the ensemble of WANN and MLR models, respectively, trained using 100 realizations of training dataset generated using bootstrap resampling. A flow chart showing the development of different models is shown in Fig. 3.

2.7 Performance Indices

The performance of the developed ANN, WANN, MLR, WMLR, BWANN and BWMLR models were evaluated using five performance indices defined below:

(i)
The coefficient of determination (R ²):
$$ {R}^2={\left(\frac{{\displaystyle \sum_{i=1}^n\left({O}_i-\overline{O}\right)\left({P}_i-\overline{P}\right)}}{\sqrt{{\displaystyle \sum_{i=1}^n{\left({O}_i-\overline{O}\right)}^2{\displaystyle \sum_{i=1}^n{\left({P}_i-\overline{P}\right)}^2}}}}\right)}^2 $$
(9)
where O _i and P _i are the observed and forecasted inflow, respectively, Ō and $ \overline{P} $ are the means of the observed and forecasted inflow, respectively, and n is the number of data patterns. Range of R ² varies from 0 to 1, with 1 presents a perfect forecasting model.
(ii)
The Nash-Sutcliffe coefficient (E) is defined as:
$$ E=1-\frac{{\displaystyle {\sum}_{i=1}^n{\left({O}_i-{P}_i\right)}^2}}{{\displaystyle \sum_{i=1}^n{\left({O}_i-{\overline{O}}_i\right)}^2}} $$
(10)

The Nash–Sutcliffe efficiency varies from -∞ to 1. The value of 1 shows the perfect model.
(iii)
Root mean square error (RMSE):
$$ RMSE=\sqrt{\frac{1}{n}{{\displaystyle \sum_{i=1}^n\left({O}_i-{P}_i\right)}}^2} $$
(11)

RMSE is always greater than 0, with value 0 the model fits the data perfectly.
(iv)
Percentage deviation in peak (P_dv):
$$ {P}_{dv}=\frac{P_p-{O}_p}{O_p}100 $$
(12)
where O _p and P _p represents peak values of observed and forecasted inflow, respectively.
(v)
Mean absolute error (MAE):
$$ MAE=\frac{1}{n}{\displaystyle \sum_{i=1}^n\left|{O}_i-{P}_i\right|} $$
(13)

MAE is always a positive number, with its minimum value 0 representing a perfect model.

3 Results and Discussion

3.1 Performance of ANN Models

The performance of ANN model for different input variables and for optimum number of hidden neuron in terms of different performance indices is shown in Table 3. Considering all the performance indices, performance of model #13 was found to be the best. Hydrograph and scatter plot between observed and forecasted values are shown in Fig. 4. It is observed that simulated values show the general behavior of the observed values, even though performance of model is not very good for medium and high inflow values forecast. It can be considered that for best ANN model, outflow of Tenughat, inflow of Panchet reservoirs and rainfall values at stations R₁ and R₂ are found effective, whereas some of the input variables are not found effective in simulating inflow forecast at Panchet reservoirs such as outflow of Konar, rainfall at stations R₃, R₄ and R₅. This may be due to the reason that these data have high co-linearity with other hydro-climatic data and add negligible information for the improvement of daily inflow forecasts at Panchet reservoir. With R² = 0.90, E = 78.32, RMSE = 206.60 m³/s and P_dv = 15.91 and MAE = 68.12 m³/s, performance of best ANN model can be considered satisfactory, but higher value of RMSE compared to MAE shows that model is not able to simulate high inflow values accurately. It may be due to the reason that ANN models are not able to extract non-stationarity from the training dataset.

Table 3 Best ANN model structure for different input variables for testing dataset

Full size table

3.2 Performance of WANN Models

WANN models were initially developed using all the wavelet sub time series components of input variables and lagged information for all the components were the same as found best for ANN model (Model #13 ) (i.e. Approach 2). Subsequently, performance of several wavelet functions with different vanishing moments was considered to select an appropriate wavelet function. The performance of WANN models using different wavelet functions and vanishing moment is shown in Table 4. It can be observed that wavelet function db5 performed best among all the wavelet functions considered. It can also be observed that performance of WANN model developed using db5 wavelet function with 5 vanishing moment performed better than standard ANN model in terms of different performance indices.

Table 4 Performance of WANN models with different mother wavelets using all the wavelet components of best found ANN model #13

Full size table

Once performance of WANN model was found better than ANN model, another issue addressed was to found the appropriate selection of wavelet components or wavelet sub time series to develop a robust WANN model. The performance of WANN models considering different approaches are presented in Table 5. It can be observed that performance of WANN models developed using newly constructed time series (i.e. Model 4) by adding significant components of each variables excluding d1 component and by considering the same lagged time information as found best for ANN, resulted in improved performance.

Table 5 Performance of WANN models developed using different approaches for 1 day lead inflow forecasting

Full size table

The performance of best found WANN modeling approach is also shown in form of hydrograph and scatter plot as shown in Fig. 5. It can be observed that the observed values are simulated very well and performance is considerably improved particularly for peak values as these are very close to 1:1 regression line. It can be observed that wavelet analysis with proper selection of wavelet function and vanishing moment along with suitable wavelet selection method can significantly improve the model performance.

3.3 Performance of MLR and WMLR Models

Similar to several previous studies, developed ANN and WANN models are compared with MLR model to benchmark the performance. In this study in addition to MLR models, wavelet based MLR models (WMLR) are developed and the performance is compared with ANN and WANN models. MLR and WMLR models are developed by using the same input variables as used for best ANN and WANN models, respectively, and the performance of both the models is shown in Table 6. It can be observed that the performance of WMLR models is better than MLR models, but WANN model performed better than both the MLR and WMLR models as these two models are not able to simulate medium and high inflow values satisfactorily (Fig. 6). This can also be observed from the higher RMSE values compared to MAE values for both the models.

Table 6 Performance of MLR and WMLR models for 1 day lead inflow forecasting

Full size table

3.4 Performance of BWANN and BWMLR Models

Performance of BWANN and BWMLR models is shown in Table 7 in terms of different performance indices and hydrograph and scatter plots using models (a) BWMLR and (b) BWANN are shown in Fig. 7. It can be observed that ensemble forecasts obtained using BWANN models is better than BWMLR models. It can also be seen that performance of BWANN models is not as good as obtained using WANN models, but BWANN models can be considered as more robust and accurate as these models are ensemble of several WANN models trained using different realizations of the training dataset. All these individual WANN models are developed using 100 different realizations of the training dataset, these realizations are applied to train the ANN models, and the ensemble of all these models are used to develop a new forecast for a new dataset. Further, the ensemble models initially generate 100 forecasts for a new testing/validation dataset using all these 100 trained models. Though the performance of ensemble BWANN models could have been better by generating ensemble of some of the better performing models, but these forecasts are the ensemble using all the 100 forecasts instead of omitting any of the non-performing models. Therefore it is advocated that the performance of BWANN model is better as these are the ensemble of 100 forecasts and are more reliable and accurate even if the new dataset for the forecasts has different complexity and variability compared to those applied for training of these ensemble models.

Table 7 Performance of BWMLR and WBANN models for 1 day dead inflow forecasting

Full size table

Instead of better performance of WANN models their reliability in generating same forecast for the newer even or dataset is questioned. Moreover, it cannot be guaranteed that if the training and testing dataset are changed or interchanged then their performance during forecast will remain the same. Therefore, assessment of uncertainty associated with the forecast is very important to know the reliability of the models and to implement the model in operational forecasting. The another advantage of BWANN model is that using different forecasts, uncertainty band or confidence bands can be constructed to see the uncertainty associated with the forecasts. Figure 8 depicts 95 % forecasted confidence bands and the corresponding observed values. Confidence bands not only show general behavior of the observed time series but that the values are simulated very well. It can be observed that higher inflow values contain higher uncertainty whereas lower inflow values have low uncertainty. Moreover, it can be observed that ensemble forecasts underestimate higher values (Fig. 7b), but BWANN models are able to assess the uncertainty associated with these forecasts. It shows that bootstrap technique is capable of uncertainty assessment and increases the reliability of WANN model forecasts.

4 Conclusions

Bootstrap wavelet based ANN model (BWANN) is developed in this study for inflow forecasting of Panchet reservoir in India and performance is compared with standard ANN, wavelet based ANN (WANN), bootstrap wavelet based MLR (BWMLR) and wavelet based MLR (WMLR) models. A robust ANN model was developed by considering several combinations of parameters such as input variables, optimization parameters, training algorithms and hidden neurons. WANN models are developed by considering different wavelet functions and approaches to ensure an efficient WANN model. Based on this study following conclusions are drawn:

Wavelet analysis with proper selection of wavelet function and vanishing moment along with suitable wavelet selection method can significantly improve the WANN model performance.
WANN model perform better than standard ANN, MLR and WMLR models for inflow forecasting.
Optimum number of input selected for ANN model development are also best with different number of wavelet sub time series excluding d1 components for best WANN model development.
Out of several wavelet functions and vanishing moments, db5 wavelet function with 5 vanishing moment provide best WANN model for inflow forecasting.
WANN model not only have capabilities to simulate all the observed values very well, it simulates peak inflow values far better compared to remaining models.
Performance of ANN model is improved significantly by including reservoir outflow and rainfall information in the upstream and nearby areas.
Selection of significant input is very crucial as inclusion of randomly selected input variables may significantly reduce the model performance.
Best WANN model can be developed by taking different wavelet components excluding d3 components of wavelet sub time series of input variables similar to those found best for ANN model. Second best approach found is by considering all the wavelet sub time series of raw input variables found best for ANN model.
WANN model can deal with non-stationary dataset effectively and can be used as suitable tool for inflow forecasting. It has the potential to perform better for different non-stationary water resource variable forecasting.
Forecasts obtained using BWANN models are not as good as WANN models, but they are more stable and consistent in case of change in training data pattern.
BWANN models are very effective to assess the uncertainty associated with the inflow forecasts and have high applicability in operational inflow forecasting.
Inflow forecasts can be improved by considering discharge releases from upstream reservoirs, and rainfall values in upstream and nearby locations in the upstream boundary.
Ensemble forecasts not only provide quantitative point estimate but also provide probabilistic forecasts by generating confidence bands, which would be helpful for flood control and reservoir management authority in decision making.

References

Abrahart RJ (2003) Neural network rainfall–runoff forecasting based on continuous resampling. J Hydroinf 5(1):51–61
Google Scholar
Abrahart RJ, Anctil F, Coulibaly P, Dawson CW, Mount NJ, See LM, Shamseldin AY, Solomatine DP, Toth E, Wilby RL (2012) Two decades of anarchy? Emerging themes and outstanding challenges for neural network river forecasting. Prog Phys Geogr 36(4):480–513
Article Google Scholar
Adamowski JF (2008) Peak daily water demand forecast modeling using artificial neural networks. J Water Resour Plan Manag 134(2):119–128
Article Google Scholar
Adamowski J, Sun K (2010) Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. J Hydrol 390(1):85–91
Article Google Scholar
Adamowski J, Adamowski K, Prokoph A (2013) A spectral analysis based methodology to detect climatological influences on daily urban water demand. Math Geosci 45(1):49–68
Article Google Scholar
Arhami M, Kamali N, Rajabi MM (2013) Predicting hourly air pollutant levels using artificial neural networks coupled with uncertainty analysis by Monte Carlo simulations. Environ Sci Pollut Res 20(7):4777–4789
Article Google Scholar
Barreto H, Howland FM (2006) Introductory econometrics: using Monte Carlo simulation with microsoft excel. Cambridge University Press, Cambridge
Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Clarendon, Oxford
Google Scholar
Cannas B, Fanni A, See L, Sias G (2006) Data preprocessing for river flow forecasting using neural networks: wavelet transforms and data partitioning. Phys Chem Earth 31(18):1164–1171
Article Google Scholar
Coulibaly P, Anctil F, Bobee B (2000) Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J Hydrol 230(3–4):244–257
Article Google Scholar
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman and Hall, London
Book Google Scholar
Francesco V, Bernd F (2000) Nonstationarity and data preprocessing for neural network predictions of an economic time series. Proc Int Joint Conf Neural Netw 5:129–134
Google Scholar
Gad MA (2013) A useful automated rainfall-runoff model for engineering applications in semi-arid regions. Comput Geosci 52:443–452
Article Google Scholar
Haykin S (1999) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall, Englewood Cliffs
Google Scholar
Herrera M, Torgo L, Izquierdo J, Perez-Garcıa R (2010) Predictive models for forecasting hourly urban water demand. J Hydrol 387(1–2):141–150
Article Google Scholar
Hinsbergen CPI, Lint JWC, Zuylen HJ (2009) Bayesian committee of neural networks to predict travel times with confidence intervals. Transp Res C 17(5):498–509
Article Google Scholar
Jain A, Varshney K, Joshi UC (2001) Short-term water demand forecast modeling at iit kanpur using artificial neural networks. Water Resour Manag 15(1):299–321
Article Google Scholar
Jia Y, Culver TB (2006) Bootstrapped artificial neural networks for synthetic flow generation with a small data sample. J Hydrol 331(3–4):580–590
Article Google Scholar
Jothiprakash V, Magar RB (2012) Multi-time-step ahead daily and hourly intermittent reservoir inflow prediction by artificial intelligent techniques using lumped and distributed data. J Hydrol 450–451:293–307
Article Google Scholar
Kant A, Suman PK, Giri BK, Tiwari MK, Chatterjee C, Nayak PC, Kumar S (2013) Comparison of multi-objective evolutionary neural network, adaptive neuro-fuzzy inference system and bootstrap-based neural network for flood forecasting. Neural Comput Applic 23(1):231–246
Article Google Scholar
Kişi O (2009) Neural networks and wavelet conjunction model for intermittent streamflow forecasting. J Hydrol Eng 14(8):773–782
Article Google Scholar
Kisi O (2010) Wavelet regression model for short-term streamflow forecasting. J Hydrol 389(3–4):344–353
Article Google Scholar
Kisi O, Cimen M (2011) A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J Hydrol 399(1–2):132–140
Article Google Scholar
Kisi O, Shiri J (2012) Reply to discussion of “Precipitation Forecasting Using Wavelet-Genetic Programming and Wavelet-Neuro-Fuzzy Conjunction Models”. Water Resour Manag 26(12):3663–3665
Article Google Scholar
Krishna B (2014) Comparison of wavelet based ANN and regression models for reservoir inflow forecasting. J Hydrol Eng 19(7):1385–1400
Article Google Scholar
Kucuk M, Oglu NA (2006) Wavelet regression technique for stream flow prediction. J Appl Stat 33(9):943–960
Article Google Scholar
Maheswaran R, Khosa R (2012) Comparative study of different wavelets for hydrologic forecasting. Comput Geosci 46:284–295
Article Google Scholar
Maier HR, Dandy GC (2010) Neural networks for the prediction and forecasting of water resources variables: a review of modeling issues and applications. Environ Model Softw 15:101–124
Article Google Scholar
Makwana JJ, Tiwari MK (2014) Intermittent streamflow forecasting and extreme event modelling using wavelet based artificial neural networks. Water Resour Manag 28:4857–4873
Article Google Scholar
Mallat SG (1989) A theory for multi resolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7):674–693
Article Google Scholar
Mehta R, Jain SK (2009) Optimal operation of a multi-purpose reservoir using neuro-fuzzy technique. Water Resour Manag 23:509–529
Article Google Scholar
Mukerjee A, Chatterjee C, Raghuwanshi NS (2009) Flood forecasting using ANN, neuro-fuzzy, and neuro-GA models. J Hydrol Eng 14(6):647–652
Article Google Scholar
Nourani V, Alami MT, Aminfar MH (2008) A combined neural wavelet model for prediction of watershed precipitation, Ligvanchai, Iran. J Environ Hydrol 16(2):1–12
Google Scholar
Nourani V, Komasi M, Mano A (2009) A multivariate ANN-wavelet approach for rainfall–runoff modeling. Water Resour Manag 23(14):2877–2894
Article Google Scholar
Okkan U (2012) Wavelet neural network model for reservoir inflow prediction. Sci Iran 19(6):1445–1455
Article Google Scholar
Partal T (2009) Modeling evapotranspiration using discrete wavelet transform and neural networks. Hydrol Process 23(25):3545–3555
Article Google Scholar
Partal T, Kisi O (2007) Wavelet and neuro-fuzzy conjunction model for precipitation forecasting. J Hydrol 342(1–2):199–212
Article Google Scholar
Paudel M, Nelson EJ, Downer CW, Hotchkiss R (2011) Comparing the capability of distributed and lumped hydrologic models for analyzing the effects of land use change. J Hydroinf 13(3):461–473
Article Google Scholar
Rajaee T, Mirbagheri SA, Nourani V, Alikhani A (2010) Prediction of daily suspended sediment load using wavelet and neuro-fuzzy combined model. Int J Environ Sci Technol 7(1):93–110
Article Google Scholar
Rao AR, Hamed KH, Chen HL (2003) Nonstationarities in hydrologic and environmental time series. Kluwer, Dordrecht
Google Scholar
Rath S, Nayak PC, Chatterjee C (2013) Hierarchical neurofuzzy model for real-time flood forecasting. Int J River Basin Manag 11(3):253–268
Article Google Scholar
Rathinasamy M, Adamowski J, Khosa R (2013) Multiscale streamflow forecasting using a new Bayesian model average based ensemble multi-wavelet volterra nonlinear method. J Hydrol 507:186–200
Article Google Scholar
Sahay RR, Sehgal V (2013) Wavelet regression models for predicting flood stages in rivers: a case study in Eastern India. J Flood Risk Manag 6(2):146–155
Article Google Scholar
Sahay RR, Srivastava A (2014) Predicting monsoon floods in rivers embedding wavelet transform, genetic algorithm and neural network. Water Resour Manag 28:301–317
Article Google Scholar
Sehgal V, Sahay RR, Chatterjee C (2014a) Effect of utilization of discrete wavelet components on flood forecasting performance of wavelet based ANFIS models. Water Resour Manag 28(6):1733–1749
Article Google Scholar
Sehgal V, Tiwari MK, Chatterjee C (2014b) Wavelet bootstrap multiple linear regression based hybrid modeling for daily river discharge forecasting. Water Resour Manag 28(10):2793–2811
Article Google Scholar
Sharma SK, Tiwari KN (2009) Bootstrap based artificial neural network (BANN) analysis for hierarchical prediction of monthly runoff in Upper Damodar Valley catchment. J Hydrol 374(3–4):209–222
Article Google Scholar
Tiwari MK, Chatterjee C (2010) Development of an accurate and reliable hourly flood forecasting model using wavelet-bootstrap-ANN hybrid approach. J Hydrol 394:458–470
Article Google Scholar
Tiwari MK, Chatterjee C (2011) A new wavelet-bootstrap-ANN hybrid model for daily discharge forecasting. J Hydroinf 13(3):500–519
Article Google Scholar
Tiwari MK, Song KY, Chatterjee C, Gupta MM (2013) Improving reliability of river flow forecasting using neural networks, wavelets and self-organizing maps. J Hydroinf 15(2):486–502
Article Google Scholar
Twomey J, Smith A (1998) Bias and variance of validation methods for function approximation neural networks under conditions of sparse data. IEEE Trans Syst Man Cybern Part C Appl Rev 28(3):417–430
Article Google Scholar
Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476(7):433–441
Article Google Scholar
Verma AK, Jha MK, Mahana RK (2010) Evaluation of HEC-HMS and WEPP for simulating watershed runoff using remote sensing and geographical information system. Paddy Water Environ 8(2):131–144
Article Google Scholar
Wang D, Ding J (2003) Wavelet network model and its application to the prediction of hydrology. Nat Sci 1(1):67–71
Google Scholar
Wang W, Vrijling JK, Van Gelder PHAJM, Ma J (2006) Testing for nonlinearity of streamflow processes at different timescales. J Hydrol 322(1–4):247–268
Article Google Scholar
Wang Y, Zheng T, Zhao Y, Jiang J, Wang Y, Guo L, Wang P (2013) Monthly water quality forecasting and uncertainty assessment via bootstrapped wavelet neural networks under missing data for Harbin, China. Environ Sci Pollut Res 20(12):8909–8923
Article Google Scholar
Zhou HC, Peng Y, Liang GH (2008) The research of monthly discharge predictor-corrector model based on wavelet decomposition. Water Resour Manag 22(2):217–227
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Agricultural and Food Engineering, Indian Institute of Technology, Kharagpur, 721302, India
Sanjeet Kumar, Chandranath Chatterjee & Ashok Mishra
Department of Soil and Water Engineering, College of Agricultural Engineering and Technology, Anand Agricultural University, Godhra, India, 389001
Mukesh Kumar Tiwari

Authors

Sanjeet Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Mukesh Kumar Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Chandranath Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar
Ashok Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanjeet Kumar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, S., Tiwari, M.K., Chatterjee, C. et al. Reservoir Inflow Forecasting Using Ensemble Models Based on Neural Networks, Wavelet Analysis and Bootstrap Method. Water Resour Manage 29, 4863–4883 (2015). https://doi.org/10.1007/s11269-015-1095-7

Download citation

Received: 01 September 2014
Accepted: 02 August 2015
Published: 13 August 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11269-015-1095-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Reservoir Inflow Forecasting Using Ensemble Models Based on Neural Networks, Wavelet Analysis and Bootstrap Method

Abstract

Similar content being viewed by others

Reservoir Inflow Forecasting Using Extreme Learning Machines

Reservoir Inflow Prediction by Ensembling Wavelet and Bootstrap Techniques to Multiple Linear Regression Model

A Comparative Study of Various Hybrid Wavelet Feedforward Neural Network Models for Runoff Forecasting

1 Introduction