1 Introduction

Forecast of rainfall is essential for planning and management of water resources especially in an agriculture based country like India. About 65 % of the total cultivated land in India is under the influence of rainfed agriculture system (Swaminathan and 1998). Especially, monthly and seasonal rainfall forecasts provide useful information for water resource management, agricultural planning, and associated crop insurance application (Garbrecht et al. 2010) in regions which are completely depend on rainfall for agriculture. Reliable forecast of rainfall help the farmers in deciding the type of crops to be cultivated and manage the resources. Therefore, forecasting the monsoon temporally is a major scientific issue in the field of hydrology. Understanding the importance of the rainfall forecasting, there has been intense research in this area which has resulted in numerous approaches for the same. In the past decades, various approaches have been used to study and forecast monthly and seasonal rainfall. These methods can be broadly classified into two categories: numerical and empirical. The numerical models such as general circulation models (GCMs) are based on the laws of physics, which have been used to forecast climate. On the other hand, the empirical models are based on observational relationships of the predictand variable with various predictors. Using the numerical models by Bustamante et al. (1999) and (Olsson et al. 2004) and physical models by Georgakakos and Bras (1984), studies of rainfall quantitative prediction have been carried out. However, they are not successful enough in forecasting rainfall (Olson et al. 1995) due to inaccurate initial conditions, parameterization schemes of subscale phenomena, and limited spatial resolution (Valverde Ram­rez et al. 2005).

Owing to numerical models’ low forecasting skill and/or their complexity (Meinke et al. 2007), empirical methods are still the most widely used approaches for seasonal precipitation forecasts when they are used in agricultural planning (Meinke et al. 2007). The empirical methods include statistical models (Mutai and Ward 2000; Immerzeel et al. 2010; Prasad et al. 2010) and artificial neural networks (French et al. 1992; Navone and Ceccatto 1994; Sivapragasam et al. 2001; Freiwan and Cigizoglu 2005; Marzano et al. 2006; Moustris et al. 2011; Abbot and Marohasy 2012; Jeong et al. 2012). Most of these studies use only a set of climate-related variables or historical rainfall data as input. Studies in the past (Ropelewski and Halpert,1986; Kurtzman and Scanlon,2007; Ishak et al.,2013; Khedun et al.,2014) have shown that the presence of tele-connections between rainfall and large scale climate signals such as Southern Oscillations Index, Pacific Decadal Oscillation (PDO), Indian Ocean Dipole(IOD), SST(Sea Surface Temperature). There have been relatively few investigations (Khedun et al.,(2014)) where the rainfall forecast models have used a combination of historical rainfall data and other climatic attributes (Abbot and Marohasy 2012). Even though many studies such as Murphy and Timbal (2008),Shukla (Shukla and Paolino 1983), and (Chattopadhyay et al. 2010) have tried to establish the relationship with climate modes, but there has been few studies to use these climate indices for rainfall prediction. Even though, the reasonable correlations between these large scale climate indices and rainfall have motivated scientists for using them for rainfall forecasting, however, it is not certain about their usefulness in rainfall forecasting. This may be attributed because the signals are highly non-stationary and these processes related to rainfall operate across varying range of temporal scales. In order to exhume the underlying relationships across different scales scientists have been using the recent technique, wavelet analysis. Wavelet analysis is a useful mathematical tool that provides a time–frequency representation of an analysed signal in the time domain (Daubechies 1992; Percival and Walden 2000). Wavelet analysis is a multi resolution decomposition in time and frequency domains. Anctil and Coulibaly (2004) proposed a wavelet-based approach to describe local interannual variabilities in streamflow, and to identify plausible climatic tele-connections that could explain these local variations. (Rivera et al. 2012) presented a self-organizing map approach using sea surface temperature (SST) filtered by WT for forecasting monthly precipitation in central Chile. More recently, (He et al. 2013) have used wavelet based linear model for utilizing the different climate indices for forecasting 1 month ahead rainfall. Basically, (He et al. 2013) have used linear regression model for forecasting rainfall. However, it is a general notion that the physical processes which induce rainfall is usually nonlinear (Sivakumar 2001; Dhanya and Nagesh Kumar 2011) and it is not possible for linear models to capture the underlying nonlinear dynamics.

Therefore, as an improvement over the linear approach, in this article, a more effective rainfall forecast model from the past rainfall data and climate signals by incorporating the wavelet analysis and multiple second order nonlinear model is proposed. Recent literature review suggest that there has been no any work reported so far on wavelet based multivariate nonlinear model using climate indices for rainfall forecasting. The main contributions of this work are as follows.

  1. 1.

    Development of wavelet based multivariate nonlinear model for rainfall forecasting using the climate indices.

  2. 2.

    Comparison of the proposed approach with other methods such as Artificial Neural Networks model and the wavelet based multiple linear regression approach.

The rest of this article is organized as follows. In Section 2, we briefly describe the wavelet analysis. Study area and the details on climate indices are provided in Section 3 . Section 4 is focused on some necessary mathematical methods and the wavelet based multivariate nonlinear model. In Section 5, the proposed method is applied to monthly rainfall forecasting at 3 rainfall stations across India and compared with a linear regression model based on data with and without WT. Some conclusions are made in Section 6.

2 Wavelet Analysis

Wavelet analysis, initially formalized by Grossmann and Morlet (1984), is the most recent solution to overcome the main shortcoming of the Fourier transforms that identifies the frequencies present in a signal but not their moment. Wavelet analysis results in a time frequency (or time-scale) representation of a signal. Wavelet analysis transforms a signal into scaled and translated versions of an original (mother) wavelet, instead of decomposing a signal into constituent harmonic functions as in Fourier analysis. The wavelet transform as defined by Eq. (1) (Daubechies 1992) is called the continuous wavelet transform (abbreviated CWT) because the scale and time parameters, a and τ assume continuous values.

$$ W\left(a,\tau \right)=\frac{1}{\sqrt{\left|a\right|}}{\displaystyle \underset{-\infty }{\overset{\infty }{\int }}f(t)\psi \left(\frac{t-\tau }{a}\right)dt} $$
(1)

It provides a redundant representation of a signal as the CWT of a function at scale ‘a’ and location ' τ ' can be obtained from the continuous wavelet transform of the same function at other scales and locations. Since the CWT behaves like an orthonormal basis decomposition, it can be shown that it is also isometric as it preserves the overall energy content of the signal and, thereby, allows for the recovery of the function f(t) from its transform by using the following reconstruction formula as provided by (Daubechies 1992) in Eq. (2)

$$ f(t)=\frac{1}{C_{\psi }}{\displaystyle \underset{-\infty }{\overset{\infty }{\int }}}{\displaystyle \underset{0}{\overset{\infty }{\int }}}{a}^{-2}W\left(a,\tau \right){\psi}_{a,\tau }(t)da\ d\tau $$
(2)

where Cφis a constant and depends on the choice of the wavelet. Clearly, the above equation suggests that the function f(t) may be seen as a superposition of signals at different scales and obtained by varying the scale parameter ‘a’.

Further, the energy of the signal f(t) can be represented scale wise as given by (Daubechies 1992) in Eq. (3)

$$ {\displaystyle \underset{-\infty }{\overset{\infty }{\int }}}{f}^2(t)dt=\frac{1}{C_{\psi }}{\displaystyle \underset{0}{\overset{\infty }{\int }}}\left[{\displaystyle \underset{-\infty }{\overset{\infty }{\int }}}{\left|W\left(a,\tau \right)\right|}^2d\tau \right]\frac{da}{a^2} $$
(3)

The left-hand side of Eq. (3) is called the ‘energy’ in the signal f(t) (it is, however, not energy in the physical sense unless f(t) has the proper units). We can thus interpret [W(α, τ)]2 as being proportional to an energy density function that decomposes the energy in f(t) across different scales and times. Again, if [W(α, τ)]2 is large (small), we can say that there is an important (insignificant) contribution to the energy in f(t) at scale τ and time t.

Flandrin (1988) proposed calling the function |W(α, τ)|2 a scalogram and, in general, for two different functions f(t) and g(t), the product W f (α, τ) and W g (α, τ) may be called a cross scalogram. While, in general, a scalogram provides an unfolding of the characteristics of a process in the scale-space plane, a cross scalogram, on the other hand, provides a similar unfolding of possible interactions of two processes, and this measure can be quite revealing about the structure of a particular process or about the interaction between different processes.

As can be seen, the CWT offers a promising platform for understanding a given dynamic process and facilitates its objective characterization, in terms of the time series of observations available on the process and, particularly in the area of hydrology, there are several applications wherein wavelet analysis has already been shown to be a credible analysis technique designed to foster understanding of these natural processes.

For practical applications, the hydrologist has access only to a discrete time signal, rather than to a continuous time signal. A discretization of the Continuous Wavelet Transform (CWT) produces N 2 coefficients from a data set of length N; hence redundant information is locked up within the coefficients, which may or may not be a desirable property (Addison 2002). To overcome this redundancy, logarithmic uniform spacing can be used for the a scale discretization with a correspondingly coarser resolution of the b locations, which allows for N transform coefficients to completely describe a signal of length N. Such a discrete wavelet has the form: [Mallat, 1989]

$$ {f}_{m,n}(t)=\frac{1}{{\sqrt{a}}_0^m}f\left(\frac{t-n{b}_o{a}_o^m}{a_o^m}\right) $$
(4)

where m and n = integers that control the wavelet dilation and translation, respectively; b 0 = location parameter and must be greater than zero; a 0 = a specified fined dilation step greater than 1. The most common and simplest choice for parameters are a 0 = 2 and b 0 = 1. This power of two logarithmic scaling of the translation and dilation is known as the dyadic grid arrangement (Szilagyi et al. 1996). The dyadic wavelet can be written in a more compact notation as:

$$ {f}_{m,n}(t)={2}^{-m/2}f\left({2}^{-m}t-n\right) $$
(5)

Discrete dyadic wavelets of this form are usually chosen to be orthonormal. This allows for the complete regeneration of the original signal as an expansion of a linear combination of translating and dilating orthonormal wavelets. For a discrete time series x i , the dyadic wavelet transform becomes:

$$ {T}_{m,n}={2}^{-m/2}{\displaystyle \sum_{i=0}^{N-1}f}\left({2}^{-m}i-n\right){x}_i $$
(6)

where T m,n  = wavelet coefficient for the discrete wavelet of scale a = 2m and location. Eq. (7) considers a finite time series, x i , i = 0, 1, 2,…, N − 1, where N is an integer power of 2: N =2M. This gives the ranges of m and n as b 0 and 1 < m < M, respectively. At the largest wavelet scale (i.e., 2m where m = M), just one wavelet is needed to cover the time interval and only one coefficient is created. At the next scale (2m − 1), two wavelets cover the time interval, therefore two coefficients are created, and so on down to m = 1. At m = 1, the a scale is 21, i.e., 2M−1 or N/2 coefficients are needed to describe the signal at this scale. The total number of wavelet coefficients for a discrete time series of length N =2M is then 1 + 2 + 4 + 8 + … + 2M−1 = N − 1. A signal smoothed component, \( \overline{T}(t) \), is left, which is the signal mean. Therefore, a time series of length N is broken into N components, i.e., with zero redundancy. The inverse discrete transform is given by:

$$ f(t)=\overline{T}(t)+{\displaystyle \sum_{m=1}^M{W}_m}(t) $$
(7)

where \( \overline{T}(t) \) is called the approximation subsignal at level M; and W m (t) = detailed sub signals at levels m = 1,2 …,M. The wavelet coefficients, W m (t), m = 1,2,…,M, provide the detail signals, which can capture small features of interpretational value in the data; the residual term, \( \overline{T}(t) \), represents the data background information.

In this study, wavelet decomposition is performed using the db2 orthogonal discrete wavelet function, as suggested by Maheswaran and Khosa (2012b) taking care of the boundary distortion. Four levels of decomposition is implemented. For a sampling period of 1 month, the time scales of the wavelet decomposition are 2−, 4−, 8−, 16- and 32-months, respectively, for the resolution levels j = 1, 2, 3, 4 and 5. These decomposition levels allow examining usefulness of a range of time-scale signals in rainfall forecasting.

3 Study Area and Data

To test the proposed method, monthly rainfall data from two different sub basins of Cauvery Basin, India and one from Gurgoan district in Delhi NCT(National Capital Territory),India were considered.

Cauvery basin receives an annual average rainfall of 1,129 mm and of which, about 50 % is received during the south-west monsoon (June-September), 33 % in the northeast monsoon (October – January) and the rest in the summer months (February – March). The mean daily maximum temperature ranges from 19.5 to 33.7 ° C, whereas the mean daily minimum varies from 9.1 to 25.2 ° C.

The climate of the Gurgoan district can be classified as tropical steppe, semi-arid and hot which is mainly characterized by the extreme dryness of the air exceptduring monsoon months, intensely hot summers and cold winters. During 3 months of south west monsoon from last week of June to September, the moist air of oceanic origin penetrates into the district a nd causes high humidity, cloudiness and monsoon rainfall. The normal annual rainfall in Gurgaon district is about 596 mm spread over 28 days. The south west monsoon sets in the last week of June and withdraws towards the end of September and contributes about 85 % of the annual rainfall.

3.1 Rainfall Data

Long-term monthly rainfall data were obtained from the IMD monthly rainfall dataset. The data is then interpolated to obtained the basin wise average rainfall. The resulted basin wise average rainfall data was available for the period between 1916 and 1989. The 70 % of the data set was used for calibration and the remaining data was used for validating purposes.

3.2 Climate Data

To forecast monthly rainfall, in this study we choose different large-scale climate signals, which have been identified to be influencing Indian rainfall by different researches (Polaski et al., (2013) and (He et al. 2013)). Following are the selected large-scale climate indices which have been found to have significant influence on rainfall over Indiansubcontinent,

  1. a)

    Southern Oscillation Index (SOI), which is used as a common index for ENSO, was chosen as apotential predictor of rainfall because it has the longest period of record (1876 to present), and it was successfully used in previous rainfall forecasting studies (Meinke and Stone 2005; Abbot and Marohasy 2012). Monthly SOI values were obtained from the Australian Bureau of Meteorology athttp://www.bom.gov.au/climate/glossary/soi.shtml.

  2. b)

    The IOD has widespread effects on rainfall in East Africa, India, Indonesia, and the western a southern Australia (Webster et al. 1999; Ummenhofer et al., 2009). Variability in the Indian Ocean is associated with rainfall variability in the Indian subcontinent, and Indian Ocean variability is reported to be the key driver climate in India. The IOD index is represented by anomalous SSTgradient between the western equatorial Indian Ocean (50°E–70°E and 10°S–10°N) and the southeasternequatorial Indian Ocean (90°E–110°E and 10°S–0°N).

  3. c)

    The PDO is a pattern of climate variability with a similar expression to El Ni˜no, but acting on a longer timescale, and with a pattern most clearly expressed in the North Pacific (Mantua et al. 1997; Mantua and Hare 2002). The PDO index is based on a projection of SST anomalies onto a pattern defined by the leading principal component of monthly SST anomalies in the North Pacific poleward of 20°N. PDO index is available from http://jisao.washington.edu/pdo/PDO.latest

    Therefore, in our study these three climate indices were taken as the input variables for the nonlinear regression model.

3.3 Identification of Significant Components

It is assumed that the rainfall responds to large scale signals with a time lag (He et al. 2013). Cross correlation function (CCF) is a common method generally used for evaluation the lag relationship between two variables. In this study, all monthly time series are decomposed into a certain number of subseries components under different temporal scalesusing a specific mother wavelet. The mother wavelet and the depth of decomposition are chosen based on the previous studies by (Maheswaran and Khosa 2012b). The cross correlation function is implemented to identify lag relationships between rainfall subseries versus each potential predictor subseries. The lag correlation coefficient between the two sets of subseries is used for this purpose.

4 Methods Used

4.1 Multiple Input Wavelet Volterra Coupled (MWVC) Nonlinear Model

The decomposed time series of the various climate indices and the past rainfall data form the input variables for the model. From these input variables, those which have a significant lagged cross correlation with the rainfall time series were identified and were then integrated using the second order Multiple Input Single Output (MISO) Volterra model to provide the forecast at next step. The Fig. 1 shows the model scheme for the propose scheme.

Fig. 1
figure 1

Wavelet Nonlinear Model –Scheme I

Here, D i X i = 1,2,..J denotes the detail component of the wavelet decomposition of a certain input variable X and A J X denotes the approximation component of the wavelet decomposition of the same input variable X.

From the wavelet coefficients of the different input time series, the significant wavelet coefficients are selected based on the lag correlation with the observed rainfall time series. Let for example, some of these may be denoted by D i ra inf all (t), D i IOD (t)............, D i PDO (t) and similarly the significant scaling coefficients at the decomposition level J of the different input series be denoted by A J Ra inf all (t), A J IOD (t)............, A J PDO (t), where i denotes the depth of decomposition which varies from 1 to J.

Now, the significant wavelet coefficients and scaling coefficients of the different input series are nonlinearly convolved using the second order Volterra representation within a multiple inputs-single output frame work. For simplicity of notation, let these different series be denoted by u 1 , u 2 … u L where L is total number of inputs.

If y(t) denotes the rainfall time series, L denotes the number of input variables, N is the length of the time series, mdenotes the memory of each input variable up to which there is a significant lag relationship with the rainfall time series and ξ t represents the model noise including modelling errors and the unobservable disturbances, the multiscale nonlinear relationship may be written as

$$ \left.\begin{array}{l}y(t)={\displaystyle \sum_{n=1}^L{\displaystyle \sum_{\tau =1}^m{h}_1^{(n)}}}\left(\tau \right){u}_n\left(t-\tau \right)+\\ {}\kern1.8em {\displaystyle \sum_{n=1}^L{\displaystyle \sum_{\tau_1=1}^m{\displaystyle \sum_{\tau_2=1}^m{h}_{2s}^{(n)}}}}\left({\tau}_1,{\tau}_2\right){u}_n\left(t-{\tau}_1\right){u}_n\left(t-{\tau}_2\right)+\\ {}\kern1.8em {\displaystyle \sum_{n_1=1}^L{\displaystyle \sum_{n_2=1}^{n_1-1}{\displaystyle \sum_{\tau_1=1}^m{\displaystyle \sum_{\tau_2=1}^m{h}_{2x}^{\left({n}_1,{n}_2\right)}\left({\tau}_1,{\tau}_2\right){u}_{n_1}\left(t-{\tau}_1\right){u}_{n_2}\left(t-{\tau}_2\right)}}}}\kern0.6em +{\xi}_t\kern2.76em \end{array}\right\} $$
(8)

First order kernels h 1 (n) describe the linear relationship between the nth input u n and y, the second order self-kernels h 2s (n) describe the 2nd order nonlinear relation between the nth input u n and y respectively and the second order cross-kernels \( {h_{2x}}^{\left({n}_1,{n}_2\right)} \) describe the 2nd order nonlinear interactions between each unique pair of inputs (u n1 and u n2 ) as they affect y.

Eq. (8) can be simplified by combining the last two terms to yield Eq. (8) and it now remains to estimate kernels h 1 and h 2 .

$$ \begin{array}{l}y(t)={\displaystyle \sum_{n=1}^L{\displaystyle \sum_{\tau =1}^m{h}_1^{(n)}}}\left(\tau \right){u}_n\left(t-\tau \right)+\\ {}\kern1.68em {\displaystyle \sum_{n_1=1}^L{\displaystyle \sum_{n_2=1}^L{\displaystyle \sum_{\tau_1=1}^m{\displaystyle \sum_{\tau_2=1}^m{h}_2^{\left({n}_1,{n}_2\right)}\left({\tau}_1,{\tau}_2\right){u}_{n_1}\left(t-{\tau}_1\right){u}_{n_2}\left(t-{\tau}_2\right)}}}}+{\xi}_t\kern4.2em \end{array} $$
(9)

The representation of Eq. (9) can be further simplified by considering each of the lagged variables u 1 (t-1), u 1 (t-τ)...., u 2 (t- 1), u 2 (t- τ).... as separate variables d 1 (t), d 2 (t), d 3 (t)........ d Nl (t) then, Eq. (8) can be written as

$$ y(t)={\displaystyle \sum_{l=1}^{N_l}{h}_1(l)}{d}_l(t)+{\displaystyle \sum_{l_1=1}^{N_l}{\displaystyle \sum_{l_2=1}^{N_l}{h}_2\left({l}_1,{l}_2\right)}}{d}_{l_1}(t){d}_{l_2}(t) $$
(10)

More clearly,

$$ \begin{array}{l}{d}_l(t)=\left\{{x}_k(t)\kern0.48em 1\le k\le L\right.\kern0.48em ;\kern0.24em 1\le l\le L\\ {}{d}_l(t)=\left\{{x}_k\left(t-\tau \right)\kern0.48em 1\le k\le L\kern0.24em ;\kern0.36em L<l\le {N}_l;\;\tau =1,2,3....m\right.\\ {}\;\tau ={\tau}^{th}\; lagged\; value.\;\\ {}L= total\kern0.24em predictor\; time\kern0.24em series\;.\;\\ {}{N}_l= total\; number\; of\; lagged\kern0.37em \operatorname{var} iables\end{array} $$

Using the Orthogonnal Least Squares- Error Reduction Ratio (OLS-ERR) method of Chen and Billings (1989), the significant regressor terms were selected and correspondingly kernels were estimated. The complete mathematical derivation of the Wavelet Volterra coupled model can be found in (Maheswaran and Khosa 2012a). The programs were coded and executed in the MATLAB 7.6.0.

4.2 Multivariate Wavelet Linear Regression Model

The Multivariate wavelet-based Linear Regression (MWLR) is constructed by incorporating two methods: Linear Regression and Wavelet Transform (Kisi 2009 and He et al. 2013). The details and approximations of the different input variables are combined using the linear regression to predict the future rainfall. For the MWLR model inputs, each of the original rainfall and climate index time series is decomposedinto a certain number of subseries components A J and Ds using the wavelet transform.

Then the forecasted value y(t), the rainfall at time t can be obtained using the multiple linear regression formulation as given below,

$$ y(t)={\displaystyle \sum_{n=1}^L{\displaystyle \sum_{\tau =1}^m{a}_n}}\left(\tau \right){u}_n\left(t-\tau \right) $$
(11)

Where, u n (t) denotes the wavelet decompositions of the different input predictor variables and a n denotes the regression parameter which can be obtained from the calibration period data and τ denotes the lag time between the rainfall time series and the predictor time series. The programs were coded and executed in the MATLAB 7.6.0.

4.3 Single Scale Neural Network Model

In the past, neural networks have been extensively applied for forecasting purposes and the results have been very encouraging. In this study, the neural networks model has been used for the basic comparison with the proposed wavelet based approach. The basics of neural networks are available extensively in literatures ((Thirumalaiah and Deo 1998), (Adamowski and Chan 2011)). The choice of the ‘most appropriate’ network training algorithm is usually resolved by means of a trial and error based judgement and, it is understandable that there would be diverse opinions on the use of a specific network algorithm. In the present study, ANN models have been implemented with various training algorithms and their performance assessed through a comparative evaluation and the corresponding ‘best’ results are presented for clarity. The input variables for the ANN model are selected based on the cross correlation between the predictor variables and prectictand. The programs were run using the Neural Network Toolbox and executed in the MATLAB 7.6.0.

4.4 Model Performance Measures

To evaluate the performance of predictions, the following statistical measures of error are considered

  1. 1.

    Mean absolute error (MAE)

    $$ MAE=\frac{{\displaystyle \sum \left| Observed(i)- Forecast(i)\kern0.24em \right|}\;}{N} $$
    (12)
  2. 2.

    Root Mean Square Error (RMSE).

    $$ RMSE=\sqrt{\frac{{\displaystyle \sum_{i=1}^N{\left( Observed(i)- Forecast(i)\right)}^2}}{N-1}} $$
    (13)
  3. 3.

    Nash Sutcliffe Criteria (NSC)

    $$ \begin{array}{l}NSC=E=1-\frac{{\displaystyle \sum_{t=1}^N{\left( Actual(i)- Forecast(i)\right)}^2}}{{\displaystyle \sum_{t=1}^N{\left( Actual(i)-\overline{Actual}\right)}^2}}\;\\ {}\end{array} $$
    (14)

To compare the performances of the different models, the present study has used measures such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE). Karunanithi et al. (1994) suggested that RMSE is a good measure for indicating goodness of fit. In general, RMSE ≥ MAE, and the degree to which RMSE exceeds MAE is an indicator of the extent to which large outliers (Variance between the observed and the forecasted values) exist in the evaluation set. Lower the value of RMSE better is the model performance.

Equations (12), (13) and (14) as given above have been used to estimate these performance measures.

5 Model Application

5.1 Selection of input components

As a first step, the wavelet decompositions of the input predictand variables were performed. Figure 2 shows an example of wavelet decomposition using the ‘db2’ wavelet with four resolution levels corresponding to the monthly rainfall series for MH Halli. Similarly Fig. 3 shows wavelet decomposition for the SOI indices. These figures clearly show how the wavelet transform decomposes original series into its detail (D’s) and approximation A4 subseries.

Fig. 2
figure 2

Wavelet decomposition of the observed rainfall time series at M H Halli using db2 wavelet

Fig. 3
figure 3

Wavelet decomposition of the SOI time series using db2 wavelet

The selection of the input wavelet components was done by estimating the cross lag correlation coefficients. For each potential predictor subseries, a monthly lag which lies within the interval [1, 30] is identified and the corresponding maximum lag correlation coefficient (MLCC) is found for this subseries and the rainfall time series. Some of the sample cross correlation plots are shown in Fig. 4. Table 2 reports the summary of the MLCC for the various subseries and the rainfall time series for all the three station under investigation. It can be seen that the components D1, D2, D3, D5 and A5 of the IOD are having significant correlation with the rainfall anomalies. Apart from this, the D4 and A4 component of the SOI time series and the IOD time series are having significant cross lag correlation with the original rainfall time series. Also, it is seen that the D4 and A4 of PDO is having good correlation with the rainfall time series.

Fig. 4
figure 4

Cross correlation plots between the rainfall time series and the wavelet decomposition of the SOI time series

Table 1 Summary of the Cross Correlation analysis between the wavelet decomposition of the input variables and the observed rainfall time series
Table 2 Selection of Best MWVCmodel for one month ahead forecasting at MHalli

5.2 MWVC Model

In line with the assumption that a given time series constituting observations on a natural process is a result of an amalgamation of various sub-processes or phenomena that individually operate at various scales and accordingly each of these sub-process has the attribute of memory albeit of different spans. For the rainfall time series of MH Halli, Table 1 indicates that the optimal components that contribute significantly to variability in the observed rainfall process, X(t) are D1 IOD , (D4,A4) SOI , (D1,D3,A4) PDO , (D1,D2,D4,A4) rainf. The lags at which these variables are having significant correlation at 95 % confidence limits were also estimated. From this group of variables a total of 24 significant input variables were selected based on the higher lagged cross correlation analysis. These input variables were regressed using the 2nd order volterra model. This resulted in anestimation of a total of 356 volterra kernels. However, using the OLS-ERR algorithm, it was seen that the 40 kernels were seen to be significant. Figure 5 shows the plot of the NSC vs. No. of significant kernels selected and it is observed that there is no significant change in the NSC values beyond 40terms leading to the logical inference that the MWVC model would comprise of just these 40 terms. Significantly, it was seen that the model has only 5 linear terms while the remaining 35 happen to be nonlinear terms. Figure 6 shows the second order kernel values of the derived MWVC model for MH Halli station and the model validation results are compared with the test data on observed rainfall time series and the comparison is presented in Fig. 7. Several input combinations were tried and the results were tabulated in Table 2.

Fig. 5
figure 5

NSC vs. No. of significant terms used MH Halli

Fig. 6
figure 6

Second order kernel for the MWVCmodel MH Halli

Fig. 7
figure 7

Observed and forecast rainfall values for MH Halli station using MWVC models

Similar approach was followed for the remaining two stations and the results are presented in the later section.

5.3 Wavelet Multivariate linear Regression Model

Selection of input variables was based on the strength of cross correlation between individual wavelet decompositions of rainfall and different climate indices. The significant input variables were numbering to 26. Figure 8 shows the plot of the NSC vs. No. of significant terms selected and it is observed that there is no significant change in the NSC values beyond 7 terms leading to the logical inference that the MWVC model would comprise of just these 7 terms. It was seen that out of these 26 inputs only 7 were significantly contributing to the rainfall. The scatter plot of the results for the MHHalli is shown in Fig. 9. The results of the WMLR for all other stations are summarised in the later part of the section. The explicit forecasting equation for the WMLR is given by Eq. (15)

Fig. 8
figure 8

NSC vs. No. of significant terms used for the MWLR model MH Halli

Fig. 9
figure 9

Observed and forecast rainfall values for MH Halli station using MWLR models

$$ \begin{array}{l}{R}_{t+1}=0.21D{3}_t^R+1.06D{4}_t^R+0.29D{2}_t^{SOI}+0.43D{3}_{t-6}^{SOI}+\\ {}5.57A{4}_t^R-5.10A{4}_{t-1}^R+0.52A{4}_{t-12}^R\end{array} $$
(15)

5.4 Neural networks Model

Recently, Meknaik et al., (2013) have used neural network for forecasting rainfall using the climate indices such as SST and IOD. They have used the ENSO, IOD, SOI and Nino 3.4 as input for the ANN models for forecasting the future rainfall. In this work, a similar approach has been taken for forecasting the rainfall based on the past rainfall values and the past climate indices. Table 3 summarises the prediction skills of the different models that were tested in the study. The scatter plot of the results from the best model for the MHHalli is shown in Fig. 10.

Table 3 Selection of Best ANN model for one month ahead forecasting at MHalli. (After normalizing)
Fig. 10
figure 10

Observed and forecast rainfall values for MH Halli station using ANN models

Similar approach was used to obtain the best model for the other two sub-basins viz. Kudige and Gurgoan. The best results from each of the category of the model are reported in Table 4.

Table 4 Result Statistics for one month ahead forecasting of one month ahead total rainfall at MHalli and Gurgoan using different models

6 Results and Discussions

The performance statistics of the Wavelet nonlinear model, wavelet linear models and the neural networks are shown in Table 4 for the test period 1970–1989 for all the three places. NSC for the Wavelet nonlinear Model ranges from 0.74 to 0.78, while that from the WMLR varies from 0.50 to 0.62. As a result, the Wavelet nonlinear model increases the forecast NSC by almost 50 %. This clearly indicates that Wavelet nonlinear provides significantly improved accuracy relative to WMLR for monthly rainfall forecasts.

Similarly, in terms of RMSE the multivariate Wavelet Volterra model performs better than the other models (WMLR, ANN) by 16 % and 25 %. Therefore, from the RMSE and NSC viewpoints, the proposed wavelet nonlinear model performs better than the WMLR model for monthly rainfall forecasting. This may be because the proposed nonlinear wavelet model has capability to capture the nonlinear relationships of the predictor variables on the rainfall at different time scales, while WMLR doesn’t. Even though the number of parameters to be estimated may be more for the wavelet based nonlinear model than the WMLR model (Eq. 15), however, it is to be noted that the original number of input predictor variables is same.

On the other hand, it has been seen that the WMLR performs better than the NN model because of the wavelet decomposition has the capability to unravel the hidden relationships between the predictor variables and the rainfall time series. To inspect how the forecast models perform for dry and wet months, the forecasted results are plotted for MHHalli in Fig. 11, for the test period from January 1985 to Dec 1987. At MHHalli both MWVC and MWLR perform similarly for months with normal rainfall, but for some extreme months, MWVC model provides much better prediction than MWLR. It can be observed that for 1985 and 1986, the MWVC models produced very close forecasts, whereas the MWLR model and ANN model underestimates the rainfall. However, for 1987 both MWVC and MWLR model overestimates the wet months rainfall whereas there is an underestimation by ANN model. For one extreme wet month at the MWVC forecasts 960.23 mm in comparison to the observed 1020.2 mm, with an underestimation of 5.88 %, whereas the MWLR forecasts 521.26 mm, with an underestimation of 48.91 %.

Fig. 11
figure 11

Model Results for the three year period (1985–1987)

In the case of the summer rainfall also, the MWVC yields better results whereas the other two models either over estimate or underestimates the rainfall. For example, in the year 1987 for June, the MWVC gives a value of 108.34 mm in comparison with the observed rainfall of 123.56 mm. For the same month the MWLR and ANN model provide forecasts of 97.67 mm and 88.56 mm respectively.

The models were compared for their overall capability for forecasting the rainfall. Figure 12 shows the comparison of the forecasts values with the observed rainfall during the months of June, July, Aug and September for the period of 5 years. The analysis of the results indicate that the MWVC performs well in picking up the extreme rainfall events on a monthly time scale.

Fig. 12
figure 12

Average forecast performance for MHHalli

Figure 13 shows error percent for each of the model with reference to the observed values for the monsoon periods. The analysis of the plot show that for these periods indicates that the MWVC had a better performance during the extreme months. Similar results were obtained for Gurgoan also.

Fig. 13
figure 13

Percentage of error in the different models during the monsoon seasons for a MHHalli and b Gurgoan

6.1 Analysis of the Influence of the Climate Indices in Driving the Rainfall

In order to evaluate the influence of the climate indices on the rainfall, a WVC model was developed without climate indices and compared with the MWVC model. The comparison of the results of the WVC model without climate indices is tabulated in Table 4. It was seen that in all the three cases, including the climate indices as the model input variables drastically improved the model performance. In the case of Kudige, including the climate indices increased the NSC from 0.71 to 0.77. These figures imply that about 15 % of the total variance is explained by the climate indices.

From the results obtained from the orthogonal least squares analysis, it was observed that the SOI index and IOD index are the major regressors when compared with the PDO indices. More research has to be pursued to bring out the reason for this kind of behaviour.

7 Conclusions

A wavelet based nonlinear method is presented, tested and discussed for forecasting monthly rainfall. The proposed wavelet based multivariate nonlinear model combines the wavelet decompositions of the candidate predictor variables (historical rainfall and climate indices using the MISO 2nd order Volterra model. The proposed method is compared with two other competent models such as MWLR and ANN models. The analysis of the results reveal that the MWVC models have better performance than the other contesting models. In fact, the linear nature of MWLR model estimators makes it inadequate to provide good prognostics for a variable characterized by a highly nonlinear physics. On the other hand the ANN model even though equipped with the ability of picking up the nonlinear features, but it performed poorly because of their inability to pick up the nonlinearity in the rainfall dynamics.

The proposed model was trained with the 53-year data, and tested with the remaining 20-year data, and compared to the traditional ANN model based on original time series. The WMVC forecast skill appears to be significantly better than the ANN and MWLR model. The WMVC model reduced mean absolute error by 16 % and increased the NSC by 26 %, respectively, in comparison to the WMLR model. The improvement is more significant for the extreme wet months, and for the dry months there is no significant improvement over the MWLR and ANN models. These results indicate that the WMVC can capture the nonlinear impacts of predictor variables on rainfall series under different time scales. Also, the analysis clearly shows that there is a significant improvement in the model performance (15 % increase in model performance) by the inclusion of the climate variables in the forecasting model.