1 Introduction

Precipitation pattern is one of the most important variables for hydrologic and meteorological studies. Intense precipitation events can cause some important floods resulted in property economic damages or loss of life. On the other hand, drought is a common problem at local or regional scales. So, accurate prediction of precipitations is very important for hydrologists and meteorologists. However, accurate precipitation prediction is hard because of the complexity of the physical processes involved and highly dependent on small scale processes (Kulligowski and Barros 1998). Numerical weather forecasting has been used for rainfall estimation by meteorologists for many years (Bustamante et al. 1999; Olson et al. 1995, 2004). However, they are basically dependent inaccurate initial conditions, parameterization schemes of subscale phenomena, and limited spatial resolution (Ramirez et al. 2005).

Artificial neural networks (ANN) are a useful tool to solve to predicting problem. ANN has been successfully used in the hydrological sciences during recent years (Applequist et al. 2002; Silverman and Dracup 2000; Cigizoglu 2003; Kumar et al. 2005; Ramirez et al. 2005; Freiwan and Cigizoglu 2005; Kisi 2006; Jain and Kumar 2007; Sreekanth et al. 2009; Gao et al. 2009). In the mentioned studies, the feed forward back propagation (FFBP) neural network algorithm which is the most popular ANN architectures was employed. However, RBF and generalized regression neural network (GRNN) has comparatively fewer applications in the water resources problems (Sudheer et al. 2002; Cigizoglu and Alp 2004, 2006; Jayawardena and Fernando 1998). Cigizoglu (2005) investigated that the performance of the GRNN are to be superior to FFBP for daily river-flow forecasting. Kermania et al. 2013) studied RBF and feed forward neural networks performance for daily runoff predicting. They investigated feed forward neural network model using Levenberg–Marquardt algorithm (LMNN) is superior to the RBF network for base and high flow. However, the RBF model is superior to the LMNN model for simulating flood events.

Wavelet transformation provides considerable information about the structure of the physical process to be modeled. So, using together wavelet and neural networks provides considerable advantages. Hybrid models combines wavelet transformation and neural networks have been improved for predicting at last years (Kim and Valdes 2003; Wang and Ding 2003; Anctil and Tape 2004; Partal 2009; Adamovski and Chan 2011; Kisi 2011; Mishra et al. 2011). Ramana et al. (2013) studied to predict the monthly rainfall series using wavelet neural network (WNN) model. They used back propagation neural network algorithm. Their results indicate that the performances of WNN models are more effective than the classical neural network models. Partal and Cigizoglu (2009) predict the daily precipitations from meteorological data of Turkey using the wavelet–neural network method (combines two methods: discrete wavelet transformation and feed forward neural networks). At results, the WNN model had a noticeably high positive effect on the performance evaluation criteria. Shoaib et al. (2014) compared to two different neural network models (RBF and multilayer perceptron neural network) with different main wavelets for rainfall–runoff modeling. They found that the discrete wavelet transform multilayer perceptron neural network and the discrete wavelet transform radial basis function (RBF) models at with the db8 main wavelet function has the best performance.

The objective of this research is to study the potential of wavelet and different neural networks structures for daily precipitation modeling from the meteorological data. Employment of the wavelet-FFBP is compared with the wavelet-GRNN and the wavelet-RBF performances. Some drawbacks of the FFBP algorithm necessitates the investigation of other ANN algorithms. For example, the training simulation performance of the FFBP is dependent on the different random weight assignment in the beginning of each training simulation. Therefore excessive FFBP simulations are needed to select the best FFBP performance. The RBF network learns faster than FFBP networks and has fewer parameters (Jayawardena and Fernando 1998). As different from the back propagation algorithm, the RBF network has the nonlinearity embedded in the basin functions of its hidden layer neurons, making the optimization of tunable parameters a linear search (Sudheer and Jain 2003). Besides, the wavelet-RBF and the wavelet-GRNN have comparatively fewer applications in the precipitation predicting problem. So, this paper investigates performance of the three different neural network algorithms and wavelet transformation. Also, the linear regression model and conventional neural networks performance is compared with the WNNs methods. The global wavelet spectrums (GWS) of the precipitation data were studied to investigate the effective periodic characteristic of the observed data. The used daily meteorological data employed were that of Turkish meteorological stations (selected as randomly distributed throughout the country). This data was found to be homogeneous for study period (Partal and Cigizoglu 2009).

2 Description of the data

The predicting described herein was employed on stations from the DMI (Turkish State of Meteorological services) all over Turkey (Fig. 1). The used stations are assumed to reflect different regional hydro-climatic conditions over Turkey. Turkey’s general climate characteristic is Mediterranean climate regime. However, Turkey is affected by the polar and tropical weather regimes in accordance with its geographical location. A general description of the climate conditions prevailing over Turkey is available in Unal et al. (2004) and Tatlı et al. (2004).

Fig. 1
figure 1

The used meteorological stations

Data quality was controlled based on homogeneity and ensure that the stations have good quality. The record length is 5479 days covering a time period for the interval between January 1987 and December 2001. The meteorological variables which are the daily mean temperature (T mean ), daily maximum temperature (T max ), daily minimum temperature (T min ), daily total specific humidity (H), daily total evaporation (E), and daily total precipitation (P) have influence on the precipitation process.

Some parameters of the data are presented in Table 1. The highest maximum daily precipitation value was observed at the Muğla station (X max  = 155.6 mm, Table 1). The lowest maximum daily precipitation was measured at the Afyon station (Its value is 60.3 mm). The precipitation series have quite high skewness values (c sx  = 5.48 for the Muğla station; Table 1). This is valuable to say because of the high skewness values decrease the estimation accuracy of the neural networks (Altun et al. 2007). The lag-1 auto-correlation of the precipitation records has a significant value whereas lag-2 and lag-3 auto-correlations are close to zero. Also, Lag 0 and Lag 1 cross-correlations between the meteorological data and precipitation data were computed and presented in Table 2. Table 2 shows that the correlations between the relative humidity and the precipitation data are quite significant.

Table 1 Station informations and some statistical parameters of the daily precipitation data (x mean , x max , c sx , r 1, r 2, r 3 denote mean, maximum, skewness, lag-1, lag-2 and lag-3 autocorrelation coefficients, respectively)
Table 2 Lag 0 and Lag 1 cross-correlations between the meteorological data and the precipitation data

3 Methods

3.1 The feed forward back propagation

The FFBP is the most known and used ANN method in water resources literature. An FFBP network structure has one input layer, one output layer and the least one hidden layers with hidden neurons. The connections between neurons in different layers are supplied by adjustment weights values. Each neuron is connected only with neurons in following layers (Cigizoglu 2004). Each neuron sums its inputs and later produces its output by activation function. In this study, tangent sigmoid function is used as neuron transfer function. The hidden layer node numbers of each model were determined after trying various network structures. For avoid overfitting in ANNs, “Early stopping” technique was considered. So, the FFBP networks training were stopped after 200 iterations.

Predicted output values are always different from observed values. The weight of connections is modified based on the differences between the computed values and observed values at the output layer. This is the back-propagation process. After that, feed forward process is again formed until an aimed total error or number of prescribed iterations is reached (Kisi 2006). The performance of the FFBP algorithm is very sensitive to the proper setting of the learning rate. The learning rate is made responsive to the complexity of the local error. At each epoch new weights and biases are calculated using the current learning rate. New outputs and errors are then calculated. If the new error is less than the old error, the learning rate is increased. So, the network can learn without large error increases. More details on neural networks can be seen in Cigizoglu (2003).

3.2 The radial basis function-based neural networks

RBF neural networks were firstly developed by Bromhead and Lowe (1988). The RBF neural network model is inspired by the locally turned response observed in biological neurons. Theoretically, RBF is similar to FFBP network. RBF network use radial symmetric transfer function on hidden layer. Radial symmetric transaction consists of centers (μ) and spread (σ) parameters. Synaptic weights (w ij ) are only between hidden and output layers (Sudheer and Jain 2003). For the X j input pattern, the response of the jth node in hidden layer (z j ) is below;

$$ z_{j} = \,\exp \left\{ {\frac{{\left| {X - \mu_{j} } \right|}}{{\sigma_{j}^{2} }}} \right\} $$
(1)

where, |.| is Euclidean Norm. The output of the network at the jth output is given by

$$ y_{L} = \sum\limits_{j = 1}^{L} {z_{j} w_{ij} } $$
(2)

Different spread constants were tried in the study. The theoretical basis of the RBF approach lies in the field of interpolation of multivariate functions (Cigizoglu 2004). The solution of the exact interpolating RBF mapping passes through every data point. More details on neural networks can be seen in Cigizoglu (2004).

3.3 The generalized regression neural networks

The GRNN, doesn’t need a training procedure as in the back-propagation method, has four layers (input layer, pattern layer, summation layer and output layer). In the first layer, there are input parameters and completely connected to the second layer which pattern layer. The pattern units are connected to in the summation layer. The spread parameter (s) of transaction function is determined by trial and error (Cigizoglu and Alp 2004). The GRNN defines any arbitrary function between input and output nodes. The GRNN is more useful for the estimation of continuous variables, as in standard regression techniques. It is based on a standard statistical technique called kernel regression (Cigizoglu 2005). The performance of the GRNN model applications as functions of the spread parameter for GRNN algorithm was evaluate. If spread parameter is larger, the function will be the smoother (Cigizoglu and Alp 2004). So, different spread parameter has been evaluated. More details on GRNN networks can be seen in Cigizoglu (2005).

3.4 Discrete wavelet transform (DWT)

The decomposition of data into periodic components allows the knowledge of the dominant mode of variability (Coulibaly and Burn 2004). This can be done by using DWT. The wavelet transform is a strong mathematical tool that provides a time–frequency representation of an analyzed signal in the time domain (Smith et al. 1998; Dabechies 1990).

Assuming a continuous time series x(t), t ∈ [∞, −∞], a wavelet function ψ(η) that depends on a non-dimensional time parameter η can be written as

$$ \psi \left( \eta \right) = \psi \left( {\tau ,s} \right) = s^{ - 1/2} \psi \left( {\frac{t - \tau }{s}} \right) $$
(2)

where t stands for time; τ for the time step in which the window function is iterated; s ∈ [0, ∞] for the wavelet scale. ψ(η) must have zero mean and be localized in both the time and the Fourier space (Meyer 1993).

Computing the wavelet coefficients at every possible scale is a fair amount of work, and it generates a lot of data. If one chooses scales and positions based on the powers of two (dyadic scales and positions) then the analysis will be much more efficient as well as accurate. This transform is called DWT, and has the form as

$$ \psi_{m,n} \left( {\frac{t - \tau }{s}} \right) = s_{0}^{ - m/2} \psi \left( {\frac{{t - n\tau_{0} s_{0}^{m} }}{{s_{0}^{m} }}} \right) $$
(3)

where m and n are integers that control respectively the wavelet dilation (scale) and the translation (time); s 0 is a specified fixed dilation step greater than 1; and t 0 is the location parameter and must be greater than zero. From this equation, it can be seen that the translation step, \( n\tau_{0} s_{0}^{m} \), depends on the dilation, \( s_{0}^{m} \). The most common (and simplest) choice for the parameters s 0 and τ 0 is 2 and 1 (time steps), respectively. This power of –two logarithmic scaling of the translations and dilations is known as dyadic grid arrangement and is the simplest and most efficient case for practical purpose (Mallat 1989). For a discrete time series x i , where x i occurs at discrete time i (i.e., here integer time steps are used), the DWT becomes

$$ W_{m,n} = 2^{ - m/2} \sum\limits_{i = 0}^{N - 1} {x_{i} \psi (} 2^{ - m} i - n) $$
(4)

where W m,n is wavelet coefficient for the discrete wavelet of scale s = 2m and location τ = 2m n. The Haar wavelet as the mother wavelet was selected in this study. It is one of the most suitable mother wavelets for hydrological forecasting applications (Belayneh et al. 2014).

3.5 The global wavelet spectrum

Considering a vertical slice through a wavelet plot as a measure of the local spectrum, the time-averaged wavelet spectrum over all the certain periods or all the local wavelet spectra is then expressed as

$$ \overline{{W^{2} }} \left( s \right) = \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {\left| {W_{t} \left( s \right)} \right|}^{2} $$
(5)

where T is the number of the points in the time series. The time-averaged wavelet spectrum is generally called GWS (Torrence and Compo 1997). The smoothed Fourier spectrum approaches the GWS when the amount of the necessary smoothing decreases with the increasing scale. Hence, GWS provides an unbiased and consistent estimation of the true power spectrum which is a useful tool for the analysis of the non-stationary time series analysis. The global spectrum is compatible with a power (Fourier) spectrum. Spectral components are defined as the frequency in a power spectrum, periodic components are ordered according to the period scales in a GWS. A global spectrum is computed via the continuous spectrum; therefore the initial and final time of the periodic components can be also determined.

4 Wavelet decomposition and global wavelet spectrum of the time series

DWT provides decomposed components at determined scales. This enable to study of the components at different scale or periods. For this aim, the time series is decomposed into series of an approximation and details (D) following the Mallat’s algorithm. The process consists of a number of successive filtering steps. The original signal is firstly decomposed into an approximation and accompanying detail. The decomposition process is then iterated, with successive approximations being decomposed in turn, so that the original signal is broken down into many lower-resolution components (Mallat 1989). As results, the wavelet coefficients of the meteorological data was obtained by the DWT. The time series were decomposed into an approximation and ten details components. The decomposed wavelet components for the Balıkesir precipitation data are presented in Fig. 2. The decomposed components of the data present variations on the different periodic scale. For example, the D8 component shows variations of nearly annual mode of the daily precipitation series. The observed extreme precipitations (especially in 1990 year) are clearly seen on the periodic components.

Fig. 2
figure 2

The wavelet components series of the Balıkesir precipitation data

Figure 3 presents the GWS of the Balıkesir meteorological data. The GWS provide useful information about the selection of the model inputs and show the dominant D components. This help to determine the effective periodical components. The annual periodicities (256–512 days) of the meteorological pattern except precipitation have strong magnitudes (Fig. 3). Besides, the short term periodicities, such 2–4–8–16 daily modes, for the mean and minimum temperatures can be seen clearly from the Fig. 3. The precipitation pattern shows high magnitudes in the annual and shorter time periods. Namely, the annual mode (refer to the D8) is significant for the all meteorological data while the annual and shorter time periods are significant for the precipitation data according to the GWS. The GWS presents the general periodic structure of the data with information about the physical structure of the observed data. This helps to select the D components for the model inputs. However, GWS is not enough just to determine effective components. Because of this, the correlations between the components and the observed precipitations are computed and presented in Table 3 for the Balıkesir station. For the temperature and evaporation, the correlation with the D8, which is nearly the annual component, has the highest magnitude. This marks that the annual dominant periodicities of the temperature and evaporation is the most influencing characteristic on the precipitation. In addition to this, the D7 components show slightly higher correlations compared with the other D components. On the other hand, the correlations between the D2, D3 components (4 and 8 daily modes) of the humidity and the observed precipitation are higher with respect to the annual components. This marks to dominant short term periodicities on the relative humidity data. The correlations between 1-day preceding precipitation wavelet series and the observed precipitation are presented in Table 4. The results shows that the correlations for the D2–D8 are higher with respect the remaining D components. In presented study, the correlations in the long term period such 512 and 1024 day mode are insignificant. The correlations determined herein provide information for the determination of the effective wavelet components on the precipitation prediction and for the input selection of the model. The results of correlation analysis are parallel to the results of the GWS. According to the correlations analysis results, the suitable components were selected as the model inputs. The number of the selected components is actually dependent on the user’s preference. However, the determination of a limit correlation value may be quite helpful for this aim. The limit correlation value for selecting the dominant D components was accepted as 0.10. As results, the D1, D7 and D8 for the mean temperature, the D2, D7 and D8 for the maximum temperature, the D8 for the minimum temperature, the D7 and D8 for the evaporation, the D2–D4, D7 and D8 for the relative humidity were selected as the input components in the predicting model at the Balıkesir station (Table 3). The D2–D8 for 1-previous day precipitations were selected (Table 4). Instead of using each D component individually, the employment of the summed D components is more convenient and useful. The use of wavelet components separately as input is not optimal for a block-box estimation model such as neural network. So, the new summed series, obtained by adding the selected D components to each other were used as input for the hybrid models. This process is the most significant and effective part on the network predicting performance. The selected wavelet components are presented in Table 3 and 4. In here, the components having the higher correlation than the limit correlation value can be seen as bold values.

Fig. 3
figure 3

Global wavelet spectrums of the Balıkesir meteorological data

Table 3 The correlation coefficients between the periodic components and the original precipitation data for the Balıkesir station
Table 4 Correlation between ti time (i = 1, 2,….,7 days) D series of its and t time observed precipitation for the Balıkesir station

5 The results of the hybrid models

The using of the WNN hybrid model aims the estimation of the daily precipitations using wavelet components of the meteorological patterns. In previous section, the meteorological patterns were decomposed by the DWT in the various periodicities. Then, the suitable components were determined as input nodes for the predicting model. The new summed series determined instead of the original data were employed as inputs of the wavelet networks.

The input and output data is divided into two parts as the training and the testing periods. The first 4383 values (1.1.1987–31.12.1998) have been used for the training of ANN network simulation. The last 1096 values (1.1.1999–31.12.2001) are employed for the testing purpose. Before applying, the selected input data were normalized in the range [0; 1] by its extreme values. The best spread parameter of each RBF and GRNN simulation is found simply by trial and error. MATLAB codes were written for three different ANN methods (FFBP, RBF, GRNN).

The wavelet-neural networks structures providing the best performance criteria values for different input combinations in terms of MSE and R 2 are presented in Tables 5, 6 and 7. For the Balıkesir station, the wavelet-FFBP structure (12,5,1), consists of 12 input nodes (T mean , T max , T min , E, H, H t−1, P t−1, P t−2, P t−3, P t−4, P t−5, P t−6) and five hidden nodes, have the best performance criteria (MSE = 5.91 mm2; R 2 = 0.762). The Wavelet-RBF model (12,1, s = 0.85) with 12 inputs (T mean , T max , T min , E, H, H t−1, P t−1, P t−2, P t−3, P t−4, P t−5, P t−6) showed best performance (R 2 = 0.603; MSE = 9.86 mm2) for this station. Here the best spread parameter was taken equal to 0.85. While the R 2 value obtained by the wavelet-RBF method are to 0.603, with the wavelet-FFBP model these values is increased to 0.762 for this station. On the other hand, for the wavelet-GRNN model, the determination coefficient was founded as 0.494 (model structure: 12,1, s = 0.08) at the Balıkesir station. Generally, the GRNN models have the lowest R 2 and the highest mean square error values. For instance, while the wavelet-GRNN model show 0.571 R 2 value, the wavelet-RBF model show 0.587 R 2 value at the Muğla station. Table 5 shows that the best wavelet-FFBP model was founded at the Siirt station in terms of performance criteria (MSE = 2.40 mm2; R 2 = 0.896). At the same station, while the wavelet-GRNN model shows 0.512 R 2 value, the wavelet-RBF model shows 0.740 R 2 value, here it is obviously seen that, the wavelet-FFBP model shows the best performance in terms of evaluation criteria for precipitation predicting, although it has some drawbacks. The success of the FFBP algorithm is dependent on the complexity of the learning rate in the back propagation process. As a consequence of the non-linearity process in the operation of the back propagation, the feed forward neural network is enabled to deal successfully with complex undefined relations between the inputs and the output. On the other hand, the RBF and GRNN techniques learn in one pass through the data and can generalize from examples as soon as they are stored (Cigizoglu 2005). It is valuable to note that the models having all of the meteorological variables in the input layer provided better performance.

Table 5 The wavelet-FFBP network structures in terms of the best performance criteria
Table 6 The wavelet-RBF network structures in terms of the best performance criteria
Table 7 The wavelet-GRNN network structures in terms of the best performance criteria

The results were also compared with the multi linear regression models (MLR) and ANN methods. The test results for the testing stage are summarized in Table 8 in terms of MSE and R 2. Table 8 indicates that the WNN model performs much better than the conventional ANN and the MLR model according to various performance criteria. The R 2 values computed by the ANN method is in the region of 0.2–0.4, whereas the WNN model with the wavelet components as inputs provided noticeably higher performance criteria in the region of 0.6–0.9. Meaning, the predicting with the new series has significantly positive effect on the regression model performance.

Table 8 The results of the conventional ANN methods and the MLR in terms of the best performance criteria for the testing periods

Figures 4, 5 and 6 presents the hydrograph and scatter plots for the Balıkesir and Siirt stations. These stations belong to the western and southeastern parts of Turkey, respectively. The model estimations approximate the general behavior of the observed data. For hydrologists, the extreme precipitation forecasting is important due to being the main cause for the flood. The drought days and the extremes in the testing period were estimated satisfactorily by the wavelet-FFBP. The performance of the wavelet-FFBP model in estimating the extreme values is significantly superior to the wavelet-RBF and the wavelet-GRNN models. On the other hand, the MLR forecasts are not approximate the general behavior of the observed data. Besides, the extreme precipitations could not be estimated closely by the MLR model (Fig 7).

Fig. 4
figure 4

Daily precipitation estimations by the wavelet-FFBP network model for the Balıkesir station (a), and Siirt station (b)—for the testing period

Fig. 5
figure 5

Daily precipitation estimations by the wavelet-RBF network model for the Balıkesir station (a), and Siirt station (b)—for the testing period

Fig. 6
figure 6

Daily precipitation estimations by the wavelet-GRNN model for the Balıkesir station (a), and Siirt station (b)—for the testing period

Fig. 7
figure 7

Daily precipitation estimations by the MLR model for the Balıkesir station (a), and Siirt station (b)—for the testing period

6 Conclusion

The aim of this paper was to compare the performance of estimation of the wavelet-ANN models. In the presented study, firstly, the meteorological patterns were decomposed into periodic series by the DWT. Then, the GWS and the correlations between the wavelet components and the observed precipitations were evaluated as criteria for the selection of appropriate components. The GWS present some knowledge about the physics of data such the periodicities and magnitudes of the time series. So, this study brings a new view in the literature about the contribution of physics of data in the ANN structure. Generally, the correlations in the short term modes such 2–4–8–16 daily and in the annual modes such 256 daily have significant. However, the correlations in the long term period such 512 and 1024 day mode are insignificant. Later, the new summed series obtained by the addition of the selected wavelet components were employed as inputs of the hybrid models. It can be understood clearly form the results that the wavelet-FFBP model showed the best performance in terms of the determination coefficient and also for the extreme precipitation estimation. Also, the prediction ability of the WNN models was tested and compared with the ANN and the MLR model. The R 2 obtained by the ANN are within the interval 0.2–0.4, whereas the WNN models provided values in the region of 0.6–0.8. Meaning, using the wavelet series affects the estimation ability positively. Also the MLR models have the lowest R 2 (within interval 0.1–0.2) and the highest mean square error values. At last, the results show clearly that the performance of the FFBP algorithm is better than the RBF and the GRNN algorithms for daily precipitation predictions.

In the presented study, GWS of the observed data provides considerable contribution about the structure of the physical process to be modeled. By wavelet, some properties of the decomposed series such as its daily, monthly, annually periods can be seen more clearly than original signal. As results, it proved that wavelet feed forward neural network is more efficient and accurate than wavelet radial basis neural network and wavelet GRNN in the precipitation predicting. So, wavelet-FFBP model is more suitable for practical application in guiding the design of WNNs.