1 Introduction

The prediction of monsoon rainfall has been one of the major concentration in the world of sciences. Long-range forecast of rainfall certainly requires continuous effort and long planning using different approaches. Most of these approaches use either a dynamical prediction model or statistical models. A dynamical model being deterministic, does not require information about a particular situation beyond initial and boundary conditions. However, in spite of considerable progress made in dynamical modeling and numerical weather prediction of precipitational patterns by dynamical method is still much below its desired level of accuracy.

In statistical method, neither one has a dynamical relation between the cause and effect of the system nor does one possess a definite notion about the relative roles of various processes that govern a phenomenon. However, performance of statistical prediction is often poor, especially when the irregular behavior of the observed data is a result of low dimensional [1]. However, statistical prediction is an active area of research, and use of non-linear techniques promises new developments. There are lot of studies for long-range prediction of monsoon which are aimed at finding suitable predictor parameters. Thus, these use limited data sets and treat monsoonal rainfall time series essentially as a stochastic process, the resulting forecast equations are regression relations with one or more predictor. The year 2002 turned out to be an all-India severe drought year, and the entire monsoon season had many intriguing features [2], specially the fact that in July the country received only half of the month’s normal rainfall. This situation could not be foreseen by any operational statistical or dynamical model [2].

In recent years, another modeling technique, known as artificial neural networks (ANN), has become popular as an alternative technique for modeling and prediction of complicated time series of weather and climate variables. In this paper ANN with different architectures has been used to model the said problem and various results, comparisons and predictions as mentioned are discussed. The powerfulness and reliability of the present model may also be proved from the comparisons. Moreover, proposed methodology successfully predicted the drought in the year 2002, which was not predicted by other investigators.

2 Neural network and design of the model

A neural network is a parallel, distributed information processing structure consists of processing elements called neurons, which are interconnected and unidirectional signal channels called connections. Each processing element branches into as many output connection as desired and carry signals known as neuron output signal. The neuron output signal can be of any mathematical type desired. In other words, it is a mathematical representation of a biological neural network. In term of implementation, it is basically a coupled input–output map constructed through an iterative procedure. Recently, Goswami and Srividya [3] and Goswami and Kumar [4] used composite-neuron (CN) network for predicting rainfall patterns. They used a generalised structure viz. composite network and compared the performance of a CN with that of conventional NN. In the present study, the network has been trained using error back propagation algorithm [7] taking different number of hidden layers. Here, four network configurations designated as N1, N2, N3 and N4 with different parameters of the architectures have been considered. The descriptions of these architectures are given in Table 1. After exhaustive simulation the authors have concluded the best architecture which give comparatively better training and prediction against earlier predicted results by other researchers.

Table 1 Description of the four network configurations

Before starting the training of the network, we must recognize the need for a measure of how close the network has come to an established desired value. This measure is the distance between the actual and the desired response, which serves as an error measure or error tolerance and it is used to correct network parameters externally. Since, supervised training is being considered here, the desired value is known for the given training set. For the error back propagation training algorithm, an error measure is known as the root mean square error (RMSE). Any continuously error function can be used, but the choice of another error function does add additional complexity and should be approached with a certain amount of caution. The RMSE is defined as follows:

$$ E_{p} = \frac{1} {2}{\sum\limits_{j = 1}^N {(d_{{pj}} - o_{{pj}} )^{2} } } $$

where E p is the error for the pth training vector, d pj is the desired value for the jth output neuron (i.e. the training set value) and o pj is the calculated output of the jth output neuron.

3 Data set

The data set used in the four configurations for training and prediction is of 38 years (1958–1995) taken from Rajeevan et al. [5], which are the previous years’ rainfall. This data set is used to predict the rainfall for 6 years in advance. The five categories of rainfall is considered here as defined in Rajeevan et al. [5], which is given in Table 2 for the sake of completeness.

Table 2 Category of monsoon rainfall for forecast [5]

4 Training and testing

Training is the way a neural network learns. Training may be supervised or unsupervised. In this study supervised training has been used, which provides the network with the desired response. In the training process, firstly the input data is presented to the network, and then the network modifies the weights of the neurons and adjusts them to achieve a prediction of the next point in the input data with desired accuracy. When this procedure is carried out with a large sample of the input data (comprising the training set), the neural network ‘learns’ the relationship between input and output data and then the trained network can be used to make a prediction for a point immediately outside the training set. This process of predicting a point outside the training set is called testing.

The learning parameter (β) and momentum (α) used in the four models are 0.1 and 0.4, respectively, which have been fixed after number of simulations. In practice, momentum is added as an aid to have rapid convergence. The momentum constant, α, has the effect of smoothing the error surface in weight space by filtering out high frequency variations.

5 Percentage error and evaluation parameter

In the present models, percentage error E r in prediction and an evaluation parameter denoted by σ r is evaluated. For good prediction σ r  < 1, i.e the error in prediction should be less than the standard deviation of the data to be trained. In all the four models, σ r is coming out to be less than 1. The percentage error is defined as follows [3].

$$ {E} _{r} = {v_{{\text{o}}} - v_{{\text{p}}}} $$

and the evaluation parameter is defined as in Goswami and Srividya [3]

$$ \sigma _{r} = \frac{{|v_{{\text{o}}} - v_{{\text{p}}} |}} {{\sigma _{{\text{o}}} }}, $$

where v p and v o are, respectively, predicted and actual values and σo is the standard deviation of the training data set.

6 Results and conclusions

The data set is trained using a variety of network configurations and only four of them viz., N1, N2, N3, N4 and the corresponding results after the training are given in Table 3. The four panels in Fig. 1 show plots between the rainfall for the period 1958–1995 and the percentage error in rainfall during learning of the four configurations N1–N4. The second column in Table 3 gives the actual percentage rainfall. Last column shows the results given in Rajeevan et al. [5], from statistical model viz. 16-parameter model for 38 years (1958–1995) where they also compared two models viz. 8 and 10 parameter models. In the last row of Table 3 correlation coefficients (r) between actual and predicted are given for the four networks along with the results given in Rajeevan et al. [5].

Table 3 Comparison of performance of N1, N2, N3 and N4 network models with Rajeevan et al. [5] (16 parameter model)
Fig. 1
figure 1

Comparison of performance in learning of the four configurations N1–N4

The training of the data set using all the four network configurations are ceased as soon as the network satisfies the given RMSE value. Then the weights are stored and used for prediction. The predicted results from all the four network configurations are compared with the desired results. It was seen when comparing with desired and predicted results outside the training data, the best result is given by the network configuration N2. Table 4 shows the comparison between the actual results and the errors with the forecast done by the four network configurations, Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4] (it was published as a scientific correspondence) and Report [6]. The last row of the Table 4 gives the correlation coefficient (r) between actual and predicted for the four models and compared it with the Rajeevan et al. [5] model, Goswami and Srividya [3], Goswami and Kumar [4] models. Among the four models, it can be seen that the best result is given by N2 model. The forecast results of the four network configurations, Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4], are compared in term of error percentage (error percentage = actual rainfall % − forecast %). It is interesting to note that actual rainfall in 2002 was 82% (drought). However, Rajeevan et al. [5] predicted the rainfall for this year as 97% (near normal) and Goswami and Srividya [3], Goswami and Kumar [4] predicted as 99% (near normal). Present study successfully predicted the rainfall for 2002 as 81.61% that is very close to actual (82%). Another reliability of the present model may be seen for the year 2000. In this year actual rainfall was 92% but Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4], predicted as 98 and 89%, respectively. Again the proposed model forecast the rainfall for 2000 as around 92%. Moreover, the Report [6] predicted rainfall for 1995 as 98% but the result as obtained by the present model is very close to the actual one. The corresponding results comparing actual, N2 model and the results of the Rajeevan et al. [5] are depicted in Fig. 2. The above discussions and various results given show the efficiency and reliability of the present model for the good prediction of the monsoon rainfall. As discussed that the evaluation parameter (σ r ) is a significant term for good prediction if it is less than 1. Therefore corresponding evaluation parameter for the years 1996–2002 (outside the training set) are computed and given in Table 5 where the standard deviation of the training set is 11.3004. This table also gives evaluation parameter for Rajeevan et al. [5]. It is worth mentioning by looking into the evaluation parameters that the N2 model is better than the other models. The predicted results obtained from the four models are depicted and compared with the actual, Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4], in Fig. 3. Table 6 presents prediction of the rainfall for 6 years in advance and this prediction is done using N2 model as it gives good results than other three models. The proposed prediction from the present ANN model, may however be checked by the future years’ observations, which will certainly be interesting to see for the comparisons.

Table 4 Comparison of the errors for the four (N1, N2, N3, N4) network configurations with Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4] and Report [6]
Fig. 2
figure 2

Performance of N2 configuration and its comparison with the Rajeevan et al. [5] and actual results

Table 5 Comparison of evaluation parameter (σ r ) for the four (N1, N2, N3, N4) network configurations with Rajeevan et al. [5]
Fig. 3
figure 3

Comparison of performance of N1, N2, N3 and N4 Models with actual, Rajeevan et al. [5], Goswami and Srividya [3], Goswami and Kumar [4] and Report [6]

Table 6 Six years prediction in advance