1 Introduction

Climate change is a matter of concern and changing patterns in rainfall are one of the most important parameters affecting climate (Thomas et al. 2007). It is a severe problem for today’s scientific community and it deserves attention. In the Asian region, the climate change had directly influenced the streamflow volume and temporal distribution, which resulted in an increase ofthe stress level of water resources (White et al. 2014; Lu et al. 2024). The distribution of rainfall is indeed influenced by the relief, or topography, of a country (Oettli and Camberlin 2005; Kushwaha et al. 2021; Li et al. 2020). The north-eastern state of India receives rainfall more than 500 mm of annual rainfall (Kripalani et al. 1996; Wu et al. 2022) while in desert areas like Rajasthan is receives rainfall less than 10 cm of annual rainfall (Singh et al. 1974). Thus, rainfall pattern forecasting is necessary for making informed decisions across various sectors. It enhances the ability of governments, communities, and industries to plan and respond effectively to the dynamic and sometimes unpredictable nature of weather patterns, thereby reducing risks and promoting resilien (Zhang et al. 2021, 2023; Lin et al. 2023). Analyzing rainfall patterns at the state level is highly recommended for effective resource management, and it addresses several challenges associated with this task (Akhter et al. 2017; Liu et al. 2023; Zhou et al. 2023a, b). Therefore, understanding of rainfall distribution patterns is a fundamental step in assessing and managing the risks associated with both floods and droughts (Tarhule 2005; Trenberth 2011; Zho et al. 2022).

Seasonal rainfall knowledge is especially crucial in the context of climate change, where shifts in precipitation patterns can exacerbate the frequency and intensity of both floods and droughts. In essence, understanding the pattern of rainfall distribution is fundamental for making informed decisions in various fields, ranging from water resource management and infrastructure design to climate change adaptation and agricultural planning (Ziervogel et al. 2010; Bhave et al. 2018; Markuna et al. 2023). Rainfall frequency analysis and stochastic modeling are powerful tools that enable researchers and professionals to quantify and predict the complex nature of rainfall patterns (Du and Wang 2014; Gao et al. 2018; Zhao et al. 2022). The rainfall distribution also impacts evapotranspiration processes(Li 2014; Kumar et al. 2016; Kushwaha et al. 2021; Markuna et al. 2023) and groundwater storage which may hamper groundwater quality and increase its remediation cost (Kumar et al. 2013; Asoka et al. 2017; Vishwakarma et al. 2018; Saroughi et al. 2023). Like excess rainfall can cause floods and less rainfall can cause drought. For this reason, accurate rainfall prediction can be helpful in managing these things as well as prevent hydraulic structures to control the floods/drought.

In hydrological modeling, the ANN techniques were applied for the first time by French et al. (1992). Since then, several modeling approaches have been successfully addressed to improve rainfall forecasting models (Ghamariadyan and Imteaz 2021c, b, a; Markuna et al. 2023), rainfall-runoff (Makwana and Tiwari 2014; Asadi et al. 2019; Dumka and Kumar 2021), stream-flow (Chen et al. 2014; Makwana and Tiwari 2017; Shukla et al. 2021; Sammen et al. 2021; Vishwakarma et al. 2023b), sediment discharge (Nourani 2014; Li et al. 2022; Chauhan et al. 2022) water quality (Barzegar et al. 2018; Kumar et al. 2023), ground water (Ch and Mathur 2012; Samantaray et al. 2022; Saroughi et al. 2023), hydraulic conductivity (Sihag 2018; Singh et al. 2019, 2022). In the past, many researchers have studied on rainfall modeling (Zhang and Dong 2001; Tan et al. 2020; Ridwan et al. 2021; Khan et al. 2021). Wu et al. (2010) developed a model that forecasts rainfall on a monthly basis as well as a daily basis. The study demonstrated the applicability of Moving average (MA), Principal Component Analysis (PCA) and Singular Spectral Analysis (SSA) and some forecasting models such as Linear Regression (LR), K-nearest-Neighbors (K-NN), Artificial Neural Network (ANN) and Modular Artificial Neural Network (MANN) for modelling of monthly rainfall. The results revealed that the SSA technique (singular spectral analysis) was better than moving average (MA) or principal component analysis (PCA) techniques and Modular artificial neural network (MANN) showed a good result on daily rainfall forecasting if MANN was associated with SSA technique. Ridwan et al. (2021) applied four Machine learning models namely Bayesian Linear Regression (BLR), Boosted Decision Tree Regression (BDTR), Decision Forest Regression (DFR) and Neural Network Regression (NNR) for forecasting rainfall in Tasik Kenyir, Terengganu. The BDTR model performed superior to others under the auto correlation function as well as projected error. Khan et al. (2021) performed a comparative study of single decision tree (SDT), tree boost (TB), decision tree forest (DTF), multilayer perceptron (MLP), and gene expression programming (GEP) for rainfall-runoff modelling in the Soan River basin, Pakistan. The study revealed that maximal overlap discrete wavelet transformation (MODWT) based DTF model has high efficacy for Rainfall modellingin the study River basin. Smith et al. (1998) used discrete wavelet transform to detect the characteristic of stream flow and also for detecting its features. Wavelet transform was applied to the daily river discharge (Ahmadi et al. 2022; Pande et al. 2023). The daily river discharge was recorded for 91 rivers in the US. The result obtained from the study suggested that by using the wavelet transform method river flow can be classified into different hydro-climatic categories. Nakken (1999) applied the wavelet theory for identifying the temporal variation in rainfall and runoff and also developing a relationship between them. The Morlet wavelet was used for developing a relationship between rainfall and runoff with respect to time. A dominant frequency was seen since 1950s. Further, Krishna et al. (2011) used the wavelet neural network for developing a time series model for daily river flow of Malaprabha River, Karnataka. The time series was decomposed into a number of sub-series by using discrete wavelet transform. Chen et al. (2014) developed a model for rainfall-runoff simulation by introducing copula entropy (CE) coupled with ANN in the south-western part of China. In this study, for selecting the inputs of ANN model, copula entropy technique was used and three models (Multilayer Feed Forward, Radial biased function, Linear Regression Neural Network) were also used to forecast the stream flow. Study shows that a significant improvement in the forecasting performance of the Jinsha River at Pingshan gauge station when the inputs selected for the copula entropy (CE) method were compared to the inputs chosen for the traditional linear correlation analysis. The MLF ANN model with the inputs selected by the CE method also had the best results. Kang and Lin (2007) used wavelet theory on water quality and hydrological signals for an agricultural watershed. Three signals based on the precipitation, the water level of wells and streamflow for three periods such as 15 years, hydrologic year and 3 years were taken into consideration. The results suggested that the tool i.e. wavelet transform was found to be useful for analyzing the temporal pattern of the hydrologic as well as for finding out the water quality signals for different scales (temporal scale).

Santos and Freire (2012) analyzed rainfall data from 1901 to 2010 in the northeast region of Brazil consisting of nine states using wavelet transform technique and wavelet spectra approach used to study the variability in monthly rainfall. The results suggested that, a high concentration of rainfall was observed using wavelet spectra technique. Chattopadhyay and Chattopadhyay (2010) developed a uni-variate model to forecast the rainfall data from 1871 to 1999 using Auto-Regressive Integrated Moving Average (ARIMA) and Auto-Regressive Neural Network (ARNN) in India. The study was done for forecasting the summer monsoon and the period of summer monsoon was taken from June to August. Li et al. (2013) examined the atmospheric moisture budget and the regulation of summer precipitation variation over the region of south-east United States during 1948–2007. The inter annual variation in the region was explained using Empirical orthogonal function. Using the wavelet analysis, an increase was identified for 2–4 years in 30 years. The atmospheric moisture budget showed an increasing trend in precipitation because of the moisture transport. Roy et al. (2021) contructed and evaluated an integrated model EO-ELM [Equilibrium Optimizer (EO) and extreme learning machine (ELM)] and a deep neural network (DNN) for rainfall-runoff modelling at two station namely Glanteifi and Fal at Tregony in the UK. The obtained results showed efficient applicability of EO-ELM and DNN in rainfall runoff modelling. Simillar attemped was made by Adnan et al. (2021) and compared the performance of ANFIS-PSO, ANFIS-FCM, MARS and M5Tree, together with Multi Model Simple Averaging (MM-SA)for rainfall-runoff modelling on hourly basis. Results shows that ML methods generally performs superior to the EBA4SUB and provides better accuracy than the M5Tree and MARS in some cases.

To the authors knowledge, in literature, no study applied hybrid model of Wavelet-ANN (WANN) for seasonal rainfall modelling in the southern part of Uttarakhand (i.e., namely, Almora, Kashipur, Lansdown and Mukteswar) considered in the present research. Furthermore, the developed methodology, which considers the modelling rainfall employing hybrid algorithm trained on surrounding stations, is in line with practical needs. Therefore, in the present study, a hybrid model using Wavelet-ANN techniques has been developed for predicting the seasonal rainfall in the southern part of Uttarakhand.

2 Material and Methods

2.1  Study Area

Uttarakhand is hilly state of India consisting of 13 districts, having a geographical area of 53,483 km2 and it consists of two sub divisionsi.e., Kumaun and Garhwal region. The latitudes and longitudes of Uttarakhand state are 30.066o N and 79.019° E respectively with an elevation ranging from 210 to 7817 m. Four stations of southern district of Uttarakhand has been selected for model development, namely Almora, Mukteswar, Lansdown, Kashipur, and the study area for this work is shown in Fig. 1.

Fig.1
figure 1

Study area map

Monthly Rainfall data for the period of 1901 to 2016 was obtained from the Indian metrological department (IMD). Seasonal months were taken according to Indian Metrological department (IMD) and details of data collection for study area were shown in Table 1 and Table 2, respectively. Some basic formulae of statistical parameters have been used in this study and they are listed in Table 3.

Table 1 Seasonal months according to IMD
Table 2 Detail of selected area of Uttarakhand, India
Table 3 Statistical parameters used in this study

Where \(\overline{x }\) is the mean of sample size; n is Total number of sample size,SD is standard deviation and \({x}_{i}\) is the values in the observation.

2.2  Artificial Neural Network (ANN)

In general, neural network deals with transforming the original signals into meaningful signals. The concept of ANN was introduced by Meculloch and Pits in 1943 and it is based on biological nervous system (McCulloch and Pitts 1943). In modeling and forecasting of non-linear hydrologic series, Artificial Neural Network has been widely used (Shukla et al. 2021; Elbeltagi et al. 2022a, b; Saroughi et al. 2023). The ANN works on black-box approach, that’s why its application does not need any prior information about the techniques (Tzeng and Ma 2005). It is deal in data driven approach and it gives powerful solution of any multifaceted systems. The main characteristic of ANN is that a quick understanding capacity between the input and output signals (Zhang et al. 2017).

In the present study, ‘Feed Forward Back Propagation’ (FFBP) network has been considered to model the seasonal rainfall data. A FFBP network consisted of three layers viz. Input layer, Hidden layer and Output layer and Levenberg–Marquardt algorithm was used in this study. The training function (TRAINLIM) and learning function (LEMRNGDM) were used for calibration the input and output data. Furthermore, there are three types for transfer functions viz. Log sigmoid, Tan sigmoid and purelin function, out of these functions Tansig function was considered as an activation function.

2.3  Wavelet Transform

The Wavelet transform analysis is an advanced technique in signal processing which has gained attention because of theoretical development by Grossmann and Morlet (1984). It is use as an alternative to Fourier transformation and it is also superior to classical spectral analysis as it allows using different scale for analysing the temporal variations and the main advantage is that the use of stationary series is not required (Smith et al. 1998).Thus, it is appropriate to analyse irregular distributed events and time series that contain non stationary power at many different frequencies. The formula of discrete wavelet transform is given in Eq. (1).

$${\varphi }_{j}^{m}={a}_{0}^{{~}^{-m}\!\left/ \!{~}_{2}\right.}\varphi \left(\frac{y-j{a}_{0}^{m}{b}_{0}}{{a}_{0}^{m}}\right)$$
(1)

where \(\varphi\) is the Mother wavelet, m is the variable scale, \({b}_{0}\) is Translation Length, j is position unit and \({a}_{0}\) is the base dilation. Furthermore, the continuous form of wavelet transform is described in Eq. (2).

$$WT\left(b,a\right)= \frac{1}{\sqrt{a}}\int g\left(\frac{t-b}{a}\right)s\left(t\right)$$
(2)

where a is Scaling Factor; b is the time domain; s(t) is the signals at time b; g(t) is the Mother wavelet at ‘a’ and “b’ having value of 0 and 1 respectively.

Although, the use of wavelet is not common in the field of hydrology. It maintains the time and frequencies localization which used in signal analyzing by transforming one-dimensional time series to a diffused two dimensional time–frequency image at the same time, which helps the researcher to get information regarding amplitude of any signal that is periodic within the series and the time varies with respect to time.

The time series of rainfall data for each station has been decomposed using wavelet analysis to get approximate and detailed signal. Thus, using wavelet analysis following relation has been obtained:

$$Rainfall\;time\;series= f\left(Approximate\;Signal, Detailed\;signal\right)$$
(3)

2.4 Wavelet Artificial Neural Network (WANN) Model

The primary objective of this analysis is to determine the frequency of the signals and the variation in the frequencies that was used for analysis of the data. It provides information’s related to time, signal frequency and location. It helps to transfer a signal into a set of sub-signal. For the Forecasting of financial time series, WANN model was firstly introduced by Aussem (1998). Wavelet transform analysis is a more appropriate tool than the Fourier transform in studying non-stationary signals (Partal and Kişi 2007). A number of hydrological processes have been developed for wavelet-based hybrid models, which have effectively been applied to studies of water resources with excellent results (Sahay and Srivastava 2014; Seo et al. 2015; Kumar et al. 2015, 2020, 2021; Djerbouai and Souag-Gamane 2016; Araghi et al. 2017; Kisi and Alizamir 2018; Shukla et al. 2021; Drisya et al. 2021; Dumka and Kumar 2021; Bajirao et al. 2021)..

In the present study, Haar wavelet (at level 2) has been used to decompose the seasonal rainfall data. The seasonal rainfall data were decomposed into many sub-signal series to get temporal information about the signal. This sub signals are classified into detailed (d1 and d2) and approximation (A1 and A2) coefficients as described in Fig. 2.Thus,

$${{I}_{a}\left(t\right) W}_{a}+{{I}_{d1}\left(t\right) W}_{1}+{{I}_{d2}\left(t\right) W}_{2}+{{I}_{d3}\left(t\right) W}_{3}=I\left(t+1\right)$$
(4)

where, \({W}_{i}\) is the weights adjusted by ANN; \({I}_{a}\left(t\right)\) and \({I}_{d}\left(t\right)\) are approximated and detailed signal for rainfall at time \(t\); \(I\left(t+1\right)\) is the rainfall time series one time step ahead of \(t\). For determining the neurons in hidden layer, 2N + 1 criteria where N is the number of input neurons has been used (Mishra and Desai 2006).

Fig. 2
figure 2

Schematic diagram of WANN model

In order to construct the network architecture, dataset categorized into two portions i.e., 70 percent of total rainfall data were used for training and remaining 30 percent data was used to develop a WANN-model. To compute the level of decomposition Eq. (5) has been used (Nourani et al. 2009).

$$L=int\;\left(LogN\right)$$
(5)

where, L is the number of decomposition level and N is the total length of dataset.

The developed models for testing the performance of rainfall prediction is shown in Table 4. The input for these models were the approximated and detailed signals at time t (as shown in Eq. 4), and the output was the signal at one time ahead signal.

Table 4 Developed WANN Models for rainfall predictions

2.5  Evaluating Criteria for Model Performance

Wavelet Artificial Neural Network models were evaluated using performance indexes, and the model network with the best performance was selected for use in the simulation of rainfall-runoff based on the WANN model with the best performance index. For evaluating the performance of designed model, Root Mean Square Error (RMSE), coefficient of determination (R2) and coefficient of efficiency (CE) have been selected whose equation is stated from Eqs. 68.

$$RMSE={\left(\frac{1}{n}\left({O}_{i}-{P}_{i}\right)\right)}^{0.5}(0<RMSE<\infty )$$
(6)
$$CE= 1-\left(\frac{\sum_{i=1}^{N}{\left({O}_{i}-{P}_{i}\right)}^{2}}{\sum_{i=1}^{N}{\left({P}_{i}-\overline{P }\right)}^{2}}\right) (-\infty <COE<1)$$
(7)
$${R}^{2}= {\left(\frac{\sum_{i=1}^{N}\left({O}_{i}-\overline{O }\right)\left({P}_{i}-\overline{P }\right)}{\left(\sqrt{\left(\sum_{i=1}^{N}{\left({O}_{i}-\overline{O }\right)}^{2}\right)\left(\sum_{i=1}^{N}{\left({P}_{i}-\overline{P }\right)}^{2}\right)}\right)}\right)}^{2}\left(0<{R}^{2}<1\right)$$
(8)

where Oiis the observed rainfall and Piis the predicted rainfall for the ith time-series; n is the total length of time-series; \(\overline{P }\) and \(\overline{O }\) are indicating the average value of observed and predicted rainfall respectively.

R2 is an index of the degree of linear relationship between observed and predicted data. CE is a measure of how well the plot between the observed values and the predicted values fits the 1:1 line when plotted against the observed values. The model performance must be evaluated based on at least one absolute error measure (e.g. RMSE) to ensure that the model is as accurate as possible. Those models having least RMSE and CE and R2 valuse closed to 1, the model will be considerd as best and superior model (Saroughi et al. 2023; Vishwakarma et al. 2023a, b; Mirzania et al. 2023; Kumar et al. 2023).

3 Results

3.1 Statistical Analysis of Seasonal Rainfall

The long-term rainfall statistics of Almora, Lansdown and Kashipur and Mukteshwar is shown in Tables 5 to 8. It can be depicted that atAlmora, lowest and highest seasonal rainfall was found to be zero and 1187.20 mm respectively (Table 5). At Lansdown station, the highest value of rainfall i.e., 2826.70 mm and lowest rainfall i.e., zero was measured (Table 6). At Kashipur station, the mean rainfall varied from 48.45 mm to 1060.53 mm and maximum rainfall was measured for the monsoon season i.e., 2716.40 mm among all the seasons whereas lowest rainfall was found zero in all the seasons except monsoon season (Table 7). At Mukteswar station, the mean rainfall varied from 85.04 mm to 964.43 mm and minimum value of rainfall was found zero in post-monsoon season. The maximum, SD, CV and SC of seasonal rainfall data ranged from 336.90 mm to 1839 mm, 65.34 mm to 256.22 mm, 26.57% to 111.34% and 0.65 to 2.29 respectively (Table 8).

Table 5 Statistical analysis of seasonal rainfall (mm) data for Almora
Table 6 Statistical analysis of seasonal rainfall (mm) data for Lansdown

Pimentel-Gomes (2023) classified the CV as follows: Low: Lower than 10%; Average: 10–20%; High: 20–30%; Very high: Higher than 30%. The result shown in Tables 5, 6, 7 and 8, and the visual interpretation in Fig. 3, describethe level of variability and found very high in all station except Almora and Mukteswar in mansoon season.

Table 7 Statistical analysis of seasonal rainfall (mm) data for Kashipur
Table 8 Statistical analysis of seasonal rainfall (mm) data for Mukteswar
Fig. 3
figure 3

Variability of rainfall atAlmora, Lansdown and Kashipur and Mukteshwar

3.2  Forecasting Seasonal Rainfall Using WANN-Model

3.2.1 Model selection

The developed model has been selected based on training results with the help of low value of Root Mean Square Error (RMSE), high value of coefficient of determination (R2) and coefficient of efficiency (CE). The training, testing and overall value of R2 of seasonal rainfall were conducted for the selected stations namely Kashipur, Lansdown, Almora and Mukteswar. Furthermore, four networks of WANN model for each seasons were designed by varying the numbers of neurons in the Hidden layer and based of model performance, one WANN model has been finalized out of total designed models.

3.2.2 WANN-model for Lansdown station

The prediction accuracy of the different WANN models during the training, testing and overall data sets for Lansdown stationare illustrated in Table 9. The result shows overall R2, RMSE (mm/month) and CE values ranging from 0.665 to 0.857 (mean = 0.744), 32.192 to 303.682 mm/month (mean = 106.437 mm/month), and 0.636 to 0.846 (mean = 0.722) for the Lansdown station. For the monsoon season, WANN-02 (4–7-1 Network) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.815, 232.736 mm/month and 0.805 respectively. Out of four designed network, WANN-08 (Network 4–9-1) found best for the winter season under the criteria of overall highest value of coefficient of determination and CE (R2 = 0.740, CE = 0.732), lowest value of RMSE (49.073 mm/month) (Table 9). Similarly, for the pre-monsoon season, the WANN-10 (4–7-1 Network) model was found superior as the its achieved highest value of R2 and CE i.e., 0.857 and 0.846 respectively, and lowest value of RMSE i.e., 32.192 mm/month out of four designed network of pre-monsoon season. In the same way, the post-monsoon season, WANN-16 (Network, 4–9-1) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.806, 47.739 mm/month and 0.796 respectively(Table 9).

Table 9 Selected models of all the seasons for Lansdown station

Furthermore, the scattered plot of observed rainfall versus predicted rainfall is shown in Fig. 4 for all seasons. Model WANN-10 (4–7-1 Network), based on the WANN algorithm and seasonal comparison, achieved the best overall performance in predicting the rainfall for Lansdown station data set. WANN-10 (4–7-1 Network) achieved the best R2, RMSE and CE values with scores 0.857, 32.192 mm/month and 0.846 respectively, hence giving it the highest accurqacy as the observed values are very closed to line 1:1 pre-mansoon season.

Fig. 4
figure 4

Scatter plot between observed and predicted rainfall for Lansdown station

3.2.3 WANN-model for Almora station

The prediction accuracy of the different WANN models during the training, testing and overall data sets for Almorastation are illustrated in Table 10. The result shows overall R2, RMSE (mm/month) and CE values ranging from 0.684 to 0.822 (mean = 0.735), 24.804 to 96.906 mm/month (mean = 47.512 mm/month), and 0.680 to 0.810 (mean = 0.726) for the Almora station. For the monsoon season, WANN-02 (4–7-1 Network) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.781, 88.011 mm/month and 0.776 respectively. Out of four designed network, WANN-08 (Network 4–9-1) found best for the winter season under the criteria of overall highest value of coefficient of determination and CE (R2 = 0.822, CE = 0.810), lowest value of RMSE (24.804 mm/month) (Table 10). Similarly, for the pre-monsoon season, the WANN-9 (4–6-1 Network) model was found superior as the its achieved highest value of R2 and CE i.e., 0.704 and 0.701 respectively, and lowest value of RMSE i.e., 32.865 mm/month out of four designed network of pre-monsoon season. In the same way, the post-monsoon season, WANN-15 (Network, 4–8-1) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.732, 34.771 mm/month and 0.725 respectively(Table 10).

Table 10 Models of all the seasons for Almora station

Furthermore, the scattered plot of observed rainfall versus predicted rainfall is shown in Fig. 5 for all seasons. Model WANN-8 (4–9-1 Network), based on the WANN algorithm and seasonal comparison, achieved the best overall performance in predicting the rainfall for Almora station data set. WANN-8 (4–9-1 Network) achieved the best R2, RMSE and CE values with scores 0.822, 24.804 mm/month and 0.810 respectively, hence giving it the highest accurqacy as the observed values are very closed to line 1:1 for winter season.

Fig. 5
figure 5

Scatter plot between observed and predicted rainfall for Almora station

3.2.4 WANN-model for Mukteswar station

The prediction accuracy of the different WANN models during the training, testing and overall data sets for Mukteswar station are illustrated in Table 11. The result shows overall R2, RMSE (mm/month) and CE values ranging from 0.661 to 0.867 (mean = 0.745), 32.575 to 134.340 mm/month (mean = 61.380 mm/month), and 0.566 to 0.864 (mean = 0.732) for the Mukteswar station. For the monsoon season, WANN-03 (4–8-1 Network) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.867, 94.317 mm/month and 0.864 respectively. Out of four designed network, WANN-05 (Network 4–6-1) found best for the winter season under the criteria of overall highest value of coefficient of determination and CE (R2 = 0.748, CE = 0.748), lowest value of RMSE (32.575 mm/month) (Table 11). Similarly, for the pre-monsoon season, the WANN-10 (4–7-1 Network) model was found superior as the its achieved highest value of R2 and CE i.e., 0.780 and 0.738 respectively, and lowest value of RMSE i.e., 37.232 mm/month out of four designed network of pre-monsoon season. In the same way, the post-monsoon season, WANN-16 (Network, 4–9-1) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.817, 41.631 mm/month and 0.811 respectively (Table 11).

Table 11 Selected models of all the seasons for Mukteswar station

Furthermore, the scattered plot of observed rainfall versus predicted rainfall is shown in Fig. 6 for all seasons. Model WANN-3 (4–8-1 Network), based on the WANN algorithm and seasonal comparison, achieved the best overall performance in predicting the rainfall for Mukteswar station data set. WANN-3 (4–8-1 Network) achieved the best R2, RMSE and CE values with scores 0.867, 94.317 mm/month and 0.864 respectively, hence giving it the highest accurqacy as the observed values are very closed to line 1:1 for winter season.

Fig. 6
figure 6

Scatter plot between observed and predicted rainfall for Mukteswar station

3.2.5 WANN-model for Kashipur station

The prediction accuracy of the different WANN models during the training, testing and overall data sets for Kashipur station are illustrated in Table 12. The result shows overall R2, RMSE (mm/month) and CE values ranging from 0.700 to 0.998 (mean = 0.808), 1.240 to 269.357 mm/month (mean = 81.338 mm/month), and 0.660 to 0.999 (mean = 0.793) for the Kashipur station. For the monsoon season, WANN-01 (4–6-1 Network) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.771, 239.490 mm/month and 0.731 respectively. Out of four designed network, WANN-08 (Network 4–9-1) found best for the winter season under the criteria of overall highest value of coefficient of determination and CE (R2 = 0.786, CE = 0.782), lowest value of RMSE (24.785 mm/month) (Table 12). Similarly, for the pre-monsoon season, the WANN-10 (4–7-1 Network) model was found superior as the its achieved highest value of R2 and CE i.e., 0.998 and 0.999 respectively, and lowest value of RMSE i.e., 1.240 mm/month out of four designed network of pre-monsoon season. In the same way, the post-monsoon season, WANN-15 (Network, 4–8-1) model performed superior than the other designed network. The overall measured value of R2, RMSE and CE were found 0.865, 34.118 mm/month and 0.839 respectively (Table 12).

Table 12 Selected models of all the seasons for Kashipur station

Furthermore, the scattered plot of observed rainfall versus predicted rainfall is shown in Fig. 7 for all seasons. Model WANN-10 (4–7-1 Network), based on the WANN algorithm and seasonal comparison, achieved the best overall performance in predicting the rainfall for Kashipur station data set. WANN-10 (4–7-1 Network) achieved the best R2, RMSE and CE values with scores 0.998, 1.240 mm/month and 0.999 respectively, hence giving it the highest accurqacy as the observed values are very closed to line 1:1 for winter season. One interesting thing also noticed for the Kashipur station that performance of all models in training was so higher than all the stations.

Fig. 7
figure 7

Scatter plot between observed and predicted rainfall for Kashipur station

4 Discussion

Because of the geographical location of Uttarakhand, the Almora, Kashipur, Lansdown, and Mukteswar climate index has a stronger impact on rainfall variability in this region. In spite of the fact that climate drivers interact with rainfall in complex ways, sometimes it is impossible to predict rainfall with a high level of accuracy in response to an individual climate driver alone because of the complex relationships between them. A total of 16 different models (WANN-01 to WANN-16) were selected to compare the effect of climate indices on Uttarakhand rainfall on a monthly basis in this study. In the present study, the desecrate wavelet transform coupled with ANN was employed to significantly enhance the accuracy of the seasonal rainfall prediction. Considering all four regions considered, it is evident that the performances of the WANN-03 (Network 4–8-1), WANN-10 model (Network 4–7-1), WANN-10 (Network 4–7-1) and WANN-15 (Network 4–8-1) models when considering the R2, RMSE, and CE values for the Mukteswar, Lansdown, Kashipur and Almora regions respectively, are considerably better than the performances of the other WANN models. The results from this study have demonstrated that the introduction of easily estimated input variables into WNN models is a very useful tool for improving precipitation predictions, especially when there are no long-term datasets available that can provide a good estimation of future precipitation amounts. These wavelet models coupled with ANN performed better when prediction of rainfall (Ray et al. 2020; Ghamariadyan and Imteaz 2021c; Tiwari et al. 2023) and while other studies simulate stream flow using specific algorithms which is based on ANN such as LM, SDG and BR-ANN (Rautela et al. 2022). In addition, our results showed lower RMSE values than those reported by Jiang and Wu (2013) in a study of ten stations in Guilin (China), using evolutionary models to estimate the RMSE values. Considering the efficiency of the models, the mean CE values indicate that they all have good levels of efficiency, and they are significantly higher than the values reported by Kalteh (2017) in Iran, which used ANNs to predict monthly precipitation using 30-year series to conduct the study. Before being utilized as an input to the ANN, the discrete signals (representing seasonal rainfall) were first decomposed into smaller signals, and then used as input. During analysis, some models perform very well when the number of neurons in the hidden layer is less, or some models perform badly even though the number of neurons in the hidden layer is highe r(Zhang and Morris 1998; Ke and Liu 2008; Sheela and Deepa 2013; Shukla et al. 2021; Rachmatullah et al. 2021). Therefore, it is recommended that the relationship between the hidden layer neuron and the best-performing model be context-specific and may also depend on the larger dataset. The local farmers and policymakers in the studied regions may find this study valuable in mitigating issues associated with water. The future scope of this study is as follows: In this research, the number of rain gauge stations is limited due to data availability. Consequently, it is recommended that future studies expand by including a greater number of stations. Additionally, the prediction of rainfall could be extended to various time scales, encompassing seasons such as daily and monthly intervals. Furthermore, it is advised to integrate diverse models employing remote sensing and deep learning approaches.

5 Conclusion

In the present study, four stations of Uttarakhand state namely, Almora, Kashipur Lansdown, and Mukteswar were taken into consideration for analyzing the statistical parameters and also to develop a hybrid WANN model. These stations were selected on the basis of data availability. Six statistical parameters such as mean, maximum, minimum, coefficient of variation, coefficient of skewness and standard deviation were employed for analyzing the seasonal rainfall data. A hybrid model of Wavelet coupled with ANN was developed to model seasonal rainfall. For evaluating the performance of the designed model Root Mean Square Error (RMSE), Coefficient of determination (R2) and Coefficient of Efficiency (CE) were used. The highest mean rainfall was observed in the winter season for the Lansdown station.Lansdowne station received the highest rainfall compared to all the stations. In the analysis based on pre-monsoon and post-monsoon reading, Mukteswar holds the highest value of mean rainfall. Similarly, for monsoon, Lansdownwasthe highest value. The maximum rainfall for pre-monsoon, post monsoon, and winter in thesouthwest monsoon was seen in the Lansdown region. For Mukteswar, Lansdown, Kashipur and Almora, the model WANN-03 (Network 4–8-1), WANN-10 model (Network 4–7-1), WANN-10 (Network 4–7-1) and WANN-15 (Network 4–8-1), respectively were found to be most efficient model as the R2 value was high and the RMSE obtained was low. The overall value for Rwas found high in these models. The selection of these models for the particular region was based on comparisons between the models and their seasons. Among all the stations investigated in the study, the hybrid model of WANN-10 with a Network (4–7-1) was found to be the most superior model for the Lansdown stations out of all those investigated.