1 Introduction

Rainfall plays a crucial role in the terrestrial water cycle, as global warming leads to increased climate instability, extreme rainfall and drought events are occurring frequently, introducing numerous uncertainties into the sustainable growth of the local economy and the well-being of its inhabitants. (Aydin and Sevgi 2022; Fatimi et al. 2023; Zhuang et al. 2023). Therefore, scientific and accurate rainfall prediction based on hydrological characteristics is important for effective hydrological planning and development of strategies to manage rainfall risks.

The hydrological processes, such as rainfall, are characterized by complexity and fluctuation, making their precise forecasting a key issue in contemporary research (Moraga et al. 2022; Li et al. 2012; Zhou et al. 2017). During the past few years, scholars have extensively researched rainfall predicting and attained remarkable outcomes (Wu et al. 2021; Yang et al. 2022). Currently, the common practice for rainfall prediction is analyzing available rainfall data and constructing corresponding prediction models (Chen et al. 2022). The GP, SVR, ANN and other classical time series modeling methods are now effectively used in hydrological forecasting (Chadalawada et al. 2017; Kajewska Szkudlarek 2020; Devi et al. 2016). However, owing to the randomness of rainfall and the influence of climate, the actual rainfall may exhibit significant fluctuations within a single year, rendering the traditional statistical methods inadequate for accurate quantitative prediction of rainfall (Olmo and Bettolli 2022). As computing power increases, deep learning neural networks have found widespread applications in various fields (Han et al. 2022; Afshari Nia et al. 2023; Chen and Wang 2022; Hewamalage et al. 2021). Byun et al. (2023) analysed cloud image data using the CNN algorithm for regional rainfall prediction. Zhang et al. (2021) predicted the daily runoff of a river using the RNN model. However, due to its algorithmic reasons, the RNN model was prone to gradient explosion. In response to this issue, the Long Short-Term Memory (LSTM) model was proposed (Gao et al. 2020). Ni et al. (2020) confirmed the effectiveness of LSTM in predicting time series by examining precipitation data. Chen et al. (2021) conducted an analysis of the rainfall and runoff processes in the Shishengchuan basin and discovered that the LSTM model had the capability to produce highly accurate rainfall predictions. Zhu et al. (2023) combined high-resolution radar rainfall data to propose an LSTM-based approach to radar rainfall nearing prediction.

Rainfall time series are always volatile and nonlinear, which seriously affects the performance of prediction models (Chen et al. 2022; Nourali 2023; Ehteram et al. 2023). Therefore, the choosing of techniques capable of handling non-stationary time series is the essential foundation for the subsequent enhancement of rainfall prediction accuracy. EMD algorithm is a traditional signal feature extraction method, but it is prone to mode mixing (Johny et al. 2022; Zheng et al. 2021). As technology evolves and modernizes, methods such as ESMD, EWT, and SVD have been proposed and effectively used in the feature recognition process of prediction data. Gao et al. (2023) applied ESMD to decompose denoised wind speed time series adaptively and proposed FO-BSO-LSSVM for multistep wind speed prediction. Karijadi et al. (2023) utilized the EWT method for secondary processing of high-frequency IMFs and combined it with the LSTM method to achieve wind power prediction. Zhang and Chen (2022) extracted the features of wind energy data by CEEMDAN-SVD and performed wind speed prediction.

The efficiency of ESMD, EWT, SVD, and LSTM methods in data processing and prediction has been verified in practical engineering, however, these methods are less applied in the field of hydrology and the practical effects of the coupled algorithms have not been fully explored. Therefore, for the practical needs of rainfall prediction, a coupled model based on EMD-EWT-SVD-LSTM is proposed. Firstly, for the rainfall observation sequence of Kaifeng City, Henan Province, ESMD-EWT is applied to perform quadratic decomposition to simplify the rainfall sequence features and obtain the characteristic IMFs; Secondly, the high-frequency IMFs obtained after quadratic decomposition are subjected to SVD noise reduction to obtain the main features; Thirdly, the final predicted rainfall sequence is obtained by predicting and integrating by the LSTM model for all the IMFs; Finally, the results of ESMD-EWT-SVD-LSTM prediction are juxtaposed with alternative models to validate applicability and accuracy of the model. This study explores a new regional rainfall prediction method, which can improve the accuracy effectively and offer a reliable scientific basis for regional water resources planning and disaster mitigation.

2 Method and Methodology

2.1 Extreme-Point Symmetric Mode Decomposition

Wang and Li (2013) proposed ESMD based on the EMD algorithm, which is applicable across various engineering fields involving data processing, such as information science, atmospheric science, and economics.

The ESMD algorithm works as follows:

Step 1: Find all maxima and minima in the data X and denote them as Ei(i = 1, 2, 3…, n) in order.

Step 2: Connect neighboring extreme points with the help of a straight line and mark the midpoint of each section of the line as Fi(i = 1, 2, 3⋯n).

Step 3: Complement the midpoints F0, Fn of the left and right boundaries by linear interpolation.

Step 4: Construct the interpolation lines L1, …Lp (p ≥ 1) for the obtained midpoints n + 1 and calculate the average value of the interpolation lines according to L ∗  = (L1 + … + Lp)/p.

Step 5: Repeat the above steps for X − L until |L| ≤ ε, where ε is a predetermined tolerance error, usually taken as ε = 0.001δ0. δ0 is as follows:

$${\delta}_0^2=\frac{1}{N}\sum \limits_{i=1}^N\left({x}_i-\frac{1}{N}\sum \limits_{i=1}^N{x}_i\right)$$
(1)

where: N is length of data.

In addition to the allowable error ε, the maximum number of decompositions K is also an adjustable parameter. When the decomposition process makes |L| ≤ ε or reaches the predetermined maximum number of decompositions K, the first empirical mode M1 is decomposed.

Step 6: Repeat the above steps for X − M1, obtain M2, M3… in order, and finally the remaining extreme points are composed as residual component R.

Step 7: Calculate the variance ratio δ/δ0 to obtain the optimal number of decompositions K, where δ is the relative standard deviation of X − R and δ0 is the standard deviation of the original series X.

Step 8: In the integer interval [Kmin, Kmax], find out the maximum number of filtering times K corresponding to the minimum variance ratio δ/δ0 as an optimal number of decompositions, and accordingly repeat the first to sixth steps to output the signal decomposition results.

2.2 Empirical Wavelet Transform

Gilles integrated wavelet transform theory with the EMD method and introduced the empirical wavelet transform (EWT) method, which can effectively avoid the phenomenon of mode mixing (Gilles 2013; Peng et al. 2022; Liu et al. 2022).

The Eq.(2) is established to distinguish the optimal decomposition order N.

$$\left\{\begin{array}{l}{E}_N^l=\sqrt{\frac{\sum \limits_{i=1}^n{x}_l^2(i)}{n}},l=1,2,\cdots N\\ {}{E}_N=\sum \limits_{l=1}^N{E}_N^l\\ {}{\theta}_{N,N-1}=\frac{\left|{E}_N-{E}_{N-1}\right|}{E_{N-1}}\end{array}\right.$$
(2)

where: \({E}_N^l\) is the energy of each IMF, EN is the sum of the energy of the N IMFs obtained by decomposition, and θN, N − 1 represents the energy difference of the IMF under different N values.

When θN, N − 1 suddenly becomes larger, it shows that the EWT decomposition process appears as the phenomenon of over-decomposition, resulting in false components. At this time, the corresponding decomposition time N − 1 is the optimal decomposition order of EWT.

2.3 Singular Value Decomposition

SVD, as a traditional and effective method for reducing noise, has been extensively utilized in the domain of extracting feature information from complex signals (Zhang et al. 2022; Zhang et al. 2018).

Define the optimal singular value decomposition order as r, this paper selects the order at which the singular entropy increment stabilizes as the optimal singular value decomposition order (Zhang et al. 2018).

2.4 Long Short-Term Memory Networks

In view of the fact that RNN is difficult to deal with long sequence data (Elsaraiti and Merabet 2021), Hochreiter and Schmidhuber (1997) proposed the LSTM model. Assuming σ stands for the sigmoid activation function; tanh represents the hyperbolic tangent function. The formulae for σ and tanh are as follows (Bakhshi Ostadkalayeh et al. 2023; Johny et al. 2022):

$$\sigma (x)=\frac{1}{1+{e}^{-x}}$$
(3)
$$\tanh (x)=\frac{e^x-{e}^{-x}}{e^x+{e}^{-x}}$$
(4)

The signal undergoes a series of transformations from its input to its output as follows:

$${f}_t=\sigma \left({W}_f\cdot \left[{h}_{t-1},{x}_t\right]+{b}_f\right)$$
(5)
$${i}_t=\sigma \left({W}_i\cdot \left[{h}_{t-1},{x}_t\right]+{b}_i\right)$$
(6)
$${o}_t=\sigma \left({W}_o\cdot \left[{h}_{t-1},{x}_t\right]+{b}_o\right)$$
(7)
$${\tilde{c}}_t=\tanh \left({W}_c\cdot \left[{h}_{t-1},{x}_t\right]+{b}_c\right)$$
(8)
$${c}_t={f}_t\cdot {c}_{t-1}+{i}_t\cdot {\tilde{c}}_t$$
(9)
$${h}_t={o}_t\cdot \tanh \left({c}_t\right)$$
(10)

where: Wf, Wi, Wo, Wc indicates the respective weight vectors of the forgotten gate, the input gate, the export gate, and the memory cell; ‘.’ denotes the dot product operation between two vectors; xt denotes the input to the network at time t; bf, bi, bo, bc represent the bias vectors of the forgotten gate, the input gate, the export gate, and the memory cell, respectively; ht − 1 denotes the hidden state at the moment t − 1; ct − 1 indicates the cellular state of t − 1; xt denotes the input at the moment of t; ft denotes the forgetting gate; it denotes the input gate; ot denotes output gate.

3 ESMD-EWT-SVD-LSTM Coupled Prediction Model

3.1 Model Steps

Taking into account the non-stationary of rainfall series and the benefits of different methods, this research suggests a coupled ESMD-EWT-SVD-LSTM model for predicting rainfall. The flowchart is illustrated in Fig. 1, and the crucial procedures are detailed underneath.

Fig. 1
figure 1

The flowchart of ESMD-EWT-SVD-LSTM coupled prediction model

1) ESMD decomposition. The ESMD method is employed to break down the initial series of rainfall, leading to a finite number of IMFs consisting of components M and a residual component R.

2) EWT quadratic decomposition. For the M1 with complex frequency among the IMFs of ESMD decomposition, quadratic decomposition by EWT algorithm is applied, and the main information of the IMFs is further extracted. The quadratically decomposed EM of the M1 is obtained.

3) Noise reduction. For the high-frequency IMFs that remain after the ESMD decomposition, the SVD method is employed to minimize noise, while the low-frequency IMFs are directly incorporated into both training and prediction samples.

4) Prediction. All the IMFs after quadratic decomposition and SVD noise reduction are segregated into training sample and prediction sample, and the LSTM method is used in this paper to predict and reconstruct the rainfall series.

By combining quadratic decomposition with noise reduction in signal processing, it is possible to significantly simplify and extract key features from monthly rainfall data, thus enhancing the precision of rainfall predictions.

3.2 Model Verification

To further assess the predictive accuracy of the ESMD-EWT-SVD-LSTM model, we employ MAE, RMSE, and SI as evaluation metrics (Johny et al. 2022; Zhang et al. 2023). The specific equations are as follows:

$$\textrm{MAE}=\frac{1}{N}\sum \limits_{i=1}^N\left|{y}_i-\hat{y_i}\right|$$
(11)
$$\textrm{RMSE}=\sqrt{\frac{\sum \limits_{i=1}^N{\left(\hat{y_i}-{y}_i\right)}^2}{N}}$$
(12)
$$\textrm{SI}=\frac{1}{\mu}\sqrt{\frac{\sum \limits_{i=1}^N{\left(\hat{y_i}-{y}_i\right)}^2}{N}}$$
(13)

where: \(\hat{y_i}\) is the predicted value at time i; yi is the actual observed value at time i; μ is the average of the observations; N is data length.

4 Example Analysis

4.1 Area Profile

Kaifeng, located in the central region of China and the eastern part of Henan Province, is renowned as the ancient capital of eight dynasties, making it an essential city for protection and development in China. Rainfall serves as the primary water resource for Kaifeng, exerting a direct influence on both the local ecological environment and the daily lives of its residents. Therefore, predicting monthly rainfall is crucial for water resources planning and sustainable development in Kaifeng.

This research utilizes the recorded monthly rainfall data from five counties and four districts in Kaifeng spanning 2009 to 2020 as foundational data (Data source: the Water Resources Bulletin of Henan Province, https://slt.henan.gov.cn/bmzl/szygl/szygb/).

4.2 Data Preprocessing

Before the prediction, it is essential to carry out the division of training samples and prediction samples. Here, 120 monthly rainfall series from 2009 to 2018 in Kaifeng City are used as training samples, and 24 monthly rainfall series from 2019 to 2020 are used as prediction samples to compare and validate the prediction results (Data source: the Water Resources Bulletin of Henan Province, https://slt.henan.gov.cn/bmzl/szygl/szygb/). According to the ESMD decomposition steps, the rainfall series is shown in Fig. 2.

Fig. 2
figure 2

Rainfall series graph

From Fig. 2(a), it can be observed that due to the influence of the temperate monsoon climate, the peak and trough values of Kaifeng’s rainfall exhibit obvious periodicity and uncertainty, the annual variation in rainfall is significant with complex characteristics. From Fig. 2(b), we can clearly observe that the ESMD effectively decomposes the original monthly rainfall series into M components representing information at different scales and a R component representing the trend. From the M1 to R, the frequency decrease gradually and the wavelength become longer. The highest frequency M1 still contains rich frequency information, which is not favorable for the next prediction. Consequently, the EWT method is utilized for the quadratic decomposition of the M1 to further simplify the component features. The EWT decomposition process is illustrated in Fig. 3.

Fig. 3
figure 3

The EWT decomposition process

From Fig. 3(a), it can be seen that when N > 4, the IMF energy difference θN, N − 1 starts to increase, therefore, N = 4 is chosen as the optimal decomposition order for the EWT algorithm. From Fig. 3(b), it’s evident that the EWT decomposition effectively decomposes the M1 into feature EMs which include distinct frequency information from the frequency domain perspective and ensures that each feature EM aligns with the frequency information present in M1. Therefore, through the EWT decomposition, several EM components with simpler frequency characteristics are obtained, which can provide basic data for the subsequent prediction analyses.

The monthly rainfall time series is characterized by its high volatility, non-linearity, and non-stationary, the obtained IMFs also exhibit significant volatility, which is not conducive to prediction and analysis. To address this, we utilize SVD denoising on the IMFs to refine the precision of the prediction model. The process is demonstrated using M2 as an example and the SVD noise reduction results is shown in Fig. 4.

Fig. 4
figure 4

The SVD noise reduction results

From Fig. 4(a), the optimal decomposition order of the M2 is determined to be r = 10 which corresponds to retaining 90% of the singular value. In Fig. 4(b), the blue curve illustrates the data before SVD processing, while the red curve illustrates the data after SVD processing. Comparing the red and blue curves, it can be found that the fluctuation peaks of the M2 are reduced in some months after SVD processing and the overall fluctuation curves are more homogeneous. By applying SVD noise reduction, the original signal characteristics are represented more comprehensively, and some detailed fluctuations are more accurately represented.

4.3 Rainfall Prediction

This study utilizes the monthly rainfall data of Kaifeng City from 2009 to 2018 as a training sample for the LSTM model and predicts the monthly rainfall for the period 2019 to 2020 (Data source: the Water Resources Bulletin of Henan Province, https://slt.henan.gov.cn/bmzl/szygl/szygb/). The predictions of IMFs are generated using LSTM model, the corresponding results are presented in Fig. 5.

Fig. 5
figure 5

The prediction results of LSTM

From Fig. 5, although there are still a small number of points in each IMF that do not match completely, the proportion of these points is very low and does not affect the final prediction results, which indicates that the model exhibits a high degree of predictive accuracy.

The relative error is calculated between the predicted and actual values, as detailed in Table 1.

Table 1 The relative error of the IMFs

According to Table 1, the M1 and M2 have the largest prediction errors with the average relative positive error ranging from 40 to 50% and the average relative negative error ranging from −50% to −30%. This suggests that the non-smoothing of the M1 and M2 is high, the sequence’s fluctuations in complex components can notably reduce prediction accuracy and cause a significant relative error. Figure 6 shows the prediction of rainfall data.

Fig. 6
figure 6

The prediction of rainfall data

From Fig. 6, the predicted rainfall series can correspond to the actual observation series with small relative errors, and the prediction results are accurate without lag. This indicates that the coupled rainfall prediction model based on ESMD-EWT-SVD-LSTM has a good ability to process data and prediction, the prediction results are accurate and can be effectively used in practical engineering.

5 Discussion

In this study, the ESMD-EWT-SVD-LSTM coupled model has achieved good application in the prediction of rainfall. To further validate the feasibility and superiority of the ESMD-EWT-SVD-LSTM coupled model, ESMD-SVD-LSTM, SVD-LSTM, and LSTM are applied to compare the prediction ability respectively. The indicators such as MAE, RMSE, SI, average relative error are introduced to quantitatively evaluate the prediction results and the comparison are shown in Table 2.

Table 2 Comparison table of indicators of each model

Table 2 demonstrates that the ESMD-EWT-SVD-LSTM model yields superior predictive outcomes and boasts a higher accuracy, with the MAE of 4.96, the RMSE of 6.13, the SI of 0.12, and the average relative error less than 1%. Taking the MAE as an example for analysis, comparing the prediction results of ESMD-EWT-SVD-LSTM and ESMD-SVD-LSTM, it can be found that after the quadratic decomposition of ESMD-EWT, the MAE is reduced by 56.53%, which indicates that reducing the complexity of the original rainfall sequence through multiple decompositions enhances the accuracy of LSTM. Besides, comparing the SVD-LSTM and LSTM prediction results, it can be found that although the SVD can further extract the signal features, the application of the SVD in the case of complex original rainfall sequence will lead to the interference or loss of the data features, which results in an increase of the MAE by 31.83% of the prediction results. The comparison of the model prediction results is provided in Fig. 7.

Fig. 7
figure 7

Comparison of model prediction results

As can be seen from Fig. 7, the trend and period of the ESMD-EWT-SVD-LSTM prediction results can effectively coincide with the actual observation data, and the prediction accuracy exceeds that of the other three models, which has a great potential in rainfall prediction.

Comparing the current research in the field of rainfall prediction (Byun et al. 2023; Chen et al. 2022; Johny et al. 2022), all the technique proposed by these researchers can be effectively applied to actual projects. However, in the context of the randomness, volatility, and non-linearity inherent in rainfall data, the central challenge for further investigation in the field of rainfall prediction becomes how to effectively diminish the complexity of rainfall data and enhance the precision of predictive outcomes. On the basis of previous studies and fully considering the particularity of rainfall data, we put forward ESMD-EWT-SVD-LSTM coupled model. Different from the traditional direct prediction or prediction after a single decomposition, the coupled model addresses the characteristics of rainfall data by performing a quadratic decomposition with ESMD-EWT and denoising with SVD, continuously reducing data complexity and enhancing predictability, ultimately achieving accurate predicting results. In this study, we apply the ESMD-EWT-SVD-LSTM coupled model to predict the monthly rainfall data in Kaifeng City and assess the validity of the model by comparing it with alternative models. The results show that the ESMD-EWT-SVD-LSTM coupled prediction model has better theoretical advantages and accuracy in monthly rainfall prediction, and can be effectively applied to practical projects. The results show that the ESMD-EWT-SVD-LSTM coupled prediction model has better theoretical advantages and accuracy in monthly rainfall prediction, and can be effectively applied to practical projects.

Kaifeng, a city in China, exemplifies a warm temperate continental monsoon climate with complex and seasonal variations in monthly rainfall. The ESMD-EWT-SVD-LSTM coupled model has been effectively applied to the prediction of monthly rainfall in Kaifeng, indicating that this model has a strong ability to analyze and predict complex rainfall series and can be applied to rainfall predicting in regions with similar climates in China and around the world. It is particularly worth mentioning that, based on the theoretical advantages of the ESMD-EWT-SVD-LSTM coupled model, it is feasible to perform rainfall prediction tasks well in complex rainfall areas different from Kaifeng, but the specific practical application effects require further research and exploration.

6 Conclusion

In this study, we present the ESMD-EWT-SVD-LSTM coupled prediction model to address the challenge of monthly rainfall prediction. By applying this approach to the analysis of monthly rainfall in Kaifeng City from 2009 to 2020, the results demonstrate that the predicted rainfall series closely aligns with the actual observation series, exhibiting small relative errors and without any lag. Comparing the predictive outcomes of the ESMD-EWT-SVD-LSTM model against those from the ESMD-SVD-LSTM, SVD-LSTM, and LSTM models, it is found that the ESMD-EWT-SVD-LSTM has the best prediction results, with MAE of 4.96, RMSE of 6.13, SI of 0.12, and average relative error less than 1%. This suggests that the coupled ESMD-EWT-SVD-LSTM monthly rainfall prediction model possesses strong predictive capabilities and can be effectively applied to regional water resources scheduling.

Although the proposed ESMD-EWT-SVD-LSTM model has demonstrated efficacy in monthly rainfall prediction, the current model solely focuses on the rainfall time series during the prediction process and overlooks the physical mechanisms that contribute to changes in rainfall. How to explore the connection between predicting rainfall and other pertinent factors will be a crucial focus for future studies.