Introduction

Precipitation is the most important component of the hydrologic cycle on the globe, recharging renewable surface and ground water resources (Brutsaert 2005; Jin et al. 2021; Sospedra‐Alfonso et al. 2015). As such, it is of paramount importance for domestic, industrial, and agricultural sectors, particularly in the context of changing climates (Medina et al. 2019; Myers et al. 2017; Qin et al. 2020). Precipitation anomalies, which can manifest as floods or droughts, may highly affect human life quality (Dai 2011; Petrucci 2022). Therefore, precipitation forecasts hold significant value in informing strategies aimed at adapting to drought and floods, ensuring food and water security, and promoting human health (Medina et al. 2019; Nguyen-Huy et al. 2017; Nouri et al. 2017a). Precipitation anomalies occur in association with local-scale microclimate anomalies and global teleconnections phasing. Climate change influences not only precipitation magnitude, but also its seasonality and type (Nouri and Homaee 2020, 2021b; Qin et al. 2020) through impacting the storm track activity and atmosphere–ocean teleconnection signals (Cai et al. 2015; Evans 2009; Trenberth 2020; Trenberth et al. 2013).

Teleconnection fluctuations and their correlation with precipitation provide a promising approach for forecasting precipitation amounts and anomalies as well as climate change impacts on precipitation. There are plenty of studies addressing the association between teleconnection signals and precipitation-related variables, e.g., dry/wet days, rainfall intensity, precipitation amount, precipitation extremity, precipitation duration, and precipitation type, on the globe (Casanueva et al. 2014; He and Guan 2013; Irannezhad et al. 2021; Nazemosadat and Cordery 2000; Nazemosadat and Ghasemi 2004; Nouri and Homaee 2021b; Skeeter et al. 2019; Helali et al. 2022; Xiao et al. 2017). Moreover, the precipitation behavior during different atmospheric teleconnection events (for instance the El Niño-Southern Oscillation, ENSO) has been leveraged to evaluate the variabilities of hydrological and biometeorological variables such as runoff and streamflow (Niu et al. 2014; Wang et al. 2022b), soil moisture (Nicolai‐Shaw et al. 2016; Niu et al. 2014), groundwater level (Rust et al. 2018), heat waves (Choi et al. 2020; Jacques‐Coper et al. 2021), evapotranspiration (Chai et al. 2018; Helali and Asadi Oskouei 2021; Zhao et al. 2020), land–atmosphere coupling strength (Holmes et al. 2017; Nouri and Homaee 2021a), and crop yield (Bannayan et al. 2011; Nouri et al. 2017a). The teleconnection signals have been also used to project the future changes in hydroclimatological variables (Rust et al. 2019; Yoon et al. 2015).

Spatial inconsistency in the relationship between precipitation and large-scale teleconnections refers to the phenomenon in which the relationship between precipitation and teleconnection is significant at certain points or regions but not at neighboring areas (Brown and Comrie 2004). This discrepancy can be a potential pitfall for understanding the complex interactions between precipitation and large-scale atmospheric phenomena. This can be explained by local impacts (particularly for point-scale studies) and/or the coupled effects exerted by other large-scale teleconnection patterns. Brown and Comrie (2004) explained spatial inconsistencies in the relationship of ENSO and winter precipitation by the Pacific Decadal Oscillation (PDO) phase shift over the western United States. Therefore, the phase of one teleconnection driver may modulate the impacts of another, affecting forecasting skill (Dannenberg et al. 2018; Theobald et al. 2018). This implicitly implies that forecasting precipitation based on one teleconnection driver seems to be highly uncertain, since it does not consider the coupled effects of teleconnections.

The investigation of the relationship between precipitation and teleconnections has been extended to regional scales, such as catchment-level studies, to overcome the spatial inconsistency observed in point-scale investigations (Räsänen and Kummu 2013; Sigaroodi et al. 2014; Xiao et al. 2015; Zhang et al. 2013, 2016a). However, conflicting results may still arise on regional scales. The relationships and forecasting models represent the average precipitation condition of a basin and might not be valid for each local site within that basin. This can be ascribed to the microclimate which can influence local precipitation regimes in a highly variable and non-uniform manner. Therefore, a sophisticated analytical approach is required to consider the microclimate impacts in different teleconnection events. Clustering techniques identify distinct groups of microclimate conditions associated with specific precipitation patterns (Roque-Malo and Kumar 2017). They can be particularly useful in reducing the uncertainty of precipitation forecasts, by providing a more detailed and nuanced understanding of the local factors that contribute to precipitation variability (Di et al. 2015; Gibson et al. 2021; Liu et al. 2020; Roque-Malo and Kumar 2017).

Various modeling approaches have been adopted to analyze the precipitation anomalies during different teleconnection events and to forecast precipitation. It is worth noting that modeling approach should physically explain the precipitation response to teleconnections. Linear approaches, such as linear regression, have been widely utilized in previous studies to associate precipitation-related variables with teleconnection indices (Hu et al. 2005; Kim et al. 2012; Wang et al. 2017, 2006a). This modeling approach is valid for the case that precipitation is linearly associated with teleconnection anomalies. However, it has been argued that the linear approaches seem unsuitable to be employed when precipitation responds non-linearly to atmospheric teleconnection patterns (Chung and Power 2017; Kinouchi et al. 2018; Krishnaswamy et al. 2014). With recent advances in computational and data sciences, artificial intelligence has emerged as a promising tool for analyzing complex relationships with high inherent uncertainty in different fields (Boukabara et al. 2021; Shouval et al. 2021). Machine learning (ML) is a sub-domain of artificial intelligence that offers deep insights into complex non-linear data structures (Shouval et al. 2021). Use of machine learning is rapidly growing in weather forecasting and hydroclimatology literature (Arcomano et al. 2020; Ashley et al. 2019; Böhm et al. 2021; Chakraborty et al. 2021; Scher and Messori 2018). Recently, ML alternatives have been applied to forecast precipitation based on teleconnection signals (He et al. 2021). This study aimed to forecast the lagged responses of seasonal and annual precipitation to a wide range of teleconnection signals across different clustered precipitation regimes in Iran using sophisticated machine learning algorithms.

Methodology

Study area and data

Iran, situated in the Middle East, is a vast country characterized by a diverse range of climatic regimes, ranging from hyper-arid to humid. This remarkable climatic variety can largely be attributed to the presence of the Alborz Mountains in the north and the Zagros ranges in the west. The precipitating air masses blowing from the Caspian Sea are trapped by the Alborz, yielding sub-humid and humid conditions in northern Iran (Nouri and Homaee 2021b). The eastward Mediterranean humid air masses are also blocked by the Zagros Mountains, resulting in a semi-arid condition in western half of Iran. The arid and hyper-arid climates prevail in central, eastern, and southern Iran, where annual precipitation over 250 mm is infrequent (Nouri and Homaee 2021b). As a result, different seasonal and annual precipitation regimes can be observed in our study area. We clustered seasonal and annual precipitation based on the percentiles to portray different precipitation regimes and to consider the microclimate impacts across the studied sites (Fig. 1, and Table 1).

Fig. 1
figure 1

Location of the studied sites (listed in Table 1 of supplementary material) classified in different precipitation percentiles

Table 1 Clustering criterion (Cr), the number of stations located in each cluster (SN), standard deviation (SD), and mean and coefficient of variation (CV) of seasonal and annual precipitation

The Iran Ministry of Energy records precipitation data from more than 4360 locations (https://stu.wrm.ir/login.asp), and the Iran Meteorological Organization (IRIMO) collects precipitation data from more than 720 synoptic and climatology stations (https://data.irimo.ir/login/login.aspx) (Nouri and Homaee 2018, 2022; Saemian et al. 2021, 2022). We employed three main selection criteria: the dataset length, the presence of missing data, and the identification of outliers. The clustering facilitated the differentiation between outliers and anomalies. Specifically, precipitation spikes were observed in the data from some arid southern sites that initially appeared to be outliers but were, in fact, a typical behavior within specific clusters of these areas. Therefore, we classified such spikes as anomalies and retained the site. Nevertheless, the spikes that appeared unusual for a given cluster were deemed outliers and the corresponding sites were excluded from the analysis. Finally, the monthly precipitation data were gathered from 117 synoptic sites and 600 rain gauges with no missing data and outliers for the duration spanning from 1987 to 2015 (Table 1 in supplementary material).

The teleconnection indices are also listed in Table 2. These indices were retrieved from different sources given in Table 2 of supplementary material. For further details on the indices, one can refer to Hanley et al. (2003), Baldwin et al. (2001), Wang and Enfield (2001), Overland et al. (1999), Kutiel and Benaroch (2002), Barnston and Livezey (1987), and Wallace and Gutzler (1981). The datasets used to derive indices and their spatial resolution, as well as the latitude/longitude range of teleconnection patterns are listed in Table 3 of supplementary material. It is noteworthy that most indices were already computed by different sources, viz. National Center for Environmental Information (NCEI), Physical Sciences Laboratory (PSL), Climate Prediction Center (CPC) of National Oceanic and Atmospheric Administration (NOAA), National Center for Atmospheric Research (NCAR), Climatic Research Unit of University of East Anglia, Australian Bureau of Meteorology, and the Centre d’Estudis Ambientals del Mediterrani (CEAM). We, however, calculated Caspian Sea Surface Temperature (CSST), Indian Ocean Basin Sea Surface Temperature (IOBSST), Pacific Ocean Index (POI), Persian Gulf Sea Surface Temperature (PSST), and Western Pacific Sea Surface Temperature (WPSST) by applying gridded NOAA Optimum Interpolation SST v2 (https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.html). These indices were equal to the average SST for the corresponding areas given in Table 3 of supplementary material.

Table 2 List of teleconnection indices

In the current study, the aridity index map provided by Nouri and Homaee (2018) was utilized to classify the climates (Fig. 1). Accordingly, the sites corresponding to the 10–40th precipitation percentiles are mainly situated in the central and eastern Iran and the strips of the Persian Gulf and the Gulf of Oman in the south of the country, with arid and hyper-arid climatic regimes (Fig. 1). The sites corresponding to the 40–60th percentiles are mostly located in northwestern semi-arid environments and southern and northeastern arid regions. The 60–80th percentiles dominantly comprise the northwestern semi-arid locations and the arid areas lying near the boundary between the semi-arid and arid climates in southern Iran. The 80–100th percentiles also encompass semi-arid, sub-humid, and humid areas in the central and northern Zagros and the sub-humid and humid areas located in the northern flanks of the Alborz. Accordingly, the precipitation percentiles are scattered across a broad range of climatic and topographic settings, illustrating that different precipitation regimes, and consequently microclimates, were included in our investigation (Fig. 1). Hence, the geographic and climatic indices appear not to be promising criteria to explain precipitation regimes. For instance, two sites, namely Arhan (No. 50 in Table 1 of supplementary material) and Saroo (No. 605 in Table 1 of supplementary material), both classified under the 70th percentile with an annual precipitation of approximately 360 mm, exhibited significant differences in their climatic and geographic characteristics. Arhan is a cold semi-arid mountainous area situated at an elevation of 2028 m.a.s.l, while Saroo is an arid region located at an elevation of 1347 m.a.s.l. Despite these dissimilarities, the two sites were classified under the same precipitation regime, highlighting the limitations of relying solely on geographic and climatic indices for explaining precipitation patterns.

Summer precipitation was not considered in the current study, as it is less than 10% of annual precipitation in 80% of investigated sites. Given a high rate of evapotranspiration in summertime, water precipitated in summer appears not important to be forecasted in Iran.

Modeling framework

Pre-processing

The modeling approach consisted of three primary stages: the pre-processing, the forecasting, and statistical evaluation (Fig. 2). In the first phase, the correlations between seasonal and annual precipitation (as predictands) with 40 teleconnection indices (as candidate predictors in Table 2) were examined for 13 clusters. This was conducted in 1- to 6-months lagged times, resulting in the creation of matrices that encompassed a total of 240 associations. The predictands were autumn (October–November–December), winter (January–February–March), spring (April–May–June), and annual precipitation. It should be noted that the hydrologic year in Iran spans from 1st October to 30th September of the subsequent year (Nouri and Homaee 2020). The Pearson’s correlation coefficient was used to evaluate the relationship between teleconnection indices and precipitation on seasonal and annual scales:

$$R = \frac{{\sum\limits_{i = 1}^{n} {\left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)} }}{{\sqrt {\sum\limits_{i = 1}^{n} {\left( {x_{i} - \overline{x}} \right)^{2} \left( {y_{i} - \overline{y}} \right)^{2} } } }},$$
(1)
Fig. 2
figure 2

The forecasting framework and stages based on the machine learning (ML) algorithms of the Generalized Regression Neural Network, GRNN, the Multi-Layer Perceptron, MLP, the Multi-Linear Regression, MLR, and the Least Squares Support Vector Machine, LSSVM. (The nRMSE, PBIAS and NSE denote normalized Root Mean Square Error, Percent Bias, and Nash–Sutcliffe Efficiency, respectively, explained in “Quantitative evaluation”.)

where xi is the independent variable (i.e., teleconnection indices), yi is the dependent variable (precipitation), and \(\overline{y}\) and \(\overline{x}\) are the average of the dependent and independent variables, respectively.

The correlation matrix established in 1- to 6-month lead times is as follows:

$$R_{i,j} = \left[ {\begin{array}{*{20}c} {R_{1,1} } & {R_{1,2} } & {R_{1,3} } & . & {R_{1,13} } \\ {R_{2,1} } & {R_{2,2} } & {R_{2,3} } & . & {R_{2,13} } \\ . & . & . & . & {R_{3.13} } \\ . & . & . & . & . \\ {R_{40,1} } & {R_{40,2} } & {R_{40,3} } & . & {R_{40,13} } \\ \end{array} } \right],$$
(2)

where i and j stand for the number of teleconnection indices (Table 2) and the number of percentiles (Table 1), respectively.

After constructing the correlation matrices, a three-step technique known as the forward selection was employed to identify the most significant predictors. This approach has been extensively utilized in the literature (Khan et al. 2007; Modaresi et al. 2018; Wang et al. 2006b). First, the predictor with the highest correlation coefficient was selected to model the predictand. Subsequently, other predictors were gradually introduced into the model in descending order of their correlation coefficient. Finally, the three predictors that produced the minimum error were deemed the most appropriate combination (Table 3). Note that we experimentally realized that incorporating more than three predictors might elevate the risk of overfitting.

Table 3 The most important teleconnection predictors applied to forecast seasonal and annual precipitation across different percentiles

Forecasting

In the second step, the seasonal and annual precipitation was forecasted using four ML algorithms including the Generalized Regression Neural Network (GRNN), the Multi-Layer Perceptron (MLP), the Multi-Linear Regression (MLR), and the Least Squares Support Vector Machine (LSSVM). The MLP is a common Neural Network Model (ANN) with feed-forward network class including at least three layers of input, hidden, and output (Muni Kumar and Manjula 2019). It is composed of multiple interconnected nodes or neurons, arranged in a layered structure, where each neuron receives inputs from the previous layer and produces an output, which is transmitted to the next layer. The hidden layer of the MLP contains multiple neurons that are assigned weights and biases to optimize the model performance (Fig. 3a).

Fig. 3
figure 3

The architecture of the machine learning algorithms (∑ denotes the sum of weights for output layer in MLP, and the sum of weights for output functions in LSSVM, and the sum of weights for variable coefficients in MLR. S1 to Sn are the summation units, and D stands for the division unit in GRNN. The weights (w) are given to layers in MPL and GRNN, to kernel functions in LSSVM, and to variable coefficients in MLR. SV denotes a support vector in LSSVM. P(t-1) and P(t-6) are the precipitation in 1- to 6-months time lags, respectively. α (x, xi) is the variable coefficient of xi in MLR, and k (x, xi) is the kernel function of xi in LSSVM.)

The GRNN is a kernel-based three-layer ANN in which the number of neurons in input (output) layers is equivalent to the input (output) vector dimensions (Lee and Resdi 2014). One of the key features of the GRNN is the use of a Radial Basis Function (RBF) in the pattern layer. The RBF is a type of activation function often employed in kernel-based methods. It is characterized by a center and a width, and it assigns a value to each input based on its distance from the center. The Gaussian kernel function is a common choice for the RBF due to its smoothness and symmetry. In the GRNN, the number of neurons in the input and output layers is fixed and equal to the dimension of the input and output vectors, respectively (Antanasijević et al. 2014; Antanasijevic et al. 2013). The number of neurons in the middle layer, however, is not fixed and is instead defined by observed data applied for calibration and validation (Fig. 3b).

The support vector machine (SVM) is a supervised ML applying the Structural Risk Minimization approach to minimize model error, whereas other methods such as artificial neural networks (ANN) use Empirical Risk Minimization principles (Cao et al. 2009; Kazem et al. 2013). The LSSVM is a variant of SVM that exploits linear equations in a forecasting algorithm. The LSSVM has been shown to achieve acceptable performance by applying an effective kernel function (Guo et al. 2012). The kernel function maps the input data into a higher-dimensional feature space, rendering LSSVM suitable for nonlinear problems (Fig. 3c).

The MLR is a supervised learning technique forecasting a continuous variable based on several independent variables. This algorithm uses the training dataset to estimate the coefficients of the linear equation that best fits the data (Jose et al. 2022; Najafi et al. 2011). The resulting model can then be applied to forecast the value of the dependent variable in unseen datasets. In the current work, the MLR algorithm using three independent variables was formulated to forecast seasonal and annual precipitation (Fig. 3d).

The data were split randomly into two subsets of training and testing. The training set comprised 70% of the data, while the testing set contained the remaining 30%. Two final outputs, one for each seasonal and annual timeframe, were generated for each model by averaging the results of the last 10 out of 30 iterations.

Quantitative evaluation

Three metrics of normalized Root Mean Square Error (nRMSE), Percent Bias (PBIAS), and Nash–Sutcliffe Efficiency (NSE) were employed to evaluate the forecasting skill of ML alternatives. The mathematical expressions of these statistics are

$$n{\text{RMSE}} = 100 \times \frac{{\sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {o_{i} - f_{i} } \right)^{2} } } }}{{\overline{o} }},$$
(3)
$${\text{PBAIS}} = \frac{100}{{\overline{o} }} \times \frac{{\sum\nolimits_{i = 1}^{n} {\left( {f_{i} - o_{i} } \right)} }}{n},$$
(4)
$${\text{NSE}} = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {\left( {o_{i} - f_{i} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {o_{i} - \overline{o} } \right)^{2} } }},$$
(5)

where fi and oi are, respectively, forecasts and observations, \(\overline{o}\) denotes the average of the observed data, and n stands for the number of comparisons.

The nRMSE is oftentimes applied to quantify the absolute error of estimates. The performance of a model is unsatisfactory if nRMSE exceeds 30% (Dettori et al. 2011). The metric of PBIAS quantifies the model bias or systematic error. The positive (negative) PBIAS indicates the tendency of a modeling algorithm to overestimate (underestimate). The model forecast can be judged as satisfactory for the PBIAS values less than 25% (Moriasi et al. 2007). The NSE is used to evaluate the relative error, and a value lower than 0.5 suggests unreliable model performance (Moriasi et al. 2007). When forecasts match observations, the model performance is considered excellent and the values of nRMSE and PBIAS are equal to 0.0%, and the magnitude of NSE is 1.0. In this study, the forecasts with nRMSE of < 30%, NSE of > 0.5, and PBIAS of < 25% were deemed reliable.

Results and discussion

Predictors selection

Autumnal precipitation had a higher correlation with teleconnection indices in 1- to 3-months lagged periods (Fig. 4). Except for two indices, i.e., Tropical North Atlantic (TNA) and Atlantic Multidecadal Oscillation (AMO), the correlation coefficient was found to be insignificant for the 4- to 6-months lagged associations. Autumn precipitation showed a stronger association with the ENSO indices, such as the Southern Oscillation Index (SOI), Niño3.4, Sea Surface Temperature in four Niño regions (SSTas), Multivariate El Niño-Southern Oscillation Index (MEI), as well as Pacific Ocean Index (POI) in lead periods of 1 to 3 months (Fig. 4). Table 3 also displays that the indices quantifying ENSO activities, such as SOI, Niño3.4 and SSTas, were applied as three main teleconnection predictors to forecast autumn precipitation. The ENSO phasing has been shown to exert a strong impact on autumnal precipitation, particularly in western Iran (Helali et al. 2020; Nazemosadat and Cordery 2000; Nazemosadat and Ghasemi 2004; Nouri et al. 2017a). Winter precipitation was associated insignificantly (p > 0.05) with most teleconnection signals (Fig. 5). Nonetheless, winter precipitation showed a higher correlation with indices of Pacific Decadal Oscillation (PDO), and Mediterranean Sea Surface Temperature (MSST) in 4- to 6-months lead periods, as well as indices of Tropical South Atlantic (TSA), East Atlantic pattern (EA), SOI and Polar/Eurasia pattern (POL) in lagged times of 1–3 months. MSST and Tropical North and South Atlantic (TNA-TSA) were often considered as the main predictors of winter precipitation (Table 3). Unlike autumnal rainfall, sharp dry/wet epochs in different phases of ENSO are not anticipated in most regions in Iran, because ENSO impacts on winter precipitation are modulated by other teleconnections (Ghasemi and Khalili 2008; Nazemosadat and Ghasemi 2004).

Fig. 4
figure 4

The correlation coefficient of autumn precipitation-teleconnection indies for 1- to 6-months lead times (the asterisk indicates the significant correlation coefficient at the 95% confidence level)

Fig. 5
figure 5

The correlation coefficient of winter precipitation-teleconnection indies for 1- to 6-months lead times (the asterisk indicates the significant correlation coefficient at the 95% confidence level)

There was a significant association between spring precipitation and Pacific Ocean Index (POI), WPSST, Tripole Index for the Interdecadal Pacific Oscillation (TPI), and ENSO-related indices, including SOI, Niño4, Sea Surface Temperature in Niño4 region (SST4), and MEI (Fig. 6). Similar to autumnal precipitation, ENSO phasing seems to cause precipitation perturbations during spring, except for the 10–30th percentiles. The La Niña-triggered droughts are expected during springtime in western Iran (Ahmadi et al. 2019; Helali et al. 2021; Nouri and Homaee 2021a). However, ENSO anomalies do not explain precipitation variations occurring in hyper-arid/arid southern, eastern, and central Iran, the regions mostly classified in the 10–40th percentiles (Fig. 1). Table 3 indicates that the ENSO-related indices, such as SOI, Niño4, and SST4, are among three main teleconnection indices applied to forecast spring precipitation. For most clusters, annual precipitation was associated significantly with ENSO-related indicators, POI, TPI, and WPSST (Table 3 and Fig. 7). Overall, ENSO seems to be one of the key teleconnections affecting precipitation variabilities across the country (Table 3, and Figs. 4, 6, and 7).

Fig. 6
figure 6

The correlation coefficient of spring precipitation-teleconnection indies for 1- to 6-month lead times (the asterisk indicates the significant correlation coefficient at the 95% confidence level)

Fig. 7
figure 7

The correlation coefficient of annual precipitation-teleconnection indies for 1- to 6-months lead times (the asterisk indicates the significant correlation coefficient at the 95% confidence level)

Figure 8 shows the Pearson’s correlation coefficient obtained for ML algorithms across different clusters in the training and testing steps. The GRNN, MLP, LSSVM, and MLR had an average Pearson’s correlation coefficient of 0.85, 0.92, 0.99, and 0.82 in the training phase, and 0.65, 0.74, 0.48, and 0.74 in the testing step, respectively. The results show that while LSSVM had superior performance in the training phase (Fig. 8a, c, e, and g), and its forecasting skill was severely impaired in the testing phase (Fig. 8b, d, f, and h). This discrepancy indicates a pronounced overfitting problem of LSSVM, despite a reasonable number of predictors included for training. In other words, LSSVM was too narrowly adjusted to the training dataset and could not effectively capture the underlying patterns in the unseen dataset. This has been also argued in the ML literature (Peng and Wang 2009; Wei et al. 2008).

Fig. 8
figure 8

The Pearson’s correlation coefficient of the Generalized Regression Neural Network, Multi-Layer Perceptron, Multi-Linear Regression, and Least Squares Support Vector Machine algorithms obtained for different precipitation percentiles in training and testing sets

Machine learning modeling performance

The MLP and LSSVM had a reasonable forecasting skill for autumn precipitation in all clusters (Fig. 9b and c). However, the nRMSE of autumnal precipitation forecasted by GRNN and MLR exceeded 30% for the 10th and 40th percentiles (Fig. 9c). Consequently, GRNN and MLR did not forecast autumnal precipitation reliably for the regions located in the 10th and 40th percentiles. The GRNN algorithm exhibited poor performance in forecasting wintertime precipitation, as evidenced by a NSE value of less than 0.5. However, MLP, LSSVM, and MLR algorithms demonstrated reliable performance across all percentiles for forecasting winter precipitation, with nRMSE values below 30% and NSE values above 0.5 (Fig. 9e and f). The GRNN and MLR forecasted spring precipitation based on teleconnection indices unreliably (i.e., nRMSE > 30%) for the sites grouped in the 10th and 40th percentiles (Fig. 9i). However, MLP and LSSVM provided satisfactory forecasting results for springtime precipitation (i.e., nRMSE < 30% and NSE > 0.5) in all percentiles, except for the 10th percentile (Fig. 9h and i). The average nRMSE of annual precipitation forecasted by GRNN, MLP, LSSVM, and MLR was 15.8%, 8.0%, 6.7%, and 12.6%, respectively. However, the NSE values for GRNN-forecasted annual precipitation were found to be less than 0.5 for some clusters, indicating poor model performance (Fig. 9k). Therefore, MLP, LSSVM, and MLR performed acceptably for all percentiles on annual scale (Fig. 9k and l).

Fig. 9
figure 9

The nRMSE, PBIAS, and NSE values of the Generalized Regression Neural Network (GRNN), the Multi-Layer Perceptron (MLP), the Multi-Linear Regression (MLR), and the Least Squares Support Vector Machine (LSSVM) for seasonal and annual precipitation across different clusters

The GRNN and MLR did not show a clear tendency to overestimate or underestimate seasonal and annual precipitation (Fig. 9a, d, g and j). The MLP algorithm, however, underestimated autumnal precipitation, and overestimated spring and annual precipitation in most percentiles (Fig. 9a, g and j). The LSSVM also tended to overestimate annual and wintertime precipitation for the majority of clusters (Fig. 9d and j). It is noteworthy that the absolute values of PBIAS did not exceed 25% for all cases (Fig. 9a, d, g and j), indicating acceptable bias errors of the studied ML alternatives in forecasting seasonal and annual precipitation.

Overall, LSSVM outperformed the other ML options in forecasting seasonal and annual precipitation. Except for spring precipitation of the areas in the 10th percentile, MLP and LSSVM gave reliable seasonal and annual precipitation forecasts (i.e., nRMSE < 30%, absolute PBIAS < 25%, and NSE > 0.5). Therefore, these algorithms are best suited to forecast precipitation using appropriate teleconnection signals listed in Table 3. The LSSVM has been identified as a robust algorithm for forecasting precipitation (Alizadeh and Farajzadeh 2018; Choubin et al. 2016; Tao et al. 2017). However, as discussed earlier, the LSSVM algorithm may experience overfitting, resulting in unsatisfactory precipitation modeling for regions not included in the training dataset. Therefore, caution should be exercised when applying LSSVM to forecast precipitation for unseen data. However, MLP provided a more balanced performance for training and testing steps (Fig. 8). This demonstrates that the MLP is less susceptible to overfitting, which is a critical consideration when selecting an appropriate forecasting model. Therefore, in our study area, MLP is a preferable choice for precipitation forecasting compared to LSSVM.

The performance of ML models is generally inferior in the regions corresponding to the 10–40th percentiles, as compared to the sites grouped in the 40th to 100th percentiles (Fig. 9). For instance, the average nRMSE of autumn precipitation forecasts provided by GRNN, MLP and LSSVM and MLR was 37.5%, 23.4%, 20.0%, and 34.9% in the 10th to 40th percentiles, and 15.3%, 15.8%, 13.4%, and 21.4% in the 40–100th percentiles, respectively. As stated previously, the regions corresponding to the 10th and 40th percentiles are mostly situated in eastern and southern hyper-arid and arid areas (Fig. 1). These results indicate that forecasting skill of ML options is relatively lower in hyper-arid and arid areas. Precipitation modeling in hyper-arid/arid areas seems to be a challenging task. This can be ascribed to the nature of precipitation process in these environments, which is erratic, uneven and sparsely distributed (Al-Rawas and Valeo 2009; Altwegg and Anderson 2009; Attum et al. 2014). This renders these water-limited regions susceptible to flood and drought risks (Dai 2011). Moreover, precipitation may occur aloft, however, it is not detected by rain gauges. This can be attributed to sub-cloud evaporation which is the evaporation of raindrops prior to reaching to land surface owing to high atmospheric evaporative power in hyper-arid/arid areas (Dinku et al. 2011; Salamalikis et al. 2016; Wang et al. 2022a). This imposes high uncertainties on remotely-sensed precipitation products, and also teleconnection-based forecasts (Chen et al. 2020; Dinku et al. 2011). Given a relatively high forecasting skill of LSSVM and MLP in the 10–40th percentiles, these modeling approaches seem to be of much use for drought/flood risk management in the hyper-arid/arid areas studied.

The strength of precipitation and teleconnections correlations may change over time (Douville et al. 2017; Kamil et al. 2019; Nouri and Homaee 2020). For instance, Nouri and Homaee (2020) and Kamil et al. (2019) reported a sudden increase in the correlation coefficient between precipitation and ENSO as of the 1980s in central southwest Asia. As a result, ENSO-triggered droughts occurred more frequently in the twenty-first century with respect to the mid-twentieth century. It is worth noticing that three devastating dry episodes in Iran occurred in La Niña years of 1999–2001, 2007–2009, and 2010–2012 (Nouri and Homaee 2020; Trigo et al. 2010). Therefore, a change in time study might affect the performance of the ML algorithms by altering the strength of the association between precipitation and teleconnections.

Precipitation is a complex phenomenon influenced by several environmental factors, such as atmospheric conditions, geography, and topography (Kumari et al. 2016). Precipitation is a conditional variable, meaning that it hinges on certain circumstances being met before it occurs (Zhang et al. 2016b). This adds an extra layer of complexity to precipitation forecasting, particularly in complex terrains. The ML algorithms can learn these relationships from historical data to forecast future precipitation. However, the above-mentioned complexities can impede the skill of ML algorithms to capture all the relevant information. Clustering identifies the underlying patterns and relationships that may be obscured in unstructured data (Ghorayeb et al. 2022; Kömüşcü et al. 2022; Mateo et al. 2013; Yang et al. 2022). In the current study, this technique facilitated the grouping of similar precipitation data into different subsets, enabling the identification of regions with similar precipitation patterns as recommended in the literature (Awan et al. 2015; Dehghan et al. 2018; Kömüşcü et al. 2022; Kumari et al. 2016). In addition, clustering can be applied to identify outliers or anomalies (Krleža et al. 2020; Mateo et al. 2013). In particular, we observed that some sites located in the southern and southeastern regions exhibited sudden spikes in springtime precipitation. However, since such patterns are typical for the corresponding clusters, we considered these data as anomalies rather than outliers. On the other hand, we also identified some stations with precipitation spikes that were not typical for a given cluster. Consequently, we considered these spikes as outliers and removed such stations from our study area. Overall, the application of clustering can significantly enhance the forecasting skill of ML models by reducing data complexity, and detecting patterns and outliers/anomalies.

Precipitation forecasting plays a crucial role in various fields, including agriculture. The availability of rainfall during the autumn and spring seasons is of utmost importance for dry farming in Iran. According to Nouri et al. (2017a), precipitation shortage during the months of October to December (OND) and March to May (MAM) can cause crops to fail in dry farming in Iran. Specifically, autumnal dry spells jeopardize crop establishment and finally result in crop failure under rainfed condition (Nouri et al. 2017a, b). Precipitation forecasts can also help the government to take well-informed decisions on food security such as importing cereal crops, croplands extension, and agricultural insurance. Overall, the forecasting methods described in this study can be beneficial for decision-makers to adopt proactive agricultural risk management in Iran. Furthermore, as climate change exerts its effects on precipitation patterns via impacting global teleconnection patterns, the forecasting frameworks can be used to design adaptation strategies in our study area.

As for water resources management, the results can provide valuable insights for policymakers seeking to ensure equitable distribution of water resources, particularly in densely populated water-scarce areas in Iran. By utilizing precipitation forecasts, water managers can make informed decisions on how to manage water resources, including increasing water releases from reservoirs to create additional storage capacity based on expected inflow, or implementing water conservation measures to preserve water supplies during drought (Pattanaik and Das 2015; Ziervogel et al. 2010). The absence of reliable precipitation forecasts can have severe implications, as demonstrated by the disastrous flood in southwestern Iran during the spring of 2019 (Dezfuli 2020; Khosravi et al. 2020). In this event, the water held behind large dams constructed on the Dez and Karkheh rivers was not released in a timely manner, leading to severe flooding that inflicted substantial damage to the environment, infrastructure, and agriculture. Thus, precipitation forecasts can facilitate the development of early warning systems for floods and droughts and inform water management strategies to mitigate weather-related disasters. It is noteworthy that robust flood early warning systems require forecasts of low-frequency precipitation events on hourly and daily scales, as well as rare extreme events such as atmospheric rivers (Dezfuli 2020). These forecasts can aid decision-makers in identifying and mapping flood-prone areas, and can ultimately help to mitigate the impacts of floods. Therefore, we recommend that future research efforts be directed towards improving the low-frequency precipitation forecasts across Iran. This may involve developing new ML algorithms, as well as enhancing our understanding of the large-scale atmospheric anomalies.

In the present study, the point data were utilized for precipitation forecasting. However, in areas with limited data availability, gridded precipitation products offer several advantages, such as spatial continuity, long-term coverage, and access to a broader range of precipitation characteristics (Nouri 2023; Baatz et al. 2021; Valmassoi et al. 2022). Therefore, it is recommended to analyze the relationship between precipitation and teleconnections using gridded products. We also suggest further investigations on forecasting other precipitation characteristics such as precipitation type (e.g., snowfall), seasonality, and extremes using teleconnection signals in our study area.

Conclusions

We evaluated the precipitation forecasting skills of four machine learning (ML) approaches, i.e., the Generalized Regression Neural Network (GRNN), the Multi-Layer Perceptron (MLP), the Multi-Linear Regression (MLR), and the Least Squares Support Vector Machine (LSSVM), using a large number of teleconnection indices across different precipitation regimes in Iran. The precipitation percentiles were defined to cluster precipitation regimes. The El Niño-Southern Oscillation (ENSO) indices were applied as the predictors for most cases, denoting the ENSO phasing impacts on precipitation pattern, particularly during autumn and spring, in Iran. The LSSVM and MLP provided more reliable seasonal and annual precipitation forecasts relative to MLR and GRNN. However, as LSSVM showed high sensitivity to overfitting, MLP seems more suited to be applied for forecasting precipitation in the surveyed areas. Nonetheless, all ML algorithms showed weaker performance in the 10th and 40th percentiles, encompassing the southern and eastern hyper-arid and arid sites. Our results indicate that clustering precipitation regimes is a necessary step to overcome the spatial inconsistency frequently observed when investigating the association between precipitation and teleconnection anomalies. The findings can be of much use for developing proactive drought/flood risk management plans, highly required for maintaining food and water security in Iran.