1 Introduction

Drought is a natural occurrence that affects various climatic, hydrological, and environmental systems, which all have socioeconomic impacts (Vicente-Serrano et al. 2020). Drought is generally caused by meteorological irregularities, such as lower rainfall spells that result in water shortages at particular points of the water cycle or throughout the cycle (McKee et al. 1993). The world's increasing demand for scarce water resources has been rising rapidly by challenging its availability for food production and other vital purposes while putting sustainable development at risk (Kushwaha et al. 2016; Abd-Elaty et al. 2022a). As one of the primary water consumers upon which a growing population depends, agriculture competes with household, industrial, and environmental uses for these scarce water resources (Suwal et al. 2020; Kuriqi et al. 2020b; Kushwaha et al. 2022a). Terrestrial evapotranspiration is essential to a wide range of the atmosphere, hydrosphere, and biosphere processes to connect with the water, energy, and carbon cycles. These factors play an important role in the hydrologic cycle, flora and fauna community dynamics, and many biochemical processes (Kuriqi et al. 2020a; Abd-Elaty et al. 2022b). Several aspects of sustainable water resources management depend upon the precise estimation of terrestrial evapotranspiration. Even though natural environmental change versus anthropogenic activities differs among regions (Anand et al. 2018; Elbeltagi et al. 2022a), the role of climate change impacts on the hydrologic cycle has been characterized and recognized mostly across different global regions (Kushwaha et al. 2022b). It is widely accepted among scientific communities and policymakers that climate change significantly influences water availability, especially in arid and semi-arid regions (Mao et al. 2015; Elbeltagi et al. 2020, 2021). Many models have shown that terrestrial evapotranspiration varies in magnitudes at both temporal and spatial scales (Chen et al. 1996; Wang and Dickinson 2012). In general, all satellite-based products demonstrate a significant increase in terrestrial evapotranspiration over the past three decades (Rodell et al. 2009; Zeng et al. 2018). The empirical-based methods such as the Penman‐Monteith equation are the most commonly applied in practice mainly due to their application simplicity.

The empirical-based approaches in the same line as the one mentioned above are, however, inaccurate only when the vegetation is not water-stressed and the net radiation and stomatal resistance are available (Tao et al. 2018; Elbeltagi et al. 2022b). Various other approaches make use of recent satellite data. Indeed, among many terrestrial evapotranspiration estimation methods, satellite remote sensing-based data demonstrate to be one of the most promising ways of terrestrial evapotranspiration mapping over larger areas (Yang et al. 2013). However, both approaches, i.e., empirical and satellite-based methods, fail to monitor long-term global terrestrial evapotranspiration (Wang et al. 2010; Anand et al. 2018). Therefore, robust estimation of the long-term variability of terrestrial evapotranspiration requires the use of standard meteorological data supplemented with high‐resolution satellite data (Chen et al. 1996; Wang et al. 2010). Despite the improvement of meteorological observation's spatial resolution, it is worth exploring methods to estimate terrestrial evapotranspiration not only from remotely sensed information but also considering data-driven based techniques combined with remote sensing data (Yang et al. 2013). This approach would considerably condense the input effort and increase the robustness of satellite‐based terrestrial evapotranspiration models while helping policymakers establish sustainable water resources management strategies. Several studies estimated evapotranspiration applying data-driven techniques (Chen et al. 1996; Muhammad Adnan et al. 2020; Alizamir et al. 2020). Nevertheless, to the best of the author's knowledge, there is a lack of studies that combine the application of data-driven techniques with satellite-based observation to estimate terrestrial evapotranspiration over complex river basins similar to the one considered in this study. India has a large number of underprivileged drought-prone and economically backward regions, but no effective system of monitoring and assessing drought for the development of better management strategies and policies, and such studies overstate meteorological trends spatially, temporally, and over time. An effective adaptation and mitigation strategy to overcome a drought situation in a country like India will depend on the identification of a reliable methodology for calculating the drought index.

SPI may be calculated over any period between one month and 72 months. A practical range of application of 1–24 months is best based on statistical evidence (Poornima and Pushpalatha 2019). This 24-month cutoff is based on Guttman’s recommendation of having around 50–60 years of data available. It is impossible to do statistical analysis on the tails (both wet and dry extremes) without having more than 80–100 years of data. A monthly SPI can be calculated in theory, but doing so in practice is not recommended. An average period of at least four weeks is highly recommended for the user. Even in non-arid climates, one is likely to encounter many dry days (0.00 rainfall even in non-arid climates) that cause the SPI to behave very erratically (Thomas et al. 2015); therefore, this approach is not recommended. Nonetheless, updating the SPI daily or weekly for a 1-month up to a 24-month period is acceptable. Accordingly, we used 3, 6 and 12 SPI Index in this study to achieve better results.

In light of the above facts, to the author's knowledge, no study has been carried out to utilize the capability of the Hybrid Random SubSpace model with other machine learning algorithms for predicting extreme droughts in the studied area in India. The primary objective of this work was to develop a new model for predicting extreme droughts based on the SPI. SPI is used in this study because it combines meteorological variables [i.e., precipitation (P) and potential evapotranspiration (PET)] and shows drought better than any other index (Ali et al. 2019).

The present study was conducted to examine the feasibility and effectiveness of the Random Subspace (RSS) to estimate the SPI 3, 6 and 12 months in India, during 2000–2019. In this study, several inputs were constructed and the best subset regression was applied to select the most effective variables as inputs to the developed artificial models. The present study aims to create hybrid data-driven models coupled with RSS using the stacking hybridization technique and evaluate their performance. Consequently, the proposed drought prediction models can help in agricultural, hydrological, and meteorological applications such as irrigation scheduling, crop simulation, water budgeting, reservoir operations, and weather forecasting. The remaining sections of the manuscript are organized as follows: Sect. 2, presents a short introductory information about the study site, data collection and methodology; Sect. 3, summarizes the main findings of the study; Sect. 4, discusses the practical implications of the main findings; Sect. 5, summarizes the main conclusions of the study. The main objectives of this study area are as follows: (1) To develop the machine learning models based on the 20 years’ datasets of rain gauge. (2) To performance of ML algorithms in arid climatic condition for prediction of drought monitoring. (3) Comparison of ML models, and recommendations for prediction of drought.

2 Materials and methods

2.1 Study area and data acquisition

The study site Jaisalmer is the largest district of Rajasthan (India) situated at a latitude of 26°55″ N and longitude of 70°55″ E (Fig. 1). The Jaisalmer is also nicknamed “The Golden City”. The study site is situated at the heart of the “Thar Desert” (the Great Indian Desert). It comes under an arid region and is prone to temperature extremes. The temperature varies greatly from day to night in both summer and winter. The maximum summer temperature is around 49 °C while the minimum is 25 °C. While temperature during winter varies from 23.6 to 5 °C. The average rainfall was observed to be scanty i.e., 201 mm with average rainy days as 12 in a year. In the present study, 20 years (2000–2019) of rainfall data were used for SPI evaluation. Further, SPI estimation models used 70 per cent data as training and 30 per cent data as testing the models.

Fig. 1
figure 1

Location map of the study station

2.2 Methodology

2.2.1 SPI description and calculation

Various indices may be used to analyses and monitor droughts across a large area, each with its own set of strengths and disadvantages. SPI, the most generally used and acknowledged index, is based on the premise that a decrease in precipitation relative to normal precipitation is the major predictor of droughts (McKee et al. 1993). The SPI is usually calculated monthly, such as 1, 3, 6, 9, and 12 months, and the drought intensity is determined using the SPI data. 6, 9 and 12 months, and the drought intensity is defined based on the calculated SPI values.

The probability density function of the total precipitation is used to calculate SPI. For every month and every location, this is done differently. Then, the probability functions have been transformed into the standard normal distribution. The probability function was used to express the gamma distribution:

$$g g\left( x \right) = \frac{1}{{\beta ^{\alpha } {{\Gamma }}\left( {{\alpha }} \right)}}X^{{\alpha - 1_{{e^{{ - X/\beta }} }} }}$$
(1)

Here α denotes the appearance of parameters, β represents the range of the parameter, x denotes the amount of rainfall, and \(\Gamma \left(\mathrm{\alpha }\right)\) is the gamma function. The value of α and β parameters > 0. The Gamma function \(\Gamma \left(\mathrm{\alpha }\right)\) can be expressed as follows:

$$\Gamma \left(\mathrm{\alpha }\right)= {\int }_{0}^{\infty }{y}^{\alpha -1}{e}^{-y}dy$$
(2)

α and β parameters must be estimated for adjusting the gamma distribution. Maximum likelihood solutions are used to accurately obtain α and β as follows:

$$\widehat{\mathrm{\alpha }}= \frac{1}{4A}\left(1+\sqrt{1+\frac{4A}{3}}\right)$$
(3)
$$\widehat{\upbeta }= \frac{\overline{X}}{\widehat{\mathrm{\alpha }} }$$
(4)

where

$$\mathrm{A}=\mathrm{ln}\left(\overline{X }\right)-\frac{\sum \mathrm{ln}\left(\overline{X }\right)}{n}$$
(5)

The SPI has strengths such as It is flexible: it can be computed for multiple timescales; Shorter timescale SPIs, for example 1-, 2- or 3-month SPIs, can provide early warning of drought and help assess drought severity; It is spatially consistent: it allows for comparisons between different locations in different climates and Its probabilistic nature gives it historical context, which is well suited for decision-making. There are few limations like it is based only on precipitation and no soil water-balance component, thus no ratios of evapotranspiration/potential evapotranspiration (ET/PET) can be calculated.

The present study applied DrinC (Drought Indices Calculator) for the calculation of SPI (3, 6 and 12) and the calculated SPI used as a reference value for the performance assessment of the hybrid metaheuristics algorithms i.e., Random subspace (RSS), M5P, Random Forest (RF) and Random Tree (RT). The DrinC model was developed at the Laboratory of National Technical University, Athens (Tigkas et al. 2015).

2.2.2 Machine learning models

2.2.2.1 Random subspace

The random subspace method (RSM) is an ensemble learning technique presented by Ho (1998) to enhance the performance of the weak learners and improve the accuracy of individual learners (Pham et al. 2018). Random subspace builds a set of feature subspaces using random sampling and then trains the basic classifiers on top of them. The models are trained parallel, and multiple results are generated before being aggregated into the final results (Dong et al. 2020). The most important parameters used in this model include the number of seeds and the number of iterations. The optimal values of these parameters are often achieved based on trial and error (Mosavi et al. 2020). Modification to the training data is performed in RSM, however this modification is applied in the feature space. Once there data consists of multiple redundant features, an improved model can be found in the random subspaces compared to the original feature space. The random subspace method is found to be the best performing when there are a large number of features, and discriminative information is spread across them. On the other hand, when there are less informative features and data is noisy, the random subspace method tends to underperform (Sammut and Webb 2017). Figure 2 shows the block diagram of the random subspace model.

Fig. 2
figure 2

Block diagram of random subspace model

2.2.2.2 M5P

M5P algorithm resulted from the reconstruction of Quilan’s M5 algorithm (1992). In M5P, linear regression functions are incorporated into the leaf nodes along with the functionality of a traditional decision tree. To construct the tree, the M5P algorithm utilizes the decision tree induction algorithm, and the splitting condition minimizes the amounts of intra-subset variation present on each branch of the tree. There are four main steps in the M5P algorithm. M5P splits the input space into multiple subspaces at the beginning of the process. These subspaces are used for building the tree. Using the standard deviation reduction factor (SDR), we construct the tree by minimizing the intra-subspace variability and intra-subspace variability with a splitting criterion. As a result, the SDR factor will maximize the expected error reduction at the node (Wang and Witten 1996). In the second stage, a linear regression model is developed for each subspace using the sub-dataset. In the third stage, the developed tree’s nodes are pruned to avoid overfitting problems, and the discontinuities induced by the pruning process are removed using a smoothing procedure in the final stage (Melesse et al. 2020). The key advantage of using M5P is that they are efficient in handling large datasets with higher dimensions. M5P was found to be robust while dealing with missing data. (Behnood et al. 2017).

2.2.2.3 Random Forest

Random forest (RF) is an ensemble method that uses several decision trees parallel with the bagging (bootstrapping followed by aggregation) approach. RF was introduced by Breiman (2001) and used in numerous studies. Bootstrapping indicates that various individual decision trees are trained in parallel on different subsets of the input training dataset. This reduces the overall variance of the model and produces accurate results. For making the final decision in RF, the decisions of individual trees are aggregated, ensuring better generalization (Misra and Li 2020). In general, deep decision trees suffer from overfitting, and random forest prevents this by generating random subsets of features and building smaller trees using those subsets. A random forest's generalization error is based on the strength of the individual trees built and their correlations. Several studies have demonstrated that random forest models can effectively predict and estimate small sample sizes and complex data in both classification and regression problems (Biau and Scornet 2015). One of the key characteristics of random forest is resilience towards overfitting. As there are enough trees in the random forest model, the model gives better generalization. One major downside of random forest is that when the number of trees is very high, the algorithm gets slower. Steps involved in solving a classification/regression problem using the random forest are shown in Fig. 3.

Fig. 3
figure 3

Steps involved in solving a classification/regression problem using random forest

2.2.2.4 Random Tree

Machine learning is primarily based on random trees (RT), which combine single model trees and random forests. A random set of data is produced using a bagging approach through RT to have many individual learners and handle regression and classification problems (Kalmegh 2015). Since RT combines both bagging and RF methods, it generates a final prediction based on the aggregated prediction of multiple individual trees. The RT algorithm starts by splitting the dataset into subspaces and fitting a constant to each subspace. In the first step, random tree classification algorithm takes the input feature vector, and classification is performed at each tree in the forest. A single tree model tends to perform poorly but bagging RT shows high performance in terms of accuracy. RTs have increased flexibility and enhanced training capability (Nhu et al. 2020) in a number of real-world problems. Additionally, RT maintains accuracy even after changing the model complexity. The single leaves on the trees indicate the linear models can be optimized via replacement methods or by splitting the attributes at every node to identify the best means to split a subset during optimization (Khosravi et al. 2019).

2.3 Hybridization of machine learning algorithms using stacked generalization

The SPI was predicted using stacking hybrid algorithms in this study. Wolpert (1992) proposed a stacking hybrid algorithm technique. During the training period, this method provides an environment for ensemble algorithms, which mix two or more algorithms. According to studies (Healey et al. 2018; Rahman et al. 2021) stacking hybrid algorithms can improve algorithm predictability. The idea behind stacking hybrid generalization is to use first-level learners to train and forecast training data sets. The first level learners' projected results were combined to create a new training dataset for the meta learner. Sikora et al. (2015) and Zhou (2009) provided more details on stacked hybrid generalization.

2.4 Best subset regression and sensitivity analysis

2.4.1 Input selection using best subset model for the SPI 3, 6, and 12 months of a selected station

Best subsets regression is an exploratory model building regression analysis. It compares all possible models that can be created based upon an identified set of predictors. It aims to find a small subset of predictors, so that the resulting linear model is expected to have the most desirable prediction accuracy several statistical criteria have been used to select the best combination of inputs, i.e., MSE, determination coefficients (R2), adjusted R2, Mallows' Cp, Akaike's AIC, Schwarz's SBC, and Amemiya's PC. One of the most crucial procedures for developing the predictive model of multi-step ahead SPI drought index is to select the best subset combination of input data. Based on ten inputs, Table 1 shows (a) SPI-3, (b) SPI-6, and (c) SPI-12 for predicting the lag-time SPI. The seven statistical criteria have been used to select the best combination of inputs, i.e., MSE, determination coefficients (R2), adjusted R2, Mallows' Cp, Akaike's AIC, Schwarz's SBC, and Amemiya's PC. The best subset input combination is displayed in bold blue row due to providing the lowest values of MSE, Mallows' Cp, Akaike's AIC, Schwarz's SBC, and Amemiya's PC, and the highest values of R2 and Adjusted R2. For SPI-3, the best input combination is of 5 variables, i.e., SPI-1, SPI-3, SPI-4, SPI-8, and SPI-9. It provides MSE of 0.490, R2 of 0.443, Adjusted R2 of 0.430, Mallows' Cp of 1.885, Akaike's AIC of − 155.168, Schwarz's SBC of 134.645, and Amemiya's PC of 0.582. For SPI-6, the best input combination is of 5 variables, i.e., SPI-1, SPI-3, SPI-6, SPI-7, and SPI-9. It gives MSE of 0.384, R2 of 0.619, Adjusted R2 of 0.610, Mallows' Cp of 2.557, Akaike's AIC of − 207.598, Schwarz's SBC of − 187.155, and Amemiya's PC of 0.398. For SPI-12, the best input combination is of 3 variables, i.e., SPI-1, SPI-2, and SPI-10. It has MSE of 0.148, R2 of 0.855, Adjusted R2 of 0.853, Mallows' Cp of − 1.745, Akaike's AIC of − 411.117, Schwarz's SBC of − 397.597, and Amemiya's PC of 0.149.

Table 1 The best subset regression analysis for determining the best input combinations

2.4.2 Sensitivity analysis

The combinations of the input variables strongly influence the models' performance. Some contribute positively to the accuracy of the selected model, while others may contribute negatively. Sensitivity analysis was used to choose the most influential variables to optimize model performance in predicting the SPI drought index. A regression analysis was performed at Jaisalmer to identify the most effective parameter sets. As shown in Fig. 4, the standardized coefficients of input variables for (a) SPI-3, (b) SPI-6, and (c) SPI-12 are also plotted. As a result of the regression analysis, SPI-1, SPI-3, SPI-9, SPI-4, and SPI-8 with standard coefficients of (0.711, − 0.215, 0.093, 0.092, and − 0.072) have been found to be the main input parameters that influence SPI drought index estimation for SPI-3 (Table 2). For SPI-6, the priority of the influential input parameters was SPI-1, SPI-6, SPI-7, SPI-3, and SPI-9 by providing absolute standard coefficients (0.742, − 0.242, 0.161, 0.110, and − 0.093), respectively. The influential input parameters were prioritized as SPI-1, SPI-2, and SPI-10, respectively, using absolute standard coefficients of (1.062, − 0.165, and − 0.104) for SPI-12.

Fig. 4
figure 4

The standardized coefficients of input variable for sensitivity analysis selected meteorological station for a SPI-3, b SPI-6, and c SPI-12

Table 2 The regression analysis for identifying the most effective parameters at Paithan station

2.5 Performance metrics and evaluation

During the period of this study, actual data was compared with modeled values. Statistical indicators have been used in evaluating the accuracy of developed hybrid Random SubSpace models, e.g., Root mean square error (RMSE), coefficient of determination (R2), relative absolute error (RAE), root relative squared error (RRSE) and mean absolute error (MAE) (Kushwaha et al. 2021; Elbeltagi et al. 2022b). All statistical indicators are defined as:

  1. 1.

    Root mean square error (RMSE)

    $$\mathrm{RMSE }=\sqrt{\frac{1}{\mathrm{N}}{\sum }_{i=1}^{N}{{(SPI}_{A}^{i}-{SPI}_{P}^{i})}^{2}}$$
    (6)
  2. 2.

    Coefficient of determination (R2)

    $${\mathrm{R}}^{2}= {\left[\frac{{\sum }_{i=1}^{N}{(SPI}_{A}^{i}-{\overline{SPI} }_{A}){(SPI}_{P}^{i}-{\overline{SPI} }_{P})}{\sqrt{{\sum }_{i=1}^{N}{{(SPI}_{A}^{i}-{\overline{SPI} }_{A})}^{2}}\sqrt{{\sum }_{i=1}^{N}{{(SPI}_{P}^{i}-{\overline{SPI} }_{P})}^{2}}}\right]}^{2}$$
    (7)
  3. 3.

    Mean absolute error (MAE)

    $$\mathrm{MAE}=\frac{1}{\mathrm{N}}{\sum }_{i=1}^{N}{|SPI}_{P}^{i}-{SPI}_{A}^{i}|$$
    (8)
  4. 4.

    Relative absolute error (RAE)

    The RAE normalizes the total absolute error by dividing it by the simple predictor's total absolute error.

    $$\mathrm{RAE}=\left|\frac{{SPI}_{A}^{i}-{ SPI}_{P}^{i}}{{SPI}_{P}^{i}}\right|\times 100$$
    (9)
  5. 5.

    Root relative squared error (RRSE)

    The RRSE normalizes the overall squared error by dividing it by the total SE of the simple predictor. The error is reduced to the same dimensions as the quantity being predicted by calculating the square root of the RSE.

    $$\mathrm{RRSE}=\frac{\sqrt{{\sum }_{i=1}^{N}{{(SPI}_{P}^{i}-{SPI}_{A}^{i})}^{2}}}{\sqrt{{\sum }_{i=1}^{N}{{(SPI}_{A}^{i}-{SPI}^{-})}^{2}}}$$
    (10)

In which, \({SPI}_{A}^{i}\) is an observed or actual value, \({SPI}_{P}^{i}\) is simulated or forecasted value, \(\overline{{SPI }_{A}}\) and \(\overline{{SPI }_{P}}\) are the mean values of observed and forecasted samples, and N is the total number of data points.

3 Results

3.1 Evaluation machine learning models based on the best-selected subset models

Four machine learning methods (i.e., RSS, RSS-M5P, RSS-RF, and RSS-RT) were used to forecast the SPI at 3, 6, and 12 months at Jaisalmer district, Rajasthan, India. The employed algorithms' performances were evaluated and compared (i.e., MAE, RMSE, RAE, RRSE, and R2). The model with the lowest MAE, RMSE, RAE, RRSE near zero, and R2 near one is deemed to have the most accuracy in estimating the SPI. Table 3 shows the performance indices for machine learning algorithms-based models during the training and testing span. The best machine learning algorithm for each time-scales of SPI (i.e., SPI-3, SPI-6, and SPI-12) is displayed in blue row.

Table 3 MAE, RMSE, RAE, RRSE, and R for Machine learning algorithms-based models during the training and testing span

Results indicated that the RSS-RF model outperformed other algorithms during the training period for forecasting the SPI at 3, 6, and 12 months. It provided MAE = 0.367, RMSE = 0.484, RAE = 48.34, RRSE = 50.25, and R2 = 0.879 for SPI-3. It gave MAE = 0.305, RMSE = 0.425, RAE = 37.36, RRSE = 41.25, and R2 = 0.919 for SPI-6. And it has MAE = 0.209, RMSE = 0.332, RAE = 25.59, RRSE = 31.08, and R2 = 0.951 for SPI-12. It was followed by RSS-RF, RSS-RT, and RSS algorithm, respectively. Among three months of the SPI predictive models, SPI-12 has the highest performance, followed by SPI-6, and SPI-3, respectively, for the training period. The RSS-M5P algorithm outperformed the other implemented algorithms during testing. Therefore, it should consider RSS-M5P the best SPI prediction model. It provided MAE = 0.497, RMSE = 0.682, RAE = 81.88, RRSE = 87.22, and R2 = 0.507 for SPI-3. It gave MAE = 0.452, RMSE = 0.717, RAE = 69.76, RRSE = 85.24, and R2 = 0.402 for SPI-6. And it has MAE = 0.294, RMSE = 0.377, RAE = 55.79, RRSE = 59.57, and R2 = 0.783 for SPI-12. Again, among three months of the SPI predictive models, SPI-12 has the highest performance, followed by SPI-6, and SPI-3, respectively, for the testing period. Figures 5, 6 and 7 present the predicted and calculated SPI-3, SPI-6, and SPI-12 values by four machine learning algorithms during testing phases: (a) time series and (b) scenario-scatter plot. Further comparative examination of models was done using the Taylor diagram (Fig. 8). According to standard deviation, correlation, and RMSE, the RSS-M5P model matched the observed location the closest, while RSS, RSS-RF, and RSS-RT models further matched the location. In this analysis, RSS-M5P was the most effective model among the chosen models due to giving more generalized performance than RSS, RSS-RF, and RSS-RT algorithm. All the developed models could be observed as underestimating SPI-3 and SPI-6 while providing overestimation for SPI-12.

Fig. 5
figure 5

Predicted and calculated SPI-3 values by RSS, RSS-M5P, RSS-RF and RSS-RT algorithms during testing phases a time series, and b scenario-scatter plot

Fig. 6
figure 6

Predicted and calculated SPI-6 values by RSS, RSS-M5P, RSS-RF and RSS-RT algorithms during testing phases a time series, and b scenario-scatter plot

Fig. 7
figure 7

Predicted and calculated SPI-12 values by RSS, RSS-M5P, RSS-RF and RSS-RT algorithms during testing phases a time series, and b scenario-scatter plot

Fig. 8
figure 8

Taylor diagrams of RSS, RSS-M5P, RSS-RF and RSS-RT during testing span at selected station for a SPI-3, b SPI-6, and c SPI-12

4 Discussion

The performance of hybrid meta-heuristics algorithms i.e. RSS, RSS-M5P, RSS-RF and RSS-RT was assessed for the multiscale prediction of SPI (i.e. 3, 6 and 12 months). The obtained results highlighted the potential of hybrid meta-heuristics algorithms in the prediction of monthly SPI especially for SPI-12 months. Figures 5, 6 and 7 represented the temporal variation between predicted and calculated SPI values and their scatter plots for SPI-3, SPI-6, and SPI-12 months. In scatter plots, the regression line provided the high value of coefficient of determination (R2) in respect of the RSS-M5P additive regression model under the both scenario for SPI-3, SPI-6, and SPI-12 months. As seen from the figures, the hybrid meta- heuristics algorithms performed better in longer time scale. Further comparison between algorithms using MAE and RMSE (Table 3) showed that M5P algorithms have improved the performance of the RSS structure as it has lower value of MAE and RMSE. The Taylor Diagram (Fig. 8) showed the more comparable depiction of models performance in prediction of monthly SPI values. The developed RSS-RF model was located furthest and RSS-M5P model was located nearest to the observed point based on the standard deviation, correlation, and RMSE for all monthly SPI prediction. This showed RSS-M5P algorithm has higher accuracy in prediction of monthly SPI as compared to RSS alone and other developed hybrid algorithms.

Our findings were also compared with other recent studies conducted across different regions, such as Bangladesh, Ethiopia, India, and Iran. Considering both training and testing periods, the long-term SPI predictive model makes more accurate predictions than the short-term SPI predictive models. This present finding agree with the study by (Aghelpour and Varshavian 2021; Malik et al. 2021; Yaseen et al. 2021). This reflects that long-term precipitation patterns vary less than short and medium-term precipitation patterns (Belayneh and Adamowski 2013). Furthermore, monsoon months are more vulnerable than other seasons, with June showing greater susceptibility to severe drought. September displays greater susceptibility to extreme droughts. Using the RSS-RF model for each time-scale of SPI (i.e., SPI-3SPI-6 and SPI-12) gave the best performance among the selected models during the training period, while it was not the best for the testing period. It is in line with the study by Ditthakit et al. (2021), who applied the RF method for estimating GR2M model parameters in an ungauged basin. Yaseen et al. (2021) investigated the capability of machine learning (ML) random forest (RF), minimum probability machine regression (MPMR), M5 Tree (M5tree), extreme learning machine (ELM), and online sequential-ELM (OSELM) in predicting (SPI) at four-month horizons (i.e., 1, 3, 6 and 12) in Bangladesh. Study found that ELM was the best model for predicting 3, 6, and 12-month SPI, while RF showed the best performance for 1- month SPI prediction. Belayneh et al. (2016) applied three machine learning techniques, i.e., artificial neural networks (ANNs), support vector regression (SVR), and coupled wavelet-ANNs (WA-ANN). They concluded that WA-ANN gave the best model performance for forecasting SPI 3 (3-month SPI) and SPI 6 (6-month SPI) in the Awash River Basin in Ethiopia. Aghelpour and Varshavian (2021) applied a hybrid model of MLP Neural Network and the Imperialistic Competitive Algorithm (MLP-ICA) for forecasting Multivariate Standardized Precipitation Index (MSPI) in Iran. They pointed out the proposed models could forecast MSPI for a longer time horizon (i.e., 12–24 and 24–48 month MSPI) were better than the shorter time horizon (i.e., 3–6, 6–12, and 3–12 month MSPI). The present study highlighted the potential of hybrid meta-heuristics algorithms in the prediction of multi scale SPI droughts. However, uncertainties related to datasets, methods, scenarios, models with a large number of algorithms parameters, search space and optimization process becomes more difficult. Therefore, further research should focus on minimizing these uncertainties and improving optimization performance. Moreover, since only one station was selected in this study, the applicability of the proposed RSS-based hybrid models may be validated at different locations varying in agro-climatic conditions under different sceneros to draw a generalized conclusion.

5 Conclusion

Assessment of drought is one of the most important tasks in the present condition as it leads to several adverse effects on the soil–water-atmosphere cycles of the earth systems across the different climates of the world. There’s exists several techniques and machine learning methods in the literature for quantifying drought, However, based on SPI, this is one of the unique study that compares and contrast the relative role of meta-heuristic models such as RSS alone and it hybridization with other algorithms. Drought is among the most global costly threats to ecosystems, especially in regions with diverse climatic patterns. Drought occurs, as a spatio-temporal phenomenon, as a result of a decrease in the amounts of precipitation below average for a specific period sufficient to cause environmental risks. In the present study, four machine learning models (i.e., RSS, RSS-M5P, RSS-RF, and RSS-RT) are applied to the SPI (Standard Precipitation Index) evaluation in Rajasthan climate conditions to create a probabilistic framework for drought situations. The maximum drought periods and the corresponding durations have been identified in the study location, and the results confirm that there have been some severe drought events in the past. The different SPI timescales (SPI-3, SPI-6, and SPI-12) present distinct drought periods and intensities that are vital to a seasonal drought analysis. In the training stages of all the climates, AI models developed using the RSS-M5P model had the highest efficiency and was followed by RSS, RSS-RF, and RSS-RT models. According to SPI-3 results, monsoon months are more vulnerable than other seasons, with June showing greater susceptibility to severe drought. September displays greater susceptibility to extreme droughts. The SPI-12 model exhibited the highest performance among models developed for three months, followed by SPI-6 and SPI-3. In general, all the models provided an underestimation of SPI-3 and SPI-6, while overestimating SPI-12. Results revealed that RSS-M5P model performed well in capturing the monthly trend of SPI and it has high values coefficient of determination (0.507–0.783) and lower values of MAE (0.294–0.497) and RMSE (0.377–0.682) for prediction of multi scale SPI (i.e., 3, 6, and 9 months), under the testing period. The occurrence of drought is affected by low precipitation, high fluctuations in the average rainfall, and climate change, particularly as a result of regional and global warming. As such, it is essential to develop drought management policies and effectively implement these policies with the assistance and support of the government and private organizations. We could use the results of our study to understand the water availability for the entire year in advance with the help of three months of SPI data, and that in turn could be used to create an efficient climate-smart agriculture strategy that has global economic implications.