1 Introduction

Drought is one of the climatic hazards that affect large areas of the Earth’s surface (Dice and Rodziewicz 2020). It is very clear that when it spreads over a long period of time, this natural phenomenon causes serious economic damage, especially in the field of agriculture (Lopez-Nicolas et al. 2017; Pande et al. 2023b, c, d). Drought poses a serious threat to the economies of many developing countries, especially on the continents of Africa and Asia (Kilimani et al. 2018). Global climate change, which is almost unanimously recognized by the scientific community (Wang et al. 2017), will inevitably lead to an increase in the frequency and duration of the sect in many regions of the globe (Liu et al. 2021). Moreover, the presence of drought can trigger other natural hazards such as forest fires (Aragão et al. 2018). This phenomenon is not only characteristic of arid and hyper-arid areas; drought occurs periodically also throughout other types of ecosystems (Bahrami et al. 2019). Therefore, due to the high importance of the problems that this natural hazard causes around the globe, the study of drought impact on society and, also, of its specific indicators, is a very important activity in the work of today’s researchers (Finn et al. 2018; Tong et al. 2018; Webber et al. 2018).

In the literature, the characterization of drought phenomena is carried out using several indices like the following: Palmer Drought Severity Index (PDSI) (Yang et al. 2020; Yu et al. 2019), Surface Water Supply Index (SWSI) (Duan et al. 2018; Jang et al. 2017), Crop Moisture Index (CMI) (Carrão et al. 2016; Juhasz and Kornfield 1978), Crop Specific Drought Index (CSDI) (Hubbard and Wu 2005; Meyer et al. 1993), Soil Moisture Drought Index (SMDI) (Sohrabi et al. 2015; Xu et al. 2020), Rainfall Anomaly Index (RAI) (Hänsel et al. 2016; Moron 1994), Reclamation Drought Index (RDI) (Weghorst 1996), Effective Precipitation Index (EPI) (Ebrahimpour et al. 2015; Peng-cheng et al. 2016), Bhalme and Mooley Drought Index (BMDI) (Domenikiotis et al. 2004; Ntale and Gan 2003), Effective Drought Index (EDI) (Malik et al. 2021a, b), and Standardized Precipitation Index (SPI) (Wu et al. 2005; Mohamadi et al. 2020).

Of all the previously mentioned indices, one of the most used in the study of climatological drought is SPI. This index is very closely related to soil moisture and also to the groundwater reserve (Spennemann et al. 2015). One of the reasons for the wide use of SPI in various research studies is given by the simplicity of the computation method as well as by the flexibility regarding the various scales of temporal analysis (Choubin et al. 2016). However, the accurate future prediction of SPI values remains a challenge for today’s scientific world. In this regard, in order to obtain the highest possible accuracy of the results, so far, the following machine learning models were applied: artificial neural network (Ibrahimi and Baali 2018; Poornima and Pushpalatha 2019; Soh et al. 2018), adaptive neuro-fuzzy inference system (Aghelpour et al. 2020; Ali et al. 2018; Mokhtarzad et al. 2017), support vector regression (Komasi et al. 2018; Roodposhti et al. 2017), support vector machine (Belayneh and Adamowski 2012; Shamshirband et al. 2020), random forest (Yaseen et al. 2021).

In this context, the present research paper aims to (1) enrich the state of knowledge in the field of using machine learning models in SPI prediction, by using the following four algorithms: additive regression model, random subspace, M5P model, and bagging model; (2) select the best developed ML model based on evaluation and validation of the results obtained by a series of statistical indicators such as correlation coefficient, mean absolute error, root mean squared, relative absolute error, and root relative squared error.

2 Methodology

2.1 Study area

The Upper Godavari River basin area is located in the Maharashtra state of India. The river area is covered by 152,199 km2 supplying approximately 65% of water usage in the state of Selangor. It is located at an elevation of 1067 m, about 80 km from the Arabian Sea. The Timbakeshwar is a source of the Upper Godavari River basin located in the Nashik District of Maharashtra. Ultimately, the river discharges into the Bay of Bengal through a comprehensive tributary network. This river basin is the second biggest river in India. The daily precipitation was observed for each station in the area. This data was collected from the Prediction of Worldwide Energy Resources, NASA. The three weather stations’ data were selected and used to combine data-driven models and best subset regression for predicting the standardized precipitation index (SPI). The river basin area is most important for agriculture development, industrial activities, and drinking purposes. This study is based in a river basin of an agricultural area, in which 30 years of daily precipitation time series analysis were carried out to understand climate change by correlating only to the dry spell and wet spell frequencies and some discussions with farmers, without further assessment. Sustainable development to minimize this vastly increasing urban situation is a difficult task to avoid serious implications of environmental deficit in the area, pollution, forest instability, and land-use changes in surface soil cover (Pramudya et al. 2016). The application of a best-suited drought index to the climate prospects noted was considered necessary under those circumstances. For 1989––2019, a sequence of weather data for three stations, including rainfall and temperature, was obtained where only daily data are available from 1989 to 2019. The location map of the Upper Godavari River basin is presented in Fig. 1.

Fig. 1
figure 1

Location map of the study area

2.2 Standardized precipitation index

The currently created drought index, SPI or SPEI, is reported as an accurate tool for studying and real-time observing the metrological drought situations under heating since numerous scientists and researchers had done SPI drought analysis for the forecasting of future metrological drought events. In this paper, 3, 6, and 12 months of SPI computation were carried out for predicting the standardized precipitation index. SPI was measured based on the daily precipitation data during the years 1989–2019 (30 years of data). The SPI or SPEI calculated value for intensity dryness is such that drought is classified as mild if the SPI or SPEI values vary between 0 and − 1, moderate if from − 1 to − 1.5, severe between − 1.5 and − 2, and extreme when less than − 2. The defined SPI classified is identical to the SPI, because in the calculation, they share a parallel based on the distribution of probabilities (Tan et al. 2015). As per the creation of SPI, three observations situated in various portions of the globe which added various regions such as tropical, monsoon, arid, semi-arid, continental, cold, and oceanic weathers have chosen to create the SPI (Vicente-Serrano et al. 2010). The SPI values were estimated using the SPI package in the R software. Thus, the time scales of SPI implemented in this research are 3, 6, and 12 months. SPI prediction values were computed by using machine learning models. The past SPI from 1989 to 2019 has included additive regression, random subspace, M5P, and bagging models for forecasting the SPI for the test period from 2013 to 2019.

2.3 Machine learning models

2.3.1 Additive regression model

The additive regression model was introduced in the literature by Stone (1985). In this type of model, a dependent variable defined as Yi (I = 1, 2,..,n) represents the sum of many functions that are associated to the following independent variables: Xi1, Xi2,…, Xip. The mathematical relationship on which the additive regression model is based has the following form:

$${Y}_{i}=\sum_{j=1}^{p}f\left({x}_{ij}\right)+{u}_{i}, {u}_{i}\sim iid\left(0,{\sigma }^{2}\right)$$
(1)

where f(xi) represents a nonparametric function, which could be calculated by using a nonparametric regression algorithm. Further, if we consider E(f) = 0 (j = 1,2,…,p), then the additive regression model equation becomes:

$$E\left({Y}_{i}|{x}_{i1},{x}_{i2},\dots , {x}_{ip}\right)=\sum_{j=1}^{p}f\left({x}_{ij}\right)$$
(2)

According to Eq. 2, the additive regression model represents a better version of linear models (Xu and Lin 2015). It should be remarked that the explanatory variables are encoded in a more general form (\(f\left({x}_{ij}\right)\)) than the initial linear form \(\left({\beta }_{i}{x}_{i}\right)\).

2.3.2 Random subspace

Random subspace, which was proposed for the first time by Ho (1998), is an ensemble algorithm that works with a selected subset features of an individual classifier and, finally, using the voting procedure, manages to combine their outputs (Pham et al. 2020). Therefore, the weak individual classifier performance is improved through an ensemble classifier.

Let consider a sample C as a training set of size n, a set P = (P1, P2,…, Pn) having as training objects Pi (i = 1,2,…, n) which is a q-dimensional vector Pi = (Pi1, Pi2,…, Piq) that is characterized by many q features. If we consider r < q features, then r will become a dimensional random subspace associated with q which is a dimensional feature space. Thus, each object from Pi = (Pi1, Pi2,…, Pin) will be a unit of set sample P = (P1, P2,…, Pn). The random subspace algorithm can be mathematically expressed as follows (Dai et al. 2002):

$$\gamma \left(s\right)=argmax\sum_{d}{\delta }_{sng}\left({E}^{d}\left(s\right)\right),\gamma$$
(3)

where \({\delta }_{i,j}\) represents the Kronecker symbol, while \(\gamma\) = (− 1, 1) represents a class label associated with the classifier \({E}^{d}\left(s\right)\) in which d = 1, 2,…, D).

2.3.3 M5P model

The M5P algorithm is a linear tree-based method which is involved in the prediction of continuous variables (Khosravi et al. 2020). Due to the fact that M5P can be characterized by many multivariate linear models, this algorithm has high flexibility (Zhan et al. 2011). The next 3 stages are required to be followed in order to construct the M5P model: (i) tree construction; (ii) tree pruning; (iii) tree smoothing. The growing tree process is intended to maximize the standard deviation reduction (SDR) in order to reach the best performance of the model. The SDR formula can be written as follows (Khosravi et al. 2020):

$$SDR=SD\left(E\right)-\sum_{i}\frac{\left|{E}_{i}\right|}{\left|E\right|}xSD({E}_{i})$$
(4)

where E is a set of cases, Ei represents the ith subset of cases that is obtained following the tree splitting, SD(E) is the standard deviation associated with E, and SD(Ei) represents the standard deviation of Ei.

2.3.4 Bagging model

The bagging model, which was proposed by Breiman (1996), represents an algorithm that consists of a set of basic functions and models which is able to achieve M learners by creating additional data within the training phase (Yariyan et al. 2020). The M training dataset is generated through a random sampling procedure following the substitution of the initial dataset. Within the bagging algorithm, K models are trained with the help of K subsets finally leading to the generation of the final model. The bagging model is a stand-alone one that does not take into account the previous model’s precision (Yariyan et al. 2020).

2.3.5 Best combination selection procedure

Feature selection is one of the stages providing a soft computing model to forecast and predict the engineering phenomena when there are many input variables. There are several approaches to specify the best combinations among all possible which are including best subset regression, mutual information, forward stepwise selection, etc. In the current study, the best subset regression analysis was performed to determine the best input combinations for the SPI model. For this purpose, six statistical criteria, including MSE, determination coefficients (R2), adjusted (R2), Mallows’ Cp, Akaike’s AIC, and Amemiya’s PC were computed. The lagged data were prepared as inputs to the models from the 1st (SPI-1) to the 15th (SPI-15). The best subset regression model was applied to select the best input variables in SPI-3-, SPI-6-, and SPI-12-month modeling. It is noteworthy that the total of all datasets were randomly divided into two training and testing subsets. Seventy-five percent of datasets were allocated for training the models and the remaining 25% were considered for validating the developed models.

2.4 Performance metrics for the evaluation of the models

Performance statistics of the correlation coefficient (C.C), mean absolute error (MSE), root mean squared error (RMSE), relative absolute error (RAE), and root relative squared error (RRSE) were utilized to measure the applied models of machine learning (Eqs.1 to 5). The following five performance metrics are definite as:

$$C.C=\sqrt{1- \frac{{\sum }_{i=1}^{n}{(Zi-{Z}^{i})}^{2}}{{\sum }_{i=1}^{n}{(Zi)}^{2}}}$$
(5)

where Zi and Zi is the measures and estimated value; n is the number of value used in the model.

$$MAE=\frac{{\sum }_{i=1}^{n}\left|\left({y}_{i}-{x}_{i}\right)\right|}{N}$$
(6)
$$RAE=\frac{\sum_{i=1}^{n}\left|{p}_{i}-{a}_{i}\right|}{\sum_{i=1}^{n}\left|\overline{a }-{a}_{i}\right|}$$
(7)
$$RMSE=\sqrt{\frac{{\sum }_{i=1}^{n}{({y}_{i}-{x}_{i})}^{2}}{N}}$$
(8)
$$RRSE=\sqrt{\frac{{\sum_{j=1}^{n}({P}_{(ij)}-{T}_{j})}^{2}}{{\sum_{j=1}^{n}({T}_{j}-{\overline{{T }_{j}}}_{j})}^{2}}}$$
(9)
$$d=1- \frac{{\sum }_{j=1}^{N}{({P}_{ij}-{T}_{j})}^{2}}{\sum_{j=1}^{n}{(\left|{T}_{j}-\overline{{P }_{ij}}\right|+\left|{P}_{ij}-\overline{{P }_{ij}}\right|)}^{2}}$$
(10)
$$NSE =1- \frac{\sum_{j}^{n}{({P}_{ij}-{T}_{j})}^{2}}{\sum_{j}^{n}{({P}_{ij}-\overline{{P }_{ij}})}^{2}}$$
(11)

where P(ij) is the value predicted by the single algorithm i for reported j (out of n data); Tj is the target value for reported j.

3 Results and discussion

In this paper, data of three climate stations are named 1, 2, and 3. The Upper Godavari River basin in India was chosen to develop the SPI index at various scales such as 3, 6, and 12 months. In the study areas, most of the villages have faced problems related to metrological drought conditions and climate parameter changes. Prediction of the SPI drought index data is considered essential for forecasting the metrological drought condition in the study area. Therefore, four machine-learning models including additive regression, random subspace, M5P, and bagging models were adopted for the prediction of the standardized precipitation index for 3 months (SPI-3), 6 months (SPI-6), and 12 months (SPI-12).

3.1 Input selection using the best subset model

For almost every machine-learning model, the input variable selection is essential to obtaining the optimum regression model. There are different techniques and approaches that could be used for input variables selection. One of the commonly used methods is the model-free (filter) based method based on a statistical analysis of model performance. Therefore, several combinations of input variables based on past SPI values (t-1, t-2, t-3,…t-15) were used to predict the SPI. Several statistical indices, which include MSE, R2, adjusted (R2), Mallows’ (Cp), Akaike’s (AIC), Schwarz’s (SBC), and Amemiya’s (PC), were calculated in order to obtain the best input variables combination.

According to results in Table 1 (A), the best subset regression analysis performance and the finest variables of the SPI-3 model have been observed with seven input variables that include (1st/6th/7th/11th/12th/13th/15th). The best model performance statistics discovered for the SPI-3 model are MSE = 0.540, R2 = 0.701, adjusted (R2) = 0.692, Mallows’ (Cp) = 2.023, Akaike’s (AIC) =  − 133.164, Schwarz’s (SBC) =  − 105.694, and Amemiya’s (PC) = 0.318 for station 1 (Table 1). The best subset regression analysis performance and the finest variables of the SPI-6 model have been observed with four input variables include (1st/12th/13th/15th) as shown the Table 1 (B). The statistics of that model were found as MSE = 0.453, R2 = 0.802, adjusted (R2) = 0.799, Mallows’ (Cp) = 0.354, Akaike’s (AIC) =  − 176.295, Schwarz’s (SBC) =  − 159.127, and Amemiya’s (PC) = 0.205 as shown in Table 1 (B). According to the SPI-12 results described in Table 1 (C), the seven variables subset model shows the best results to the finest accuracy of the SPI-12 in station 1. The seven input variables are including (1st/2nd-2/3rd-3/7th/13th/14th/15th) in the SPI-12 model formed an MSE = 0.152, R2 = 0.944, Adjusted (R2) = 0.942, Mallows’ (Cp) = 2.913, Akaike’s (AIC) =  − 422.907, Schwarz’s (SBC) =  − 395.438, and Amemiya’s (PC) = 0.059.

Table 1 The best subset regression analysis for determining the best input combinations to model

Table 2 (A) shows the best subset regression analysis performance and the best input variable combination of the SPI-3 model have been reported in seven variables that include (1st/4th/6th/7th/11th/13th/15th) for station 2. The best model performance results are MSE = 0.528, R2 = 0.609, adjusted (R2) = 0.597, Mallows’ (Cp) = 4.988, Akaike’s (AIC) =  − 133.164, Schwarz’s (SBC) =  − 111.105, and Amemiya’s (PC) = 0.415. The best subset regression analysis performance and the finest variables of the SPI-6 model for station 2 have been observed with seven input variables that include (1st/5th/6th/10th/12th/13th/15th). The statistics of that model were found as MSE = 0.478, R2 = 0.802, Adjusted (R2) = 0.796, Mallows’ (Cp) = 3.313, Akaike’s (AIC) =  − 161.079, Schwarz’s (SBC) =  − 133.609, and Amemiya’s (PC) = 0.210 as shown in Table 2B. According to the SPI-12 results described in Table 2 (C), the six variables subset model shows the best results to the finest accuracy of the SPI-12 in station 2. These six variables include (1st/7th/12th/13th/14th/15th) in the SPI-12 model formed an MSE = 0.166, R2 = 0.943, adjusted (R2) = 0.942, Mallows’ (Cp) = 0.200, Akaike’s (AIC) =  − 404.098, Schwarz’s (SBC) =  − 380.062, and Amemiya’s (PC) = 0.060, as shown in Table 2 (C).

Table 2 The best subset regression analysis for determining the best input combinations to model

Table 3 (A) shows the best subset regression analysis performance and the best input variables of the SPI-3 model for station 3. The best model has been reported with nine input variables that include (1st/4th/6th/7th/11th/12th/13th/14th/15th). The best model performance results are MSE = 0.525, R2 = 0.617, adjusted (R2) = 0.602, Mallows’ (Cp) = 5.367, Akaike’s (AIC) =  − 137.766, Schwarz’s (SBC) =  − 103.429, and Amemiya’s (PC) = 0.414. In Table 3 (B), the best subset regression analysis performance and the finest input variables of the SPI-6 model have been observed with four variables that include (1st/12th/13th/15th). The statistics of that model were found as MSE = 0.455, R2 = 0.802, adjusted (R2) = 0.799, Mallows’ (Cp) = 0.119, Akaike’s (AIC) =  − 175.530, Schwarz’s (SBC) =  − 158.361, and Amemiya’s (PC) = 0.205 had the best accuracy in four variables of the SPI-6 model for station 3. According to the SPI-12 results described in Table 3 (C), the fifth input variables model shows the best results to the finest accuracy of the SPI-12 model in station 3. The five variables such as (1st/6th/12th/13th/15th) model formed an MSE = 0.150, R2 = 0.944, adjusted (R2) = 0.943, Mallows’ (Cp) =  − 1.078, Akaike’s (AIC) =  − 428.007, Schwarz’s (SBC) =  − 407.404, and Amemiya’s (PC) = 0.058; all these performance metrics indicated the five variables best input combination for SPI-12 model for station 3 (Table 3 (C)).

Table 3 The best subset regression analysis for determining the best input combinations to model

3.2 Evaluation of machine learning models

In this paper, the SPI drought index for different time scales including 3,6, and 12 months for the Upper Godawari River basin in India was predicted through four machine learning models, namely additive regression, random subspace, M5P, and the bagging. Different input combinations from three meteorological stations were used and the best model (with the best input combination) was adopted according to the statistical index analysis. The value of past SPI (t-1, t-2, t-3,…t-15) was used as input variables in order to predicate the value of future SPI. Meteorological data of 20 years from 2000 to 2019 have been collected and used to build the predictive models. The performance of the machine learning models was evaluated by calculating the arithmetical indices, viz. C.C, NSE, IW, MAE, RMSE, RAE (%), and RRSE (%). The predictive models given were repeatedly performed in order to maintain steady and reliable results.

3.2.1 Evaluation of SPI for station 11

The statistical analysis of the performance of predictive models for the testing datasets for station 11 is given in Table 4. The results showed that M5P (NSE = 0.64–0.95 and RMSE = 0.36–0.97) and Bagging model (NSE = 0.66–0.93 and RMSE = 0.4–0.95) were found as the best models for SPI prediction. The lowest performance models for drought index (SPI) prediction for all time scales and for different stations were found when the additive regression and random subsurface are adopted. Recently, the suitability and ability of the M5T technique are improved by several other studies such as Sattari and Sureh (2019).

Table 4 Statistical analysis of model performance for predicting SPI-3, SPI-6, and SPI-12 in station 11

In addition, the scatter plots between the observed and predicted (SPIs, SPI6 and SPI12) for the testing models of station 1 are shown Figs. 2, 3, and 4. The results indicate that the performance of M5T and bagging predictive models have a high correlation with the observation while the additive regression and random subsurface predictive models have the lowest correlation with observed SPI especially in SPI-3 and SPI-6. The best values of the correlation coefficient of the predictive models for SPI-12 are found (R2 = 0.95) when the M5T predictive model was used, while the lowest values of the correlation coefficient are found (R2 = 0.52) when the additive regression predictive model was used. However, there was no significant difference between the results obtained with M5T and Bagging predictive models (Table 4).

Fig. 2
figure 2

Observed SPI-3 versus estimated SPI-3. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 1

Fig. 3
figure 3

Observed SPI-6 versus Estimated SPI-6 a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 1

Fig. 4
figure 4

Observed SPI-12 versus Estimated SPI-12. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 1

In order to evaluate the uncertainty in SPI prediction for the Upper Godawari River basin, a box plot was used as shown in Fig. 5. The box plot includes the first-quarter, second-quarter and third-quarter values of all the predictive models and observed SPI. It is clear from Fig. 5 how the M5T represents the best predictive model for SPI prediction followed by the bagging predictive model. In addition, it is found that the fluctuations of the additive regression and random subsurface predictive models were far from the range of the observed SPI. Hence, it could be concluded that the M5T is more suitable for the prediction of SPI with different time scales in station 1.

Fig. 5
figure 5

Box plot presentation of the performance of the predictive models. a SPI-3. b SPI-6. c SPI-12 for station 1

3.2.2 Evaluation of SPI for station 2

The statistical analysis of the performance of predictive models for the testing datasets for station 1 is given in Table 5. As in the results of station 1, the M5P model (NSE = 0.69–0.97 and RMSE = 0.27–0.72) and the bagging model (NSE = 0.72–0.94 and RMSE = 0.4–0.6) were found as the best models for SPI prediction for station no. 2. In addition, the lowest performance models for all time scales have been found when the additive regression and random subsurface are adopted. Recently, the suitability and ability of the M5T technique are improved by several other studies such as (Sattari and Sureh 2019).

Table 5 Statistical analysis of model performance for predicting SPI-3, SPI-6, and SPI-12 in station 2

The graphical evaluation using scatter plots between the observed and predicted (SPIs, SP-I6, and SPI-12) for the testing models of station 2 are shown in Figs. 6, 7, and 8). The results indicate that the performance of M5T and bagging predictive models have a high correlation with the observation while the additive regression and random subsurface predictive models have the lowest correlation with observed SPI especially in SPI-3 and SPI-6. The best values of the correlation coefficient of the predictive models for SPI-12 are found (R2 = 0.98) when the M5T predictive model was used for SPI-12 estimation, while the lowest values of the correlation coefficient are found (R2 = 0.47) when the additive regression predictive model was used for SPI-6 estimation. However, the performance of additive regression and random subsurface predictive models for the estimation of SPI-6 was better than their performance for SPI-3 estimation.

Fig. 6
figure 6

Observed SPI-3 versus Estimated SPI-3. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 2

Fig. 7
figure 7

Observed SPI-6 versus Estimated SPI-6. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 2

Fig. 8
figure 8

Observed SPI-12 versus Estimated SPI-12. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 2

Figure 9 presents the box plot for the predicted and observed values of SPI for station 2. It is clear from Fig. 9 the M5T represents the best predictive model for SPI prediction compared with the other models followed by the bagging predictive model. In addition, it is found that the fluctuations of the additive regression and random subsurface predictive models were far from the range of the observed SPI. Hence, it could be concluded that the additive regression and random subsurface predictive models are not suitable for the prediction of SPI with different time scales in station 2.

Fig. 9
figure 9

Box plot presentation of the performance of the predictive models. a SPI-3. b SPI-6. c SPI-12 for station 2

3.2.3 Evaluation of SPI for station 3

The statistical analysis of the performance of predictive models for the testing datasets for station 3 is given in Table 6. Similar to the results of station 1 and station 2, the results showed that the M5P predictive model (NSE = 0.69–0.97 and RMSE = 0.28–0.66) and Bagging model (NSE = 0.60–0.92 and RMSE = 0.45–0.7) were found as the best models for SPI prediction for station 3. The lowest performance models for drought index (SPI) prediction for all time scales and for different stations was found when the additive regression and random subsurface are adopted. Recently, the suitability and ability of the M5T technique are improved by several other studies such as (Sattari and Sureh 2019).

Table 6 Statistical analysis of model performance for predicting SPI-3, SPI-6, and SPI-12 in station 3

As for station 1 and station 2, the scatter plots between the observed and predicted (SPIs, SPI6, and SPI12) for the testing models of station 3 are shown in Figs. 10, 11, and 12). The results indicate that the performance of M5T and bagging predictive models have a high correlation with the observation while the additive regression and random subsurface predictive models have the lowest correlation with observed SPI especially in SPI-3 and SPI-6. The best values of the correlation coefficient are found (R2 = 0.98) when the M5T predictive model was used for SPI-12 estimation, while the lowest values of the correlation coefficient are found (R2 = 0.45) when the additive regression predictive model was used. However, there was no significant difference between the results obtained with M5T and bagging predictive models for SPI-6 estimation. Figure 13 presents the box plot for the predicted and observed values of SPI for station 3. It is clear from Fig. 13 the M5T represents the best predictive model for SPI prediction compared with the other models followed by the Bagging predictive model. In addition, it is found that the fluctuations of the additive regression and random subsurface predictive models were far from the range of the observed SPI. Hence, it could be concluded that the additive regression and random subsurface predictive models are not suitable for the prediction of SPI with different time scales in station 3 and M5T is considered suitable for SPI prediction. Overall, the results revealed that all of the machine learning techniques used in this study could predicate the SPI with a high time scale (SPI-12) with acceptable accuracy and this conclusion is agree with that one improved (Yaseen et al. 2021).

Fig. 10
figure 10

Observed SPI versus Estimated SPI-3. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 3

Fig. 11
figure 11

Observed SPI versus Estimated SPI-6. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 3

Fig. 12
figure 12

Observed SPI versus Estimated SPI-12. a Additive regression. b Random subspace. c M5P. d Bagging models during the testing period for station 3

Fig. 13
figure 13

Box plot presentation of the performance of the predictive models. a SPI-3. b SPI-6. c SPI-12 for station 3

4 Discussion

In India, droughts regularly have an impact on farming and farmers’ lives. To lessen the effects of drought in the area (Orimoloye et al. 2022), reliable drought prediction is crucial. The majority of meteorological stations in India lack the dependable rainfall and temperature data required for drought research and prediction over longer time periods (Shelar et al. 2022; Kumar Gautam et al. 2022). In order to get beyond the restrictions of the climatic data, ML techniques were utilized in this work (Elbeltagi et al. 2023a). The current study showed that ML models can anticipate the SPI, the most popular DI, accurately over a multi-month horizon (i.e., 3, 6, and 12). The best subset regression analysis was used to optimize the SPI-3, SPI-6, and SPI-12-month models. Based on the statistical performance metrics, research results showed that the whole best models i.e., (Bagging and M5P) had acceptable forecasting of the mid-term drought forecasting based on the SPI-3, SPI-6, and SPI-12 months for three stations in the Upper Godavari Basin in India. Different regions of India might duplicate the temporal variability of SPIs. This model could help decision-makers and experts in the water sector make wise choices (Pande et al. 2022, 2023a).

For the cases of mid-term dryness, the examined models more accurately predicted SPIs. Our results were contrasted with those of recent research carried out in other places, including Bangladesh, Ethiopia, India, and Iran. When training and testing durations were taken into account, the investigated models more accurately predicted SPIs for mid-term drought circumstances. These results support the research by (Malik et al. 2021a; Yaseen et al. 2021). Additionally, monsoon months are more prone to severe drought than other times of the year, with June exhibiting the greatest vulnerability. Serious droughts are more likely to occur in September. The bagging model proved to be superior among the chosen models during the training and testing phases for each timeline of SPI (i.e., SPI-3, SPI-6, and SPI-12). It agrees with the findings of Ditthakit et al. (2021). In Bangladesh, Yaseen et al. (2021) looked at the effectiveness of machine learning (ML) techniques such random forest (RF), bagging, M5P Tree, extreme learning machine (ELM), and online sequential-ELM (OSELM) in predicting (SPI) at 4-month horizons (i.e., 1, 3, and 12). According to the study, bagging and M5P provided the most accurate predictions for the 3-, 6-, and 12-month SPI. Three machine learning techniques-artificial neural networks (ANNs), support vector regression (SVR), and M5P-were used by Belayneh et al. (2016). They came to the conclusion that M5P provided the superior model performance for SPI-3 (3-month SPI) and SPI-6 (6-month SPI) forecasting multi-scale drought index (Pande et al. 2022, 2023a).

This is anticipated since the smoothness and unpredictability of the SPI time series get worse as the time scale gets longer. As was discovered in the current work, more linear data enhances the performance of machine learning models. Lower-scale SPI predictions by M5P were nevertheless accurate. Using M5P and bagging, a very non-linear process may be recorded.

The M5P model is a multivariate linear algorithm. Numerous linear regression models are represented by the tree’s leaves. This technique facilitates data segmentation and matching with the suitable regression model (Heddam and Kisi 2018). It is able to fit numerous models to diverse non-linear datasets because of its decomposition capacity. M5P was able to simulate every data point in a data series, which improved its capacity to foresee phenomena in linear models. This update significantly enhanced M5P’s capacity to rapidly learn and model high-dimensional data.

A stochastic (time-series) model and algorithms drawn from nature have been used to forecast a number of drought indicators. The outcomes of these models were contrasted with those of the M5P. (DIs). The forecast for the meteorological drought in Ankara, Turkey, was made using regression and random subspace models using delayed SPI data (Mehdizadeh et al. 2020). The prediction accuracy found in this investigation was higher than the predictive capability of the regression and random subspace models. Droughts in eastern Australia were predicted using the least-squares support vector machine (LSSVM), multivariate adaptive regression splines (MARS), and M5P tree models (Deo et al. 2017). The M5P tree technique was said to have better prediction accuracy. The ANFIS, M5P, M11, and M13 models were among the ML models employed by Nguyen et al. (2015) to forecast SPI in the Cai River basin in Vietnam. The highest performing model, according to them, was M5P, which was followed by M11 and M13. Stepwise linear regression, genetic programming, and M5P approaches were used by Adarsh and Janga Reddy (2019) to forecast standardized precipitation indices for various areas of India. They observed that M5P performed better than expected in predicting droughts across the board. (Shamshirband et al. 2020) demonstrated improved results with the M5P and bagging models and predicted SPI using support vector regression, bagging, and M5P models. Barzkar et al. (2022) predicted SPIs for various climatic circumstances using three ML models: GEP, M5P, and multivariate adaptive regression spline (MARS). They demonstrated that the M5P model outperformed others in every instance (Elbeltagi et al. 2023b).

The material listed above unequivocally demonstrates the potential of ML models to forecast droughts in various meteorological contexts. According to the current study, the ML model-more particularly, M5P and bagging-was better able to predict meteorological droughts over a wide range of periods. The effects of droughts are disastrous to both society and the economy. The results of this study suggest that drought forecasting models might be placed as an alarm to lessen the consequences of drought in India’s eastern areas, which are resistant to them.

5 Conclusions

In this study, four machine learning models namely an additive regression, random subspace, M5P, and bagging were selected to predict the future of SPI-3, SPI-6, and SPI-12 months at the Upper Godavari Basin, India. The input dataset series for the expansion of four models were pre-processed with machine learning to enhance the performance of the four models. Based on the statistical performance metrics, research results showed that the Bagging was the best model for predicting SPI-3 and SPI-6 while the M5P was the best for SPI-12 estimation in station 1, while in stations 12 and 13, the M5P was superlative in predicting the SPI-3 and SPI-12 months and the bagging was the best in SPI-6. The whole best models had acceptable forecasting of the mid-term drought forecasting based on the SPI-3, SPI-6, and SPI-12 months for three stations in the Upper Godavari Basin in India. Finally, these best machine learning models are better in predicting drought phenomena based on the standardized precipitation index (SPI) and it is not inadequate by the training input range and gives precise forecasts for short-term and mid-term drought situations. The results of the study area can be useful for making policy and planning related to drought, water resources management, crop water requirement, and irrigation planning purposes in the semi-arid region.