Introduction

Nations in arid and semiarid regions have dealt with the severe water scarcity crisis and water supply procedures. One such reliable and natural water resources supplier is groundwater. Ancient Iranians invented Qanat as a water supplier for drinking and agricultural purposes in cities and villages. Qanat system was successively employed mainly in the Middle East, North Africa, southern Spain, and other arid and semiarid regions. Qanats provide groundwater without energy consumption and extract water by gravity (Azari Rad et al. 2018; Naghibi et al. 2018; Samani et al. 2023a; Sedghi and Zhan 2024). More than economic benefits, Qanats are groundwater regulators which dewater the groundwater level in some areas with high groundwater levels. Compared to water well drilling, Qanats have fewer limitations in shallow groundwater and aquifers with low hydraulic conductivity. Qanats are dried due to extreme climate change and groundwater overexploitation (Sedghi and Zhan 2020). Despite their importance in the water cycle and people’s life in the middle east, little scientific research has explored the hydrological and hydrogeological conditions of Qanats and, more specifically, modeling methods to predict Qanat discharge (QD); which was the pivot motivation for the current study.

In standard groundwater research, machine learning methods are promising tools for quantitative evaluation and predicting hydrogeological conditions and groundwater level fluctuations (Cui et al. 2022; Dehghani and Torabi Poudeh 2022). For instance, the main goal of modeling could be considering the Qanat discharge prediction based on different hydrologic scenarios and stresses and evaluating the hydrological effects of nearby Qanats on each other (Naghibi et al. 2018). The non-availability or insufficient data could be the most significant limitation of the mathematical models, particularly the comprehensive complex dataset to simulate the anthropical factors. Based on the prior studies, machine learning methods could provide a considerable advantage in modeling the pattern of the inputs-outputs datasets (Antonopoulos and Gianniou 2022; Pham et al. 2022; Poursaeid et al. 2022; Sun et al. 2022). The nonlinear time-variant behavior of hydrogeological systems is complicated enough to be solved with standard statistical methods (Najafabadipour et al. 2022). Then, researchers have tried to use various machine learning methods such as artificial neural networks (ANNs), gene expression programming (GEP), group method of data handling (GMDH), adaptive neuro-fuzzy inference system (ANFIS), least-square support vector machine (LSSVM) for groundwater level (GWL) prediction (Samani et al. 2013; Bahmani and Ouarda 2021; Ghazi et al. 2021a; Kiyani et al. 2022; Mozaffari et al. 2022; Poursaeid et al. 2022; Tao et al. 2022). The overall results of the prior studies revealed the ability of the machine learning models to cope with the complicated behavior of the hydrogeological systems and provide satisfactory results with the less-required dataset.

The ANN is a significant machine learning (ML) model that handles difficult and complex hydrological issues and problems based on statistical algorithms. ANN is the most applied and practical model for predicting groundwater’s quantitative characteristics and conditions due to the architecture and codes’ simplicity and availability (Iqbal et al. 2020; Banadkooki et al. 2020; Ahmadi et al. 2022; Samani et al. 2022). Based on the literature review, the studies focused on GWL modeling using ANFIS methods have also increased. The ANFIS model could provide simulated time series with high accuracy and precision for hydrological time series with different time steps, i.e., monthly, daily, or weekly (Moravej et al. 2020; Sridharam et al. 2021; Samantaray et al. 2022; Tao et al. 2022). The GMDH model is a promising tool for finding nonlinear relationships of hydrological systems. GMDH has been extensively applied in various areas of hydrological studies, such as river flow prediction, management, and soil and sediment (Lin et al. 2020; Khodakhah et al. 2022; Jaafari et al. 2022; Mulashani et al. 2022; Nadiri et al. 2022). In addition, various studies have been performed on groundwater level prediction exploiting the GMDH approach (Moghaddam et al. 2021; Arya Azar et al. 2022; Tao et al. 2022; Samani et al. 2023b). LSSVM has been implemented for predicting GWL and reported that this method improves the accuracy compared to ANN in predicting GWL (Miraki et al. 2019; Guzman et al. 2019; Khedri et al. 2020).

Several works have been reported in the literature regarding the application of ANN, ANFIS, GMDH and LSSVM in hydrogeological issues, civil engineering, soil science and water quality (Tayebi et al. 2019; Lin et al. 2020; Tao et al. 2022; Samantaray and Sahoo 2023; Samantaray et al. 2023 a and b; Samani 2024; Tao et al. 2024). GEP model has been widely used in groundwater modeling (Ghazi et al. 2021b) and in other fields such as leak detection of water distribution networks (Tijani and Zayed 2022), modeling soil enzymes (Ebrahimi et al. 2021), simulation of the subgouge soil deformation in the sand (Azimi and Shiri 2020), and prediction of daily dew point temperature (Mehdizadeh et al. 2017).

Despite the critical role of Qanats in providing water in arid and semi-arid regions, scientific research has not extensively explored the hydrological and hydrogeological dynamics specific to Qanats. This gap is significant given the pressing challenges posed by climate change and groundwater overexploitation, which threaten the sustainability of these ancient systems. Furthermore, while machine learning methods have been successfully applied to predict groundwater levels in various contexts, their application to Qanat discharge prediction remains limited. This study aims to bridge these gaps by (1) Evaluating the effectiveness of various machine learning models in predicting Qanat discharge, which has been less studied compared to other groundwater systems; (2) Identifying the most influential hydrological and hydrogeological parameters affecting Qanat discharge, an area not thoroughly investigated in previous Qanat studies; (3) Demonstrating the interplay between Qanats and adjacent hydrological features, which has not been sufficiently modeled or understood in the context of Qanat sustainability and efficiency. By focusing on these research gaps, this study seeks to advance the understanding of Qanat systems and improve the predictive capabilities of hydrological models in arid environments, ultimately contributing to better water resource management in these vulnerable regions. Figure 1 illustrates the procedural outline of the implemented ML methods.

Fig. 1
figure 1

The methodological procedure of the QD models

Methods

The selection of machine learning methods for predicting Qanat discharge was based on several criteria aimed at capturing the complex, non-linear relationships inherent in hydrological data. The chosen methods are known for their robustness in handling various types of input data and their ability to model complex systems effectively. Here are the specific reasons for selecting each method:

Artificial Neural Network (ANN): ANN was chosen due to its proven ability in numerous hydrological studies to model complex and non-linear relationships between inputs and outputs. ANNs are flexible and can be trained to learn patterns from historical data, making them suitable for predicting Qanat discharge which is influenced by a variety of hydrological variables.

Adaptive Neuro-Fuzzy Inference System (ANFIS): ANFIS combines the learning capabilities of neural networks with the linguistic rule presentation of fuzzy logic. This hybrid approach allows it to effectively handle uncertainty and model the nonlinear relationships between hydrological parameters. Its success in previous studies involving groundwater level prediction under uncertain conditions justified its selection.

Gene Expression Programming (GEP): GEP was selected for its ability to create models that can evolve over time, allowing for better adaptation to changing patterns in data. This feature is particularly useful in the context of Qanats, where hydrological inputs may change due to environmental factors or human activities.

Group Method of Data Handling (GMDH): GMDH is known for its self-organizing capabilities, making it suitable for modeling complex systems where the relationships between parameters are not fully understood. It was chosen for its ability to generate a model with optimal complexity, reducing the risk of overfitting which is crucial in accurately predicting Qanat discharge.

Least Square Support Vector Machine (LSSVM): LSSVM was included for its robustness in dealing with small datasets and its effectiveness in solving non-linear regression and classification problems. Given the limited data availability for Qanats, LSSVM’s capability to provide high accuracy with fewer data points is highly advantageous.

These methods were also chosen because of their complementary strengths, providing a comprehensive approach to the modeling process. By comparing their performance, we aim to identify the most effective model(s) for QD prediction, contributing to more reliable water resource management strategies.

Artificial neural network (ANN)

The ANN model, which is adapted based on brain behavior, performs in two stages, (i) obtaining knowledge from the environment via a learning procedure and (ii) using the interneuron link to gather the obtained knowledge (Haykin 2004; Patel et al. 2022; Sreelakshmi and Shaji 2022). The whole process consists of five phases: choosing the input dataset, choosing a proper framework, neural network designing, the process of training and testing, and lastly, model evaluation (Sahoo and Jha 2013). As the well-accepted ANN model in simulating hydrological phenomena, multilayer perceptron (MLP) was applied in the present work (McGarry et al. 1999). The MLP model consists of three main layers: input, hidden (middle) and output. The quantity of layers and neurons is crucial to obtaining an optimal model framework. In the present study, one hidden layer was utilized since it could be adequate for QD prediction based on earlier similar studies. The Levenberg-Marquardt (LM) algorithm was applied for the MLP training. MATLAB® (Mathworks 2014) software package was used to create ML models in the present study. The general structures/procedures of the developed models are given in Fig. 2a-e.Adaptive Neuro Fuzzy Inference System (ANFIS).

The ANFIS, as a particular ML method, could take advantage of the ANN and the fuzzy inference system (Jang 1993). The ANFIS is a flexible statistical method that could recognize complex nonlinear and uncertain patterns considering ambiguity between variables without completely knowing the nature of the problem. The Sugeno system is used in the ANFIS, and generally, the ANFIS structure is composed of 5-layers (Fig. 2b): fuzzy membership, fuzzification, normalization, defuzzification, and output (Jang 1993; Wee et al. 2021). The subtractive clustering (SC) is utilized in the present study to split the input dataset dimension into n-divided specific areas by assessing the n-dimensional input dataset to create different specific clusters. The cluster radius could be selected from 0 to 1 to optimize the cluster centroid’s influence range. Recognizing the optimum cluster radius is vital to identifying the clusters’ number, which was obtained by the trial-error procedure in the current paper.

Gene expression programming (GEP)

Koza (1992) proposed Genetic Programming (GP) as a generalization of the Genetic Algorithm (GA) (Goldberg 1989), which utilizes a ‘‘parse tree’’ form to search in the solution space, and GEP uses the benefits of GA and GP. The GEP model was first introduced by Ferreira (2001), which is an intelligent evolutionary model. The initial phase in the GEP algorithm is the initial population generation that could be achieved randomly or with some data about the problem. Next, the chromosomes are expressed to create the tree expression. Then, the model results are assessed based on a fitness function to define the degree of satisfaction. If the reasonable model outcomes (desired threshold of fitness function or generation) are found, the evolution procedure will be interrupted, and the best-achieved results at this step are reported. However, if the stop provisions are not found, the best result for the current model generation will be maintained. The modeling procedure is repeated for several generations until the solution is found (Ferreira 2001). The structure, input and target variable, and position function of the GEP model differ from other ML methods, such as ANN. The optimum structure of the GEP model and coefficients are defined during the training procedure. The nature of the GEP provides further flexibility to the model. The overall structure of the GEP method is given in Fig. 2c.

Group method of data handling (GMDH)

Ivakhnenko (1968) introduced the GMDH method, which employs a self-organizing model (SOM) to address intricate and nonlinear problems, specifically tackling prediction, classification, and various other challenges. The main inputs, the quantity of hidden layers and neurons, and network structure are essentially specified in the GMDH method. However, the GMDH as a polynomial neural network resembles ANN models. Mueller et al. (1998) declared that statistical analysis and ANN are deductive methods that could not uncover complex objects because they need a lot of a priori information. Instead, the GMDH model as a regression-based method could combine the advantages of both methods (Lemke 1997). Therefore, GMDH could cover the deficits of ANN, while statistical neural networks could slightly solve them, and all model structures in the GMDH model could be specified by default. Nariman-Zadeh et al. (2002) provided more detailed information about the GMDH model. The framework of the GMDH model is presented in Fig. 2d.

Least square support vector machine (LSSVM)

Vapnik (1998) introduced the theory and basic concepts of the SVM. The SVM’s general outline is better than the ANN since it is based on structural risk minimization, whereas the ANN employs experimental risk minimization. The SVM’s main process consists of the selection of support vectors supporting the model structure and identifying the weights. A comprehensive mathematical framework of the SVM was suggested by Vapnik (1998). Then, Suykens and Vandewalle (1999) proposed the LSSVM based on the SVM. LSSVM is a robust method to resolve function estimation, nonlinear classification, and density estimation of hydrological problems. LSSVM solves linear programming subjects by modifying inequality constraints in the SVM to equality constraints (Kumar and Kar 2009). Also, the LSSVM is faster than the SVM (Fels and Ghorfi 2022; Gu et al. 2010).

Different algorithms were proposed to solve the dual optimization issue of SVM models. The Sequential Minimal Optimization (SMO) is the latest learning algorithm for SVM, which employs an analytical phase (Platt 1999). SMO can directly solve the SVM problem without utilizing a quadratic optimizer or additional matrix space employed in this work. The result of LSSVM differs strongly on the appropriate selection of the kernel function and modifying the appropriate parameters of C and γ. The polynomial kernel function used in this study for LSSVM since its outstanding outcomes in QD prediction is dependent on the study area’s dataset. Also, the optimum parameters of the LSSVM model are achieved based on a trial-and-error process (Suryanarayana et al. 2014). The LSSVM was applied using codes of the LIBSVM library introduced by Chang and Lin (2011). The overall structure of the LSSVM model is given in Fig. 2e.

Fig. 2
figure 2

The general structure of ML models: ANN (a), ANFIS (b), GEP(c), GMDH (d) and LSSVM (e)

Qanat system

The Qanat system denotes a conventional reliance on the water supply and agricultural usage and is a remarkable typical aspect of Iran’s climate and landscape. However, in many areas, the Qanat is now quickly substituted by deep wells.

Qanats have been built by the hand of experienced laborers with extensive knowledge of geology and engineering. As shown in Fig. 3, a Qanat system comprises the following four main elements:

  1. 1.

    Mother well: a dug deep into the groundwater level. Qanats conduct groundwater by gravity, and the mother well is typically built on sedimentary formation at the baseline of mountains. The most profound mother well, about 300 m, belongs to a 2,700-year-old Qanat in Gonabad in Iran (Boustani 2008).

  2. 2.

    Outlet: the area where water emerges to the surface. There are regularly various nominee locations for the groundwater outlet. The last position is determined concerning several factors, such as the vicinity to the positions of water utilization and the gallery’s slope.

  3. 3.

    Gallery: Once the outlet and mother well’s location are determined, a slightly sloped tunnel is started to build from the outlet in the direction of the mother well. The selection of the gallery’s slope could be a trade-off between erosion and sedimentation, which usually slope of Qanats is about 0.5%. The gallery length differs from a hundred meters to kilometers, and the most extended gallery of Qanats is about 120 km, belongs to Zarach in Iran (Molle et al. 2004) and,

  4. 4.

    Shafts: are a string of vertical wells constructed along the Qanat’s gallery between the outlet and the mother well at a 20–50 m distance to assist sediment removal and support air circulation and entry for laborers while constructing the gallery and after finishing the construction to monitor and maintain the Qanat.To simplify the subject, Qanat could be defined as a practically horizontal tunnel with some shaft wells which transmit groundwater from the aquifer to the surface. Qanat water leaks into the tunnel in the saturated zone and then streams down the tunnel to the outlet.

Also, Qanats can be considered a drainage system that drains groundwater to supply agricultural and drinking usage (Boustani 2008; Ganjeizadeh Rohani et al. 2024; Mohajerani et al. 2024; Nasiri and Mafakheri 2015; Yazdi and Khaneiki 2016). A cross-section of a typical Qanat in an unconfined aquifer and two intersecting Qanats near the town of Meybod, Yazd Province, Iran is given in Fig. 3. It should be mentioned that in a Qanats, the discharge is a function of specific storage of the aquifer, the length of the gallery and groundwater level; subsequently groundwater fluctuations can directly affect the Qanats discharge. Although the Qanat has provided water for Iranian for over two thousand years, the difficulty of controllability of water discharge has supposed that Qanat is now unsuitable for the optimum usage of water resources.

Fig. 3
figure 3

Cross section of typical Qanat in an unconfined aquifer

Study area

The study area of Razan-Ghahavand is located in Hamedan province, with an area of 3084 square kilometers. Water requirements for residents of this area are supplied by 1788 wells, 104 springs, and 96 Qanats. The groundwater system is experiencing significant tension due to an imbalance between extraction and infiltration, resulting in excessive exploitation of groundwater resources. A detailed location of the study area is given in Fig. 4. The elevation of the study area differs from 1581 to 2741 m above mean sea level, and the mean annual temperature and precipitation of the study region, respectively are 10° C and 245 mm.

Fig. 4
figure 4

The geographical map of the study area

Data and preprocessing

This study utilizes time series data collected from the Razan-Ghahavand Aquifer over a period of 18 years, from 2003 to 2021. The data, provided by the Hamedan Regional Water Authority, includes monthly measurements of Qanat discharge (QD), groundwater levels (GWL), and meteorological parameters such as temperature (T), precipitation (P), and evapotranspiration (ET). Each selected variable plays a specific role in the hydrological dynamics of Qanats. Discharge for adjacent Qanat (QD1 and QD2): Reflects the effect of neighboring discharged on the each Qanat; Groundwater Level (GWL): Indicates the aquifer’s state, influencing QD through hydraulic connectivity; Meteorological Data: Temperature, precipitation, and evapotranspiration directly affect evaporation and recharge rates, hence influencing QD. The integration of these variables suggests their interplay determines the aquifer’s response to natural and anthropogenic changes, making them critical for accurate QD predictions. The output variable for models is the Qanat discharge (QD), predicted at one, two, and five-month lead times.

For accurate modeling of Qanat discharge (QD) using groundwater levels (GWL) and hydrological time series data, the optimal number of lagged inputs that significantly influence the current values must be determined. The Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) analyses were employed to identify these effective lags. The correlation between time series observations at different time lags is measured by ACF, which indicates how past values are related to future values without distinguishing between direct and indirect effects. Conversely, PACF is used to measure the correlation between observations at different lags while controlling for the values of the intervening observations, thereby isolating the direct effects. Significant autocorrelations up to lag 3 were demonstrated in the ACF and PACF plots for both Qanat 1 and Qanat 2, as well as for GWL. Based on this observation, the inclusion of lags 1, 2, and 3 of QD and GWL as predictors in the machine learning models was decided.

Model development

Monthly local QD (QDt−1, QDt−2, QDt−3) and GWL for one, two and three lag-times (GWLt−1, GWLt−2, GWLt−3), temperature, (T), precipitation (P), evapotranspiration (ET) and QD of neighboring Qanat were considered as inputs for one-, two-, and five-month ahead QD predictions for two adjacent Qanats in Razan-Ghahavand Aquifer. This study explored GWL fluctuations based on a representative hydrograph. The Hamedan Regional Water Authority published the Qanat discharge and monthly GWLs for 18 years from 2003 to 2021. To evaluate the models’ efficiency in predicting QD, the input-output time series was split into two sections, 70% for training and 30% for the testing phase.

Model implementation

Selecting the most appropriate combination of input datasets is an essential step in ML modeling.

Considering unrelated datasets with time lags as input datasets in the scenarios reduces the results’ reliability and complicates the models’ structure. The general correlation analysis shows that QD in both Q1 and Q2 Qanats in the Razan-Ghahavand strongly correlated with time series data of QD of neighboring Qanat (Q1 and Q2) and GWL fluctuations. Moreover, meteorological datasets were additionally used for QD prediction.

Different input combinations were evaluated using the predictive input parameters with different lag time steps from one-month “QDt−1” to three-month prior “QDt−3” to predict QD with various lead times (one- to five-month ahead) (Table 1).

Table 1 Different combinations for both models

Results and discussion

This research focuses on utilizing machine learning methods to accurately predict QD and identify the most effective model for practical application. The study employs several machine learning techniques, including GEP, ANN, GMDH, LSSVM, ANFIS. These methods are used to forecast QD one, two, and five months in advance within an unconfined aquifer.

Designing an optimum structure for each model is an important phase of the modeling process since an inappropriate structure of the model could trigger over/ under-fitting issues. In the present study, a three-layered ANN model was deemed for QD prediction; Initial outcomes revealed that one hidden layer was sufficient to get a relationship between the predictor inputs and QD. Generally, a trial-and-error procedure was applied to identify the number of neurons in the hidden layer. This information is provided in the third column of Tables 2, 3, 4, 5, 6 and 7 for Qanat 1 and 2. Optimizing the cluster radius is a substantial concern for the efficient ANFIS structure. Smaller radii make numerous small clusters and, subsequently, many rules, whereas large radii cause a few large clusters to get fewer rules (Sanikhani and Kisi 2012). The optimal LSSVM parameters were determined as C= [0.2, 1] and γ = 5 by trial-and-error procedure. RBF is chosen as the appropriate kernel function.

Model comparison according to computational effort and Run Times

Computational cost is regularly a substantial constraint of prediction models. In the present study, we use different ML models to predict QD to achieve this goal; the computation times for the ANN, ANFIS, GEP, GMDH, and LSSVM models for the first combination and one month ahead, respectively, were recorded as 4.70, 3.90, 6.17, 1.32, and 0.48 s. The outcomes reveal that the LSSVM is faster compared to other implemented ML models. Moreover, the appropriate iteration (epoch) number, an important factor for ML models, could improve the model accuracy in both training and validation stages and could prevent overtraining. Adjusting various models with various forms and it is apparent that 100–200 iterations are enough for calibration of all developed models in predicting QD in both Qanats. Th epoch number for all developed ANN models was less than 100.

Comparison of the implemented models

This study aims to consider the application of various machine learning models ANN, ANFIS, GEP, and LSSVM to predict QD levels up to five months beyond data records. In the first part, various input combinations were examined by the applied models.

It should be remarked that combinations are categorized into three groups in the present application (Table 1), which (1) hydrogeological based: and employs only Qanat discharge data (QDt−1, QDt−2, QDt−3) to predict QD (combination 1); (2) mixture of hydroclimatic and hydrogeological based: employs monthly local QD and QD of neighboring Qanat, local GWL, temperature, (T), precipitation (P), and evapotranspiration (ET) to predict QD (combinations 2, 3, 4, 6); And, (3) hydrological and hydrogeological based combinations without using previous QD values to predict QD (combination 5 and 7).

Models for QD prediction could be categorized into various groups based on NSE (Nash-Sutcliffe Efficiency) as follows: very good (0.85 < NSE ≤ 1.00), good (0.70 < NSE ≤ 0.85), satisfactory (0.55 < NSE ≤ 0.70), and unsatisfactory (NSE ≤ 0.55) (Moriasi et al. 2015). All models except the GEP model in the training phase for combinations 5 and 7 show satisfactory and good results according to NSE results (0.55 < NSE ≤ 0.70 and 0.70 < NSE ≤ 0.85). Tables 2 and 5 show that the GMDH model gives the best results among other applied models for combinations 5 and 7 for the training and testing phases of both Qanats. For the one month ahead QD prediction in Qanat 1, the NSE values in the training phase are 0.84 (combination 5) and 0.85 (combination 7), and in Qanat 2, the corresponding values are 0.82 and 0.83. Hence, GMDH can get good estimation without using previous QD values, and this model can be efficient for predicting QD when monthly QD time series data are unavailable.

Also, the conclusions reveal that considering QD of adjacent Qanat as input model parameters (comparing combination 4 with 6 and 5 with 7) improves the accuracy of all models. For example, in the training phase of LSSVM for the prediction of one month ahead QD in Qanat 1, from combination 4 to combination 6, the NSE is improved from 0.84 to 0.86 and from combination 5 to combination 7, the NSE is improved from 0.82 to 0.85; for the prediction of one month ahead QD in Qanat 2, from combination 4 combination 6, the NSE is improved from 0.76 to 0.80; and from combination 5 to combination 7, the NSE is improved from 0.75 to 0.79.

It is essential to point out that all models’ results for combination 1, where just monthly QD (QDt−1, QDt−2, QDt−3) values are used as inputs are not satisfactory for both Qanats. By considering all combinations and ML models like ANN, ANFIS, GEP, and LSSVM, it can be seen that combination six (combination with hydroclimatic and hydrogeological based) can offer accurate predictions for one month ahead QD. Based on Tab.s 2 and 5, it is evident that ANFIS, GMDH, and LSSVM models at the training phase offer good precision for the one month ahead QD prediction for both Qanats with NSE higher than 0.55. However, great implementation is found for the GMDH methods based on NSE values greater than 0.7 for five months ahead in the training phase for both Qanats. Considering RMSE, NRMSE and R values, the GMDH model reveals precise results with the low RMSE, NRMSE, and high R values. The LSSVM can be ranked as the second model. According to the NSE analysis, the GEP model is the worst in estimating QD. The outcomes of the training phase ascertained that the GMDH model (six input combination) for both Qanats performed superior to the other models; for one month ahead QD prediction in Qanat 1, R = 0.93, RMSE = 5.28, NRMSE = 0.07, MAE = 3.63 and NSE = 0.87; for two-month ahead QD prediction in Qanat 1, R = 0.91, RMSE = 5.90, NRMSE = 0.08, MAE = 4.10 and NSE = 0.81; for five-month ahead QD prediction in Qanat 1, R = 0.86, RMSE = 7.27, NRMSE = 0.10, MAE = 5.36 and NSE = 0.73; for one month ahead QD prediction in Qanat 2, R = 0.93, RMSE = 6.56, NRMSE = 0.09, MAE = 5.13 and NSE = 0.86; for two-month ahead QD prediction in Qanat 2, R = 0.89, RMSE = 7.71,NRMSE = 0.10, MAE = 6.28 and NSE = 0.79; for five-month ahead QD prediction in Qanat 2, R = 0.86, RMSE = 9.11,NRMSE = 0.11, MAE = 7.51 and NSE = 0.73)

The accuracy models for the combination six are further compared in Figs. 5 and 6 via time variation graphs and scatterplots for both Qanats. The graphs in Fig. 5 provide detailed changes between observed values and predictions provided by the models. Also, the provided graphs in Fig. 6 reveal how all model’s simulations are scattering, and R values provide a practical understanding of fitting the simulated dataset based on the accuracy and precision of both Qanats. Analysis of the hydrographs and scatterplots reveals that the forecasted time series by GMDH closely aligns with the observed QD values and exhibits less dispersion compared to the other four models used. However, noticeable disparities between the predicted and observed time series are apparent for the GEP model. However, the GMDH model could not catch the extreme QD values. This can be explained by the smaller number of samples for QD peak values. Without a high number of examples, ML models cannot adequately learn extreme events.

Table 2 The results of the different combinations in predicting one- ahead QD for Qanat 1
Table 3 The results of the different combinations in predicting two-months ahead QD for Qanat 1
Table 4 The results of the different combinations in predicting five-months ahead QD for Qanat 1
Table 5 The results of the different combinations in predicting one- ahead QD for Qanat 2
Table 6 The results of the different combinations in predicting two-months ahead QD for Qanat 2
Table 7 The results of the different combinations in predicting five-months ahead QD for Qanat 2
Fig. 5
figure 5

The observed and simulated one month ahead QD time series using the ANN, ANFIS, GEP, GMDH and LSSVM models for the combination 6: Qanat1 (a) and Qanat2 (b)

Fig. 6
figure 6

Scatterplots of the observed and predicted one month ahead QD by different models for the combination 6: Qanat1 (left panel) and Qanat2 (Right panel)

The Taylor diagrams for both Qanat 1 and Qanat 2, as shown in Fig. 7, provide a visual comparison of various machine learning models based on their statistical performance metrics, including standard deviation, correlation, and root mean square error. In each diagram, the GMDH model is closer to the reference point, indicating that its predictions closely match the observed data’s variability (standard deviation) and pattern (correlation). The proximity of the GMDH model to the center of the circle further demonstrates its lower RMSE, signifying more accurate predictions compared to other models. These results underscore the effectiveness of the GMDH model in capturing the complex dynamics of Qanat discharge, providing reliable predictions that are crucial for water resource management in the regions dependent on Qanats. The consistency in model performance across both Qanat 1 and Qanat 2 further reinforces the robustness of the GMDH approach.

In total, the current study implies the superiority of the GMDH method compared to the other applied ML methods based on the employed dataset. These findings reinforce the prior results reported by the existing literature for different areas (Aghelpour and Varshavian 2020; Li et al. 2020; Moghadam et al. 2021; Kamali et al. 2022).

Fig. 7
figure 7

Taylor diagrams of different models: Qanat1 (a) and Qanat2 (b)

Figures 8 and 9 illustrate the impact of various hydrological, meteorological, and neighboring Qanat features on the discharge predictions of Qanats 1 and 2 using the GMDH model as the best model.

Figure 8 provides a global view, showcasing how different features influence the model output across all instances. This global perspective is crucial for understanding which features generally drive the model’s predictions and should therefore be monitored closely in management practices. QD1 and QD2 (discharging of neighboring Qanat) and Groundwater Level (GWL) emerge as significant drivers, with high variability in their SHAP values, indicating that these features have a substantial impact on the model’s output across all data points. Temperature (T), Evapotranspiration (ET) and Precipitation (P) show less variability and smaller range of SHAP values, suggesting a lower but consistent influence.

Figure 9 delves into local interactions, revealing how specific feature combinations affect discharge outcomes under particular conditions. This analysis is crucial for recognizing circumstances where interactions between features like temperature and groundwater level critically alter predictions, offering deeper insights into the model’s operation under diverse scenarios. According to this figure for both Qanats, the importance of temperature and significant local interaction with precipitation and GWL may indicate scenarios where climatic conditions jointly affect QD. Also, the interaction values for discharge from neighboring Qantas underscore the model’s sensitivity to neighboring discharge influences, critical for precise discharge predictions.

Fig. 8
figure 8

Feature impact analysis on qanat discharge predictions for GMDH model: Qanat1 (a) and Qanat2 (b)

Fig. 9
figure 9

SHAP interaction value analysis for Qanat discharge predictions using GMDH mode: Qanat1 (a) and Qanat2 (b)

The practical implications of the presented method for groundwater resource management in arid regions

The findings from this study, demonstrating the efficacy of various machine learning models in predicting Qanat discharge, have significant practical implications for groundwater resource management, particularly in arid regions where water scarcity is a pressing issue. Implementing these models can enable local water authorities and stakeholders to forecast water availability more accurately, facilitating proactive management strategies. For instance, the predictive capabilities of these models can be integrated into water management systems to optimize the allocation and use of water from Qanats, ensuring that water supply meets the demands of agriculture and domestic use without overexploiting the aquifer. By predicting periods of low discharge, water managers can plan alternative water supply strategies or implement water-saving measures in advance, thus avoiding crises. Moreover, the ability of these models to account for various hydrological and climatic inputs means they can also be used to assess the impact of climate change on Qanat performance. This is crucial for long-term water resource planning, allowing for the adaptation of water infrastructure and policies in response to predicted changes in groundwater levels and Qanat discharge patterns. Incorporating machine learning models into groundwater monitoring systems could also enhance the efficiency of these systems, providing continuous, real-time data analysis. This would allow for immediate responses to detected changes in water levels, potentially preventing over-extraction and its associated negative impacts on the aquifer.

Overall, the application of these models in groundwater resource management not only promises to enhance the sustainability of water resources in arid regions but also supports the preservation of Qanats, which are vital to the cultural and ecological landscapes of these areas. By fostering a more data-driven approach to water management, stakeholders can ensure the resilience and efficiency of water use in face of increasing variability due to human and environmental changes.

Also, these machine learning predictions help in mitigating adverse environmental effects. The environmental impacts of Qanat decline or alterations in their discharge patterns are profound, especially in arid and semi-arid regions where they serve as critical water sources. One significant impact is on local ecosystems, which rely on consistent and adequate water supply from Qanats. A decrease in water availability can lead to habitat degradation, loss of biodiversity, and reduced agricultural productivity due to increased soil salinity and desertification. This alteration in land use and ecosystem health can, in turn, reduce the land’s natural resilience to environmental changes.

Machine learning models, as detailed in the paper, can significantly mitigate these adverse effects through enhanced predictive capabilities. By accurately forecasting Qanat discharge under various scenarios, these models allow for better water resource management. They enable authorities to implement strategic water allocation and conservation practices, ensuring that water usage is sustainable and does not exceed the natural replenishment rates of aquifers. Additionally, these predictions can facilitate the development of early warning systems for water scarcity, allowing communities and farmers to implement water-saving measures in advance, thereby safeguarding against the ecological and economic impacts of drought.

Future research directions for enhancing Qanat discharge predictions using machine learning

For enhancing the accuracy of ML predictions in Qanat discharge studies, future research can explore several innovative approaches. Developing hybrid models that combine multiple ML techniques could leverage the strengths of various algorithms to handle complex datasets more effectively. Additionally, advanced ensemble techniques like stacked generalization could further refine predictions by integrating outputs from multiple models, exploiting the unique strengths of each to improve overall accuracy. Enhancing feature engineering and selection is also crucial, as more precise input features related to hydrological, meteorological, and geological data could be pivotal in improving model outcomes. Exploring deep learning architectures such as Convolutional Neural Networks (CNNs) for spatial data and Long Short-Term Memory (LSTM) networks for temporal data could offer significant advancements in processing complex time series and spatial patterns. Incorporating uncertainty analysis through Bayesian networks or Monte Carlo methods could provide deeper insights into prediction confidence and the influence of various parameters, which is vital for robust water management strategies. Additionally, expanding data sources to include remote sensing data could provide more dynamic and high-frequency updates, enhancing the model’s responsiveness to environmental changes.

Conclusions

Qanats are a unique way to extract groundwater from the aquifer, especially in arid and semiarid countries. Simulation of discharge of Qanats is essential for aquifer and demand management, and machine learning methods are promising tools to simulate the complex behavior of the Qanats. The present study utilized a range of well-accepted machine learning methods for predicting QD. Monthly local QD (QDt−1, QDt−2, QDt−3), QD of neighboring Qanat, local GWL for one, two, and three lag-times (GWLt−1, GWLt−2, GWLt−3), temperature, (T), precipitation (P), and evapotranspiration (ET) data were considered as inputs to predict one-, two-, and five-month ahead QD for two adjacent Qanats in Razan-Ghahavand Aquifer. Seven input combinations were categorized into three groups based on expert knowledge and investigated for QD prediction with different lead times (one to five months ahead) using monthly QD data from Razan- Ghahavand Aquifer. Combination 1 in group 1, with hydrogeology-based inputs, includes only Qanat discharge. Combinations 2, 3, 4 and 6 in group 2, with a mixture of hydroclimatic and hydrogeological based inputs, employ monthly QD, QD of neighboring Qanat, GWL, temperature, (T), precipitation (P), and evapotranspiration (ET) to predict QD; and combinations 5 and 7 in group 3, with hydroclimatic and hydrogeological based inputs, do not consider previous QD values. The results of the different models were investigated through different statistical error indices, R, RMSE, NRMSE, MAE, and NSE, to identify the superior model which could predict the QD and give satisfactory results. According to the NSE, the GMDH model gives the best results among the applied models for combinations 5 and 7 in-group 3 for both Qanats. Therefore, this model can efficiently predict QD when monthly QD time series data are unavailable. It is worth mentioning that the results of all used models for combination one, QDt-1, QDt-2, and QDt-3, were not satisfactory for both Qanats. Furthermore, the overall results indicate that considering QD of adjacent Qanat as an input for the developed models (comparing combination 4 with 6 and 5 with 7) generally increases the quality of the model’s prediction. In general, based on the model results, the GMDH model had the superior results, followed by the LSSVM, ANFIS, and ANN, respectively, to predict QD in the present study, and the GEP model was not satisfactory for the simulation of QD.