1 Introduction

Operation of dam reservoirs is of great importance due to an increase in the population of the world and consequently the need for development and optimal utilization of water resources. Hence the access to accurate information is important to predict inflow to the reservoir to plan and control the rule curve of dams (Bae et al., 2007). Short-term (e.g. Monthly) and long-term (e.g. Annual) prediction cannot easily be done, because the pattern of inflow is associated with many complications. Therefore, the development of a model that can take into account this complexity is essential to provide accurate predictions (Allawi et al., 2018). Accurate analysis of short-term predictions is necessary for preventing flood events and water supply. In addition, long-term predictions are necessary for water resources planning (Awan and Bae, 2014).

According to the literature, there are many different approaches to predict dam inflow. Awan and Bae (2013) developed a model based on Adaptive Network-based Fuzzy Inference System (ANFIS) to predict inflow to three dam reservoirs in South Korea. Using this model, they predicted monthly inflow to dams in the next month using predictor parameters of precipitation, temperature and dam inflow. Kumar et al., (2015) used Bootstrap wavelet-based ANN model (BWANN) to predict the daily inflow of the Panchut dam in India. The results of this method were compared with several models such as wavelet-based multiple linear regression (WMLR) and Bootstrap analysis.

Atiquzzaman and Kandasamy (2016) studied the accuracy of the Genetic Programming in long-term prediction of inflow to a dam in Australia. In this study, precipitation and inflow to the dam in previous time steps were used in predictions. Li et al., (2016) introduced the Deep Restricted Boltzmann Machine-based Neural Networks (DRBM-NN) and Stack Auto Encoder-based Neural Networks (SAE-NN) models to predict daily inflow of two dams in China. The results of this study were compared with those obtained from ARIMA and a Feed Forward Neural Network. For the operation of Ubonratana Dam reservoir in Thailand, Chiamsathit et al., (2016) predicted inflow to the dam reservoir using a Multilayer Perceptron Artificial Neural Network. Simulations were performed under the role curve of the dam. Esmaeilzadeh et al., (2017) used different combinations of precipitation time series, evaporation and upstream discharge of the river at different time steps to predict inflow to Sattarkhan Dam in Iran. They compared the performance of Artificial Neural Networks (ANN), Support Vector Regression (SVR), Wavelet Neural Networks (WANNs) and M5 tree models.

Different models with different parameters are being used in predictions. However, methods that can reduce uncertainty are more reliable. BNs are one of the efficient probabilistic models in this regard. Bayesian models have been increasingly developed due to high processing speed, graphical representation, no limitation in the number of variables and parameters, a combination of different data sources and management of uncertainties.

According to the literature, this model is a powerful tool for solving complex problems and is able to effectively discuss relationships between them (Leu and Bui, 2016). This model has been used in various areas of water resource management such as water allocation (Ahmadi et al., 2010; Xue et al., 2016), irrigation water management (Rahman et al., 2016; Sherafatpour et al., 2019); supply and demand management (Phan et al., 2016; Asadilour et al., 2012), groundwater management (Mohajerani et al., 2017; Roozbahani et al., 2018), water quality management (Liu et al., 2018; Couture et al., 2018), integrated water resources management (Molina et al., 2010; Xue et al., 2017), urban water management (Anbari et al., 2017; Tabesh et al., 2018) and many other fields of study. In the meantime, this efficient approach has recently received much attention in predicting water resources. For example, it has been used in drought prediction (Madadgar and Moradkhani, 2014; Bae et al., 2017), water consumption prediction (Froelich, 2015; Magiera and Froelich, 2015), runoff prediction (Nagarajan et al., 2010; Humphrey et al., 2016), water environmental risk prediction (Sharifahmadian and Latifi, 2013), water pollution prediction (Hall and Le, 2017; Nodoushan, 2018), pipe failure prediction (Kabir et al., 2015), prediction of pipe leakage (Leu and Bui, 2016), flood prediction (Sikorska and Seibert, 2016; Goodarzi et al. 2019), etc.

According to the literature, this approach has been rarely used to predict dams’ reservoirs related backgrounds, in particular, prediction of inflow to dams. For instance, BNs have been used in predicting the optimal utilization of dam reservoir (Mediero et al., 2007), reservoir water dynamics (Das et al., 2017) and the water level in reservoirs (Das et al., 2016), changes in the reservoirs fullness (Ropero et al., 2017) and the seasonal prediction of dam inflow (Kim et al., 2018).

Due to the random and uncertain nature of dams’ inflow, the BN is used in the current research. The possibility to enter classified variables numerically is among prominent features of the BNs. Monthly and annual inflow predictions as well as the prediction of inflow range are introduced in this study for the first time. The model is tested for the Zayandehrud Dam as one of the most important dams in Central Iran. The real data were used directly in magnitude predictions and the clustered data were used for predicting the inflow range. The results of this study can be used to help decision-makers in allocating water to various uses or other goals such as long-term water sales contracts, hydraulic power and drought preparedness with the highest degree of reliability.

2 Methodology

2.1 Study Area

Zayandehrud multi-purpose dam on the Zayandehrud River in Central Iran is located 110 km west of Isfahan. Water is supplied through natural runoff of Zayandehrud River and tunnels that are used for the transfer of inter-basin water including the first, second and third tunnels of Koohrang and Cheshme-Langan. It is noteworthy that the third Koohrang tunnel has been launched, but is not yet in operation. According to the statistics, the average inflow to the Zayandehrud Dam during 1971–72 to 2014–15 was about 44 m3/s. The basin upstream of Zayandehrud Dam with an area of ​​4265 km2 is located between the northern latitudes of 32° 18′ to 33° 10′ and eastern longitude of 50° 03′ to 50° 40′. The dam was constructed for hydroelectric power generation, seasonal flood control, supplying the agricultural, industrial, drinking and environmental water demands of downstream lands and cities. Figure 1 shows the study area in Iran.

Fig. 1
figure 1

study area

2.2 Bayesian Networks (BNs)

Bayesian networks (BNs), also referred to belief networks and Bayesian belief networks was designed by Pearl (1988). This is a graphical model representing probabilistic relationships among different factors in a case study (Pearl, 1988). Probabilistic relationships in this method are estimated according to the Bayes theorem (Roozbahani et al. 2018). If E and F are two events so that P (E) ≠ 0 and P (F) ≠ 0, then we have:

$$ \mathrm{P}\left(\mathrm{E}\mid \mathrm{F}\right)=\frac{\mathrm{P}\left(\mathrm{F}\mid \mathrm{E}\right)\ \mathrm{P}\left(\mathrm{E}\right)}{\mathrm{P}\left(\mathrm{F}\right)} $$
(1)

Similarly, for n exclusive events of E1, E2, ...,En, we have:

$$ P\left({E}_i|F\right)=\frac{P\left(F|{E}_i\right)P\left({E}_i\right)}{P\left(F|{E}_1\right)P\left({E}_1\right)+P\left(F|{E}_2\right)P\left({E}_2\right)+\dots P\left(F|{E}_n\right)P\left({E}_n\right)} $$
(2)

Where P(Ei) is the probability of event Ei; P(F) is the probability of event F; P(Ei|F) and P(F|E)are conditional probability of Ei given F and vice versa.

A BN consists of two main quantitative and qualitative components. The qualitative component is a directed linear graph in which each node represents a system variable and edges represent a causal relationship between the variables of the network (Abebe et al., 2018). The quantitative component is represented by a set of probabilistic relationships or probabilistic distributions for each network node. In the absence of any parent for the node (no arc toward the node), the node will have a marginal probability table. If the node has a parent (one or more arcs toward the node), it will have a conditional probability table (Hugin Expert A/S, 2017). Basically there are three types of BN nodes: discrete, continuous and hybrid (discrete and continuous nodes). For discrete nodes, the probabilistic table contains a probability distribution over the states and for continuous nodes, the probabilistic table contains a Gaussian density function (given through mean and variance parameters) for the variables it represents. Once constructed, the network can be used to enter observational data in nodes with known specific conditions to obtain probabilities in other nodes. If the BNs consist only discrete nodes, then it is called discrete BNs, and if it contains continuous nodes, it is called continuous BNs.

The structure and probabilistic relationships of the BN are unknown in many cases. Learning is done using available observational data referred to as BN learning. This involves two steps of network structure learning and network parameters learning. Network structure learning is to determine dependent and independent variables and to find possible relationships between the variables that their causal relationships can be detected based on observational data. But the learning of parameters means the calculation of conditional probabilities of each node in the network. Among the advantages of the BNs model are risk analysis and uncertainty with greater accuracy than other models, management of missing values ​​from input data, the ability to combine quantitative and qualitative data and providing approximate solutions using simulation techniques of estimation methods in cases where an exact solution is not available (Roozbahani et al. 2018). One of the main advantages of the BNs is development of the network in the case of incomplete data (Anbari et al., 2017). This can be helpful in this study due to the lack of time series of some parameters such as snow which cannot be extended.

2.3 Data Clustering

One of the important steps in discrete BN modelling in the predicting inflow range is to provide appropriate numerical intervals for model parameters which play a significant role in the final results of the model. To this end, clustering was used to divide monthly and annually predictors and predictands data to proper classes. Most of the previous researches have determined the classes manually and it cannot guarantee achieving the best results. Clustering is an unsupervised process during which objects are classified into different groups so that objects in a cluster are most similar to each other. The K-means method is one of the most practical clustering methods proposed by Macqueen (1967). This method uses an algorithm to classify objects so that the sum of squares of the distance between the data and the corresponding cluster center is minimized. The K-means clustering algorithm can be summarized as follows:

i: First, an arbitrary value is considered for the number of clusters (K). ii: K points are selected in the space of the objects which are in fact the set of primary centers. Iii: Each object is assigned to a group with the shortest distance to its center. iv: When all objects are assigned to clusters, the location of the k centers is recalculated by calculating the average of each cluster’s data. v: The steps (iii) and (iv) are repeated until the center of the cluster does not change. vi: At the last step, the objects are divided into separate groups with least error.

This is one of the most popular clustering techniques, but its reliability is influenced by the choice of initial centers for clustering, because the algorithm may stop in local optimums in some cases (Javadi et al., 2017). To obtain a suitable number of clusters (K), clustering validation methods such as Davies-Bouldin Index, Silhouette Width and the newer Gap method (Albalate and Suendermann, 2009; Rendón et al., 2011) have been used in this paper.

2.4 The Structure of the Proposed Bayesian Model

Choosing suitable and effective initial input variables improves the performance of the results in smart models. Therefore, it is important to identify the parameters affecting inflow to the dam reservoir. In Bayesian Network modelling, correlation analyses is not necessary and the relations between inputs and outputs are extracted by conditional probabilities. According to available statistics and information, the discharge to the basin by the first and second tunnels of Koohrang (Q1, Q2) and Cheshme-Langan tunnel (Q3), natural discharge of Zayandehrud River (Zayandehrud River natural discharge is estimated by subtracting the discharge of water transfer tunnels from total dam inflow) (Qz), discharge of two important hydrometric stations (due to suitable positioning and suitable long-term statistical period) of Qaleh-Shahrokh (Q4) and Eskandari (Q5) which measure the inflow to the dam reservoir respectively from the south and north, average Rainfall in the basin (R), average snow height in the basin (S) and total dam inflow (Qd) with a reasonable time delay (monthly/yearly) are the effective predictors in the model. In this research all possible predictors have been used and there is no other variable that can be incorporated in the prediction model due to the lack of data in this region. But before Bayesian Network applying, correlation coefficients between predictors and dam inflow were estimated. As the coefficients were relatively low, therefore applying cause effect and probabilistic models such as BNs is reasonable.

Table 1 shows the basic information of the parameters affecting the inflow into the dam (i.e. years of data, mean annual values and % of missing data). These parameters were identified based on the role of these parameters in the calculations. Mostly, the correlation values between predictors and dam inflow was not considerable and this is one of the main reasons that BN has been chosen for prediction.

Table 1 Basic information of parameters

Accordingly, four scenarios were designed after introducing effective variables in predictions:

  1. A).

    Scenario 1: Prediction of annual inflow magnitude

  2. B).

    Scenario 2: Prediction of the annual inflow range

  3. C).

    Scenario 3: Prediction of monthly inflow magnitude

  4. D).

    Scenario 4: Prediction of monthly inflow range

Magnitude of inflow value is a real value (i.e. 2 m3/s) and inflow range is a class of inflow which varies between minimum and maximum values (i.e. 2–3 m3/s). According to the modelling scenarios, learning and validation of the proposed Bayesian model in Scenarios 1 and 3 were performed based on predictive numerical data to predict inflow to the dam. In Scenarios 2 and 4, predictor and predicted variables were divided into appropriate intervals with K-means clustering and validation indices mentioned in the methodology section to predict the range of inflow changes. Also dam inflow is predicted for the next month and next year for monthly and annual prediction models, respectively. Figure 2 shows the modelling flowchart for 4 designed scenarios.

Fig. 2
figure 2

Flowchart of the proposed model

Considering different effects of predictor variables in the modelling structure, different patterns of BN were identified for entering data to identify the best learning structure. Accordingly, 44 patterns were designed for modelling as listed in Table 2.

Table 2 Patterns defined in the Bayesian network to enter the variables

Three groups were used to define these patterns. In the first group (25 first patterns), inflow to the Zayandehrud Dam was predicted using the average rainfall in the basin, average snow height, discharge of the first and second tunnels of Koohrang and Cheshme-Langan, natural discharge of Zayandehrud River and inflow to the dam with a time delay (monthly/annual).

To define the patterns in the second group (14 patterns), average rainfall in the basin, discharge of the first and second tunnels of Koohrang and Cheshme-Langan, Discharge of Qaleh- Shahrok station, south of the basin and Eskandari station, north of the basin as well as inflow to the dam with a time delay (monthly / yearly) were used. In these two approaches, data from the previous step (last year/month) were used to predict the inflow into the dam at the current time step. Finally, in the third group (5 patterns), predictions were performed only based on the time series of the inflow into the dam in the last one, two and three-time steps (month or year). For monthly and annual prediction scenarios, time step of prediction (t) is month and year, respectively.

Hugin Lite is one of the most powerful commercial software for simulating and analyzing BNs. It provides a very good, intelligible and user-friendly interface with practical tools (Phan et al., 2016). Due to the advantages and ease of use, Hugin Lite V. 8.5 was used in this study for modelling the BN to predict annual and monthly inflow into the dam numerically and as intervals (Hugin Expert A/S, 2017).

2.5 Model Evaluation Indicators

The use of statistical indicators for evaluation of results depends on the type of prediction outcomes. After the learning of the network, coefficient of determination (r2), Nash-Sutcliffe coefficient (NS), Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) were used to investigate the accuracy of predictions of the inflow magnitude (Ghordoyee Milan et al., 2018; Sherafatpour et al., 2019). The Reliability Percent index (RP) to measure the prediction accuracy of the flow range. In this case, the probability of being located in each interval was calculated and the one with the highest probability was selected and compared with the observational data. PR index is calculated by dividing the number of correct predicted years (or months) to the total number of years (or months).

3 Results and Discussion

3.1 Learning Period and Model Validation

The time series of inflow to the Zayandehrud dam is available from 1971 to 2014 which includes 44 years of data for annual modelling and 536 months of data for monthly modelling. Therefore, this period was chosen as the modelling period and the time series of other parameters (except for the snow parameter that cannot be extended) were reconstructed as needed. 80% of data were used for calibration or learning of the BN and the remaining 20% ​​were used for validation to verify the accuracy of the trained network. Figure 3 shows the long-term average of the time series of inflow to the Zayandehrud Dam.

Fig. 3
figure 3

The time series of inflow to the Zayandehrud Dam

As shown, 20% of the end of the time series is in the hydrological dry period. For validating the model, comprehensive inclusion of a dry period cannot express the accuracy of the model, especially in the future wet periods. Using a moving average of discharge, 20% ending in 2010–2011 (2002–2003 to 2010–2011) distributed in wet and dry periods was selected as period of validation. In other words, to train and test the annual Bayesian model, 36 years were allocated to calibration, 8 years for validation. Calibration period must contain the proportional wet and dry hydrological periods to increase the reliability of the forecast results in the future. Similarly, in the monthly Bayesian model, 432 months were allocated for calibration and 104 months were used to evaluate the accuracy of the trained Bayesian Network.

3.2 Determining the Optimal Number of Clusters for Modelling and Data Clustering

To find suitable numerical intervals in this study, the number of proper clusters for all monthly and yearly predictor and predicted parameters were first calculated with cluster validation indices. According to Davies-Bouldin, and Silhouette width indices in the annual approach, the optimal cluster number calculated by both indices is the same. However, due to the significant difference between these two validation indices for monthly data and uncertainty about which of the indicators will yield better results, Gap index was used to confirm the results and select the appropriate number of clusters. Finally, the number of clusters with at least two indices was considered as the optimal cluster number. The results on the validation of annual and monthly clusters are given in Tables 3 and 4. The numerical intervals (ranges) obtained for each of the parameters in both annual and monthly approaches are shown in Table 5.

Table 3 The optimal number of clusters of annual data
Table 4 The optimal number of clusters of monthly data
Table 5 The intervals for predictor and predicted variables

3.3 Bayesian Network Learning

BN learning consists of 2 steps of structure learning and parameter learning. Structure of the network refers to the causal relationship between variables. Algorithm of Necessary Path Condition (NPC) is the most well-known for this purpose. Due to the known network structure in terms of causal relationships between the parameters in 44 learning patterns, the model parameters were learned only. Learning the parameters is the fact to find conditional probabilities of nodes using the Expectation-Maximization (EM) algorithm.

3.4 Model Validation and Results

Upon completing the BN learning, the patterns were validated under different scenarios.

A) Scenario 1: Table 6 shows the results on validation of the first scenario (prediction of​​ annual inflow magnitude). In this scenario, all predictor and predicted parameters were used numerically to predict the magnitude values of annual inflow. The best results in terms of statistical indicators were obtained from the patterns 15 and 44. Of these two patterns, the pattern 15 was the best in terms of the mean absolute percentage error while the pattern 44 was the best in terms of other statistical indices. After analyzing the results, the pattern 15-b was defined as a specific pattern. In this pattern, the predictors are similar to those in the pattern 15 including the first Koohrang tunnel, Cheshme-Langan tunnel, Zayandehrud natural discharge and rainfall, but with applying lag time of two years for predictors to evaluate its effect on the accuracy of inflow prediction. Accordingly, the results of the evaluation indices of this model (Table 7) show a relatively good improvement in r2, NS and RMSE compared to pattern 15. Thus, this pattern can be described as the top model in this scenario.

Table 6 Results of different prediction patterns in the first scenario (magnitude prediction of annual inflow)
Table 7 Validation of Scenario 1 under a specific pattern

B) Scenario 2: The BN was modeled in Scenario 2 with an annual approach to predict the range of inflow changes. In this scenario, the range of inflow variation was predicted by clustering predictor and predicted variables. The results in Table 8 indicate that the patterns 22 and 23 provide a higher reliability percent of 75% than other network structure patterns (the ratio of the number of correct prediction intervals to the total number of predicted intervals).

Table 8 Results of different prediction patterns in the second scenario (prediction of annual inflow intervals)

To analyze the difference between the performances of the two top patterns in this scenario, validation results in each year were evaluated in terms of reliability. Since the results obtained each year are based on the probability of being located in each cluster, four categories were considered for the results. Three categories were considered in terms of correct prediction probability based on which the probability percentage less than 50% was considered as a low-reliability prediction, 50–100 as a high-reliability prediction and 100% was considered as a decisive prediction. The fourth category was also introduced as the percentage of incorrect prediction probability. The results obtained from the analysis of the patterns 22 and 23 are presented in Fig. 4. The prediction results for the first and third clusters were similar in both patterns, but the pattern 22 in cluster 2 provides high reliable predictions. Analysis of the parameters of these two patterns indicates the significant role of rainfall in selecting the pattern 22. The predictors in this pattern include rainfall, natural discharge of Zayandehrud and runoff into the dam with a one year delay. Figure 5 shows the modelling results in the calibration and validation periods. As seen, 61.1% and 75% of calibration and validation data are correctly predicted, respectively.

Fig. 4
figure 4

The performance of the Bayesian model in two top patterns of the scenario 2

Fig. 5
figure 5

Results of inflow range prediction for the best pattern in scenario 2

C) Scenario 3: In this scenario, prediction of monthly inflow magnitude to the dam was carried out. The statistical indices for different cases are presented in Table 9. As seen, the highest accuracy in this scenario is observed in the pattern 15 with the predictor parameters of discharge of the first Koohrang tunnel and Cheshmeh-Langan, Zayandehrud natural discharge and rainfall. This pattern leads to a mean absolute percentage error rate of 49%, Nash-Sutcliffe of 0.7, RMSE of 21.82 \( \frac{m^3}{s} \) with a coefficient of determination of 0.71.

Table 9 Results of different prediction patterns in the third scenario (magnitude prediction of monthly inflow)

As the best result in scenario 1 has obtained from the specifically defined pattern, similarly, in this scenario, a specific pattern was considered (pattern 15-b). In this pattern, the predictors are similar to those in the pattern 15 with this difference that the time step of predictor parameters has a two-month delay. Accordingly, the result of the statistical indices of this model (Table 10) doesn’t show an improvement in accuracy of BN compared to pattern 15.

Table 10 Validation of Scenario 3 under a specific pattern

D) Scenario 4: This scenario determines the reliability percent of the monthly inflow range prediction by defined patterns. Table 11 lists the validation results of this scenario. As seen, the patterns 10, 11, 19, 23 and 24 show the highest reliability percent. The constant parameter in the network structure of these 5 patterns is natural runoff of Zayandehrud. This indicates the importance of this parameter in prediction with more accuracy.

Table 11 Results of different prediction patterns in the fourth scenario (prediction of monthly inflow ranges)

Like scenario 2, the results of the top patterns in this scenario were analyzed with regard to the reliability index (Fig. 6). The data in the fourth clustering group are not available in the validation period and thus will not affect the selection of the top model.

Fig. 6
figure 6

The performance of the Bayesian model in two top patterns of the scenario 4

Comparing the results of the top patterns in Fig. 6, one can see that the first cluster in all 5 cases is predicted with a probability of 50 ≤ P ≤ 100 or p = 100. So this cluster will have the least effect on choosing the best pattern, because the forecast is accurate with a good confidence rate in all relevant months. However, the least reliable predictions are seen for the third cluster. So this cluster has the most impact on the selection of best pattern. It seems necessary to select a pattern giving acceptable results from cluster 3. Given that the patterns 11 and 24 were not able to accurately predict the cluster 3 even within a month in the first step, they are removed from the list of top models. In other words, the confidence level of correct prediction in these two patterns is 0%. In the next step, the pattern 10 is removed from three remaining patterns because of the lowest confidence in the prediction of the third cluster.

In the third step, the remaining two patterns, namely 19 and 23, are compared. The confidence level of the correct prediction of the third cluster is equal in these patterns. So the decisive factor in this step is the confidence level obtained from the second cluster. As seen, the percentage of the probability of the correct prediction is higher in the pattern 23. Consequently, the pattern 23 with the predictor parameters of natural discharge of Zayanderroud with a one-month delay and inflow to the dam with a one-month delay can be introduced as the top pattern for predicting the monthly inflow to the dam. Figure 7 shows the prediction results of the calibration and validation periods for this pattern.

Fig. 7
figure 7

Results of inflow range prediction for the best pattern in scenario 4

Considering the results, 341 out of 432 months (79%) for calibration period and 80 out of 96 months (83%) for validation period, have been correctly predicted. By implementation of 4 scenarios under 44 patterns, one can conclude that the BN model has been able to predict the interval of inflow to the dam with a reasonable accuracy. Predictor parameters of Zayandehrud natural discharge and rainfall are the most important parameters in these four scenarios. Figure 8 shows the preferred pattern in each scenario in the software environment.

Fig. 8
figure 8

The top BN patterns of scenarios examined in Hugin software

4 Conclusion

Due to the importance of predicting inflow in reservoirs operational planning and management, the performance of the Bayesian Networks in predicting range and magnitude of monthly and annual inflows, was investigated. Generally, handling of incomplete data sets, facilitating the combination of domain knowledge and available data and probabilistic learning about causal networks are the main benefits of BN modelling. The proposed algorithm for each scenario includes four stages of data preparation, BN learning, BN validation and model prediction. To verify the proposed model, Zayandehrud Dam, one of the most important multi-purpose dams in Central Iran, was selected as a study area. Modelling was performed in each scenario under 44 different patterns of the network structure to find the best combination of predictors. According to the results, inflow prediction ranges obtained by the model, is more realistic and trustworthy in terms of uncertainty consideration. Analysis of the results showed that the Bayesian model has been able to predict the annual inflow range. The reliability percent of inflow range predictions was 75% and 83% for annual and monthly scenarios, respectively. Comparing the results of this study with other limited research conducted in the study area such as Nasri (2010) and Gholamzadeh et al. (2011) shows that the proposed BN model has higher accuracy in predicting the dam inflow. This model can be used as a part of decision support systems (DSS) for reservoirs operation considering the importance of inflow in updating and developing a suitable rule curves. In fact in both of BNs structures (Discrete or continuous variables), by application of probabilistic relations between input and output variables, uncertainty can be modeled and considered. When an operational system in a dam wants to use this approach, the developed model can easily provide the acceptable and certain inflow predictions for a month or year ahead for operators and it can lead to better management and planning for different water users in downstream.

Since this research is one of the first attempts in applying BNs in dam inflow prediction, it is recommended to compare it with other popular machine learning models in this field. In addition, employing the proposed model in other dams with different predictors and clustering methods is suggested. It is also worth noting that in this research due to the availability of long term data (44 years), effect of climate and consumption conditions have been incorporated in BN analysis automatically in terms of calibration and validation phases, but it is suggested to apply the climate change and different human disturbance scenarios in the future researches.