1 Introduction

Literature and observations show that a significant portion of lake water is lost through evaporation in arid and semi-arid areas (Chattopadhyay et al. 2009; Biazar et al. 2019). Evaporation from the free water surface has been measured by direct and indirect methods. Direct methods include Eddy covariance (EC) method, Bowen ratio energy balance, and measurement of water loss with lysimeters or mas-balance methods (Allen et al. 2011). Indirect methods include experimental and theoretical equations (Meyer 1915; Penman 1948; Harbeck 1955; Priestly and Taylor 1972; Brutsaert and Stricker 1979; De Bruin and Keijman 1979), and artificial intelligence (Piri et al. 2009; Kisi et al. 2015; Vaheddoost and Koack 2019; Ashrafzadeh et al. 2019) and satellite-based methods (Zhao and Gao 2019; Holmes 2019). Direct measurement of evaporation is expensive, cumbersome, and has different sources of error, and becomes even more difficult for evaporation from saline water with different concentrations. Climatic variables, influencing saline water evaporation, not only vary from station to station, but also vary based on the degree of salinity. Therefore, different models can be derived for saline water evaporation. The rate of evaporation from saline water surface is usually less than that from fresh water (Asmar and Ergenzinger 1999). Many of the world’s saline lakes are shrinking, with the result that bird habitats are undermined, people living around the lakes suffer with economic losses, and dusty winds are being caused which threaten people’s health (Wurtsbaugh et al. 2017).

Many investigators have estimated evaporation from lake water surfaces (Gianniou and Antonopoulos 2007; Piri et al. 2009; Hamdani et al. 2018; Wang et al. 2018a, b, Mor et al. 2018; Ashrafzadeh et al. 2019). The mechanisms of evaporation from saline water are somewhat different from those from fresh water. The complexity of the process of evaporation from water bodies (e.g., lakes, ponds, dams, and natural reservoirs) and the lack of sufficient and reliable information are the main obstacles (Sandler 1999; Hamdani et al. 2018). Additionally, the selection of meteorological variables that influence saline water evaporation is a difficult task.

This study attempts to find the main important meteorological variables that affect the rate of daily evaporation from saline water. This task was performed using three models which were (1) procrustes approach (PA) (Dinpashoh et al. 2004), Gamma Test (GT) (Ashrafzadeh et al. 2018), and entropy based method (EnT) (Ahmadi et al. 2009). Once the variables were selected, the artificial neural network (ANN) method was applied to model daily evaporation rate using daily meteorological time series.

Lee (1927) experimentally compared evaporation from pure water and from brine lake in Nevada, USA, for different densities and found that the brine water evaporation in comparison with pure water reduced by 0.01% for a 1% increase in water specific weight. Young (1947) compared evaporation from both saline water (with different concentrations of sodium chloride) and that from fresh water. Results of this study did not differ significantly from the findings of Lee (1927). Kokya and Kokya (2008) analyzed the effect of salinity of water on evaporation rate from a water pan in an experimental study near the Urmia Lake by using Meyer and Harbeck methods, and they developed these methods. The results showed that the developed methods could estimate evaporation with higher accuracy than classical methods. Al-Khalifat (2008) reported that the average volume of evaporation losses from the Dead Sea was between 2 and 4 billion m3/year using the mass and energy balance during 1800–2000. He also showed that the decrease in the lake level over time had resulted in reducing evaporation rate. Furthermore, the evaporation rate had decreased due to increasing salinity during the periods with significantly lower lake water level. Piri et al. (2009) investigated the evaporation of free water in hot and dry regions of Iran. In their research, they used an ANN model to estimate evaporation and compared the results with two experimental models (Linacre and Marciano). They used a GT to select the best input combination for the ANN. Wind speed, saturation vapor pressure, and mean relative humidity were selected as the most effective input variables in estimating evaporation. Results showed the ANN-GT model outperformed the experimental models. Moghaddamnia et al. (2009) estimated, sing ANN and adaptive neuro fuzzy inference system (ANFIS), evaporation from the free water surface in Iran. They used data on air temperature, wind speed, saturation vapor pressure, and mean relative humidity. The GT was also used to select the effective input variables. In this study, variables such as wind speed, relative humidity, and saturation vapor pressure were selected as the most effective variables by the GT. They used three empirical models of Hefner, Linkier, and Marciano to compare the model performances. Results showed that the intelligence models performed better than the experimental models. Goyal et al. (2014) modeled tropical evaporation using ANNs, least squares-support vector regression (LS-SVR), fuzzy logic, and ANFIS. They used precipitation data, maximum and minimum temperatures, maximum and minimum relative humidity, and sunshine hours. They also used the GT to select the effective inputs to improve model performance. Results were compared with the experimental models of Hargreaves and Samani and Sphere Stewart, and showed that the intelligence models performed better than the experimental models. Kisi et al. (2015) predicted the Urmia Lake water level with the support vector machine (SVM) model by applying a novel method based on firefly algorithm (FA). The output of the SVM–FA model was compared to the Genetic Programming (GP) and ANNs models. Results showed that the SVM-FA model was superior to others. Shiri et al. (2016) predicted the water level in the Urmia Lake using the Extreme Learning Machine (ELM) method. In their study the ELM method was compared with ANN and GP. Results showed that the ELM method was better than others. Wang et al. (2018a, b) evaluated different approaches for estimating evaporation in a small high-elevation lake on the Tibetan Plateau (TP). They conducted the evaluation by using EC observation-based reference datasets. They used Bowen ratio energy budget, Penman, Priestley–Taylor, Brutsaert–Stricker and DeBruin–Keijman, Dalton and Ryan–Harleman, Jensen–Haise, and Makkink methods. All methods were significantly improved after parameter optimization, with better efficiency by the former than the latter. The deBruin method yielded the largest error due to the poor relationship between evaporation and the drying power of the air. The good efficiency of the Makkink approach, with no considerable differences before and after optimization, indicated the significance of solar radiation and air temperature in the estimation of lake evaporation. The Makkink method was used for long-term evaporation estimation due to the lack of water temperature observations in lakes on the TP. Wang et al. (2018a, b) carried out research about global lake evaporation accelerated by changes in surface energy allocation in a warmer climate. They reported simulations with a numerical model of lake surface fluxes, with input data based on a high-emission climate change scenario (Representative Concentration Pathway 8.5). In their simulations, the global annual lake evaporation increased by 16% by the end of the century, despite little change in incoming solar radiation at the surface. At the current work, we investigated the effect of variables on evaporation in different concentrations (fresh water to high saline water) separately. Seifi and Riahi (2018) estimated daily reference evapotranspiration using the least square support vector machine (LS-SVM)-GT, ANN-GT and ANFIS-GT models in Iran. Results of GT revealed that three climate variables which were minimum air temperature, maximum air temperature, and wind speed were the most important climatic variables, and the LSSVM model performed better than did ANFIS and ANN for the same meteorological input variables. Hamdani et al. (2018) assessed seasonal and daily evaporation of the deep and brine Dead Sea. They provided observations in two consecutive years using the eddy covariance system, meteorological stations, and a floating station (which measured the water profile temperature). They found that the peak of evaporation rate occurring in summer was related to solar radiation. The winter peak evaporation was also related to the thermal storage of the lake, and evaporation due to high vapor pressure combined with wind and resulting thermal static. They compared several models of evaporation with direct measurements and showed that the mass transfer model was more reliable than others. Nozari and Azadi (2019) predicted the salinity of drainage and groundwater at various drain depths and spaces using ANN. Results revealed that ANN had a reasonable accuracy in the simulation of temporal shallow groundwater and drainage of water salinities at different drain depths and drain spaces. Guo et al. (2019) conducted research on long-term changes in evaporation over Siling Co Lake on the Tibetan Plateau and its impact on recent rapid lake expansion. In this study, long-term evaporation over Lake Siling Co was simulated using a single-layer lake evaporation model, and simulated results were verified by observation from an EC https://www.sciencedirect.com/topics/earth-and-planetary-sciences/eddy-covariance system in the lake. Results showed that the single-layer lake evaporation model was capable of accurately simulating lake evaporation on a daily scale. Ashrafzadeh et al. (2019) estimated evaporation of free water in northern Iran using intelligent models, ANN and ANN-krill herd optimization algorithm (ANN-KHA). Maximum temperature, maximum and minimum relative humidity, rainfall, wind speed, and sunshine were used. They also used the GT to select the effective inputs and results showed that the ANN-KHA model performed better than did the ANN model.

Since it is difficult to answer which variables are useful as inputs for modeling (Ahmadi et al. 2009; Isazadeh et al. 2017; Ashrafzadeh et al. 2019), there have been insufficient reports on comparing different techniques for selecting the most effective inputs. Although much research has been done on evaporation estimation, most of it does not consider effective input selection for evaporation modeling. As a result, the main challenge lies in the evaluation of existing data and assessing the adequacy of data. Although modeling improves performance by adding information, observations show that adding information improves model performance only slightly. Modeling accuracy can be reduced by increasing information. This is because additional information causes the model to be overfitted. The overfitted model also performs well in training but has a very poor performance in testing. Overfitting also occurs when multivariate models have too much input data. It is therefore important to know which inputs are effective in modeling and which are not (Ahmadi et al. 2009). Remesan et al. (2008) used GT in determining the effective inputs for estimating solar radiation using local linear regression (LLR) and ANN in the Brue Catchment in southwest England. Ahmadi et al. (2009) applied GT, Entropy Theory (EnT), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) in the determination of effective inputs for solar radiation estimation in the Brue Catchment in England. They estimated solar radiation with LLR and ANN and found that the GT had the best performance. A limited number of studies have used the procrustes analysis (PA) for the selection of model inputs. Dinpashoh et al. (2004) used this approach in selecting variables in a precipitation climate study in Iran. They found that among the 57 candidate variables, only 12 variables were important in the regionalization of precipitation records. Nam et al. (2015) used the PA in the selection of 33 rainfall-related and geographical variables among the total 42 candidate variables for the delineation of rainfall in South Korea.

In this study, the PA approach was used in the determination of evaporation rate from saline water. Results were compared with those of the GT and EnT schemes. The main objectives therefore were to: (1) identify the best combination of climatic variables for modeling evaporation rate of saline water, (2) identify the best model among the three candidate models which were PA, GT, and EnT using performance criteria.

2 Materials and Methods

2.1 Study Area and Data

Lake Urmia is one of the most significant saline lakes of the world and Iran's largest saline lake, which is at risk of drying up because of excessive agricultural development, climate change, and irrational construction of dams. Various reports have been presented regarding evaporation from the lake surface, with the values generally estimated to be in the range of 890–1360 mm/year (more than 50% difference). However, just one centimeter of error in estimating the evaporation height leads to 30 million cubic meters of error in calculating the lake water balance, considering the average lake area of 3000 km2. This shows the necessity of accurately estimating the evaporation rate from Lake Urmia using physically-based approaches and accurate observed data.

For modeling the evaporation rate from saline water having different salt concentrations, an experimental study was carried out at a location near the Urmia Lake. Recorded daily data of meteorological variables used in modeling were: (1) surface water temperature, (2) mean air temperature (Tmean), (3) mean sky cloudiness, (4) sunshine hours, (5) mean relative humidity (RHmean), (6) mean wind speed, (7) mean air pressure, (8) maximum temperature, (9) minimum temperature, and (10) precipitation. Measurements at daily scale were used for evaporation rate of saline water with different concentration levels. Nine different samples (with salt and water) were made with different concentrations. The sample concentrations were 0.2 g/L (as a fresh water), 5 g/L, 10 g/L, 20 g/L, 50 g/L, 100 g/L, 200 g/L, 300 g/L, and 500 g/L, exposed to free evaporation in the field. This was accomplished using a sensitive electrical conductivity meter (Lide 2004). Also, the total dissolved solids (TDS) was kept unchanged during measurement. The evaporation rate was measured at daily scale from March 1, 2019 to June 30, 2019. In order to keep the water surface clear from algae and other thin films the water surface was cleaned from time to time, since any dust precipitation over the water surface would retard evaporation (El-Dessouky et al. 2002). The observed evaporation rate was recorded along with the corresponding climatic data.

First, a place closest to Lake Urmia where a station could be built was selected for the study (We used Basmenj weather station). Then, the station was equipped with Pan Evaporation, and waters with different concentrations were monitored to measure the evaporation. Urmia lake salt and distilled water were used for conducting this experiment, because the distilled water concentration was known, and it was possible to easily control the concentration of Pan Evaporation with electrical conductivity meter. The water level in all Pan Evaporation was kept near the Pan Evaporation surface. A separate scale was prepared for each Pan Evaporation, and the numbers on them were read every day. Distilled water was added to the pan every few days to reach the initial level of water and salinity (El-Dessouky et al. 2002; Lide 2004). After adding water to the Pan Evaporation, they were stirred with separate plastic tubes designed for each Pan Evaporation, and then the tubes were washed using distilled water and the water used to wash them was poured into the Pan Evaporation again, because it was intended to control the Pan Evaporation concentrations (On the days when distilled water was added to the barrels, the scales were read once before adding the water and once after adding the water and stirring).

2.2 Theory of Entropy (EnT)

Entropy theory was developed by Shannon (1948) which states that information reduces uncertainty and vice versa. Entropy can be used as an index of quantification of lack of knowledge of system characteristics. In general, four types of entropy have been introduced in the quantification of information: marginal entropy, joint entropy, conditional entropy, and information transfer entropy. In order to select the suitable set of inputs for modeling evaporation from saline water, information transfer entropy was used. The marginal entropy, called H(x), based on a discrete random variable, x, can be defined as (Shannon and Weaver 1949):

$$H(x) = - K\sum\limits_{i = 1}^{n} {P(x{}_{i})\log P(x_{i} )}$$
(1)

where the value of K is equal to 1, if in Eq. (1) the logarithm is based on 2.7182; and n is the number of events with p(xi); (i = 1, …, N).

The entropy for the sum of the two random independent variables, x and y, can be estimated as (Ahmadi et al. 2009; Saha and Chattopadhyay 2020):

$$H(x,y) = H(x) + H(y)$$
(2)

In the case of two dependent random variables, x and y, the joint entropy is less than the sum of the two distinct entropies (i.e. H(x) + H(y)).

The information transfer entropy specifies the mutual relationship between x and y. The information transfer entropy [denoted by T(x, y)] for the two dependent variables (i.e. x and y) can be obtained as (Ahmadi et al. 2009):

$$T(x,y) = H(x) + H(y) - H(x,y)$$
(3)

which can also be estimated as:

$$T(x,y) = H(x) - H(x\left| y \right.) = H(y) - H(x\left| y \right.)$$
(4)

where H(x|y) denotes the conditional entropy. The term H(x|y) expresses the residual uncertainty of x provided that y is known and vice versa.

The calculation of entropy for the M distinct variables can be generalized in a similar manner (Harmancioglu and Alpaslan 1992; Singh 2011). In this case the total entropy for the M independent variables Xm (m = 1, 2, …M) can be obtained as:

$$H(x_{1} ,x_{2} ,etc.,x_{M} ) = \sum\limits_{m = 1}^{M} {H(x_{m} )}$$
(5)

For dependent variables, the joint entropy can be computed as (Harmancioglu and Alpaslan 1992; Singh 2011; Chattopadhyay et al. 2018):

$$H(x_{1} ,x_{2} ,etc.,x_{M} ) = H(x_{1} ) + \sum\limits_{m = 1}^{M} {H(x_{m} \left| {x_{1} ,etc.,x_{m - 1} } \right.)}$$
(6)

In order to select the most suitable variables, the value of T(x, y) was calculated for all the 14 variables. Then, one of the variables was removed from the set of x, and T(x, y) was evaluated for this reduced set. This process was repeated for the other 13 variables. The computed values of T(x, y) were tabulated for different concentrations of saline water (i.e. y). For a distinct concentration (for example, 5 mg/L), those variables, for which T(x, y) was more than that of the corresponding T(x, y) obtained using the total 14 variables, the redundant variables were selected and then were removed from the set. Thus, the most appropriate variables were selected in modeling saline water evaporation.

2.3 Gamma Test (GT)

GT is a nonlinear approach which supposes that the observation set is defined by the following relationship (Durrant 2001):

$$\{ (x_{i} - y_{i} )\} ,1 \le i \le M\}$$
(7)

where xi is the input observation vector (here, surface water temperature, air temperature, …), and yi is the output of the GT (here, evaporation rate from a distinct concentration), and M is the total number of observations (here, total days for which evaporation was measured). The relationship between the inputs and output can be expressed as:

$$y = f\left( x \right) + r$$
(8)

where f(x) is the smooth variable and r is the error term. It is assumed that the mean of the distribution fitted to r is zero and its variance is limited. The gamma statistic denoted by \((\Gamma )\) expresses the variance of those observations, which the model is incapable in its determination. For a distinct input vector xi, the set N[i, k], for (1 ≤ k ≤ p), is called the set of closest neighbors.

The gamma test is based on this set (i.e. N [i,k]). The term X [I,k] is the closest neighbor for ith x (i.e. xi), such that (1 ≤ k ≤ p), and(1 ≤ i ≤ m). Also p is the maximum number of neighbors that are usually assumed to be between 10 and 50. In order to estimate \((\Gamma )\) the values of \((\delta_{M} \left( k \right))\) should be calculated according to the input data (Evans and Jones 2002).

$${\updelta }_{{\text{M}}} \left( {\text{k}} \right) = \frac{1}{{\text{M}}}\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{M}}} \left| {{\text{x}}_{{{\text{N}}\left[ {\text{i,k}} \right]}} - {\text{x}}_{{\text{i}}} } \right|^{2} ,\quad 1 \le {\text{k}} \le p$$
(9)

where \(\left| \ldots \right|\) denotes the Euclidean distance between vector \({\text{x}}_{{{\text{N}}\left[ {\text{i,k}} \right]}}\) and its neighbors. Also, the value of \(\gamma_{M} \left( k \right)\) can be estimated using the output data from the following relationship:

$${\upgamma }_{{\text{M}}} \left( {\text{k}} \right) = \frac{1}{{2{\text{M}}}}\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{M}}} \left| {{\text{y}}_{{{\text{N}}\left[ {\text{i,k}} \right]}} - {\text{y}}_{{\text{i}}} } \right|^{2} ,\quad 1 \le {\text{k}} \le p$$
(10)

where \({\text{y}}_{{{\text{N}}\left[ {\text{i, k}} \right]}} { }\) is the value of output corresponding to the kth neighborhood of xi vector. In this manner the p values for \({\updelta }_{{\text{M}}} \left( {\text{k}} \right)\) and the p values for \({\upgamma }_{{\text{M}}} \left( {\text{k}} \right)\) can be calculated. Then a relationship between \({{\{ \delta }}_{{\text{M}}} \left( {\text{k}} \right),{\upgamma }_{{\text{M}}} \left( {\text{k}} \right)\}\) would exist as:

$${\upgamma } = {\text{A }}\delta + {\Gamma }$$
(11)

The value of \({\Gamma }\) statistic is indeed the intercept of the above mentioned regression model. Also, A is the slope of the line which shows the complexity of model derived from data (Isazadeh et al. 2017; Ashrafzadeh et al. 2018).

2.4 Principal Components Analysis (PCA)

PCA is one of the oldest multivariate methods that was introduced by Pearson in 1901 and modified later by Hotelling in 1933 (Hotelling 1933; Jolliffe 1986). The main idea in PCA is the reduction of dimensions of data that feature a relatively high correlation (in terms of modulus). The reduction of dimensions by converting data into a few new variables, namely PCs. These new variables are independent of each other. The first PC can be known as the first independent variable which has the largest variance compared with the other PCs. The second PC incorporates the largest variance following the PC1 among others. In the same role the third PC has a large variance after PC1 and PC2 among others. The sum of variances of all the PCs is 100%. In PCA the first few PCS usually incorporated a large variance for data which led to easy interpretation of data.

As stated by Johnson and Hanson (1995), instead of directly using the input variables, which are usually dependent on each other, the variables can be converted to the few independent PCs that can be subsequently applied for data interpretation. In this method, only negligible information would be lost after converting the original variables into the PCS (Johnson and Hanson 1995).

2.5 Procrustes Analysis (PA)

Assuming p to be the number of total variables, and k the number of the dominant PCs in PCA, the PA selects a set of important variables which show the feature of nearly the whole number of PCs. At the same time the number of selected variables (q) is much less than p (total candidate variables) and is less than k (selected PCs). This property of PA which at last selects less variables (say q) helps with interpretation of the whole data. The retained q variables incorporate the feature of nearly all candidate variables. This method was introduced by Krzanowski (1987) and then applied in hydrology by Dinpashoh et al. (2004).

In this method, it is necessary to select q as a number of important variables to be much less than total candidate variables (i.e. p), that is, q <  < p. By the implementation of PCA to original variables the number of important PCs was selected to be k, in which k < p. In PA, the selection of q variables was done by the minimization of the objective function denoted by M2. This function can be defined as follows:

$$M^{2} = Trace\left\{ {YY^{\prime} + ZZ^{\prime} - 2ZQ^{\prime}Y^{\prime}} \right\}$$
(12)

Figure 2 illustrates the general stages of procrustes analysis.

Fig. 1
figure 1

Study area

Fig. 2
figure 2

The cyclone of variable selection using PA method (Dinpashoh et al. 2004)

Fig. 3
figure 3

Where Pans were divided into three parts: UCZ (Upper Convective Zone), NCZ (No-Convective Zone) and LCZ (Low Convective Zone)

In Fig. 2, X is the matrix of the standardized data and \(\underline {X}\) is the reduced matrix (standardized), i.e., after removing a most redundant variable from X. Y is the output of PCA conducted on X. and Z is the output of PCA carried out for \(\underline {X}\). Y is the real feature of original data, but Z is called the approximate feature of data. PA evaluates the differences between the two matrices (Y and \(\underline {X}\)) by means of differences between the sums of squares of the corresponding points in the two mentioned arrangements. If the variables were selected precisely, the similarity between the two arrangements would be large, therefore, the differences between the approximate and real configuration would be less. In Eq. (12), Q is obtained as:

$$Q = VU^{\prime}$$
(13)

where matrices U and V can be obtained using the singular value decomposition (SVD) of the matrix Z′Y having dimension of k × k. The SVD relationship is as follows:

$$Z^{\prime}Y = U\sum V^{\prime}$$
(14)

where \(UU^{\prime} = I_{k}\), \(V^{\prime}V = VV^{\prime} = I_{k}\), and \(\sum = diag \left( {\sigma_{1} ,\sigma_{2} , \ldots ,\sigma_{k} } \right)\). \(I_{k}\) is the square identity matrix with dimensions k × k.

M2 was determined for any of the subsets pertaining to the set of variables with at least q members. The best subset of the variables’ reference set was selected in a repeated manner by which at last the lowest amount of M2 was obtained and the retained variables were defined to be the most important variables (Krzanowski 1987).

3 Artificial Neural Network (ANN)

ANNs are considered as one of the data processing methods, including input/output processing and generally one or more hidden layers as their main components. Even though there are various neural network architectures, about 90% of them are the feed-forward type (Coulibaly et al. 2000; Aghelpour et al. 2019). A feed-forward ANN can have one or more hidden layers which their nodes are called as hidden units or nodes. The ANN can solve high ordered statistical problems by including hidden layers. The ANN can have several outputs, however, in the presented study one output was utilized. The perceptron, the very basic form of an artificial neural network, is a binary classifier and can be described using the following equation:

$$g\left( z \right) = \left\{ {\begin{array}{*{20}c} 1 & { if\; z = w \times X + b > 0} \\ 0 & {otherwise} \\ \end{array} } \right.$$
(15)

where g (z) is the Heaviside step function (i.e., a limited activation function), w is the weight vector, b is the perceptron parameter deviation, and X is the input vector. Multilayer perceptron (MLP) networks learn to simulate the treatment of a wrapped, nonlinear system through learning algorithms and observed data. There are different kinds of learning algorithms, the most famous of which is back-propagation (Rumelhart et al. 1988), delta-bar-delta (Jacobs 1988), quick-prop (Fombellida and Destiné 1992), conjugate gradient (Charalambous 1992), and Levenberg–Marquardt (Hagan and Menhaj 1994; Deo et al. 2018; Ashrafzadeh et al. 2019). These learning algorithms are generally used to detect the optimal set of MLP model parameters. This study used the MLP model along with learning algorithm Levenberg–Marquardt learning algorithm. 1 to 20 neurons in the hidden layer are employed to evaluate the effects of network structure on its performance in RS simulation. The “Trial and Error” method has been used to obtain the optimal neuron. The sigmoid tangent function is applied to map information from the input layer to the hidden layer and from the hidden layer to the output layer (Lagos-Avid and Bonilla 2017; Naganna et al. 2019; Ashrafzadeh et al. 2020). Moreover, this study used 70% of the data as the training set and the remaining 30% as the test set.

3.1 Model Performances

Three metrics were applied to assess the model performances, including the (1) coefficient of correlation (CC), (2) root mean square error (RMSE), and (3) Nash–Sutcliffe model efficiency coefficient (NS) as you know One of the most popular evaluation indices is the Nash Sutcliffe Index, whose range varies from 1 to negative infinity. The intervals of 0.75 1, 0.36–0.75 and less than 0.36 for this index in a simulation, respectively, show very good, satisfactory, and poor performance (Nash and Sutcliffe 1970; Isazadeh et al. 2017; Ashrafzadeh et al. 2018).

$$CC = \sqrt {\frac{{(\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)^{2} )}}{{(\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \overline{x}} \right)^{2} \mathop \sum \nolimits_{i = 1}^{N} \left( {y_{i} - \overline{y}} \right)^{2} )}}}$$
(16)
$$RMSE = \sqrt {\frac{{(\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \overline{y}} \right)^{2} )}}{N}}$$
(17)
$$NS = 1 - \sqrt {\frac{{(\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \overline{y}} \right)^{2} )}}{{(\mathop \sum \nolimits_{i = 1}^{N} \left( {x_{i} - \overline{x}} \right)^{2} }}}$$
(18)

where \(x_{i}\) is the ith observed value, \(\overline{x}\) is the mean of observations, \(y_{i}\) is the ith value estimated from the model, \(\overline{y}\) is the mean of the estimated values, and N is the number of observations.

3.2 Uncertainty Analysis

In this study we used an approach presented by Abbaspour et al. (2007) to analyze uncertainty in the predict evaporation from saline water. In this method, the percentage of measured data bracketed by 95 percent of predicted uncertainties (95PPU) calculated by the 2.5th (Xl) and 97.5th (Xu) percentiles of normal distribution function obtained from n (in this study 1000) times of the simulation results as follows

$${\text{Bracketed by }}95{\text{PPU}} = \frac{1}{{\text{k}}}{\text{count }}\left( {{\text{j|X}}_{{\text{L}}}^{{\text{l}}} \le {\text{ X}}_{{{\text{reg}}}}^{{\text{l}}} \le {\text{X}}_{{\text{U}}}^{{\text{l}}} } \right) \times 100$$
(19)

where l is the item number from one to k, \({\text{X}}_{{{\text{reg}}}}^{{\text{l}}}\) is the observed value on day l, and j is the numerator parameter of the number of observed values placed on the 95% prediction uncertainties (95PPU) band. If all values are within the confidence band of uncertainty, they are then bracketed by 95PPU = 100 (Abbaspour et al. 2007; Isazadeh et al. 2017; Biazar et al. 2019, 2020).

Also, 95 percent prediction uncertainties (95PPU or P-factor) and d-factor coefficients were presented to quantify the authority of calibration and uncertainty analyses (Abbaspour et al. 2007). Equation (20) was used to determine the average width of the band (d-factor) index:

$$d - factor = \frac{{{{\overline{\text{d}} x}}}}{{{\sigma x}}}$$
(20)

where σx is the normal deflection of the observed data and \({\overline{\text{d}} x}\) is the average width of the confidence interval that can be achieved by

$${\overline{\text{d}}\text{x}} = \frac{1}{{\text{k}}}\mathop \sum \limits_{{{\text{t}} = 1}}^{{\text{K}}} ({\text{X}}_{{\text{U}}} - {\text{X }}_{{\text{L}}} )$$
(21)

where t = 1,…, K is the number of observed data, XU is the 97.5th percentile of model output, XL is the 2.5th percentile of model output, and K is the number of observation data.

4 Results

The experiment was conducted in an environment close to Lake Urmia and the environmental conditions were the same for all pans. In this study, three methods were used separately to determine the input variable or select the appropriate input composition and to estimate evaporation using an ANN. Finally, results were compared. To estimate the evaporation by the ANN model, the input variables were selected based on the performance characteristics of each method. The performance and specificity of each of the preprocessing methods are described in Sect. 2. This means that each of the three methods, depending on their performance characteristics, had determined the input variables or appropriate input composition for estimating evaporation at different concentrations. Therefore, the variables in the input composition selected by each of the methods were different for each concentration in the same way as Asmar and Ergenzinger (1999).

4.1 Entropy Theory (EnT)

In the entropy section, the effective input variables were used to estimate evaporation for each pan separately. Thus, the entropy value was first calculated for all selected input variables (row 1 in Table 2). Then, among all input variables one variable was eliminated and the Entropy Test was conducted for the remainder of variables. In this process all input variables were examined by removing the variables one by one. Then, the output of the entropy method was included (row 2 to 15 in Table 2). The first combination (row 1) containing all the variables was designated as the base, meaning that the subsequent combinations were compared with the first combination containing the most information (all variables). If the entropy value of the next compound (2–15) was lower than the first compound, the omitted variable contained information that was effective in estimating the evaporation, thus it was known as the effective variable (bold variable).

Table 1 Characteristics of station sued in this study
Table 2 Summary of entropy values obtained for different inputs of the model in estimating evaporation from saline water with different concentrations

The entropy values of the information transfer for different variables in line with estimating water evaporation with different salinities are presented in Table 2. In this table, the input data were: water surface temperature, air temperature, degree of cloudiness, hours of solar radiation, relative humidity, wind speed, station pressure, maximum temperature, minimum temperature, precipitation, air vapor pressure, solar radiation, maximum wind speed, and the square root of maximum and minimum temperature (SQRT (Tmaximum − Tminimum)) in Table 2, As can be seen the three most important variables for evaporation from saline water using the entropy method were water surface temperature, wind speed and rainfall. Air temperature, solar radiation, and relative humidity were important for only two different concentrations identified as 10 and 500 g/L for air temperature, 5 and 500 for solar radiation, and 5 and 300 for relative humidity). On the other hand, the variables of cloudiness degree, solar radiation, maximum temperature, and the square root of maximum and minimum temperature differences, at any of the concentrations, were not selected as effective input variables by the information transfer entropy method. Other variables were also identified as effective input variables in 3 or 4 compositions by the entropy method.

In this section, the effect of salinity can be generally investigated by initially selecting water surface temperature, wind speed, and precipitation as the primary variables for low concentrations according to the entropy method. As the concentration increased, the station pressure and air temperature variables were added. At the concentration of 50 g/L, the mean station vapor pressure variable was also added to the initial variables according to entropy in the same way as Anderson (1936).

4.2 Gamma Test (GT)

The GT for each pan was used to determine the effective input variables to estimate evaporation. Therefore, first, the GT was performed for all data (row 1) and then this test was performed based on the elimination of each of the variables from all available variables (row 2–15), which is shown in Table 3. In the first-row GT method, in which all the variables in the composition were assigned as bases and the next row or combinations were compared with the first combination, if the gamma value of the next compound (2–15) was greater than the first compound, the omitted variable was known as the effective variable (bold variable in the table).

Table 3 Summary of the output of GT for estimating daily evaporation using the normalized option

Using GT, various combinations of input variables were identified for different Pans for improving the evaporation estimation. The important variables identified by GT were surface water temperature, air temperature, mean relative humidity, mean wind speed, mean station pressure, minimum temperature, precipitation, mean station vapor pressure, and solar radiation. This result was obtained in the case of six (or more) out of the eight concentrations (input parameter) used in this study (Table 3). The use of GT led to the elimination of mean wind speed for all the densities (except 20 (g/L) and 500 (g/L) concentrations). The parameter denoted by the square root of maximum and minimum temperature (SQRT (Tmaximum − Tminimum)) eliminated by GT for densities of 10, 20, 50 (g/L) and fresh water. Results showed that maximum temperature had been eliminated just for three densities, namely 10, 20 (g/L) and fresh water (Table 3).

As can be seen, the main variables at low concentrations included the mean relative humidity, mean wind speed, and air temperature which were almost the same as Piri et al. (2009). As concentration increased, surface water temperature, sunshine hours, station pressure, mean station vapor pressure, and solar radiation were added to the initial variables. Wind speed, temperature, and relative humidity were among the factors that contributed to evaporation (Moghaddamnia et al. 2009; Biazar et al. 2019). As the concentration of the solution increased, the salt in the solution absorbed the sun's radiation energy and broke its bonds. Therefore, solar radiation, sunshine hours, and surface water temperature were added to the initial variables. Also, at higher concentration, the soluble salts decreased the free energy of the water molecules and hence the lowered the saturated vapor pressure above the saline surface. As the concentration increased toward supersaturated concentration, the maximum temperature parameter and the SQRT (Tmaximum − Tminimum) were added to the input variables. This indicated that as the concentration increased, the role of temperature increased.

4.3 Procrustes Analysis (PA)

The PA method was used for detecting the most effective variables in saline water evaporation. The PCA model should be applied for data in an iterative manner. The number of dominant components was detected by eigenvalue calculation. Those eigenvalues, whose values were more than one, were selected as the most dominant PCs. Results showed that the first three components had eigenvalues more than 1. Therefore, the first three components were selected to be the most dominant components here. Also, the number of most important variables in the estimation of saline water evaporation was chosen to be three. Bearing in mind that the number of dominant effective PCs and the number of effective variables were three, the PA analysis was carried out for data and the M2 index was calculated for the selection of important variables. Table 3 shows the most important variables which were detected by PA to be effective in saline water evaporation with different concentrations.

From Table 4, it can be seen that, for example, in the 20 (g/L) concentration, the three important variables in the estimation of saline water were air temperature, sunshine hours, and mean wind speed same as Wang et al. (2018a, b), respectively. This result is valid for the other five concentrations, namely 50, 100, 300, 500 g/L, and “Fresh Water” too. Similarly, sunshine hours, mean wind speed, and maximum temperature were the effective variables in evaporation from water with concentration of 10 (g/L). The best combinations of variables for the estimation of saline water evaporation with different concentrations are represented in Table 4.

Table 4 Optimal input variables selected, based on PA for each saline water concentration

Depending on the dataset used, the PA method first classifies the data and then selects the variable that contains the most variance in each data set. Therefore, the selected variable from each category is most similar to its members, thus using the selected variable can represent other variables in each category.

As can be seen from the results, the wind speed variable was recognized as the effective input variable by almost all three methods, the same as Mor et al. (2018). This may be due to the fact that the station studied was near the lake and humidity was high in the area and because of high wind speed in the area, this variable was able to show its effect well. The physical effect of wind speed is to reduce the moisture content in the region and increase the evaporation rate, so the wind speed is also a good representative of maximum wind speed, relative humidity, and saturated vapor pressure. Another variable that can be considered as an effective input variable in evaporation modeling is the air temperature and surface water temperature which were selected by the GT and EnT at most concentrations. However, the PA’ method identified air temperature as an effective input variable rather than surface water temperature. The reason is that depending on the physical conditions and the variance of data, air temperature can be representative of surface water temperature. It should be noted that the most important energy source for warming the environment is solar radiation (Dinpashoh et al. 2019; Biazar and Ferdosi 2020), and in fact, the physical effect of solar radiation is the increase in ambient temperature. The temperature itself is also an effective factor in evaporation. Thus, the temperature can somehow reflect the physical effect of solar radiation and represent variable solar radiation. Sunshine hours can also be a good representative of the variable cloudy and rainy weather. In fact, the longer the sunshine hours, the lower the cloudiness, and the lower cloudiness degree will also bring about a decrease in humidity.

In saline water, stratification is carried out at high concentrations and the lower part has the highest concentration possible. It should be noted that the sunlight penetrating the pan will pass through the first and second layers and fall into the third layer at high concentrations, which is why the temperature was very high in the pan with high concentrations-the same as Rabl and Nielsen (1975), Hull et al. (1988), and El-Sebaii et al. (2011). In fact, salts or high concentrations of the solution absorb and store the sun's energy. The second layer acts as an insulator for the third layer, which reduces the water surface temperature (the first layer) and increases the temperature at the bottom of the pan (the third layer) at high concentration, the same as Kurt et al. (2000), Suárez et al. (2010), and Ruskowitz et al. (2014). This will reduce the release of water molecules into the saline water. In saline waters, the first layer has the lowest concentration compared to the other two layers, but the soluble salts in this layer decrease the free energy of water molecules. According to the second law of thermodynamics, the increase in ionic activity due to the presence of ions in the solvent and the chemical potential of the solvent will cause the evaporation of saline waters, especially the saline waters, not following the usual models of evaporation. As a result, it reduces the rate of conversion of water molecules from liquid to gas, and reduces the vapor pressure, as is known, evaporation takes place whenever there is a deficit between a water surface and the overlying atmosphere and sufficient energy is available. This will be neutralized as the concentration decreases, and the energy from sunlight is being distributed more evenly throughout the pan. Also, the concentration in the first layer decreases from super salty to freshwater, respectively, which increases the temperature in the surface layer of water, thereby increasing evaporation from saline to freshwater. As mentioned earlier, the climate was the same for all pans. Therefore, according to the above discussion, the physical effect of concentration of the solutions was mainly on the temperature changes, so that the surface temperature of the saline water was observed to be lower than the freshwater surface temperature for the above-mentioned reasons. Evaporation measurements in this study were performed for each concentration separately so that the effect of the concentration factor was considered on the volatility values themselves.

4.4 Modeling Results

The best input combinations were identified by GT, EnT, and PA, and then ANN was used for evaporation estimation. Table 5 shows the results of performance measures of the intelligent models with different scenarios, namely, ANN-GT, ANN-EnT and ANN-PA. As seen from Table 5, the ANN-GT model showed better performance in the estimation of daily saline water evaporation as compared with the other two models. This result was obtained, based on the performance measures (RMSE, CC, and NS), which is valid for all salt concentrations [except 20 (g/L)]. One of the uncertainties of statistical and intelligent models is the use of iterations in calculations. The coefficients such as the d-factor and p-factor can be calculated based on the results of each iteration, each representing part of the uncertainty of each model. Therefore, to determine the uncertainty band after a thousand times iteration in each model, a 95% probability band was obtained for each pan. Although the model ANN-GT revealed better performance than ANN-EnT and ANN-PA, however, ANN-PA showed lower uncertainty than the other two, because the values of the d-factor measure obtained for ANN-GT and ANN-EnT were larger than ANN-PA. However, as can be seen from Table 5 and Fig. 4, the values obtained here for the d-factor and 95% PPU (i.e. p-factor) indicated the lower uncertainty of ANN-PA model in all the tested concentrations [except 20 and 5 (g/L)] (Table 5 and Fig. 4). Results of uncertainty analysis for different salt concentrations are shown in Fig. 4. As can be seen from Fig. 4, ANN-GT and ANN-EnT had a wider 95% PPU band than ANN-PA. This implies that the obtained values of d-factor for the ANN-GT and ANN-EnT models were greater than for ANN-PA. According to Fig. 4, almost all models showed that the higher the bandwidth of 95 PPU, the greater the d-factor, and the lower the bandwidth, the lower the d-factor, the same as Abbaspour et al. (2007), and Noori et al. (2011). The lowest bandwidth also belonged to the ANN-PA model for concentrations of 50 g/L with a P-factor of 0.45 and a d-factor of 0.66. The highest d-factor belonged to the ANN-GT model for a concentration of 500 g/L with a d-factor of 1.88 and a p-factor of 0.86. One reason for the high uncertainty in this model may be due to the number of input variables selected by the gamma test-the same as Isazadeh et al. (2017). As mentioned in the GT section, for the concentration of 500 g/L all variables except the precipitation variable were selected as inputs. Comparative diagrams of calculated and observed values are given in Fig. 4, which shows the proper overlap between the estimated and observed diagrams. Besides, according to the NS index, given the high values of this index, which in most models were in a very good range, it can be concluded that the performance of the models was acceptable. Generally, results with lower uncertainty are more reliable in the estimation of hydrologic variables (Khaledian et al. 2020; Isazadeh et al. 2017; Ghorbani et al. 2016; Noori et al. 2011; Yang et al. 2008; Abbaspour et al. 2007).

Table 5 Test results and uncertainty analysis of ANNs models
Fig. 4
figure 4figure 4

Uncertainty of ANN-GT, ANN-EnT and ANN-PA in estimation of evaporation from saline water

5 Conclusion

Model input selection is a complicated process, especially for non-linear dynamic models. This study introduced a new input selection method, PA, for the estimation of daily evaporation from saline water with different concentrations. This work is conducted for daily estimation of saline water evaporation using three intelligent models (with different scenarios) which are ANN-GT, ANN-EnT and ANN-PA. The outputs of these models were compared with the corresponding measured evaporation. The amount of evaporation was measured separately for each pan (each pan had different concentrations). As in the Kokya and Kokya (2008) study, the concentration was kept constant throughout the experiment. It should be noted that the amount of concentration affected the amount of evaporation and evaporation increased from the high-concentration pan toward the low-concentration pan (Al-Khlaifat 2008). The effect of concentration is included in the evaporation values themselves. And the purpose of this study was to determine the best input variables. Moreover, the selection of effective Inputs for saline water evaporation using PA, GT and EnT can be regarded to be the novelty of this study. The most important variables identified by the GT were surface water temperature, air temperature, mean relative humidity, mean wind speed, mean station pressure, minimum temperature, precipitation, mean station vapor pressure, and solar radiation. Also, as can be seen the most important variables for evaporation from saline water using the EnT method were water surface temperature, wind speed, and precipitation. The three important variables in the estimation of saline water selected by the PA method were air temperature, sunshine hours, and mean wind speed. As the concentration increased, the mean station vapor pressure and temperature variables had the most influence on saline water evaporation. The performances of models were evaluated by using the RMSE, CC, and NS measures. The limitations of these measures are in their inability to assess uncertainty. Therefore, uncertainty of the models used was evaluated using the p-factor (95% PPU) and d-factor. Although ANN-GT was found to be a powerful model in the estimation of evaporation, however, it failed to select the optimal input combinations with lowest uncertainties (Noori et al. 2011). Having lower uncertainty is reported to be the primary criterion in the selection of most suitable model in hydrologic practical applications. This study employed the mentioned two measures (p-factor and d-factor) in uncertainty analysis. Results of this study are interpreted, based on the evaluation of uncertainty in addition to model performance measures. Although there are numerous questions that need to be addressed yet, this work will hopefully stimulate more investigation in input selection methods in model development for hydrological processes, climate simulation, and evapotranspiration estimation. The authors will suggest to develop different scenarios for ANN models to remove bias and increase sufficiency in other researches. Generally, according to the results the PA method was able to demonstrate acceptable performance over the other two methods, with the least number of input variables, in addition to reducing the uncertainty of the ANN model. It can be concluded that by reducing the number of variables as well as selecting the input combination that is a good representative of the effective variables in the evaporation estimation, the model complexity can be reduced and a good estimate of saline water evaporation can be made.