Keywords

Introduction

Accurate estimation of pan evaporation (E p) is needed to solve various hydrological and water resources related problems. The influence of several meteorological parameters [air temperature (T avg), wind speed (W s), relative humidity (RHavg), and solar radiation (S ra)] on E p makes the modeling of evaporation more complex due on the nonlinear nature. To deal with the complexity and nonlinearity, the artificial neural networks (ANNs) are used in E p modeling. Depending upon the order of synaptic operation in a hidden neuron, the ANNs are classified as either first order or higher order (second or third or Nth) (Gupta et al. 2003). The ‘first-order neural networks’ or ‘linear neural network (LNN)’ models are synonymous to feed-forward neural network (FNN).

Han and Felker (1997) demonstrated application of ANN in estimation of E p. The authors implemented a radial basis function (RBF) neural network to model daily soil water evaporation using RHavg, T avg, W s, and soil water content data as input. Bruton et al. (2000) developed FNNs to estimate daily E p with different combinations of rainfall, T avg, RHavg, S ra, and W s as input. Sudheer et al. (2002) proposed ANN models with back propagation (BP) training algorithm for the prediction of Class A pan evaporation with different combinations of climate data as input. Keskin and Terzi (2006) evaluated the potential of ANNs to estimate daily E p from measured meteorological data viz. water temperature (T w), T avg, sunshine hours (n), S ra, air pressure (P a), RHavg, and W s. Kisi (2009) investigated the abilities of multi-layer perceptron (MLP), RBF, and generalized regression neural networks (GRNN) models to estimate daily E p using climatic variables (T avg, S ra, W s, RHavg, and P a). Moghaddamnia et al. (2009) explored the ANN and adaptive neuro-fuzzy inference system (ANFIS)-based E p estimation methods with an aid of the Gamma test. Rahimikhoob (2009) considered the MLP models for estimating the daily E p by using maximum and minimum air temperatures (T max and T min) and the extraterrestrial radiation (R a) as input. Shirsath and Singh (2010) presented the application of ANN and multi-linear regression (MLR) models comprising of various combinations of T max, T min, n, W s, maximum and minimum relative humidity (RHmax and RHmin) to estimate daily E p.

Tabari et al. (2010) aimed to estimate daily E p using ANN and multivariate nonlinear regression methods with varying input (T avg, precipitation, S ra, W s, and RHavg) combinations and various training algorithms. Shirgure and Rajput (2011) reviewed thoroughly the studies on the E p modeling using ANNs. Shiri and Kisi (2011) illustrated the abilities of ANN, genetic programming, and ANFIS models to improve the accuracy of daily E p estimation by using various climatic variables (T avg, n, Sra, W s, and RHavg). Kalifa et al. (2012) developed ANN-based models to predict the E p from various combinations of T w, T avg, W s, RHavg, and S ra as input. Kim et al. (2012) demonstrated the accuracy of two types of ANNs, i.e., MLP and co-active neuro-fuzzy inference system model for estimating the daily E p. Kumar et al. (2012) developed ANN and ANFIS models to forecast monthly potential E p based on four explanatory climatic factors (RHavg, S ra, T avg, and W s) with different combinations. Nourani and Fard (2012) examined the potential of MLP, RBF, and Elman Network for estimating daily E p using measured climatic (RHavg, S ra, T avg, P a, and W s) data. Arunkumar and Jothiprakash (2013) developed ANN, model tree, and genetic programming-based models with varying input data (reservoir evaporation values with different lags) to predict the reservoir evaporation. Chang et al. (2013) proposed a hybrid model to estimate E p. The hybrid model combined the BP ANNs and dynamic factor analysis. Kim et al. (2013) developed MLP, GRNN, and ANFIS-based ANNs for estimating daily E p using T avg, S ra, n, and merged input combinations under lag-time patterns. Kim et al. (2014) developed soft computing models, namely MLP, self-organizing map ANN model, and gene expression programming to predict daily E p.

All of the above-cited studies used the FNN or LNN to model E p. These neural networks are able to extract the first-order or linear correlations that exist between inputs and the synaptic weight vectors. However, the climatic variables associated with E p exhibit high nonlinearity during modeling and these LNN models fail to extract the complete nonlinearity that is present in the data because of the linear synaptic operation. The limitations with the existing conventional E p methods encourage the researchers to develop higher-order neural network (HNN) network models. The HNN network is a polynomial model in which the weighted sum of the products of its input vector is passed to a computational neuron instead of just a weighted sum of its input vector, as in case of conventional ANNs. This property makes the superior performance of HNNs over other conventional ANNs. The HNNs have been widely used in various fields such as pattern recognition, financial time series forecasting. Further, the HNN models have been successfully applied in hydrology to a limited extent, e.g., characterizing soil moisture dynamics (Elshorbagy and Parasuraman 2008), forecasting river discharge (Tiwari et al. 2012), reference crop evapotranspiration estimation (Adamala et al. 2014a, b, 2015a). However, the HNNs application in E p estimation is not yet reported.

One limitation associated with the FNN and HNN models is their lack of generalizing capability because they are applicable to data from the locations which are used in training or model development (these locations are indicated as ‘model development locations’). When new location data, i.e., data from locations that were not used during the model development (these locations are represented as ‘model test locations’) are introduced to the developed network, the network fails to provide good performance, indicating poor generalizing capacity. This limitation can be overcome by developing generalized FNN (GFNN) and generalized (GHNN) models which perform well not only for model development locations but also for model test locations. This can be achieved by considering pooled climatic data of various locations which have properties of both spatial and altitudinal variations during model development. Therefore, this study aims to develop GHNN and GFNN models for the estimation of E p for different agro-ecological regions (AERs) of India and to test their generalizing capabilities for both the model development and testing locations.

Materials and Methods

Study Area and Climate Data

The climatic data for this study were collected from All India Coordinated Research Project on Agro-meteorology (AICRPAM), Central Research Institute for Dryland Agriculture (CRIDA), Hyderabad, Telangana, India. Data (T max, T min, S ra, RHmax, RHmin, W s, and E p) for 25 climatic stations distributed over the following four AERs: semi-arid, arid, sub-humid, and humid (Fig. 1) were collected.

Fig. 1
figure 1

Geographical locations of study sites in India

Table 1 presents information related to altitude, observation periods, and statistical summary of the climate data and measured E p for the chosen locations. The altitude of selected stations varies from 10 m above msl at Mohanpur to 1600 m above msl at Ranichauri. The mean T min and T max range from 9.66 °C at Ranichauri to 23.38 °C at Thrissur and 20.08 °C at Ranichauri to 35.11 °C at Kovilpatti, respectively. The mean RHmin and RHmax range from 33.91% at Anantapur to 75.27% at Jorhat and 64.27% at Akola to 96.18% at Mohanpur, respectively. The mean W s and S ra range from 1.27 km h−1 at Mohanpur to 9.64 km h−1 at Anantapur and 3.46 MJ m−2 day−1 at Ranchi to 23.30 MJ m−2 day−1 at Akola, respectively. The mean E p ranges from 2.30 mm day−1 at Jorhat in a humid region to 8.38 mm day−1 at Anantapur in an arid region.

Table 1 Characteristics of daily climate data and E p for the study locations

Daily climate data (T avg, RHavg, W s, and S ra) of 15 locations were used to develop GHNN-based E p models, whereas remaining 10 locations were used to test the developed models. The correlation coefficients of climatic variables with the E p for four AERs are shown in Table 2. Among the four climatic variables, the three variables (T avg, W s, and S ra) show a positive correlation and the remaining one variable (RHavg) shows a negative correlation with the E p. The degree of correlation of these climatic variables with the E p indicates their sensitivity in estimating E p.

Table 2 Correlation coefficients of climate variables with E p

Artificial Neural Network (ANN) Models

ANNs are represented as parallel distributed units with a crucial ability of learning and adaptation. The processing of information in any biological or artificial neural models involves two distinct operations: (a) synaptic operation and (b) somatic operation. In synaptic operation, different weights are assigned to each input matrix based on past experience or knowledge with an addition of bias or threshold (Fig. 2). In somatic operation, the synaptic output is applied to a nonlinear activation function (\( \phi \)) (Tiwari et al. 2012). Mathematical representation of synaptic and somatic operations in a neural network is shown in Eqs. (1) and (2), respectively.

Fig. 2
figure 2

Architecture of generalized synaptic neural network models (Tiwari et al. 2012)

$$ y = \sum\limits_{i = 0}^{n} {w_{i} } x_{i} = w_{0} x_{0} + w_{1} x_{1} + \cdots + w_{n} x_{n} $$
(1)
$$ z = \phi \left[ y \right] $$
(2)

where y = neural synaptic output; z = neural somatic output; w 0 = threshold weight; x 0 = constant bias (=1); \( x_{i} \) = neural inputs at the ith step; \( w_{i} \) = synaptic weights at the ith step; and \( \phi \) = activation function (sigmoid); n = number of elements in the input vector.

Generalized Feed-Forward Neural Network (GFNN) Model

The GFNN model provides the neural output as a nonlinear function of the weighted linear combination of the neural inputs. In GFNN model, the synaptic operation is of the first order which means that only first-order correlations exist between the inputs and the synaptic weights of the model. Let N and n be the order and the number of inputs to the neuron, respectively. For N = 1, according to Redlapalli (2004) the mathematical expression of GFNN model is given as:

$$ \left( z \right)_{N = 1} = \phi \left( {\sum\limits_{{i_{1} = 0}}^{n} {w_{{i_{1} }} } x_{{i_{1} }} } \right) $$
(3)

where \( x_{{i_{1} }} \) = neural inputs at the \( i_{1}^{\text{th}} \) step; \( w_{{i_{1} }} \) = synaptic weights at the \( i_{1}^{\text{th}} \) step.

Generalized Higher-Order Neural Network (GHNN) Model

The architecture of the GHNN model is accomplished by capturing the higher-order association as well as the linear association between the elements of the input patterns. The higher-order weighted combination of the inputs will yield higher neural performance as they require fewer training passes and a smaller training set to achieve the generalization over the input domain. The synaptic operation of the GHNN embraces both the first- and second-order neural input combinations with the synaptic weights. In GHNN model, the synaptic operation in a neural unit or a node is of the second order which means that there exists not only first order but also higher-order correlations with second-order terms between inputs and synaptic weights. For N = 2, the mathematical model of GHNN is represented as:

$$ \left( z \right)_{N = 2} = \phi \left( {\sum\limits_{{i_{1} = 0}}^{n} {\sum\limits_{{i_{2} = i_{1} }}^{n} {w_{{i_{1} i_{2} }} } } x_{{i_{1} }} x_{{i_{2} }} } \right) $$
(4)

where \( x_{{i_{2} }} \) = neural inputs at the \( i_{2}^{\text{th}} \) step; \( w_{{i_{1} i_{2} }} \) = synaptic weights at the \( i_{1} i_{2}^{{^{\text{th}} }} \) step (Redlapalli 2004).

Data Preparation

For the development of GFNN and GHNN models for different AERs, locations having daily data for the period of 2001–2005 were chosen. The data were divided into training sets (denoted as Tr and used to adjust the weights and biases during learning), validation sets (denoted as V and used to avoid overfitting), and testing sets (denoted as Ts and used to predict with new data). The locations with ‘Tr, V, Ts’ role (Table 1) were used to develop GFNN and GHNN models (model development locations). These locations for the model development were selected because of the availability of a larger set of data during the study period as compared to other locations. In this study, the habitual practice of using a standard hold out strategy for dividing the data was followed as it is a very common practice in hydrological modeling (Adamala et al. 2015b). For these locations, 70 and 30% of data for the period 2001–2004 were used for training and validation, respectively. It would be more complicated to use different year of dataset for different locations. Therefore, the same 2005 year data was used for testing the performance of developed models. However, the data for the same testing (2005) year is different for the locations considering the different agro-climatic zones.

To develop GFNN and GHNN models for semi-arid, arid, sub-humid, and humid regions, respectively, the data in Table 1 were pooled as follows: (i) Parbhani, Solapur, Bangalore, Kovilpatti, and Udaipur; (ii) Anantapur and Hissar; (iii) Raipur, Faizabad, Ludhiana, and Ranichauri; and (iv) Palampur, Jorhat, Mohanpur, and Dapoli. To test the generalizing capability of the developed models (either for practical application or ….), these models were applied to data from the locations that were not used during model development. The locations with only ‘Ts’ role (Table 1) were used to test the generalizing capability of the developed models (model testing locations). As an example, for the locations that lie in semi-arid regions (Parbhani, Solapur, Bangalore, Kovilpatti, and Udaipur), the pooled data of 2001–2004 were used to train (including validation) the GFNN and GHNN models, while the data of 2005 were used to test these models. The generalizing capability of GFNN and GHNN models was tested using data from Kanpur, Anantapur, and Akola that were not included during development in semi-arid region. In a similar way, different GFNN and GHNN models were developed and tested for their generalization capabilities in arid, sub-humid, and humid regions.

Criteria for Preprocessing and Estimation of Parameters

As a first step in developing GFNN and GHNN models, normalization before presenting data as input to network and denormalization after developing optimum network were performed using a Matlab built-in function called ‘mapstd’ which rescales data so that their mean and standard deviation become equal to 0 and 1, respectively. The inputs for developing GFNN and GHNN models were T avg, RHavg, S ra, and W s. This study examined eight combinations of these inputs to both models. Thus, the sensitivity of E p on each of these variables was evaluated. The target consists of the daily values of measured E p. Only one hidden layer was used in both the GFNN and GHNN models, as it is enough for the representation of the nonlinear relationship between climate variables and E p. The important parameters for network training are the learning rate, which tends toward a fast, steepest-descent convergence, and the momentum, a long-range function preventing the solution from being trapped into local minima. The other parameters are activation function, error function, learning rule, and the initial weight distribution (i.e., initialization of weights). A variation in GHNN parameters had a negligible effect on the performance of these models for estimating E p (Adamala et al. 2014b). Therefore, results concerning the model’s parameters are not discussed. Sigmoidal activation function was employed in the output layer neurons. For developing GHNN-based daily E p models, the code was written using Matlab 7.0 programming language.

Performance Evaluation

The performance evaluation of all the developed models was carried out for both the training, validation, and testing periods in order to examine their effectiveness in simulating E p. The performance indices used for evaluating the models were: the root mean squared error (RMSE, mm day−1), ratio of average output to average target E p values (R ratio), and coefficient of determination (R 2, dimensionless). A description of the aforementioned indices is provided below.

$$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {(T_{i} - O_{i} } )^{2} } $$
(8)
$$ R^{2} = \frac{{\left[ {\sum\nolimits_{i = 1}^{n} {\left( {O_{i} - \overline{O} } \right)\left( {T_{i} - \overline{T} } \right)} } \right]^{2} \, }}{{\sum\nolimits_{i = 1}^{n} {\left( {O_{i} - \overline{O} } \right)^{2} \sum\limits_{i = 1}^{n} {\left( {T_{i} - \overline{T} } \right)^{2} } } }} $$
(9)
$$ R_{\text{ratio}} = \frac{{\overline{O} }}{{\overline{T} }} $$
(10)

where T i and O i  = target (E p) and output (E p predictions of the GFNN and GHNN models) values at the ith step, respectively; n = number of data points; \( \overline{T} \) and \( \overline{O} \) = average of target and output values, respectively.

Results and Discussion

Evaporation estimation requires nonlinear mapping of different climate variables. The main advantages of using GHNN models are their flexibility and their ability to model nonlinear relationships. An important aspect of this study is to develop models for prediction of E p using available weather data. Sudheer et al. (2002) concluded that using daily mean values of temperature and relative humidity instead of minimum and maximum values of both the parameters would not significantly reduce the performance. This observation may help to reduce drastically the data requirement for estimating the evaporation from climatic variables.

Performance of E p Based GHNN Models

Keeping the findings of Sudheer et al. (2002) in view, in the present study the GHNN models were developed with the various combinations of T avg, RHavg, W s, and S ra instead of T max, T min, RHmax, RHmin, W s, and S ra as inputs to evaluate the effect of each of these variables on estimated E p. A total of eight different input combinations were tried in this study. These included (i) T avg, RHavg, W s, and S ra; (ii) T avg, RHavg, and W s; (iii) T avg, RHavg, and S ra; (iv) T avg and RHavg; (v) T avg; (vi) RHavg; (vii) W s; and (viii) S ra. Due to poor performance of the GHNN models developed with a single input variable (input combinations v–viii), the performance results pertaining to these are not presented here. The GHNN models were compared with the GFNN models to test the relative performance of higher-order over linear (first-order) neural models. Further, the developed GHNN models were compared with the generalized multiple linear regression models (GMLR) models to evaluate the accuracy of the former models.

The optimum GHNN, GFNN, and GMLR structures were determined for each input combination and their performance statistics during testing is presented in Table 3. The comparative results of GHNN models with the GFNN models confirm the superiority of GHNN models in terms of the various performance criteria (RMSE, R 2, and R ratio) for all the four input combinations under four AERs (Table 3). The reason for this is probably the capability GHNN models to capture nonlinearity, as these models use nonlinear approximation functions with the second-order polynomials (Eq. 4). The GMLR models with various input combinations showed the poorest performance with highest RMSE and lowest R 2 values for different AERs except for the arid region.

Table 3 Performance of GMLR-, GFNN-, and GHNN-based E p models with different input combinations

Among the all combinations, the GHNN(i) model whose inputs are the T avg, RHavg, W s, and S ra (input combination i) gave the highest accuracy with the smallest RMSE (mm day−1) values (1.389 for semi-arid, 1.079 for sub-humid, 0.99 for humid) for all AERs except for arid region where GMLR (i) resulted in minimum RMSE of 1.429 mm day−1. This shows the strong correlation of T avg, RHavg, W s, and S ra variables with the measured E p values. The reason for this superior performance might be due to the inclusion of all climatic data as inputs which may have great influence on generalized models as these were developed using data from different locations.

Removing S ra (input combination ii) as an input variable increased the RMSE (mm day−1) to 1.451, 1.117, and 1.035 for semi-arid, sub-humid, and humid regions, respectively. Further, decreasing the number of input variables in the GHNN model continued to decrease its accuracy. Removing the climate variable W s (input combination iii) as an input variable increased the RMSE (mm day−1) to 1.510, 1.167, and 1.224 for semi-arid, sub-humid, and humid regions, respectively, and decreased the RMSE (mm day−1) to 1.625 for arid region. The GHNN (iv) model which considers only two inputs furthermore increased the RMSE (mm day−1) to 1.559, 1.632, 1.198, and 1.281 for semi-arid, arid, sub-humid, and humid regions, respectively. Similar performance of GHNN models was also observed with R 2 (high) statistical index. These results suggest that the performance of the developed E p models decreased with the decrease in the input variables.

Due to the superior performance of GHNN models over the GMLR and GFNN models, the scatter plots pertaining to the GHNN models with all input combinations (i.e., i to iv) are only shown in Fig. 3, which confirms the statistics given in Table 3. The results in Fig. 3 illustrate that the agreement between the E p predictions of the GHNN models and the measured E p predictions was better for all regions. Although the GHNN (i) to (iv) models resulted in acceptable R 2 values for all regions except for humid region, their estimates are far from the exact 1:1 fit line. This can be clearly observed from the coefficients of their fitted equations (y = a 0 x + a 1) where the values of a 0 and a 1 coefficients are far away from one and zero, respectively.

Fig. 3
figure 3

Scatter plots of GHNN(i), GHNN(ii), GHNN(iii), and GHNN(iv) models estimated E p (mm day−1) with respect to measured E p (mm day−1) for four AERs

Application of GHNN Models for E p Estimation

The best performed GHNN(i) models were applied to 15 model development and 10 model testing locations for four AERs to test their generalizing capability. The performance indices of GHNN(i) models for the model development locations are shown in Table 4. The R ratio values (Table 4) suggest that the GHNN(i) model overestimated E p values for semi-arid and humid regions and underestimated for arid and sub-humid regions. The RMSE (mm day−1) values for this model ranged from 0.673 (at Ranichauri) to 3.227 (at Hissar). The performance indices of GHNN(i) models for the model testing locations are shown in Table 5. The RMSE (mm day−1) values for this model ranged from 1.098 (at Samastipur) to 1.830 (at Bijapur). This indicates that the GHNN models have better generalization capability for the estimation of E p for locations that were not used in the model development.

Table 4 Performance of GHNN(i)-based E p model for model development locations
Table 5 Performance of GHNN(i)-based E p model for model testing locations

Conclusions

The ability of GHNN models corresponding to different locations in four AERs in India to estimate E p was studied in this paper. The results illustrated that GHNN and GFNN models performed much better than the GMLR models and GHNN and GFNN models with the input combination (i), which include all variables as input performed better as compared to other combinations (ii, iii, and iv, respectively) for all AERs. During testing of the generalizing capability of GHNN models for the model development and testing locations, the GHNN models performed better than the GFNN models for all cases. The performance of the generalized models increases with the increase of number of input variables during E p modeling. Overall, better performance of GHNN models in comparison to GFNN and GMLR models in different AERs in India showed that these models not only have better potential but also have good generalizing capability. It may be noted that the main focus of this study was to evaluate the generalizing capability of higher-order neural networks in E p modeling. This study does not intend to replace the well established models. Further, more studies are required to test the generalizing capability of GHNN models with limited climate data for different climatic regions of other countries.