Introduction

Evapotranspiration is an important issue for the water balance of an irrigated area. Irrigation engineers want to know how much of the supplied irrigation water is consumed by the crops; only then can they estimate, or calculate, the remaining components of the water balance. On the other hand, agriculturists want to know the specific water requirements of a crop, so that they can obtain a satisfactory yield. They also want to know whether these specific water requirements are being met under the prevailing irrigation practices (Kisi 2008).

As described by Brutsaert (1982) and Jensen et al. (1990), numerous methods have been proposed for estimating evapotranspiration. The combination of energy balance/aerodynamic equations generally “provides the most accurate results as a result of their foundation in physics and basis on rational relationships” (Jensen et al. 1990). The Food and Agricultural Organization of the United Nations (FAO) assumed the ET definition of Smith et al. (1997) and accepted the FAO Penman–Monteith as the standard equation for estimation of ET (Allen et al. 1998; Naoum and Tsanis 2003; Kisi 2007a).

In last decades, the application of artificial intelligence techniques (e.g., artificial neural networks) in modeling reference evapotranspiration (ET0) has received much attention (Kumar et al. 2002, 2008, 2009, 2011; Sudheer et al. 2003; Trajkovic et al. 2003; Trajkovic 2005; Kisi and Yildirim 2005a, b; Kisi 2006a, b, 2007a, 2008, 2011a, b; Kisi and Ozturk 2007; Kim and Kim 2008; Jain et al. 2008; Khoob 2008a, b; Landeras et al. 2009; Marti et al. 2011a, b). Kumar et al. (2002) used artificial neural networks (ANN) for the estimation of ET0. Sudheer et al. (2003) developed radial basis ANN for predicting ET0 using limited climatic data. Trajkovic et al. (2003) developed a radial basis type ANN in forecasting ET0. Trajkovic (2005) applied temperature-based radial basis ANN for estimating FAO-56 PM ET0. Kisi (2006a) estimated daily ET0 using ANN method and compared ANN test results with those of the Penman and Hargreaves empirical models. Kisi (2006b) modeled ET0 using generalized regression neural network (GRNN) models. Kisi (2007a) estimated daily ET0 using ANN method and compared ANN results with those of the Penman, Hargreaves and Turc models. He found that the ANN model performed better than the empirical models. Kisi and Ozturk (2007) modeled FAO-56 PM ET0 by neuro-fuzzy and ANN models using the observed climatic variables. Kisi (2008) investigated the accuracy of different ANN methods in modeling ET0. Kim and Kim (2008) estimated alfalfa ET0 by using GRNN model with genetic algorithm. Jain et al. (2008) modeled the ET0 using ANN and suggested a procedure to evaluate the effects of input variables on the output variable using the weight connections of ANN models. Khoob (2008a) developed ANN model for estimating monthly ET0 of Khuzestan plain, Iran, and compared with Hargreaves method. The results indicated that the Hargreaves method underestimates and overestimates the FAO-56 PM monthly ET0 values by maximum of 20 and 37 %, respectively. Khoob (2008b) estimated ET0 from pan evaporation using ANN in a semi-arid environment and reported that the Hargreaves method is poor for regional estimation of ET0. Kumar et al. (2008) developed different ANN models for estimating daily ET0 and compared with conventional methods. They found that the ANN models performed better than the respective conventional methods in estimating FAO-56PM ET0. Kumar et al. (2009) used ANN for predicting ET0 under the arid conditions. Landeras et al. (2009) forecasted weekly FAO-56PM ET0 using ANN and ARIMA models. Kisi (2011a) modeled daily ET0 using wavelet regression model and compared with empirical models. Kisi (2011b) used evolutionary feed-forward neural networks for estimating ET0. Marti et al. (2011a) assessed four-input ANN model for ET0 estimation through data set scanning procedures. Marti et al. (2011b) estimated daily FAO-56PM ET0 using ANN without local climatic data. Kumar et al. (2011) reviewed the studies related with the application of ANN in ET0 modeling. All these studies revealed that the ANN models are superior in estimating ET0 than the conventional methods. In the present study, least square support vector machine (LSSVM) was proposed as an alternative to ANN model for estimating daily FAO-56PM ET0. To the best knowledge of the author, there is not any published work indicating the input–output mapping capability of LSSVM technique in the modeling of daily ET0.

The main purpose of this study is to investigate the accuracy of LSSVM approach in the modeling of daily ET0. The performance of the LSSVM models was compared with those of the ANN and commonly used Priestley–Taylor, Hargreaves and Ritchie empirical methods. The presented study is the first application for modeling daily ET0 using LSSVM models.

Methodology

Least square support vector machine (LSSVM)

The least square support vector machine (LSSVM), first proposed by Suykens and Vandewalle (1999), was originated from SVM (support vector machines) is a powerful methodology for solving problems in nonlinear classification, function estimation and density estimation (Kumar and Kar 2009). Figure 1 illustrates the procedure of LSSVM regression algorithm. The LSSVM changes the inequality constraints of a SVM into a set of equality constraints and forces the sum of squared error (SSE) loss function to become an experience loss function of the training set. By this way, the problem has become one of solving a linear programming problem (Zhao et al. 2009; Xiaohui and Xiaoping 2010).

Fig. 1
figure 1

LSSVM model of evapotranspiration

Consider given inputs x i (solar radiation, air temperature, relative humidity and wind speed) and output y i (evapotranspiration) time series. According to the LSSVM method, the nonlinear function can be represented as

$$ f(x) = w^{T} \varphi (x) + b $$
(1)

where f indicates the relationship between the climatic variables and ET0, w is the m-dimensional weight vector, φ is the mapping function that maps x into the m-dimensional feature vector, and b is the bias term (Shu-gang et al. 2008).

Considering the complexity of function a fitting error, the regression problem can be given according to the structural minimization principle as

$$ \min J(w,e) = \frac{1}{2}w^{T} w + \frac{\gamma }{2}\sum\limits_{i = 1}^{m} {e_{i}^{2} } $$
(2)

that has the following constraints

$$ y_{i} = w^{T} \varphi (x_{i} ) + b + e_{i} \quad \left( {i = 1,\,2,\, \ldots ,m} \right) $$
(3)

where γ is the margin parameter and e i is the slack variable for x i .

To solve the optimization problems given in Eq. (2), the objective function can be obtained by changing the constraint problem into an unconstraint problem and introducing the Lagrange multipliers α i as

$$ L(w,b,e, \propto ) = J\left( {w,e} \right) - \sum\limits_{i = 1}^{m} { \propto_{i} \left\{ {w^{T} \varphi (x_{i} ) + b + e_{i} - y_{i } } \right\}} $$
(4)

According to the Karush–Kuhn–Tucker (KKT), the optimal conditions can be obtained by taking the partial derivatives of Eq. (4) with respect to w, b, e and α, respectively, as

$$ \left\{ \begin{gathered} w = \sum\nolimits_{i = 1}^{m} { \propto_{i} \varphi (x_{i} )} \hfill \\ \sum\nolimits_{i = 1}^{m} { \propto_{i} = 0} \hfill \\ \propto_{i} = \gamma e_{i} \hfill \\ w^{\text{T}} \varphi (x_{i} ) + b + e_{i} - y_{i } = 0 \hfill \\ \end{gathered} \right. $$
(5)

Thus, the linear equations are obtained as

$$ \left[ {\begin{array}{*{20}c} 0 & { - Y^{T} } \\ Y & {ZZ^{T} + I/\gamma } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} b \\ \propto \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ 1 \\ \end{array} } \right] $$
(6)

where \( Y = y_{1} , \ldots,y_{m} \), \( Z = \varphi (x_{1} )^{T} y_{1} , \ldots ,\,\varphi (x_{m} )^{T} y_{m} \), \( I = [1, \ldots ,\,1] \), \( \propto = [ \propto_{1} , \ldots ,\, \propto_{l} ] \).

By defining kernel function \( K(x,x_{i} ) = \varphi (x)^{T} \varphi (x_{i} ) \), i = 1,…,m, the LSSVM regression becomes

$$ f(x) = \sum\limits_{i = 1}^{m} { \propto_{i} K(x,x_{i} ) + b} $$
(7)

The RBF kernel is common type of kernels in regression problems. The RBF kernel function was used in this study. It can be expressed as

$$ K(x,x_{i} ) = \exp ( - \| {x - x_{i} \|^{2}} /2\sigma^{2} ) $$
(8)

Case study

In this study, daily climatic data from two automated weather stations, Glendale (Latitude 34°11′59″N, Longitude 118°13′56″W) and Oxnard (Latitude 34°14′01″N, Longitude 119°11′49″W) operated by the California Irrigation Management Information System (CIMIS), were used. The locations of the stations are shown in Fig. 2. The Glendale Station is located in Los Angeles Basin Region Los Angeles County. The Oxnard is located in Central Coast Valleys Region Ventura County. The elevations are 1,111 and 48 ft for the Glendale and Oxnard stations, respectively. The daily weather data used in the present study are solar radiation, air temperature, relative humidity and wind speed. The total incoming solar radiation is measured using pyranometers at height of 2.0 m above the ground. Air temperature is measured at a height of 1.5 m above the ground using a thermistor. Relative humidity is the ratio of the actual amount of water vapor in the atmosphere to the amount the atmosphere can potentially hold at the given air temperature. It is expressed as a percentage. The relative humidity sensor is sheltered in the same enclosure with the air temperature sensor at 1.5 m above the ground. Wind speed is measured using three-cup anemometers at 2.0 m above the ground (Kisi 2006b). These measured daily climatic data were downloaded from the CIMIS web server (http://wwwcimis.water.ca.gov/cimis/data.jsp).

Fig. 2
figure 2

The location of the Glendale and Oxnard (CIMIS No: 133) and Oakville (CIMIS No: 156) stations in California

The data sample covers 12 (1997–2008) and 8 years (2003–2010) of daily records of solar radiation (R s), air temperature (T), relative humidity (RH) and wind speed (U 2) for the Glendale and Oxnard stations, respectively. For the Glendale Station, the first 6-year (1997–2002) data are used to train the LSSVM models, the second three-year (2003–2005) data were used for testing, and the remaining data were used for validation. Daily mean temperature records were obtained by calculating the average of the daily maximum and minimum temperature records. A linear variation between the daily minimum and maximum temperatures were assumed. Missing data were removed from the whole data set. The daily statistical parameters of each data for the entire time series are given in Table 1. It is seen from Table 1 that the statistical properties of the data set are not similar for the Glendale and Oxnard stations. The author has intended to compare models for different stations located in different climatic conditions. In Table 1, the x mean, x min, x max S x, C v and C sx denote the mean, minimum, maximum, standard deviation, variation coefficient and skewness, respectively. The wind speed shows a skewed distribution for each station (see C sx values in Table 1). As can be seen from the correlation coefficients between the R s and ET0 in Table 1, the solar radiation is closely correlated with evapotranspiration for each station. The air temperature seems to be the second best parameter correlated with ET0.

Table 1 The daily statistical parameters of each data set for the entire time series

Application and results

The first part of the study focused on comparison of LSSVM models with the Priestley–Taylor, Hargreaves and Ritchie empirical models. First, the ET0 values of the Glendale and Oxnard stations were calculated using the FAO-56 PM method as described in Allen et al. (1998)

$$ {\text{ET}}_{0} = \frac{{0.408\Updelta (R_{n} - G) + \gamma \frac{900}{T + 273}U_{2} (e_{a} - e_{d} )}}{{\Updelta + \gamma (1 + 0.34U_{2} )}} $$
(9)

where ET0 = reference evapotranspiration (mm day−1), Δ = slope of the saturation vapor pressure function (kPa °C−1), R n = net radiation (MJ m−2 day−1), G = soil heat flux density (MJ m−2 day−1), γ = psychometric constant (kPa °C−1), T = mean air temperature (°C), U 2 = average 24-h wind speed at 2 m height (m s−1), e a is the saturation vapor pressure (kPa), and e d is the actual vapor pressure (kPa).

Then, the inputs, R s, T, RH and U 2, and output ET0 values obtained using the FAO-56 PM method were used for the calibration of LSSVM models. Root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R) statistics were used for the evaluation of the models. The RMSE, MAE and R 2 are defined as

$$ {\text{RMSE}} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {(x_{i} - y_{i} )^{2} } } $$
(10)
$$ {\text{MAE}} = \frac{1}{N}\sum\limits_{i = 1}^{n} {|x_{i} - y_{i} |} $$
(11)
$$ R^{2} = \frac{{\left( {\sum\nolimits_{i = 1}^{n} {(x_{i} - \bar{x})(y_{i} - \bar{y})} } \right)^{2} }}{{\sum\nolimits_{i = 1}^{n} {(x_{i} - \bar{x})^{2} \sum\nolimits_{i = 1}^{n} {(y_{i} - \bar{y})^{2} } } }} $$
(12)

in which N and bar, respectively, denote the number of data and mean of the variable, and x and y are the predicted and FAO-56 PM ET0 values, respectively.

In the study, two different LSSVM models were employed. The LSSVM models having two inputs, T and R s, were developed for the valid comparison with two-parameter Priestley–Taylor, Hargreaves and Ritchie models. Optimum parameters of the LSSVM were determined by minimizing the objective function (RMSE error between calculated and FAO-56 PM ET0 values). The optimum parameters of the LSSVM models for the Glendale and Oxnard stations are given in Table 2. In this table, the LSSVM(100,10) model has the regularization constant = 100 and width of the RBF kernel = 10.

Table 2 The performance statistics of the models in validation period

The LSSVM models were compared with the Priestley–Taylor, Hargreaves and Ritchie models. The Priestley and Taylor (1972) equation for computing ET0 value is expressed as:

$$ {\text{ET}}_{0} = \frac{\alpha }{\lambda }\frac{\Updelta }{\Updelta + \gamma }(R_{n} - G) $$
(13)

where ET0 = reference evapotranspiration (mm day−1), α = 1.26, λ = latent heat of the evaporation (MJ/Kg), and the other applied parameters were introduced before (Gibson et al. 1994).

The Hargreaves formula is one of the simplest equations used for estimating ET0. It is expressed as (Hargreaves and Samani 1985)

$$ {\text{ET}}_{0} = 0.0023R_{a} \left( {\frac{{T_{\max } + T_{\min } }}{2} + 17.8} \right)(T_{\max } - T_{\min } )^{0.5} $$
(14)

where ET0 = reference evapotranspiration (mm day−1), T max and T min = maximum and minimum temperature (°C), and R a = extraterrestrial radiation (MJ m−2 day−1).

The Ritchie equation (Jones and Ritchie 1990) is:

$$ ET_{0} = \alpha_{1} \left[ {3.87 \times 10^{ - 3} R_{s} \left( {0.6T_{\max } + 0.4T_{\min } + 29} \right)} \right] $$
(15)

where ET0 = reference evapotranspiration (mm day−1), T max and T min = maximum and minimum temperature (°C), and R s = solar radiation (MJ m−2 day−1), when

$$ \begin{gathered} 5 < T_{\max } \le 35\,{}{^\circ }{\text{C}}\quad \alpha_{1} = 1.1 \hfill \\ T_{\max } > 35\,{}{^\circ }{\text{C}}\quad \alpha_{1} = 1.1 + 0.05(T_{\max } - 35) \hfill \\ T_{\max } < 5\,{}{^\circ }{\text{C}}\quad \alpha_{1} = 0.01.\exp [0.18(T_{\max } + 20)] \hfill \\ \end{gathered} $$
(16)

The LSSVM models were also compared with the conventional ANN models. The conventional feed-forward ANN model was also employed for daily ET0 estimation. For adjusting the weights of the ANN model, the conjugate gradient algorithm was used because this technique is more powerful and faster than the conventional gradient descent technique (Kisi and Uncuoglu 2005; Kisi 2007b). The sigmoid activation functions were used for the hidden and output nodes. The optimum hidden layer node numbers of the models were obtained after trying various network structures because there is no theory yet to tell how many hidden units are needed to approximate any given function. The training of ANN networks was stopped after 250 iterations following the suggestion of Kisi and Uncuoglu (2005) and Kisi (2007b). The optimum number of hidden layer units was obtained after many trials for each station. The optimal ANN models for the Glendale and Oxnard stations are given in Table 2. In this table, the ANN1(4,5,1) denotes an ANN model comprising 4 inputs, 5 hidden and 1 output nodes.

The LSSVM models are compared with the ANN, Priestley–Taylor, Hargreaves and Ritchie methods in respect of RMSE, MAE and R 2 statistics for the Glendale and Oxnard stations in Table 2. The each model’s inputs are also given in this table. The LSSVM2, ANN2, Priestley–Taylor, Hargreaves and Ritchie models use the same input variables. It is clear from the Table 2 that the LSSVM1 model outperformed all other models in terms of RMSE, MAE and R 2 performance criteria. The LSSVM2, ANN2, Priestley–Taylor, Hargreaves and Ritchie models are rather simple and consider only T and R s data. Compared with the two-parameter empirical models, the LSSVM2 and ANN2 models have almost same accuracy, and they performed better than the others. A slight difference exists between the Hargreaves and Ritchie models and both of them perform better than the Priestley–Taylor model. These results come to an agreement with the results of Karimaldini et al. (2012) and Shiri et al. (2011). They also found that the Hargreaves method provides better accuracy than the Priestley–Taylor method.

The estimates of each model for the Glendale and Oxnard stations are shown in Figs. 3 and 4 in the form of scatterplot. It is clear from the scatterplots that the four-input LSSVM1 estimates are closer to the corresponding FAO-56 PM ET0 values than those of the other models. Fit line equations (assume that the equation is y = ax + b) and R 2 values indicate that LSSVM1 model performs better than the ANN1 model for both stations. The a and b coefficients of the four-input LSSVM1 model are, respectively, closer to the 1 and 0 with a higher R 2 value than those of the four-input ANN1 model. The LSSVM2 model also performs better than the other two-parameter ANN2 and empirical models. Priestley–Taylor has the worse accuracy for both Glendale and Oxnard stations. This is also confirmed by the RMSE, MAE and R 2 values in Table 2.

Fig. 3
figure 3

The FAO-56 PM and estimated ET0 values of the Glendale Station in the validation period

Fig. 4
figure 4

The FAO-56 PM and estimated ET0 values of the Oxnard Station in the validation period

The total ET0 estimation of each model was compared in Table 3 because of its importance in irrigation management. For the Glendale Station, the LSSVM1 model gave an estimate that was closest to the total FAO-56 PM ET0 value. The ANN1 and Hargreaves had the same estimate, and they were ranked as the second best. For the Oxnard Station, the Ritchie model gave the closest estimate of total FAO-56 PM ET0 value than the other models. The LSSVM1 and ANN1 models had the same estimate, and they were ranked as the second best.

Table 3 Total estimated evapotranspirations in validation period

For the Glendale Station, the LSSVM1 gave 980 estimates lower than the 10 % relative error in the validation period, while the ANN1, ANN2, LSSVM2, Priestley–Taylor, Hargreaves and Ritchie had 930, 472, 457, 105, 372 and 361 estimates lower than the 10 % error, respectively. Furthermore, the ANN1, ANN2, LSSVM2, Priestley–Taylor, Hargreaves and Ritchie methods had 690, 241, 226, 45, 182 and 180 estimates lower than the 5 % relative error, respectively, while the LSSVM1 had 768 estimates lower than the 5 % error. For the Oxnard Station, the LSSVM1, ANN1, LSSVM2, ANN2, Priestley–Taylor, Hargreaves and Ritchie gave 618, 580, 320, 307, 193, 203 and 230 estimates lower than the 10 % relative error in the validation period, respectively. Furthermore, the LSSVM1, ANN1, LSSVM2, ANN2, Priestley–Taylor, Hargreaves and Ritchie had 381, 322, 182, 176, 96, 111 and 119 estimates lower than 5 % relative error, respectively. The LSSVM1 performs better than the other models from the relative error viewpoint. Out of the two-parameter models, the ANN2 and LSSVM2 performed the best for the Glendale and Oxnard stations, respectively. For the both stations, the Priestley–Taylor model provided worse estimates than the other models from the relative error viewpoint.

Concluding remarks

The accuracy of LSSVM method for the estimation of reference evapotranspiration using climatic variables was investigated in the present study. LSSVM models were tested and validated by applying daily climatic data of Glendale and Oxnard stations to estimate ET0 obtained using the FAO-56 Penman–Monteith equation. In the first part of the study, the LSSVM models were compared with the Priestley–Taylor, Hargreaves and Ritchie empirical methods. The LSSVM1 model whose inputs are the R s, T, RH and U 2 was found to perform better than the empirical models in the estimation of FAO-56 PM ET0. LSSVM2 models containing only two inputs, R s and T, were also developed and compared with two-parameter Priestley–Taylor, Hargreaves and Ritchie models because, in some areas (e.g., developing countries), the available data may be the solar radiation, R s, and air temperature, T, due to the difficulty in obtaining the data of other two parameters, relative humidity and wind speed. The comparison results revealed that the LSSVM2 models were superior to the two-parameter empirical models. In total ET0 estimation, the LSSVM1 model performed better than the Priestley–Taylor, Hargreaves and Ritchie models for the Glendale Station. For the Oxnard Station, however, the Ritchie method gave the closest estimate of total FAO-56 PM ET0 value than the other models. The LSSVM1 model was ranked as the second best for this station. The LSSVM2 models can be used in the estimation of FAO-56 PM ET0 where there exist only the R s and T data. Out of the two-parameter empirical models, the Priestley–Taylor model was found to perform worse than the Hargreaves and Ritchie models. In the second part of the study, the LSSVM models were compared with conventional feed-forward ANN models. The comparison results revealed that the LSSVM models were superior to the ANN in the estimation of FAO-56 PM ET0. It should be noted that the LSSVM models used in the present study is site-specific because selected stations are located in southern district of California. The researchers should base all calculations on their local conditions.