Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment

Rahimikhoob, Ali

doi:10.1007/s11269-013-0506-x

Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment

Published: 07 January 2014

Volume 28, pages 657–669, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Water Resources Management Aims and scope Submit manuscript

Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment

Download PDF

Ali Rahimikhoob¹

672 Accesses
61 Citations
Explore all metrics

Abstract

This paper describes a detailed evaluation of the performance and characteristic behaviour of feed-forward artificial neural network (ANN) and M5 model tree for estimating reference evapotranspiration (ET₀) at four meteorological sites in an arid climate. The input variables for these models were the maximum and minimum air temperature, air humidity and extraterrestrial radiation. The FAO-56 Penman–Monteith model was used as a reference model for assessing the performance of the two approaches. The results of this study showed that the ANN estimated ET₀ better than the M5 model tree but both models performed well for the study area and yielded results close to the FAO56-PM method. Root mean square error and R² for the comparison between reference and estimated ET₀ for the tested data using the proposed ANN model are 5.6 % and 0.98, respectively. For the M5 model tree method these values are 8.9 % and 0.98, respectively. The overall results are of significant practical use because the temperature and Humidity-based model can be used when radiation and wind speed data are not available.

Comparison of M5 Model Tree and Artificial Neural Network’s Methodologies in Modelling Daily Reference Evapotranspiration from NOAA Satellite Images

Article 22 April 2016

An investigation on generalization ability of artificial neural networks and M5 model tree in modeling reference evapotranspiration

Article 09 August 2015

Reference Evapotranspiration Modelling Using Artificial Neural Networks Under Scenarios of Limited Weather Data: A Case Study in the Malwa Region of Punjab

Article 28 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Reference evapotranspiration (ET₀) is a very important and necessary parameter in water resources management and environmental assessment in general and irrigation scheduling in particular. A large number of methods have been developed for assessing ET₀ from meteorological data. The Food and Agriculture Organization (FAO) recommends the use of the FAO-56 Penman-Monteith (FAO PM) equation as the sole method for estimating ET₀ wherever the required input data are available (Allen et al. 1998; Droogers and Allen 2002). This method is a physically based approach and has been proven to accurately estimate ET₀ using lysimeter data from a wide range of climate conditions (Allen et al. 1994; Itenfisu et al. 2000). It requires measurements of air temperature, relative humidity, solar radiation and wind speed. However, these climatic variables are not always measured in weather stations. Although temperature and humidity are routinely measured, solar radiation and wind speed data are rarely available over the world even in developed countries. Under these conditions, simplified models, not requiring solar radiation and wind speed data, should be considered. Determination of ET₀ is a complex nonlinear phenomenon because it depends on several interacting climatologic factors. More recently, alternative approaches based on artificial neural networks (ANNs) and M5 model tree have been suggested to provide reliable estimation model for various application in engineering. The main advantage of these approaches over conventional methods is that they do not need detailed information on the physical processes of the system.

ANNs are effective tools for modelling nonlinear systems and those that are difficult to formalize. In recent years, neural network methods have been employed for the estimation of ET₀ as a function of climatic variables. Some of them used the same climatic data required for application of the FAO PM method (Odhiambo et al. 2001; Kumar et al. 2002; Trajkovic et al. 2003). These researchers reported that the ANN can predict ET₀ ever better than the FAO PM conventional method. Sudheer et al. (2003) and Zanetti et al. (2007) simplified the input variables and ET₀ was estimated as a function of air temperature, extraterrestrial solar radiation and the daylight hours. They found satisfactory results. Chauhan and Shrivastava (2009) compared the performance of four climate based methods and Artificial Neural Networks (ANNs) for estimation of ET₀ in India, when input climatic parameters are insufficient to apply FAO PM method. They concluded that the ANN models were performed better than the climatic based methods. In another study, Rahimikhoob (2010) applied ANN technique to estimate ET₀ based on air temperature data under humid subtropical conditions on the southern coast of the Caspian Sea situated in the north of Iran. He showed that ANN successfully estimated the daily ET₀ and simulated ET₀ better than the Hargreaves conventional equation.

Recently, M5 model trees have been used successfully for flood forecasting (Solomatine and Xue 2004), water level-discharge relationship (Bhattacharya and Solomatine 2005), rainfall-runoff modeling (Solomatine and Dulal 2003), sedimentation modeling (Bhattacharya and Solomatine 2006), and estimation of ET₀ (Pal and Deswal 2009). Pal and Deswal (2009) investigated the potential of M5 model tree based regression approach to model daily ET₀ using four inputs including solar radiation, average air temperature, average relative humidity, and average wind speed. Results from their study suggested that M5 model tree could successfully be employed in modeling the ET₀. In other research, Sattari et al. (2013a) compared the performance of an M5 model tree and support vector machine in predicting daily stream flows in the River Sohu, located within the municipal borders of Ankara, Turkey. They found that the M5 model tree was performed better than the support vector machine. Recently, two different studies have been made to investigate the ANN and M5 model tree techniques for the assessment of ET₀ in two different countries, the first one (Sattari et al. 2013b) in Ankara (Turkey), and the second (Sattari et al. 2013c) in Bonab (Northwestern Iran). In both cases, the comparison results showed that the ANN model gave better performance in estimating ET₀ in comparison with M5 model tree. But M5 model tree was appropriate which provides simple linear relations.

The purpose of the research reported in this article was to compare ANN model and M5 model trees to estimate monthly ET₀ in an arid environment of Iran. Since the maximum and minimum air temperature and relative humidity records are more readily available around the globe, these records with extraterrestrial radiation are being used as input in above models for the estimation of ET₀. Extraterrestrial radiation reflects the seasonality of ET₀ and can be calculated theoretically as a function of the local latitude and Julian data, according to the equations presented by Allen et al. (1998). Thus, for proposed models in this study, only temperature and relative humidity are the parameters that require observation. Here, the FAO PM method was used as a substitute for measured ET₀ data, as this is the standard procedure used when no measured lysimeter data are available (Irmak et al. 2003; Utset et al. 2004). Although in practice the best way to test the performance of the above-mentioned methods would be to compare their performances against the lysimeter-measured data, this type of data set is not available in the study area.

2 Materials and Methods

2.1 Study Area and Climate Dataset

The area under study was Sistan and Baluchestan province, which lies between latitudes 25.0°N and 31.5°N and between longitudes 58.8°E and 63.3°E. Sistan and Baluchestan province is in the south-east of Iran, borders Pakistan, Afghanistan and Oman Sea, and covers an area of 181,578 km². On the basis of the Koppen climate classification, the climate is arid, with an average annual precipitation of about 112 mm.

2.2 Data Description

Monthly meteorological data were obtained from January 1998 through December 2007 (10 years) (120 months) from four weather stations in the study area with varying latitudes, longitudes and elevations. The annual average weather data of meteorological stations are presented in Table 1. The stations belong to the meteorological organization of Iran and spatial distribution of them within the province is shown in Fig. 1. Five monthly meteorological variables were recorded including: (1) mean maximum air temperature (T_x °C); (2) mean minimum air temperature (T_n °C); (3) mean wind speed (U m s⁻¹); (4) mean relative humidity (RH %) and (5) bright sunshine hours (n h). Measurements were made at a height of 2 m (air temperature and relative humidity) and 10 m (wind speed) above the soil surface. Wind speed data at 2 m (U₂) were obtained from those taken at 10 m using the log-wind profile equation. All measurements were made daily according to Iran Meteorological Organization with monthly data being averaged from daily data as appropriate. Mean measured monthly T_x, T_n, RH and U₂ for the four meteorological stations used in the study, over 10 years are presented in Figs. 2, 3, and 4.

Table 1 Mean annual meteorological parameters averaged over 10 years for weather stations used in this study

Full size table

In order to train ANN and M5 model tree, whole data set of four stations (480 patterns, from 1998 to 2007) were collected into one group to produce a model with a higher regional capacity that could be applied to estimate ET₀ for different locations in the Sistan and Bluchestan Province. This data set was divided into two parts: The first part (336 patterns, from 1998 to 2004) was used for training and the second part (144 patterns, from 2005 to 2007) was used for testing the trained model.

2.3 The FAO PM Method

The following equation was applied for the FAO PM (Allen et al. 1998):

$$ {\mathrm{ET}}_0\mathrm{PM}=\frac{0.408\varDelta \left({\mathrm{R}}_{\mathrm{n}}-\mathrm{G}\right)+\gamma \frac{900}{{\mathrm{T}}_{\mathrm{a}}+273}{\mathrm{U}}_2\left({\mathrm{e}}_{\mathrm{s}}-{\mathrm{e}}_{\mathrm{a}}\right)}{\varDelta +\gamma \left(1+0.34{\mathrm{U}}_2\right)} $$

(1)

where ET₀ PM is reference crop evapotanspiration calculated using the FAO PM method (mm d⁻¹), R_n is the daily net radiation (MJ m⁻² d⁻¹), G is the daily soil heat flux (MJ m⁻² d⁻¹), T_a is the mean daily air temperature at a height of 2 m (°C), U₂ is the daily mean wind speed at a height of 2 m (m s⁻¹), e_s is the saturation vapor pressure (kPa), e_a is the actual vapor pressure (kPa), ∆ is the slope of the saturation vapor pressure versus the air temperature curve (kPa °C⁻¹), and γ is the psychrometric constant (kPa °C⁻¹). The terms in the numerator on the right-hand side of the equation are the radiation term and aerodynamic term, respectively.

In this study, the daily values of ∆, R_n, e_s and e_a were calculated using the equations given by Allen et al. (1998). For R_n, an albedo of 0.23 (green vegetation surface) was used. Since G is usually small compared with R_n and is difficult to measure, it was assumed to be zero over the calculation time step period (daily and monthly) (Allen et al. 1998). The measured RH, T_x and T_n values were used to calculate e_a and e_s. The daily solar or shortwave radiation (R_s) was calculated using the Angstrom formula, which relates solar radiation to extraterrestrial radiation (R_a) and relative sunshine duration. Eq. (39) in Allen et al. (1998) was used to calculate the net outgoing longwave radiation. R_a (MJ m⁻² d⁻¹), was calculated from the following equation (Allen et al. 1998):

$$ {\mathrm{R}}_{\mathrm{a}}=\frac{24(60)}{\pi }{G}_{SC}{d}_r\left[{\omega}_S \sin \left(\varphi \right) \sin \left(\delta \right)+ \cos \left(\varphi \right) \cos \left(\delta \right) \sin \left({\omega}_S\right)\right] $$

(2)

where G_SC is solar constant (0.0820 MJ m⁻² min⁻¹), d_r is inverse relative distance between Earth and Sun (Eq. 3), ω_s sunset hour angle (Eq. 4; radians), φ is the latitude of the site (radians), δ is solar declination (Eq. 5; radians).

$$ {\mathrm{d}}_{\mathrm{r}}=1+0.033\kern0.5em \cos \left(2\pi /365\times \mathrm{J}\right) $$

(3)

$$ {\omega}_{\mathrm{s}}=\mathrm{arcos}\left[- \tan \left(\varphi \right) \tan \left(\delta \right)\right] $$

(4)

$$ \delta =0.409\kern0.5em \sin \left\{\left(2\pi /365\times \mathrm{J}\right)-1.39\right\} $$

(5)

where J is the number of the days in the year.

2.4 Artificial Neural Network (ANN)

In this study, an ANN of the multilayer perceptron (MLP) type with one input layer, one hidden layer and one output layer was used for estimating ET₀ from the temperature, humidity and extraterrestrial radiation data. MLP networks consist of units (neurons) arranged in layers (input, hidden and output layer) with only forward connections to units in subsequent layers. The number of nodes in the input and the output layers depends on the number of input and output variables, respectively. The performance of the ANN depends on the number of nodes in the hidden layer. Because no specific guidelines exist for choosing the optimum number of hidden nodes for a given problem, this network parameter is often optimized using a combination of empirical rules and trial and error. Figure 2 shows the general layout of a three-layer neural network used in this study. In this structure, there are four neurons in the input layer (representing the T_x, T_n, RH and R_a variables), i neurons in a single hidden layer, and one neuron in the output layer (representing the ET₀).

A neural performs a particular function by adjusting the weights of the connections between the elements. Each connection has its corresponding weight. The processing element consists of two parts. The first part simply aggregates the weighted and biases inputs; the second part is essentially a nonlinear filter, usually called the transfer function or activation function. The activation function acts as a squashing function, such that the output of a neuron in a neural network is between certain values (usually 0 and 1, or −1 and 1). Mathematically, this process is described in the Fig. 3. In this paper, the log sigmoid activation function is used for both hidden layer and output layer. This function is the most commonly used activation function. It is a continuous function that varies gradually between two asymptotic values, typically 0 and 1 which is defined as follows:

$$ {\mathrm{y}}_{\mathrm{k}}=\frac{1}{1+ \exp \left(-{\nu}_{\mathrm{k}}\right)} $$

(6)

where ν_k and y_k denote the weighted sum of inputs to the kth hidden neuron and output from that neuron, respectively. The training of an MLP network involves finding values of the connection weights and biases, which minimize an error function between the actual network output and the corresponding target values in the training set. In this study, a backpropagation (BP) algorithm was employed to train our MLP neural network. Levenberg–Marquardt (LM), a second-order nonlinear optimization technique, was chosen from the various BP training algorithms available for use in this study. The LM algorithm is widely applied to many different domains and is faster and produces better results than other training methods (Hagan and Menhaj 1994; Tan and van Cauwenberghe 1999). In some examples, however, the BP algorithm may become trapped in a local minimum. Initial values of weights also affect in trapping in local minimum. Thus, the weights have been reinitialized and the networks retrain several times to guarantee global minimum in this research.

Generalization is the quality of neural networks that is sought following supervised learning. It is the ability to provide accurate output values for input variables that have not been seen by the network (Atkinson and Tatnall 1997). Lack of generalization is caused by overfitting. The network has memorized the training examples, but it has not learned to generalize new situations. The most common technique to circumvent overfitting is based on an early stopping criterion that halts training before convergence (Sarle 1995; Prechelt 1998). Here, the LM algorithm was used with an early stopping criterion to improve the network training speed and efficiency. The accuracy of the networks was evaluated for each epoch in the training through mean squared error. For the criterion, all the data were divided into three sets (Coulibaly et al. 2000). The first set is the training set for determining the weights and biases of the network. The second set is the validation set for evaluating the weights and biases and for deciding when to stop training. The validation error normally decreases at the beginning of the training process. When the network starts to overfit the data, the validation error begins to increase. The training is stopped when the validation error begins to increase, and the weights and biases will then be derived at the minimum error. The last data set is for testing the weights and biases to verify the effectiveness of the stopping criterion and to estimate the expected network operation on new data sets.

In this study, In order to reflect the seasonality of ET_0, extraterrestrial radiation was selected as an input variable to the ANN. Therefore, in this study, maximum and minimum air temperature and relative humidity with extraterrestrial radiation were employed as input variables. The Ra was calculated as a function of the local latitude and Julian data, according to the equations presented by Allen et al. (1998). Thus, the proposed model only needs the measured values of maximum and minimum air temperature and relative humidity for estimating the ET₀. In this study, the data from 1998 to 2004 for each station were collected into one set to train the network. The training set was divided at random, with 70 % being reserved to train the ANN and 30 % being used to validate the training. This data set had a total of 336 patterns. After the training process, the remaining data for each station (2005 to 2007) were used to test the network. The test data set had a total of 144 patterns that were not used for training. As the purpose of this study was the estimation of ET₀, the ANN has only one output variable. The computed daily ET₀ values from Eq. 1 were used as target output.

The performance of the ANN depends on the number of hidden layers and the number of nodes in each hidden layer. In general, neural networks with one hidden layer containing a sufficiently large number of hidden nodes have been shown to be capable of providing accurate approximations to any continuous nonlinear function (Hornik et al. 1989).

However, neural networks with a large number of hidden nodes may lead to overfitting of data, resulting in network models with poor predictive capability. It is thus of great importance to select an appropriate number of hidden nodes. Because no specific guidelines exist for choosing the optimum number of hidden nodes for a given problem, this network parameter is often optimized according to some empirical rules combined with trial and error.

To suit the consistency of the model, all source data were normalized in the range 0.0–1.0 and then returned to original values after the simulation using:

$$ {\mathrm{X}}_{\mathrm{norm}}=\frac{\mathrm{X}-{\mathrm{X}}_{\min }}{{\mathrm{X}}_{\max }-{\mathrm{X}}_{\min }} $$

(7)

where X_norm is the normalized value; X is the original value; X_min and X_max are the maximum and minimum of original values.

2.5 M5 Model Tree

Another method that is used in this study to estimate ET₀ from the temperature and relative humidity data is the M5 model tree. M5 model tree was first presented by Quinlan (1992). The model is based on a binary decision tree having linear regression functions at the terminal (leaf) nodes, which develops a relationship between independent and dependent variables. Unlike decision tree which is used for categorical data, it can also be used for quantitative data (Quinlan 1992; Mitchell 1997). M5 model tree generation requires two different stages (Quinlan 1992; Solomatine and Xue 2004). The first stage involves splitting of the data into subsets to create a decision tree. The splitting criterion is based on treating the standard deviation of the class values that reach a node as a measure of the error at that node, and calculating the expected reduction in this error as a result of testing each attribute at that node. The formula for computing the standard deviation reduction (SDR) is defined as follows (Pal and Deswal 2009):

$$ \mathrm{SDR}=\mathrm{sd}\left(\mathrm{T}\right)-{\displaystyle \sum \frac{\left|{\ \mathrm{T}}_{\mathrm{i}}\right|}{\left|\ \mathrm{T}\ \right|}\mathrm{sd}\left({\mathrm{T}}_{\mathrm{i}}\right)} $$

(8)

where T denotes a set of examples that reaches the node; T_i denotes the subset of examples that have the ith outcome of the potential set; sd denotes the standard deviation (Wang and Witten 1997). Due to the splitting process, the standard deviation of the data in child nodes (lower nodes) is less than that at the parent node. After examining all the possible splits, the one that maximizes the expected error reduction was chosen. However, this division often produces a large tree-like structure which may cause over fitting or poor generalization. To overcome this problem, in second stage the overgrown tree is pruned and then pruned sub-trees are replaced with linear regression functions. This technique of generating the model tree substantially increases the accuracy of estimation (Quinlan 1992). Figure 5a shows splitting the input space X1 × X2 (independent variables) into six subspaces (leaves) by M5 model tree algorithm. A linear regression function was built at the leaves, labeled LM1 through LM6. Figure 5b shows its relations in form of tree diagram, in which LM1 to LM6 is in leave level. Further details of the M5 model tree can be found in Quinlan (1992).

In order to compare ANN and M5 model tree methods, the same climatic data required for the application of the ANN method were selected as input variable of the M5 model tree. Therefore, the maximum and minimum air temperatures, relative humidity and the extraterrestrial radiation were adopted as input variables for the M5 model. The data used to train and test the neural network were used to create and test the M5 model tree. Thus, the ET₀ estimates by the M5 model tree (ET₀ M5) can be compared with the ET₀ values produced by the neural network estimates (ET₀ ANN). For creating M5 model tree, based on training data set, the Weka software (Witten and Frank 2005) was used.

The performance of the ANN and M5 models was checked with three statistical indices: determination coefficient (R²), mean bias error (MBE) and root mean square error (RMSE). To ease the comparison, both MBE and RMSE indices are normalized and expressed as percentages of the mean observed ET₀ (calculated with the FAO PM method) value. These indices were defined as follows:

$$ {\mathrm{R}}^2=\frac{{\left[{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}\left({\mathrm{P}}_{\mathrm{i}}-\overline{\mathrm{P}}\right)\left({\mathrm{O}}_{\mathrm{i}}-\overline{\mathrm{O}}\right)}\right]}^2}{{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}{\left({\mathrm{P}}_{\mathrm{i}}-\overline{\mathrm{P}}\right)}^2{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}{\left({\mathrm{O}}_{\mathrm{i}}-\overline{O}\right)}^2}}} $$

(9)

$$ \mathrm{MBE}=\frac{{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}\left({\mathrm{P}}_{\mathrm{i}}-{\mathrm{O}}_{\mathrm{i}}\right)}}{\mathrm{N}\overline{\mathrm{O}}}\times 100 $$

(10)

$$ \mathrm{RMSE}=\frac{\sqrt{\frac{1}{\mathrm{N}}{\displaystyle \sum_{\mathrm{i}=1}^{\mathrm{N}}{\left({\mathrm{P}}_{\mathrm{i}}-{\mathrm{O}}_{\mathrm{i}}\right)}^2}}}{\overline{O}}\times 100 $$

(11)

where N is the number of observations, P_i is the estimated ET₀ (using the ANN and M5 methods), O_i is the observed ET₀, $ \begin{array}{l}\overline{\mathrm{P}}\\ {}\end{array} $ and $ \overline{\mathrm{O}} $ are the average values for P_i and O_i.

3 Results and Discussion

The weather parameters considered for the ANN models with four inputs were the monthly mean daily T_x, T_n, RH and R_a. The output was the monthly mean daily ET₀ calculated using the FAO PM method. The optimal node number in the hidden layer of the network was determined using a trial and error method by considering the MBE, RMSE and R² values from a test sample. In this study, ten ANNs were trained with one to 10 nodes in the hidden layer, and the aforementioned statistical parameters were calculated using only the whole test data set after each training run. Based on the three statistical results, the network that employed six nodes in the hidden layer provided the best results, with MBE, RMSE and R² values of 0.7 (%), 5.3 (%) and 0.99, respectively for testing data.

The data used for the training of neural network were used for creating of M5 model tree. The following is the generated model tree with only two rules:

$$ \begin{array}{l}\mathrm{Rule}\kern0.5em 1:\mathrm{If}\kern0.5em \mathrm{Ra}<=33.163\kern0.5em \mathrm{then}\kern0.5em \mathrm{LM}1\hfill \\ {}\mathrm{Rule}\kern0.5em 2:\mathrm{If}\kern0.5em \mathrm{Ra}>33.163\kern0.5em \mathrm{then}\kern0.5em \mathrm{LM}2\hfill \end{array} $$

LM1 and LM2 are linear models provided by M5 model tree with train data set:

$$ \begin{array}{l}\mathrm{LM}1:{\mathrm{ET}}_0=0.0601*{\mathrm{T}}_{\mathrm{n}}-0.0108*{\mathrm{T}}_{\mathrm{x}}-0.0481*\mathrm{RH}+0.1528*\mathrm{Ra}+1.3661\hfill \\ {}\mathrm{LM}2:{\mathrm{ET}}_0=0.0907*{\mathrm{T}}_{\mathrm{n}}-0.0108*{\mathrm{T}}_{\mathrm{x}}-0.0959*\mathrm{RH}+0.2279*\mathrm{Ra}-0.418\hfill \end{array} $$

The developed ANN and M5 model tree were applied on the test data set and the statistical summary of ET₀ estimate for all the locations is presented in Table 2. It is clear from Table 2 that the difference between the two models is quite small. The RMSEs for both methods are generally low, indicating that for either method the systematic error is small. From Table 2, the RMSE has a maximum of 7.8 and 9.4 % for the ANN and M5 models, respectively. The RMSE varies between 0.5 and 7.8 % for the ANN model. It varies between 8.9 and 9.4 % for the M5 model tree. Generally, the result in Table 2 showed that use of ANN model offered an advantage over the use of M5 model tree for the data in study area; however, differences with the statistical approach are small. The selected ANN model showed very good performance when compared to values estimated FAO PM method. This model, with an R² of 0.98, RMSE of 5.6 % and MBE of 0.8 % produces a small overestimation. The M5 model tree also performs well compared with FAO PM estimates with 2.1 % overestimation, a RMSE of 8.9 % and an R² of 0.98.

Table 2 Statistical summary of ET₀ estimates for four locations in Sistan and Bluchestan province

Full size table

The ET₀ estimates of developed ANN and M5 model tree at four weather stations for test data set are illustrated in Fig. 6 in the form of scatterplot. In both the cases, all ET₀ data appear to be well distributed along the 1: 1 line. A good correlation was observed for all sites in both cases, with R² higher than 0.97. The selected ANN and the M5 model tree models perform very well when compared with the FAO PM estimates. The slopes of the straight lines in both models are nearly close to one, and neither overestimations nor underestimations are produced in the range of the values studied. This verifies that the models can be used to estimate ET₀ values for different days.

Figure 7 showes the comparison between monthly mean of daily ET₀ values estimated by FAO PM method and those calculated by the selected ANN model and the M5 model tree during the test period. It can be seen that both models have no significant MBE. In both models, the evolution is similar and one line is practically superimposed over the other.

4 Conclusions

The results showed that the both neural network and M5 models provide quite good agreement with the ET₀ obtained by the FAO PM method. They gave reliable estimation at all the locations. The study demonstrated that modelling of ET₀ through the use of ANN technique gave better estimates than the M5 model tree. However, differences with the M5 model tree are small. The advantage of the M5 model tree over ANN is that, it is simple to compute. So it is recommended to use M5 model for estimating ET₀. The overall results are of significant practical use because the temperature and Humidity-based model can be used when radiation and wind speed data are not available.

The results of this study are similar to those reported by Sattari et al. (2013b, 2013c) when comparing ANN and M5 model tree approaches at different locations. These results suggested a better performance by the ANN approach, but M5 model tree, being analogous to piecewise linear functions, provides a simple linear relation. Therefore, these results recommended using the M5 model tree to estimate ET₀.

References

Allen RG, Smith M, Perrier A, Pereira LS (1994) An update for the definition of reference evapotranspiration. ICID Bulletin 43:1–34
Google Scholar
Allen RG, Pereira LS, Raes D, Smith M (1998) Crop evapotranspiration: guidelines for computing crop requirements. FAOIrrigation and Drainage Paper No. 56. FAO, Rome, Italy
Atkinson PM, Tatnall ARL (1997) Introduction neural networks in remote sensing. Int J Remote Sens 18:699–709
Article Google Scholar
Bhattacharya B, Solomatine DP (2005) Neural networks and M5 model trees in modeling water level–discharge relationship. Neurocomputing 63:381–396
Article Google Scholar
Bhattacharya B, Solomatine DP (2006) Machine learning in sedimentation modelling. Neural Netw 19(2):208–214
Article Google Scholar
Chauhan S, Shrivastava RK (2009) Performance evaluation of reference evapotranspiration estimation using climate based methods and artificial neural networks. Water Resour Manag 23(5):825–837
Article Google Scholar
Coulibaly P, Anctil F, Bobee B (2000) Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J Hydrol 230(3–4):244–257
Article Google Scholar
Droogers P, Allen RG (2002) Estimating reference evapotranspiration under inaccurate data conditions. Irrig Drain Syst 16:33–45
Article Google Scholar
Hagan MT, Menhaj MB (1994) Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw 5:989–993
Article Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366
Article Google Scholar
Irmak S, Allen RG, Whitty EB (2003) Daily grass and alfalfa reference evapotranspiration estimates and alfalfa-to-grass evapotranspiration ratios in Florida. J Irrig Drain Eng 129(5):360–370
Article Google Scholar
Itenfisu D, Elliott RL, Allen RG, Walter IA (2000) Comparison of reference evapotranspiration calculations across a range of climates. Proceedings of the 4th National Irrigation Symposium. ASAE: Phoenix, AZ
Kumar M, Raghuwanshi NS, Singh R, Wallender WW, Pruitt WO (2002) Estimating evapotranspiration using artificial neural network. J Irrig Drain Eng 128(4):224–233
Article Google Scholar
Mitchell TM (1997) Machine learning. The McGraw-Hill Comp. Press.
Odhiambo LO, Yoder RE, Hines JW (2001) Optimization of fuzzy evapotranspiration model through neural training with input–output examples. Trans ASAE 44(6):1625–1633
Article Google Scholar
Pal M, Deswal S (2009) M5 model tree based modelling of reference evapotranspiration. Hydrol Process 23:1437–1443
Article Google Scholar
Prechelt L (1998) Automatic early stopping using cross validation: quantifying the criteria. Neural Netw 11:761–767
Article Google Scholar
Quinlan JR (1992) Learning with continuous classes. In Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, Hobart, Australia, 16–18 November, World Scientific, Singapore: 343–348
Rahimikhoob A (2010) Estimation of evapotranspiration based on only air temperature data using artificial neural networks for a subtropical climate in Iran. Theor Appl Climatol 101(1–2):83–91
Article Google Scholar
Sarle WS (1995) Stopped training and other remedies for overfitting. In: Proceedings of the 27th symposium on the interface of computing science statistics
Sattari MT, Pal M, Apaydin H, Ozturk F (2013a) M5 model tree application in daily river flow forecasting in Sohu stream, Turkey. Water Resources 40(3):233–242
Article Google Scholar
Sattari MT, Pal M, Yurekli K, Ünlukara A (2013b) M5 model trees and neural network based modelling of ET0 in Ankara, Turkey. Turk J Eng Environ Sci 37:211–219
Article Google Scholar
Sattari MT, Nahrein F, Azimi V (2013c) M5 model trees and neural networks based prediction of daily ET0 (Case Study: Bonab Station). Iranian Journal of Irrigation and Drainage 7(1):104–113 (In Farsi)
Google Scholar
Solomatine DP, Dulal KN (2003) Model trees as an alternative to neural networks in rainfall-runoff modelling. Hydrol Sci J 48(3):399–411
Article Google Scholar
Solomatine DP, Xue Y (2004) M5 model trees compared to neural networks: application to flood forecasting in the upper reach of the Huai River in China. J Hydr Engrg 9(6):491–501
Article Google Scholar
Sudheer KP, Gosain AK, Ramasastri KS (2003) Estimating actual evapotranspiration from limited climatic data using neural computing technique. Irrig Drain Eng 129(3):214–218
Article Google Scholar
Tan Y, Van Cauwenberghe A (1999) Neural-network-based d-stepahead predictors for nonlinear systems with time delay. Eng Appl Artif Intell 12:21–25
Article Google Scholar
Trajkovic S, Todorovic B, Stankovic M (2003) Forecasting of reference evapotranspiration by artificial neural networks. J Irrig Drain Eng 129(6):454–457
Article Google Scholar
Utset A, Farre I, Martinez-Cob A, Cavero J (2004) Comparing Penman–Monteith and Priestley–Taylor approaches as referenceevapotranspiration inputs for modeling maize wateruse under Mediterranean conditions. Agric Water Manage 66(3):205–219
Article Google Scholar
Wang Y, Witten IH (1997) Induction of model trees for predicting continuous lasses. In: Proceedings of the Poster Papers of the European Conference on Machine Learning. University of Economics, Faculty of Informatics and Statistics, Prague.
Witten IH, Frank E (2005) Data mining: practical machine learning tools and technique. Morgan Kaufmann Publishers, San Francisco
Google Scholar
Zanetti SS, Sousa EF, Oliveira VPS, Almeida FT, Bernardo S (2007) Estimating evapotranspiration using artificial neural network and minimum climatological data. J Irrig Drain Eng 133(2):83–89
Article Google Scholar

Download references

Acknowledgments

This study is the partial work under Project No. WR1-1389-631 supported by Sistan and Baluchestan Regional Water Corporation and was done in the Department of Irrigation and Drainage Engineering, Abouraihan Campus, University of Tehran.

Author information

Authors and Affiliations

Department of Irrigation and drainage Engineering, Abouraihan Campus, University of Tehran, Tehran, Iran
Ali Rahimikhoob

Authors

Ali Rahimikhoob
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Rahimikhoob.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahimikhoob, A. Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment. Water Resour Manage 28, 657–669 (2014). https://doi.org/10.1007/s11269-013-0506-x

Download citation

Received: 08 February 2013
Accepted: 29 December 2013
Published: 07 January 2014
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11269-013-0506-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Comparison between M5 Model Tree and Neural Networks for Estimating Reference Evapotranspiration in an Arid Environment

Abstract

Similar content being viewed by others

Comparison of M5 Model Tree and Artificial Neural Network’s Methodologies in Modelling Daily Reference Evapotranspiration from NOAA Satellite Images

An investigation on generalization ability of artificial neural networks and M5 model tree in modeling reference evapotranspiration

Reference Evapotranspiration Modelling Using Artificial Neural Networks Under Scenarios of Limited Weather Data: A Case Study in the Malwa Region of Punjab

1 Introduction