Abstract
In the current study, an ability of a novel regression-based method is evaluated in modeling daily reference evapotranspiration (ET0), which is an important issue in water resources management and planning. The method was developed by hybridizing radial basis function and M5 model tree and called as radial basis M5 model tree (RM5Tree). The new model results were compared with traditional M5 model tree (M5Tree), response surface method (RSM), and two neural networks (multi-layer perceptron neural networks, MLPNN & radial basis function neural network, RBFNN) with respect to several statistical indices. Daily climatic data (relative humidity, RH, solar radiation, SR, wind speed, air temperature, T) recorded at three stations in Turkey, Mediterranean Region, were used. The effect of each weather data on ET0 was also investigated by utilizing three different input scenarios with various combinations of input variables. On the whole, the RM5Tree provided the best results (Nash and Sutcliffe efficiency, NES > 0.997) followed by the MLPNN (NES > 0.990), and M5Tree (NES > 0.945) in modeling daily ET0. The SR was observed as the most effective input parameter on ET0 which was followed by the T and RH. However, the findings of the third modeling scenario revealed that taking into account of all variables would considerably increase models’ accuracies for the three stations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Evapotranspiration (ET) is one of main components in hydrological cycle and accurate estimation of ET has vital importance in design and management of the irrigation systems, water resources studies, and other similar cases. Knowing the ET rate of a plant can help determining the accurate amount of water required for irrigation, which will subsequently lead to increased productivity. Failure in determining the accurate ET rate may lead to the overestimation of plants’ water requirement, which will consequently cause adverse effects such as waterlogged lands, soil nutrients washout, as well as contamination of groundwater resources. On the other hand, underestimation of plants’ water requirement will incur moisture tension on them, followed consequently by a reduced crop yield. In this regard, equations such as FAO Penman, FAO Penman-Monteith, Blaney-Criddle, etc. can be used to study the reference evapotranspiration (ET0) of plants (Feng et al. 2017a, b; Chen et al. 2019a, b). However, despite their acceptable performance in most of the cases, using these methods needs the access to a large amount of input data, which is not of course always possible regarding the conditions within various regions.
In such conditions, using indirect approaches such as soft computing techniques and artificial intelligence based (AI-based) models can be a proper alternative modeling solution (Zounemat-Kermani et al. 2020). Lots of researchers have applied artificial intelligence (AI)-based models including artificial neural networks (e.g., MLPNN & RBFNN), fuzzy logic concepts (e.g., ANFIS), regression, and classification tree models (e.g., M5 tree & CHAID) and machine learning approaches (support vector machine, SVM & support vector regression, SVR) for simulating and modeling ET0 (Kişi and Öztürk 2007; Shiri et al. 2014, 2015; Gocić et al. 2015; Yin et al. 2017; Mehdizadeh et al. 2017; Dou and Yang 2018; Zounemat-Kermani et al. 2019; Chia et al. 2020; Chen et al. 2020; Yamaç and Todorovic 2020; Adnan et al. 2021). In the following paragraphs, some of the most recent pertinent studies about modeling ET0 using AI-based models are presented. The neuro-fuzzy and artificial neural network (ANN) were compared for modeling ET0 with two input combinations to select suitable input data via Shiri et al. (2015) and they reported that the local training can be applied to validate the alternative modeling by using AI. Kisi (2016) used three different models consisting multivariate adaptive regression splines, MARS, M5Tree, and least square SVR for approximating the ET0. In general, it was found that the M5Tree performed superior to the other modeling approaches applied. Rahimikhoob (2016) compared the ability of artificial neural network and M5Tree for estimating ET0 of an arid area. The study did not report the complete superiority of the utilized models.
Khoshravesh et al. (2017) analyzed performances of multivariate fractional polynomial model, Bayesian, and robust regressions to estimate ET0 in arid climates. Outcomes of the study showed that accuracy of multivariate fractional polynomial model is better than other two models. Daily ET0 predictions were investigated using extreme learning machine (ELM) and generalized regression neural network (GRNN) with input data of temperature at 6 stations of China by Feng et al. (2017a, b). It is conducted that ELM is a robust and accurate model compared to GRNN. Three different AI-based models (ANFIS-GP, fuzzy genetic model, and M5Tree) were used and executed in modeling monthly ET0 values by Wang et al. in 2017. It was reported that the fuzzy genetic model performed better in comparison to the ANFIS-GP and M5Tree models. The hybrid genetic algorithm (GA) and SVM as an AI-based models were used for simulating daily ET0 of semi-arid environment in northwest China by Yin et al. (2017) and it was compared with SVM and ANN based on eight different combinations of climatic input data. Based on their results, the SVR combined by GA model had superior performances. Antonopoulos and Antonopoulos (2017) compared ANN-based multi-layer feed forward back-propagation with several empirical models for calculation of daily ET0 in northern Greece and showed that the ANN with a sigmoid transfer function in hidden layer can provide more accurate predictions than the empirical models.
In another study, Keshtegar et al. (2018) explored the ability of subset ANFIS, ANFIS, ANNs, and M5Tree. They claimed that the subset ANFIS is superior to other applied methods in modeling daily ET0. Gavili et al. (2018) compared the ability of ANN, ANFIS, and gene expression programming (GEP) in modeling daily ET0. The results attained from the AI-based models were compared with those of the empirical models. Comparing the results, it was revealed that the AI-based models provided better accuracies compared to empirical models. Dou and Yang (2018) investigated and compared the feasibilities and abilities of four AI-based models using ELM, ANFIS, ANN, and SVR for prediction of daily ET0 for four sites in China. It conducted that the AI-based ANFIS and ELM models can produce better performances compared to the ANN and SVM while the ELM model was considerably reduced computational time in modeling process. Keshtegar et al. (2019) compared the ability of RSM and polynomial chaos expansion (PCE) in modeling ET0. They reported that the PCE model was more accurate approach to estimate daily ET0. The abilities of four learning algorithms as multilayer perceptron-based deep feed-forward ANN, gradient-boosting machine, random forest regression (RF) using M5Tree model, and generalized linear model were compared for ET0 estimation for the Punjab Northern India stations by Saggi and Jain (2019). According to the extracted results from this study, the deep feed-forward ANN performed better than the other models. Sanikhani et al. (2019) employed six AI-based models namely multilayer perceptron ANN, GRNN, radial basis neural networks, integrated ANFIS with subtractive clustering and grid partitioning, and GEP for modeling ET0 with small number of input climatic data. In general, it was reported that all the applied models have highly practical and reliable performances for investigated stations. Heddam et al. (2018) applied and compared three evolving connectionist (ECoS) models namely, offline-based dynamic evolving neural-fuzzy inference systems named DENFIS-OF,(ii) online-based dynamic evolving neural-fuzzy inference systems named DENFIS-ON, and (iii) the evolving fuzzy neural network called (EFuNN), for modeling daily ET0 in the northern region of the Algeria. According to the obtained results, the best accuracy was obtained using the DENFIS-OF model. Tao et al. (2018) proposed a hybrid model called adaptive neuro-fuzzy inference systems (ANFIS) with firefly algorithm (ANFIS-FA) for predicting daily ET0 at Burkina Faso and reported that the hybrid ANFIS-FA provides higher accuracy compared to the standard ANFIS. Karbasi (2018) employed the Gaussian process regression (GPR) for forecasting daily ET0 and demonstrated that the wavelet decomposition significantly improved the performances of the models. Fang et al. (2018) employed the RF, SVM, and MLR for predicting monthly ET0 in China, and demonstrated that the SVM was more accurate. The accuracy of models to predict the ET0 is one of challenges in hydrology field to manage irrigation systems and water resources.
More recently, Zhu et al. (2020) employed a hybrid extreme learning machine (ELM) with the particle swarm optimization (PSO) model for daily ET0 prediction. They claimed that the PSO-ELM model offered the best accuracy among other applied models such as ANN and random forest (RF) models. Nagappan et al. (2020) attempted to predict ET0 for irrigation scheduling using machine learning methods like deep learning neural network (DLNN) and RBNN. It was found that the DLNN model acted better in the prediction process.
In a similar study, Ferreira and da Cunha (2020) showed that deep learning performed slightly better than ANN and RF in predicting ET0. Salam and Islam (2020) compared various data-driven models in ET0 prediction. They utilized standard SVM model as well as ensemble learning models for the prediction process including the bagging random tree (RT), RF, and random subspace (FS) models. The findings showed that the RT model performed superior followed by the RF, RS, and SVM.
Generally, the AI methods as machine learning approaches are used to provide the accurate prediction of ET0 due to flexible ability for providing the nonlinear relations. However, the AI approaches have some limitations as (i) input variables highly affect their predictions, (ii) some control parameters are required to train the models, and (iii) training process to provide a model is time consuming. The regression-based models are the efficient modeling approaches with simple regressed process. However, the regression-based data driven approaches have some drawbacks including (i) the regressed function is important for accurately predictions, (ii) the highly nonlinearity of the input data are neglected in the regression process, and (iii) the linear cross -correlation between the input-output data is used in the modeling process. Consequently, the efficient and accurate modeling approach using the regression-based models should be developed that is free from a complex training process.
The nonlinear mapping with efficient regression region can provide a flexible nonlinear response with efficient modeling process for machine learning models. The input data for training of the M5Tree models can be controlled based on the nonlinear maps using radial basis function. Thus, the nonlinearity of input variables of the response can be considered applying the radial map. Therefore, it can improve the nonlinear functions for accurate ET0 predictions using M5Tree. In this study, the nonlinear forms of the input data are used to improve M5Tree-based regression model. By this way, the input data are better controlled by providing a nonlinear cross-correlation between input-output data set. The proposed model was tested using data from three climate stations in Turkey (Isparta, Antalya & Adana stations). Afterwards, the performance of the RM5tree model was also compared with two neural networks methods, multi-layer perceptron, and radial basis function neural network and two regression methods, response surface method and M5Tree. The results showed that the proposed model has the fixable ability for nonlinear response compared to M5Tree models and by increasing the input data, the accuracy of RM5Tree models was significantly improved compared to other studied models.
2 Materials and methodology
2.1 Case study
The study used daily weather data comprising relative humidity (RH), solar radiation (SR), air temperature (T), and wind speed (W) from Adana (longitude 35° 19' E, latitude 37° 00' N with an altitude of 27 m) Antalya (latitude: 36° 42′ N, longitude: 30° 44′ E with an altitude of 47 m) and Isparta (longitude: 30° 34′ E, latitude: 37° 47′ N with an altitude of 997 m) stations with Mediterranean Region, Turkey (Fig. 1). Data cover a period from 01 January of 1972 to 31 December of 2002 for Adana station, from 01 January of 1973 to 31 October of 2002 for Antalya station, and from 01 September of 1978 to 31 October of 2002 for Isparta station. There are no missing values in the used data. Data were obtained from TSMS (Turkish State Meteorological Service). Table 1 shows the statistic characteristics of the dataset in terms of minimum (Min), maximum (Max), standard deviation (Std.), mean, coefficient of variation (CV), and correlation between the input parameters and the output parameter (ET0). Table 1 implies that the Adana and Antalya stations are more similar in terms of temperature ranges. Isparta is the only station that recorded minus temperatures in the dataset used in this study. In addition, the solar radiation is the most correlated parameter followed by air temperature with ET0 in all of the three stations. In the applications, data splitting rule of 65–35% was applied to train and test the studied models.
2.2 Modeling approaches
2.2.1 Response surface method
The RSM is commonly implemented for modeling nonlinear relations as below (Hill and Hunter 1966):
where, \( E{\hat{T}}_o \) is the predicted ET0, NV is the number of input variables x including the mean temperature, i.e., Tmean (oC), solar redation, i.e., SR (langley), relative humidity, i.e., RHmean (%), and wind speed, i.e., W (m/s). a0,ai and aij are unknown coefficients for polynomial terms of Eq. (2). Generally, the unknown coefficients are calibrated based on the ordinary least square estimator as follows (Keshtegar and Kisi 2017; Ahmadi et al. 2020):
Where, P(X) is the polynomial basic function which is determined based on input data in training stage (65% total of data). More details can be acquired from Keshtegar and Heddam (2017), Keshtegar et al. (2021) and Lu et al. (2020):
2.2.2 Multilayer perceptron artificial neural networks
Artificial neural networks (ANN) are black box models possessing the capabilities to produce a suitable response from an external stimulus, and they are composed of two items: the neurons and the weights (Fei et al. 2020; Sayari et al. 2021). The ANN models are constructed in two distinguished phases: the forward and the backward phases, these two are successively achieved during the backpropagation training algorithm. The information available in the predictor variables is transferred from input neurons to the hidden neurons via the weights, and then summed to get an estimate of the total stimulus of each hidden unit (Ozonoh et al. 2020). The hidden neurons send the collected information to the output neuron through an activation function, generally the sigmoid (Shahabinejad et al. 2020). Finally, the output neuron provides a response, then, compared to the desired value, and the error expected is calculated. A multilayer perceptron artificial neural network (MLPNN) having an input, one hidden and one output layers is a well-known ANN architecture (Fig. 2). Such an ANN model was employed in the present study, and trained with supervised Levenberg Marquart (LM) learning algorithm. According to Fig. 1, the relationship between N possible input variables (xi: climatic variables) and one output variable (ET0) was created as follows (Zhu et al. 2021):
where xi is an input variable, wij is the weight between the input i and the hidden neuron j, βj is the bias of the hidden neuron j, ϕ1 is the activation sigmoid function, wjk is the weight of connection of neuron j in the hidden layer to unique neuron k in the output layer, β0 is the bias of the output neuron k (Wang et al. 2020;)
2.2.3 Radial basis function neural network
Radial basis function neural network (RBFNN) belongs to the category of feedforward neural network (FFNN). Contrary to the well-known MLPNN, the RBFNN possess only one hidden layer with large number of neurons, and each one implements a radial basis function, generally the Gaussian function (Tenenbaum et al. 2020). The first input layer transfers the predictor variables to the hidden layer directly and the only output neuron linearly combines the weighted results of all hidden units (Fig. 3). The Gaussian activation function can be expressed as follows (Chen et al. 2019a, b; Pham et al. 2020):
Where μi = [μi1, μi2,…, μiM] is the center of Gaussian function, xk = [xk1, xk2,…, xkN] is a training sample and σ2 is the width of the RBFNN neuron, also called the spread width. During the training process, the optimal number of RBFNN neurons, the values of the centers, the weights, and biases were determined by minimizing the mean squared errors between observed and modeled values of ET0 (Zhou et al. 2012; Bonanno et al. 2012)
2.2.4 M5 tree model
M5 tree model is subset basis data mining and machine learning method. The tree-based methods are indeed a part of data mining methods, the output of which resulted from application of the input and output data will be a model with tree structure (Solomatine and Xue 2004; Zahiri et al. 2020). The tree models are fundamentally based on the decision and dominance method. Substituting the linear regression equation at the nodes is a method executed in the M5 model, which is capable to predict or estimate the numerical variables. Structure of a decision tree is similar to a tree constituted of the root, branches, nodes, and leaves. A tree model is built up in two steps. Accordingly, in the first step, the decision tree is designed by data splitting. The split criterion in M5 model is to maximize the reduction of standard deviation (SDR) of the data at the offspring node. When no reduction of standard deviation of the data at the offspring node is possible, its parent node will not be split and, thus, reach the end node or leaf. The following formula is used to calculate SDR:
where T represents a set of the samples entering on each node, Ti represents a subset of the samples with the ith result of the potential test, and Sd is standard deviation of the input data, which can be calculated as follows:
Yi is a numerical value of the target attribute of sample i and N indicates the number of data. Since the process of branching (classification) at offspring nodes has less standard deviation than the parent nodes, they have more accurate results and are featured with higher homogeneity. Once all the possible classifications are examined, the M5 model selects the one with minimum expected error. However, the second step in designing a tree model is to shrink the overgrown and overfitted tree through pruning the branches and replacing them with linear regression functions (Rahimikhoob 2016).
3 Radial M5 model tree
To enhance the accuracy of ET0 estimations, radial basis M5Tree (RM5Tree) is introduced. The input data set is controlled by applying the radial basis function (RBF) in feature space. By transferring input data from original to radial map, the RBF is applied in RM5Tree as follows (Chen et al. 1991; Xiao et al. 2020; Zhang et al. 2020a):
where RF is number of radial sets with shape factor of ε, and Crepresents the center of RBF. Zis normalized map, which is given as follows (Zhang et al. 2019; Keshtegar and Kisi 2018):
where, μx and σx are respectively mean and SD of input data x. The radial transformations which are given by Eq. (7) with C=0 and ε=0.5 are shown in Fig. 4 that shows a nonlinear map. Thus, there can be utilized a new data set to train a model by transforming original data set from NV (X-space) to RF (radial-space).
Two parameters of shape and location as center points applied in RBF are selected as ε=0.5 and C=[Xmin Xmax] which are randomly given from the domain of input dataset with RF as 5, 10, 20, and 50 in this study. The schematic structure of RM5Tree is presented in Fig. 5. This model involves three layers as input, transferring, and modeling layers. Using Eq. (7), the input dataset is normalized in the first (input) layer, while the RF-dataset is provided by transferring data to the second layer as follows:
-
1)
Create RF- center point from domain of each input data, randomly.
-
2)
Transfer the input data set in layer 1 into radial space by using RBF given in Eq. (7) based on the RF- center point as follows:
where, N is number of data in the training stage as 65% of total data, number of input variables and number of radial input data and Kij i = 1, 2, ..., N j = 1, 2, ..., RF. The radial input data is used in the training phase of M5Tree models. Therefore, the applied nonlinear map using Gaussian function and the number of center points improve the accuracy of M5Tree models (Keshtegar et al. 2018).
4 Application of the models
4.1 Modeling scenarios
Based on the results of Table 1 given for the correlation coefficients between the independent variables (T, RH, SR & W) and dependent variable (ET0), three different modeling scenarios for constructing the machine learning methods (M5Tree, Radial M5Tree, MLPNN, RBFNN & RSM) are considered. These scenarios are tabulated in Table 2; in the first scenario, just one input parameter is considered for modeling ET0 including (i) Tmean; (ii) W; (iii) SR; (iv) RH. The second scenario takes into account the most correlated parameters including (v) Tmean, SR and, (vi) Tmean, SR, RH and, (vii) Tmean, SR, RH. Finally, the third scenario has all of the independent parameters as (vii) Tmean, SR, RH, W.
4.2 Evaluation of the models
The models’ accuracies were compared according to the mean absolute error (MAE), determination coefficient (R2), root mean square error (RMSE), agreement index (d), and Nash and Sutcliffe efficiency (NES) statistics (Xiao et al. 2019; Zhang et al. 2020b).
In which, N is the number of data, ET0, ETp,\( {\overline{ET}}_0 \) \( {\overline{ET}}_p \) are the FAO-56 PM ET0, predicted ET0 mean ET0, and mean predicted ET0, respectively.
5 Results and discussion
5.1 Result analysis for the Isparta Station
The final results of the investigated AI-based models (RSM, M5Tree, and RM5Tree) in terms of training and testing results for Isparta Station can be seen in Table 3. It can be seen that the RM5Tree model performs superior to the M5Tree and RSM models with respect to various criteria in all input combinations (Scenario III). In testing phase, the RMSE is improved (d) as accuracy (tendency) factors using proposed RM5Tree by about 42% (6%) and 15% (2%) for Scenario I, 75% (15%) and 60% (3%) for Scenario II, and 105% (1%) and 90% (1%) for Scenario III compared to M5tree and RSM models, respectively. Considering Scenario (I) implies that among the single input variables, SR is the most effective parameter on ET0 followed by Tmean and RH, respectively while W has the least effect. This result was actually expected according to the calculated correlation coefficients in Table 3. In the second scenario (II), including the combination of SR (Tmean) parameter with Tmean (SR) considerably improves the models’ accuracy. For example, it improved the MAE, RMSE, and NES of RM5Tree by 42% (13%), 43% (19%), and 34% (7%), respectively. Adding RH parameter to Tmean and SR inputs also increases the accuracy of the employed models. For example, the values of MAE and RMSE of RM5Tree were decreased from 0.455 and 0.551 mm to 0.215 and 0.32 mm by 52% and 42%, respectively. In scenario III — even though W seems to be the least effective parameter from the first four input combinations — adding W parameter to other three inputs considerably increases the MAE and RMSE of the models (MAE and RMSE of the RM5Tree increased by 82% and 84%, respectively). According to the results of scenario (III), the accuracy of M5Tree model with respect to MAE and RMSE was improved by 43% and 51% using RM5Tree, respectively.
5.2 Result analysis for the Antalya Station
Table 4 reports the comparative results of the models in estimating ET0 of Antalya Station. Similar to Isparta, RM5Tree model outperforms the other models. From the first scenario (categories i to iv), the effective parameters (from most to least) in modeling ET0 are SR, Tmean, RH, and W. The accuracy of the RM5tree with respect to MAE, RMSE, and NES is improved up to 10% (15%), 48% (39%), and 50% (27%) by adding the SR input, respectively. Similarly, importing RH to Tmean and SR inputs decreases the MAE and RMSE of RM5Tree by 73% and 42%, respectively. Moreover, including W input in Tmean, SR and RH combination considerably increases the RM5Tree accuracy (MAE and RMSE are decreased by 87% and 89%) in scenario (III).
5.3 Result analysis for the Adana Station
Table 5 compares the training and testing statistics of the three methods for Adana Station. Similar to the Isparta, in this station, the RM5Tree model gave the best accuracy in modeling ET0 with respect to various evaluation statistics. According to the single input combinations in scenario (I), the most effective variables on ET0 is SR followed by Tmean and RH. Using SR (Tmean) parameter with Tmean (SR) input improves the RM5Tree accuracy with respect to MAE, RMSE and NES by 35% (11%), 40% (12%) and 43% (5%) in the test period, respectively. Including RH variable as an input factor to the RM5Tree comprising Tmean and SR inputs decreases the MAE and RMSE of the model from 0.553 and 0.671 mm to 0.167 mm and 0.325 mm by 70% and 50%. Similarly, importing W parameter to three inputs (Tmean, SR and RH) in the third scenario considerably increases RM5Tree accuracy, from 0.167 mm to 0.038 (MAE) and from 0.325 mm to 0.122 mm (RMSE), respectively.
5.4 Discussion
The ET0 estimates of the M5Tree, RSM, and RM5Tree models are compared in Figs. 6, 7, 8, 9, 10, 11, and 12 for each station and each input combination. The effect of each variable on ET0 can be seen from the figures visually. Comparison of Figs. 6 and 7 indicates that the effective ranks of the variables (from the most to the least) are SR, Tmean, RH and W. Jain et al. (2008) also reported the same trend for the effective parameters (SR, Tmean, RH, W and lastly dew point temperature) by using hourly data of ET0 for a few stations in the Reynolds Creek Experimental Watershed in South-western Idaho, USA. In addition, the effect of each variable on ET0 can also be observed from Figs. 8 and 9. Comparison of Figs. 10 and 11 shows the considerable effect of W variable even though this cannot be seen when W is used as input alone. One input model cannot catch the relationship between W and ET0. All these indicate the necessity of this variable in accurately modeling of ET0. It should be noted that the M5Tree model estimates are not accurate in Adana compared to other stations and methods. The reason of this might be the fact that the relationship between inputs and output is more non-linear in Adana compared to others and the M5tree model having linear nature might not adequately map this highly non-linear relationship. Table 6 compares the results of the best RM5tree model with two of the most prevailing AI-based models of MLPNN and RBNN (multi-layer perceptron neural network and radial basis neural network). It can be concluded that all the AI-based models acted better by considering all the input variables considering scenario III (with the exception for the RBNN in Adana Station). Although the MLPNN model gave better results than the RBNN models but it could not surpass the performance of the proposed RM5tree model. Having a better diagnostic analysis of the efficiency of the all AI-based models (M5Tree, RM5Tree, RSM, MLPNN & RBNN), the results of the best input category in scenarios I, II, and III in terms of RMSE (mm) are shown in Fig. 13 using radar charts. Obviously, the smaller size of stars with lower values for RMSE would indicate the better performance of the models. It can be easily seen that involving all the variables (T, SR, RH, W in scenario III) would result in lower values of RMSE (with an exception for the RBNN in Adana Station). This major finding is supported by the outcomes of different AI-based model in a similar study done by Kisi (2006). Further evaluation was achieved using the Taylor diagram to check the performances of the models as presnted in Fig. 14. At all stations RM5Tree performs better than the other models, it is clear evidence fron the results of Fig. 14 that the proposed approach improves the accuracy of the M5Tree model. Finally, to further compare the accuracy of the models, all the results using the best input combination for each model has been considered using the Box plot as plotted in Fig. 15 . Box plots corresponding to the test data in Fig. 15 clearly show that the accuracy of the RM5Tree model was higher than the other models.
6 Conclusion
In the presented work, the applicability of a new method which is developed by combining radial basis function and M5Tree methods was investigated in modeling ET0. The new method was compared with standard M5Tree, RSM, MLPNN, and RBNN using daily climatic data from three stations located in Turkey. Various input combinations of available data were tried to see the effect of each input variable on ET0. The following conclusions were derived from the applications.
-
i-
The comparison of methods revealed that the new proposed method, RM5Tree, provided better ET0 estimates than the MLPNN, RBNN, M5Tree, and RSM. The accuracy of M5Tree models was considerably improved (more than 30% with respect to MAE and RMSE) by using RM5Tree.
-
ii-
The results obtained based on different input combinations indicated that the most effective variable on models’ accuracy in estimating ET0 was solar radiation followed by the air temperature, relative humidity, and wind speed. However, it was also observed that using wind speed together with other three inputs considerably increases models’ efficiency (more than 80% with respect to MAE and RMSE).
-
iii-
The study showed that the proposed RM5Tree model could be utilized as a better alternative to the M5Tree in modeling daily ET0.
-
iv-
This ability of this method can be compared with other stations or this method can be applied for other hydrological problems in future.
Data availability
The daily weather data from Adana, Antalya and Isparta stations, Turkey. The datasets analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
The codes of the modeling methods applied in this current study are available from the corresponding author on reasonable request.
References
Adnan RM, Heddam S, Yaseen ZM, Shahid S, Kisi O, Li B (2021) Prediction of potential evapotranspiration using temperature-based heuristic approaches. Sustainability 13(1):297
Ahmadi AA, Arabbeiki M, Ali HM, Goodarzi M, Safaei MR (2020) Configuration and optimization of a minichannel using water–alumina nanofluid by non-dominated sorting genetic algorithm and response surface method. Nanomaterials 10(5):901
Antonopoulos VZ, Antonopoulos AV (2017) Daily reference evapotranspiration estimates by artificial neural networks technique and empirical equations using limited input climate variables. Comput Electron Agric 132:86–96
Bonanno F, Capizzi G, Graditi G, Napoli C, Tina GM (2012) A radial basis function neural network based approach for the electrical characteristics estimation of a photovoltaic module. Appl Energy 97:956–961
Chen S, Cowan CF, Grant PM (1991) Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans Neural Netw 2:302–309
Chen X, Li FW, Wang YX, Feng P, Yang RZ (2019a) Evolution properties between meteorological, agricultural and hydrological droughts and their related driving factors in the Luanhe River basin, China. Hydrol Res 50:1096–1119. https://doi.org/10.2166/nh.2019.141
Chen Z, Yang X, Liu X (2019b) RBFNN-based non-singular fast terminal sliding mode control for robotic manipulators including actuator dynamics. Neurocomputing 362:72–82
Chen Z, Zhu Z, Jiang H, Sun S (2020) Estimating daily reference evapotranspiration based on limited meteorological data using deep learning and classical machine learning methods. J Hydrol 591:125286
Chia MY, Huang YF, Koo CH (2020) Support vector machine enhanced empirical reference evapotranspiration estimation with limited meteorological parameters. Comput Electron Agric 175:105577
Dou X, Yang Y (2018) Evapotranspiration estimation using four different machine learning approaches in different terrestrial ecosystems. Comput Electron Agric 148:95–106
Fang W, Huang S, Huang Q, Huang G, Meng E, Luan J (2018) Reference evapotranspiration forecasting based on local meteorological and global climate information screened by partial mutual information. J Hydrol 561:764–779
Fei CW, Li H, Liu HT, Lu C, An LQ, Han L, Zhao YJ (2020) Enhanced network learning model with intelligent operator for the motion reliability evaluation of flexible mechanism. Aerosp Sci Technol 107:106342
Feng Y, Gong D, Mei X, Cui N (2017a) Estimation of maize evapotranspiration using extreme learning machine and generalized regression neural network on the China Loess Plateau. Hydrol Res 48(4):1156–1168
Feng Y, Peng Y, Cui N, Gong D, Zhang K (2017b) Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput Electron Agric 136:71–78
Ferreira LB, da Cunha FF (2020) Multi-step ahead forecasting of daily reference evapotranspiration using deep learning. Comput Electron Agric 178:105728
Gavili S, Sanikhani H, Kisi O, Mahmoudi MH (2018) Evaluation of several soft computing methods in monthly evapotranspiration modelling. Meteorol Appl 25:128–138
Gocić M, Motamedi S, Shamshirband S, Petković D, Ch S, Hashim R, Arif M (2015) Soft computing approaches for forecasting reference evapotranspiration. Comput Electron Agric 113:164–173
Heddam S, Watts MJ, Houichi L, Djemili L, Sebbar A (2018) Evolving connectionist systems (ECoSs): a new approach for modeling daily reference evapotranspiration (ET 0). Environ Monit Assess 190(9):516. https://doi.org/10.1007/s10661-018-6903-0
Hill WJ, Hunter WG (1966) A review of response surface methodology: a literature survey. Technometrics 8(4):571–590
Jain SK, Nayak PC, Sudheer KP (2008) Models for estimating evapotranspiration using artificial neural networks, and their physical interpretation. Hydrol Process 22(13):2225–2234
Karbasi M (2018) Forecasting of multi-step ahead reference evapotranspiration using Wavelet-Gaussian Process Regression Model. Water Resour Manag 32(3):1035–1052
Keshtegar B, Heddam S (2018) Modeling daily dissolved oxygen concentration using modified response surface method and artificial neural network: a comparative study. Neural Comput Appl 30:2995–3006
Keshtegar B, Kisi O (2017) Modified response-surface method: new approach for modeling pan evaporation. J Hydrol Eng 22(10):1–14
Keshtegar B, Kisi O (2018) RM5Tree: radial basis M5 model tree for accurate structural reliability analysis. Reliab Eng Syst Saf 180:49–61
Keshtegar B, Kisi O, Arab HG, Zounemat-Kermani M (2018) Subset modeling basis ANFIS for prediction of the reference evapotranspiration. Water Resour Manag 32(1):1101–1116
Keshtegar B, Kisi O, Zounemat-Kermani M (2019) Polynomial chaos expansion and response surface method for nonlinear modelling of reference evapotranspiration. Hydrol Sci J 64(6):720–730
Keshtegar B, Bagheri M, Fei C-W, Lu C, Taylan O, Thai D-K (2021) Multi-extremum-modified response basis model for nonlinear response prediction of dynamic turbine blisk. Eng Comput. https://doi.org/10.1007/s00366-020-01273-8
Khoshravesh M, Sefidkouhi MAG, Valipour M (2017) Estimation of reference evapotranspiration using multivariate fractional polynomial, Bayesian regression, and robust regression models in three arid environments. Appl Water Sci 7(4):1911–1922
Kisi O (2006) Generalized regression neural networks for evapotranspiration modelling. Hydrol Sci J 51(6):1092–1105
Kisi O (2016) Modeling reference evapotranspiration using three different heuristic regression approaches. Agric Water Manag 169:162–172
Kişi Ö, Öztürk Ö (2007) Adaptive Neurofuzzy computing technique for evapotranspiration estimation. J Irrig Drain Eng 133(4):368–379
Lu C, Fei CW, Liu HT, Li H, An LQ (2020) Moving extremum surrogate modeling strategy for dynamic reliability estimation of turbine blisk with multi-physics fields. Aerosp Sci Technol 106:106112
Mehdizadeh S, Behmanesh J, Khalili K (2017) Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration. Comput Electron Agric 139:103–114
Nagappan M, Gopalakrishnan V, Alagappan M (2020) Prediction of reference evapotranspiration for irrigation scheduling using machine learning. Hydrol Sci J 65(16):2669–2677
Ozonoh M, Oboirien BO, Higginson A, Daramola MO (2020) Performance evaluation of gasification system efficiency using artificial neural network. Renew Energy 145:2253–2270. https://doi.org/10.1016/j.renene.2019.07.136
Pham BT, Phong TV, Nguyen HD, Qi C, Al-Ansari N, Amini A et al (2020) A comparative study of kernel logistic regression, radial basis function classifier, multinomial naïve bayes, and logistic model tree for flash flood susceptibility mapping. Water 12(1):239
Rahimikhoob A (2016) Comparison of M5 model tree and artificial neural network’s methodologies in modelling daily reference evapotranspiration from NOAA satellite images. Water Resour Manag 30(9):3063–3075
Saggi MK, Jain S (2019) Reference evapotranspiration estimation and modeling of the Punjab Northern India using deep learning. Comput Electron Agric 156:387–398
Salam R, Islam ARMT (2020) Potential of RT, Bagging and RS ensemble learning algorithms for reference evapotranspiration prediction using climatic data-limited humid region in Bangladesh. J Hydrol 590:125241. https://doi.org/10.1016/j.jhydrol.2020.125241
Sanikhani H, Kisi O, Maroufpoor E, Yaseen ZM (2019) Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: application of different modeling scenarios. Theor Appl Climatol 135(1-2):449–462
Sayari S, Mahdavi-Meymand A, Zounemat-Kermani M (2021) Irrigation water infiltration modeling using machine learning. Comput Electron Agric 180:105921
Shahabinejad H, Vosoughi N, Saheli F (2020) Matrix effects corrections in prompt gamma-ray spectra of a PGNAA online analyzer system using artificial neural network. Prog Nucl Energy 118:103146. https://doi.org/10.1016/j.pnucene.2019.103146
Shiri J, Sadraddini AA, Nazemi AH, Kisi O, Landeras G, Fard AF, Marti P (2014) Generalizability of gene expression programming-based approaches for estimating daily reference evapotranspiration in coastal stations of Iran. J Hydrol 508:1–11
Shiri J, Marti P, Nazemi AH, Sadraddini AA, Kisi O, Landeras G, Fakheri Fard A (2015) Local vs. external training of neuro-fuzzy and neural networks models for estimating reference evapotranspiration assessed through k-fold testing. Hydrol Res 46(1):72–88
Solomatine DP, Xue Y (2004) M5 model trees and neural networks: application to flood forecasting in the upper reach of the Huai River in China. J Hydrol Eng 9(6):491–501
Tao H, Diop L, Bodian A, Djaman K, Ndiaye PM, Yaseen ZM (2018) Reference evapotranspiration prediction using hybridized fuzzy model with firefly algorithm: regional case study in Burkina Faso. Agric Water Manag 208:140–151. https://doi.org/10.1016/j.agwat.2018.06.018
Tenenbaum RA, Taminato FO, Melo VS (2020) Fast auralization using radial basis functions type of artificial neural network techniques. Appl Acoust 157:106993. https://doi.org/10.1016/j.apacoust.2019.07.041
Wang L, Kisi O, Hu B, Bilal M, Zounemat-Kermani M, Li H (2017) Evaporation modelling using different machine learning techniques. Int J Climatol 37(S1):1076–1092
Wang Y, Liu H, Yu Z, Tu L (2020) An improved artificial neural network based on human-behaviour particle swarm optimization and cellular automata. Expert Syst Appl 140:112862. https://doi.org/10.1016/j.eswa.2019.112862
Xiao M, Zhang J, Gao L, Lee S, Eshghi AT (2019) An efficient Kriging-based subset simulation method for hybrid reliability analysis under random and interval variables with small failure probability. Struct Multidiscip Optim 59:2077–2092
Xiao M, Zhang J, Gao L (2020) A system active learning Kriging method for system reliability-based design optimization with a multiple response model. Reliab Eng Syst Saf 199:106935
Yamaç SS, Todorovic M (2020) Estimation of daily potato crop evapotranspiration using three different machine learning algorithms and four scenarios of available meteorological data. Agric Water Manag 228:105875
Yin Z, Wen X, Feng Q, He Z, Zou S, Yang L (2017) Integrating genetic algorithm and support vector machine for modeling daily reference evapotranspiration in a semi-arid mountain area. Hydrol Res 48(5):1177–1191
Zahiri J, Mollaee Z, Ansari MR (2020) Estimation of suspended sediment concentration by M5 model tree based on hydrological and moderate resolution imaging spectroradiometer (MODIS) data. Water Resour Manag 34(12):3725–3737
Zhang J, Xiao M, Gao L, Chu S (2019) A combined projection-outline-based active learning Kriging and adaptive importance sampling method for hybrid reliability analysis with small failure probabilities. Comput Methods Appl Mech Eng 344:13–33
Zhang J, Gao L, Xiao M (2020a) A new hybrid reliability-based design optimization method under random and interval uncertainties. Int J Numer Methods Eng 121:4435–4457
Zhang Y, Gao L, Xiao M (2020b) Maximizing natural frequencies of inhomogeneous cellular structures by Kriging-assisted multiscale topology optimization. Comput Struct 230:106197
Zhou Q, Ren P, Tan YL (2012) Soft measurement of paper smoothness based on time-frequency analysis of paper quantization noise. Measurement 45(3):493–499. https://doi.org/10.1016/j.measurement.2011.10.023
Zhu B, Feng Y, Gong D, Jiang S, Zhao L, Cui N (2020) Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Comput Electron Agric 173:105430
Zhu S-P, Keshtegar B, Tian K, Trung N-T (2021) Optimization of load-carrying hierarchical stiffened shells: comparative survey and applications of six hybrid heuristic models. Arch Comput Methods Eng. 1-12. https://doi.org/10.1007/s11831-021-09528-3
Zounemat-Kermani M, Kisi O, Piri J, Mahdavi-Meymand A (2019) Assessment of artificial intelligence–based models and metaheuristic algorithms in modeling evaporation. J Hydrol Eng 24(10):04019033
Zounemat-Kermani M, Matta E, Cominola A, Xia X, Zhang Q, Liang Q, Hinkelmann R (2020) Neurocomputing in surface water hydrology and hydraulics: a review of two decades retrospective, current status and future prospects. J Hydrol 588:125085. https://doi.org/10.1016/j.jhydrol.2020.125085
Acknowledgements
The authors would like to thank TSMS (Turkish State Meteorological Service) for providing the data for this study.
Funding
This work was supported by University of Zabol under Grant No: UOZ-GR-9618-1 and UOZ-GR-9719-1 and Iran National Science Foundation (INSF) under project No: 97023031.
Author information
Authors and Affiliations
Contributions
Ozgur Kisi: Conceptualization, Investigation, Writing - original draft, Writing - review & editing. Behrooz Keshtegar: Conceptualization, Investigation, Writing - original draft, Writing - review & editing. Mohammad Zounemat-Kermani: Conceptualization, Investigation, Writing - review & editing. Salim Heddam: Conceptualization, Writing - original draft & editing. Nguyen- Thoi Trung: Conceptualization, Writing - review & editing.
Corresponding author
Ethics declarations
Ethics approval
The authors have the responsibility about the methodology and the novelty, research, ethical standards, and professional guidelines in this contribution.
Consent to participate
We as the research team in this current contribution have voluntarily agreed to participate in this research study.
Consent for publication
We would like to give consent for the publication of identifiable details including text, material and methods, figures, and tables to be published in the Journal.
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kisi, O., Keshtegar, B., Zounemat-Kermani, M. et al. Modeling reference evapotranspiration using a novel regression-based method: radial basis M5 model tree. Theor Appl Climatol 145, 639–659 (2021). https://doi.org/10.1007/s00704-021-03645-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00704-021-03645-6