1 Introduction

Evapotranspiration (ET) is one of main components in hydrological cycle and accurate estimation of ET has vital importance in design and management of the irrigation systems, water resources studies, and other similar cases. Knowing the ET rate of a plant can help determining the accurate amount of water required for irrigation, which will subsequently lead to increased productivity. Failure in determining the accurate ET rate may lead to the overestimation of plants’ water requirement, which will consequently cause adverse effects such as waterlogged lands, soil nutrients washout, as well as contamination of groundwater resources. On the other hand, underestimation of plants’ water requirement will incur moisture tension on them, followed consequently by a reduced crop yield. In this regard, equations such as FAO Penman, FAO Penman-Monteith, Blaney-Criddle, etc. can be used to study the reference evapotranspiration (ET0) of plants (Feng et al. 2017a, b; Chen et al. 2019a, b). However, despite their acceptable performance in most of the cases, using these methods needs the access to a large amount of input data, which is not of course always possible regarding the conditions within various regions.

In such conditions, using indirect approaches such as soft computing techniques and artificial intelligence based (AI-based) models can be a proper alternative modeling solution (Zounemat-Kermani et al. 2020). Lots of researchers have applied artificial intelligence (AI)-based models including artificial neural networks (e.g., MLPNN & RBFNN), fuzzy logic concepts (e.g., ANFIS), regression, and classification tree models (e.g., M5 tree & CHAID) and machine learning approaches (support vector machine, SVM & support vector regression, SVR) for simulating and modeling ET0 (Kişi and Öztürk 2007; Shiri et al. 2014, 2015; Gocić et al. 2015; Yin et al. 2017; Mehdizadeh et al. 2017; Dou and Yang 2018; Zounemat-Kermani et al. 2019; Chia et al. 2020; Chen et al. 2020; Yamaç and Todorovic 2020; Adnan et al. 2021). In the following paragraphs, some of the most recent pertinent studies about modeling ET0 using AI-based models are presented. The neuro-fuzzy and artificial neural network (ANN) were compared for modeling ET0 with two input combinations to select suitable input data via Shiri et al. (2015) and they reported that the local training can be applied to validate the alternative modeling by using AI. Kisi (2016) used three different models consisting multivariate adaptive regression splines, MARS, M5Tree, and least square SVR for approximating the ET0. In general, it was found that the M5Tree performed superior to the other modeling approaches applied. Rahimikhoob (2016) compared the ability of artificial neural network and M5Tree for estimating ET0 of an arid area. The study did not report the complete superiority of the utilized models.

Khoshravesh et al. (2017) analyzed performances of multivariate fractional polynomial model, Bayesian, and robust regressions to estimate ET0 in arid climates. Outcomes of the study showed that accuracy of multivariate fractional polynomial model is better than other two models. Daily ET0 predictions were investigated using extreme learning machine (ELM) and generalized regression neural network (GRNN) with input data of temperature at 6 stations of China by Feng et al. (2017a, b). It is conducted that ELM is a robust and accurate model compared to GRNN. Three different AI-based models (ANFIS-GP, fuzzy genetic model, and M5Tree) were used and executed in modeling monthly ET0 values by Wang et al. in 2017. It was reported that the fuzzy genetic model performed better in comparison to the ANFIS-GP and M5Tree models. The hybrid genetic algorithm (GA) and SVM as an AI-based models were used for simulating daily ET0 of semi-arid environment in northwest China by Yin et al. (2017) and it was compared with SVM and ANN based on eight different combinations of climatic input data. Based on their results, the SVR combined by GA model had superior performances. Antonopoulos and Antonopoulos (2017) compared ANN-based multi-layer feed forward back-propagation with several empirical models for calculation of daily ET0 in northern Greece and showed that the ANN with a sigmoid transfer function in hidden layer can provide more accurate predictions than the empirical models.

In another study, Keshtegar et al. (2018) explored the ability of subset ANFIS, ANFIS, ANNs, and M5Tree. They claimed that the subset ANFIS is superior to other applied methods in modeling daily ET0. Gavili et al. (2018) compared the ability of ANN, ANFIS, and gene expression programming (GEP) in modeling daily ET0. The results attained from the AI-based models were compared with those of the empirical models. Comparing the results, it was revealed that the AI-based models provided better accuracies compared to empirical models. Dou and Yang (2018) investigated and compared the feasibilities and abilities of four AI-based models using ELM, ANFIS, ANN, and SVR for prediction of daily ET0 for four sites in China. It conducted that the AI-based ANFIS and ELM models can produce better performances compared to the ANN and SVM while the ELM model was considerably reduced computational time in modeling process. Keshtegar et al. (2019) compared the ability of RSM and polynomial chaos expansion (PCE) in modeling ET0. They reported that the PCE model was more accurate approach to estimate daily ET0. The abilities of four learning algorithms as multilayer perceptron-based deep feed-forward ANN, gradient-boosting machine, random forest regression (RF) using M5Tree model, and generalized linear model were compared for ET0 estimation for the Punjab Northern India stations by Saggi and Jain (2019). According to the extracted results from this study, the deep feed-forward ANN performed better than the other models. Sanikhani et al. (2019) employed six AI-based models namely multilayer perceptron ANN, GRNN, radial basis neural networks, integrated ANFIS with subtractive clustering and grid partitioning, and GEP for modeling ET0 with small number of input climatic data. In general, it was reported that all the applied models have highly practical and reliable performances for investigated stations. Heddam et al. (2018) applied and compared three evolving connectionist (ECoS) models namely, offline-based dynamic evolving neural-fuzzy inference systems named DENFIS-OF,(ii) online-based dynamic evolving neural-fuzzy inference systems named DENFIS-ON, and (iii) the evolving fuzzy neural network called (EFuNN), for modeling daily ET0 in the northern region of the Algeria. According to the obtained results, the best accuracy was obtained using the DENFIS-OF model. Tao et al. (2018) proposed a hybrid model called adaptive neuro-fuzzy inference systems (ANFIS) with firefly algorithm (ANFIS-FA) for predicting daily ET0 at Burkina Faso and reported that the hybrid ANFIS-FA provides higher accuracy compared to the standard ANFIS. Karbasi (2018) employed the Gaussian process regression (GPR) for forecasting daily ET0 and demonstrated that the wavelet decomposition significantly improved the performances of the models. Fang et al. (2018) employed the RF, SVM, and MLR for predicting monthly ET0 in China, and demonstrated that the SVM was more accurate. The accuracy of models to predict the ET0 is one of challenges in hydrology field to manage irrigation systems and water resources.

More recently, Zhu et al. (2020) employed a hybrid extreme learning machine (ELM) with the particle swarm optimization (PSO) model for daily ET0 prediction. They claimed that the PSO-ELM model offered the best accuracy among other applied models such as ANN and random forest (RF) models. Nagappan et al. (2020) attempted to predict ET0 for irrigation scheduling using machine learning methods like deep learning neural network (DLNN) and RBNN. It was found that the DLNN model acted better in the prediction process.

In a similar study, Ferreira and da Cunha (2020) showed that deep learning performed slightly better than ANN and RF in predicting ET0. Salam and Islam (2020) compared various data-driven models in ET0 prediction. They utilized standard SVM model as well as ensemble learning models for the prediction process including the bagging random tree (RT), RF, and random subspace (FS) models. The findings showed that the RT model performed superior followed by the RF, RS, and SVM.

Generally, the AI methods as machine learning approaches are used to provide the accurate prediction of ET0 due to flexible ability for providing the nonlinear relations. However, the AI approaches have some limitations as (i) input variables highly affect their predictions, (ii) some control parameters are required to train the models, and (iii) training process to provide a model is time consuming. The regression-based models are the efficient modeling approaches with simple regressed process. However, the regression-based data driven approaches have some drawbacks including (i) the regressed function is important for accurately predictions, (ii) the highly nonlinearity of the input data are neglected in the regression process, and (iii) the linear cross -correlation between the input-output data is used in the modeling process. Consequently, the efficient and accurate modeling approach using the regression-based models should be developed that is free from a complex training process.

The nonlinear mapping with efficient regression region can provide a flexible nonlinear response with efficient modeling process for machine learning models. The input data for training of the M5Tree models can be controlled based on the nonlinear maps using radial basis function. Thus, the nonlinearity of input variables of the response can be considered applying the radial map. Therefore, it can improve the nonlinear functions for accurate ET0 predictions using M5Tree. In this study, the nonlinear forms of the input data are used to improve M5Tree-based regression model. By this way, the input data are better controlled by providing a nonlinear cross-correlation between input-output data set. The proposed model was tested using data from three climate stations in Turkey (Isparta, Antalya & Adana stations). Afterwards, the performance of the RM5tree model was also compared with two neural networks methods, multi-layer perceptron, and radial basis function neural network and two regression methods, response surface method and M5Tree. The results showed that the proposed model has the fixable ability for nonlinear response compared to M5Tree models and by increasing the input data, the accuracy of RM5Tree models was significantly improved compared to other studied models.

2 Materials and methodology

2.1 Case study

The study used daily weather data comprising relative humidity (RH), solar radiation (SR), air temperature (T), and wind speed (W) from Adana (longitude 35° 19' E, latitude 37° 00' N with an altitude of 27 m) Antalya (latitude: 36° 42′ N, longitude: 30° 44′ E with an altitude of 47 m) and Isparta (longitude: 30° 34′ E, latitude: 37° 47′ N with an altitude of 997 m) stations with Mediterranean Region, Turkey (Fig. 1). Data cover a period from 01 January of 1972 to 31 December of 2002 for Adana station, from 01 January of 1973 to 31 October of 2002 for Antalya station, and from 01 September of 1978 to 31 October of 2002 for Isparta station. There are no missing values in the used data. Data were obtained from TSMS (Turkish State Meteorological Service). Table 1 shows the statistic characteristics of the dataset in terms of minimum (Min), maximum (Max), standard deviation (Std.), mean, coefficient of variation (CV), and correlation between the input parameters and the output parameter (ET0). Table 1 implies that the Adana and Antalya stations are more similar in terms of temperature ranges. Isparta is the only station that recorded minus temperatures in the dataset used in this study. In addition, the solar radiation is the most correlated parameter followed by air temperature with ET0 in all of the three stations. In the applications, data splitting rule of 65–35% was applied to train and test the studied models.

Fig. 1
figure 1

The location of the stations in Mediterranean region of Turkey

Table 1 Daily statistical characteristics of the dependent (T, W, SR & RH) and independent (ET0) parameters in the three different meteorological stations in Turkey

2.2 Modeling approaches

2.2.1 Response surface method

The RSM is commonly implemented for modeling nonlinear relations as below (Hill and Hunter 1966):

$$ E{\hat{T}}_o={a}_0+\sum \limits_{i=1}^{NV}{a}_i{x}_i+\sum \limits_{i=1}^{NV}\sum \limits_{j=i}^{NV}{a}_{ij}{x}_i{x}_j $$
(1)

where, \( E{\hat{T}}_o \) is the predicted ET0, NV is the number of input variables x including the mean temperature, i.e., Tmean (oC), solar redation, i.e., SR (langley), relative humidity, i.e., RHmean (%), and wind speed, i.e., W (m/s). a0,ai and aij are unknown coefficients for polynomial terms of Eq. (2). Generally, the unknown coefficients are calibrated based on the ordinary least square estimator as follows (Keshtegar and Kisi 2017; Ahmadi et al. 2020):

$$ a={\left[P{(X)}^TP(X)\right]}^{-1}\;\left[P{(X)}^TE{T}_o\right] $$
(2)

Where, P(X) is the polynomial basic function which is determined based on input data in training stage (65% total of data). More details can be acquired from Keshtegar and Heddam (2017), Keshtegar et al. (2021) and Lu et al. (2020):

2.2.2 Multilayer perceptron artificial neural networks

Artificial neural networks (ANN) are black box models possessing the capabilities to produce a suitable response from an external stimulus, and they are composed of two items: the neurons and the weights (Fei et al. 2020; Sayari et al. 2021). The ANN models are constructed in two distinguished phases: the forward and the backward phases, these two are successively achieved during the backpropagation training algorithm. The information available in the predictor variables is transferred from input neurons to the hidden neurons via the weights, and then summed to get an estimate of the total stimulus of each hidden unit (Ozonoh et al. 2020). The hidden neurons send the collected information to the output neuron through an activation function, generally the sigmoid (Shahabinejad et al. 2020). Finally, the output neuron provides a response, then, compared to the desired value, and the error expected is calculated. A multilayer perceptron artificial neural network (MLPNN) having an input, one hidden and one output layers is a well-known ANN architecture (Fig. 2). Such an ANN model was employed in the present study, and trained with supervised Levenberg Marquart (LM) learning algorithm. According to Fig. 1, the relationship between N possible input variables (xi: climatic variables) and one output variable (ET0) was created as follows (Zhu et al. 2021):

$$ Y={\varphi}_2\left[{\beta}_0+\sum \limits_{j=1}^n{w}_{jk}\left[{\varphi}_1\left(\sum \limits_{j=1}^n{x}_i{w}_{ij}+{\beta}_j\right)\right]\right] $$
(3)

where xi is an input variable, wij is the weight between the input i and the hidden neuron j, βj is the bias of the hidden neuron j, ϕ1 is the activation sigmoid function, wjk is the weight of connection of neuron j in the hidden layer to unique neuron k in the output layer, β0 is the bias of the output neuron k (Wang et al. 2020;)

Fig. 2
figure 2

General structure of the MLPNN model

2.2.3 Radial basis function neural network

Radial basis function neural network (RBFNN) belongs to the category of feedforward neural network (FFNN). Contrary to the well-known MLPNN, the RBFNN possess only one hidden layer with large number of neurons, and each one implements a radial basis function, generally the Gaussian function (Tenenbaum et al. 2020). The first input layer transfers the predictor variables to the hidden layer directly and the only output neuron linearly combines the weighted results of all hidden units (Fig. 3). The Gaussian activation function can be expressed as follows (Chen et al. 2019a, b; Pham et al. 2020):

$$ \varphi \left({x}_k,{\mu}_i\right)=\exp \left(-\frac{1}{2{\sigma}_i^2}\sum \limits_{m=1}^M\left({x}_k-\mu \right)\right) $$
(4)

Where μi = [μi1, μi2,…, μiM] is the center of Gaussian function, xk = [xk1, xk2,…, xkN] is a training sample and σ2 is the width of the RBFNN neuron, also called the spread width. During the training process, the optimal number of RBFNN neurons, the values of the centers, the weights, and biases were determined by minimizing the mean squared errors between observed and modeled values of ET0 (Zhou et al. 2012; Bonanno et al. 2012)

Fig. 3
figure 3

Architecture of the established structure radial basis function neural network (RBFNN)

2.2.4 M5 tree model

M5 tree model is subset basis data mining and machine learning method. The tree-based methods are indeed a part of data mining methods, the output of which resulted from application of the input and output data will be a model with tree structure (Solomatine and Xue 2004; Zahiri et al. 2020). The tree models are fundamentally based on the decision and dominance method. Substituting the linear regression equation at the nodes is a method executed in the M5 model, which is capable to predict or estimate the numerical variables. Structure of a decision tree is similar to a tree constituted of the root, branches, nodes, and leaves. A tree model is built up in two steps. Accordingly, in the first step, the decision tree is designed by data splitting. The split criterion in M5 model is to maximize the reduction of standard deviation (SDR) of the data at the offspring node. When no reduction of standard deviation of the data at the offspring node is possible, its parent node will not be split and, thus, reach the end node or leaf. The following formula is used to calculate SDR:

$$ SDR= sd(T)-\sum \limits_{i=1}^N\frac{T_i}{T} sd\left({T}_i\right) $$
(5)

where T represents a set of the samples entering on each node, Ti represents a subset of the samples with the ith result of the potential test, and Sd is standard deviation of the input data, which can be calculated as follows:

$$ Sd(T)=\sqrt{\frac{1}{N}\Big(\sum \limits_{i=1}^N{\left({y}_i\right)}^2-\frac{1}{N}{\left(\sum \limits_{i=1}^N{y}_i\right)}^2} $$
(6)

Yi is a numerical value of the target attribute of sample i and N indicates the number of data. Since the process of branching (classification) at offspring nodes has less standard deviation than the parent nodes, they have more accurate results and are featured with higher homogeneity. Once all the possible classifications are examined, the M5 model selects the one with minimum expected error. However, the second step in designing a tree model is to shrink the overgrown and overfitted tree through pruning the branches and replacing them with linear regression functions (Rahimikhoob 2016).

3 Radial M5 model tree

To enhance the accuracy of ET0 estimations, radial basis M5Tree (RM5Tree) is introduced. The input data set is controlled by applying the radial basis function (RBF) in feature space. By transferring input data from original to radial map, the RBF is applied in RM5Tree as follows (Chen et al. 1991; Xiao et al. 2020; Zhang et al. 2020a):

$$ {K}_{ij}=\varphi \left(\left\Vert {Z}_i-{C}_j\right\Vert, \varepsilon \right)=\exp \left(-\varepsilon {\left\Vert {Z}_i-{C}_j\right\Vert}^2\right)\kern0.36em i=1,..., NV\kern0.36em j=1,..., RF $$
(7)

where RF is number of radial sets with shape factor of ε, and Crepresents the center of RBF. Zis normalized map, which is given as follows (Zhang et al. 2019; Keshtegar and Kisi 2018):

$$ Z=\frac{X-{\mu}_x}{\sigma_x} $$
(8)

where, μx and σx are respectively mean and SD of input data x. The radial transformations which are given by Eq. (7) with C=0 and ε=0.5 are shown in Fig. 4 that shows a nonlinear map. Thus, there can be utilized a new data set to train a model by transforming original data set from NV (X-space) to RF (radial-space).

Fig. 4
figure 4

Schematic view of RBF (K) for C=0 and ε=0.5

Two parameters of shape and location as center points applied in RBF are selected as ε=0.5 and C=[Xmin Xmax] which are randomly given from the domain of input dataset with RF as 5, 10, 20, and 50 in this study. The schematic structure of RM5Tree is presented in Fig. 5. This model involves three layers as input, transferring, and modeling layers. Using Eq. (7), the input dataset is normalized in the first (input) layer, while the RF-dataset is provided by transferring data to the second layer as follows:

  1. 1)

    Create RF- center point from domain of each input data, randomly.

  2. 2)

    Transfer the input data set in layer 1 into radial space by using RBF given in Eq. (7) based on the RF- center point as follows:

Fig. 5
figure 5

Schematic structure of RM5Tree model

$$ Z=\left[\begin{array}{cccc}{z}_{1,1}& {z}_{1,2}& \cdots & {z}_{1, NV}\\ {}{z}_{2,1}& {z}_{2,2}& \cdots & {z}_{2, NV}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}{z}_{N,1}& {z}_{N,2}& \cdots & {z}_{N, NV}\end{array}\right]\to K=\left[\begin{array}{cccc}{K}_{1,1}& {K}_{1,2}& \cdots & {K}_{1, RF}\\ {}{K}_{2,1}& {K}_{2,2}& \cdots & {K}_{2, RF}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}{K}_{N,1}& {K}_{N,2}& \cdots & {K}_{N, RF}\end{array}\right]\kern0.24em $$
(9)

where, N is number of data in the training stage as 65% of total data, number of input variables and number of radial input data and Kij  i = 1, 2, ..., N   j = 1, 2, ..., RF. The radial input data is used in the training phase of M5Tree models. Therefore, the applied nonlinear map using Gaussian function and the number of center points improve the accuracy of M5Tree models (Keshtegar et al. 2018).

4 Application of the models

4.1 Modeling scenarios

Based on the results of Table 1 given for the correlation coefficients between the independent variables (T, RH, SR & W) and dependent variable (ET0), three different modeling scenarios for constructing the machine learning methods (M5Tree, Radial M5Tree, MLPNN, RBFNN & RSM) are considered. These scenarios are tabulated in Table 2; in the first scenario, just one input parameter is considered for modeling ET0 including (i) Tmean; (ii) W; (iii) SR; (iv) RH. The second scenario takes into account the most correlated parameters including (v) Tmean, SR and, (vi) Tmean, SR, RH and, (vii) Tmean, SR, RH. Finally, the third scenario has all of the independent parameters as (vii) Tmean, SR, RH, W.

Table 2 Modeling scenarios based on the input variables for the applied models

4.2 Evaluation of the models

The models’ accuracies were compared according to the mean absolute error (MAE), determination coefficient (R2), root mean square error (RMSE), agreement index (d), and Nash and Sutcliffe efficiency (NES) statistics (Xiao et al. 2019; Zhang et al. 2020b).

$$ MAE=\frac{1}{N}\sum \limits_{i=1}^N\mid {\left(E{T}_0\right)}_i-{\left(E{T}_p\right)}_i\mid $$
(10)
$$ RMSE=\frac{1}{N}\sqrt{\sum \limits_{i=1}^N{\left[{\left(E{T}_0\right)}_i-{\left(E{T}_p\right)}_i\right]}^2} $$
(11)
$$ d=1-\frac{\sum \limits_{i=1}^N{\left[{\left(E{T}_0\right)}_i-{\left(E{T}_p\right)}_i\right]}^2}{\sum \limits_{i=1}^N{\left[|{\left(E{T}_0\right)}_i-{\overline{ET}}_0|+|{\left(E{T}_p\right)}_i-{\overline{ET}}_0|\right]}^2},\kern0.36em 0<d\le 1 $$
(12)
$$ NES=1-\frac{\sum \limits_{i=1}^N{\left[{\left(E{T}_0\right)}_i-{\left(E{T}_p\right)}_i\right]}^2}{\sum \limits_{i=1}^N{\left[{\left(E{T}_0\right)}_i-{\overline{ET}}_0\right]}^2},\kern0.36em -\infty < NES\le 1 $$
(13)
$$ {R}^2={\left[\frac{\frac{1}{N}\sum \left({\left(E{T}_0\right)}_i-{\overline{ET}}_0\right)\ \left({\left(E{T}_p\right)}_i-{\overline{ET}}_P\right)}{\sqrt{\frac{1}{N}\sum \limits_{i=1}^n{\left({\left(E{T}_0\right)}_i-{\overline{ET}}_0\right)}^2}\sqrt{\frac{1}{N}\sum \limits_{i=1}^n{\left({\left(E{T}_p\right)}_i-{\overline{ET}}_p\right)}^2}}\right]}^2,\kern0.36em 0<{R}^2\le 1 $$
(14)

In which, N is the number of data, ET0, ETp,\( {\overline{ET}}_0 \) \( {\overline{ET}}_p \) are the FAO-56 PM ET0, predicted ET0 mean ET0, and mean predicted ET0, respectively.

5 Results and discussion

5.1 Result analysis for the Isparta Station

The final results of the investigated AI-based models (RSM, M5Tree, and RM5Tree) in terms of training and testing results for Isparta Station can be seen in Table 3. It can be seen that the RM5Tree model performs superior to the M5Tree and RSM models with respect to various criteria in all input combinations (Scenario III). In testing phase, the RMSE is improved (d) as accuracy (tendency) factors using proposed RM5Tree by about 42% (6%) and 15% (2%) for Scenario I, 75% (15%) and 60% (3%) for Scenario II, and 105% (1%) and 90% (1%) for Scenario III compared to M5tree and RSM models, respectively. Considering Scenario (I) implies that among the single input variables, SR is the most effective parameter on ET0 followed by Tmean and RH, respectively while W has the least effect. This result was actually expected according to the calculated correlation coefficients in Table 3. In the second scenario (II), including the combination of SR (Tmean) parameter with Tmean (SR) considerably improves the models’ accuracy. For example, it improved the MAE, RMSE, and NES of RM5Tree by 42% (13%), 43% (19%), and 34% (7%), respectively. Adding RH parameter to Tmean and SR inputs also increases the accuracy of the employed models. For example, the values of MAE and RMSE of RM5Tree were decreased from 0.455 and 0.551 mm to 0.215 and 0.32 mm by 52% and 42%, respectively. In scenario III — even though W seems to be the least effective parameter from the first four input combinations — adding W parameter to other three inputs considerably increases the MAE and RMSE of the models (MAE and RMSE of the RM5Tree increased by 82% and 84%, respectively). According to the results of scenario (III), the accuracy of M5Tree model with respect to MAE and RMSE was improved by 43% and 51% using RM5Tree, respectively.

Table 3 Comparative results of different modeling approaches for Isparta Station in training and testing phases

5.2 Result analysis for the Antalya Station

Table 4 reports the comparative results of the models in estimating ET0 of Antalya Station. Similar to Isparta, RM5Tree model outperforms the other models. From the first scenario (categories i to iv), the effective parameters (from most to least) in modeling ET0 are SR, Tmean, RH, and W. The accuracy of the RM5tree with respect to MAE, RMSE, and NES is improved up to 10% (15%), 48% (39%), and 50% (27%) by adding the SR input, respectively. Similarly, importing RH to Tmean and SR inputs decreases the MAE and RMSE of RM5Tree by 73% and 42%, respectively. Moreover, including W input in Tmean, SR and RH combination considerably increases the RM5Tree accuracy (MAE and RMSE are decreased by 87% and 89%) in scenario (III).

Table 4 Comparative results of different modeling approaches for Antalya station in training and testing phases

5.3 Result analysis for the Adana Station

Table 5 compares the training and testing statistics of the three methods for Adana Station. Similar to the Isparta, in this station, the RM5Tree model gave the best accuracy in modeling ET0 with respect to various evaluation statistics. According to the single input combinations in scenario (I), the most effective variables on ET0 is SR followed by Tmean and RH. Using SR (Tmean) parameter with Tmean (SR) input improves the RM5Tree accuracy with respect to MAE, RMSE and NES by 35% (11%), 40% (12%) and 43% (5%) in the test period, respectively. Including RH variable as an input factor to the RM5Tree comprising Tmean and SR inputs decreases the MAE and RMSE of the model from 0.553 and 0.671 mm to 0.167 mm and 0.325 mm by 70% and 50%. Similarly, importing W parameter to three inputs (Tmean, SR and RH) in the third scenario considerably increases RM5Tree accuracy, from 0.167 mm to 0.038 (MAE) and from 0.325 mm to 0.122 mm (RMSE), respectively.

Table 5 Comparative results of different modeling approaches for Adana station in training and testing phases

5.4 Discussion

The ET0 estimates of the M5Tree, RSM, and RM5Tree models are compared in Figs. 6, 7, 8, 9, 10, 11, and 12 for each station and each input combination. The effect of each variable on ET0 can be seen from the figures visually. Comparison of Figs. 6 and 7 indicates that the effective ranks of the variables (from the most to the least) are SR, Tmean, RH and W. Jain et al. (2008) also reported the same trend for the effective parameters (SR, Tmean, RH, W and lastly dew point temperature) by using hourly data of ET0 for a few stations in the Reynolds Creek Experimental Watershed in South-western Idaho, USA. In addition, the effect of each variable on ET0 can also be observed from Figs. 8 and 9. Comparison of Figs. 10 and 11 shows the considerable effect of W variable even though this cannot be seen when W is used as input alone. One input model cannot catch the relationship between W and ET0. All these indicate the necessity of this variable in accurately modeling of ET0. It should be noted that the M5Tree model estimates are not accurate in Adana compared to other stations and methods. The reason of this might be the fact that the relationship between inputs and output is more non-linear in Adana compared to others and the M5tree model having linear nature might not adequately map this highly non-linear relationship. Table 6 compares the results of the best RM5tree model with two of the most prevailing AI-based models of MLPNN and RBNN (multi-layer perceptron neural network and radial basis neural network). It can be concluded that all the AI-based models acted better by considering all the input variables considering scenario III (with the exception for the RBNN in Adana Station). Although the MLPNN model gave better results than the RBNN models but it could not surpass the performance of the proposed RM5tree model. Having a better diagnostic analysis of the efficiency of the all AI-based models (M5Tree, RM5Tree, RSM, MLPNN & RBNN), the results of the best input category in scenarios I, II, and III in terms of RMSE (mm) are shown in Fig. 13 using radar charts. Obviously, the smaller size of stars with lower values for RMSE would indicate the better performance of the models. It can be easily seen that involving all the variables (T, SR, RH, W in scenario III) would result in lower values of RMSE (with an exception for the RBNN in Adana Station). This major finding is supported by the outcomes of different AI-based model in a similar study done by Kisi (2006). Further evaluation was achieved using the Taylor diagram to check the performances of the models as presnted in Fig. 14. At all stations RM5Tree performs better than the other models, it is clear evidence fron the results of Fig. 14 that the proposed approach improves the accuracy of the M5Tree model. Finally, to further compare the accuracy of the models, all the results using the best input combination for each model has been considered using the Box plot as plotted in Fig. 15 . Box plots corresponding to the test data in Fig. 15 clearly show that the accuracy of the RM5Tree model was higher than the other models.

Fig. 6
figure 6

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of mean temperature (T) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 7
figure 7

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of wind speed (W) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 8
figure 8

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of solar radiation (SR) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 9
figure 9

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of relative humidity (RH) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 10
figure 10

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of mean temperature (T) and solar radiation (SR) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 11
figure 11

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of mean temperature (T), solar radiation (SR) and relative humidity (RH) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Fig. 12
figure 12

Scatterplot of the M5Tree, RSM, and RM5Tree models based on the input data of mean temperature (T), solar radiation (SR), relative humidity (RH) and wind speed (W) in test (35% from all data) period for Isparta, Antalya, and Adana stations (RF: number of radial function)

Table 6 Comparing the results of the best RM5Tree model with the best structures of well-known Radial basis neural network (RBFNN) and Multi-Layer Perceptron Neural Network (MLPNN) in the testing phases for the three meteorological stations
Fig. 13
figure 13

Radar chart for the best calculated values of RMSE (mm) for the applied models using the three input scenarios

Fig. 14
figure 14

Taylor diagram displaying a statistical comparison of the proposed models with FAO-56 PM (mm). The green circles correspond to circumferences of equal centered normalized root-mean-square (NRMS) difference between measured and calculated ET0, the blue lines correspond to lines of equal correlation coefficients, and doted red circles correspond to circumferences of equal standard deviations

Fig. 15
figure 15

Box plots of FAO-56 PM and calculated values of ET0 in the test phase of all stations. The box stretches from the 25th percentile to the 75th percentile. The median is shown as a red line, and the whiskers correspond to the most extreme data points

6 Conclusion

In the presented work, the applicability of a new method which is developed by combining radial basis function and M5Tree methods was investigated in modeling ET0. The new method was compared with standard M5Tree, RSM, MLPNN, and RBNN using daily climatic data from three stations located in Turkey. Various input combinations of available data were tried to see the effect of each input variable on ET0. The following conclusions were derived from the applications.

  1. i-

    The comparison of methods revealed that the new proposed method, RM5Tree, provided better ET0 estimates than the MLPNN, RBNN, M5Tree, and RSM. The accuracy of M5Tree models was considerably improved (more than 30% with respect to MAE and RMSE) by using RM5Tree.

  2. ii-

    The results obtained based on different input combinations indicated that the most effective variable on models’ accuracy in estimating ET0 was solar radiation followed by the air temperature, relative humidity, and wind speed. However, it was also observed that using wind speed together with other three inputs considerably increases models’ efficiency (more than 80% with respect to MAE and RMSE).

  3. iii-

    The study showed that the proposed RM5Tree model could be utilized as a better alternative to the M5Tree in modeling daily ET0.

  4. iv-

    This ability of this method can be compared with other stations or this method can be applied for other hydrological problems in future.