Introduction

Decision-making for choosing the best among several possible strategies is improved by the use of optimization models. This is particularly true when decision-making must consider multi-conditional situations (e.g., Bozorg-Haddad et al. 2006). There are meta-heuristic and evolutionary several algorithms that are being developed for different water resources issues (Bozorg-Haddad et al. 2017). For example, Solgi et al. (2017) modified the HBMO algorithm (Enhanced HBMO (EHBMO)) to solve several mathematical benchmark problems and a real multi-reservoir problem. The results demonstrated the successful efficiency of the EHBMO compared with other optimization algorithms. Shokri et al. (2014) extracted the best water-quality management decisions to minimize damages due to the sudden release of pollution to Karaj dam (in Iran) by the implementation of the multi-objective NSGAII-ALANN algorithm (a combination of the NSGAII algorithm with a multi-layer perceptron (MLP)). Li et al. (2018) applied the multi-objective Moth-flame optimization algorithm (MOMFA) for a multi-reservoir system with multiple objectives in the Lushui River basin (China). The superiority of the proposed MOMFA was verified in comparison with the results of other algorithms.

GP identify functions that relate a system’s inputs to its outputs without capturing the processes that lead from input to outputs. GP searches for optimal mathematical relations between observed and calculated data (Langdon and Chen 2018). There are numerous reported applications of GP. Fernandez and Evett (1998) introduced and investigated the numeric mutation approach genetic programming. Golubski (2002) worked on symbolic fuzzy regression problems using GP. Sheng-wu and Wei-wu (2003) introduced a point-tree data structure genetic programming (PTGP) method. Morales and Vázquez (2004) studied symbolic regression problems by means of GP. Searson et al. (2011) introduced genetic programming and symbolic regression for the MATLAB (GPTIPS) software for solving symbolic regression problems. This tool was specifically designed to develop mathematical models for data that are multi-genetic in nature.

Among the GP applications in hydroinformatics, we cite Savic et al. (1999) and Babovic and Keijzer (2002), who worked on rainfall-runoff modeling based on GP. Giustolisi (2004) determined the Chèzy resistance coefficient in corrugated channels using GP. Liong et al. (2007) applied the GP as a flow forecasting tool. Fallah-Mehdipour et al. (2013) predicted and simulated the monthly groundwater levels by GP. Havlíček et al. (2013) improved rainfall-runoff forecasts by a combination of GP and hydrological modeling concepts. They developed the SORD! program in the R programming language (R Development Core Team 2011). Ashofteh et al. (2015) developed and evaluated rule curves of reservoir operation and compared them for baseline and future periods. The rules were calculated with genetic GP. Danandeh Mehr et al. (2018) presented a comprehensive review of recent progress and applications of GP in water resources engineering (WRE). The representative papers were classified as having hydrologic, hydraulic, and hydro-climatological emphasis.

A survey of the archival literature indicates the range of applications of GP is herein widened by adding multi-conditional mathematical operators, logical operators, and logical functions to improve its predictive skill. A logical genetic programming (LGP) approach is presented in this paper that innovates the current state-of-the-art in genetic programming. Thus, the purpose of the present study is to improve the GP predictive skill through adding conditional functions to it (LGP). The LGP performance is tested with the calculation of multi-conditional mathematical relations and standard operation rules (SOP) in water resources. The results obtained from LGP are compared with those obtained with GP.

Methodology

This section includes a brief review of the GP approach, a description of the LGP to improve the performance of the GP, a proof of the LGP efficiency in multi-conditional mathematical problem solving, and evaluation of the performance of the LGP through reconstruction of operating rules with the SOP rule. A flowchart of this paper’s methodology is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of the proposed methodology

Development of the logical genetic programming approach

One of the improvements that can be made to GP is the development and integration of multiple mathematical logical operators and functions to its capabilities. The development of the LGP by integration of logical operators and functions to GP is described next.

The GP process starts with the generation of a random initial population. This random initial population consists of a set or trees of functions (all operators or mathematical relations) and terminals (all independent parameters and constants). All trees in the population must be evaluated in terms of their performance to solve a stated problem. This evaluation is carried out by the fitness function (an objective function plus penalties that are added to satisfy constraints). Genetic operators (crossover and mutation, for example) are used on trees of the current population of solutions to select the fittest solutions with which a new, offspring, population of solutions is created (Kramer and Zhang 2000). Crossover and mutation operators are selected at the beginning of calculations by the user. Members of a population of solutions are shown as a tree of mathematical statements when solving a regression or optimization problem with GP. Sets of functions and terminals are used to create population members in the tree structure. The combination of these two sets enables GP to build potential solutions to their problems. Koza (1992) first showed how GP can be applied to solve regression problems. This requires identifying the model structure and optimizing associated numerical parameters to achieve the best possible match between observed and calculated data. Therefore, GP can be simultaneously employed to optimize the model functional form and to estimate the associated numerical parameters. This process is known as symbolic regression. Prior knowledge of the problem helps in the development of algorithms to be added to GP.

Integration of new functions with the GPLAB toolbox is possible by editing the m.file of “availableparams.” The GPLAB toolbox is available in MATLAB 9.0. It is composed of three main components: The first introduces parameters, the second specifies possible values for each parameter, and the third determines the default values for each parameter (Silva 2007). Multi-conditional functions, logical functions operators (≤, ≥, <, and >) and Boolean functions are defined and integrated with GP to produce the LGP in this study. LGP can be applied to problems with various characteristics and conditions. Figure 2 depicts the LGP approach in parametric form.

Fig. 2
figure 2

The LGP approach in parametric form: a function expressed with five segments

The LGP approach applied to a multi-conditional mathematical problem

The solving capability of the LGP approach is verified in this paper with the following optimization problem featuring multi-conditional statements expressed by Eq. (1):

$$ y=f(x)=\Big\{{\displaystyle \begin{array}{ll}-{\left(x+4\right)}^2-3& -6\le x<-3\\ {}-x& -3\le x<0\\ {}x& 0\le x<3\\ {}{\left(x-4\right)}^2+6& 3\le x<6\\ {}6& 6\le x<8\end{array}} $$
(1)

The objective function minimizes the MAE (mean absolute error) in each interval x. Therefore, the objective function is written according to Eq. (2):

$$ \mathit{\operatorname{Minimize}}\; MA{E}_q=\frac{1}{n}\sum \limits_{i=1}^n\mid {y}_i-{y}_{qi}\mid q=1,2 $$
(2)

in which MAEq is the mean absolute error related to approach q, q is the approach chosen to represent genetic programming (GP) (q = 1) and LGP (q = 2), yi is the observed data, yqi is the calculated data by approach q, and n is the number of observed data in the desired range (for this problem the interval is between −6 ≤ x < 8).

The LPG approach applied to the reservoir operation rule (SOP rule)

Reservoir operating rules specify the desired release from reservoir storage. The simplest reservoir operation policy is the SOP (Loucks et al. 1981). The SOP rule has been applied by various researchers (such as Raman and Chandramouli (1996), Cancelliere et al. (1998), and Ashofteh et al. (2013a)). A multi-conditional rule considers the amount of available water as a threshold for determining the reservoir release (Fig. 3).

Fig. 3
figure 3

Schematic of the SOP rule. rspt is the release during period t

The available water is defined as reservoir storage volume plus inflow to reservoir minus evaporation, according to Eq. (3):

$$ A{W}_t={S}_t+{Q}_t-{E}_t\;t=1,2,\dots, T $$
(3)

in which AWt is the available water volume during period t, St is the storage volume of reservoir at the beginning of period t, Qt is the inflow volume to reservoir during period t, T is the duration of the operation interval, and Et is the volume of evaporation during period t. Et is determined with Eq. (4):

$$ {E}_t=\left[{e}_t\times \left(a{S}_t+b\right)\right]/1,000 $$
(4)

in which et is the evaporation depth during period t; and a and b are the constants of the reservoir area as a function of storage (Loáiciga 2002).

The release and spill during period t of the SOP rule is given by Eq. (5):

$$ rs{p}_t=\Big\{{\displaystyle \begin{array}{ll}A{W}_t& <A{W}_t\le D\\ {}D& D<A{W}_t\le D+S\max \\ {} AW-S\max & D+S\max <A{W}_t\end{array}}\;t=1,2,...,T $$
(5)

in which rspt is the release from the SOP (observed) during period t, D is the average water demand calculated over the entire planning time interval, and Smax is the maximum volume of the reservoir.

The objective function minimizes the MAE during the interval of operation of the reservoir for the SOP rule, according to Eq. (6):

$$ \mathit{\operatorname{Minimize}} MA{E}_q=\frac{1}{T}\sum \limits_{t=1}^T\mid rs{p}_t- RS{P}_{qt}\mid t=1,2,\dots, T\;q=1,2 $$
(6)

in which MAEq is the mean absolute error based on approach q and RSPqt is the calculated release during period t with approach q.

Performance criteria

The correlation coefficient (R), root mean square error (RMSE), and Nash-Sutcliffe efficiency (NSE), written in Eqs. (7)–(9) (Hu et al. 2001; Moriasi et al. 2007; Moghadam et al. 2019), respectively, are applied to compare the GP and LGP approaches:

$$ {R}_q=\left(\sum \limits_{t=1}^T\left({z}_t-\overline{z}\right).\left({Z}_{qt}-{\overline{Z}}_q\right)\right)/\left(\sqrt{\sum \limits_{t=1}^T{\left({z}_t-\overline{z}\right)}^2\cdot \sum \limits_{t=1}^T{\left({Z}_{qt}-{\overline{Z}}_{qt}\right)}^2}\right)t=1,2,...,T\;q=1,2 $$
(7)
$$ RMS{E}_q=\sqrt{\sum \limits_{t=1}^T{\left({Z}_{qt}-{z}_t\right)}^2/T}\;t=1,2,\dots, T\;q=1,2 $$
(8)
$$ NS{E}_q=1-\left(\sum \limits_{t=1}^T{\left({z}_t-{Z}_{qt}\right)}^2/\sum \limits_{t=1}^T{\left({z}_t-\overline{z}\right)}^2\right)\;t=1,2,\dots, T\;q=1,2 $$
(9)

in which zt is the observed (or additional release) data during period t, \( \overline{z} \) is the average of observed data over the entire time interval, Zqt is the calculated data during period t and based on approach q, and \( {\overline{Z}}_{qt} \) is the average of calculated data over the entire time interval based on approach q.

The reservoir system and pertinent information

The studied reservoir system is within the Aidoghmoush basin that is located in Eastern Azerbaijan province (northeastern of Iran) (Fig. 4) (Ashofteh et al. 2013b). The basin size is approximately 1800 km2. Annual river discharge and length of river equal are 190 (106 m3) and 80 km, respectively. The normal level of the Aidoghmoush reservoir is 1341.5 m above sea level. The reservoir capacity is 145.7 (106 m3) and the reservoir’s dead volume is equal to 8.7 (106 m3). a and b are constants of the reservoir area vs. storage curve equal to 0.03 and 0.8, respectively. This study relies on a baseline T = 14 year inflow data (time interval 1987–2000). Also, the average demand over the entire planning period is equal to 11.97 (106 m3), as shown in Fig. 5 with related information about the evaporation depth.

Fig. 4
figure 4

Location of the Aidoghmoush reservoir system

Fig. 5
figure 5

Inflow to reservoir, average demand, and evaporation depth during the planning interval

Parameters and stopping criterion

GPLAB is the GP toolbox for the MATLAB 9.0 software (Silva 2007). Five arithmetic operators (+, − , /, × , ^) and six developed mathematical or operator functions (≤, ≥, < , > , if, and) were used as the set of GP functions. An evolutionary search process converges to a value that is close to the optimal solution. The magnitudes of the GP and LGP parameters are presented in Table 1 for the multi-conditional mathematical problem and for the operating rule of the reservoir (extracted from the SOP).

Table 1 Parameters of the GP and LGP for the mathematical problem and the SOP rule

Results and discussion

The results obtained from the multi-conditional mathematical problem by the LGP and the GP, are presented in Fig. 6a–f. Figure 6a, b shows trends of the objective functions variations for extraction of multi-conditional mathematical functions by the GP and LGP, respectively. Figure 6c, d displays the functional forms, and Fig. 6e, f compares the calculated data obtained with the GP and LGP with their observed data values, respectively.

Fig. 6
figure 6

Comparison of the results of mathematical relation extraction by the GP and LGP approaches: a and b the related objective functions, c and d functional form, and e and f comparison of calculated and observed values

Figure 6a, b establishes that the convergence of the LGP is better than that of the GP, with the LGP-obtained optimal objective value equaling 0.303, which was lower (under minimization) than the GP’s obtained optimal value equal to 1.177. In other words, the LGP approach improves the objective function (74%) relative to the GP approach in the multi-conditional mathematical example. Also, a comparison of the calculated data with the GP and LGP approaches with the observed appears in Fig. 6e–f, respectively, that establish the LGP approach, with determination coefficient of 99%, has better performance than the GP approach with determination coefficient of 84%.

Equation (10) was calculated with GP (from solving optimization problem per q = 1 (Eq. (2)), and Eq. (11) was calculated with LGP (from solving optimization problem for q = 2 (Eq. (2)), and are depicted in Fig. 6c, d. The SPOT markers in Figs 6c and 5d were calculated with Eq. (1), and they are as follows:

$$ {y}_{qn}\left(q=1\right)=\frac{x_n}{1.02+4.32\times {10}^{-7\left(7.39{x}_n+2.36{x}_n^2\right)}} $$
(10)
$$ {y}_{qn}\left(q=2\right)=\Big\{{\displaystyle \begin{array}{ll}0.25{x}_n+4.49& 5.86\le {x}_n\\ {}1.18{x}_n+1.04& 3.85\le {x}_n<5.86\\ {}-1.51{x}_n+11.41& 2.92\le {x}_n<3.85\\ {}{x}_n& 0\le {x}_n<2.92\\ {}-{x}_n& -3.1\le {x}_n<0\\ {}0.92{x}_n+0.44& -5.45\le {x}_n<-3.1\\ {}0.92{x}_n-0.48& {x}_n<-5.45\end{array}} $$
(11)

The results of the SOP rule extraction for the Aidoghmoush one-reservoir system with LGP and GP are shown in Fig. 7a–f. Figure 7a, b depicts the trends of the objective function variations for the SOP rule. Figure 7c, d displays their functional forms, and Fig 7e, f compares observed and calculated data with GP and LGP, respectively. It can be concluded from Fig 7a, b that convergence of LGP is better than that of GP, so the LGP approach with objective function equal to 0.3 has better performance than the GP approach with objective function equal to 0.575. In other words, the LGP approach improves the objective function (42%) relative to the GP approach insofar as optimizing the SOP rule is concerned. Comparison of the data calculated with the GP and the LGP approaches with the observed data is shown in Fig 6e, f, respectively, where it is seen that the LGP approach with determination coefficient of 99% has better performance than GP with determination coefficient of 95%. This means the LGP approach, which is capable of incorporating logical functions, leads to better curve fitting than GP.

Fig. 7
figure 7

Comparison of the results of the SOP rule extraction by the GP and LGP approaches: a and b the related objective functions, c and d the SOP curve, and e and f comparison of calculated and observed values

The rule calculated with the GP approach (from solving optimization problem per q = 1 (Eq. (6)) is given by Eq. (12) and plotted in Fig. 7c, and the rule calculated with the LGP (from solving optimization problem per q = 2 (Eq. (6)) is presented in Eq. (13) and plotted in Fig. 7d. Meanwhile, the SOP markers in Fig. 7c, d were calculated with Eq. (5):

$$ RS{P}_t=16.04+\frac{3.151}{0.081\cdot A{W}_t-14.73}+\frac{\left(9.258\times {10}^{-18}\times A{W_t}^{9.089}\right)-130.888}{A{W}_t-0.0247\cdot A{W}_t} $$
(12)
$$ RS{P}_t=\Big\{{\displaystyle \begin{array}{ll}0.89A{W}_t-126.38& 157.02\le A{W}_t\\ {}12.05& 17.74\le A{W}_t<157.02\\ {}1.01A{W}_t-8.66& 0.12\le A{W}_t<17.74\\ {}0& A{W}_t<0.12\end{array}} $$
(13)

The GP and LGP results were compared in order to further evaluate the LGP approach’s efficiency relative to GP for both the multi-conditional mathematical problem and the SOP rule, with the obtained results listed in Table 2. The results of Table 2 indicate the LGP approach performs better than GP for both multi-conditional mathematical problem and the SOP rule. Specifically, solving the mathematical problem with the LGP approach decreases the RMSE (78%) and increases the NSE (18%) relative to GP. Also, using the LGP approach in the reconstruction of the SOP rule decreases the RMSE (22%) and increases the NSE (1%) relative to the GP.

Table 2 Comparison of the LGP approach to the GP approach using performance criteria

Concluding remarks

Logical operators and Boolean functions were developed and added to GP to create the LGP approach, seeking to improve GP’s performance in solving special problems. The superior capability of the LGP was verified and evaluated with one multi-conditional mathematical problem and one water resources problem (the SOP rule). The results showed the LGP approach improves the objective function 74 and 42% relative to GP in the mathematical and SOP problems, respectively.

The LGP approach decreased the RMSE about 78% in the mathematical problem; it increased the NSE about 18%, and increased the R about 8.5%. Calculation of the SOP rule with the LGP approach decreased the RMSE about 22%, and increased the R about 2% relative to GP.