Abstract
Genetic programming (GP) is a variant of evolutionary algorithms (EA). EAs are general-purpose search algorithms. Yet, GP does not solve multi-conditional problems satisfactorily. This study improves the GP’s predictive skill by development and integration of mathematical logical operators and functions to it. The proposed improvement is herein named logical genetic programming (LGP) whose performance is compared with that of GP using examples from the fields of mathematics and water resources. The results of the examples show the LGP’s superior performance in both examples, with LGP producing improvements of 74 and 42% in the objective functions of the mathematical and water resources examples, respectively, when compared with the GP’s results. The objective functions minimize the mean absolute error (MAE). The comparison of the LGP and GP results with alternative performance criteria demonstrate a better capability of the former algorithm in solving multi-conditional problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Decision-making for choosing the best among several possible strategies is improved by the use of optimization models. This is particularly true when decision-making must consider multi-conditional situations (e.g., Bozorg-Haddad et al. 2006). There are meta-heuristic and evolutionary several algorithms that are being developed for different water resources issues (Bozorg-Haddad et al. 2017). For example, Solgi et al. (2017) modified the HBMO algorithm (Enhanced HBMO (EHBMO)) to solve several mathematical benchmark problems and a real multi-reservoir problem. The results demonstrated the successful efficiency of the EHBMO compared with other optimization algorithms. Shokri et al. (2014) extracted the best water-quality management decisions to minimize damages due to the sudden release of pollution to Karaj dam (in Iran) by the implementation of the multi-objective NSGAII-ALANN algorithm (a combination of the NSGAII algorithm with a multi-layer perceptron (MLP)). Li et al. (2018) applied the multi-objective Moth-flame optimization algorithm (MOMFA) for a multi-reservoir system with multiple objectives in the Lushui River basin (China). The superiority of the proposed MOMFA was verified in comparison with the results of other algorithms.
GP identify functions that relate a system’s inputs to its outputs without capturing the processes that lead from input to outputs. GP searches for optimal mathematical relations between observed and calculated data (Langdon and Chen 2018). There are numerous reported applications of GP. Fernandez and Evett (1998) introduced and investigated the numeric mutation approach genetic programming. Golubski (2002) worked on symbolic fuzzy regression problems using GP. Sheng-wu and Wei-wu (2003) introduced a point-tree data structure genetic programming (PTGP) method. Morales and Vázquez (2004) studied symbolic regression problems by means of GP. Searson et al. (2011) introduced genetic programming and symbolic regression for the MATLAB (GPTIPS) software for solving symbolic regression problems. This tool was specifically designed to develop mathematical models for data that are multi-genetic in nature.
Among the GP applications in hydroinformatics, we cite Savic et al. (1999) and Babovic and Keijzer (2002), who worked on rainfall-runoff modeling based on GP. Giustolisi (2004) determined the Chèzy resistance coefficient in corrugated channels using GP. Liong et al. (2007) applied the GP as a flow forecasting tool. Fallah-Mehdipour et al. (2013) predicted and simulated the monthly groundwater levels by GP. Havlíček et al. (2013) improved rainfall-runoff forecasts by a combination of GP and hydrological modeling concepts. They developed the SORD! program in the R programming language (R Development Core Team 2011). Ashofteh et al. (2015) developed and evaluated rule curves of reservoir operation and compared them for baseline and future periods. The rules were calculated with genetic GP. Danandeh Mehr et al. (2018) presented a comprehensive review of recent progress and applications of GP in water resources engineering (WRE). The representative papers were classified as having hydrologic, hydraulic, and hydro-climatological emphasis.
A survey of the archival literature indicates the range of applications of GP is herein widened by adding multi-conditional mathematical operators, logical operators, and logical functions to improve its predictive skill. A logical genetic programming (LGP) approach is presented in this paper that innovates the current state-of-the-art in genetic programming. Thus, the purpose of the present study is to improve the GP predictive skill through adding conditional functions to it (LGP). The LGP performance is tested with the calculation of multi-conditional mathematical relations and standard operation rules (SOP) in water resources. The results obtained from LGP are compared with those obtained with GP.
Methodology
This section includes a brief review of the GP approach, a description of the LGP to improve the performance of the GP, a proof of the LGP efficiency in multi-conditional mathematical problem solving, and evaluation of the performance of the LGP through reconstruction of operating rules with the SOP rule. A flowchart of this paper’s methodology is shown in Fig. 1.
Development of the logical genetic programming approach
One of the improvements that can be made to GP is the development and integration of multiple mathematical logical operators and functions to its capabilities. The development of the LGP by integration of logical operators and functions to GP is described next.
The GP process starts with the generation of a random initial population. This random initial population consists of a set or trees of functions (all operators or mathematical relations) and terminals (all independent parameters and constants). All trees in the population must be evaluated in terms of their performance to solve a stated problem. This evaluation is carried out by the fitness function (an objective function plus penalties that are added to satisfy constraints). Genetic operators (crossover and mutation, for example) are used on trees of the current population of solutions to select the fittest solutions with which a new, offspring, population of solutions is created (Kramer and Zhang 2000). Crossover and mutation operators are selected at the beginning of calculations by the user. Members of a population of solutions are shown as a tree of mathematical statements when solving a regression or optimization problem with GP. Sets of functions and terminals are used to create population members in the tree structure. The combination of these two sets enables GP to build potential solutions to their problems. Koza (1992) first showed how GP can be applied to solve regression problems. This requires identifying the model structure and optimizing associated numerical parameters to achieve the best possible match between observed and calculated data. Therefore, GP can be simultaneously employed to optimize the model functional form and to estimate the associated numerical parameters. This process is known as symbolic regression. Prior knowledge of the problem helps in the development of algorithms to be added to GP.
Integration of new functions with the GPLAB toolbox is possible by editing the m.file of “availableparams.” The GPLAB toolbox is available in MATLAB 9.0. It is composed of three main components: The first introduces parameters, the second specifies possible values for each parameter, and the third determines the default values for each parameter (Silva 2007). Multi-conditional functions, logical functions operators (≤, ≥, <, and >) and Boolean functions are defined and integrated with GP to produce the LGP in this study. LGP can be applied to problems with various characteristics and conditions. Figure 2 depicts the LGP approach in parametric form.
The LGP approach applied to a multi-conditional mathematical problem
The solving capability of the LGP approach is verified in this paper with the following optimization problem featuring multi-conditional statements expressed by Eq. (1):
The objective function minimizes the MAE (mean absolute error) in each interval x. Therefore, the objective function is written according to Eq. (2):
in which MAEq is the mean absolute error related to approach q, q is the approach chosen to represent genetic programming (GP) (q = 1) and LGP (q = 2), yi is the observed data, yqi is the calculated data by approach q, and n is the number of observed data in the desired range (for this problem the interval is between −6 ≤ x < 8).
The LPG approach applied to the reservoir operation rule (SOP rule)
Reservoir operating rules specify the desired release from reservoir storage. The simplest reservoir operation policy is the SOP (Loucks et al. 1981). The SOP rule has been applied by various researchers (such as Raman and Chandramouli (1996), Cancelliere et al. (1998), and Ashofteh et al. (2013a)). A multi-conditional rule considers the amount of available water as a threshold for determining the reservoir release (Fig. 3).
The available water is defined as reservoir storage volume plus inflow to reservoir minus evaporation, according to Eq. (3):
in which AWt is the available water volume during period t, St is the storage volume of reservoir at the beginning of period t, Qt is the inflow volume to reservoir during period t, T is the duration of the operation interval, and Et is the volume of evaporation during period t. Et is determined with Eq. (4):
in which et is the evaporation depth during period t; and a and b are the constants of the reservoir area as a function of storage (Loáiciga 2002).
The release and spill during period t of the SOP rule is given by Eq. (5):
in which rspt is the release from the SOP (observed) during period t, D is the average water demand calculated over the entire planning time interval, and Smax is the maximum volume of the reservoir.
The objective function minimizes the MAE during the interval of operation of the reservoir for the SOP rule, according to Eq. (6):
in which MAEq is the mean absolute error based on approach q and RSPqt is the calculated release during period t with approach q.
Performance criteria
The correlation coefficient (R), root mean square error (RMSE), and Nash-Sutcliffe efficiency (NSE), written in Eqs. (7)–(9) (Hu et al. 2001; Moriasi et al. 2007; Moghadam et al. 2019), respectively, are applied to compare the GP and LGP approaches:
in which zt is the observed (or additional release) data during period t, \( \overline{z} \) is the average of observed data over the entire time interval, Zqt is the calculated data during period t and based on approach q, and \( {\overline{Z}}_{qt} \) is the average of calculated data over the entire time interval based on approach q.
The reservoir system and pertinent information
The studied reservoir system is within the Aidoghmoush basin that is located in Eastern Azerbaijan province (northeastern of Iran) (Fig. 4) (Ashofteh et al. 2013b). The basin size is approximately 1800 km2. Annual river discharge and length of river equal are 190 (106 m3) and 80 km, respectively. The normal level of the Aidoghmoush reservoir is 1341.5 m above sea level. The reservoir capacity is 145.7 (106 m3) and the reservoir’s dead volume is equal to 8.7 (106 m3). a and b are constants of the reservoir area vs. storage curve equal to 0.03 and 0.8, respectively. This study relies on a baseline T = 14 year inflow data (time interval 1987–2000). Also, the average demand over the entire planning period is equal to 11.97 (106 m3), as shown in Fig. 5 with related information about the evaporation depth.
Parameters and stopping criterion
GPLAB is the GP toolbox for the MATLAB 9.0 software (Silva 2007). Five arithmetic operators (+, − , /, × , ^) and six developed mathematical or operator functions (≤, ≥, < , > , if, and) were used as the set of GP functions. An evolutionary search process converges to a value that is close to the optimal solution. The magnitudes of the GP and LGP parameters are presented in Table 1 for the multi-conditional mathematical problem and for the operating rule of the reservoir (extracted from the SOP).
Results and discussion
The results obtained from the multi-conditional mathematical problem by the LGP and the GP, are presented in Fig. 6a–f. Figure 6a, b shows trends of the objective functions variations for extraction of multi-conditional mathematical functions by the GP and LGP, respectively. Figure 6c, d displays the functional forms, and Fig. 6e, f compares the calculated data obtained with the GP and LGP with their observed data values, respectively.
Figure 6a, b establishes that the convergence of the LGP is better than that of the GP, with the LGP-obtained optimal objective value equaling 0.303, which was lower (under minimization) than the GP’s obtained optimal value equal to 1.177. In other words, the LGP approach improves the objective function (74%) relative to the GP approach in the multi-conditional mathematical example. Also, a comparison of the calculated data with the GP and LGP approaches with the observed appears in Fig. 6e–f, respectively, that establish the LGP approach, with determination coefficient of 99%, has better performance than the GP approach with determination coefficient of 84%.
Equation (10) was calculated with GP (from solving optimization problem per q = 1 (Eq. (2)), and Eq. (11) was calculated with LGP (from solving optimization problem for q = 2 (Eq. (2)), and are depicted in Fig. 6c, d. The SPOT markers in Figs 6c and 5d were calculated with Eq. (1), and they are as follows:
The results of the SOP rule extraction for the Aidoghmoush one-reservoir system with LGP and GP are shown in Fig. 7a–f. Figure 7a, b depicts the trends of the objective function variations for the SOP rule. Figure 7c, d displays their functional forms, and Fig 7e, f compares observed and calculated data with GP and LGP, respectively. It can be concluded from Fig 7a, b that convergence of LGP is better than that of GP, so the LGP approach with objective function equal to 0.3 has better performance than the GP approach with objective function equal to 0.575. In other words, the LGP approach improves the objective function (42%) relative to the GP approach insofar as optimizing the SOP rule is concerned. Comparison of the data calculated with the GP and the LGP approaches with the observed data is shown in Fig 6e, f, respectively, where it is seen that the LGP approach with determination coefficient of 99% has better performance than GP with determination coefficient of 95%. This means the LGP approach, which is capable of incorporating logical functions, leads to better curve fitting than GP.
The rule calculated with the GP approach (from solving optimization problem per q = 1 (Eq. (6)) is given by Eq. (12) and plotted in Fig. 7c, and the rule calculated with the LGP (from solving optimization problem per q = 2 (Eq. (6)) is presented in Eq. (13) and plotted in Fig. 7d. Meanwhile, the SOP markers in Fig. 7c, d were calculated with Eq. (5):
The GP and LGP results were compared in order to further evaluate the LGP approach’s efficiency relative to GP for both the multi-conditional mathematical problem and the SOP rule, with the obtained results listed in Table 2. The results of Table 2 indicate the LGP approach performs better than GP for both multi-conditional mathematical problem and the SOP rule. Specifically, solving the mathematical problem with the LGP approach decreases the RMSE (78%) and increases the NSE (18%) relative to GP. Also, using the LGP approach in the reconstruction of the SOP rule decreases the RMSE (22%) and increases the NSE (1%) relative to the GP.
Concluding remarks
Logical operators and Boolean functions were developed and added to GP to create the LGP approach, seeking to improve GP’s performance in solving special problems. The superior capability of the LGP was verified and evaluated with one multi-conditional mathematical problem and one water resources problem (the SOP rule). The results showed the LGP approach improves the objective function 74 and 42% relative to GP in the mathematical and SOP problems, respectively.
The LGP approach decreased the RMSE about 78% in the mathematical problem; it increased the NSE about 18%, and increased the R about 8.5%. Calculation of the SOP rule with the LGP approach decreased the RMSE about 22%, and increased the R about 2% relative to GP.
References
Ashofteh, P.-S., Bozorg-Haddad, O., & Mariño, M. A. (2013a). Climate change impact on reservoir performance indices in agricultural water supply. Journal of Irrigation and Drainage Engineering, 139(2), 85–97.
Ashofteh, P.-S., Bozorg-Haddad, O., & Mariño, M. A. (2013b). Scenario assessment of streamflow simulation and its transition probability in future periods under climate change. Water Resources Management, 27(1), 255–274.
Ashofteh, P.-S., Bozorg-Haddad, O., Akbari-Alashti, H., and Mariño, M. A., (2015). “Determination of irrigation allocation policy under climate change by genetic programming”, Journal of Irrigation and Drainage Engineering (ASCE), 141(4), Doi: https://doi.org/10.1061/(ASCE)IR.1943-4774.0000807, 141 (4), 04014059.
Babovic, V., & Keijzer, M. (2002). Rainfall runoff modeling based on genetic programming. Nordic Hydrology, 33(5), 331–346.
Bozorg-Haddad, O., Afshar, A., & Mariño, M. A. (2006). Honey-bees mating optimization (HBMO) algorithm: a new heuristic approach for water resources optimization. Water Resources Management, 20(5), 661–680.
Bozorg-Haddad, O., Solgi, M., & Loáiciga, H. A. (2017). Meta-heuristic and evolutionary algorithms for engineering optimization. Hoboken: John Wiley & Sons.
Cancelliere, A., Ancarani, A., & Rossi, G. (1998). Susceptibility of water supply reservoirs to drought conditions. Journal of Hydrologic Engineering, 3(2), 140–148.
Danandeh Mehr, A., Nourani, V., Kahya, E., Hrnjica, B., Sattar, A. M. A., & Mundher Yaseen, Z. (2018). Genetic programming in water resources engineering: a state-of-the-art review. Journal of Hydrology, 566, 643–667. https://doi.org/10.1016/j.jhydrol.2018.09.043.
Fallah-Mehdipour, E., Bozorg-Haddad, O., & Marino, M. A. (2013). Prediction and simulation of monthly groundwater levels by genetic programming. Journal of Hydro-Environment Research, 7(4), 253–260.
Fernandez, T., & Evett, M. (1998). Numeric mutation as an improvement to symbolic regression in genetic programming. Evolutionary Programming VII, Lecture Notes in Computer Science, Springer Verlag KG, 1447, 251–260.
Giustolisi, O. (2004). Using genetic programming to determine Chèzy resistance coefficient in corrugated channels. Journal of Hydroinformatics, 6(3), 157–173.
Golubski, W. (2002). New results on fuzzy regression by using genetic programming. Genetic Programming, Lecture Notes in Computer Science, Kinsale, Ireland, 2278, 308–315.
Havlíček, V., Hanel, M., Máca, P., Kuráž, M., & Pech, P. (2013). Incorporating basic hydrological concepts into genetic programming for rainfall-runoff forecasting. Computing, 95(1), 363–380.
Hu, T. S., Lam, K. C., & Ng, S. T. (2001). River flow time series prediction with a range dependent neural network. Hydrological Sciences Journal, 46(5), 729–745.
Langdon, W. B. and Chen, T. (2018). “Genetic programming bibliography”, <http://www.cs.bham.ac.uk/~wbl/biblio/>.
Li, W. K., Wang, W. L., & Li, L. (2018). Optimization of water resources utilization by multi-objective moth-flame algorithm. Water Resources Management, 32(10), 3303–3316.
Liong, S.-Y., Gautam, T. R., Khu, S. T., Babovic, V., Keijzer, M., & Muttil, N. (2007). Genetic programming: a new paradigm in rainfall runoff modeling. Journal of the American Water Resources Association, 38(3), 705–718.
Koza, J. R. (1992). Genetic programming: on the programming of computers by means of natural selection (p. 819). Cambridge, Massachusets, London: MIT Press.
Kramer, M. D. and Zhang, D. (2000). “GAPS: a genetic programming system”, The Twenty-Fourth Annual International Computer Software and Applications Conference, Taipei, 25-27 October, pp. 614-619.
Loáiciga, H. A. (2002). Reservoir design and operation with variable lake hydrology. Journal of Water Resources Planning and Management, 128(6), 399–405.
Loucks, D. P., Stedinger, J. R., & Haith, D. A. (1981). Water resources systems planning and analysis (p. 559). N. J., Prentice-Hall: Englewood Cliffs.
Moghadam, S. H., Ashofteh, P.-S., & Loáiciga, H. A. (2019). Application of climate projections and Monte Carlo approach for the assessment of future river flow: case study of the Khorramabad River basin, Iran. Journal of Hydrologic Engineering, 24(7), 05019014. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001801.
Morales, C. O. and Vázquez, K. R. (2004). “Symbolic regression problems by genetic programming with multi-branches”, advances in artificial intelligence, lecture notes in computer science, Springer-Verlag, Mexico City, Mexico, 26–30 April, 2972, 717–726.
Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., & Veith, T. L. (2007). Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Transactions of the ASABE, 50(3), 885–900.
Development Core Team, R. (2011). R: a language and environment for statistical computing. Vienna. http://www.R-project.org: R Foundation for Statistical Computing.
Raman, H., & Chandramouli, V. (1996). Deriving a general operating policy for reservoirs using neural network. Journal of Water Resources Planning and Management, 122(5), 342–347.
Savic, D. A., Walters, G. A., & Davidson, J. W. (1999). A genetic programming approach to rainfall-runoff modelling. Water Resources Management, 13(3), 219–231.
Searson, D. P. Leahy, D. E., and Willis, M. J. (2011). “Predicting the toxicity of chemical compounds using GPTIPS: a free genetic programming toolbox for MATLAB”, Intelligent Control and Computer Engineering, Lecture Notes in Electrical Engineering, Springer, 70, 83–93.
Sheng-Wu, X., & Wei-Wu, W. (2003). Point-tree structure genetic programming method for discontinuous function’s regression. Wuhan University Journal of Natural Sciences, 8(1), 323–326.
Shokri A., Bozorg-Haddad O., Mariño M. A. (2014). “Multi-objective quantity–quality reservoir operation in sudden pollution”, Water Resources Management, 28(2):567–586, DOI: https://doi.org/10.1007/s11269-013-0504-z.
Silva, S. (2007). GPLAB: a genetic programming toolbox for Matlab, version 3 (pp. 13–15). ECOS-Evolutionary and Complex Systems Group: University of Coimbra, Portugal.
Solgi, M., Bozorg-Haddad, O., & Loáiciga, H. A. (2017). The enhanced honey bee mating optimization algorithm for water resources optimization. Water Resources Management, 31(3).
Acknowledgements
The authors thank Iran’s National Science Foundation (INSF) for its financial support on this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ashofteh, PS., Bozorg-Haddad, O. & Loáiciga, H.A. Logical genetic programming (LGP) application to water resources management. Environ Monit Assess 192, 34 (2020). https://doi.org/10.1007/s10661-019-8014-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10661-019-8014-y