1 Introduction

Artificial intelligence (AI) emulates the complex biological processes such as learning, reasoning and self-correction to explore solutions for engineering problems. Unlike traditional statistical methods, AI can provide solutions without prior knowledge of the nature of the relationship between the dependent and independent variables. Therefore, the AI techniques can provide alternative solutions to determine engineering design parameters. This can be particularly important when it is not possible to carry out laboratory or field testing. In general, AI can be classified into several sub-categories such as reasoning, programming, artificial life, belief revision, data mining, distributed AI, expert systems, evolutionary computation, systems, knowledge representation, machine learning, natural language understanding, neural networks, theorem proving, constraint satisfaction, and theory of computation (Kushchu 2002; Zhang and Rockett 2007). Figure 1 shows an illustration of the interrelation between AI and major data science techniques including machine learning (ML) and data mining (DM). These techniques have been extensively applied to the development of reliable empirical, analytical and numerical models for complex engineering problems (Salehi and Burgueño 2018; Araujo 2007; Peker 2016; Wang et al. 2019). ML deals with designing computer models that can learn from a set of data and identify patterns. Biological learning is the main source of inspiration for many of the ML methods (Salehi and Burgueño 2018). The main components of ML are presented in Fig. 2. Some of the widely-used ML methods are artificial neural network (ANN), fuzzy inference system (FIS), adaptive neuro-fuzzy systems (ANFIS), and support vector machines (SVM).

Fig. 1
figure 1

Relationship between AI, ML and other data mining techniques

Fig. 2
figure 2

ML and its main components

The reliability of the AI algorithms has been often criticized because they do not operate similar to the classical methods based on the laws of physics. In fact, the problem is the AI methods generate complex models that are not easy to understand for users (Ghiringhelli et al. 2015). AI-based characterization and design of new materials would be doubtful if the connection between features and prediction is not well-defined. Therefore, two key issues in interpreting the AI-based models are the transparency of the models and the knowledge extraction process. Figure 3 shows a pictorial representation of the classification of modelling techniques (Giustolisi et al. 2007). Referring to this figure, white-, black- and grey-box models are three main categories for mathematical modelling. If the variables and parameters are known and the model is based on first principles (e.g. laws of physics), then it is possible to explain the underlying physical relationships of the system. Such models are classified as white-box models. Black-box methods explore the relationships between the input and output data without providing a feasible structure of the model. Conceptual methods that do not only identify the existing patterns between the data but also provide a mathematical structure of the model belong to the grey-box category. ANN is perhaps the most widely-used ML method in the engineering arena (Hamedi 2005; Yalpir 2016; Bui et al. 2018). It is a computing system inspired by biological neural networks. Despite its acceptable accuracy, ANN belongs to the class of black-box modelling techniques. That is, the ANN solution representation is in the form of weights and bias that do not provide details about the nature of the derived relationships. Genetic programming (GP) is a fairly new ML method developed based on Darwin’s evolution theory (Koza 1992; Brameier and Banzhaf 2007). In contrast to the black-box methods, GP is classified as a grey-box technique with the ability to develop explicit prediction functions (Giustolisi et al. 2007). GP creates predictive models through a complicated evolutionary process. The model transparency is a notable advantage of GP over nearly all other ML methods, especially the black-box ANN and DL methods. Mathematical structures derived by GP can readily be used to gain important information about the system performance.

Fig. 3
figure 3

Classification of the modelling techniques (Adapted from Giustolisi et al. 2007)

In order to improve the performance of standard GP in various aspects, several GP branches have been developed such as linear-based GP, cartesian GP, grammatical GP, stack GP, etc. Gene expression programming (GEP), linear genetic programming (LGP), multi-expression programming (MEP), and multigene genetic programming (MGGP) are the most widely-used GP variants in engineering domain (Alavi and Gandomi 2011; Li et al. 2019; Hadianfard and Jafari 2016; Mansouri and Farzampour 2018; Arruda et al. 2014; Aval et al. 2017; Jafari and Mahini 2017). Although GP and its variants are not as popular as ANNs, a number of researchers discovered their significant potential in solving civil engineering problems. This paper intends to provide a comprehensive survey of applications of GP and its variants in civil engineering. Future avenues for applications of GP and remaining challenges are further discussed. To perform the literature review of genetic programming applications in civil engineering, a content analysis approach developed by Krippendorff (Krippendorff (2018)) is used. Samples are collected through a sophisticated search within major academic databases such as Scopus, Web of Science, Google Scholar, Wiley Online Library, Science Direct, Sage, ASCE Library, IEEE Explore, ACM, and Emerald. Various keywords, e.g., “genetic programming”, “civil engineering”, “LGP”, “GEP”, “MGGP”, “evolutionary algorithm”, “concrete”, “structural”, “geotechnical”, etc., are used to ensure all the related studies are included in the survey. The time period under review is from 2010–2019, leading to study approximately 327 candidate articles. The articles are filtered using a two-round article selection technique: 1) checking titles, abstract, and keywords of the articles, and 2) filtering out the papers closely related to the objective of this review paper. Finally, 31 papers are selected for the survey. This review paper is organized as follows. Section 2 presents the introduction of GP and its variants. The most relevant works addressing the applications of GP for solving civil engineering problems are discussed in Sect. 3. Section 4 reviews the development and improvement of GP and highlights the trends for future applications of GP. Finally, conclusion is provided in Sect. 5.

2 Genetic programming and its variants

2.1 Tree-based genetic programming

GP is a symbolic regression technique for evolving computer programs following the natural selection principles. GP was first introduced by Koza (1992). GP generated candidate solutions are in abstract syntax trees representation expressed by a functional programming language. That is why the classical GP is called tree-based GP (TGP) or canonical GP. The TGP algorithm starts with creating a random initial population and applying genetic operations (e.g. crossover, mutate and reproduce, etc.) until an optimal or at least a suitable model is found. A simplified flow chart of the GP process is illustrated in Fig. 4. Population members in TGP consist of functions and terminals (see Fig. 5). Functions can be basic arithmetic operations, Boolean logical and any other mathematical functions. Terminals contain arguments for functions, which can be variables, numerical and logical contents.

Fig. 4
figure 4

Flowchart of the GP approach

Fig. 5
figure 5

Example of a tree GP program

2.2 Linear genetic programming

LGP is a linear variant of classical GP developed by Brameier and Banzhaf (2007). It uses a specific linear representation of individuals or computer programs. In LGP, classical tree-like expressions are substituted by programs of an imperative language (like C or C + +) (Gandomi et al. 2014; Guven 2009). Compared to the rigid data flow determined by the tree structure programs in classical GP, LGP uses register content multiple times to generate a data flow graph. A sequence of instructions in C represent an LGP program. The instruction sets consists of arithmetic operations, logical functions and other mathematical functions. Terminal sets include variables and constants (Gandomi et al. 2014). As a result, LGP can run faster than tree-based GP and is easier to interpret. An example of LGP is shown in Fig. 6. As illustrated in Fig. 7, the evolution process in LGP is similar to TGP. Further details about LGP can be found in Brameier and Banzhaf (2007).

Fig. 6
figure 6

Example of LGP

Fig. 7
figure 7

Typical operations in LGP

2.3 Gene expression programming

GEP is a natural extension of GP introduced by Ferreira (2001) and Özcan (2012). A schematic representation of the GEP algorithm is presented in Fig. 8. The main difference between GEP and classical GP is that GEP generated candidate solutions are represented by fixed-length string characters, which are later expressed as parse-trees with different sizes and shapes when fitness is evaluated. Chromosomes and expression tress (ETs) are two major parameters of GEP. The translation process decodes information from chromosomes to ETs based on a set of rules. The genetic code and rules are very simple and presented in two languages (language of gene and language of ETs). A typical GEP expression tree is shown in Fig. 9. Further details about GEP can be found in Ferreira (2001).

Fig. 8
figure 8

Flowchart of the GEP algorithm

Fig. 9
figure 9

Example of a GEP program

2.4 Multi expression programming

MEP is another variant of GP introduced by Oltean (2004). MEP is capable of storing multiple solutions of a problem in a single chromosome. This advantage enables MEP to explore larger zones of the search space without running-time penalty. The best solution is determined based on the fitness agreement. The representation of MEP is not specified, and it depends on the problem type. In a simplest situation, the MEP chromosomes are linear strings of instructions (Oltean 2004; Oltean and Groşan 2003; Oltean and Grosan 2004). Figure 10 shows a typical representation of the MEP solutions. Further details about this method can be found in Oltean (2004).

Fig. 10
figure 10

A simple MEP program

2.5 Multigene genetic programming

MGGP is a fairly new variant of GP developed by Searson et al. (2011). In MGGP, a single GP individual is derived from a number of genes, where each GP is a tree expression (Searson et al. 2011; Gandomi and Alavi 2012). MGGP combines multiple GP programs via a weighted linear approach. An example of MGGP program is illustrated in Fig. 11. Further details about MGGP can be found in Searson et al. (2011).

Fig. 11
figure 11

Example of an MGGP program

3 Review of application of GP and its variants in civil engineering

3.1 Tree-based GP applications

Various studies have shown the potential of TGP in solving civil engineering related problems. According to the literature, GP has been used to develop structural behavior property prediction models (Ferreira 2001; Özcan 2012; Oltean 2004; Oltean and Groşan 2003; Oltean and Grosan 2004; Searson et al. 2011), local scour of bridge vertical abutment prediction model (Begum 2013), and fundamental period of vibration (Joshi et al. (2014). Tsai and Lin (2011) proposed a new GP approach by introducing weight coefficients to each GP linkage in a tree called weighted genetic programming (WGP). The WGP method evolves models with a fully weighted connection. This method has been applied to predict high-strength concrete parameters such as compressive strength, cost and slump. Gandomi et al. (2013) proposed an empirical prediction model for shear strength of reinforced concrete (RC) deep beams using a hybrid algorithm combining GP and simulated annealing called genetic simulated annealing (GSA). The authors used 214 RC deep beams test results collected from literature to develop and evaluate the GSA model. The proposed model was compared with design codes and a parameter sensitivity analysis was conducted to further validate the proposed model. It revealed that the model had a higher accuracy than existing models. Jalal et al. (2013) utilized GP and adaptive neuro-fuzzy inference system (ANFIS) to develop models for strength enhancement of concrete cylinders retrofitted by carbon-fiber-reinforced polymer (CFRP) composites. The developed models were compared with ANNs, multiple regression and empirical models. It was shown that the ANFIS model was the most accurate and the accuracy of GP was higher than regression, ANN and other empirical models. Castelli et al. (2013) proposed a new intelligent system based on general GP and geometric semantic genetic operators, called Geometric Semantic Genetic Programming. The proposed method was used to develop prediction models for predicting the strength of high-performance concrete, where the results were compared to general GP and other machine learning methods (e.g. SVM, ANN). It was demonstrated that the proposed method is capable of providing a significantly better result. More, Gandomi et al. (2016) proposed an algorithm for complex civil engineering problems called multi-objective genetic programming (MOGP). MOGP could find model structure selection like standard GP, parameter estimation like classical regression and optimization for complexity and goodness-of-fit in a system. The proposed algorithm was validated by creating a time-dependent concrete creep model based on a big data set. Data selected from the NU-ITI data bank was used to develop the creep model, and the obtained model was compared to some existing creep models, which has a much higher accuracy. Accordingly, it was shown that the proposed MOGP algorithm is efficient to solve complex civil engineering problems.

Further, Moradi et al. (2019) proposed a model to estimate tensile stress–strain diagrams of steel fiber reinforced concrete (SFRC) using GP. To develop the model, 41 experimental data were collected from literature. The proposed model was compared with other models using 11 specimens collected from literatures. Results of the study revealed that the proposed model had better accuracy than other models. Begum (2013) also proposed alternative equations for prediction of local scour at vertical bridge abutment with GP. The developed model was compared with empirical models from literature, where it was shown that the GP model had a higher accuracy. In addition, Joshi et al. (2014) applied genetic programming to develop models for fundamental period of vibration using 206 analyzed buildings. Models in four categories were generated with various inputs numbers, where the obtained equations for four categories were compared with equations recommended by other researchers. The developed models had reasonable and acceptable accuracy, but experimental investigation was needed as suggested by the author. However, reviewing the existing limited applications of TGP in civil engineering shows its acceptable performance. Table 1 presents the summary of the best performances of each study utilizing GP, from which it can be seen that the correlation coefficient (R) between model prediction and actual experimental data is above 0.9, further denoting the capability and potentials of TGP in civil engineering domain.

Table 1 Summary of the TGP model performance indexes in the reviewed studies

3.2 Linear genetic programming applications

This sub-section summarizes studies related to the application of LGP for solving civil engineering problems. Gandomi et al. (2010) proposed a new formulation of elastic modulus for normal-strength and high-strength concrete by applying LGP approach. The models were developed with experimental data collected from literature. The results revealed that LGP models were more accurate that the building codes and some other models from literature. More, Gandomi et al. (2014) proposed a prediction model for shear strength of reinforced concrete beams without stirrups by using LGP. Accordingly, 1938 experimental test results were used to develop the model. The proposed model was compared with existing design codes, where the accuracy was shown to be higher than the design codes. Mollahasani et al. (2011) developed a new prediction model for secant soil deformation modulus using LGP, where the model was developed based on set of plate load tests conducted on different types of soil. The best model was selected among the several models developed with various input parameters, and a sensitivity analysis was conducted. It was shown that the proposed model provided precise prediction. Rashed et al. (2012) developed a prediction model for pressure meter soil deformation modulus utilizing LGP. Babanajad et al. (2013) developed two prediction models for predicting the strength of concrete under multiaxial compression. 370 test data were used to develop training and test models. The first model was developed for prediction of tensile-compression strength ratio, for which uniaxial compressive strength and confining pressure were selected as the effective input parameters. The second model was proposed to predict triaxial strength, for which uniaxial compressive strength and confining pressure were selected as the effective input parameters. Results confirmed that both developed LGP models had a high accuracy performance. Besides, Rostami et al. (2018) proposed a new predictive model using LGP for predicting confined compressive strength and strain at confined peak stress of circular-reinforced concrete columns. Comparing results of the proposed model with models recommended by other researchers revealed that the proposed model had a significantly better performance. Sadrossadat et al. (2017) proposed a new equation for compressive strength of high strength concrete (HSC) specimens 28 days in age by using LGP. The LGP model shows a good degree of accuracy and can be used for manual calculation. Table 2 list the best performance of each method in the reviewed studies.

Table 2 Summary of the LGP performance indexes in the reviewed studies

3.3 Gene expression programming applications

Gene expression is a powerful linear-based GP method that has been widely used to solve civil engineering problems (Sadrossadat et al. 2020). Cevik et al. (2010) utilized GEP to develop prediction model for torsional strength of reinforced concrete. Accordingly, 76 sets of rectangular RC beams experimental data obtained from existing databased were used to develop the model. The proposed model was compared with building codes, where it was concluded that the GEP model had a high accuracy than the building codes. Ozcan (2012) developed two GEP models to predict splitting tensile strength of cylinder specimens with 150 mm diameter and 300 mm height. 190 experimental results were used to develop the models. The first model GEP-I was a function of concrete compressive strength, while the second model GEP-II was a function of concrete compressive strength as well as water cement ratio. The proposed models were compared with some existing models, where it was shown that the proposed GEP-based models can be used as promising tools for splitting tensile strength prediction. Further, Mousavi et al. (2012) GEP to develop a prediction model for predicting the compressive strength of high performance concrete (HPC) mixes. Linear and nonlinear least square regression models were developed and compared with the GEP model. For this purpose, 133 data collected from literature were used. Results of GEP, LLSR and NLSR models were compared, and it was revealed that the GEP model had a higher accuracy than two other regression models. Ozbek et al. (2013) developed five different GEP models to estimate the uniaxial compressive strength (UCS) of the five types of rock with different characteristics. The agreement between experimental data and predicted results was achieved. In addition, Gandomi et al. (2014) proposed a new design equation for shear strength of slender reinforced concrete beams without stirrups, where 1942 experiment test results obtained from published database were used to develop the model. During the development of GEP model, several different sets of training and test combinations were considered to obtain the best GEP model with good simplicity and high accuracy. The obtained model was compared with some existing models, where it was shown that the GEP model had a higher accuracy compared to existing models. Also, a sensitivity and parametric analysis were conducted to show the effects of each parameter. Gandomi et al. (2017) also used the similar approach noted above to design equations for shear strength of slender reinforced concrete beams with stirrups, and results confirmed the good performance of the model with high accuracy. Danish (2012) used gene expression programming technique to develop a scour depth prediction model at bridge abutments in cohesive sediments. The model was developed based on laboratory data collected from literature. The results showed that GEP model provided a better prediction than nonlinear regression model, which revealed that GEP had a big potential in scour depth prediction. Hadianfard and Jafari (2016) conducted a study to estimate the compressive strength of lightweight aggregate concrete by generating suitable mathematical functions. Three type of lightweight concrete mixtures were assessed by ultrasonic pulse velocity test. The results were used to generate compressive strength prediction functions by exponential function fitting and GEP. Three prediction functions for single type of lightweight concrete and one correlation function were generated, where it was revealed that GEP functions had high accuracy for all three types, but exponential function only had high accuracy for one type material. Further, Gholampour et al. (2017) proposed a new empirical model using GEP for recycled aggregate concrete (RAC) mechanical properties prediction. 650 compressive strength, 421 elastic modulus and 346 splitting tensile strength and 152 flexural strength testing results from literature were used to develop the GEP models (i.e., compressive strength, elastic modulus, flexural strength and splitting tensile strength). The developed models were compared with some existing models. Results confirmed that proposed GEP models had higher accuracy than the existing models.

Aval et al. (2017) conducted a study to develop more accurate prediction model for shear strength of SR-RCCs due to the limitations of existing simple design models. Thirty sets of lab testing data were collected, where one third of the collected data were used to calibrate a FE model to be accurate to represent the behaviors of SR-RCCs. The calibrated FE model was used to generate 53 simulations, and results were used to develop the prediction models utilizing GEP and nonlinear regression (NR) analysis. The rest of lab testing data were used to evaluate the accuracy of obtained GEP model and NR model through comparison with ACI and EC2 design codes. It was revealed that GEP had much higher accuracy for shear strength prediction of SR-RCCs than NR model, ACI code and EC2 code. Similarly, due to the limitation of ACI proposed compressive strength and density standards for lightweight concrete, Jafari and Mahini (2017) utilized GEP to generate prediction functions for compressive strength of a specific mixture. A hundred of specimens made with three types of concrete mixes were tested in the lab. Test results were compared to other models. Three prediction functions with parameters (the amount of water, sand, cement and the lightweight aggregate) for compressive strength were developed for three types of concrete mixtures. Results showed that the prediction accuracy was high. This study proved that GEP could be used to generate prediction functions for compressive strength when there is not a specific standard for lightweight concrete aggregate selection and mix proportions. In addition, Mansouri and Farzampour (2018) proposed a method to predict the buckling load of imperfect cylindrical shells. Because the classical buckling theory is insufficient for some specific applications, researches related to thin-shell structures are needed. The proposed method was applied to an existing experimental data set. The performance of the proposed method was excellent on training and test data set. This study showed that GEP is an efficient and accurate approach to estimate the buckling load of imperfect cylindrical shells. Sadrossadat and Basarir (2019) developed two prediction models to predict confined compressive strength and strain of RC columns with circular cross section by using GEP. The developed model performs notably better than traditional models.

As an another example highlighting the applicability of GEP, Jumaa and Yousif (Arruda et al. 2014) proposed three prediction models for shear capacity prediction of FRP-reinforced concrete beams without stirrups, which utilized back-propagation neural networks, GEP and nonlinear regression analysis. Accordingly, 269 test results collected from 42 different studies were used to derive the proposed models. For the ANN model training, the trial and error procedure was used to obtain the best fitting model, which consisted of thirteen neurons in a single hidden layer. To develop the best GEP model, various models were evolved throughout a set of genetic operators. For NLR, the proposed equation was obtained based on one existing model proposed by Zsutty (1971). The predicting results of the proposed three models were compared to the common design provisions and seven equations proposed by other researchers. The comparison revealed that all of the proposed models had a sufficient accuracy, while ANN and GEP models had a higher accuracy than NLR model, but NLR model was computationally efficient than other two models. Ipek and Güneyisi (2019) deployed GEP to assess the axial strength of concrete-filled composite columns. GEP was utilized to generate the proposed model based on 103 sets of available data from some literature. The proposed model had a higher accuracy value and lower error value than other models (İpek and Güneyisi 2019). The best performance of each GEP method is summarized in Table 3.

Table 3 Summary of the GEP performance indexes in the reviewed studies

3.4 Multi expression programming applications

Although MEP has not been widely used in civil engineering area, some studies discovered the potential of MEP. Alavi and Gandomi (2011) and his group presented the capabilities of LGP, GEP and MEP in dealing with geotechnical engineering problems. Models were developed from reliable databased gathered from literature. Compared to simple conventional models, LGP, GEP and MEP models had more comprehensive parameters and higher prediction accuracy. Further, Alavi et al. (2010) proposed two soil classification formulations developed by MEP, where results indicated that MEP models achieved a high accuracy. The best performance of each study is summarized in Table 4.

Table 4 Summary of the MEP performance indexes in reviewed studies

3.5 Multigene genetic programming applications

Various studies have been conducted to assess the performance of MGGP to solve engineering problems. Gandomi and Alavi (2012) presented a structural engineering system behavior modeling method using MGGP applied to four structural engineering problems. For this purpose, 1133 test results were used to develop the MGGP model to formulate the compressive strength of high-performance concrete. Also, 55 pure bending test results were utilized to develop the MGGP model for ultimate pure bending of steel circular tubes. Further, 156 test results were used to develop MGGP model for surface roughness in end-milling. Finally, 251 test results were used to develop model for failure modes of beams subjected to patch loads. Additionally, Bayazidi et al. (2014) proposed a MGGP-based prediction model for elastic modulus of both normal and high strength concrete. 70 groups of NSC and 89 groups of HSC experimental data extracted from literature were used to develop three MGGP models, where compressive strength of concrete was selected as an effective input for MGGP models. Three models were developed with different training parameters (e.g., generation number, crossover ratio, etc.). Based on the performance of the proposed models on test data, it was revealed that they provided reliable estimates of elastic modulus of both HSC and NSC. As an another example, Li et al. (2019) proposed a method to estimate elastic modulus of reservoir rocks. An experiment was conducted to validate the proposed method, where results predicted by the proposed method were compared with empirical method and the measured results. The proposed method had a higher accuracy than traditional empirical method and required less input parameters (Li et al. 2019). The overall performance of each method is summarized in Table 5.

Table 5 Summary of the MGGP performance indexes in reviewed studies

4 Discussion and future trends

This survey provides evidence that TGP, LGP, GEP, MEP, and MGGP can be efficiently used for solving various civil engineering problems that are challenging to solve with traditional methods. Their application is not limited to classical civil engineering tasks but also to other relevant arenas (Xu et al. 2013; Nourani et al. 2014; Garg et al. 2014; Pandey et al. 2015; Tran et al. 2016a; Suganuma et al. 2017). In most cases, the GP variants provide a comparable prediction performance. However, it has been observed that LGP, GEP and MEP can run several orders of magnitude faster than TGP and MGGP because the individuals in these variants are represented as linear strings (Alavi and Gandomi 2011). On the other hand, TGP and MGGP evolve tree-shaped programs. Since computers do not naturally run tree-shaped programs, slow interpreters have to be used as part of the TGP and MGGP simulation process. The enhanced speed of LGP, GEP and MEP permits conducting many runs in realistic timeframes, which leads to deriving consistent, high-precision models with little customization (Brameier and Banzhaf 2007).

Comparing GP with other ML methods, it can be inferred that the major advantage of GP is the transparency of the solutions. Precisely developed GP-based prediction equations can replace the design equations in various civil engineering codes (e.g. American Concrete Institute (ACI) Code) that are extensively used by civil engineers. These explicit models can be readily incorporated into numerical analysis tools such as finite element codes to replace the conventional (analytical) constitutive models. Another unique characteristic of GP is its inherent built-in feature selection mechanism which is known to avoid over-fitting (Vanneschi et al. 2011). In case of function fitting and regression, the overall performance of GP and other ML methods heavily depends on the complexity of the problem as well as other factors such as quality and completeness of the training data, error measures, etc. An ML method may depict better performance on a specific problem while in some other problems, it is outperformed by similar methods. However, there is more to ML than merely function approximation. Researchers are interested in methods that provide both good accuracy and model interpretability. Besides, there are other issues that needs to be considered when comparing the ML methods such as the training time, computational complexity, number of hyperparameters, scaling of the complexity with the number of variables, etc. A limitation of the GP-based methods is that they are parameter sensitive. The performance of these algorithms can be improved by using any form of optimally controlling the parameters of the run such as advanced optimization methods. In addition, the GP methods are computationally intensive techniques compared to ANNs, especially when applied to regression or classification tasks with large datasets. Exploiting highly parallel architectures in the form of Graphics Processing Units (GPUs) and Central Processing Unit (CPU) can improve their computational speed.

However, applications of GP and its variants in civil engineering area are fairly new. Most of the studies reviewed in this paper mainly focus on using the original versions of the GP methods. Although the obtained results sound promising, improving the existing GP algorithms can efficiently enable their ability in solving more complex civil engineering problems and achieving more accurate results. Several attempts made to improve the existing GP methods for application in other engineering areas can also be deployed to enhance the GP practice in the civil engineering domain. This section provides an insight into some of these GP developments that can be considered for tackling more complicated civil engineering problems. As an example, extensive research has been done to improve the GP evolutionary process, i.e. initialization, selection and reproduction. For traditional GP, the initial population are randomly selected, which may not be well-scattered enough. This issue can affect the convergence speed of the evolution. To tackle this challenge, Chen et al. (2013) proposed a novel evolutionary algorithm called uniform design-aided gene expression programming (UGEP) based on traditional GEP. In UGEP, a mixed-level uniform table is used to search uniformly scattered points in the whole searching space to generate a well-distributed initial population. During the selection procedure, the diversity of the population decreases, while the similarity among individuals in population increases. Therefore, it is important to maintain the diversity of population to avoid premature convergence. Premature convergence would result in convergence even the fitness is not the best or ideal. Several studies have been conducted to address such issue. Gan et al. (2007) proposed a clonal selection based gene expression programming (CS-GEP), merging Clonal Selection Algorithm (CSA) with general GEP to avoid the premature convergence issue. Similarly, Litvinenko et al. (2005) combined CSA with GP to overcome the premature convergence drawback by improving the diversity of population. Guo et al. (2014) proposed an enhanced GEP inspired by the principle of minimal free energy in thermodynamics, called CTSGEP. The authors created a formula to calculate the free energy component of combination of newly created offspring and their parents. Individuals with smaller free energy components were selected for the next generation. In addition, Wang et al. (2015) proposed a similarity weight tournament selection procedure to increase the diversity of population and improve the performance of existing GEP. In this selection procedure each individual is assigned with probabilities to be selected. Individuals that are the historical best would have a lower probability to be selected, but individuals with a better fitness in the tournament would be still selected to the next generation. The proposed method is applied to discover the dynamic relationship between the microstructure and physical properties of cement from the image.

Reproduction procedure of GP is highly dependent on genetic operators. To improve the searching accuracy and efficiency of GP, new algorithms with novel operators have been developed. Jiang et al. (2005) proposed a new algorithm that combines GEP and Simulated Annealing (SA), called GEPSA. The proposed algorithm decreases the dependence on genetic operators without affecting the GEP performance. To improve the accuracy and efficiency of GEP, Zhong et al. (2016) proposed a new self-learning GEP based on the existing GEP, named SL-GEP, in which each chromosome is assigned with a subfunction which are self-learned or self-evolved during the evolutionary search. With these subfunctions, SL-GEP is able to improve the accuracy and efficient of the search. Zeng et al. (2007) proposed an Immune Gene Expression Programming (IGEP) based on the Artificial Immune System (AIS) to address the importance of rule mining issue. Since population similarity can be described by edit distance between individuals, crossover is a way to create and maintain the distance. Also, Qu et al. (2015) proposed two edit distance-based crossover operators for GEP to preserve the population diversity. Further, Chen et al. (2017a,b,2019) improved the geometric semantic GP framework by introducing geometric search operators.

Another approach to improve of GP is to combine it with other methods such as optimization algorithms, ANNs, feature extraction algorithms, and deep learning. Most of the ML method are black-box techniques. To address this issue, studies have been performed to combine ANNs and deep neural networks (DNNs) with GP to provide more transparent models. Tran et al. (2016b) proposed a credit scoring framework by merging DNNs and the comprehensive GP. The framework shown in Fig. 12 was aimed to extract rules to build a more robust and reliable DL models. The framework was validated with experimental data (Tran et al. 2016b). Further, Garg et al. (2014), Suganuma et al. (2019), Evans et al. (2018) and Assunção et al. (2019) addressed the noted issue using similar approaches. Additionally, Aslam et al. (2012) combined K-nearest neighbor (KNN) and GP to conduct modulation classification. KNN was used to evaluate the fitness of GP during training process and to reduce the performance of the best individual in the testing process. The validation results for the proposed methods were better than other methods mentioned in the literature. Feature extraction algorithms are gaining more attention in recent years. Price et al. (2019) proposed a novel GP approach to automatically generate discriminative-rich features for image classification.

Fig. 12
figure 12

Credit scoring framework in Tran et al. (2016b)

As discussed in this survey, applications of GP in civil engineering are significantly promising. Yet, such intelligent method could be further improved from various aspects to enhance its performance. Based on the review of the recent works on improving GP, two main approaches can be suggested for future research in the area of GP for civil engineering applications: improving the GP evolutionary process or hybridizing GP with other powerful methods. Figure 13 illustrates the development trend of GP for each approach.

Fig. 13
figure 13

The development trend of GP for civil engineering applications

5 Conclusion

Compared to other ML techniques, GP has been rarely used in civil engineering domain. This comprehensive review aims at presenting the importance of GP for civil engineering applications over the last decade. The most recent works concerning the applications of GP in civil engineering domain are discussed. It is shown that different GP branches such as TGP, LGP, GEP, MEP, and MGGP can be effectively adapted for various civil engineering applications. The survey reveals that GP and its variants are able to outperform existing models with high accuracy. GP allows solving complex problems which are difficult to be solved using conventional methods. Compared to other ANNs or other black-box ML methods, GP allows inclusion of experts' knowledge on the problem domain resulting in a powerful mix of ML and human insight (Rodriguez-Coayahuitl et al. 2019). However, based on the review, improvements of GP and its variants is vital given rapid advancement of the ML methods. The survey suggests that GP can be enhanced through either improvement of GP itself or a combination with other ML methods. According to review, potential future research trends for GP is discussed. It is highlighted that GP and its variants can be enhanced and effectively used as pioneering methods for variety of civil engineering applications. In particular, discovering novel materials is a trending topic which has been gaining significant attentions in recent years. There has been an explosion in the number of works deploying AI/ML in materials science domain (Liu et al. 2017). This computational paradigm change for material characterization and discovery is being further promoted by many researchers via more data-intensive approaches. While researchers are now heavily focused to revolutionize the traditional material science field with ML, the entire concept of ML for characterizing and designing new construction materials is still in its infancy let alone GP. Therefore, harnessing the power of GP for material discovery and design should be considered as a major future direction.