Abstract
This paper studies the time complexity of gene expression programming based on maintaining elitist (ME-GEP). Using the theory of Markov chain and the technique of artificial fitness level, the properties of transition matrices of ME-GEP are analyzed. Based on the properties, the upper and lower bounds of the average time complexity of ME-GEP are obtained. Furthermore, the upper bound is estimated, which is determined by the parameters of ME-GEP algorithm. And the theoretical results acquired in this paper are used to analyze ME-GEP for solving function modeling and clustering problem. At last, a set of experiments are performed on these problems to illustrate the effectiveness of theoretical results. The results show that the upper bound of expected first hitting time can be used to direct the algorithm design of ME-GEP.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Gene expression programming (GEP) (Ferreira 2001) is a genotype/phenotype evolutionary algorithm. As a descendant of genetic algorithms (GAs) and genetic programming (GP), GEP combines their advantageous features and eliminates their main handicaps. The essential feature among GEP, GA and GP is that GEP separates genotype (linear structure) from phenotype (non-linear structure) of chromosomes. GA and GP only have one entity, such as linear strings of fixed length (chromosomes) or nonlinear entities of different sizes and shapes (parse trees).
From the history of life on the earth (Smith and Szathmry 1995), this kind of system such as GA and GP is condemned to have one of two limitations (Ferreira 2001): if they are easy to be manipulated genetically, they lose in functional complexity (the case of GA); if they exhibit a certain amount of functional complexity, they are extremely difficult to reproduce with modification (the case of GP). In GEP, the individuals are encoded as linear strings of fixed length (chromosomes) which are afterwards expressed as nonlinear entities of different sizes and shapes (the phenotype). It not only is manipulated genetically in an easy way but also exhibits a certain amount of functional complexity. Thus, GEP overcomes the limitation of representation of GA and makes up the defect of complex genetic operators of GP. Those excellent properties of GEP have attracted wide attention of scholars around the world and have aroused their research interests. In recent years, GEP has rapidly become a powerful tool of automatic programming. And it has been widely applied in symbolic regression (Peng et al. 2005; Xin et al. 2006), time series prediction (Lopes et al. 2004), data mining (Chi et al. 2003; Bin et al. 2005; Karakasis and Stafylopatis 2008) and many other fields (Shi et al. 2010; Canakci et al. 2009; Teodorescu 2004; Teodorescu and Sherwood 2008).
Up to now, theoretical results of EA on computational time complexity are comparatively abundant. The representative works are as follows: for specific problems (e.g. Rudolph 1996; Garnier et al. 1999) and simple single-individual EA, such as the (1 + 1) EA (e.g. Droste et al. 2002; Wegener and Witt 2005). It is to be noted that He and Yao have done a series of works about the time complexity for several kinds of EAs and different problems (He and Yao 2002, 2003, 2004; Lehre and Yao 2012). Further, some more profound results about the computation time of EAs by using the drift analysis and other tools such as Dynkin’s formula in stochastic process theory were given Ding and Yu (2005). A survey of the results obtained in the past decade along the time complexity of EAs was presented Oliveto et al. (2007). After that, a simplified drift theorem was introduced Happ et al. (2008). A further simplification of the drift theorem which is particularly suited for the analysis of EAs was presented Oliveto and Witt (2008). By exploiting the relationship between the convergence rates and the expected first hitting times, the upper bounds of the first hitting times of evolutionary algorithms based on He and Kang works (He and Kang 1999) were given Yu and Zhou (2008). A general idea for analyzing population-based EAs on unimodal problems using Markov chain theory and Chernoff bounds was given Chen et al. (2009). However, how to make the upper or lower bounds of the expected first hitting times associate with the parameters of EAs is sill unresolved.
The individual of GEP has linear genotype and non-linear phenotype. Simultaneously, GEP not only contains all the genetic operators of traditional EA but also introduces some new genetic operators. All above characters of GEP bring some challenges to the time complexity analysis. Now, in the case of GEP, most studies of GEP focus on the realizations, improvements and applications of the algorithm. There have been few studies involved in theoretical analysis of GEP, especially the time complexity. To the best of our knowledge, the convergence of GEP under the smallest residual sum of squares was studied Yuan et al. (2004, 2006). In view of above results, we Du et al. (2009, 2010) studied the convergence and convergence rate of ME-GEP. We not only proved that ME-GEP algorithm converged to the global optimal solution in probability but also obtained the upper bound of the convergence rates of GEP. The results were that the upper bound was dependent on the second largest eigenvalue of the transition matrix of the Markov chain by the method of Jordan Blocks of matrix and the second largest eigenvalue of the transition matrix was determined by the parameters of ME-GEP algorithm.
This paper concerns theoretical analysis of the average time complexity of ME-GEP on optimization problems with a unique global optimum. A general and easy-to-apply approach to estimate the average time complexity for ME-GEP algorithm is presented. By using the properties of Markov chain and artificial fitness level technique, the upper and lower bounds of the average time complexity of ME-GEP algorithm are obtained. More importantly, we get the relation between the upper bound and the parameters of ME-GEP algorithm. These results obtained in this paper are the general theoretical results. Moreover, it has been shown that the analytical methods and techniques adopted in this paper have broad suitability and useful reference for the researchers in the theory of GEP. The rest of this paper is organized as follows: Sect. 2 introduces GEP and ME-GEP algorithm. In Sect. 3, the characteristic of nonnegative matrix and artificial fitness level technique are introduced. In addition, starting with the Markov chain model associated with ME-GEP algorithm, some fundamental properties of transition matrix of ME-GEP algorithm are proved. Section 4 gives the upper and lower bound of average time complexity of ME-GEP algorithm and discusses the relations between the upper bound and the parameters of ME-GEP. As an application of the theoretical results acquired in this paper, in Sect. 5, the upper bounds of average time complexity of ME-GEP for function modeling and clustering problem are also explored. And the results in Sect. 4 are verified. Finally, some short comments on the results obtained in this paper and future work are summarized.
2 Background
2.1 GEP algorithm
In the following, we will first introduce the genotype, phenotype and genetic operators in GEP.
The genotype is composed of head and tail. Let \(x\) be a genotype and set \(x = \{ \hat{x}_1 ,\hat{x}_2, \ldots ,\hat{x}_l \} \) where
Here, \(l\) represents the length of genotype. \(F\) represents function set,which is composed of operators, the primary functions or user-defined functions. \(T\) represents terminal set, which is composed of constants and variables. Both \(F\) and \(T\) are finite. The elements in head are from \(F\) or \(T\), and the elements in tail are from \(T\). Variables \(h\) and \(tal\) denote the length of head and tail, respectively. For any given \(h\), \(tal\) is evaluated by formula \(tal = h \times (n_{\max }- 1) + 1 \), \(l = h + tal\), where \(n_{\mathrm{{max}}} \mathrm{{ = max(paraNum(}}f_{} \mathrm{{)}}_{f_{} \in \mathrm{{F}}} ) \), paraNum(f) is the number of parameters of function .
For a given genotype, the phenotype is inferred as follows by Karva Language (Ferreira 2001), which reads genotype from the start character depending on the number of arguments of each element (functions may have a different number of arguments, whereas terminals have an arity of zero). And the representation (algebraic expression) is inferred from the phenotype according to the method of inorder traversal. The transformation relation between genotype, phenotype and representation is shown as Fig. 1.
There are three kinds of genetic operators in basic GEP (Ferreira 2001), including mutation, transposition and recombination. Each operation occurs with an active probability. Next, we show these operators adopted in GEP.
Mutation is allowed to occur anywhere in the chromosome and each bit flips with a probability independently at time \(t\). In the head of the gene, a function can be flipped into another function or terminal. In the tail, a terminal can only be flipped into another terminal.
In GEP, there are two recombination operators: one-point and two-point. Similar to crossover operators in GA, the two operations act the same way to one-point and two-point crossover operators. In all cases of recombination, both offspring are conserved into successive evolution.
IS transposition Any sequence in the chromosome can become an IS element. The chosen sequence copies itself to any position in the head of a chromosome except the start position. A sequence with the same length is excluded from the end of the head to maintain the structural organization of the gene. For example, there is a parent chromosome whose head length is 10 and tail length is 11.
Suppose that the sequence “bba” becomes an IS element, which is highlighted, and its target position is 7, as the result of IS transposition, the last three elements “b*a” of the head are deleted. The offspring is as follows:
2.2 ME-GEP algorithm
Although the advantages of a system like GEP are clear from nature, the most important should be emphasized. First, the chromosomes (genotype) are simple entities: linear, compact, relatively small, easy to be manipulated genetically (mutate, recombine, transpose, etc.). Second, Karva language was created to read and express the information encoded in the chromosome (genotype). It always guarantees that a genotype can be converted into a valid phenotype, which allows the implementation of a very powerful set of unconstrained genetic operators acting on genetic space. Clearly, it is that the unconstrained genetic operators highly improving the search capability, also making GEP algorithm difficult to converge. ME-GEP is a kind of GEP algorithm with elitist strategy. Thus, ME-GEP not only has efficient search capability but also ensures the convergence proved by our previous work Du et al. (2010).
ME-GEP algorithm with population size \(n(n \ge 1)\) can be described as follows:
Step 1 Initialization: generate randomly an initial subpopulation of \(n\) individuals, denote it by \(X(0) = (x_1 (0), \ldots ,x_n (0))\), where \(x_i (0) \in H\), \(i = 1, \ldots ,n\) and let \( t \leftarrow 0 \). The optimal individual in generation 0, denoted by \(x_0 (0)\), is then defined as the individual among \(x_1 (0), \ldots ,x_n (0)\) with the biggest fitness. The population in generation 0 is defined as \(\xi (0) = (x_0 (0), x_1 (0), \ldots ,x_n (0))\), where \(x_i (0) \in H\), \(i = 0,1, \ldots ,n\).
Step 2 Recombination: generate a new (intermediate) subpopulation by recombination operation based on subpopulation \(X(t)\) with probability \(p_c\)(\(0 < p_c < 1\)) and denote it as \(X{}_C(t)\).
Step 3 Transposition: generate a new (intermediate) subpopulation by transposition operation based on subpopulation \(X{}_C(t)\) with probability \(p_t\)(\(0 < p_t< 1\)) and denote it as \(X{}_T(t)\).
Step 4 Mutation: generate a new (intermediate) subpopulation by mutation operation based on subpopulation \(X{}_T(t)\) with probability \(p_m\)(\(0 < p_m< 1\)) and denote it as \(X{}_{Mu}(t)\).
Step 5 Select \(n\) individuals from subpopulation \(X{}_{Mu}(t)\) and \(X(t)\) according to certain select strategy, where the probability for each individual to be selected is \(>\)0 and then obtain the next subpopulation \(X(t+1)\).
Step 6 Find optimal individual \(x_0 (t + 1)\) from subpopulation \(X(t+1)\) and optimal individual \(x_0 (t )\).
Step 7 Let \(t \leftarrow t + 1\); then the next population \(\xi (t + 1) = (x_0 (t + 1),x_1 (t + 1), \ldots ,x_n (t + 1))\).
Step 8 If \(f(x_0 (t + 1)) = f_{\max }\), then stop; otherwise go to step2.
Remark In this paper, we always assume that all the genetic operations are independent of each other and fitness function is a single-valued function. The individual space is denoted by \(H\) and \(|H| = N\)
3 Markov chain model of ME-GEP
In reference Du et al. (2010), we have proved that both \(\{ X(t);t \ge 0\}\) and \(\{ \xi (t);t \ge 0\}\) are Homogeneous Markov chains with finite states space \(H^n\) and \(V = H^{n + 1}\), respectively; this is because subpopulation \(X(t)\) only depends on subpopulation \(X(t-1)\), and optimal individual \(x_0 (t)\) depends on both subpopulation \(X(t)\) and optimal individual \(x_0 (t-1)\). Hence, population \(\xi (t)\) only depends on population \(\xi (t-1)\); further, all genetic operators adopted in different generation are fixed.
Since the limit behaviors of Markov chain are determined by its corresponding transition matrices, some definitions and symbols about matrix are stated as below.
3.1 The characteristic of nonnegative matrix
Definition 1
Rudolph (1994) A square matrix \(A = [a_{ij} ]_{m \times m}\) is said to be
-
(1)
nonnegative (\(A \ge 0\) ), if \(a_{ij} \ge 0\) for all \(i,j \in \{ 1,2, \ldots ,m\}\) ,
-
(2)
positive (\(A >0\) ), if \(a_{ij}> 0\) for all \(i,j \in \{ 1,2, \ldots ,m\}\). A nonnegative matrix \(A = [a_{ij} ]_{m \times m}\) is said to be
-
(3)
primitive if there exists a positive integer \(k\) such that \(A^k\) is positive,
-
(4)
stochastic, if \(\sum \nolimits _{j = 1}^m {a_{ij} = 1}\) for all \(i \in \{ 1,2, \ldots ,m\}\),
-
(5)
column-allowable, if it has at least one positive entry in each column.
Clearly, the product of stochastic matrices is again a stochastic matrix and that every positive matrix is also primitive.
Apart from providing a smaller Markov chain transition matrix, artificial fitness levels lead to simplified calculations. Next, the artificial fitness level technique will be introduced.
3.2 Artificial fitness levels
Artificial fitness levels (Oliveto et al. 2007) is a kind of partition technique of the search space based on the fitness function. The whole search space \(S\) as a set of \(|S| = n\) different states is divided into \(m<n\) states \(A_1,\ldots , A_m\), such that for all points \(a \in A_i\) and \(b \in A_j\) it happens that \(f(a) < f(b)\) if \(i<j\). Thus, \(A_m\) contains only optimal search points, in such a way the Markov chain can be constructed considering only \(m\) different states rather than \(n\).
3.3 The transition matrices of ME-GEP
Note that the evolutionary process of ME-GEP can be illustrated by the following formula:
where \(C\), \(T\), \(M\), \(S\) and \(U\) represent recombination, transposition, mutation, select and update operation, respectively.
Let \(P^X\) and \(P^\xi \) be the transition matrices associated with homogeneous Markov chains \(\{ X(t);t \ge 0\}\) and \(\{ \xi (t);t \ge 0\}\). Then the properties and relations between \(P^X\) and \(P^\xi \) will be analyzed in the following. First, some conclusions of \(P^X\) can be stated as below.
Lemma 1
For any two chromosomes \(x\) and \(y\), the mutation probability from \(x\) to \(y\) is
where \(k_1\) and \(k_2\) are the numbers of different bits in gene head and tail of \(x\) and \(y\) separately.
Proof
For \(\forall x,y \in H\), suppose that there are \(k_1\) bits in head and \(k_2\) bits in tail of individual \(x\) to be mutated. Each bit in head has \(|F| + |T|\) conditions and each bit in tail has \(|T|\) conditions; hence the probability that individual \(x\) becomes individual \(y\) after mutation can be aggregated to
\(\square \)
Lemma 2
Through IS transposition, population in last generation could generate any chromosome with the probability of
Proof
Suppose individual \( X= x_1, x_2, \ldots , x_l\) is selected as IS transpose with probability \(p_t\). Let the beginning bit of IS string be \(f_j ,j = 1,2,\ldots ,l\). There would be \(l-j+1\) possibilities of the IS string, and \(h-2\) possibilities of its insertion bit, except the first bit and the bit. Thus IS transposition with the beginning bit \(f_j\) could generate \((h - 2) \cdot (l-j+ 1)\) kinds of chromosomes. When IS transposition occurs on \(X\), there would be
kinds of chromosomes. So any chromosome could be generated by IS transposition with the probability of \(2p_t /((h - 2) \cdot l \cdot (l + 1))\) . \(\square \)
Lemma 3
\(P^X\) is primitive, which can be decomposed in a natural way into a product of stochastic matrices \(P^X = C \cdot T \cdot M \cdot S\), where \(C = [c_{ij} ]_{N^n \times N^n }\), \(T = [t_{ij} ]_{N^n \times N^n }\), \(M = [m_{ij} ]_{N^n \times N^n }\) and \(S = [s_{ij} ]_{N^n \times N^n }\) are the intermediate transition matrices induced by recombination, transposition, mutation and selection, respectively.
Proof
Crossover, mutation and selection operators may be regarded as random total functions whose domain and range are \(H^n\), i.e., each state of \(H^n\) is mapped probabilistically to another state since \(0 < p_c ,p_m < 1\). Therefore, \(C\), \(M\) and \(S\) are stochastic matrices.
According to Lemma 1, we have
where \(k_1 \le h,k_2 \le tal.\)
Denote the minimal mutation probability by \(p_{\min m} = \mathop {\min }\limits _{a,b \in H} p\left\{ a\buildrel M \over \longrightarrow b\right\} \); then
For any \(i,j \in H^n \), we have \(m_{ij} \ge p_{\min m}^n > 0\). Thus \(M\) is positive.
According to selection operation, the probability of each individual to be selected is greater than zero. Then for \(\forall i \in H^n \), we have \(s_{ii}> 0\). Thus \(S\) is column-allowable.
Let \(A = [a_{ij} ]_{N^n \times N^n } = T \cdot M\), \(Q = [q_{ij} ]_{N^n \times N^n } = C \cdot A\), \(P^X = [p_{ij}^X ]_{N^n \times N^n } = Q \cdot S\). Since \(C\) and \(T\) are stochastic matrices with at least one positive entry in each row, and \(M>0\), then, by matrix multiplication \(a_{ij} = \sum \nolimits _{k = 1}^{N^n } {t_{ik} m_{kj} } > 0\) for all \(i,j \in \{ 1,2, \ldots ,N^n \}\), i.e. \(A>0\). Similarly, \(q_{ij} = \sum \nolimits _{k = 1}^{N^n } {c_{ik} a_{kj} } > 0\), and hence \(p_{ij} ^X = \sum \nolimits _{k = 1}^{N^n } {q_{ik} s_{kj} } > 0\) for all \(i,j \in \{ 1,2, \ldots ,N^n \}\), because \(S\) is column-allowable. Thus \(P^X\) is positive. Since every positive matrix is primitive. \(\square \)
In order to analyze the properties of transition matrix \(P^\xi \), we will use artificial fitness level technique to partition the state space \(V = H^{n + 1}\). First, set \(H = \{ s_1, s_2, \ldots , s_N \} \) with \(f(s_1 ) > f(s_2 ) > \cdots > f(s_N )\); thus \(H\) is a totally ordered set, and the elements in \(H\) can be ordered by \(s_1 \succ s_2 \succ \cdots \succ s_N\). Let
It is obvious that \(V_1 \cup \cdots \cup V_N = H^{n + 1}\) and \(V_1\) is the optimal state set. Then the following facts can be reached immediately:
-
(1)
By step 2 to 5 of ME-GEP, we know that the optimal individual is not affected by the genetic operators, and the extended transition matrices on \(H^{n + 1}\) for recombination \(C^{\prime }\), transposition \(T^{\prime }\), mutation \(M^{\prime }\) and selection \(S^{\prime }\) can be written as block diagonal matrices.
$$\begin{aligned} \begin{array}{ll} C^{\prime } = \left( \begin{array}{llll} C &{} &{} &{} \\ &{} C &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} C \\ \end{array}\right) &{} T^{\prime } = \left( \begin{array}{llll} T &{} &{} &{} \\ &{} T &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} T \\ \end{array} \right) \\ M^{\prime } = \left( \begin{array}{llll} M &{} &{} &{} \\ &{} M &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} M \\ \end{array} \right) &{} S^{\prime } = \left( \begin{array}{llll} S &{} &{} &{} \\ &{} S &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} S \\ \end{array} \right) \\ \end{array} \end{aligned}$$$$\begin{aligned} C^{\prime }T^{\prime }M^{\prime }S^{\prime }&= \left( \begin{array}{llll} {CTMS} &{} &{} &{} \\ &{} {CTMS} &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} {CTMS} \\ \end{array} \right) \\&= \left( \begin{array}{llll} {P^X } &{} &{} &{} \\ &{} {P^X } &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} {P^X } \\ \end{array} \right) \end{aligned}$$ -
(2)
Update operation of step 6 is only to update the optimal individual. And transition matrix \(U\) can be written by matrices \(U_{rk} (1 \le r,k \le N)\) as follows:
$$\begin{aligned} U = \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} {U_{11} } &{} {U_{12} } &{} \cdots &{} {U_{1N} } \\ {U_{21} } &{} {U_{22} } &{} \cdots &{} {U_{2N} } \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ {U_{N1} } &{} {U_{N2} } &{} \cdots &{} {U_{NN} } \\ \end{array} \right) , \end{aligned}$$where \(U_{ij} (1 \le i,j \le N)\) are \(N \times N\) matrices.
-
(3)
Transition matrix \(P^\zeta = (C^{\prime }T^{\prime }M^{\prime }S^{\prime }) \cdot U\).
Some useful properties of \(P^\zeta \) can be obtained by analyzing matrix \(U\). Let \(n_k^{} = |\{ (x_1, \ldots x_n ):\mathop {\max }\limits _{1 \le j \le n} f(x_j ) = f(s_k )\} |\), that is, \(n_k\) is the number of those subpopulations with best individual \(s_k\). Then we have
Lemma 4
For each subspace \(V_i = \{ (s_i, x_1, \ldots , x_n ){:}\,x_j \in H,j = 1, \ldots n\} ,\quad \forall \, i = 1, \ldots ,N\) in \(H^{n + 1}\), we have \(\sum {_{k = 1}^N } n_k^{} = N^n \) and \(n_k^{} = (N - k + 1)^n - (N - k)^n\), where \(k = 1,\ldots ,N\).
Proof
According to the definition of \(V_i\), we can know immediately that the size of each \(V_i\) is equal to \(N^n\); thus it is clear that
In subspaces \(V_i\), if subpopulation \((x_1 , \ldots ,x_n )\) satisfies \(\mathop {\max }\limits _{1 \le i \le n} f(x_i ) < f(s_k ),\) then subpopulation \((x_1 , \ldots ,x_n )\) cannot contain individuals \(s_1 ,s_2 ,\ldots \), \(s_{k - 1}\). Thus, there are \((N - k + 1)^n\) states which do not contain individuals \(s_1 ,s_2 , \ldots ,\) \(s_{k - 1}\). Likewise, there exist \((N - k )^n\) states that do not contain individuals \(s_1 ,s_2 , \ldots ,s_k\). So we have
\(\square \)
Lemma 5
\(U\) Matrix has the following properties:
-
(1)
In each row of matrix \(U\), there is exactly one entry equaling to 1, all others are 0, and \(U_{11}\) is a unit matrix.
-
(2)
\(U_{rk} = 0(r < k,1 \le r < N)\).
-
(3)
\(U_{rk} (2 \le r \le N,1 \le k \le r - 1)\) are diagonal matrices such that \(n_k^{} \) elements are 1 in primary diagonal line and the remaining elements are 0.
-
(4)
\(U_{rr} (2 \le r \le N)\) are diagonal matrices such that \(\sum \nolimits _{i = r}^N {n_i^{} }\) elements are 1 in primary diagonal line and the remaining elements are 0.
Proof
-
(1)
The upgrade operation is represented by an upgrade matrix \(U\). Let \(b(i)\) be the best individual of the subpopulation \((x_1 , \ldots ,x_n )\) at any state \(i\) and write \(b(i) = \arg \max \{ f(x_i ){:}\,i = 1, \ldots ,n\}\). Then \(u_{ij} = 1\) if \(f(x_0 ) < f(b(i))\) with \(j\mathop = \limits ^\Delta ( b(i),x_1 , \ldots ,x_n) \in V \), \(i =( x_0 ,x_1 , \ldots ,x_n) \in V\); otherwise \(u_{ii} = 1\), \(u_{ij} = 0(j \ne i) \). That is, there is exactly one entry equal to 1, and all others are 0 in each row of matrix \(U\). Notice that subpopulation \((x_1 , \ldots ,x_n )\) is unchanged by upgrade operation. In other words, a state either gets upgraded or remains unaltered. So there is only one in primary diagonal line for each sub-matrix \(U_{ij} (1 \le i,j \le N)\). In particular, the optimal individual of subspace \(V_1\) is the global optimal solution. That is \(f(x^ * ) \ge f(b(i))\). Then we have \(u_{ii}= 1\),\(u_{ij}= 0(j \ne i)\) according to (1). So \(U_{11}\) is a unit matrix.
-
(2)
The state of population remains unaltered when \(r < k\), that is \(s_r \succ s_k\). Then we have \(u_{ij}= 0(j \ne i)\). Thus \(U_{rk}= 0(r < k,1 \le r < N)\).
-
(3)
There is \(s_k \succ s_r\) when \(1 \le k \le r - 1\). The total state number with the best individual \(s_k\) in \(V_k\) is \(n_k\) according to Lemma 4. In addition, for any diagonal matrix \(U_{rk}(2 \le r \le N,2 \le k \le r)\), there are \(n_k\) elements whose value equal to 1, while others equal to 0 according to (1).
-
(4)
There is \(s_k=s_r\) when \(2 \le k =r \le N\). The total state number with the best individual \(s_k\) in \(V_r\) is \(n_k\). The best individual can be \(s_r\), \(s_{r+1}\), \( \ldots \) , \(s_N\). Then \(U_{rr} (2 \le r \le N)\) are diagonal matrices such that \(n_r^{}+ n_{r + 1}^{}+ \cdots + n_N^{}\) elements are 1 in primary diagonal line and the remaining elements are 0.\(\square \)
Based on the above, the transition matrix \(P^\xi \) becomes
where
According to the properties of absorbing Markov chain (Iosifescu 1980) , we have
where \(Y\) concerns the transition from one absorbing state to the other, the region 0 consists entirely of 0’s, the matrix \(R\) concerns the transition from transient state to absorbing one, and the matrix \(T\) concerns the transition from transient to transient states. The matrix \(Y\) corresponds to the fact that once the Markov chain reaches an absorbing state, it will never leave that absorbing state.
Let \(P^\xi =\left( p_{ij}^\xi ;i,j \in N^{n + 1} \right) \), \(P^X= \left( p_{ij}^X ;i,j \in N^n \right) \) where \(p_{ij}^\xi \), \(p_{ij}^X\) are the transition probability from state \(i\) to stat \(j\)). According to Lemma 5, we know that matrices \(U_{rk}(2 \le r \le N,1 \le k \le r - 1) \) and matrices \( U_{rr} (2 \le r \le N)\) can be written as follows:
Then we have
The matrix \(R\) contains \(N\) sub-matrices similar to above sub-matrix \(P^X U_{21}\). The matrix \(T\) contains two kinds of sub-matrices: \( P^X U_{rk} (2 \le r \le N,1 \le k \le r - 1)\) and \(P^X U_{rr} (2 \le r \le N)\); as example \(P^X U_{32}\) and \(P^X U_{22}\) can be written as Eqs. (7) and (8).
4 The average time complexity of ME-GEP
In this section, EFHT is used to estimate the average time complexity of ME-GEP based on the Markov Chain Model and transition matrices of ME-GEP shown above.
The first hitting time (FHT) of ME-GEP is the time that ME-GEP finds the optimal solution for the first time, and the expected first hitting time (expected FHT) is the average time that ME-GEP requires to find the optimal solution, which implies the average computational time complexity of ME-GEP.
The first hitting time on \(V_1\) can be defined by
Let \(E\) denote the expectation operators. The first hitting time \(\tau (V_1 )\) is a random variable, and what we are interested is its expectation value \(E[\tau (V_1 )]\).
Hence, we can obtain the following conclusion of EFHT of ME-GEP:
Theorem 1
Let \(\beta = \min \{ p^X _{ij} ;i,j \in N^n \}\); then the EFHT of ME-GEP is as follows:
Proof
Note that \(P\{ \tau (V_1 ) \ge 0\} = 1 \), \(P\{ \tau (V_1 ) \ge 1\} = P\{ \xi (0) \notin V_1 \}\).
By Markov property of \(\{ \xi (t);t \ge 0\}\), for any \(l \ge 2\), we have
Then for any \(i \notin V_1\), according to Eqs. (6)–(8), we have
So that for any \(l \ge 2\), we have
Note that \(0 < n_1 \beta < 1\), \(0 < (N^n - n_1 )\beta < 1\). Hence
So the upper bound for \( E[\tau (V_1 )]\) is
The lower bound for \( E[\tau (V_1 )]\) is
That is,
\(\square \)
In order to make the EFHT associated with parameters of ME-GEP directly, Corollary 1 is given.
Corollary 1
Let \(p_{\min t} = \mathop {\min }\limits _{\forall a,b \in H} p\left\{ a\buildrel T \over \longrightarrow b\right\} \), \(p_{\min s} = \mathop {\min }\limits _{\forall a \in H} p\left\{ a\buildrel S \over \longrightarrow a\right\} \), \( p_{\min m} = \mathop {\min }\limits _{\forall a,b \in H} p\{ a\buildrel M \over \longrightarrow b\}\) ; then we have
Proof
For any \(i,\delta ,\eta ,k,j \in H^n\), we have
Then we have
So we get
By Theorem 1, we have
\(\square \)
Theorem 1 estimates the upper and lower bounds of EFHT of ME-GEP algorithm. Further, Corollary 1 makes the upper bound associate with parameters of ME-GEP algorithm directly. Thus, Theorem 1 and Corollary 1 can be used to direct algorithm design of ME-GEP.
5 Case study
In order to justify our theoretical results, the EFHT of ME-GEP for solving function modeling and clustering problems are analyzed below.
To guarantee the validity of results, each execution of each algorithm on each parameter set was independently repeated 50 times.
5.1 Function modeling problem
5.1.1 The description of function modeling problem
Function modeling problem can be described as follows: given a group of datum \((x_1^i ,x_2^i , \ldots ,x_m^i ,y^i ),i = 1,2, \ldots k\), find a function \(\varphi (x_1 ,x_2 , \ldots ,x_m ) \) in a certain function set \(\Phi (\bar{x}) \), \(\bar{x} = (x_1 , \ldots ,x_m ) \), which satisfies
where \(g(\bar{x})\) is an arbitrary function in set \(\Phi (\bar{x}) \).
Next, the function modeling about the amount of gas emitted from coalfaces (Xin et al. 2006) is used as an example to be described as follows. The amount of gas emitted from coalfaces in coal mine is the main unsafe factor during the production of coal mine. However, this amount is hard to predict because it is affected by many factors and has complicated nonlinear relations with these factors. In Table 1, the sampling cases are given. There are six main factors as input variables: the depth of coalface \(x_1\), the height of coalface \(x_2\), the amount of coalfaces \(x_3\), the layer interval between working coalface and neighboring coalface \(x_4\), average daily advance of working face \(x_5\) and average daily output \(x_6\). The amount of gas emitted from coalfaces \(y\) is the output variable. By ME-GEP, the function can be found to model accurately the nonlinear relations between input variables and \(y\).
5.1.2 The EFHT of ME-GEP for solving the amount of gas emitted from coalfaces
The parameters of ME-GEP algorithm are designed as follows: Let \(n = 80\), \(h = 15\) then \(tal = 16\), \(l = 31\), \(F = \{ + , - ,*,/,\mathrm{{sqrt, exp,}}\sin ,\cos ,\ln \}\), \(T = \{ R,x\}\) where \(R\) is a set of random constants, and we assume that \(R\) has 20 elements. The genetic operators adopted are one-point recombination, IS transposition, selection and mutation operators. Especially, mutation operators include uniform mutation that flips independently each bit with probability \(p_m\) and one-point mutation that flips one bit with probability \(p_m\) randomly. Each individual is selected with same probability in selection strategy. Next, the EFHT of ME-GEP with two different mutation operators is analyzed below.
The fitness (Ferreira 2001) of ME-GEP is evaluated by Eq. (10).
where \(k\) is the number of sampling, and \(M\) is the range of selection. For this example, let \(M= 1{,}000\), factor \(=M/k \!\approx \! 56\), \(C_{ij}\) be the value returned by the individual \(i\) for sample case \(j\), and \(y_j\) be the target value for sample case \(j\).
Applying Theorem 1 and Corollary 1, we have
Let \(p_{\min m1}\) and \(p_{\min m2}\) be the minimum probabilities when an individual \(x\) becomes individual \(y\) after uniform mutation and one-point mutation, respectively. Next, we will estimate \(N\), \(p_{\min t}\), \(p_{\min m1}\), \(p_{\min m2}\) and \(p_{\min s}\).
(1) The size of individual space \(N\).
Denote the tree height of phenotype by variable \(ht\). Because the individual length is 31, we have \(ht \le 5\). It follows that
(2) \(p_{\min m1}\)
According to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
It is easy to find that the minimum probability that any individual \(x\) mutates to another individual \(y\) is \(\left( \frac{{p_m }}{{910}}\right) ^{31}\) where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m1} \ge \left( \frac{{p_m }}{{910}}\right) ^{31}\).
(3) \(p_{\min m2}\)
There are two situations for one-point mutation. One is that one bit flips in head. The other is that one bit flips in tail.
For the first situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
For the second situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
From above analysis, the minimum probability that any individual \(x\) mutates to another individual \(y\) is \({{p_m }^{31} \over {35}}\) where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m2} \ge { {p_m }^{31}\over {35}}\).
(4) \(p_{\min t}\)
According to Lemma 2, for \(\forall x,y \in H\), we have
Thus, we have
(5) \(p_{\min s}\)
According to selection operator, obviously, we have \(p_{min s} ={\textstyle {1 \over 2n}}\).
According to Corollary 1, the EFHT of ME-GEP with uniform mutation is as follows:
and the EFHT of ME-GEP with one-point mutation is as follows:
5.1.3 Numerical experiment
From above formulas (11) and (12), the EFHT of ME-GEP can be decreased by increasing mutation probability and transposition probability.
The following experiments to observe the actual average first hitting time of ME-GEP, especially considering different mutation operators, are done:
-
(1)
When IS transposition probability is 0.1, uniform and one-point mutation probabilities are changed;
-
(2)
When uniform and one-point mutate probabilities are 0.04, the IS transposition probability is changed.
The results obtained are shown in Figs. 2 and 3. The EFHT of ME-GEP decreased with the increasing of one-point and uniform mutation probabilities when IS transposition probability was 0.1. And the EFHT of ME-GEP decreased with the increasing of IS transposition probability when one-point and uniform mutation probabilities were 0.04. In this example, the experimental results conformed to formulas (11) and (12).
5.2 Clustering problem
5.2.1 The description of clustering problem
Clustering problem can be described as follows: given a set of observations \((P_\mathrm{{1}}, \ldots , P_\mathrm{{m}})\), where each observation is a \( d\)-dimensional real vector, partition the \(m\) observations into \(q(\le m)\) sets \(C=\{C_\mathrm{{1}}, \ldots ,C_\mathrm{{q}}\}\), which satisfies
Minimize \(\sum \limits _{i = 1}^q {\sum \limits _{P_j \in C_i } {||P_j - \mu _i ||^2 } }\), where \(\mu _i\) is the mean of points in \(C_\mathrm{{j}}\).
In ME-GEP for solving clustering problems, two special clustering operators, namely ‘\(\cup \)’ and ‘\(\cap \)’, are introduced as binary operators used to define aggregation and segmentation of clusters respectively.
Let \(O_1\) and \(O_2\) be the centers of cluster \(C_1\) and \(C_2\).
-
(1)
The aggregation operator ‘\(\cap \)’ is defined as \(O_1 \cap O_2=\) centroid (\(O_1\), \(O_2\)), where the return of function centroid is the center vector of \(O_1\) and \(O_2\). And \(C_1\) and \(C_2\) are aggregated into a cluster \(C\) with the center \(O\) \(=\) centroid \((O_1, O_2)\).
-
(2)
The segmentation operator \('\cup '\) is defined as \(O_1 \cup O_2\) = \(\{O_1, O_2\}\). \(C_1\) and \(C_2\) are two distinct clusters, and they still keep the segmentation, whose centers of the clusters are \(O_1\) and \(O_2\), respectively.
The chromosome is also composed of head and tail. The function set is \(F = \{ \cup , \cap \}\), and the terminals set is \(T=\{ x_1, x_2, \ldots , x_i \}\), where \(i \in (1,n)\). Here, the symbol \(x_i\) represents the center of a cluster in the data set.
Based on above definitions, the clusters are easily defined in the chromosome of ME-GEP framework.
The main procedure of ME-GEP for solving clustering problem is described as follows. The detailed content can be got in reference (Chen et al. 2007):
Step 1 Initialize the population;
Step 2 Select and aggregate cluster centers by coding and decoding expression trees;
Step 3 Calculate the Euclidean distances between each data object and cluster centers and tag the corresponding data object;
Step 4 Update the cluster centers and compute the fitness of each individual;
Step 5 Preserve the best individual to the next generation;
Step 6 Select individuals with same probability;
Step 7 Mutate and recombination with a certain probability;
Step 8 Generate a new population, and if the optimal solution has been found, then turn to step 9; otherwise, turn to step 2;
Step 9 Select the best individual;
Step 10 Auto-clustering algorithm of best individual;
Step 11 Output the result of clusters.
The EFHT of ME-GEP (Chen et al. 2007) for solving two actual clustering examples of Iris and Yeast (benchmark databases) (Zhou et al. 2003) with different data dimensions and sizes is analyzed as follows:
To solve clustering problems in examples, the parameters of ME-GEP are designed as Table 2. The genetic operators adopted include one-point recombination, selection with equal probability, uniform and one-point mutation operators.
The fitness of ME-GEP algorithm for solving the clustering problem is evaluated by Eq. (13).
where \(q\) is the number of clusters, \(C_i\) is the \(ith\) cluster, \(P_j\) is the observation point of cluster \(C_i\) and \(\mu _i\) is the center of \(ith\) cluster. The bigger the individual fitness, the more compact and independent the cluster, namely the corresponding individual is better.
5.2.2 The EFHT of ME-GEP for sloving clustering Iris database
The parameters of ME-GEP for clustering Iris database are as follows: Let \(h = 4\); then \(tal = 5\), \(l = 9\). \(T=\{x_1,x_2, x_3, x_4, x_5\}\).
Applying Theorem 1 and Corollary 1, we have
Let \(p_{\min m1}\) and \(p_{\min m2}\) be the minimum probabilities when an individual \(x\) changes to individual \(y\) after uniform mutation and one-point mutation, respectively. Next, we will evaluate and estimate \(N\), \(p_{\min m1}\), \(p_{\min m2}\) and \(p_{\min s}\).
(1) The size of individual space \(N\).
Denote the tree height of phenotype by variable \(ht\). Because the individual length is 9, we have \(ht \le 4\). It follows that
(2) \(p_{\min m1}\)
According to Lemma 1, we have
If \(1 - p_m > p_m\), that is \(p_m < {\textstyle {1 \over 2}}\), it follows that
From the above analysis, the minimum probability that any individual \(x\) mutates to another individual \(y\) is \(\left( \frac{{p_m }}{{10}}\right) ^{9}\) where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m1} \ge \left( \frac{{p_m }}{{10}}\right) ^{9}\).
(3) \(p_{\min m2}\)
There are two situations for one-point mutation. One is that one bit flips in head. The other is that one bit flips in tail.
For the first situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
For the second situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
From above analysis, the minimum probability that any individual \(x\) mutates to another individual \(y\) is \({{p_m }^{9} \over {5}}\), where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m2} \ge { {p_m }^{9}\over {5}}\).
(4) \(p_{\min s}\)
According to selection operator, obviously, we have \(p_{mins} ={\textstyle {1 \over 2n}}\).
According to Corollary 1, the EFHT of ME-GEP with uniform mutation is as follows:
And the EFHT of ME-GEP with one-point mutation is as follows:
5.2.3 The EFHT of ME-GEP for clustering yeast database
The parameters of ME-GEP for clustering yeast database are as follows: Let \(h = 15\); then \(tal = 16\), \(l = 31\). \(T =\{x_1,x_2, \ldots , x_{16} \}\).
Applying Theorem 1 and Corollary 1, we have
Let \(p_{\min m1}\) and \(p_{\min m2}\) be the minimum probabilities when an individual \(x\) mutates to individual \(y\) after uniform mutation and one-point mutation, respectively. Next, we will evaluate and estimate \(N\), \(p_{\min m1}\), \(p_{\min m2}\) and \(p_{\min s}\).
(1) The size of individual space \(N\).
Denote the tree height of phenotype by variable \(ht\). Because the individual length is 31, we have \(ht \le 5\). It follows that
(2) \(p_{\min m1}\)
According to Lemma 1, we have
If \(1 - p_m > p_m\), that is \(p_m < {\textstyle {1 \over 2}}\), it follows that
It is easy to find that the minimum probability that any individual \(x\) mutates to another individual \(y\) is \(\left( \frac{{p_m }}{{32}}\right) ^{31}\) where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m1} \ge \left( \frac{{p_m }}{{32}}\right) ^{31}\).
(3) \(p_{\min m2}\)
There are two situations when an individual does one-point mutation. First is that one bit flips in head. Second is that one bit flips in tail.
For the first situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
For the second situation according to Lemma 1, we have
If \(1-p_m > p_m\), that is \(p_m <{\textstyle {1 \over 2}}\), it follows that
From above analysis, the minimum probability that any individual \(x\) mutates to another individual \(y\) is \({{p_m }^{31} \over {16}}\) where \(p_m \in (0,0.5)\). Thus, we have \(p_{\min m2} \ge { {p_m }^{31}\over {16}}\).
(4) \(p_{\min s}\)
According selection operator, obviously, we have \(p_{mins} ={\textstyle {1 \over 2n}}\).
According to Corollary 1, the EFHT of ME-GEP with uniform mutation is as follows:
And the EFHT of ME-GEP with one-point mutation is as follows:
5.2.4 Numerical experiments
From the above formulas (14)–(17), the upper bound of the EFHT of ME-GEP for solving clustering Iris and Yeast databases can be decreased by increasing mutate probability.
In order to verify formulas (14)–(17), the following experiments to observe the actual average first hitting times of ME-GEP, especially considering different mutation operators: (1) the uniform mutation probability is changed; (2) the one-point mutation probability is changed.
The results obtained are shown in Figs. 4 and 5. The EFHT of ME-GEP decreased with the increasing of mutate probabilities. In the above mentioned examples, the experimental results conformed to formulas (14)–(17).
6 Conclusion and future work
This paper studies the computational time complexity of ME-GEP. By Markov chain theory and artificial fitness levels technique, the upper and lower bounds of expected first hitting time of ME-GEP are obtained. Furthermore, the relations between the upper bound and the parameters of the ME-GEP are given. Moreover, the relations between the expected first hitting time and the parameters of the algorithm are verified by actual examples for function modeling and clustering problems. Our results obtained in this paper are the general theoretical results, which can be used to guide the design and implementation of GEP. What is more, it has shown that the analytical methods and techniques adopted in this paper have broad suitability and useful reference for the researchers in relevant research area of GEP theory.
Crossover operator has an important role in GEP. The influence of crossover operator on EFHT of GEP will be discussed in our future work. We will also research on the approach to judging whether a problem belongs to ME-GEP-easy class or ME-GEP-hard class.
References
Bin JX, Jie TC et al (2005) Mining frequent function set based on gene expression programming. Chin J Comput 128(8):1247–1254
Canakci H, Baykasoglu A, Gullu H (2009) Quantitative structure-activity relationships studies of ccr5 inhibitors and toxicity of aromatic compounds using gene expression programming. Eur J Med Chem 18(8):1031–1041
Chen Yu, Tang Changjie et al (2007) Clustering without prior knowledge based on gene expression progarmming. Third Int Conf Nat Comput 3:451–455
Chen TS, He J et al (2009) A new approach for analyzing average time complexity of population-based evolutionary algorithms on unimodal problems. IEEE Trans Syst Man Cybern B Cybern 39(5):1092–1106
Chi Z, Xiao WM, Thomas MT, Peter CN (2003) Evolving accurate and compact classification rules with gene expression programming. IEEE Trans Evol Comput 7(6):519–531
Ding LX, Yu JH (2005) Some theoretical results about the computation time of evolutionary algorithms. In: Proceedings of the 7th annual conference on genetic and evolutionary computation. ACM, Washington, DC, USA, pp 1409–1415
Droste S, Jansen T, Wegener I (2002) On the analysis of the (1 + 1) evolutionary algorithms. Theor Comput Sci 276(1–2):51–81
Du X, Ding LX (2010) About the convergence rates of a class of gene expression programming. Sci China Inform Sci 53(4):715–728
Du X, Ding LX et al (2009) Convergence analysis of gene expression programming based on maintaining elitist. In: Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, vol 6, pp 823–826
Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst 12(2):87–129
Garnier J, Kallel L, Schoenauer M (1999) Rigorous hitting times for binary mutations. Evol Comput 7(2):173–203
Happ E, Daniel J, Christian K, Frank N (2008) Rigorous analyses of fitness-proportional selection for optimizing linear functions. In: Proceedings of the 10th annual conference on genetic and evolutionary computation. ACM, Atlanta, Georgia, USA, pp 953–960
He J, Kang LS (1999) On the convergence rates of genetic algorithms. Theor Comput Sci 229(1–2):23–39
He J, Yao X (2002) From an individual to a population: an analysis of the first hitting time of population-based evolutionary algorithms. IEEE Trans Evol Comput 6(5):495–511
He J, Yao X (2003) Towards an analytic framework for analyzing the computation time of evolutionary algorithms. Artif Intell 145(1–2):59–97
He J, Yao X (2004) A study of drift analysis of estimating computation time of evolutionary algorithms. Nat Comput 3(1):21–35
Iosifescu M (1980) Finite Markov processes and their application. Wiley, Chichester
Karakasis V, Stafylopatis A (2008) Efficient evolution of accurate classification rules using a combination of gene expression programming and clonal selection. IEEE Tran Evol Comput 12(6):662–678
Lehre PK, Yao X (2012) On the impact of mutation-selection balance on the runtime of evolutionary algorithms. IEEE Trans Evol Comput 16(2):225–241
Lopes HS, Weinert WR (2004) A gene expression programming system for time series modeling. In: Proceedings of XXV Iberian Latin American Congress on Computational Methods in Engineering, vol 11. CILAMCE, Recife, pp 10–12
Oliveto PS, Witt C, (2008) Simplified drift analysis for proving lower bounds in evolutionary computation. In: Proceeding of the 10th International Conference on Parallel Problem Solving from Nature, pp 82–91
Oliveto PS, He J, Yao X (2007) Computational complexity analysis of evolutionary algorithms for combinatorial optimization: a decade of results. J Autom Comput 4(3):281–293
Peng J, Jie TC et al (2005) M-GEP: a new evolution algorithm based on multi-layer chromosomes gene expression programming. Chin J Comput 9(28):1459–1466
Rudolph G (1994) Convergence analysis of canonical genetic algorithm. IEEE Trans Neural Netw 5:96–101
Rudolph G (1996) How mutation and selection solve long path problems in polynomial expected time. Evol Comput 4(2):194–205
Shi W, Zhang X, Shen Q (2010) Quantitative structure-activity relationships studies of ccr5 inhibitors and toxicity of aromatic compounds using gene expression programming. IEEE Trans Evol Comput 45(1):49–54
Smith MJ, Szathmry E (1995) The major transitions in evolution. Oxford University Press, Oxford
Teodorescu L (2004) Gene expression programming approach to event selection in high energy physics. IEEE Trans Nuclear Sci 53(6):2221–2227
Teodorescu L, Sherwood D (2008) High energy physics event selection with gene expression programming. Comput Phys Commun 178(6):409–419
Wegener I, Witt C (2005) On the analysis of a simple evolutionary algorithm on quadratic pseudo-boolean function. J Discrete Algorithm 3(1):61–78
Xin D, Qiao LY, Tong XD, Shan KL (2006) A new algorithm of automatic programming: GEDGEP. In: The 6th International Conference on Simulated Evolution and Learning, vol 10. Springer, Hefei, pp 292–301
Yu Y, Zhou ZH (2008) A new approach to estimating the expected first hitting time of evolutionary algorithms. Artif Intell 172(15):1809–1832
Yuan CA et al (2004) Function mining based on gene expression programming-convergence analysis and remnant-guided evolution algorithm. J Sichun Univ 36(6):100–105
Yuan CA et al (2006) Convergence of genetic regression in data mining based on gene expression programming and optimized solution. Int J Comput Appl 28(4):359–366
Zhou Chi, Xiao W, Tirpak TM, Nelsom PC (2003) Evolving accurate and compact classification rules with gene expression programming. IEEE Trans Evol Comput 7(6):519–531
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61305079, 61305086, 61203306), the project of preeminent youth fund of Fujian province (JA12471), outstanding young teacher training fund of Fujian Normal University (No. fjsdjk2012083), the open fund of State Key Laboratory of Software Engineering (No. SKLSE2012-09-28). Finally, we thank Prof. Jinghu Yu for discussing some issues about this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by U. Fiore.
Rights and permissions
About this article
Cite this article
Du, X., Ni, Y., Xie, D. et al. The time complexity analysis of a class of gene expression programming. Soft Comput 19, 1611–1625 (2015). https://doi.org/10.1007/s00500-014-1551-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1551-y