Keywords

1 Introduction

Cartesian Genetic Programming can be considered a well-established graph-based GP variant. Initial work towards CGP was done by Miller, Thompson, Kalganova, and Fogarty [8, 14, 15] by the introduction of a two-dimensional graph encoding model of functional nodes. CGP can be seen as an extension to the traditional tree-based GP representation model since its representation allows many graph-based applications such as digital circuit design [26], evolution of neural network topologies [16, 28] and synthesis of cryptographic Boolean functions [5, 7]. CGP has introduced over two decades ago but is still predominantly used only with a probabilistic point mutation operator. The reason for this is that various standard genotypic crossover operators failed to improve the search performance of standard CGP in the past [3, 15]. Overall, the state of knowledge about recombination in CGP appears to be weak when compared to the number of publications in tree-based GP. The role of recombination in CGP was recently surveyed by Miller [17] and is still considered to be an open issue. Even if some progress has been made in recent years, comprehensive and advanced knowledge about recombination in CGP is still missing [17]. In the field of evolutionary computation (EC), discrete recombination is a well-established form of recombination in various subfields. Discrete recombination typically selects each gene from one of the two parents with equal probability. According to Rudolph [22], this method can be therefore considered as a dynamic n-point crossover since each gene for the chromosome of the offspring is selected from the first or second parent with equal probability. In this work, we take a step forward on the issue of crossover and introduce a method for the adaption of discrete recombination in CGP. We initially evaluate our method on a set of well-known symbolic regression benchmarks. Our results demonstrate the effectiveness of our approach for these problems.

Section 2 of this work describes CGP. Related work on crossover in CGP is surveyed in Sect. 3. This section also gives a brief historical overview of discrete recombination in the field of EC. In Sect. 4, we introduce our new method. Section 5 is devoted to the description of our experiments and the presentation of our results. Our findings are discussed in Sect. 6. Finally, Sect. 7 gives a conclusion and outlines our future work.

2 Cartesian Genetic Programming

In contrast to tree-based GP, CGP represents a genetic program via genotype-phenotype mapping as an indexed, acyclic, and directed graph. In this way, CGP can be seen as an extension of the traditional tree-based GP approach. The CGP representation model is based on a rectangular grid or row of nodes. Each genetic program is encoded in the genotype of an individual and is decoded to its corresponding phenotype. A definition of a cartesian genetic program \(\mathcal {P}\) is given in Definition 1. Let \(\phi : \mathcal {P} \mapsto \varPsi \) be a decode function which maps \(\mathcal {P}\) to a phenotype \(\varPsi \). Originally, the structure of the graph was represented by a rectangular grid of \(n_\text {r}\) rows and \(n_\text {c}\) columns, but later work focused on a representation with one row. The CGP decoding procedure processes groups of genes, and each group refers to a node of the graph, except the last one, which represents the outputs of the phenotype. Each node is represented by two types of genes that index the function number in the GP function set and the node inputs. These nodes are called function nodes and execute functions on the input values. The number of input genes depends on the maximum arity \(n_\text {a}\) of the function set.

Definition 1 (Cartesian Genetic Program)

A cartesian genetic program \(\mathcal {P}\) is an element of the Cartesian product \(\mathcal {N_\text {i}} \times \mathcal {N_\text {f}} \times \mathcal {N_\text {o}} \times \mathcal {F}: \)

  • \(\mathcal {N_\text {i}}\) is a finite non-empty set of input nodes

  • \(\mathcal {N_\text {f}}\) is a finite set of function nodes

  • \(\mathcal {N_\text {o}}\) is a finite non-empty set of output nodes

  • \(\mathcal {F}\) is a finite non-empty set of functions

A backward search is conducted to decode the corresponding phenotype. The decoding itself starts at the output nodes and continues until the inputs nodes are reached. The decoding procedure is done for all output genes. The result of the decoding procedure can be described as a set of directed paths \(\varOmega \). Given the input set I and the output set O, let \(\omega = I \times \varOmega \mapsto O\) be an output function. An example of the backward search of the most popular one-row integer representation is illustrated in Fig. 1. The backward search starts from the program output and processes all nodes which are linked in the genotype. In this way, only active nodes are processed during evaluation. The genotype in Fig. 1 is grouped by its function nodes. The first (underlined) gene of each group refers to the function number in the corresponding function set. The non-underlined genes represent the input connections of the node. Inactive function nodes are shown in gray color and with dashed lines.

Fig. 1.
figure 1

Example of the decoding procedure of a CGP genotype to its corresponding phenotype. The identifiers IP1 and IP2 stand for the two input nodes with node index 0 and 1. The identifier OP stands for the output node of the graph.

The number of inputs \(n_\text {i}\), outputs \(n_\text {o}\), and the length of the genotype is fixed. Every candidate program is represented with \(n_\text {r} * n_\text {c} * (n_\text {a} +1) + n_\text {o}\) integers. Even if the length of the genotype is fixed for each candidate program, the length of the corresponding phenotype in CGP is variable, which can be considered as an advantage of the CGP representation. CGP is traditionally used with a (\(1+\lambda \)) evolutionary algorithm (EA). The (1+\(\lambda \))-EA is often used with a selection strategy called neutrality, which is based on the idea that genetic drift yields to diverse individuals having equal fitness. The genetic drift is implemented into the selection mechanism in a way that individuals which have the same fitness as the normally selected parent are determined, and one of these same-fitness individuals is returned uniformly at random. The new population in each generation consists of the best individual of the previous population and the \(\lambda \) created offspring. The breeding procedure is mostly done by a point mutation that swaps genes in the genotype of an individual in the valid range by chance. Another point mutation is the flip of the functional gene, which changes the functional behavior of the corresponding function node.

3 Related Work

3.1 Recombination in CGP

According to the reports of Clegg et al. [3], the first attempts of recombination in standard CGP included testing of various genotypic crossover techniques. For instance, the genetic material was recombined by swapping parts of the genotypes of the parent individuals or randomly exchanging selected nodes. Clegg et al. reported that all techniques failed to improve the convergence of CGP and that merely swapping the integers disrupts the search performance. In comparison to mutation only CGP, the addition of genotypic crossover techniques hindered the performance. In one of the first empirical studies about CGP, Miller [15] analyzed its computational efficiency on Boolean function problems. More precisely, Miller analyzed and studied the influence of population size on the efficiency of CGP. The key finding of his study was that extremely low populations perform most effectively for the tested problems. The experiments of this study also demonstrated that the addition of a genotypic crossover reduces the computational effort only marginally.

This was the motivation for the introduction of a real-valued representation and intermediate recombination for CGP by Clegg et al. The real-valued representation of CGP represents the directed graph as a fixed-length list of real-valued numbers in the interval [0, 1]. The genes are decoded to the integer-based representation by their normalization values (number of functions or maximum input range). The recombination of two CGP genotypes is performed by intermediate recombination with a random weighting factor. Clegg et al. demonstrated that the new representation in combination with crossover improves the convergence behavior of CGP on one of the two tested symbolic regression problems. However, for the later generations, Clegg et al. found that the use of crossover in real-valued CGP disrupts the convergence on one problem. Later work by Turner [30] presented results with intermediate recombination on three additional classes of computational problems, digital circuit synthesis, function optimization, and agent-based wall avoidance. On these problems, it was found that the real-valued representation together with the crossover operation performed worse than mutation-only CGP.

Kalkreuth et al. [10] introduced and investigated subgraph crossover in CGP which exchanges and links subgraphs of active function nodes between two selected parents and the block crossover exchanges blocks of active function genes. In recent comparative studies, its use has been found beneficial for several symbolic regression benchmarks since it led to a significant decrease in the number of fitness evaluations needed to find the ideal solution [9, 11]. Contrarily, the gain of the search performance was considerably lower for the tested Boolean function problems [9, 11]. Moreover, the results of the experiments clearly showed that the subgraph crossover failed to improve the search performance on some of the tested Boolean benchmarks when compared to the results of the traditional \(1+\lambda \) selection strategy.

Husa and Kalkreuth [6] proposed block crossover which selects active function nodes by chance in accordance with a predefined block size but without any order. The function genes of the selected active nodes are then swapped. The block crossover has been compared to mutation-only CGP on a suite of Boolean functions and symbolic regression problems. The outcome of the study gave significant evidence that the \((1+\lambda )\)-CGP cannot be considered the most efficient CGP algorithm in the Boolean function domain, although it seems to be often a good choice. The outcome of the study gave the first evidence, that it is possible for crossover operators to outperform the standard \(1+\lambda \) selection strategy.

Sivla et al. [27] introduced a form of crossover for multiple output problems. The proposed method combines the subgraphs of the best outputs of the parent individuals produce an offspring. The proposed crossover technique was applied to the synthesis of combinational logic circuits with multiple outputs. The so-called X-CGP obtained the best results when compared to single chromosome CGP representations and performed better than the multi-chromosome representation for some of the tested problems. The experiments of Siliva et al. indicate that the proposed method is promising. On the other hand, the authors concluded that more studies are needed since X-CGP performed no better than the mutation-only multi-chromosome techniques on the majority of the tested problems.

3.2 Historical Background of Discrete Recombination

Discrete recombination in EC was first described by Rechenberg [20, 21] for the simulation of the first type of a multimembered evolutionary strategy (ES) called \((\mu + 1)\) or steady-state ES. Rechenberg demonstrated that recombination can improve the speed of the evolutionary process if the measure is taken per generation rather than per function evaluation [2]. Schwefel [23, 24] later utilized discrete recombination among five types of recombination for two further versions of the multimembered ES, called \((\mu + \lambda )\)- and \((\mu , \lambda )\)-ES [1]. Schwefel [24] performed an empirical study with 50 uni- and multimodal test functions and compared ESs to the most traditional direct optimization strategies and the outcome showed good results for ESs. According to Bäck et al. [1], the best results were achieved with the use of several types of recombination. In the field of GAs, discrete recombination is commonly referred to as uniform crossover and has been found to be a useful search operator [4]. Uniform crossover was first proposed for the binary encoding model of GA by Syswerda [29] and its search performance was found superior to the one- and two-point crossover in the most cases. Uniform recombination in GA inspired the adaption in tree-based GP [18, 19] where function nodes and subtrees are exchanged between two parent individuals in accordance with a uniform rate. If the uniform rate is set to \(50\%\), this method represents the tree-based GP equivalent of the uniform crossover for binary strings.

4 The Proposed Method

We adapt discrete recombination in CGP by means of phenotypic functional variation which is performed through the exchange of function genes of active function nodes. The phenotype of a CGP individual is represented by its active function nodes which are determined before the crossover procedure. After selecting two individuals, the minimum and a maximum number of active function nodes of the two individuals is determined. The reason for this is that the size of the phenotype in CGP is not fixed and can vary among individuals. To perform the exchange of active function genes, the crossover procedure iterates over the minimum number of active nodes. A binary decision is made by chance in each iteration whether the function genes are swapped or kept. In the case that both phenotypes differ in size, our method performs a special step in the last iteration called boundary extension which extends the selection of active function genes. The idea behind this step is to include active function genes of the larger phenotype into the selection which would not be considered if the lists of active function nodes are merely interated in order. Just like the uniform crossover in GA, our method produces two offspring. The algorithmic implementation of our method is described in Algorithm 1. Exemplifications of the procedure on genotypic and phenotypic level are illustrated in Fig. 2 and 3. An implementation for the CGP extension package of the Java Evolutionary Computation Research System (ECJ) [25] is provided in the ECJ GitHub repositoryFootnote 1.

Fig. 2.
figure 2

Exemplification of discrete recombination in CGP: Active function genes of two CGP genotypes are recombined by means of discrete recombination. Function genes, which have been randomly selected for the exchange, are connected with a double-sided arrow in the figure. The active function nodes and genes of the respective parent and offspring individuals are highlighted in red and blue color. (Color figure online)

Fig. 3.
figure 3

Illustration of discrete recombination in CGP on the phenotypic level based on the genotypic exemplification presented in Fig. 2. Active function nodes of the respective parent and offspring individuals are highlighted in red and blue color. (Color figure online)

5 Experiments

5.1 Experimental Setup

We performed experiments with symbolic regression problems. We compared the traditional (\(1+\lambda \))-CGP to a canonical EA equipped with our proposed discrete recombination and tournament selection. The algorithms which we used in our experiments are listed in Table 1. To evaluate the search performance, we measured the number of fitness evaluations until the CGP algorithm terminated successfully as recommended by McDermott et al. [13]. Termination was triggered when an ideal solution was found or a predefined budget of fitness evaluation was exceeded. We defined a maximum number of \(10^8\) fitness evaluations for our experiments and calculated the success rate (SR). In addition to the mean values of the measurements, we also calculated the standard deviation (SD), median (Q2) as well as lower and upper quartile (Q1 and Q3). Meta-optimization experiments have been performed to compare the algorithms fairly and are described in more detail in the following subsection. All tested algorithms were compared on the same number of function nodes to exclude conditions, which can distort the search performance comparison. Our method was tested against the traditional (\(1+\lambda \))-CGP which we declared as the baseline algorithm for our experiments. In our experiments, we exclusively used the single-row standard integer-based representation of CGP. Since we cannot guarantee normally distributed values in our samples, we used the nonparametric two-tailed Mann-Whitney U test to evaluate statistical significance. More precisely, we tested the null hypothesis that two samples come from the same population (i.e. have the same median). We performed 100 independent runs with different random seeds. The levels back parameter l was set to \(\infty \).

figure a
Table 1. Identifiers for the tested CGP algorithms.

5.2 Benchmarks

We chose eleven symbolic regression problems from the work of McDermott et al. [13] for better GP benchmarks. The reason for our choice of these problems is the fact that we can find an ideal solution more likely on average and evaluate the search performance of the whole evolutionary process. Our set of benchmarks covers uni- as well as bivariate polynomial, trigonometric, logarithmic, and power functions. The functions of the problems are shown in Table 2. A training data set U[abc] refers to c uniform random samples drawn from a to b inclusive. We used the extended Koza function set as recommended by McDermott et al. The function set is shown in Table 3. The fitness of the individuals was represented by a cost function value. The cost function was defined by the sum of the absolute difference between the real function values and the values of an evaluated individual. Let \(T = \bigl \lbrace x_p \bigr \rbrace ^\mathcal {P}_{p=1} \) be a training dataset of \(\mathcal {P}\) random points and \(f_\text {ind}(x_p)\) the value of an evaluated individual and \(f_\text {ref}(x_p\)) the true function value. Let

$$\begin{aligned} C := \sum _{p=1}^\mathcal {P} \vert f_\text {ind}(x_p) - f_\text {ref}(x_p) \vert \end{aligned}$$

be the cost function. When the difference of all absolute values becomes less than 0.01, the algorithm is classified as converged.

5.3 Meta-optimization

We tuned relevant parameters for all tested CGP algorithms on the set of benchmark problems. Moreover, we used the meta-optimization toolkit of ECJ. The parameter space for the respective algorithms, explored by meta-optimization, is presented in Table 4. For the meta-level, we used a canonical GA equipped with intermediate recombination and point mutation. Since GP benchmark problems can be very noisy in terms of finding the ideal solution, we oriented the meta-optimization with a common approach that has been used in previous studies [6, 11, 12]. The meta-evolution process at the base level was repeated multiple times for each candidate setting and the most effective settings were compared to find the best setting. For the problems Koza 1–3 and Nguyen 4–7, we selected effective settings of certain parameters for the (\(1+\lambda \))-CGP from previous parametrization studies  [11, 12].

Table 2. List of symbolic regression benchmarks.
Table 3. Function set used for the experiments.
Table 4. Parameter space explored by meta-optimization for the \(1+\lambda \) and canonical CGP algorithm.

5.4 Results

The results of our meta-optimization and search performance evaluation are presented in Table 5 and it is clearly visible that the Canonical-CGP with discrete recombination reduces the number of fitness evaluations to termination significantly on all tested problems. Moreover, on the more complex problems, the Canonical-CGP achieves higher success rates. Violin plots are provided in Fig. 4.

Table 5. Results of the meta-optimization and search performance evaluation.
Fig. 4.
figure 4

Violin plots for all tested problems and algorithms of our experiments.

6 Discussion

The experiments presented in this work allow certain points that are worthy of discussion. Even if the initial results of our proposed method are promising we have to emphasize that more experiments are needed to achieve insight into how our method performs in other problem domains. Since former work [10] on recombination in CGP presented promising results with symbolic regression problems, we initially tested our proposed method in this problem domain. However, we have to evaluate our method in problem domains where the search space differs from the continuous search spaces of our tested symbolic regression problems. Recent work [9, 11] led to more insight into the antagonism between continuous and discrete search spaces and its implications for the success of crossover-based algorithms in CGP. For our experiments, we also did not include the comparison to other crossover operators that have been proposed for CGP. For our initial evaluation and generally as a first step we concentrated on comparisons to the most commonly used algorithm in CGP and ensuring fair conditions with meta-optimization. But since several crossover operators have been proposed in recent years, more comparative studies are needed in the field of CGP and should be addressed by future work.

Another point that should be discussed is the parametrization of our method. Based on our meta-optimization experiments, we can derive some essential generalizations for our tested problems. In our experiments, moderate to high crossover rates performed best in combination with mid-size populations. We also tested low and very high rates of crossover but obtained no further improvement in the search performance. Likewise, we also experimented with bigger and smaller populations but the size of 50 individuals turned out to be the best choice. Overall, our results give more evidence that mid-size populations can be used effectively in CGP which depicts a significant shift from the popular dogma that only very small populations can perform effectively in CGP. Moreover, our results are coherent with the work of Kalkreuth [11] on population sizes in CGP and reinforce his findings. Nevertheless, we again, have to point out that our findings are based on results that have been obtained in merely one problem category.

7 Conclusions and Future Work

In this work, we presented initial results of a method for phenotypic discrete recombination in CGP. The effectiveness of our approach has been evaluated on a diverse set of well-known symbolic regression benchmarks, covering uni- and bivariate functions. Overall, our results indicate that the use of our proposed methods can be beneficial for symbolic regression. This work primarily focused on an initial evaluation of the search performance and ensuring fair conditions through meta-optimization. The next natural following step is the evaluation of our method in other problem domains and in comparison to other crossover operators. Therefore, our future work will primarily focus on comparative studies. Another part of our future work will be devoted to analytical experiments to study the effects caused by the phenotypic discrete crossover.