Keywords

1 Introduction to Phenotype and Genotype Mappings

This paper deals with the study of the influence of direct and indirect encoding for the artificial neural network design by evolutionary computation, i.e. neuroevolution. Evolutionary computation meets terminology as phenotype (behaviour, physiology, the morphology of an organism) and genotype (genetic coding of the organism). The relationship between both terms has been the subject of various investigations, including [1,2,3]. Such phenotype mapping has been used to predict disease-related genes. Van Driel et al. [1] say that phenotypes can be used to predict biological interactions, which are the effect which two genes have on each other.

Many evolutionary computation systems, such as genetic programming (GP) work with direct encodings (sometimes the term coding is used). Program trees representing a computer program evaluated recursively are used as genotypes. They translate directly into solutions. Some hybrid systems operate on graphs so there may be a problem with devising direct genetic operators. In a direct encoding [4], the genotype specifies every neuron and connection explicitly. Not all authors differentiate between direct encodings and structural encodings like [5]. Structural encoding implies that the encoding also holds information on connection weights. This is used when a genetic algorithm (GA) evolves Artificial Neural Network (ANN) parameters. In such an encoding, there are few constraints to the GA’s exploration [5].

The minimal alphabet principle in Gas [6] specifies the selection of the smallest alphabet that permits a natural problem expression. This holds for direct encoding but for indirect encoding the search space can be reduced [7].

1.1 Direct Encodings for ANNs

Montana [8] reviewed how GAs can be used to represent and train ANNs. The genetic representation must include the network topology, real-valued weights associated with every link and real-valued bias associated with every node.

Specifically, an example of a direct and structural encoding of an ANN, as found in [8] is given in Fig. 1. Typically, the layout, such as the number of nodes in each layer and the number of hidden layers, is fixed, and not part of the encoding.

Fig. 1.
figure 1

ANN with one hidden layer

The ANN in Fig. 1 would be encoded as the following chromosome (w1,w2,w3,w4,w5,w6,b1,b2,w7,w8,w9,w10,w11,w12,w13,w14,w15,w16,w17,w18,b3,b4,b5).

This representation scheme degrades its performance of convergence as the network size is enlarged [9]. It is noted that using direct encodings, it is normally not possible to represent the ANN graphical structure geometrically [10]. This lack of geometry restricts ANNs from evolving brain-like structures.

1.2 Indirect Encodings for ANNs

[9] devised a genetic algorithm representation for ANNs, which represents connections and network topology. The existence of a connection is represented by ‘1’ and its absence by a ‘0’. An extension of L-Systems [11] called Graph L-System, is used to generate graphs. Starting from an axiom of a 2 × 2 matrix, a set of deterministic rewriting rules are applied for edges and nodes represented by symbols. The connectivity matrix will enlarge every generation and the last generation will contain ‘1’ and ‘0’’s representing feed-forward network connections as an upper-right triangle; only the bold connections are used, the rest are discarded. An example representation for the XOR problem [12] is given in Fig. 2 where the symbol S is a starting axiom and symbols A, B, C, D stand for nonterminal for the grammar graph generation based on Graph L-System. The details of the concept are described in [9] including rewriting rules to final ‘1’ and ‘0’’s.

For the first line, ‘01100’ represents a network with the top node not having a connection from itself, and only the next two nodes connecting to the top node.

Fig. 2.
figure 2

2-2-1 XOR Network Generations (based on an example from [9])

1.3 The Evolutionary Search Space of Indirect Encodings

Pigliucci [13] quotes Hartmann et al. [7] as having determined that indirect encoding dramatically reduced the evolutionary search space in evolutionary computation as opposed to the classical direct encoding. However, on a closer look, this is actually not something that Hartmann et al. claim as fact in all cases and situations. Indeed, they only claimed this holds for evolved digital circuits. So, further research may be required in this area to see whether indirect encoding would be beneficial in certain cases. The importance of this can be seen in the field of genetics, where [13] assumed that indirect encoding was found to be beneficial in all cases by an analogy that biological systems should behave like simulated evolutionary models. On that unproven argument, it is claimed in [13] that the old metaphors of genetic blueprints and genetic programs are misleading or inadequate. [13] claims developmental or indirect encoding must be a promising basis to understand evolvability and the Genotype-Phenotype (G-P) mapping problem.

Interestingly, developmental encoding itself was inspired by biological development [14]. More recently, Clune et al. [15] showed an indirect encoding outperforming direct encoding. On the contrary, Harding and Miller [16] showed that for lower complexity (in the sense of Kolmogorov) encoding patterns, direct encoding sometimes performs worse than indirect encodings.

Other studies utilising generative encodings include Clune and Lipson [17], Meli [18], Jacob and Rozenberg [11] (using the grammar by Lindenmayer [19]) and Kitano [9].

Kwasnicka and Paradowski [20] found that in a few generations, indirect encoding gave excellent solutions though the evolved networks were larger than the equivalent directly encoded ones. Da Silva et al. also demonstrated the benefit of indirect encoding [21] where the quality of the Particle Swarm Optimisation (PSO), GA and GP-based solutions for web services using using indirect encoding was higher than the equivalent baseline direct encoding approach for twelve out of thirteen datasets. Also Hotz [22] in a case study of lens shape evolution schemes showed that indirect encoding converged faster than direct ones.

Compared to the advantages stated in the above-mentioned papers, [23] found in a TETRIS problem that HyperNEAT, a tool for indirect encoding in ANNs, is superior to NEAT (a tool for direct encoding in ANNs) early in evolution, but this fizzles out eventually and NEAT performs better. Authors showed that HyperNEAT was better for raw but NEAT performed better for hand-designed features. The aim of this paper is to compare direct and indirect encoding on a problem to analyse the performance of evolution.

2 Indirect Encoding

Encodings in evolutionary programming are typically binary, real-valued, graph-based, computer code [24]. The selected representation affects the effectiveness of a genetic algorithm [25] and probably even other binary-coded evolutionary computation. Direct encoding is claimed to be ineffective [14]. Indirect encoding can vary. One form, developmental encoding, employs gene reuse. Used in evolutionary computation, this is referred to as embryogeny. This is called Artificial Embryogeny [14]. Some exciting research in indirect encoding involves ANNs. An ANN is represented as a binary tree with grammar rules generating the ANN as seen in [9].

In GP, one finds the use of indirect encodings like Cartesian GP (CGP) [26] and Grammatical Evolution [27]. They can also help in “encoding neutrality”. Neutrality is defined as a situation when small genotype mutations (neutral mutations) do not have an effect on the fitness of the expressed phenotype. Miorandi et al. [28] explain that a neutral mutation gives a substantial advantage for state space exploration. This can potentially increase the evolvability of a population providing further robustness.

Indirect encoding representation can allow optimisation to occur without restrictions, since “functional constraints are subsequently enforced during the decoding step” [21].

An issue which Ronald [29] finds with the developmental form of indirect encoding used mainly in hybrid systems is that some of these encodings do not address the entire search space. On the other hand, an alternate representation may address the entire search space, as the Millipede [18] representation does by combining or folding several points in the search space into one. Similarly, Della Croce et al. [30] in their GA, which solves the Traveling Salesperson Problem (TSP) employ a lookahead representation based on Falkenauer and Bouffoix’s Linear Order Crossover (LOX). Another GP representation uses a block-oriented representation instead of the usual direct one [31]. It describes how a block diagram is built rather than the directly coded structure.

3 Summary of Research on Indirect Encoding

Research on indirect encoding has been summarized in two tables, Table 1 involving evolutionary computation with neural networks and Table 2 which involved the use of other evolutionary computation techniques.

Table 1. Research involving indirect encoding with ANN or other neural networks
Table 2. Research involving indirect encoding with other evolutionary computation techniques

3.1 Basic Processing of HyperNEAT

HyperNEAT is also known as Hypercube-based NEAT [10], where NEAT stands for NeuroEvolution of Augmented Topologies. The main idea in HyperNEAT is that it is possible to learn relationships when the solution is represented indirectly. It is a generative description of the connectivity of the ANN rather than searching for and tuning the connection weights of the ANN itself. Such approach is very important for evolution of large scale neural networks with huge amount of nodes a and many more of connections.

HyperNEAT uses Compositional Pattern Producing Networks (CPPNs) which can produce augmented structures including possibility to use any activation functions via evolutionary process of genetic algorithm. Figure 3 briefly shows the steps involved in the HyperNEAT algorithm.

Fig. 3.
figure 3

HyperNEAT processing

4 Experiment

The experiment compared two direct and indirect encoding algorithms used for ANNs. The direct algorithm chosen was the NEAT algorithm [34], and the indirect algorithm used was HyperNEAT [10]. Recall that NEAT is used to evolve dense ANN network nodes and connections specifically. It uses a direct encoding because of the difficulty in obtaining extensive knowledge of how such encoding will be used and, such indirect search can be biased [34]. HyperNEAT extends the NEAT algorithm. It utilizes Compositional Pattern Producing Networks (CPPNs) to evolve ANNs using principles from the NEAT algorithm. These algorithms were preferred due to the availability of easily usable software, SharpNEATFootnote 1.

4.1 Experimental Parameters

The open-source SharpNEAT neuroevolution C#.NET implementation was compiled and used unmodified to compare the two algorithms, NEAT and HyperNEAT, applied to the 2D Walker problem [41]. Settings were taken from [39], which compared the two algorithms applied to the T-Maze learning problem [39, 40] used HypersharpNEAT instead of SharpNEAT. The 2D Walker problem was chosen as it was the only problem available in SharpNEAT using both NEAT and HyperNEAT algorithms. The problem was investigated by [41], who used their implementation of NEAT, Covariance Matrix Adaptation Evolution Strategy (CMAES) and other deep reinforcement learning algorithms. The 2D Walker task involves an agent learning how to walk. [32] investigated a similar quadruped gait problem and concluded that HyperNEAT performed better than FT-NEAT, a directly encoded algorithm. They used the ODE physics simulator with a small population of 150, 1000 generations and 50 runs. The fitness function for 2D Walker is the mean hip position (if it is positive, otherwise 0) over five trials squared.

Every run consisted of 500 generations, the population size was 500, and 10% elitism was used. 50% of sexual offspring were not mutated. The asexual offspring “had 0.94 probability of link weight mutation, 0.03 chance of link addition, and 0.02 chance of node addition” [39]. Initial connections proportion was set to 0.05. Mutate connection weights was set to 0.89, mutate delete connection to 0.025. Connection weight range was five.

4.2 Results

Figure 4 below shows the results of running the NEAT and HyperNEAT algorithms using the SharpNEAT application for 500 generations, charting mean fitness.

Fig. 4.
figure 4

NEAT vs HyperNEAT algorithm results

Interestingly, NEAT performed well on the task. It can be seen how the HyperNEAT algorithm using indirect coding does not manage to get good mean fitness values. Indeed all values are smaller than 0.3. On the other hand, the NEAT algorithm manages to get close to 1.3 mean fitness. By generation 200 the mean fitness for HyperNEAT is 0.239891, whereas the equivalent for NEAT is 0.96752. The largest mean fitness reached by HyperNEAT is 0.264622 at generation 396. This is never reached again by the algorithm. The largest mean fitness reached by NEAT is that of 1.333688, at generation 493.

Table 3 shows the mean and maximum fitnesses reached in the last generation 500 averaged across runs for NEAT and HyperNEAT.

Table 3. Evaluation of mean and average of runs (Mean and Maximum fitnesses at Generation 500 for NEAT and HyperNEAT)

The Mann-Whitney U test was performed across the final generation runs for mean fitness since the data is not normally distributed. This showed that the difference is significant (U = 0; z-score is 5.39649, p-value is <.00001 and the result is significant at p < 0.05).

The results clearly ascertain how indirect coding in the Walker problem does not help. Indeed, it appears to have stifled the evolution. Some other tests were made to see if HyperNEAT could be improved, e.g. using the larger number of generations used by [32] or interspecies crossover, however there was no improvement.

5 Conclusion

Most current research into indirect encoding involves GP, GAs or ANNs. This paper has covered the issues involving indirect coding which are important aspects of evolutionary computation. We looked at differences involving indirect and direct representation. Significant progress has been made in the theoretical and practical developments. More research is needed involving indirect coding with neural networks and other evolutionary computation techniques, resulting in better representations.

These negative results do not show that indirect encoding is better than direct coding for the 2D Walker problem. This shows an example where indirect encoding is hard to apply successfully and is less effective, reminiscent of [23].

Further research should clarify whether indirect encoding is better than direct encoding in some areas, as well as which categories of problems might be convenient for either type of encoding. It definitely cannot be claimed that indirect coding is always better. This will also ascertain the veracity of Pigliucci’s claim that “old metaphors of genetic blueprints and genetic programmes are misleading or inadequate” [13].