1 Evolutionary Computation

Evolutionary Computation (EC) is a machine learning technique inspired by the manner in which the biological evolutionary process operates. Populations of individuals, that is, candidate solutions, are evaluated and their performance on a particular problem scored. The population is replaced with a new one created by probabilistically recombining the best performing individuals. In this way, the population slowly evolves towards an optimal or near optimal solution.

Two key factors that limit the sort of problems that can be tackled by EC and, indeed, any iterated machine learning technique, are representation and fitness. Representation is concerned with the complexity of the solutions that the system can evolve and manipulate. As individuals become more complex, it becomes increasingly more difficult to recombine them with each other.

Fitness is the ability to measure the quality of an individual, specifically its ability to solve the problem at hand. If no fitness evaluator exists, creating one can be prohibitively expensive, as unless they are quick and accurate, they will quickly become a bottleneck.

EC has been used with considerable success in areas as varied as Bioinformatics [15, 51], Automatic Circuit Generation [42, 93] and Fluid Dynamics [4]—as far back as the seventies!

Evolutionary Automatic Programming is specifically focused on evolving programs, and more recently has been referred to as the problem of program synthesis. The most commonly used approach, Genetic Programming (GP) [41,42,43,44,45] uses expression trees to represent individuals, as in Fig. 1.

Fig. 1
figure 1

GP individuals represented as syntax trees. The individual on the left corresponds to (+ (* 3 Y) (* (/ 6 X) (* X Y))) while the one on the right corresponds to (+ (/ X Y) (* (* X 4) (Sin X)))

These individuals are recombined with each other using crossover, an operation that swaps subtrees from the two parent individuals. The subtrees are selected at random and placed into the corresponding location in the other parent, resulting in two offspring as in Fig. 2.

Fig. 2
figure 2

The resulting offspring from crossing over the parents from Fig. 1

GP has enjoyed much success and has been successfully applied to an enormous number of problem domains. There is, however, no simple way to deal with multiple types in GP, nor to handle constraints for the manner in which programs are put together. This is because all GP individuals must obey the closure rule, that is, all functions must take and return the same type. It is possible to use Strongly Typed Genetic Programming [50], in which multiple types can be maintained, but this involves performing constrained crossover, with only those nodes of the same type being able to be swapped, which reduces the searching abilities of the system.

However, this constrains the search space, and becomes particularly problematic when dealing with dimensionally aware [37, 38] problems. Furthermore, it also doesn’t facilitate the passing of information down through the trees as in Attribute Grammars (AG), which is necessary to generate dynamic types such as those required in matrix multiplication.

Most users do not use GP with multiple types, however, and standard GP has achieved extraordinary success across a very wide range of domains.

2 Grammatical Evolution

2.1 Grammars and Evolutionary Computation

All evolutionary systems that produce programs use grammars of one form or another whether explicitly [49] or implicitly [86]. Grammars describe how programs can be constructed from constituent parts, i.e. how variables and operators can be legally combined to create executable code. The sorts of languages that different kinds of grammars can produce is documented in Chomsky’s well known Hierarchy of Grammars [7,8,9]. Most EC systems use Context Free Grammars (CFG), Type-2 grammars.

As noted above, most Evolutionary Automatic Programming (EAP) systems, including GP, generally considered to be one of the more advanced ones, exclusively use Closed Grammars [71, 72], which are a special, restricted form of CFGs that have a single type.

Sometimes these are implicit, as with GP and Gene Expression Programming [23], while other systems are more explicit, such as GE, G3P [94,95,96,97], etc. The main trade-off between implicit and explicit grammar usage is speed and expressiveness. We refer the interested reader to two relatively recent syntheses of grammars and genetic programming [49], and more broadly in the context of developmental systems [5].

GE, on the other hand, employs simple linear strings (typically binary or integer) as genotypes, using a mapping scheme to map them onto arbitrarily complex structures. The mapping scheme takes the form of a CFG, which specifies legal relationships between terminals (items which can appear in the final structure) and non-terminals (interim values to help link terminals together). CFGs enable one to evolve considerably more complex structures than standard GP, because they permit multiple types.

GE has a modular nature, see Fig. 3, meaning that everything from the problem being tackled to the language being used and even the search algorithm being employed can easily be swapped out. Section 4 describes how this modular nature has lead to a massive community effort in further developing GE.

Fig. 3
figure 3

The modular nature of Grammatical Evolution. Everything from the fitness function to the grammar and even the search engine can be modified or replaced

3 Crash Course in Grammatical Evolution

GE traditionally uses an evolutionary algorithm comprising a variable-length linear genome encoding of a computer program. The genotype-phenotype mapping (mapping) takes as input the linear genome and a grammar, and outputs a sentence in the language described by the grammar, with context-free grammars being the most often used. To drive search the quality of the each individual (that is, a sentence from the language) needs to be assigned a measure of quality.

GE individuals are usually executable entities, but can be any structure represented by a grammar; for example, Chapter 13, “Design, Architecture, and Engineering with Grammatical Evolution” in this book describes the GENR8 [30] system that uses GE and Autodesk’s Maya CAD tool to evolve digital surfaces.

When the sentence is in the form of code, it is usually embedded in some wrapper code to manage its execution. The result of the execution of the code is used as its measure of fitness.

We illustrate the mapping process using a simple example grammar to generate strings of characters, vowels and consonants. We first specify the grammar of the output language, which describes all possible sentences that can be generated.

The sentences generated by the example grammar below are of type string, which are comprised of one or more letter’s. A letter is allowed to be one of our primitives, that is, either a vowel, consonant or character.

A convenient formal notation for grammars, often employed by GE, is Backus Naur Form (BNF). BNF is comprised of the tuple {N, T, P, S}, where N is the set of intermediary symbols called non-terminals, which are mapped to the set of terminal symbols (T) according to P, the set of production rules. The terminal set consists of items that can actually appear in legal sentences for the grammar. The final item, S, is a special non-terminal start symbol, from which all derivation sequences begin.

For example, in this particular grammar the terminals are neatly described by three types: vowel, consonant and other character. We use the following sets for N and P.

N = {<string>, <letter>, <vowel>, <consonant>, <character>}, and

T = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, ”, ?, ’,’, ., ;, :, ’ ’}.

That is, the terminal set consists of letters, spaces and punctuation symbols, while the non-terminal set consists of the three types noted above, along with string, the start symbol, and letter. letter is a non-terminal that will be used to help group various vowels, consonants and characters together. The production rules for this grammar can be specified as follows:

<string> ::= <letter>|<letter><string> <letter> ::= <vowel>|<consonant>|<character> <vowel> ::= a|e|i|o|u <consonant> ::= b|c|d|f|g|h|j|k|l|m|n|p|q|r|s|t|v|w|x|y|z <character> ::= "|∗|?||@|,|.|;|:|' '

Thus, the above grammar contains the set of all possible primitive symbols of the sentences, and the structural rules, which govern the generation of syntactically legal sentences. For example, the following is an example of a sentence generated by this grammar, with a partial derivation tree shown overleaf in Fig. 4:

to evolve or not to evolve, that is the question.

Fig. 4
figure 4

Partial derivation of the sentence “to evolve or not to evolve, that is the question.” from the example grammar

3.1 Mapping

GE individuals describe a derivation sequence through a grammar. They do so by selecting choices from the production rules at every derivation step, for example, whether to choose a vowel, consonant or letter from letter.

The linear genome is interpreted as 8 bit codons, i.e. the smallest functional unit in GE. Each time a choice needs to be made in the derivation sequence a codon is taken and the mod of the number of available rules calculated, which is then used to select the appropriate rule. If, for example, we were choosing a production rule for letter we would mod the codon by 3 because there are three production rules.

The process continues as described in Fig. 5, consuming a codon for each choice in the derivation sequence, until the full derivation tree has been produced. If there are unconsumed codons remaining, these are said to be the tail of an individual and do not contribute to the mapping. In the event that the individual has not fully mapped and all the codons are consumed, either the individual is simply abandoned and assigned the lowest possible fitness or is wrapped, meaning that the first codon is reused. In these cases, an upper limit is placed on the number of times an individual can be wrapped.

Fig. 5
figure 5

GE generating a derivation and corresponding parse tree from a binary string. The numbers indicate the order of the mapping was done; circled nodes labelled with letters indicate terminals

Although this can lead to more successful mappings, particularly early in runs, results have been mixed [88] and the ability for wrapping to help evolution is often dependent on the grammar being used. Many researchers have found that removing wrapping doesn’t have a major detrimental effect.

3.2 Alternative Grammars

With the exceptions noted here, GE and, indeed, virtually all other grammar based systems, predominantly use CFGs which, although expressive enough for GE to be a very broadly applicable system [3, 24, 61, 70], is limited to regular and context free languages.

Alternative grammars, which have been employed with GE include Attribute Grammars (more details below in Sect. 3.2.1), Shape Grammars [76], L-Systems [65] and Map L-Systems [82], logic grammars [39], graph grammars [47], meta-grammars (albeit CFG) [62, 69] and Tree Adjoining Grammars [54, 56, 58].

3.2.1 Attribute Grammars

Attribute Grammars (AG) can be used to expand the expressive power of GE by attaching attributes (pieces of information) to the symbols in a grammar [10, 11, 35, 36, 74, 83]. These entities can interpret and generate attributes; attributes are generated either passed down (inherited) or passed up (synthesized), although default attributes can also be created and passed around as in Fig. 6. Attributes can take any form, from simple atomic forms to arrays or lists. These attributes can be used by a developing structure in AGE to pass information about various parts of the structure to other parts. AG facilitates the manipulation and exploitation of contextual information, which can be about other parts of the solution or the problem. For example, in the context circuit design, attributes could be used to pass information about which input pins have already been processed.

Fig. 6
figure 6

The GE mapping process augmented with AG. Individuals are mapped from simple binary strings (codons) to high level structures using arbitrarily complex grammars, including attribute grammars, which can pass contextual information around

4 Twenty Years of Grammatical Evolution

As shown in Fig. 3, the genotype-phenotype mapping of GE provides the advantage of a modular framework to approach Genetic Programming. The main components of this framework are the search engine, the mapper, the grammar, and the fitness evaluation. Activities over the past 20 years can be described in terms of these components, see Fig. 7 for an overview.

Fig. 7
figure 7

Some of the key research areas in GE. New topics, especially applications are being added all the time

Research in the search engine revolves around understanding the impact of the genome encoding [34], initialisation [59, 85], modularity [29, 33, 81, 89,90,91,92], crossover [27, 48, 68, 71], the impact of dynamic environments [13, 78], the behaviour of search operators of crossover and mutation, proposing alternative search operators [6, 17, 25, 28, 52, 75? ] to replacing the traditional evolutionary algorithm with alternatives such as Particle Swarm Optimisation [64, 73], Simulated Annealing [84], Differential Evolution [63] and even random search [84]. Continuing this vein of research Chapter 7, “Geometric Semantic Grammatical Evolution” outlines a geometric semantic search operator approach to GE, and in Chapter 3, “On the Non-uniform Redundancy of Representations for Grammatical Evolution: The Influence of Grammars” we see an emphasis on analysing the locality of the GE mapping and some of its genetic search operators.

The mapping process itself has been a target for investigation with a number of alternatives having been proposed in part to gain a deeper understanding of the generative process and in attempts to make improvements by, for example, complexifying the mapping by bringing it closer to its biological counterpart [1, 16, 40]. Chapter 4, “Mapping in Grammatical Evolution” provides an overview and highlights key studies in this area.

At the heart of GE is the grammar and while the majority of papers adopt CFGs, we noted earlier in this chapter (Sect. 3.2) the variety of grammars which have been adopted with a GE mapping is impressive, including shape, logic, attribute, meta, graph and tree adjunct grammars to prefix, infix and postfix encoding [31]. This line of research continues to this day, and Chapter 2, “Understanding Grammatical Evolution: Grammar Design” provides a critical analysis on the importance of grammar design in the successful application of GE.

Part of the attraction of genetic programming algorithms such as GE are their flexibility of application. As such, GE has enjoyed application to a wide set of problems areas. Part II of this book contains a selection of chapters highlighting some of these including Financial Modelling (Chapter 11, “Grammatical Evolution in Finance and Economics: A Survey”), Medicine and Bioinformatics (Chapter 15, “Identification of Models for Glucose Blood Values in Diabetics by Grammatical Evolution” and Chapter 16, “Grammatical Evolution Strategies for Bioinformatics and Systems Genomics”), Architecture and Design (Chapter 13, “Design, Architecture, and Engineering with Grammatical Evolution”), Business Analytics (Chapter 19, “Business Analytics and Grammatical Evolution for the Prediction of Patient Recruitment in Multicentre Clinical Trials”), Computational Creativity (Chapter 14, “Grammatical Evolution and Creativity”) and Game Artificial Intelligence (Chapter 18, “Evolving Behaviour Tree structures using Grammatical Evolution”). Other examples include communication networks [19, 20, 32], search-based software engineering [12] and program synthesis [66, 67, 79], sport analytics [80], eco-system modelling [14, 60], and animation [53, 55, 57].

MatlabFootnote 1 and more recently in Python with PonyGE2 [21, 22] with the majority of these employing an integer genome encoding as standard.

Finally, as noted in Sect. 5.4, a large number of variants of GE have appeared. These include position independent approaches such as Chorus and πGE [2, 16, 87], context-sensitive approaches such as Adaptive Logic Programming [39], TAGE [56] and DTAGE [58], to a novel 3D MAP L-system GENr8 [82], and exploiting meta-grammars for Grammatical Evolution by Grammatical Evolution [69].

5 The State of the Art

As noted in the foreword, as the authors of the original GE paper, one of the most rewarding things we have experienced is how it has been taken up by other researchers. The state of the art in 1998 was easy to articulate; there were only three researchers, three problem domains and one system. Twenty years of evolution have had their impact on the sort of applications that GE can be applied to. It is important to note that it isn’t possible to definitively state what set up is the best GE, mainly because of the hugely broad spectrum of uses. Instead, we focus in this section on the sorts of choices that need to be considered when tackling a problem with GE, and discuss how various characteristics of problems influence these choices.

5.1 Grammars

AGs are more expressive than CFGs and can be used to enforce constraints and pass context information around derivation trees, as in Fig. 6. The key advantage for CFGs is their simplicity, and mappers using CFGs are generally faster than their AG counterparts, but at the cost of sacrificing expressiveness. Several chapters in this book look in detail at grammar design and our advice is to use the most powerful grammar necessary, but no more.

5.2 Genetic Operators

The genetic operators are generally inherited from whatever underlying GA or search engine is driving GE, but early work analysing the operation of single point crossover [71] showed that, when compared to other crossover operators, including highly tuned homologousFootnote 2 operators, actually performed surprisingly well, giving a very good performance-to-cost ratio. More recent work, such as Chapter 7, “Geometric Semantic Grammatical Evolution” in this book, has examined semantic crossover operators, and show some very promising results.

For a simple GE, set up we recommend what has become known as effective crossover, that is, to simply ensure that at least one crossover point is selected within the coding part of an individual as described in Fig. 8. This is simple to implement and dramatically increases the probability that at least one of the offspring will be phenotypically different from the parents.

Fig. 8
figure 8

The three distinct crossover regions for Grammatical Evolution. The solid area in each parent represents the coding regions, while the diagonal lines represent regions that were not used in the mapping. When each crossover point occurs within these regions, the operation will simply result in two offspring identical to the parents. When the points are in either of the other two regions, the crossover operation is said to be effective

5.2.1 Initialisation

Originally, we used random initialisation for the GE population. However, as noted in [18, 26, 85], random initialisation can lead to very heavily biased initial populations. Consider the simple grammar below:

< S >::=< op >< v >< v > | < v > 

< op >::= +|−|∗|∕

< v >::= x|y

Uniform random initialisation will create a population in which 50% of the individuals consist of just one item, due to the < S >:==< v > production; of these, approximately half will be x and the rest y. Clearly this compromises the variation in the initial population, making evolution towards a useful product more difficult than it needs to be.

Thus, it is important to ensure a spread of individuals in the first generation. Sensible Initialisation [85] takes the ramped-half-and-half approach often used in GP and uses it for GE. Sensible initialisation operates by creating a population of derivation trees of various shapes and sizes and performing an “unmod” operation on them to generate linear strings that can subsequently be processed by GE.

When creating each individual in the initial population, first a derivation tree of a particular size is generated. Figure 9 gives an example of a derivation tree of depth 3. The choice made at each step is noted, for example, the initial step used the production rule < S >::=< op >< v >< v > , which is choice 0 from those available for < S > . Similarly, when mapping < op > , choice 2 is made from the available productions rules, that is, < op >::= ∗.

Fig. 9
figure 9

Creating derivation trees in Sensible Initialisation. The production rule number at each step is noted and will subsequently be used in the following “umod” step. Each individual has a sequence of choices associated with it. In this case the sequence is 0210

Each individual in the initial population has a list of these choices, which can be used to quickly identify duplicates. Once we are satisfied that the population consists of unique genotypes, the final “unmod” step can be performed.

Unmod produces the actual codons that will be used and essentially performs the opposite operation to mod, returning a number that, when divided by the number of choices available for the particular non-terminal, will return the choice made. In our example, we wish to perform 2 unmod 0 for the first production rule, meaning that we require a number that, when modded by 2 will yield 0.

This means that any even number between 0 and 255 will suffice. Similarly, in the second production rule, we perform 4 unmod 2 as there are four choices and our tree used the second one. Any number in the set {2, 6, 10..} will give the necessary number.

Clearly, unmod is a stochastic operator and, while its output doesn’t impact the initial generation in any way, it is crucial to introduce variation so that when individuals from the first generation are crossed over with each other, codons that end up being used for different production rules than they originally were, will not bias the choices made.

More recent work on initialisation includes that of Nicolau, who demonstrated that across the problems examined in their study, a variant of Harper’s PTC2 consistently outperforms other initialisations [59], as well as the work of Lourenco [46], which is further advanced in Chapter 6, “Structured Grammatical Evolution: A Dynamic Approach”.

What is crucial though, is to put some effort into ensuring good variation in that initial population, and to avoid simple random initialisation.

5.3 Parameter Settings

As with all EAs, GE has a number of parameter settings, such as population size, mutation rates and the like. There is a vast amount of literature in the field about how to set these parameters, but suffice it to say that population size is the most sensitive, and that more difficult problems generally require larger populations. It is important to turn this knob carefully though, as grammars and initialisation also play a part.

5.4 Variants

As described earlier in Sect. 4, not only has there been considerable research into the use of GE and analysis into its operations, there have also been quite a number of variants. It would take a whole other book to exhaustively test these against each other on a broad enough range of problems to be able to make any sort of recommendations, but readers are encouraged to investigate these variants, particularly those that have been shown to outperform GE on problems related to their own.

6 Contents of This Book

The book is divided into two key sections, Analysis and Applications. Rather appropriately, we start in the applications section with two chapters on grammar design. In Chapter 2, “Understanding Grammatical Evolution: Grammar Design”, Nicolau and Agapitos present some domain-independent guidelines for designing grammars, and, in Chapter 3, “On the Non-uniform Redundancy of Representations for Grammatical Evolution: The Influence of Grammars”, Schweim, Thorhauer and Rothlauf present a fascinating study on the impact of grammar design and redundancy on the creation of biased trees.

These are followed up by a trio of chapters on mapping in GE. Starting with a comprehensive survey in Chapter 4, “Mapping in Grammatical Evolution” by Fagan and Murphy, we then move to a contribution by Hemberg, Chapter 5, “Theory of Disruption in GE” in which he formalizes and analyzes the mapping process. This leads nicely into Chapter 6, “Structured Grammatical Evolution: A Dynamic Approach” by Lourenco et al., in which they further develop their Dynamic Structured Grammatical Evolution (DSGE) system, a version of GE that employs a different mapping to improve the power of the genetic operators.

Similar motivations are evident in Chapter 7, “Geometric Semantic Grammatical Evolution” and Chapter 8, “GE and Semantics”, by Moraglio et al. and Echeandia et al., respectively, the former which develops a semantic crossover operator for GE and the latter employs grammars to enable semantics, giving an excellent review of related work as it does so.

This section of the book is rounded out by two final chapters. Chapter 10, “Comparing Methods to Creating Constants in Grammatical Evolution” by Azad and Ryan tackles the issue of constant generation, highlighting the pros and cons of the more well-known methods, while Dufek et al. describe a parallel implementation of GE in Chapter 9, “Multi- and Many-Threaded Heterogeneous Parallel Grammatical Evolution”, which yields hugely impressive results.

We then switch gear to applications and provide seven radically different problems that have been tackled by experts in the field. Starting with a survey of financial applications in GE in Chapter 11, “Grammatical Evolution in Finance and Economics: A Survey” by Brabazon, we move to parallel program generation in Chapter 12, “Synthesis of Parallel Programs on Multi-Cores” by Chennupati et al.

The creative side of GE is explored in the next two chapters, starting with Fenton et al. in Chapter 13, “Design, Architecture, and Engineering with Grammatical Evolution”, who use GE to evolve physical designs, and then with Loughran in Chapter 14, “Grammatical Evolution and Creativity” who, in a very philosophical paper, uses GE to evolve music. There then follow two chapters from medical domains; first Hidalgo et al. use GE to generate models for glucose blood values in Diabetics in Chapter 15, “Identification of Models for Glucose Blood Values in Diabetics by Grammatical Evolution”, while Moore and Sipper give a thorough review of the use of GE in bioinformatics and systems genomics.

7 Summary

We hope this book provides useful snapshots of research and applications in Grammatical Evolution which has taken place over the past 20 years since the original work was published in EuroGP 2008, and presents some of the state of the art and current thinking in this field. Grammatical Evolution as a form of Genetic Programming in particular in its application to automatic programming or program synthesis has still a lot of open issues to address [77] and we hope to witness and be involved in the continued development of this exciting field of research for some time to come.