Deployment of parallel linear genetic programming using GPUs on PC and video game console platforms

Wilson, Garnett; Banzhaf, Wolfgang

doi:10.1007/s10710-010-9102-5

Deployment of parallel linear genetic programming using GPUs on PC and video game console platforms

Original Paper
Published: 18 February 2010

Volume 11, pages 147–184, (2010)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Deployment of parallel linear genetic programming using GPUs on PC and video game console platforms

Download PDF

Garnett Wilson¹ &
Wolfgang Banzhaf¹

424 Accesses
6 Citations
Explore all metrics

Abstract

We present a general method for deploying parallel linear genetic programming (LGP) to the PC and Xbox 360 video game console by using a publicly available common framework for the devices called XNA (for “XNA’s Not Acronymed”). By constructing the LGP within this framework, we effectively produce an LGP “game” for PC and XBox 360 that displays results as they evolve. We use the GPU of each device to parallelize fitness evaluation and the mutation operator of the LGP algorithm, thus providing a general LGP implementation suitable for parallel computation on heterogeneous devices. While parallel GP implementations on PCs are now common, both the implementation of GP on a video game console using GPU and the construction of a GP around a framework for heterogeneous devices are novel contributions. The objective of this work is to describe how to implement the parallel execution of LGP in order to use the underlying hardware (especially GPU) on the different platforms while still maintaining loyalty to the general methodology of the LGP algorithm built for the common framework. We discuss the implementation of texture-based data structures and the sequential and parallel algorithms built for their use on both CPU and GPU. Following the description of the general algorithm, the particular tailoring of the implementations for each hardware platform is described. Sequential (CPU) and parallel (GPU-based) algorithm performance is compared on both PC and video game platforms using the metrics of GP operations per second, actual time elapsed, speedup of parallel over sequential implementation, and percentage of execution time used by the GPU versus CPU.

Paralldroid: Performance Analysis of GPU Executions

Accelerating a Classic 3D Video Game on Heterogeneous Reconfigurable MPSoCs

Multi-Objective Differential Evolution on the GPU with C-CUDA

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

An increasingly popular means of conducting massively parallel computing is general-purpose computing for graphics processing units (GPGPU). The adoption of graphics processing units (GPUs) for parallel computing is due to both the low price point of the GPU hardware compared to other options for parallel processing and the rate at which the computing power of the GPU hardware brought to market increases [1]. Evolutionary computation in general, including genetic programming (GP), can usually be readily adapted to parallel computing techniques. Thus, a growing number of GP practitioners have been using GPGPU to reduce the computing time required by their applications. This work describes an algorithm for implementation of linear genetic programming (LGP) using the GPU for fitness evaluation and mutation sections of the algorithm. Moreover, the authors describe how to implement LGP so that it can be deployed so that it uses the GPU on heterogeneous devices.

The toolset and runtime environment package used in this work to access the GPU, Microsoft’s XNA (recursive acronym “XNA’s Not Acronymed”) framework, is designed to allow execution on heterogeneous devices including PCs (Windows XP or Vista), video game console (Xbox 360), and even a portable digital media device (Zune).^{Footnote 1} The XNA framework allows the creation of computer games by consumer designers across these hardware platforms. The XNA framework is used with Microsoft’s C# (for CPU programming) and High Level Shader Language, or HLSL (for GPU shader-level programming). C# is a high level object-oriented language that is part of Microsoft’s current Visual Studio development environment, while HLSL is a lower level C-style language specifically useful for programming GPU shader programs. While the authors were able to produce a general algorithm that could be applied across devices, particular elements of each algorithm were changed due to hardware considerations and the XNA framework. Despite the changes that were unique to each platform, we aimed to keep the high-level parallel methodology general across PC and video game console devices. By designing the algorithm within the XNA framework for all platforms, what is effectively developed is a Linear Genetic Programming game for PC and XBox 360 that runs a sequential or parallel GP and displays results numerically and within colored textures on screen as evolution takes place.

This main objective of this paper is to describe both a general methodology, and the platform-dependent requirements, to implement GP using GPGPU on PCs and video game consoles. While the general methodology is the same across the two platforms, two separate versions of the implementation were used: One goal of this work was to attempt to best utilize the potential GPU shader programming on the different platforms used for the LGP algorithm, rather than achieve identical implementations (program instructions) across platforms at the unnecessary expense of performance on a given platform. This goal was a continual trade-off with the additional goal of maintaining uniformity of the general algorithm across the heterogeneous platforms. A description of the details of the modifications that the authors implemented to allow such deployment on the two heterogeneous platforms is the main contribution of the work. In addition, some empirical examination of performance of parallel (CPU and GPU-based) and sequential (only CPU-based) implementations on the two platforms is provided. Due to the differences in both PC and video game console hardware and underlying operating systems, these results only provide a means of comparing the combined algorithm and hardware implementations (where they are highly interrelated in this work). That is, no conclusions can be drawn regarding the speed of the underlying hardware and software used to deploy the algorithms. However, results can be gleaned in the comparison of parallel and non-parallel implementations using the combination of LGP algorithm and targeted hardware platform.

Section 2 discusses parallel implementations using general-purpose computation on graphics hardware programming (GPGPU) and previous, related work involving parallel programming of GP with GPUs. Section 3 provides a brief overview of linear genetic programming (LGP) and general purpose computation on graphics processing units (GPGPUs) and introduces the GPU-based data structures used for the LGP individuals. Section 4 discusses some hardware-related considerations for each platform, and describes the general methodology of programming LGP using the data structures introduced in the previous section. Section 5 details platform-dependant differences in the general algorithm for the PC and Xbox 360 deployments of the parallel and sequential fitness functions, with implementation of the parallel and sequential mutation operator described in Sect. 6. Section 7 examines performance results for the GP regression sextic polynomial problem for the parallel and sequential algorithms and hardware combinations. Discussion and conclusions follow in Sects. 8 and 9, respectively.

2 GPGPU programming and related work

GPUs have the ability to perform restricted parallel processing, while being mass produced and inexpensive; hence researchers are increasingly interested in their use for applications requiring intensive parallel computations. The type of parallel processing used by GPUs is referred to as single instruction multiple data (SIMD) processing, where all the processors on the graphics unit simultaneously execute the same set of program instructions on different data. To be specific, a GPU is responsible for simultaneously rendering the pixels it is provided on an assembly of these pixels called a “texture.”^{Footnote 2} The GPU processes the texture and outputs a vector of four floating point numbers for each pixel processed, corresponding to rgba (red, green, blue, and alpha, for transparency) components of a color or the four components of a position (x, y, z and w for an algebraic factor). The two parts of GPU architecture that a user can control are the set of vertex processors and the set of pixel (or fragment) processors.

An effect file, which is a program to control the GPU, is divided into two parts corresponding to the architecture: a pixel shader and a vertex shader. The vertex shader program transforms input vertices based on camera position, and then each set of three resulting vertices compute a triangle from which pixel (fragment) output is generated and sent to the pixel shaders. The shader program instructs the pixel shaders (processors) to produce the colors of each pixel in parallel and place the final pixel in a memory buffer prior for final output. GPGPU applications tend to take advantage of pixel shader programming rather than using the vertex shaders, mainly because there are typically more pixel than vertex shaders on a GPU and the output of the pixel shaders is fed directly to memory [2]. In contrast, vertex processors must send output through both the rasterizer and the pixel shader sections of the GPU. The general architecture of a GPU is shown in Fig. 1, showing a level of architecture detail matching the discussion above.

APIs for accessing the functionality of the GPU differ in level of abstraction. Lower level alternatives for GPU programming include DirectX and the Open Graphics Library (OpenGL). The next level of abstraction includes C-style languages including C for Graphics (Cg), Microsoft’s High Level Shader Language (HLSL), and nVidia’s Compute Unified Device Architecture (CUDA). At the highest level are libraries that are integrated with object-oriented languages such as Sh (now RapidMind) that integrates with C++ and Microsoft Research’s Accelerator [3] integrating with C#. This work uses the XNA framework, which uses C# with GPU programming done in HLSL. Both implementations described herein (PC and XBox 360) use only the classes and methods specified by the XNA framework (using C# and HLSL). Thus, the algorithms can be constructed with Visual Studio 2008 with XNA framework installed and video card with appropriate GPU. The only additional requirement to execute these algorithms on an XBox 360 is a membership in the XNA Creators Club, which has a recurring subscription fee.

A number of evolutionary computation practitioners have demonstrated significant speed-ups in computation for some time now using a number of distributed and parallel computing techniques, due to the fact that evolutionary computation-based algorithms are easily parallelizable [1, 4]. The first GPU-centered applications to use evolutionary algorithms in general naturally applied them to textures for use in image processing. The idea of applying genetic programming to evolve shaders was first suggested by Musgrave [5]. Loviscach and Meyer-Spradow [6] used genetic programming to evolve pixel shaders in OpenGL and applied them to textures with user feedback required for determination of a fitness based on aesthetic; Ebner et al. [7] implemented a similar strategy with Cg. Lindblad et al. [8] applied linear GP (LGP) with DirectX to interpret 3D images, with fitness determining the difference between target and rendered images.

Moving from GP using the GPU for more traditional image analysis, general purpose computation on GPUs (GPGPU) techniques were later tried using evolutionary algorithms. Yu et al. [9] use Cg to implement a GA on a GPU using a fine-grained parallel model where each point of a 2D grid is an individual, which itself becomes a parent with its best neighbor. The chromosome of each individual is divided sequentially into several segments that are distributed across a number of textures with the same position. Each chromosome segment consists of four genes in each of a pixel’s components, with a separate texture storing the fitness values of the pixel individuals. Their implementation incorporated fitness evaluation, selection, crossover, and mutation operators in shader programs on the GPU. For large populations, the GPU implementation was found to be faster than the one on the CPU for the regression benchmark (with this result being typical in EC-based GPGPU research). Given their hardware configuration (2005) of an AMD Athlon 2500+ CPU with 512 M RAM and an Nvidia GeForce 6800GT, the authors achieved speedups of 1.4× to 20.1× for genetic operators and 0.3× to 17.1× for fitness evaluations for populations ranging from 1,024 to 262,144 for a regression benchmark.^{Footnote 3}

Fok et al. [4, 10] implemented EP (evolutionary programming) on the GPU. The individuals in a population are represented as textures on the GPU, as are fitness, random number, and indexing requirements. They determine that it is most effective to implement mutation, reproduction, and fitness evaluation with the GPU while the CPU performs competition and selection (where GPU versions of those functionalities were also tried). The authors achieved speedups of 1.25–5.02 for populations of 800–6,400 using five regression problems with a 2.4 GHz Pentium 4 with 512 MB of RAM and a GeForce 6800 Ultra video card. The lowest population size they tried, 400, did not improve in speed through the use of the GPU. GP (particularly Cartesian GP) is implemented by Harding and Banzhaf [11] using C# and Accelerator, with Accelerator handling the compilation of GP expressions into shader programs, execution of the shader programs, and the return of textures as array data. Fitness cases were evaluated in parallel on the GPU. Using sextic polynomial regression, the authors achieve speedups ranging from 0.02 to 95.37 testing combinations of maximum expression length {10, 100, 1,000, 10,000} and number of fitness cases {10, 100, 1,000, 2,000}. Chitty [12] implements a tree-based GP system, using OpenGL to create data textures and converting tree GP individuals to Cg shader programs for evaluation on the GPU. Langdon and Banzhaf [13] created a GPU-based interpreter using RapidMind and C++ that operates on stack-based GP trees. Their goal was to map a population of different individual programs to the GPU and evaluate the population in parallel.

Wilson and Banzhaf presented the first instance of parallel linear GP (LGP) using a graphics processing unit, deployed to PC and XBox 360, in [14]. In the previous work, fitness evaluation on the CPU maintained a loop over all fitness cases. Within that loop, the CPU iterated over n instructions, where n was the number of instructions in each individual. For each iteration of the fitness case loop, the shader on the GPU processed one instruction in all individuals in parallel, with the current values in each GP individual’s registers passed in and out of the shader as single row texture. This method of evaluating the fitness allowed for the possibility of tracking the contents of registers, but was considerably more computationally expensive than the implementation described in detail in this work. The fitness function has since been optimized to include iteration over fitness cases and instructions in each individual for PC and iteration over all instructions (but not fitness cases) for the XBox 360. Furthermore, in [14] the authors did not implement fitness evaluation on the GPU of the XBox 360, but this is now accomplished in the current work (the previous work implemented only mutation using the XBox 360 GPU). Crossover is not implemented, as the authors felt that mutation provided sufficient variation to meet the goal of having implemented a working GP system on heterogeneous platforms. The authors wished to keep the algorithm as simple as possible in terms of computational time and memory overhead due to the programming of the unique hardware within the heterogeneous devices to which the implementation was to be ported.^{Footnote 4} This work describes the finalized algorithm, where the solution incorporates a new fitness function that allows for faster execution on the GPU. Once the fitness function was refined, differences in hardware prevented the authors from trivially porting the PC version of their GPU fitness function to the Xbox 360. To create a fitness function tailored to the Xbox 360 hardware, the authors implemented a number of changes to the fitness function component of the algorithm. Thus, both fitness and mutation can now be implemented on both the PC and XBox 360 platforms. This work represents a much more substantial explanation of the requirements for deployment on the PC and video game platforms than [15].

3 Linear genetic programming and its GPU parallel programming implementation

This first part of this section describes the general form of a linear GP (LGP) program. In particular, the instructions comprising a typical LGP individual and the genetic operator of mutation are discussed. The second part of this section provides details of how LGP is implemented for parallel processing using GPUs.

3.1 Linear genetic programming: brief overview

In Linear Genetic Programming (LGP), each individual is a sequence of instructions of an imperative programming language like C (or lower level languages like machine code). Usually, the instructions under evolution have a particular structure known to the evolutionary process: Each instruction consists of an operand, target, and two sources (each with an associated indicator component, or “flag”). A valid linear GP instruction takes the general form

$$ {\text{target}} = src1 \; op \; src2 $$

(1)

where it performs the operation op on values from either source registers or constant terminal inputs of the program represented by the sources (src1 and src2), and places the result in the target register (see Fig. 1). Their respective flags determine whether the two source registers src1 and src2 refer to the source registers themselves or to the program inputs. Each instruction can therefore be represented by four integer values, src1, src2, target, op. The use of linear (bit) sequences in GP has been pioneered by N. Cramer with his JB language [16], and was later applied to machine language by Nordin et al. [17]. In recent years, a large number of LGP implementations have appeared [17–19]. Figure 2 shows a typical LGP program where the function set consists of arithmetic operators.

Figure 2 shows an instantiated instruction set with each instruction consisting of a target, followed by an operation on two sources. In the first line of this instruction set, r[0] is the target of Eq. 1, “+” is the operator (op), r[5] is a value from the sixth internal register specified by src1, and 11 is a constant input value specified by src2. The value 11 is drawn from the program input for src2 rather than drawing an input from an internal register due to the value of its accompanying flag. Below, we shall apply GP to a regression benchmark problem. In that example (a sextic polynomial), the integer variable op, op = {0, 1, 2, 3}, indicates one of four operators ADD, SUB, MUL, or DIV. Following Eq. 1, the integer variable target, target = {0, 1, 2, 3}, specifies one of four target registers. The integer variables src1 and src2 indicate either data from a fitness case or a register, based on the value of the respective flag variables f1 and f2 (not shown in generic instruction), which can have one of the values {0, 1}. The regression problem in this paper does not require control flow statements (conditionals and loops), and thus they are not encoded as possible instructions. However, general linear GP does allow the use of control flow statements simply through their inclusion in the set of operators (typically in addition to arithmetic operators). The control flow operators may also create general instruction forms in addition to Eq. 1.

The genetic operation applicable to LGP individuals in this work is mutation, where an instruction (single line of the LGP individual) is chosen using a random uniform distribution. The selected instruction then has each of its integer values changed to a random value within its acceptable range of values. There is no change in program size, as a single instruction is simply manipulated in place. Since all integer values within an instruction are mutated so they are within an acceptable range, no unfeasible instruction strings are generated.

3.2 GPGPU version of linear genetic programming

This work describes how to implement components of linear genetic programming in parallel on a graphics processing unit (GPU). To do this, a technique known as general purpose computation on GPUs (GPGPU) is used. GPUs typically consist of a number of processors that operate in parallel to perform “single instruction multiple data” (SIMD) processing where all processors on the GPU simultaneously execute the same program on different data. In particular, the GPU simultaneously processes each pixel in the texture. Each pixel consists of four components (we use xyzw as the components, but rgba is equally as effective in GPGPU programming), see Sect. 2 for details on pixel components. The shader program on a GPU processes all components on every pixel at one time and outputs a vector of four floating point numbers for each pixel. Figure 3 summarizes the execution of a pixel shader program on the GPU.

In Fig. 3, each channel (xyzw) of each pixel in the input (top) 8 × 1 pixel texture are processed at the same time by the shader program on the GPU (darkened box, midde of Fig. 3). Thus, every x channel in every pixel is multiplied by 5, every y channel is increased by 3, and so on, at the end of the execution of the shader program on the GPU. The resulting texture that is output by the GPU is shown on the bottom of Fig. 3. In our LGP GPU-based implementation, the inputs to the GPU are either the individuals themselves or associated problem data to be represented as textures. Numeric values corresponding to each segment of an instruction in an individual are stored as arrays in an XNA data type so they can be processed by the GPU as a texture. Our representation of an instruction uses four floats per pixel, one float for each of the 4 color components (xyzw) of the pixel. The fitness cases are stored on a separate (read only) texture object, and also use four floats per pixel. These texture-based data structures are described in greater detail in Sects. 5 and 6.

An instruction is encoded as two corresponding pixels. Two additional pieces of information are encoded on the pixels for convenience of reference when developing the program, but neither are required for execution: an integer id used to label the individual and an integer PC (program counter) to label the current instruction. There is no extra computational or space cost for adding this information—only 6 out of the 8 components across the 2 pixels used to represent an individual instruction are required for the instruction itself. Pixels of the first texture each contain the variables {operator, target, id, PC} corresponding to their four components and pixels of the second texture contain {flag1, source1, flag2, source2}.

The collection of pixel instructions make up an LGP individual, and the collection of individuals is the population. Accordingly, two textures can collectively represent the entire GP population: a particular column in both textures represents an individual, and the two pixels in the same location in both textures represent a unique instruction from the individual. The pixel-based width of these textures is the number of individuals in the population, and the pixel-based height of the textures is the number of instructions per individual. The length of all individuals (also the height of the population textures) is fixed at 16 for these experiments, where we found this length adequate to solve our chosen regression problem and did not wish to consume additional GPU texture memory. The representation of instruction, individual, and population is shown in Fig. 4.

4 General linear genetic programming method for heterogeneous devices

The previous section described texture-based data structures suitable for use in parallel processing on a GPU, while the current section describes the general algorithm built to use these structures that is universal to the CPU and the GPU. Independent of CPU and GPU-related algorithm design considerations, there are practical hardware considerations when implementing the general GP algorithm on both PC and Xbox 360 platforms. These issues are discussed in Sect. 4.1. Section 4.2 then describes the general linear genetic programming algorithm that we aim to implement on both architectures (thus the algorithm is designed for the XNA framework) independent of CPU or GPU.

4.1 Architecture issues

The design decisions surrounding the way the GPU is used by the algorithm was determined by the authors based on the nature of the underlying hardware. GPU-based GP implementations can use the GPU in one of two ways: the GPU instructions can represent the instructions within an individual of the population, or the GPU instructions can represent an interpreter that executes the instruction set from an individual that is passed to it over all fitness cases. The former approach requires dynamic compilation of shaders; that is, shader program must be compiled and re-loaded to the GPU whenever selection occurs. In the latter approach, the GPU program is compiled and loaded only once; thus the GPU program is static (fixed) throughout program execution. Practitioners choose either approach based on problem implementation or preference. In the case of XBox 360 implementation, however, the choice was obvious because the authors were only successful at compiling the shader components during initial compilation of the entire program. Thus, we use the interpreter approach for the GPU for both the PC and XBox 360. As the design considerations are largely hardware dependent, the limitations described in this section will likely change in the future with new devices. An issue that prompted two separate implementations for the platforms was that for the Xbox 360, there was an added restriction enforced by the microcode compiler upon compilation of the shader program: when a texture was used to hold the fitness cases, referencing it from within the inner of two nested loops did not allow the shader to compile. Thus, iteration over the instructions of an individual in a loop nested within a loop iterating over fitness cases was not possible for our application on the video game console, but was possible on the PC. To attempt to maximize the respective parallelization capabilities of the platforms’ GPU hardware, the authors allowed the general methodology to diverge in this respect. That is, to preserve the optimized PC GPU shader technique, separate implementations were required because the XBox 360 would not compile the authors’ PC shader technique that used the nested loops. While it is possible to create platform specific programs using the XNA framework to form a single class for both platforms, the inability to compile the shader prevented that possibility for the authors’ implementation. Despite the two slightly modified shader techniques and associated CPU fitness evaluation initiating the shader techniques, the procedure for deployment is still considerably general (see following Sect. 4.2 for details).

Hardware architecture considerations that we discovered also affected the allowable parameters across implementations. To keep experiments more consistent, the most restrictive setting determined by the authors between the two platforms was used for both platforms, as determined by querying the hardware devices. Parameterization related to population size is affected in our implementation by the size of the video card backbuffer (memory area where the textures are drawn instead of the screen so they can be retrieved) limiting the texture size corresponding to the population. In particular, the width of the result texture is the number of individuals in the population. The XBox 360 used textures with pixel dimensions of up to 8,192 x 8,192. Our implementation’s preliminary trials found that 400 individuals ran acceptably on the XBox 360, so the maximum population tested is thus 400. In addition to population-related GP parameters, the number of possible fitness cases is also impacted. The Xbox 360 features a maximum shader constant limitation of 256, so far the Xbox 360 implementation we restricted the number of fitness cases to 200 or under. Fitness cases were loaded into memory as an array rather than as a texture as on PC (due to microcode compilation issues associated with nested loops mentioned earlier in this section). The number of fitness cases on the PC is limited by the number of instructions that the shader is composed of following the unrolling of the nested fitness case and individual instruction loops. Unrolling of the loops is done by the shader program (HLSL) compiler and translates the instructions within all nested loops into a completely sequential version of those instructions to be used by the GPU. Even using shader options (such as preferring dynamic control flow) and directives to dynamically compile the shader, the authors found that the unrolling of the loops on the PC still occurred. Regardless, the preliminary tests using the XBox 360 prompted a limit of 200 fitness cases for our implementations.

4.2 General methodology

We implement linear GP on two different hardware platforms using a general methodology while adhering to the architectural requirements of both platforms mentioned in the preceding section. The framework used for programming both platforms is Microsoft’s publicly available XNA Game Studio, where this work uses version 3.0. The authors were unable to get Microsoft Research’s Accelerator, a tool for general purpose programming of the GPU, to operate with the XNA framework. The only other means of programming both platforms that the authors are aware of is Microsoft’s XBox 360 Development Kits professional developer tools, which are generally restricted to established video game development companies [20]. These professional tools require special approval and licenses. Thus, we did not use these tools. However, a recent publication described the use of these professional tools to perform scientific computing using GPGPU programming on the XBox 360 in a medical application [21].

Programming of shaders using the HLSL shader language is used to provide GPU access using the XNA framework. CPU-side programming is with C# in Microsoft’s Visual Studio 2008 development environment, where GPU shader programs are invoked using C# commands from the XNA framework. The shader programs themselves are programmed with Microsoft’s HLSL. Implementation of LGP begins by creating a project using an XNA Framework Windows (or XBox 360) Game template. Upon creating the project with either template, the user will see two C# files containing a class: Game1.cs and Program.cs. Program.cs is a wrapper class for our purposes, containing only a Main method that begins a program (a game in the context of an XNA project). The Main method begins by creating an instance of the game class and invoking the Run method. The second file is called Game1.cs and contains the Game1 class by default, which contains the methods of greatest interest to our implementation (and most other standard games). The class contains the methods Initialize, LoadContent, UnloadContent, Update, and Draw. The Initialize method is used to query services and handle non-graphics related content. The LoadContent method is automatically called only once per run and is used to initialize objects necessary to draw graphics, load effect files (files containing HLSL shader programs to be run), and load textures that will be drawn to the screen. UnloadContent is also automatically called once, and removes all loaded content. The Update method is used to check if a particular state is true during game play, including whether or not the user has pressed a certain key. The Draw method draws the current state of the game to the screen. The linear GP algorithm is programmed using the methods and functionality of the default Game class in a typical XNA framework project. Thus, in actuality, we are creating a Genetic Programming game that runs an example GP program and displays results on screen. The general GP method that we wish to implement in this framework is described in Table 1.

Table 1 General linear genetic programming (LGP) algorihtm

Deployment of parallel linear genetic programming using GPUs on PC and video game console platforms

Abstract

Similar content being viewed by others

Paralldroid: Performance Analysis of GPU Executions

Accelerating a Classic 3D Video Game on Heterogeneous Reconfigurable MPSoCs

Multi-Objective Differential Evolution on the GPU with C-CUDA

Explore related subjects

1 Introduction

2 GPGPU programming and related work

3 Linear genetic programming and its GPU parallel programming implementation

3.1 Linear genetic programming: brief overview

3.2 GPGPU version of linear genetic programming

4 General linear genetic programming method for heterogeneous devices

4.1 Architecture issues

4.2 General methodology

5 Implementation of fitness function

5.1 Sequential CPU fitness function

5.2 Parallel GPU fitness function

5.3 Parallel GPU fitness function for XBox 360

6 Implementation of the mutation operator

6.1 CPU mutation for PC and XBox 360

6.2 Parallel GPU mutation for PC and XBox 360

7 Results

7.1 Visual interpretation of results

7.2 Quantitative results

8 Discussion

9 Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation