Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Overview

An intelligent system and evolution are intrinsically related since it is difficult to conceive intelligence without evolution because intelligence cannot be static. Human beings create, adapt, and replace their own rules throughout their whole lives. The idea to apply evolution to a fuzzy system is an attempt to construct a mathematical assembly that can approximate human-like reasoning and learning mechanisms [1]. A mathematical tool that has been successfully applied to better represent different forms of knowledge is fuzzy logic (GlossaryTerm

FL

); also if-then rules are a good way to express human knowledge, so the application of GlossaryTerm

FL

to a rule-based system leads to a Fuzzy Rule-Based System (GlossaryTerm

FRBS

). Unfortunately, an GlossaryTerm

FRBS

is not able to learn by itself, the knowledge needs to be derived from the expert or generated automatically with an evolutionary algorithm (GlossaryTerm

EA

) such as a genetic algorithm (GlossaryTerm

GA

) [2].

The use of GlossaryTerm

GA

s to design machine learning systems constitutes the soft computing paradigm known as the genetic fuzzy system where the goal is to incorporate learning to the system or tuning different components of the GlossaryTerm

FRBS

. Other proposals in the same line of work are: genetic fuzzy neural networks, genetic

fuzzy clustering, and fuzzy decision trees. A system with the capacity to evolve can be defined as a self-developing, self-learning, fuzzy rule-based or neuro-fuzzy system with the ability to self-adapt its parameters and structure online [3].

Figure 76.1 shows the general structure of an evolutionary GlossaryTerm

FRBS

(GlossaryTerm

EFRBS

)that can be used for tuning or learning purposes. Although, it is difficult to make a clear distinction between tuning and learning, the particular aspect of each process can be summarized as follows. The tuning process is assumed to work on a predefined rule base having the target to find the optimal set of parameters for the membership functions and/or scaling functions. On the other hand, the learning process requires that a more elaborated search in the space of possible rule bases, or in the whole knowledge base be achieved, as well as for the scaling functions. Since the learning approach does not depend on a predefined set of rules and knowledge, the system can change its fundamental structure with the aim of improving its performance according to some criteria. The idea of using scaling functions for input and output variables is to normalize the universe of discourse in which membership functions were

Fig. 76.1
figure 1figure 1

General structure of an evolutionary fuzzy rule-based system

According to De Jong [4]:

the common denominator in most learning systems is their capability of making structural changes to themselves over time with the intent of improving performance on tasks defined by the environment, discovering and subsequently exploiting interesting concepts, or improving the consistency and generality of internal knowledge structures.

Hence, it is important to have a clear understanding of the strengths and limitations of a particular learning system, to achieve a precise characterization of all the permitted structural changes and how they are going to be made.

De Jong sets three different levels of complexity where the GlossaryTerm

GA

can perform legal structural changes in following a goal, these are [4]:

  1. 1.

    By changing critical parameters’ values

  2. 2.

    By changing key data structures

  3. 3.

    By changing the program itself with the idea of achieving effective behavioral changes in a task subsystem where a prominent representative of this branch is the learning production-systems program.

A good reason behind the success of production systems in machine learning is due to the fact that they have a representation of knowledge that can simultaneously support two kinds of activities: (1) the knowledge can be treated as data that can be manipulated according to some criteria; (2) for a particular task, the knowledge can be used as an executable entity.

The two classical approaches for working with evolutionary GlossaryTerm

FRBS

(GlossaryTerm

EFRBS

) for a learning system are the Pittsburgh and Michigan approaches. Historically, in 1975 Holland [5] affirmed that a natural way to represent an entire rule set is to use a string, i. e., an individual; so, the population is formed by candidate rule sets, and to achieve evolution it is necessary to use selection and genetic operators to produce new generations of rule sets. This was the approach taken by De Jong at the University of Pittsburgh, hence the name of Pittsburgh approach. During the same period, Holland developed a model of cognition in which the members of population are individual rules, and the entire population is conformed with the rule set; this quickly became the Michigan approach [6, 7].

There are extensive pioneering and recent work about tuning and learning using GlossaryTerm

FRBS

most of them fall in some way in the Michigan or in the Pittsburgh approaches, for example, the supervised inductive algorithm [8, 9], the iterative rule learning approach [10], coverage-based genetic induction (GlossaryTerm

COGIN

) [11, 12], the relational genetic algorithm learner (GlossaryTerm

REGAL

) system [13], the compact fuzzy classification system [14], with applications to fuzzy control [15, 16], and about tuning type-2 fuzzy controllers [17, 18, 19, 20].

The focus of this chapter is on evolving embedded fuzzy controllers; this subclassification reduces the number of related works; however, they are still a big quantity, since by an embedding system (GlossaryTerm

ES

), we can understand a combination of computer hardware (GlossaryTerm

HW

) and software (GlossaryTerm

SW

) devoted to a specific control function within a larger system. Typically, the GlossaryTerm

HW

of an GlossaryTerm

ES

can be a dedicated computer system, a microcontroller, a digital signal processor, or a GlossaryTerm

FPGA

-based system. If the GlossaryTerm

SW

of the GlossaryTerm

ES

is fixed, it is called firmware; because there are no strict boundaries between firmware and software, and the GlossaryTerm

ES

has the capability of being reprogrammed, the firmware can be low level and high level. Low-level firmware tells the hardware how to work and typically resides in a read only memory (GlossaryTerm

ROM

) or in a programmable logic array (GlossaryTerm

PLA

); high-level firmware can be updated, hence is usually set in a flash memory, and it is often considered software.

In the literature, there is extensive work on successful applications of type-1 and type-2 fuzzy systems; with regards to evolving embedded fuzzy systems, they were applied in a control mechanism for autonomous mobile robot navigation in real environments in [21]. For the sake of limiting more the content of this chapter, we have focused on GlossaryTerm

EFRBS

s to be implemented in an GlossaryTerm

FPGA

GlossaryTerm

HW

platform, with special emphasis on type-2 GlossaryTerm

FRBS

s. In this last category, with respect to type-1 GlossaryTerm

FRBS

took our attention to the following proposals: The development of an GlossaryTerm

FPGA

-based proportional-differential (GlossaryTerm

PD

) fuzzy look-up table controller [22], GlossaryTerm

FPGA

implementation of embedded fuzzy controllers for robotic applications [23], a non-fixed structure fuzzy logic controller is presented in [24], a flexible architecture to implement a fuzzy controller into an GlossaryTerm

FPGA

 [25], a very simple method for tuning the input membership function (GlossaryTerm

MF

) for modifying the implemented GlossaryTerm

FPGA

controller response [26]; how to test and simulate the different stages of a GlossaryTerm

FRBS

for future implementation into an GlossaryTerm

FPGA

are explained in [27, 28, 29]. On type-1 GlossaryTerm

EFRBS

there are some works like: A reconfigurable hardware platform for evolving a fuzzy system by using a cooperative coevolutionary methodology [30], the tuning of input GlossaryTerm

MF

s for an incremental fuzzy GlossaryTerm

PD

controller using a GlossaryTerm

GA

 [31]. In the type-2 GlossaryTerm

FRBS

category, the amount of reported work is less; representative work can be listed as follows: an architectural proposal of hardware-based interval type-2 fuzzy inference engine for GlossaryTerm

FPGA

is presented in [32], the use of parallel GlossaryTerm

HW

implementation using bespoke coprocessors handled by a soft-core processor of an interval type-2 fuzzy logic controller is explored in [33], a high-performance interval type-2 fuzzy inference system (GlossaryTerm

IT2

-GlossaryTerm

FIS

) that can achieve the four stages fuzzification, inference, GlossaryTerm

KM

-type reduction, and defuzzification in four clock cycles is shown in [34]; the same system is suitable for implementation in pipelines providing the complete GlossaryTerm

IT2

-GlossaryTerm

FIS

process in just one clock

This work deals with the development of evolving embedded type-1 and type-2 fuzzy controllers. In the chapter, a broad exploration of several ways to implement evolving embedded fuzzy controllers are presented. We choose to work with the Mamdani fuzzy controller proposal since it provides a highly flexible means to formulate

The organization of this chapter is as follows. In Sect. 76.2 we present the basis of GlossaryTerm

T1

and GlossaryTerm

T2

GlossaryTerm

FL

to explain how to achieve the GlossaryTerm

HW

implementation of an GlossaryTerm

FRBS

. In Sect. 76.3 a brief description of the state of the art in hosting technology for high-performance embedded systems is given.

2 Type-1 and Type-2 Fuzzy Controllers

The type-2 fuzzy sets (GlossaryTerm

T2FS

) were developed with the aim of handling uncertainty in a better way than GlossaryTerm

T1

GlossaryTerm

FS

does, since a GlossaryTerm

T1FS

has crisp grades of membership, whereas a GlossaryTerm

T2FS

has fuzzy grades of membership. An important point to note is that if all uncertainty disappears, a GlossaryTerm

T2

GlossaryTerm

FS

can be reduced to a GlossaryTerm

T1FS

. A type-2 membership function (GlossaryTerm

T2MF

) is an GlossaryTerm

FS

that has primary and secondary membership values; the primary GlossaryTerm

MF

is a representation of an GlossaryTerm

FS

, and serves to create a linguistic representation of some concept with linguistic and random uncertainties with limited capabilities; the secondary GlossaryTerm

MF

allows capturing more about linguistic uncertainty than a T1MF.

There are two common ways to use a GlossaryTerm

T2FS

, the generalized GlossaryTerm

T2FS

(GlossaryTerm

GT2

), and the interval GlossaryTerm

T2FS

(GlossaryTerm

IT2FS

). The former has secondary membership grades of different values to represent more accurately the existing uncertainty; on the other hand, in an GlossaryTerm

IT2FS

the secondary membership value always takes the value of 1. Unfortunately, to date for GlossaryTerm

GT2

no one knows yet how to choose their best secondary GlossaryTerm

MF

s; moreover, this method introduces a lot of computations, making it inappropriate for current application in real-time (GlossaryTerm

RT

) systems, even those with small time constraints; in contrast, the calculations are easy to perform in an GlossaryTerm

IT2FS

.

GlossaryTerm

T2MF

can be represented using a 3-D figure that is not as easy to sketch as a T1MF. A more common way to visualize a GlossaryTerm

T2MF

is to sketch its footprint of uncertainty (GlossaryTerm

FOU

) on the 2-D domain of the GlossaryTerm

T2FS

. We illustrate this concept in Fig. 76.2, where we show a vertical slice sketch of the GlossaryTerm

FOU

at the primary GlossaryTerm

MF

value x ; in the case of a GlossaryTerm

GT2

, in the right upper part of the figure, the secondary GlossaryTerm

MF

shows different height values of the GlossaryTerm

GT2

; in the case of an IT2F2, just below is the secondary GlossaryTerm

MF

with uniform values for the GlossaryTerm

IT2FS

. Note that the secondary values sit on top of its GlossaryTerm

FOU

.

Fig. 76.2
figure 2figure 2

Type-2 membership function. For the triangular MF the FOU is shown. The FOU is bounded by the upper part UMF( A ̃ ) and the lower part LMF( A ̃ ). A vertical slice at x is illustrated. Right, top: secondary MF values for a generalized T2MF; bottom: secondary MF values of an IT2MF

Figure 76.3 shows the main components of a fuzzy logic system showing the differences between the GlossaryTerm

T1

and GlossaryTerm

T2

FC. For GlossaryTerm

T1

systems, there are three components: fuzzifier, inference engine, and the defuzzifier which is the only output processing unit; whereas for a GlossaryTerm

T2

system there are four components, since the output processing has interconnected the type reducer (GlossaryTerm

TR

) block and the defuzzifier to form the output processing unit.

Fig. 76.3
figure 3figure 3

Type-1 and type-2 FC. The T2FC at the output processing has the type reducer block

Ordinary fuzzy sets were developed by Zadeh in 1965 [35]; they are an extension of classical set theory where the concept of membership was extended to have various grades of membership on the real continuous interval [ 0 , 1 ] . The original idea was to use a fuzzy set (GlossaryTerm

FS

); i. e., a linguistic term to model a word; however, after almost 10 years, Zadeh introduced the concept of type-n GlossaryTerm

FS

as an extension of an ordinary GlossaryTerm

FS

(GlossaryTerm

T1FS

) with the idea of blurring the degrees of membership values [36].

GlossaryTerm

T1FS

s have been demonstrated to work efficiently in many applications; most of them use the mathematics of fuzzy sets but lose the focus on words that are mainly used in the context to represent a function which is more mathematical than linguistic [37].

GlossaryTerm

T1FS

is a set of ordered pairs represented by (76.1) [38],

A = { ( x , μ A ( x ) ) | x X } ,
(76.1)

where each element is mapped to [ 0 , 1 ] by its GlossaryTerm

MF

μ A , where [ 0 , 1 ] means real numbers between 0 and 1, including the values 0 and 1,

μ A ( x ) : X [ 0 , 1 ] .
(76.2)

A pointwise definition of a GlossaryTerm

T2FS

is given as follows, A ̃  is characterized by a GlossaryTerm

T2MF

μ A ̃ ( x , u ) , where x X and u J x [ 0 , 1 ] , i. e. [39],

A ̃ = { ( x , u ) , μ A ̃ ( x , u ) | x X , u J x [ 0 , 1 ] } ,
(76.3)

where 0 μ A ̃ ( x , u ) 1 .

Another way to express A ̃ is

A ̃ = x X u J x μ A ̃ ( x , u ) / ( x , u ) J x [ 0 , 1 ] ,
(76.4)

where denote the union over all admissible input variables x and u . For discrete universes of discourse is replaced by  [39]. In fact, J x [ 0 , 1 ] represents the primary membership of x X and μ A ̃ ( x , u ) is a GlossaryTerm

T1FS

known as the secondary set. Hence, a GlossaryTerm

T2MF

can be any subset in [0,1], the primary membership, and corresponding to each primary membership, there is a secondary membership (which can also be in [0,1]) that defines the uncertainty for the primary membership.

When μ A ̃ ( x , u ) = 1 , where x X and u J x [ 0 , 1 ] , we have the IT2MF shown in Fig. 76.2. The uniform shading for the GlossaryTerm

FOU

represents the entire GlossaryTerm

IT2FS

and it can be described in terms of an upper membership function and a lower membership function

μ ¯ A ̃ ( x ) = FOU ( A ̃ ) x X ,
(76.5)
μ ¯ A ̃ ( x ) = FOU ( A ̃ ) ¯ x X .
(76.6)

Figure 76.2 shows an IT2MF, the shadow region is the GlossaryTerm

FOU

. At the points x 1 and x 2 are the primary GlossaryTerm

MF

s J x 1 and J x 2 , and the corresponding secondary GlossaryTerm

MF

s μ A ̃ ( x 1 ) and μ A ̃ ( x 2 ) are also shown.

The basics and principles of fuzzy logic do not change from GlossaryTerm

T1FS

s to GlossaryTerm

T2FS

s [37, 40, 41], they are independent of the nature of the membership functions, and in general, will not change for any type-n. When a GlossaryTerm

FIS

uses at least one type-2 fuzzy set, it is a type-2 GlossaryTerm

FIS

.

In this chapter we based our study on GlossaryTerm

IT2FS

s, so the GlossaryTerm

IT2

GlossaryTerm

FIS

can be seen as a mapping from the inputs to the output and it can be interpreted quantitatively as Y = f ( X ) , where X = { x 1 , x 2 , , x n } are the inputs to the GlossaryTerm

IT2

GlossaryTerm

FIS

f, and Y = { y 1 , y 2 , , y n } are the defuzzified outputs. These concepts can be represented by rules of the form

If x 1 is F ̃ 1 and and x p is F ̃ p , then y is G ̃ .
(76.7)

In a GlossaryTerm

T1FC

, where the output sets are GlossaryTerm

T1FS

, the defuzzification produces a number, which is in some sense a crisp representation of the combined output sets. In the GlossaryTerm

T2

case, the output sets are GlossaryTerm

T2

, so the extended defuzzification operation is necessary to get GlossaryTerm

T1FS

at the output. Since this operation converts GlossaryTerm

T2

output sets to a GlossaryTerm

T1FS

, it is called type reduction, and the GlossaryTerm

T1FS

is called a type-reduced set, which may then be defuzzified to obtain a single crisp number.

The GlossaryTerm

TR

stage is the most computationally expensive stage of the GlossaryTerm

T2FC

; therefore, several proposals to improve this stage have been developed. One of the first proposals was the iterative procedure known as the Karnik–Mendel (GlossaryTerm

KM

) algorithm.

In general, all the proposals can be classified into two big groups. Group I embraces all the algorithmic improvements and Group II all the hardware improvements, as follows [42]:

  1. 1.

    Improvements to software algorithms, where the dominant idea is to reduce computational cost of GlossaryTerm

    IT2

    -GlossaryTerm

    FIS

    based on algorithmic improvements. This group can be subdivided into three subgroups.

    1. (a)

      Enhancements to the GlossaryTerm

      KM

      GlossaryTerm

      TR

      algorithm. As the classification’s name claims, the aim is to improve the original GlossaryTerm

      KM

      GlossaryTerm

      TR

      algorithm directly, to speed it up. The best known algorithms in this classification are:

      1. i.

        Enhanced GlossaryTerm

        KM

        (GlossaryTerm

        EKM

        ) algorithms. They have three improvements over the original GlossaryTerm

        KM

        algorithm. First, a better initialization is used to reduce the number of iterations. Second, the termination condition of the iterations is changed to remove unnecessary iterations (one). Finally, a subtle computing technique is used to reduce the computational cost of each iteration.

      2. ii.

        The enhanced Karnik–Mendel algorithm with new initialization (GlossaryTerm

        EKMANI

        ) [43]. It computes the generalized centroid of general GlossaryTerm

        T2FS

        . It is based on the observation that for two alpha-planes close to each other, the centroids of the two resulting GlossaryTerm

        IT2FS

        s are also closed to each other. So, it may be advantageous to use the switch points obtained from the previous alpha-plane to initialize the switch points in the current alpha-plane. Although GlossaryTerm

        EKMANI

        was primarily intended for computing the generalized centroid, it may also be used in the GlossaryTerm

        TR

        of GlossaryTerm

        IT2

        -GlossaryTerm

        FIS

        , because usually the output of an GlossaryTerm

        IT2

        -GlossaryTerm

        FIS

        changes only a small amount at each step.

      3. iii.

        The iterative algorithm with stop condition (GlossaryTerm

        IASC

        ). This was proposed by Melgarejo etal [44] and is based on the analysis of behavior of the firing strengths.

      4. iv.

        The enhaced GlossaryTerm

        IASC

         [45] is an improvement of the GlossaryTerm

        IASC

        .

      5. v.

        Enhanced opposite directions searching (GlossaryTerm

        EODS

        ), which is a proposal to speed up GlossaryTerm

        KM

        algorithms. The aim is to search in both directions simultaneously, and in each iteration the points L and R are the switching points.

    2. (b)

      Alternative GlossaryTerm

      TR

      algorithms. Unlike iterative GlossaryTerm

      KM

      algorithms, most alternative GlossaryTerm

      TR

      algorithms have a closed-form representation. Usually, they are faster than GlossaryTerm

      KM

      algorithms. Two representative examples are:

      1. i.

        The Gorzalczany method. A polygon using the firing strengths [ f ¯ n , f n ] and [ ( y 1 , y n ) , which can be viewed as an GlossaryTerm

        IT2FS

        . It computes an approximate membership value for each point. Here, y ¯ n = y n = y n , for  n = 1 , 2 , 3 , N .

        μ ( y ) = f ¯ + f 2 [ 1 - ( f - f ¯ ) ] ,
        (76.8)

        where f - f ¯ is called the bandwidth. Then the defuzzified output can be computed as

        y G = arg max y μ ( y ) .
        (76.9)
      2. ii.

        The Wu–Tan (GlossaryTerm

        WT

        ) method. It searches an equivalent GlossaryTerm

        T1FS

        . The centroid method is applied to obtain the defuzzification. This is the faster method in this category.

  2. 2.

    Hardware implementation. The main idea is to take advantage of the intrinsic parallelism of the hardware and/or combinations of hardware and parallel programming. Here, we divided this group into four main approaches that embrace the existing proposals of reducing the computational time of the type reduction stage by the use of parallelism at different levels.

    1. (a)

      The use of multiprocessor systems, including multicore systems that enable the same benefits at a reduced cost. In this category are personal and industrial computers with processors such as the Intel Pentium Core Processor family, which includes the Intel Core i3, i5 and i7; the AMD Quad-Core Optetron, the AMD Phenom X4 Quad-Core processors, multicore microcontrollers such as the Propeller P8X32A from Parallax, or the F28M35Hx of the Concerto Microcontrollers family of Texas Instruments. Multicore processors also can be implemented into GlossaryTerm

      FPGA

      s.

    2. (b)

      The use of a general-purpose GlossaryTerm

      GPU

      (GlossaryTerm

      GPGPU

      ), and compute unified device architecture (GlossaryTerm

      CUDA

      ). In general, GlossaryTerm

      GPU

      provides a new way to perform high performance computing on hardware. In particular GlossaryTerm

      IT2FC

      s can take the most advantage of this technology because their complexity. Traditionally, before the development of the GlossaryTerm

      CUDA

      technology, the programming was achieved by translating a computational procedure into a graphic format with the idea to execute it using the standard graphic pipeline; a process known as encoding data into a texture format. The GlossaryTerm

      CUDA

      technology of NVIDIA offers a parallel programming model for GlossaryTerm

      GPU

      s that does not require the use of a graphic application programming interface (GlossaryTerm

      API

      ), such as OpenGL [46].

    3. (c)

      The use of GlossaryTerm

      FPGA

      s. This approach offers the best processing speed and flexibility. One of the main advantages is that the developer can determine the desired parallelism grade by a trade-off analysis. Moreover, this technology allows us to use the strength of all platforms in tight integration to provide the large performance available at the present time. It is possible to have a standalone GlossaryTerm

      T1

      /GlossaryTerm

      IT2FC

      , or to integrate the same GlossaryTerm

      T1

      /GlossaryTerm

      T2FC

      as a coprocessor as part of a high performance computing system.

    4. (d)

      The use of GlossaryTerm

      ASIC

      s. The GlossaryTerm

      T1

      /GlossaryTerm

      T2FC

      is factory integrated using complementary metal-oxide-semiconductor (GlossaryTerm

      CMOS

      ) technology. The main advantages are that they are cheaper than GlossaryTerm

      FPGA

      s. Differently to GlossaryTerm

      FPGA

      technology, GlossaryTerm

      ASIC

      solutions are not field reprogrammable.

A system based on an GlossaryTerm

FPGA

platform allows us to program all the Group I algorithms since modern GlossaryTerm

FPGA

s have embedded hard and/or soft processors; this kind of system can be programmed using high-level languages such as C/C++ and also they can incorporate operating systems such as Linux. On the other hand, GlossaryTerm

T1

/GlossaryTerm

T2

FC hardware implementations have the advantage of providing competitive faster systems in comparison to GlossaryTerm

ASIC

systems and the in field reconfigurability.

3 Host Technology

Until the beginnings of this century, general-purpose computers with a single-core processor were the systems of choice for high-performance computing (GlossaryTerm

HPC

) for many applications; they replaced existing big and expensive computer architectures [47]. In 2001, IBM introduced a reduced intstruction set computer (GlossaryTerm

RISC

) microarchitecture named POEWER4 (performance optimization with enhanced GlossaryTerm

RISC

) [48]. This was the first dual core processor embedded into a single die, and subsequently other companies introduced different multicore microprocessor architectures to the market, such as the Arm Cortex A9 [49], Sparc64 [50], Intel and AMD Quad Core processors, Intel i7 processors, and others [51]. These developments, together with the rapid development of GlossaryTerm

GPU

s that offer massively parallel architectures to develop high-performance software, are an attractive choice for professionals, scientists, and researchers interested in speeding up applications. Undoubtedly, the use of a generic computer with GlossaryTerm

GPU

technology has many advantages for implementing an embedded learning fuzzy system [46], and disadvantages are mainly related to size and power consumption. A solution to the aforementioned problems is the use of application specific integrated circuits (GlossaryTerm

ASIC

s) fuzzy processors [52, 53, 54], or reprogrammable hardware based on microcontrollers and/or GlossaryTerm

FPGA

s.

The orientation of this paper is towards tuning and learning using GlossaryTerm

FRBS

for embedded applications; for now, we are going to focus on GlossaryTerm

FPGA

s and GlossaryTerm

ASIC

technology [55], since they provide the best level of parallelization. Both families of devices provide characteristics for GlossaryTerm

HPC

that the other options cannot. Each technology has its own advantages and disadvantages, which are narrowing down due to recent developments. In general, GlossaryTerm

ASIC

s are integrated circuits that are designed to implement a single application directly in fixed hardware; therefore, they are very specialized for solving a particular problem. The costs of GlossaryTerm

ASIC

implementations are reduced for high volumes; they are faster and consume less power; it is possible to implement analog circuitry, as well as mixed signal design, but the time to market can take a year or more. There are several design issues that need to be carried out that do not need to be achieved using GlossaryTerm

FPGA

s, the tools for development are very expensive. On the other hand, GlossaryTerm

FPGA

s can be introduced to the market very fast since the user only needs a personal computer and low-cost hardware to burn the GlossaryTerm

HDL

(GlossaryTerm

HDL

) code to the GlossaryTerm

FPGA

before it is ready to work. They can be remotely updated with new software since they are field reprogrammable. They have specific dedicated hardware such as blocks of random access memory (GlossaryTerm

RAM

); they also provide high-speed programmable I/O, hardware multipliers for digital signal processing (GlossaryTerm

DSP

), intellectual property (GlossaryTerm

IP

) cores, microprocessors in the form of hard cores (factory implemented) such as PowerPC and ARM for Xilinx, or Microblaze and Nios softcore (user implemented) for Xilinx and Altera, respectively. They can have built-in analog digital converters (GlossaryTerm

ADC

s). The synthesis process is easier. A significant point is that the GlossaryTerm

HDL

tested code developed for GlossaryTerm

FPGA

s may be used in the design process of an GlossaryTerm

ASIC

.

There are three main disadvantages of the GlossaryTerm

FPGA

s versus GlossaryTerm

ASIC

s, they are: GlossaryTerm

FPGA

devices consume more power than GlossaryTerm

ASIC

s, it is necessary to use the resources available in the GlossaryTerm

FPGA

which can limit the design, and they are good for low-quantity production. To overcome these disadvantages it is very important to achieve optimized designs, which can only be attained by coding efficient algorithms.

During the last decade, there has been an increasing interest in evolving hardware by the use of evolutionary computations applied to an embedded digital system. Although different custom chips have been proposed for this plan, the most popular device is the GlossaryTerm

FPGA

because its architecture is designed for general-purpose commercial applications. New FGAs allow modification of part of the programmed logic, or add new logic at the running time. This feature is known as dynamic or active reconfiguration, and because in an GlossaryTerm

FPGA

we can combine a multiprocessor system and coprocessors, GlossaryTerm

FPGA

s are very attractive for implementing evolvable hardware algorithms. Therefore, in the next sections, we shall put special emphasis on multiprocessor systems and GlossaryTerm

FPGA

s.

4 Hardware Implementation Approaches

In this section, an overview of the three main lines of attack to do

a hardware implementation of an intelligent system is given.

4.1 Multiprocessor Systems

Multiprocessor systems consist of multiple processors residing within one system; they have been available for many years. Multicore processors have equivalent benefits to multiprocessors at a lower cost; they are integrated in the same electronic component. At the present time, most modern computer systems have many processors that can be single core or multicore processors; therefore, we can have three different layouts for multiprocessing; a multicore system, a multiprocessor system, and a multiprocessor/multicore system. Figure 76.4 shows a multicore system embedded into a Virtex 5 GlossaryTerm

FPGA

XC5VFX70; it has the capacity to integrate a distributed multicore system with a hard-processor PowerPC 440 as the master, five Microblaze 32-bit soft-processor slaves, coprocessors, and peripherals. The GlossaryTerm

FPGA

capacity to integrate devices is, of course, limited by the size of the GlossaryTerm

FPGA

. Figure 76.5 shows the full implementation in the program memory of the multiprocessor system.

Fig. 76.4
figure 4figure 4

Multicore system embedded into an FPGA. Embedded is a hard-processor PowerPC440 and five MicroBlaze soft-processors. In this system we can process an EA using the island model

Fig. 76.5
figure 5figure 5

The whole embedded evolutionary IT2FC implemented in the program memory of the multiprocessor system, similarly as in a desktop computer

4.2 Implementations into FPGAs

The architecture of GlossaryTerm

FPGA

s offers massive parallelism because they are composed of a large array of configurable logic blocks (GlossaryTerm

CLB

s), digital signal processing blocks (GlossaryTerm

DSP

s), block GlossaryTerm

RAM

, and input/output blocks (GlossaryTerm

IOB

s). Similarly, to a processor’s arithmetic unit (GlossaryTerm

ALU

), GlossaryTerm

CLB

s and GlossaryTerm

DSP

s can be programmed to perform arithmetic and logic operations like compare, add/subtract, multiply, divide, etc. In a processor, GlossaryTerm

ALU

architectures are fixed because they have been designed in a general-purpose manner to execute various operations. GlossaryTerm

CLB

s can be programmed using just the operations that are needed by the application, which results in increased computation efficiency. Therefore, an GlossaryTerm

FPGA

consists of a set of programmable logic cells manufactured into the device according to a connection paradigm to build an array of computing resources; the resulting arrangement can be classified into four categories: symmetrical array, row-based, hierarchy-based, and sets of gates [56]. Figure 76.6 shows a symmetrical array-based GlossaryTerm

FPGA

that consists of a two-dimensional array of logic blocks immersed in a set of vertical and horizontal lines; examples of GlossaryTerm

FPGA

s in this category are Spartan and Virtex from Xilinx, and Atmel AT40K. In Fig. 76.6 three main parts can be identified: a set of programmable logic cells also called logic blocks (GlossaryTerm

LB

s) or configurable logic blocks (GlossaryTerm

CLB

s), a programmable interconnection network, and a set of input and output cells around the device.

Fig. 76.6
figure 6figure 6

Symmetric array-based FPGA architecture island style

Embedded programmable logic devices usually integrate one or several processor cores, programmable logic and memory on the same chip (an GlossaryTerm

FPGA

) [56]. Developments in the field of GlossaryTerm

FPGA

have been very amazing in the last two decades, and for this reason, FPGAs have moved from tiny devices with a few thousand gates that were used in small applications such as finite state machines, glue-logic for complex devices, and very limited CPUs. In a 10-year period of time, a  200 % growth rate in the capacity of Xilinx GlossaryTerm

FPGA

s devices was observed, a  50 % reduction rate in power consumption, and prices also show a significant decrease rate. Other GlossaryTerm

FPGA

vendors, such as ACTEL, and ALTERA show similar developments, and this trend still continues. These developments, together with the progress in development tools that include software and low-cost evaluation boards, have boosted the acceptance of GlossaryTerm

FPGA

s for different technological applications.

4.2.1 Development Flow

The development flow of an GlossaryTerm

FPGA

-based system consists of the following major steps:

  1. 1.

    Write in GlossaryTerm

    VHDL

    the code that describes the systems’ logic; usually a top-down and bottom-up methodology is used. For example, to design an GlossaryTerm

    IT2FC

    , we need to achieve the following procedure:

    1. (a)

      Describe the design entity where the designer defines the input and output of the top GlossaryTerm

      VHDL

      module. The idea is to present the complex object in different hierarchical levels of abstraction. For our example, the top design entity is FT2KM.

    2. (b)

      Once the design entity has been defined, it is required to define its architecture, where the description of the design entity is given; in this step, we define its behavior, its structure, or a mixture of both. For the case of the GlossaryTerm

      IT2

      FLS, we define the system’s internal behavior, so we determined the necessity to achieve a logic design formed by four interconnected modules: fuzzification, inference engine, type reduction, and defuzzification. The GlossaryTerm

      VHDL

      circuits (submodules) are described using a register transfer logic (GlossaryTerm

      RTL

      ) sequence, since we can divide the functionality in a sequence of steps. At each step, the circuit achieves a task consisting in data transference between registers and evaluation of some conditions in order to go to the next step; in other words, each GlossaryTerm

      VHDL

      module (design entity) can be divided into two areas: data and control. Each of the four modules needs to be conceptualized, so we need to define its own design entity and, therefore, its particular architecture as well interconnections with internal modules. This process is achieved when we have reached the last system component.

    3. (c)

      Integrate the system. It is necessary to create a main design entity (top level) that integrates the submodules defining their interconnections. In Fig. 76.7 the integration of the four modules is shown.

      Fig. 76.7
      figure 7figure 7

      IT2FC design entity (FT2KM). This top-level module contains instances of the four fuzzy controller submodules

  2. 2.

    Develop the test bench in GlossaryTerm

    VHDL

    and perform GlossaryTerm

    RTL

    simulations for each submodule of the main design entity. It is necessary to achieve timing and functional simulations to create reliable internal design entities.

  3. 3.

    Perform synthesis and implementation. In the synthesis process, the software transforms the GlossaryTerm

    VHDL

    constructs to generic gate-level components, such as simple logic gates and flip-flops. The implementation process is composed of three small sub-processes: translate, map, and place, and route. In the translate process the multiple design files of a project are merged into a single netlist. The map process maps the generic gates in the netlist to the GlossaryTerm

    FPGA

    ’s logic cells and GlossaryTerm

    IOB

    s, this process is also known as technology mapping. In the place and route process, using the physical layout inside the GlossaryTerm

    FPGA

    chip, the process places the cells in physical locations and determines the routes to connect diverse signals. In the Xilinx flow, the static timing analysis performed at the end of the implantation process determines various timing parameters such as maximal clock frequency and maximal propagation delay [57].

  4. 4.

    Generate the programming file and download it to the GlossaryTerm

    FPGA

    . According to the final netlist a configuration file is generated, which is downloaded to the GlossaryTerm

    FPGA

    serially.

  5. 5.

    Test the design entity using a simulation program such as Simulink of Matlab and the Xilinx system generator (GlossaryTerm

    XSG

    ) for Xilinx devices. The idea here is first to plot the surface control in order to analyze the general behavior of the design (a controller in our example), and second to integrate the design entity as a block of the desired system to be controlled. Although, this fifth step, is not in the current literature of logic design for GlossaryTerm

    FPGA

    implementation, it is the authors’s recommendation since we have experienced good results following this practice.

Using the design entity FT2KM.vhd, which was created and tested using the aforementioned development flow, we can integrate it an GlossaryTerm

FPGA

in two ways:

  1. 1.

    As a standalone system. Here, we mean an independent system that does not require the support of any microprocessor to work, the system itself is a specialized circuit that can produce the desired output. The GlossaryTerm

    IT2FC

    is implemented using the GlossaryTerm

    FPGA

    flow design; therefore, it is programmed using the complete development flow for a specific application.

  2. 2.

    As a coprocessor. The coprocessor performs specialized functions in such a way that the main system processor cannot perform as well and faster. For GlossaryTerm

    IT2FC

    s, given an input, the time to produce an output is big enough to achieve an adequate control of many plants when the GlossaryTerm

    IT2FC

    is programmed using high-level language, even we have used a parallel programming paradigm. Since a coprocessor is a dedicated circuit designed to offload the main processor, and the GlossaryTerm

    FPGA

    can offer parallelism on the circuit level, the designer of the GlossaryTerm

    IT2FC

    coprocessor can have control of the controller performance. The coprocessor can be physically separated, i. e., in a different GlossaryTerm

    FPGA

    circuit (or module), or it can be part of the system, in the same GlossaryTerm

    FPGA

    circuit. In this work, we show two methods to develop a system with an GlossaryTerm

    IT2FC

    as a coprocessor. In both methods, we consider that we have a tested GlossaryTerm

    IT2FC

    design entity. In the first case, we shall use the FT2KM design entity to incorporate the fuzzy controller as a coprocessor of an ARM processor into an GlossaryTerm

    FPGA

    Fusion. In the second case, we are going to create the GlossaryTerm

    IT2FC

    GlossaryTerm

    IP

    core using the Xilinx Platform Studio; the core will serve as a coprocessor of the MicroBlaze processor embedded into a Spartan 6 GlossaryTerm

    FPGA

    .

5 Development of a Standalone IT2FC

Figure 76.7 shows the top-level design entity (FT2KM) of the GlossaryTerm

IT2FC

and its components (submodules) for GlossaryTerm

FPGA

implementation. The entity codification of the top-level entity and its components are given in Sect. 76.5.1. All stages include the clock (GlossaryTerm

clk

) and reset (GlossaryTerm

rst

) signals. In the defuzzifier, we have included these two signals to illustrate that a full process takes only four clock cycles, one for each stage. In practice, we did not add these two signals, since when we used it as a coprocessor, in order to incorporate it to the system, one 8-bit data latch is added at the output. For a detailed description of the GlossaryTerm

IT2FC

stages consult [34].

The fuzzification stage has two input variables, x1 and x2. This module contains a fuzzifier for the upper GlossaryTerm

MF

s, and another for the lower GlossaryTerm

MF

s of the GlossaryTerm

IT2FC

. For the upper part, for the first input x1, considering that a crisp value can be fuzzified by two GlossaryTerm

MF

s because it may have membership values in two contiguous GlossaryTerm

T2MF

s, the linguistic terms are assigned to the GlossaryTerm

VHDL

variables  e 1up and  e 2up , and their upper membership values are  g e 1up and  g e 2up . For the second input x 2, the linguistic terms are assigned to the GlossaryTerm

VHDL

variables  d e 1up and  d e 2up , and  gde 1up and  gde 2up are the upper membership values. The lower part of the fuzzifier is

similar; for example, for the input variable x1 the GlossaryTerm

VHDL

assigned variables are e 1 low and e 2 low , and their lower GlossaryTerm

MF

values are g e 1 low and g e 2 low , etc. The fuzzification stage entity only needs one clock cycle to perform the fuzzification. These eight variables are the inputs of the inference engine stage [58].

The inference engine is divided into two parallel inference engine entities IEEup is used to manage the upper bound of the GlossaryTerm

IT2FC

, and IEElow for the lower bound of the GlossaryTerm

IT2FC

s. Each entity has eight inputs from the corresponding fuzzifier stage, and eight outputs; four belong to the output linguistic terms, the rest correspond to their firing strengths. All the inputs enter into a parallel selection GlossaryTerm

VHDL

process, the circuits into the process are placed in parallel; the degree of parallelism can be tailored by an adequate codification style. In our case, all the rules are processed in parallel and the eight outputs of each inference engine section (upper bound and lower bound) are obtained at the same time because the GlossaryTerm

clk

signal synchronizes the process, hence this stage needs only one clock cycle to perform a whole inference and provide the output to the next stage. In the upper bound, the four antecedents are formed at the same time, for example, for the first rule, the antecedent is formed using the concatenation operator &, so it looks like ante := e 1 & d e 1 . Each antecedent can address up to four rules and depending on the combination, one of the four rules is chosen; the upper inference engine output provides the active consequents and its firing strengths. The lower bound of the inference engine is treated in the same way [59].

At the input of the GlossaryTerm

TR

, we have the equivalent values of the pre-computed y l i , i. e., the linguistic terms of the active consequents ( C 1 left , C 2 left , C 3 left , and C 4 left ), the upper firing strength ( g c 1up , g c 2up , g c 3up , and g c 4up ) , in addition to the equivalent values of the pre-computed y r i ( C 1 right , C 2 right , C 3 right , and C 4 right ), the lower firing strength ( g c 1 low , g c 2 low , g c 3 low , and g c 4 low )  [60]. All the above-mentioned signals go to a parallel selection process to perform the GlossaryTerm

KM

algorithm [39]. There are parallel blocks to obtain the average of the upper and lower firing strength for the active consequents, required to obtain the average of the y r and y l ; a block to obtain the different defuzzified values of y r and y l ; parallel comparator blocks to obtain the final result of y r and y l  [61].

The final result of the GlossaryTerm

IT2FC

is obtained using the defuzzification block, which computes the average of the y r and y l , and produces the only output y.

5.1 Development of the IT2 FT2KM Design Entity

Figure 76.8 shows the implementation of a static GlossaryTerm

IT2FC

that can work as a standalone system. By static, we mean that the only way to reconfigure (modify) the FC is to stop the application and uploading the whole

Fig. 76.8
figure 8figure 8

A standalone IT2FC is embedded into an FPGA. The fuzzifier reads the inputs directly from the FPGA terminals. The defuzzifier sends the crisp output to the FPGA terminals. The system may be embedded in the static region or in the reprogrammable region

configuration bit file (bitstream). In this system, the inputs of the fuzzifier and the defuzzifier output are connected directly to the GlossaryTerm

FPGA

terminals. The assignment of the terminals is achieved in accordance with the internal architecture of the chosen GlossaryTerm

FPGA

. Hence, it is necessary to provide to the Xilinx Integrated Synthesis Environment (GlossaryTerm

ISE

) program, special instructions (constraints) to carry through the synthesis process. They are generally placed in the user constraint file (GlossaryTerm

UCF

), although they may exist in the GlossaryTerm

HDL

code. In general, constraints are instructions that are given to the GlossaryTerm

FPGA

implementation tools with the purpose of directing the mapping, placement, timing, or other guidelines for the implementation tools to follow while processing an GlossaryTerm

FPGA

design. In Fig. 76.7 the overall entity of design of the GlossaryTerm

IT2FC

(FTK2M) was defined as follows,

entity FT2KM is

Port(clk, reset : in std_logic;

x1, x2 : in std_logic_vector(8 downto 1);

y : out std_logic_vector (8 downto 1)

);

end FT2KM;

The architecture of FT2KM has four components, and all of them have two common input ports: clock (GlossaryTerm

clk

), and reset (GlossaryTerm

rst

). All ports in an entity are signals by default. This is important since a signal serves to pass values in and out of the circuit; a signal represents circuit interconnects (wires). A component is a simple piece of customized code formed by entities as corresponding architectures, as well as library declarations. To allow a hierarchical design, each component must be declared before been used by another circuit, and to use a component it is neccesary to instatiate it first. In this approach the components are:

  1. 1.

    The component labeled as fuzzyUpLw. It is the GlossaryTerm

    T2

    fuzzifier that consists of one fuzzifier for the upper GlossaryTerm

    MF

    of the GlossaryTerm

    FOU

    and one for the lower GlossaryTerm

    MF

    . It has two input ports x1 and x2; these are 16: e1Up to de2Low.

    component fuzzyUpLw is

    port(clk, reset : in std_logic;

    x1, x2, ge1Up, ge2Up, gde1Up, gde2Up :

    in std_logic_vector(n downto 1);

    e1Up, e2Up, de1Up, de2Up, e1Low,

    e2Low, de1Low,

    de2Low : out std_logic_vector(3 downto 1);

    ge1Up, ge2Up, gde1Up, gde2Up, ge1Low,

    ge2Low, gde1Low,

    gde2Low : out std_logic_vector(n downto 1);

    );

    end component;

    The instantiation of this component is achieved using nominal mapping and the name of this instance is fuzt2. Note that ports GlossaryTerm

    clk

    , reset, and x1 and x2 are mapped (connected) directly to the entity of design FT2KM, since as we explained before, all ports are signals by default, which represent wires. The piece of code that defines the instantiation of the fuzzyUpLw component is as follows,

    fuzt2 : fuzzyUpLw port map(

    clk => clk, reset=> reset, x1 => x1, x2 => x2,

    e1Up => e1upsig, e2Up => e2upsig, de1Up => de1upsig,

    de2Up => de2upsig, ge1Up => ge1upsig, ge2Up => ge2upsig,

    gde1Up => gde1upsig, gde2Up => gde2upsig, e1Low => e1lowsig,

    e2Low => e2lowsig, de1Low => de1lowsig, de2Low => de2lowsig,

    ge1Low => ge1lowsig, ge2Low => ge2lowsig, gde1Low => gde1lowsig,

    gde2Low => gde2lowsig

    );

  2. 2.

    The component Infer_type_2 corresponds to the GlossaryTerm

    T2

    inference the controller. It has 16 inputs that match to the 16 outputs of the fuzzification stage. This component has 16 outputs to be connected to the type reduction stage. The piece of code to include this component is:

    component Infer_type_2 is

    port(rst, clk : in std_logic;

    e1, e2, de1, de2, e1_2, e2_2, de1_2, de2_2 : in STD_LOGIC_VECTOR (m downto 1);

    g_e1, g_e2, g_de1, g_de2, g_e1_2, g_e2_2,

    g_de1_2, g_de2_2 : in STD_LOGIC_VECTOR (n downto 1);

    c1, c2, c3, c4, c1_2, c2_2, c3_2, c4_2 : out STD_LOGIC_VECTOR (m downto 1);

    gc1_2, gc2_2, gc3_2, gc4_2, gc1, gc2, gc3, gc4 : out STD_LOGIC_VECTOR (n downto 1);

    );

    end component;

    This component is instantiated with the name Infer_type_2 as follows,

    inft2: Infer_type_2 port map(

    rst => reset, clk => clk, e1 => e1upsig, e2 => e2upsig, de1 => de1upsig,

    de2 => de2upsig, g_e1 => ge1upsig, g_e2 => ge2upsig, g_de1 => gde1upsig,

    g_de2 => gde2upsig, e1_2 => e1lowsig, e2_2 => e2lowsig, de1_2 => de1lowsig,

    de2_2 => de2lowsig, g_e1_2 => ge1lowsig, g_e2_2 => ge2lowsig, g_de1_2 => gde1lowsig,

    g_de2_2 => gde2lowsig, c1 => c1sig, c2 => c2sig, c3 => c3sig, c4 => c4sig,

    gc1 => gc1sig, gc2 => gc2sig, gc3 => gc3sig, gc4 => gc4sig, c1_2 => c12sig,

    c2_2 => c22sig, c3_2 => c32sig, c4_2 => c42sig, gc1_2 => gc12sig,

    gc2_2 => gc22sig, gc3_2 => gc32sig, gc4_2 => gc42sig

    );

    To connect the instances fuzt2 and Infer_type_2 it is necessary to define some signals (wires),

    signal e1upsig, e2upsig, de1upsig, de2upsig : std_logic_vector (m-1 downto 0);

    signal ge1upsig, ge2upsig, gde1upsig, gde2upsig :std_logic_vector (7 downto 0);

    signal e1lowsig, e2lowsig, de1lowsig, de2lowsig :std_logic_vector (m-1 downto 0);

    signal ge1lowsig, ge2lowsig, gde1lowsig, gde2lowsig : std_logic_vector (7 downto 0);

  3. 3.

    The component TypeRed corresponds to the type reduction stage of the GlossaryTerm

    T2FC

    . It has 16 inputs that should connect the inference engine’s outputs and it has two outputs yr and yl that should be connected to the deffuzifier through signals, once both have been instantiated. The piece of code to include this component is:

    component TypeRed is

    Port (clk, rst : in std_logic;

    c1, c2, c3, c4, c1_2, c2_2, c3_2, c4_2 : in STD_LOGIC_VECTOR (3 downto 1);

    gc1, gc2, gc3, gc4, gc1_2, gc2_2, gc3_2, gc4_2 : in STD_LOGIC_VECTOR (7 downto 0);

    yl, yr : out std_logic_vector (8 downto 1));

    end component;

    This component is instantiated with the name trkm as follows,

    inft2: Infer_type_2 port map(

    rst => reset, clk => clk, e1 => e1upsig, e2 => e2upsig, de1 => de1upsig,

    de2 => de2upsig, g_e1 => ge1upsig, g_e2 => ge2upsig, g_de1 => gde1upsig,

    g_de2 => gde2upsig, e1_2 => e1lowsig, e2_2 => e2lowsig, de1_2 => de1lowsig,

    de2_2 => de2lowsig, g_e1_2 => ge1lowsig, g_e2_2 => ge2lowsig, g_de1_2 => gde1lowsig,

    g_de2_2 => gde2lowsig, c1 => c1sig, c2 => c2sig, c3 => c3sig, c4 => c4sig,

    gc1 => gc1sig, gc2 => gc2sig, gc3 => gc3sig, gc4 => gc4sig, c1_2 => c12sig,

    c2_2 => c22sig, c3_2 => c32sig, c4_2 => c42sig, gc1_2 => gc12sig,

    gc2_2 => gc22sig, gc3_2 => gc32sig, gc4_2 => gc42sig

    );

    The signals that connect the instance Infer_type_2 to the instance trkm are

    signal c1sig, c2sig, c3sig, c4sig : std_logic_vector (m-1 downto 0);

    signal gc1sig, gc2sig, gc3sig, gc4sig : std_logic_vector (7 downto 0);

    signal c12sig, c22sig, c32sig, c42sig : std_logic_vector (m-1 downto 0);

    signal gc12sig, gc22sig, gc32sig, gc42sig :std_logic_vector (7 downto 0);

  4. 4.

    The last component defit2 corresponds to the defuzzifier stage of the T2FLC. It has two inputs and one output.

    component defit2 is

    Port ( yl, yr : in std_logic_vector (n-1 downto 0);

    y : out std_logic_vector (n-1 downto 0));

    end component;

    This component is instantiated with the name dfit2 as follows,

    dfit2 : defit2 port map(yl => ylsig, yr => yrsig, y => y);

    We did not define any signal for the port y since it can be connected directly to the entity of design FT2KM. The instances trkm and dfit2 are connected using the following signals,

    signal ylsig, yrsig : std_logic_vector (n-1 downto 0);

This approach of implementing an GlossaryTerm

IT2FC

provides the faster response. The whole process consisting of fuzzification, inference, type reduction, and defuzzification is achieved in four clock cycles, which for a Spartan family implementation using a  50 MHz clock represents 80 10 - 9 s , and for a Virtex 5 GlossaryTerm

FPGA

-based system is 40 10 - 9 s .

6 Developing of IT2FC Coprocessors

The use of GlossaryTerm

IT2FC

embedded into an GlossaryTerm

FPGA

can certainly be the option that offers the best performance and flexibility. As we shall see, the best performance can be obtained when the embedded FC is used as standalone system. Unfortunately, this gain in performance can present some drawbacks; for example, for people who were not involved in the design process of the controller or who are not familiar with GlossaryTerm

VHDL

codification, or the code owners simply want to keep the codification secret. All these obstacles can be overcome by the use of GlossaryTerm

IP

cores. Next, we shall explain two methods of implementing GlossaryTerm

IT2FC

as coprocessors.

6.1 Integrating the IT2FC Through Internal Ports

In Fig. 76.9, we show a control system that integrates the FT2KM design entity embedded into the Actel Fusion GlossaryTerm

FPGA

 [62] as a coprocessor of an ARM processor. This GlossaryTerm

FPGA

allows incorporating the soft processor ARM Cortex, as well as other GlossaryTerm

IP

cores to make a custom configuration. The embedded system contains the ARM processor, two memory blocks, timers, interrupt controller (GlossaryTerm

IC

), a Universal Asynchronous Receiver/Transmitter (GlossaryTerm

UART

) serial port, IIC, pulse width modulator/tachometer block, and a general-purpose input/output interface (GlossaryTerm

GPIO

) interfacing the FT2KM block. All the factory embedded components are soft GlossaryTerm

IP

cores. The FT2KM is a GlossaryTerm

VHDL

module that together with the GlossaryTerm

GPIO

form the Ft2km_core soft coprocessor, handled as an GlossaryTerm

IP

core; however, in this case, it is necessary to have the GlossaryTerm

VHDL

code. In the system, the GlossaryTerm

IT2

coprocessor is composed of the GlossaryTerm

GPIO

and the FT2KM modules, forming the Ft2km_core. In the system, moreover, are a DC motor with a high-resolution quadrature optical encoder, the system’s power supply, an H-bridge for power direction, a personal computer, and a digital display.

Fig. 76.9
figure 9figure 9

A coprocessor implemented into the Actel Fusion FPGA. The system has an ARM processor, the IT2FC coprocessor implemented through the general-purpose input/output port, and some peripherals

The Ft2km_core has six inputs and two outputs. The inputs are error, c.error, ce, GlossaryTerm

rst

, w, and GlossaryTerm

clk

. The 8-bit inputs error and c.errror are the controller input for the error and change of error values. ce input is used to enable/disable the fuzzy controller, the input GlossaryTerm

rst

restores all the internal registers of the GlossaryTerm

IT2FC

, and the input w allows starting a fuzzy inference cycle. The outputs are out, and IRQ/RDY; the first one is the crisp output value, which is 8-bit wide. IRQ/RDY is produced when the output data corresponding to the respective input is ready to be read. IRQ is a pulse used to request an interrupt, whereas, RDY is a signal that can be programmed to be active in high or low binary logic level, indicating that valid output was produced; this last signal can be used in a polling mode. In Fig. 76.9 we used only 1 bit for the IRQ/RDYsignal, at the moment of designing the system the designer will have to decide on one method. It is possible to use both, modifying the logic or separating the signal and adding an extra 1-bit

The GlossaryTerm

GPIO

GlossaryTerm

IP

has two 32 bit wide ports, one for input (reading bus) and one for output (write bus). The output bus connects the GlossaryTerm

GPIO

GlossaryTerm

IP

to the ARM cortex using the 32 bit bus APB. The input bus connects the GlossaryTerm

IT2FC

GlossaryTerm

IP

to the GlossaryTerm

GPIO

GlossaryTerm

IP

. The ARM cortex uses the Ft2km_core as a coprocessor.

6.2 Development of IP Cores

In Sect. 76.6.1, we showed how to integrate the fuzzy coprocessor through an input/output port, i. e., the GlossaryTerm

IP

GlossaryTerm

GPIO

. We also commented on the existence of GlossaryTerm

IP

cores such as the GlossaryTerm

UART

and the timers that are connected directly to the system bus as in any microcontroller system with integrated peripherals. In this section, we shall show how to implement an GlossaryTerm

IT2FC

connected to the system bus to obtain an GlossaryTerm

IT2FC

GlossaryTerm

IP

core integrated to the system architecture. The procedure is basically the same for any GlossaryTerm

FPGA

of the Xilinx family. We worked with the Spartan 6 and Virtex 5, so the Xilinx GlossaryTerm

ISE

Design Suite was used.

The whole process to start an application that includes a microprocessor and a coprocessor can be broadly divided into three steps:

  1. 1.

    Design and implement the design entity that will be integrated as an GlossaryTerm

    IP

    core in further steps, then follows the development flow explained in Sect. 76.4.2 . In our case, the design entity is FT2KM.

  2. 2.

    Create the basic embedded microcontroller system tailored for our application. We already know the kind and amount of memory that we will need, as well as the peripherals. This step is achieved as follows: we create the microprocessor system using the base system builder (GlossaryTerm

    BSB

    ) of the Xilinx Platform Studio (GlossaryTerm

    XPS

    ) software. The system contains a Microblaze softcore, 16 KB of local memory, the data controller bus (dlmb_cntlr), and the instruction controller bus (ilmb_cntlr).

  3. 3.

    Create the GlossaryTerm

    IP

    core, which should contain the desired design entity, in our case the FT2KM. This step is achieved using the Import Peripheral Wizard found in the Hardware option in the GlossaryTerm

    XPS

    . The idea is to connect the FTKM design entity to the processor local bus (GlossaryTerm

    PLB

    V4.6) through three registers, one for each input (two registers) and one for the output. Upon the completion, this tool will create synthesizable GlossaryTerm

    HDL

    file (ft2km_core) that implements the intellectual property interface (GlossaryTerm

    IPIF

    ) services required and a stub user_logic_module. These two modules are shown in Fig. 76.10. The GlossaryTerm

    IPIF

    connects the user logic module to the system bus using the GlossaryTerm

    OPB

    or the GlossaryTerm

    PLB

    bus or to the on-chip peripheral bus (GlossaryTerm

    OPB

    ). At this stage, we will need to use the GlossaryTerm

    ISE

    Project Navigator (GlossaryTerm

    ISE

    ) software to integrate to the user_logic_module all the required files that implement the FT2KM design entity. Edit the User_Logic_I.vhd file to define the FT2KM component and signals. Open the ftk2_core.vhd file and create the ftk2_core entity and user logic. Synthesize the GlossaryTerm

    HDL

    code and exit GlossaryTerm

    ISE

    . Return to the XSP and add the FTK2_core GlossaryTerm

    IP

    to the embedded system, connect the new GlossaryTerm

    IP

    core to the mb_plb bus system and generate address. Figure 76.10 shows the GlossaryTerm

    IT2FC

    GlossaryTerm

    IP

    core; the GlossaryTerm

    IPIF

    consists of the GlossaryTerm

    PLB

     V4.6 bus controller that provides the necessary signals to interface the GlossaryTerm

    IP

    core to the embedded soft core bus system.

    Fig. 76.10
    figure 10figure 10

    IP Core implementation of a user defined peripheral. The IT2FC coprocessor is implemented into the user logic module. This module achieves communication with the rest of the system through the PLB or the on-chip peripheral bus OPB. For a static coprocessor, use the PLB. For an implementation in the reconfigurable region, use the OPB

  4. 4.

    Design the drivers (software) to handle this design entity as a peripheral.

  5. 5.

    Design the application software to use the design entity.

7 Implementing a GA in an FPGA

In essence, evolution is a two-step process of random variation and selection of a population of individuals that responds with a collection of behaviors to the environment. Selection tends to eliminate those individuals that do not demonstrate an appropriate behavior. The survivors reproduce and combine their features to obtain better offspring. In replication random mutation always occurs, which introduces novel behavioral characteristics. The evolution process optimizes behavior and this is a desirable characteristic for a learning system. Although the term evolutionary computation dates back to 1991, the field has decades of history, genetic algorithms being one avenue of investigation in simulated evolution [63]. GlossaryTerm

GA

s are family of computational models, which imitates the principles of natural evolution. For consistency they adopt biological terminology to describe operations. There are six main steps of a GlossaryTerm

GA

: population initialization, evaluation of candidates using a fitness function, selection, crossover, and termination judgment, as is shown in Algorithm 76.1. The first step is to decide how to code a solution to the problem that we want to optimize; hence, each individual is represented using a chromosome that contains the parameters. Common encoding of solutions are binary, integer, and real value. In binary encoding, every chromosome is a string of bits. In real-value encoding, every chromosome is a string than can contain one or several parameters encoded as real numbers. Algorithm 76.1 starts initializing a population with random solutions, and then each individual of the population is evaluated using a fitness function, which is selected according to the optimization goals. For example, for tuning a controller it may be enough to check if the actual output controller is minimizing errors between the target and the reference. However, one or more complex fitness functions can be designed in order to carry out the control goal. In steps 3 to 5 the genetic operations are applied, i. e., selection, crossover (recombination), and mutation. In step 6, the termination criteria are checked, stopping the procedure if such criteria have been fulfilled.

Algorithm 76.1 General scheme of a GA

initialize population with random candidate solutions

evaluate each candidate

repeat

select parents

recombine pairs of parents

mutate the resulting offspring

evaluate new candidates

select individuals for the next generation

untiltermination condition is satisfied

In this work, we have chosen work a GlossaryTerm

GA

to evolve the GlossaryTerm

IT2FC

. However, the ideas exposed here are valid for most evolutionary and natural computing methods. So, there are two methods to implement any evolutionary algorithm. One is based on executing software written using a computer language such as C/C++, similarly as with a desktop computer. The second method is based on designing specialized hardware using a GlossaryTerm

HDL

. Both have advantages and disadvantages; the first method is the easier method since there is much information about coding using a high level language for different GlossaryTerm

EA

s. However, this solution may have similar limitations for real-time systems since they are slower than hardware implementations by at least a factor of magnitude of five. On the other hand, state machine hardware-based designs are more complex to implement and use. In this section we shall present a small overview of both methods.

7.1 GA Software Based Implementations

It is well known that a GlossaryTerm

GA

can run in parallel, taking advantage of the two types of known parallelism: data and control parallelism. Data parallelism refers to executing one process over several instances of the GlossaryTerm

EA

, while control parallelism works with separate

Coarse-grained parallelism and fine-grained parallelism are two methods often associated with the use of GlossaryTerm

EA

in parallel. The use of both methods is called a hybrid approach. Coarse-grained parallelism entails the GlossaryTerm

EA

cores to work in conjunction to solve a problem. The nodes swap individuals of their population with another node running the same problem. The cores can exchange individuals with each other to improve diversity. The amount of information, frequency of exchange, direction, data pattern, etc., are factors that can affect the efficiency of this approach.

In fine-grained parallelism, the approach is to share mating partners instead of populations. The members of populations across the parallel cores select to mate their fittest members with the fittest found in a neighboring node’s population. Then, the offspring of the selected individuals are distribuited. The distribution of this next generation can go to one of the parents’ populations, both parents’ population, or all cores’ populations, based on the means of distribution.

Figure 76.4 shows a six-core architecture design for the Virtex 5. Here, we can make fine or coarse-grained implementations of an GlossaryTerm

EA

. For example, for coarse-grained implementation, the island model with one processor per island can be used.

7.2 GA Hardware Implementations

Figure 76.11 shows a high-level view of the architecture of a GlossaryTerm

GA

for hardware implementation. The system has eight basic modules: selection module, crossover module, mutation module, fitness evaluation module, control module, observer module, four random generation number (GlossaryTerm

RGN

) modules, and two random access memory modules.

Fig. 76.11
figure 11figure 11

High-level view of the structure of a GA for FPGA implementation

The control module is a Mealy state machine designed to feed all other modules with the necessary control signals to synchronize the algorithm execution. The selection module can have any existing method of selection, for example the Roulette Wheel Selection Algorithm. This method picks the genes of the parents of the current population, and the parents are processed to create new individuals. At the current generation, the crossover and genetic modules achieve the corresponding genetic operation on the selected parents. The fitness evaluation module computes the fitness of each offspring and applies elitism to the population. The observer module determines the stopping criterion and observes its fulfilment. RNGs are indispensable to provide the randomness that GlossaryTerm

EA

s require. Additionally, GlossaryTerm

RAM

1 is necessary to store the current population and GlossaryTerm

RAM

2 to store the selected parents of each

8 Evolving Fuzzy Controllers

In Sect. 76.1 the general structure of an GlossaryTerm

EFRBS

was presented. It was mentioned that the common denominator in most learning systems is their capability of making structural changes to themselves over time to improve their performance for defined tasks. It also was mentioned that the two classical approaches for

fuzzy learning systems are the Michigan and Pittsburgh approaches, and there exist newer proposals with the same target. Although to programm a learning system in a computer using high-level language, such as C/C++, requires some skill, system knowledge, and experimentation, there are no technical problems with achieving a system with such characteristics. This can be also true for hardware implementation, if the GlossaryTerm

EFRBS

was developed in C/C++ and executed by a hard or soft processor such as PowerPC or Microblaze, it is similarly as it is done in a computer. How to develop a coprocessor was explained in Sect. 76.6. The coprocessor was developed in the FPGA’s static (base) region, which cannot be changed during a partial reconfiguration process. Therefore, such coprocessors cannot suffer any structural change. Achieving an GlossaryTerm

EFRBS

in hardware is quite different to achieving it using high-level language, because it is more difficult to change the circuitry than to modify programming lines.

GlossaryTerm

FPGA

s are reprogrammable devices that need a design methodology to be successfully used as reconfigurable devices. Since there are several vendors with different architectures, the methodology usually change from vendor to vendor and devices. For the Xilinx GlossaryTerm

FPGA

s the configuration memory is volatile, so, it needs to be configured every time that it is powered by uploading the configuration data known as bitstream. Configuring GlossaryTerm

FPGA

this way is not useful for many applications that need to change its behavior while they still working online. A solution to overcome such a limitation is to use partial reconfiguration, which splits the GlossaryTerm

FPGA

into two kinds of regions. The static (base) region is the portion of the design that does not change during partial reconfiguration, it may include logic that controls the partial reconfiguration process. In other words, partial reconfiguration (GlossaryTerm

PR

) is the ability to reconfigure select areas of an GlossaryTerm

FPGA

any time after its initial configuration [64]. It can be divided into two groups: dynamic partial reconfiguration (GlossaryTerm

DPR

) and static partial reconfiguration (GlossaryTerm

SPR

). GlossaryTerm

DPR

is also known as active partial reconfiguration. It allows changing a part of the device while the rest of the GlossaryTerm

FPGA

is still running. GlossaryTerm

DPR

is accomplished to allow the GlossaryTerm

FPGA

to adapt to changing algorithms and enhance performance, or for critical missions that cannot be disrupted while some subsystems are being defined. On the other hand, in GlossaryTerm

SPR

the static section of the GlossaryTerm

FPGA

needs to be stopped, so auto-reconfiguration is impossible (Fig. 76.12).

Fig. 76.12
figure 12figure 12

The FPGA is divided into two regions: static and reconfigurable. The soft processor and peripherals are in the static region. Different fuzzy controller architectures are in the reconfigurable region. The bus macro are fixed data paths for signals going between a reconfigurable module and another module

For Xilinx GlossaryTerm

FPGA

s, there are basically three ways to achieve GlossaryTerm

DPR

for devices that support this feature. The two basic styles are difference-based partial reconfiguration and module-based partial reconfiguration. The first one can be used to achieve small changes to the design, the partial bitstream only contains information about differences between the current design structure that resides in the GlossaryTerm

FPGA

and the new content of the GlossaryTerm

FPGA

. Since the bitstream differences are usually small, the changes can be made very quickly. Module-based partial reconfiguration is useful for reconfiguring large blocks of logic using modular design concepts. The third style is also based on modular design but is more flexible and less restrictive. This new style was introduced by Xilinx in 2006 and it is known as early access partial reconfiguration (GlossaryTerm

EAPR

) [65, 66]. There are two key differences between the design flow GlossaryTerm

EAPR

and the module-based one. (1) In the GlossaryTerm

EAPR

flow the shape and size of partially reconfigurable regions (GlossaryTerm

PRR

s) can be defined by the user. Each GlossaryTerm

PRR

has at least one, and usually multiple, partially reconfigurable modules (GlossaryTerm

PRM

s) that can be loaded into the GlossaryTerm

PRR

. (2) For modules that communicate with each other, a special bus macro allows signals to cross over a partial reconfiguration boundary. This is an important consideration, since without this feature intermodule communication would not be feasible, as it is impossible to guarantee routing between modules. The bus macro provides a fixed bus of inter-design communication. Each time partial reconfiguration is performed, the bus macro is used to establish unchanging routing channels between modules, guaranteeing correct connections [65].

An important core that enables embedded microprocessors such as MicroBlaze and PowerPC to achieve reconfiguration at run time is GlossaryTerm

HWICAP

(hardware internal configuration access point) for the GlossaryTerm

OPB

. The GlossaryTerm

HWICAP

allows the processors to read and write the GlossaryTerm

FPGA

configuration memory through the GlossaryTerm

ICAP

(internal configuration access point). Basically it allows writing and reading the configurable logic block (GlossaryTerm

CLB

) look-up table (GlossaryTerm

LUT

) of the GlossaryTerm

FPGA

.

The process to achieve reconfigurable computing with application to GlossaryTerm

IT2FC

will be explained with more detail in Sect. 76.8.2. Moreover, how to evolve an GlossaryTerm

IT2FC

embedded into an GlossaryTerm

FPGA

, whether it resides in the static or in the reconfigurable region, will be also explained in therein.

8.1 EAPR Flow for Changing the Controller Structure

Figure 76.12 shows the basic idea of using GlossaryTerm

EAPR

flow for reconfigurable computing to change from one IT2FC structure to a different one. In this figure the Microblaze soft processor can evaluate each controller structure according to single or multiobjective criteria. The processor communicates with a GlossaryTerm

PR

region using the bus macro, which provides a means of locking the routing between the GlossaryTerm

PRM

and the base design. The system can achieve fast reconfiguration operations since partial bitstream are transferred between the GlossaryTerm

FPGA

and the compact flash memory (GlossaryTerm

CF

) where bitstreams are stored.

In general, the GlossaryTerm

EAPR

design flow is as follows [64, 67, 68]:

  1. 1.

    Hardware description language design and synthesis. The first steps in the GlossaryTerm

    EAPR

    design flow are very similar to the standard modular design flow. We can summarize this in three steps:

    1. (a)

      Top-level design. In this step, the design description must only contain black-box instantiations of lower-level modules. Top-level design must contain: I/O instantiations, clock primitives instantiations, static module instantiations, GlossaryTerm

      PR

      module instantiations, signal declarations, and bus macro instantiations, since all non-global signals between the static design and the GlossaryTerm

      PRM

      s must pass through a bus macro.

    2. (b)

      Base design. Here, the static modules of the system contain logic that will remain constant during reconfiguration. This step is very similar to the design flow explained in Sect. 76.4.2. However, the designer must consider input and output assignment rules for GlossaryTerm

      PR

      .

    3. (c)

      GlossaryTerm

      PRM

      design. Similarly to static modules, GlossaryTerm

      PR

      modules must not include global clock signals either, but may use those from top-level modules. When designing multiple GlossaryTerm

      PRM

      s to take advantage of the same reconfigurable area, for each module, the component name and port configuration must match the reconfigurable module instantiation of the top-level

  2. 2.

    Set design constraints. In this step, we need to place constraints in the design for place and route (GlossaryTerm

    PAR

    ). The constraints included are: area group, reconfiguration mode, timing constraint, and location constraints. The area group constraint specifies which modules in the top-level module are static and which are reconfigurable. Each module instantiated by the top-level module is assigned to a group. The reconfiguration mode constraint is only applied to the reconfigurable group, which specifies that the group is reconfigurable. Location constraints must be set for all pins, clocking primitives, and bus macros in top-level design. Bus macros must be located so that they straddle the boundary between the GlossaryTerm

    PR

    region and the base design.

  3. 3.

    Implement base design. Before the implementation of the static modules, the top level is translated to ensure that the constraints file has been created properly. The information generated by implementing the base design is used for the GlossaryTerm

    PRM

    implementation step. Base design implementation follows three steps: translate, map, and GlossaryTerm

    PAR

    .

  4. 4.

    Implement GlossaryTerm

    PRM

    s. Each of the GlossaryTerm

    PRM

    s must be implemented separately within its own directory, and follows base design implementation steps: i. e., translate, map and GlossaryTerm

    PAR

    .

  5. 5.

    Merge. The final step in the partial reconfiguration flow is to merge the top level, base, and GlossaryTerm

    PRM

    s. During the merge step, a complete design is built from the base design and each GlossaryTerm

    PRM

    . In this step, many partial bitstreams for each GlossaryTerm

    PRM

    and initial full bitstreams are created to configure the GlossaryTerm

    FPGA

    .

Partial dynamic reconfigurable computing allows us to achieve online reconfiguration. By selecting a certain bitstream is possible to change the full controller structure, or any of the stages (fuzzification, inference engine, type reduction, and defuzzification), as well as any individual section of each stage, for example, different membership functions for the fuzzification stage, etc. However, we need to have all the reconfigurable modules previously synthesized because they are loaded using partial bitstreams. Therefore, to have the capability to evolve reconfigurable modules we need to provide them with a control register (GlossaryTerm

CR

) to change the desired parameters.

Next, a flexible coprocessor (GlossaryTerm

FlexCo

) prototype of an GlossaryTerm

IT2FC

(GlossaryTerm

FlexCo

GlossaryTerm

IT2FC

) that can be implemented either in the static region as well as in the GlossaryTerm

PR

is

8.2 Flexible Coprocessor Prototype of an IT2FC

Figure 76.13 illustrates the GlossaryTerm

FlexCo

GlossaryTerm

IT2FC

, which contains the four stages (fuzzification, inference engine, type reduction, and defuzzification). They are connected depending on the target region, to the GlossaryTerm

PLB

or to the GlossaryTerm

OPB

through a  32 bits command register (GlossaryTerm

CR

), which is formed by four 8 bit registers named R1 to R4 (Fig. 76.14). The parameters of each stage can be changed by the programmer since they are not static as they were defined previously for the FT2KM (Sect. 76.5). Now, they are volatile registers connected through signals to save parameter values. The processor (MicroBlaze) can send through the GlossaryTerm

PLB

or the GlossaryTerm

OPB

, two kinds of commands to the GlossaryTerm

CR

: control words (GlossaryTerm

CW

s) and data words (GlossaryTerm

DW

s). The state machine of the GlossaryTerm

FlexCo

GlossaryTerm

IT2FC

interprets the command.

Fig. 76.13
figure 13figure 13

Flexible coprocessor proposal of an IT2FC for the static region

Fig. 76.14
figure 14figure 14

The control register is used for both styles of implementation, in the static region or in the reconfigurable region

Figure 76.14 illustrates the GlossaryTerm

CR

coding for static and reconfigurable FC. This register is used to perform parameter modification in both modes, static and reconfigurable. In general, bit 7 of R4 is used to differentiate between a GlossaryTerm

CW

or a GlossaryTerm

DW

, 1 means a GlossaryTerm

CW

, whereas 0 means a GlossaryTerm

DW

. The StaGe bits (SG-bits) serves to identify the GlossaryTerm

IT2FC

stage that is to be modified.

  • SG-bits= 00: The fuzzification stage has been chosen, then it is necessary to set the bit Ant/Con to 1 to indicate that the antecedent GlossaryTerm

    MF

    s are going to be modified. With the section-bit (GlossaryTerm

    S-bit

    ) we indicate which part of the GlossaryTerm

    FOU

    (upper or lower) will be modified. The bit linguistic-variable-term/active (GlossaryTerm

    LVT

    /Active) is to indicate whether we want to modify a linguistic variable (GlossaryTerm

    LV

    ) or the linguistic term (GlossaryTerm

    LT

    ), the Act option is for the inference engine (GlossaryTerm

    IE

    ). In accordance to the GlossaryTerm

    LV

    /GlossaryTerm

    LT

    bit value, in the register R3 we set the number of the GlossaryTerm

    LV

    or the GlossaryTerm

    LT

    that will be changed. Finally, with registers R1 and R2, the parameter value of the GlossaryTerm

    LV

    or the GlossaryTerm

    LT

    is given, R1 is the least significant byte.

  • SG-bits= 01: With this setting, the state machine identifies that the GlossaryTerm

    IE

    will be modified. It works in conjunction with Ant/Con, GlossaryTerm

    S-bit

    , and the registers R1, R2, and R3. Set a 0 value in the Ant/Con bit to change the consequent parameters of a Mamdani inference system, in GlossaryTerm

    S-bit

    choose the upper or lower GlossaryTerm

    MF

    , using R3 indicate the number of GlossaryTerm

    MF

    , and with R1 and R2 set the corresponding value or static implementation. It is possible to activate and deactivate rules using the bit GlossaryTerm

    LVT

    /Active. With bit dynamic change/activate-deactivate (GlossaryTerm

    DC/AD

    ), it is possible to change the combination of antecedents and consequents of a specific rule provided that we have made this part flexible by using registers. For an implementation in the reconfigurable region, it is possible to add or remove rules. These two features need to work in conjunction with registers R1, R2, and R3.

  • SG-bits= 10: This selection is to modify the type reduction stage. It is possible to have more than one type reducer. By setting the GlossaryTerm

    DC/AD

    -bit to 1, we indicate that we wish to change the method at running time without the necessity of achieving a reconfiguration process that implies uploading partial bitstreams. The methods can be selected using register R3. By using a GlossaryTerm

    DC/AD

    -bit equal to 0 and GlossaryTerm

    LVT

    /Act equal to 0, in combination with registers R1 to R3 we can indicate that we wish to change the preloaded values that the GlossaryTerm

    KM

    -algorithm needs to achieve the GlossaryTerm

    TR

    .

  • SG-bits= 11: Similarly to the type reduction stage, we can change the defuzzifier at running time.

With respect to the type reducer and defuzzification stages, we give the option to have more than one module, which has the advantage of making the process easier and possible for static designs, but the disadvantage is that the design will consume more macrocells, increasing the cost of the required GlossaryTerm

FPGA

s, boards, and power consumption. Next, we will explain the implementation of the GlossaryTerm

FlexCo

GlossaryTerm

IT2FC

for the static region and the reconfigurable region.

8.2.1 Implementing the FlexCo IT2FC on the Static Region

The GlossaryTerm

IT2FC

of is connected to the GlossaryTerm

PLB

. Although the controller structure is static, this system can be evolved for tuning and learning because it is possible to achieve parametric modifications to all the GlossaryTerm

IT2FC

stages. Figure 76.13 shows the architecture of this system and Fig. 76.15 a conceptual model of the possible implementation.

Fig. 76.15
figure 15figure 15

In the static region of the FPGA a multiprocessor system (MPS) with operating system. The GA resides in the program memory, it is executed by the MPS. The IT2FC may be implemented in the reconfigurable region, Fig. 76.16, or in the static region, Fig. 76.13

8.2.2 Implementing the FlexCo IT2FC on the PR

Figure 76.16 illustrates a more flexible architecture for GlossaryTerm

FlexCo

GlossaryTerm

IT2FC

. The GlossaryTerm

IT2FC

is implemented in the reconfigurable region, using a partially reconfigurable region (GlossaryTerm

PRR

) for each stage. This is convenient since each region can have multiple modules that can be swapped in and out of the device on the fly. This is the most recommended method to achieve the evolving GlossaryTerm

IT2FC

since it is more flexible. One disadvantage is that at running time it is slower than the static implementation because more logic circuits are incorporated.

Fig. 76.16
figure 16figure 16

Flexible coprocessor proposal of an IT2FC for the reconfigurable region

Figure 76.17 is an evolutive standalone system; as it was mentioned, the GlossaryTerm

IT2FC

and the GlossaryTerm

GA

can be in the static or in the reconfigurable region.

Fig. 76.17
figure 17figure 17

This design may be implemented in both regions to have a dynamic reconfigurable system. For a static implementation, the system must have registers for all the variable parameters to make possible to change their values, Fig. 76.13

8.3 Conclusion and Further Reading

GlossaryTerm

FPGA

s combine the best parts of GlossaryTerm

ASIC

s and processor-based systems, since they do not require high volumes to justify making a custom design. Moreover, they also provide the flexibility of software, running on a processor-based system, without being limited by the number of cores available. They are one of the best options to parallelize a system since they are parallel in nature. In an GlossaryTerm

IT2FC

, a typical whole GlossaryTerm

T2

-inference, computed using an industrial computer equipped with a quad-core processor, lasts about 18 10 - 3 s . A whole GlossaryTerm

IT2FC

(fuzzification, inference, GlossaryTerm

KM

-type reducer, and defuzzification) lasts only four clock cycles, which for a Spartan implementation using a  50 MHz clock represents 80 10 - 9 s , and for a Virtex 5 GlossaryTerm

FPGA

-based system represents 40 10 - 9 s . For the Spartan family the typical implementation speedup is 225000, whereas for the Virtex 5 it is 450000. Using a pipeline architecture, the speedup of the whole GlossaryTerm

IT2

process can be obtained in just one clock cycle, so using the same criteria to compare, the speedup for Spartan is 90000 and 2400000 for Virtex. Reported speedups of GlossaryTerm

GA

s implemented into an GlossaryTerm

FPGA

, are at least 5 times higher than in a computer system. For all these reasons, GlossaryTerm

FPGA

s are suitable devices for embedding evolving fuzzy logic controllers, especially the GlossaryTerm

IT2FC

, since they are computationally expensive. There are some drawbacks with the use of this technology, mostly with respect to the need to have a highly experienced development team because its implementation complexity. Achieving an evolving intelligent system using reconfigurable computing is not as direct as it is using a computer system. It requires the knowledge of GlossaryTerm

FPGA

architectures, GlossaryTerm

VHDL

coding, soft processor implementation, the development of coprocessors, high-level languages, and reconfigurable computing bases. Therefore, people interested in achieving such implementations require expertise in the above fields, and further reading must focus on these topics, GlossaryTerm

FPGA

vendor manuals and white papers, as well as papers and books on reconfigurable computing.