Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Dynamics of Attractor and Analog Networks

An attractor, as a well-known mathematical object, is central to the field of nonlinear dynamical systems (GlossaryTerm

NDS

) theory, which is one of the indispensable conceptual underpinnings of complexity science. An attractor is a set towards which a variable moves according to the dictates of a nonlinear dynamical system, evolves over time, such that points get close enough to the attractor, and remain close even if they are slightly disturbed. To well appreciate what an attractor is, some corresponding GlossaryTerm

NDS

notions, such as phase or state space, phase portraits, basins of attractions, initial conditions, transients, bifurcations, chaos, and strange attractors are needed to tame some of the unruliness of complex

Most of us have at least some inkling of what nonlinear means, which can be illustrated by the most well-known and vivid example of the butterfly effect of a chaotic system that is nonlinear. It has prompted the use of the image of tiny air currents produced by a butterfly flapping its wing in Brazil, which are then amplified to the extent that they may influence the building up of a thunderhead in Kansas. Although no one can actually claim that there is such a linkage between Brazilian lepidopterological dynamics and climatology in the Midwest of the USA, it does serve to vividly portray nonlinearity in the extreme.

As the existence of both the nonlinearity and the capacity in passing through different regimes of stability and instability, the outcomes of the nonlinear dynamical system are unpredictable. These different regimes of a dynamical system are understood as different phases governed by different attractors, which means that the dynamics of each phase of a dynamical system are constrained within the circumscribed range allowable by that phase’s

1.1 Phase Space and Attractors

To better grasp the idea of phase space, a time series and phase portrait have been used to represent the data points. Time series display changes in the values of variables on the y-axis (or the z-axis), and time on the x-axis as in a time series chart, however, the phase portrait plots the variables against each other and leaves time as an implicit dimension not explicitly plotted. Attractors can be displayed by phase portraits as the long-term stable sets of points of the dynamical system. This means that the locations in the phase portrait towards which the system’s dynamics are attracted after transient phenomena have died down. To illustrate phase space and attractors, two examples are

Imagine a child on a swing and a parent pulling the swing back. This gives a good push to make the child move forward. When the child is not moving forward, he will move backward on the swing as shown in Fig. 33.1. The unpushed swing will come to rest as shown in the times series chart and phase space. The time series show an oscillation of the speed of the swing, which slows down and eventually stops, that is its flat lines. In phase space, the swing’s speed is plotted against the distance of the swing from the central point called a fixed point attractor since it attracts the system’s dynamics in the long

Fig. 33.1
figure 1figure 1

Schematics of an unpushed swing

run. The fixed point attractor in the center of Fig. 33.2 is equivalent to the flat line in Fig. 33.3. The fixed point attractor is another way to see and say that an unpushed swing will come to a state of rest in the long term. The curved lines with arrows spiraling down to the center point in Fig. 33.2 display what is called the basin of attraction for the unpushed swing. These basins of attraction represent various initial conditions for the unpushed swing, such as starting heights and initial velocities.

Fig. 33.2
figure 2figure 2

Phase portrait and fixed point attractor of an unpushed swing

Fig. 33.3
figure 3figure 3

Time series of the unpushed swing

Now consider another type of a similar dynamical system, this time the swing is pushed each time it comes back to where the parent is standing. The time series chart of the pushed swing is shown in Fig. 33.4 as a continuing oscillation. This oscillation is around a zero value for y and is positive when the swing is going in one direction and negative when the swing is going in the other direction. As a phase space diagram, the states of variables against each other are shown in Fig. 33.5. The unbroken oval in Fig. 33.5 is a different kind of attractor from the fixed point one in Fig. 33.2. This attractor is well known as a limit cycle or periodic attractor of a pushed swing. It is called a limit cycle because it represents the cyclical behavior of the oscillations of the pushed swing as a limit to which the dynamical systems adheres under the sway of this attractor. It is periodic because the attractor oscillates around the same values, as the swing keeps going up and down until the s has a same heights from the lowest point. Such dynamical system can be called periodic for it has a repeating cycle or pattern.

Fig. 33.4
figure 4figure 4

Time series chart of the pushed swing

Fig. 33.5
figure 5figure 5

Phase portrait and limit cycle attractor of a pushed swing (after [1])

By now, what we have learned about attractors can be summarized as follows: they are spatially displayed phase portraits of a dynamical system as it changes over the course of time, thus they represent the long-term dynamics of the system so that whatever the initial conditions represented as data points are, their trajectories in phase space fall within its basins of attraction, they are attracted to the attractor. In spite of wide usage in mathematics and science, as Robinson points out there is still no precise definition of an attractor, although many have been offered [2]. So he suggests thinking about an attractor as a phase portrait that attracts a large set of initial conditions and has some sort of minimality property, which is the smallest portrait in the phase space of the system. The attractor has the property of attracting the initial conditions after any initial transient behavior has died down. The minimality requirement implies the invariance or stability of the attractor. As a minimal object, the attractor cannot be split up into smaller subsets and retains its role as what dominates a dynamical system during a particular phase of its

1.2 Single Attractors of Dynamical Systems

Standard methods for the study of stability of dynamical systems with a unique attractor include the Lyapunov method, the Lasalles invariance principle, and the combination of thereof. Usually, given the properties of a (unique) attractor, we can realize a dynamical system with such an attractor.

Since the creation of the fundamental theorems of Lyapunov stability, many researchers have gone further and proved that most of the fundamental Lyapunov theories are reversible. Thus, from theory, this demonstrates that these theories are efficacious; i. e., there necessarily exists the corresponding Lyapunov function if the solution has some kind of stability. However, as for the construction of an appropriate V function for the determinant of stability, researchers are still interested. The difference between the existence and its construction is large. However, there is no general rule for the construction of the Lyapunov function. In some cases, different researchers have different methods for the construction of the Lyapunov function based on their experience and technique. Those, who can construct a good quality Lyapunov function, can get more useful information to demonstrate the effectiveness of their theories. Certainly, many successful Lyapunov functions have a practical background. For example, some equations inferred from the physical model have a clear physical meaning such as the mechanics guard system, in which the total sum of the kinetic energy and potential energy is the appropriate V function. The linear approximate method can be used; i. e., for the nonlinear differential equation, firstly find its corresponding linear differential equation’s quadric form positive defined V function, then consider the nonlinear quality for the construction of a similar V function.

Grossberg proposed and studied additive neural networks because they add nonlinear contributions to the neuron activity. The additive neural network has been used for many applications since the 1960s [3, 4], including the introduction of self-organizing maps. In the past decades, neural networks as a special kind of nonlinear systems have received considerable attention. The study of recurrent neural networks with their various generalizations has been an active research area [10, 11, 12, 13, 14, 15, 16, 17, 5, 6, 7, 8, 9]. The stability of recurrent neural networks is a prerequisite for almost all neural network applications. Stability analysis is primarily concerned with the existence and uniqueness of equilibrium points and global asymptotic stability, global exponential stability, and global robust stability of neural networks at equilibria. In recent years, the stability analysis of recurrent neural networks with time delays has received much attention [18, 19]. Single attractors of dynamical systems are shown in Fig. 33.6.

Fig. 33.6
figure 6figure 6

Single attractors of dynamical systems

1.3 Multiple Attractors of Dynamical Systems

Multistable systems have attracted extensive interest in both modeling studies and neurobiological research in recent years due to their feasibility to emulate and explain biological behavior [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]. Mathematically, multistability allows the system to have multiple fixed points and periodic orbits. As noted in [35], more than 25 years of experimental and theoretical work has indicated that the onset of oscillations in neurons and in neuron populations is characterized by multistability.

Multistability analysis is different from monostability analysis. In monostability analysis, the objective is to derive conditions that guarantee that each nonlinear system contains only one equilibrium point, and all the trajectories of the neural network converge to it. Whereas in multistability analysis, nonlinear systems are allowed to have multiple equilibrium points. Stable and unstable equilibrium points, and even periodic trajectories may co-exist in a multistable system.

The methods to study the stability of dynamical systems with a unique attractor include the Lyapunov method, the Lasalles invariance principle, and the combination of the two methods. One unique attractor can be realized by one dynamical system, but it is much more complicated for multiple attractors to be realized by one dynamical system or dynamical multisystems because of the compatibility, agreement, and behavior optimization among the systems. Generally, the usual global stability conditions are not adequately applicable to multistable systems. The latest results on multistability of neural networks can be found in [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52]. It is shown in [45, 46] that the n-neuron recurrent neural networks with one step piecewise linear activation function can have 2n locally exponentially stable equilibrium points located in saturation regions by partitioning the state space into 2n subspaces. In [47], multistability of almost periodic solutions of recurrently connected neural networks with delays is investigated. In [48], by constructing a Lyapunov functional and using matrix inequality techniques, a delay-dependent multistability criterion on recurrent neural networks is derived. In [49], the neural networks with a class of nondecreasing piecewise linear activation functions with  2 r corner points are considered. It is proved that the n-neuron dynamical systems can have and only have  ( 2 r + 1 ) n equilibria under some conditions, of which  ( r + 1 ) n are locally exponentially stable and others are unstable. In [50], some multistability properties for a class of bidirectional associative memory recurrent neural networks with unsaturation piecewise linear transfer functions are studied based on local inhibition. In [51], for two classes of general activation functions, multistability of competitive neural networks with time-varying and distributed delays is investigated by formulating parameter conditions and using inequality techniques. In [52], the existence of 2n stable stationary solutions for general n-dimensional delayed neural networks with several classes of activation functions is presented through formulating parameter conditions motivated by a geometrical observation. Two limit cycle attractors and 24 equilibrium point attractors of dynamical systems are shown in Figs. 33.7 and 33.8, respectively.

Fig. 33.7
figure 7figure 7

Two limit cycle attractors of dynamical systems

Fig. 33.8
figure 8figure 8

24 equilibrium point attractors of dynamical systems

1.4 Conclusion

In this section, we briefly introduced what attractors can be summarized as, and phase space and attractors. Furthermore, single-attractor and multiattractors of dynamical systems were also discussed.

2 Synchrony, Oscillations, and Chaos in Neural Networks

2.1 Synchronization

2.1.1 Biological Significance of Synchronization

Neurodynamics deals with dynamic changes of neural properties and behaviors in time and space at different levels of hierarchy in neural systems. The characteristic spiking dynamics of individual neurons is of fundamental importance. In large-scale systems, such as biological neural networks and brains with billions of neurons, the interaction among the connected neural components is crucial in determining collective properties. In particular, synchronization plays a critical role in higher cognition and consciousness experience [53, 54, 55, 56, 57]. Large-scale synchronization of neuronal activity arising from intrinsic asynchronous oscillations in local electrical circuitries of neurons are at the root of cognition. Synchronization at the level of neural populations is characterized

There are various dynamic behaviors of potential interest for neural systems. In the simplest case, the system behavior converges to a fixed point, when all major variables remain unchanged. A more interesting dynamic behavior emerges when the system behavior periodically repeats itself at period T, which will be described first. Such periodic oscillations are common in neural networks and are often caused by the presence of inhibitory neurons and inhibitory neural populations. Another behavior emerges when the system neither converges to a fixed point nor exhibits periodic oscillations, rather it maintains highly complex, chaotic dynamics. Chaos can be microscopic effect at the cellular level, or mesoscopic dynamics of neural populations or cortical regions. At the highest level of hierarchy, chaos can emerge as the result of large-scale, macroscopic effect across cortical areas in the

Considering the temporal dynamics of a system of interacting neural units, limit cycle oscillations and chaotic dynamics are of importance. Synchronization in limit cycle oscillations is considered first, which illustrates the basic principles of synchronization. The extension to more complex (chaotic) dynamics is described in Sect. 33.2.3. Limit cycle dynamics is described as a cyclic repetition of the system’s behavior at a given time period T. The cyclic repetition covers all characteristics of the system, e. g., microscopic currents, potentials, and dynamic variables; see, e. g., the Hodgkin–Huxley model of neurons [58]. Limit cycle oscillations can be described as a cyclic loop of the system trajectory in the space of all variables. The state of the system is given as a point on this trajectory at any given time instant. As time evolves, the point belonging to the system traverses along the trajectory. Due to the periodic nature of the movement, the points describing the system at time t and  t + T coincide fully. We can define a convenient reference system by selecting a center point of the trajectory and describe the motion as the vector pointing from the center to the actual state on the trajectory. This vector has an amplitude and phase in a suitable coordinate system, denoted as  ξ ( t ) and  Φ ( t ) , respectively. The evolution of the phase in an isolated oscillator with frequency  ω 0 can be given as

d Φ ( t ) d t = ω 0 .
(33.1)

Several types of synchronization can be defined. The strongest synchronization takes place when two (or multiple) units have identical behaviors. Considering limit cycle dynamics, strong synchronization means that the oscillation amplitude and phase are the same for all units. This means complete synchrony. An example of two periodic oscillators is given by the clocks shown in Fig. 33.9a–c [59]. Strong synchronization means that the two pendulums are connected with a rigid object forcing them move together. The lack of connection between the two pendulums means the absence of synchronization, i. e., they move completely independently. An intermediate level of synchrony may arise with weak coupling between the pendulums, such as a spring or a flexible band. Phase synchrony takes place when the amplitudes are not the same, but the phases of the oscillations could still coincide. Figure 33.9b–d depicts the case of out-of-phase synchrony, when the phases of the two oscillators are exactly the

Fig. 33.9 a–d
figure 9figure 9

Synchronization in pendulums, in phase and out of phase (after [59]). Bottom plots: Illustration of periodic trajectories, case of in-phase (ac) and out-of-phase oscillations (bd)

2.1.2 Amplitude Measures of Synchrony

Denote by  a j ( t ) the time signal produced by the individual units (neurons); j = 1 , , N , and the overall signal of interacting units (A) is determined as

A ( t ) = 1 / N j = 1 N a j ( t ) .
(33.2)

The variance of time series  A ( t ) is given as follows

σ A 2 = A 2 ( t ) - A ( t ) 2 .
(33.3)

Here f ( t ) denotes time averaging over a give time window. After determining the variance of the individual channels  σ A j 2 based on (33.3), the synchrony  χ N in the system with N components is defined as follows

χ 2 N = σ A 2 1 / N i = 1 N σ A i 2 .
(33.4)

This synchrony measure has a nonzero value in synchronized and partially synchronized systems 0 < χ N < 1 , while  χ N = 0 means the complete absence of synchrony in neural networks [60].

Fourier transform-based signal processing methods are very useful for the characterization of synchrony in time series, and they are widely used in neural network analysis. The Fourier transform makes important assumptions on the analyzed time series, including stationary or slowly changing statistical characteristics and ergodicity. In many applications these approximations are appropriate. In analyzing large-scale synchrony on brain signals, however, alternative methods are also justified. Relevant approaches include the Hilbert transform for rapidly changing brain signals [61, 62]. Here both Fourier and Hilbert-based methods are outlined and avenues for their applications in neural networks are indicated. Define the cross correlation function (GlossaryTerm

CCF

) between discretely sampled time series  x i ( t ) and  x i ( t ) , t = 1 , , N as follows

CCF i j ( τ ) = 1 T T - τ t = 1 [ x i ( t + τ ) - x i ] [ x j ( t ) - x j ] .
(33.5)

Here x i is the mean of the signal over period T, and it is assumed that  x i ( t ) is normalized to unit variance. For completely correlated pairs of signals, the maximum of the cross correlation is 1, for uncorrelated signals it equals 0. The cross power spectral density GlossaryTerm

CPSD

i j ( ω ) , cross spectrum for short, is defined as the Fourier transform of the cross correlation as follows: CPSD i j ( ω ) = F ( CCF i j ( τ ) ) . If i = j , i. e., the two channels coincide, then we talk about autocorrelation and auto power spectral density GlossaryTerm

APSD

i i ( ω ) ; for details of Fourier analysis, see [63]. Coherence γ 2 is defined by normalizing the cross spectrum by the autospectra

γ i j 2 ( ω ) = | CPSD i j ( ω ) | 2 | APSD i i ( ω ) | | APSD j j ( ω ) | .
(33.6)

The coherence satisfies 0 γ 2 ( ω ) 1 and it contains useful information on the frequency content of the synchronization between signals. If coherence is close to unity at some frequencies, it means that the two signals are closely related or synchronized; a coherence near zero means the absence of synchrony at those frequencies. Coherence functions provide useful information on synchrony in brain signals at various frequency bands [64]. For other information-theoretical characterizations, including mutual information and entropy measures.

2.1.3 Phase Synchronization

If the components of the neural network are weakly interacting, the synchrony evaluated using the amplitude measure χ in (33.4) may be low. There can still be a meaningful synchronization effect in the system, based on phase measures. Phase synchronization is defined as the global entrainment of the phases [65], which means a collective adjustment of their rhythms due to their weak interaction. At the same time, in systems with phase synchronization the amplitudes need not be synchronized. Phase synchronization is often observed in complex chaotic systems and it has been identified in biological neural networks [61, 65].

In complex systems, the trajectory of the system in the phase space is often very convoluted. The approach described in (33.1), i. e., choosing a center point for the oscillating cycle in the phase space with natural frequency  ω 0 , can be nontrivial in chaotic systems. In such cases, the Hilbert transform-based approach can provide a useful tool for the characterization of phase synchrony. Hilbert analysis determines the analytic signal and its instantaneous frequency, which can be used to describe phase synchronization effects. Considering time series  s ( t ) , its analytic signal  z ( t ) is defined as follows [62]

z ( t ) = s ( t ) + i s ^ ( t ) = A ( t ) e i Φ ( t ) .
(33.7)

Here  A ( t ) is the analytic amplitude,  Φ ( t ) is the analytic phase, while  s ^ ( t ) is the Hilbert transform of  s ( t ) , given by

s ^ ( t ) = 1 π PV - + s ( t ) t - τ d τ ,
(33.8)

where GlossaryTerm

PV

stands for the principal value of the integral computed over the complex plane. The analytic signal and its instantaneous phase can be determined for an arbitrary broadband signal. However, the analytic signal has clear meaning only at a narrow frequency band, therefore, the bandpass filter should precede the evaluation of analytic signal in data with broad frequency content.

The Hilbert method of analytic signals is illustrated using actual local field potentials measured over rabbits with an array of chronically implanted intracranial electrodes [67]. The signals have been filtered in the theta band (3 7 Hz ). An example of time series  s ( t ) is shown in Fig. 33.10a. The Hilbert transform  s ^ ( t ) is depicted in Fig. 33.10b in red, while blue color shows  s ( t ) . Figure 33.10c shows the analytic phase  Φ ( t ) , and Fig. 33.10d depicts the analytic  z ( t ) signal in the complex plane. Figure 33.11 shows the unwrapped instantaneous phase with bifurcating phase curves indicating desynchronization at specific time instances - 1.3 s , - 0.4 s , and 1 s . The plot on the right-hand side of Fig. 33.11 depicts the evolution of the instantaneous frequency in time. The frequency is around 5 Hz most of the time, indicating phase synchronization. However, it has very large dispersion at a few specific instances (desynchronization).

Fig. 33.10 a–d
figure 10figure 10

Demonstration of the Hilbert analytic signal approach on electroencephalogram (EEG ) signals (after [66]); (a) signal s ( t ) ; (b) Hilbert transform  s ^ ( t ) (red) of signal  s ( t ) (blue); (c) instantaneous phase  Φ ( t ) ; and analytic signal in complex plane  z ( t )

Fig. 33.11 a,b
figure 11figure 11

Illustration of instantaneous phases; (a) unwrapped phase with bifurcating phase curves indicating desynchronization at specific time instances - 1.3 s , - 0.4 s , and 1 s ; (b) evolution of instantaneous frequency in time

Synchronization between channels x and y can be measured using the phase lock value (GlossaryTerm

PLV

) defined as follows [61]

PLV x y ( t ) = | 1 T t + T / 2 t - T / 2 e i [ Φ x ( τ ) - Φ y ( τ ) ] d τ | .
(33.9)

GlossaryTerm

PLV

ranges from 1 to 0, where 1 indicates complete phase locking. GlossaryTerm

PLV

defined in (33.9) determines an average value over a time window of length T. Note that GlossaryTerm

PLV

is a function of t by applying the given sliding window. GlossaryTerm

PLV

is also the function of the frequency, which is being selected by the bandpass filter during the preprocessing phase. By changing the frequency band and time, the synchronization can be monitored at various conditions. This method has been applied productively in cognitive experiments [68].

2.1.4 Synchronization–Desynchronization Transitions

Transitions between neurodynamic regimes with and without synchronization have been observed and exploited for cognitive monitoring. The Haken–Kelso–Bunz (HKB) model is one of the prominent and elegant approaches providing a theoretical framework for synchrony switching, based on the observations related to bimanual coordination [69]. The HKB model invokes the concepts of metastability and multistability as fundamental properties of cognition. In the experiment, the subjects were instructed to follow the rhythm of a metronome with their index fingers in an anti-phase manner. It was observed that by increasing the metronome frequency, the subject spontaneously switched their anti-phase movement to in-phase at a certain oscillation frequency and maintained it thereon even if the metronome frequency was decreased again below the given threshold.

The following simple equation is introduced to describe the dynamics observed: d Δ Φ / d t = - sin⁡ ( Φ ) - 2 ε sin⁡ ( 2 Φ ) . Here Δ Φ = ϕ 1 - ϕ 2 is the phase difference between the two finger movements, control parameter ε is related to the inverse of the introduced oscillatory frequency. The system dynamics is illustrated in Fig. 33.12 by the potential surface V, where stable fixed points correspond to local minima. For low oscillatory frequencies (high ε), there are stable equilibria at anti-phase conditions. As the oscillatory frequency increases (low ε) the dynamics transits to a state where only the in-phase equilibrium is stable.

Fig. 33.12
figure 12figure 12

Illustration of the potential surface (V) of the HKB system as a function of the phase difference in radians  Δ Φ and inverse frequency ε. The transition from anti-phase to in-phase behavior is seen as the oscillation frequency increases (ε decreases)

Another practical example of synchrony-desynchrony transition in neural networks is given by image processing. An important basic task of neural networks is image segmentation, which is difficult to accomplish with excitatory nodes only. There is evidence that biological neural networks use inhibitory connections for completing basic pattern separation and integration tasks [70]. Synchrony between interacting neurons may indicate the recognition of an input. A typical neural network architecture implementing such a switch between synchronous and nonsynchronous states using local excitation and global inhibition is shown in Fig. 33.13. This system uses amplitude difference to measure synchronization between neighboring neurons. Phase synchronization measures have been proposed as well to accomplish the segmentation and recognition tasks [71]. Phase synchronization provides a very useful tool for learning and control of the oscillations in weakly interacting neighborhoods.

Fig. 33.13
figure 13figure 13

Neural network with local excitation and a global inhibition node (black; after [70])

2.2 Oscillations in Neural Networks

2.2.1 Oscillations in Brains

The interaction between opposing tendencies in physical and biological systems can lead to the onset of oscillations. Negative feedback between the system’s components plays an important role in generating oscillations in electrical systems. Brains as large-scale bioelectrical networks consist of components oscillating at various frequencies. The competition between inhibitory and excitatory neurons is a basic ingredient of cortical oscillations. The intricate interaction between oscillators produces the amazingly rich oscillations that we experimentally observe as brain rhythms at multiple time scales [72, 73].

Oscillations occur in the brain at different time scales, starting from several milliseconds (high frequencies) to several seconds (low frequencies). One can distinguish between oscillatory components based on their frequency contents, including delta (1 4 Hz ), theta (4 7 Hz ), alpha (7 12 Hz ), beta (12 30 Hz ), and gamma (30 80 Hz ) bands. The above separation of brain wave frequencies is somewhat arbitrary, however, they can be used as a guideline to focus on various activities. For example, higher cognitive functions are broadly assumed be manifested in oscillations in the higher beta and gamma bands.

Brain oscillations take place in time and space. A large part of cognitive activity happens in the cortex, which is a convoluted surface of the six-layer cortical sheet of gyri and sulci. The spatial activity is organized on multiple scales as well, starting from the neuronal level (μm), to granules (mm), cortical activities (several cm), and hemisphere-wide level ( 20 cm ). The temporal and spatial scales are not independent, rather they delicately interact and modulate each other during cognition. Modern brain monitoring tools provide insight to these complex space–time processes [74].

2.2.2 Characterization of Oscillatory Networks

Oscillations in neural networks are synchronized activities of populations of neurons at certain well-defined frequencies. Neural systems are often modeled as the interaction of components which oscillate at specific, well-defined frequencies. Oscillatory dynamics can correspond to either microscopic neurons, to mesoscopic populations of tens of thousands neurons, or to macroscopic neural populations including billions of neurons. Oscillations at the microscopic level have been thoroughly studied using spiking neuron models, such as the Hodgkin–Huxley equation (GlossaryTerm

HH

). Here we focus on populations of neurons, which have some natural oscillation frequencies. It is meaningful to assume that the natural frequencies are not identical due to the diverse properties of populations in the cortex. Interestingly, the diversity of oscillations at the microscopic and mesoscopic levels can give rise to large-scale synchronous dynamics at higher levels. Such emergent oscillatory dynamics is the primary subject of this

Consider N coupled oscillators with natural frequencies  ω j ; j = 1 , , N . A measure of the synchronization in such systems is given by parameter R, which is often called the order parameter. This terminology was introduced by Haken [75] to describe the emergence of macroscopic order from disorder. The time-varying order parameter  R ( t ) is defined as [76]

R ( t ) = | 1 / N × Σ N j = 1 e i θ j ( t ) | .
(33.10)

Order parameter R provides a useful synchronization measure for coupled oscillatory systems. A common approach is to consider a globally coupled system, in which all the components interact with each other. This is the broadest possible level of interaction. The local coupling model represents just the other extreme limit, i. e., each node interacts with just a few others, which are called its direct neighbors. In a one-dimensional array, a node has two neighbors on its left and right, respectively (assuming periodic boundary conditions). In a two-dimensional lattice, a node has four direct neighbors, and so on. The size of the neighborhood can be expanded, so the connectivity in the network becomes more dense. There is of special interest in networks that have a mostly regular neighborhood with some further neighbors added by a selection rule from the whole network. The addition of remote or nonlocal connections is called rewiring, and the networks with rewiring are small world networks. They have been extensively studied in network theory [76, 77, 78]. Figure 33.14 illustrates local (top left) and global coupling (bottom right), as well as intermediate coupling, with the bottom left plot giving an example of network with random rewiring.

Fig. 33.14 a–d
figure 14figure 14

Network architectures with various connectivity structures: (a) local, (b) and (c) are intermediate, and (d) global (mean-field) connectivity

2.2.3 The Kuramoto Model

The Kuramoto model [79] is a popular approach to describe oscillatory neural systems. It implements mean-field (global) coupling. The synchronization in this model allows an analytical solution, which helps to interpret the underlying dynamics in clear mathematical terms [76]. Let  θ j and  ω j denote the phase and the inherent frequency of the i-th oscillator. The oscillators are coupled by a nonlinear interaction term depending on their pair-wise phase differences. In the Kuramoto model, the following sinusoidal coupling term has been used to model neural systems

d θ j d t = ω j + K N Σ N j = 1 sin⁡ ( θ i - θ j ) , j = 1 , , N .
(33.11)

Here K denotes the coupling strength and K = 0 means no coupling. The system in (33.11) and its generalizations have been studied extensively since its first introduction by Kuramoto [79]. Kuramoto used Lorenztian initial distribution of phases θ defined as: L ( θ ) = γ / { π ( γ 2 + ( ω - ω 0 ) 2 ) } . This leads to the asymptotic solution N inf⁡ and t inf⁡ for order parameter R in simple analytic terms

R = 1 - ( K c / K ) if K > K c , R = 0 otherwise .
(33.12)

Here  K c denotes the critical coupling strength given by  K c = 2 γ . There is no synchronization between the oscillators if K K c , and the synchronization becomes stronger as K increases at supercritical conditions K > K c , see Fig. 33.15. Inputs can be used to control synchronization, i. e., a highly synchronized system can be (partially) desynchronized by input stimuli [80, 81]. Alternatively, input stimuli can induce large-scale synchrony in a system with low level of synchrony, as evidenced by cortical observations [82].

Fig. 33.15
figure 15figure 15

Kuramoto model in the mean-field case. Dependence of order parameter R on the coupling strength K. Below a critical value  K c , the order parameter is 0, indicating the absence of synchrony; synchrony emerges for K above the critical value

2.2.4 Neural Networks as Dynamical Systems

A dynamical system is defined by its equation of motion, which describes the location of the system as a function of time t

d X ( t , λ ) d t = F ( X ) , X R n .
(33.13)

Here X is the state vector describing the state of the system in the n-dimensional Euclidean space X = X ( x 1 , , x n ) R n and λ is the vector of system parameters. Proper initial conditions must be specified and it is assumed that  F ( X ) is a sufficiently smooth nonlinear function. In neural dynamics it is often assumed that the state space is a smooth manifold, and the goal is to study the evolution of the trajectory of  X ( t ) in the state space as time varies along the interval  [ t 0 , T ] .

The Cohen–Grossberg (GlossaryTerm

CG

) equation is a general formulation of the motion of a neural network as a dynamical system with distributed time delays in the presence of inputs. The GlossaryTerm

CG

model has been studied thoroughly in the past decades and it served as a starting point for various other approaches. The general form of the GlossaryTerm

CG

model is [83]

d z i ( t ) d t = - a i ( x i ( t ) ) [ b i ( x i ( t ) ) - j = 1 N a i j f j ( x j ( t ) ) . . - j = 1 N b i j f j ( x j ( t - τ i j ) ) + u j ] , i = 1 , , N .
(33.14)

Here X ( t ) = [ x 1 ( t ) , x 2 ( t ) , , x N ( t ) ] T is the state vector describing a neural network with N neurons. Function  a i ( t ) describes the amplification, b i ( t ) denotes a properly behaved function to guarantee that the solution remains bounded, f i ( x ) is the activation function, u i denotes external input, a ij and b ij are components of the connection weight matrix and the delayed connection weight matrix, respectively, and τ i j describes the time delays between neurons, i , j = 1 , , n . The solution of (33.14) can be determined after specifying suitable initial conditions.

There are various approaches to guarantee the stability of the GlossaryTerm

CG

equation as it approaches its equilibria under specific constraints. Global convergence assuming symmetry of the connectivity matrix has been shown [83]. The symmetric version of a simplified GlossaryTerm

CG

model has become popular as the Hopfield or Hopfield–Tank model [84]. Dynamical properties of GlossaryTerm

CG

equation have been studied extensively, including asymptotic stability, exponential stability, robust stability, and stability of periodic bifurcations and chaos. Symmetry requirements for the connectivity matrix have been relaxed, still guaranteeing asymptotic stability [85]. GlossaryTerm

CG

equations can be employed to find the optimum solutions of a nonlinear optimization problem when global asymptotic stability guarantees the stability of the solution [86]. Global asymptotic stability of the GlossaryTerm

CG

neural network with time delay is studied using linear matrix inequalities (GlossaryTerm

LMI

). GlossaryTerm

LMI

is a fruitful approach for global exponential stability by constructing Lyapunov functions for broad classes of neural networks.

2.2.5 Bifurcations in Neural Network Dynamics

Bifurcation theory studies the behavior of dynamical systems in the neighborhood of bifurcation points, i. e., at points when the topology of the state space abruptly changes with continuous variation of a system parameter. An example of the state space is given by the folded surface in Fig. 33.16, which illustrates a cusp bifurcation point. Here  λ = [ a , b ] is a two-dimensional parameter vector, X R 1  [87]. As parameter b increases, the initially unfolded manifold undergoes a bifurcation through a cusp folding with three possible values of state vector X. This is an example of pitchfork bifurcation, when a stable equilibrium point bifurcates into one unstable and two stable equilibria. The projection to the  a - b plane shows the cusp bifurcation folding with multiple equilibria. The presence of multiple equilibria provides the conditions for the onset of oscillatory states in neural networks. The transition from fixed point to limit cycle dynamics can described by bifurcation theory.

Fig. 33.16
figure 16figure 16

Folded surface in the state space illustrating cusp bifurcation following (after [87]). By increasing parameter b, the stable equilibrium bifurcates to two stable and one unstable equilibria

2.2.6 Neural Networks with Inhibitory Feedback

Oscillations in neural networks are typically due to delayed, negative feedback between neural population. Mean-field models are described first, starting with Wilson–Cowan (GlossaryTerm

WC

) oscillators, which are capable of producing limit cycle oscillations. Next, a class of more general networks with excitatory–inhibitory feedback are described, which can generate unstable limit cycle oscillations.

The Wilson–Cowan model is based on statistical analysis of neural populations in the mean-field limit, i. e., assuming that all components of the system fully interact [88, 89]. In the brain it may describe a single cortical column in one of the sensory cortices, which in turn interacts with other columns to generate synchronous or asynchronous oscillations, depending on the cognitive state. In its simplest manifestation, the GlossaryTerm

WC

model has one excitatory and one inhibitory component, with interaction weights denoted as w EE , w EI , w IE , and  w II . Nonlinear function f stands for the standard sigmoid with rate constant a

d X E d t = - X E + f ( w EE X E + w IE X I + P E ) ,
(33.15)
d X I d t = - X I + f ( w EI X E + w II X I + P I ) ,
(33.16)
f ( x ) = 1 / [ 1 + e - a x ] .
(33.17)

P E and  P I describe the effect of input stimuli through the excitatory and inhibitory nodes, respectively. The inhibitory weights are negative, while the excitatory ones are positive. The GlossaryTerm

WC

system has been extensively studied with dynamical behaviors including fixed point and oscillatory regimes. In particular, for fixed weight values, it has been shown that the GlossaryTerm

WC

system undergoes a pitchfork bifurcation by changing  P E or  P I input levels. Figure 33.17 shows the schematics of the two-node system, as well as the illustration of the oscillatory states following the bifurcation with parameters  w EE = 11.5 , w II = - 2 , w EI = - w IE = - 10 , and input values  P E = 0 and P I = - 4 , with rate constant a = 1. Stochastic versions of the Wilson–Cowan oscillators have been extensively developed as well [90]. Coupled Wilson–Cowan oscillators have been used in learning models and have demonstrated applicability in a number of fields, including visual processing and pattern classification [91, 92, 93].

Fig. 33.17
figure 17figure 17

Schematic diagram of the Wilson–Cowan oscillator with excitatory (E) and inhibitory (I) populations; solid lines show excitatory, dashed show inhibitory connections. The right panels show the trajectory in the phase space of X E - X I and the time series of the oscillatory signals (after [90])

Oscillatory neural networks with interacting excitatory–inhibitory units have been developed in Freeman K sets [94]. That model uses an asymmetric sigmoid function  f ( x ) modeled based on neurophysiological activations and given as follows

f ( x ) = q { 1 - exp ( - [ 1 / ( q ( e x - 1 ) ) ] ) } .
(33.18)

Here q is a parameter specifying the slope and maximal asymptote of the sigmoid curve. The sigmoid has unit gain at zero, and has maximum gain at positive x values due to its asymmetry, see (33.18). This property provides the opportunity for self-sustained oscillations without input at a wide range of parameters. Two versions of the basic oscillatory units have been studied, either one excitatory and one inhibitory unit, or two excitatory and two inhibitory units. This is illustrated in Fig. 33.18. Stability conditions of the fixed point and limit cycle oscillations have been identified [95, 96]. The system with two E and two I units has the advantage that it avoids self-feedback, which is uncharacteristic in biological neural populations. Interestingly, the extended system has an operating regime with an unstable equilibrium without stable equilibria. This condition leads to an inherent instability in a dynamical regime when the system oscillates without input. Oscillations in the unstable region have been characterized and conditions for sustained unstable oscillations derived [96]. Simulations in the region confirmed the existence of limit cycles in the unstable regime with highly irregular oscillatory shapes of the cycle, see Fig. 33.18, upper plot. Regions with regular limit cycle oscillations and fixed point oscillations have been identified as well, see Fig. 33.18, middle and bottom [97].

Fig. 33.18 a,b
figure 18figure 18

Illustration of excitatory–inhibitory models. (a) Left: simplified model with one excitatory (E) and one inhibitory (I) node. Right: extended model with two E and two I nodes. (b) Simulations with the extended model with two E and two I nodes; y 1 - y 4 show the activations of the nodes; b1: limit cycle oscillations in the unstable regime; b2: oscillations in the stable limit cycle regime; b3: fixed point regime (after [97])

2.2.7 Spatiotemporal Oscillations in Heterogeneous NNs

Neural networks describe the collective behavior of populations of neurons. It is of special interest to study populations with a large-number of components having complex, nonlinear interactions. Homogeneous populations of neurons allow mathematical modeling in mean-field approximation, leading to oscillatory models such as the Wilson–Cowan oscillators and Freeman KII sets. Field models with heterogeneous structure and dynamic variables are of great interest as well, as they are the prerequisite of associative memory functions of neural networks.

A general mathematical formulation views the neuropil, the interconnected neural tissue of the cortex, as a dynamical system evolving in the phase space, see (33.13). Consider a population of spiking neurons each of which is modeled by a Hodgkin–Huxley equation. The state of a neuron at any time instant is determined by its depolarization potential, microscopic current, and spike timing. Each neuron is represented by a point in the state space given by the above coordinates comprising vector X ( t ) R n , and the evolution of a neuron is given with its trajectory the state space. Neuropils can include millions and billions of neurons; thus the phase space of the neurons contains a myriads of trajectories. Using the ensemble density approach of population modeling, the distribution of neurons in the state space at a given time t is described by a probability density function  p ( X , t ) . The ensemble density approach models the evolution of the probability density in the state space [98]. One popular approach uses the Langevin formalism given next.

2.2.8 Field Theories of Neural Networks

Consider the stochastic process  X ( t ) , which is described by the Langevin equation [99]

d X ( t ) = μ ( X ( t ) ) d t + σ ( X ( t ) ) d W ( t ) .
(33.19)

Here μ and σ denote the drift and variance, respectively, and  d W ( t ) is a Wiener process (Brown noise) with normally distributed increments. The probability density  p ( X , t ) of Langevin equation (33.19) satisfies the following form of the Fokker–Planck equation, after omitting higher-order terms

p ( X , t ) t = - i = 1 n x i [ μ i ( X ) p ( X , t ) ] + n i = 1 n j = 1 2 x i x j [ D i j ( X ) p ( X , t ) ] .
(33.20)

The Fokker–Planck equation has two components. The first one is a flow term containing drift vector  μ i ( X ) , while the other term describes diffusion with diffusion coefficient matrix  D i j ( X , t ) . The Fokker–Planck equation is a partial differential equation (GlossaryTerm

PDE

) that provides a deterministic description of macroscopic events resulting from random microscopic events. The mean-field approximation describes time-dependent, ensemble average population properties, instead of keeping track of the behavior of individual neurons.

Mean-field models can be extended to describe the evolution of neural populations distributed in physical space. Considering the cortical sheet as a de facto continuum of the highly convoluted neural tissue (the neuropil), field theories of brains are developed using partial differential equations in space and time. The corresponding GlossaryTerm

PDE

s are wave equations. Consider a simple one-dimensional model to describe the dynamics of the current density  Φ ( x , t ) as a macroscopic variable. In the simple case of translational invariance of the connectivity function between arbitrary two points of the domain with exponential decay, the following form of the wave equation is obtained [100]

2 Φ t 2 + ( ω 0 2 - v 2 Δ ) Φ + 2 ω 0 Φ t = ( ω 0 2 + ω 0 t ) S [ Φ ( x , t ) + P ( x , t ) ] .
(33.21)

Here Δ = 2 / x 2 is the Laplacian in one dimension,  S ( . ) is a sigmoid transfer function for firing rates, P ( x , t ) describes the effect of inputs; ω 0 = v / σ , where v is the propagation velocity along lateral axons, and σ is the spatial relaxation constant of the applied exponential decay function [100]. The model can be extended to excitatory–inhibitory components. An example of simulations with a one-dimensional neural field model incorporating excitatory and inhibitory neurons is given in Fig. 33.19 [101]. The figure shows the propagation of two traveling pulses and the emergence of transient complex behavior ultimately leading to an elevated firing rate across the whole tissue [101]. For recent developments in brain field models, see [102, 90].

Fig. 33.19
figure 19figure 19

Numerical simulations of a one-dimensional neural field model showing the interaction of two traveling pulses (after [101])

2.2.9 Coupled Map Lattices for NNs

Spatiotemporal dynamics in complex systems has been modeled using coupled map lattices (GlossaryTerm

CML

) [103]. GlossaryTerm

CML

s use continuous state space and discrete time and space coordinates. In other words, GlossaryTerm

CML

s are defined on (finite or infinite lattices) using discrete time iterations. Using periodic boundary conditions, the array can be folded into a circle in one dimension, or into a torus for lattices of dimension 2 or higher. GlossaryTerm

CML

dynamics is described as follows

x n + 1 ( i ) = ( 1 - ε ) f ( x n ( i ) ) + ε 1 K K / 2 k = - K / 2 f ( x n ( i + k ) ) ,
(33.22)

where  x n ( i ) is the value of node i at iteration step n , i = 1 , , N ; N is the size of the lattice. Note that in (33.22) a periodic boundary condition applies. f ( . ) is a nonlinear mapping function used in the iterations and ε is the coupling strength, 0 ε 1 . ε = 0 means no coupling, while  ε = 1 is maximum coupling. The GlossaryTerm

CML

rule defined in (33.22) has two terms. The first term on the right-hand side is an iterative update of the i-th state, while the second term describes coupling between the units. Parameter K has a special role in coupled map lattices; it defines the size of the neighborhoods. K = N describes mean-field coupling, while smaller K values belong to smaller neighborhoods. The geometry of the system is similar to the ones given in Fig. 33.14. The case of local neighborhood is the upper left diagram in Fig. 33.14, while mean-field coupling is the lower right diagram. Similar rules have been defined for higher-dimensional lattices.

GlossaryTerm

CML

s exhibit very rich dynamic behavior, including fixed points, limit cycles, and chaos, depending on the choice of control parameters, ε , K , and function  f ( . )  [103, 104]. An example of the cubic sigmoid function

f ( x , a ) = a x 3 - a x + x

is shown in Fig. 33.20, together with the bifurcation diagram with respect to parameter a. By increasing the value of parameter a, the map exhibits bifurcations from fixed point to limit cycle, and ultimately to the chaotic regime.

Fig. 33.20 a,b
figure 20figure 20

Transfer function for CML: (a) shape of the cubic transfer function f ( x , a ) = a x 3 - a x + x ; (b) bifurcation diagram over parameter a

Complex GlossaryTerm

CML

dynamics has been used to design dynamic associative memory systems. In GlossaryTerm

CML

, each memory is represented as a spatially coherent oscillation and is learnt by a correlational learning rule operating in limit cycle or chaotic regimes. In such systems, both the memory capacity and the basin volume for each memory are larger in GlossaryTerm

CML

than in the Hopfield model employing the same learning rule [105]. GlossaryTerm

CML

chaotic memories reduce the problem of spurious memories, but they are not immune to it. Spurious memories prevent the system from exploiting its memory capacity to the fullest extent.

2.2.10 Stochastic Resonance

Field models of brain networks develop deterministic GlossaryTerm

PDE

s (Fokker–Planck equation) for macroscopic properties based on a statistical description of the underlying stochastic dynamics of microscopic neurons. In another words, they are deterministic systems at the macroscopic level. Stochastic resonance (GlossaryTerm

SR

) deals with conditions when a bistable or multistable system exhibits strong oscillations under weak periodic perturbations in the presence of random noise [106]. In a typical GlossaryTerm

SR

situation, the weak periodic carrier wave is insufficient to cross the potential barrier between the equilibria of a multistable system. Additive noise enables the system to surmount the barrier and exhibit oscillations as it transits between the equilibria. GlossaryTerm

SR

is an example of processes when properly tuned random noise improves the performance of a nonlinear system and it is highly relevant to neural signal processing [107, 108].

A prominent example of GlossaryTerm

SR

in a neural network with excitatory and inhibitory units is described in [109]. In the model developed, the activation rate of excitatory and inhibitory neurons is described by  μ e and  μ i , respectively. The ratio α = μ e / μ i is an important parameter of the system. The investigated neural populations exhibit a range of dynamic behaviors, including convergence to fixed point, damped oscillations, and persistent oscillations. Figure 33.21 summarizes the main findings in the form of a phase diagram in the space of parameters α and noise level. The diagram contains three regions. Region I is at low noise levels and it corresponds to oscillations decaying to a fixed point at an exponential rate. Region II corresponds to high noise, when the neural activity exhibits damped oscillations as it approaches the steady state. Region III, however, demonstrates sustained oscillations for an intermediate level of noise. If ais above a critical value (see the tip of Region III), the activities in the steady state undergo a first-order phase transition at a critical noise level. The intensive oscillations in Region III at an intermediate noise level show that the output of the system (oscillations) can be enhanced by an optimally selected noise

Fig. 33.21
figure 21figure 21

Stochastic resonance in excitatory–inhibitory neural networks; α describes the relative strength of inhibition. Region I: fixed point dynamics. Region II: damped oscillatory regime. Region III: sustained periodic oscillations illustrating stochastic resonance (after [109])

The observed phase transitions may be triggered by neuronal avalanches, when the neural system is close to a critical state and the activation of a small number of neurons can generate an avalanche process of activation [110]. Neural avalanches have been described using self-organized criticality (GlossaryTerm

SOC

), which has been identified in neural systems [111]. There is much empirical evidence of the cortex conforming to the self-stabilized, scale-free dynamics with avalanches during the existence of some quasi-stable states [112, 113]. These avalanches maintain a metastable background state of

Phase transitions have been studied in models with extended layers of excitatory and inhibitory neuron populations, respectively. A specific model uses random cellular neural networks to describe conditions with sustained oscillations [114]. The role of various control parameters has been studied, including noise level, inhibition, and rewiring. Rewiring describes long axonal connections to produce neural network architectures resembling connectivity patterns with short and long-range axons in the neuropil. By properly tuning the parameters, the system can reside in a fixed point regime in isolation, but it will switch to persistent oscillations under the influence of learnt input patterns [115].

2.3 Chaotic Neural Networks

2.3.1 Emergence of Chaos in Neural Systems

Neural networks as dynamical systems are described by the state vector  X ( t ) which obeys the equation of motion (33.13). Dynamical systems can exhibit fixed point, periodic, and chaotic behaviors. Fixed points and periodic oscillations, and transitions from one to the other through bifurcation dynamics has been described in Sect. 33.2.2. The trajectory of a chaotic system does not converge to a fixed point or limit cycle, rather it converges to a chaotic attractor. Chaotic attractors, or strange attractors, have the property that they define a fractal set in the state space, moreover, chaotic trajectories close to each other at some point, diverge from each other exponentially fast as time evolves [116, 117].

An example of the chaotic Lorenz attractor is shown in Fig. 33.22. The Lorenz attractor is defined by a system of three ordinary differential equations (GlossaryTerm

ODE

s) with nonlinear coupling, originally derived for the description of the motion of viscous flows [118]. The time series belonging to variables  X , Y , Z are shown in Fig. 33.22a for parameters in the chaotic region, while the strange attractor is illustrated by the trajectory in the phase space, see Fig. 33.22b.

Fig. 33.22 a,b
figure 22figure 22

Lorenz attractor in the chaotic regime; (a) time series of the variables  X , Y , and Z; (b) butterfly-winged chaotic Lorenz attractor in the phase space spanned by variables  X , Y , and Z

2.3.2 Chaotic Neuron Model

In chaotic neural networks the individual components exhibit chaotic behavior, and the goal is to study the order emerging from their interaction. Nerve membranes produce propagating action potentials in a highly nonlinear process which can generate oscillations and bifurcations to chaos. Chaos has been observed in the giant axons of squid and it has been used to study chaotic behavior in neurons. The Hodgkin–Huxley equations can model nonlinear dynamics in the squid giant axon with high accuracy [58]. The chaotic neuron model of Aihara etal is an approximation of the Hodgkin–Huxley equation and it reproduces chaotic oscillations observed in the squid giant axon [119, 120]. The model uses the following simple iterative map

x ( t + 1 ) = k x ( t ) - α f ( x ( t ) ) + a ,
(33.23)

where  x ( t ) is the state of the chaotic neuron at time t, k is a decay parameter, α characterizes refractoriness, a is a combined bias term, and  f ( y ( t ) ) is a nonlinear transfer function. In the chaotic neuron model, the log sigmoid transfer function is used, see (33.17). Equation (33.23) combined with the sigmoid produces a piece-wise monotonous map, which generates chaos.

Chaotic neural networks composed of chaotic neurons generate spatio-temporal chaos and are able to retrieve previously learnt patterns as the chaotic trajectory traverses the state space. Chaotic neural networks are used in various information processing systems with abilities of parallel distributed processing [121, 122, 123]. Note that GlossaryTerm

CML

s also consist of chaotic oscillators produced by a nonlinear local iterative map, like in chaotic neural networks. GlossaryTerm

CML

s define a spatial relationship among their nodes to describe spatio-temporal fluctuations. A class of cellular neural networks combines the explicit spatial relationships similar to GlossaryTerm

CML

s with detailed temporal dynamics using Cohen–Grossberg model [83] and it has been used successfully in neural network applications [124, 125].

2.3.3 Collective Chaos in Neural Networks

Chaos in neural networks can be an emergent macroscopic property stemming from the interaction of nonlinear neurons, which are not necessarily chaotic in isolation. Starting from the microscopic neural level up to the macroscopic level of cognition and consciousness, chaos plays an important role in neurodynamics [126, 127, 128, 129, 82]. There are various routes to chaos in neural systems, including period-doubling bifurcations to chaos, chaotic intermittency, and collapse of a two-dimensional torus to chaos [130, 131].

Chaotic itinerancy is a special form of chaos, which is between ordered dynamics and fully developed chaos. Chaotic itinerancy describes the trajectory through high-dimensional state space of neural activity [132]. In chaotic itinerancy the chaotic system is destabilized to some degree but some traces of the trajectories remain. This describes an itinerant behavior between the states of the system containing destabilized attractors or attractor ruins, which can be fixed point, limit cycle, torus, or strange attractor with unstable directions. Dynamical orbits are attracted to a certain attractor ruin, but they leave via an unstable manifold after a (short or long) stay around it and move toward another attractor ruin. This successive chaotic transition continues unless a strong input is received. A schematic diagram is shown in Fig. 33.23, where the trajectory of a chaotic itinerant system is shown visiting attractor ruins. Chaotic itinerancy is associated with perceptions and memories, the chaos between the attractor ruins is related to searches, and the itinerancy is associated with sequences in thinking, speaking, and

Fig. 33.23
figure 23figure 23

Schematic illustration of itinerant chaos with a trajectory visiting attractor ruins (after [132])

Frustrated chaos is a dynamical system in a neural network with a global attractor structure when local connectivity patterns responsible for stable oscillatory behaviors become intertwined, leading to mutually competing attractors and unpredictable itinerancy between brief appearances of these attractors [133]. Similarly to chaotic itinerancy, frustrated chaos is related to destabilization of the dynamics and it generates itinerant, wavering oscillations between the orbits of the network, the trajectories of which have been stable with the original connectivity pattern. Frustrated chaos is shown to belong to the family of intermittency type of chaos [134, 135].

To characterize chaotic dynamics, tools of statistical time series analysis are useful. The studies may involve time and frequency domains. Time domain analysis includes attractor reconstruction, i. e., the attractor is depicted in the state space. Chaotic attractors have fractal dimensions, which can be evaluated using one of the available methods [136, 137, 138]. In the case of low-dimensional chaotic systems, the reconstruction can be illustrated using two or three-dimensional plots. An example of attractor reconstruction is given in Fig. 33.22 for the Lorenz system with three variables. Attractor reconstruction of a time series can be conducted using time-delay coordinates [139].

Lyapunov spectrum analysis is a key tool in identifying and describing chaotic systems. Lyapunov exponents measure the instability of orbits in different directions in the state space. It describes the rate of exponential divergence of trajectories that were once close to each other. The set of corresponding Lyapunov exponents constitutes the Lyapunov spectrum. The maximum Lyapunov exponent Λ * is of crucial importance; as a positive leading Lyapunov exponent Λ * > 0 is the hallmark of chaos. X ( t ) describes the trajectory of the system in the phase space starting from  X ( 0 ) at time t = 0. Denote by  X Δ x 0 ( t ) the perturbed trajectory starting from  [ X ( 0 ) + Δ x 0 ] . The leading Lyapunov exponent can be determined using the following relationship [140]

Λ * = lim⁡ t Δ x 0 0 t - 1 ln⁡ [ | X Δ x 0 ( t ) - X ( t ) | / | Δ x 0 | ] ,
(33.24)

where  Λ * < 0 corresponds to convergent behavior, Λ * = 0 indicates periodic orbits, and Λ * > 0 signifies chaos. For example, the Lorenz attractor has Λ * = 0.906 , indicating strong chaos (Fig. 33.24). Equation (33.24) measures the divergence for infinitesimal perturbations in the limit of infinite time series. In practical situations, especially for short time series, it is often difficult to distinguish weak chaos from random perturbations. One must be careful with conclusions about the presence of chaos when  Λ * has a value close to zero. Lyapunov exponents are widely used in brain monitoring using electroencephalogram (GlossaryTerm

EEG

) analysis, and various methods are available for characterization of normal and pathological brain conditions based on Lyapunov spectra [141, 142].

Fourier analysis conducts data processing in the frequency domain, see (33.5) and (33.6). For chaotic signals, the shape of the power spectra is of special interest. Power spectra often show  1 / f α power law behavior in log–log coordinates, which is the indication of scale-free system and possibly chaos. Power-law scaling in systems at GlossaryTerm

SOC

is suggested by a linear decrease in log power with increasing log frequency [143]. Scaling properties of criticality facilitate the coexistence of spatially coherent cortical activity patterns for a duration ranging from a few milliseconds to a few seconds. Scale-free behavior characterizes chaotic brain activity both in time and frequency domains. For completeness, we mention the Hilbert space analysis as an alternative to Fourier methods. The analytic signal approach based on Hilbert analysis is widely used in brain monitoring.

2.3.4 Emergent Macroscopic Chaos in Neural Networks

Freeman’s K model describes spatio-temporal brain chaos using a hierarchical approach. Low-level K sets were introduced in the 1970s, named in the honor of Aharon Kachalsky, an early pioneer of neural dynamics [82, 94]. K sets are multiscale models, describing an increasing complexity of structure and dynamics. K sets are mesoscopic models and represent an intermediate level between microscopic neurons and macroscopic brain structures. K-sets are topological specifications of the hierarchy of connectivity in neural populations in brains. K sets describe the spatial patterns of phase and amplitude of the oscillations generated by neural populations. They model observable fields of neural activity comprising electroencephalograms (GlossaryTerm

EEG

s), local field potentials (GlossaryTerm

LFP

s), and magnetoencephalograms (GlossaryTerm

MEG

s) [144]. K sets form a hierarchy for cell assemblies with components starting from K0 to KIV [145, 146].

Fig. 33.24 a–c
figure 24figure 24

KIII diagram and behaviors; (a)  3 double layer hierarchy of KIII and time series over each layer, exhibiting intermittent chaotic oscillations, (b) phase space reconstruction using delayed time coordinates

K0 sets represent noninteractive collections of neurons forming cortical microcolumns; a K0 set models a neuron population of 10 3 - 10 4 neurons. K0 models dendritic integration in average neurons and an asymmetric sigmoid static nonlinearity for axon transmission. The K0 set is governed by a point attractor with zero output and stays at equilibrium except when perturbed. In the original K-set models, K0s are described by a state-dependent, linear second-order ordinary differential equation (GlossaryTerm

ODE

) [94]

a b d 2 X ( t ) / d t 2 + ( a + b ) d X ( t ) / d t + P ( t ) = U ( t ) .
(33.25)

Here a and b are biologically determined time constants. X ( t ) denotes the activation of the node as a function of time. U ( t ) includes an asymmetric sigmoid function  Q ( x ) , see (33.18), acting on the weighted sum of activation from neighboring nodes and any external input.

KI sets are made of interacting K0 sets, either excitatory or inhibitory with positive feedback. The dynamics of KI is described as convergence to a nonzero fixed point. If KI has sufficient functional connection density, then it is able to maintain a nonzero state of background activity by mutual excitation (or inhibition). KI typically operates far from thermodynamic equilibrium. Neural interaction by stable mutual excitation (or mutual inhibition) is fundamental to understanding brain dynamics. KII sets consists of interacting excitatory and inhibitory KI sets with negative feedback. KII sets are responsible for the emergence of limit cycle oscillation due to the negative feedback between the neural populations. Transitions from point attractor to limit cycle attractor can be achieved through a suitable level of feedback gain or by input stimuli, see Fig. 33.18.

KIII sets made up of multiple interacting KII sets. Examples include the sensory cortices. KIII sets generate broadband, chaotic oscillations as background activity by combined negative and positive feedback among several KII populations with incommensurate frequencies. The increase in nonlinear feedback gain that is driven by input results in the destabilization of the background activity and leads to the emergence of a spatial amplitude modulation (GlossaryTerm

AM

) pattern in KIII. KIII sets are responsible for the embodiment of meaning in GlossaryTerm

AM

patterns of neural activity shaped by synaptic interactions that have been modified through learning in KIII layers. The KIII model is illustrated in Fig. 33.24 with three layers of excitatory–inhibitory nodes. In Fig. 33.24a the temporal dynamics is illustrated in each layer, while Fig. 33.24b shows the phase space reconstruction of the attractor. This is a chaotic behavior resembling the dynamics of the Lorenz attractor in Fig. 33.22. KIV sets are made up of interacting KIII units to model intentional neurodynamics of the limbic system. KIV exhibits global phase transitions, which are the manifestations of hemisphere-wide cooperation through intermittent large-scale synchronization. KIV is the domain of Gestalt formation and preafference through the convergence of external and internal sensory signals leading to intentional action [144, 146].

2.3.5 Properties of Collective Chaotic Neural Networks

KIII is an associative memory, encoding input data in spatio-temporal GlossaryTerm

AM

patterns [147, 148]. KIII chaotic memories have several advantages as compared to convergent recurrent networks:

  1. 1.

    They produce robust memories based on relatively few learning examples even in noisy environment.

  2. 2.

    The encoding capacity of a network with a given number of nodes is exponentially larger than their convergent counterparts.

  3. 3.

    They can recall the stored data very quickly, just as humans and animals can recognize a learnt pattern within a fraction of a second.

The recurrent Hopfield neural network can store an estimated 0.15N input patterns in stable attractors, where N is the number of neurons [84]. Exact analysis by Mceliece etal [149] shows that the memory capacity of the Hopfield network is N / ( 4 log N ) . Various generalizations provide improvements over the initial memory gain [150, 151]. It is of interest to evaluate the memory capacity of the KIII memory. The memory capacity of chaotic networks which encode input into chaotic attractors is, in principle, exponentially increased with the number of nodes. However, the efficient recall of the stored memories is a serious challenge. The memory capacity of KIII as a chaotic associative memory device has been evaluated with noisy input patterns. The results are shown in Fig. 33.25, where the performance of Hopfield and KIII memories are compared; the top two plots are for Hopfield nets, while the lower two figures describe KIII results [152]. The light color shows recognition rate close to 100 % , while the dark color means poor recognition approaching 0. The right-hand column has higher noise levels. The Hopfield network shows the well-known linear gain curve  0.15. The KIII model, on the other hand, has a drastically better performance. The boundary separating the correct and incorrect classification domains is superlinear; it has been fitted with as a fifth-order polynomial.

Fig. 33.25 a,b
figure 25figure 25

Comparison of the memory capacity of (a) Hopfield and (b) KIII neural networks; the noise level is 40 % (left); 50 % (right); the lighter the color the higher the recall accuracy. Observe the linear gain for Hopfield networks and the superlinear (fifth-order) separation for KIII (after [152])

2.3.6 Cognitive Implications of Intermittent Brain Chaos

Developments in brain monitoring techniques provide increasingly detailed insights into spatio-temporal neurodynamics and neural correlates of large-scale cognitive processing [153, 154, 155, 74]. Brains as large-scale dynamical systems have a basal state, which is a high-dimensional chaotic attractor with a dynamic trajectory wandering broadly over the attractor landscape [126, 82]. Under the influence of external stimuli, cortical dynamics is destabilized and condenses intermittently to a lower-dimensional, more organized subspace. This is the act of perception when the subject identifies the stimulus with a meaning in the context of its previous experience. The system stays intermittently in the condensed, more coherent state, which gives rise to a spatio-temporal GlossaryTerm

AM

activity pattern corresponding to the stimulus in the given context. The GlossaryTerm

AM

pattern is meta-stable and it disintegrates as the system returns to the high-dimensional chaotic basal state (less synchrony) Brain dynamics is described as a sequence of phase transitions with intermittent synchronization-desynchronization effects. The rapid emergence of synchronization can be initiated by (Hebbian) neural assemblies that lock into synchronization across widespread cortical and subcortical areas [156, 157, 82].

Intermittent oscillations in spatio-temporal neural dynamics are modeled by a neuropercolation approach. Neuropercolation is a family of probabilistic models based on the theory of probabilistic cellular automata on lattices and random graphs and it is motivated by structural and dynamical properties of neural populations. Neuropercolation constructs the hierarchy of interactive populations in networks as developed in Freeman K models [144, 94], but replace differential equations with probability distributions from the observed random networks that evolve in time [158]. Neuropercolation considers populations of cortical neurons which sustain their background state by mutual excitation, and their stability is guaranteed by the neural refractory periods. Neural populations transmit and receive signals from other populations by virtue of small-world effects [159, 77]. Tools of statistical physics and finite-size scaling theory are applied to describe critical behavior of the neuropil. Neuropercolation theory provides a mathematical approach to describe phase transitions and critical phenomena in large-scale, interactive cortical networks. The existence of phase transitions is proven in specific probabilistic cellular automata models [160, 161].

Simulations by neuropercolation models demonstrate the onset of large-scale synchronization-desynchronization behavior [162]. Figure 33.26 illustrates results of intermittent phase desynchronization for neuropercolation with excitatory and inhibitory populations. Three main regimes can be distinguished, separated by critical noise values ε 1 > ε 0 . In Regime I ε > ε 1 , Fig. 33.26a, the channels are not synchronous and the phase values are distributed broadly. In Regime II ε 1 > ε > ε 0 , Fig. 33.26b, the phase lags are drastically reduced indicating significant synchrony over extended time periods. Regime III is observed for high values of ε 0 > ε , when the channels demonstrate highly synchronized, frozen dynamics, see Fig. 33.26c. Similar transitions can be induced by the relative strength of inhibition, as well as by the fraction of rewiring across the network [114, 115, 163]. The probabilistic model of neural populations reproduces important properties of the spatio-temporal dynamics of cortices and is a promising approach for large-scale cognitive models.

Fig. 33.26 a–c
figure 26figure 26

Phase synchronization–desynchronization with excitatory–inhibitory connections in neuropercolation with 256 granule nodes; the z-axis shows the pair-wise phase between the units. (a) No synchrony; (b) intermittent synchrony; (c) highly synchronized, frozen phase regime (after [162])

3 Memristive Neurodynamics

Sequential processing of fetch, decode, and execution of instructions through the classical von Neumann digital computers has resulted in less efficient machines as their ecosystems have grown to be increasingly complex [164]. Though modern digital computers are fast and complex enough to emulate the brain functionality of animals like spiders, mice, and cats [165, 166], the associated energy dissipation in the system grows exponentially along the hierarchy of animal intelligence. For example, to perform certain cortical simulations at the cat scale even at an 83 times slower firing rate, the IBM team has to employ Blue Gene/P (BG/P), a super computer equipped with 147456 GlossaryTerm

CPU

s and 144 TB s of main memory. On the other hand, the human brain contains more than 100 billion neurons and each neuron has more than 20000 synapses [167]. Efficient circuit implementation of synapses, therefore, is very important to build a brain-like machine. One active branch of this research area is cellular neural networks (GlossaryTerm

CNN

s) [168, 169], where lots of multiplication circuits are utilized in a complementary metal-oxide-semiconductor (GlossaryTerm

CMOS

) chip. However, since shrinking the current transistor size is very difficult, introducing a more efficient approach is essential for further development of neural network

The memristor was first authorized by Chua as the fourth basic circuit element in electrical circuits in 1971 [170]. It is based on the nonlinear characteristics of charge and flux. By supplying a voltage or current to the memristor, its resistance can be altered. In this way, the memristor remembers information. In that seminal work, Chua demonstrated that the memristance  M ( q ) relates the charge q and the flux φ in such a way that the resistance of the device will change with the applied electric field and time

M = d φ d q .
(33.26)

The parameter M denotes the memristance of a charge controlled memristor, measured in ohms. Thus, the memristance M can be controlled by applying a voltage or current signal across the memristor. In other words, the memristor behaves like an ordinary resistor at any given instance of time, where its resistance depends on the complete history of the device [170].

Although the device was proposed nearly four decades ago, it was not until 2008 that researchers from HP Labs showed that the devices they had fabricated were indeed two-terminal memristors [171]. Figure 33.27 shows the I–V characteristics of a generic memristor, where memristance behavior is observed for TiO 2 -based devices. A  TiO 2 - x layer with oxygen vacancies is placed on a perfect TiO 2 layer, and these layers are sandwiched between platinum electrodes. In metal oxide materials, the switching from  R off to  R on and vice versa occurs as a result of ion migration, due to the enormous electric fields applied across the nanoscale structures. These memristors have been fabricated using nanoimprint lithography and were successfully integrated on a GlossaryTerm

CMOS

substrate in [172]. Apart from these metal-oxide memristors, memristance has also been demonstrated using magnetic materials based on their magnetic domain wall motion and spin-torque induced magnetization switching in [173]. Furthermore, several different types of nonlinear memristor models have been investigated [174, 175]. One of them is the window model in which the state equation is multiplied by window function  F p ( ω ) , namely

d ω d t = μ v R on D i ( t ) F p ( ω ) ,
(33.27)

where p is an integer parameter and  F p ( ω ) is defined by

F p ( ω ) = 1 - ( 2 ω D - 1 ) 2 p ,
(33.28)

which is shown in Fig. 33.28.

Fig. 33.27
figure 27figure 27

Typical I V characteristic of memristor (after [171]). The pinched hysteresis loop is due to the nonlinear relationship between the memristance current and voltage. The parameters of the memristor are R on = 100 Ω , R off = 16 K Ω , R init = 11 k Ω , D = 10 nm , u v = 10 - 10 cm 2 s - 1 V - 1 , p = 10 and V in = sin⁡ ( 2 π t ) . The memristor exhibits the feature of pinched hysteresis, which means that a lag occurs between the application and the removal of a field and its subsequent effect, just like the feature of neurons in the human brain

Fig. 33.28
figure 28figure 28

Window function for different integer p

3.1 Memristor-Based Synapses

The design of simple weighting circuits for synaptic multiplication between arbitrary input signals and weights is extremely important in artificial neural systems. Some efforts have been made to build neuron-like analog neural networks [178, 179, 180]. However, this research has gained limited success so far because of the difficulty in implementing the synapses efficiently. Based on the memristor, a novel weighting circuit was proposed by Kim etal [176, 181, 182] as shown in Fig. 33.29. The memristors provide a bridge-like switching for achieving either positive or negative weighting. Though several memristors are employed to emulate a synapse, the total area of the

Fig. 33.29
figure 29figure 29

Memristor bridge circuit. The synaptic weight is programmable by varying the input voltage. The weighting of the input signal is also performed in this circuit (after [176])

Fig. 33.31
figure 30figure 30

Memristor-based cellular neural networks cell (after [183])

memristors is less than that of a single transistor. To compensate for the spatial nonuniformity and nonideal response of the memristor bridge synapse, a modified chip-in-the-loop learning scheme suitable for the proposed neural network architecture is investigated [176]. In the proposed method, the initial learning is conducted by software, and the behavior of the software-trained network is learned via the hardware network by learning each of the single layered neurons of the network independently. The forward calculation of single layered neuron learning is implemented through circuit hardware and is followed by a weight updating phase assisted by a host computer. Unlike conventional chip-in-the-loop learning, the need for the readout of synaptic weights for calculating weight updates in each epoch is eliminated by virtue of the memristor bridge synapse and the proposed learning

On the other hand, spike-timing-dependent learning (GlossaryTerm

STDP

), which is a powerful learning paradigm for spiking neural systems because of its massive parallelism, potential scalability, and inherent defect, fault, and failure-tolerance, can be implemented by using a crossbar memristive array combined with neurons that asynchronously generate spikes of a given shape [177, 185]. Such spikes need to be sent back through the neurons to the input terminal as in Fig. 33.30. The shape of the spikes turns out to be very similar to the neural spikes observed in realistic biological neurons. The GlossaryTerm

STDP

learning function obtained by combining such neurons with memristors is exactly obtained from neurophysiological experiments on real synapses. Such nanoscale synapses can be combined with GlossaryTerm

CMOS

neurons which is possible to create neuromorphic hardware several orders of magnitude denser than in conventional GlossaryTerm

CMOS

. This method offers better control over power dissipation; fewer constraints on the design of memristive materials used for nanoscale synapses; greater freedom in learning algorithms than traditional design of synapses since the synaptic learning dynamics can be dynamically turned on or off; greater control over the precise form and timing of the GlossaryTerm

STDP

equations; the ability to implement a variety of other learning laws besides GlossaryTerm

STDP

; better circuit diversity since the approach allows different learning laws to be implemented in different areas of a single chip using the same memristive material for all synapses. Furthermore, an analog GlossaryTerm

CMOS

neuromorphic design utilizing GlossaryTerm

STDP

and memristor synapses is investigated for use in building a multipurpose analogy neuromorphic chip [186]. In order to obtain a multipurpose chip, a suitable architecture is established. Based on the technique of IBM 90 nm CMOS9RF, neurons are designed to interface with Verilog-A memristor synapses models to perform the XOR operation and edge detection

Fig. 33.30
figure 31figure 31

Neuromorphic memristive computer equipped with STDP (after [177])

To make the neurons compatible with such new synapses, some novel training methods are proposed. For instance, Manem etal proposed a variation-tolerant training method to efficiently reconfigure memristive synapses in a trainable threshold gate array (GlossaryTerm

TTGA

) system [187]. The training process is inspired from the gradient descent machine learning algorithm commonly used to train artificial threshold neural networks known as perceptrons. The proposed training method is robust to the unpredictability of GlossaryTerm

CMOS

and nanocircuits with decreasing technology sizes, but also provides its own randomness in its

3.2 Memristor-Based Neural Networks

Employing memristor-based synapses, some results have been obtained about the memristor-based neural networks [183, 184, 188]. As the template weights in memristor-based neural networks (GlossaryTerm

MNN

s) are usually known and need to be updated between each template in a sequence of templates, there should be a way to rapidly change the weights. Meanwhile, the GlossaryTerm

MNN

cells need to be modified, as the programmable couplings are implemented by memristors which require programming circuits to isolate each other. Lehtonen and Laiho proposed a new cell of memristor-based cellular neural network that can be used to program the templates. For this purpose, a voltage global is input into the cell. This voltage is used to convey the weight of one connection into the cells [183]. The level of virtual ground and switches are controlled so that the memristor connected to a particular neighbor is biased above the programming threshold, until it reaches the desired resistance value.

Merrikh-Bayat etal presented a new way to explain the relationships between logical circuits and artificial neural networks, logical circuits and fuzzy logic, and artificial neural networks and fuzzy inference systems, and proposed a new neuro-fuzzy computing system, which can effectively be implemented via the memristor-crossbar structure [184]. A simple realization of GlossaryTerm

MNN

s is shown in Figs. 33.3233.34. Figure 33.32 shows that it is possible to interpret the working procedure of conventional artificial neural network GlossaryTerm

ANN

without changing its structure. In this figure, each row of the structure implements a simple fuzzy rule or min-term. Figure 33.33 shows how the activation function of neurons can be implemented when the activation function is modeled by a t-norm operator. Matrix multiplication is performed by vector circuit in Fig. 33.34. This circuit consists of a simple memristor crossbar where each of its rows is connected to the virtually grounded terminal of an operational amplifier that plays the role of a neuron with identity activation function. The advantages of the proposed system are twofold: first, its hardware can be directly trained using the Hebbian learning rule and without the need to perform any optimization; second, this system has a great ability to deal with a huge number of input-output training data without facing problems like overtrainging.

Fig. 33.32
figure 32figure 32

Simple realization of MNN based on fuzzy concepts (after [184])

Fig. 33.33
figure 33figure 33

Implementation of the activation function of neurons (after [184])

Fig. 33.34
figure 34figure 34

Memristor crossbar-based circuit (after [184])

Howard etal proposed a spiking neuro-evolutionary system which implements memristors as plastic connections [188]. These memristors provide a learning architecture that may be beneficial to the evolutionary design process that exploits parameter self-adaptation and variable topologies, allow the number of neurons, connection weights, and interneural connectivity pattern to emerge. This approach allows the evolution of networks with appropriate complexity to emerge whilst exploiting the memristive properties of the connections to reduce learning

To investigate the dynamic behaviors of memristor-based neural networks, Zeng etal proposed the memristor-based recurrent neural networks (GlossaryTerm

MRNN

s) [189, 190] shown in Fig. 33.35, where x i ( . ) is the state of the i-th subsystem, f j ( . ) is the amplifier, M fij is the connection memristor between the amplifier f j ( . ) and state x i ( . ) , R i and C i are the resistor and capacitor, I i is the external input, a i , b i are the outputs, i , j = 1 , 2 , , n . The parameters in this neural network are changed according to the state of the system, so this network is a state-dependent switching system. The dynamic behavior of this neural network with time-varying delays was investigated based on the Filippov theory and the Lyapunov

Fig. 33.35
figure 35figure 35

Circuit of a memristor-based recurrent network (after [189])

3.3 Conclusion

Memristor-based synapses and neural networks have been investigated by many scientists for their possible applications in analog, digital information processing, and memory and logic applications. However, the problem, of how to take advantage of the nonvolatile memory of memristors, nanoscale, low-power dissipation, and so on to design a method to process and store the information, which needs learning and memory, into the synapses of the memristor-based neural networks at the dynamical mapping space by a more rational space-parting method, is still an open issue. Further investigation is needed to shorten such a gap.

4 Neurodynamic Optimization

Optimization is omnipresent in nature and society, and an important tool for problem-solving in science, engineering, and commerce. Optimization problems arise in a wide variety of applications such as the design, planning, control, operation, and management of engineering systems. In many applications (e. g., online pattern recognition and in-chip signal processing in mobile devices), real-time optimization is necessary or desirable. For such applications, conventional optimization techniques may not be competent due to stringent requirements on computational time. It is computationally challenging when optimization procedures are to be performed in real time to optimize the performance of dynamical systems.

The brain is a profound dynamic system and its neurons are always active from birth to death. When a decision is to be made in the brain, many of its neurons are highly activated to gather information, search memory, compare differences, and make inferences and decisions. Recurrent neural networks are brain-like nonlinear dynamic system models and can be properly designed to imitate biological counterparts and serve as goal-seeking parallel computational models for solving optimization problems in a variety of settings. Neurodynamic optimization can be physically realized in designated hardware such as application-specific integrated circuits (GlossaryTerm

ASIC

s) where optimization is carried out in a parallel and distributed manner, where the convergence rate of the optimization process is independent of the problem dimensionality. Because of the inherent nature of parallel and distributed information processing, neurodynamic optimization can handle large-scale problems. In addition, neurodynamic optimization may be used for optimizing dynamic systems in multiple time scales with parameter-controlled convergence rates. These salient features are particularly desirable for dynamic optimization in decentralized decision-making scenarios [191, 192, 193, 194]. While population-based evolutionary approaches to optimization have emerged as prevailing heuristic and stochastic methods in recent years, neurodynamic optimization deserves great attention in its own right due to its close ties with optimization and dynamical systems theories, as well as its biological plausibility and circuit implementability with very large scale integration (GlossaryTerm

VLSI

) or optical technologies.

4.1 Neurodynamic Models

The past three decades witnessed the birth and growth of neurodynamic optimization. Although a couple of circuit-based optimization methods were developed earlier [195, 196, 197], it was perhaps Hopfield and Tank who spearheaded neurodynamic optimization research in the context of neural computation with their seminal work in the mid 1980s [198, 199, 200]. Since the inception, numerous neurodynamic optimization models in various forms of recurrent neural networks have been developed and analyzed, see [201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256], and the references therein. For example, Tank and Hopfield extended the continuous-time Hopfield network for linear programming and showed their experimental results with a circuit of operational amplifiers and other discrete components on a breadboard [200]. Kennedy and Chua developed a circuit-based recurrent neural network for nonlinear programming [201]. It is proven that the state of the neurodynamics is globally convergent and an equilibrium corresponding to an approximate optimal solution of the given optimization problems.

Over the years, neurodynamic optimization research has made significant progress with models with improved features for solving various optimization problems. Substantial improvements of neurodynamic optimization theory and models have been made in the following dimensions:

  1. i

    Solution quality: designed based on smooth penalty methods with a finite penalty parameter; the earliest neurodynamic optimization models can converge to approximate solutions only [200, 201]. Later on, better models designed based on other design principles can guarantee to state or output convergence to exact optimal solutions of solvable convex and pseudoconvex optimization problems with or without any conditions [204, 205, 208, 210], etc.

  2. ii

    Solvability scope: the solvability scope of neurodynamic optimization has been expanded from linear programming problems [200, 202, 208, 211, 212, 214, 215, 216, 217, 218, 219, 223, 242, 244, 251], to quadratic programming problems [202, 203, 204, 205, 206, 210, 214, 217, 218, 220, 225, 226, 229, 233, 240, 241, 242, 243, 247], to smooth convex programming problems with various constraints [201, 204, 205, 210, 214, 222, 224, 228, 230, 232, 234, 237, 245, 246, 257], to nonsmooth convex optimization problems [235, 248, 250, 251, 252, 253, 254, 255, 256], and recently to nonsmooth optimization with some nonconvex objective functions or constraints [239, 249, 254, 255, 256].

  3. iii

    Convergence property: the convergence property of neurodynamic optimization models has been extended from near-optimum convergence [200, 201], to conditional exact-optimum global convergence [205, 208, 210], to guaranteed global convergence [204, 205, 214, 215, 216, 218, 219, 222, 226, 227, 228, 230, 232, 234, 240, 243, 245, 247, 250, 253, 256, 257], to faster global exponential convergence [206, 224, 225, 228, 233, 237, 239, 241, 246, 254], to even more desirable finite-time convergence [235, 248, 249, 251, 252, 255], with increasing convergence rate.

  4. iv

    Model complexity: the neurodynamic optimization models for constrained optimization are essentially multilayer due to the introduction of instrumental variables for constraint handling (e. g., Lagrange multipliers or dual variables). The architectures of later neurodynamic optimization models for solving linearly constrained optimization problems have been reduced from multilayer structures to single-layer ones with decreasing model complexity to facilitate their implementation [243, 244, 251, 252, 254, 255].

    Activation functions are a signature component of neural network models for quantifying the firing state activities of neurons. The activation functions in existing neurodynamic optimization models include smooth ones (e. g., sigmoid), as shown in Fig. 33.36a,b [200, 208, 209, 210], nonsmooth ones (e. g., piecewise-linear) as shown in Fig. 33.36c,d [203, 206], and even discontinuous ones as shown in Fig. 33.36e,f [243, 244, 251, 252, 254, 255].

    Fig. 33.36 a–f
    figure 36figure 36

    Three classes of activation functions in neurodynamic optimization models: smooth in (a) and (b) , nonsmooth in (c) and (d) , and discontinuous in (e) and (f)

4.2 Design Methods

The crux of neurodynamic optimization model design lies in the derivation of a convergent neurodynamic equation that prescribes the states of the neurodynamics. A properly derived neurodynamic equation can ensure that the states of neurodynamics reaches an equilibrium that satisfies the constraints and optimizes the objective function. Although the existing neurodynamic optimization models are highly diversified with many different features, the design methods or principles for determining their neurodynamic equations can be categorized as follows:

  1. i

    Penalty methods

  2. ii

    Lagrange methods

  3. iii

    Duality methods

  4. iv

    Optimality methods.

4.2.1 Penalty Methods

Consider the general constrained optimization problem

minimize f ( x ) subject to g ( x ) 0 , h ( x ) = 0 ,

where x R e n is the vector of decision variables, f ( x ) is an objective function, g ( x ) = [ g 1 ( x ) , , g m ( x ) ] T is a vector-valued function, and h ( x ) = [ h 1 ( x ) , , h p ( x ) ] T a vector-valued function.

A penalty method starts with the formulation of a smooth or nonsmooth energy function based on a given objective function  f ( x ) and constraints  g ( x ) and  h ( x ) . It plays an important role in neurodynamic optimization. Ideally, the minimum of a formulated energy function corresponds to the optimal solution of the original optimization problem. For constrained optimization, the minimum of the energy function has to satisfy a set of constraints. Most early approaches formulate an energy function by incorporating objective function and constraints through functional transformation and numerical weighting [198, 199, 200, 201]. Functional transformation is usually used to convert constraints to a penalty function to penalize the violation of constraints; e. g., a smooth penalty function is as

p ( x ) = 1 2 i = 1 m { [ - g i ( x ) ] + } 2 + j = 1 p [ h j ( x ) ] 2 ,

where [ y ] + = max⁡ { 0 , y } . Numerical weighting is often used to balance constraint satisfaction and objective optimization, e. g.,

E ( x ) = f ( x ) + w p ( x ) ,

where w is a positive weight.

In smooth penalty methods, neurodynamic equations are usually derived as the negative gradient flow of the energy function in the form of a differential equation

d x ( t ) d t - E ( x ( t ) ) .

If the energy function is bounded below, the stability of the neurodynamics can be ensured. Nevertheless, the major limitation is that the neurodynamics designed using a smooth penalty method with any fixed finite penalty parameter can converge to an approximate optimal solution only, as a compromise between constraint satisfaction and objective optimization. One way to remedy the approximated limitation of smooth penalty design methods is to introduce a variable penalty parameter. For example, a time-varying delaying penalty parameter (called temperature) is used in deterministic annealing networks to achieve exact optimality with a slow cooling schedule [208, 210].

If the objective function or penalty function is nonsmooth, the gradient has to be replaced by a generalized gradient and the neurodynamics can be modeled using a differential inclusion [235, 248, 249, 251, 252, 255]. Two advantages of nonsmooth penalty methods over smooth ones are possible constraint satisfaction and objective optimization with some finite penalty parameters and finite-time convergence of the resulting neurodynamics. Needless to say, nonsmooth neurodynamics are much more difficult to analyze to guarantee their stability.

4.2.2 Lagrange Methods

A Lagrange method for designing a neurodynamic optimization model begins with the formulation of a Lagrange function (Lagrangian) instead of an energy function [204, 205]. A typical Lagrangian is defined as

L ( x , λ , μ ) = f ( x ) + i = 1 m λ i g i ( x ) + j = 1 p μ j h j ( x ) ,

where λ = ( λ 1 , , λ m ) T and λ = ( μ 1 , , μ p ) T are Lagrange multipliers, for inequality constraints  g ( x ) and equality constraints  h ( x ) , respectively.

According to the saddle-point theorem, the optimal solution can be determined by minimizing the Lagrangian with respect to x and maximizing it with respect to λ and μ. Therefore, neurodynamic equations can be derived in an augmented space

ϵ d x ( t ) d t = - x L ( x ( t ) , λ ( t ) , μ ( t ) ) , ϵ d λ ( t ) d t = - λ L ( x ( t ) , λ ( t ) , μ ( t ) ) , ϵ d μ ( t ) d t = - μ L ( x ( t ) , λ ( t ) , μ ( t ) ) ,

where ϵ is a positive time constant. The equilibrium of the Lagrangian neurodynamics satisfy the Lagrange necessary optimality conditions.

4.2.3 Duality Methods

For convex optimization, the objective functions of primal and dual problems reach the same value at their optima. In view of this duality property, the duality methods for designing neurodynamic optimization models begin with the formulation of an energy function consisting of a duality gap between the primal and dual problems and a constraint-based penalty function, e. g.,

E ( x , y ) = 1 2 ( f ( x ) - f d ( y ) ) 2 + p ( x ) + p d ( y ) ,

where y is a vector of dual variables, f d ( y ) is the dual objective function to be maximized, p ( x ) and  p d ( y ) are, respectively, smooth penalty functions to penalize the violations of constraints of primal (original) and dual problems. The corresponding neurodynamic equation can be derived with guaranteed global stability as the negative gradient flow of the energy function similarly as in the aforementioned smooth penalty methods [216, 218, 222, 226, 258, 259]. Neurodynamic optimization models designed by using duality design methods can guarantee global convergence to the exact optimal solutions of convex optimization problems without any parametric

In addition, using duality methods, dual networks and their simplified/improved versions can be designed for quadratic programming with reduced model complexity by mapping their global convergent optimal dual state variables to optimal primal solutions via linear or piecewise-linear output functions [240, 247, 260, 261, 262, 263].

4.2.4 Optimality Methods

The neurodynamic equations of some recent models are derived based on optimality conditions (e. g., the Karush–Kuhn–Tucker condition) and projection methods. Basically, the methods are to map the equilibrium of the designed neurodynamic optimization models to the equivalent equalities given by optimality conditions and projection equations (i. e., all equilibria essentially satisfy the optimality conditions) [225, 227, 228]. For several types of common geometric constraints (such as nonnegative constraints, bound constraints, and spherical constraints), some projection operators map the neuron state variables onto the convex feasible regions by using their activation functions and avoid the use of excessive dual variables as in the dual networks, and thus lower the model complexity. For neurodynamic optimization models designed using optimality methods, stability analysis is needed explicitly to ensure that the resulting neurodynamics are

Once a neurodynamic equation has been derived and its stability is proven, the next step is to determine the architecture of the neural network in terms of the neurons and connections based on the derived neurodynamic equation. The last step is usually devoted to simulation or emulation to test the performance of the neural network numerically or physically. The simulation/emulation results may reveal additional properties or characteristics for further analysis or model

4.3 Selected Applications

Over the last few decades, neurodynamic optimization has been widely applied in many fields of science, engineering, and commerce, as highlighted in the following selected nine areas.

4.3.1 Scientific Computing

Neurodynamic optimization models ave been developed for solving linear equations and inequalities and computing inverse or pseudoinverse matrices [240, 264, 265, 266, 267, 268].

4.3.2 Network Routing

Neurodynamic optimization models have been developed or applied for shortest-path routing in networks modeled by using weighted directed graphs [258, 269, 270, 271].

4.3.3 Machine Learning

Neurodynamic optimization has been applied for support vector machine learning to take the advantages of its parallel computational power [272, 273, 274].

4.3.4 Data Processing

The data processing applications of neurodynamic optimization include, but are not limited to, sorting [275, 276, 277], winners-take-all selection [240, 277, 278], data fusion [279], and data reconciliation [254].

4.3.5 Signal/Image Processing

The applications of neurodynamic optimization for signal and image processing include, but are not limited to, recursive least-squares adaptive filtering, overcomplete signal representations, time delay estimation, and image restoration and reconstruction [191, 203, 204, 280, 281, 282, 283].

4.3.6 Communication Systems

The telecommunication applications of neurodynamic optimization include beamforming [284, 285]) and simulations of DS-CDMA mobile communication systems [229].

4.3.7 Control Systems

Intelligent control applications of neurodynamic optimization include pole assignment for synthesizing linear control systems [286, 287, 288, 289] and model predictive control for linear/nonlinear systems [290, 291, 292].

4.3.8 Robotic Systems

The applications of neurodynamic optimization in intelligent robotic systems include real-time motion planning and control of kinematically redundant robot manipulators with torque minimization or obstacle avoidance [259, 260, 261, 262, 263, 267, 293, 294, 295, 296, 297, 298] and grasping force optimization for multifingered robotic hands [299].

4.3.9 Financial Engineering

Recently, neurodynamic optimization was also applied for real-time portfolio selection based on an equivalent probability measure to optimize the asset distribution in financial investments; [255, 300].

4.4 Concluding Remarks

Neurodynamic optimization provides a parallel distributed computational model for solving many optimization problems. For convex and convex-like optimization, neurodynamic optimization models are available with guaranteed optimality, expended applicability, improved convergence properties, and reduced model complexity. Neurodynamic optimization approaches have been demonstrated to be effective and efficient for many applications, especially those with real-time solution requirements.

The existing results can still be further improved to expand their solvability scope, increase their convergence rate, or reduce their model complexity. With the view that neurodynamic approaches to global optimization and discrete optimization are much more interesting and challenging, it is necessary to develop neurodynamic models for nonconvex optimization and combinatorial optimization. In addition, neurodynamic optimization approaches could be more widely applied for many other application areas in conjunction with conventional and evolutionary optimization approaches.