1 INTRODUCTION

Many experimental works provide evidence that cognitive functions are related with oscillatory neural activity and its synchronization. These functions include perception, feature binding, memory storage and retrieval, and various forms of attention. The question that has no commonly accepted answer is whether the synchronization appears due to the self-organization of neural assemblies or there is a special source of synchronization that governs cognitive processes in the brain. The theory proposed by Cowan and Baddeley states that there is a central executive of the attention system that is responsible for the segmentation of individual objects, focusing attention on a particular object, and keeping this object in the working memory [13]. The central executive is assumed to be a complex network in the forebrain at the top of the pyramid of cortical convergent zones. The mechanism of interaction between the central executive and information processing regions is presumably based on phase locking [4]. This mechanism found an indirect experimental support in the paper [5].

Our aim is to present arguments in favor of this theory in the form of mathematical models of various cognitive functions based on synchronization in an oscillatory system with a central element. As a starting point, let us consider Kuramoto phase oscillators [6] which are conventional instruments for modeling oscillatory systems in physics, chemistry, and biology [7, 8]. These oscillators are described by a single variable, the oscillation phase. Natural frequencies and connection strengths are the parameters of Kuramoto phase oscillator networks. These networks are suitable for the study of synchronization conditions, but they cannot be useful if the resonant increase in the amplitude of oscillations induced by synchronization takes place. To expand the capabilities of phase oscillators and better adapt them for modeling in neurobiology, we introduced generalized phase oscillators [9]. These oscillators differ from traditional Kuramoto oscillators in that a number of oscillator parameters are transformed into dynamical variables. The advantage of generalized phase oscillators is that they allow the implementation of the winner-take-all (WTA) procedure, which is one of the basic principles in neural network modeling. With this principle, we developed a number of models of brain cognitive functions which reproduce data from neurophysiological and psychophysical experiments.

The paper presents a review of our models of brain cognitive functions that are based on generalized phase oscillators. These models were published before as separate examples of representing particular cognitive functions by oscillatory neural networks. Now we collect them together to demonstrate the power of the approach and a broad range of its potential applications.

In Section 2, we will describe a network of generalized phase oscillators with a central element. In Section 3, a number of oscillatory models of cognitive functions based on networks of generalized phase oscillators will be considered. In Section 4, we discuss general questions about the role of dynamic models in understanding cognitive functions, the limitations associated with using abstract models, and the ways to overcome these limitations.

2 GENERALIZED PHASE OSCILLATORS

Systems that implement the WTA procedure are traditionally built from artificial neurons that compete for maximizing their activity using lateral or recurrent inhibition. We suggest an alternative approach that is based on synchronization in a network of generalized phase oscillators with the radial architecture of connections. In such a network, there is a central oscillator (CO) and a set of peripheral oscillators (POs). The CO interacts with the POs via feedforward and feedback connections. A PO that wins a competition for the synchronization with the CO, resonantly increases its activity to a high level, while the activity of the “losing” POs is suppressed to a low level.

The dynamics of a system of generalized phase oscillators are described by the following equations:

$$\frac{{d{{\theta }_{0}}}}{{dt}} = {{\omega }_{0}} + \frac{1}{n}\sum\limits_{i = 1}^n {{{a}_{j}}f({{\theta }_{i}} - {{\theta }_{0}})} ,$$
((1))
$$\frac{{d{{\theta }_{i}}}}{{dt}} = {{\omega }_{i}} + bg({{\theta }_{0}} - {{\theta }_{i}}),\quad i = 1, \ldots ,n,$$
((2))
$$\frac{{d{{\omega }_{0}}}}{{dt}} = \frac{\alpha }{n}\sum\limits_{i = 1}^n {{{a}_{j}}f({{\theta }_{i}} - {{\theta }_{0}})} ,$$
((3))
$$\frac{{d{{a}_{i}}}}{{dt}} = \beta ( - {{a}_{i}} + c + \gamma h({{\theta }_{i}} - {{\theta }_{0}})),\quad i = 1, \ldots ,n,$$
((4))

where \(\theta = ({{\theta }_{0}}{\text{, }}{{\theta }_{1}},...,{{\theta }_{n}}) \in {{T}^{{n + 1}}}\), \({{\omega }_{0}} \in R\), \(a = ({{a}_{1}},...,{{a}_{n}}) \in {{R}^{n}}\) are variables, \(\alpha ,\beta ,\gamma ,c \in R\), b < 0 are parameters. Without loss of generality, we can assume that b = –1. Thus, a PO tries to synchronize the CO, while the CO tries to desynchronize all POs. The connection strength ai > 0 is associated with the amplitude of the ith PO.

The architecture of connections in the system is schematically shown in Fig. 1.

Fig. 1.
figure 1

Architecture of connections in a network of generalized phase oscillators with a central element. Red arrows show synchronizing connections, green arrows show desynchronizing connections (color online).

It is assumed that the functions f, g, h are 2π-periodic and satisfy the following condition:

$$f(x) = - f( - x),\,\,\,\,f{\kern 1pt} '(0) > 0,\,\,\,\,f{\kern 1pt} '(\pi ) = 0,\,\,\,\,g(x) = - g( - x),\,\,\,\,g{\kern 1pt} '(0) > 0,\,\,\,\,g{\kern 1pt} '(\pi ) < 0,$$
$$h(x) = h( - x),\,\,\,\,h(0) = 1,\,\,\,\,h(\pi ) = 0,\,\,\,\,h{\kern 1pt} '(0) = h{\kern 1pt} '(\pi ) = 0.$$

Periodicity and oddness of the functions f and g imply the conditions

$$f(0) = f(\pi ) = g(0) = g(\pi ) = 0.$$

It is assumed that these functions have no other zeros.

The function h(x) monotonically increases on [–π, 0] (and, respectively, monotonically decreases on [0, π]).

Examples of f and h that were used in simulations are shown in Fig. 2. In all simulations g(x) = sin(x).

Fig. 2.
figure 2

Types of functions in equations (1)(4): (а) interaction functions f(x) for different values of the parameter ν that determines the position of extrema of the function; (b) resonance controlling functions h(x) for different values of the parameters μ and σ that determine the width of the region where h(x) is not equal to zero and the rate of decrease of h(x) when the argument deviates from 0 (color online).

Equations (1)(4) can be considered as a generalization of standard Kuramoto equations for phase oscillators. This generalization is obtained by adding to phase equations (1)(2)equation (3) for adaptation of the natural frequency of the CO and equation (4) for adaptation of the amplitudes of POs.

The meaning of equation (3) becomes clear if it is rewritten in the form

$$\frac{{d{{\omega }_{0}}}}{{dt}} = \alpha \left( {\frac{{d{{\theta }_{0}}}}{{dt}} - {{\omega }_{0}}} \right).$$

According to this equation, the natural frequency of the CO is adapted in the direction of its current frequency. The parameter α determines the adaptation rate.

Equation (4) describes the resonant increase of the amplitude of oscillations of the ith PO to the level c + γ, if this PO is synchronous with the CO; otherwise, the amplitude of this PO falls down the low level c. The parameter β determines the adaptation rate.

Under a suitable choice of parameters and functions, system (1)–(4) can work in the WTA regime when at most one PO wins the competition for the synchronization with the CO and increases its amplitude to the resonant level [9] (Fig. 3). Depending on the relationship between the desynchronizing strength b and the divergence of the natural frequencies of POs, the WTA regime can be stationary or non-stationary. Different types of dynamics in the WTA regime are shown in Fig. 4 for the case of 3 POs.

Fig. 3.
figure 3

Dynamics of the amplitudes of 10 POs in the WTA regime. The first PO is in the resonant state (its amplitude becomes higher than a threshold level Hhigh). The other 9 POs work with a low amplitude (it is below a threshold level Hlow) (color online).

Fig. 4.
figure 4

The dynamics of phase differences between 3 POs and the CO in the WTA regime. The first PO is the winner: (a) stationary state, all POs are phase-locked by the CO, one is the winner, the other two are nearly in antiphase to the CO; (b) two POs are phase locked by the CO (one inphase, the other one in nearly antiphase), the phase difference of the third PO runs in negative direction; (c) the winner is inphase with the CO, the phase differences of the other two POs run in negative direction; (d) the winner is inphase with the CO, the phase differences of the other two POs run in opposite direction to each other [9] (color online).

3 OSCILLATORY NEURAL MODELS OF COGNITIVE FUNCTIONS

3.1 Consecutive Selection of Objects in the Image

When processing visual scenes, it is usually assumed that individual objects that are located in the scene are fed to the recognition system one by one in a certain sequence. However, biological visual systems have to work under conditions when visual scenes simultaneously contain several objects. To reduce the amount of information that must be processed at each moment of time, animals use two cognitive mechanisms: attention and feature binding. Feature binding allows for the separation of features of different objects that are simultaneously present in the visual scene; attention restricts information processing by the features of a single object that is currently selected into the attention focus. The attention system amplifies the activity of neurons that code an attended object and inhibits the activity of neurons that code unattended objects.

Our model of consecutive selection of objects in the attention focus is based on system (1)–(4) with the addition of local connections between POs [10]. The POs are located on the plane in the nodes of a square lattice. Each PO codes the information from a pixel of the image (Fig. 5). Thus, there is one-to-one correspondence between the pixels in the image and the POs.

Fig. 5.
figure 5

A scheme for modeling consecutive selection of objects. Objects are coded by assemblies of locally connected oscillators. The selection of a particular object in the attention focus is realized by the WTA procedure in a system of generalized phase oscillators with a central element. Gray arrow shows assigning values to the natural frequencies of POs. Black filled arrow shows synchronizing connections that are used for object representation by a synchronous assembly of POs and for synchronization of the CO with an assembly of POs. Black hollow arrow shows desynchronizing connections that are used to prevent simultaneous synchronization of the CO with several assemblies of POs.

Synchronizing local connections between POs are used to group the pixel belonging to a particular object. They lead to the formation of synchronous assemblies of POs that represent individual objects according to their spatial connectivity.

Attention is implemented in the model as a result of synchronization of the CO with an assembly of POs. It is assumed that the focus of attention is formed by those POs that work synchronously with the CO. Due to the resonance, the amplitudes of POs in the focus of attention increase, while the amplitudes of other POs decrease to a low level. There is a special procedure incorporated in the model that prevents repeated selection of the same object. For this purpose, the oscillators that went through the resonant state loose the possibility to resonate for some time that is sufficient for examination of the whole image. This leads to the selection of different objects in the focus of attention.

Figure 6 illustrates the principles of model performance using a simple example of a black-and-white image with the word “HELLO” that consists of 5 letters (objects). The letters were selected in a random order.

Fig. 6.
figure 6

Dynamics of PO’s amplitudes when the letters of the word HELLO are consecutively selected in the attention focus [10] (color online).

3.2 Visual Search

Visual search is a type of perceptual task that involves an active scan of the visual environment for a particular object (the target) surrounded by other objects (the distractors) [11]. The task of visual search can be of various complexities depending on the saliency of the target relative to the distractors. This is reflected in the duration of time that the observer spends performing the search task, and in the number of errors that are made.

A vast amount of experimental evidence has been obtained to characterize the mechanisms of visual selection. In the early experiments of [12] it was discovered that visual search tasks can be subdivided into several categories based on their difficulty. Though later experiments have shown that there are no strict barriers between the categories, the following three categories are considered as basic experimental paradigms [13]:

Feature Search is the case where the target and the distractors are maximally different, i.e. differentiated by a single property, such as colour, shape, orientation, or size. In this case, the reaction time (RT) is found to be short and independent of the number of objects n in the display (i.e. the graph RT(n) is nearly parallel to the n-axis).

Conjunction Search occurs when the target shares a property with the distractors, e.g. a red vertical bar should be found among vertical green bars and horizontal red bars. In this case, the search task is more difficult and RT(n) increases when n increases.

Spatial Configuration Search is even more difficult. An example of such task is the search of the number 2 among numbers 5 (2 is a reflection of 5 relative to the vertical mirror). In this case, the graph RT(n) is steeper than in the case of Conjunction Search.

Figure 7a gives some examples of visual search tasks. Figure 7b shows the graphs of average reaction times RT(n) for these tasks as a function of the number of objects in the scene.

Fig. 7.
figure 7

Visual search: (а) examples of visual search tasks of different complexity; (b) mean reaction times for these tasks [13]). Solid (dash) lines correspond to the case when the image contains (does not contain) a target object (color online).

Recently there appeared a model [14] that is able to reproduce not only average RTs in visual search tasks, but also time distributions [15]. The model does not suggest a neural mechanism for its implementation. It only assumes that the process of selection consists of several selection rounds, which continue until a target is selected or the conditions for search termination are fulfilled. The model was applied to the tasks which differed by the weights assigned to the target and distractors according to their saliency. In Feature Search due to high saliency of the target, the weight of the target is radically greater than the weights of distractors. Purely mathematically, this leads to a flat slope of the RT as a function of n. In Spatial Configuration Search the weight of the target is only slightly higher than the weight of distractors, which leads to a steep slope of RT(n). Conjunction Search occupies an intermediate position between these two extreme cases.

Our model follows the same logic as the model [14], but it aims to suggest a simple neural mechanism that may underlie the probabilistic selection of objects. We model visual search by the WTA procedure in system (1)–(4) of generalized phase oscillators [16]. Each visual object is represented by a PO. The saliency of an object is described by an initial value of the amplitude of the corresponding PO: objects that are more salient have a higher value of the initial amplitude.

The model combines parallel and sequential mechanisms of information processing. The former are used to select an object in the attention focus, the latter are used for a limited item-by-item examination of selected objects to check whether the selected object is a target or a distractor. If it is a distractor, then a new attempt (that is a new run of the model) is made, and so on until the target is selected.

Figure 8a shows the probabilities rn that a target object will be selected in the attention focus during a single run of the model. The values of rn decrease when the number of objects n increases.

Fig. 8.
figure 8

Simulation results for visual search: (a) probabilities rn as functions of n; (b) average number of attempts \(M_{n}^{1}\) needed to select a target with return; (c) average number of attempts \(M_{n}^{2}\), needed to select a target without return. Parameters: ai(0)= 1, i = 2, …, n. In all panels the numeration of lines corresponds to the following values of a1(0): (1) a1(0) = 12, (2) a1(0) = 8, (3) a1(0) = 6, (4) a1(0) = 4, (5) a1(0) = 3. Straight line in (b) and (c) are linear regression lines [16] (color online).

With a fixed initial amplitude of “distractor” POs, the values of rn become smaller as the initial value of the amplitude of the “target” PO decreases. In terms of the model, this means that the probability of selecting a target object in the focus of attention decreases if the target object becomes less salient among distractors.

The parameter rn is hidden, it is impossible to determine it in a psychophysical experiment. However, we can use rn to compute the average number of attempts (runs of the model) M, that are needed to select a target object.

Since there is no common opinion among researchers whether repeated selections are made with or without inhibition of return, we consider both possibilities. Denote the average number of selection attempts in the cases when the return is allowed or is not allowed as \(M_{n}^{1}\) and \(M_{n}^{2}\), respectively.

The results of computations of \(M_{n}^{1}\) and \(M_{n}^{2}\) shown in Fig. 6b,c properly correspond to experimental evidence on RTs in visual search tasks of different complexity (compare with Fig. 7).

The lines labelled 1 and 2 in Fig. 8b,c are nearly horizontal, therefore one can think that they correspond to Feature Search. Lines 34 in Figs. 8b, 8c correspond to cases of Conjunction Search. The lines labelled 5 correspond to the most difficult type of tasks: Spatial Configuration Search. Note that the search with inhibition of return is only slightly faster than the search without such inhibition. This explains why it is so difficult to experimentally distinguish between these two types of search.

The simulations show that the dependences RT(n) from n are properly approximated by linear functions which is in agreement with experimental data [15]. This was not planned in advance and confirmed that the model correctly reflects the neural mechanisms that underlie the work of selective attention in visual search.

3.3 Multiple Object Tracking

An important experimental paradigm that is used in the study of object-oriented attention is multiple object tracking (MOT). In a standard MOT experiment [17, 18] an observer views a display with m simple identical objects (up to 10 objects, such as points, or plus signs, or circles, etc.) (Fig. 9). A certain subset of the objects (from 1 to m/2, m is supposed to be even) is briefly flashed to mark them as targets. Other objects are considered as distractors. Then all objects begin moving independently and unpredictably about the screen without passing too near to each other and without moving off the display. The observer’s task is to track the targets with the eyes fixed at the center of the screen. At various times during animation one of the objects is flashed and the observer should press a key to indicate whether this object is a target or a distractor.

Fig. 9.
figure 9

Multiple object tracking. Targets are flashed before moving. The observer must identify the targets using a computer mouse [19].

It has been shown that trained subjects are quite efficient in performing MOT. Though the number of errors increases with the increased number of targets, even for 5 targets the performance level was about 85% correct target identifications.

The architecture of the MOT model is shown in Fig. 10a for the case of 3 targets. In a general case, the number of layers is equal to the number of targets. The layers represent the modules of the attention system [20]. It is assumed that each module tracks a single target. The interaction between the modules is organized so that different modules are not able to track the same object.

Fig. 10.
figure 10

Model of MOT: (а) architecture of the model; (b) probability of errors in the model [20] and in experiments [21] (color online).

If objects move along the visual field slowly enough and do not intersect in the process of movements, the focus of attention (after it is formed) remains fairly stable due to the resonance of the POs included in the focus of attention. A resonant PO has a much stronger effect on the CO in its layer than other POs, and this prevents attention switching from one object to another. However, if the speed of moving objects becomes relatively high in comparison with the rate of synchronization and resonance, attention can spontaneously switch from one object to another. This leads to errors in the identification of target objects and distractors.

To estimate the quality of model performance, we performed 50 runs of the model for each of the values k = 2, 3, 4, 5, where k is the number of target objects. The results of the simulations are presented in Fig. 10b together with experimental data. The model gives a fairly good approximation of the experimental data. Significant deviation occurs in the case of two targets. This is probably due to the fact that the model parameters were fixed for all cases of tracking, while in reality adjustment of parameters depending on the number of objects in the image cannot be excluded.

The dependence of the number of errors in MOT on the number of targets was the reason for the researchers to associate MOT with resource-limited parallel processing [17]. Our model presents an alternative explanation of this phenomenon. Though the processing of information in our model is purely parallel, it is evident that the quality of both object selection as targets and tracking of the targets will be poorer if the number of targets increases. This is caused by the limited capacity of the phase space where several central oscillators have to operate simultaneously. Increasing the number of central oscillators will make it more and more difficult for them to avoid temporal synchronization, which may result in unpredictable jumps of attention to non-target objects.

Designing a MOT model, we intentionally tried to avoid the use of traditional image processing techniques, such as shape analysis, connectivity testing, pattern recognition, etc. Therefore the model can work equally well in the case when objects in the visual field are not identical or even vary in shape. This is important, e.g., if object movements take place in 3-D space with the projection of objects on the retina constantly changing.

4 CONCLUSIONS

In this research we were interested in the relationship between cognitive functions and brain dynamics. This question is important, since many existing models ignore dynamical aspects of brain functioning and, nevertheless, at the functional level obtain quite satisfactory results, sometimes even surpassing the capabilities of natural intelligence. On the other hand, experimental data show that a quick change of brain activity is necessary for flexible adaptation of the brain to different tasks. How does the brain use its dynamic modes? Are there general principles in the architectures that these modes support, and in the mechanisms that these modes implement? What types of information processing can be explained and reproduced on the basis of oscillatory activity and its synchronization? These questions stimulate many studies where dynamical systems are applied to modeling brain functions (see, e.g., reviews [22, 23]). Without giving exhaustive answers to these questions, our models provide sufficient arguments to assert that the synchronization of neural activity underlies a wide range of cognitive functions. Phase oscillators that are used as basic instruments in our modeling can be conveniently realized as optical devices, so the models may be helpful for optical implementation of artificial intelligence in the future.

Mathematically, a reduction of a system of limit cycle oscillators to a system of Kuramoto phase oscillators can be correctly induced if only the interaction between oscillators is weak [24]. This assumption is not fulfilled in our models. As for generalized phase oscillators, there is no strict mathematical justification of their use in neurophysiological modeling. The only argument in favor of our approach is that it gives correct reproduction of experimental data based on uniform principles of the temporal correlation theory [25]. Further advancement will require a more detailed description of oscillatory processes, taking into account real frequency and phase relations, as well as the interaction of various types of oscillations in different frequency ranges. Some steps in this direction have been taken [26], but the question of how well phase models can be adequately reproduced at a detailed level of Hodgkin–Huxley neurons remains open.

Here lies the limit of applicability of the developed models and methods. Our models are suitable for solving general questions about the role of oscillations and synchronization for various cognitive functions; however, they describe in a very general form both morphological and neurophysiological characteristics associated with brain structures when cognitive functions are performed. Reductionism and extreme simplification were inevitable for the initial steps of understanding the mechanisms of cognitive functions.

The presence of hubs in neural networks is well known from the studies of anatomical and functional connections. However, this fact is rarely taken into account due to traditional ideas about the distributed representation of information in the brain. Also there is a desire to restrict oneself to well-developed methods of statistical physics, which are most easily applied to homogeneous neural network architectures. As can be seen from our models, networks with hubs are a natural approach to the implementation of the WTA procedure, especially if this procedure is implemented on the principles of synchronization.

5 CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.