1 Introduction

One of the major challenges in neurobiology is understanding the relationship between spatially structured activity states and the underlying neural circuitry that supports them.

From the geometrical point of view the first accurate models of the functional architecture of the primary visual cortex (V1) is due to Hubel and Wiesel (1977) (see Hubel (1988) for a review of their work). Hubel and Wiesel discovered that for every point (x, y) of the retinal plane there is an entire set of cells, each one sensitive to a particular instance of a specific feature of the image: position, orientation, scale, color, curvature, velocity, stereo. They called this structure hypercolumnar organization. Horizontal connectivity is responsible for the cortico-cortical propagation of the neural activity between hypercolumns. Further insights on the structure of the connectivity and the spatial arrangements of cells were provided by Blasdel (1992), Bonhoeffer and Grinvald (1991), Bosking et al. (1997). The association fields of Field et al. (1993), discovered on a purely psycho-physical basis, have been proposed as a phenomenological counterpart of the cortical-cortical connectivity. Geometric frameworks for the description of the functional architecture of V1 were proposed by W.C. Hoffman (1989), Petitot and Tondut (1999), Bressloff and Cowan (2003), Citti and Sarti (2006), Zucker (2006), Sarti, Citti, Petitot (2008). Application to image processing can be found in Duits and Franklin (2010a, b, Duits et al. (2011), Boscain et al. (2012).

From the dynamical point of view the first neural field models of the cortical activity are due to Wilson and Cow (1972, 1973) and Amari (1972), and are expressed in terms of integro-differential equations. Extensions of the models have been provided by Ermentrout and Cowan (1979,1980). These mean field equations describe the activity on a 2D plane and formally express the interaction between cells through as a convolution kernel. Bressloff and Cowan (2003), Bressloff et al. (2002) proposed new models taking into account the high dimensional cortical structure, with position, orientation and scale as features. In their models the connectivity kernel satisfies the symmetry properties of the cortical space, namely S E(2) for rotation and translation and the affine group for scale, rotation and translation. In absence of the external input these models successfully account for hallucination patterns. More recently, Faugeras (2012), Faye and Faugeras (2010) and Chossat et al. (2011) modified the model in order to take into account delay and the tensorial structure of the cortex.

Scope of this paper is to provide a possible computational interpretation of cortical function by considering a mean field neural model which takes into account the neurogeometry of the cortex introduced in Citti and Sarti (2006) as well as the presence of a visual input. It is known that when stationary solutions of the equation become marginally stable, eigenmodes of the linearized operator can become stable. In absence of a visual input the raising eigenmodes lead to the hallucination patterns proposed by Bressloff and Cowan (2003), Bressloff et al. (2002). The main result of our study consists in showing that in presence of a visual input, these eigenmodes correspond to perceptual units. While in the case of hallucinations the emergence of eigenmodes is due to the use of drugs, in the case of perceptual units it is due to physiological variations of parameters during the perception process. The whole process can be interpreted as a problem of data segregation and partitioning, strongly related to the most recent results of dimensionality reduction. In particular our model can justify on biological basis, the results of Perona and Freeman (1998), Shi and Malik (1997), Weiss (1999), Coifman and Lafon (2006), Coifman et al., who directly faced the problem of perceptual grouping in the description of a scene by means of a kernel PCA on an affinity matrix.

If the aim of the paper is to provide a possible computational interpretation of cortical function, another motivation is to show that the proposed neural computational model performs a unification among four different scientific areas: computational neuroscience, visual perception, computer vision and machine learning. In facts the model is able to extract perceptual units with a neurally plausible mechanism and at the same time it formally corresponds to a computer vision algorithm and to a machine learning technique. This intersection could be important to integrate different scientific communities and to share ideas and inspiration on the base of a formal (mathematical and computational) analogy.

The paper starts with briefly recalling some results about the neurogeometry of the primary visual cortex (Section 2). The horizontal interaction between simple cells is represented by the fundamental solution of a Fokker Planck equation, following Sanguinetti et al. (2010) and Barbieri et al.. In Section 3 the classical mean field model of Ermentraut and Cowan is adapted to the S E(2) cortical symmetry group with the previously computed connectivity kernel. Stationary solutions are studied and a stability analysis is performed, varying a suitable physiological parameter. In the classical papers (Bressloff and Cowan 2003; Bressloff et al. 2002) the variability of this parameter was due to the presence of drugs. On the contrary in our model, the variability of the same parameter is due to the physiological variability of the transfer function in different neural populations. In addition, the geometry of the problem depends both on the invariance of S E(2) and the presence of the input. In Section 4 the mean field equation is discretized and the connectivity kernel reduced to a matrix induced by the neurogeometry of the cortex as well as by the visual input. Marginally stable solutions are computed as eigenvectors of this matrix, and we show that they represent perceptual units present in the image. The result is very closely related to the dimensionality reduction and clustering problems of Perona and Freeman (1998), and the connectivity matrix can be interpreted as an affinity matrix. In Section 5 we present numerical simulation results. Finally in Section 6 we will discuss the model with regards to cortical function and outline the disciplinary unification.

2 The functional geometry of V1

In this section we briefly recall the structure of the functional geometry of the visual cortex. As discovered by Hubel and Wiesel (1977) the visual cortex is organized in hypercolumns of simple cells sensitive to the position (x, y) and to variables which describe different properties of the stimulus: orientation, curvature, speed, velocity, scale, disparity. We will describe in detail the structure of the family of simple cells, sensitive to position and orientation.

2.1 The S E(2) symmetry of the visual cortex

Many authors (Petitot and Tondut 1999; Citti and Sarti 2006; Zucker 2006) represented the hypercolumnar organization as a 3-dimensional space with coordinates (x, y, θ) where each point corresponds to a specific population of cells sensitive to a stimulus positioned in (x, y) and with orientation θ. This leads to the description of the visual cortex in the special Euclidean group \(SE(2) \approx \mathbb {R}^{2} \times S^{1}\). This is composed by the semi-direct product of the group of translations of the plane \(\mathbb {R}^{2}\) with the rotations and reflections group of the plane O(2) (see Fig. 1).

Fig. 1
figure 1

The ice cube model of S E(2): the basis (in gray) represents the retinal 2D plane, discretized with 4 values of the x and y variables. The hypercolumns, coded in color represent the different orientations

This 3D model can be identified with the original ice cube model (see Hubel-Wiesel) of the cortex. Later on a more realistic model of the cortex has been proposed in terms of the pinwheels structure (see Fig. 2 left), which codes for position and orientations in the 2D cortical layer. The pinwheel structure of V1 has been reconstructed starting from a set of cortical activity maps acquired with optical imaging techniques in response to gratings with different orientations (see Bosking et al. (1997)). A color image has been obtained from gray valued activity maps, associating a color coding representation to preferred orientations. This model can be considered as the union of discrete patches each one coding all orientations (see Fig. 2, top right). Different mathematical models of this structure have been proposed (see Berry and Dennis (2000), Durbin and Mitchison (1990), Barbieri et al. (2012)). In particular using harmonic analysis properties in the group S E(2) in Barbieri et al. (2012) a model of the pinwheel structure was expressed through the choice of an angle θ(x, y) at every point (see Fig. 3). In computation we will always use the continuous S E(2) model of the cortex, since it is simpler to apply, but all computations operated in this setting can be projected to the pinwheels structure by intersection with the graph of θ(x, y) allowing to check neural compatibility.

Fig. 2
figure 2

The pinwheel structure of the primary visual cortex measured by in vivo optical imaging taken from Bosking et al. (1997). Orientation maps are coded with the colorbar on the bottom. On the right we see that all the orientations are coded around a pinwheel

Fig. 3
figure 3

The model of pinwheel proposed in Barbieri et al. (2012) (left) It is expressed as the 2D projection of the graph of a function θ(x.y) (right)

2.2 The output of simple cells to the visual stimulus

The receptive profile of a simple cell has been modelled as a Gabor filter or in terms of derivatives of a Gaussian function (Daugmann). The whole set of simple cells ψ (x, y, θ) can be obtained by rotation and translation from the mother filter ψ (0,0,0), which amount to say that for every (x, y, θ) the cell at position (x, y) sensible to the orientation θ can be represented as

$$ \psi_{(x,y,\theta)} \left(x^{\prime},y^{\prime}\right)= \psi_{(0,0,0)}\left(R_{\theta}(x^{\prime}-x, y^{\prime}-y)\right). $$
(2.1)

where R θ represents the rotating of an angle θ. This transformation formally attests that the cortex is the S E(2) group of rotation and translaton.

The response of simple cells to a visual stimulus I(x, y) can be obtained as an integral of the RP with the image I:

$$ h(x,y,\theta)= \int \psi_{(x,y,\theta)}\left(x^{\prime},y^{\prime}\right) I\left(x^{\prime},y^{\prime}\right) dx^{\prime}dy^{\prime}. $$
(2.2)

Note that the action of the cells is to associate to the 2D retinal image I(x, y) a function h(x, y, θ) defined on the motion group S E(2), which describes the visual cortex.

2.3 Geometry of the horizontal connectivity

Hypercolumns are connected by means of the so called horizontal connectivity. Experimental measures of this connectivity have been obtained by Bosking et al. (1997) by injecting a chemical fig4 (biocytin) and observing its propagation in the cortical layer (see Fig. 4).

Fig. 4
figure 4

Cortico-cortical connectivity measured by Bosking et al. (1997). The Fig. 4 is propagated through the lateral connections to points in black. These locations are plotted together with the orientation maps

The scope of this section is to recall the geometric instruments which can describe a trajectory in the ideal S E(2) cortical space, from which we will deduce a model of the horizontal neural connectivity, to be compared with the physiological data.

In order to do so, the authors in Citti and Sarti (2006) introduced the following vector fields

$$ \vec{X_{1}} = (\cos\theta, \sin\theta, 0), \quad \vec{X_{2}} = (0, 0, 1) \ $$
(2.3)

which describe respectively the propagation in the direction of the orientation θ and the rotation.

For the reader interested in the geometric aspects of the problem we note that these vector fields are the generators of the Lie algebra associated to S E(2) (see Sugiura), but this remark can be skipped since it is not necessary for the comprehension of the rest of the paper. The points of the structure are connected by integral curves of these two vector fields:

$$c:\mathbb{R}\rightarrow SE(2), \quad c(s)=(x(s),y(s),\theta(s))$$

such that

$$ \frac{dc}{ds}(s) =\left( \vec{X_{1}} + k \vec{X_{2}}\right)(c(s)), \quad c(0) = 0. $$
(2.4)

More precisely the cortical connectivity can be modeled with the probability of connecting two points in the cortex. Hence we need consider the stochastic counterpart of the curves defined in Eq. (2.4):

$$ (x^{\prime}, y^{\prime}, \theta^{\prime}) = \left(\cos (\theta),\sin(\theta) ,N (0, \sigma^{2})\right) = \vec{X}_{1} + N \left(0, \sigma^{2}\right)\vec{X}_{2} $$
(2.5)

where N(0,σ 2) is a normally distributed variable with zero mean and variance equal to σ 2.

This approach, first introduced by Mumford (1993) for describing the probability of co-occurrence of edges, has been further discussed by August-Zucker (2000, 2003), Williams (1995), and Sanguinetti et al. (2010), and we shortly recall it here. Let’s denote v the transition probability that the stochastic solution starting from the point (x′,y′) with orientation θ′ at the initial time reaches the point (x, y) with orientation θ at the time s. This probability density satisfies a deterministic equation known in literature as the Kolmogorov Forward Equation or Fokker-Planck equation (FP):

$$ \partial_{t} v=X_{1} v + \sigma^{2}X_{22}v $$
(2.6)

where X 1 is the directional derivative c o s(θ) x +s i n(θ) y and X 2 = θ , while X 22 = θ θ is the second order derivative.

This equation has been largely used in computer vision and applied to perceptual completion related problems. It was first used by Williams (1995) to compute stochastic completion field, by August and Zucker (2000, 2003) to define the curve indicator random field, and more recently by R. Duits and Franken (2010a, b) to perform contour completion, de-noising and contour enhancement. Its stationary counterpart was proposed in Sanguinetti et al. (2008) to model the probability of co-occurence of contours in natural images:

$$FP =X_{1} + \sigma^{2} X_{22} $$
(2.7)

This operator has a nonnegative fundamental solution Γ satisfying:

$$\begin{array}{@{}rcl@{}} &&X_{1} {\Gamma} ((x, y, \theta),(x^{\prime}, y^{\prime}, \theta^{\prime})) \\ &&{\kern.5pc}+ \sigma^{2} X_{22} {\Gamma}((x, y, \theta),(x^{\prime}, y^{\prime}, \theta^{\prime})) = \delta(x, y, \theta), \end{array} $$
(2.8)

The kernel is strongly biased in direction X 1 and not symmetric. Its symmetrization can be obtained as:

$$\begin{array}{@{}rcl@{}} \omega\left((x,y,\theta),(x^{\prime},y^{\prime},\theta^{\prime})\right)\! &=&\!\frac{1}{2}\left({\Gamma}((x,y,\theta),(x^{\prime},y^{\prime},\theta^{\prime}))\right. \\ &+&\!\left.{\Gamma}((x^{\prime},y^{\prime},\theta^{\prime}),(x,y,\theta))\right)\!. \end{array} $$
(2.9)

Since the fundamental solution of Eq. (2.8) is shift invariant with respect to rotation and translation, the kernel ω inherits the same property of invariance. Calling

$$T_{-(x^{\prime},y^{\prime},\theta^{\prime})}(x,y,\theta) = \left(R_{-\theta^{\prime}}(x-x^{\prime}, y-y^{\prime}), \theta - \theta^{\prime}\right) $$

the group law in S E(2), the kernel in any point can be obtained from the kernel centered at the origin applying this transformation:

$$ \omega\!\left(\!(x,y,\theta),(x^{\prime},y^{\prime},\theta^{\prime})\!\right) \,=\, \omega\left(T_{-(x^{\prime},y^{\prime},\theta^{\prime})}(x,y,\theta),\! (0,0,0)\right)\!. $$
(2.10)

We explicitly recall that the general results of Rothschild and Stein (1976) and Nagel et al. (1985) provide a local estimate of the fundamental solution. In addition, starting from the paper of lanconellipascucci, the level sets of the Fokker Planck fundamental solutions have been used to define a distance d c , so that the kernel ω is estimated as follows:

$$\omega \left((x,y,\theta), (x^{\prime},y^{\prime},\theta^{\prime}) \right)\simeq e^{- d_{c}^{2}}\left((x,y,\theta), (x^{\prime},y^{\prime},\theta^{\prime})\right) . $$
(2.11)

An isosurface of the simmetrized kernel ω is visualized in Fig. 5, where it is visible its typical twisted butterfly shape. In Fig. 6 the kernel is superimposed to the to the structure of the SE(2) group, representing the cortical space and it is projected onto the patchy pinwheel structure in Fig. 7. In this image the pinwheels structure is the outcome of a simulation following (Barbieri et al. 2012). In Fig. 8 (left) the kernel was visualized by means of black points generated with a probability density proportional to the value of the kernel at the point \((x, y, \tilde \theta (x, y))\). The comparison of the image with the results of Bosking presented in Fig. 8 (right) shows that the kernel ω provides a good estimate of the measured cortical connectivity. Let’s also recall that this model closely matches the statistical distribution of edge co-occurence in natural images as obtained in Sanguinetti et al. (2008). In Fig. 9 right it is visualized the probability density of edge cooccurrences measured from a huge data base of natural images. Its resemblance with the Fokker Planck fundamental solution (left) is proved both at a qualitative and quantitative level in Sanguinetti et al. (2008). This argument strongly suggests that horizontal connectivity modelled by the fundamental solution of the Fokker Planck equation is deeply shaped by the statistical distributions of features in the environment and that the very origin of neurogeometry has to be discovered in the interaction between the embodied subject and the world.

Fig. 5
figure 5

The fundamental solution of the Fokker Planck equation is strongly biased in direction X 1 and not symmetric

Fig. 6
figure 6

When the fundamental solution (in gray) is superimposed to the S E(2) cortical structure (in color), it tends to intersect the region with the same orientation as its pole

Fig. 7
figure 7

Superimposing the fundamental solution to the patchy pinwheel structure, the solution is sampled by the pinwheel orientations, obtaining the patchy distribution of the connectivity (in black in the figure), accordingly with the distribution of Fig. 3

Fig. 8
figure 8

The fundamental solution superimposed to the patchy pinwheel structure and represented in gray (left) and the connectivity map measured by Bosking (right)

Fig. 9
figure 9

Comparison between fundamental solution (left) and distribution of edges cooccurrences in natual images (right). Both images are taken from Sanguinetti et al. (2008)

We would like to note that, even though the connectivity is strongly anisotropic, if we consider it in a pinwheel point, at the population level there is no orientation preference so that the corresponding horizontal connections are isotropic. This fact can be clearly observed in the model. Indeed for every fixed point we have an anisotropic Fokker Plank kernel. However over each point (x, y) we have a whole family of kernels, each one with a different orientation: their 2D projection gives rise to an isotropic configuration as represented in Fig. 10.

Fig. 10
figure 10

Over a point (x, y) we have a whole family of kernels, each one with a different orientation (left). The 2D projection of these kernels gives rise to an isotropic configuration (right)

Let us also mention the fact that the functional architecture of tree shrew is not the same as primates. Indeed, primates appear to have approximately isotropic horizontal connections (once ocular dominance is taken into account). An isotropic version of the previous model can be obtained completing the basis X 1, X 2 with an orthonormal vector

$$X_{3} =-sin(\theta) \partial_{x} + \cos(\theta) \partial_{y}.$$

Propagation in the direction of this vector field has been used in Sarti (2008) while describing simple cells depending on parameters of orientation and scale and while modelling perception of parallel lines. In order to model isotropic diffusion Eq. (2.5) has to be modified as following

$$ \left(x^{\prime}, y^{\prime}, \theta^{\prime}\right) \,=\, N \left(0, {\sigma_{1}^{2}}\right) \vec{X}_{1} + N \left(0, {\sigma_{2}^{2}}\right)\vec{X}_{2} + N \left(0, {\sigma_{3}^{2}}\right)\vec{X}_{3}, $$
(2.12)

where \(N(0, {\sigma _{i}^{2}})\) are normally distributed variables with zero mean and variance equal to \({\sigma _{i}^{2}}\) .

Consequently the associated time independent Fokker Planck equation reduces to an elliptic differential equation:

$$ L ={\sigma_{1}^{2}} X_{11} + {\sigma_{2}^{2}} X_{22} + {\sigma_{3}^{2}} X_{33}. $$
(2.13)

The associated fundamental solution is depicted in Fig. 11 left and it can be considered as a model (11 right) for the isotropic connectivity found for example in Angelucci A. et al. (2002).

Fig. 11
figure 11

A map of horizontal connectivity found by Angelucci A. et al. (2002) on macaques (left). The connectivity pattern is almost isotropic. This pattern is modelled with the isotropic fundamendal solution of Eq. (2.13) (right)

3 Mean field equation in the cortical space

The evolution of a state of a population of cells has been modelled by Wilson and Cowan (1972, 1973), by Ermentrout and Cowan (1980), and subsequently by Bressloff and Cowan (2003). Recent results are due to Faye and Faugeras (2010) and Chossat et al. (2011). The Ermentraut Cowan mean field equation rewritten in the cortical space reads

$$\begin{array}{@{}rcl@{}} &&\frac{\mathrm{d}a(\xi,t)}{\mathrm{d}t}\,=\,-\alpha a(\xi,t)\\ &&{\kern2.5pc}+\sigma\left(\int \!\mu \omega(\xi, \xi^{\prime}) a(\xi^{\prime}, t) d\xi^{\prime} + h(\xi, t) \right)\quad \textrm{ in } \mathcal{M} \end{array} $$
(3.1)

where ξ = (x, y, θ) is a point of the cortical space \(\mathcal {M}\), the coefficient α represents the decay of activity, h is the feedforward input which coincides with the response of the simple cells in presence of a visual stimulus described by Eq. (2.2).

The function σ is the transfer function of the population, and has a piecewise linear behavior, as proposed in Kilpatrick and Bressloff (2010) (see Fig. 12).

$$ \sigma(s) = \left\{\begin{array}{rcl} 0, & s\in ]-\infty, c-\frac{1}{2\gamma}[\\ \\ \gamma(s-c)+\frac{1}{2}, & s\in [c-\frac{1}{2\gamma}, c+\frac{1}{2\gamma}]\\ \\ 1, & s\in ]c+\frac{1}{2\gamma}, +\infty[ \end{array}\right. , $$
(3.2)

where γ is a real number, which represents the slope of the linear regime and c is the half height threshold.

Fig. 12
figure 12

The piecewise linear transfer function, compared with the classical sigmoid

The kernel μ ω(ξ, ξ′) is the contribution of cortico-cortical connectivity introduced in Eq. (2.9). Note that in particular that the kernel is invariant with respect to the group law in S E(2), so that the equation is equivariant. It is compatible with the model of Bressloff and Cowan who only assumed that ω is invariant with respect to rotation and translations. For the reader interested in mathematical properties, we state the following remark

Remark 3.1

We explicitly note that using the expression Eq. (2.10) the operator associated to the kernel ω becomes

$$\mathbf{A}a (\xi) =\int \omega(\xi, \xi^{\prime}) a(\xi^{\prime}, t) d\xi^{\prime} =\int \omega(T_{-\xi^{\prime}}\xi, 0) a(\xi^{\prime}, t) d\xi^{\prime}$$

Hence it is a convolution in the group S E(2), where the Euclidean translation is replaced in the argument of ω by the group transformation. The choice of ω as a symmetrized fundamental solution ensures that it is locally integrable, so that the associated S E(2)−convolution operator is compact on square integrable functions on bounded sets. The assertion, known in the Euclidean setting (see for example Brezis (2011)), holds also in S E(2) since the properties of the convolution are the same (see for example Rothschild and Stein (1976), Proposition B, pag 265).

The parameter μ is a coefficient of short term synaptic facilitation and generally increasing during the perceptual process.

We also outline the following existence result:

Remark 3.2

Existence of the solution. The solution is defined for all times and satisfies

$$|a(\xi, t)|\leq \frac{1}{\alpha} \text{ for all } \xi, \in M, t >0.$$

See for example Faugeras et al. (2009).

3.1 Restriction to the domain defined by the external input

The main novelty of our model is to split the cortical domain M in a subdomain Ω characterized by the presence of the input, and the complementary set. We will show in the following that under suitable assumptions the activity in this complementary set will be negligible and the domain of Eq. (3.1) reduces to Ω.

By simplicity we will assume that h can attain only two values: 0 and c, and we call Ω the set of points in the visual cortex activated by the presence of an input

$$ {\Omega}=\{\xi: h(\xi)=c \}. $$
(3.3)

We require that μ ω satisfies an assumption of weak connectivity, which means that when the activity is around the points 0 and c, the dynamics does not change regime due to the connectivity contribution.

Remark 3.3

Formally we will require that the integral of μ ω is sufficiently small to satisfy:

$${\int}_{M} \mu \omega(\xi, \xi^{\prime}) d\xi^{\prime} \leq \alpha \min\left( \frac{1}{2\gamma}, c - \frac{1}{2\gamma}\right). $$
(3.4)

Under this assumption, if the activity a is identically 0 at the initial time, then the activity remains identically 0 outside Ω for all t > t 0:

$$a(\xi, t)=0 \text{ for } \xi\in M\backslash{\Omega}.$$

On the other hand on the set Ω the argument of σ always remains in the linear regime for all t > t 0:

$$ \int \mu \omega(\xi, \xi^{\prime}) a(\xi^{\prime})d\xi^{\prime} + c\in \!\left[\!c- \frac{1}{2\gamma}, c+ \frac{1}{2\gamma}\!\right],\!\!\!\quad\text{ for } \xi\in {\Omega}. $$
(3.5)

Proof

Let us choose ξ in M∖Ω. Using the boundness of a asserted in Remark 3.2, and the assumption of weak connectivity Eq. (3.4) on ω we get

$$\begin{array}{@{}rcl@{}} &&\left|\int \mu\omega(\xi, \xi^{\prime}) a(\xi^{\prime}) d\xi^{\prime} \right|\leq \alpha\max(a) \min\left( \frac{1}{2\gamma}, c - \frac{1}{2\gamma}\right)\\ &&{\kern1pc}\leq\min\left( \frac{1}{2\gamma}, c - \frac{1}{2\gamma}\right). \end{array} $$
(3.6)

It follows that

$$\int \mu\omega(\xi, \xi^{\prime}) a(\xi^{\prime}) d\xi^{\prime} \leq c - \frac{1}{2\gamma}. $$

so that, by the properties of σ, we get:

$$\sigma \left(\int \mu\omega(\xi, \xi^{\prime}) a(\xi^{\prime}) d\xi^{\prime} \right) =0, $$

if ξM∖Ω. Inserting this in the right hand side of Eq. (3.1)

$$\frac{d}{dt} \left(e^{\alpha t} a(\xi, t)\right) = e^{\alpha t} a^{\prime}(\xi, t) + \alpha e^{\alpha t} a(\xi, t) $$
$$=e^{\alpha t} \sigma \left(\int \mu\omega(\xi, \xi^{\prime}) a(\xi^{\prime}) d\xi^{\prime}\right)=0, $$

This implies that

$$e^{\alpha t} a(\xi, t)$$

is constant, and since it vanishes for t = t 0, it is identically 0 for all t > t 0. From Eq. (3.6) it also follows that

$$\int \mu \omega(\xi, \xi^{\prime}) a(\xi^{\prime})d\xi^{\prime} + c \leq c + \frac{1}{2\gamma} $$

and

$$\int \mu \omega(\xi, \xi^{\prime}) a(\xi^{\prime})d\xi^{\prime} + c \geq c - \frac{1}{2\gamma}. $$

Hence the mean field activity equation reduces to

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{d}a(\xi,t)}{\mathrm{d}t}&=&\!-\alpha a(\xi,t)\\ &&\!+\gamma\! \left(\int\! \mu\omega(\xi, \xi^{\prime}) a(\xi^{\prime}, t) d\xi^{\prime} + c \right)\!\!\quad\text{in}~{\Omega}. \end{array} $$
(3.7)

Note that the Eq. (3.7) is similar to the one in Bresslof Cowan model, but the Bresslof Cowan model is defined in the whole cortical space, while Eq. (3.7) is defined on the domain Ω.

3.2 Stability analysis

The stationary states a 1 of Eq. (3.7) satisfy

$$ -\alpha a_{1}(\xi)+\gamma\left(\int \mu\omega(\xi, \xi^{\prime}) a_{1}(\xi^{\prime}) d\xi^{\prime} + c \right)=0\quad \mathrm{ in } {\Omega} $$
(3.8)

and have been studied by Faugeras et al. (2009).

In order to study their stability we need to study small perturbation around the stationary state. Hence we will call u = aa 1 the perturbation, and obtain the equation satisfied by u subtracting the equations for a and a 1:

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{d}(a-a_{1})(\xi,t)}{\mathrm{d}t}&=&-\alpha (a-a_{1})(\xi,t)+\gamma \\ &\times& \left(\int \mu\omega(\xi, \xi^{\prime}) (a - a_{1} ) (\xi^{\prime}, t) d\xi^{\prime} \right) \end{array} $$
(3.8)

in Ω. Note that the function u is a solution of the homogeneous equation associated to Eq. (3.7):

$$ \frac{\mathrm{d}u(\xi,t)}{\mathrm{d}t}=-\alpha u(\xi,t)+\gamma\left(\int \mu\omega(\xi, \xi^{\prime}) u(\xi^{\prime}) d\xi^{\prime} \right)\quad \textrm{ in } {\Omega} $$
(3.9)

The stability of the solution of this linear equation can be studied by means of the eingenvalues of the associated linear operator:

$$Lu = - \alpha u + \mu\gamma\int \omega(\xi, \xi^{\prime}) u(\xi^{\prime}) d\xi^{\prime} =\lambda u. $$
(3.10)

Let us note that the parameter μ increases since it is a short term synaptic facilitation. For this reason we now study this eigenvalue problem by varying μ. The system will be stable if λ is negative. This condition depends on the value of μ and on the eigenvalues of the convolution operator with μ ω. Indeed condition Eq. (3.10) is equivalent to

$$\int \omega(\xi, \xi^{\prime}) u(\xi^{\prime}) d\xi^{\prime} = \frac{1}{\gamma\mu}(\lambda + \alpha) u.$$

and implies

$$\frac{\lambda + \alpha}{\gamma\mu} = \tilde \lambda$$

for an eigenvalue \(\tilde \lambda \) of ω. Imposing that λ is negative we get:

$$\lambda = - \alpha + \mu\gamma \tilde \lambda<0$$

Hence

$$\mu < \frac{\alpha}{\gamma\tilde \lambda} $$

for every eigenvalue \(\tilde \lambda \) of ω. Remember that the operator associated to ω has a sequence \(\tilde \lambda _{k}\) of eigenvalues. This is satisfied if

$$\mu < \frac{\alpha}{\gamma\tilde \lambda_{1}},$$

for the largest eigenvalue \(\tilde \lambda _{1}\). The uniform solution becomes marginally stable when μ increases beyond the critical value \( \frac {\alpha }{\gamma \tilde \lambda _{1}}\). due the excitation of the linear eigenfunctions, solutions of

$$ \int \omega(\xi, \xi^{\prime}) u(\xi^{\prime}) d\xi^{\prime} = \tilde \lambda_{k} u. $$
(3.11)

The saturating nonlinearities of the system can stabilize the growing pattern of activity.

4 Patterns of activity and spectral clustering

4.1 The discrete mean field equation

Due to the discrete structure of the cortex, the input configurations are constituted by a finite number N of position-orientation elements, with coordinates ξ i = (x i ,y i ,θ i ). On these points the input h takes the value c. As a consequence, the set Ω, defined in Eq. (3.3) is discretized, and becomes

$$ {\Omega}_{d} =\{\xi_{i}: h(\xi_{i})=c\}. $$
(4.1)

Analogously the linear operator (3.10) reduces to:

$$ L_{d} u(\xi_{i})=-\alpha u(\xi_{i})+ \gamma\mu\sum\limits_{j=1}^{N}\omega(\xi_{i},\xi_{j})u (\xi_{j}) \quad \textrm{ in } {\Omega}_{d}. $$
(4.2)

The model of Bressloff and Cowan (2003) has been developed in the whole cortical space without an input and the activity patterns have the symmetry of S E(2). Here the symmetry is lost due to the presence of the input, hence the activity patterns inherit geometric properties of the domain Ω d . The eigenmodes will be defined precisely on that geometry.

In particular the kernel ω is reduced to a matrix A, whose entries i, j are:

$$ A_{ij} = \gamma\mu\omega(\xi_{i}, \xi_{j}), $$
(4.3)

and the eigenvalue problem (3.11) becomes:

$$ A u = \tilde \lambda_{k} u. $$
(4.4)

This matrix can be considered as the equivalent of the affinity matrix introduced by Perona and Freeman (1998) to perform perceptual grouping. Perona proposed to model the affinity matrix in term of an euristic distance d(ξ), facilitating collinear and cocircular couple of elements. Indeed by Eq. (2.11) we see that

$$A_{ij} \simeq e^{-{d_{c}}^{2}(\xi_{i}, \xi_{j}) }, $$

where d c is the distance defined in Eq. (2.11).

4.2 Spectral clustering and dimensionality reduction

In Perona and Freeman (1998) the problem of perceptual grouping has been faced in terms of reduction of the complexity in the description of a scene. The visual scene is described in term of the affinity matrix A i j with a complexity of order O(N 2) if N discrete elements are present in the scene. The idea of Perona and Freeman is to describe the scene approximating the matrix A i j by the sum of matrices of rank 1 and complexity N, each of which will identify a perceptual unit in the scene. If the number of the perceptual units present in the scene is much smaller than N, this procedure reduces the dimensionality of the description. A rank 1 matrix will be represented as the external product of a vector p with itself.

The first one will be computed as the best approximation of A i j minimizing the Frobenius norm as follows:

$$p_{1}= \textit{argmin}_{\hat{p}} \sum\limits_{i,j=1}^{N}(A_{ij}-\hat{p}_{i} \hat{p}_{j})^{2}$$

where the term \({\sum }_{i,j=1}^{N} \hat {p}_{i} \hat {p}_{j} \) is a rank one matrix with complexity order O(N).

Perona proved that the minimizer p 1 is the first eigenvector v 1 of the matrix A with largest eigenvalue \(\lambda _{1}: p_{1}=\lambda _{1}^{1/2}v_{1}.\)

Then the problem is repeated on the vector space orthogonal to p 1. The minimizer will correspond to the second eigenvector, and iteratively the others eigenvectors are recovered. The process ends when the associated eigenvalue is sufficiently small. In this way in general only n eigenvectors are selected, with n < N, leading to the dimensionality reduction.

Then the problem of grouping is reduced to the spectral analysis of the affinity matrix A i j , where the salient objects in the scene correspond to the eigenvectors with largest eigenvalues.

We just showed in the previous paragraphs that this spectral analysis can be implemented by the neural population equation in the functional architecture of the primary visual cortex. We can now interpret eigenvectors of Eq. (4.4) as the gestalten segmenting the scene.

5 Numerical simulation and results

5.1 Numerical approximation of the kernel

We numerically evaluate the connectivity kernel ω, defined by Eq. (2.9), in a descrete (x, y, θ) volume, whose cells will be denoted Ω i, j, k . Since the kernel is invariant with respect to rotation and translation it will be computed at the point 0, and the kernel in any other point will be obtained via rigid transformation. Hence we will consider the discrete fundamental solution Γ d as well as ω d function of an unique variable (i, j, k). These kernels will be numerically estimated with standard Markov Chain Monte Carlo methods (MCMC) (Robert and Casella 2004). This is done by generating random paths obtained from numerical solutions of the system (2.5). This system is discretized as follows

$$ \left\{ \begin{array}{lll} x_{s+{\Delta} s} - x_{s} & = & {\Delta} s\cos(\theta) \vspace{4pt}\\ y_{s+{\Delta} s} - y_{s} & = & {\Delta} s\sin(\theta)\vspace{4pt}\\ \theta_{s+{\Delta} s} - \theta_{s} & = & {\Delta} s N(\sigma,0) \vspace{4pt} \end{array} \hspace{8pt},\hspace{8pt}s \in \{0,\dots,H\} \right. $$
(5.1)

where H is the number of steps performed by the random path and N(σ, 0) is a generator of numbers taken from a normal distribution with mean 0 and variance σ. Solving this finite difference equation n times will give n different realizations of the stochastic path. The estimated kernel Γ d (i, j, k), is computed averaging their passages over discrete volume elements, and smoothing the results with local weighted means. More precisely at a fixed time value s we count the number of paths that passed through each grid cell M i j k . Dividing by n, this provides a distribution which, for large values of n gives a discrete approximation of the solution of Eq. (2.6) that we will denote ρ(M, i, j, k, s|0). The fundamental solution of its stationary counterpart which approximates the connectivity kernel (2.7) will then be computed integrating in the s variable:

$${\Gamma}_{d}(i,j,k) = \frac{1}{H} \sum\limits_{s=1}^{H}\rho(M, i,j,k, s| 0). $$

We refer to (Higham) where the code for the implementation of a similar Stochastic differential equation is provided.

In Fig. 13 a projection of the fundamental solution Γ d is visualized with different number of paths. In Fig. 14 a level set of the connectivity kernel ω is represented (a different level set had been anticipated in Fig. 6). In Fig. 7 the connectivity kernel was superimposed to the pinwheels structure outcome of a simulation following Barbieri et al. (2012). In Fig. 9 (left) the kernel was visualized by means of black points generated with a probability density proportional to the value of the kernel at the point \((x, y, \tilde \theta (x, y))\). The comparison of the image with the results of Bosking presented in Figure 8 (right) shows that the kernel ω provides a good estimate of the measured cortical connectivity.

Fig. 13
figure 13

Estimate of the fundamental solution Γ of Eq. (2.8) with the Markov Chain Monte Carlo method. It is visualized the projection of Γ d in the (x, y) plane. On the left with 6 random paths and on the right with 3000 with σ = 0.08, and H = 100

Fig. 14
figure 14

A level set of the kernel ω, obtained via the simmetrisation of the fundamental solution Γ d

The numerical approximation of the isotropic version of the kernel proposed by Angelucci A. et al. (2002),and discussed in section 2 follows the same strategy. Equation (2.12) is approximated by

$$ \left\{ \begin{array}{lll} x_{s+{\Delta} s} - x_{s} & = & {\Delta} s N(\sigma_{1},0) \vspace{4pt}\\ y_{s+{\Delta} s} - y_{s} & = & {\Delta} s N(\sigma_{2},0) \vspace{4pt}\\ \theta_{s+{\Delta} s} - \theta_{s} & = & {\Delta} s N(\sigma_{3},0) \vspace{4pt} \end{array} \hspace{8pt},\hspace{8pt}s \in \{0,\dots,H\} \right. $$
(5.2)

where σ 1,σ 2,σ 3 are the variances in the x, y, θ directions. By integrating this system the isotropic fundamental solution is computed and the kernel visualized in Fig. (11) is obtained.

5.2 Results of grouping

In Field et al. (1993) experimented the ability of the human visual system to detect perceptual units out of a random distribution of oriented elements. In Fig. 15 (left) it is shown the stimulus proposed to the observer, from which the visual system is able to individuate the perceptual unit shown in the right. In the following we will test our grouping model on similar stimuli to individuate the perceptual units present in the images.

Fig. 15
figure 15

The experiment of Fields, Heyes and Hess. The proposed stimulus (on the left) and the perceptual unit present in it (right) Field et al. (1993)

In the first experiment we considered 150 position-orientation patches, with coordinates ξ i . A subset of elements is organized in a coherent way and the large majority is randomly chosen, in a way similar to the experiment of Field et al. (1993) (see Fig. 16, left). These points define a domain Ω d = {ξ i :i = 1,⋯n} as in Eq. (4.1), and we will define the input stimulus h as a function defined on the whole cortical space M, which attains value c on Ω and 0 outside.

Fig. 16
figure 16

In the image on the left a random distribution of segments and a coherent structure are present. On the right the first eigenvector of the affinity matrix is shown. In red are visualized the segments on which the eigenvector is greater than a given threshold

The connectivity among these elements is defined as in Eq. (4.3), by means of the connectivity kernel γ μ ω(ξ i ,ξ j ). The entries of the associated matrix A i j are visualized in Fig. 17. It is evident the quasi block structure of the matrix with a principal block on the top left and small blocks on the quasidiagonal structure. The principal block corresponds to the coherent object and the diagonal to the correlated ones. The eigenvalue problem (4.4) is faced and eigenvalues of the associated affinity matrix are computed.

Fig. 17
figure 17

On the left is visualized the affinity matrix. On the right its eigenvalues are shown

Figure 17 right shows the ordered distributions of eigenvalues, where a dominant eigenvalue is present. The corres ponding eigenvector is visualized in Fig. 16. The algorithm is summarized in Table 1. (right) and individuates the coherent perceptual unit.

Table 1 Main steps of the algorithm.

In the second experiment a stimulus containing 2 perceptual units is present. As before we compute the connectivity kernel γ μ ω(ξ i ,ξ j ) and the associated matrix A i j . The eigenvalue problem (4.4) is faced and eigenvalues of the associated affinity matrix are computed. The first eigenvector of the affinity matrix is computed and shown in Fig. 18 (top right). After that the affinity matrix is updated removing the detected perceptual unit. The first eigenvector of the updated affinity matrix is visualized in Fig. 18 (bottom left). The procedure is iterated for the next unit which only contains two oriented element (bottom right).

Fig. 18
figure 18

A stimulus containing 2 perceptual units (top left) is segmented. After that the first eigenvector of the affinity matrix is computed (top right), the affinity matrix is updated removing the detected perceptual unit. The first eigenvector of the updated affinity matrix is visualized (bottom left). The procedure is iterated for the next unit (bottom right)

Figure 19 shows the selection of the most salient structure of the Kanizsa triangle. When applying the model, the first eigenvector corresponds to the 3 inducers of the triangle linked together that indicates which boundaries should be completed tfor the perception of the triangle. The circles correspond to less salient eigenvectors.

Fig. 19
figure 19

A classical Kanizsa triangle (left) and the first eigenvector in red (right). The successive eigenvectors account for the circles, showing that the triangle is more salient than the circles

Fig. 20
figure 20

Segmentation with the isotropic kernel of Eq. 2.13 applied to the stimulus of Fig. 18. The first two eigenvectors of the operator are visualized, showing that the segmentation is achieved but spuriuos segments are generated

Finally we show the results of a numerical experiment with an isotropic kernel in the cortical space (x, y, θ), corresponding to an isotropic connectivity pattern between simple cells. An isosurface of the kernel is visualized in the top of the figure. The segmentation model is applied to the stimulus of Fig 18. The two perceptual units are detected as principal eigenvectors (Fig. 20), but the result is more noisy then in case of the Fokker Planck based kernel.

6 Discussion

We have shown that a mathematical model of the functional architecture of the primary visual cortex, expressed in term of a mean field equation and of suitable horizontal connectivity kernels, gives rise to neural activation patterns that appear as perceived units. The real cortical function of V1 suggested by the model consists in 3 basic features: a) The cortex provides a space whose connectivity embeds gestalt rules (good continuation in our model). This connectivity space is learned by statistics of natural images. b) This space is sampled by the visual input, generating a subspace that is a stimulus dependent connectivity graph. c) Eigensolutions of mean field equations on this subspace corresponds to visual units.

A first comment concerns the nature of visual perception that is suggested by the poposed model. The symmetry breaking mechanism of the mean field equations has been introduced by Bressloff and Cowan as a model for visual hallucinations that are produced by the use of psychotropic drugs. In that model drugs are exciting the entire cortical space in an unconstrained manner and hallucinations emerge as eigenvectors in the full SE(2) structure. In our model the visual input excites a subdomain of SE(2) and visual units emerge as eigenvectors constrained to the new domain. Then the suggestive idea of Jan Koenderink that visual perception is a constrained hallucination (Koenderink and van Doorn 2008) seems to be strongly supported by the model, suggesting that a common mechanism is at the base of hallucinations and perception of visual units. The difference between the two is just due to the different shapes of the excited domains of V1.

Let’s outline that the proposed model has the following properties:

  1. 1)

    it is compatible with the emergence of perceptual units as in classical phenomenology of perception. In our model the connectivity kernel encode the good continuation law of the Berliner gestalt school and the association field of Field, Heyes and Hess. Moreover when figure-ground articulation is performed, the emergent figure corresponds to the most salient gestalt present in the visual stimulus, as in classical theory of phenomenology of perception (Merleau-Ponty 2012).

  2. 2)

    it is formally equivalent to the mechanism of grouping proposed in computer vision by basic spectral methods for figure segmentation (Perona and Freeman 1998). In spectral methods, segmentation is performed by computing eigenvectors of an affinity matrix. We show here that the solutions of the neural mean field equation correspond to eigenvectors of the horizontal connectivity operator, that plays the role of the affinity matrix in computer vision.

  3. 3)

    it corresponds to kernel principle components analisys of the connectivity matrix activated by the visual input, allowing to interpret the neural/perceptual process of emergence of gestalt in terms of dimensionality reduction of graphs as in machine learning.

The neural model implemented in the functional geometry of V1 performs a grand unification among four different scientific areas: computational neuroscience, visual perception, computer vision and machine learning. It is not our intention to judge if the spectral computer vision technique is the best in performing grouping or to evaluate its performance in comparison to others. Nor it is our interest to compare different machine learning techniques (for example PCA, independent component analysis, sparse coding). We would like just to note how the basic features 1) 2) 3) naturally emerge from a simple model of the brain in a unified and integrated setting.

As a last comment let’s note that Bressloff and Cowan have shown in (Bressloff et al. 2002) that the emerging of patterns by symmetry breaking of the mean field equation is formally equivalent to the morphogenetical process introduced by Turing in his mailstone paper on the chemical basis of morphogenesis (Turing 1952). We propose in this paper that the constitution of perceptual units underlines the same principle of symmetry breaking of the evolution equation, where the equation is now defined on a continously varying domain defined by the visual input. This situation is even closer to the original paper of Turing, where the origin of the symmetry breaking is due to the deformation and growing of the domain of the equation.