Abstract
In this paper we show that the emergence of perceptual units in V1 can be explained in terms of a physical mechanism of simmetry breaking of the mean field neural equation. We consider a mean field neural model which takes into account the functional architecture of the visual cortex modeled as a group of rotations and translations equipped with a degenerate metric. The model generalizes well known results of Bressloff and Cowan which, in absence of input, accounts for hallucination patterns. The main result of our study consists in showing that in presence of a visual input, the stable eigenmodes of the linearized operator represent perceptual units of the visual stimulus. The result is strictly related to dimensionality reduction and clustering problems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
One of the major challenges in neurobiology is understanding the relationship between spatially structured activity states and the underlying neural circuitry that supports them.
From the geometrical point of view the first accurate models of the functional architecture of the primary visual cortex (V1) is due to Hubel and Wiesel (1977) (see Hubel (1988) for a review of their work). Hubel and Wiesel discovered that for every point (x, y) of the retinal plane there is an entire set of cells, each one sensitive to a particular instance of a specific feature of the image: position, orientation, scale, color, curvature, velocity, stereo. They called this structure hypercolumnar organization. Horizontal connectivity is responsible for the cortico-cortical propagation of the neural activity between hypercolumns. Further insights on the structure of the connectivity and the spatial arrangements of cells were provided by Blasdel (1992), Bonhoeffer and Grinvald (1991), Bosking et al. (1997). The association fields of Field et al. (1993), discovered on a purely psycho-physical basis, have been proposed as a phenomenological counterpart of the cortical-cortical connectivity. Geometric frameworks for the description of the functional architecture of V1 were proposed by W.C. Hoffman (1989), Petitot and Tondut (1999), Bressloff and Cowan (2003), Citti and Sarti (2006), Zucker (2006), Sarti, Citti, Petitot (2008). Application to image processing can be found in Duits and Franklin (2010a, b, Duits et al. (2011), Boscain et al. (2012).
From the dynamical point of view the first neural field models of the cortical activity are due to Wilson and Cow (1972, 1973) and Amari (1972), and are expressed in terms of integro-differential equations. Extensions of the models have been provided by Ermentrout and Cowan (1979,1980). These mean field equations describe the activity on a 2D plane and formally express the interaction between cells through as a convolution kernel. Bressloff and Cowan (2003), Bressloff et al. (2002) proposed new models taking into account the high dimensional cortical structure, with position, orientation and scale as features. In their models the connectivity kernel satisfies the symmetry properties of the cortical space, namely S E(2) for rotation and translation and the affine group for scale, rotation and translation. In absence of the external input these models successfully account for hallucination patterns. More recently, Faugeras (2012), Faye and Faugeras (2010) and Chossat et al. (2011) modified the model in order to take into account delay and the tensorial structure of the cortex.
Scope of this paper is to provide a possible computational interpretation of cortical function by considering a mean field neural model which takes into account the neurogeometry of the cortex introduced in Citti and Sarti (2006) as well as the presence of a visual input. It is known that when stationary solutions of the equation become marginally stable, eigenmodes of the linearized operator can become stable. In absence of a visual input the raising eigenmodes lead to the hallucination patterns proposed by Bressloff and Cowan (2003), Bressloff et al. (2002). The main result of our study consists in showing that in presence of a visual input, these eigenmodes correspond to perceptual units. While in the case of hallucinations the emergence of eigenmodes is due to the use of drugs, in the case of perceptual units it is due to physiological variations of parameters during the perception process. The whole process can be interpreted as a problem of data segregation and partitioning, strongly related to the most recent results of dimensionality reduction. In particular our model can justify on biological basis, the results of Perona and Freeman (1998), Shi and Malik (1997), Weiss (1999), Coifman and Lafon (2006), Coifman et al., who directly faced the problem of perceptual grouping in the description of a scene by means of a kernel PCA on an affinity matrix.
If the aim of the paper is to provide a possible computational interpretation of cortical function, another motivation is to show that the proposed neural computational model performs a unification among four different scientific areas: computational neuroscience, visual perception, computer vision and machine learning. In facts the model is able to extract perceptual units with a neurally plausible mechanism and at the same time it formally corresponds to a computer vision algorithm and to a machine learning technique. This intersection could be important to integrate different scientific communities and to share ideas and inspiration on the base of a formal (mathematical and computational) analogy.
The paper starts with briefly recalling some results about the neurogeometry of the primary visual cortex (Section 2). The horizontal interaction between simple cells is represented by the fundamental solution of a Fokker Planck equation, following Sanguinetti et al. (2010) and Barbieri et al.. In Section 3 the classical mean field model of Ermentraut and Cowan is adapted to the S E(2) cortical symmetry group with the previously computed connectivity kernel. Stationary solutions are studied and a stability analysis is performed, varying a suitable physiological parameter. In the classical papers (Bressloff and Cowan 2003; Bressloff et al. 2002) the variability of this parameter was due to the presence of drugs. On the contrary in our model, the variability of the same parameter is due to the physiological variability of the transfer function in different neural populations. In addition, the geometry of the problem depends both on the invariance of S E(2) and the presence of the input. In Section 4 the mean field equation is discretized and the connectivity kernel reduced to a matrix induced by the neurogeometry of the cortex as well as by the visual input. Marginally stable solutions are computed as eigenvectors of this matrix, and we show that they represent perceptual units present in the image. The result is very closely related to the dimensionality reduction and clustering problems of Perona and Freeman (1998), and the connectivity matrix can be interpreted as an affinity matrix. In Section 5 we present numerical simulation results. Finally in Section 6 we will discuss the model with regards to cortical function and outline the disciplinary unification.
2 The functional geometry of V1
In this section we briefly recall the structure of the functional geometry of the visual cortex. As discovered by Hubel and Wiesel (1977) the visual cortex is organized in hypercolumns of simple cells sensitive to the position (x, y) and to variables which describe different properties of the stimulus: orientation, curvature, speed, velocity, scale, disparity. We will describe in detail the structure of the family of simple cells, sensitive to position and orientation.
2.1 The S E(2) symmetry of the visual cortex
Many authors (Petitot and Tondut 1999; Citti and Sarti 2006; Zucker 2006) represented the hypercolumnar organization as a 3-dimensional space with coordinates (x, y, θ) where each point corresponds to a specific population of cells sensitive to a stimulus positioned in (x, y) and with orientation θ. This leads to the description of the visual cortex in the special Euclidean group \(SE(2) \approx \mathbb {R}^{2} \times S^{1}\). This is composed by the semi-direct product of the group of translations of the plane \(\mathbb {R}^{2}\) with the rotations and reflections group of the plane O(2) (see Fig. 1).
This 3D model can be identified with the original ice cube model (see Hubel-Wiesel) of the cortex. Later on a more realistic model of the cortex has been proposed in terms of the pinwheels structure (see Fig. 2 left), which codes for position and orientations in the 2D cortical layer. The pinwheel structure of V1 has been reconstructed starting from a set of cortical activity maps acquired with optical imaging techniques in response to gratings with different orientations (see Bosking et al. (1997)). A color image has been obtained from gray valued activity maps, associating a color coding representation to preferred orientations. This model can be considered as the union of discrete patches each one coding all orientations (see Fig. 2, top right). Different mathematical models of this structure have been proposed (see Berry and Dennis (2000), Durbin and Mitchison (1990), Barbieri et al. (2012)). In particular using harmonic analysis properties in the group S E(2) in Barbieri et al. (2012) a model of the pinwheel structure was expressed through the choice of an angle θ(x, y) at every point (see Fig. 3). In computation we will always use the continuous S E(2) model of the cortex, since it is simpler to apply, but all computations operated in this setting can be projected to the pinwheels structure by intersection with the graph of θ(x, y) allowing to check neural compatibility.
2.2 The output of simple cells to the visual stimulus
The receptive profile of a simple cell has been modelled as a Gabor filter or in terms of derivatives of a Gaussian function (Daugmann). The whole set of simple cells ψ (x, y, θ) can be obtained by rotation and translation from the mother filter ψ (0,0,0), which amount to say that for every (x, y, θ) the cell at position (x, y) sensible to the orientation θ can be represented as
where R θ represents the rotating of an angle θ. This transformation formally attests that the cortex is the S E(2) group of rotation and translaton.
The response of simple cells to a visual stimulus I(x, y) can be obtained as an integral of the RP with the image I:
Note that the action of the cells is to associate to the 2D retinal image I(x, y) a function h(x, y, θ) defined on the motion group S E(2), which describes the visual cortex.
2.3 Geometry of the horizontal connectivity
Hypercolumns are connected by means of the so called horizontal connectivity. Experimental measures of this connectivity have been obtained by Bosking et al. (1997) by injecting a chemical fig4 (biocytin) and observing its propagation in the cortical layer (see Fig. 4).
The scope of this section is to recall the geometric instruments which can describe a trajectory in the ideal S E(2) cortical space, from which we will deduce a model of the horizontal neural connectivity, to be compared with the physiological data.
In order to do so, the authors in Citti and Sarti (2006) introduced the following vector fields
which describe respectively the propagation in the direction of the orientation θ and the rotation.
For the reader interested in the geometric aspects of the problem we note that these vector fields are the generators of the Lie algebra associated to S E(2) (see Sugiura), but this remark can be skipped since it is not necessary for the comprehension of the rest of the paper. The points of the structure are connected by integral curves of these two vector fields:
such that
More precisely the cortical connectivity can be modeled with the probability of connecting two points in the cortex. Hence we need consider the stochastic counterpart of the curves defined in Eq. (2.4):
where N(0,σ 2) is a normally distributed variable with zero mean and variance equal to σ 2.
This approach, first introduced by Mumford (1993) for describing the probability of co-occurrence of edges, has been further discussed by August-Zucker (2000, 2003), Williams (1995), and Sanguinetti et al. (2010), and we shortly recall it here. Let’s denote v the transition probability that the stochastic solution starting from the point (x′,y′) with orientation θ′ at the initial time reaches the point (x, y) with orientation θ at the time s. This probability density satisfies a deterministic equation known in literature as the Kolmogorov Forward Equation or Fokker-Planck equation (FP):
where X 1 is the directional derivative c o s(θ)∂ x +s i n(θ)∂ y and X 2 = ∂ θ , while X 22 = ∂ θ θ is the second order derivative.
This equation has been largely used in computer vision and applied to perceptual completion related problems. It was first used by Williams (1995) to compute stochastic completion field, by August and Zucker (2000, 2003) to define the curve indicator random field, and more recently by R. Duits and Franken (2010a, b) to perform contour completion, de-noising and contour enhancement. Its stationary counterpart was proposed in Sanguinetti et al. (2008) to model the probability of co-occurence of contours in natural images:
This operator has a nonnegative fundamental solution Γ satisfying:
The kernel is strongly biased in direction X 1 and not symmetric. Its symmetrization can be obtained as:
Since the fundamental solution of Eq. (2.8) is shift invariant with respect to rotation and translation, the kernel ω inherits the same property of invariance. Calling
the group law in S E(2), the kernel in any point can be obtained from the kernel centered at the origin applying this transformation:
We explicitly recall that the general results of Rothschild and Stein (1976) and Nagel et al. (1985) provide a local estimate of the fundamental solution. In addition, starting from the paper of lanconellipascucci, the level sets of the Fokker Planck fundamental solutions have been used to define a distance d c , so that the kernel ω is estimated as follows:
An isosurface of the simmetrized kernel ω is visualized in Fig. 5, where it is visible its typical twisted butterfly shape. In Fig. 6 the kernel is superimposed to the to the structure of the SE(2) group, representing the cortical space and it is projected onto the patchy pinwheel structure in Fig. 7. In this image the pinwheels structure is the outcome of a simulation following (Barbieri et al. 2012). In Fig. 8 (left) the kernel was visualized by means of black points generated with a probability density proportional to the value of the kernel at the point \((x, y, \tilde \theta (x, y))\). The comparison of the image with the results of Bosking presented in Fig. 8 (right) shows that the kernel ω provides a good estimate of the measured cortical connectivity. Let’s also recall that this model closely matches the statistical distribution of edge co-occurence in natural images as obtained in Sanguinetti et al. (2008). In Fig. 9 right it is visualized the probability density of edge cooccurrences measured from a huge data base of natural images. Its resemblance with the Fokker Planck fundamental solution (left) is proved both at a qualitative and quantitative level in Sanguinetti et al. (2008). This argument strongly suggests that horizontal connectivity modelled by the fundamental solution of the Fokker Planck equation is deeply shaped by the statistical distributions of features in the environment and that the very origin of neurogeometry has to be discovered in the interaction between the embodied subject and the world.
We would like to note that, even though the connectivity is strongly anisotropic, if we consider it in a pinwheel point, at the population level there is no orientation preference so that the corresponding horizontal connections are isotropic. This fact can be clearly observed in the model. Indeed for every fixed point we have an anisotropic Fokker Plank kernel. However over each point (x, y) we have a whole family of kernels, each one with a different orientation: their 2D projection gives rise to an isotropic configuration as represented in Fig. 10.
Let us also mention the fact that the functional architecture of tree shrew is not the same as primates. Indeed, primates appear to have approximately isotropic horizontal connections (once ocular dominance is taken into account). An isotropic version of the previous model can be obtained completing the basis X 1, X 2 with an orthonormal vector
Propagation in the direction of this vector field has been used in Sarti (2008) while describing simple cells depending on parameters of orientation and scale and while modelling perception of parallel lines. In order to model isotropic diffusion Eq. (2.5) has to be modified as following
where \(N(0, {\sigma _{i}^{2}})\) are normally distributed variables with zero mean and variance equal to \({\sigma _{i}^{2}}\) .
Consequently the associated time independent Fokker Planck equation reduces to an elliptic differential equation:
The associated fundamental solution is depicted in Fig. 11 left and it can be considered as a model (11 right) for the isotropic connectivity found for example in Angelucci A. et al. (2002).
3 Mean field equation in the cortical space
The evolution of a state of a population of cells has been modelled by Wilson and Cowan (1972, 1973), by Ermentrout and Cowan (1980), and subsequently by Bressloff and Cowan (2003). Recent results are due to Faye and Faugeras (2010) and Chossat et al. (2011). The Ermentraut Cowan mean field equation rewritten in the cortical space reads
where ξ = (x, y, θ) is a point of the cortical space \(\mathcal {M}\), the coefficient α represents the decay of activity, h is the feedforward input which coincides with the response of the simple cells in presence of a visual stimulus described by Eq. (2.2).
The function σ is the transfer function of the population, and has a piecewise linear behavior, as proposed in Kilpatrick and Bressloff (2010) (see Fig. 12).
where γ is a real number, which represents the slope of the linear regime and c is the half height threshold.
The kernel μ ω(ξ, ξ′) is the contribution of cortico-cortical connectivity introduced in Eq. (2.9). Note that in particular that the kernel is invariant with respect to the group law in S E(2), so that the equation is equivariant. It is compatible with the model of Bressloff and Cowan who only assumed that ω is invariant with respect to rotation and translations. For the reader interested in mathematical properties, we state the following remark
Remark 3.1
We explicitly note that using the expression Eq. (2.10) the operator associated to the kernel ω becomes
Hence it is a convolution in the group S E(2), where the Euclidean translation is replaced in the argument of ω by the group transformation. The choice of ω as a symmetrized fundamental solution ensures that it is locally integrable, so that the associated S E(2)−convolution operator is compact on square integrable functions on bounded sets. The assertion, known in the Euclidean setting (see for example Brezis (2011)), holds also in S E(2) since the properties of the convolution are the same (see for example Rothschild and Stein (1976), Proposition B, pag 265).
The parameter μ is a coefficient of short term synaptic facilitation and generally increasing during the perceptual process.
We also outline the following existence result:
Remark 3.2
Existence of the solution. The solution is defined for all times and satisfies
See for example Faugeras et al. (2009).
3.1 Restriction to the domain defined by the external input
The main novelty of our model is to split the cortical domain M in a subdomain Ω characterized by the presence of the input, and the complementary set. We will show in the following that under suitable assumptions the activity in this complementary set will be negligible and the domain of Eq. (3.1) reduces to Ω.
By simplicity we will assume that h can attain only two values: 0 and c, and we call Ω the set of points in the visual cortex activated by the presence of an input
We require that μ ω satisfies an assumption of weak connectivity, which means that when the activity is around the points 0 and c, the dynamics does not change regime due to the connectivity contribution.
Remark 3.3
Formally we will require that the integral of μ ω is sufficiently small to satisfy:
Under this assumption, if the activity a is identically 0 at the initial time, then the activity remains identically 0 outside Ω for all t > t 0:
On the other hand on the set Ω the argument of σ always remains in the linear regime for all t > t 0:
Proof
Let us choose ξ in M∖Ω. Using the boundness of a asserted in Remark 3.2, and the assumption of weak connectivity Eq. (3.4) on ω we get
It follows that
so that, by the properties of σ, we get:
if ξ∈M∖Ω. Inserting this in the right hand side of Eq. (3.1)
This implies that
is constant, and since it vanishes for t = t 0, it is identically 0 for all t > t 0. From Eq. (3.6) it also follows that
and
□
Hence the mean field activity equation reduces to
Note that the Eq. (3.7) is similar to the one in Bresslof Cowan model, but the Bresslof Cowan model is defined in the whole cortical space, while Eq. (3.7) is defined on the domain Ω.
3.2 Stability analysis
The stationary states a 1 of Eq. (3.7) satisfy
and have been studied by Faugeras et al. (2009).
In order to study their stability we need to study small perturbation around the stationary state. Hence we will call u = a−a 1 the perturbation, and obtain the equation satisfied by u subtracting the equations for a and a 1:
in Ω. Note that the function u is a solution of the homogeneous equation associated to Eq. (3.7):
The stability of the solution of this linear equation can be studied by means of the eingenvalues of the associated linear operator:
Let us note that the parameter μ increases since it is a short term synaptic facilitation. For this reason we now study this eigenvalue problem by varying μ. The system will be stable if λ is negative. This condition depends on the value of μ and on the eigenvalues of the convolution operator with μ ω. Indeed condition Eq. (3.10) is equivalent to
and implies
for an eigenvalue \(\tilde \lambda \) of ω. Imposing that λ is negative we get:
Hence
for every eigenvalue \(\tilde \lambda \) of ω. Remember that the operator associated to ω has a sequence \(\tilde \lambda _{k}\) of eigenvalues. This is satisfied if
for the largest eigenvalue \(\tilde \lambda _{1}\). The uniform solution becomes marginally stable when μ increases beyond the critical value \( \frac {\alpha }{\gamma \tilde \lambda _{1}}\). due the excitation of the linear eigenfunctions, solutions of
The saturating nonlinearities of the system can stabilize the growing pattern of activity.
4 Patterns of activity and spectral clustering
4.1 The discrete mean field equation
Due to the discrete structure of the cortex, the input configurations are constituted by a finite number N of position-orientation elements, with coordinates ξ i = (x i ,y i ,θ i ). On these points the input h takes the value c. As a consequence, the set Ω, defined in Eq. (3.3) is discretized, and becomes
Analogously the linear operator (3.10) reduces to:
The model of Bressloff and Cowan (2003) has been developed in the whole cortical space without an input and the activity patterns have the symmetry of S E(2). Here the symmetry is lost due to the presence of the input, hence the activity patterns inherit geometric properties of the domain Ω d . The eigenmodes will be defined precisely on that geometry.
In particular the kernel ω is reduced to a matrix A, whose entries i, j are:
and the eigenvalue problem (3.11) becomes:
This matrix can be considered as the equivalent of the affinity matrix introduced by Perona and Freeman (1998) to perform perceptual grouping. Perona proposed to model the affinity matrix in term of an euristic distance d(ξ), facilitating collinear and cocircular couple of elements. Indeed by Eq. (2.11) we see that
where d c is the distance defined in Eq. (2.11).
4.2 Spectral clustering and dimensionality reduction
In Perona and Freeman (1998) the problem of perceptual grouping has been faced in terms of reduction of the complexity in the description of a scene. The visual scene is described in term of the affinity matrix A i j with a complexity of order O(N 2) if N discrete elements are present in the scene. The idea of Perona and Freeman is to describe the scene approximating the matrix A i j by the sum of matrices of rank 1 and complexity N, each of which will identify a perceptual unit in the scene. If the number of the perceptual units present in the scene is much smaller than N, this procedure reduces the dimensionality of the description. A rank 1 matrix will be represented as the external product of a vector p with itself.
The first one will be computed as the best approximation of A i j minimizing the Frobenius norm as follows:
where the term \({\sum }_{i,j=1}^{N} \hat {p}_{i} \hat {p}_{j} \) is a rank one matrix with complexity order O(N).
Perona proved that the minimizer p 1 is the first eigenvector v 1 of the matrix A with largest eigenvalue \(\lambda _{1}: p_{1}=\lambda _{1}^{1/2}v_{1}.\)
Then the problem is repeated on the vector space orthogonal to p 1. The minimizer will correspond to the second eigenvector, and iteratively the others eigenvectors are recovered. The process ends when the associated eigenvalue is sufficiently small. In this way in general only n eigenvectors are selected, with n < N, leading to the dimensionality reduction.
Then the problem of grouping is reduced to the spectral analysis of the affinity matrix A i j , where the salient objects in the scene correspond to the eigenvectors with largest eigenvalues.
We just showed in the previous paragraphs that this spectral analysis can be implemented by the neural population equation in the functional architecture of the primary visual cortex. We can now interpret eigenvectors of Eq. (4.4) as the gestalten segmenting the scene.
5 Numerical simulation and results
5.1 Numerical approximation of the kernel
We numerically evaluate the connectivity kernel ω, defined by Eq. (2.9), in a descrete (x, y, θ) volume, whose cells will be denoted Ω i, j, k . Since the kernel is invariant with respect to rotation and translation it will be computed at the point 0, and the kernel in any other point will be obtained via rigid transformation. Hence we will consider the discrete fundamental solution Γ d as well as ω d function of an unique variable (i, j, k). These kernels will be numerically estimated with standard Markov Chain Monte Carlo methods (MCMC) (Robert and Casella 2004). This is done by generating random paths obtained from numerical solutions of the system (2.5). This system is discretized as follows
where H is the number of steps performed by the random path and N(σ, 0) is a generator of numbers taken from a normal distribution with mean 0 and variance σ. Solving this finite difference equation n times will give n different realizations of the stochastic path. The estimated kernel Γ d (i, j, k), is computed averaging their passages over discrete volume elements, and smoothing the results with local weighted means. More precisely at a fixed time value s we count the number of paths that passed through each grid cell M i j k . Dividing by n, this provides a distribution which, for large values of n gives a discrete approximation of the solution of Eq. (2.6) that we will denote ρ(M, i, j, k, s|0). The fundamental solution of its stationary counterpart which approximates the connectivity kernel (2.7) will then be computed integrating in the s variable:
We refer to (Higham) where the code for the implementation of a similar Stochastic differential equation is provided.
In Fig. 13 a projection of the fundamental solution Γ d is visualized with different number of paths. In Fig. 14 a level set of the connectivity kernel ω is represented (a different level set had been anticipated in Fig. 6). In Fig. 7 the connectivity kernel was superimposed to the pinwheels structure outcome of a simulation following Barbieri et al. (2012). In Fig. 9 (left) the kernel was visualized by means of black points generated with a probability density proportional to the value of the kernel at the point \((x, y, \tilde \theta (x, y))\). The comparison of the image with the results of Bosking presented in Figure 8 (right) shows that the kernel ω provides a good estimate of the measured cortical connectivity.
The numerical approximation of the isotropic version of the kernel proposed by Angelucci A. et al. (2002),and discussed in section 2 follows the same strategy. Equation (2.12) is approximated by
where σ 1,σ 2,σ 3 are the variances in the x, y, θ directions. By integrating this system the isotropic fundamental solution is computed and the kernel visualized in Fig. (11) is obtained.
5.2 Results of grouping
In Field et al. (1993) experimented the ability of the human visual system to detect perceptual units out of a random distribution of oriented elements. In Fig. 15 (left) it is shown the stimulus proposed to the observer, from which the visual system is able to individuate the perceptual unit shown in the right. In the following we will test our grouping model on similar stimuli to individuate the perceptual units present in the images.
In the first experiment we considered 150 position-orientation patches, with coordinates ξ i . A subset of elements is organized in a coherent way and the large majority is randomly chosen, in a way similar to the experiment of Field et al. (1993) (see Fig. 16, left). These points define a domain Ω d = {ξ i :i = 1,⋯n} as in Eq. (4.1), and we will define the input stimulus h as a function defined on the whole cortical space M, which attains value c on Ω and 0 outside.
The connectivity among these elements is defined as in Eq. (4.3), by means of the connectivity kernel γ μ ω(ξ i ,ξ j ). The entries of the associated matrix A i j are visualized in Fig. 17. It is evident the quasi block structure of the matrix with a principal block on the top left and small blocks on the quasidiagonal structure. The principal block corresponds to the coherent object and the diagonal to the correlated ones. The eigenvalue problem (4.4) is faced and eigenvalues of the associated affinity matrix are computed.
Figure 17 right shows the ordered distributions of eigenvalues, where a dominant eigenvalue is present. The corres ponding eigenvector is visualized in Fig. 16. The algorithm is summarized in Table 1. (right) and individuates the coherent perceptual unit.
In the second experiment a stimulus containing 2 perceptual units is present. As before we compute the connectivity kernel γ μ ω(ξ i ,ξ j ) and the associated matrix A i j . The eigenvalue problem (4.4) is faced and eigenvalues of the associated affinity matrix are computed. The first eigenvector of the affinity matrix is computed and shown in Fig. 18 (top right). After that the affinity matrix is updated removing the detected perceptual unit. The first eigenvector of the updated affinity matrix is visualized in Fig. 18 (bottom left). The procedure is iterated for the next unit which only contains two oriented element (bottom right).
Figure 19 shows the selection of the most salient structure of the Kanizsa triangle. When applying the model, the first eigenvector corresponds to the 3 inducers of the triangle linked together that indicates which boundaries should be completed tfor the perception of the triangle. The circles correspond to less salient eigenvectors.
Finally we show the results of a numerical experiment with an isotropic kernel in the cortical space (x, y, θ), corresponding to an isotropic connectivity pattern between simple cells. An isosurface of the kernel is visualized in the top of the figure. The segmentation model is applied to the stimulus of Fig 18. The two perceptual units are detected as principal eigenvectors (Fig. 20), but the result is more noisy then in case of the Fokker Planck based kernel.
6 Discussion
We have shown that a mathematical model of the functional architecture of the primary visual cortex, expressed in term of a mean field equation and of suitable horizontal connectivity kernels, gives rise to neural activation patterns that appear as perceived units. The real cortical function of V1 suggested by the model consists in 3 basic features: a) The cortex provides a space whose connectivity embeds gestalt rules (good continuation in our model). This connectivity space is learned by statistics of natural images. b) This space is sampled by the visual input, generating a subspace that is a stimulus dependent connectivity graph. c) Eigensolutions of mean field equations on this subspace corresponds to visual units.
A first comment concerns the nature of visual perception that is suggested by the poposed model. The symmetry breaking mechanism of the mean field equations has been introduced by Bressloff and Cowan as a model for visual hallucinations that are produced by the use of psychotropic drugs. In that model drugs are exciting the entire cortical space in an unconstrained manner and hallucinations emerge as eigenvectors in the full SE(2) structure. In our model the visual input excites a subdomain of SE(2) and visual units emerge as eigenvectors constrained to the new domain. Then the suggestive idea of Jan Koenderink that visual perception is a constrained hallucination (Koenderink and van Doorn 2008) seems to be strongly supported by the model, suggesting that a common mechanism is at the base of hallucinations and perception of visual units. The difference between the two is just due to the different shapes of the excited domains of V1.
Let’s outline that the proposed model has the following properties:
-
1)
it is compatible with the emergence of perceptual units as in classical phenomenology of perception. In our model the connectivity kernel encode the good continuation law of the Berliner gestalt school and the association field of Field, Heyes and Hess. Moreover when figure-ground articulation is performed, the emergent figure corresponds to the most salient gestalt present in the visual stimulus, as in classical theory of phenomenology of perception (Merleau-Ponty 2012).
-
2)
it is formally equivalent to the mechanism of grouping proposed in computer vision by basic spectral methods for figure segmentation (Perona and Freeman 1998). In spectral methods, segmentation is performed by computing eigenvectors of an affinity matrix. We show here that the solutions of the neural mean field equation correspond to eigenvectors of the horizontal connectivity operator, that plays the role of the affinity matrix in computer vision.
-
3)
it corresponds to kernel principle components analisys of the connectivity matrix activated by the visual input, allowing to interpret the neural/perceptual process of emergence of gestalt in terms of dimensionality reduction of graphs as in machine learning.
The neural model implemented in the functional geometry of V1 performs a grand unification among four different scientific areas: computational neuroscience, visual perception, computer vision and machine learning. It is not our intention to judge if the spectral computer vision technique is the best in performing grouping or to evaluate its performance in comparison to others. Nor it is our interest to compare different machine learning techniques (for example PCA, independent component analysis, sparse coding). We would like just to note how the basic features 1) 2) 3) naturally emerge from a simple model of the brain in a unified and integrated setting.
As a last comment let’s note that Bressloff and Cowan have shown in (Bressloff et al. 2002) that the emerging of patterns by symmetry breaking of the mean field equation is formally equivalent to the morphogenetical process introduced by Turing in his mailstone paper on the chemical basis of morphogenesis (Turing 1952). We propose in this paper that the constitution of perceptual units underlines the same principle of symmetry breaking of the evolution equation, where the equation is now defined on a continously varying domain defined by the visual input. This situation is even closer to the original paper of Turing, where the origin of the symmetry breaking is due to the deformation and growing of the domain of the equation.
References
Amari, S. (1972). Characteristics of random nets of analog neuron-like elements. Systems Man Cybernet. SMC-2.
Angelucci, A., Levitt, J.B., Walton, E.J.S., Hupe, J.-M., Bullier, J., Lund, J.S. (2002). Circuits for loocal and global signal integration in primary visual cortex. The Journal of Neuroscience, 22 (19), 86338646.
August, J., & Zucker, S.W. (2000). The curve indicator random field: curve organization via edge correlation. In: Boyer, K., & Sarkar, S. (Eds.) In Perceptual organization for artificial vision systems. Kluwer, Norwell, (pp. 265–288).
August, J., & Zucker, S.W. (2003). Sketches with curvature: the curve indicator random field and Markov processes. IEEE Transaction Pattern Analysis and Machine Intelligence, 25 (4), 387–400.
Barbieri, D., Citti, G., Sanguinetti, G., Sarti, A. (2014). An uncertainty principle underlying the functional architecture of V1. Journal of Physiology Paris, 106 (5–6), 183–193.
Berry, M.V., & Dennis, M.R. (2000). Phase singularities in isotropic random waves. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 456(2001):20592079.
Blasdel, G.G. (1992). Orientation selectivity, preference, and continuity in monkey striate cortex. Journal of Neuroscience, 12, 3139–3161.
Bonhoeffer, T., & Grinvald, A. (1991). Iso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns. Nature, 353 (6343), 429–431.
Boscain, U. , Duplaix, J. , Gauthier, J.P. , Rossi, F. (2012). Anthropomorphic image reconstruction via hypoelliptic diffusion. SIAM Journal of Control Optimazation, 50 (3), 1309–1336.
Bosking, W.H., Zhang, Y., Schofield, B., Fitzpatrick, D. (1997). Orientation selectivity the arrangement of horizontal connections in tree shrew striate cortex. Journal of Neuroscience, 17 (6), 2112–2127.
Bressloff, P.C., & Cowan, J.D. (2003). The functional geometry of local and long-range connections in a model of V1. Journal of Physiology Paris, 97 (2-3), 221–236.
Bressloff, P.C., Cowan, J.D., Golubitsky, M., Thomas, P.J., Wiener, M.C. (2002). What geometric visual hallucinations tell us about the visual cortex. Neural Computation, 14, 473–491.
Brezis, H. (2011). Functional analysis, Sobolev spaces and PDE. New York: Springer.
Chossat, P. , Faye, G. , Faugeras, O. (2011). Bifurcation of hyperbolic planforms. Journal of Nonlinear Science, 21, 465498.
Citti, G., & Sarti, A. (2006). A cortical based model of perceptual completion in the roto-translation space. Journal of Mathematical Imag Vis, 24 (3), 307–326.
Cocci, G., Barbieri, D., Citti, G., Sarti, A. (2014). Cortical spatio-temporal dimensionality reduction for visual grouping Accepted on Neural Computation.
Coifman, R., & Lafon, S. (2006). Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions. Applied and Computational Harmonic Analysis, 21 (1), 31–52.
Coifman, R.R. , Lafon, S. , Lee, A.B. , Maggioni, M. , Warner, F. , Zucker, S. (2005). Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proceedings of the National Academy of Sciences, 102(21):7426–7431.
Daugman, J.G. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A, 2 (7), 1160–1169.
Duits, R., & Franken, E.M. (2010a). Left invariant parabolic evolution equations on SE(2) and contour enhancement via invertible orientation scores, part I: Linear left-invariant diffusion equations on SE(2). Quarterly of Applied Mathematics, 68, 255–292.
Duits, R., & Franken, E.M. (2010b). Left invariant parabolic evolution equations on SE(2) and contour enhancement via invertible orientation scores, part II: Nonlinear left-invariant diffusion equations on invertible orientation scores. Quarterly of Applied Mathematics, 68, 293–331.
Duits, R., Führ, H., Janssen, B., Bruurmijn, M., Florack, L., Van Assen, H. (2011). Evolution equations on Gabor transforms and their applications . arXiv:http://arxiv.org/abs/1110.6087.
Durbin, R., & Mitchison, G. (1990). A dimension reduction framework for understanding cortical maps. Nature, 343, 644647.
Ermentrout, G.B., & Cowan, J.D. (1979). Temporal oscillations in neuronal nets. Journal of mathematical biology, 7 (3), 265–280.
Ermentrout, G.B., & Cowan, J.D. (1980). Large scale spatially organized activity in neural nets. SIAM. Journal of Applied Mathematics, 38 (1), 1–21.
Faugeras, O., Veltz, R., Grimbert, F. (2009). Persistent neural states: stationary localized activity patterns in nonlinear continuous n-population, q-dimensional neural networks. Neural Computation, 21, 147187.
Faye, G., & Faugeras, O. (2010). Some theoretical and numerical results for delayed neural field equations. Physica D, 239 (9), 561578.
Faugeras, D. (2012). Neural fields models of visual areas: principles, successes, and caveats. ECCV Workshops 1, 474–479.
Field, D.J., Hayes, A., Hess, R.F. (1993). Contour integration by the human visual system: evidence for a local association field. Vision Res, 33, 173–193.
Higham, D.J. (2001). An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Review, 43 (3), 525–546.
Hoffman, W.C. (1989). The visual cortex is a contact bundle. Applied Mathematics and Computation, 32, 137–167.
Hubel, D.H. (1988). Eye, Brain and Vision. New York: Scientific American Library.
Hubel, D.H., & Wiesel, T.N. (1977). Ferrier lecture: Functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society of London Series B, 198 (1130), 1–59.
Kilpatrick, Z.P., & Bressloff, P.C. (2010). Effects of synaptic depression and adaptation on spatio temporal dynamics of an excitatory neuronal network. Physica D, 239, 547–560.
Koenderink, J.J., & van Doorn, A.J. (2008). The structure of visual space. Journal of mathematical imaging and vision, 31 (2–3), 171–187.
Montgomery, R. (2001). A tour of Subriemannian geometries, their geodesics and applications. AMS.
Merleau-Ponty, M. (2012). Phenomenology of perception. New York: Routledge.
Mumford, D. (1993). Elastica and computer vision. Algebraic Geometry and its Applications, 1993, 507–518.
Nagel, A. , Stein, E.M. , Wainger, S. (1985). Balls and metrics defined by vector fields I: Basic properties. Acta Mathematics, 155, 103147.
Perona, P., & Freeman, W.T. (1998). A factorization approach to grouping. In: Burkardt, H. , & Neu- mann, B. (Eds.) Proceedings ECCV, (pp. 655–670).
Petitot, J., & Tondut, Y. (1999). Vers une neurogéométrie. Fibrations corticales, structures de contact et contours subjectifs modaux. Mathematical Sciences Human, 145, 5–101.
Robert, C.P., & Casella, G. (2004). Monte Carlo statistical methods, 2nd Edition. New York: Springer.
Rothschild, L., & Stein, E. (1976). Hypoelliptic differential operators and nilpotent groups. Acta Mathematics, 137, 247–320.
Sachkov, Y. L. (2011). Cut locus and optimal synthesis in the sub-Riemannian problem on the group of motions of a plane. ESAIM: COCV, 17 (2), 293–321.
Sanguinetti, G., Citti, G., Sarti, A. (2008). Image completion using a diffusion driven mean curvature flowin A sub-riemannian space. VISAPP (2)’08, 46–53.
Sanguinetti, G., Citti, G., Sarti, A. (2010). A model of natural image edge co-occurrence in the rototranslation group. Journal of Visual, 10 (14).
Sarti, A., Citti, G., Petitot, J. (2008). The symplectic structure of the primary visual cortex. Biological Cybernetics, 98, 33–48.
Shi, J., & Malik, J. (1997). Normalized cuts and image segmentation. In: Proceedings IEEE Conf. Computer Vision and Pattern Recognition, (pp. 731–737).
Sugiura, M. (1976). Unitary representations and harmonic analysis: an introduction, North Holland Kodansha.
Turing, A.M. (1952). The chemical basis of morphogenesis, philosophical transactions of the royal society of London. Series B. Biological Sciences, 237 (641), 37–72.
Veltz, R., & Faugeras, O. (2011). Stability of the stationary solutions of neural field equations with propagation delays. The Journal of Mathematical Neuroscience (JMN).
Williams, L.R. (1995). Stochastic completion fields. ICCV Proceedings.
Wilson, H.R., & Cowan, J.D. (1972). Excitatory and inhibitory interactions in localized popula- tions of model neurons,. Biophysical Journal, 12, 1–24.
Wilson, H.R., & Cowan, J.D. (1973). A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue. Biological Cybernetics, 13 (2), 55–80.
Weiss, Y. (1999). Segmentation using eigenvectors: a unifying view Computer Vision. The Proceedings of the Seventh IEEE International Conference.
Zucker, S.W. (2006). Differential geometry from the Frenet point of view: boundary detection, stereo, texture and color. In: Paragios, N., Chen, Y., Faugeras, O (Eds.) In Handbook of mathematical models in computer vision. Springer, US, (pp. 357–373).
Conflict of interests
The authors declare that they have no conflict of interest
Author information
Authors and Affiliations
Corresponding author
Additional information
Action Editor: Bard Ermentrout
Rights and permissions
About this article
Cite this article
Sarti, A., Citti, G. The constitution of visual perceptual units in the functional architecture of V1. J Comput Neurosci 38, 285–300 (2015). https://doi.org/10.1007/s10827-014-0540-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-014-0540-6