1 Introduction

The simulator Virtual Retina Footnote 1 is fully documented and downloadable.Footnote 2 As an alternative solution to downloading, the software homepage also includes a web serviceFootnote 3 which allows clients to use the simulator on-line, through a user-friendly interface requiring no installation.

The simulator is mostly based on state-of-the-art knowledge about retinal processing, with formulations adapted to large-scale simulation. Modular XML definition files provide a simple handling of the simulator’s different parameters. In this article, we detail the underlying model and the interesting features of the simulator.

Retina (and LGN) models are very numerous, ranking from detailed models of a specific physiological phenomenon, to large-scale models of the whole retina. However, interestingly, the category of large-scale models appears under-represented in retinal literature. The reason is simple: On one side, experimental researchers on the retina are not very interested in large-scale models that require mostly a compilation of already well-established results. On the other side, researchers that seek a retina/LGN input for further modeling (typically, of V1) often overlook the complexity of processing in the retina, and use very simplified retina models.

Our primary goal with Virtual Retina is to provide a better retinal input to modelers of the visual cortex. Retinal processing displays a number of features which are likely fundamental for further cortical interpretation, such as band-pass filtering, gain controls, spiking synchrony, etc. (Wohrer 2008b) It is these functional implications of retinal processing that we wanted to retain in the simulator. This focus on functionality naturally makes Virtual Retina a good candidate for a second type of application: to study the nature of the retinal code itself.

By opposition, reproducing biological complexity is not our primary goal. However, because the retina is an efficient machinery, a functional model to reproduce retinal specificities must de facto have an architecture quite close to that of a real retina, to allow the same sort of filtering operations.

Amongst models of retinal processing as a whole, some focus on a detailed reproduction of retinal connectivity in successive layers, each layer being modeled with a full set of cellular and synaptic parameters (Hennig et al. 2002; Bálya et el. 2002). Other models take a more functional approach, built on a series of specific filtering stages, to produce a functionally efficient retinal output. In this group, another distinction can be made between models that aim at strong biological precision (van Hateren et al. 2002; Gazeres et al. 1998; Bonin et al. 2005), and models more oriented towards signal processing and computer vision (Hérault and Durette 2007; Delorme et al. 1999), with a consequent reduction of model parameters. Many functional models are strongly inspired by the linear–nonlinear (LN) architecture, based on three successive stages: Linear filtering on the visual stimulus, static nonlinearity and then spike generation (Chichilnisky 2001). Because of their generality and wide use, LN models have even been termed the retinal standard model by Carandini et al. (2005).

For Virtual Retina, we propose a model somewhere in-between all models cited above. It is definitely a functional model, with consequent simplifications regarding the complexity of retinal physiology. Still, it aims at a relative biological precision: This is verified by reproducing different experimental recordings on real retinal ganglion cells, including experiments not accounted for by linear models.

LN models are also a strong inspiration to our model: The first stage consists of a spatio-temporal linear filter, and we make use of static nonlinearities. However, as opposed to LN models, we also incorporate a nonlinear mechanism of contrast gain control, inspired from retinal physiology and other existing models. Furthermore, whereas LN models are generally used to experimentally fit the responses of a few ganglion cells (Chichilnisky 2001; Keat et al. 2001; Baccus and Meister 2002), we propose here a functional model suitable for large-scale simulation.

More generally, the model keeps an architecture strongly related to retinal physiology, in a desire to reproduce specific effects which are functionally important, and often discarded by large-scale models:

  • Non-separability of the center-surround filtering. It is well-known that most ganglion cells have a center-surround receptive field organization which makes them more sensitive to image edges. In real retinas, the surround signal is transmitted with a supplementary delay of a few milliseconds. This delay is included in our software, since it yields consequent effects for the perception of uniform screens, or appearing images. A qualitative, large-scale illustration of this fact is provided in the article.

  • Contrast gain control. This specific nonlinear dependence of retinal filtering on the mean level of contrast is modeled in an original framework, inspired by previous models, that can account simultaneously for gain controls due to the temporal and spatial structure of the stimulus. This contrast gain control model is carefully discussed, justified mathematically, and its perceptual consequences are suggested through a qualitative simulation.

  • Adaptable band-pass temporal filtering. In the retina, some ganglion cells have a long-lasting response after apparition of a static visual stimulus (tonic cells), while others only respond by a strong and short activation wave right after stimulus onset, and return to being silent afterward (phasic cells). Virtual Retina accounts for both types of cells in the simplest way possible, by modifying the strength of a partially high-pass filter.

  • Spike generation mechanism. A possible spike generation process is proposed at the output of the software, with a model derived from experimental fitting of ganglion cell outputs (Keat et al. 2001), that yields more realistic spike trains than a Poisson process.

None of the previously cited models of retinal processing displays simultaneously all these elements. Further capabilities of the software include reproduction of Y-type cells’ spatial nonlinearity (Enroth-Cugell and Robson 1966; Enroth-Cugell and Freeman 1987) and a possible log-polar scheme modeling the large-scale organization of primate retinas.

The article is organized as follows. In Section 2, we detail the three stages of the retina model implemented by the simulator. The first stage (2.2) is a linear filter that reproduces the center-surround architecture arising from the interaction between light receptors and horizontal cells. The second stage (2.3) is a contrast gain control mechanism at the level of bipolar cells, driven by feedback conductances. The third stage (2.4) provides additional temporal shaping of the signal, and then a spike generation process. Based on this model, we present in Section 3 the software simulator, Virtual Retina, and the results that we obtain with various sequences. First, we prove the pertinence of our model by comparing its spiking output with recordings of ganglion cells in different experiments (3.2). Then, we show large-scale simulations and more qualitative results (3.3). Finally, in Section 4, we discuss the nature of the software and its underlying model (4.1), and more specifically the included contrast gain control mechanism (4.2).

2 Methods

2.1 General structure of the model

2.1.1 A layered model in three stages

Figure 1 presents the global architecture of our model. The layered architecture of a retina suggests a model made of successive continuous spatio-temporal maps that progressively transmit and transform the incoming signal. The incoming light on the retina is the luminosity profile L(x,y,t) defined for every spatial point (x,y) of the retina at each time t. It can have any units in our model; for our simulations we used digitalized intensities between 0 and 255. The subsequent layers of cells are modeled as spatial continuums (no discrete cells, except the output ganglion cells), driven by specific differential equations.

Fig. 1
figure 1

Schematic view of the model, inspired by the layered structure of the retina. Three boxes indicate the three stages of the model. Corresponding mathematical notations are indicated in the right-hand side. Except for the last layer (ganglion cells), successive signals are modeled as spatially continuous maps

The first stage of our model deals with signal processing done in the Outer Plexiform Layer (OPL). It involves the two first layers of cells in the retina: light receptors and horizontal cells. This stage is modeled as a simple spatio-temporal linear filter based on experimental recordings (Enroth-Cugell et al. 1983; Cai et al. 1997) and previous models (Mahowald and Mead 1991; Herault 1996). We detail this OPL filter in Section 2.2. When applied to the input sequence L(x,y,t), the OPL filter defines a band-pass excitatory current I OPL(x,y,t) which is fed to bipolar cells.

The second stage of the model is an instantaneous, nonlinear contrast gain control through a variable feedback shunt conductance g A(x,y,t), applied on bipolar cells in our model. This interaction is represented by the two small arrows between bipolar cells and the ’fast adaptation’ signal in Fig. 1.

The third stage models signal processing in the Inner Plexiform Layer (IPL) and ganglion cells. First, additional spatio-temporal shaping of the signal is provided, modeling some synaptic interactions in the IPL. It produces the excitatory current I Gang, which is fed to our model ganglion cells. Second, ganglion cells themselves are modeled as a discrete set of noisy integrate-and-fire (nLIF) cells paving the visual field, and generating spike trains from input current I Gang. The cells that we model (see Sections 3.2.1 and 4.1.2) can be either X-type (the blue arrow, representing a one-to-one connection from bipolar cells) or Y-type (the blue cone, representing a synaptic pooling of the excitatory current).

2.1.2 Dimensional reduction

Our model equivalents to bipolar cells (Section 2.3) and ganglion cells (Section 2.4) are based on the same, generic membrane equation for a point neuron:

$$c \frac{dV}{dt} = \sum\limits_i I_i + \sum\limits_j g_j(E_j - V).$$
(1)

V is the cell’s membrane potential in Volts, c is the membrane capacity in Farads. Synaptic inputs (and other intrinsic membrane currents) can either be modeled as currents I i in Amperes, or more precisely as synaptic conductances g j in Siemens associated to reversal potentials E j in Volts.

Since our model does not focus on physiological precision, but on the functional output of the retina on a large scale, we are solely concerned with the temporal evolution induced by Eq. (1). Hence the following reduction of dimensionality:

$$V \to (V-V_R)/\Delta V,\;\;I \to I/(c\Delta V),\;\;g \to g/c,$$
(2)

where c is the membrane capacity of the neuron, V R its resting potential, and ΔV a ‘typical’ range of variation for the neurons’ potential. Through this reduction, constant c disappears from Eq. (1), V and the E j become dimensionless with typical values the order of unity, and I i and g j , expressed in Hertz, directly give scales for the temporal evolution of V.

Table 1 proposes a rough conversion between reduced and physical units in our model. It allows to verify that our model parameters stay in a biological range. The given conversion uses the following constants:

$$c\equiv\textrm{ 0.1 nF},\ \ \Delta V\equiv\textrm{ 20 mV,}$$
(3)

in link with physiological measurements in mammalian bipolar (Euler and Masland 2000) and ganglion (O’Brien et al. 2002) cells.

Table 1 Rough conversion between physical and reduced units in the model, based on Eqs. (2)–(3)

2.1.3 Notations for linear filters

As most retinal models, we use linear filters at several locations of the model, to approximate signal transformations that occur in the successive layers of the retina (see Fig. 2). This section describes the different types of linear filters used in our model. They are kept as simple as possible, because our simulator is focused on functional efficiency.

Fig. 2
figure 2

Linear kernels used in the model. (a): Exponential kernel E τ (t) (temporal, low-pass). (b): Gamma (exponential cascade) kernel E n,τ(t) (temporal, low-pass). (c): Partially high-pass kernel T w,τ(t) (temporal, high-pass). (d): Gaussian kernel (spatial, low-pass)

Low-pass temporal filters are taken as exponential filters, or possibly as an exponential cascade:

$$E_\tau (t) = \exp (-t/\tau) / \tau,$$
(4)
$$E_{n,\tau} (t) = (nt)^n \exp (-nt/\tau) / \big( (n-1)! \tau^{n+1} \big),$$
(5)

if t > 0, and zero otherwise (causal filters). These are normalized filters summing to one, meaning that they are only averaging filters, without a linear gain. The exponential cascade filter Eq. (5) peaks at time τ (not nτ), and offers more variability in the shape of the filter thanks to parameter n > 0.

Temporal low-pass occurs throughout retinal pro cessing, starting at the level of photoreceptors with the complicated phototransduction cascade, and continuing in subsequent layers because of synaptic delays and membrane integration of synaptic currents.

Partially high-pass temporal filters are simply taken as the difference of a Dirac, representing the original signal, and an exponential filter representing the low-pass average removed from the original signal:

$$T_{w,\tau} (t) = \delta_0(t) - w E_\tau (t).$$
(6)

Temporal high-pass behavior is present in the retina with different time scales, resulting from cellular internal mechanisms (e.g. cellular adaptation), as well as from synaptic oppositions between excitatory and inhibitory cells.

Low-pass spatial filters are taken as normalized two-dimensional Gaussians:

$$G_{\sigma} (x,y) = \exp\left( -\left(x^2+y^2\right) / \left( 2 \sigma^2\right) \right) / \left(2 \pi \sigma^2 \right),$$
(7)

which is again a normalized filter that only performs spatial averaging on the input signal. Gaussian kernels naturally arise when modeling dendritic spread of retinal cells, and also electrical couplings between cells (see Appendix A).

All filtering kernels are applied as convolution operators on their input signals. In this article, sign \(\stackrel{t}{*}\) denotes temporal convolution, and sign \(\stackrel{x,y}{*}\) denotes spatial convolution.

2.2 Outer plexiform layer

Retinal signals display a spatial opposition between a precise center signal, and a wider surround signal providing each location of the retina with a measure of the average illumination in the neighborhood (Barlow 1953; Kuffler 1953).

Physiologically, OPL is the designation for the first layer of synapses in the retina, which is the locus of synaptic interactions between light receptors, horizontal cells and bipolar cells (Masland 2001; Kolb et al. 2001). It is often assumed that the center-surround organization arises mostly in the OPL, through competing contributions from light receptors (center, or C) and horizontal cells (surround, or S) onto bipolar cells. Indeed, there is experimental evidence that a center-surround opposition is already present at the level of bipolar cells, in non-mammalian retinas (Werblin and Dowling 1969) as well as in primate retina (Dacey et al. 2000).

In our model, the resulting current I OPL(x,y,t) received by bipolar cells from light receptors and horizontal cells is obtained as

$$C(x,y,t) = G_{\sigma_C} \stackrel{x,y}{*} T_{w_U,\tau_U} \stackrel{t}{*} E_{n_C, \tau_C} \stackrel{t}{*} L\,(x,y,t),$$
(8)
$$S(x,y,t) = G_{\sigma_S} \stackrel{x,y}{*} E_{\tau_S} \stackrel{t}{*} C\;\;(x,y,t),$$
(9)
$$I_\mathrm{OPL}(x,y,t) = \lambda_\mathrm{OPL} \big( C(x,y,t)- w_\mathrm{OPL} S(x,y,t) \big),$$
(10)

where successive convolutions are applied from right to left. L(x,y,t) is the input luminosity profile. C(x,y,t) represents the center signal, associated to light receptors, and S(x,y,t) the surround signal, associated to horizontal cells. The linear filter resulting from Eqs. (8)–(10) is formally close to the linear approximation measured for LGN cells by Cai et al. (1997).Footnote 4 Let us comment these three equations.

Center signal

[Eq. (8)]. Temporally, the phototransduction process is modeled as a partially transient linear cascade: the exponential cascade \(E_{n_C, \tau_C}(t)\) is modulated by a partially transient filter \(T_{w_U,\tau_U}(t)\) (U standing for ’undershoot’). Such filter provides an impulse response close to that measured by Schnapf et al in macaque cone receptors (Schnapf et al. 1990). Spatially, \(G_{\sigma_C}(x,y)\) encompasses the spatial blur due to gap junctions between light receptors (Raviola and Gilula 1973).

Remark:

nonlinearities inherent to phototransduction. The phototransduction process is in fact hardly linear, since it displays fast and slow adaptation to the ambient level of luminosity, mediated by different nonlinear processes (Schnapf et al. 1990; Valeton and Van Norren 1983; Polans et al. 1996). The linear approximation is used here for simplicity. It is relatively valid as soon as the luminosity range of the input image remains constant (Schnapf et al. 1990): room with artificial light, monitor screen... For this reason, our simulator should rather take normalized sequences as input, such as movies coded between 0 and 255.

Also, phototransduction is the only place where our software uses an exponential cascade Eq. (5), because a simple exponential approximation Eq. (4) would appear too crude.

Surround signal

[Eq. (9)]. The surround signal is obtained through a supplementary low-pass on the center signal. Indeed, horizontal cells receive their input from light receptors, so their signal develops with one more synapse and one more cellular integration than receptors. Spatially, horizontal cells are very low-pass (meaning big σ S ) because of strongly coupling gap junctions with their neighboring horizontal cells (Naka and Rushton 1967; Masland 2001; Kolb et al. 2001).

OPL signal

[Eq. (10)]. Constant λ OPL is the overall gain of the center-surround filter, expressed in Hertz per unit of luminance. Constant w OPL ∈ [0,1] is the relative weight of center and surround signals, physiologically measured close to 1 in mammalian ganglion cells (Enroth-Cugell and Robson 1966) and bipolar cells (Dacey et al. 2000).

Spatially, the center-surround filter Eqs. (8)–(10) can be associated to a classical difference of Gaussians (DOG). Temporally, it is biphasic. As a result, our OPL filter acts at the same time as an edge detector and a movement detector.

However, note that the center-surround filter is not separable in time and space: It cannot be written as the product of a spatial kernel by a temporal kernel. Indeed, the surround signal is more delayed than the center signal because of \(E_{\tau_S}(t)\) in Eq. (9).

This delay, although estimated to only a few milliseconds in mammalian retinas (Enroth-Cugell et al. 1983; Bernadete and Kaplan 1999), has significant perceptual consequences. As a result, the center-surround filter is able to detect temporal variations of luminosity even in a spatially uniform zone. This would not be the case for a separable filter, like a spatial DOG multiplied by a temporal difference-of-Exponentials. Indeed, the response of such a separable filter on a uniform region would always be zero because of the DOG properties, even if the luminosity does vary in time.

Also, the delay between center and surround probably implies that the first retinal spikes after onset of a new image do not code for image edges, but simply for the luminosity signal (as illustrated in Section 3.3.2).

Remark:

Surround inhibition from the IPL. Specific studies suggest that some amacrine cells in the IPL also contribute to generate the surround component of retinal output. However, the importance of this contribution is not well established: According to the species and experimental procedure, synaptic connections in the IPL have been found to form a minority (McMahon et al. 2004), moderate (Flores-Herr et al. 2001) or important (Roska and Werblin 2001; Jacobs and Werblin 1998) contribution to ganglion cells’ surround. Our model, aiming at functionality, neglects this additional contribution of the IPL.

2.3 Contrast gain control

Contrast gain control is the usual term to describe the influence of the local contrast of the scene on the transfer properties of the retina (Shapley and Victor 1978; Victor 1987; Kim and Rieke 2001; Rieke 2001; Baccus and Meister 2002). Consequently, it is an effect intrinsically nonlinear, and dynamical. Our simulator includes a model of contrast gain control based on a nonlinear feedback loop at the level of our model bipolar cells:

$$\frac{\displaystyle {dV_\mathrm{Bip}}}{\displaystyle {dt}}(x,y,t)=I_\mathrm{OPL}(x,y,t) -g_\mathrm{A}(x,y,t) V_\mathrm{Bip}(x,y,t)$$
(11)
$$g_\mathrm{A}(x,y,t) = G_{\sigma_\mathrm{A}} \stackrel{x,y}{*} E_{\tau_\mathrm{A}} \stackrel{t}{*} Q (V_\mathrm{Bip}) \;\;(x,y,t),$$
(12)
$$Q(V_\mathrm{Bip}) = g^0_\mathrm{A} + \lambda_\mathrm{A} V_\mathrm{Bip}^2,$$
(13)

where g A represents a variable leakage (’shunt’) term in the membranes of our model bipolar cells, which is activated through the static function Q (Fig. 3(a)). All physiological magnitudes are reduced dimensionally, as detailed in Eq. (2).

Fig. 3
figure 3

Transmission functions of the model. (a): Activation function Q(V Bip) for shunt conductances g A in bipolar cells. (b): Synaptic transmission N(V Bip) from bipolar to ganglion cells

Since the leakage determines the gain of current integration, g A has a divisive effect on the evolution of V Bip in Eq. (11). At the same time, g A defines the ‘time constant’ (not constant, here) of Eq. (11). In our model, g A depends dynamically on the recent values taken by bipolar cells (feedback mechanism), with a typical time scale τ A and spatial extent σ A, as described by Eq. (12). These two parameters determine the size of the spatio-temporal neighborhood used by g A to determine a local measure of contrast.

The possible values and biological origin of parameters σ A and τ A are discussed in Section 4.2. They cannot be directly fixed from experimental data, due to the hypothetical nature of the stage Eqs. (11)–(13). In the experiments of Section 3, best reproduction was obtained for a small adaptation time constant (typically, τ A = 5 ms). As for parameter σ A, we make no assumption on its possible values: it could be absent (σ A = 0), or present with the typical spatial extent of diffusion processes in the retina (e.g. through gap junctions) (see Discussion). Both cases are considered on a perceptual level in Section 3.3.2.

A static activation function Q(V Bip) links values of V Bip to the activation of leak conductances g A, through Eq. (13). Q is defined as an even function, so that the activation of g A depends only on the absolute value of V Bip. Parameter \(g_\mathrm{A}^0\) in Eq. (13) represents the inert leaks in membrane integration Eq. (11) (because filters \(G_{\sigma_\mathrm{A}}\) and \(E_{\tau_\mathrm{A}}\) in Eq. (12) have a gain of 1). It does not depend on the mean level of V Bip. On the contrary, λ A in Eq. (13), also in Hertz, fixes the strength of the gain control feedback loop.

Furthermore, Q is assumed to have a convex shape, implying different behaviors of the system, depending on the contrast:

  • At small contrasts, the system has a quasi-linear working range. Indeed, when the input current I OPL has small variations, it translates into small variations of the bipolar potential, so that V Bip remains in the ’central region’ of function Q, where \(V_\mathrm{Bip}^2\simeq 0\). As a result, \(g^0_\mathrm{A}\) remains the principal, constant, leaking force in Eq. (11), and integration remains quasi-linear at the level of bipolar cells.

  • At high contrasts, by opposition, as |V Bip| enters the ‘big value’ range of function Q, the leakage term g A in Eq. (11) becomes truly subject to dynamical variations. As a result, bipolar cells start responding sub-linearly to the input current I OPL.

Note that the precise choice of function Q(V Bip) is arbitrary in our model (a linear-by-parts version was successively tested, but required an extra parameter, see Section 4.2). The only important constraint for our model to reproduce experimental curves, was that function Q be strictly convex, in order to enhance the contrast gain control effect (see Section 3.2.2).

Although the gain control mechanism proposed here is close to existing models of gain control in the retina (Victor 1987), it is an original contribution, and it is justified mathematically (see Appendix B and Wohrer 2007 for more details). We refer to Section 4.2 for comparisons between this model and previous ones. In that section we will also discuss the biological relevance of the architecture, with questions such as the symmetric shape of Q, the choice of a simple leak conductance, and values given to σ A and τ A.

2.4 Ganglion layer

The last stage of our simulator is the generation of spike trains, modeling how retinal ganglion cells produce spikes from bipolar cells’ activities. First, bipolar signal V Bip is rectified, and possibly receives additional spatio-temporal shaping, to produce an excitatory current on ganglion cells I Gang (Section 2.4.1). From I Gang(x,y,t), an array of noisy leaky-integrate-and-fire neurons (nLIF), modeling ganglion cells, produces the sets of output spikes. The nLIF procedure is described in Section 2.4.2, while Section 2.4.3 shows how to define a whole retina as an assembly of nLIF. Two possible retinal organizations are discussed, log-polar or homogeneous.

2.4.1 Synaptic current upon ganglion cells

In real retinas, additional and complex transformations of the signal are provided by the synaptic structures in the IPL, the second layer of synapses in the retina which is the locus of synaptic interactions between bipolar cells, amacrine cells, and ganglion cells. Our simulator uses a single, empirical formula to model signal shaping in the transition from bipolar cells to classical center-surround ganglion cells (cat X and Y cells, primate parvo- and magnocellular cells):

$$I_\mathrm{Gang}(x,y,t) = G_{\sigma_\mathrm{G}} \stackrel{x,y}{*} N\big( \varepsilon\ T_{w_\mathrm{G},\tau_\mathrm{G}} \stackrel{t}{*}V_\mathrm{Bip}(x,y,t)\big),$$
(14)

with

$$N(V)= \begin{cases} \frac { \displaystyle i_\mathrm{G}^0 } { \displaystyle 1 - \lambda_\mathrm{G} (V-v_\mathrm{G}^0) /i_\mathrm{G}^0 } & \mathrm{if} \, V<v_\mathrm{G}^0,\\[0.3 cm] \displaystyle i_\mathrm{G}^0 + \lambda_\mathrm{G} (V-v_\mathrm{G}^0) & \mathrm{if} \, V>v_\mathrm{G}^0. \end{cases}$$
(15)

Equations (14)–(15) do not aim to explain the biological complexity of the IPL, but to allow functional reproduction of some ganglion cells’ specific responses. Parameters in these formulas will vary according to the subtype of ganglion cell being modeled. Equations (14)–(15) comprise four modeling elements:

Polarity

‘ON’ and ‘OFF’ ganglion cells are simulated simply by the simulator, by setting parameter ε in (14) to respective value 1 or − 1.

Biologically, ‘ON’ and ‘OFF’ cellular pathways diverge earlier in retinal processing, involving different types of bipolar cells. From the level of bipolar cells and on, there are physiological and anatomical disparities between ‘OFF’ and ‘ON’ cells, such as reaction time, sensitivity to contrast gain control, or density of cells. These discrepancies are not explicitly taken into account by our model that considers a single, symmetrical signal up to bipolar cells. However, model parameters allow to reproduce either population, when required.

Rectification

Equation (14) rectifies signal V Bip through the static nonlinear function N, defined in Eq. (15) and represented in Fig. 3(b). Parameters λ G and \(i_\mathrm{G}^0\) have, again, the dimension of ‘reduced currents’ expressed in Hertz. \(v_\mathrm{G}^0\) is the ‘linearity threshold’ of the cell, i.e. the value after which transmission becomes linear. Note that \(N(v_\mathrm{G}^0)=i_\mathrm{G}^0\).

Such rectification is a very common feature in neural modeling and in retinal models (Carandini et al. 2005; Chichilnisky 2001; Gazeres et al. 1998). It reflects static nonlinearities observed experimentally in the retina (Chichilnisky 2001; Kim and Rieke 2001; Baccus and Meister 2002), e.g., through LN analysis of retinal cells (see Section 3.2.3 for the definition of LN analysis). Biologically, static nonlinearities in signal transmission can occur for different reasons: saturations, synaptic transmissions, etc. Eq. (15) defines a smooth rectification with a shape close to experimental curves observed when performing an LN Analysis (see Section 3.2.3).

Additional transient

The temporal filter \(T_{w_\mathrm{G},\tau_\mathrm{G}}\) allows to control how much the simulated ganglion cells are phasic or tonic. The tonic-phasic opposition is a general concept in physiology that can be described in simple words: “A tonic process is one that continues for some time or indefinitely after being initiated, while a phasic process is one that shuts down quickly” (Erwin 2004). For primate Magnocellular cells or cat Y cells, response to a constant stimulation shuts down quickly after one or two hundreds of milliseconds, requiring the use of a transient weight w G close to 1. By opposition, for our simulations of cat X cells, the supplementary transient was fixed at an intermediate balance (w G = 0.7) in order to reproduce correct Wiener kernels in Section 3.2.2, and correct responses to gratings in 3.2.1.

There are several plausible biological explanations for the transient properties intrinsic to ganglion cells. Likely, one main reason is the existence of specific amacrine cells in the IPL that were found to cut the responses of ganglion cells (Nirenberg and Meister 1997; Masland 2001; Kolb et al. 2001), whether through a feedback to bipolar cells, whether by direct inhibition on ganglion cells.

Locating this transient before the static nonlinearity N is a convenient, empirical choice. First, it provides better reproduction of Y cells’ spatial nonlinearity (see next paragraph and Section 3.2.1) by creating fully band-pass units before the rectification and pooling. Second, it allows undershoots possibly generated by the transient filter \(T_{w_\mathrm{G},\tau_\mathrm{G}}\) to be attenuated by the compression N.

Additional pooling

The spatial filter \(G_{\sigma_\mathrm{G}}\) aims at reproducing a typical nonlinear effect observed in cat Y cells (Enroth-Cugell and Robson 1966) (an illustration can be seen in our simulations of Section 3.2.1). The simplest explanation of this spatial nonlinearity is a spatial pooling that would occur after the synaptic rectification onto these ganglion cells. This explanation, first proposed by Hochstein and Shapley (1976), is at the base of the Freeman and Enroth-Cugell model for Y-type cells (Enroth-Cugell and Freeman 1987). Furthermore, the biological basis was justified experimentally (Demb et al. 2001; Kolb et al. 2001): The spread of the dendritic tree of Y-type cells is large enough to significantly average the synaptic input from bipolar cells over a consequent spatial extent.

Remark:

Other explanations have been proposed to account for the nonlinearity of Y cells. For example, Hennig et al. (2002) find that part of the nonlinearity might be due to the spatial integration by Y cells of a temporal nonlinearity due to phototransduction. Although they propose a different model, their work also needs a wide spatial pooling at the level of Y cells.

2.4.2 Spike generation in ganglion cells

This section is about the transformation of the continuous signal I Gang(x,y,t) into discrete sets of spike trains. Let us consider N ganglion cells C n (n = 1...N) paving the retinal space (see Section 2.4.3 for their repartition and parameters) and let us denote by V n the potential of cell number n (n = 1...N) centered at position (x n ,y n ).

Our simulator simply generates the output of cell C n with a standard nLIF model:

$$\left\{\begin{array}{llll}\frac{{dV_n}}{{dt}} = I_\mathrm{Gang}(x_n,y_n,t) - g^L V_n(t) + \eta_v (t), \nonumber\\ \textrm{ Spike when threshold is reached: } V_n(t_\mathrm{spk} ) =1, \\ \textrm{ Refractory period: } V_n(t) = 0 \textrm{ while } t<t_\mathrm{spk} + \eta_\mathrm{refr},\nonumber\\ \textrm{ and } (16) \textrm{ again},\nonumber \end{array}\right.$$
(16)

where η v (t) and η refr are two noise sources that can be added to the spike generation process in order to reproduce the trial-to-trial variability of real ganglion cells, following the experimental results of Keat et al. (2001). η v (t) is taken as a Brownian movement that has the dimension of a current. Integration of this current through Eq. (16) is equivalent to adding to V n (t) a Gaussian auto-correlated process with time constant 1/g L (typically, 20 ms), and variance σ v . The amplitude of η v (t) is chosen for σ v to be around 0.1. η refr is a stochastic absolute refractory period that is randomly chosen after each spike, following a normal law, typically \(\mathcal{N}\) (3 ms,1 ms).

Note that spike generation is the only source of noise in our model, following the model of (Keat et al. 2001) for trial-to-trial variability in ganglion spike trains. Recent findings (Dhingra and Smith 2004) have confirmed that the spiking mechanism is an important contribution to the overall noise in retinal processing.

Remark:

ISI distributions – Poisson or not Poisson? Measuring inter-spike intervals (ISIs) is a common way of estimating the nature of the spike generation process. In real experiments, it is observed that retinal spike trains have less variability than a simple Poisson emission process (see e.g. Kara et al. 2000 for references), at least during the periods of high firing activity (Hartveit and Heggelund 1994). The coefficient of variation of retinal spike trains (CV, defined as the ratio between standard deviation and mean of the ISI histogram) was generally measured smaller than one, meaning less variability than a Poisson process. As a result, the spiking procedure has often been modeled through Gamma renewal processes, which can be seen as a generalization of Poisson processes, but with a controllable CV (see e.g. Gazeres 1998 for references).

However, in Poisson as in Gamma processes, the CV is constant whatever the input current. By opposition, some studies (Hartveit and Heggelund 1994) find that the CV of retinal spike trains depends on the intensity of the spike emission: retinal spikes become more predictable (CV decreases) during periods of high spiking activity. To account for this reality, Gazeres et al. (1998) propose a model which switches dynamically between a Poisson and a Gamma procedure with smaller CV, according to the values of the generating signal. But precise fitting of this model to biological data has not been done to our knowledge.

By comparison, the nLIF model that we use here was experimentally proved to successfully predict the occurrences of retinal spikes, in different species (Keat et al. 2001), yielding typical noise parameters that we could use in our model. An nLIF model has also been used to reproduce spike variability in the LGN (Lesica and Stanley 2004). The simulated ISIs in our nLIF model resemble those of a Gamma process, with a CV dependent on the amount of noise added in the spike generation through noise sources η v and η refr (personal experimentation).

However, we have not yet well established the behavior of the CV with input current for this nLIF model. From first experimentation, the CVs display some variability with input current (unlike a Gamma process), but not sufficiently compared to real cells. And, indeed, some experimental reproductions by our simulator suggested that our emitted spikes are probably too deterministic at low contrast. The spiking nLIF model may be enhanced in the future by adding a dynamical variation on the intensity of η v , depending on the instantaneous intensity of the generating current.

Remark:

Spike correlations between neighboring cells. Other experiments revealed a stimulus-dependent synchrony between the spikes of neighboring cells (Neuenschwander and Singer 1996) (also, see Kenyon et al. 2004; Kenyon and Marshak 1998 for a model of this synchrony based on feedbacks from long- and short-range amacrine cells). Such input-driven synchrony is not taken into account yet by the simulator, but we consider it as an interesting future extension.

2.4.3 Ganglion cell sampling configurations

The whole model presented above holds when modeling a small region of the retina, in which the density of retinal cells can be considered uniform. In that case, all filtering scales and parameters are constant and do not depend on the spatial position of each cell. Our simulation software can easily handle such a uniform distribution of cells.

However, mammalian (and especially primate) retinas taken as a whole are not uniform at all. Density of cells and filtering scales depend on the position considered in the retina. One needs to distinguish the fovea in the center, from the surround of the visual field where precision is less. A simple way is to define a scaling function, that describes at the same time the local density of cells and the spatial scales of filtering in the different regions of the retina.

Our simulator implements a radial and isotropic density function that depends on the distance r from the center of the retina. We define a one-dimensional log-polar scaling function s(r) as

$$s(r)=\begin{cases} 1 & \mathrm{if} \; r<R_0,\\ 1 / (1 + K (r-R_0) ) & \mathrm{if} \; r>R_0, \end{cases}$$
(17)

where R 0 is the size of the fovea and K is the speed of density decrease outside of the fovea. When K = 1/R 0, this amounts to a traditional log-polar scaling. The density of cells in a given region of the retina at eccentricity r is then given by \(d_0\,s(r)^2\), d 0 being the 2d-density of cells in the fovea. Conversely, all spatial filtering scales of the model presented before (σ C , σ S , σ Am, σ Gang) scale with s(r) − 1.

The choice of such a scaling function is biologically justified: Dendritic trees for primate ganglion cells have experimentally been found to scale with a positive power of r, between r 0.7 and r according to the type of cell (Dacey and Petersen 1992).

3 Results

3.1 Virtual Retina customization

The software Virtual Retina implements the model presented in this article, with the following characteristics:

Possibility of large-scale simulations

Up to 100,000 spiking cells can be simulated in a reasonable time (speed of around 1/100 real time).

XML definition file

All parameters for the different stages of the model are defined in a single customizable XML file.

Two possible density functions

First option is a uniform, square array of cells, associated to a uniform density function. In that case, all spatial filtering scales of the model (σ C , σ S , σ Am, σ Gang) are constant throughout the whole image, and the corresponding Gaussian filters G σ (x,y) are implemented thanks to traditional recursive Deriche filters. Second option is a sampling of ganglion cells along concentric circles, associated to the radial scaling s(r) in Eq. (16). In that case, the Gaussian filters G σ (x,y) have different scales according to the location in the retina. They are implemented thanks to a recursive filtering with inhomogeneous recursive coefficients, an approximation proposed in Tan et al. (2003) that leads to a significant gain in computational speed.

Fixation microsaccades

Finally, the software allows to include a simple random microsaccades generator at the input of the retina, to account for fixation eye movements, as inspired from Martinez-Conde et al. (2004).

To provide a better fit to the specific complexity required by potential users, we wished to build a modular software, that allows some liberty in the choice of the underlying model. Virtual Retina needs an XML file that defines all the parameters of the retina chosen for the simulation. This XML file is customizable: Each feature, as defined in this article, corresponds to its own XML node, which can be present or not in the XML definition file. One example file is shown in Appendix C.

As a result, the output of the software can consist of spikes or continuous maps, the contrast gain control can be present or not, the retina can follow a log-polar scheme or a uniform scheme, etc. This flexibility required important efforts in the conception of the software (Wohrer 2008a).

In the on-line web service, a dedicated page assists users in customizing their own XML file.

3.2 Physiological reproductions

In this section we test our simulator on classical physiological experiments, led on single ganglion cells. These experiments, with various protocols, demonstrate that our underlying model induces linear kernels close to those measured physiologically, and can also account for two typical nonlinear effects: Contrast gain control and spatial nonlinearity of Y-type ganglion cells.

A first experiment (Section 3.2.1) is devoted to reproduction of the physiological difference between X and Y- type cells. The two following experiments (Sections 3.2.2 and 3.2.3) are devoted to the phenomenon of contrast gain control in the retina. We show that our gain control loop Eqs. (11)–(13), along with the rest of the model, reproduces qualitatively the dynamical changes in retinal filtering linked to the average level of contrast.

3.2.1 X and Y cell responses to grating apparitions

Description of the experiment. Grating apparitions are a classical stimulus when experimenting on the low-level visual pathway. We reproduce here one of the first recordings of that kind, on cat ganglion cells in 1966 by Enroth-Cugell and Robson (1966), which led to the distinction between cat X type and Y type cells.

Cat ‘OFF’ ganglion cells are presented with the alternation of a static grating and a uniform screen of same luminance, at a frequency of about 0.5 Hz, and their spiking output is measured extracellularly. The experiment is repeated for different spatial phases of the grating, so as to test the summing properties of the cells’ receptive fields. Typical responses (averaged instantaneous frequencies) for a X cell and a Y cell are presented in Fig. 4.

Fig. 4
figure 4

Response of cat OFF-center ganglion cells to the disappearance and reappearance of a sinusoidal grating with different spatial offsets, reproduced from Enroth-Cugell and Robson (1966). (a): typical X-type ganglion cell. (b): typical Y-type ganglion cell (see details in the text). Grating spatial frequencies of 0.13 deg − 1 (a) and 0.16 deg − 1 (b). Mean luminance 16 cd/m2, grating contrast of 0.32

The experiment reveals that X cells have a relatively tonic behavior, since their response to a static stimulus lasts for a long time, whereas Y cells are totally phasic, only responding for a few hundreds of milliseconds after stimulus onset, and returning to silent.

The experiment is also an illustration of the ‘null position’ test: For X cells, a spatial phase exists for which the cell has roughly no response to the grating (here, 90 and 270 deg), when the ‘positive’ and ‘negative’ parts of the grating exactly compensate one another thanks to linear summation. For Y cells such a position does not exist, revealing a spatial nonlinearity, as mentioned already in Section 2.4.1.

The X cell curves in Fig. 4(a) also reveal a slow, nonlinear adaptation of the cell to its own level of response. Consider the responses of the cell at the end of the ’uniform screen’ period. Linear approximation would predict similar levels of response in the four experimental conditions, since the uniform screen has already been on for a whole second, which exceeds the latency of the cell’s linear response.

However, real cell responses at the end of the ‘uniform screen’ period are bigger in the experiment with the ‘0 deg’ grating, and lower with the ‘180 deg’ grating. This can be explained by a slow adaptation of the cell’s gain to its global level of response, which is stronger in the ‘180 deg’ experiment because the cell responds strongly when the grating is on.

Simulation with our model. Our modeled X and Y cells are tested in Fig. 5. Parameters σ C and σ S (see caption to Fig. 5) are chosen to fit the receptive fields measurements for the original X cell in Fig. 4, by the authors. Other parameters are chosen to produce a good, simultaneous fit to these experiments and the following (multi-sinus, LN analysis).

Fig. 5
figure 5

Reproduction of the experiments in Fig. 4 by our retina model. (a:) X cell model, (b:) Y cell model. Average firing rates generated from 80 trials with noise in the spike generation. Test grating: Normalized mean luminance of 0.5, contrast 0.32, spatial frequency of 0.13 deg − 1. OPL parameters: Center, σ C  = 0.88 deg, τ C  = 10 ms, n C  = 2. Surround, σ S  = 2.35 deg, τ S  = 10 ms. Slow linear transient, w U  = 0.8, τ U  = 100 ms. Global amplification, λ OPL = 1000 Hz per normalized luminance unit. Gain control parameters: \(g_\mathrm{A}^0=5\) Hz, λ A = 50 Hz, τ A = 5 ms, σ A = 2.5 deg. X cell parameters: IPL transients, w G = 0.7, τ G = 20 ms. Synaptic transmission, σ G = 0, λ G = 150 Hz, \(v_\mathrm{G}^0=0\), \(i_\mathrm{G}^0=80\) Hz. Y cell parameters: IPL transients, w G = 1, τ G = 50 ms. Synaptic transmission, σ G = 1.8 deg, λ G = 300 Hz, \(v_\mathrm{G}^0=0\), \(i_\mathrm{G}^0=80\) Hz. Spike generation: g L = 50 Hz, σ v =0.2, \(\eta_\mathrm{refr} \sim \mathcal{N}\)(3 ms, 1 ms)

The model reproduces the ‘null position’ typical of X cells, and the absence of such ‘null position’ for Y cells, due to the post-synaptic pooling added in Eq. (14). Reproduction of Y cell curves required that the three modeling elements in Eq. (14) be in the right order: temporal transient \(T_{w_\mathrm{G},\tau_\mathrm{G}}\), rectification N and pooling \(G_{\sigma_\mathrm{G}}\).

The slow decay of X cell responses to static stimuli is also reproduced, with the correct time scale. This is due to the added effects of the two transient filters in our retinal scheme: slow transients with \(T_{w_U,\tau_U}\) in Eq. (10), and fast transients with \(T_{w_\mathrm{G},\tau_\mathrm{G}}\) in Eq. (14).

Note however that in our model, in any of the four experimental conditions, the X cell sets back to the same firing rate at the end of the ’uniform screen’ period: its ground firing rate. Parameters \(i_\mathrm{G}^0\) in Eq. (15) and g L in Eq. (16) were fixed to obtain a ground firing rate of around 50 Hz. Our model does not encompass the slow cellular adaptation of the cell to its own level of response, observed in Fig. 4(a) and explained in the previous paragraph. Slow adaptation is discussed in Section 4.2.

3.2.2 Multi-sinus experiments

To test and calibrate our contrast gain control loop Eqs. (11)–(13), we reproduced two of the Shapley and Victor (1978) multi-sinus experiments, which gave the first quantitative measures of contrast gain control in the retina. These experiments were pursued on an ON-center cat X cell. Input stimulus L(x,y,t) was a static grating of fixed mean luminance \(\bar{L}=20\) cd/m2, temporally modulated by a sum of sinusoids with adjustable contrasts:

$$L(x,y,t)= \bar{L} \left( 1 + \; \mathrm{Gr}(x,y) \sum\limits_{i=1}^{8} c_i \sin(\xi_i t) \right),$$
(18)

where Gr(x,y) is a sinusoidal grating function with normalized amplitude (between −1 and 1). The ξ i are a set of eight temporal frequencies that logarithmically span the frequency range from about 0.2 to 32 Hz, respectively associated to contrast strengths c i .

Recordings were made for different distributions of the c i . For each recording, the cell’s output firing rate was Fourier-analyzed at each of the input frequencies ξ i , thus yielding a set of eight amplitudes and eight phases. This set provided a measure for the linear kernel (first-order Wiener kernel) that best fits the cell’s response, in the given contrast conditions.

Influence of the mean level of contrast

Description of the experiment.

This first experiment measures how the mean level of contrast changes the best-fitting first-order Wiener kernel for the cell. The c i are all fixed at the same value c i  = c, global level of contrast for the stimulus. The experiment is repeated for four values of contrast, c being doubled each time.

Remark

When ∀ i, c i  = c, the temporal part of signal Eq. (18) is related to a pink noise stimulus (with similar power in each frequency octave). However, the spectrum of Eq. (18) is concentrated on eight discrete values, unlike a real pink noise.

The resulting amplitude and phase diagrams for the cell’s output, represented in Fig. 6(a) and (b), reveal deviations from linearity: If the ganglion cell responded linearly to its input, the modulations in its response would simply be proportional to c. Successive amplitude curves in Fig. 6(a) would be parallel, spaced by log(2) as contrast is doubled, and all phase curves in Fig. 6(b) would superimpose, since the phase portrait depends only on the nature of the linear filter.

Fig. 6
figure 6

Contrast gain control in a cat ON-center X ganglion cell, reproduced from Shapley and Victor (1978). (a and b): response to multi-sinus stimuli of different contrasts c (sample input signal depicted in panel c-1). Amplitude curves (A) reveal under-linearity at low temporal frequencies. Phase curves (B) reveal time advance for high contrasts (see text). Successively, c was 0.0125 (○), 0.025 (□), 0.05 (△) and 0.1 (●). (d): Strength of the gain control effect depends on the dominant frequency \(\xi_{i_0}\) present in the input (stimuli with a carrier frequency \(\xi_{i_0}\), as depicted in Panel c-2). The three curves represent indicators \(\phi_5(\xi_{i_0})\) (●), \(\phi_6(\xi_{i_0})\) (■) and \(\phi_7(\xi_{i_0})\) (▲) which measure the strength of the gain control (see text). Frequencies that elicit the most gain control are \(\xi_{i_0}=3–10\) Hz

Instead, the cell responds under-linearly to contrast at low temporal frequencies, where successive amplitude curves are spaced by less than log(2). In the phase portrait, strong contrasts induce a phase-advance of the response (phase curve shifted upwards), meaning that the cell responds faster at high contrasts. Amplitude compression at low frequencies and phase advance are the dual mark of the contrast gain control effect, as defined by Shapley and Victor. The authors found the two phenomena to be highly correlated in their experiments, probably resulting from a common mechanism.

Simulation with our model

Reproduction by our model is shown in Fig. 7(a) and (b). The model reproduces the typical time advance of ganglion responses at high contrasts (Fig. 7(b)). This is because conductance g A in Eq. (11) determines the time constant of the response of bipolar cells, and that the mean level of g A, dependent on the average of \(V_\mathrm{Bip}^2\), is a growing value of contrast.

Fig. 7
figure 7

Reproduction of the gain control effect by a model cat X cell. (a), (b) and (d) have same signification as in Fig. 6 (including various curve markers). (c) represents the amplitude responses for the model cell in the ‘carrier frequency’ experiments (stimuli as in Fig. 6(c-2)). Contrast gain control is observed since the ‘perturbation’ kernels do not superimpose (see text). Test grating: mean luminance of 0.5, 0.2 cycles/deg. X Cell parameters as in Fig. 5

Similarly, the under-linearity of response amplitudes with contrast is also observed (successive curves spaced by less than log(2) in Fig. 7(a)). This is because g A in Eq. (11) increases the leak in the bipolar membrane, and thus lowers the linear gain of bipolar transmission in the case of high input contrast.

According to the preceding intuitive explanation, any shunting feedback loop necessarily implies a phase advance and an amplitude compression, and thus contrast gain control. However, reproducing the exact shape of the kernel, and how it varies with contrast, required more specific features from our model.

First, the feedback loop Eqs. (11)–(13) is globally a low-pass setting, that can by no means reproduce the band-pass behavior observed in Fig. 6(a). The band-pass behavior in our model arises before the feedback loop, through filter \(T_{w_U,\tau_U}\) in Eq. (8) that accounts for temporal transients in the first layers of the retina, and after the feedback loop, through filter \(T_{w_\mathrm{G},\tau_\mathrm{G}}\) in Eq. (15) that accounts for temporal transients in the IPL.

Second, we found mandatory that function Q in Eq. (13) be strictly convex with a flat zone around V Bip = 0, in order to reproduce the pronounced change in shape of the Wiener kernel between high and low contrast. If simulations are done with Q(V Bip) = λ|V Bip|, we obtain parallel amplitude curves spaced by less than log(2): Contrast gain control is thus present, but we are unable to reproduce the specific transformation of Wiener kernel shapes with contrast. The biological relevance of a strictly convex shape is discussed in Section 4.1.3.

Frequencies that induce contrast gain control

Description of the experiment.

A second experiment was crafted by Shapley and Victor (1978) to further investigate the origin of the gain control mechanism. Each input frequency \(\xi_{i_0}\) is successively chosen as a ‘carrier’ frequency with \(c_{i_0}=0.2\), while the other frequencies are added as perturbation terms: c i  = 0.0125 for i ≠ i 0 [see Eq. (18)]. Results are compared to a ‘low contrast’ test condition where c i  = 0.0125 for all i.

For each carrier frequency \(\xi_{i_0}\) three phase advance indicators \(\phi_5(\xi_{i_0})\), \(\phi_6(\xi_{i_0})\) and \(\phi_7(\xi_{i_0})\) are measured, respectively associated to assay frequencies ξ 5 = 3.9, ξ 6 = 7.8 and ξ 7 = 15.6 Hz. \(\phi_5(\xi_{i_0})\) is obtained by measuring output phase at the assay frequency ξ 5 when \(\xi_{i_0}\) is the carrier frequency, and subtracting the output phase at ξ 5 in the low-contrast test condition; similarly for \(\phi_6(\xi_{i_0})\) and \(\phi_7(\xi_{i_0})\).

Since contrast gain control can be measured by a phase advance (previous paragraph), \(\phi_5(\xi_{i_0})\), \(\phi_6(\xi_{i_0})\) and \(\phi_7(\xi_{i_0})\) provide three indicators, hopefully highly correlated, of the strength of the gain control induced by \(\xi_{i_0}\).

Figure 6(d) represents experimental measures for \(\phi_5(\xi_{i_0})\), \(\phi_6(\xi_{i_0})\) and \(\phi_7(\xi_{i_0})\). As predicted, the three indicators are highly correlated, consistently with the global time advance induced by contrast gain control. Phase advance is strongest when the carrier \(\xi_{i_0}\) is around 3 − 10 Hz. This reveals that the underlying mechanism for the contrast gain control measured here has a ‘band-pass’ sensitivity, being preferentially triggered by temporal variations around 3 − 10 Hz.

Simulation with our model.

The three measured phase differences \(\phi_5(\xi_{i_0})\), \(\phi_6(\xi_{i_0})\) and \(\phi_7(\xi_{i_0})\) for our model are represented in Fig. 7(d). We reproduce larger phase advances when the carrier frequency is in the range 1 − 10 Hz. This is due to the temporal band-pass filter \(T_{w_U,\tau_U}\) in Eq. (8), that enhances the contributions of frequencies 1 − 10 Hz in the current I OPL which is fed to the gain control mechanism Eq. (11).

In Fig. 7(c) we also provide the Wiener kernels associated to the experiment. Given a carrier frequency \(\xi_{i_0}\), the interesting feature is the perturbed kernel measured at the remaining frequencies \(\xi_i \neq \xi_{i_0}\). Perturbed kernels for different carriers \(\xi_{i_0}\) do not superimpose, because all carriers do not imply the same amount of gain control. Strong gain control translates in a Wiener kernel that is more cut at low temporal frequencies. For our set of parameters, we find that the carrier frequencies that elicit the most gain control are \(\xi_{i_0}=4\), 8 and 2 Hz, coherently with Fig. 7(d).

3.2.3 Linear–nonlinear analysis (LN)

Another way to compute Wiener kernels is LN analysis, based on reverse correlation of the output signal with a white noise stimulus (Marmarelis and Naka 1972; Chichilnisky 2001). It provides a way to measure the LN architecture (linear filtering followed by a static nonlinearity) that best fits the measured cell. This method has been recently applied to the retina to provide new measurements of contrast gain control (Kim and Rieke 2001; Rieke 2001; Baccus and Meister 2002).

Figure 8(a) and (b) presents experimental results of Baccus and Meister (2002) on a salamander ganglion cell. They performed an LN analysis with white noise at two different contrasts, to measure the change in the filtering structure of the cell. The dual mark of contrast gain control is again observed: time advance of the response (Fig. 8(a)) and decrease of the gain for high contrasts (slope of the nonlinear function in Fig. 8(b)).

Fig. 8
figure 8

(a) and (b): LN analysis by Baccus and Meister (2002) on a salamander ganglion cell, revealing contrast gain control. Time advance can be observed (a), and decrease of the gain at high contrasts (b). (c) and (d): LN analysis on a model cell, based on a spike-triggered average as in Chichilnisky (2001) (time bin 5 ms). Same parameters as the model X cell in Fig. 5, except for slower time scales to account for the salamander retina: τ C  = 30 ms, τ S  = 20 ms, τ G = 50 ms, \(i_\mathrm{G}^0=0\) Hz, λ G = 300 Hz

LN analysis on a model cell is depicted in Fig. 8(c) and (d). Our simulated cell is based on the X cell model of Fig. 5, except for time scales which were fitted to salamander retina time constants. The dependence on contrast of the linear filter and nonlinear curve is qualitatively similar to the original experiment.

Remark

It is not suprising that our model performs well both on this LN protocol and on the previous multi-sinus protocol (Section 3.2.2). Indeed, both protocols derive a ‘best-fitting linear kernel’, in response to a temporal signal which browses the whole frequency domain. However, LN analysis generally uses white noise, while the multi-sinus experiment is more related to pink noise (preceding remark). As a result, the retrieved kernels slightly differ between the two protocols (personal experimentation).

Remark

Fast and slow contrast gain control Recent work (Kim and Rieke 2001; Baccus and Meister 2002) has established that at least two contrast gain control mechanisms are present in the retina with different time scales: A fast, almost instantaneous gain control mechanism, and a slower adaptation process (see discussion in Section 4.2). The multi-sinus experiments (Shapley and Victor 1978) in Fig. 6 use a protocol which elicits both the slow and fast mechanisms. By opposition, the LN experiments of Baccus and Meister (2002) do discriminate fast from slow contrast gain control. The curves in Fig. 8(a) correspond specifically to their measure of the fast effect.

Our model only includes a fast gain control mechanism. Figure 8(b) shows that our mechanism can qualitatively reproduce the fast component of contrast gain control. In turn, Fig. 7 shows that our mechanism is sufficient to qualitatively reproduce the multi-sinus experiments, suggesting that biologically, fast contrast gain control is mostly responsible for the change in shape of the kernels (a result confirmed in Baccus and Meister 2002).

3.3 Results on real images

To conclude our presentation of Virtual Retina, we present simulations on whole images and sequences. We do not study quantitatively any retinal feature in this section: The article does not aim at such study, but at presenting a simulation tool. Rather, we illustrate qualitatively how large-scale simulations allow to link a model architecture with its perceptual consequences.

First, as a general illustration of the software, we present the complete simulation of a retina on a moving sequence, with spiking output and all intermediate signals involved. Second, we focus on two specific elements of the model and how they relate to a percept.

3.3.1 Large-scale simulation of the model

We show in Fig. 9 the response of our model to an input video stimulation, with all intermediate signals represented at two instants of the simulation. To display all possibilities of Virtual Retina, we chose a hybrid retina, with cell properties being those of a cat retina (with X and Y cells), but that displays a radial structure as in a primate retina, with spatial precision maximum in the fovea, and decreasing towards the periphery. The simulated retina had a diameter of 50 deg, corresponding to 250 pixels. The input sequence lasted 1.4 s of ‘real time’, corresponding to 56 frames (each frame shown for 25 ms). There were three ganglion layers of 30,000 spiking cells each. On average, each cell fired approximatively one spike per input frame. Total processing time was around 130 s (2 s per input frame).

Fig. 9
figure 9

Large-scale simulation with cat X and Y cells. Image size: 50 deg (250 pixels). Same parameters as in Fig. 5, except σ C  = 0.3 deg, σ S  = 1 deg, and for Y cells σ G = 1 deg, \(i_\mathrm{G}^0=60\) Hz. Artificial radial structure as in primate retinas: R 0 = 10 deg (50 pix), K = 0.2 deg − 1. 90,000 spiking cells, simulation speed of around 1/100 real time (see text)

The foveated structure, ruled by sampling scheme Eq. (17), can be observed on all retinal images (except for the input light): The periphery is more blurred than the central zone.

Signals C, S and I OPL illustrate the properties of the OPL filter Eqs. (8)–(10): It is the difference of two low-passed versions of the sequence, so it takes strong values on image edges and on moving zones. Its biggest response is thus located on the edges of the walking characters.

Second column corresponds to layers I OPL, V Bip and g A which are involved in the contrast gain control scheme Eqs. (11)–(13). The contrast gain control scheme enhances the linear contrast image I OPL to produce V Bip (which reveals better the details of contrast). This perceptual effect is detailed in the sequel.

Last column presents linear reconstructions from the output spike trains, respectively from X ON and OFF cells, and Y OFF cells. Each spike simply contributes to the reconstruction by adding a circular spot, whose diameter and intensity depend on the cell density in this region of the retina, and whose temporal profile is a decreasing exponential of time constant 20 ms. Thus, a reconstructed sequence displays in each pixel a quantity close to the ‘instantaneous firing rate for a cell located at this pixel’.

The positive and negative parts of signal V Bip are coded respectively by ON and OFF ganglion cells. Y cells display a signal with less spatial precision than X cells, because of the supplementary synaptic pooling \(G_{\sigma_\mathrm{G}}\) in Eqs. (14). Second, Y cells are only sensitive to temporal changes, so they only detect the moving characters. This is obtained by making Y cells totally phasic with w G = 1 in Eq. (14), and by lowering the spontaneous firing rate of Y cells, through parameter \(i_0^G\) in Eq. (15).

3.3.2 Perceptual consequences of model architecture

Spiking pattern at image onset

It has long been known that the center-surround architecture of ganglion cells produces preferential responses to spatial or temporal change. Many subsequent modeling has considered the retina – or at least its center-surround ganglion cells pathway – as an edge detector. Here we suggest there could be another effect in the first milliseconds after image onset, due to the biologically observed delay of surround signal w.r.t center signal (Enroth-Cugell et al. 1983; Bernadete and Kaplan 1999).

Figure 10 presents a reconstruction from the spikes emitted by a square array of primate ‘parvo’-like cells, after onset of a static image (one cell per pixel, each spike adding an exponential contribution of latency 15 ms). To produce plausible initial conditions, the cells are previously exposed to Gaussian white noise of same luminance as the forthcoming image.

Fig. 10
figure 10

Spiking cells respond to the onset of a static image, after exposure to noise of similar luminance. First spikes code for a luminance signal (see reconstruction 30 ms after image onset). The ‘edge detection’ signal is coded only in the following spikes (see reconstruction at 90 ms), due to the non-separability of filtering in the OPL (see text). Experimental procedure: Primate ‘parvo’-like cells: σ C  = 0.03 deg, σ S  = 0.1 deg, σ A = 0.2 deg; τ S  = 4 ms, approximative delay measured between center and surround signal in cat (Enroth-Cugell et al. 1983) and primate (Bernadete and Kaplan 1999). Other parameters as in Fig. 5. Input image 5 deg (500 pixels). Output array: 100×100 cells, one cell per pixel in the reconstruction

The first spikes coding for the image (observable from around 20 ms after image onset) do not code for image edges, but only for the center signal, simply proportional to the input luminance: see reconstruction at time 30 ms. The following spikes progressively start coding for image edges, as the delayed surround signal [Eq. (9)] catches up. In the reconstruction of Fig. 10, the transition is partly achieved at time 50 ms, totally achieved at time 90 ms. It is surprising, but verified, that the small supplementary delay of surround (here, τ S  = 4 ms) has perceptual effects over several tens of milliseconds.

Very recent results (Gollisch and Meister 2008) have brought experimental validation to this original prediction of our model. The authors reconstructed two images from the spike trains of a salamander fast OFF ganglion cell in response to the onset of a static image: A first reconstruction image based on the latency of the first emitted spike (T), and a second image based on the total number of spikes fired by the cell (N). Their observation, reproduced with our model (not shown), is that the ‘T’ image resembles the luminance input profile, while the ‘N’ image puts more accent on image edges.

In our model, this subsequent ‘edge image’ is the equilibrium signal of the retina: It defines the ‘stabilized’ retinal output to the static image, after the initial luminance transient at image onset (in real retinas, ‘stabilized’ is only an approximation, due to slow adaptation effects). But the initial ‘luminance transient’ probably has its importance for studies of spike-related information transmission. If it is true that the first emitted spikes carry the most information (Berry et al. 1997; Van Rullen and Thorpe 2001), is the retina really an edge detector?

We believe that a large-scale, and relatively detailed simulator such as Virtual Retina can be of some help to theoreticians wishing to address this ‘delayed surround’ problem more quantitatively. One object of our current research is to find reconstruction procedures taking advantage of this particular structure of retinal filtering (see Conclusion).

Remark

Using the basic microsaccade generator included in Virtual Retina, we also tested what signal is coded when the image is subject to small, regular displacements of frequency around 2 Hz (Martinez-Conde et al. 2004). We found that the ‘microsaccade’-perturbed image remains perceptually close to the stabilized ‘edge image’. This further justifies our denomination of the ‘edge image’ as being ‘stabilized’, even under small fixation eye movements.

Contrast enhancement through contrast gain control

As a second illustration of the relation between models and large-scale percepts, we present in Fig. 11 how the gain control loop Eqs. (11)–(13) enhances edges nonlinearly, in ways very similar to traditional image processing techniques.

Fig. 11
figure 11

Perceptual comparison between a linear output (b) and two versions of the gain control mechanism. When the gain control is purely temporal (σ A=0, panel c), it operates as a point-by-point Gamma transform on the image. When the gain control is allowed to have a spatial extent (d), it operates like a local histogram equalization on the contrast image. b to d are equilibrium responses, after the initial transient at image onset. Details and experimental procedure in the text. Flower photo courtesy of Marcello Moisan (Moisan 2007)

The test image displays a flower with strong contrast, and smaller variations of contrast in the background. We only consider here the equilibrium ‘edge image’ of the retina, rather than the initial ‘luminance transient’ (as explained in the previous paragraph).

We compare the linear output of the retina I OPL with two versions of V Bip after contrast gain control. First version considers a purely temporal gain control loop, with no spatial extent for the measure of contrast by g A(x,y,t) (through σ A = 0 in Eq. (12)). A second version allows this spatial extent, with σ A = 0.2 deg, a value comparable to the extent of our surround signal σ S  = 0.1 deg. We study these two cases distinctly because of the hypothetical nature of parameter σ A in our retinal model (see Discussion in Section 4.2).

To produce comparable results, the three resulting images (I OPL(x,y) and the two V Bip(x,y)) are normalized between − 1 and 1, and passed through an adimensional rectification N as in Eq. (15) (with i 0 = 0.3, λ = 1, v 0 = 0, see Fig. 3(b)) modeling ganglion rectification and spike generation.

Panel B presents the rectified version of I OPL(x,y), the linear response. In panels C and D, we present the interplay between the rectified nonlinear output V Bip(x,y) and the adapting conductance g A(x,y) that produced the nonlinear effect.

In the case of a purely temporal contrast gain control (σ A = 0, panel C), one can observe a re-equilibrating of the contrast levels, as compared to the linear output (B). Intermediate contrast levels are enhanced as compared to high-contrast levels (see, e.g., the small flower at the bottom left). Mathematically, when equilibrium is reached in Eqs. (11)–(13), one has the point-by-point relations \(g_\mathrm{A}(x,y)=Q(V_\mathrm{Bip}(x,y))=g_\mathrm{A}^0+\lambda_\mathrm{A} V_\mathrm{Bip}(x,y)^2\) and I OPL(x,y) = V Bip(x,y)g A(x,y), so that the nonlinear output can simply be understood as the static point-by-point compression

$$V_\mathrm{Bip}(x,y)=L^{-1}(I_\mathrm{OPL}(x,y)),$$

with \(L(V)=V(g_\mathrm{A}^0+\lambda_\mathrm{A} V^2)\). So in this case (σ A = 0, on a static image), our gain control loop Eqs. (11)–(13) is close to a Gamma-transform (Gonzalez and Woods 1992) on the original linear output.

In the case where contrast gain control mechanism includes a spatial extent σ A (panel D), the equilibrium between V Bip(x,y) and g A(x,y) becomes dependent on the spatial structure of the input image and there is no analytical expression. Intuitively, g A(x,y) provides a divisive effect on V Bip(x,y) based on the contrast in the neighborhood of (x,y), making V Bip(x,y) a measure of contrast that is local rather than absolute. This enhancement, which can be observed in the background in D, is very close to a local histogram equalization (see Gonzalez and Woods (1992), chapter 3) on the linear contrast image.

These results are another example of link between physiological and perceptual features allowed by large-scale simulation. We wish to stress the qualitative nature of these perceptual results on contrast gain control. For example, one might argue that we humans do not see such an enhanced contrast as that displayed in Fig. 11(d). It should not be forgotten that the Midget (‘parvo’) pathway of primates, which is supposed to be our primary source of precise form analysis, is very little subject to contrast gain control (Bernadete et al. 1992). Besides, the problem of ‘double filtering’ (by the software, and by our visual system) raises other issues concerning what we see when looking at the reconstructions.

To conclude, note that physiological measurements such as those of Shapley and Victor 78 (Fig. 6(a)) demonstrate that there is under-linearity to contrast in cat ganglion cells, at least in specific spatio-temporal conditions (their experiments concerned sinusoidal stimulation, whereas here we simulate a totally static image). This necessarily implies a perceptual invariance, for all cells which display contrast gain control (cat cells, primate Parasol (‘magno’) cells): At least, a static compression effect as in Fig. 11(c). Possibly, a local equalization as in Fig. 11(d).

4 Discussion

4.1 Customizable simulation software

4.1.1 Combining large-scale and plausibility

The first aim of this article was to present Virtual Retina, a large-scale simulation software. Before going into the details of the underlying retina model, we would like to stress the goals of this software: To achieve at the same time large-scale simulation and a relative biological plausibility, with an adaptable degree of complexity.

First of all, Virtual Retina is a large-scale simulator. It aims at providing input to neuroscientists who need this large-scale factor: Motion detection tasks in a natural scene, population coding by an assembly of spiking visual neurons, information-theoretic calculations on natural scenes. In this optic, it is being used by several research teams of the FACETSFootnote 5 European consortium as input to detailed models of primary visual cortex (V1). It can also serve as a demonstration tool in an educational framework.

As a large-scale simulator, we wished to reduce as much as possible the number of parameters used in the underlying model. This explains the simple form taken by our successive stages (OPL, contrast gain control, IPL and ganglion cells), that discard many effects known to occur in real retinas.

At the same time, Virtual Retina intends to be a plausible simulator, that can provide output spike trains reasonably close to those of real ganglion cells. The reproduction of a number of experimental recordings in Section 3 appeared a necessary step to prove the plausibility of the software.

As a plausible simulator, we wanted to keep specific properties of retinal processing that are often ignored by large-scale models. This includes: The non-separability of the OPL filter Eq. (10) that allows to detect both image edges and uniform flickering screens, the contrast gain control mechanism Eqs. (11)–(13) that provides invariance to contrast in natural scenes, and the trial-to-trial variability in the emission of spike trains.

As a plausible simulator also, Virtual Retina uses an underlying model mostly based on prior, state-of-the-art knowledge on the retina, experimental results as well as models. Our goal was to reduce this state-of-the-art knowledge to formulations as simple as possible, for inclusion in the software. As an exception, our contrast gain control mechanism based on conductances is a more original contribution, although it also strongly relates to previous work. The mechanism is discussed in the next paragraph. Note that potential users looking for purely state-of-the-art simulation can easily disconnect the contrast gain control stage, thanks to the modular nature of the software.

4.1.2 Subclasses of ganglion cells

In this section we shortly mention the different types of ganglion cells reproducible by our simulator. Names and classification of ganglion cells vary according to the species considered, and to the classification medium (morphology or physiology) (Kolb et al. 2001; Masland 2001). The goal of this section is not to review all types, but just to give landmarks about retinal physiology and how our simulator relates to them.

In the cat retina, X and Y cells are the most studied type. Both types of cells display a strong contrast gain control (Shapley and Victor 1978), although the effect is stronger in Y cells. Y cells are more phasic, X cells more tonic. Finally, the response of Y cells cannot be modeled by linear spatial summation. Our model can account for both X and Y types of cellular response (see Section 3.2.1).

In the primate retina, Midget and Parasol cells have received the most attention. Midget cells are very precise spatially (small receptive field) and code for red-green color oppositions. They are known to display little contrast gain control (Kaplan and Bernadete 2001). They are connected to the Parvocellular pathway of the LGN, which is supposed to be in charge of precise shape detection. Parasol cells have a wider receptive field, are not sensitive to color but very sensitive to contrast, and display a strong contrast gain control effect. They are connected to the Magnocellular pathway of the LGN, which is supposed to be in charge of movement detection and broad scene analysis. Elements suggest that Midget cells constitute a new channel of visual information possessed only by primates, whereas Parasol cells are a common feature shared with other mammals, being close for example to cat X and Y cells (Kaplan and Bernadete 2001; Masland 2001). Following this hypothesis, Virtual Retina can reproduce primate Parasol cells with efficiency. Midget cells can also be reproduced in their achromatic features; but color oppositions are not handled yet by our model. Finally, as explained already, Virtual Retina handles the foveated structure typical of primate retinas.

Table 2 summarizes plausible orders of magnitude for the model parameters that must vary between cat and primate retinas. Values for σ C and σ S are taken from the literature (Enroth-Cugell and Robson (1966) for cat, Croner and Kaplan (1995), McMahon et al. (2000) for primate). Other parameters for cat cells are plausible orders of magnitude that provided good fit to data. Other parameters for primate cells are suggestions respecting the scaling from cat to primate retina, and the characteristics of primate cells explained in the above paragraph: Parasol cells can behave like cat X or Y cells, and Midget cells display no contrast gain control.

Table 2 Model parameters that vary according to species (cat and primate) and pathway

Remark

Many other types of ganglion cells have been found in the cat retina (W cells, Q cells...) and in the primate retina (such as small bi-stratified ganglion cells which code for Blue/Yellow color opponents) with functionalities more or less known (Masland 2001; Wohrer 2008b). Some are sensitive to illumination, others are tuned to directional movement (e.g., ON-OFF DS cells in rabbit retina), etc. Our model does not intend to reproduce these other subtypes, although we believe that for some of them, modeling is relatively straightforward from what is presented in this article.

4.1.3 Temporal transients and adaptation in the model

‘Adaptation’ is a word widely used in neurophysiology, that can convey various meanings. When presenting our model, we restricted this word to the description of nonlinear effects in retinal filtering. Conversely, we rather used the word ‘transient’ to define the temporal high-pass stages that we modeled through linear filters. Using this terminology, our simulator uses two linear transient filters \(T_{w_U,\tau_U}\) and \(T_{w_G,\tau_G}\), and one contrast adaptation stage with the feedback loop Eqs. (11)–(13). We designed the feedback loop Eqs. (11)–(13) as a purely low-pass stage, by associating the adapting conductance g A to a null Nernst potential. As a result, only the two linear filters \(T_{w_U,\tau_U}\) and \(T_{w_G,\tau_G}\) are in charge of temporal shaping in the system.

Our adaptation scheme to contrast is discussed in Section 4.2. We wish to stress that this scheme only aims to reproduce fast gain control adaptation. Recent works (Baccus and Meister 2002; Kim and Rieke 2001) have revealed a second mechanism of gain control, on a much slower time scale. This secondary gain control enters the family of slow adaptation mechanisms in neurons, that we have not discussed nor included in our simulator. Likely, this slow adaptation is present simultaneously at different biological locations, both pre- and post-synaptically to ganglion cells.

In the salamander, Kim and Rieke (2003, 2001) find that slow contrast gain control could be accounted for by a slow inactivation of Na +  currents due to spike generation in the ganglion cells. In mammalian retinas, some studies (Solomon et al. (2004) in primate) also suggest that a large part of slow contrast adaptation arises directly in ganglion cells, while others (Manookin and Demb (2006) in guinea pig) suggest that slow adaptation is mostly presynaptic to the ganglion cells, because currents directly injected in ganglion cells induce less slow adaptation.

For the moment, Virtual Retina ignores slow adaptation, although a very experimental spike-frequency adaptation scheme is available in the source code. Note that if a slow adaptation was implemented in the model (spike-frequency adaptation or other), it would allow to account for the nonlinear variations of cat X cells in response to a uniform screen, according to their mean level of activity during the current stimulation (Section 3.2.1 and Fig. 4(a)).

As for linear filters \(T_{w_U,\tau_U}\) in Eq. (8) and \(T_{w_G,\tau_G}\) in Eq. (14), they must be seen as a functionally convenient choice to reproduce band-pass behaviors in the retina. Their respective biological locations and time scales (100 ms for \(T_{w_U,\tau_U}\) versus 10 − 30 ms for \(T_{w_G,\tau_G}\)) do not intend to reproduce physiology with precision.

\(T_{w_U,\tau_U}\) in Eq. (8) can be associated to different physiological meanings: First, it is a crude approximation of the intrinsic undershoot occurring in cone phototransduction in the time scales of a few hundreds of milliseconds (Schnapf et al. 1990). At the same time, it provides a very simple linear equivalent to slow cellular adaptation described in the previous paragraph. Finally, it serves as a way to correctly enhance the 1 − 10 Hz frequency range before application of the contrast gain control stage (see Section 3.2.2 and Fig. 6(d)).

By opposition, \(T_{w_G,\tau_G}\) in Eq. (14) is rather associated to fast transients which occur later in retinal processing due to retinal connectivity. It accounts mostly for amacrine cells that provide a strong inhibition on some bipolar cells and/or ganglion cells at the level of synapses in the IPL. \(T_{w_G,\tau_G}\) is the processing step where we can fix the balance between tonic and phasic cells, through parameter w G (Section 2.4.1). Biologically, locating the tonic/phasic opposition at the level of the IPL is reasonable, since the IPL is the retinal location with the biggest variety of cell subtypes (Masland 2001), by opposition with the OPL that involves only two types of light receptors (cones and rods) and three to four types of horizontal cells (Kolb et al. 2001; Masland 2001). Bipolar cells are known to exist under different subtypes with different band-pass properties (Masland 2001), possibly reflecting different interactions at their synaptic terminals in the IPL.

4.2 Contrast gain control mechanism

A key feature of the present model is its detailed contrast gain control stage Eqs. (11)–(13), and how it relates to experimental observations (Section 3.2.2). Here we discuss the biological relevance of our model, and relate it to other gain-control models that have been proposed at the level of the retina.

Shunt conductances

Our model assumes that fast contrast gain control is achieved by the divisive influence of conductances in the membrane of bipolar cells. When a conductance g j (t) opens in a cellular membrane, associated to Nernst potential E j , it creates a synaptic current g j (t)(E j  − V(t)) through the membrane, as detailed in equation (1). Such a current can be decomposed in a ‘linear contribution’ g j (t)E j and a ‘shunt contribution’ − g j (t)V(t). If only the first term is taken into account, then g j (t) contributes linearly to V(t). The second term, by opposition, is independent of the Nernst potential, and modifies nonlinearly the instantaneous gain of membrane integration.

At the contrast gain control stage Eqs. (11)–(13), our model focuses on the shunting impact of dynamic conductances onto bipolar cells: Only the term − g A(t)V(t) is present, as if our adapting conductance g A was associated to a null Nernst potential. We do not claim this to be the biological reality, but only a computational convenience to reduce the number of required parameters. First, it dispenses from fixing a value to the Nernst potential. More importantly, it allows the whole scheme Eqs. (11)–(13) to display an ON-OFF symmetry, which provides a fair reduction in the number of parameters.

ON-OFF symmetry

In the retina, the positive and negative parts of visual signals are transmitted through two distinct pathways, ON and OFF, that imply different cells. Our model takes this distinction into account starting at Eq. (14), right before spike generation by ganglion cells. Before that, single signals (I OPL, V Bip...) are used, which code symmetrically for positive and negative values. In real retinas, the distinction arises earlier, through ON and OFF bipolar cells. The former are somewhat slower (mostly, because of a metabotropic receptor required to invert the polarity of the photoreceptor signal), less subject to contrast gain control, and more numerous than OFF cells (Chichilnisky and Kalmar 2002; Rieke 2001; Kolb et al. 2001).

Our contrast gain control scheme Eqs. (11)–(13), which is totally symmetrical, is thus a consequent simplification of the biological reality, in order to reduce model complexity. This explains the symmetric shape of function Q, not often met in biological neurons: it is intended to encompass simultaneously gain controls on ON and OFF cells. The corresponding biological architecture, if it exists, is that of two distinct pathways which display:

  • A gain control mechanism on their preferred contrast polarity (ON or OFF), similarly to Eqs. (11)–(13), except for function Q that becomes a one-sided smooth rectification (similar to function N in Eq. (15)).

  • A signal compression on their non-preferred contrast polarity.

Convex activation function

The quadratic expression chosen here for Q is the simplest function to be both symmetrical and flat around V Bip = 0. This flatness around zero was found mandatory to reproduce the multi-sinus kernels of Section 3.2.2. For example, similar good reproduction was obtained when function Q was linear by parts with an activation threshold:

$$Q(V_\mathrm{Bip})= \begin{cases} g_A^0 & \mathrm{if} \, 0<V_\mathrm{Bip}<v_0,\\ g_A^0 +\lambda_A (V_\mathrm{Bip}-v_0) & \mathrm{if} \, V_\mathrm{Bip}>v_0, \end{cases}$$

and symmetrically in the negative range. But this required a supplementary parameter, and allowed a less well-defined mathematical analysis.

Biologically, the convex shape required for Q in our model might reflect the existence of voltage-gated conductances in bipolar cells, whose activation becomes significant only at high potentials V Bip. Indeed, different voltage-gated inhibitory currents have been observed in bipolar cells of several species (Connaughton and Maguire 1998; Kolb et al. 2001). Mathematically, the convexity of Q is also an important property, as demonstrated in Wohrer (2007): We refer to Appendix B for the main result.

Physiological interpretation

It is natural to ask whether or not the present model of gain control can be attached to a physiological meaning. Experiment has established that the fast gain control mechanism (Baccus and Meister 2002) is already observed at the level of bipolar cells (Rieke 2001).

But what could be the biological origin of conductances g A in the membranes of bipolar cells, as in our model? From the discovery of the contrast gain control effect and on (Shapley and Victor 1978), it was believed the effect had a link with amacrine cells, and/or the so-called  sub-units’ of Hochstein and Shapley (1976). However, recent experimentations have revealed that the fast contrast gain control effect is still present under physiological blockade of amacrine synapses (Rieke 2001; Beaudoin et al. 2007). These findings likely eliminate the hypothesis that g A arises from amacrine synapses.

One plausible explanation is that g A arises from voltage-gated conductances in the membranes of bipolar cells. Voltage-gated conductances have already been hypothesized to contribute to contrast gain control in mammalian bipolar cells (Mao et al. 1998). This would explain the direct dependence of g A on the recent values of V Bip(x,y,t), as well as the small time scale τ A for the adaptation, that we found to produce the best results in our simulations (τ A=5 ms).

Another possible explanation could imply calcium adaptation, often hypothesized as a source of contrast gain control in bipolar cells (Shiells and Falk 1999; Nawy 2000; Rieke 2001). Modeling calcium feedbacks often leads to systems formally close to Eqs. (11)–(13) (see, e.g., the phototransduction model of van Hateren and Lamb (2006)), but with a different physiological interpretation of the system variables. In this alternative interpretation, the ‘controlled’ variable (V Bip in Eqs. (11)–(13)) is the input dendritic current I Syn on bipolar cells, while the ‘controlling’ variable (g A in Eqs. (11)–(13)) is the concentration of some calcium-binding molecule [M], which regulates I Syn through a catalyzer equation formally similar to (11). Finally, a convex nonlinearity Q may also be introduced for different reasons: Possibly, a Hill exponent in the binding of calcium to molecule [M].

In both biological interpretations (voltage-gated conductances or calcium adaptation), the contrast gain control mechanism could have a certain spatial extent σ A, due to a diffusion of the molecules or ions involved in the control reaction (e.g., a diffusion of Ca2 + , in the ‘calcium’ interpretation of the model). Such diffusion is indeed possible, due to the existence of gap junctions between neighboring bipolar cells (Dacey et al. 2000; Kolb et al. 2001). We have seen in Section 3.3.2 how the presence or not of the diffusion through parameter σ A has perceptual consequences on the output of our gain control model.

Other models of gain control

How does our model compare with other gain control models in the retina? The main influence of our gain control model is that of Victor (1987). His empirical model reproduces changes in the Wiener kernel due to contrast, thanks to a high-pass filtering stage whose time constant is a function of the recent values of contrast. His model is also based on a dynamical feedback mechanism.

The main difference in our model is that we associate contrast gain control Eqs. (11)–(13) to a low-pass scheme only. Temporal high-pass is done through purely linear filters, before (slow transient \(T_{w_U,\tau_U}\) in Eq. (8)) and after (fast transient \(T_{w_\mathrm{G},\tau_\mathrm{G}}\) in Eq. (14)) our gain control loop. Our gains by doing so are:

  • Conceptual separation of two distinct effects, since a low-pass model is simpler, and sufficient to account for the contrast gain control.

  • More biological plausibility, since our mechanism can be associated to a biological signification.

  • A mathematically tractable model, with available proofs concerning the behavior of the control loop (see Appendix B).

We also extended Victor’s concept of ‘local level of contrast’ to a possible spatial extent σ A (the Victor 87 model is purely temporal), and observed its possible consequences on a perceptual level.

However, our gain control model remains the direct descent of the Victor model, one of the most accurate models of contrast gain control in the retina, that served as a direct influence to more recent models, such as the Y cell model of Enroth-Cugell and Freeman (1987), or the Van Hateren model for primate Parasol cells (van Hateren et al. 2002).

Another source of inspiration for our model can be found in models that propose spatial divisive effects in the retina. This is the case of the Bonin LGN model (Bonin et al. 2005) which proposes the idea of a divisive spatial surround. Similarly, the Herault et al. retina model (Hérault and Durette 2007) allows at the same time for luminance and contrast invariance, through two successive divisive steps in retinal processing. Their model, strongly oriented towards fast image processing, displays purely analogical signals (no spikes), and does not focus on the temporal shaping of ganglion responses by temporal transients like our model. Neither Bonin nor Herault models are concerned with the dynamical change in the Wiener kernel induced by contrast gain control, but only by the spatial divisive influence of luminance and/or contrast on retinal outputs.

To summarize, two ‘trends’ of gain control models can be found: Those based on the temporal expression of the gain control and those based on its spatial expression. Here we proposed a framework where both effects can be accounted for, while bearing possible biological interpretations.

5 Conclusion

In this article we described the underlying model and main characteristics of the open-source simulation software Virtual Retina. Interestingly, this software can emulate up to around 100,000 cells with a processing speed of about 1/100 real time, which makes it a good choice for large-scale simulations of the visual cortex (and possibly LGN) that require a realistic input from the retina. The first stage of the model corresponds to the OPL modeled as a non-separable spatio-temporal linear filter. The second stage is an original implementation of contrast gain control through a shunting feedback mechanism. The third stage concerns further temporal shaping in the IPL and spike generation in ganglion cells, to form the output spike trains.

Of course, this simulator remains an abstraction of the precise retinal mechanisms. It relies on assumptions and simplifications necessary to have an efficient simulator. However, in all this work, strong efforts were made to justify the modeling choices thanks to physiology literature.

Introducing nonlinearities was one focus of the simulator. For example, we proposed an original contrast gain control mechanism based on a system of ordinary differential equations, which can implement both spatial and temporal invariance to contrast, and for which mathematical results have been established. Another kind of nonlinearity modeled in the software is the spatial nonlinearity of cat Y cells.

Another goal was to build a modular software, with an adaptable degree of complexity for the underlying retinal model. The retina models used by Virtual Retina have a structure and internal parameters which are all defined in a single customizable XML file, with adaptable complexity.

In its present state, we see three main types of applications for the simulator in the field of visual neuroscience. The first type of application is to use the software as a bio-plausible input to models of higher-level visual processing – typically, LGN or V1. As such, the simulator could advantageously replace the oversimplified retina models generally used as input. Such usage of the software has started amongst laboratories of the FACETS EC IP project.

A second type of application is to use the software as a tool to understand retinal coding itself. In this scope, we are currently developing advanced reconstruction procedures from the spiking output of a retina, exploiting the particular filtering structure with a delayed surround signal.

More generally, we have shown in this article first qualitative examples of how the constitutive elements of the retina model (nature of the linear filter, contrast gain control) have a strong influence on large-scale percepts. We are now seeking methods to quantitatively assess the perceptual quality of retinal transmission: Suppose we can compute, for a given retina, a finite number of statistical estimators {X α } measuring how efficiently this retina transmits a set {α} of ‘perceptual descriptors’ for natural scenes.Footnote 6 Then, Virtual Retina can be used to measure the evolution of the various estimators {X α } according to the parameters of the retina model. The perceptual influences of the non-separable center-surround filtering (Section 2.2, and Fig. 10), of contrast gain control (Section 2.3, and Fig. 11), or of the non-Poisson spiking statistics induced by the nLIF model (Section 2.4.2), could thus be quantified in terms of their impact on the various estimators {X α }.

Finally, a third application is to use the software for educational purposes, e.g., through its webservice interface, to efficiently illustrate the nature of retinal coding.

To conclude, we would like to mention that Virtual Retina is an evolutionary software: Depending on user’s requirements, other features could be integrated, such as realistic eye movements (pure rotation), chromatic oppositions, stimulus-dependent synchrony between the spikes of neighboring cells, or fitting of parameters to build a feed-forward model of LGN.