Introduction

This paper presents a methodology for brain dynamics modeling called computational neurogenetic modeling (CNGM) that integrates gene regulatory networks with models of artificial neural networks (ANN) to model different functions of neural system. Properties of all cell types, including neurons, are determined by proteins they contain (Lodish et al. 2000). In turn, the types and amounts of proteins are determined by differential transcription of different genes in response to internal and external signals. Eventually, the properties of neurons determine the structure and dynamics of the whole neural network they are part of. In the CNGM approach, interaction of gene variables in neurons affects the dynamics of the whole ANN through neuronal parameters, which are no longer constant, but change as a function of gene expression. Through optimization of the gene interaction network, initial gene/protein expression values and neuronal parameters, particular target states of the neural network operation can be achieved, and meaningful relationships between genes, proteins and neural functions can be extracted.

One particular instance where the gene expression determines the neural dynamics is the circadian rhythm. A circadian rhythm is a roughly 24-h cycle in the physiological processes of plants and animals. The circadian rhythm partly depends on external cues such as sunlight and temperature, but otherwise it is determined by periodic expression patterns of the so-called clock genes (Lee et al. 1998; Suri et al. 1999). Smolen et al. (2004) have developed a computational model to represent the regulation of core clock component genes in Drosophila (per, vri, Pdp-1, and Clk). To model the dynamics of gene expression, differential equations and first-order kinetics equations were employed for modeling the control of genes and their products. The model illustrates the ways in which negative and positive feedback loops within the gene regulatory network (GRN) cooperate to generate oscillations of gene expression. The relative amplitudes and phases of simulated oscillations of gene expressions resemble empirical data in most of simulated situations. The model of Smolen et al. (2004) shows that it is possible to develop detailed models of gene control of neural behavior provided enough experimental data is available to adjust the model.

Another example of modeling genetic influence upon neural dynamics is genetic control of neural development. Computational models were developed for early neural development, early dendritic and axonal morphogenesis, formation of dendritic branching patterns, axonal guidance and gradient detection by growth cones, activity-dependent neurite outgrowth, etc. (van Ooyen 2003). Although, these models consider role and dynamics of proteins they do not take into account the role and dynamics of genes. However, they can be taken one step further by linking proteins to genes. This step was actually performed by Marnellos and Mjolsness (Marnellos and Mjolsness 2003; Mjolsness et al. 1991), Storjohann and Marcus (Storjohann and Marcus 2005; Thivierge and Marcus 2006).

Mjolsness et al. (1991) and Marnellos and Mjolsness (2003) have introduced a modeling framework for the study of development including neural development based upon genes and their interactions. Cells in the model are represented as overlapping cylinders in a two-dimensional hexagonal lattice where the extent of overlap determines the strength of interaction between neighboring cells. Model cells express a small number of genes corresponding to genes that are involved in differentiation. Genes in broad terms can correspond to groups of related genes, for instance proneural genes or epithelial genes, etc. Abstracting from biochemical detail, genes interact as nodes of a recurrent network. According to Marnellos and Mjolsness (2003), levels of gene products should be viewed as corresponding to gene product activities rather than actual concentrations and gene interactions should be viewed as corresponding more to genetic rather than specific biochemical (transcriptional, etc.) interactions. The gene network allows cell transformations in the model. For instance, cells may change their state (i.e., the levels of gene products or other state variables), change type, strength of interaction, can give birth to other cells, or die. These transformations are represented by a set of grammar rules, the L-grammar as in Lindenmayer systems. Rules are triggered according to the internal state of each cell (or other cells as well) and are of two kinds: discrete (leading to abrupt changes) and continuous (leading to smooth changes). Marnellos and Mjolsness applied this approach to modeling early neurogenesis in Drosophila and constructed models to study and make predictions about the dynamics of how neuroblasts and sensory organ precursor cells differentiate from proneural clusters (Marnellos and Mjolsness 2003). The gene interaction strengths were optimized in order to fit gene expression patterns described in experimental literature. The objective function was the least-squares one and optimization was done by means of simulated annealing. The Drosophila developmental model made predictions about how the interplay of factors such as proneural cluster shape and size, gene expression levels, and strength of cell–cell signaling determine the timing and position of neuroblasts and sensory organ precursor cells. The model also made predictions about the effect of various perturbations in gene product levels on cell differentiation.

Yet another example of a neurodevelopmental process that is dependent upon gene expression is formation of topographic maps in the brains of vertebrates. Topographic maps transmit visual, auditory, and somatosensory information from sensory organs to cortex and between the cortical hemispheres (Kaas 1997). Experimental evidence suggests that topographic organization is maintained also in sensory neural structures where learning occurs, in other words, tactile information is stored within the spatial structure of maps (Diamond et al. 2003). It is known that the topographic map formation depends on activity-independent (genetic) and activity-dependent processes (learning or activity-dependent synaptic plasticity) (Willshaw and Price 2003). To study the interplay between these processes a novel platform is under development called INTEGRATE (Thivierge and Marcus 2006). It is similar in nature to a novel computational programming system for integrated simulation of neural biochemistry, neurodevelopment and neural activity within a unifying framework of genetic control, called NeuroGene (Storjohann and Marcus 2005). NeuroGene is designed to simulate a wide range of neurodevelopmental processes, including gene regulation, protein expression, chemical signaling, neural activity and neuronal growth. Central is a computational model of genes, which allows protein concentrations, neural activity and cell morphology to affect, and be affected by, gene expression. Using this system, the authors have developed a novel model for the formation of topographic projections from retina to the midbrain, including activity-dependent developmental processes which underlie receptive field refinement and ocular dominance column formation. The authors also implemented the learning rule introduced by Elliott and Shadbolt (1999) to model the competition among presynaptic terminals for the postsynaptic protein. The learning rule is encoded entirely in simulated genes. NeuroGene simulations of activity-dependent remodeling of synapses in topographic projections had two results in accordance with experimental data. First, retino-tectal arbors, which initially form connections to many tectal cells over a large area, become focused so that each retinal ganglion cell connects to only one or a few tectal cells. This improves the topographic ordering of the projection. Second, the tectum, which receives overlapping topographic projections from both eyes, becomes subdivided into domains (known as ocular dominance columns) which receive neural input exclusively from one or the other eye. In addition, NeuroGene successfully modeled the EphA knockin experiment in which the retinal EphA level was increased and the resulting retino-tectal projections were specifically disrupted (Brown et al. 2000). NeuroGene can be considered to be a neurogenetic model in spite it does not include interactions between genes. Genes obey the known expression profiles and these can be changed as a consequence of mutation, gene knockout or knockin, and thus the model can be used for predictions of some neurodevelopmental disorders of the visual tract in vertebrates.

To summarize, models using the gene network framework can be formulated as optimization tasks that look for the model parameters so that the model optimally fits biological data or behaves in a certain desired manner. Optimization seeks the minimum of the objective (or error) function E(p), which depends on the state variable values. An example of the objective function can be the least-squares error function, as in Marnellos and Mjolsness (2003):

$$ E({\mathbf{p}}) = \sum\limits_{i,a,t} {\left( {p_{{a{\text{MODEL}}}}^{i} (t) - p_{{a{\text{DATA}}}}^{i} (t)} \right)}^{2} $$
(1)

which is the squared difference between gene product levels in the model and those in the data, summed over all cells (i), over all gene products (a) and over all times (t) for which data are available. The objective functions in gene network models typically have a large number of variables and parameters, are highly nonlinear and cannot be solved analytically or readily optimized with deterministic methods. Therefore, the more appropriate methods for optimization are stochastic optimization methods like simulated annealing (Cerny 1985) or evolutionary computation (Goldberg 1989). What is actually being optimized is the set of adjustable parameters of the GRN that is the gene interaction weights, activation thresholds, protein production and decay rates, etc., depending on a particular GRN model.

Optimization leads to optimal hidden parameter values, like interactions between genes that constitute the main prediction of the model. Construction of the hidden gene regulatory network enables predictions about consequences of gene mutations. After introducing the general framework for modeling brain dynamics, we illustrate the above optimization and predictions on a particular example of modeling local field potential (LFP) in wild type and PV knockout mice.

Methods

Discrete computational neurogenetic model of neural dynamics

This methodology has been developed over years in Kasabov and Benuskova (2004, 2006), Benuskova et al. (2006), Benuskova and Kasabov (2007). In general, we consider two sets of genes: a set G gen that relates to proteins of general cell functions and a set G spec that codes specific neuronal information-processing proteins (e.g. receptors, ion channels, etc.). The two sets form together a set G = {G 1, G 2, …, G N } that forms a gene regulatory network (GRN) interconnected through matrix of gene interaction weights W. Proteins that mediate general cellular or specific information-processing functions in neurons are usually complex molecules comprised of several subunits, each of them being coded by a separate gene (Burnashev and Rozov 2000). We assume that the expression level of each gene g j (tt) is a nonlinear function of expression levels of all the genes in G. Relationship can be expressed in a discrete form using the sigmoid function σ (Weaver et al. 2001), i.e.:

$$ g_{j} (t + {{\Updelta}}t) = w_{j0} + \sigma \left( {\sum\limits_{k = 1}^{{N_{G} }} {w_{jk} g_{k} (t)} } \right) $$
(2)

where N G is the total number of genes in G, w j0 ≥ 0 is the basal level of expression of gene j and the gene interaction weight w jk represents interaction weight between two genes j and k. The positive interaction, w jk  > 0, means that upregulation of gene k leads to the upregulation of gene j. The negative interaction, w jk  < 0, means that upregulation of gene k leads to the downregulation of gene j. We can work with normalized gene expression values in the interval g j (t) ∈ (0, 1). Initial values of gene expressions can be small random values, i.e. g j (0) ∈ (0, 0.1).

In a living cell, including neurons, gene expression, i.e. the transcription of DNA to messenger RNA followed by translation to protein, occurs stochastically, as a consequence of the low copy number of DNA and mRNA molecules involved. It has been shown at a cell level that the protein production occurs in bursts, with the number of molecules per burst following an exponential distribution (Cai et al. 2006). However, in our approach, we take into account the average gene expression levels and average levels of proteins taken over the whole population of cells and over the whole relevant time period.

We assume a linear relationship between protein levels and gene expression levels. The linear relationship in the next equation is based on findings that protein complexes, which have clearly defined interactions between their subunits, have highly correlated levels with mRNA expression levels (Greenbaum et al. 2003; Jansen et al. 2002). Subunits of the same protein complex show significant co-expression, both in terms of similarities of absolute mRNA levels and expression profiles, e.g., subunits of a complex have correlated patterns of expression over a time course (Jansen et al. 2002). This implies that there should be a correlation between mRNA and protein concentration, as these subunits have to be available in stoichiometric amounts for the complexes to function (Greenbaum et al. 2003). Thus, the protein level p j (tt) reads

$$ p_{j} (t + \Updelta t) = z_{j0} + \sum\limits_{k = 1}^{{N_{{p_{j} }} }} {z_{jk} g_{k} (t)} $$
(3)

where N pj is the number of protein j subunits, z j0 ≥ 0 is the basal concentration (level) of protein j and z jk  ≥ 0 is the coefficient of proportionality between subunit gene k and protein j (subunit k content). Time delay Δt corresponds to time interval when protein expression data are being gathered. Determining protein levels requires two stages of sample preparation. All proteins of interest are separated using two-dimensional electrophoresis, followed by identification using mass spectrometry (MacBeath and Schreiber 2000). Thus, in our current model the delays Δt represent the time points of gathering both gene and protein data.

Some protein levels are directly related to the values of neuronal parameters P j such that

$$ P_{j} (t)\, = \,P_{j} (0)\,p_{j} (t) $$
(4)

where P j (0) is the initial value of the neuronal parameter at time t = 0, and p j (t) is a protein level at time t. An example can be the membraneous conductance for Na+ ions being directly proportional to the concentration of voltage-gated Na+ channels in the axonal membrane and to concentration of AMPA and NMDA receptors in the synaptic membrane. These concentrations are in turn proportional to the rate of their biosynthesis. After induction of LTP (long-term potentiation, the main mechanism of long-term memory formation), the genes for AMPA and NMDA receptors are upregulated (Abraham and Williams 2003) and receptors inserted into the postsynaptic membrane (Shi et al. 1999). Hence, we assume that if there is an increase in synthesis of ion channels or postsynaptic receptors, it is because they are needed to be inserted into the membrane to enhance the function, hence the linear relationship in Eq. 4. In such a way, the gene/protein dynamics is directly linked to the dynamics of artificial neural network (ANN).

The discrete CNGM model is a general one and can be integrated with any neural network model, depending on what kind of neural activity one wants to model. In the presented model we have made several simplifying assumptions:

  • Each neuron has the same GRN, i.e. the same genes and the same interaction gene matrix W.

  • Each GRN starts from the same initial values of gene expressions.

  • There is no direct feedback from neuronal activity or any other external factors to gene expression levels or protein levels.

This generic discrete neurogenetic model can be run continuously over time in the following way:

  1. 1.

    Choose initial expression values of the genes G, G(t = 0), in the neuron and the matrix W of the GRN, basal levels of all genes and proteins, and the initial values of neuronal parameters P(t = 0).

  2. 2.

    Calculate the next vector of expression levels of the gene set G(tt) (Eq. 2).

  3. 3.

    Calculate concentration levels of proteins that are related to the set of neuronal parameters (Eq. 3).

  4. 4.

    Calculate the values of neuronal parameters P (Eq. 4).

  5. 5.

    Update the activity of neural network based on new values of parameters (taking into account all external inputs to the neural network).

  6. 6.

    Go to step 2.

The biggest challenge of our approach and the key to the predictions of CNGM is the construction of the GRN state transition matrix W, which determines the dynamics of GRN and consequently the dynamics of the ANN. There are several ways how to obtain W:

  1. 1.

    Ideally, the values of gene interaction coefficients w ij are obtained from real measurements through reverse engineering performed on the microarray data using Kalman filter and genetic algorithm (Kasabov et al. 2004), evolving connectionist systems (Chan et al. 2008), ICA (Lutter et al. 2006) or its nonlinear extension to network component analysis (Chang et al. 2008).

  2. 2.

    The values of W elements are iteratively optimized from initial random values, for instance with the use of genetic algorithm (Whitehead et al. 2004) to obtain the desired behavior of the ANN. This behavior would be used as a “fitness criterion” in the GA to stop the search process for an optimal interaction matrix W.

  3. 3.

    The matrix W is constructed heuristically based on some assumptions and insights into what result we want to obtain and why. For instance, we can use the theory of discrete dynamic systems to obtain a dynamic system with the fixed point attractor(s), limit cycle attractors or strange attractors (Katok and Hasselblat 1995).

  4. 4.

    The matrix W is constructed from databases and literature on gene–protein interactions.

  5. 5.

    The matrix W is constructed with the use of a mix of the above methods or other methods.

The above method 2 of obtaining coefficients of W allows us to investigate and discover relationships between different GRNs and ANN states even in the case when gene expression data are not available, and therefore, we will use it for our study. An optimization procedure to obtain this relationship can read:

  1. 1.

    Generate a population of CNGMs, each with randomly generated values of coefficients for the GRN matrix W, initial gene expression values g(0), and initial values of ANN parameters P(0);

  2. 2.

    For each set of parameters run the CNGM over a period of time T and record the activity of the neurons in the associated ANN;

  3. 3.

    Evaluate characteristics of the ANN behavior (e.g. connectivity, level of activity, spectral characteristics of LFP, etc.);

  4. 4.

    Compare the ANN behavior characteristics to the characteristics of the desired ANN state (e.g. normal wiring, level of activity, etc.);

  5. 5.

    Repeat steps (1)–(4) until a desired GRN and ANN model behavior is obtained. Keep the solution if it fulfills the criterion;

  6. 6.

    Analyze all the obtained optimal solutions of GRN and the ANN parameters for significant gene interaction patterns and parameter values that cause the target ANN model behavior.

In step 1, which is the generation of the population of CNGM, we can apply the principles of evolutionary computation with the operations of crossover and mutations of parameter values. In such a way we can simulate the process of evolution that has led to the neural GRN with the gene interactions underlying the desired ANN behavior. In the following we apply our theory to the case study of LFP generation in wild type and gene knockout mice.

Simplified computational neurogenetic model of LFP

The overall sum of electric activity of billions of neurons in the brain is recorded as EEG. EEG is the sum of many LFPs which are in turn sums of electrical activities of thousands of neurons summed locally (Freeman 2000). Genetic studies show that human EEG has a strong genetic basis (Buzsaki and Draguhn 2004; Porjesz et al. 2002; van Beijsterveldt and van Baal 2002). We can assume that this feature holds also for animals, since in this study we use animal data. In the presented work, we want to use our method of CNGM to model the dependency of neural electrical activity upon internal gene interactions in order to account for the spectral differences in the LFP in wild type and gene knockout mice. In particular, we use the data measured in the laboratory of A.E.P. Villa on gene-knockout mice that are prone to epilepsy (Schwaller et al. 2004; Villa et al. 2005) to seek the underlying genetic interactions.

Let each spiking model of neuron is characterized by its instantaneous membrane potential u i (t). Then LFP is the sum of membrane potentials of all neurons in the spiking neural network (SNN), i.e. Φ(t) = Σ i u i (t). Each model neuron within SNN possesses an internal model of the gene regulatory network (GRN) (Fig. 1). Genes are related to neuronal parameters like excitation and inhibition and thus their expression levels determine the value of these parameters (Eq. 4). For simplicity, we assume all GRN to be the same. We can optimize the GRN interactions W to match the SNN output with the real signal, i.e. LFP. Based on the target LFP signal with particular spectral characteristics, we want to predict the underlying interactions W between selected subset of genes for further experimental verification.

Fig. 1
figure 1

Computational neurogenetic model (CNGM) as an abstract gene regulatory network (GRN) embedded in each neuron of a SNN model with particular output behaviour, for instance local field potential (LFP)

Proteins in neurons like receptors and ion channels are complex proteins comprised of several subunits each of them is coded for by a separate gene (Burnashev and Rozov 2000). These genes are expressed in a coordinated manner so we will treat them as a one gene group G j with the overall normalized expression level g j (t). For simplicity we assume that the gene expression level is constant but at the same time depends on expression levels of all gene groups in the selected subset of genes such that

$$ g_{j} (t) = \sigma \left( {\sum\limits_{k = 1}^{n} {w_{jk} (t)g_{k} (t)} } \right) $$
(5)

where σ is the sigmoid function between 0 and 1; g k (t) is the expression level of gene group k at time t and w jk  ∈ (−1, 1) is the coefficient of an abstract gene interaction matrix W. Positive interaction between two genes means that the upregulation of one gene leads to an upregulation of the other gene. The negative interaction means the opposite influence. Interactions are abstract ones and represent the whole chain of molecular events.

Neuron’s parameter value P j (t) is proportional to gene expression level g j (t) such that

$$ P_{j} (t) = P_{j} (0)g_{j} (t) $$
(6)

where P j (t) is the value of parameter j at time t, P j (0) is the initial value of that parameter and g j (t) ∈ (0, 1) is the normalized level of expression of the jth gene group in the model GRN. In such a way, the gene/protein dynamics is linked to the dynamics of SNN. Neuronal parameters and their corresponding proteins are summarized in Table 1. The linear relationship in Eq. 6 is justified by findings that protein complexes, which have clearly defined interactions between their subunits, have highly correlated levels with mRNA expression levels (Greenbaum et al. 2003; Jansen et al. 2002). Subunits of the same protein complex show significant co-expression, both in terms of similarities of absolute mRNA levels and expression profiles, e.g., subunits of a complex have correlated patterns of expression over a time course (Jansen et al. 2002). This implies that there should be a correlation between mRNA and protein concentration, as these subunits have to be available in stoichiometric amounts for the complexes to function (Greenbaum et al. 2003). This is exactly the case of proteins in our model, which are receptors and ion channels, comprised of respective ratios of subunits.

Table 1 Neuron parameters and related proteins

Spiking neural network model

Our neuron spiking model is derived from the spike response model (SRM) (Gerstner and Kistler 2002; Maass and Bishop 1999). The total somatic postsynaptic potential of a neuron i is denoted as u i (t). When u i (t) reaches the firing threshold ϑ i (t), the neuron i fires, i.e. emits a spike (see Fig. 2a). The moment of the threshold ϑ i (t) crossing defines the firing time t i of an output spike. The value of u i (t) is the weighted sum of all synaptic postsynaptic potentials (PSPs), \( {\text{PSP}}_{ij} (t - t_{j} - \Updelta_{ij}^{ax} ) \), such that:

$$ u_{i} (t) = \sum\limits_{{j \in \Upgamma_{i} }} {\sum\limits_{{t_{j} \in F_{j} }} {J_{ij} \,{\text{PSP}}_{ij} (t - t_{j} - \Updelta_{ij}^{ax} )} }. $$
(7)
Fig. 2
figure 2

(a) SRM model neuron; (b) SNN architecture with N = 120 neurons. 90–75% are excitatory neurons that are randomly positioned on the grid (white circles), others are inhibitory (black circles). The input from thalamus are random spikes with low frequencies

The weight of synaptic connection from neuron j to neuron i is denoted by J ij . It takes positive (negative) values for excitatory (inhibitory) connections, respectively. \( \Updelta_{ij}^{ax} \) is an axonal delay between neurons i and j. Delay linearly increases with Euclidean distance between neurons. The positive kernel expressing an individual postsynaptic potential (PSP) evoked on neuron i when a presynaptic neuron j from the pool Γ i fires at time t j has a double exponential form, i.e.

$$ {\text{PSP}}_{ij}^{\text{type}} (s) = A^{\text{type}} \left( {{ \exp }\left( { - \frac{s}{{\tau_{\text{decay}}^{\text{type}} }}} \right) - \exp\left( { - \frac{s}{{\tau_{\text{rise}}^{\text{type}} }}} \right)} \right) $$
(8)

where \( \tau_{\text{decay/rise}}^{\text{type}} \) are time constants of the fall and rise of an individual PSP, respectively, A is the PSP’s amplitude, and index type denotes one of the following: fast_excitation, fast_inhibition, slow_excitation, slow_inhibition and late_PSP. These types of PSPs are based on neurobiological knowledge. Fast excitation is mediated through the AMPA receptor-gated ion channels for sodium (Destexhe 1998; Kleppe and Robinson 1999), slow excitation is mediated through the NMDA receptor-gated ion channels for sodium and calcium (Destexhe 1998; Kleppe and Robinson 1999), fast inhibition is mediated through the somatic GABAA receptor-gated ion channels for chloride, and slow inhibition is mostly mediated through the dendritic GABAB receptor-gated ion channels for potassium (Connors et al. 1988) and in part by the dendritic GABAA receptor-gated ion channels for chloride (Wendling et al. 2002; White et al. 2000). However, in our model we consider only the dominant effect of GABRB for slow-inhibition. In addition, when the inhibitory synapse is stimulated, late depolarizing potential, late_PSP is evoked, which is dependent upon parvalbumin, PV, because its size is about one-third bigger when the PV gene is knockout (Vreugdenhil et al. 2003). Thus, when there is a presynaptic spike at excitatory synapse, both fast and slow components of excitatory PSP are activated and when a presynaptic spike arrives at inhibitory synapse, fast, slow and late PSPs are activated. Table 1 links parameters P j to appropriate genes g j , such that the expression level of that gene or genes calculated according to Eq. 5 determines the value of parameter P j be it the amplitude or time constants of various types of PSPs, parameters of the firing threshold for Eq. 9, etc. In case of the firing threshold, we assume its parameters, i.e. ϑ 0 , k and τ decay (see below), are proportional to KCN and CLCN, and inversely proportional to the level of SCN.

Immediately after firing the output spike at t i (only the last firing time is considered), neuron’s firing threshold ϑ i (t) increases k-times and then returns to its resting value ϑ 0 in an exponential fashion:

$$ \vartheta_{i} (t - t_{i} ) = k \times \vartheta_{0} \exp \left( { - \frac{{t - t_{i} }}{{\tau_{\text{decay}}^{\vartheta } }}} \right) $$
(9)

where \( \tau_{\text{decay}}^{\vartheta } \) is the time constant of the threshold decay. In such a way, absolute and relative refractory periods are modelled.

Figure 2b illustrates the architecture of our SNN. Spiking neurons within the network can be either excitatory or inhibitory. There can be as many as about 10–25% of inhibitory neurons positioned randomly on the rectangular grid of N neurons. Lateral connections between neurons and input connections have weights that decrease in value with distance from neuron i according to a Gaussian formula:

$$ J_{ij} ({\text{dist}}(i,j)) = \frac{{J_{0}^{\text{exc/inh}} }}{{\sigma^{\text{exc/inh}} }}\exp \left( { - \frac{{{\text{dist}}(i,j)^{2} }}{{\sigma^{{{\text{exc/inh}}^{ 2} }} }}} \right) $$
(10)

while the connections can be established at random with the probability equal to 0.5. External inputs from the input layer are added to the right hand side of Eq. 7 at each time step. Each external input has its own weight \( J_{i}^{{{\text{ext\_input}}}} \) and \( {\text{PSP}}_{i}^{{{\text{fast\_excitation}}}} (t) \), i.e.

$$ u_{i}^{{{\text{ext\_input}}}} (t) = J_{i}^{{{\text{ext\_input}}}} \,{\text{PSP}}_{i}^{{{\text{fast\_excitation}}}} (t). $$
(11)

We employed a uniformly random input to capture low-frequency, non-periodic and non-bursting firing of thalamocortical inputs. Table 2 contains the values of neuron’s and SNN parameters, respectively, used in our simulations. These values were inspired by experimental and computational studies (Charpier et al. 1999; Deisz 1999; Destexhe 1998; Wendling et al. 2002) and were further adjusted by experimentation.

Table 2 Model parameters, their value ranges and the initial value for optimized model

For optimization of the GRN model a genetic algorithm is used implemented as part of a neurogenetic simulator (Fig. 3). The optimization procedure consists of following steps:

Fig. 3
figure 3

Snapshot of the CNGM simulator used in this work

  • Generate a population of N CNGMs, each with randomly generated values of coefficients for the GRN matrix W, w jk  ∈ (−1, 1), initial gene expression values g(0) ∈ (0, 0.1), initial values of parameters P(0), and other model parameters (like connectivity, input frequency, etc.), which are chosen uniformly randomly from ranges specified in Table 2;

  • Run each CNGM over a period of time T (in our case T = 5 s) and record the LFP for each associated SNN;

  • Calculate the distribution of phase-coupled frequencies in LFP using the methodology described in Villa et al. (2005), referring to Nikias and Raghuveer (1987) and Brillinger (1965);

  • Compare the characteristics of each SNN LFP to the quantitative characteristics of the target LFP signal of the wild-type PV+/+ mice. Evaluate the closeness of the LFP signal for each SNN to the target LFP signal by means of the Euclidean distance between characteristic vectors consisting of distribution of relative frequencies;

  • Find CNGM models that matches the LFP spectral characteristics better than other solutions, i.e. Euclidean distance <0.1;

  • If no such solution found, repeat the above steps if necessary or use operators of genetic algorithm to generate new solutions by operations of crossover and mutation;

  • For the winner solutions that lead to the PV+/+ LFP characteristics, simulate the PV knockout by removing the PV gene from GRN and record the changes in the LFP characteristics. Removal of the PV gene is implemented by putting the initial expression value of PV = 0.0, all interaction weights in GRN to and from PV = 0.0 and by increasing the amplitude of late PSP by 1/3 (Vreugdenhil et al. 2003);

  • Choose those solutions of GRN that match the PV−/− LFP characteristics after PV gene removal. Analyse the GRN interaction matrices W for significant patterns that lead to the desired behaviour.

  • If none of the solutions matches the PV−/− LFP characteristics, repeat the above optimization from the beginning.

Results

Results of simulations of wild-type and PV−/− mice data

The CNGM described above has been implemented in the C++ simulatorFootnote 1 (Fig. 3). We have optimized parameter values to match bispectral characteristics of the mouse LFP for the wild type (PV+/+) mouse and for the PV-knockout mouse (PV−/−). One-channel experimental results (distribution of phase-coupled frequencies) to be used for the cost function were taken from (Villa et al. 2005). We performed GA optimization for 100 solutions in the population for the maximum of 20 generations. Offspring solutions were produced by operation of crossover, in which new values of CNGM parameters were produced by arithmetic averaging of parameter values of parent solutions. Selection was done with replacement, meaning that the same solution can be selected more than once to become a parent. Mutation had probability of 0.1 and meant adding or subtracting of 10% of the offspring parameter value. If the new values fell out of the allowed range they were moved towards the closest border (minimal or maximal range value). Parents were chosen according to the roulette wheel. The whole generation was replaced with newly generated solutions and all solutions with Euclidean distance <0.2 from every generation were kept. The optimization was stopped, if the Euclidean distance between any of the solutions and the target solution (distribution of phase-coupled frequencies) dropped below 0.1 before the limit of 20 generations was reached. Figure 4 shows evolution of average and best fitness during optimization. All the stored solutions (total of 187) were tested by simulating the PV gene knockout. PV gene knockout was simulated by removing the gene variable for PV from GRN, and by increasing the amplitude for late depolarizing potential by one-third (Vreugdenhil et al. 2003). Gene for PV was removed by putting all incoming and outgoing weights in GRN to zero and also by putting the gene expression of PV equal to zero. Then the SNN LFP was compared with LFP recorded in PV−/− mice. Below we describe results for the only CNGM that passed the test of simulated knockout of gene variable for PV and produced the LFP that resembled the characteristics of experimental LFP for PV−/−. Values of gene expressions can be found in Table 1 and values of neural parameters of this optimized model in Table 2. It is notable to mention that these results do not depend on a particular realization of network connectivity, when its parameters (peak and sigma of Gaussian distribution of weights) are like those listed in Table 2.

Fig. 4
figure 4

Evolution of an average and best error in the GA optimization of CNGM parameters

In Fig. 5, we show the real LFP data from mouse. The x axis denotes the frequency bands for phase-coupled frequencies in the following order: 0–10 Hz (0), 10–20 Hz (1), 20–30 Hz (2), …, 90–100 Hz (9). The y axis expresses the proportion of the given phase-coupled frequency band in the LFP bispectrum (Villa et al. 2005). In Fig. 6, we show the distribution of phase-coupled frequencies of the LFP generated by the model SNN, in which the neuronal parameters depend on gene levels in the optimized GRN. Interactions within GRN were optimized as described above, so that the SNN output has the most similar bispectral characteristics to the real LFP signal recorded from mice. As we can see in the optimized model, there is a shift towards higher frequencies in the LFP spectrum when the GRN is complete (Fig. 6a) as is the case of the wild-type mouse data (Fig. 5a). After the removal of the PV gene from the GRN (Fig. 6b) there is a shift towards lower frequencies like in the real mouse data (Fig. 5b). However, we must note that the actual power of different frequencies does not exactly match real data, which is to be expected with such a simplified model.

Fig. 5
figure 5

Distribution of phase-coupled frequencies of the (a) LFP from the wild type mouse, PV+/+ and (b) LFP from the gene knock-out mouse, PV−/−. The x axis represents the frequency bands: (0) 0–10 Hz, (1) 10–20 Hz, (2) 20–30 Hz, etc.

Fig. 6
figure 6

Distribution of phase-coupled frequencies of the (a) SNN LFP for the complete GRN, (b) SNN LFP with gene knockout PV−/− GRN. The x axis represents the frequency bands: (0) 0–10 Hz, (1) 10–20 Hz, (2) 20–30 Hz, etc.

In various other computational models of epilepsy, desired epileptic behavior of the model is achieved by changing the values of model parameters that express the neuronal inhibition (Destexhe 1998; Robinson et al. 2002; Wendling et al. 2002; Kudela et al. 2003). This approach is fully justified if there is no genetic cause involved (which is indeed the case of many epilepsies) or when there is no relevant underlying genetic interaction involved in that condition. The parameter regulated by PV in our model is the late PSP. In the next test, we will see what happens when we change the value of amplitude of late PSP by one-third but leave the underlying GRN intact, i.e. we do not take out the gene for PV. In Fig. 7, we show the resulting distribution of phase-coupled frequencies in the model LFP spectrum. If we compare this outcome with the simulated normal LFP characteristics depicted in Fig. 6a, we can see that there is not much change in the distribution. That is, a simple change in parameter value is not enough in this model, to achieve the shift in the frequency distribution that would correspond to PV−/− data. We need also to simulate gene variables that mutually interact and optimize this interaction network to lead both the normal and altered behavior.

Fig. 7
figure 7

Distribution of phase-coupled frequencies of the SNN LFP for the complete GRN with the increased amplitude of late PSP

Figure 8 shows the underlying abstract GRN that leads to the model LFP, which fits best the experimental data for both PV+/+ and PV−/− mice. The found GRN constitutes the main prediction of the model for further experimental testing. Figure 8a illustrates all interactions that are stronger than 0.2, in absolute value. However, we are interested in what consequences will follow after taking the gene for PV out, therefore, we illustrate in Fig. 8b all outgoing interactions from PV gene that are stronger than 0.1. As we can see in Fig. 8b, the optimized GRN solution has a negative interaction from gene for PV to gene for GABRB, which means PV suppresses expression of GABRB. GABRB is responsible for slow-inhibition and also for slow waves in the brain electrical activity (Destexhe 1998). When PV is absent in simulated PV−/− gene knockout, then due to the absent negative interaction from PV, simulated expression of GABRB and consequently the magnitude of slow inhibition both increase, leading to more slow oscillations in the LFP spectrum, and thus, the overall shift towards lower frequencies in the spectrum (Fig. 6a). This model prediction albeit intuitively plausible, remains to be experimentally tested, for instance by microarray measurement of gene expression data from wild-type and PV−/− mice. To conclude, this example demonstrates the type of theoretical predictions that can be obtained with CNGM models about gene expression levels and gene-to-gene interactions.

Fig. 8
figure 8

(a) Abstract optimized GRN that leads to the most similar LFP to the real data for the complete genome and for PV gene knockout. Dashed lines denote negative interactions between genes and solid lines denote positive interactions. Only the interactions with the absolute value above 0.2 are shown. (b) Outgoing interactions of the gene for PV that are stronger in the absolute value than 0.1. Line thickness reflects the strength of interaction

Discussion

Complex interactions between genes and proteins in neurons affect the dynamics of neural networks in the brain. Gene expression values may change due to internal dynamics of gene regulatory networks or due to external factors, like hormones, electrical activity, etc. We can expect that different initial gene expression values, and even different gene interactions can lead to the same outcome in terms of neuronal activity. However, in the diseased brain, either altered initial expression values, mutated genes and/or altered interactions within GRN lead to abnormalities in the network activity. We have presented a simplified approach how to model these complex relationships. The presented model can be considered to be a first step, with many suggestions for improvements at every level, starting from the dynamics of GRN, and ending with more sophisticated parameter optimization design. However, one has to bear in mind the huge number of degrees of freedom associated with each level of this multi-level approach. We have illustrated our extremely simplified approach on the case study of simulating the changes in LFP in PV−/− gene knockout mice. Previously, we have applied the same methodology to simulate human EEG data (Benuskova et al. 2006). In the present study, we demonstrated that to obtain the match with the experimental data, it is not enough just to manipulate the value of a neuronal parameter that is related to the knockout gene, but it is necessary to simulate the underlying GRN and the change in gene expressions, which follow after one gene from GRN, namely PV, is absent. To simulate the effect of gene knockout by neural network model, in which we manipulate only neuronal parameters would be possible only in that case if there were no underlying interactions between the knockout and other genes that influence neural activity in question.

In real neural networks neuronal parameters that define the functioning of a neural network depend on genes and proteins in a complex way. Gene expression values change due to internal dynamics of the gene regulatory network (which in fact involves proteins, transcription factors, regulatory RNAs, etc.), initial conditions of the genes and external conditions. All this may affect gradually or quickly the functioning of the neural network as a whole. Realistic models of gene networks within neural networks should account for these processes. Future research should be linked to real gene data obtained from for instance microarray experiments in order to be able to address the following issues:

  1. 1.

    “What-if analysis”. What happens if one or few particular genes are erased or mutated (i.e. data are collected from knock-out gene technology)? What happens if interactions within the GRN change? What happens if external factors are included? In such a way our approach can serve as a noninvasive test system.

  2. 2.

    Exploration of possibilities of modeling genetically caused brain disorders such as epilepsy, Parkinson’s disease, etc. The goal is to make predictions about gene interactions to aid experimental research on gene interactions in various states and conditions of the brain.

  3. 3.

    Extension of the approach to model genetic influences upon brain cognitive functions and their disorders. Basic genetic data are presented and some ideas are conceived in the last chapter of (Benuskova and Kasabov 2007).

There are successful computational models of mental diseases, for instance of impaired associative learning in schizophrenia (Diwadkar et al. 2008) or sleep disturbance in autism (Matsuura et al. 2008) or other disorders (Reggia et al. 1999). However, these models should be taken one step further as recent genetic studies show that serious mental diseases manifested as compromised cognitive and/or affective status are thought to be linked to mutations of genes. Table 3 contains some of the brain disorders that have cognitive symptoms and which are attributed mainly to underlying genetic causes. Thus, suggesting further developments, presented general framework of CNGM can be extended to model the dependence of cognitive state upon genes as presented below:

Table 3 Human brain diseases, gene mutations, brain abnormality and cognitive symptoms

Let the future state of a molecule M′ or a group of molecules (e.g. genes, proteins) be represented as a nonlinear function F m of its current state M and external signals E m

$$ M^{\prime}\, = \,F_{\text{m}} \,(M,\,E_{\text{m}} ) $$
(12)

A future state N′ of a neural network is represented as a nonlinear function F n of its current state N, the state of the molecules M (e.g. genes) and external signals E n

$$ N^{\prime}\, = \,F_{\text{n}} \,(N,\,M,\,E_{\text{n}} ) $$
(13)

A future cognitive state C′ of the neural network is represented as a nonlinear function F c of its current state C, the neuronal N, and the molecular M state and the external stimuli E c:

$$ C^{\prime}\, = \,F_{\text{c}} \,(C,\,N,\,M,\,E_{\text{c}} ) $$
(14)

The above set of equations is a general one and it can be implemented differently, as mentioned in Benuskova and Kasabov (2007):

  • one gene—one neuron/brain function model;

  • multiple genes—one neuron/brain function, no interaction between genes;

  • multiple genes—multiple neuron/brain functions where genes interact in a gene regulatory network (GRN) and neurons also interact in a neural network architecture;

  • thousands of genes—complex brain/cognitive function/s where genes interact within GRN and neurons interact in several hierarchical neural networks.

These advanced scenarios should probably involve the framework of modelling the large scale dynamics of the brain (Bressler and Kelso 2001; Haken 2007; Freeman 2007; Seth and Edelman 2007) although identifying variables that may be under genetic influence will be tricky. Or one can opt for the Blue Brain (Markram 2006) as it is a project underway to create a biologically accurate, detailed model of the brain using the IBM’s Blue Gene supercomputer. In the future, information from the genetic level is planned to be added to the algorithms that model individual neurons. Then, the simulations can be used to explore what happens when genetic information is altered. However, even in case it is intractable and practically impossible to include all molecular influences, we believe this avenue of modeling will be further developed in the future. Clearly, the model and results presented in our paper are only the first small step towards this ambitious goal of modeling cognitive neurodynamics using the novel approach of computational neurogenetics.