Keywords

10.1 Introduction

Systems biology integrates biochemical, genetic and cellular approaches to provide a more comprehensive understanding of higher level processes in living organisms. In this integrated approach, the components of a system are best understood by their relationships within the system as well as with other systems. These interconnected components are referred to as networks. Mathematical modeling of networks is an essential facet of systems biology (Dhurjati and Mahadevan 2008). When studying complex and dynamic interactions, experimental and/or mathematical approaches provide a means to explore and understand the system of networks in question (Dhurjati and Mahadevan 2008). These approaches can also highlight ways in which this system can be manipulated. Systems biology can be used to identify novel pathways implicated in different diseases and to determine the optimal ways of manipulating regulatory networks to treat these diseases.

Network analysis and pathway connectivity approaches in systems biology have provided important insights into the pathogenesis of neurodegenerative diseases (Villoslada et al. 2009). These approaches have led to the identification of novel molecular pathways affected by diseases like hereditary ataxias. Systems biology and mathematical modeling approaches can be used in the development of therapeutic strategies for neurological diseases. Neurological diseases having a clearly genetic etiology, like the pediatric-onset motor neuron disease spinal muscular atrophy (SMA), are particularly amenable to systems biology and mathematical modeling approaches. To illustrate this application, we describe below how mathematical modeling is being used to more thoroughly understand the regulation of SMN2, an endogenous modifier gene for SMA, expression and to develop optimal therapeutic targets for this disease.

10.2 Spinal Muscular Atrophy

Proximal SMA is an autosomal recessive early-onset neurodegenerative disease characterized by the loss of α-motor neurons in the anterior horn of the spinal cord which leads to muscle weakness and atrophy (Crawford and Pardo 1996; Kolb and Kissel 2015). Proximally innervated muscles are preferentially affected over distal muscles in SMA. It is a leading genetic cause of infant and early childhood mortality across the world with an incidence of 1 in ~10,000 live births (Pearn 1978; Cuscó et al. 2002; Sugarman et al. 2012). The carrier frequency for SMA ranges from 1:25 to 1:50 (Zaldívar et al. 2005; Labrum et al. 2007; Hendrickson et al. 2009; Ben-Shachar et al. 2011; Su et al. 2011; Sugarman et al. 2012; Lyahyai et al. 2012; Sangaré et al. 2014). While SMA is primarily a disorder affecting motor neurons, other cells are affected by this disease (Shababi et al. 2014). Arrhythmias and other cardiac abnormalities have been observed in mouse models for SMA (Heier et al. 2010; Shababi et al. 2010; Biondi et al. 2012; Shababi et al. 2012). SMA mice have also demonstrated abnormalities in the autonomic and enteric nervous systems (Bevan et al. 2010; Gombash et al. 2015). Loss of insulin-producing β-cells has been observed in the pancreas (Bowerman et al. 2012, 2014). While peripheral organ dysfunction in SMA has been described, it is not yet clear whether or not this dysfunction is a direct result of the disease or a consequence of motor neuron loss and muscle atrophy.

There is a high degree of phenotypic variability within the SMA population. As such, SMA is divided into 5 clinical grades based on age of onset and severity of the disease (Munsat and Davies 1992; Russman 2007). The more severe SMA (types 0 and I) patients have a short lifespan and usually die because of respiratory complications arising from weakness in the intercostals muscles. Due in part to better supportive care (Wang et al. 2007), type II SMA patients generally have a life expectancy into early adulthood. Type III SMA patients usually have a normal lifespan but have difficulty walking. Adult-onset type IV SMA patients generally have a fairly benign disease progression.

Most cases of SMA, regardless of clinical grade, result from large-scale deletions within chromosome 5q13.2 and the loss of the Survival Motor Neuron 1 (SMN1) gene (Lefebvre et al. 1995). The SMN gene is duplicated in humans to give rise to SMN1 and SMN2 (Rochette et al. 2001). This duplication event is not perfect in that there are single nucleotide differences between SMN1 and SMN2. The major difference between these two SMN genes is a translationally silent, C-to-T transition in exon 7(SMN2 c.850C > T) (Lorson et al. 1999; Monani et al. 1999). This position on exon 7 lies within an exonic splicing enhancer (ESS) sequence that regulated the inclusion of exon 7 in SMN1 mRNA transcripts (Fig. 10.1). This ESS is disrupted in SMN2 so that most (about 90%) of the SMN2-derived mRNAs lack exon 7 (SMNΔ7) after splicing. The resultant SMNΔ7 protein is unstable and not fully functional (Lorson and Androphy 2000; Burnett et al. 2009; Cho and Dreyfuss 2010). Some SMN2 mRNAs—roughly 10%—contain exon 7 which results in the production of some full-length SMN (FL-SMN) protein from SMN2.

Fig. 10.1
figure 1

Molecular difference between SMN1 and SMN2 and its effect on splicing. This figure is adapted from (Butchbach and Burghes 2004; Butchbach 2016)

10.3 SMN2 as an Endogenous Genetic Modifier of SMA Phenotype

Since the region of chromosome 5 containing the SMA locus is subject to unequal segmental duplication, SMN1 and SMN2 copy numbers are quite variable in the genome. Numerous studies have demonstrated an inverse relationship between SMN2 copy number and disease severity amongst patient with SMA (reviewed in Butchbach 2016). As a general rule, those patients with milder forms of SMA have higher SMN2 copy numbers than severe SMA patients. There are some rare exceptions to this inverse relationship between SMN2 copy number and disease severity in SMA. Some type II and III SMA patients have been shown to harbor only 2 copies of SMN2 (Prior et al. 2009; Vezain et al. 2010; Bernal et al. 2010). These patients contain a rare single nucleotide variant (SNV) in exon 7 (SMN2 c.859G > C) that regulates exon 7 inclusion. SMN2 is a genetic modifier of disease progression in SMA patients.

The modifier effect of SMN2 is also observed in animal models for SMA. In zSmn (zebrafish orthologue to SMN1) mutant zebrafish, SMN2 extends the survival of mutant larvae and rescues deficits in neuromuscular junction formation in these mutant fish (Hao Le et al. 2011). Transgenic insertion of SMN2 into mSmn (murine orthologue to SMN1) nullizygous mice rescues embryonic lethality (Schrank et al. 1997; Monani et al. 2000; Hsieh-Li et al. 2000; Michaud et al. 2010). SMN2 transgene copy number dictates the severity of the SMA phenotype in these mice. In other words, SMA mice with low SMN2 copy numbers show a severe SMA phenotype (i.e. death within 8 days after birth) while high copy SMN2 SMA mice have no phenotype (Monani et al. 2000; Hsieh-Li et al. 2000; Michaud et al. 2010). SMN2 is, therefore, a major modifier of disease severity in humans as well as in animal models for SMA. These studies also show that SMN2 is an ideal endogenous molecular target for the development of therapies for SMA.

10.4 Regulation of SMN2 Expression by cAMP Signaling

Because of this phenotype modifying property, SMN2 has been the target for numerous drug discovery strategies (Cherry et al. 2014). Targeting cyclic adenosine monophosphate (cAMP) signaling is of particular interest in developing inducers of SMN2 expression. The cAMP signaling cascade (Fig. 10.2) regulates various cellular processes including gene expression, cell growth, metabolism and stress response (Kleppe et al. 2011). The SMN2 promoter contains at least one cAMP-response element (CRE) that is able to bind to activated CRE-binding protein (phospho-CREB) (Majumder et al. 2004). The β2-adrenergic agonist salbutamol increases the amount of FL-SMN protein in SMA fibroblasts and leukocytes of SMA patients (Angelozzi et al. 2008; Tiziano et al. 2010). Forskolin, which stimulates adenylyl cyclase (AC) catalysis to produce cAMP from ATP, increases SMN2 promoter activity (Majumder et al. 2004). The synthetic analogue dibutyryl cAMP (dbcAMP)—which activates cyclic AMP-dependent protein kinase (PKA)—also increases SMN2 promoter activity (Majumder et al. 2004). We have recently shown that modulators of cAMP signaling significantly increase the number of gems—subnuclei foci of SMN protein (Liu and Dreyfuss 1996)—in fibroblasts derived from a type II SMA patient (Mack et al. 2014). Taken together, these studies show that modulation of cAMP signaling can increase SMN levels from SMN2.

Fig. 10.2
figure 2

Regulation of SMN2 gene expression by the cAMP pathway. Ligand binds to and activates its membrane-bound G protein-coupled receptor (GPCR) leading to the dissociation of the Gα,s subunit from the GPCR. Gα,s then activates adenylyl cyclase (AC) which converts intracellular ATP into cAMP. cAMP then activates cAMP-dependent protein kinase—or protein kinase A (PKA). The catalytic PKA subunit, now freed of its regulatory subunits, acts on cAMP-response element-binding (CREB) protein. Phosphorylated CREB (phospho-CREB) binds to cAMP response elements (CREs) with the promoter regions of SMN2. Cyclic nucleotide phosphodiesterases (cnPDEs) diminish cAMP signaling by breaking down cAMP into AMP. This figure is adapted from (Mack et al. 2014)

10.5 Mathematical Modeling of Gene Expression

Mathematical models use mathematical concepts and terminology to describe a biological network using a set of variables and equations to define the relationships between these variables. Mathematical models are initially generated using available experimental data and domain knowledge. Through a process involving multiple iterations, the model assumptions are revised and refined in order to develop improved models that better fit the biological network (Dhurjati and Mahadevan 2008). This adaptability of mathematical models also makes it possible to integrate multiple pathways into a network model.

There are two types of mathematical models, quantitative and logic (Le Novère 2015). Quantitative models are linear representations of quantitative variables over time and can be used to compute concentrations of biomolecules and genes as well as durations of biomolecular interactions and processes. Quantitative models are precise and provide a direct comparison with experimentally-derived quantitative measurements but a priori knowledge of initial conditions and kinetic parameters is required to generate these models. Logic models, on the other hand, use qualitative activities and define phenotypes to compute transitions between two states and stable behaviors, known as attractors. While logic models are easy to generate and to use for simulation experiments, they are not useful for making quantitative predictions and it is difficult to select between multiple attractors. Historically, mathematical models have been designed to be either quantitative or logic; however, newer models which integrate quantitative modules with logic modules are being developed (Ryll et al. 2014).

Gene expression networks can be modeled mathematically using either thermodynamic, differential equation-based or Boolean (probabilistic) approaches (Ay and Arnosti 2011). The selection of modeling approach depends on the type of biological data available (qualitative or quantitative), the nature of the system to be modeled (static vs. dynamic), the level of detail and the scale of the model. Thermodynamic models are generated by factoring the quality and the arrangements of binding site for a biomolecule, for example, binding of transcription factors to their response elements within DNA. Thermodynamic models assume that the system is at a state of equilibrium and, hence, cannot describe the dynamic nature of the system. Differential equation models focus on regulatory interactions where time, state and space are viewed as continuous variables. As a result, differential equation models readily factor in the dynamic nature of the system in question. These models use ordinary differential equations (ODEs) if only one continuous variable, like time, is being factored or partial differential equations (PDEs) when multiple variables are being factored. Since ODEs and PDEs can be difficult to solve analytically, differential equation-based models can be hard to implement computationally, especially for larger biological networks. Boolean models represent relationships as one of two possible states, on or off, and can combine qualitative data into a logical structure. Instead of viewing variables as continuous, Boolean models consider time, state and space as discreet variables. While Boolean models are easy to analyze and implement computationally, they can be inaccurate if the system depends on fine details.

In most cases, there are unknown parameters within a mathematical model; as a result, these parameters need to be estimated so as to fit the proposed model with the experimental data. Parameter estimation begins with an initial estimate and new estimates are iteratively generated so as to minimize the error between simulated and experimental data. An objective function that measures model performance—such as the sign squared error or sum of squares of the residuals between model simulations and experimental data—is used in parameter estimation. More detailed information on parameter estimation approaches can be found elsewhere (Banga and Balsa-Canto 2008; Ay and Arnosti 2011). Parameter estimation is affected by both the structure of the model and the biological system for that model (Ay and Arnosti 2011). In order to assess how the model structure can affect parameter estimation, the effects of changes in parameter inputs on model outputs needs to be measured by process of sensitivity analysis. Local sensitivity analysis focuses on a specific set of parameter values at one point in time or space while global approaches examine the entire model over a range of parameter values. Further detailed information regarding the specifics of sensitivity analysis can be found elsewhere (Ingalls 2008). Sensitivity analysis is essential for the building and interpreting mathematical models.

10.6 Overall Strategy for Building Mathematical Models of Gene Expression

Mathematical modeling is an essential component of systems biology but it can appear daunting to biologists, especially those with limited experience in mathematical biology. Figure 10.3 provides a generic workflow for generating and testing a mathematical model for gene expression and cell signaling. This workflow is designed for differential equation-based models of the regulation of gene expression and cellular signaling. When beginning the process of model generation, one of the best sources of information for building mathematical models is the primary literature. There are also some recent reviews which describe the methodological details of generating a mathematical model for gene expression and cell signaling (Zi 2012; MacLean et al. 2016).

Fig. 10.3
figure 3

Strategy for generating and testing mathematical models of gene expression and cell signaling

The first step in mathematical model generation is to create a comprehensive gene regulatory pathway map using existing biological information. CellDesigner is a convenient tool to graphically represent these pathway maps (Funahashi et al. 2003). CellDesigner uses standardized set of symbols known as Systems Biology Graphical Notation (SBGN; (Hucka et al. 2003) to represent components of a biological network and their relationships (Le Novère et al. 2009). Complex Pathway Simulator (COPASI) is another platform-independent biological simulator program that can be used to generate mathematical models (Hoops et al. 2006; Mendes et al. 2009). The models can then imported into the mathematical software MATLAB using the Systems Biology Toolbox (Schmidt and Jirstrand 2006).

Once the pathway map has been generated, the kinetics for each reaction in the regulatory pathway must be assigned. The two primary components of a mathematical model, the differential equations and the conservation equations, can now be generated. The differential equations, which generally take the form of ODEs, represent changes in the components of a reaction in response to stimulation. The conservation equations are meant to show the balance between active and inactive forms of a signaling intermediate. The values of all of the parameters within each reaction kinetics equation must also be set from either a priori knowledge or be estimated using an objective function as described in the previous section (Banga and Balsa-Canto 2008; Ay and Arnosti 2011).

Simulations for the mathematical models can be completed once the parameters have been set and the initial concentrations of signaling components are estimated or determined from the literature. The robustness of a biological model can be assessed with sensitivity analysis as described in the previous section (Ingalls 2008; MacLean et al. 2016). For a model to be considered robust, its outcomes must not be markedly affected by perturbations of the parameters or initial concentrations. With a robust model, the effects of altered expression of a component on the outcome, i.e. expression of the target gene, can be measured and future biological experiments can be designed with the assistance of mathematical models.

10.7 Mathematical Modeling of SMN2 Regulation by cAMP Signaling

A systems biology approach can be used to investigate SMN2 gene regulation. We recently developed mathematical models to characterize the regulation of SMN2 expression by cAMP signaling (Mack et al. 2014). This approach is based on additive interactions between experimental data and mathematical models. We focused on the SMN2 regulation by cAMP signaling because there is ample evidence in the literature showing that activation of cAMP signaling increases SMN2 expression (Majumder et al. 2004; Angelozzi et al. 2008; Tiziano et al. 2010). The experimental data for these mathematical models were obtained from gem—a marker for SMN localization within the nucleus—assays in type II SMA fibroblasts because the reduction in gems correlates with SMN protein expression and SMA severity (Coovert et al. 1997) and this assay has been used in multiple studies identifying compounds which increase SMN expression (Andreassi et al. 2001, 2004; Sumner et al. 2003; Lunn et al. 2004; Grzeschik et al. 2005; Jarecki et al. 2005; Riessland et al. 2006; Mattis et al. 2006; Novoyatleva et al. 2008; Thurmond et al. 2008; Xiao et al. 2011). These gem inducing agents were validated by immunoblot or enzyme-linked immunosorbent assays (ELISAs).

The cAMP signaling treatment data were used to generate two distinct mathematical models, the full cAMP:SMN2 and alternate cAMP:SMN2 models (Fig. 10.4) (Mack et al. 2014). The full cAMP:SMN2 model (Fig. 10.4a) factors in the effect of CREB activation on SMN2 transcription. As some groups have suggested that cAMP signaling regulates SMN2 expression post-transcriptionally by influencing FL-SMN protein stability (Burnett et al. 2009; Harahap et al. 2015), an alternate cAMP:SMN2 model (Fig. 10.4b) was generated. Both models are extensions of a cAMP signaling mathematical model in yeast (Williamson et al. 2009). The models contain ODEs as well as conservation equations. The full cAMP:SMN2 model contains seven ODEs and three conservation equations while the alternate cAMP:SMN2 model contains five ODEs and two conservation equations (Mack et al. 2014). Simulated data from both models match with the experimental gem data showing that either model is valid. When these two models were combined, however, the resultant simulated data did not fit well with the experimental data suggesting that only one model correctly recapitulates the effect of cAMP signaling cascade on SMN2 expression. Since the experimental data used to generate these models were fixed at one point in time, it is currently not possible to assess which mathematical model more accurately simulates cAMP signaling-dependent regulation of SMN2 expression. Future studies examining various facets of SMN2 regulation including gem formation will allow better refinement and distinction between these two models.

Fig. 10.4
figure 4

Mathematical models for modulation of SMN2 expression by cAMP signaling. Schematic representations of the full cAMP:SMN2 (a) and alternate cAMP:SMN2 (b) models for the regulation of SMN2 expression by cAMP signaling. This figure is adapted from (Mack et al. 2014)

10.8 Conclusions and Future Directions

Regulation of SMN2 expression by cAMP signaling can be modeled mathematically. The regulation of SMN2 by cAMP signaling is complex, multi-faceted and not completely understood. SMN is directly phosphorylated by PKA in vitro (Burnett et al. 2009; Wu et al. 2011). The interactions between SMN and other components of the core SMN macromolecular complex may be dependent upon PKA-dependent phosphorylation of SMN. PKA phosphorylation of SMN protein could not be factored into either mathematical model because the effects of PKA phosphorylation of SMN on its function and localization are not yet known. If PKA phosphorylation impacts SMN function, then this component of cAMP signaling can be factored into refined mathematical models of cAMP signaling and SMN2 expression.

Another facet of gene regulation is the impact of other signaling pathways on the target pathway. For example, numerous extracellular stimuli including activation of ionotropic glutamate receptors, exercise and inhibition of insulin-like growth factor I receptor (IGF1R) increases SMN expression in the spinal cord by AKT-mediated phosphorylation of CREB (Biondi et al. 2010, 2015; Branchu et al. 2013). CREB is regulated by the protein serine/threonine phosphatase 2A (PP2A) (Wadzinski et al. 1993). Protein serine/threonine phosphatase inhibitors like cantharidin and tautomycin have been shown to increase SMN2 expression (Novoyatleva et al. 2008; Zhang et al. 2011). These natural product inhibitors may act through suppression of CREB dephosphorylation and, as a result, activation of the SMN2 promoter. As new insights are gained as to how these other intracellular pathways affect CREB-mediated activation of SMN2 expression, the intersection of AKT and PP2A with cAMP signaling can be integrated into current mathematical models of SMN2 expression.

In addition to identifying the optimal component of the cAMP signaling pathway responsible for regulating SMN2 expression, mathematical models can be used to predict the effects of drug combinations. For example, activation of AC by forskolin can act in concert with cyclic nucleotide phosphodiesterase (cnPDE) inhibition by rolipram to additively increase gem formation, as predicted mathematically (Mack et al. 2014). Once validated experimentally, mathematical modeling can used to design combination strategies that target different parts of a signaling cascade, in this case cAMP signaling. Furthermore, drug discovery efforts have identified numerous small molecule activators of SMN2 expression that operate either by increasing SMN2 transcription or alternative splicing to increase the proportion of FL-SMN transcripts (reviewed in Cherry et al. 2014). As the molecular targets and signaling pathways affected by these small molecules are identified, parallel mathematical models can be generated for each pathway as it relates to SMN2 gene regulation. These pathways can then be integrated so as to create a comprehensive mathematical model for SMN2 gene regulation. This comprehensive model can be used to predict which pathways could be modulated synergistically in order to maximize SMN2 upregulation which will drive the development of combination therapeutic strategies for SMA.

In conclusion, mathematical modeling is a systems biology approach that can be used to understand how gene expression can be regulated by a signaling pathway. This approach has recently been applied to the regulation of SMN2 expression by cAMP signaling. This systems-based mathematical modeling approach can ultimately aid in the development and optimization of cAMP signaling-based therapies for SMA. A similar approach could also be used for other molecular pathways that regulate SMN2 expression. Mathematical models of these individual pathways regulating SMN2 expression can then be integrated to create a more comprehensive model of SMN2 gene regulation. Furthermore, mathematical modeling can be applied to other neurogenetic diseases wherein modifier genes, like SMN2 for SMA, have been identified.