Key words

1 Introduction

Diseases of aging, such as cancer and neurodegeneration, are complex and multifactorial, involving many genetic and life course modifiers. As more evidence becomes available, many links between different diseases of aging are becoming apparent [1], such as the roles of cell cycle proteins in cancer and neurodegeneration [2] or the contributions of Alzheimer’s disease (AD) related and cardiovascular related genes in both normal aging and neurodegeneration [3]. Systems biology, a field that aims to integrate data from diverse biological areas, is becoming an essential tool to investigate processes relating to initiation and progression in complex disease. AD is the most common form of dementia associated with aging and is increasingly being accepted as a complex multifactorial neurodegenerative syndrome. AD can be used as a case study to investigate the application of systems biology to complex molecular disease pathways and relate these to brain behavior and ultimately treatment strategies.

2 Overview of Alzheimer’s Disease (AD)

AD is characterized clinically by memory loss, cognitive impairments and dementia [4, 5]. These symptoms lead to impairments in activities of daily living with the result that individuals with AD require an increasing degree of support and care as the disease progresses. Neuropathologically, the hallmarks of AD include intracellular neurofibrillary tangles (NFT) composed of paired helical filaments of the microtubule associated protein tau, extracellular senile plaques containing aggregated amyloid-beta-protein (Aβ) and neuritic plaques and dystrophic neurites that are tau reactive and are also often associated with aggregated Aβ [6, 7].

The importance of the amyloid precursor protein (APP) proteolytic system to dementia initiation and progression in AD is highlighted by both neuropathological and genetic evidence. Various mutations within APP and the γ-secretase associated Presenilin (PS) genes, PS1 and PS2, are associated with early onset familial Alzheimer’s disease (FAD) [8]. The genetic data is further linked to disease progression by the deposition of the Aβ, a proteolytic fragment of APP, in neuritic and senile plaques. Additionally, the deposition of Aβ in the brain vasculature as congophilic amyloid angiopathy (CAA) is common in AD and may have independent effects on cognitive function [9, 10]. For late onset AD, accounting for >95 % of cases, the genetic contributions to disease are estimated to be between 48 and 79 % [11, 12] and include contributions from genes such as ApoE [13], CLU and PICALM [14] and CR1 [15] amongst others (reviewed in [16, 17]). Lifestyle modifiers that may contribute to dementia risk include education [18], exercise [19] and diet [20].

The relationship between neuropathology and cognitive status is not straight forward [21]. While considered as neuropathological hallmarks of AD, clinicopathological population studies show that the relationships between various neuropathologies, age and dementia status are complex [22] and that very few “pure” AD cases exist [23]. Population studies of the aging brain commonly find the neuropathological hallmarks of AD in cognitively normal individuals, albeit generally at lower severities, and demented individuals may show little neuropathology [21, 22, 24]. This raises questions around how these neuropathologies, and the neurochemistry associated with them, contribute to disease initiation and progression and how AD is defined both clinically and neuropathologically. If the aim is to devise treatment strategies, where some medication may alleviate or prevent the clinical manifestation of dementia, then the relationships between the human genome, (the complete set of genetic material in a cell), the transcriptome, (the entire collection of gene transcripts both destined to be expressed as proteins and as regulatory elements), the proteome, (the complete set of expressed proteins in a specific cell type), the interactome, (the complete set of molecular interactions in a cell), the functional brain connectome, (the complete set of neural and synaptic connections in the human), and the whole body within its ever-changing environment must be elucidated. Computational models can be a tool to investigate these relationships and how they change due to disease.

3 Basic Background for Biomolecular Networks

Molecular pathways are dynamic functional systems involving multiple players often with complex regulatory systems involving both direct and indirect feedback loops. Flow of biological information through these pathways can be represented as computational networks based on molecular communication theories [25]. Within a cell as a whole, the probability that an interaction or biological reaction will occur between specific molecules and not others depends on many factors including, compartmentation, relative affinity, concentration, half-life, protein modifications, the presence of co-factors, and the formation of biologically active protein complexes.

3.1 Compartmentation

A cell is divided into compartments and forms organized structures that allow cellular processes to occur in a controlled way. Organelles, such as the nucleus, endoplasmic reticulum and mitochondria, isolate specific cellular processes within semi-permeable membranes that concentrate components of a particular cellular process and increase the chance that they will combine. Compartmentation also isolates reactions that would otherwise be deleterious for the whole cell, such as lysosomal reactions involved in the breakdown of proteins tagged for destruction. Within organelles, specific compartments can be defined by further interactions between factors, such as relatively rigid cholesterol-rich lipid raft areas within a more fluid phospholipid membrane. In order to maintain cellular compartments, the cell must express all the various components in the correct place and at the appropriate time and this involves the complex process of cellular trafficking.

3.2 Relative Affinity

The relative affinity of one protein for another contributes to the probability that they will react and this affinity depends on shape and charge distribution which ultimately depend on the amino acid sequence and protein folding. Protein shape and charge distribution are altered by the protein modifications described below and by many other factors including pH, metal ion binding and interactions with other cellular molecules.

3.3 Concentration

The concentration of the active form of a protein depends on many factors including gene expression, protein synthesis, protein modification, trafficking and storage mechanisms and protein degradation amongst others. Concentration is usually tightly regulated and over- or under- expression of active proteins can be disruptive to normal cellular processes.

3.4 Half Life

The rate at which a protein is synthesized and degraded is its turnover and this is characterized by its half-life, i.e. the time it takes for half the amount of a particular protein to be degraded. The length of time a protein is active and available can contribute to the likelihood that it will be involved in a cellular reaction. The concentration of a protein with a short half-life is more easily manipulated by the cell.

3.5 Protein Modifications

After translation, proteins are often processed and/or modified before achieving an active form and more than 200 different types of modification are known [26]. Modifications can be permanent or transient. Permanent modifications include proteolytic processing, where an immature protein, such as immature PS, requires cleavage to attain its active form [27, 28]. Transient and reversible enzymatic modifications are fundamental to the regulation cellular processes and include (1) glycosylation, the addition of sugar groups, (2) phosphorylation and dephosphorylation, the addition and removal of phosphate groups and (3) acetylation and deacetylation, the addition or removal of acetyl groups. Phosphorylation and dephosphorylation in particular form a major mechanism by which cells can switch processes on or off or change the flow through a biochemical pathway. Additionally, proteins may be modified non-enzymatically by metabolites, e.g. the modification of various lysine residues by the glycolytic metabolite 1,3-bisphosphoglycerate [29].

3.6 Co-factors

Co factors are molecules or ions that are required for biological functions or reactions to occur. For many proteins, metal ions are central to their mechanism of action. For example, the N-methyl d-Aspartate (NMDA) glutamate receptor allows calcium ions into a neuron when both electrical and neurotransmitter signals are received. The Ca2+ channel is normally blocked by Mg2+. This block is removed briefly when a previous electrical signal changes the electrical potential of the membrane surrounding the NMDA glutamate receptor. If glutamate binds at this time, the calcium channel opens to allow Ca2+ ions into the cell. With no change in electrical potential, glutamate binding cannot open the channel. In effect, Mg2+ contributes mechanistically to the way the NMDA receptor senses coincidence in electric and neurotransmitter signals and this process contributes to one mechanism of synaptic plasticity. Other examples of co-factors include small molecules such as vitamins which are often involved in enzyme reactions as part of the chemical process.

3.7 Protein Complexes

The formation of tightly associated proteins within large complexes is often required for biological activity. An example of this is the endopeptidase γ secretase complex, discussed later, where at least four different proteins are required to form an active enzyme [30]. These include one of the presenilins, either PS1 (UniProt P49768) or PS2 (UniProt P49810), which forms the catalytic core and the proteins Pen-2 (UniProt Q9NZ42), nicastrin (UniProt Q92542) and APH-1 (UniProt Q96BI3) that may contribute to the activation of the protein complex and regulate how the complex interacts with its various substrates [31].

3.8 Environmental Factors

In addition to processes regulated by the cell via gene and protein expression, features such as temperature, pH or redox state associated with the cellular environment may also affect the likelihood of a reaction, for example pH modulates Aβ aggregation [32, 33] and oxidative stress may increase Aβ production and also be increased by Aβ [34].

3.9 Describing Protein Interactions

The properties of affinity and concentration for active forms of a protein in relation to its biological outcomes can be illustrated by dose response curves (Fig. 1). Further, interactions such as enzyme reactions can be described by various kinetic constants such as the affinity constant K (a), the catalytic efficiency K (cat), maximal reaction velocity V (max) and K m, an inverse measure of affinity defined as the amount of substrate at half V max. These values are calculated from experimental data using equations such as the Michaelis-Menten equation [35] and associated variations. The basic biochemical properties should be captured in any mechanistic model of a molecular pathway. Some pathways will be more complex than others but most will feature these properties in regulatory mechanisms. It must be remembered that molecules and signaling pathways in different cell types may be associated with different functions and these may also vary between species making a generally applicable model of any one molecular pathway impossible.

Fig. 1
figure 1

A generalized dose response curve. Where the concentration of an active protein is very low, the probability that it will interact with its target is also very low and any associated biological outcome will be minimal (a). As concentration increases towards a physiologically relevant range, the high affinity biological outcome will also increase (b). At a certain point the system is maximally active and any further increase in protein concentration will not increase the high affinity biological outcome as other features of the system may be rate limiting and the biological outcome reaches a steady state (c). At increasing concentrations of the active protein, other pathways may become more relevant as the chances of lower affinity reactions increase (d); other features of the lower affinity systems may be rate limiting for the relevant biological outcomes which will reach a steady state. At very high concentrations, there are increased chances of aberrant or inappropriate reactions/interactions between the active protein and other pathways with which it would not normally associate (e), and these may not be rate limited

4 Networks and Their Analysis as Tools to Investigate Complexity in Molecular Pathways

One approach to teasing apart the complexity of molecular pathways is to model molecular interactions as networks to describe and characterize the complex relationships and components within and between pathways. A molecular system can be represented as a graph in the form of a collection of nodes (objects) and edges (relationships). The functional relevance of nodes and edges can be described by assigning various attributes derived from the molecular system in question.

Nodes can be used to represent molecules and annotations can represent the various factors such as concentration, affinity and compartmentation. Edges can be either directed, specifying a source (starting point) and a target (endpoint), or non-directed. Directed edges are suitable for representing flow while non-directed edges are used to represent mutual interactions. Mixed graphs contain both directed and undirected edges and have various sets of relations.

A network of molecular relationships can be built in several ways. One way is to iteratively search literature databases using keywords relevant to the system being investigated [36]. An iterative procedure can be used to develop the search strategy, with input from clinician advisors, neuropathologists, information specialists etc. A search of PubMed (28 August 2013) for the keywords systems biology AND Alzheimer disease retrieved 183 results and the increasing number of references over time indicates that the application of systems biology to AD research is of increasing importance. Not all of these references will be relevant and manual curation will be required. A search of PubMed (28 August 2013) using the MeSH terms (“Systems Biology”[Mesh]) AND “Alzheimer Disease”[Mesh] retrieved 24 results, with some relevant references missing. A comprehensive search of several bibliographic databases as well as hand searches of key journals would also need to be undertaken to ensure all literature would be identified. All titles and abstracts should be screened by two independent reviewers and a third reviewer would resolve any disagreements about inclusion. This underlines the importance of a reliable and repeatable search strategy.

Once a collection of papers has been generated, there are various ways to filter these results to obtain only those papers of interest, involving either automated text search, human search of abstracts or both. Using this approach, networks can be built based on the information available, analyzed and then used to generate questions for further experimentation.

It must be remembered that any defined literature search, whilst being reproducible, may not retrieve all the papers of interest and a manual search of paper references may be required until no more useful references are found. Specific molecules in older literature may not be named in a standard way and in one network construction study [37], two APP interacting proteins were excluded as they could not be identified with certainty due to inconsistent naming. Additionally, only information that is published is available, leading to an unquantifiable bias in network construction due to missing information and this has important consequences for the analysis and interpretation of any resultant molecular network.

Molecular interactions can also be extracted from databases such as those listed in Table 1. While each database may be slightly different, there are now systematic ways to query such databases and extract relevant information in standard formats [38]. However, these databases are built from the existing literature and will therefore share the unquantifiable bias due to missing information. Automated methods of text searching are often used in database construction as they can be fast and repeatable. However, automation can lead to errors of misclassification and manual curation is used in most databases to minimize this. Manual curation can also lead to errors which must be repaired when found.

Table 1 Examples of molecular pathway and interaction databases

Most molecular databases are built using data from a variety of sources and are annotated with the experimental system from which the data were derived; this generally includes the species, whether in-vivo or in vitro and the exact method used, such as co-immunoprecipitation, various gene [39, 40] and protein [41, 42] expression systems or co-migration in sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE), all methods have strengths and weaknesses.

The studies listed in Table 2 have approached map construction in different ways, using different combinations of protein-protein interaction (PPI) databases, with different literature searching protocols and different inclusion or exclusion criteria. The networks generated in these studies do not always correspond and different studies highlight different pathways or biological processes, e.g. Fe2+ [43], apoptosis [44], or cardiovascular disease/diabetes [3]. Each study has different starting points, inclusion/exclusion criteria and network construction methods, so this lack of agreement is to be expected. It is difficult to assess the degree to which the various starting points, criteria and network construction methods bias results towards an outcome. For example, the study by Soler-Lopez et al. [45] may not represent the interactions of full length APP in the membrane adequately, as many of the extracellular matrix (ECM) proteins that might be expected to interact with APP are excluded due to difficulties involved in expressing them in the experimental microarray used. This may shift the focus of their network more towards intracellular interactions. Given the importance of the various interactions of APP with components of the ECM (see Fig. 2), any network excluding such proteins and proteoglycans could be seriously confounded and any findings would have to be interpreted carefully. Additionally, the correspondence between gene expression as mRNA and viable functional proteins within a cell is not absolute, varying from 9 to 87 % depending on which genes are investigated [46].

Table 2 Protein–Protein interaction (PPI) network studies (adapted from [36])
Fig. 2
figure 2

A simplified view of selected interactions of the APP proteolytic system (adapted from [36]) Nodes represent molecules or molecular assemblies and interactions between them as arrows. Some complex interactions have been collapsed into general processes shown in grey. Multiple sequence variants and conformations of APP and Aβ have been collapsed into a single node for each. Aβ, amyloid beta protein; ADAM, a disintegrin and metalloproteinase domain-containing protein; AICD, APP intracellular domain; APP, amyloid precursor protein; BACE, beta-site amyloid precursor protein cleaving enzyme; CD74, HLA class II histocompatibility antigen gamma chain; CTF, carboxy- terminal fragment; ECM, extracellular matrix; Fe65, Amyloid beta A4 precursor protein-binding family B member 1; LTP, long-term potentiation; PKA, protein kinase A; PKC, protein kinase C; sAPP, secreted amyloid precursor protein; Tip60, Histone acetyltransferase KAT5. With permission from BioMed Central (part of Springer Science + Business Media) under the Open Access License Agreement (http://www.biomedcentral.com/about/license)

Molecular networks built from PPI databases or literature searches do not explicitly take into account differences between cell types arising through the processes of differentiation during development which can lead to different susceptibilities of different cell types to neuropathology, such as the well-recognized difference in susceptibility to tau reactive NFT pathology of Ca4, Ca3, Ca2 and Ca1 neurons in the hippocampus as reflected in Braak Staging [47]; a widely accepted semi-quantitative measure of NFT pathology. Different cellular systems may have very different functions depending on cell type: an example of this is the way many cell cycle proteins, involved in regulating cell proliferation, are involved in synaptic plasticity in non-proliferative neurons [2]. The differences between cell types potentially undermine many of the current network approaches, especially where different experimental systems have been used to generate interaction data. Ideally there should be a database for each cell type, and for the brain this would need to include different neuron types as not all neurons necessarily share similar signaling and interaction pathways.

A major problem with all molecular map type networks is their inability to include dynamic information relating to the way molecular networks are regulated in living systems. Transient protein modifications, such as phosphorylation, regulate molecular interactions and are central to cellular function are not easily captured, for example, differential phosphorylation of the tyrosine (tyr) residues Tyr682 and/or Threonine Thr668 of the APP695 cytoplasmic domain regulates many interactions with small binding proteins and kinases [48]. Other dynamic processes that may not be fully represented include transient changes in gene expression via epigenetic mechanisms, changes in protein expression via RNA interference, responses to environmental perturbations such as infection and activity lead changes, such as the up-regulation of synaptic proteins in response to synaptic activity.

Inter-species differences in the way cellular signaling systems are organized, especially in the brain, are well recognized [49, 50] and this should be taken into account when designing animal disease models and building networks. Miller et al. [51] confirm this in the comparison between human and mouse networks, revealing an additional function of PS in oligodendrocytes and myelination in humans that is not seen in the mouse. Given the association of PS mutations in FAD, this difference is likely to impact on the suitability of the mouse as a model for AD.

The development of animal models that represent AD disease processes in humans is crucial in the search for effective therapeutic interventions. Early transgenic mouse models did not completely replicate the neuropathology associated with human disease [52] nor the more fundamental aspects of Aβ biochemistry in humans [50]. Attempts to fully represent AD in humans are on-going with the development of new animal models that can be used to investigate the links between various features of AD. Using multiply transgenic animal models allows the investigation of molecular interactions and signaling pathways involved in different aspects of the disease in a way not possible in humans. For example, the TgF344-AD rat [53] displays oligomeric Aβ species and plaque pathology, tau pathology, behavioral change and neuronal loss, combinations not always present together in other animal models, and this model can be used to study the connections between Aβ and tau pathologies. Different animal models may be used to highlight different aspects of human disease, such as the association between Aβ and cholesterol metabolism in the triple transgenic mouse model over-expressing the sterol regulatory element-binding protein-2 [54] or the relationship between age and cognitive decline in the senescence-accelerated mouse [55].

The success of all animal models depends on being comparable to disease presentation in humans, and this is where the main problems lie. The characteristics of AD in humans are constantly being updated as new disease processes and pathologies are found. Disease processes, such as hippocampal sclerosis [7], or other pathologies such as the Tar-DNA binding protein of 43 kDa (TDP-43) [56] may independently contribute to cognitive status and are yet to be fully characterized in the human population. Population studies highlight the existence of multiple pathologies including contributions from the vascular system in the development of Alzheimer- like dementia in the aging population, with relatively few cases of “pure” AD [23, 57]. Additionally, the relationship between age, neuropathology and disease is not straight-forward, with many pathologies showing an age related distribution [22].

Fresh human brain tissue that may be of use in functional studies is extremely rare and to a great extent, interaction databases rely on various animal, cell culture and in vitro based models, all of which have yet to be fully characterized with respect to the normal human system. If only animal or cell based systems are used as experimental models, functions that are human specific could be misrepresented or missed completely. The full range of pathologies associated with age and AD in humans still remains to be replicated in any animal model.

5 APP: A Dynamic and Complex Proteolytic System

A review of the complexity of the APP proteolytic system has been described [36]. In summary, APP is a member of a wider family of similar proteins that also includes the APP like proteins (APLP)1 and APLP2 that have significant functional redundancy [58] complicating investigations. It is expressed in various isoforms due to mRNA splicing, with APP695 being expressed predominantly in the brain and linked to amyloid deposition. It is a type I, single pass transmembrane protein with diverse functions including associations with cell differentiation [59], neurite outgrowth [60, 61], cell adhesion [62], synapse formation, maintenance and plasticity [62, 63] and many cell signaling pathways [36, 64, 65] including apoptosis [66]. APP is post-translationally glycosylated [67] and phosphorylated [48] at various residues and these modifications may contribute to the regulation of the various APP functions and proteolytic pathways.

Full length APP has a large N-terminal domain that interacts with various components of the ECM including heparin and other proteoglycans [68, 69], other proteins such as reelin [70], DAB1 [71] and also forms homodimers regulated by heparin and Zn2+ [72]. The transmembrane region has been implicated in the process of homodimerization and also interacts with various proteins including Notch [73]. The C-terminal domain of full length APP also interacts functionally with a variety of proteins including FE65 [74], the low density lipoprotein receptor protein (LRP) [75, 76], a variety of small binding proteins [48, 77] and several kinases [48, 78, 79] that phosphorylate the residues Y682 of the binding and signaling sequence GY682ENPTY and T668 of APP695 [48, 79, 80]. Phosphorylation regulates the interaction of the C-terminal domain with other proteins [48, 77], may modulate proteolytic processing [80] and allows cross talk between diverse cellular systems [48].

Full length APP can remain at the cell surface, be recycled via endocytosis or proteolytically processed and has a high turnover, with a half-life ranging from 30 min [76, 8183] to 4 h [8486]. Unprocessed APP is degraded or recycled via the endosomal or lysosomal pathways and may be recycled back to the membrane and processed within ~30 min [82], with perhaps one third to one half being processed via the cleavage pathways as measured by secreted sAPPα/β [82]. APP is proteolytically processed to more than 40 fragments [87]. There are two main cleavage pathways, α- and β- pathways that then converge on a shared γ-cleavage, summarized in Fig. 2. These cleavages have been well reviewed [88, 89]. Additional cleavage pathways (not shown in Fig. 2) include caspase cleavages producing an alternative C-terminal cytoplasmic fragment C31 that is associated with apoptosis [90, 91] and the alternative cleavages by β-site APP cleaving enzyme (BACE)1, 11 residues within the Aβ sequence [88, 92] leaving a membrane bound fragment C88 and BACE2 at the θ-cleavage site between the phenylalanine residues, F615 and F616 of APP695 downstream of both the Aβ and P3 cleavage sites, producing a membrane bound fragment C80 [93].

5.1  α Cleavage

α-cleavage occurs between residues Lys612 and Leu613 within the Aβ sequence of APP695, releasing the N-terminal sAPPα and leaving a membrane bound C83 C-terminal fragment [88]. α-secretase activity has been observed by several membrane-anchored zinc-dependent metalloproteinase enzymes including A Disintegrin and Metalloproteinase (ADAM)9, ADAM10, ADAM17 [9496] and possibly the matrix metalloproteinase (MMP)9 [97]. α-cleavage is both constitutive and regulated, with the various ADAMs responding in different ways depending on many factors [95, 98]. In addition to APP, α-secretases also cleave alternative substrates such as Notch [99], pro-TNF-α and the epidermal growth factor receptor [100] which may lead to competition between different pathways with consequences for many cellular processes including development, synaptic plasticity and the cell cycle and cancer [96, 100, 101]. How the balance between these alternative pathways is regulated is not known.

The soluble N-terminal fragment released by α-cleavage, sAPPα, retains two heparin binding sites and has been shown to bind heparin as a dimer [102]. The ability of sAPPα to disrupt APP dimerization at the cell surface may contribute to its neuroprotective actions [103105] and may partly explain why sAPPα is ~100× more neuroprotective against excytotoxicity, glucose deprivation and the addition of Aβ in hippocampal cultures than sAPPβ, which lacks the second C-terminal heparin binding site [104]. Additionally, neuroprotective actions of sAPPα may be mediated by its antagonism of stress signaling by the JNK stress signaling pathway [106]. Dementia status has been associated with both reduced sAPPα levels in human CSF [107] and an increased half-life of sAPPα [86] in transgenic mice, however, as yet, there has been no systematic study of the α-pathway proteolytic fragments in the human population.

5.2  β Cleavage

β cleavage occurs between residues Met596 and Asp597 of APP695 within the second heparin binding site, releasing the N-terminal sAPPβ from the membrane bound C99 C-terminal fragment [88, 92]. Two membrane bound aspartyl proteases are associated with β-cleavage, BACE 1 and to a lesser extent, BACE2 [88, 92]. Additionally, Cathepsins D and B have shown β-cleavage activity to release Aβ [108]. BACE1 and BACE2 are differentially regulated and have different functions [109]. In addition to APP, BACE1 may also cleave alternative substrates including APLP1 and APLP2 [110] and P-selectin glycoprotein ligand-1 [111]. Heparin and heparin sulfates may be involved in regulating APP cleavage by BACE1 [112]. In addition to interactions with sAPPα and APP, the large soluble sAPPβ fragment may be associated with apoptotic signaling and axonal degeneration via the death receptor DR6 and caspase6 [113], though the interactions of sAPPβ are not fully characterized and require further detailed investigation.

5.3  γ Cleavage

Cleavage of APP by the γ-secretase complex occurs within the membrane to release the variable length 38–46 residue Aβ peptide following β-cleavage, the variable length 21–29 residue P3 (Aβ17-X) fragment following α-cleavage, with both pathways releasing the APP intracellular domain, (AICD) [8, 88, 114, 115]. There is some uncertainty as to how γ-cleavage occurs; γ-secretase cleavage may occur via successive ζ and ε cleavages producing progressively shorter Aβ fragments [116118], though there may also be distinct cleavage mechanisms that may be separately modulated [119].

There are a number of alternative γ-secretase substrates, e.g. APLP1, APLP2, Notch, cadherins, LRP [120, 121], and syndecan-1 [114, 122]. In addition to γ-secretase dependent functions, some PS functions are independent of γ-secretase, so that in effect, γ-secretase may compete for presenilins with other γ-secretase independent PS functions including cell adhesion, trafficking of various proteins [123], and Ca2+ homeostasis [114]. How the γ-secretase is regulated between the different substrates is not fully understood but may involve other binding proteins such as numb [65] and Rac1 [124], regulation of PS trafficking, including a possible reciprocal interaction with APP [125] and localization of PS within specific organelles and cellular membrane compartments [126].

Aβ is produced in a range of sequence lengths [87] and can form monomers, dimers, oligomers and fibrils [8] which have been difficult to study due to their dynamic instability [127]. At physiological concentrations Aβ is associated with numerous normal cellular functions [128] and in AD progression has multiple interactions that have been described as either neuroprotective or neurotoxic [36]. It is deposited in the brain in various pathological forms including CAA, diffuse and cored senile plaques and is often associated with neuritic plaques. Different sequence lengths have different propensities to aggregate [32, 129] and aggregation is also affected by amino acid substitution in mutant forms [130, 131] and various factors such as proximity to membranes [132], and pH or metal ion availability [133]. Different sequence lengths and different aggregation states can have different functional roles [36], making investigations into the exact roles of Aβ in the brain difficult. These associations may be better approached experimentally as a matrix, where the various sequence lengths, aggregation states and mutant forms should be assessed for each interaction.

While it is likely that P3 is produced in alternative sequence lengths following γ cleavage, very little evidence can be found in the literature for the contributions of P3 to disease progression. There is currently little interest in characterizing the contributions of P3 to normal brain function or AD, even though P3 is known to aggregate [134136], has been associated with in cotton wool type amyloid plaques [137] enhances the aggregation of Aβ1-40 [138] and may have a signaling role in apoptosis via caspase activation [139].

Regulation of expression and proteolysis of APP involves multiple factors, some of which are summarized in Fig. 2 (adapted from [36]). How signals from these multiple factors in various cellular locations are integrated to produce a specific outcome in any one cell is not known. Regulation of APP proteolysis, from both outside and within the APP proteolytic system, can be in response to a wide variety of cellular signals and various modulators including glycosylation, phosphorylation, dimerization, associations with heparin glycoproteins and other binding proteins. Feedback routes can be simple and short range such as the promotion of APP expression associated with fibrillar Aβ and prion protein [140]. Indirect and complex feedback routes also exist, such as the effects of heparin on regulating β-cleavage with low concentration promoting and high concentration inhibiting the activation of BACE1 [141] and the effects of Aβ on heparin. Aβ interacts with heparins in the ECM and at high levels may prevent the catabolism of proteoglycans and promote amyloid formation [142]. Reciprocally heparins modulate many of the interactions involving Aβ such as enhancing both nucleation and elongation processes in the aggregation of Aβ [143], limiting the neurotoxic and pro-inflammatory activity of Aβ in a dose dependent manner [144] and contributing to the uptake of Aβ by a pathway shared with ApoE [145].

6 Modeling the APP Proteolytic System. Practical Considerations

As a summary of interactions, maps, such as Fig. 2, can highlight particular areas that may be of interest such as hubs or regulatory interactions that may be open to modification by medications, or may highlight areas where data are missing, leading to further research. While molecular networks involving APP can be constructed, how these relate to the actual network of molecular interactions in any one human cell type at any one stage of development cannot yet be fully assessed. As reviewed above, different criteria and network construction methods can generate different networks, each with strengths, weaknesses and different behaviors in analysis. The impact of missing data, such as interactions that have not yet been identified, is difficult to assess. For the APP network, the contributions of alternative proteolytic fragments, such as sAPPα, sAPPβ, P3 and the various longer Aβ fragments, e.g. Aβ43, Aβ45, Aβ46 and Aβ48, in various states of aggregation have yet to be fully described. It is still unclear which Aβ sequence or aggregation state is linked to disease progression [146]. These alternative fragments may yet provide further interactions that have the potential to affect network behavior as a whole, as suggested by the predisposition to form Aβ42 from γ cleavage due to the accumulation of γ secretase substrates, C99 and longer Aβ fragments [147].

There are great difficulties in representing an iterative and dynamic proteolytic system, such as APP, as a static network map of connections. One of the first questions raised is what exactly does a static network represent? If a network represents interactions, and these interactions change with protein modifications such as phosphorylation, is it best to represent each functional protein version as a separate node? Should the alternative isoforms of APP be included and if so, should they have separate nodes? How do we best represent Aβ with around 40 possible sequence lengths [87] and various states of aggregation [32, 146]? In Fig. 2, Aβ has been collapsed into a single node for clarity. How would over 40 nodes in this space with potentially different connections affect computational and analytical methods? Given the different conformations [148] and functional actions [149151] of Aβ(1–40) and Aβ(1–42), a single node for these peptides cannot fully represent the APP functional network.

If the aim is to understand the role of PS in AD, perhaps with a view to developing treatment strategies that modulate its probability to react between its various substrates, then a network of its interactions could be constructed and this could be the basis for a dynamic computational model. This dynamic model would need to include calculations of a protein’s probability of reaction, where the basic molecular features described previously, (concentration, half-life etc.), could be represented as values in a computational matrix. This approach could be developed iteratively and different versions of a network could be compared in terms of flow through the network. Experimental data relating to basic biomolecular properties that are relevant to modeling the probability that a reaction will occur can be extracted from the literature, including V max, K m, and K (cat). However, characterizing enzyme reactions in order to model the probability of reaction is not an easy task as demonstrated in the following example.

Recent studies [118, 147, 152, 153] have looked at γ secretase enzyme kinetics for a variety of PS mutations, substrates and products. Different experimental models have been used and different features of the system have been reported in different formats. Table 3 gives values for K m, and V max for human synthetic wild-type PS1 and its interaction with various substrates extracted from the associated references.

Table 3 Kinetic values for human synthetic wild-type PS1. N/A, not available

Values for K m have been given in μM or nM and values for maximum reaction rate have been given as V max (pM/min or nM/h) or maximal activity (pM/106 cells). While manual extraction from the literature could easily convert μM to nM or nM/h to pM/min, automated text based searches could introduce error due to units reported. It is not possible to convert pM/106 cells into nM/h or pM/min, making comparisons between these studies difficult. The degree to which the experimental system used affects the values gained is difficult to assess, mouse embryonic fibroblast (MEF) derived membrane cell free assays, Hek293 or HeLa cell based systems are likely to have very different environments and each system will have experimental advantages and disadvantages. None of these systems accurately represent aging in the human brain. Indeed, which values of K m and V max in Table 3 would be most representative of the situation in any human neuron? Standard reporting formats for proteomic data exist [154, 155] and are annotated by experimental system used to derive the information such as species used, etc. so that data from different studies can be integrated but it is difficult to choose those values that may best represent the human system as it has not yet been fully characterized.

Attempts to dynamically model the human cognitive system are on-going with a diversity of approaches. For example, Kasabov et al. [156, 157] have combined gene and protein expression networks with a probabilistic spiking neural network and compared this to real human electroencephalograms [158] and used this to investigate pathways involved in AD [157]. In these models, dynamic behavior is captured in the network output, represented as spiking neurons, which can be controlled by networks representing gene and protein expression data. These gene and protein networks are in turn re-modeled iteratively by the spiking neural network. While a computational model of the AD process would be very useful to investigate how the system might be perturbed by changes to gene and protein expression, their current usefulness is open to question. Connectionist network models contain unquantifiable modules, as the weights of connections between the nodes in a network are stochastically modified during the training process. The relationships between the nodes and weighted connections with any feature of the human system are not certain: the nodes do not necessarily represent real human neurons and the connections do not necessarily represent connections between neurons. Populations of trained networks will consist of individual network models, each of which will have different connection weights. The difficulty here is in relating the distributions of the weights in any network to the living human system: the extraction of potentially useful information from the structure of the network is problematic.

7 Applying Systems Biology Approaches in Other Areas

7.1 Pattern Recognition and the Early Diagnosis of AD

Various computational methods such as principle component analysis [159, 160], linear regression methods [161, 162], machine learning methods [163165] and random forests [166] are being used to investigate automated pattern recognition in magnetic resonance imaging (MRI) image analysis [161] or various imaging methods coupled with multiple biomarker analysis [160, 163, 165, 166] with some success in separating normal aging from mild cognitive impairment (MCI) and AD. Although the use of new computational methods for multiple markers for AD increases the specificity and sensitivity in categorizing normal aging, MCI or AD, there is still no combination of markers that can identify those with MCI that may convert to dementia and AD with certainty and this is an urgent requirement.

7.2 The Human Connectome Project

Beyond mapping gene and protein expression or interaction networks, the effects of the human connectome on dementia risk is another complex area that presents huge challenges. The human connectome project [167] aims to map the human connectome at the macroscopic scale, (~1 mm3) using a variety of neuroimaging methods. This project aims to create a map of healthy human connectivity. There is great inter-individual heterogeneity, both in the vascular system, that may affect certain imaging methods and in cortical folding, so any resultant map can only be an idealized reference map. Further, how this connectivity changes with progression in dementia may also be highly heterogeneous between individuals and this has yet to be fully investigated.

8 Relating the Systems Biology of APP to Normal Cognition and Disease Progression in AD

For any neuron, signals received via synapses must be integrated into dynamic responses of the cell as a whole and this requires signaling between any specific synapse on a dendrite and its nucleus, possibly located some distance from the synapse. Changes to gene and protein expression in response to synaptic signaling must be transmitted back to the synapse via protein trafficking so that receptors and signaling molecules are in the correct cellular positions. There may be different signals arriving via different pathways, both electrical and metabolic, and these must be integrated into a coherent neuronal response. There is a temporal coherence, where everything must be in the right place at the right time, as the synaptic response builds on the previous state of the synapse. These synapses are further organized within a neural network connectome of different cell types and different functional brain areas from which cognition and human behavior arise that may include inputs from the whole body as it interacts with its environment. Figure 3 illustrates the interdependence of the areas involved in normal brain function, where gene expression may be modified by behavior which in turn may change protein expression and interaction leading to further changes in behavior as the whole system iteratively and stochastically changes over time. Attempts to isolate any specific area, such as protein expression, can be undermined by this interdependence and contributions to cognitive processes may be misrepresented, simply due to the assumptions of independence in experimental design.

Fig. 3
figure 3

The interdependence between ‘omics research areas. General discrete research areas discussed in the text appear as nodes, selected feedback and feed forward relationships are shown as arrows

In order to understand this coherent system, research has necessarily had to break it into smaller parts giving rise to discrete research fields investigating all the areas involved from genomes and proteomes to interactomes and connectomes. Traditionally, the reductionist approach aims to characterize individual pathways by introducing changes that are meant to impact on specific components in potentially well understood ways. This can lead to a limited view of complex processes, for example, the amyloid cascade hypothesis suggests that Aβ, in some form, is the sole cause of AD and that therefore removal of Aβ should modify the disease course. This can be understood in terms of a more linear infection type model. However, treatments based on this model have been unsuccessful in clinical trials so far and have failed to change the course of the disease [168], questioning its validity. Population studies highlight complexity in the presentation of AD, bringing wider research areas such as aging, diet, exercise, education, the vascular system and other biochemical pathways into consideration. Few complex biological mechanisms can be reduced to simple in vitro, cell based or animal based experimental models [50] and poorly characterized or unsuitable experimental systems may lead to erroneous interpretations.

In contrast to reductionist approaches, in which molecular systems may be treated as isolated and independent mechanisms, systems biology aims to integrate evidence from diverse areas into a representation of living processes as a whole. Even simple molecular systems present enormous challenges in terms of modeling biological outcomes as theories of molecular communication, i.e. how biological information is transmitted through a molecular network, are still being developed [25] and any computational representations of biological processes are necessarily limited to the data we currently have. In complex maps of protein interactions, many pathways are possible and whether any specific interactions are central, peripheral or involved in only subtypes of disease progression cannot yet be fully assessed.

Integrating networks constructed at the level of gene expression, with networks constructed at the levels of protein expression, protein interaction and cellular behavior is currently difficult as there isn’t correspondence between them. As reviewed above, not all genes expressed as mRNA transcripts become functional proteins and not all functional proteins necessarily interact due to dynamic regulation. Additionally, while the human connectome is being mapped at ever increasing resolution [167], how information is represented and stored across the human brain as a dynamic neural system of synaptic connections and how this changes with disease progression is not known.

Given that there is no qualitative marker for AD, diagnosis has relied on various clinical [4, 5] and neuropathological [6, 7] criteria that are quantitative and involve the application of thresholds: no single measure yet defines AD. Further, biomarkers used in the diagnosis of clinical disease remain to be standardized and harmonized [169]. Aβ fragments, commonly employed as biomarkers of disease, may have both protective and aberrant behaviors associated with disease and multiple disease pathways may exist. Additionally, no Aβ fragment has been identified as the “neurotoxic” disease related species [146]. How can poorly defined neurodegenerative diseases be diagnosed at an early stage when treatment strategies could have the best chance of preserving cognitive functions? This has consequences for how we understand AD, whether it is a single process that will respond to a single intervention strategy or whether AD is a syndrome, requiring multiple different interventions depending on disease types, yet to be characterized. This is of great importance to the design of experimental investigations and clinical trials. Selection of participants and controls relies on how we understand the disease process and how any disease process is reflected in clinical markers. How do we know that in any given clinical trial, the subjects selected represent homogenous disease or non-disease groups? There may be other disease processes, such as hippocampal sclerosis [7] and other pathologies such as TDP-43 [56] that may contribute to disease pathways and are yet to be fully characterized. Additionally, individuals may vary in the degree to which cognitive reserve and compensation to neuronal injury may limit the impact of pathological changes that occur during aging to better preserve cognitive functions [170].

In terms of health care planning, given the lack of progress towards a reliable dementia treatment strategy, in the immediate future perhaps dementia prevention and dementia care are areas where progress can best be made. The association of education [18], exercise [19] and diet [20] throughout life with a lower dementia risk in old age suggests that Public Health strategies devised to promote these activities would be worthwhile. Without a cure or ameliorating treatment, we need to be able to care for dementia sufferers in the most appropriate and efficient manner to maintain an individual’s independence and quality of life for as long as possible.

While applying the systems biology approach to represent complex dynamic proteolytic systems such as APP may not yet be entirely feasible, useful perspectives can still be generated. For APP, the complexity of its interactions and regulatory features suggest that multiple initiation and progression pathways are possible: the analysis of networks to highlight those disease pathways that may be most likely to occur in humans presents major challenges. Capturing this complexity in any network model and being able to relate network behavior to real human brains is the ultimate goal. Whether it will ever be possible to build a dynamic model of the AD disease processes at all levels of consideration (genome, proteome, interactome, connectome and whole body) is not clear. There is no best way to build a network and all networks constructed so far are incomplete. Additionally, both AD and normal aging in humans have yet to be fully characterized. How this missing data impacts on the reliable prediction of events from an incomplete network cannot yet be known. However, this chapter suggests some initial steps and proposals on how we could build more sophisticated networks. The challenge to the AD biomedical research community is to iteratively integrate data generated via a variety of approaches, both reductionist and systems biology, and then to use any insights gained to integrate the information and design further experiments to generate new data. It is clear that no single approach, reductionist or systems biology can tackle this problem alone.