Keywords

1.1 Introduction

Aging is the time-dependent physiological functional decline in all aspects of a biological system, which ultimately leads to death. Systems biology combines computational modeling and simulation, with large-scale experiments, to explore dynamic behavior in biological systems (Cassman 2005), which is an ideal approach to study a systems-level problem like aging. In this chapter we will discuss the data resource and analysis approaches in aging systems biology.

1.2 Data Resource for Systems Biology in Aging Research

The omics data are the basic building blocks for constructing a global view of a tissue or organism in the aging process through systems biology approaches. The following will highlight different omics data source in the aging research and their findings.

1.2.1 Genomics

The genome-wide association studies in aging research are based on genetic variants measured by single nucleotide polymorphism (SNP) chips or high-throughput genome sequencing and phenotypes such as the chronological age- or healthy aging-related phenotypes. A series of twin studies (Paneni et al. 2017; Ljungquist et al. 1998; Skytthe et al. 2003) have shown that 20–30% of the overall variation of human lifespan can be attributed to genetic factors, indicating that lifespan is not genetically determined to a large extent, although the genetic influences on lifespan increase after age 60 (Hjelmborg et al. 2006). With that being said, there are apparently some genetic determinants for longevity. The SNPs on APOE (Gerdes et al. 2000; Ewbank 2007; Deelen et al. 2011; Joshi et al. 2017; Sebastiani et al. 2017) and FOXO3A (Joshi et al. 2017; Willcox et al. 2008; Pawlikowska et al. 2009; Flachsbart et al. 2009; Anselmi et al. 2009) are repeatedly found to be associated with longevity in studies of centenarians versus younger controls. In contrast, a recent GWAS research on healthy aging reveals that healthy aging (in this study defined as people >80 years without chronic diseases and not taking chronic medications) shares no SNP loci with exceptional longevity, suggesting they are very divergent phenomena, although they are intuitively expected to share some common features (Erikson et al. 2016). Instead this study found no major single contributor to healthy aging (Erikson et al. 2016).

1.2.2 Epigenomics

Genome-wide DNA methylation can be measured by chip (Illumnia 450 K or 850 K chip) or sequencing (whole-genome bisulfite sequencing, reduced-representation bisulfite sequencing, methylated DNA immunoprecipitation sequencing, or methyl-CpG binding domain enriched sequencing) (Bock et al. 2010; Harris et al. 2010). The global pattern of DNA methylation during aging is hypo-methylation in repetitive sequences, hyper-methylation in promoter regions, and higher intercell variability (Bacalini et al. 2014; Cevenini et al. 2008). A study using DNA methylation to estimate the state of aging in blood found that only three CpG sites could predict age with a mean absolute deviation from chronological age of less than 5 years (Weidner et al. 2014), providing a DNA methylation-based aging biomarker. A cross-sectional study that evaluated DNA methylation in boys aged 3–17 years found that >88% pediatric age-associated loci trend in the same direction as in adulthood, suggesting that some of the methylation changes with age take place in early life stages (Alisch et al. 2012). Aging-associated DNA methylation is shared across different tissues within the same individuals, as indicated by one research which found that differentially methylated regions in whole blood can be replicated in buccal cells (Rakyan et al. 2010), and another research found that age-methylation correlations are well preserved between the brain and blood (Horvath et al. 2012).

1.2.3 Transcriptomics

Transcriptome is also measured by either microarrays or RNA sequencing methods. Changes in the aging transcriptome are found to be tissue-specific, as most of the changes from the brain (Lu et al. 2004; Berchtold et al. 2008), skin (Glass et al. 2013), adipose tissue (Glass et al. 2013), kidney (Rodwell et al. 2004), and blood (Peters et al. 2015) did not overlap with other tissues. And the change also shows species specificity, because a cross-species analysis found only 73 genes consistently associated with age (de Magalhães et al. 2009). The repeated biological functions that change in the aging process include increased inflammation and decreased energy metabolism especially mitochondrial functions (Zierer et al. 2015).

1.2.4 Proteomics

Current proteomic techniques based on immunoassays, protein arrays, or mass spectrometry can measure only a small fraction of the proteome (up to 1000 proteins per a sample). The most comprehensive description of the human proteome across various human tissues, cell lines, and body fluids to date consists of 18,097 proteins collected from 16,857 liquid chromatography tandem-mass spectrometry (LCMS/MS) experiments (Wilhelm et al. 2014). Recent research using quantitative middle-down proteomics found that a histone variant H3.3 is accumulated during aging (Tvardovskiy et al. 2017), and another research in Drosophila showed that tissue-specific proteome in long-lived mutant strains new insights on the insulin/IGF signaling pathway (Tain et al. 2017). A proteomics study of young and old B cells found that protein related to stress management in mitochondria and DNA repair is under significant regulation during aging (Mayer et al. 2017). Besides the identification of proteins from proteomics data, a distinctive value of such data source is the posttranslational modification (PTM) information, which cannot be directly measured by any other omics but can alter biochemical properties of proteins. PTM is significantly changed during aging, for example, levels of N-glycosylation correlate with familial longevity and healthy aging (Ruhaak et al. 2011) and linear combination of only three IgG glycans explained up to 58% of variance in age in a research of four European populations (Krištić et al. 2014). As mass spectrometry (MS)-based proteomics fields are more open to data sharing practice, it is the golden age to analyze public proteomics data (Martens and Vizcaíno 2017). OpenMS (Röst et al. 2016) is an open-source tool available to assist such analyses.

1.2.5 Metabolomics

Metabolomics profiles the low-molecular-weight molecules in a biological sample. Similar to proteomics, this profiling is based on either mass spectrometry or nuclear magnetic resonance. To date, there is no analytical method available to determine and quantify all metabolites in a single experiment (Adamski and Suhre 2013). From 2008 till today, a series of metabolomics studies in human aging have been done (Gonzalez-Covarrubias et al. 2013; Lawton et al. 2008; Menni et al. 2013; Yu et al. 2012) in small to large cohorts. A lipidomics study in middle-aged offspring of nonagenarians found that improved antioxidant capacity and more efficient β-oxidation function might be responsible for increased lifespan in women (Gonzalez-Covarrubias et al. 2013), and another study found that C-linked glycosylated tryptophan was highly correlated with age and aging traits, such as lung function, bone mineral density, and blood pressure (Menni et al. 2013). Now metabolomics are often conducted with other layers of omics to facilitate the study, such as in the proteomics study mentioned before, metabolomics are used to verify their conclusions (Mayer et al. 2017).

1.2.6 Metagenomics

The human metagenomics refers to the collective genome of microbial species hosted by the human body. Metagenomics of fecal samples found that the separation of microbiota composition significantly correlated with measures of frailty, markers of inflammation and nutritional status in older people, as well as their residential situation (Claesson et al. 2012).

1.2.7 Phenomics

Phenomics refers to the clinical and lifestyle traits, ranging from anthropometric measures to health and lifestyle questionnaires (Moayyeri et al. 2013). As aging is tightly linked to lifestyle, for example, calorie restriction and exercise are repeatedly found to slow aging (Green et al. 2017), phenomics is especially valuable in aging research. The Rockwood frailty index, which is composed of symptoms, signs, diseases, and disabilities, could be used as a measure of biological age (Rockwood and Mitnitski 2007). The phenomics could be interdependent on each other, such as faster telomere attrition, and higher inflammaging burden (measured by interleukin-1β) was associated with lower grip strength (Baylis et al. 2014). Recently, the human 3D face was also profiled for the aging study, and features extracted from the 3D such as eye slopes were found to be tightly associated with age, while physical age predicted from the 3D face was found to be more consistent with health indicators than chronological age (Chen et al. 2015).

1.2.8 Single-Cell/Organism Measurement

Although not necessarily through omics approaches, single-cell/organism measurement could also be informative for aging research and suit the need for systems biology as such experiments often generate big dataset for the downstream integrative analysis. The aging-related immune system changes have been investigated via 15-color flow cytometry panel (measures 14 proteins) in 28 T cell subpopulations in human (Lu et al. 2016) and single-cell RNA-seq in naïve and effector memory CD4+ T cells in mice from two divergent species (Martinez-Jimenez et al. 2017). The latter found that aging increases cell-to-cell variation on transcriptome level, which suggests that transcriptomic switch driven by immunological activation is no longer controlled as tight as in young mice (Martinez-Jimenez et al. 2017). Another single-cell RNA-seq study in human pancreas of 2544 single cells from 8 donors spanning six decades of life found that older donors display increased levels of transcriptional noise and potential fate drift (Enge et al. 2017). With the development of micro-fluid technology in model animals such as yeast S. cerevisiae (Chen et al. 2017) and worm C. elegans (Xian et al. 2013) or other equivalent culture techniques utilizing a polyethylene glycol hydrogel and a silicone elastomer (Pittman et al. 2017), there have been significant efforts to delineate the long-time puzzle about how aging differs among genetically identical individuals within the same species, which reflects the stochastic nature of the aging process.

1.3 Data Analysis for Systems Biology in Aging Research

The data analysis for aging systems biology generally could be separated into two parts: the network approach and the mathematical modeling approach. The following will briefly discuss the advance of the application of such approaches in the aging research and their conclusions.

1.3.1 Network Construction

One way to integrate the result of an omics study in a systems biology context is to project the variables of interest onto known reference networks, such as protein-protein interaction (PPI) networks, gene regulatory networks (GRN), or metabolic networks. PPI can be obtained from the Human Protein Reference Database (Keshava Prasad et al. 2009), the MIPS mammalian protein-protein interaction database (Pagel et al. 2005), the Reactome database (Fabregat et al. 2017), and the STRING database (Szklarczyk et al. 2017). Metabolic networks are mainly from Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al. 2017).

On such predefined network, aging-associated proteins are found to be highly connected hubs in the PPI network (Bell et al. 2009), and type I diabetes is more tightly related to aging than type II diabetes using an asymmetric closeness based on the PPI network (Wang et al. 2009). Through integration of DNA methylation and PPI, tissue-independent age-associated hotspots were found to target stem cell differentiation pathways (West et al. 2013). By restricting the PPI to age-specific highly expressed genes, although the global network topologies did not change, the centrality of several genes correlated with age (Faisal and Milenković 2014). A study from our laboratory analyzed the topology of aging-related PPI subnetwork in which interacting gene pairs are transcriptionally co-expressed or anti-expressed during human brain aging and found that the PPIs connecting anti-expressed genes are enriched for lifespan regulators and transcriptional and epigenetic regulators (Xia et al. 2006).

Another way of network inference is through data-driven approaches, which can be separated into five major classes according to the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project: regression, mutual information, correlation, Bayesian networks, and others (Marbach et al. 2012). One should keep in mind that network inference is at best an indication of association and experimental validations are always needed to demonstrate causality. The following are some examples of network construction efforts in aging research.

The weighted gene co-expression network analysis is a method to infer the gene-gene interaction networks from transcriptomics data (Zhang and Horvath 2005), and by applying the method to gene expression data from 30 adult human frontal cortex samples of different ages and comparing the resulting network to a network derived from AD transcriptome, Miller and colleagues found that healthy aging of the brain and AD share features in the decline or mitochondrial activity and synaptic plasticity (Miller et al. 2008). Such co-expression- or correlation-based network can be also used to integrate multiple layers of data, for example, in a recent effort to profile young and old adults’ vaccinal responses, a multiscale, multifactorial response network spanning transcriptomic and metabolomics signatures, cell populations, and cytokine levels was built and reveals striking associations between orthogonal datasets (Li et al. 2017). Similar idea could be generalized to single-cell transcriptome analysis, as has been done in the SCENIC computational tool, which could simultaneously reconstruct gene regulatory network and identify cellular states (Aibar et al. 2017).

Probabilistic graphical models are an important class of networks that can be built with high-throughput data (Friedman 2004). In a study of metabolomics data, a Gaussian graphical model (GGM) was applied to infer association networks (Krumsiek et al. 2011). GGM is also applied in aging research to reconstruct networks from metabolic data and identify modules (Murphy et al. 2017). A Bayesian network is a directed acyclic graph inferred from data which could extract biological meaningful associations without prior knowledge (Friedman et al. 2000). Recently our laboratory developed an algorithm that could combine the public intervention data to infer a Bayesian network (Li et al. 2013) and applied it to transcriptomic data of C. elegans during normal aging and dietary restriction (DR), which led to the finding that there are extensive feedback controls which exist among three modules mediating DR-induced longevity and validated them by lifespan assay (Hou et al. 2016).

1.3.2 Model Aging Systems

The ultimate goal of systems biology is to quantitatively model an organism, conduct in silico experiments, and generate hypotheses and predictions. While whole-organism models have been attempted in yeast (Karr et al. 2012), modeling a subsystem of an organism based on prior knowledge also allows mechanistic insights on the biological process such as aging. A stochastic network model of cell senescence based on telomere reduction, mitochondria damage, and nuclear somatic mutations was built, and the simulation from this model was consistent with published data on intra-clonal variability in cell-doubling potential (Sozou and Kirkwood 2001). The same group also developed a mathematical model to describe the heat shock system and to describe the influence of chaperones and accumulation of misfolded proteins on aging (Proctor et al. 2005). Another modeling work focused on the mitochondrial fission and fusion events and found that the simulation from their model was consistent with two experimental findings so that this model could provide evidences for age-related accumulation of mitochondrial deletion mutants (Kowald et al. 2005). An in silico model of the chronic effects of elevated cortisol on hippocampal atrophy was developed, and simulations using ordinary differential equations suggested that chronic increase in cortisol levels leads to faster decline in hippocampal output than acute bursts (McAuley et al. 2009). The epigenetic changes in aging stem cells were also modeled to explain why increased stem cell proliferation can lead to progeroid phenotypes (Przybilla et al. 2014). One interesting effort besides the researches in the biological side of aging is the facial aging modeling, which is useful in looking for lost children or wanted fugitives, utilizing four types of approaches: physical model-based approaches, prototyping, function-based approaches, and evaluation targeted approach, and the results were impressive (Suo et al. 2012).

1.4 Conclusions

With the rapid development of various omics mapping methods, and accumulating big data, studying aging at systems biology level is now not only feasible but becoming a necessity to complement traditional one-gene-at-a-time approaches. Aging systems biology (data sources and analysis are summarized in Fig. 1.1) will bring new insights to aging both macroscopically at the network level and microcosmically using mathematical models. Single-cell technology will further fuel the aging systems biology study toward single-cell levels, and linked with big data generated at the cellular, tissue, and whole-organism levels, the time is ripe for aging systems biology to take off and reap fruits.

Fig. 1.1
figure 1

Intervention of data source and analysis in aging systems biology. In this concise sketch map, all the types of data sources and analysis methods are nested in the network to show their interdependency to each other. The network is obtained from Hou et al. (2016)