Introduction

The role of genetic factors in determination of longevity and patterns of aging is an intensively studied issue (Budovsky et al. 2007; de Magalhaes et al. 2009; Finch and Ruvkun 2001). Our knowledge in the field has come mainly from genetic interventions in the model organisms and, most recently, from RNAi screens in C. elegans (Curran and Ruvkun 2007; Lee 2006). The longevity-associated genes (LAGs) represent a diverse group of genes, among which are those that predispose to increased lifespan, whereas others cause premature aging (Budovsky et al. 2007; de Magalhaes et al. 2009). The mechanisms (independent or cooperative) by which LAGs determine lifespan and the rate of aging as well as the potential role of their neighboring partners have not yet been fully addressed. To a great extent, the latter could also be applied to hundreds of genes that have been identified as being involved in development of age-related diseases (ARDs).

The analysis of LAGs and genes involved in major human ARDs, conducted in our lab in the last years (Budovsky et al. 2007, 2009; Tacutu et al. 2010; Wolfson et al. 2009), has revealed several remarkable features: (1) LAGs are highly evolutionary conserved (approximately 80% of LAGs established in model organisms have orthologs in humans). (2) LAGs encoded proteins have higher average connectivity than other proteins from the interactome and are also highly interconnected. (3) LAGs together with their first-order interacting partners form continuous protein–protein interaction (PPI) networks of scale-free topology, that is, a relatively small number of nodes with multiple connections (hubs) contribute a lot to the network connectivity. (4) The genes involved in the major ARDs (atherosclerosis, type 2 diabetes, cancer, and AD) also exhibit high connectivity and interconnectivity and form PPI networks. (5) There are numerous common nodes and PPIs between the Human Longevity Network (HLN) and the networks of the major ARDs. Collectively, the results obtained clearly show that LAGs and ARD genes may act in a cooperative manner and provide further evidence for the existence of strong evolutionary and molecular links between longevity and ARDs.

The network-based approach is becoming increasingly used as a systems biology tool to investigate aging (Chautard et al. 2010; Csermely and Soti 2006; Ferrarini et al. 2005; Promislow 2004; Wang et al. 2009; Xue et al. 2007), diseases (Goh et al. 2007; Ideker and Sharan 2008), or both (Miller et al. 2008; Simko et al. 2009; Wang et al. 2009).

The integrity and functionality of PPI networks is under a tight epigenetic control. A recently discovered epigenetic mechanism of RNA interference includes miRNAs as key players, acting predominantly at the post-transcriptional level (Fire et al. 1998; Meister and Tuschl 2004). The dysregulations in epigenetic control could have a profound impact on both aging and ARDs (Budovsky et al. 2006; Feinberg 2008; Liang et al. 2009; Rattan and Singh 2009; Wang 2007). As we have recently shown (Tacutu et al. 2010; Wolfson et al. 2008), many established miRNAs have the potential to target the genes involved in the control of lifespan, in aging and ARDs. Thus, the molecular links between aging/longevity and ARDs could also be attributed to their common miRNAs.

The idea to combine miRNAs and PPI networks is only beginning to emerge. In particular, the miRNA-regulated PPI networks (human interactome) have thus far been constructed only for miRNA-predicted targets (Hsu et al. 2008; Liang and Li 2007). Such an approach however generates many false-positive results (Lewis et al. 2003; Maragkakis et al. 2009). Because of that, in this work we have included only experimentally validated miRNA targets (see next section). We suggest that building the miRNA-regulated PPI networks could provide further insight into the mechanisms of aging/longevity and ARDs as well as be useful for the selection of potential drug targets. The need for such an integrative approach led us therefore to develop NetAge—a database containing miRNA-regulated PPI networks for longevity, ARDs and aging-associated processes.

NetAge: database content and implementation

Data sources

A full list of LAGs established thus far in model organisms was compiled from scientific literature and manually curated. The list contains around 800 entries for the four major model organisms (S. cerevisae, C. elegans, D. melanogaster, and M. musculus), and is accessible in GenAge (de Magalhaes et al. 2009). The LAGs from model organisms were determined based on genetic interventions (partial or full loss-of-function mutations, gene over-expression) or RNA interference-induced gene silencing, which reportedly promote longevity or cause premature aging (Budovsky et al. 2007; de Magalhaes et al. 2009). The still limited number of human LAGs include genes either possessing longevity-predisposing polymorphisms or those involved in progeroid syndromes. Presuming that the LAGs from model organisms may also play a role in the control of human lifespan (Budovsky et al. 2007), non-redundant human orthologs of model organism LAGs (n = 643) and human LAGs (n = 19) were used for the construction of the HLN.

Human genes involved in major age-related diseases (ARDs) were selected according to the following criteria: (1) mutations associated with a higher frequency of the disease, (2) consistent up- or down-regulation of gene expression, and (3) gene polymorphisms associated with greater predisposition or susceptibility to a given disease. Cancer-associated genes were collected from scientific literature (Budovsky et al. 2009 and references therein) and from publicly available databases including The Cancer Genome Anatomy Project—CGAP (Strausberg et al. 2002), NCBI OMIM (Online Mendelian Inheritance in Man) (Sayers et al. 2009), Tumor Suppressor Gene Database—TSGDB (Yang and Fu 2003). The genes associated with atherosclerosis, Alzheimer’s disease, and type 2 diabetes were also collected from the scientific literature and from publicly available databases: the Cardiovascular Bioinformatics Database—Cardio (Zhang et al. 2004), AlzGene database (Bertram et al. 2007), and Type 2 Diabetes Mellitus Database—T2D-Db (Agrawal et al. 2008), respectively. Finally, the human genes associated with oxidative stress and chronic inflammation were collected from the scientific literature only.

PPI data for each interactome was extracted from the BioGRID database (Breitkreutz et al. 2008). Micro RNAs and their experimentally verified targets were retrieved from the TarBase database (Papadopoulos et al. 2009). Orthology information was obtained from the InParanoid database—Eukaryotic Ortholog Groups (Berglund et al. 2008). All human genes/proteins were annotated with basic process and primary cellular localization as provided by HPRD—Human Protein Reference Database (Keshava Prasad et al. 2009).

As expected, when integrating data from several distinct databases, some name and/or identifier inconsistencies between the input data have arisen. Most of these inconsistencies were reported in our program, manually analyzed and curated. Various names/identifiers were unified according to the data from the HUGO Gene Nomenclature (Eyre et al. 2006), FlyBase (Tweedie et al. 2009), WormBase (Rogers et al. 2008) and Saccharomyces Genome Database (Hong et al. 2008).

Construction of miRNA-regulated PPI networks

The construction of longevity and ARDs PPI networks was described in detail elsewhere (Budovsky et al. 2007, 2009; Wolfson et al. 2009). In this work, the PPI networks associated with longevity, ARDs, and related processes (oxidative stress, chronic inflammation) have been extended to miRNA-regulated PPI networks. This was done using YABNA (Yet Another Biological Networks Analyzer), a flexible software program developed in our lab. The construction of the networks included in NetAge is described in Fig. 1.

Fig. 1
figure 1

a Construction of longevity networks for humans and for model organisms and b the human networks of ARDs and aging-associated processes

The created networks for longevity, ARDs and associated processes represent sub-networks of the respective miRNA-regulated interactomes. Therefore, in order to provide the user with a wider perspective, the interactomes were also included in the NetAge database. For each species, the miRNA-regulated interactome was modeled as a mixed graph, including all the genes/proteins/miRNAs as nodes (a gene and its encoded protein are represented as one node) and all the PPIs and miRNA–gene interactions as edges. PPIs are considered undirected edges, whereas miRNAs are directed to their respective target genes. The construction of the miRNA-regulated interactome was done in two stages: (i) constructing the PPI interactome, and (ii) adding miRNA nodes with directed edges to their targets within the interactome. Only miRNAs which regulate at least one validated target from the graph were included.

Since the construction of the networks makes sense only in the context of the interactome, for each gene set of interest, only genes/proteins with reported PPIs were included. For example, the most studied PPI interactome—the yeast interactome, comprises about 5,300 genes with more than 77,000 PPIs (at the time of database creation). Accordingly, almost all yeast LAGs (85 out of 87) are presented in the interactome and were included in the yeast longevity network. The human PPI interactome is still incomplete, and up to date includes around 8,100 genes and 27,000 PPIs. Of the 662 LAGs (human LAGs and non-redundant human orthologs of LAGs in model organism), 456 are found in the human interactome and were used for constructing the HLN.

In doing so, the genes/proteins of interest were used as a “core set”, to which their first-order protein partners were added. The largest interconnected sub-graph was then kept and all the miRNAs which target these genes were added to the network. It is worth stressing that in almost all cases, these sub-graphs cover more than 90% of the core set genes. Both the gene sets of interest and the constructed networks are stored in the database. The graphical output of the networks illustrated on the website was generated using Cytoscape 2.6.0 (Shannon et al. 2003).

Database content

NetAge hosts data for 4 species: S. cerevisae, C. elegans, D. melanogaster, and H. sapiens. For each of the four species, the database contains miRNA-regulated interactomes, gene sets (from interactome) associated with longevity and their corresponding longevity networks. For H. sapiens, the database contains additional gene sets and networks for ARDs and aging-associated processes such as oxidative stress and chronic inflammation. In order to emphasize the links between longevity, ARDs and aging-associated processes, we compared the above mentioned human networks and, as a result of this comparison, several overlaps have been created. Interestingly, the overlapping genes/proteins also form continuous networks. All of them are accessible through the Net-Selector module in the NetAge database. For example, we have constructed the CGS (Common Gene Signature) network by overlapping the HLN and all the major ARD networks. The CGS network is included in the “Common signature of longevity and ARDs” for the purpose of showing and characterizing the molecular links between longevity and ARDs in humans (Wolfson et al. 2009). Another example is the “Common signature of AD, oxidative stress and chronic inflammation”. Strong relations between AD, oxidative stress and chronic inflammation have long been suggested (Ashford et al. 2005; Finch and Marchalonis 1996; Reddy et al. 2009). With this in mind, we constructed the overlapping network (Fig. 2). Similar analyses were performed for pair-wise comparisons between HLN and the networks of all major ARDs. In total, NetAge currently contains 10 gene sets (LAGs, ARDs, oxidative stress and chronic inflammation genes) and 21 networks (including the networks for longevity and ARDs as well as the overlaps between them) (see Table 1).

Fig. 2
figure 2

The common miRNA-regulated network (signature) for AD, oxidative stress and chronic inflammation. The common signature was created as an overlap between the three networks and comprises 622 proteins and 41 miRNAs. Circles in dark grey represent genes/proteins reported to be involved in AD. Light grey circles are their partners. The size of the circles in both cases is proportional to the node’s connectivity in the interactome

Table 1 Sets of genes, networks and interactomes stored in NetAge

At the time of writing, information about miRNA targets in S. cerevisae is still lacking, and therefore the Yeast interactome included in the NetAge database behaves as a regular PPI network. Also, the number of annotated PPIs for mouse (~1,100 nodes and 1,400 interactions in BioGRID) is not as yet sufficient enough for constructing an informative longevity network. Therefore, for the time being, NetAge does not contain data on M. musculus.

Web interface

The NetAge website located at http://www.netage-project.org provides easy access to and basic tools for querying and/or surfing through the database. The web interface includes the following sections and tools:

  • The “User guidesection aims at helping the first-time users become quickly familiar with the database and its scope. It contains several user cases with possible scenarios that might interest the user (for example, see Figs. 3, 4), as well as answers to FAQs.

    Fig. 3
    figure 3

    An example of user case from the online guide. What is the basic information that could be obtained from NetAge regarding gene A? (1) Whether gene A or its ortholog belongs to any of the longevity networks. (2) Whether gene A is found in any of the age-related disease networks or aging-associated process networks (in case that gene A is a human gene). (3) What are gene A’s partners? (3) What miRNAs regulate gene A and/or its partners? (4) What other genes could be regulated by the same miRNAs, and (5) whether these genes belong to any network in NetAge?

    Fig. 4
    figure 4

    Another user case: comparison between species. For example, gene A is a longevity gene in C. elegans and a user is interested in D. melanogaster. The questions that could be answered include: (1) Does gene A have orthologs in D. melanogaster and are they found in the fly interactome? (2) Is gene B (an ortholog of gene A) also a longevity gene in the D. melanogaster longevity network or is it found among partners of longevity genes? (3) Are the partners of gene A also conserved in fly, and if so, are they present in the fly longevity network?

  • The Net - selector module provides the possibility for selecting species, analyses, and gene sets/networks. Results given by the browsing, searching and statistics sections depend thus on the selection made.

  • The Browsing module allows users to view gene sets and networks—their description, construction methods, images and lists of nodes.

  • The Search engine for quick access to nodes (gene/miRNA) from a given network.

  • The Quick info module shows summary information about the currently selected species/analysis/gene set or network.

  • The Advanced searching module which is based on a filtering system. The criteria available for searching currently include constrains regarding a node’s name, type (genes/proteins or miRNAs), connectivity and interacting partners. Moreover, the search can always be restricted to given gene sets/networks and/or species. The above criteria can also be combined using logical operators (AND, OR, etc.). This in turn considerably extends the user’s options allowing for more complex searches. As an example, a possible scenario could be searching for genes/proteins from the Cancer Network, H. sapiens, that might be regulated by miR-124 and have at least 10 protein partners.

  • Statistics section with general statistical information and statistics for each gene set/network/interactome.

  • Glossary with basic scientific terms that are used throughout the website.

For each node in NetAge, the web interface provides an info page with essential information about the node, its partners (genes/proteins/miRNAs) and the miRNA-regulated networks to which it belongs. All entries are clearly annotated, and provide several external links to relevant databases. In addition, the user can get a graphical view of the local topology of the node. OrthoLinks allow the user to quickly navigate from a node which has orthologs in the other NetAge species to their info pages. The latter could in particular be important for comparative studies as well as for choosing new candidates for experimental modeling.

Availability

NetAge is freely available at http://netage-project.org. The NetAge xml data files can be freely downloaded and used according to the GNU Public License. MySQL dump files are also accessible upon request.

Concluding remarks: implications and perspectives

The analysis of LAGs and genes involved in ARDs (atherosclerosis, type 2 diabetes, cancer, and AD) and in aging-associated processes (oxidative stress and chronic inflammation) showed that these genes may act in a cooperative manner via numerous direct and indirect PPIs, and eventually by forming PPI networks. Comparison between the networks highlighted the strong molecular links between aging, longevity and ARDs, indicating that the network-based approach could serve as an efficient tool in biogerontological studies. Specifically, the NetAge database could be helpful:

  1. (1)

    For predicting new pro-longevity targets (genes and/or miRNAs).

  2. (2)

    For the analysis of the links between the determinants of longevity, ARDs and related processes.

  3. (3)

    For comparative biogerontology, in particular, for studying the public and private mechanisms of aging and longevity.

  4. (4)

    For the systems biology of aging.

Since the data sources used by NetAge get continuously updated, we have developed a set of “automatic update” scripts included in the YABNA software, with the aim at maintaining and updating the NetAge database on a quarterly basis.

Future developments include the extension of the database to other species, with particular interest in M. musculus, and further integration of data sources in the NetAge database. For example, one of the important points for future investigations would be the integration of data on functional genomics of aging (Pan et al. 2007). Another point for future focus could be related to the analysis of extracellular proteins, in particular, that of cell–cell and cell–extracellular matrix interactions. As we showed recently (Wolfson et al. 2009), the common gene signature of longevity and ARDs is considerably enriched with genes from the adherens junctions and focal adhesion signaling pathways. As such, it would be interesting to extend the analytic capabilities of the NetAge database by using the data and tools provided by MatrixDB, a database of mammalian protein–protein and protein–carbohydrate interactions involving the extracellular matrix (Chautard et al. 2009).

Another interesting direction would be the integration of predicted miRNA targets into the NetAge database. However, such an approach should be pursued with care such that the cost of having false-positive results does not overweight the gained information.

To the best of our knowledge, the NetAge database is unique in its concept and network content. By making these resources available online, we hope to provide the scientific community with a solid, new network platform for biogerontological research, and encourage greater participation in the systems biology of aging.