Introduction

Common neurodegenerative diseases such as Alzheimer’s disease (AD), Parkinson’s disease (PD), amyotrophic lateral sclerosis (ALS), and Huntington’s disease (HD) are debilitating disorders with increasing prevalence in modern ageing societies. Decades of research on individual diseases through diverse approaches have offered deep insights into the phenotypic manifestations and molecular underpinnings of each disease. Interestingly, patterns of converging features across neurodegenerative diseases such as dementia in AD, PD and ALS have been observed (Parikshak et al. 2015; Lynch et al. 2016; Santiago et al. 2017), calling for a quest to better understand the relationships between neurodegenerative diseases to reveal disease-specific mechanisms as well as potentially shared mechanisms. While disease-specific mechanisms are useful for personalized, disease-specific therapies, the shared mechanisms revealed can facilitate the development of common therapies for multiple neurodegenerative diseases and a better understanding of the co-occurrence and overall interactions among these diseases.

While disease–disease overlap may be expected due to shared environmental risk factors such as common chemical exposures between different diseases (Liu et al. 2009), recent investigations of the shared mechanisms or interactions between diseases at the levels of individual risk factors (genetic vs environment), candidate disease genes, pathways and networks have all supported the presence of shared molecular pathways (Parikshak et al. 2015; Lynch et al. 2016; Santiago et al. 2017). These studies have provided valuable information on common dysregulation in different diseases, but were done at various times in the past on different subsets of diseases, and a systematic review of the most current information is needed to achieve a more comprehensive understanding of the shared mechanisms across neurodegenerative diseases.

In this review, we systematically collected the multidimensional molecular information probing the mechanisms of individual diseases at genetic, transcriptome, pathway and gene network levels. We focus on genomewide studies to offer an objective and comprehensive view, and summarize the shared mechanisms between diseases revealed at each molecular level to derive more conclusive insights. Here we focus on AD, PD, ALS and HD whenever sufficient data is available. We organize our review of the genomewide similarities among neurodegenerative diseases based on the scale of investigation, i.e. whether the shared (and at times unique) factors discussed are at the genetic, gene expression or network level.

Shared genetic factors between neurodegenerative diseases

All neurodegenerative disorders have significant genetic components, with genetic heritability for AD, PD and ALS estimated to be 60–80% (Gatz et al. 2006; Van Cauwenberghe et al. 2016), \(\sim \) 40% (Hamza and Payami 2010), and \(\sim \)60% (Al-Chalabi et al. 2010), respectively. Mendelian forms of neurodegenerative diseases have been attributed to rare mutations, such as amyloid precursor protein (APP) and presenilin genes (PSEN1, PSEN2) for AD, \(\upalpha \)-synuclein (SNCA), Parkin (PARK2), PTEN-induced putative kinase I (PINK1), microtubule-associated protein tau (MAPT) and leucine-rich repeat kinase 2 or dardarin (LRRK2) for PD, cytosolic Cu/Zn superoxide dismutase (SOD1), alsin (ALS2), senataxin (SETX) and synaptobrevin/VAMP (vesicle-associated membrane protein)-associated protein B (VAPB) for ALS (Bertram and Tanzi 2005; Pasinelli and Brown 2006), and chromosome 9 open-reading frame 72 (C9ORF72) repeat expansion as a shared genetic cause for ALS and FTD (van Blitterswijk et al. 2012) which was dissected recently using CRISPR-Cas9 screens (Kramer et al. 2018). However, the majority of disease cases are of non-Mendelian form and exhibit complex aetiology involving large numbers of genetic variants with moderate to subtle effects. These complex forms of neurodegenerative diseases are difficult to examine, until the advent of genomewide association studies (GWAS).

Within a decade of the best known early GWA study in 2005 (Klein et al. 2005), the GWAS approach has become a staple in modern genetic research and a driving force for novel causal gene discoveries for numerous human traits and diseases including neurodegenerative diseases. Based on the recent GWAS catalog (Welter et al. 2014) (https://www.ebi.ac.uk/gwas/; downloaded on 19 Sep. 2017), a total of 43, 26 and 17 studies were conducted for AD, PD and ALS, revealing tens of single-nucleotide polymorphisms (SNPs) to be associated with each of these diseases at genomewide significance (\(P{<}5\times 10^{-8}\); HD was not included in the analysis due to the few limited loci identified) (figure 1). Among these, the genetic locus with the strongest effect and clinical significance for AD is the APOE locus, which has also been implicated in PD (albeit with different alleles). Direct comparison of the significant GWAS hits between diseases revealed no overlap in SNPs (figure 1a), but a few overlaps in the candidate genes mapped to the top SNPs (figure 1b) as well as in the over-represented pathways among the candidate genes (figure 1c). Specifically, HLA-DRB5 and MAPT were GWAS candidate genes for both AD and PD. At the pathway level, ‘vesicle-mediated transport’ was shared across all three diseases, and nine pathways such as synaptic signalling, neuron projection development, and proteolysis were shared between AD and PD (table 1).

Fig. 1
figure 1

Overlap among GWAS genetic signals of AD, ALS and PD at SNP, candidate gene and pathway levels. Overlap among different diseases based on overlap among each disease’s (a) GWAS-associated genomewide significant (\(P{<}5{\times }10^{-8}\)) SNPs, (b) candidate genes mapped to the genomewide significant disease SNPs, and (c) pathways over-represented in the candidate disease genes. (d) Q–Q plot showing that the significant PD SNPs (\(P{\le }5{\times }10^{-8})\) collectively demonstrate more significant association with AD in AD GWAS (based on the full summary statistics from the IGAP stage1 study (Lambert et al. 2013)) compared to random expectation (\(P=0.01\) based on the Kolmogorov–Smirnov or KS test). (e) Q–Q plot showing that the significant ALS SNPs (\(P{\le }5{\times }10^{-8})\) do not collectively demonstrate more significant association with AD in AD GWAS (based on the full summary statistics from the IGAP stage1 study (Lambert et al. 2013)) compared to random expectation (\(P = 0.59\) based on KS test).

Previously, Ramanan and Saykin (2013) categorized 13 and 15 GWAS candidate genes for AD and PD, respectively, based on the known functions of individual genes and compared the convergent pathways between the two diseases. The genes included represented a much smaller number of candidate genes than what was summarized above in our analysis. In the earlier analysis by Ramanan and Saykin, as long as a pathway was implicated by a single gene, the pathway was considered to be implicated in the disease biology. Using this strategy, they identified numerous shared pathways between AD and PD. These included intracellular processes (apoptosis, autophagy, mitochondrial function, oxidative damage/repair, proteasome), pathways involving local tissue environment (cell adhesion, endocytosis, neurotransmission, prions/transmissible factors), pathways related to systemic environment (inflammation/immune system, lipid/metabolic/endocrine, vascular factors), and processes relevant to development and ageing (epigenetics, neurotrophic factors, telomeres). Our current analysis based on the over-represented pathway among the updated GWAS candidates confirmed the involvement of both intracellular and intercellular processes (multicellular organismal process, transport, macromolecular complex binding, cell activation, vesicle mediated transport, proteolysis, kinase binding), and also implicated pathways associated with the nerve system (synaptic signaling, neuron projection) (table 1). Interestingly, Ramanan and Saykin also attempted to model the interactions among the AD/PD candidate genes using transcription factor binding networks curated in the MetaCore software. They found that nine of the 13 AD GWAS genes and 10 of the 15 PD GWAS genes were tightly connected within a coherent network coordinated by transcription factors SP1 and AP-1.

The above analyses support the presence of shared GWAS genes, pathways, and networks between several neurodegenerative diseases. However, these were based on the significant GWAS loci only. Given that ample evidence supports that complex traits or diseases involve numerous genetic variants with effect sizes ranging from strong to moderate to subtle, the top-loci focussed comparisons miss the opportunity to obtain comprehensive mechanistic insights. Additionally, examining overlaps in the top loci between diseases would necessitate the assumption that the ranks of genetic association strengths for the GWAS loci are similar across diseases, which may not be true in that strong loci for one disease may only subtly perturb another disease. Therefore, it is important to compare the disease mechanisms revealed by GWAS using the full GWAS statistics. As an illustration of this concept, we took the significant GWAS SNPs for PD and ALS from GWAS catalog and plotted their association P values with AD based on the full summary statistics from the IGAP GWAS study (Lambert et al. 2013) on AD using Q–Q plots (figure 1, d–e) to assess whether the observed distribution of P values is significantly different from the expected null distribution. A significant deviation of the AD association P values among the PD or ALS SNPs from the null distribution towards lower P value (higher \(-\log 10(P)\)) values would indicate that PD or ALS SNPs collectively show stronger association with AD than random SNPs, which can serve as evidence for genetic sharing. This analysis showed that top PD SNPs are collectively associated with AD but in a subtler way, as the AD GWAS P values for PD SNPs were mostly less significant than \(10^{-3}\). In contrast, the top ALS SNPs did not appear to show evidence of association with AD.

Table 1 Shared molecular pathways between neurodegenerative diseases based on genetic, gene expression, and network evidence.
Table 2 (contd)
Table 3 (contd)

Recently, multiple more sophisticated statistical and bioinformatics methods have been used to explore shared genetic components across diseases (Fortune et al. 2015; Brown et al. 2016; Pickrell et al. 2016; Shu et al. 2017). The various methods can detect genetic sharing at SNP, gene, pathway and network levels. For example, Pickrell et al. (2016) examined the overlapping genetic architecture among 43 human traits, including AD and PD. They used a log likelihood-based model selection method and revealed genetic sharing between many diseases. In particular, their study also provided supporting evidence for moderate genetic sharing between AD and PD, agreeing with our analysis above (ALS was not examined in their analysis). Future applications of these various methods will further our understanding of genetic sharing between neurodegenerative diseases.

Shared gene expression signatures

With the advent of microarrays and next-generation RNA sequencing, there has been an outpour of studies on transcriptional profiling of many complex diseases to better understand the underlying disease mechanisms. Genomewide gene expression profiles collected from disease vs control individuals represent the largest publicly available resource in the genomic domain. As such, there have been numerous studies which have profiled the transcriptomes of different neurodegenerative diseases and explored the corresponding overlapping gene signatures to determine the shared transcriptional perturbations of neurodegenerative diseases. In addition to these numerous individual profiling studies, there have also been efforts to boost the power of detection and find consistent gene signatures by conducting meta analyses and comparing two or more neurodegenerative diseases to find conserved gene perturbations.

Within the field of neurodegenerative diseases, it has been well established that there is a higher probability of developing concurrent PD and AD than would be expected by random chance, making this disease comparison a prime candidate for the exploration of overlapping gene signatures. This was precisely the goal in a comparative study profiling three brain regions (hippocampus, gyrus-frontalis-medius, and cerebellum) (Grünblatt et al. 2007). This study found 12 genes that were similarly perturbed by PD and AD across all three brain regions including: synaptic vesicle genes (SYT1), Alzheimer’s-related genes (APP, SNX2), insulin genes (IRS4), and oxidative stress genes (GSTM1). In addition, the study identified four genes (CNR2, HIST1H3E, CHRNA6 and BACE1) which showed opposite regulation patterns between the two diseases. For instance, BACE1, which is involved in processing Amyloid precursor protein, was found to be upregulated in PD but downregulated in AD (Grünblatt et al. 2007).

While the previous study was based on data-driven whole transcriptome profiling to find similarities between PD and AD, there have been more targeted efforts to find consistencies in transcriptomes between neurodegenerative diseases. Specifically, one study drew on prior literature evidence of inflammation and perturbation of the immune system, which is characteristic of AD, PD and Creutzfeldt–Jakob disease, to find transcriptomic overlaps. Using this approach, López González et al. (2016) found overlaps between cytokines and the mediators of the immune response in all three diseases, as well as additional overlaps between AD and PD characterized by inflammatory markers in the blood and serum (López González et al. 2016).

Although there are many more examples of targeted gene expression profiling studies which serve to characterize a single neurodegenerative disease or compare between two diseases, the most comprehensive gene signature comparison between neurodegenerative diseases to date is a meta-analysis of 1270 post-mortem brain tissues from 13 patient cohorts spanning four neurodegenerative diseases (AD, PD, HD and ALS) across many different brain regions (Li et al. 2014). Sampled tissues include: hippocampus, frontal cortex, entorhinal cortex, dorsolateral prefrontal cortex and medial temporal lobe for AD; substantia nigra, dorsolateral prefrontal cortex, putamen, dorsal motor nucleus and globus pallidus interna for PD; motor cortex, ventral head of the caudate nucleus and dorsolateral prefrontal cortex for HD; and motor cortex and cervical spinal cord for ALS (table 1). This meta-analysis identified a shared gene expression signature of 243 genes, which was validated in an additional withheld dataset comprising 205 samples from 15 different cohorts (figure 2). This shared gene signature contains genes associated with bioenergetic deficits, M1-type microglial activation and gliosis, thereby supporting these processes as consistent themes across neurodegeneration (Li et al. 2014). Further, pathway enrichment of the differentially expressed genes largely overlapped with literature on known neurodegenerative disease pathways, and included functional processes such as inflammation, altered synaptic transmission, mitochondrial dysfunction, and oxidative stress.

Fig. 2
figure 2

Overlap of differentially expressed genes in neurodegenerative diseases. Overlap of meta analyses of all possible combinations of three of the following diseases: AD, ALS, HD and PD. Three hundred and twenty two genes were consistently found across all four meta-analyses, and the enriched pathways of these conserved genes are shown. Note that these 322 conserved genes differ from the 243 consistent genes reported in the original study (Li et al. 2014) due to some differences in methodology, i.e. our approach of overlapping genes from the four meta analyses shown in this figure (as necessitated by the lack of ready availability of individual disease gene signatures from the original study) differs from the original study’s approach of a single meta-analysis of all four diseases followed by validation in a replication cohort to derive the consistent genes.

The overlapping gene signatures of ALS, PD and AD were further explored in another review of neurodegenerative diseases. The authors found overlaps between all the three diseases in the form of neuroinflammation gene signatures (consistent with the targeted findings above), with further overlaps identified in genes related to RNA splicing and protein turnover between ALS and PD, and mitochondrial dysfunction genes as a common theme between PD and AD (Cooper-Knock et al. 2012).

Given that genomewide gene expression profiling is the most abundant omics-based profiling platform, here we focussed on transcriptome-based signature comparisons between neurodegenerative diseases. However, there have been a handful of studies on other data modalities such as shared protein dysregulation (Hosp et al. 2015) and shared epigenomic patterns (Urdinguio et al. 2009; Portela and Esteller 2010; Sanchez-Mut et al. 2016). These additional data modalities can provide unique information that is not captured at the transcriptome level and, when more comprehensively investigated in the future, can be integrated into system-wide network models to better capture the shared aetiology between neurodegenerative diseases.

Shared network-level dysregulation

Genes and proteins do not work in isolation within the cells and tissues of our body, but instead interact with and regulate each other through protein–protein, protein–DNA, and other biomolecular interactions. Disease-induced perturbations can propagate through this interconnected network of genes and proteins. Evidence for this type of network-level dysregulation shared between different neurodegenerative diseases comes from several studies on transcriptional networks or protein interaction networks. These studies typically infer a transcriptional network by adding a network link between any pair of genes that show correlated expression (coexpression) in postmortem brain samples of a group of individuals (such as a group of individuals with disease, or another group of control individuals without dementia); and typically infer a protein interaction network from more direct experimental assays done on human cells (such as a yeast two-hybrid assay to detect protein–protein interactions, or a ChIP-seq experiment to detect protein–DNA interactions).

Different neurodegenerative diseases show similarity at the level of transcriptional networks. We have found for instance (Narayanan et al. 2014) that the global transcriptional network in the human dorsolateral prefrontal cortex is drastically altered in both AD and HD diseases in a similar fashion when compared to control individuals. This result is based on a systematic differential coexpression (DC) analysis that revealed two types of disrupted gene–gene relations based on whether the correlation strength between two genes are increased (gain of correlation or GOC) or decreased (loss of correlation or LOC) in the disease group relative to controls. The network of shared DC relations between AD and HD contained a majority of LOC relations, even when the individual DC networks (i.e., AD vs controls DC network, or HD vs controls DC network) contained predominantly GOC relations (figure 3a).

Fig. 3
figure 3

Shared network-level dysregulation in AD, HD and ALS. (a) The number of transcriptional network disruptions (DC or differential coexpression gene–gene relations) that were identified in our previous study (Narayanan et al. 2014) as AD-specific, HD-specific, or shared between both diseases are shown. The shared DC network comprised 8043 DC relations as shown (involving 3021 genes), and contained a majority of LOC relations, since the proportion of LOC among all DC relations in the AD, HD and shared DC networks were 33%, 19% and 51%, respectively. (b) Overlap of two coexpression network module genes identified from blood gene expression analysis of multiple ALS cohorts (Saris et al. 2009) with the HD–AD shared DC network genes (Narayanan et al. 2014). The hypergeometric distribution based overlap P values compare the overlap of the module genes against the overlap rate of background genes (note that 1711 of the 15,462 ALS study background genes overlapped with the set of 3021 HD–AD shared DC genes).

Aligning the shared DC network between AD and HD (which comprised 8043 DC relations involving 3021 genes) with a network of 116,220 protein–protein/protein–DNA interactions from HPRD, BIND and other databases resulted in a 242-gene subnetwork. This 242-gene subnetwork was enriched for independent AD and HD signatures and for several biological processes (including neuron differentiation, gap junction trafficking, regulation of apoptosis and other processes; table 1). This subnetwork also revealed two interacting processes involving GOC of chromatin organization genes and LOC of oligodendrocyte differentiation genes as a pathological mechanism shared between AD and HD (Narayanan et al. 2014).

We were interested in comparing genes in the shared HD–AD DC network to genes in the coexpression network of other neurodegenerative diseases. The most such suitable study in terms of sufficient sample size to construct robust coexpression networks was a study on ALS (Saris et al. 2009). However, due to the lack of access to brain tissues, the ALS study was based on blood gene expression analysis. Using blood expression data from multiple ALS cohorts, this study identified and replicated significant associations of two sets of coherently regulated genes (or coexpression modules) with ALS status: a blue module enriched for genes upregulated in ALS patients compared to controls, and a yellow module enriched for genes downregulated in ALS. In a test for enrichment of the top 500 genes correlated to the activity of the ALS yellow (or blue) modules for various functional or disease gene categories, the study found significant overlap with HD disease genes category besides other neurological disorder categories such as atrophy of dendrites. Further, the top 500 blue or yellow module genes (i.e. most positively correlated 500 genes averaged across multiple ALS cohorts as reported in their supplementary table S4) overlapped significantly with the genes in the shared DC network between HD and AD in our study (figure 3c). These results taken together support significant sharing of coexpression network-level signatures between HD, AD and ALS.

We focussed above on transcriptional coexpression network analysis of bulk tissues such as postmortem brain samples or blood samples since they were available from multiple neurodegenerative diseases. There have also been cell type specific studies such as coexpression network analysis of spinal motor neurons implicated in ALS (Ho et al. 2016) or fibroblasts affected in ALS (Kotni et al. 2016). More such work is needed to understand if different neurodegenerative diseases could affect specific cell types in a shared fashion.

To complement studies that use transcriptional coexpression networks inferred from expression data to uncover disease dysregulation, one could also use literature-based network data on protein–protein, protein–DNA and other physical interactions to dissect disease dysregulation. For instance, a study used the human protein–protein interaction network to expand an initial seed set of 10 common disease susceptibility genes shared between AD, HD and PD (e.g., ESR2, PARP1, GSK3B, UCHL1 and LRRK2, as identified from genetic association databases, literature mining or other sources) into a larger set of 1294 genes connected to these seed genes in the protein network (Li et al. 2015). Inspection of the protein network among these expanded genes revealed enrichment of metabolic pathways from the KEGG database, and modules/pathways that provide bridging interactions between the common susceptible genes for these three neurodegenerative diseases. The discovered bridging pathways such as ‘adherens’ and ‘tight junctions’ were further validated using independent gene expression data related to these neurodegenerative diseases.

Given the complexity of how these diseases progress, instead of using gene expression data in the validation phase as done in the study above, one may use it directly in the discovery phase to identify and prioritize the bridging pathways or common disease subnetworks. This approach was taken in a related study (Liu et al. 2012) to search for similar subnetworks dysregulated in both AD and PD. Instead of simply looking at first-level neighbours of seed disease genes in the human protein network, this study also employed an objective function based on Steiner trees to expand the seed genes minimally into a connected disease subnetwork, and then inspected the resulting subnetworks for coordinated transcriptional dysregulation in both AD and PD. The reconstructed AD and PD networks respectively contained 225 genes (387 interactions) and 273 genes (502 interactions), with 72 genes shared between the two networks. While the majority of these 72 genes could be constructed from a direct overlap between the two starting seed disease gene sets (derived based on AD or PD association in at least four publications), the power of a network-based approach is in its depiction of the interconnected nature of these shared genes and the extra bridging genes that the Steiner tree algorithm adds. For instance, the Steiner tree algorithm added five AD genes (ACHE, APP, ATXN1, CLU and DAPK1) to the PD network, and five PD genes (APOB, CALR, CAV1, NOS1 and TFRC) to the AD network based on molecular interactions between the AD and PD genes.

Conclusion

We set out to obtain a systematic genomewide elucidation of common molecular pathways to neurodegeneration, and did so using three different genomic data types for which data is relatively abundant to allow between disease comparisons: genetic, gene expression and network-level signatures. The three data types seem to complement one another as the common dysregulated pathways supported by each data type are generally distinct from those supported by the other data types (table 1), with some exceptions like the vesicle mediated transport receiving support from both genetic and gene expression data for its association with AD, PD and ALS. Certain previously known shared mechanisms like protein misfolding and subsequent neuronal loss are only captured partially in our results (i.e., cell death but not protein misfolding is found enriched; table 1). Future work could review other lines of evidence such as shared protein, cellular (composition of cell types) or structural/neuronal network signatures among the neurodegenerative diseases when data becomes more enriched in these domains, and these could for instance reveal both protein misfolding pathways and other novel pathways shared among neurodegenerative diseases.

Several factors need to be considered when interpreting the results in this study. First of all, the biological implications of the different types of data can be different. For instance, genetic information deals with heritable DNA alterations that precede disease development, which may carry stronger implications for disease causality. Gene expression data, on the other hand, are more subject to dynamic changes and can reflect on both causal genes/pathways that are upstream of disease development and reactive genes/pathways that are downstream of disease processes. Therefore, one cannot directly take the large numbers of genes and pathways revealed from gene expression studies as evidence for disease causality. Sophisticated integrative analysis of the various data modalities and perturbation experiments are needed to differentiate causal driver processes from reactive passenger pathways. Second, different statistical significance or other filtering cutoffs were employed in different studies and data modalities to build their disease gene sets—hence it is easier to interpret similarities than differences observed between diseases’ gene sets, since differences could simply arise from usage of different cutoffs or other technical/methodological artifact than real differences between the diseases being compared. We note here that we did keep the factors under our control to be the same across diseases to minimize artefactual results (in table 1 for instance, the same pathway definitions and the same pathway enrichment tool based on hypergeometric test was used). A further consideration when comparing gene signatures (and other omics modalities) is the tissue-specific nature of disease signatures as seen in neurodegenerative diseases like HD (Hodges et al. 2006) and tissue-specific patterns of splicing (Twine et al. 2011). Therefore, it is important to match the tissue types in which the genes and pathways manifest when comparing mechanisms between diseases. If these tissue-specific influences are not accounted for when comparing diseases, spurious conclusions will result. Having said that, the tissues of relevance to neurodegenerative diseases are predominantly different regions of the brain, with the cortical region profiled in many studies for instance, and therefore allows us to compare different tissue-based disease signatures. Lastly, although we tried to be as comprehensive as possible in collecting data and studies, it is easily discernible that not all comparisons are possible due to missing information or data on one or more diseases. Therefore, future coordinated efforts are needed to systematically collect multi-omics data types from multiple tissues and cell types from multiple neurodegenerative diseases to enable more comprehensive investigations.

In summary, we reviewed common mechanisms shared between neurodegenerative diseases here (i.e. overlap between dysregulation signatures identified by typically studying each disease separately in independent studies or cohorts), and found several pathways to be enriched for genes associated with multiple neurodegenerative diseases. This short review of ‘disease–disease overlap’ could pave the way to understand ‘disease–disease interactions’, i.e. interaction between two neurodegenerative diseases afflicting the same individual. This is an underexplored topic of research that could constitute a promising avenue for future research, and thereby help reveal new therapies for diseases with mechanistic overlaps and interactions.