Introduction

Alzheimer’s disease (AD) is a long-term, progressive neurodegenerative disease, and it is the most common type of dementia. It causes the death of brain cells and shrinkage of brain tissues, which can cause memory loss, cognitive decline, disorientation, behavioral disorders, and, in more severe cases, death [1]. The number of AD patients is expected to exceed 100 million by 2050, with the effect of increasing life expectancy [2]. Lifelong care and therapy requirements for these patients place a significant financial burden on the family and society [3]. Therefore, uncovering the disease-associated molecular mechanisms that were not yet fully elucidated would have a significant contribution to developing novel treatment options.

In the pathophysiology of AD, there are several known physiological abnormalities in the brain tissues especially at hippocampus, entorhinal cortex, and neocortex [4]. Extracellular amyloid-beta (Aβ) plaques are one of the hallmarks of AD. Aβ peptides are products of the proteolytic process of amyloid precursor protein (APP), which is crucial for several biological activities like intracellular transport, signaling, neuronal development, and homeostasis [5]. In AD, the accumulation of Aβ plaques causes serious metabolic disorders and neurotoxicity, and, eventually, leads to neuronal cell death [6]. Another hallmark of AD is intracellular neurofibrillary tangles (NFTs) [7]. Tau is an essential protein for maintaining microtubule stability by tubulin assembly, and, thus, regulates normal functioning of neurons. Post-translational modifications in tau proteins lead to NFTs [8], which form aggregates that promote synaptic and neuronal dysfunction by damaging dendritic spines and impair axonal functions [9]. The other major characteristic of disease is neuroinflammatory response. Pathological alterations in AD state are detected by tissue-resident immune cells, microglia, and perivascular macrophages. They produce cytokines and chemokines that exacerbate inflammation and, eventually, result in neurodegeneration and neuronal loss [10]. Like microglia, astrocytes also release cytokines after exposure to Aβ and misfolded proteins, hence they aggravate the neuroinflammatory response. In addition, astroglial atrophy occurs in AD, leading to the destruction of synaptic transmission, which contributes to cognitive impairment [11].

In order to understand the complex molecular mechanisms and reveal the heterogeneous nature of the diseases, various methods and data sources are employed. Transcriptomics, one of the omics technologies, is widely used in disease diagnosis and for the elucidation of molecular mechanisms by enabling the extraction of important genes and pathways in complex systems [12]. Williams and colleagues conducted a study with transcriptome data from the prefrontal cortex region of AD patients. They revealed that synaptic function-associated genes were downregulated while genes connected to immune response were upregulated in AD. Genes encoding postsynaptic receptors, transporters, and enzymes mediating glutamatergic and GABAergic transmission were also downregulated [13]. In another study, Li et al. performed a meta-analysis by combining six transcriptome data and found candidate biomarker genes for AD. These genes were substantially downregulated and strongly linked with neuropathology. Also, their pathway analysis showed that the most affected pathways were related to nitric oxide and reactive oxygen species, NF-κB, and mitochondrial dysfunction [14]. In those studies and several other studies that use transcriptome data to investigate AD, the samples from patients are grouped together and compared with the control samples as a group, ignoring individual molecular differences between patients.

Differentially expressed genes (DEGs) alone are often insufficient to identify complicated molecular interactions. Subnetwork discovery tools, or, in other words, active module identifiers, are frequently employed in systems medicine to reveal potential molecular mechanisms of disease from protein-protein interaction networks. They mainly integrate the molecular profiles and knowledge of biological networks with different strategies and algorithms. Among the many tools mentioned in the literature for this purpose, DOMINO was shown to perform better in a benchmarking study [15] since it aims to eliminate non-specific over-represented Gene Ontology (GO) terms for the genes in the discovered subnetwork. It generates multiple modules, which are biologically rich and functionally distinct. Although it was published recently, it has been used in several studies as a subnetwork discovery tool [16, 17].

Although many studies have been conducted to understand the molecular mechanism of AD [18,19,20], its causes and complex mechanism still remain unclear. Due to the heterogeneous nature of the disease, putting all the data in the same pool and evaluating them as a whole may lead to missing key points. In this study, we used transcriptome data from the ROSMAP cohort [21], arguably the largest RNA sequencing dataset for AD, and adopted a personalized approach to evaluate all 403 AD samples individually, aiming to explore important genes and unknown mechanisms of disease pathology. To this end, we mapped each patient data separately on a human protein-protein interaction network to identify patient-specific subnetworks, which were separately investigated by functional enrichment analysis and in terms of known AD genes. Our patient-specific personalized approach captured several AD-related dysregulations that could not be identified by bulk comparisons of AD and control samples, offering a promising approach towards unraveling undiscovered molecular features of AD.

Methods

Fig. 1 provides the personalized methodology we followed in this study to investigate the molecular mechanisms of AD.

Fig. 1
figure 1

Workflow of the method applied individually for each AD patient in this study. 1 Differentially expressed gene (DEG) list was created for each patient (n = 403) separately based on two different tests: ranking-based approach and one-sample t-test. 2 Dysregulated subnetworks were discovered for each patient separately by mapping DEGs on the human protein-protein interaction network using the DOMINO algorithm and 3 functional enrichment analysis was applied to the genes in each subnetwork to identify dysregulated molecular processes

Transcriptome Datasets

Transcriptome data from the Religious Orders Study and Rush Memory and Aging Project (ROSMAP) [21], arguably the most comprehensive study conducted within the scope of AD in terms of the number of AD patients covered, was used in this study. The dataset includes RNA-sequencing data from the dorsolateral prefrontal cortex (DLPFC) region of the brains of 658 participants. The samples in the dataset were grouped as disease or control based on the CERAD score, a semiquantitative measure of neuritic plaques in brain. This led to 404 Alzheimer’s disease patients (CERAD scores 1 and 2, corresponding to definite and probable AD) and 165 controls (CERAD score 4, corresponding to no AD). Raw datasets of these 569 participants were downloaded from the Synapse database (accession code: syn3388564) in the fastq file format. Low-quality reads were trimmed with Trimmomatic (version 0.39) [22], and the reads were aligned to the reference human genome (hg38) by the STAR algorithm (version 2.7.8a) [23]. To obtain the raw read counts, FeatureCounts (version 2.0.2) [24] was used and raw read counts were normalized by the DESeq2 package [25] in R. Principal component analysis (PCA) was performed with the normalized data of 569 samples (404 AD and 165 control) in R to detect outlier samples that are clearly separated from the rest of the samples in a group when the samples were mapped on a two dimensional plot based on the variations in their gene expression profiles. The expression data was centered and scaled before PCA. Two samples were considered as outliers and excluded from the cohort: one was from AD and the other was from the control group (sample IDs: 500_120515 and 380_120503). All further analyses in the study were performed with 403 AD and 164 control samples. To eliminate confounding effects of age, sex, and PMI (postmortem interval), a mixed linear model was applied to the log2-transformed normalized data by lm() function in R. The detailed demographic information of participants can be found in Table 1.

Table 1 Demographic information of the control and AD groups from the ROSMAP cohort included in this study

Statistical Analysis

We adopted a personalized approach in this study where each sample from the AD group was compared to the control group to detect DEGs in each patient. To this end, we used a ranking-based approach as follows: (i) for the selected patient, the expression value of each gene of the patient was combined with the expression values of that gene in the control group, leading to a list of 165 values (1 value from the patient, 164 values from the control group), (ii) the expression values in the list were ranked from the smallest value to the largest value for each gene, (iii) a gene was considered to be downregulated for that patient if it was within the first 5th percentile in the ranked list while it was considered to be upregulated if it was within the last 5th percentile, and (iv) the procedure from (i) to (iii) was repeated for each patient. We also calculated Benjamini-Hochberg corrected p-values with one-sample t-test for each gene in each AD sample, and used this as additional cut-off (p-value < 0.01) in identifying DEGs for each sample (Fig. 1). We observed that the genes identified by our ranking-based approach had significant p-values in most cases. We confined our analysis to protein-coding genes in the transcriptome dataset. In the bulk analysis where all AD samples as a group were compared with all control samples as a group for protein-coding genes, limma [26] package was used, and the cut-off value was chosen as 0.05 for Benjamini-Hochberg corrected p-values.

Subnetwork Discovery from Protein-Protein Interaction Data

Human protein-protein interaction (PPI) network was downloaded from BioGRID [27] (version 4.4.207). It consists of physical, chemical, and genetic interactions between human proteins and interactions between human proteins and proteins of other organisms. Only physical interactions between human proteins were retained, and other interactions were excluded. Also, self-loops and duplicated edges were removed from the network. In its final form, the PPI network included 19,634 proteins and 678,802 interactions.

Subnetwork discovery was performed with the DOMINO algorithm [29]. To this end, for each patient, subnetworks were created for the upregulated and downregulated DEGs separately. The DOMINO analysis was performed with version 0.1.1 without changing the default parameters. A comparison of the results with an alternative subnetwork discovery algorithm (KeyPathwayMiner, KPM) [28, 30] was also provided. For the KPM analysis, standalone version 5.0 was used with greedy algorithm and Individual Node ExceptionS (INES) model. The maximum number of unchanged genes allowed in a discovered subnetwork was set to 8 through the parameter K. Both algorithms were run with the same DEG lists and the same PPI network as input.

Enrichment Analysis

Functional enrichment analysis was applied to the genes in each discovered subnetwork via the R package of g:Profiler [31, 32], with default correction type (g:SCS) and 0.05 p-value cut-off. Since DOMINO produces multiple modules with differing sizes, enrichment analysis was applied to the modules with the number of included genes greater than or equal to 10. Significantly enriched terms coming from the multiple modules of a given sample were combined for final analysis.

Selecting AD-Associated Genes and AD-Related Functional Terms

To evaluate the results obtained from subnetwork discovery and enrichment analysis, AD-associated gene lists and functional terms were identified. Four different AD-associated gene lists were used: (i) a list compiled from three different GWAS studies that were conducted to identify AD risk genes [33,34,35], (ii) a list including the genes linked with the term “KEGG:05010 Alzheimer’s disease,” obtained via g:Profiler, (iii) a list created by searching for the term “Alzheimer’s disease” in the GeneCards database, and (iv) a manually curated list of AD-associated genes, termed Alzgset, compiled by extensive scanning of 823 published genetic association studies in PubMed [36]. There are 117 genes in GWAS, 353 genes in KEGG, 191 genes in GeneCards, and 430 genes in Alzgset lists. These AD-associated gene lists can be accessed from Supplementary File 3.

Functional terms from KEGG, WikiPathway, and GO reflecting biological/neuronal processes known to be associated with AD, representing the pathophysiology of the disease and found in the enrichment results of the subnetworks, were determined. While selecting these terms, we have also benefited from a recent study by Morgan et al. [37]. Selected AD-related terms are given in Supplementary File 2.

Results and Discussion

In this study, we aimed to elucidate the molecular mechanisms of AD by adopting a personalized approach. RNA-seq data of 403 AD patients from the ROSMAP cohort were used for this purpose. The very beginning step of this study is the detection of upregulated and downregulated DEGs for each patient separately. By following our personalized approach (see “Methods,” “Statistical Analysis” section), we created upregulated and downregulated DEG lists for each patient (Fig. 1). For comparison, the ranges of the number of upregulated and downregulated DEGs are shown in Fig. 2A with a violin plot.

Fig. 2
figure 2

A Violin plots showing the range of the number of downregulated and upregulated DEGs for each AD patient from ROSMAP cohort. B Violin plots of number of genes in the downregulated and upregulated subnetworks discovered by DOMINO. C The top 500 genes in the upregulated and downregulated subnetworks were intersected with the AD-related gene lists

To reveal the affected mechanisms, subnetwork discovery was performed through the DOMINO algorithm from the human protein-protein interaction network by using upregulated and downregulated DEGs separately for each patient. As a result of its design, DOMINO produces multiple distinct modules, each of which has a different number of genes. It did not discover any modules for the downregulated DEGs of 23 patients and the upregulated DEGs of 9 patients. There was only one patient with no modules. Genes from these discrete modules were combined when reporting the total number of genes in the subnetworks. Fig. 2B shows violin plots of the number of genes in the upregulated and downregulated subnetworks discovered by DOMINO.

To see how many AD-related genes were present in the discovered subnetworks, the top 500 mostly encountered genes in the subnetworks were identified separately for upregulated and downregulated subnetworks, and their intersections with four different AD-associated gene lists were determined (Fig. 2C). In constructing the top 500 gene list, the genes were scored based on the number of patient-specific subnetworks they appeared, and the genes with the same score as the 500th gene were also taken into account. While the intersections with GWAS and GeneCards lists were low in general, the highest intersections were observed with the KEGG list. As a comparison, we repeated patient-specific subnetwork discovery analysis with an alternative algorithm, KeyPathwayMiner [28] (Supplementary File 6). In agreement with a recent benchmarking analysis that showed superiority of DOMINO over KeyPathwayMiner and other subnetwork discovery algorithms [15], DOMINO is more powerful in (i) generating subnetworks more consistent in size across the patients, (ii) capturing AD-associated genes, and (iii) capturing AD-related functional terms (Supplementary File 6).

Dysregulated Genes and Pathways Uncovered by the Patient-Specific Subnetworks Spotlight a Wide Range of AD-Related Mechanisms

To determine which pathways are associated with the discovered subnetworks, an enrichment analysis was carried out using the genes of each patient-specific subnetwork. Combining the enrichment results of the downregulated DOMINO subnetworks gives 15,451 unique terms while 16,286 unique terms were obtained from the combination of the enriched terms of the upregulated subnetworks. Then, majority voting was applied to the enrichment results to identify the number of patients found to be significantly associated with a given functional term. Supplementary File 1 provides enrichment results of the upregulated and downregulated subnetworks.

We calculated the prevalence of the enrichment terms and genes among patients to summarize the large number of results we generated. While doing this, the number of occurrences of a term or gene was divided by the total number of patient subnetworks and then multiplied by 100. Based on the prevalence rates of each identified term in the upregulated and downregulated results, a heatmap was produced (Fig. 3). The terms considered were grouped into seven categories as follows: mitochondria, immune response, cell death, calcium ion balance, synaptic activities, endoplasmic reticulum, and fatty acid metabolism.

Fig. 3
figure 3

Comparison of the percent prevalence rates of AD-related functional terms from KEGG, WikiPathway, and GeneOntology annotations based on the enrichment results of the upregulated and downregulated subnetworks

Mitochondria

The first obvious difference is in the mitochondria-associated terms. Mitochondria is an essential component of neurons because they both produce ATP and buffer calcium ion concentration at the synaptic region [38]. Impaired energy metabolism observed in AD state is an indication of disrupted mitochondrial activities. Moreover, damaged mitochondria has a higher potential of producing reactive oxygen species (ROS), and less efficient in ATP production. This increases oxidative stress, which further impairs mitochondrial functions [39]. Fig. 3 shows that altered ATP-related functions are primarily associated with the downregulated subnetworks. The enrichment results of downregulated subnetworks contain more mitochondria-related terms than upregulated ones, and their prevalence rates are higher (for detailed results, see Supplementary File 1). This indicates that there is a suppression of genes that control the mitochondrial activity in AD. The two genes mostly encountered in the downregulated subnetworks are TBRG4 and COX4I1. The mitochondria-localized protein TBRG4 (transforming growth factor beta-regulator 4) is a member of FASTK family. This family is involved in mitochondrial RNA processing, regulation of the mitochondrial mRNA stability, and modulation of mitochondrial respiration. TBRG4 deficiency promotes apoptosis and elevates the generation of ROS [40]. Mitochondrial cytochrome c oxidase (COX, complex IV) is involved in the mitochondrial respiratory chain as the final electron acceptor. It has 13 subunits. COX4 is one of the subunits, and the most common isoform of COX4 is COX4I1. Its deficiency elevates ROS levels and impairs ATP production [41].

Immune Response

Immune response is another important mechanism in AD. The related enriched terms are more prevalent in the upregulated subnetworks than the downregulated ones. In AD case, accumulated or misfolded proteins in the extracellular matrix like Aβ bind to pattern recognition receptors (PRRs) on microglia and astrocyte and activate them. This activation is predominantly initiated by toll-like receptors (TLRs) [42]. Activation of different types of TLRs triggers cytokine signaling networks, causing the release of various cytokines like TNF-α, IL-6, and TGF-β, which have proinflammatory effect [43]. The predominance of pathways related to these mechanisms in upregulated subnetworks is consistent with the disease pathology.

Cell Death Mechanism

Another striking difference in our results is in the terms related to cell death. Cell death mechanism is a highly controlled process required for healthy growth and tissue homeostasis. Several pathway alterations in the AD brain cause mitochondrial dysfunction, stress on the endoplasmic reticulum, and disruption of the normal functioning of cell death mechanisms, inducing the apoptotic process [44]. A high rate of neuronal death is seen in various parts of the brain, especially in the entorhinal cortex, hippocampus, amygdala, and cerebral cortex [45]. When the terms of upregulated and downregulated networks are compared, it is observed that the terms related to cell death like “apoptosis”, “apoptotic process”, and “apoptosis modulation and signaling” are predominant in the upregulated networks.

Endoplasmic Reticulum (ER)

ER and Golgi apparatus-associated pathways are also among the pathways where the differences were observed. ER is a membranous organelle with very basic functions such as folding of protein molecules, transport of synthesized proteins to the Golgi with vesicles, and regulation of calcium homeostasis in the cell [46]. In the ER, a subpopulation of cytosolic proteins, integral membrane proteins, and secretory proteins are all synthesized. Ribosomes on the cytosolic face of the ER produce proteins that are taken up into the ER lumen for processing [47]. Before being exported from the ER, proteins are folded, assembled, and undergo post-translational modifications. Misfolded proteins are either corrected in the ER or, if not corrected, are discovered by the ER and sent to the cytosol for degradation [48]. Buildup of misfolded proteins and disruption of intracellular Ca2+ homeostasis cause ER stress, resulting in neuronal dysfunction and, eventually, cell death [49]. Our results show that the terms “ER lumen”, “protein processing in the ER”, “response to ER stress”, and “protein folding in the ER” dominantly take place in the downregulated subnetworks, consistent with the AD pathology.

Dysregulated Genes in AD

It is also important to identify candidate genes in terms of revealing mechanisms of the disease and drug development. To this end, the genes in the four lists of known AD genes mentioned in the “Methods” section (Supplementary File 3) were ranked according to their prevalence in the patient-specific subnetworks obtained by combining the up- and downregulated subnetworks, and the interaction map was created by selecting the first 50 genes (Fig. 4A). Here we see HSPA5 and HSP90AA1 from the heat shock protein family in the first row. In the third place is the COX4I1 gene, the most common isoform of one of the subunits of the Cox involved in the mitochondrial respiratory chain. In order to identify novel genes that are not yet associated with AD but may be involved in the disease mechanism, the interaction map of the first 50 non-AD genes and the first 30 known AD genes that were most frequently involved in the subnetworks was visualized (Fig. 4B). We use the terms “non-AD genes” to refer to genes not included in the list of known AD genes (Supplementary File 3). The non-AD gene XPO1 stands out as the gene with the highest interaction in this network. It encodes the protein that mediates leucine-rich nuclear export signal (NES)-dependent protein transport. XPO1 inhibitors are not only anticancer and antiviral focused compounds, but they are also candidates for the treatment of neurodegenerative diseases. In preclinical models of inflammatory demyelination and axonal injury, XPO1 inhibition slowed the progression of the disease [50]. One other gene in Fig. 4B is CIT, citron rho-interacting serine/threonine kinase. It specifically targets the Rho-Rac binding proteins. In AD, Rho-Rac1-GTPase signaling dysregulation promotes the degeneration of synapses, APP processing, and increases the phosphorylation of tau [51]. The third most interacting gene is OCIAD1, ovarian cancer immunoreactive antigen domain containing 1. It can decrease oxidative phosphorylation by interacting with electron transport chain proteins and found to be neurodegeneration related factor for AD in a study. It was shown that OCIAD1 disrupts mitochondrial functions, which causes neuronal vulnerability and synaptic damage, and thus contributes to neurodegeneration in AD [52].

Fig. 4
figure 4

Genes were ranked based on their numbers of appearance in the discovered subnetworks, and A protein-protein interactions between the top 50 known AD genes were visualized. B Protein-protein interactions between top 50 non-AD genes and top 30 known AD genes were visualized. Inner circle in B belongs to the known AD genes and outer circle belongs to non-AD genes. Colors of genes in both A and B change based on the degrees of the genes in the graphs

Personalized Approach can Capture Many Specific AD Mechanisms that are Missed by the Bulk Approach

To understand whether our patient-specific personalized approach has an advantage over the conventional bulk-comparison approach, which is the standard in the literature, we performed a bulk analysis by comparing all control samples with all AD samples, and obtained 2073 DEGs. Then, as in the patient-specific analysis, subnetwork discovery was performed with these DEGs by using DOMINO, and enrichment analysis was applied to the genes in the discovered subnetwork. DOMINO and g:Profiler analyses were performed using the same parameters as in the patient-specific analysis.

The enriched terms for the bulk subnetwork included many terms related to mitochondrial functions, respiration, and ATP synthesis, which are very important for AD (Supplementary File 4). Likewise, these terms are also frequently encountered in the patient-specific analysis results. Regarding synaptic functions, which are also important for AD pathology, there are only few terms in the bulk result, and these are related with “presynapse”, “synapse”, “postsynapse”, and “synaptic vesicle”. On the other hand, terms that allow us to learn more and in-depth information about synapses are found only in the patient-specific results. Fig. 5 shows some of the terms included only in the enrichment results of the patient-specific analysis.

Fig. 5
figure 5

Illustration of some of the terms included only in the patient-specific upregulated and downregulated enrichment results. Sizes of the points change regarding the prevalence rate of the related terms

Another striking difference is that inflammation or immune system-related terms, which have a central place in AD pathology [53], were not identified in the bulk analysis enrichment results. In the patient-specific enrichment results, on the other hand, there are many terms related to inflammation and immune response like “immune response”, “immune system process”, “cytokine-mediated signaling pathway”, and “inflammatory response”. The terms related to the ER also differ between the two approaches. The bulk analysis results have only a few terms from Reactome Database reflecting ER-golgi transmission; on the other hand, our personalized analysis results contain many terms related to endoplasmic reticulum and its functional processes like “Protein processing in endoplasmic reticulum”, “response to endoplasmic reticulum stress”, and “endoplasmic reticulum lumen”. In addition, there are a limited number of terms in the bulk analysis enrichment results that reflect the mechanism of cell death, and these are only terms related to the autophagy mechanism. However, as it is known, the predominant cell death mechanisms in AD are apoptosis and necrosis [54], and the terms related with these processes appear in the patient-specific results such as “apoptotic process” and “apoptosis”. These mechanisms are triggered by the activation of caspase cascade, TNF, and TLR pathways [54]. Although personalized analysis results, especially upregulated ones, include different terms reflecting cell death mechanisms, there is no term referring to these mechanisms in the bulk analysis results. Another issue where differences are captured is the terms related to the SMAD protein. These proteins are activated by TGF-β binding to cell wall receptors and then move from the cytoplasm to the nucleus to activate or suppress transcription [55]. Abnormal localization of phosphorylated SMAD proteins in AD affects TGF-β signaling pathway, impairing the neuroprotective functions of this mechanism [56]. While the terms related to SMAD were observed in the patient-specific results, no terms were found in the bulk analysis results.

Protein homeostasis is essential for normal functioning of the cells. Ubiquitin-proteasome system (UPS) and the ubiquitin-like protein (UBL) conjugation pathways are crucial for maintaining this homeostasis [57]. Ubiquitin is a member of UBL family and it is an important component of the UPS, which degrades more than 80% of normal or misfolded cellular proteins [58]. In the AD brain, abnormal UPS activity, ubiquitin accumulated plaques, and tangles are observed. Also proteasomal activity decreases especially in hippocampus [59]. Although there are many terms related to these mechanisms and proteins in the patient-specific results such as “ubiquitin-dependent protein catabolic process” and “protein ubiquitination”, there is no related term in the bulk analysis. Another UBL where the difference is observed between the two approaches is small ubiquitin-like modifier (SUMO), and its related terms. SUMO is a group of proteins that are attached to or detached from proteins during post-translational modifications, and this phenomenon is called SUMOylation. It plays important roles in many cellular processes and regulation of biological functions of proteins [60]. In AD, dysregulations occur in SUMOylation [61], but only one term reflecting these alterations is found in the bulk analysis result, while many SUMO-related terms are found in the sample-based enrichment results (Fig. 5). The same observation holds also for NEDD8, another UBL member whose activity is deregulated in AD [62].

The phosphatidylinositol 3-kinase (PI3K)/AKT signaling pathway is another pathway where the differences are noticeable between the bulk and patient-specific results. The PI3K/AKT signaling pathway regulates a wide range of processes in the brain, including motility, intracellular trafficking, cell proliferation, growth, differentiation, and more complex procedures like dendrite and axon extension. It also has essential role in maintaining synaptic plasticity and memory process [63]. An abnormality in PI3K/AKT signaling can increase the activity of GSK-3β, which is a crucial molecule in the downstream of the pathway, and cause hyperphosphorylation of Tau, which causes the formation of NFTs [64]. Neurons can be protected against Aβ-induced toxicity by activating PI3K/AKT signaling pathway [65]. The terms reflecting these pathways and the proteins that are central for a thorough understanding of the molecular mechanism underlying AD are present in the patient-specific results, but they are absent in the results of the bulk analysis. These altogether show that our personalized approach can reveal different aspects of each patient, and the unobtrusive points in the bulk analysis are made visible with the patient-specific analysis.

PPI-Based Personalized Results Highlight Differences in the Subtypes of AD

It is a known fact that AD is a heterogeneous disease and has subtypes. Not all patients show the same pathological findings, and the disease progresses by different mechanisms [66]. Therefore, we also evaluated our patient-centered personalized results by referring to the subtyping suggested in a recent study by Neff et al. [67]. In that study, they used the Mount Sinai Brain Bank (MSBB) dataset to subgroup AD patients and validated these results with the ROSMAP dataset, where our personalized subnetworks are based. They identified 5 molecular subtypes, which they called blue, green, red, turquoise, and yellow under three major classes. There are 123 patients in the blue subtype, 96 patients in green, 80 in red, 56 in turquoise, and 35 in yellow subtype. In this section, we evaluate the 390 ROSMAP AD samples used by Neff et al. and show differences between subtypes by calculating prevalence rates of enriched terms and genes for each subtype separately for upregulated and downregulated subnetworks.

Fig. 6 illustrates how the prevalence rates of the enriched terms differ in the subtypes. The results are consistent with the fact that there are subtypes of the disease. The evaluation of the enrichment results revealed differences between the disease subtypes on the basis of biological processes and cellular functions, and these are detailed in the subsections below for each subtype (Figure 6). Moreover, in order to contribute to the elucidation of the molecular mechanism of the disease, the genetic alterations that cause these differences should also be revealed. For this purpose, first the prevalence rates of the genes in the subnetworks of each subtype were determined separately. Prevalence rate of a gene shows the percentage of patients that include that gene in the discovered subnetwork for a given subtype. Next, previously used known AD genes were compared with the genes in the subnetworks of each subtype, and the genes that had the highest prevalence rate were selected when compared to the other subtypes. This provides a different way of analyzing the differences between the disease subtypes. The prevalence rates of AD-related genes in discovered subnetworks for each subtype are given in Supplementary File 5. Each subtype will be discussed separately in the context of subtype-specific enrichment results and genes that differ from other subtypes.

Fig. 6
figure 6

Comparison of percent prevalence rates of selected AD-related functional terms from KEGG, WikiPathway, and GeneOntology databases across five AD subtypes reported by Neff et al. [67] for downregulated and upregulated subnetworks. The functional terms were grouped into eight molecular systems, known to be altered in AD. Blue, green, red, turquoise, and yellow designate the AD subtypes, as named in the original article. Percent prevalence of a term shows the percentage of patients whose discovered subnetworks are statistically enriched with the genes associated with that term

Blue Subtype

For the downregulated processes, for example, the patients in the blue subtype have a low prevalence rate in the terms related to immune response and cell death. However, the prevalence rates of the terms related with mitochondria, synapse, and calcium are high compared to the other subtypes in the blue subtype (Fig. 6). For the upregulated processes, the blue subtype shows very high prevalence rates in the terms related to cell death and immune system. There is also a slight upregulation of fatty acid metabolism. High levels of oxidative stress and ongoing bioenergetic disruption activate the apoptotic pathway, impair mitophagy, and ultimately cause the death of neuronal cells [68], which explains why there is an increase in immune response and cell death while there is a decrease in mitochondrial activity. Unlike other subtypes, the genes most frequently involved in the blue subtype are SLC6A3 and SNCA. The SLC6A3 gene, which is present in 23% of the subnetworks, is the most prominent gene. This gene encodes dopamine transporter (DAT) protein, which has often been associated with neuropsychiatric diseases. Abnormality in regulations of dopamine levels is associated with depression, bipolar disorder, and Parkinson’s disease. Since low levels of dopaminergic neurotransmitters are linked to AD [69], the function and significance of this gene can be further examined in terms of AD. Another gene is SNCA, which is found in 21% of subnetworks in the blue subtype. Although the function of α-synuclein encoded by SNCA is not exactly known, it is considered to be involved in the regulation of dopamine release and transport, the fibrillation of tau, neuroprotection by decreasing caspase activity [70].

Green Subtype

In the green subtype, cell death and immune response, which do not have high prevalence rates in other subtypes, are downregulated. Moreover, unlike the other subtypes, the rates of synapse-related terms in the green subtype are higher in the upregulated subnetwork results. At the same time, the prevalence rate of terms related to mitochondria is high (Fig. 6). For the green subtype, PSMC5 and PSMB6 genes, which encode subunits of the proteasome complex, draw attention with 13% prevalence rate in subnetworks. As mentioned earlier, the proteasome is a large, highly complex protein machinery that is responsible for protein degradation. In a study by Drummond et al., it was observed that phosphorylated tau has a significant interaction with PSMC5 and some other 19S subcomplex proteins [71].

Red Subtype

In the red subtype, while endoplasmic reticulum and fatty acid metabolism-related terms are downregulated, cell death-related terms are upregulated (Fig. 6). Polyunsaturated fatty acids (PUFAs) play critical roles in brain development and maintenance, and their deficiency or abnormality is associated with neurological problems such as AD, bipolar disorders, and major depression [72]. Linoleic acid (LA), docosahexaenoic acid (DHA), and eicosapentaenoic acid (EPA) levels are decreased in the AD brain. Although fatty acid metabolism is not affected in each subtype, these results show that it plays an active role in some subtypes of the disease (Fig. 6). The first AD-related gene identified in the red subtype, with a prevalence rate of 38%, is HSPA5. It encodes the binding immunoglobulin protein (BiP), which is a member of the HSP70 chaperon family. It can initiate the unfolded protein response (UPR) when the amount of unfolded/misfolded protein in the ER exceeds the capacity of the protein folding machinery [73]. The other notable gene is TP73, which has 29% prevalence rate in the red subtype and encodes the p73 protein. It is a transcription factor that is essential for neuronal development, stem cell renewal, differentiation, cell death, and proliferation [74]. Happloinsufficieny of p73 can induce aberrant hyperphosphorylation of tau and tau kinase activity and thus tau aggregation in aging and AD [75].

Turquoise Subtype

In the turquoise subtype, endoplasmic reticulum, mitochondria, and synapse-related terms were downregulated, while cell death and immune response-related terms were upregulated, similar to the majority of the subtypes (Fig. 6). Close physical contact exists between mitochondria and ER subcompartments, and this may be the reason why they are downregulated together in turquoise subtype. These contact sides, called mitochondria-associated ER membranes (MAMs), play very critical role in calcium homeostasis, lipid synthesis, apoptotic signaling, and autophagy [76]. This region is affected by AD pathogenesis [77]. First notable gene associated with AD for the turquoise subtype is MYH8 with a prevalence of 20%. Actin-based motor protein MYH8 is involved in the contraction of skeletal muscles as well as cell adhesion and migration mediated by integrins [78]. In some studies, it was observed that the MYH8 gene was downregulated in AD cases and was associated with AD [79], [80]. Another gene that stands out in terms of prevalence rate is the famous AD-related gene PSEN1 that has been studied extensively. It is well known that this gene’s mutations contribute to autosomal dominant early-onset Alzheimer’s disease [81]. However, it is not predominant in all disease subtypes, suggesting that it may not be the defining feature in all AD cases.

Yellow Subtype

In the yellow subtype, we observe that the terms associated with fatty acids and mitochondria are downregulated whereas the terms associated with immune response and cell death are upregulated (Fig. 6). HSP90AA1 and HSP90AB1 genes draw attention in the yellow subtype with 29% prevalence rate, unlike the other subtypes. Stressful situations like heat shock, ischemia, heavy metals, and hypoxia trigger the synthesis of heat shock proteins (HSPs) [82]. They are divided into several families, and the molecular chaperone HSP90s is one of them. They are able to target specific proteins for degradation, suppress protein aggregation, and solubilize protein aggregates [83]. HSP family genes are among the most prevalent genes for the yellow subtype, which motivates further study of these genes in an effort to uncover previously unknown aspects of the disease.

A summary of how cellular mechanisms vary between the AD subtypes is shown in Fig. 7. The cell death mechanism was upregulated in the majority of the subtypes, while it was downregulated only in the green subtype. In some AD cases, while no apoptotic morphology was observed in any part of the brain, it was observed that the cells were swollen and DNA fragmentation increased [54]. With the exception of the red subtype, cell death and immune response appear to act in together. This implies that both mechanisms trigger or activate each other.

Fig. 7
figure 7

A diagram that shows the alterations in five AD subtypes in terms of commonly affected cellular mechanisms, derived from the analysis of the prevalent AD-associated functional terms given in Fig. 6

Conclusion

The disease, described by Alois Alzheimer in 1906, has been the subject of many research since then. Despite these studies conducted from different perspectives, the cause of the disease and the underlying molecular mechanisms are not yet fully understood. Although there are many studies in the literature with transcriptomic data, the majority of these studies handled the data of the patients collectively and thus could not reveal the differences caused by the heterogeneous nature of the disease.

In this study, each patient was separately compared with the whole control group, and patient-specific profiles in terms of genes and pathways were revealed. Instead of considering only DEGs in the evaluations, subnetwork discoveries were performed, and DOMINO algorithm was employed for this purpose. As a result of the evaluation of patient-specific results, significant alterations were identified in mitochondrial activities, immune response, cell death mechanism, and endoplasmic reticulum in AD state. Additionally, it was demonstrated that the patient-specific approach captured many more AD-related dysregulations than the bulk analysis, indicating that a personalized approach is a much more effective way to decipher previously undiscovered features of the disease. It has been shown that molecular processes can be altered with different strength and even in different direction between the disease subtypes. In addition, novel AD genes were highlighted based on their interactions with the known AD genes. Our personalized approach captured alterations in several molecular processes that could not be identified by bulk comparisons of AD and control samples. More transcriptomic datasets with high number of individual samples are becoming available at an increasing speed for Alzheimer’s disease. The patient-specific approach presented here can be applied to such datasets or datasets from other neurological diseases that are heterogeneous in nature to extract biological information that, otherwise, would remain hidden in the data.