Abstract
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease with strong genetic components. To identity novel risk variants for ALS, utilizing the latest genome-wide association studies (GWAS) and eQTL study data, we conducted a genome-wide expression association analysis by summary data-based Mendelian randomization (SMR) method. Summary data were derived from a large-scale GWAS of ALS, involving 12577 cases and 23475 controls. The eQTL annotation dataset included 923,021 cis-eQTL for 14,329 genes and 4732 trans-eQTL for 2612 genes. Genome-wide single gene expression association analysis was conducted by SMR software. To identify ALS-associated biological pathways, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). SMR single gene analysis identified one significant and four suggestive genes associated with ALS, including C9ORF72 (P value = 7.08 × 10−6), NT5C3L (P value = 1.33 × 10−5), GGNBP2 (P value = 1.81 × 10−5), ZNHIT3(P value = 2.94 × 10−5), and KIAA1600(P value = 9.97 × 10−5). GSEA identified 7 significant biological pathways, such as PEROXISOME (empirical P value = 0.006), GLYCOLYSIS_GLUCONEOGENESIS (empirical P value = 0.043), and ARACHIDONIC_ACID_ METABOLISM (empirical P value = 0.040). Our study provides novel clues for the genetic mechanism studies of ALS.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Amyotrophic lateral sclerosis (ALS) is the most common neurodegenerative disease, which mainly affects the motor neurons controlling voluntary muscles. ALS leads to muscle stiffness, twitching, and progressive weakness (Bonifacino et al. 2016). Most serious ALS patients will die of respiratory failure or pneumonia. Although ALS can occur at any time during adulthood, it usually affects people in the mid-fifties (Eisen 2009). It was estimated that the incidence of ALS was 1.2–4 cases per 100,000 people per year in developed countries (Gordon et al. 2013). Most patients die in 2–5 years after affecting ALS (Xiao et al. 2016).
It has been demonstrated that genetic factors play an important role in the pathogenesis of ALS. It is estimated that these candidate genes account for 25–30% of familial ALS (Orr 2011). A group of susceptibility genes or loci have been identified for ALS, such as SOD1 (Rosen 1993), FUS (Vance et al. 2009), OPTN (Maruyama et al. 2010), VCP (Johnson et al. 2010), and C9ORF72 (DeJesus-Hernandez et al. 2011; Majounie et al. 2012; Renton et al. 2011). Recently, a large genome-wide association study (GWAS) of ALS reported MOBP and SCFD1 as the risk loci of ALS, and verified that ALS was a complex genetic trait with a polygenic architecture (van Rheenen et al. 2016). However, the genetic risks explained by the identified loci were limited, suggesting the existence of additional genetic factors implicated in the development of ALS.
Genome-wide association studies (GWAS) are a powerful approach and achieve great success for susceptibility gene mapping of complex diseases/traits, including ALS. However, GWAS have several limitations. For instance, due to strict statistical significant threshold, GWAS usually focus on a few of loci with the most significant association signals. Without considering the joint effects of multiple functionally related genes, GWAS have limited power to detect the causal loci with moderate or weak genetic effects. Because most of genes do not work individually and complex diseases are generally determined by complicated biological processes, identifying several disease-associated genes is often insufficient to reveal the genetic architecture of complex diseases. Inspired by the gene set enrichment analysis (GSEA) of microarray data, GWAS-based pathway association studies were proposed (Wang et al. 2007). By integrative analysis of summary data from GWAS and known biological pathways, pathway association studies can provide more pathogenetic information considering the joint biological effects of multiple functionally related genes (Wang et al. 2010).
Expression quantitative trait loci (eQTLs) are genomic loci that can affect gene expression levels. Recent studies have confirmed the important roles of eQTLs in the pathogenesis of human complex diseases (Ertekin-Taner 2011; Gibson et al. 2012; Murphy et al. 2010; Nicolae et al. 2010). The disease-associated SNPs identified by GWAS are also significantly enriched in eQTLs (Nicolae et al. 2010). However, because mostly locating outside of genes, disease-associated eQTLs were easily to be ignored by previous GWAS. Recently, a new method named summary data-based Mendelian randomization (SMR) was proposed, which was capable of integrating GWAS summaries and eQTL annotation information to detect novel genes, whose expression levels were associated with complex diseases (Zhu et al. 2016). SMR showed a high power for identifying novel causal genes of complex diseases (Zhu et al. 2016).
In this study, utilizing the latest published ALS GWAS and eQTL data, SMR was first applied for single gene expression association analysis. To reveal the biological significances of single gene expression association analysis results, we further extended SMR to pathway association analysis. The SMR single gene analysis results were subjected to GSEA for identifying novel ALS-associated pathways with known function.
Materials and Methods
GWAS Summary Data of ALS
A large-scale GWAS meta-analysis of ALS was used in this study (van Rheenen et al. 2016). Briefly, this GWAS meta-analysis consists of two published GWAS, including a recent GWAS of 15 cohorts (7763 cases and 4669 controls) and a previous GWAS of 26 cohorts (7,028 cases and 22,229 controls). In total, 14,791 cases and 26,898 controls from 41 cohorts were analyzed. Quality control (QC) was first performed per cohort to remove low-quality SNPs and individuals. After quality control, 12,577 cases and 23,475 controls were included (van Rheenen et al. 2016). For genotype imputation, prephasing was first performed for each stratum using SHAPEIT2 against the 1000 Genomes Project phase 1 haplotypes as a reference panel. Subsequently, strata were imputed up to the merged reference panel in 5-Mb chunks using IMPUTE2. An inverse-variance-weighted, fixed-effect meta-analysis was performed using METAL. The summary data of the meta-analysis were used in this study. Detailed information of cohorts, genotyping, imputation, meta-analysis, and quality control approaches can be found in the published study (van Rheenen et al. 2016).
eQTL Datasets
The eQTL dataset obtained from peripheral blood was used in this study (Westra et al. 2013). In brief, a genome-wide eQTL scanning was first conducted using 5311 individuals and replicated in another independent sample of 2775 subjects. Illumina whole-genome Expression BeadChips were used for mRNA expression profiling. SNP genotyping was conducted using commercial platforms, such as Illumina 610 K quad arrays and Illumina HumanHap300 arrays. Imputation was conducted using IMPUTE (Marchini et al. 2007) or MACH (Li et al. 2010) against the HapMap 2 reference panels. 923,021 cis-eQTLs and 4732 trans-eQTLs were identified (Westra et al. 2013).
SMR Single Gene Analysis of ALS
The GWAS summary data of ALS were analyzed by SMR for detecting associations between gene expression levels and ALS. SMR resembled a Mendelian randomization (MR) analysis, in which genetic variants were viewed as an instrumental variable to assess the effects of gene expression levels on disease phenotypes (Zhu et al. 2016). SMR analysis showed good power to evaluate the impact of gene expression variation on complex diseases by integrating GWAS summary data and eQTL annotation information (Zhu et al. 2016). 5366 genes with both GWAS summary data and eQTL data were analyzed in this study. The genome-wide significant genes were identified at P SMR < 0.05/5366 = 9.3 × 10−6. Suggestive association signals were identified at P SMR < 1×10−4. The heterogeneity in dependent instruments (HEIDI) tests were also conducted by SMR. The genes with P HEIDI > 0.05 were further subjected to GSEA analysis.
Pathway Enrichment Analysis of ALS
To reveal the biological significance of SMR single gene expression association analysis results for ALS, we extended SMR to pathway association analysis implemented by GSEA approach (Wang et al. 2007). 162 biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/) with known biological function were analyzed in this study. During the pathway enrichment analysis, 5000 permutations were conducted to calculate the empirical P value for each KEGG pathway.
Results
SMR Single Gene Expression Association Analysis
After strict Bonferroni correction, SMR identified one significant gene C9ORF72 (P SMR = 7.08 × 10−6, P HEIDI = 2.71 × 10−2) for ALS. We also identified four genes with suggestive association signals, including NT5C3L (P SMR = 1.33 × 10−5, P HEIDI = 8.88 × 10−2), GGNBP2 (P SMR = 1.81 × 10−5, P HEIDI = 7.23 × 10−2), ZNHIT3 (P SMR = 2.94 × 10−5, P HEIDI = 8.55 × 10−2), and KIAA1600 (P SMR = 9.97 × 10−5, P HEIDI = 9.90 × 10−2) (Table 1). The original GWAS of ALS identified three top genes associated with ALS (Table 2).
Pathway Enrichments Analysis
Pathway enrichment analysis identified 7 significant pathways for ALS, including PEROXISOME (empirical P value = 0.006), CITRATE_CYCLE_TCA_CYCLE (empirical P value = 0.025), TIGHT_JUNCTION, PPAR_SIGNALING_PATHWAY (empirical P value = 0.025), SNARE_INTERACTIONS_IN_VESICULAR_TRANSPORT (empirical P value = 0.027), ARACHIDONIC_ACID_METABOLISM (empirical P value = 0.040), and GLYCOLYSIS_GLUCONEOGENESIS (empirical P value = 0.043).
Discussion
Extensive GWAS have been conducted and identified a large amount of genetic variants associated with complex diseases. But it is a challenge to reveal the biological significances of GWAS-identified loci, which locate outside known genes. Recent studies confirmed the implication of eQTLs in the development of complex diseases (Ertekin-Taner 2011; Gibson et al. 2012; Murphy et al. 2010; Nicolae et al. 2010). Integrating GWAS with eQTL studies has the potential to discover novel susceptibility genes for complex diseases. In this study, utilizing the latest SMR approach and published GWAS data, we conducted a genome-wide single gene association analysis and pathway enrichment analysis for ALS. We identified several genes and KEGG pathways associated with ALS, providing novel clues for the pathogenetic studies of ALS.
SMR analysis observed significant association between C9ORF72gene and ALS, confirming the important role of C9ORF72 in the pathogenesis of ALS. C9ORF72 plays an important role in the regulation of endosomal trafficking (Farg et al. 2014). Previous studies found that an expanded GGGGCC repeat within a non-coding region of C9ORF72 could increase the risk of ALS (Gijselinck et al. 2012; Mizielinska et al. 2014; Renton et al. 2011). Besides the well-known ALS-associated C9ORF72 gene, SMR also identified four genes with suggestive association signals for ALS, including NT5C3L, GGNBP2, ZNHIT3, and KIAA1600. The biological function of these four genes remains largely unknown by now. To the best of our knowledge, no study has been conducted to investigate the possible roles of these four genes in the development of ALS.
In the pathway enrichment analysis, we observed the most significant association between ALS and PEROXISOME pathways. Peroxisomes are essential organelles for redox signaling and lipid homeostasis, and involve in many crucial metabolic processes. PEROXISOME pathway plays a key role in the detoxification of reactive oxygen species (ROS) (Zhang et al. 2014). ROS have been suggested to contribute to the regulation of apoptosis, remarkably in inflammatory cells (Gu et al. 2017). Increased ROS production could induce apoptosis in endothelial cell and then initiate mitochondrial damage (Walford 2003). Previous studies have found that mitochondria dysfunction and overproduction of ROS played a crucial role in the pathogenesis of neurodegenerative diseases such as AD, PD, and ALS (Calabrese et al. 2005; Emerit et al. 2004; Federico et al. 2012; Zarkovic 2003).
Additionally, pathway enrichment analysis also identified several novel candidate pathways for ALS, such as GLYCOLYSIS_GLUCONEOGENESIS and ARACHIDONIC_ACID_METABOLISM. Previous studies have provided some evidence supporting the implication of identified candidate pathways in the development of ALS. For instance, Valbuena GN et al. observed abnormal aerobic glycolysis in ALS cellular model (Valbuena et al. 2016). Arachidonic acid is implicated in COX-2-driven inflammatory pathway in ALS (Kiaei et al. 2005). Further studies are warranted to reveal the roles of identified pathways in the pathogenesis of ALS.
The eQTL dataset from peripheral blood was used for SMR analysis in the study. This eQTL dataset should have good power for genes with consistent eQTL effects across tissues. But it may lose power for the genes with brain tissue-specific effects. Zhu et al. also evaluated the impact of tissue specificity on the performance of SMR approach. They compared the SMR results of schizophrenia using the eQTL datasets from brain and peripheral blood. They found that the SMR analysis results using peripheral blood eQTL dataset was highly consistent with that using brain eQTL dataset, suggesting the good performance of eQTL dataset from peripheral blood in SMR analysis (Zhu et al. 2016).
In summary, utilizing the latest published ALS GWAS and eQTL study data, we conducted a genome-wide eQTL-based single gene expression association analysis and pathway association analysis for ALS. We identified several genes and pathways associated with ALS. Our results provide novel clues for the pathogenetic studies of ALS. This study also illustrates the good performance of SMR approach and extends it to pathway association analysis for complex diseases.
References
Bonifacino T et al (2016) Altered mechanisms underlying the abnormal glutamate release in amyotrophic lateral sclerosis at a pre-symptomatic stage of the disease. Neurobiol Dis 95:122–133. doi:10.1016/j.nbd.2016.07.011
Calabrese V et al (2005) Oxidative stress, mitochondrial dysfunction and cellular stress response in Friedreich’s ataxia. J Neurol Sci 233:145–162. doi:10.1016/j.jns.2005.03.012
DeJesus-Hernandez M et al (2011) Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 72:245–256. doi:10.1016/j.neuron.2011.09.011
Eisen A (2009) Amyotrophic lateral sclerosis: a 40-year personal perspective. J Clin Neurosci 16:505–512. doi:10.1016/j.jocn.2008.07.072
Emerit J, Edeas M, Bricaire F (2004) Neurodegenerative diseases and oxidative stress. Biomed Pharmacother 58:39–46. doi:10.1016/j.biopha.2003.11.004
Ertekin-Taner N (2011) Gene expression endophenotypes: a novel approach for gene discovery in Alzheimer’s disease. Mol Neurodegener 6:31. doi:10.1186/1750-1326-6-31
Farg MA et al (2014) C9ORF72, implicated in amytrophic lateral sclerosis and frontotemporal dementia, regulates endosomal trafficking. Hum Mol Genet 23:3579–3595. doi:10.1093/hmg/ddu068
Federico A, Cardaioli E, Da Pozzo P, Formichi P, Gallus GN, Radi E (2012) Mitochondria, oxidative stress and neurodegeneration. J Neurol Sci 322:254–262. doi:10.1016/j.jns.2012.05.030
Gibson G et al (2012) Brain expression genome-wide association study (eGWAS) identifies human disease-associated variants. PLoS Genet 8:e1002707. doi:10.1371/journal.pgen.1002707
Gijselinck I et al (2012) A C9orf72 promoter repeat expansion in a Flanders-Belgian cohort with disorders of the frontotemporal lobar degeneration-amyotrophic lateral sclerosis spectrum: a gene identification study. Lancet Neurol 11:54–65. doi:10.1016/s1474-4422(11)70261-7
Gordon PH, Mehal JM, Holman RC, Rowland LP, Rowland AS, Cheek JE (2013) Incidence of amyotrophic lateral sclerosis among American Indians and Alaska natives. JAMA Neurol 70:476–480. doi:10.1001/jamaneurol.2013.929
Gu HF, Li HZ, Xie XJ, Tang YL, Tang XQ, Nie YX, Liao DF (2017) Oxidized low-density lipoprotein induced mouse hippocampal HT-22 cell damage via promoting the shift from autophagy to apoptosis. CNS Neurosci Ther. doi:10.1111/cns.12680
Johnson JO et al (2010) Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron 68:857–864. doi:10.1016/j.neuron.2010.11.036
Kiaei M, Kipiani K, Petri S, Choi DK, Chen J, Calingasan NY, Beal MF (2005) Integrative role of cPLA with COX-2 and the effect of non-steriodal anti-inflammatory drugs in a transgenic mouse model of amyotrophic lateral sclerosis. J Neurochem 93:403–411. doi:10.1111/j.1471-4159.2005.03024.x
Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816–834. doi:10.1002/gepi.20533
Majounie E et al (2012) Frequency of the C9orf72 hexanucleotide repeat expansion in patients with amyotrophic lateral sclerosis and frontotemporal dementia: a cross-sectional study. Lancet Neurol 11:323–330. doi:10.1016/S1474-4422(12)70043-1
Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39:906–913. doi:10.1038/ng2088
Maruyama H et al (2010) Mutations of optineurin in amyotrophic lateral sclerosis. Nature 465:223–226. doi:10.1038/nature08971
Mizielinska S et al (2014) C9orf72 repeat expansions cause neurodegeneration in Drosophila through arginine-rich proteins. Science 345:1192–1194. doi:10.1126/science.1256800
Murphy A et al (2010) Mapping of numerous disease-associated expression polymorphisms in primary peripheral blood CD4 + lymphocytes. Hum Mol Genet 19:4745–4757. doi:10.1093/hmg/ddq392
Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 6:e1000888. doi:10.1371/journal.pgen.1000888
Orr Harry T (2011) FTD and ALS: genetic ties that bind. Neuron 72:189–190. doi:10.1016/j.neuron.2011.10.001
Renton Alan E et al (2011) A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 72:257–268. doi:10.1016/j.neuron.2011.09.010
Rosen DR (1993) Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature 364:362. doi:10.1038/364362c0
Valbuena GN, Rizzardini M, Cimini S, Siskos AP, Bendotti C, Cantoni L, Keun HC (2016) Metabolomic analysis reveals increased aerobic glycolysis and amino acid deficit in a cellular model of amyotrophic lateral sclerosis. Mol Neurobiol 53:2222–2240. doi:10.1007/s12035-015-9165-7
van Rheenen W et al (2016) Genome-wide association analyses identify new risk variants and the genetic architecture of amyotrophic lateral sclerosis. Nat Genet 48:1043. doi:10.1038/ng.3622
Vance C et al (2009) Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science 323:1208–1211. doi:10.1126/science.1165942
Walford GA (2003) Hypoxia potentiates nitric oxide-mediated apoptosis in endothelial cells via peroxynitrite-induced activation of mitochondria-dependent and -independent pathways. J Biol Chem 279:4425–4432. doi:10.1074/jbc.M310582200
Wang K, Li M, Bucan M (2007) Pathway-based approaches for analysis of genomewide association studies. Am J Hum Genet 81:1278–1283. doi:10.1086/522374
Wang K, Li M, Hakonarson H (2010) Analysing biological pathways in genome-wide association studies. Nat Rev Genet 11:843–854. doi:10.1038/nrg2884
Westra HJ et al (2013) Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet 45:1238–1243. doi:10.1038/ng.2756
Xiao S et al (2016) C9orf72 isoforms in amyotrophic lateral sclerosis and frontotemporal lobar degeneration. Brain Res 1647:43–49. doi:10.1016/j.brainres.2016.04.062
Zarkovic K (2003) 4-Hydroxynonenal and neurodegenerative diseases. Mol Aspects Med 24:293–303. doi:10.1016/s0098-2997(03)00024-4
Zhang F et al (2014) Trans-omics pathway analysis suggests that eQTLs contribute to chondrocyte apoptosis of Kashin-Beck disease through regulating apoptosis pathway expression. Gene 553:166–169. doi:10.1016/j.gene.2014.10.018
Zhu Z et al (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet 48:481–487. doi:10.1038/ng.3538
Acknowledgements
This study is supported by the National Natural Scientific Foundation of China (81472925, 81673112), and the Technology Research and Development Program of in Shaanxi Province of China (2013KJXX-51), and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Du, Y., Wen, Y., Guo, X. et al. A Genome-wide Expression Association Analysis Identifies Genes and Pathways Associated with Amyotrophic Lateral Sclerosis. Cell Mol Neurobiol 38, 635–639 (2018). https://doi.org/10.1007/s10571-017-0512-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10571-017-0512-2