Abstract
While large-scale, genome-wide association studies (GWAS) have identified hundreds of loci associated with brain-related traits, identification of the variants, genes and molecular mechanisms underlying these traits remains challenging. Integration of GWAS with expression quantitative trait loci (eQTLs) and identification of shared genetic architecture have been widely adopted to nominate genes and candidate causal variants. However, this approach is limited by sample size, statistical power and linkage disequilibrium. We developed the multivariate multiple QTL approach and performed a large-scale, multi-ancestry eQTL meta-analysis to increase power and fine-mapping resolution. Analysis of 3,983 RNA-sequenced samples from 2,119 donors, including 474 non-European individuals, yielded an effective sample size of 3,154. Joint statistical fine-mapping of eQTL and GWAS identified 329 variant–trait pairs for 24 brain-related traits driven by 204 unique candidate causal variants for 189 unique genes. This integrative analysis identifies candidate causal variants and elucidates potential regulatory mechanisms for genes underlying schizophrenia, bipolar disorder and Alzheimer’s disease.
Similar content being viewed by others
Data availability
Brain eQTL meta-analysis resource: http://icahn.mssm.edu/brema
Code availability
mmQTL: https://github.com/jxzb1988/mmQTL and Zenodo79 (https://doi.org/10.5281/zenodo.5560014).
References
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Visscher, P. M. et al. 10 Years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102 (2019).
Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 50, 668–681 (2018).
Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).
Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102, 717–730 (2018).
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Kim-Hellmuth, S. et al. Cell type-specific genetic regulation of gene expression across human tissues. Science 369, eaaz8528 (2020).
Dobbyn, A. et al. Landscape of conditional eQTL in dorsolateral prefrontal cortex and co-localization with schizophrenia GWAS. Am. J. Hum. Genet. 102, 1169–1184 (2018).
Hormozdiari, F. et al. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat. Genet. 50, 1041–1047 (2018).
Fromer, M. et al. Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat. Neurosci. 19, 1442–1453 (2016).
Ng, B. et al. An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat. Neurosci. 20, 1418–1426 (2017).
Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).
Jaffe, A. E. et al. Developmental and genetic regulation of the human cortex transcriptome illuminate schizophrenia pathogenesis. Nat. Neurosci. 21, 1117–1125 (2018).
Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955–958 (2017).
Darmanis, S. et al. A survey of human brain transcriptome diversity at the single cell level. Proc. Natl Acad. Sci. USA 112, 7285–7290 (2015).
Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70–80 (2018).
Cao, J. et al. A human cell atlas of fetal gene expression. Science 370, eaba7612 (2020).
Raj, T. et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523 (2014).
van der Wijst, M. G. P. et al. Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs. Nat. Genet. 50, 493–497 (2018).
Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Farh, K. K.-H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).
Young, A. M. H. et al. A map of transcriptional heterogeneity and regulatory variation in human microglia. Nat. Genet. 53, 861–868 (2021).
Jaffe, A. E. et al. Profiling gene expression in the human dentate gyrus granule cell layer reveals insights into schizophrenia and its genetic risk. Nat. Neurosci. 23, 510–519 (2020).
de Paiva Lopes, K. et al. Atlas of genetic effects in human microglia transcriptome across brain regions, aging and disease pathologies. Preprint at https://www.biorxiv.org/content/10.1101/2020.10.27.356113v1?rss=1 (2020).
Jansen, R. et al. Conditional eQTL analysis reveals allelic heterogeneity of gene expression. Hum. Mol. Genet. 26, 1444–1451 (2017).
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).
Schrode, N. et al. Synergistic effects of common schizophrenia risk variants. Nat. Genet. 51, 1475–1485 (2019).
Nott, A. et al. Brain cell type-specific enhancer-promoter interactome maps and disease-risk association. Science 366, 1134–1139 (2019).
Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177–183 (2016).
Zaitlen, N., Paşaniuc, B., Gur, T., Ziv, E. & Halperin, E. Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010).
Asimit, J. L., Hatzikotoulas, K., McCarthy, M., Morris, A. P. & Zeggini, E. Trans-ethnic study design approaches for fine-mapping. Eur. J. Hum. Genet. 24, 1330–1336 (2016).
Morris, A. P. Transethnic meta-analysis of genomewide association studies. Genet. Epidemiol. 35, 809–822 (2011).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).
Sul, J. H., Martin, L. S. & Eskin, E. Population structure in genetic studies: confounding factors and mixed models. PLoS Genet. 14, e1007309 (2018).
Han, B. et al. A general framework for meta-analyzing dependent studies with overlapping subjects in association mapping. Hum. Mol. Genet. 25, 1857–1866 (2016).
Bennett, D. A. et al. Religious orders study and rush memory and aging project. J. Alzheimers Dis. 64, S161–S189 (2018).
GTEx Consortium. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Wang, J. et al. CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies. Nucleic Acids Res. 48, D807–D816 (2020).
Wang, M. et al. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci. Data 5, 180185 (2018).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Hauberg, M. E. et al. Common schizophrenia risk variants are enriched in open chromatin regions of human glutamatergic neurons. Nat. Commun. 11, 5581 (2020).
Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011).
Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Pardiñas, A. F. et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat. Genet. 50, 381–389 (2018).
Bipolar Disorder and Schizophrenia Working Group of the Psychiatric Genomics Consortium. Genomic dissection of bipolar disorder and schizophrenia, including 28 subphenotypes. Cell 173, 1705–1715 (2018).
Cross-Disorder Group of the Psychiatric Genomics Consortium. Genomic relationships, novel loci, and pleiotropic mechanisms across eight psychiatric disorders. Cell 179, 1469–1482 (2019).
Roussos, P. et al. A role for noncoding variation in schizophrenia. Cell Rep. 9, 1417–1429 (2014).
Najt, P. et al. Impulsivity and bipolar disorder. Eur. Neuropsychopharmacol. 17, 313–320 (2007).
Ouzir, M. Impulsivity in schizophrenia: a comprehensive update. Aggress. Violent Behav. 18, 247–254 (2013).
Cerimele, J. M. & Katon, W. J. Associations between health risk behaviors and symptoms of schizophrenia and bipolar disorder: a systematic review. Gen. Hosp. Psychiatry 35, 16–22 (2013).
Zhang, X. et al. Negative evidence for a role of APH1B T27I variant in Alzheimer’s disease. Hum. Mol. Genet. 29, 955–966 (2020).
Hwang, J.-Y. & Zukin, R. S. REST, a master transcriptional regulator in neurodegenerative disease. Curr. Opin. Neurobiol. 48, 193–200 (2018).
Schoenherr, C. J. & Anderson, D. J. The neuron-restrictive silencer factor (NRSF): a coordinate repressor of multiple neuron-specific genes. Science 267, 1360–1363 (1995).
Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 51, 675–682 (2019).
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).
van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9, e52155 (2020).
Mandric, I. et al. Optimized design of single-cell RNA sequencing experiments for cell-type-specific eQTL analysis. Nat. Commun. 11, 5504 (2020).
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
Sul, J. H., Han, B., Ye, C., Choi, T. & Eskin, E. Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches. PLoS Genet. 9, e1003491 (2013).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).
Joo, J. W. J., Hormozdiari, F., Han, B. & Eskin, E. Multiple testing correction in linear mixed models. Genome Biol. 17, 62 (2016).
Ota, M. et al. Dynamic landscape of immune cell-specific gene regulation in immune-mediated diseases. Cell 184, 3006–3021 (2021).
Ishigaki, K. et al. Polygenic burdens on cell-specific pathways underlie the risk of rheumatoid arthritis. Nat. Genet. 49, 1120–1125 (2017).
Mohammadi, P., Castel, S. E., Brown, A. A. & Lappalainen, T. Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change. Genome Res. 27, 1872–1884 (2017).
Su, Z., Marchini, J. & Donnelly, P. HAPGEN2: simulation of multiple disease SNPs. Bioinformatics 27, 2304–2305 (2011).
Meyer, H. V. & Birney, E. PhenotypeSimulator: a comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships. Bioinformatics 34, 2951–2956 (2018).
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Delaneau, O. et al. A complete tool set for molecular QTL discovery and analysis. Nat. Commun. 8, 15452 (2017).
Glassberg, E. C., Gao, Z., Harpak, A., Lan, X. & Pritchard, J. K. Evidence for weak selective constraint on human gene expression. Genetics 211, 757–772 (2019).
Zeng, B. jxzb1988/MMQTL: mmQTL v1.5.0. https://doi.org/10.5281/zenodo.5771105 (2021).
Acknowledgements
This project was supported by the National Institute of Mental Health (NIH grants nos. R01-MH109677, U01-MH116442, R01-MH125246 and R01-MH109897), the National Institute on Aging (NIH grants nos. R01-AG050986, R01-AG067025 and R01-AG065582) and the Veterans Affairs Merit (no. BX004189) to P.R. G.E.H. was supported in part by NARSAD Young Investigator Grant no. 26313 from the Brain & Behavior Research Foundation. J.B. was supported in part by NARSAD Young Investigator Grant no. 27209 from the Brain & Behavior Research Foundation. Research reported in this paper was supported by the Office of Research Infrastructure of the National Institutes of Health under award nos. S10OD018522 and S10OD026880. The content herein is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
B.Z., G.E.H. and P.R. conceived and designed the study. B.Z. designed and implemented the statistical method. B.Z. and G.E.H. performed analyses. J.F.F. generated cell-type-specific expression and chromatin accessibility data. J.B. and R.K. preprocessed and analyzed cell-type-specific expression and chromatin accessibility data. J.F.F. and P.R. supervised data generation. G.E.H. and P.R. supervised data analyses. G.E.H., B.Z. and P.R. wrote the manuscript with the help of all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Andrew Jaffe and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Biologically motivated simulations demonstrate performance of mmQTL workflow: low correlation scenario.
a) QQ plot of results from null simulation shows that the linear mixed model (LMM) with fixed or random effect meta-analysis accurately controls the false positive rate for, while linear regression with 5 genotype principal components did not. The Sidak method was very conservative in both cases. λGC indicates the genomic control inflation factor. Gray band indicates 95% confidence interval under the null. b) Power from LMM followed by 3 types of meta-analysis versus the number of tissues sharing an eQTL. c) Size of the 95% credible sets from fixed- (y-axis) and random- (x-axis) effects meta-analysis from simulations in Fig. 2c.
Extended Data Fig. 2 Lead eQTL SNP sign concordance.
For the lead eQTL SNP of each gene in the meta-analysis, the sign of the mean estimated effect size is compared to the estimated effect sign from neuron and microglia eQTL analyses. The concordance rate increases with the strictness of the p-value cutoff, so a smaller p-value indicates a higher concordance rate. Error bars indicate 95% confidence interval for a binomial proportion. Analysis included 11,709 variants for neuron, and 10,865 variants for microglia.
Extended Data Fig. 3 Impact of effect size heterogeneity.
The test statistic from the random effect meta-analysis used here (Han and Eskin, 2011) is the sum of statistics testing the mean (Smean) and variance (Svariance) of the estimated effect sizes. a) The percent of total signal contributed by the fixed effect (that is Smean / (Smean + Svariance)) is shown for the lead eQTL SNP for multiple orders of conditional analysis. Box plot indicates median, interquartile range (IQR) and 1.5*IQR. b) The relationship between the test statistics is visualized by plotting sqrt Svariance against sqrt Smean from the lead eQTL SNP from the primary eQTL analysis. c) The estimated effect sizes from the lead eQTL SNP for genes with high and low levels of effect size heterogeneity is shown. Box plot indicates median, interquartile range (IQR) and 1.5*IQR.
Extended Data Fig. 4 Properties of conditional eQTLs.
a) The distribution of the distance to the transcription start site is shown for the lead variant for eQTL analysis of increasing degree. P-values indicate significance of one-sided Mann–Whitney U test between adjacent groups. Box plot indicates median, interquartile range (IQR) and 1.5*IQR. b) Cell type specificity metric tau plotted against the number of independent eQTLs discovered for each gene. Gray band indicates 95% confidence interval. c) Bar plot shows that the fraction of genes with high evolutionary constraint (pLI > 0.9) decreases with eQTL degree for the current study, PsychENCODE15, and whole blood78. Error bars indicate standard error based on asymptotic estimate of binomial proportion. Analysis included 10769 genes with eQTLs.
Extended Data Fig. 5 Estimated effect size and minor allele frequencies from conditional eQTL analysis.
The estimated effect size (a) and MAF (b) are shown for the lead eQTL SNP of significant genes for increasing order to conditional eQTL analysis. a) The distribution of estimated effect size is similar for all conditional analyses. b) The MAF shows a marked decrease with increasing order of conditional analysis. Box plot indicates median, interquartile range (IQR) and 1.5*IQR.
Extended Data Fig. 6 Comparison of estimated effect size for bulk and cell-type specific data.
(a-c) Estimated allelic effect size for eQTL lead in (a) neurons (Jaffe, et al. 2020), (b) microglia from Kosoy, et al. (in preparation) and (c) microglia from Young, et al. (2021) compared to effect size estimates from meta-analysis of bulk data from the current study. (d–g) Estimated allelic effect size for eQTL lead SNP in four immune cell types including (d) B cells, (e) CD14, (f) monocytes, (g) NK cells from Ota, et al. (2021) compared to estimates from bulk samples (Ishigaki, et al. 2017).
Extended Data Fig. 7 Number of genes colocalizing for each MeSH category with CLPP > 0.01.
The phenotype with the highest number of colocalized genes for each MeSH category is indicated.
Extended Data Fig. 8 Expression of FURIN and risk for multiple complex traits share rs4702 as a candidate causal variant.
Starting from the top, the plot shows -log10 p-values from eQTL analysis, poster probabilities from statistical fine-mapping of eQTL results, poster probabilities from statistical fine-mapping of GWAS results, and colocalization posterior probabilities (CLPP) for combining eQTL and GWAS fine-mapping. Traits are shown in the box on the right in decreasing order to CLPP value.
Supplementary information
Rights and permissions
About this article
Cite this article
Zeng, B., Bendl, J., Kosoy, R. et al. Multi-ancestry eQTL meta-analysis of human brain identifies candidate causal variants for brain-related traits. Nat Genet 54, 161–169 (2022). https://doi.org/10.1038/s41588-021-00987-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-021-00987-9
- Springer Nature America, Inc.
This article is cited by
-
Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference
Nature Genetics (2024)
-
Multi-ancestry genome-wide association meta-analysis of Parkinson’s disease
Nature Genetics (2024)
-
Inferring cell-type-specific causal gene regulatory networks during human neurogenesis
Genome Biology (2023)
-
Genome-wide Mendelian randomization identifies actionable novel drug targets for psychiatric disorders
Neuropsychopharmacology (2023)
-
Neurite outgrowth deficits caused by rare PLXNB1 mutation in pediatric bipolar disorder
Molecular Psychiatry (2023)