Introduction

Psychiatric disorders are often associated with significant morbidity and mortality (Saha et al. 2007). The estimated heritability for most psychiatric disorders is moderate to high (40–80 %), so genetic factors play a critical role in their etiology (Sullivan et al. 2000; Lichtenstein et al. 2009, 2010). In the past few years, many genome-wide association studies (GWAS) have been conducted to identify genetic risk variants underlying psychiatric disorders (Visscher et al. 2012). Despite recent progress, there is much yet to be discovered regarding the genetic architecture of psychiatric disorders (Gratten et al. 2014).

The relationship between psychiatric disorders and immune disorders has intrigued researchers for decades (Fig. S1). There is a moderately large body of evidence that supports a role for immune dysfunction in the development of several psychiatric disorders, including early hypothesis like the macrophage theory of depression (Smith 1991), and recent findings such as the epidemiological observation of co-occurrence of rheumatoid arthritis (RA) and depression (Margaretten et al. 2011; Covic et al. 2012) and cross-disorder drug effects, for example some drug for psychiatric disorders have anti-inflammatory properties (Walker 2013; Muller et al. 2006). The genetic liability underlying these observed correlations has not been well studied, with the exception that recent GWAS have repeatedly identified association between SCZ and genetic variants at the major histocompatibility locus (MHC), which also plays an important role in the immune system (Schizophrenia 2014; Irish Schizophrenia Genomics 2012; International Schizophrenia 2009). However, no strong evidence of shared liability was observed between Crohn’s disease (CD) and multiple psychiatric disorders in another study (Cross-Disorder 2013b). In genetics, the term pleiotropy refers to a one-to-many relationship between a gene or mutation and phenotypes (Paaby and Rockman 2013). In the GWAS era, pleiotropy could explain correlations among disorders, and may also boost statistical power to detect genetic associations (Cross-Disorder 2013a, b; Vattikuti et al. 2012; Li et al. 2014; Lee et al. 2012; Andreassen et al. 2013). To date, pervasive pleiotropic effects have been discovered in autoimmune disorders (Cotsapas et al. 2011) and in psychiatric disorders (Gratten et al. 2014; Cross-Disorder 2013b), as separate classes.

Given the public health significance of these two classes of disorders and the treatment implications of any etiological overlap, it is important to resolve the nature of genetic pleiotropy between them, to understand the underlying mechanisms of pleiotropy, and to identify specific genes and pathways driving such pleiotropy. These inquiries can only now be carried out because of the large amounts of genomic data that have become available in recent years. Large consortia have been formed to study many psychiatric disorders and immune disorders (Schizophrenia 2014; Cross-Disorder 2013a, b; Tobacco Genetics 2010; Lango Allen et al. 2010; Speliotes et al. 2010; Barrett et al. 2009; Harley et al. 2008; IMSG 2007; Anderson et al. 2011; Franke et al. 2010). For example, the results from a well-powered GWAS of schizophrenia (Schizophrenia 2014) provided strong evidence supporting the link between schizophrenia and the immune system. Undoubtedly, the availability of high-quality omics data offers us an unprecedented opportunity to revisit the nature of the genetic connections between psychiatric disorders and immune-mediated disorders. The analysis results can deepen our understanding of the genetic architecture of complex human diseases.

Our current study takes advantage of multiple omics data resources to obtain a bird’s-eye view of the shared genetic components between psychiatric disorders and immune disorders. To better represent those two disorder categories while taking the data availability into account, we considered five psychiatric disorders, including schizophrenia (SCZ), bipolar affective disorder (BPD), autism spectrum disorder (ASD), attention deficit hyperactivity disorder (ADHD), and major depressive disorder (MDD). For immune-mediated disorders, we considered two inflammatory bowel diseases (IBDs), Crohn’s disease (CD) and ulcerative colitis (UC), and five other immune disorders, including multiple sclerosis (MS), psoriasis (PS), rheumatoid arthritis (RA), systemic lupus erythematosis (SLE), and insulin-dependent diabetes mellitus (T1D). For comparisons, we also included a central nervous system degenerative disease, Parkinson’s disease (PD), and five traits related to education, height, and weight. We performed comprehensive genome-level analysis on psychiatric disorders and immune disorders by integrating both disorder-specific GWAS and genomic annotations, in search of common genetic liability. Our results not only confirmed previously reported genetic regions affecting disease risk for both psychiatric and immune-mediated disorders, but also implicated many novel shared genes and pathways.

Results

Pervasive pleiotropic effects between psychiatric disorders and immune system disorders

Previous studies have shown extensive shared genetic effects among many of the five psychiatric disorders studied by the Psychiatric Genomics Consortium (PGC) (Cross-Disorder 2013a, b) and among multiple immune system disorders (Cotsapas et al. 2011), separately. Consistent with those studies, we also observed pervasive pleiotropic effects among psychiatric disorders and among immune-related disorders (Table S1). Pleiotropic effects are significant (Bonferroni-adjusted \(p<0.05\)) for all 21 pairs of immune system disorders, and for seven of the 10 pairs of psychiatric disorders (the exceptions being ASD-ADHD, MDD–ADHD, and MDD–ASD).

We then tested pleiotropic effects between psychiatric and immune system disorders. We first considered SCZ with seven immune-mediated disorders. The conditional Q-Q plots (Fig. 1a) suggest that all seven immune-mediated disorders share genetic liability components with SCZ.

Conditional Q-Q plots, while simple and intuitive, suffer from arbitrary cutoffs, e.g. \(1\times 10^{-4}\), and do not offer statistical assessment of pleiotropy. We then used GPA (Genetic analysis incorporating pleiotropy and annotation), a statistically rigorous approach recently developed by us (Chung et al. 2014), to quantitatively test the significance of pleiotropy between the five psychiatric disorders (SCZ, BPD, MDD, ASD, ADHD) and seven immune-mediated disorders (CD, UC, MS, PS, SLE, RA, T1D). Twenty-four of the 35 pairs were significant at Bonferroni-adjusted p-value <0.05 (Table S1), indicating pervasive pleiotropic effects between psychiatric disorders and immune-mediated disorders. Consistent with previous studies (Andreassen et al. 2014), we observed strong pleiotropy between SCZ–MS (\(p=1.3\times 10^{-20}\)), but no significant pleiotropy between BPD–MS (\(p=0.26\)), with or without the MHC region (Fig. S2).

For each pair of disorders, we estimated the proportion of single nucleotide polymorphisms (SNPs) associated with both disorders vs. those associated with only one disorder (Table S1). Figure 2 shows the results among SCZ, BPD, UC and CD. Consistent with previous studies (Cross-Disorder 2013a; Parkes et al. 2013), most of the SCZ-associated SNPs and BPD-associated SNPs were estimated to be shared between these two disorders. Similarly, most UC-associated SNPs and CD-associated SNPs were shared between them. The proportions of SNPs shared by cross-class disorders were: SCZ–CD 0.063 (s.e. 0.0021); SCZ–UC 0.053 (s.e. 0.0018); BPD–CD 0.05 (s.e. 0.0034); and BPD–UC 0.039 (s.e. 0.0025), respectively.

To account for potential inflation of statistical significance in pleiotropy tests due to LD structure, pleiotropy among SCZ, BPD, CD, and UC was studied by the means of chromosome-bound circular permutations (Kindt et al. 2013), as detailed in “Materials and methods”. Consistent with GPA pleiotropy test results, all six disease pairs have significant pleiotropy, with permutation-based p values below 0.001 for SCZ–BPD, CD–UC, SCZ–CD, SCZ-UC and about 0.004 for BPD–CD and BPD–UC (Fig. 1b).

Fig. 1
figure 1

Pervasive pleiotropic effects between psychiatric and immune system disorders. a Conditional Q-Q plot showing pleiotropy between schizophrenia and 7 immune system disorders. Black dots represent all 1 219 805 SCZ GWAS SNPs while the other 7 colored dots represent different subsets of SNPs selected from the corresponding immune system disorder GWAS whose \(p<0.0001\) (left panel) and \(p<0.001\) (right panel), with the number of SNPs in each subset shown in brackets. b Chromosome-bound circular permutation to adjust LD effects for assessment of the significance of pleiotropy of eight GWAS pairs. For each of the 8 trait pairs, 1000 times of chromosome-bound circular permutations were performed. The distribution of the test statistic from 1000 permutations are shown by histograms, where x-axis represents the test statistic, \(Diff\_PI = \hat{\pi }_{11}-( \hat{\pi }_{10} +\hat{\pi }_{11} )( \hat{\pi }_{01}+\hat{\pi }_{11})\), and y-axis represents its frequency. Red vertical line denotes the observed test statistics for each trait pair (color figure online)

Fig. 2
figure 2

GPA results showing pleiotropic effects among SCZ, BPD, UC and CD. Purple, red, green and blue represent SCZ, BPD, UC and CD; gray represents the proportion of SNPs associated with both disorders, and white represents the proportion of SNPs associated with neither disorder. Upper triangle: pie charts show proportion of SNPs associated with only one disorder, both disorders (gray), and neither disorder (white). Lower triangle: bar plots contrasting proportions of associated SNPs for each disorder when analyzed separately (first and third bar in darker color), and proportion of associated SNPs when two disorders are jointly analyzed (second and fourth bar for proportion of SNPs associated with only one disorder, and fifth gray bar for proportion of SNPs associated with both disorders). Error bars indicate one standard error (color figure online)

Enrichment of immune-related annotations in multiple psychiatric disorders

Observation of extensive pleiotropy naturally leads to the exploration of functional enrichment for the shared genes to better understand the underlying biology. We used central nervous system (CNS) SNPs and immune-related eQTLs (see “Materials and methods”) to represent the functional sites relevant to the CNS and immune system, respectively. Because 12.5 % of CNS SNPs overlap immune eQTLs, we also tested enrichments excluding those overlapping SNPs.

We first tested for enrichment of CNS SNPs in all 18 traits (Fig. 3a). As expected, all psychiatric disorders had modest enrichment for CNS SNPs (>1.3-fold, except for MDD, 1.09-fold). The enrichment effects could still be observed (and were even stronger for ADHD) with the MHC region and/or immune-related eQTLs excluded (Fig. S3). Only three immune system disorders (MS, PS and RA) showed modest enrichment for CNS SNPs (1.5, 1.2, and 1.2-fold, respectively), but not with immune eQTLs excluded (0.9, 0.4, 0.5-folds, respectively). This suggests that enrichment of CNS SNPs in immune traits was driven by overlapping immune eQTLs. We also observed enrichment of CNS SNPs for education years (1.25-fold), college completion status (1.29-fold), and BMI (1.55-fold), but neither waist-to-hip ratio adjusted BMI nor height showed enrichment of CNS SNPs.

Next, we tested enrichment of immune eQTLs in the same set of 18 traits (Fig. 3b). The seven immune-mediated disorders consistently had the strongest enrichment (ranging from 2.0 to 8.5-fold). We also observed enrichment of immune eQTLs in four psychiatric disorders (SCZ, BPD, ASD, and MDD; 2.0, 2.0, 1.4, 1.6-fold, respectively), and Parkinson’s disease (1.4-fold). Those enrichment effects still persisted with MHC region and/or CNS SNPs excluded, suggesting the enrichment was not solely due to eQTLs in the MHC region or overlapping with CNS SNPs (Fig. S3). We also observed immune eQTL enrichment in two education related traits, college completion (1.39-fold) and year of education (1.46-fold), and in three physical features, BMI (1.99-fold), obesity measured by waist-to-hip ratio adjusted BMI (2.90-fold), and height (2.97-fold).

To explore this hypothesis further, we tested levels of enrichment of immune-related eQTLs in SNPs associated with both psychiatric disorders (SCZ, BPD, ASD, MDD, ADHD) and Crohn’s disease, and observed larger enrichment ratios compared with those SNPs associated with only one disease (Fig. S4). This result suggests that the shared genetic components between the five psychiatric disorders and CD are closely related to immune function. Next we tested enrichment of DNase-peak located SNPs in SCZ GWAS signals from 98 ENCODE cell lines (Table S2), and found the top cell lines were from blood elements having important roles in immune response, with the top two cell lines being CD20+ B cells and Th2 cells (CD4+ T cells) (Fig. S5). We also tested enrichment of an epigenetic marker H3K9ac (H3 lysine 9 acetylation), known to mark active enhancers and promoters, in eight tissues from the ROADMAP project (Roadmap Epigenomics Consortium 2015). We observed that both psychiatric disorders and immune-related disorders have the highest enrichment for H3K9ac markers in blood, while educational traits (years of education and colleges completion) have the highest enrichment for H3K9ac markers in brain (Fig. S6). We also observed enrichment of H3K9ac markers from fat tissue to waist-to-hip ratio adjusted BMI GWAS (Fig. S6). Those results further demonstrate the enrichment of immune-specific contribution to the psychiatric disorder GWAS.

Fig. 3
figure 3

Enrichment of CNS SNPs and immune eQTLs in 18 traits. a Enrichment of CNS SNPs (comprising 18.8 % of all SNPs) in 18 traits from three categories: psychiatric disorders or CNS-related disorder (red), immune system-related disorders (blue), and body somatic features (black). For each trait, the first bar (darker color) excludes immune eQTLs from CNS SNPs, and the second bar (light color) is for all CNS SNPs (comprising 21.4 % of all SNPs). b Enrichment of immune eQTLs (comprising 7.5 % of all SNPs) in 18 traits from three categories: psychiatric disorders or CNS-related disorder (red), immune system-related disorders (blue), and other body somatic features (black). For each trait, the first bar (darker color) excluded CNS SNPs from immune eQTLs, and the second bar (light color) is for all immune eQTLs (comprising 10.1 % of all SNPs) (color figure online)

Trend of consistent effect direction between psychiatric disorders and immune system disorders

To explore the mechanism of these pleiotropic effects further, we examined effect directions. For each SNP, the same allele may increase or reduce susceptibility for the two disorders (same direction) or have opposite effects (different directions). The SCZ Q-Q plot (Fig. S7) shows interesting signals conditional on having the same effect direction with CD. We then considered four disorder pairs showing strong pleiotropy: SCZ–CD (\(p=1.9\times 10^{-109}\)), BPD–CD (\(p=1.5\times 10^{-13}\)), SCZ–Height (\(p=2.0\times 10^{-122}\)), and BPD–Height (\(p=5.4\times 10^{-150}\)). For each of these four pairs, there is no correlation in effect direction when all genotyped SNPs are considered (Table S3). However, trends emerged after we partitioned the SNPs into 10 groups according to their posterior probabilities of being associated with both traits. Proportions of SNPs having the same effect directions were calculated for each group. There are clear patterns for SCZ–CD and BPD–CD, but not for SCZ–height nor BPD–height as shown in Fig. 4.

In general, the higher the posterior probability of a SNP being associated with SCZ (or BPD) and CD, the more likely that the SNP had the same effect direction for the pair. For SCZ–CD, among the 85 top SNPs with posterior probabilities of association with both SCZ and CD higher than 0.9 (Table S4), 97.6 % of SNPs had the same effect directions (an allele either increases or reduces both SCZ and CD risks). Similarly, for BPD–CD, in the SNP group with posterior probabilities higher than 0.8, and between 0.7 and 0.8, 83 and 95% of SNPs had the same effect direction, respectively (Fig. 4). Similar patterns were also observed for SCZ–RA and BPD–RA pairs (Fig. S8). In contrast, the proportion of SCZ risk alleles that were associated with lower height was  50 % for all SNP groups, regardless of their posterior probabilities of being associated with both SCZ and height. Effect direction distributions across 10 posterior groups for the BPD–height pair behaved similarly. We also investigated the influence of LD blocks on the observed effect direction trend by grouping SNPs into LD blocks based on DistiLD Database (Palleja et al. 2012), as detailed in Supplementary Methods. LD blocks with high posterior probability were more likely to have a higher proportion of SNPs with same effect direction for SCZ–CD and BPD–CD, while the effect direction was less consistent for SCZ–height and BPD–height (Fig. S9).

Fig. 4
figure 4

Trend of consistent effect directions for SCZ/BPD–CD across posterior probability groups. Proportion of SNPs having the same effect direction for trait pairs, in each of the 10 posterior probability groups (darker colors indicate higher posterior probability), where SNPs were grouped based on posterior of being associated with both traits into 10 equal bins. Four pairs of traits: SCZ–CD, SCZ–height, BPD–CD, and BPD–height

Genome region enrichment analysis

To demonstrate the biological mechanism of the pleiotropy between psychiatric disorders and immune-mediated disorders, we tested genome-wide enrichment of potential pleiotropic SNPs in cytobands, protein–protein interaction networks, and gene ontology (GO) terms, for 28 disorder pairs between seven immune system disorders (CD, UC, MS, PS, RA, SLE, and T1D) and four psychiatric disorders (SCZ, BPD, MDD, and ASD), detailed in “Materials and methods”.

In cytoband enrichment analysis, a complete list of all cytobands with enrichment odds ratio (OR)>5 and Bonferroni-adjusted p value <0.001 in at least one disease pair is reported in Table S5. Some cytobands have significant enrichment in more than one disease pairs, such as MHC region and 1p13.2, indicating their role in affecting both psychiatric disorders and immune system disorders. Specifically, cytoband 1p13.2 was significantly enriched for the eight disorder pairs between {SCZ, BPD, MDD, and ASD} and {T1D and RA}, with Bonferroni-adjusted p values ranging from \(6.8\times 10^{-26}\) to \(2.7\times 10^{-78}\), with top SNPs located in genes AP4B1, PTPN22, and PHTF1 (Fig. S10). In protein–protein interaction (PPI) network analysis, several sub-networks (Fig. S11) were highlighted in the analysis. Specifically, those protein–protein interaction clusters most responsible for shared genetic components between psychiatric disorders and immune system disorders in these data were: (1) three minor gene subunits HLA-E, HLA-F, and HLA-G, but not the three major gene subunits, interacting with TAP1, TAP2. TAP1 and TAP2 are transporters associated with antigen processing, which cooperate with MHC class I to present antigens (Suh et al. 1994); (2) Interaction between HLA-DO, HLA-DM, and HLA-DR proteins; and (3) a set of genes with important roles in transcriptional activation, including BRD2, TUBB, ABT1, and multiple histone coding genes. In GO term enrichment analysis, the identified top terms included antigen processing and presentation, MHC protein complex, allograft rejection, and NF-kappaB binding (Table S6), which further suggests the enrichment of immune system function in shared genetic factors between psychiatric and immune system disorders.

Discussion

Our work demonstrates extensive pleiotropy between psychiatric disorders and immune system disorders. It is a common concern that the uneven distribution of genomic features, such as LD blocks and genes, may bias these findings. To address this issue rigorously, we performed chromosome-bound circular permutation (Kindt et al. 2013) on eight trait pairs for which we performed comprehensive analysis in this work (Fig. 1b). All eight pairs yielded highly significant permutation-based p values, consistent with the GPA pleiotropy test results.

Beyond the evidence of pleiotropy, our results suggest how psychiatric disorders and immune system disorders are related genetically. We observed a major but not single role of MHC region in contributing to the pleiotropy between psychiatric disorders and immune system disorders. First, we observed enrichment of immune eQTLs even after the whole MHC region was removed (Fig. S3). Second, cytoband enrichment results indicate roles played by other specific genomic regions, such as 1p13.2, harboring gene PTPN22 [Protein Tyrosine Phosphatase, Non-Receptor Type 22 (Lymphoid)], which was also prioritized in our PPI analysis.

The observation of a tendency of the same effect direction for SNPs associated with either SCZ and BPD paired with CD gives some insight concerning the underlying mechanism of their shared genetic factors. Pleiotropy has been extensively reviewed (Paaby and Rockman 2013; Williams 1957; Stearns 2010; Solovieff et al. 2013), but is still not well understood in terms of its extent, mechanisms, and consequences. The weak hypothesis of universal pleiotropy (WHUP) advocated by Fisher (1930) and Wright (1968) is based on two assumptions that, in general, a phenotype might be influenced by many variants, and a variant might cause changes to many phenotypes. Under WHUP, extensive pleiotropy should be detected while the effect directions of shared genetic variants should be about random, which is not we observed for SCZ–CD. Our observation supports a closer genetic relationship between those two types of disorders. Various molecular mechanisms could result in pleiotropy (Solovieff et al. 2013). There are biological pleiotropy, mediated pleiotropy, and spurious pleiotropy. Biological pleiotropy has separate causal paths for different phenotypes, while mediated pleiotropy has one phenotype lying on another phenotypes causal path; thus by this mechanism, one phenotype might lead to another (Solovieff et al. 2013). Our results, the striking trend of shared SNPs for SCZ and CD acting in the same direction, can be best explained by mediated pleiotropy. This, together with our observation of pervasive enrichment of immune eQTLs in psychiatric disorders, and the lack of enrichment of CNS SNPs (immune eQTLs excluded) in immune-mediated disorders (except MS, which is characterized by CNS pathology) suggest that immune system disorders might mediate psychiatric disorder risk, i.e. some downstream immune dysfunctions might be a trigger to some psychiatric disorders (or subtypes).

Consistent with a recent GWAS finding that detected pathways associated with BMI mostly act in brain or peripheral nervous system (Locke et al. 2015), we observed enrichment of CNS SNPs for BMI, but a depletion of CNS SNPs for WHRadjBMI, suggesting different regulation mechanisms for body fat level and fat distribution. We also observed considerable enrichment of immune-related eQTLs in height and BMI, which are consistent with previous experiments that BMI is correlated with immune parameters (Ilavska et al. 2012), and that height is associated with immune response in young men (Krams et al. 2014). Our results further confirmed the relationship between BMI and height and immune system from a genomics perspective.

Our work revealed the shared genetic factors between psychiatric and immune system disorders using novel methods and multiple omics data accumulated in recent years. We were able to show that there is pervasive pleiotropy between those two categories of disorders. Although the MHC region shows the strongest pleiotropic effects, other regions, such as cytoband 1p13.2, also contribute to the overall pleiotropy. Moreover, we found that pleiotropic SNPs for schizophrenia and Crohn’s disease tend to have the same effect direction for both disorders, suggesting mediated pleiotropy. Apart from cross-disorder study of GWAS summary statistics, our study included various genome annotations, including CNS SNPs, eQTLs detected in immune-related contexts, and DNase I hypertensive sites from 98 cell lines. Study of those genome annotations provided further support for correlated genetic factors for psychiatric disorders and immune system disorders. Our work offers insights on pleiotropic mechanisms and a better understanding of pathophysiology, which may lead to improved prevention and treatment strategies for these two classes of disorders via immunological mechanisms. Although our analyses were based on results from GWAS consortia, the statistical power remains limited to identify the majority of disease associated variants for these disorders. GWAS results from larger studies and improved statistical and bioinformatics approaches will enable us to identify more shared genetic pathways between these classes of disorders, and as always despite the very high significance levels we observed for some relationships independent replication of our results is called for.

Materials and methods

Genome-wide association study (GWAS) data sources

We made use of GWAS summary statistics from a set of diverse and representative traits, including major psychiatric disorders, various immune system disorders, body morphological features, and some socioeconomic measures (Table 1). The p values were available for all traits, but only some of them have available specified alleles and their corresponding beta or odds ratios indicating effect direction.

Table 1 Sources of GWAS summary statistics

Genomic annotation data sources

Central nervous system (CNS) genes were identified in a previous study (Raychaudhuri et al. 2010), comprising preferentially brain-expressed genes (Raychaudhuri et al. 2010), neuronal-activity genes (Walsh et al. 2008), learning-related genes (Weiss et al. 2008), and synapse genes, defined by Gene Ontology (Ashburner et al. 2000). A complete list of these genes is given in Table S7. CNS SNPs were defined as SNPs located within 50 kb of CNS genes. To investigate immune system influence, we used context-specific eQTLs upon triggering immune response as detected by Fairfax et al. (2014), where interferon-\(\gamma\) and lipopolysaccharide (LPS) were used as inflammatory proxies to stimulate innate immune effects in monocytes from volunteers of European ancestry. We used a union of cis-eQTLs detected in four distinct contexts, nave, LPS2 (monocytes exposed to 2 h of LPS), LPS24 (monocytes exposed to 24 h of LPS), and INF-\(\gamma\) (monocytes exposed to 24 h of interferon-\(\gamma\)), as a set of immune-related eQTLs in our study. In total, we have 94,674 immune eQTLs and 199,202 CNS SNPs, of which 24,860 CNS SNPs are also immune eQTLs.

To investigate the impact of chromatin state, we used DNase I hypersensitivity sites extracted from ENCODE (2012) DNase-seq peaks and signal of open chromatin from 125 cell lines. There were 98 cell lines after removing 27 cancer cell lines (Table S2). Although limited in cell lines from brain regions, those 98 cell lines have great coverage for various blood cells, making it suitable for studying whether there is enrichment of functional genomic regions in SCZ GWAS tissues implicated with important immune functions. DNase-Peak SNPs are SNPs located in or within 1kb from DNase-Peaks. We obtained the H3K9ac histone marker from the project (Roadmap Epigenomics Consortium 2015). We downloaded the consolidated narrow peaks from http://egg2.wustl.edu/roadmap/data/byFileType/peaks/consolidated/narrowPeak/, and then generated our annotations based on the \(-log(p)\) value cutoff at 6. We mainly focused on eight primary tissues, including blood (E038, E047, E062, E115, E116, E123, E124), brain (E067, E068, E069, E072, E073, E074, E125), breast (E027, E119), fat (E023, E025, E063), heart (E083), lung (E017, E088, E114, E128), muscle (E052, E107, E108, E120, E121), and skin (E126, E127).

Pleiotropy and annotation enrichment analysis using GPA

Pleiotropy analysis was performed via the GPA R package (Chung et al. 2014), which is a statistical approach to exploring the genetic architecture of complex traits by integrating pleiotropy and functional annotation information, including prioritizing risk genetic variants, and evaluating annotation enrichment and pleiotropy by hypothesis testing. Instead of relying on genotype–phenotype data at the individual level, it only requires the summary statistics from GWAS, which makes it useful for integrative analysis of genomic data. For each trait pair, only overlapped SNPs across the two traits were used in our analysis. For convenience, we briefly introduce the GPA model (Chung et al. 2014) and its notation here.

Consider the p values \(\{p_1, \ldots ,p_M\}\) obtained by performing hypothesis testing of genome-wide SNPs from one GWAS, where M is the number of SNPs. In the GPA model, these p values are assumed to come from a mixture of null (un-associated) and non-null (associated), with probability \(\pi _0\) and \(\pi _1 = 1-\pi _0\), respectively. GPA uses the Uniform distribution on [0,1] and the Beta distribution with parameters (\(\alpha ,1\)) to model the p values from the null and non-null groups, respectively. Let \(Z_{j}\in \{0,1\}\) be the latent variable indicating whether the jth SNP is from the null or non-null group, where \(Z_{j}=0\) means null and \(Z_{j}=1\) means non-null. Then the GPA model for one GWAS without annotation can be written as:

$$\begin{array}{ll} \pi _0 = \text{ Pr }(Z_{j}=0):\quad p_j \sim \mathcal {U}[0,1], &\quad \text{if } Z_{j}=0, \nonumber \\ \pi _1 = \text{ Pr }(Z_{j}=1):\quad p_j \sim \mathrm{Beta}(\alpha ,1),&\quad \text{if } Z_{j}=1. \end{array}$$
(1)

GPA further incorporates functional annotation as follows. Let an M-dimensional vector \(\mathbf {A}\) collect functional information from an annotation source, where \(A_{j}\in \{0,1\}\) indicates whether the jth SNP is a functional unit according to the annotation source. For example, given an eQTL data, if the jth SNP is an eQTL, then \(A_{j}=1\), otherwise \(A_{j}=0\). The relationship between \({Z}_j\) and \(A_{j}\) is described as:

$$\begin{aligned} q_{0}=\text{ Pr }\left( A_{j}=1|Z_{j}=0\right) , \quad q_{1}=\text{ Pr }\left( A_{j}=1|Z_{j}=1\right) . \end{aligned}$$
(2)

Clearly, \(q_{0}\) can be interpreted as the proportion of null SNPs being annotated, \(q_{1}\) corresponds to the proportion of non-null SNPs being annotated, and \(q_{1}>q_{0}\) implies that there exists enrichment in this annotation.

Let \(\hat{\Theta } = \{\hat{\pi }_0,\hat{\pi }_1,\hat{q}_0,\hat{q}_1,\hat{\alpha }\}\) be the collection of the estimated model parameters. Then SNPs can be prioritized based on their local false discovery rates (FDR). When there is no annotation data, the local FDR is defined as the probability that the jth SNP belongs to the null group given its p value, i.e., \(\widehat{fdr}(p_j) = \Pr (Z_{j}=0|p_j;\hat{\Theta }).\) With annotation data, the FDR can be calculated as \(\widehat{fdr}(p_j,A_j) = \Pr (Z_{j}=0|p_j,A_j;\hat{\Theta })\). We can use the likelihood ratio test to assess the significance of its enrichment. Specifically, the significance of enrichment of an annotation for GWAS can be assessed by testing \(H_0: q_{0}=q_{1}\) versus \(H_1: q_{0}\ne q_{1}\). Standard errors of all the parameters can also be calculated.

The extension of the above model to handle two GWAS is straightforward. Suppose the p values from two GWAS have been collected in an \(M\times 2\) matrix \(\mathbf {p}=[p_{jk}]\), where \(p_{jk}\) denotes the p value of the jth SNP in the kth GWAS, \(k=1,2\). Let \({Z}_j\in \{00,10,01,11\}\) indicate the association between the j-th SNP and the two phenotypes: \(Z_{j}=00\) means the jth SNP is associated with neither of them, \(Z_{j}=10\) means it is only associated with the first one, \(Z_{j}=01\) means it is only associated with the second one, and \(Z_{j}=11\) means it is associated with both. Then the two-groups model (1) can be extended to the following four-groups model:

$$\begin{aligned} \begin{array}{ll}\pi _{00} = \text{ Pr }(Z_{j}=00):\quad p_{j1} \sim \mathcal {U}[0,1],\quad p_{j2} \sim \mathcal {U}[0,1], &\quad \text{if } Z_{j}=00, \\ \pi _{10} = \text{ Pr }(Z_{j}=10):\quad p_{j1} \sim \mathrm{Beta}(\alpha _1,1),\quad p_{j2} \sim \mathcal {U}[0,1], & \quad \text{if } Z_{j}=10, \\ \pi _{01} = \text{ Pr }(Z_{j}=01):\quad p_{j1} \sim \mathcal {U}[0,1],\quad p_{j2} \sim \mathrm{Beta}(\alpha _2,1), &\quad\text{if } Z_{j}=01,\\ \pi _{11} = \text{ Pr }(Z_{j}=11):\quad p_{j1} \sim \mathrm{Beta}(\alpha _1,1),\quad p_{j2} \sim \mathrm{Beta}(\alpha _2,1), &\quad\text{if } Z_{j}=11. \end{array}\end{aligned}$$
(3)

Similarly, functional annotation information can be incorporated into the multiple GWAS model (3) in the following way:

$$\begin{aligned} q_{00}=\text{ Pr }(A_{j}=1|Z_{j}=00), \\ q_{10}=\text{ Pr }(A_{j}=1|Z_{j}=10), \\ q_{01}=\text{ Pr }(A_{j}=1|Z_{j}=01), \\ q_{11}=\text{ Pr }(A_{j}=1|Z_{j}=11), \end{aligned}$$

where \(q_{00}\) is the probability of a null SNP being annotated, \(q_{10}\) is the probability of the first phenotype-associated SNP being annotated, \(q_{01}\) is the probability of the second phenotype-associated SNP being annotated, and \(q_{11}\) is the probability of jointly associated SNP being annotated. For joint analysis of two GWAS data sets, the local FDR calculation and enrichment assessment can be done in a similar way. In addition, the pleiotropy between two phenotypes can be tested in a statistically rigorous way. When there is no pleiotropy, i.e., the signals from the two GWAS are independent of each other, testing pleiotropy can be formulated by testing the following hypothesis:

$$\begin{aligned} H_0: \pi _{11} = \pi _{1*} \pi _{* 1}, \text{ v.s. } H_1: \text{ not } H_0, \end{aligned}$$
(4)

where \(\pi _{1*}=\pi _{10 }+\pi _{11}\) and \(\pi _{* 1}=\pi _{0 1}+\pi _{11}\). The likelihood ratio test statistic asymptotically follows \(\chi ^2\) distribution with \(df=1\) under the null.

Chromosome-bound circular permutation to assess the significance of pleiotropy

To fully account for the potential effects from LD structure when testing the significance of pleiotropy, chromosome-bound circular permutation (Kindt et al. 2013) was adopted here. This permutation approach preserves the observed LD distribution of SNPs around the genome, and establishes a robust null distribution from which the significance of the observed pleiotropy can be calculated. To assess the significance of the pleiotropy of two traits A and B, the summary statistic set of trait A was circularly permutated within chromosomes as follows: The summary statistics were ordered according to their SNP positions. Next, a random number was generated from one to the number of SNPs in the chromosome. Then the summary statistics were shifted down by the generated random number. If their shifting status exceeded the number of SNPs in the chromosome, they resumed at the beginning position. The summary statistic set of GWAS B was unchanged. For each permutation, GPA was used to estimate the proportion of the SNPs in the four groups: SNPs associated with neither of the traits, SNPs associated with trait A, SNPs associated with trait B, and SNPs associated with both, denoted as {\(\hat{\pi }_{00},\hat{\pi }_{10},\hat{\pi }_{01},\hat{\pi }_{11}\)}. Under the null hypothesis that there is no pleiotropy between A and B, the joint distribution should be the product of its marginal distribution, i.e., \(\hat{\pi }_{11}=( \hat{\pi }_{10} + \hat{\pi }_{11} )( \hat{\pi }_{01} + \hat{\pi }_{11})\). Therefore, we define our test statistic as \(Diff\_PI = \hat{\pi }_{11}-( \hat{\pi }_{10} +\hat{\pi }_{11} )( \hat{\pi }_{01}+ \hat{\pi }_{11}),\) and recorded the test statistic for each permutation. The observed test statistic of the real data was compared with the null distribution obtained from chromosome-bound circular permutation, and p values were obtained accordingly. The permutation results for eight pairs of GWAS are shown in Fig. 1b.

Effect direction analysis on the level of LD blocks

SNPs were grouped into LD blocks based on DistiLD Database (Palleja et al. 2012). For each LD block, we first calculated the number of SNPs within the block and the proportion of these SNPs having the same effect direction for both traits. For each SNP within the LD block, we evaluated its posterior probability of being associated with both traits and assigned the maximum posterior probability of these SNPs to the LD block, as the LD blocks posterior probability.

Enrichment analysis

Enrichment analysis for cytobands Cytoband position was downloaded from the UCSC Table Browser (Karolchik et al. 2004), with 862 entries of cytobands in total. Enrichment tests were carried out on 28 pairs of disease pairs, between seven immune system disorders (CD, UC, MS, PS, RA, SLE, and T1D) and four psychiatric disorders (SCZ, BPD, MDD, and ASD). Fig. S12 shows the posterior probability of being associated with both diseases for those 28 disease pairs. For each disease pair, potentially shared SNPs were selected based on posterior probability \(\text{ Pr }(Z_j=1)>0.5\). Numbers of potentially shared SNPs vary from disease to disease, ranging from 0 to 4505 (for SCZ–CD). For each cytoband, we calculated {\(x_{11},x_{10},x_{01},x_{00}\)}, with \(x_{11}\) being the number of SNPs in cytoband that are potentially shared SNPs, \(x_{10}\) being the number of SNPs in cytoband that are not potentially shared SNPs, \(x_{01}\) being the number of potentially shared SNPs not in cytoband, and \(x_{00}\) being the number of SNPs not in cytoband and are not potentially shared SNPs. Under null hypothesis that there is no enrichment {\(x_{11},x_{10},x_{01},x_{00}\)} follows hypergeometic distribution. The deviation from null hypothesis was tested using Fishers exact test and p values were adjusted for multiple testing (Dunn 1961).

Enrichment analysis for gene ontology (GO) terms and KEGG pathways Genome annotation enrichment was performed via DAVID (Huang et al. 2009a, b) on GO (Ashburner et al. 2000) terms and KEGG pathways (Kanehisaa and Goto 2000; Data 2014). Enrichment tests were carried out on 28 pairs of disease pairs, between seven immune system disorders (CD, UC, MS, PS, RA, SLE, and T1D) and four psychiatric disorders (SCZ, BPD, MDD, and ASD). Gene lists were constructed with genes containing SNPs having posterior >0.8 in at least three disorder pairs.

Enrichment analysis for protein–protein interaction (PPI) networkPPI can provide independent information for prioritization of genetic findings, and thus we constructed PPI sub-networks via DAPPLE (Rossin et al. 2011) in which PPI edges are overrepresented in top SNPs. Enrichment tests were carried out on 28 pairs of disease pairs, between seven immune system disorders (CD, UC, MS, PS, RA, SLE, and T1D) and four psychiatric disorders (SCZ, BPD, MDD, and ASD). For each disease pair, potentially shared SNPs were selected based on posterior of being associated with both diseases \(\Pr (Z_j=1)>0.8\) (Fig. S13).