Introduction

Humans express at least seven alcohol dehydrogenase (ADH) isoforms, each with slightly different properties (Luo et al. 2008). ADHs are expressed predominantly in the liver, the upper digestive tract (from mouth to stomach), and kidney, and partly in the brain (Yoshida et al. 1998). Particularly, because ADHs are key catabolic enzymes for ethanol, ADH variants have been implicated in the risk for alcohol dependence by previous studies [reviewed by (Luo et al. 2006)]. However, in addition to catalyzing the oxidation of retinol and ethanol, ADHs may be involved in the metabolic pathways of several neurotransmitters including serotonin, epinephrine, norepinephrine, and dopamine (Holmes 1994; Svensson et al. 1999). The functions of ADHs in the metabolism of these monoamines suggest their potential roles in the etiology of other neuropsychiatric disorders.

ADH isoforms are encoded by ADH7ADH1CADH1BADH1AADH6ADH4ADH5 gene cluster at chromosome 4. It has been widely reported by candidate gene studies that at least four functional ADH gene variants, i.e., rs1229984 (ADH2*2; Arg48His), rs2066702 (ADH2*3; Arg370Cys), rs1693482 (ADH3*2; Arg272Gln), and rs698 (ADH3*2; Ile350Va), significantly affect the risk for alcohol dependence [reviewed by (Luo et al. 2006)]. These variants are rare in most populations, e.g., in Europeans (minor allele frequency (MAF)rs2066702 = 0.000 and MAFrs1229984 = 0.008) and Africans (MAFrs1229984 = 0.000, MAFrs1693482 = 0.052, and MAFrs698 = 0.042). In one of our previous studies, we also found that the rare variant constellation across the entire ADH cluster was associated with alcohol dependence in European-Americans, European-Australians, and African-Americans (Zuo et al. 2013b). So far, numerous genome-wide association studies (GWASs) of alcohol dependence using common variants as markers have also been performed; however, only one GWAS identified one common ADH variant (rs1789891; MAF = 0.192) that was associated with alcohol dependence at the genome-wide significance level (p = 1.3 × 10−8; OR = 1.46; α = 5×10−8) (Frank et al. 2012). This leads to a hypothesis that common ADH variants might be associated with other diseases rather than alcohol dependence only. For example, one candidate gene study reported that common variants at ADH7 were associated with Parkinson’s disease (Buervenich et al. 2000). To further test this hypothesis, in the present study, we comprehensively examined the associations between common ADH variants (MAF >0.05 in both cases and controls) and 11 neuropsychiatric and neurological disorders including schizophrenia, autism, attention deficit hyperactivity disorder (ADHD), alcoholism, major depression, bipolar disorder, Alzheimer’s disease, amyotrophic lateral sclerosis (ALS), early onset stroke, ischemic stroke, and Parkinson’s disease in subjects of European or African descent.

Materials and methods

Subjects

A total of 50,063 subjects in 25 independent cohorts with 11 different neuropsychiatric and neurological disorders were analyzed. They included case–control and family-based samples, genotyped on Illumina, Affymetrix, or PERLEGEN microarray platforms. All subjects gave informed consent. Diagnoses, ethnicities, study designs, sample sizes, and dataset names for these cohorts are shown in Table 1. More detailed demographics data of these cohorts were published previously (Stefansson et al. 2009; Anney et al. 2010; Zuo et al. 2011, 2012, 2013a, b).

Table 1 Associations between ADH gene cluster and different neuropsychiatric or neurological disorders

The African-American schizophrenia cohort came from the GAIN dataset (dbGaP access number: phs000021.v3.p2), including 1,195 cases with schizophrenia and 954 controls. The subjects were genotyped on AFFYMETRIX AFFY_6.0 platform. All subjects were at least 18 years old. The cases included 746 males (41.9 ± 10.8 years) and 449 females (43.0 ± 9.8 years); and the controls included 362 males (46.2 ± 13.7 years) and 592 females (45.0 ± 12.9 years). Affected subjects met lifetime DSM-IV criteria for schizophrenia (American Psychiatric Association 1994). Cases were excluded if they had worse than mild mental retardation, or if their psychotic illness was judged to be secondary to substance use or a neurological disorder. Controls were excluded if they did not deny all of the following psychosis screening questions: treatment for or diagnosis of schizophrenia or schizoaffective disorder; treatment for or diagnosis of bipolar disorder or manic-depression; treatment for or diagnosis of psychotic symptoms such as auditory hallucinations or persecutory delusions.

The Autism cohort came from the AGP dataset (dbGaP access number: phs000267.v1.p1). A total of 1,366 families (trios) contained 4,075 European-American subjects including 1,330 probands with autism. The probands consisted of 1,121 males (7.2 ± 3.2 years) and 209 females (7.1 ± 3.0 years). Affected subjects were diagnosed using the Autism Diagnostic Interview-Revised (ADI-R) and Autism Diagnostic Observation Schedule (ADOS) instruments, and met DSM-IV criteria for autism (American Psychiatric Association 1994). Cases with known karyotypic abnormalities, fragile X mutations, or other genetic disorders were excluded. The subjects were genotyped on ILLUMINA_Human_1 M platform.

Imputation

To make the genetic marker sets highly consistent across the different samples, we imputed the missing single nucleotide polymorphisms (SNPs) across the entire ADH gene cluster (Chr4: 100204900-100631900) in all samples of 25 cohorts using the same reference panels (i.e., 1,000 genome project and HapMap 3 panels). We used the programs IMPUTE2 (Howie et al. 2009) and BEAGLE (Browning and Browning 2009) for imputation, with the reference CEU panel for the samples of European descent and the reference YRI panel for the samples of African descent. We maximized the success rate and accuracy of imputation and minimized the false-positives during the imputation process. Only the genotypes that were consistently imputed from the two independent reference panels (i.e., 1,000 genome project and HapMap 3 panels) and the genotypes that were consistently imputed by both IMPUTE2 and BEAGLE were selected for analysis. The uncertainty rate of inference for missing genotypes was controlled at <1 %. Furthermore, only the SNPs that had similar MAFs (with frequency difference <2 % within the same ethnicity) in the healthy controls across different cohorts and HapMap database were selected for analysis. After this strict selection, we were highly confident with the quality of these imputed genotype data. Checking the imputed genotypes in all of our four family-based cohorts, we did not find any one individual with more than 0.1 % Mendelian inconsistency (considering all SNPs tested) or any one SNP with more than 0.1 % Mendelian inconsistency (considering all individuals tested).

Data cleaning

We stringently cleaned the phenotype data [detailed previously (Zuo et al. 2012)] and then the imputed genotype data. Subjects with poor genotypic data, allele discordance, sample relatedness, a mismatch between self-identified and genetically inferred ethnicity, or a missing genotype call rate ≥2 % across all SNPs were filtered out. Furthermore, we filtered out the monomorphic SNPs and the SNPs with allele discordance, Mendelian errors (in family samples), an overall missing genotype call rate ≥2 %, MAFs <0.05 in either cases or controls, or Hardy–Weinberg Equilibrium (HWE) (p < 10−4) within controls. We also filtered out the SNPs with MAF differences ≥2 % or missing rate differences ≥2 % between two samples that had the same phenotype and microarray platform. The cleaned sample sizes and cleaned SNP numbers are shown in Table 1.

Association test

For case–control cohorts, the allele frequencies were compared between cases and controls using logistic regression analysis as implemented in the program PLINK (Purcell et al. 2007). Diagnosis served as the dependent variable, alleles served as the independent variables, and ancestry proportions (to control for admixture effects) (Zuo et al. 2012), sex, and age served as covariates. The ancestry proportions for each individual were estimated using the program STRUCTURE (Pritchard et al. 2000). For those non-alcoholism cohorts, alcohol drinking behavior, if available, was also included as a covariate. Furthermore, for family cohorts, we used DFAM as implemented in PLINK to test associations (as effective as the program FBAT). The −log(p) value distribution is shown in Fig. 1. The MAFs and minimal p values of the most significant risk SNPs are shown in Table 1. The statistically significant risk SNPs associated with diseases (p < α) are shown in Table 2. Finally, we did bioinformatic analysis of these significant risk SNPs to explore their potential functions using the UCSC Genome Browser including ENCODE data (http://genome.ucsc.edu).

Fig. 1
figure 1

Regional association plots in ADH cluster [left Y-axis corresponds to −log(p) value; right Y-axis corresponds to recombination rates; X-axis corresponds to genomic positions; quantitative color gradient corresponds to r 2; red squares represent peak SNPs. a Regional association plot in African-American GAIN schizophrenia sample; b regional association plot in European-American autism sample]

Table 2 Significant risk SNPs associated with schizophrenia and autism (p < α)

Correction for multiple testing

The experiment-wide significance level (α) was corrected for the number of effective markers that were calculated from the entire marker set by the program SNPSpD. SNPSpD is based on an adjusted Bonferroni correction method (Li and Ji 2005). The linkage disequilibrium (LD) structures were highly similar across different phenotype groups within the same ethnicity. Approximately, 100 effective SNPs captured most information of all common SNPs across the entire ADH gene cluster both in subjects of European and African descents. Thus, the corrected significance level (α) was set at 0.0005. The numbers of risk SNPs that were nominally (p < 0.05) or significantly (p < α) associated with phenotypes are shown in Table 1. Finally, q value for each SNP was estimated from p values within each phenotype group using the R package QVALUE (Storey and Tibshirani 2003). The numbers of risk SNPs with q < 0.05 and the q values for the significant risk SNPs are shown in Tables 1 and 2, respectively.

Results

Among a total of 632 common SNPs in African-American GAIN samples, 50 SNPs were nominally associated with schizophrenia (p < 0.05), 28 of which were significantly associated with schizophrenia after false discovery rate (FDR) correction (q < 0.05). With region-wide correction for multiple testing by SNPSpD, 19 SNPs were significantly associated with schizophrenia (8.9 × 10−5 ≤ p ≤ 0.0003). These 19 SNPs were in high LD with one another (D′ = 1). Among a total of 921 common SNPs in European-Americans, 141 SNPs were nominally associated with autism (p < 0.05), 15 of which survived FDR correction (q < 0.05), and 6 of which survived region-wide SNPSpD correction (2.4 × 10−5 ≤ p ≤ 0.0003) (Tables 1, 2). These six SNPs were in high LD with one another (D′ > 0.9). After further corrected by the number of cohorts examined, these associations still remained suggestively significant. In addition, as introduced above, a recent GWAS identified a common variant (rs1789891 between ADH1B and ADH1C) that was significantly associated with alcohol dependence in the subjects of German descent (Frank et al. 2012). Interestingly, this SNP was suggestively associated with autism (p = 0.0015) in the present study, but not with alcohol dependence (p > 0.05).

Bioinformatic analysis showed that most of the significant risk SNPs (p < α; Table 2) were located at transcription factor binding sites (TFBS). Three SNPs, i.e., rs1442481 and rs1789912 at ADH1C and rs1229863 between ADH1B and ADH1C, were located at species-conserved elements. Three SNPs, i.e., rs71612682 between ADH1B and ADH1C and rs1789916 and rs1789912 at ADH1C, were located at methylated CpG islands. rs1789900 and rs1442480 at ADH1C were located at a 60-bp-long copy number variant (CNV: A_16_P16787293), and rs1789916 at ADH1C was located at another 60-bp-long CNV (A_16_P36841645). In addition, rs62323588 between ADH5 and ADH4 was located at a long RNA transcript (>200 bases).

Among a total of 916 common SNPs in African-Americans, 26 SNPs were nominally associated with alcohol dependence (p < 0.05), some of which were suggestively associated with alcohol dependence at a non-significant trend level. The most significant one was rs904092 at 5′ flanking region of ADH1A (p = 0.00053), and the second most significant one was rs2066702 (Arg370Cys; ADH2*3) at exon 9 of ADH1B (p = 0.0015; f = 0.142 in cases and 0.193 in controls). However, no SNPs survived either FDR or SNPSpD correction (Table 1). Similarly, although some SNPs were nominally associated with other neuropsychiatric and neurological disorders (p < 0.05), no SNPs survived either FDR or SNPSpD correction (Table 1).

Discussion

The principal finding of the current study was that common ADH variants were significantly associated with the risk for schizophrenia and autism, but not other neuropsychiatric disorders, including alcohol dependence. There is growing evidence that schizophrenia and autism share genetic risk variants including SNPs and CNVs (McCarthy et al. 2009; Sebat et al. 2009; Owen et al. 2011). The present study provided new evidence in support of this shared risk.

The location of the ADH variants within the ADH gene cluster may have functional significance. All of the 19 significant risk SNPs for schizophrenia and five of the six significant risk SNPs for autism were located within or flanking ADH1C (i.e., in 5′ flanking region of ADH1C or between ADH1C and ADH1B) (Table 2). These risk SNPs may have potential biological functions based on the bioinformatic analyses. It has been known that the lower functioning γγADH enzyme (mainly) (encoded by ADH1C) and ββADH enzyme (partially) (encoded by ADH1B) inhibit the turnover of 5-HIAL to 5-HTOL and increase 5-HIAA levels (Svensson et al. 1999). 5-HIAA is an important metabolite of serotonin. Alterations in 5-HIAA levels variably associated with schizophrenia (Wieselgren and Lindstrom 1998) and autism (Adamsen et al. 2011) have been interpreted as providing evidence of disturbances in serotonergic neurotransmission associated with these disorders (Cook and Leventhal 1996; Abi-Dargham et al. 1997; Chugani 2004). Thus, it is conceivable that ADH1B and ADH1C are involved in serotonergic dysfunction associated with these disorders. In addition, we noted that one significant risk SNP (rs62323588) for autism was located between ADH4 and ADH5 (Table 2). It has been known that the increased ππADH enzyme (encoded by ADH4) activity could lead to a very high turnover of norepinephrine aldehydes (Holmes 1994), and norepinephrine has been reported to be involved in the development of autism (Leboyer et al. 1992). These functional links may be supported, at least partially, by our current finding of the association between rs62323588 and autism.

It is also worth noting that the two top-ranked common ADH variants, i.e., rs904092 and rs2066702, that were suggestively associated with alcohol dependence in African–Americans at a trend level, are located in the 5′ flanking region of ADH1A and within ADH1B, respectively. The functional rs2066702 (ADH2*3) reduces the activity of ββADH enzyme in the oxidation of ethanol, and thus may affect risk for alcohol dependence (Thomasson et al. 1995). ADH1A encodes ααADH enzyme that has similar properties to ββADH and γγADH and contributes to the oxidization of ethanol. Thus, ADH1A is also a reasonable candidate gene for alcohol dependence (Zuo et al. 2009). In view of the apparent biological functions of these ADHs, the trend-level associations between these variants and alcohol dependence may reflect the smaller effects of common variants than rare variants. Future studies with larger samples are warranted to examine whether the associations between common ADH variants and alcohol dependence can really reach a significant level.

In conclusion, human diseases may be caused by a constellation of rare variants (Dickson et al. 2010), common variants, or both. Our studies, including a previous work (Zuo et al. 2013b) and the present one, suggest that rare ADH variants are associated with alcohol dependence; common ADH variants were suggestively associated with alcohol dependence, but significantly associated with schizophrenia and autism. These findings may support a hypothesis that rare and common ADH variants play different roles in the ADH properties. The rare ADH variants (e.g., those four functional variants introduced above) may influence the ADH functions that are related to the ethanol metabolism, and may thus be implicated in risk for alcoholism; however, the common ADH variants are more likely to affect the ADH activity that is related to the monoamines’ metabolic pathways, and may thus be implicated in risk for schizophrenia, autism, and possible, more other, neuropsychiatric disorders.