Introduction

DNA methylation usually occurs at CpG sites in somatic cells, and is non-randomly distributed across the genome. It is one of the major epigenetic mechanisms controlling gene expression. An inverse relationship is often observed between promoter DNA methylation levels and gene expression levels (Abdolmaleky et al. 2006; Baer et al. 2012; Egger et al. 2004; Etcheverry et al. 2010; Tekpli et al. 2013; Woloszynska-Read et al. 2007), although findings are inconsistent (Bell et al. 2011; Fraser et al. 2012; Lam et al. 2012). Altered DNA methylation in the promoter regions of genes may influence gene transcription and thus affect phenotypic variation or disease susceptibility.

To understand the mechanism by which DNA methylation regulates gene transcription, it is critical to understand the various biological influences on DNA methylation. Demographic factors, such as sex (Liu et al. 2010), age (Huang et al. 2013), and ethnicity (Nielsen et al. 2010; Zhang et al. 2011), covary with certain patterns of DNA methylation. Variation in DNA methylation has been linked to diverse environmental exposures, including pre- and post-natal nutrition (Lan et al. 2013; Soubry et al. 2013; Supic et al. 2013), early-life stress (Tyrka et al. 2012; Yang et al. 2013; Zhang et al. 2013b), psychosocial stress (de Rooij et al. 2012; Unternaehrer et al. 2012), and inflammation (Lam et al. 2012). Genetic variation can also impact DNA methylation patterns. DNA sequence variants that are associated with DNA methylation patterns are referred to as methylation quantitative trait loci (mQTLs). mQTLs are widely dispersed throughout the genome across different tissues (Bell et al. 2011; Gibbs et al. 2010; Kerkel et al. 2008; Schalkwyk et al. 2010; Zhang et al. 2010). The study by Quon et al. (2013) demonstrated that methylation patterns of 3–4 % of CpG sites in the genome were heritable. Given the close relationship between sequence variants and DNA methylation patterns, the heritability of DNA methylation is potentially mediated by genetic variation.

DNA methylation is important for normal development and is associated with key processes such as genomic imprinting and X-chromosome inactivation. Significant associations between aberrant DNA methylation and susceptibility to numerous diseases have been reported. For example, we compared the DNA methylation levels of 384 CpGs in the promoter regions of 82 candidate genes of subjects with alcohol dependence (AD) and healthy controls, and identified several promoter CpGs that were differentially methylated in AD subjects (Zhang et al. 2013a). Nevertheless, it is unknown whether DNA methylation alterations in AD subjects are due to DNA sequence variants (potentially enriched in the genome of AD subjects) or chronic alcohol consumption (an environmental factor). Alcohol consumption can alter DNA methylation status (Chen et al. 2013). Alternatively or additionally, considering the widespread presence of mQTLs in the genome, aberrant DNA methylation patterns may exist in individuals at risk for AD due to sequence variants (or mQTLs) that can moderate gene transcription, and thus confer vulnerability to AD.

The present study aimed to identify mQTLs that regulate the methylation patterns of CpGs in the promoter regions of 82 candidate genes potentially involved in AD-relevant biological pathways. Selection of genes for study based on a priori hypothesis (i.e., there is a potential role of these genes in AD development) should be more appropriate than random selection of genes in the genome, as suggested by previous studies (Zhao et al. 2013) and with the availability of well-established alcohol-related gene database (Guo et al. 2009). In this study, we examined the association between genotypes of genome-wide single nucleotide polymorphisms (SNPs) and methylation levels (as quantitative traits) of 384 CpGs located in the promoter regions of 82 AD candidate genes (Supplementary Table S1), resulting in the identification of both cis- and trans-mQTLs for the 384 promoter CpGs.

Materials and methods

Human subjects

We included 411 subjects [268 African Americans (AAs) and 143 European Americans (EAs)] in this study. They were recruited for DNA methylation (Zhang et al. 2013a) and genome-wide association (Gelernter et al. 2014) studies of AD. The subjects were interviewed using an electronic version of the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA) (Pierucci-Lagha et al. 2005), and lifetime diagnoses for AD were made according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (American Psychiatric Association 1994). Control subjects were not affected with alcohol or drug abuse or dependence. Both AD and control subjects were free of major psychotic disorders such as schizophrenia and bipolar disorders. The ancestry proportions of the 411 subjects were estimated with the program STRUCTURE (Pritchard et al. 2000) using 36 short tandem repeat markers and five SNPs (Xie et al. 2009). Among 268 AAs, 129 (48.1 %) were AD cases and 139 (51.9 %) were healthy controls, 97 (36.2 %) were males, and the average AA ancestry proportion was 0.94 ± 0.08. Among 143 EAs, 129 (90.2 %) were AD cases and 14 (9.8 %) were healthy controls, 73 subjects (51.5 %) were males, and the average EA ancestry proportion was 0.97 ± 0.07. No age difference was observed between AAs (39 ± 11) and EAs (41 ± 13) (P > 0.05).

DNA methylation data

Methylation data for 384 CpGs in the promoter regions [from 2,000-bp upstream to 1,000-bp downstream of the transcription start sites (TSS)] of 82 candidate genes (Supplementary Table S1) for the 411 subjects were generated using Illumina’s GoldenGate DNA methylation assays (Zhang et al. 2013a). The 82 candidate genes are involved in several brain neurotransmission systems (dopaminergic, opioidergic, serotonergic, GABAergic/glutamatergic, cholinergic, and cannabinoidergic), alcohol metabolism, DNA methylation, or signal transduction related to alcohol reward and reinforcement (Hodgkinson et al. 2008; Zhang et al. 2013a). One to 12 CpGs in each of the 82 candidate genes were included in the custom methylation profiling panel. The background normalization algorithm minimized background variation within the array using built-in negative control probe signals. Raw DNA methylation data were further processed with the R package sva (Johnson et al. 2007) to remove batch effects. The processed DNA methylation data were used for the mQTL (or SNP genotype–CpG methylation association) analysis.

Genome-wide SNP genotype data

The 411 subjects were among 5,697 subjects (3,318 AAs and 2,379 EAs) included in our recent AD genome-wide association study (GWAS) (Gelernter et al. 2014). All subjects were genotyped on the Illumina HumanOmni1-Quad v1.0 microarray containing 988,306 autosomal SNPs. Genotypes of SNPs were called using the GenomeStudio software V2011.1 and genotyping module V1.8.4 (Illumina, San Diego, CA, USA). The extent of identity-by-descent (IBD) sharing among the 411 subjects was estimated by PLINK (Purcell et al. 2007), and all subjects passed the sample quality examination. SNPs were excluded if the: (1) minor allele frequency (MAF) was ≤0.05; (2) missing rate per SNP was >10 %; or (3) P values of Hardy–Weinberg disequilibrium tests were ≤0.001. After data cleaning and quality control, 784,514 SNPs for 268 AAs and 699,437 SNPs for 143 EAs remained for the mQTL analysis.

Statistical and bioinformatics analyses

All statistical analyses were implemented using the open-source program R 3.0.1 (http://www.r-project.org/). To identify mQTLs for the 384 CpGs in the promoter regions of 82 candidate genes, we analyzed the association of genotypes (in an additive model) of genome-wide SNPs and methylation levels of each of the 384 promoter CpGs using PLINK (Purcell et al. 2007) in 268 AAs and 143 EAs. Multiple linear regression analysis was used to predict the effect of SNPs on the methylation of CpGs, with adjustment for sex, age, ancestry proportion, and AD status. Nominal P values (P nominal) were obtained for >260 million SNP–CpG pairs [in 268 AAs: 784,514 (SNPs) × 384 (CpGs) pairs; in 143 EAs: 699,437 (SNPs) × 384 (CpGs) pairs]. To study the influence of AD on mQTL–CpG associations, multiple linear regression analysis was used to validate significant mQTL–CpG pairs (identified in AAs or EAs) in AD cases and healthy controls separately, with adjustment for sex, age, and ancestry proportion. All regression models were summarized in Supplementary Table S2. QVALUE (Storey 2002) was used to account for multiple testing by controlling the false discovery rate (FDR) at 0.05. mQTL (or SNP)-CpG pairs were considered to be significant if the q value ≤ 0.05. Based on the distance between mQTLs and CpGs, mQTLs were further classified as cis-mQTLs (located within 1 Mb from a specific CpG site) or trans-mQTLs (located beyond 1 Mb from a specific CpG site or on different chromosomes). Regional association results from genome-wide association scans were plotted using LocusZoom (Pruim et al. 2010).

To explore the function of mQTLs (or SNPs), the Web-based functional annotation tool SNPnexus (Dayem Ullah et al. 2012) was queried to determine whether the identified mQTLs were located in coding or non-coding regions, transcription factor-binding sites (TFBS), microRNA-binding sites, or vertebrate conserved regions, and whether they were previously known to be associated with phenotypes or diseases. We also examined whether the identified mQTLs were among AD-associated SNPs obtained in our recent GWAS (Gelernter et al. 2014). The identified mQTLs were further analyzed to see whether they were expression quantitative trait loci (eQTLs) by querying the gene expression datasets (generated from the lymphoblastoid cell lines derived from peripheral blood lymphocytes of HapMap YRI and CEU subjects) in the GTEx eQTL database (GTExConsortium 2013) using the web server SCAN (Nicolae et al. 2010). The influence of the distance between SNPs and CpGs, race, and AD on the strength of the mQTL (or SNP)-CpG association was also investigated.

Results

Significant mQTL–CpG pairs identified in African Americans (AAs)

In AAs, we identified 282 significant mQTL–CpG pairs with P values survived multiple-testing correction when false discovery rate (FDR) q value was controlled at the 0.05 level (9.9 × 10−100 ≤ P nominal ≤ 7.7 × 10−8 or 6.7 × 10−91 ≤ q ≤ 0.05) (Supplementary Table S3a). These 282 mQTL–CpG pairs (including 202 cis-mQTL–CpG and 80 trans-mQTL–CpG pairs) included 238 unique SNPs (mapped to 99 genes, including 20 of the 82 candidate genes) and 65 unique promoter CpGs (mapped to 44 of the 82 candidate genes). The top ten cis-mQTL–CpG and top ten trans-mQTL–CpG pairs identified in AAs are listed in Table 1. The most significant cis-mQTL was rs1800759, which was strongly associated with CpG cg12011299 (P nominal = 9.9 × 10−100, q = 6.7 × 10−91). Both rs1800759 and CpG cg12011299 are located in the promoter region of the alcohol dehydrogenase 4 gene (ADH4), 37-bp apart. A regional plot by LocusZoom displays the association of SNPs (400 kb up or downstream of rs1800759) and cg12011299 as well as the linkage disequilibrium (LD) between rs1800759 and the surrounding SNPs in AAs (Fig. 1a). Six additional cis-mQTLs (rs1154415, rs2851301, rs2602836, rs3828541, rs13148577, and rs4147547) located in the ADH5-ADH4-ADH6 gene cluster region (about 128 kb) were also strongly associated with cg12011299 (1.7 × 10−37 ≤ P nominal ≤ 8.0 × 10−23 or 6.7 × 10−30 ≤ q ≤ 2.2 × 10−15) (Table 1). The two top trans-mQTLs identified in AAs were rs3097793 and rs173567. Both of them are located in the intergenic region between CCDC69 (coding for coiled-coil domain containing 69) and GM2A (coding for GM2 ganglioside activator), and strongly associated with CpG cg14150516 in the promoter region of CREB1 (coding for cAMP-responsive element-binding protein 1) (rs3097793: P nominal = 3.1 × 10−43, q = 6.5 × 10−35; rs173567: P nominal = 3.8 × 10−32, q = 8.5 × 10−25) (Table 1).

Table 1 Top ten cis- and top ten trans-mQTLs identified in 268 African Americans (AAs)
Fig. 1
figure 1

Regional plot of associations between SNPs around the top mQTL rs1800759 and CpG cg12011299. a Regional plot of associations between SNPs around the top mQTL rs1800759 and CpG cg12011299 in African Americans (AAs). b Regional plot of associations between SNPs around the top mQTL rs1800759 and CpG cg12011299 in European Americans (EAs). The graph was made by LocusZoom using the 1,000 Genomes data (hg19). The left y-axis indicates −log10(P values) of mQTL–CpG associations, the right y-axis indicates recombination rates, and the x-axis shows the physical coordinates of genes and SNPs on chromosome 4. The color of dots indicates the strength of correlation between rs1800759 and nearby SNPs. Gene annotation is displayed at the bottom of the figure, and the light blue shadow refers to the location of cg12011299 (color figure online)

Significant mQTL–CpG pairs identified in European Americans (EAs)

In EAs, we identified 313 significant mQTL–CpG pairs with P values that survived correction for multiple testing (2.7 × 10−53 ≤ P nominal ≤ 9.9 × 10−8 or 1.4 × 10−44 ≤ q ≤ 0.05) (Supplementary Table S3b). These 313 mQTL–CpG pairs (including 223 cis-mQTL–CpG and 90 trans-mQTL–CpG pairs) included 305 unique SNPs (mapped to 73 genes, including 10 of the 82 candidate genes) and 44 unique promoter CpGs (mapped to 30 of the 82 candidate genes). The top ten cis-mQTL–CpG and top ten trans-mQTL–CpG pairs identified in EAs are listed in Table 2. As in AAs, the most significant cis-mQTL identified in EAs was rs1800759, which was also strongly associated with CpG cg12011299 (P nominal = 2.7 × 10−53; q = 1.4 × 10−44). A regional plot by LocusZoom shows the association of SNPs (400 kb upstream or downstream of rs1800759) and cg12011299 as well as the LD between rs1800759 and the surrounding SNPs in EAs (Fig. 1b). Eight additional cis-mQTLs (rs2602836, rs1126670, rs1126671, rs6837685, rs1126673, rs4699710, rs6532798, and rs10017466) located in ADH4 or ADH5 were also strongly associated with cg12011299 (1.3 × 10−39 ≤ P nominal ≤ 3.0 × 10−26 or 3.4 × 10−31 ≤ q ≤ 4.8 × 10−19) (Table 2). The two top trans-mQTLs identified in EAs were the same as those identified in AAs, i.e., rs3097793 and rs173567 (two intergenic SNPs between CCDC69 and GM2A). Both of these were strongly associated with CpG cg14150516 in the promoter region of CREB1 (rs3097793: P nominal = 1.2 × 10−33, q = 2.0 × 10−25; rs173567: P nominal = 4.5 × 10−33, q = 5.8 × 10−25) (Table 2).

Table 2 Top ten cis- and top ten trans-mQTLs identified in 143 European Americans (EAs)

Influence of race on mQTL–CpG association

We searched for significant mQTL–CpG pairs (q ≤ 0.05) that were common to both AAs and EAs. As shown in Fig. 2a, 92 significant mQTL–CpG pairs (i.e., 32.6 % of the 282 significant mQTL–CpG pairs identified in AAs or 29.4 % of the 313 significant mQTL–CpG pairs identified in EAs) were shared between AAs and EAs. These 92 common mQTL–CpG pairs (including 57 cis-mQTL–CpG and 35 trans-mQTL–CpG pairs) were comprised of 89 unique SNPs (i.e., 37.4 % of the 238 unique SNPs involved in the 282 mQTL–CpG pairs in AAs or 29.2 % of the 305 unique SNPs involved in the 313 mQTL–CpG pairs in EAs) and 16 unique CpGs. However, 190 (or 67.4 %) of the 282 significant mQTL–CpG pairs (including 145 cis-mQTL–CpG and 45 trans-mQTL–CpG pairs) identified in AAs were specific for AAs, and 221 (or 70.6 %) of the 313 significant mQTL–CpG pairs (including 166 cis-mQTL–CpG and 55 trans-mQTL–CpG pairs) identified in EAs were specific for EAs. The most significant cis-mQTL–CpG pair, consisting of rs1800759 and CpG cg12011299 (both located in the promoter region of ADH4), was identified in both populations.

Fig. 2
figure 2

The effect of race and AD on mQTL–CpG association. a. The effect of race on mQTL–CpG association in African Americans (AAs) and European Americans (EAs). The x-axis represents the P values [converted to sign (linear regression beta value) × (−log10 P value)] of 282 (92 + 190) significant mQTL–CpG pairs in AAs (q ≤ 0.05), and the y-axis represents the P values [converted to sign(linear regression beta value) × (−log10 P value)] of 313 (92 + 221) significant mQTL–CpG pairs in EAs (q ≤ 0.05). The dashed line indicates the threshold of significance (q ≤ 0.05) (vertical, for AAs; horizontal, for EAs). The color of the dots indicates the type of mQTLs (red, cis-mQTLs; black, trans-mQTLs). b The effect of alcohol dependence (AD) on mQTL–CpG association in African Americans (AAs). The x-axis represents the P values [converted to sign (linear regression beta value) × (−log10 P value)] of 225 (188 + 37) significant mQTL–CpG pairs identified in AD subjects (q ≤ 0.05), and the y-axis represents the P values [converted to sign (linear regression beta value) × (−log10 P value)] of 245 (188 + 57) significant mQTL–CpG pairs identified in control subjects (q ≤ 0.05). The dashed line indicates the threshold of significance (0.05/282 = 1.8 × 10−4) (vertical, for AD subjects; horizontal, for control subjects) (color figure online)

Influence of AD on mQTL–CpG association

Because chronic drinking or other features associated with AD may influence CpG methylation status and thus alter the association of mQTLs (or SNPs) and CpGs, we further examined the validity of the 282 significant mQTL–CpG pairs (identified in all AA subjects) in 129 AA cases and 139 AA controls separately. After correcting for multiple testing (i.e., the P value for significance was set at 0.05/282 = 1.8 × 10−4), 188 (or 66.7 %) of the 282 mQTL–CpG pairs were shared between AD and control subjects (Fig. 2b). These 188 common mQTL–CpG pairs (including 144 cis-mQTL–CpG and 44 trans-mQTL–CpG pairs) included 154 unique SNPs (64.7 % of the 238 unique SNPs involved in the 282 mQTL–CpG pairs in AAs) and 36 unique CpGs. However, 37 (or 13.1 %) of the 282 mQTL–CpG pairs (including 22 cis-mQTL–CpG and 15 trans-mQTL–CpG pairs) were observed only in AD subjects, and 57 (or 20.2 %) of the 282 mQTL–CpG pairs (including 36 cis-mQTL–CpG and 21 trans-mQTL–CpG pairs) only in control subjects. The influence of AD on the mQTL–CpG association was not analyzed in EAs, because the size of the EA control sample was small (n = 14).

Association of identified mQTLs and AD

We next evaluated whether there was an overlap between the mQTLs identified as associated with CpGs and SNPs associated with AD in our recent GWAS (Gelernter et al. 2014). Of the 238 mQTLs identified in AAs, 21 (or 8.8 %) were associated with AD in AAs at the level of overall meta P ≤ 0.05. Permutation tests showed that the 21 mQTLs were overrepresented by AD-associated SNPs identified in our recent GWAS (Gelernter et al. 2014) (P permutaion = 0.041, 1 million times). The P values of three mQTLs identified in AAs reached genome-wide significance [rs2173201 (an intergenic SNP between ADH1B and ADH1C): overall meta P = 2.4 × 10−9; rs4147542 (in ADH1C intron 2): overall meta P = 9.1 × 10−9; and rs4147541 (in AHD1C promoter region): overall meta P meta = 1.1 × 10−8] (Supplementary Table S4a). Of the 305 mQTLs identified in EAs, 58 (19.0 %) were associated with AD at the level of overall meta P ≤ 0.05. Permutation tests showed that the 58 mQTLs were overrepresented by AD-associated SNPs identified in our recent GWAS (Gelernter et al. 2014) (P permutaion < 1.0 × 10−6, 1 million iterations); however, none of the 58 mQTLs was associated with AD at the genome-wide level. The most significant mQTL associated with AD in EAs was rs1800759 (P meta = 0.008), which is located in the promoter region of ADH4 (Supplementary Table S4b).

Prediction of the function of identified mQTLs

First, the correlation of cis-mQTL–CpG physical distance and cis-mQTL–CpG association strength was examined. As shown in Fig. 3, the distance between cis-mQTLs and CpGs was highly correlated with the magnitude of association between cis-mQTLs and CpGs, i.e., the shorter the distance from cis-mQTLs to CpGs, the stronger the association between cis-mQTLs and CpGs. This result was obtained in both AAs (Fig. 3a) and EAs (Fig. 3b). The distance between rs1800759 and CpG cg12011299 (both in the promoter region of ADH4), which formed the most significant SNP–CpG pair in both AAs and EA, was only 37 bp. Second, the function of the identified mQTLs was predicted using the online server SNPnexus (Dayem Ullah et al. 2012). The numbers of cis- and trans-mQTLs that were predicted to impact protein function, regulate gene transcription, or influence any disease phenotypes (including alcohol dependence or abuse) are summarized in Table 3. Third, the functional role of the identified mQTLs in regulating gene expression was predicted by querying the SCAN eQTL database. As shown in Table 3, 76 (46.9 %) of the 202 cis-mQTLs and 39 (51.3 %) of the 80 trans-mQTLs identified in AAs as well as 151 (70.2 %) cis-mQTLs of the 223 cis-mQTLs and 49 (54.4 %) of the 90 trans-mQTLs identified in EAs were predicted to be expression quantitative trait loci (eQTLs).

Fig. 3
figure 3

The correlation of SNP–CpG physical distance (bp) and SNP–CpG association strength in (a) African Americans (AA) and (b) European Americans (EA). X-axis, the distance (bp) between SNPs and CpGs; y-axis, the strength of SNP–CpG association [P values were converted to −log10(P value)]. SNP–CpG pairs (represented as black dots) with P nominal ≤ 0.001 are shown in the figure. The most significant mQTL (or SNP)–CpG association was derived from rs1800759 and cg12011299 (both located in the promoter region of ADH4, with a distance between them of only 37 bp), and this association was observed in both AAs and EAs

Table 3 Prediction of the function of mQTLs by querying the SNP nexus and SCAN databases

Discussion

The present study demonstrated that mQTL–CpG distance, race, and AD status could influence the association of mQTLs and CpGs. We also found that 48.3 % of the mQTLs identified in AAs and 65.6 % of the mQTLs identified in EAs were predicted to be eQTLs. These results help in interpreting our AD GWAS findings, providing a mechanism for the AD risk contribution of certain non-coding variants, i.e., their effect on disease susceptibility may be through altering promoter DNA methylation levels.

Our findings suggest that sequence variants (particularly non-coding variants) may influence AD susceptibility via DNA methylation. As summarized in Table 3, only a small number of the identified mQTLs lead to amino acid changes or potentially participate in gene expression regulation by altering transcription factor-binding sites or conserved genomic sequences. A majority of the identified mQTLs may affect AD risk by influencing DNA methylation patterns. Of the 12 coding mQTLs or non-synonymous SNPs (five identified in AAs and seven identified in EAs), only rs1126671 (located in ADH4 exon 7 and resulting in the substitution of valine to isoleucine) identified in EAs was predicted to damage protein function. This SNP marker was also strongly associated with CpG cg12011299, which was located in the promoter region of ADH4 (Table 2). Given the significant association of rs1126671 and AD (Edenberg et al. 2006; Luo et al. 2005), this variant may have more than one functional mechanism, i.e., altering ADH4 protein function and regulating ADH4 expression by impacting promoter CpG methylation levels. About half of the mQTLs identified in AAs and more than half of the mQTLs identified in EAs were predicted to be eQTLs (Table 3). This implies that some mQTLs may mediate the interaction of SNP genotype, DNA methylation, and gene expression. Additionally, by querying the Genetic Association Database (GAD) through the Web server SNPnexus, we found that 15 (9.2 %) of the 162 cis-mQTLs identified in AAs and 50 (23.2 %) of the 215 cis-mQTLs identified in EAs were associated with alcohol dependence or abuse. Three cis-mQTLs (rs2173201, rs4147542 and rs4147541 located in the ADH1B-ADH1C gene region) identified in AAs were all significantly associated with CpG cg25997474 in the promoter region of ADH1C (Supplementary Table S3a), and they were among the SNPs showing genome-wide association with AD in AAs in our recent GWAS (Gelernter et al. 2014). However, none of the 76 trans-mQTLs identified in AAs and the 90-trans-mQTLs identified in EAs were reported to be associated with alcohol dependence or abuse in published studies. Therefore, cis-mQTLs that influence the methylation levels of proximal CpGs in the promoter regions of genes are more likely to increase the risk of diseases such as AD. In addition, we noted that three of the top ten cis-mQTLs in AAs (Table 1) and four of the top ten cis-mQTLs in EAs (Table 2) were intronic SNPs. The significant correlation of these intronic SNPs and promoter CpGs may be due to the tight LD between these intronic SNPs and promoter SNPs that are proximal to promoter CpG sites. As showed in Fig. 1, SNP markers in the ADH4-ADH5 gene cluster formed a tight LD block.

The distance between cis-mQTLs and CpGs was inversely correlated with the strength of association between cis-mQTLs and CpGs: a shorter distance between cis-mQTLs and CpGs predicted a stronger effect of genetic variation on CpG methylation (Fig. 3). A similar observation was reported in a recent study that performed DNA methylome analysis using Illumina’s 450 K Methylation array (Moen et al. 2013). Both studies showed that a majority of the significantly associated SNP–CpG pairs were comprised of SNPs and CpGs with distances less than 500 kb. However, unlike the previous study, the present study focused mainly on the effect of genetic polymorphisms on methylation patterns of promoter CpGs in candidate risk genes for AD. As shown in Fig. 3, the most significant mQTL (or SNP)–CpG association was derived from rs1800759 and cg12011299 (both located in the promoter region of ADH4, with a distance between them of only 37 bp), and this association was observed in both AAs and EAs. The present mQTL study provided further evidence that rs1800759 may strongly influence the methylation status of the nearby promoter CpG cg12011299, thus leading to altered transcription of ADH4, a gene encoding for an enzyme involved in alcohol metabolism.

Additionally, the present study demonstrated that the strength of mQTL–CpG association could be affected by non-genetic factors such as race and AD status: about 70 % of the identified mQTL–CpG pairs was either AA or EA specific (Fig. 2a). Previous studies also showed that mQTLs were population specific (Fraser et al. 2012; Lam et al. 2012; Moen et al. 2013). Our study provides additional evidence that population-specific mQTLs may be due to the combined effect of population differences in DNA methylation and allele frequencies. An additional source for the observed population-specific mQTLs may originate from sex differences of the two populations (i.e., 36.2 % of AAs were males vs. 51.5 % EAs were males), although sex was adjusted by the multiple linear model. Furthermore, we analyzed the effect of AD on association between mQTLs and CpGs in AAs. Two-thirds of the mQTL–CpG pairs identified in AAs were common to both AD case and control subjects. In other words, a majority of the associations between mQTL and CpGs were not affected by AD. This may be because AD exerts only a minor or moderate effect on DNA methylation, as demonstrated in published studies (Philibert et al. 2012; Zhang et al. 2013a).

There are several limitations to this study. First, we examined the influence of sequence variants (i.e., SNPs) on methylation patterns of only a small number of promoter CpGs in 82 candidate risk genes for AD. For a fuller understanding of the relations among SNP genotypes, DNA methylation, and gene expression, future studies should be performed on a genome-wide scale. Second, the present study analyzed the impact of only two non-genetic factors (i.e., race and AD) on the association of SNPs (or mQTLs) and CpGs. However, numerous studies have demonstrated that environmental factor such as early-life stress (Murgatroyd et al. 2009; Zhang et al. 2013b) and demographic factors such as age (Boks et al. 2009; Numata et al. 2012) and sex (Boks et al. 2009; Xu et al. 2014) can alter DNA methylation levels. Non-genetic factors such as early-life stress (not included in the present study) should be considered when performing mQTL studies. Third, the present study did not consider the linkage disequilibrium (LD) among SNPs in mQTL–CpG association analysis, yielding evidence of association of multiple tightly linked SNP markers with the same promoter CpG. Experimental studies are needed to specify the functional SNP that influences the methylation status of a specific CpG site. Lastly, the present study examined mQTLs in only peripheral blood. Because DNA methylation can be cell-, tissue-, organ-, individual- or population-specific, and there is a dynamic regulation of DNA methylation during development, caution is warranted when interpreting the mQTL–CpG association analysis results.

In conclusion, we found a significant association between certain DNA sequence variants (or mQTLs) and methylation patterns of promoter CpGs in a number of AD candidate genes. The strength of mQTL–CpG association was determined by factors such as mQTL–CpG distance, race, and AD. In addition, a large proportion of the identified mQTLs were predicted to be eQTLs. These findings suggest that some mQTLs may regulate gene expression, and thus underlie the association of genetic variation with AD.