Introduction

Acne is a common condition in young people and affect all to a certain degree. It occurs earlier and is more severe in those with a family history (Ballanger et al. 2006; Ghodsi et al. 2009). Since 1984, several twin studies have assessed the heritability of acne and indicated a substantial genetic influence on familial clustering (Bataille et al. 2002; Evans et al. 2005; Friedman 1984; Niermann 1958; Walton et al. 1988). However, the individual genes responsible for this high heritability remain unclear. Although candidate gene-based studies have identified a few genetic variants associated with acne in tumor necrosis factor alpha (TNF-α) (Al-Shobaili et al. 2012; Baz et al. 2008; Szabo et al. 2011), tumor necrosis factor receptor 2 (TNFR2) (Tian et al. 2010), interleukin-1A (IL1A) (Szabo et al. 2010), cytochrome P450, family 17 (CYP17) (He et al. 2006), toll-like receptor 2 (TLR2) (Tian et al. 2010) and toll-like receptor 4 (TLR4) (Grech et al. 2012), no genome-wide association studies (GWASs) have been reported on this common skin condition. To identify novel genetic variants for acne, we conducted a genome-wide association study on severe teenage acne within a U. S cohort of young females, the Nurses’ Health Study II (NHSII).

Results

We conducted a genome-wide association study among 928 European Americans in the discovery stage (81 of them with history of severe teenage acne and 847 of them without such a history). We imputed 2,542,603 autosomal single nucleotide polymorphisms (SNPs) based on the HapMap database phase II data build 35 (CEU). After quality control filtering (see Materials and methods), 2,165,857 SNPs were selected for further association analysis with acne. The quantile–quantile (Q–Q) plots did not demonstrate a systematic deviation from the expected distribution, with the overall genomic control inflation factor (λGC) of 1.02 (Supplementary Figure 1). The Manhattan plot of each GWAS is presented in Supplementary Figure 2.

Across the whole genome, chromosome 8q24 locus showed the most significant association with severe teenage acne. In the discovery set, we identified the SNP rs4133274 on chromosome 8q24 as the SNP most significantly associated with severe teenage acne (G allele; odds ratio, OR 4.01; 95 % confidence interval, CI 2.37–6.82; P = 1.7 × 10−6, Table 1). Another SNP in the same region, rs13248513, which was in linkage disequilibrium (LD) with rs4133274 (LD r 2 = 0.85 in our study population), ranked the second (C allele, OR 3.82, 95 % CI 2.29–6.36, P = 2.2 × 10−6, Table 1). Breast cancer cases included in the discovery set might bias the results, because our previous findings suggest that severe teenage acne was associated with an increased risk of breast cancer. Hence, we further conducted a sensitivity analysis in which all breast cancer cases were excluded. The estimates of the two SNPs remained similar (OR 3.86 vs. 4.01 for rs4133274; and OR 3.86 vs. 3.82 for rs13248513). Interestingly, neither of the two SNPs was in LD with the previously identified 14 SNPs in this region that were associated with cancer risk (Grisanzio and Freedman 2010) (data not shown), and none of the cancer risk associated-SNPs was related to severe teenage acne in the present GWAS after multiple testing correction (0.05/14 = 0.004, data not shown).

Table 1 Associations between the two SNPs in chromosome 8q24 and severe teenage acne

We selected the top one or two loci in each chromosome region showing association with p value < 2.5 × 10−5 for a fast-track replication in an additional 1,392 European Americans from the same cohort population (134 of them with history of severe teenage acne and 1,258 of them without such a history). A total of 30 SNPs were selected for replication, but none of their associations with severe teenage acne was replicated in the replication set (Supplementary Table 1). Besides, none of the SNPs associated with acne in previous candidate gene-based studies (rs1799724 and rs1800629 in TNFα; rs1061622 in TNFR2; rs17561 in IL1A; rs743572 in CYP17; rs5743708 in TLR2; rs4986790 and rs4986791 in TLR4) (Al-Shobaili et al. 2012; Baz et al. 2008; Grech et al. 2012; He et al. 2006; Szabo et al. 2011; Tian et al. 2010) showed significant association in the GWAS of severe teenage acne (p value > 0.05; rs1799724 in TNFα and rs5743708 in TLR2 was not in our dataset).

Discussion

Our results suggested that the chromosome 8q24 locus may be associated with severe teenage acne in a GWAS among a cohort population of European Americans. Chromosome 8q24 is a large gene desert region of particular interest. This region harbors a set of risk loci associated with multiple types of cancer in previous GWASs, including prostate, breast, colon, ovarian, bladder cancers, chronic lymphocytic leukemia and glioma (Crowther-Swanepoel et al. 2010; Easton et al. 2007; Eeles et al. 2008, 2009; Ghoussaini et al. 2008; Gudmundsson et al. 2007, 2009; Shete et al. 2009; Thomas et al. 2008; Tomlinson et al. 2008; Turnbull et al. 2010; Yeager et al. 2007; Zanke et al. 2007).

Notably, we previously found that women with a history of severe teenage acne had a 17 % increased risk of breast cancer in the same cohort (data unpublished). Besides, we have previously reported a 70 % increased risk of prostate cancer among men with a history of severe acne in the Health Professionals Follow-up Study, a men’s cohort in the US (Sutcliffe et al. 2007) The closest gene around the chromosome 8q24 region is MYC, which has been known as a proto-oncogene and is a particularly compelling candidate in this region. Of interest, Myc has also been reported to regulate the androgenic effect. A Myc consensus site was identified in the androgen receptor (AR) gene to up-regulate AR (Grad et al. 1999). Previous studies also reported that Myc enhanced AR expression in androgen-independent prostate cancers and plays a key role in the control of hormone responsiveness and cell proliferation in epithelial prostatic cells (Lee et al. 2009; Silva et al. 2001). Besides, among the widespread microRNAs repressed by Myc, miR-let-7 was recently found to play an important role in the regulation of androgen signaling by down-regulating AR expression (Nadiminty et al. 2012). It is known that higher levels of circulating androgens can lead to the hyperplasia of the sebaceous glands and the seborrhea characterized by acne (Lucky 1995), and a growing body of evidence has suggested that high levels of circulating androgens are associated with increased risk of breast cancer (Eliassen and Hankinson 2008), including the NHSII (Eliassen et al. 2006). Besides, prostate cancer, a well-recognized androgen-related cancer, has been positively associated with a history of severe acne in epidemiologic study (Sutcliffe et al. 2007; Williams et al. 2012). Thus, the regulation of androgen by Myc may be a common mechanism underlying the association between acne and certain cancers. However, despite the evidence that the 8q24 region harbors regulatory elements that regulate the expression of MYC, most studies did not show a consistent association between risk allele status and MYC expression level (Grisanzio and Freedman 2010), and it cannot be ruled out that the risk variants in 8q24 may influence genes other than MYC.

One limitation of this study is the modest sample size, which may limit our statistical power to identify loci with genome-wide significance levels. Besides, we used self-reported information on acne. However, the high education level and interest in health of cohort members allow high quality and valid information to be collected on self-administered forms. In addition, a previous study demonstrated that people reporting acne of some severity were likely to have seen a physician (Cheng et al. 2010). In summary, our study suggested an association between chromosome 8q24 locus and severe teenage acne in European Americans. Upon further replication, our findings may shed new light on the genetic basis of acne and suggest a potential linkage between acne and cancer.

Materials and methods

Study population

We included 261 women with a history of severe teenage acne and 2,578 women without such a history in the NHSII cohort in the present GWAS. The NHSII is a prospective cohort study established in 1989, when 116,430 female registered nurses aged 25–42 and residing in the United States at the time of enrollment responded to an initial questionnaire on their medical histories and baseline health-related exposures. Details of this cohort have been described previously (Bertone-Johnson et al. 2009). The protocol for this study was approved by the Institutional Review Board at Brigham and Women’s Hospital and the Harvard School of Public Health.

Participants reported their history of severe teenage acne on the baseline questionnaire in 1989. In the discovery stage, we used pooled samples from two existing GWASs nested within the NHSII: one was a part of the Cancer Genetic Markers of Susceptibility project (CGEMS, n = 290; 34 with a history of severe teenage acne and 256 without such a history) and the other was a kidney stone case–control study (n = 638; 47 with a history of severe teenage acne and 591 without). All the NHSII individuals in the CGEMS set were breast cancer cases. In the replication stage, we recruited an additional 1,392 women in a renal function study nested in the same cohort for a fast-track replication (134 with a history of severe teenage acne and 1,258 individuals without such a history). We provided the detailed descriptions of each component data set in the Supplementary Material. There was no overlapping of samples among these datasets. All subjects included in this analysis were US non-Hispanic Europeans.

Genotyping, imputation and quality control

We used the Illumina 610K SNP chip for genotyping individuals in the discovery set. We imputed 2,542,603 autosomal SNPs based on haplotypes from the HapMap (http://www.hapmap.org) phase II data build 35 (CEU) (Marchini et al. 2005) using the MACH program (Li et al. 2009). Samples from the two component study sets were imputed separately. SNPs with imputation r 2 > 0.4 and minor allele frequency (MAF) >0.05 in both studies were included. Finally, 2,165,857 SNPs were included in the association analysis. The top SNPs selected from the discovery stage were genotyped in the replication set using TaqMan/BioTrove assays at the Dana Farber/Harvard Cancer Center Polymorphism Detection Core. Laboratory personnel were blinded to the case–control status, and blinded quality control samples were inserted to validate genotyping procedures; concordance for the blinded samples was 100 %. Primers, probes and conditions for genotyping assays are available upon request. All these SNPs had P value for the Hardy–Weinberg test of equilibrium >0.002 (0.05/30). We excluded two SNPs rs7270170 and rs6996538 with low call rates.

Statistical analysis

Logistic regression was applied to evaluate the association between the minor allele counts and severe teenage acne. In the discovery stage, we used the imputed dosage data and adjusted for the first four principal components as well as the data set in the model. These principal components were calculated for all individuals on the basis of ca. 10,000 unlinked markers using the EIGENSTRAT software (Price et al. 2006). We used ProbABEL for the GWAS analysis and used Statistical Analysis System software (version 9.1.3; SAS Institute, Cary, NC, USA) for the replication study. Associations in the discovery set and the replication set were combined in an inverse variance-weighted meta-analysis using the METAL software (http://www.sph.umich.edu/csg/abecasis/Metal/index.html).