Introduction

Vitamin D deficiency is a common public health problem in the United States. It is reported that 40–100% elderly US and European men and women suffer from vitamin D deficiency (Holick 2007). This deficiency has been demonstrated to relate to bone health problems like rickets and osteoporosis because of reduced calcium absorption (Rovner and Miller 2008; Sahota 2000; Simonelli 2005; Wharton and Bishop 2003). Additionally, recent data indicate that vitamin D deficiency may relate with other diseases like diabetes, cardiovascular disease, renal disease, and some cancers (Borkar et al. 2009; Feskanich et al. 2004; Kilkkinen et al. 2009; Melamed et al. 2009).

Serum 25-hydroxy vitamin D [25(OH)D] concentration is the best indicator of Vitamin D status, and determination of its deficiency or sufficiency. Recent studies indicate that genetic factors may play an important role in the determination of serum 25(OH)D levels. Wjst et al. performed a genome-wide linkage scan in asthma families for serum 25(OH)D. Results show that serum 25(OH)D levels have a heritability of 80% in these families (Wjst et al. 2007). In two other studies performed in 226 adolescent twins by Arguelles et al. (2009), and in 1,762 participants by Shea et al. (2009), the genetic factors accounting for the variability in serum 25(OH)D levels were estimated as 68.9 and 28.8%, respectively.

Although serum 25(OH)D has a high genetic determination, few genes have been identified. We conducted a cross-sectional study in healthy Caucasian subjects to investigate the association between variation in serum 25(OH)D levels and nine prominent candidate genes, which involve synthesizing, transporting, degrading vitamin D, or activating downstream signaling pathways for vitamin D. Next, we conducted a replication study in another independent cohort of healthy subjects to confirm possible significant results.

Methods and materials

Participants

Subjects for the discovery cohort were 156 unrelated non-Hispanic white men and women, randomly selected from three vitamin D studies in the Osteoporosis Research Center (ORC) at Creighton University. Details of the recruitment criteria have been reported in previous reports (Lappe et al. 2006). The subjects for the discovery cohort were randomly recruited from three locations: Alaska (41 subjects), Hawaii (23 subjects), and Nebraska (92 subjects). Subjects for the replication study were 340 unrelated non-Hispanic white postmenopausal women, randomly recruited from a rural area of eastern Nebraska (Lappe et al. 2006). All the subjects were generally healthy. Subjects who had diseases that may affect vitamin D metabolism were excluded. These diseases included: (1) any chronic kidney disease, (2) Paget’s or other metabolic bone diseases, and (3) all known cancers. All subjects provided written informed consent, and the Institutional Review Board at Creighton University approved the project. The characteristics of the discovery and replication cohorts are presented in Table 1.

Table 1 Participants characterization

Serum samples for 25(OH)D measurements were collected at the initial visit. Total serum 25(OH)D levels, including both 25(OH)D2 and 25(OH)D3, were measured by radioimmunoassay (Nichols/Quest Diagnostics, San Clemente, CA, USA) in a single laboratory participating in the international quality assessment process for 25(OH)D (DEQAS) assays (Carter et al. 2004). Other important facts possibly influencing variation in serum 25(OH)D levels were also collected, such as age, date of blood collection, serum 25(OH)D measurement date, height, weight, habitual Vitamin D supplementation and other medication intake.

Candidate genes, tag SNPs selection

We selected nine candidate genes according to the following criteria: (1) evidence of significant association in previous studies; (2) biological importance in vitamin D metabolism, transportation, degradation, or downstream signaling activation. The genes selected are ALPL (alkaline phosphatase), CYP24A1 (vitamin D 24-hydroxylase), CYP27A1 (vitamin D3 25-hydroxylase), CYP27B1 [25(OH)D-1-alpha hydroxylase], CYP2R1 (cytochrome P450, family 2, subfamily R, polypeptide 1), CYP3A4 (cytochrome P450, family 3, subfamily A, polypeptide 4), GC (vitamin D binding protein), VDR (vitamin D receptor), PTH (parathyroid hormone). The basic characteristics of the nine genes are tabulated in Table 2.

Table 2 Basic characteristics of the nine candidate genes

We used the software program SNPbrowser (v4.0.1) to select tag SNPs within and around the nine candidate genes, which have minor allele frequency (MAF) >10% in the HapMap CEU population. The tag SNP selection is based on the HapMap database (release 20, January 24, 2006) with two methods: pair-wise r 2 (r 2 ≥ 0.8) and haplotype R 2 (R 2 ≥ 0.8) (De La Vega et al. 2006). In addition to tag SNPs, we chose other SNPs in the promoter, including 3′UTR, and exons region that indicate potential functional importance. All chosen SNPs were confirmed from NCBI (http://www.ncbi.nlm.nih.gov/SNP/) and HapMap (http://www.hapmap.org).

SNP genotyping

DNA was extracted from peripheral blood using the Gentra Puregene Blood kits (Qiagen Inc.), following the provided protocol. DNA samples were diluted to 20 µg/ml and shipped to KBioscience Company (http://www.kbioscience.co.uk) for SNP genotyping using the KASPar assay. Forty-six of the 49 selected SNPs were successfully genotyped and used for data analysis. Analysis showed that SNPs rs3782130 and rs2853562 were monomorphic in the discovery cohort; genotyping for SNP rs4646535 failed. After excluding the three SNPs, the genotype call rate for the 46 successfully genotyped SNPs is 97.8%.

Statistical data analyses in the discovery cohort

To illuminate the potential effect of covariates on serum 25(OH)D levels, a Pearson correlation was conducted (SPSS, version 13.0). Age, gender, BMI (body mass index), habitual vitamin D supplementation, and blood collecting season (December–February, March–May, June–August, September–November) were considered as potential covariates. Subsequently, the serum 25(OH)D levels were adjusted by the significant covariates using a linear regression approach (SPSS, version 13.0). The adjusted serum 25(OH)D levels were used for the data analyses.

Several statistical data analyses were conducted in the discovery cohort. The Hardy–Weinberg equilibrium (HWE) of the genotypic frequencies among subjects was examined. Association between the adjusted serum 25(OH)D level (a quantitative trait) and SNPs were tested by using the Wald test implemented in PLINK (Purcell et al. 2007).

To control family-wise error rate, the Max(T) permutation procedure in PLINK was performed for correcting multiple testing and adjusting P values. We conducted 10,000 permutations (by shuffling the phenotypes in each permutation) to generate 10,000 replicated datasets. The observed Wald test statistic at each SNP was compared against the maximum of all statistics (over all SNPs) for each single replicate. Let n denote the number of replicates for which the maximum statistic is greater than or equal to the observed statistic at the SNP. The adjusted P value for the SNP was estimated as the ratio of n/10,000.

Replication and pooled analyses

The replication analyses were conducted in the replication cohort for significant SNPs found in the discovery cohort. Significant SNPs were genotyped in the replication cohort using the same method in the same company as the discovery cohort. Age, BMI, habitual vitamin D supplementation, and phlebotomy season were tested as potential covariates for serum 25(OH)D concentrations. Significant covariates were used for adjusting serum 25(OH)D levels. The same statistical approaches and tools were used for replication analyses to identify genetic variants important for serum 25(OH)D variation. In addition, a one-way ANOVA was performed on the three genotype groups (GG, AA, and AG) of the most significant SNP rs12794714 to compare the difference of serum 25(OH)D levels.

A pooled analysis of the groups was conducted involving all 496 subjects. The same phenotypes of the groups and the genotypes of replicated SNPs were combined as one set to perform the analysis. Significant covariates were evaluated and used for adjusting serum 25(OH)D levels. Association tests were conducted using PLINK (version 1.0.7) by the same methods, as for the discovery cohort.

Results

Table 3 lists the basic information of the 46 successfully genotyped SNPs. All of them passed the HWE test. The average MAF of the 46 SNP markers are 31.5%, ranging from 10 to 50%. We adjusted serum 25(OH)D concentration using significant covariates. In the discovery cohort, BMI (correlation coefficient r = −0.253, P = 0.001) and habitual vitamin D supplementation (r = 0.281, P = 0.001) had significant effects on serum 25(OH)D levels. These two covariates account for 14% of variation in serum 25(OH)D levels. We tested the effects of potential covariates for serum 25(OH)D in the replication cohort. As in the discovery cohort, BMI (r = −0.307, P = 7.7e−9) and habitual vitamin D supplementation (r = 0.247, P = 4.1e−6), were significantly correlated with serum 25(OH)D levels in replication cohort. Age, gender, and phlebotomy season were not significant in either the discovery cohort or replication cohort.

Table 3 Results of single SNP association with serum 25(OH)D level in the discovery cohort (n = 156)

Three P values are reported for the association results. We set the nominal significance level as 0.05 for Wald test for individual SNP. Six SNPs in the CYP2R1 and GC genes were identified to be significantly associated with serum 25(OH)D levels (Table 3). Three SNPs in the promoter of the CYP2R1 gene, rs10741657 (P discovery = 0.001), rs1562902 (P discovery = 0.011), and rs10766197 (P discovery = 0.005) showed evidence of association. SNP rs12794714, which is a synonymous mutation of Ser to Ser in exon 1 of the CYP2R1 gene, reached the lowest P value among all tested SNP markers. For the GC gene, SNPs rs222020 (P discovery = 0.010) and rs2298849 (P discovery = 0.026) were significantly associated with serum 25(OH)D concentrations. If we set the family-wise error rate (i.e. significance level for all SNPs) as 0.05, after adjusting the multiple testing, SNP rs12794714 of CYP2R1 gene maintained significance (adjusted P discovery = 0.022) (Table 3).

Replication analyses were conducted on the six significant SNPs identified at the nominal significance level of 0.05 for single SNP association test. The call rate was 95% for the genotyping experiment for the six SNPs. At the nominal significance level of 0.05 for single tests, SNPs rs12794714 (P replication = 0.018) and rs10766197 (P replication = 0.022) in the CYP2R1 gene were confirmed to be significantly associated with the variation in serum 25(OH)D levels (Table 4). The SNP marker rs222020 (P replication = 0.037) in the GC gene was confirmed to be marginally significant.

Table 4 Replication results of single SNPs association analyses for the serum 25(OH)D level in the replication cohort (n = 340)

All six replicated SNPs were used to construct the pooled dataset of the two groups. After correcting for multiple testing, SNPs rs12794714, rs10741657 and rs10766197 of CYP2R1 gene, and rs222020 of GC gene were found to have a significant association with serum 25(OH)D levels (Table 5). The SNP rs12794714 in the CYP2R1 gene reached the lowest adjusted P value at 1.00 × 10−4.

Table 5 Results of single SNPs association analyses for the serum 25(OH)D level in the pooled dataset of the groups (n = 496)

We compared the raw serum 25(OH)D levels in the three genotypes on the most significant SNP rs12794714 of the CYP2R1 gene using one-way ANOVA (Fig. 1). For SNP rs12794714, significant difference of average serum 25(OH)D levels was shown in the discovery cohort (AA: 62.03 ± 17.82 nmol/L, GA: 70.65 ± 21.70 nmol/L, GG: 77.82 ± 21.96 nmol/L, P = 0.001) and the replication cohort (AA: 66.13 ± 16.18 nmol/L, GA: 74.94 ± 19.90 nmol/L, GG: 75.09 ± 23.67 nmol/L, P = 0.017). By pooling two cohorts into a dataset, the lowest P value of one-way ANOVA was gained (AA: 65.20 ± 16.96 nmol/L, GA: 74.44 ± 20.83 nmol/L, GG: 76.62 ± 23.32 nmol/L, P = 0.0002). The results indicate that raw serum 25(OH)D levels are significantly different between GG, AA, and GA genotypes of SNP rs12794714 in both discovery and replication cohorts.

Fig. 1
figure 1

The raw serum 25(OH)D means (±SE) in the three genotypes of SNP rs12794714 in the discovery cohort, replication cohort and pooled dataset. X axis stands for three genotypes in the two cohorts; Y axis stands for the raw serum 25(OH)D concentrations (nmol/L); Error bar stands by standard error of average serum 25(OH)D concentrations. One-way ANOVA was conducted to compare the serum 25(OH)D levels grouped by genotypes (SPSS 13.0)

Discussion

The present study investigated the association of nine prominent candidate genes with serum 25(OH)D levels. The GC and CYP2R1 genes were shown to be associated with serum 25(OH)D levels in the discovery cohort at nominal significance level of 0.05 for single tests. Further replication analysis and pooled dataset analysis confirmed the association between the CYP2R1 and GC genes and serum 25(OH)D concentrations, suggesting that the CYP2R1 and GC genes play an important role in regulating serum 25(OH)D levels in the non-Hispanic white population.

CYP2R1 is a member of CYP2 family encoding cytochrome P450 proteins. It is a key vitamin D 25-hydroxylase which hydroxylates vitamin D at the 25-C position for 25(OH)D synthesis in the liver (Cheng et al. 2003; Shinkyo et al. 2004). Previous data show that the CYP2R1 gene is associated with several vitamin D related diseases, such as type 1 diabetes (Ramos-Lopez et al. 2007), ovarian cancer (Downie et al. 2005), and asthma and atopy (Bosse et al. 2009).

Our study found that the CYP2R1 gene is associated with serum 25(OH)D levels. This finding is supported by previous studies. Cheng et al. reported that a patient with low circulating levels of serum 25(OH)D and classic symptoms of vitamin D deficiency had a homozygous mutation L99P in exon 2 of the CYP2R1 gene. This homozygous mutation caused inactivation of CYP2R1 (Cheng et al. 2004). Ahn et al. performed a combined meta-analysis on 4,501 subjects of European ancestry from five cohorts (Ahn et al. 2010). The significant findings were replicated in 2,221 subjects. Results show that rs2060793 and rs1993116 in the CYP2R1 gene are associated with serum 25(OH)D levels (Ahn et al. 2010). Interestingly, it is consistent with our results. SNP rs2060793 is located in the promoter region of the CYP2R1 gene. It has a high LD value with SNP rs10741657 in Caucasian population (D′ = 1, r 2 = 1, HapMap Data Rel 24/phase II Nov 08). Ramos-Lopez et al. tested the association of CYP2R1 gene with variation in serum 25(OH)D concentration in 609 subjects from 203 type 1 diabetes families. The study found that SNP rs10741657 is associated with serum levels of 25(OH)D (Ramos-Lopez et al. 2007). In addition, in a recent genome-wide association study (GWAS), Wang et al. (2010) found multiple SNPs (including rs12794714 and rs10741657) in the CYP2R1 gene that are significantly associated with 25(OH)D levels in ~30,000 individuals of European descent from 15 cohorts. The SNP rs10741657 is significant in the association analyses in our pooled dataset, and the SNP rs12794714 located in exon 1 of CYP2R1 gene, is significant in both the discovery cohort and the pooled dataset in our study. In the work of Wjst et al. (2006), SNP rs10766197 in the CYP2R1 gene was significantly associated with the 25(OH)D levels in 872 participants of the German Asthma Family Study. This same SNP is also significant in the pooled dataset in our study. Both of the SNPs rs10741657 and rs10766197 are located in the promoter region of the CYP2R1 gene. All these aforementioned studies indicate that genetic variants of the CYP2R1 gene are associated with serum 25(OH)D variation.

Recent association studies revealed several important genetic variants in the GC gene for the serum 25(OH)D variation. Ahn J et al. (2009) suggested genetic markers rs12512631, rs2282679, rs7041, and rs1155563. And in a recent GWAS study for serum 25(OH)D in 4,501 persons of European ancestry, SNP rs2282679 in the GC gene was the most significant one (Ahn et al. 2010). Interestingly, SNP rs2282679 is also significantly associated with serum 25(OH)D level in the GWAS on ~30,000 individuals in the SUNLIGHT consortium (Wang et al. 2010). Our most significant finding in the GC gene is SNP rs222020. Our LD data analysis indicates low LD between SNP rs222020 and SNP rs2282679 (r 2 = 0.05) (Fig. 2). Both SNP rs222020 and SNP rs2282679 are located in the intron of the gene. It is possible that an unknown functional genetic variant, which is near the two SNPs, is important for the regulation of serum 25(OH)D levels. Future sequencing studies or large-scale association studies with dense markers may reveal such genetic variance.

Fig. 2
figure 2

LD plots with r 2 values of CYP2R1 (a) and GC (b) gene in discovery cohort. The figure is generated by Haploview. D′ values were indicated by the dark depth. r 2 values multiplied by 100 were shown as number in the diamonds

Compared to previous studies, our study has two strengths: (1) This study, conducted in a healthy population to analyze the genetic association with variation in serum 25(OH)D levels, eliminated potential impacts of diseases. (2) Dense markers in nine important candidate genes involved in vitamin D metabolism were selected. One limitation of the study is that the sample size for the discovery cohort is relatively small. Given the lack of knowledge regarding genes regulating prevalent serum 25(OH)D levels, more genetic association studies are needed.

In summary, after comprehensive screening of nine functionally important vitamin D candidate genes, our study suggests that the CYP2R1 and GC genes may be important in regulating serum 25(OH)D levels in healthy Caucasian subjects. It is important to confirm these findings in other large-scale healthy populations or in other races. Further studies are needed to identify the functional genetic variants, and to characterize their functions.