Introduction

The cytochrome P450 enzyme CYP1A2 plays a significant role in the metabolism of caffeine and of several drugs including clozapine, imipramine, and theophylline (reviewed in [1]). To date, five of the 14 CYP1A2 alleles (http://www.imm.ki.se/CYPalleles) are known to be associated with decreased enzyme activity as compared with the wild-type CYP1A2*1A allele. That is, the CYP1A2*1F allele consisting of a single nucleotide polymorphism (SNP) −163A>C in the 5′noncoding region of the CYP1A2 gene had been associated with 1.6-fold decreased caffeine metabolism in Caucasian smokers [2], although no functional association was seen in nonsmokers [2], pregnant women [3], or individuals of other ethnicities [4, 5]. Of note is that the −163C-allele is numerically the minor allele, hence −163A>C, but CYP allele nomenclature lists this SNP as −163C>A according to the reference sequence (Ensembl Gene ID ENST00000343932). The CYP1A2*1K allele consisting of three linked SNPs in the 5′noncoding region, −739T>G, −729C>T, and −163A>C, was associated with decreased caffeine metabolism [4]. Likewise, the CYP1A2*1C allele consisting of the −3860G>A SNP was associated with decreased caffeine demethylation in Japanese smokers [6]. The CYP1A2*7 allele consisting of the 3534G>A SNP in intron 6 of the CYP1A2 gene was reported to be associated with impaired clozapine elimination due to low CYP1A2 enzyme activity in Japanese subjects [7]. Finally, the CYP1A2*11 allele consisting of a 558C>A SNP in exon 2 was associated with reduced catalytic activity for acetaminophen to approximately 5% of that of the wild-type allele [8].

It has been demonstrated that CYP1A2 polymorphisms are in linkage disequilibrium. Therefore, screening for CYP1A2*1D (−2464Tdel), CYP1A2*1F (−163A>C) alleles has been proposed to be predictive for the CYP1A2 phenotype [9]. Additional functional information is provided by CYP1A2*1K (−739T>G, −729C>T, −163A>C) allele [4]. We thus developed pyrosequencing [10] assays for rapid detection of these alleles in order to facilitate routine assessment of the CYP1A2 genotype.

Methods

Blood samples were obtained from 495 healthy Caucasian subjects after written informed consent. Approval of genotyping had been obtained from the University of Frankfurt Medical Faculty Ethics Review Board. The DNA was extracted using a BioRobot EZ-1 and the EZ-1 DNA Blood Card (Qiagen, Hilden, Germany). PCR primers were designed with the Oligo primer analysis software (Molecular Biology Insight, Cascade, CO, USA) using the CYP1A2 nucleotide sequences provided with Ensembl Gene ID ENST00000343932. The primer pair biotin-5′–GGCAACATGGCAAGACCT–‘3 and 5′–GGACAAGCCTTAAATTGGATG–‘3 was used for −2464Tdel, biotin-5′–GGAGAGAGCCAGCGTTCA–‘3 and 5′–GGACAATGCCATCTGTACCAA–‘3 for −163A>C, and 5′–TCTTGGGACCAATTTACAATCTC–‘3 and biotin-5′–GGCTTAGTCCAAACTGCTCATT–‘3 for −739T>G/−729C>T. After initial denaturation (95°C, 5 min), a thermal cycler protocol (50 cycles) was employed cycling 15 s at 95°C, 15 s at the annealing temperature of 52.5°C for −2,464Tdel, 54.5°C for −163A>C, and 51°C for −739T>G/−729C>T, followed by 30-s extension at 72°C. A duplex reverse assay used the pyrosequencing primers 5′–CCAGGTTGGGGTTC–‘3 for −2,464Tdel and 5′–CCATCTACCATGCGTC–‘3 for −163A>C, and a simplex forward assay used 5′–GGGCTAGGTGTAGGG–‘3 for −739T>G and −729C>T. Assay design was carried out with SNP Primer Design software for a PSQ 96MA system (Pyrosequencing, Uppsala, Sweden). The deoxynucleotide triphosphate (dNTP) dispension order was GATAGTGCTGTGTCA for the duplex and AGTGCTGAGTCTCG for the simplex assay. A 25-μl CYP1A2 PCR template of each allele was incubated in a shaker (10 min) with streptavidin-coated sepharose beads (Amersham Pharmacia Biotech, Uppsala, Sweden) and prepared with 70% ethanol and denaturation buffer in a Vacuum Prep Workstation (Pyrosequencing, Uppsala, Sweden) for transfer of the biotinylated templates into 55 μl of the corresponding 0.35 μM sequencing primer. Pyrosequencing took place after incubation for 2 min at 80°C. For each of the genotypes −2464TT, −2464TdelTdel, −163CC, −163GG, −739TT, −739TG, −729CC, and −729CT, two randomly selected samples were amplified with nonbiotinylated primers, conventionally sequenced (AGOWA, Berlin, Germany) and implemented as positive controls during pyrosequencing.

The allelic frequencies were calculated as a/2n, where a is the number of mutated alleles and n is the number of DNA samples. On the basis of the observed allelic frequency, the expected number of homozygous and heterozygous carriers of the respective SNP was calculated using the Hardy-Weinberg equation as \(p^{2} + 2pq + q^{2} = 1,\) where p and q are defined as the probabilities of occurrence for the dominant and mutated alleles, respectively. The correspondence between the observed number of homozygous and heterozygous individuals and the numbers expected on the basis of the Hardy-Weinberg equilibrium, indicating that the study sample corresponded to a random sample of subjects, was assessed using the χ2 test. In addition, the observed allelic frequencies were compared with allelic frequencies reported in the literature by means of Fisher's exact test. Binominal 95% confidence intervals (CI) of allelic frequencies of SNPs and haplotypes were computed using the BiAS software (epsilon-Verlag, Germany). Furthermore, analysis of linkage disequilibrium was performed using the EMLD computer program (Qiqing Huang, Ph.D., University of Texas, USA; http://www.request.mdacc.tmc.edu/∼qhuang/Software/pub.htm), which computed the values of D', denoting the difference, D, between the observed and the expected gamete frequency, normalized to the maximum that D can have according to Lewontin [11], and r2 denoting the squared correlation measure of linkage disequilibrium between two loci [12]. Finally, in-silico haplotyping was performed using the PHASE computer software [13, 14].

Results

The pyrosequencing assays clearly identified all genotypes of the 495 DNA samples and were concordant with the genotype of the electropherogram controls (Fig. 1). The CYP1A2*1D allele (−2467Tdel) was found at an allelic frequency of 7.9% (95% CI 6.3–9.7; Table 1). The majority of 422 subjects (85.3%) had no deletion at position −2467, 68 carriers (13.7%) were heterozygous (−2467TTdel), and five carriers (1%) were homozygously mutated (−2467TdelTdel). The allelic frequency for the CYP1A2*1F allele (−163A>C) was 31.8% (95% CI 28.6–34.5). There were 224 (45.3%) homozygous noncarriers of the mutated −163C allele, 227 heterozygous carriers (45.9%), and 44 (8.9%) homozygous carriers of the −163C allele. The CYP1A2*1K allele (−739T>G, −729C>T, −163A>C) had an allelic frequency of 0.4% (95% CI 0.03–0.7). In detail, SNP −739T>G and −729C>T reported an allelic frequency of 1.6% (95% CI 0.9–2.6) and 0.2% (95% CI 0.03–0.7) with 16 (3.2%) and two (0.4%) heterozygous carriers, respectively. Homozygous carriers were not detected. The observed distributions for all alleles agreed with the distributions predicted by the Hardy-Weinberg law (χ2 test: p>0.45, indicating absence of difference between observation and expectation). Complete linkage disequilibrium (value of D' nearly 1) existed between −2467Tdel and both −739T>G (r2=0.19) and −729C>T (r2=0.02), between −739T>G and −729C>T (r2=0.12), and between −729C>T and −163A>C (r2=0.001). Furthermore, −163A>C was linked to −2467Tdel and −739T>G (D'<0.6, r2<0.01). The most frequent CYP1A2 haplotype with respect to the four analyzed DNA positions was −2467T/−739T/−729C/−163A (61.6%), followed by −2467T/−739T/−729C/−163C (30.5%), −2467Tdel/−739T/−729C/−163A (5.1%), −2467Tdel/−739G/−729C/−163A (1.2%), and −2467Tdel/−739T/−729C/−163C (1.1%).

Fig. 1
figure 1

Duplex and simplex assay design with expected and observed pyrograms for detection of the deletion −2467Tdel (CYP1A2*1D) and the single nucleotide polymorphisms (SNPs) −163A>C (CYP1A2*1F), −739T>G, and −729C>T (CYP1A2*1K with SNP −163A>C) in the CYP1A2 gene. The positions relevant for genotype identification are framed in the pyrogram and the biotinylated forward respective reverse DNA strand. The sequence to analyze based on the cDNA position with attached sequencing primer determines the nucleotide dispensation order. The duplex pyrograms denote the genotypes −2467TT, −163CC (above), −2467TTdel, −163AC (center), and −2467TdelTdel, −163AA (below). The simplex pyrograms show the genotypes −739TT, −729CC (above), −739TG, and −729CT (center). Homozygously mutated genotypes were not detected. Note that the initial part corresponding to the dispensation of enzyme and substrate has been omitted. From sequence analysis of the CYP1A2 gene, only the fragments relevant for identifying the single nucleotide polymorphisms at position −2467, −163, −739T>G, and −729C>T (marked with rectangle) are displayed in the electropherograms (right). The sequence was identical with the CYP1A2 gene cDNA as deposited at Ensembl Gene ID ENST00000343932

Table 1 Summary of detected allelic frequencies of CYP1A2 single nucleotide polymorphisms (SNPs) and CYP1A2*1D, *1F, and *1K alleles in different ethnicities [nd not determinable (95% CI/Fisher’s p not indicated or calculation not possible due to missing genotype distribution)]

Discussion

The presently observed allelic frequency of 7.9% of the CYP1A2*1D allele is similar to that of 4.1% and 5.4% previously observed in Caucasians [9] but differs statistically significantly from other ethnicities (Table 1) 1. That is, the −2467T deletion appeared at much higher allelic frequencies of 41.5% in Japanese and of 40% in Egyptians [15]. For the CYP1A2*1F allele (i.e., −163C allele), the observed allelic frequency of 31.8% corresponded to the majority of previously published data, but not to that involving Ethiopians [4], Bantu Africans [16], or Japanese [17], yielding significantly higher frequencies (Table 1). An even higher CYP1A2*1F frequency was detected in African-American patients with tardive dyskinesia after long-term neuroleptic medication [18], whereas patients with a sporadic form of porphyria cutanea tarda were in a significantly lower number carrier of the C-allele [19]. The low allelic frequency of 0.4% for CYP1A2*1K corresponded to previously obtained results [4] with allelic frequencies for SNPs −739T>G and −729C>T comparable to Caucasians [4] but differing from other ethnicities [4, 17] (Table 1). The presently observed frequency of the nonmutated haplotype −2467T/−739T/−729C/−163A of 61.6% reproduces the frequency of that haplotype of 61.8% recently reported [9], and the other haplotype frequencies also do not differ statistically significantly from the published ones of 33.3% for −2467T/−739T/−729C/−163C and 3.5% for 2467Tdel/−739T/−729C/−163A [9].

The suggestion that screening for the alleles CYP1A2*1D (−2467Tdel) and CYP1A2*1F (−163A>C) in Caucasians is sufficient to predict CYP1A2 enzyme activity was based on the observation that CYP1A2 polymorphisms are in strong linkage disequilibrium [9]. However, this would not identify the CYP1A2*1K allele consisting of the SNPs −739T>G, −729C>T, and −163A>C, because the −739T>G and −729C>T SNPs are much rarer than the −163A>C SNP. But because the functional relevant [4] *1K allele has the very low frequency of 0.4% in Caucasians, only *1D and *1F alleles need to be tested in Caucasians to obtain sufficient information for phenotype prediction [9]. Screening of the *1K allele in other ethnicities promises additional information because the frequency is higher: 3.6% in Arabians and 3% in Ethiopians [4]. Nonetheless, the present selection of SNPs might not be applicable to other populations in which frequencies of certain alleles have been reported to be different. For instance, the population frequencies for three additional characterized SNPs in the 5′ flanking region of CYP1A2 (−3591T>G, −3595G>T, and −3605Tins) were much lower in Caucasians than in African-Americans or Taiwanese [20]. A haplotype analysis in Japanese showed that allele CYP1A2*1C is highly linked with CYP1A2*1F (0.99 probability), but allele CYP1A2*1F was associated with a smaller degree to allele CYP1A2*1C (0.37 probability) [15]. The authors saw this discrepancy as a possible explanation for the difference in the plasma caffeine metabolic ratio between carriers of these two alleles [21].

In addition to the pharmacogenetic modulation of the plasma concentrations and effects of drugs that are substrate of CYP1A2, a role of CYP1A2 polymorphisms for clinical pathology has been suggested. For example, the SNP −163A>C (CYP1A2*1F) has been proposed to play an important role in disease states such as porphyria cutanea tarda [19], ovarian cancer [22], myocardial infarction [23], and tardive dyskinesia induced by neuroleptics in schizophrenic patients [18].

In conclusion, due to the large variation of CYP1A2 enzyme activity, which ranges from 10– to 160–fold [3, 24, 25], and the preliminary character of the clinical data, further research on the polymorphic character of CYP1A2 is required. To facilitate detection of the relevant CYP1A2 genotypes, we developed fast and reliable pyrosequencing assays.