Introduction

It is well known that considerable interindividual differences in drug response exist, and more and more research is focused on understanding to what extent genetic factors can explain these differences [1, 2]. Common genetic variants or polymorphisms are found in key enzymes and proteins involved in the metabolism of many drugs. In the case of substrate overlap (i.e. when more than one isoform of a gene is involved in the metabolism of a substrate) for a particular exogenous drug, the simultaneous occurrence of more than one variant in different genes may generate a synergistic effect that results in significant interindividual variation in the drug’s pharmacokinetic or toxicity properties. Knowing the prevalence of different genetic variants affecting the properties or levels of drug-metabolising enzymes would therefore be valuable when designing pharmacogenetic association studies.

Glucuronidation is primarily a detoxification process responsible for eliminating both endogenous and exogenous compounds. Glucuronidation reactions are catalysed by uridine-diphosphate glucuronosyl transferase (UGT) enzymes encoded by the UGT genes [3]. One subfamily is UGT1A, which has 13 alternative first exons. Different protein isoforms of the UGT1A enzyme are generated by alternative splicing of different first exons joined to the common downstream exons 2–5. The first exon confers the substrate specificity, which differs among isoforms, although an overlap exists for several drugs. The presence of a promoter region upstream of each of the unique exons allows for independent regulation of expression of the different UGT1A-encoded isoforms [4, 5]. Genetic polymorphisms have been identified for many of the UGT1A isoforms, but for several, the functional significance is unclear [3, 5].

A well-known defect of glucuronidation in humans is illustrated by the group of individuals with Gilbert’s syndrome and is associated with genetic polymorphism in the UGT1A gene. Gilbert’s syndrome is a familial unconjugated hyperbilirubinaemia in the absence of liver disease or overt haemolysis and is a benign disorder requiring no treatment [6]. A polymorphism in the UGT1A1 gene, in which seven instead of six thymine repeats (TA) are found in the proximal promoter region, accounts for most cases of this disorder. This results in lower expression levels of the protein and thereby a reduction of the glucuronidation capacity for bilirubin. Homozygosity for the (TA)7TAA variant (also called UGT1A1*28) appears to be necessary but not sufficient for the complete manifestation of the syndrome [3, 7]. Individuals with Gilbert’s syndrome as well as carriers of the UGT1A*28 allele have also been shown to have altered pharmacokinetics of different drugs and an increased risk of drug toxicity [3, 4, 8].

The aim of this study was to determine the prevalence of polymorphisms (genotype distributions) in the UGT1A gene in a Swedish cohort of healthy female and male volunteers. The occurrence of variants found in the Swedish cohort was also investigated in different ethnic groups. The potential for a partly overlapping substrate specificity of different UGT1A isoforms suggests that analysis of a combination of cis-acting variations located on the same allele could be more informative with regard to pharmacokinetics. Thus, a haplotype analysis of the UGT1A gene regarding allelic variants in UGT1A1, UGT1A6, and UGT1A7. Moreover, we investigated the endogenous substrate bilirubin as an indicator for a UGT1A genotype-associated interindividual difference in glucuronidation capacity.

Materials and methods

Study group

Healthy subjects participating in clinical trials or entered in the clinical trial database at the Department of Clinical Pharmacology at Lund University Hospital were asked if they would be willing to be screened for selected polymorphisms. After verbal information was given and written informed consent obtained, blood sampling was collected for further genetic and plasma bilirubin analysis. The study was approved by the ethics committee at Lund University. In all, 248 individuals were genotyped for the variants in the UGT1A gene and its isoforms. Total plasma bilirubin samples (nonfasting) were obtained from 184 subjects (Table 1). DNA was prepared from 7 ml of EDTA whole blood either at the Department of Clinical Pharmacology using the salting-out technique [9] or at Karolinska Institute Biobank Facility in Stockholm.

Table 1 Descriptive statistics for subjects included in the bilirubin analysis

Human diversity panel

A human diversity DNA sample panel of 14 different ethnic groups was obtained from Coriell Repositories (Coriell, Camden, New Jersey, USA). Origins (and catalogue numbers) of the different groups were as follows: Yorubian in Ibadan, Nigeria; Southwestern American Indian (HD06); Southeast Asian, excluding Chinese and Japanese (HD13); Russian Zversky district northeast of Moscow (HD23); Puerto Rican (HD09); Middle Eastern (HD27); Middle Eastern (HD05); Mexican (HD08); Japanese (HD07); Italian (HD21); Indo-Pakistani (HD03); Chinese (HD02); Caucasian Northern European (HD01); African-American (HD04).

Genotyping

Genotyping was performed at the KTH DNA typing facility, Department of Biotechnology, Royal Institute of Technology, KTH, Stockholm, using pyrosequencing with an out-of-phase assay design [10]. Primers for polymerase chain reaction (PCR) and sequencing were designed based on available GenBank entries and are shown in Table 2. For template, 2 μl of the arrayed genomic DNA was used in a 15-μl PCR with 1.5 pmol of each primer, 2 mM MgCl2, 0.2 mM dNTP (Amersham Biosciences, Uppsala, Sweden), and 0.2 U AmpliTaq Gold with 1× PCR Gold Buffer (Applied Biosystems, Foster City, CA, USA). PCR was performed in a MWG multiblock thermocycler (MWG Biotech, Ebersberg, Germany) system with a temperature profile of 95°C for 12 min followed by 45 cycles of the sequence 95°C for 30 s, 62°C for 30 s, and 72°C for 2 min, followed by a final 72°C soak for 10 min, with the exception of reaction UGT1A7(T786C) and UGT1A28TA(TA)175003TAA, for which an annealing temperature of 60°C was used. Template preparation and primer annealing for pyrosequencing was performed in a Magnatrix 1200 instrument (Magnetic Biosolution, Stockholm, Sweden) with a standard method and kit provided by the manufacturer, using 50 μg of paramagnetic M270 streptavidin beads (Dynal Biotech, Oslo, Norway) and 1.65 pmol pyrosequencing primer per sample (Table 2). Samples obtained from the robot in a PSQ 96 plate were loaded on the PSQ HS 96 instrument (Pyrosequencing, Uppsala, Sweden) and run according to the manufacturer’s standard procedure, and were analysed using Pyrosequencing SNP software. Pyrosequencing assays for molecular haplotyping of UGT1A6 and UGT1A7 were designed according to the principles previously described [11]. The nucleotide additions in the different assays were as follows:

  • UGT1A1 promoter (TA/–): [GATATATATATGACTGAACAACAT]

  • UGT1A6 (T181A, R184S): [GTACAGTCTATCAGCAGACGACGCA]

  • UGT1A7 (N129K, R131K): [GCGTCAGCTATCTATAC]

  • UGT1A7 (W208R): [GACGTAGCACACAT]

Table 2 Polymorphisms and corresponding primer sequences (IP polymerase chain reaction primer, PS pyrosequencing primer)

Haplotype analysis and linkage disequilibrium

The corresponding haplotypes for the combination of all the analysed polymorphisms in the UGT1A gene were reconstructed using two statistical methods, the Gibbs sampling algorithm [12] and expectation maximisation (EM) [13]. The in silico SNP workbench tool [Andrade et al., unpublished] was used to perform the haplotype and linkage disequilibrium (LD) analysis. This tool is a set of ActiveX components, programmed in C++ Builder and available as a Web-based integrated workbench, that perform different statistical analyses on genotype data, including haplotype reconstruction and haplotype frequency estimation. This tool was also used to generate the graphical visualisation. The LD was measured by comparing the probability of the estimated frequency of each allele with the probability of the observed frequency of the allele when they are presumed to segregate independently. The EM-based algorithm was used to estimate the haplotypes followed by calculation of LD using various LD metrics as described by Devlin and Risch [14]. GOLD software [15] was used to display linkage disequilibrium graphically.

Phylogenetic tree

A phylogenetic tree was created using the distance matrix method [16]. This is a graphical representation of the relationship between the estimated haplotypes. It illustrates the probability that two sequences are closely related to each other but not necessarily the pathways that created the existing relationships.

Analysis of bilirubin

Samples for bilirubin levels were analysed using a validated method (used daily for routine analysis) at the Department of Clinical Chemistry Laboratory, Lund University Hospital, using a Hitachi Modular P apparatus. The reference limit was set to <20 μmol/l plasma.

Statistic analyses of bilirubin levels versus genotypes

A series of univariate analyses of the bilirubin levels categorised by gender, smoking status, and the UGT1A1 genotypes was performed. Additionally, multivariate analyses of the bilirubin levels categorised by gender, smoking, contraception, and genotypes (UGT1A1, haplotype combinations) and all possible interaction terms were performed.

All analyses were performed using SAS software, version 8.2 (SAS, Cary, NC, USA). The results are presented as least square means with 95% confidence intervals.

Results

Frequencies and haplotypes

To analyse the frequencies of the genotypes included in the study, we developed genotyping assays based on the pyrosequencing technique (Table 2). The frequencies of seven polymorphisms in the UGT1A gene (Fig. 1) were investigated in a cohort of 248 healthy individuals and confirmed to be in Hardy–Weinberg equilibrium (Table 3). The haplotypes of the UGT1A gene with respect to the UGT1A1, UGT1A6, and UGT1A7 polymorphisms were determined based on individual genotype information for the seven different nucleotide sites using two alternative statistical methods (Gibbs and EM). This analysis was performed both separately in the cohort and together with the genotype data from the human diversity panel (HDP). The results from the two methods show good agreement (Table 4). Three specific haplotypes constituted almost 84% of the allelic variants found in the total sample set (cohort and HDP; Table 4). Altogether we identified 15 haplotypes, of which three were found only in the HDP and two were unique to the cohort (Table 4). The common haplotype 3, carrying the TA insertion allele, was found in all ethnic groups of the diversity panel except for Japanese, Southwestern American Indians, and Southeast Asians (excluding Chinese and Japanese) (Table 5).

Fig. 1
figure 1

Schematic illustration of the partial UGT1A gene with alternative first exons of the isoforms A1, A6, and A7, and the location of the analysed polymorphisms

Table 3 Allele frequencies of the UGT1A polymorphisms in the cohort
Table 4 Haplotype sequences and frequencies
Table 5 The original and catalogue numbers of the Corriel HDP panel of ethnic groups with the number of carriers of the different haplotypes (H1–H15) of the UGT1A gene including the cohort

Phylogenetic tree and linkage disequilibrium

A graphical representation of the phylogenetic relationship between the different haplotype sequences is shown in Fig. 2. The degree of LD between the different alleles shows a picture of at least two separate blocks of LD within this region of the UGT1A gene (Fig. 3). Furthermore, the statistical analysis identified two novel alleles with respect to the UGT1A6 polymorphisms (Table 4). We have denoted the alleles *3 and *4, respectively, which correspond to the amino acid combinations T181S184and A181 R184, respectively. These were independently confirmed experimentally through the molecular haplotyping analysis (Fig. 4).

Fig. 2
figure 2

Relationship between the estimated haplotypes as a phylogenetic tree. This illustrates the probability that two sequences are more closely related to each other but does not necessarily describe the pathway that created the existing relationships

Fig. 3
figure 3

Linkage disequilibrium (LD) between the markers in cohort. The graphics correspond to a representation of delta square measure. Two blocks can be defined with high levels of LD: The first one corresponds to the linkage between UGT1A (TA/–) and UGT1A6(A754G, A765C) polymorphisms, and the second is defined for UGT1A7 (T551G, C555A, G556A) and (T786C) polymorphisms

Fig. 4
figure 4

Pyrosequencing raw data from molecular haplotyping of the UGT1A6 (A754G) and (A765C). The two novel haplotypes (*3, *4) in the UGT1A6 discovered by the two statistical methods are confirmed in heterozygotes for the *1/*3, *4/*2, and *3/*2 alleles. No homozygotes were identified in the genotyped individuals. Allele *1=T181R184; allele *2=A181S184; allele *3=T181S184; allele *4=A181R184

Genotype UGT1A1 and phenotype bilirubin

When testing for an association of genotype and total bilirubin levels in plasma, we found that carriers of the TA allele had increased levels of bilirubin compared with noncarriers (P=0.0003), with the highest levels found in the homozygous group (P<0.0001). When stratifying bilirubin levels for gender, smoking, and contraceptives, we found a significant difference for gender [P<0.0001; least square mean for women, 8.2±3.5 (SD), 95% CI (7.3–9.0); for men, 12.6±5.8 (SD); 95% CI (11.5–13.6)], but none for smoking or users of contraceptives.

When stratifying gender for genotype, there was still a significant difference between genders for wild type and heterozygous individuals (Fig. 5 and Table 6), but not for the homozygous (TA/TA). For men there was also a significant difference when comparing –/– with TA/– (P=0.0005) and TA/– with TA/TA (P<0.0001). For women this could be observed for TA/– and TA/TA (P<0.0001); however, no statistically significant difference could be found between female wild type and heterozygous (P=0.124), although the trend was the same (Table 6).

Fig. 5
figure 5

Wild type (–/–), heterozygous (TA/–), and homozygous (TA/TA) for the –/TA insertion deletion polymorphism in the promoter region of the UGT1A1 gene (UGT1A1 *1/*28), stratified for gender and correlated to the levels of bilirubin [least square mean (Lsmean) and 95% confidence interval (CI)]

Table 6 Genotype UGT1A1, stratified for gender and bilirubin

Diploids of haplotype combinations with and without TA allele, and association with bilirubin

We analysed whether particular haplotype combinations were associated with differences in bilirubin levels. Diploids with a frequency < 2.2% were excluded, leaving 11 diploid combinations (1:1, 1:2, 1:7, 2:2, 2:4, 1:3, 2:3, 3:3, 3:4, 1:4, 1:5). We found that the 3:3 haplotype combination (here not separated for gender) was associated with an increased bilirubin level (least square mean 22.4, 95% CI 19.7–25), which was significantly different compared with the other diploids (P<0.0001). This is consistent with the identified association of bilirubin with the TA genotype, as this resides on haplotype 3. When we excluded the diploids with the TA allele, there was no statistically significant association between bilirubin levels and analysed diploid combinations (P=0.72).

Local haplotype analysis of UGT1A and association with bilirubin

It is not possible to analyse the effect of UGT1A6 allelic variation independently of the UGT1A1 polymorphism due to the high linkage between these. However, we analysed whether there could be any independent influence of UGT1A7 variation on bilirubin levels. A new diploid set was analysed in which we extracted the local haplotypes for UGT1A7 data in the second block of LD of UGT1A7 identified by the (T551G), (C555A), and (G556A) polymorphisms. No independent effect of UGT1A7 on bilirubin levels was observed (P=0.39) when alleles coupled to the TA insertion were excluded.

Discussion

This study was designed to investigate the prevalence of different polymorphisms, including haplotype combinations, of the UGT1A gene in a Swedish cohort of healthy individuals volunteering for clinical trials. In the cohort, the frequency of the UGT1A1*28 allele was found to be in the range previously reported for Caucasians (approximately 0.30–40) [1719]. When analysing the haplotype structures and their frequencies, we discovered novel allelic variants and haplotypes.

Due to the limited number of individuals in each ethnic group (apart from the Swedish cohort), a true frequency determination was not possible, and the results give only an indication of whether common haplotypes found in the cohort could be significantly different or absent in certain ethnic groups. It has previously been reported that the frequency of UGT1A1*28 is low in Asians [17, 20]. Interestingly, we found the minor haplotypes H6 and H9 containing the TA allele among the 10-sample representations of Southeast Asians (excluding Chinese and Japanese) and Southwestern American Indian populations and H9 in Japanese, while the common haplotype 3 carrying the TA insertion (*28) was missing in all three groups (Table 5). Yorubians showed a high occurrence of the TA allele, including both haplotype H3 and the rare haplotypes H7 and H9, which could indicate that the common haplotype H3 may be of early origin. Our data agree with earlier reports that the TA allele has a high frequency in Africans [17]; however, the individual TA-containing UGT1A haplotypes behind this observation have not been previously described. Moreover, we found three novel haplotypes in the HDP that do not exist in the large cohort. Together, this indicates that there will be ethnic-related differences in UGT1A haplotypes.

Generally, the more individuals that form the basis for a statistical determination of haplotypes from unrelated individuals, the less likelihood of ending up with individuals with unresolved haplotypes, i.e. individuals who are heterozygotes for certain markers and in whom the exact phase of these cannot be determined. The two methods used here [12, 13] are in good agreement. Moreover, both methods discovered two new allelic variants of the UGT1A6 local haplotype or composite variation, *3 and *4 (Table 4). The independent confirmation of these from the molecular haplotyping raw data (Fig. 4) further supports the accuracy of these two statistical methods in deriving minor haplotypes from datasets of this sample size.

It has been suggested that the UGT1A6*2 variant is associated with reduced glucuronidation capacity [3, 21]. This study shows, in agreement with previously reported research [18], that there is a high linkage of UGT1A6*2 to the allelic variant of UGT1A1*28. The literature does not suggest a role of the UGT1A6 enzyme in the metabolism of bilirubin [22], but there are compounds for which both UGT1A1 and UGT1A6 are involved in the metabolism [5, 22]. The common haplotype 3 (Table 4) constitutes both the *28 and the *2 alleles and would harbour a potential additive effect regarding reduced glucuronidation of these compounds. However, the high LD between these alleles would make it difficult to perform any differential analysis of the separate phenotypic effect of either of these two isoforms for a drug that is glucuronidated by both the isoforms. This is the case unless a study is designed specifically with this aim and genotype screening of potential subjects is used to select and include a sufficient number of individuals with the rare haplotypes in which the *28 and *2 alleles are not co-occurring (haplotype 5 and 7). The allele variation of UGT1A6 has hitherto been described as co-occurrence of the two amino acid variants, either the T181R184 (*1) or the A181S184 (*2) alleles; however, we describe here the two other possible combinations, *3 and *4 (Table 4). It is unknown whether the suggested reduced function of the *2 allelic variant of the UGT1A6 enzyme involves the amino acids at both of these positions (A181 and S184); therefore, the implications of the novel minor variants with the other alternative combinations (found in haplotypes H6, H8, H12, H14, and H15) of these amino acids are uncertain.

It has earlier been shown that increased bilirubin levels are related to allele UGT1A1*28 [7, 23]. This association has been shown to be more pronounced in individuals who have been overnight-fasted and have refrained from drugs, alcohol, and smoking, compared with an unrestricted group of individuals such as in our study [7, 23]. In our study, the mean concentration of bilirubin in plasma was lower in women than in men (Table 6), which is in accordance with previous findings [2426]. Furthermore, women homozygous for the UGT1A1*28 allele had plasma concentrations of bilirubin in the normal range, in contrast to the homozygous men who all had bilirubin concentrations above the normal range (Table 6). Although the study numbers are small, similar results have been reported in other studies where the subjects had undergone restriction regarding food intake [7, 19]. This is believed to explain the high male/female ratio among patients diagnosed with Gilbert’s syndrome [7]. In view of this, hyperbilirubinaemia seems to be a poor indicator for reduced glucuronidation capacity, at least in Caucasian females. This is an important issue because total bilirubin has been suggested to replace genotyping information of the TA insertion when the latter is not available, for example for predicting severe neutropaenia after treatment with the anticancer drug irinotecan [27].

We could detect no independent effect of UGT1A7 on bilirubin levels. The UGT1A7 is probably not involved by itself in the glucuronidation of bilirubin, as the major expression of UGT1A7 has been found in gastric tissue [28]. For drugs in which glucuronidation involves all three UGT1A isoforms, e.g. irinotecan [29, 30], haplotype-based analysis involving cis-acting polymorphisms in different blocks of LD is relevant to consider because it may identify a combined effect associated with a specific haplotype, not evident when analysing the different polymorphisms separately.

Conclusion

The studied genetic variants and haplotypes in the UGT1A gene are common in a Swedish cohort of healthy volunteers, but prevalence may differ due to ethnicity.

Because of the influence of gender on bilirubin levels, total bilirubin has limitations as a phenotype indicator when testing for an UGT1A genotype association with altered glucuronidation capacity. Determination of haplotypes of potential cis-acting polymorphisms in the UGT1A gene should be considered in pharmacogenetics association studies regarding drugs that undergo glucuronidation and when substrate overlap may potentially exist.