Introduction

A wide range of methods for the analysis of pharmacogenetic polymorphisms is now available. These can be divided into those concerned with phenotype determination, either by measurement of enzyme levels or monitoring levels of a metabolite in blood or urine, and genotyping methods, where DNA is analysed directly for the occurrence of a specific genetic polymorphism, usually following amplification of part of the gene of interest by PCR. Phenotypic analysis is mainly only applicable to metabolising enzymes whereas genotypic analysis can be applied to a wide range of genes encoding proteins of pharmacogenetic interest including enzymes, receptors and transporters. Genotyping is now the method of choice for most pharmacogenetic studies and the increasing availability of genome sequence data is likely to lead to an increased knowledge of pharmacogenetic polymorphisms known to result in functionally significant effects on drug disposition or responses. However, the important early discoveries in pharmacogenetics such as the N-acetyltransferase (NAT2) and CYP2D6 debrisoquine/sparteine polymorphisms (Evans et al. 1960; Mahgoub et al. 1977) were achieved by phenotypic analysis and there is still a need for effective phenotyping methods for some types of study. Specific examples of such studies include in vivo drug interaction studies on specific cytochrome P450 isoforms such as CYP2D6 and population studies on certain enzymes such as CYP1A2, CYP3A4 and CYP2E1 where there is some evidence for genetically-determined variability in enzyme levels but the molecular basis of this variation remains unclear. Also, in the case of some polymorphisms, phenotypic analysis may be considered to be more sensitive in identifying all individuals with a particular deficiency than genotyping where it is not usually possible to screen for certain rare or unknown polymorphisms. This is theoretically the case for the CYP2D6, CYP2C19 and thiopurine methyltransferase (TPMT) polymorphisms. However, in comparing assay sensitivity the fact that subject compliance and use of interfering drugs can decrease the accuracy of phenotypic analysis involving probe drug administration should also be considered. Direct enzyme assays such as that for TPMT levels in erythrocytes present fewer problems of this nature. Both types of phenotyping assays require a definition for the abnormal phenotype and occasionally it may be difficult to define phenotype if the measurement for an individual falls at the border of the normal and abnormal phenotype. With genotyping this problem will not arise.

Phenotype determination

The most convenient approach to population phenotyping is to measure levels of enzyme activity or protein in an accessible tissue such as erythrocytes or leukocytes. Unfortunately, many enzymes of pharmacogenetic interest are not expressed at high levels in these cells. This is particularly so for most cytochrome P450 isoforms, which are mainly expressed at high levels only in the liver though there are now some reports of measurements in leukocytes using sensitive methods such as immunoblotting (Raucy et al. 1997). The validity of using levels of enzyme activity in leukocytes as a surrogate for levels in other tissues is however questionable (Hukkanen et al. 1997). Because of the limitations with the use of direct enzyme measurement in blood cells, phenotypic analysis involving administration of probe drugs is the most widely used method for phenotype determination at present. Table 1 summarises the main phenotyping methods, either involving direct enzyme assay or use of a probe drug, that have been used in population studies on interindividual variability in activity for a variety of xenobiotic metabolising enzymes and related proteins.

Table 1 Phenotyping methods for analysis of pharmacogenetic polymorphisms

Methods using probe drugs

As summarised in Table 1, a range of probe drugs have been utilised for phenotyping studies, particularly for CYP2D6, CYP2C19, CYP2C9, CYP3A4 and NAT2. The subject of cytochrome P450 probe drugs has been reviewed in some detail (Streetman et al. 2000). As discussed by these authors, there are well-established and effective probe drugs available for CYP2D6 and, to a lesser extent, CYP2C9. Though mephenytoin is a widely-used and sensitive probe for CYP2C19, there are concerns about its safety and no completely satisfactory alternative is currently available. A large number of different probe drugs have been used in studies on CYP3A4 but, as discussed by Streetman et al. (2000), there are certain drawbacks to each and no ideal probe for this enzyme in population studies on interindividual variation in activity has yet emerged. There is still controversy about the extent of polymorphism in CYP3A4. An upstream polymorphism has been detected but its functional significance remains uncertain (Rebbeck et al. 1998; Westlind et al. 1999). A number of coding region polymorphisms are also known including one showing substrate-dependent effects on activity (Sata et al. 2000). The existence of substrate-dependent polymorphisms makes the interpretation of phenotyping data relating to CYP3A4 more difficult as does the fact that 10–20% of livers also express the homologous CYP3A5, which shows some differences in substrate specificity to CYP3A4 (Aoyama et al. 1989; Gillam et al. 1995). However, many phenotyping studies on CYP3A4 are concerned with determining the extent of induction or inhibition by new drugs and currently available phenotyping methods including the erythromycin breath test and use of midazolam are useful in such cases. Caffeine is a widely used probe for CYP1A2 activity but there is still uncertainty regarding the precise metabolic ratio that should be used and the functional significance of several upstream polymorphisms (Nakajima et al. 1999; Sachse et al. 1999; Sinues et al. 1999; Sachse et al. 2003).

In the case of non-P450 enzymes, NAT2 has been the most comprehensively studied by use of probe drugs. A wide variety of drugs have been used but caffeine is currently the most convenient probe (Grant et al. 1983). The homologous NAT1 gene has also been shown to exhibit polymorphism at a low frequency using 4-aminobenzoic acid as probe drug (Cribb et al. 1994). Phenotype for the polymorphic flavin-linked monoxygenase FMO3 can be determined by analysis of levels of endogenous trimethylamine in urine (Alwaiz et al. 1989). In the case of UDP-glucuronosyltransferases (UGT), a number of probe drugs have been used in phenotyping studies. Some evidence for the existence of considerable interindividual variation in levels of activity has emerged from these studies but interpretation of the data is made difficult by the possibility that some of the probe drugs used may be substrates for more than one UGT isoform (Liu et al. 1995; Patel et al. 1995; Shimoda et al. 1995; Yue et al. 1997).

Up to the present, the majority of studies using probe drugs have been concerned with measurement of levels of xenobiotic metabolising enzymes only. However, interindividual variation in absorption, distribution and excretion is also clearly of importance to overall drug disposition. Levels of plasma digoxin in healthy volunteers were used as a measure of absorption and a correlation found between presence of a polymorphism in the P-glycoprotein (PGP) encoding MDR1 gene, which is apparently linked to low levels of PGP expression and high plasma levels of the drug (Hoffmeyer et al. 2000). These findings have not been confirmed in all subsequent studies and the precise functional significance of several MDR1 polymorphisms remains controversial (Kim 2002).

Direct measurement

Phenotyping by direct measurement of enzyme activity is a useful technique in the study of many pharmacogenetic polymorphisms affecting drug metabolism but as mentioned above there are problems with low or absent enzyme expression in accessible tissues. In the case of the cytochromes P450 enzyme levels are often measured in human liver microsomes but obtaining suitable liver samples presents some ethical and practical problems. In particular, the extent of interindividual variation in enzyme activity may be overestimated if all specimens used are not in good condition with uniform methods used for preparation of microsomes.

Use of blood cells presents fewer difficulties if the enzyme of interest is expressed at sufficient level for detection and the data obtained is genuinely representative of gene expression in other tissues as well. As summarised in Table 1, studies using erythrocytes or leukocytes have resulted in the detection of polymorphisms affecting a number of phase II enzymes including the glutathione S-transferases GSTM1 and GSTT1 (Seidegard and Pero 1985; Hallier et al. 1993), the methyltransferases TPMT (Weinshilboum and Sladek 1980) and catechol O-methyltransferase (COMT) (Weinshilboum and Raymond 1977) and the phenol sulfotransferases (Price et al. 1988, 1989). In addition, an apparent polymorphism affecting the extent of CYP1A1 induction by polycyclic aromatic hydrocarbons has been detected in cultured lymphocytes (Kouri et al. 1982).

Genotype determination

Sequence data now exists on a wide range of genes relevant to pharmacogenetics, many of which are known to show polymorphism. The molecular basis of most polymorphisms previously detected by phenotypic analysis is now known and it is now possible to use simple genotyping assays to predict phenotype accurately. In addition, it is also possible to detect new polymorphisms for which no previous phenotypic data existed by either directly scanning coding or regulatory sequences for polymorphisms or comparison of several independently sequenced cDNA or genomic clones. This is an increasingly important approach, which is being greatly facilitated by the availability of the Human Genome Project sequence data. In particular, it is now possible to obtain data on polymorphism in receptors and other drug targets on which phenotypic studies have been difficult in the past. Polymorphisms, including the most common type the single nucleotide polymorphism (SNP), occur at typical frequencies of 1 per 1,000 bp in the genome. However, even in coding sequences the majority will not be functionally significant. Once novel polymorphisms have been identified, the next step in their study is usually assessment of their functional significance. In the case of polymorphisms in a protein coding region, the most important consideration is whether they result in any alterations in the protein sequence. If a polymorphism results in a premature stop codon or frameshift, there is a high likelihood of functional significance and many amino acid substitutions will also affect function, particularly if they are non-conservative or occur in an area of the protein critical to structure or function. However, many novel polymorphisms are silent or result in conservative amino acid substitutions. There are reports of some silent polymorphisms altering RNA stability (Milland et al. 1996) but in general such polymorphisms are unlikely to be of functional significance though possibly of interest for linkage studies. Where an amino acid substitution has occurred, direct assessment of its functional significance will require expression of the mutant protein for assessment of its biological properties. It may also be possible to assess functional significance by determining phenotype in individuals or isolated tissue samples positive for the variant genotype of interest.

It is more difficult to predict effects of polymorphisms occurring outside protein coding sequences but those in promoter regions may affect gene expression, those in introns can interfere with RNA splicing and those at the 3'-end of a gene may affect RNA stability. The most usual approach to studying functional significance of promoter region polymorphisms is to compare the effects of mutant and wild-type sequence fused to reporter gene constructs. Other approaches to the study of functional effects of polymorphisms in regulatory regions include comparison of transcript levels for the gene of interest between samples from individuals of different genotypes.

Screening for known polymorphisms

Background

Most genotyping assays currently used to screen for known pharmacogenetic polymorphisms involve use of the polymerase chain reaction (PCR) followed by a specific detection step. Other amplification methods such as the oligonucleotide ligation detection assay (OLA) are occasionally used. The type of amplification-based assay used can range from relatively simple but slow and low-throughput methods such as restriction fragment length polymorphism-PCR (RFLP-PCR) analysis, single strand conformational polymorphism (SSCP) analysis or allele-specific PCR analysis to more specialised higher-throughput detection methods such as microarrays and sequencing. The choice of method will be determined by a number of factors including equipment available in the laboratory and number of samples to be analysed now and in the future.

An important issue in pharmacogenetics is whether the genotyping method chosen for a particular gene results in the identification of all possible variant alleles with 100% detection of all individuals of a particular phenotype. This is problematic since even if DNA sequencing, which should identify all known polymorphisms is used as assay method, some individuals with a particular phenotype may have a previously unknown polymorphism present. It will seldom be possible to predict the precise effect of a new polymorphism on phenotype without further experimentation. Designing assay systems that will screen for all very rare polymorphisms also presents difficulty. Thus, in the case of polymorphic genes such as CYP2D6 or NAT2, most current genotyping approaches successfully identify only 95 to 99% of all individuals with the particular phenotype (Chen et al. 1996; Grant et al. 1997; Sachse et al. 1997; Leathart et al. 1998; Cascorbi and Roots 1999). Since as discussed above, there are also limitations to the accuracy of phenotyping methods, this is generally regarded as adequate for most purposes. Providing appropriate quality control measures are in place (see below), false positive results should not occur with genotyping assays.

There are a wide range of publications concerning successfully amplifying sequences by PCR and a wide range of reagents to make the process easier are now commercially available. However, there are a few general points that are particularly important when performing PCR on genes relevant to xenobiotic metabolism. For many of these genes, it is important to use the highest annealing temperatures compatible with satisfactory amplification of the required sequence with the chosen primers to ensure that other sequences with homology to the gene of interest such as pseudogenes are not also amplified. Other factors important in successful PCR include choice of heat stable DNA polymerase and whether additional refinements such as "hot start" or long PCR conditions are required. Heat stable DNA polymerases from different suppliers vary in their ability to successfully amplify using different primer sets and it is advisable that enzyme from the same supplier should be used consistently for particular reactions.

PCR-RFLP

This method involves digestion of a PCR product with a restriction enzyme that distinguishes between the wild-type and mutant sequence. The method is most valuable when the polymorphism introduces a direct alteration in a restriction site so that a restriction site is lost or created, but even if there is no change in restriction enzyme sites as a result of a polymorphism, it may also be possible to engineer one or more base changes using a mismatched primer so that a new site seen only for either the mutant or wild-type sequence is created. Restriction patterns are analysed by gel electrophoresis on either agarose gels when the difference in fragment sizes is greater than approximately 40 bp or polyacrylamide gels where the difference is smaller such as in the case of an engineered restriction site. For genes where more than one polymorphism can occur such as CYP2D6 or NAT2, it may be possible to amplify a relatively large gene fragment and digest with several restriction enzymes, either together or in parallel reactions (Bell et al. 1993; Daly et al. 1996). With respect to restriction enzyme digestion, it is important to have an internal control for the enzyme activity in each tube to check that complete digestion has occurred. This can be an additional restriction site in the PCR product that is not subject to polymorphism or an additional PCR product with a site for the enzyme that will yield products of different non-interfering size. Careful observation of relative band intensities prior to genotype assignment is also important in avoiding misclassification due to partial digestion.

Allele-specific and sequence specific PCR

Allele and sequence specific PCR involves the specific amplification of individual alleles followed by a single detection step usually gel electrophoresis. Two alleles can be amplified in the same tube providing allele-specific primers that give products of different sizes are included. In some cases where large deletions or insertions occur, the same primers can be used for both alleles with the individual alleles distinguished purely on the basis of electrophoretic mobility. It is also possible to carry out two PCR separate reactions in parallel, one specific for the wild-type allele and one for the variant allele. With allele-specific PCR reactions, it is important that amplification of a non-polymorphic sequence of a different size from a control gene be carried out in the same tube to ensure that lack of product is not simply due to failure of the amplification reaction. Although allele-specific PCR and sequence-specific PCR often work well and involve fewer steps than RFLP-based methods or SSCP analysis, the specificity of the PCR reactions, which is vital to the successful use of methods involving the use of allele-specific primers, may be affected by variations in the laboratory temperature, which presumably results in slight changes in the temperature of the annealing step in the thermocycler.

Some new techniques that overcome the difficulties of achieving specificity and reproducibility in allele-specific PCR have been developed. The dynamic allele-specific hybridisation (DASH technique) (Howell et al. 1999) involves initial amplification of a biotin-labelled PCR product spanning the polymorphic site of interest and immobilisation of one strand to a streptavidin-coated well followed by hybridisation to allele specific probes at the site of the polymorphism. The melting behaviour of the double stranded DNA is then followed by use of a fluorescent dye that binds to double stranded DNA only with decreased fluorescence indicating denaturation. Any mismatch will be associated with denaturation at a lower temperature.

SSCP analysis

Single strand conformational polymorphism analysis is currently a widely used polymorphism scanning technique because of its simplicity and relatively high sensitivity (Orita et al. 1989). The basis of the method is that if a double stranded PCR product is denatured and applied to non-denaturing polyacrylamide electrophoresis gel, both strands will form intrastrand secondary structure whose electrophoretic mobility will differ in a sequence dependent manner. Where a mobility shift is detected, the sample is sequenced to determine the sequence change and its location. The precise sensitivity of SSCP analysis in mutation detection remains controversial with estimates of approximately 80% commonly quoted for a 300 bp PCR product when the analysis is done under at least two different electrophoretic conditions varying by temperature and/or glycerol content of the gel (Ravnik-Glavac et al. 1994). The original description of the technique involved using 32P-labelled primers and sequencing gel equipment but methods without radiolabelling and using smaller polyacrylamide gels have also been developed (Hongyo et al. 1993). While problems with sensitivity may limit the usefulness of SSCP for mutation scanning, it is frequently a useful technique for routine screening for polymorphisms and generally will give consistent results providing samples are denatured adequately and consistent electrophoresis conditions especially with respect to temperature are used. It is particularly important for accurate genotype designation that positive controls of known genotype previously checked by DNA sequencing should be included on each gel when using SSCP analysis for routine detection of polymorphisms.

Other methods

There are a variety of newer methods available for routine genotyping including use of high throughput systems involving microarrays and MALDI-TOF. Genotyping by minisequencing, which involves primer extension of a PCR product using dideoxynucleotides, is a less specialised technique that can be multiplexed (Syvanen 1999). Adaptation to a microarray-based format to achieve higher throughput is also possible. Pyrosequencing is a specialised form of minisequencing that uses iterative nucleotide dispensation with detection of incorporation by real-time measurement of pyrophosphate production (Alderborn et al. 2000). Pyrophosphate is detected in a series of enzymatic reactions that result in light generation. The ability to multiplex and the availability of automated equipment allows high sample throughput.

Quality control issues

In addition to the normal controls needed in carrying out any amplification by PCR, it is important that control DNA samples of known genotype are analysed in parallel with test samples regardless of the method being used for genotyping. In the case of rare polymorphisms, obtaining adequate amounts of DNA may be difficult. Use of cloned DNA rather than human genomic DNA as a control is an alternative. As discussed previously, there are particular problems concerning specificity that may be encountered when amplifying P450 genes and other members of multigene families because of the existence of highly homologous genes. These problems can be dealt with by careful design of primers, use of high annealing temperatures and checking the sequence of PCR products when setting up the assay initially. If problems persist, use of nested PCR may enable greater specificity to be achieved. Quality control issues in relation to NAT2 genotyping have been considered in detail (Cascorbi and Roots 1999) but many of the points discussed by these authors are also relevant to the analysis of other polymorphisms.

Identification and characterisation of novel polymorphisms

In spite of the large number of pharmacogenetic polymorphisms now detected and characterised, it is clear that many more exist. Providing genomic sequence information on the gene of interest is available, there are a number of different approaches to detecting novel polymorphisms (for review see Cotton 1997). DNA sequencing is probably the most accurate though if double stranded PCR products are sequenced there is a risk of missing some heterozygosity, even if traces from automated sequencing are scanned visually. In general, sequencing is a particularly useful method for mutation detection when a small number of samples of a known phenotype are available. Random population scanning of up to 100 DNA samples from particular ethnic groups for polymorphisms in genes of interest is also being increasingly performed (Cargill et al. 1999). Techniques such as SSCP and denaturing gradient gel electrophoresis (DGGE) analysis can be used for such studies but denaturing HPLC (DHPLC), which involves detection of heteroduplexes in PCR products on the basis of a mobility shift on a ion paired reversed phase liquid chromatography column, is being increasingly used. Although the equipment is relatively expensive initially, the technique offers the advantages of speed, the ability to automate and low running costs. It has a sensitivity for polymorphism detection in excess of 90% (Xiao and Oefner 2001). DHPLC is primarily a method for detecting heteroduplexes so samples homozygous for a mutation can be missed unless wild-type PCR product is added to form artificial heteroduplexes.

Conclusion

A variety of approaches to the analysis of both phenotype and genotype have been discussed. At present, the relationship between phenotype and genotype is well understood for a few genes concerned with drug disposition but there is still a need for further studies on aspects such as ethnic variation in patterns of polymorphisms and increased understanding on the extent of polymorphism in the full range of genes encoding proteins relevant to drug disposition. In the case of drug targets, the amount of information on polymorphisms is more limited and there is a need to understand more about both the extent of polymorphism and the phenotypic effects of polymorphisms for the possibility of individualisation of drug treatment to become a reality. This greater understanding is likely to be facilitated both by the availability of sequence information from the Human Genome Project and by the development of improved methods for polymorphism detection.