Introduction

Isopentenyl-diphosphate isomerase (IDI) catalyzes a central reaction in the biosynthesis of isoprenoids: By converting isopentenyl diphosphate (IPP) to its highly nucleophilic isomer dimethylallyl diphosphate (DMAPP) it activates the basic isoprenoid unit for polymerization. It is, therefore, necessary for the synthesis of a wide variety of essential cellular metabolites, including sterols, steroid hormones and bile acids, farnesol, geraniol, geranylgeraniol, dolichols and their derivatives, ubiquinones, carotenoids, vitamins A, D, E and K, and many secondary metabolites (reviewed in Ramos-Valdivia et al. 1997a). IDI is found in all free-living eukaryotes examined and in a wide range of eubacteria, but is absent in most archaea, which (with the possible exception of Halobacterium sp. NRC-1) use an unrelated flavoprotein for catalyzing the interconversion of IPP and DMAPP (Boucher and Doolittle 2000; Kaneda et al. 2001; Smit and Mushegian 2000). Hamster IDI has been shown to be a peroxisomal protein, targeted by a C-terminal variant PTS-1 signal (-HRM [Paton et al. 1997]). This was not surprising, because of the predominantly peroxisomal localization of upstream and downstream reactions of the pathway in mammals (reviewed in Kovacs et al. 2002). In the liver of rats treated with cholesterol biosynthesis inhibitors (fluvastatin, lovastatin) IDI is the most highly upregulated protein identified (Steiner et al. 2000, 2001). In several physiological situations IDI has been suggested to be rate-limiting for isoprenoid biosynthesis (Albrecht and Sandmann 1994; Ramos-Valdivia et al. 1997a; Steiner et al. 2000).

IDI is duplicated in many plants and algae, where isozymes are reported to be subfunctionalized by different localization and different expression patterns. Examples include the tobacco plant (Nicotiana tabacum), where the level of one isozyme mRNA was increased under high-salt and high-light stress conditions, while that of the other was increased under high-salt and cold stress conditions (Nakamura et al. 2001); the quinine tree (Cinchona robusta), where one isozyme is specifically upregulated in response to elicitation by phytopathogenic fungi (Ramos-Valdivia et al. 1997b); and the green alga Haematococcus pluvialis, in which one isozyme is preferentially upregulated in high-light conditions (Sun et al. 1998). Arabidopsis thaliana also has two IDI genes, each consisting of six exons, but no functional or expressional differences are known (Campbell et al. 1998). The duplication in Arabidopsis is part of a larger region of conserved synteny between chromosome 3 and chromosome 5 (Blanc et al. 2000; The Arabidopsis Genome Initiative 2000; Vision et al. 2000), and, as in all other plants, the different isoforms show about 90% sequence identity. It is noteworthy that in both Cinchona and Haematococcus the shorter isoform is upregulated specifically in response to a noxious stimulus (pathogenic fungus and high light, respectively) and may be involved in the production of protective compounds such as anthraquinone phytoalexins (in Cinchona) and antioxidant carotenoids (in Haematococcus).

Enzymatically, distinct isoforms of IDI have been described in pig and chicken (Bruenger et al. 1986; Sagami and Ogura 1983). During the cloning of human IDI we recently found that the IDI gene is also duplicated in humans (genomic contig: AF291755). This finding has been confirmed when the human and mouse genome drafts became available: In both mammals IDI is encoded by two tandemly duplicated genes. Here we report a detailed phylogenetic and structural analysis of that duplication, focusing on the possible functional significance of widespread duplications in a central metabolic enzyme. Two scenarios are possible: The duplications might be favored over neutral mutations, because the gene is facing recurrent deleterious mutation or because an additional copy is directly advantageous (Clark 1994). On the other hand, the IDI duplications are not necessarily functionally relevant, because the rate of gene duplications is of the same order of magnitude as the rate of mutation per nucleotide site, so that 50% of all genes in a genome are expected to duplicate (and become established at high frequency in the species) at least once every 350 million years (Lynch and Conery 2000). In addition to the general evolutionary interest of this question, a possible subfunctionalization of IDI for different subcellular functions might open interesting avenues for therapeutic interventions in such diverse diseases as hypercholesterolemia or cancer (via modulation of protein isoprenylation [Wong et al. 2002]).

To determine the relevance of the IDI duplication in mammals and to provide direction for further experimental studies we performed a comprehensive bioinformatic analysis of the IDI genes in human and mouse.

Materials and Methods

Sequence Analysis

Protein, EST, and genomic sequences were obtained from the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/) and the Joint Genome Initiative (http://www.jgi.doe.gov/). Sequence alignments were performed with ClustalW (Thompson et al. 1994). The differences between ancestral IDI1 and IDI2 in the most recent common ancestor of human and rodents were estimated by maximum parsimony using ProtPars in the Phylip package (Felsenstein 1989), based on an alignment of all mammalian IDI sequences and using the fugu IDI as an outgroup. The Phylip package was also used for the phylogenetic analysis by maximum parsimony and neighbor joining. Alu sequences were detected and classified using Censor (Jurka et al. 1996). Nucleotide alignments of large genomic sequences were performed with DiAlign (Morgenstern et al. 1998).

Structural Analysis

Two X-ray structures of E. coli IDI have been published recently (Durbecq et al. 2001) and served as templates for molecular modeling using SwissModel (Guex and Peitsch 1997). The quality of the models was assessed by WhatCheck (Hooft et al. 1996) and used to manually refine the alignment between target and template sequence. For an assessment of the three-dimensional distribution of differences between IDI1 and IDI2, amino acid positions in IDI were classified in three conservation classes: highly conserved, highly variable and all the remaining residues. Highly conserved are those amino acids that are identified as functionally important (Durbecq et al. 2001), based on the X-ray structure of E. coli IDI, and are conserved in all known IDI homologues from eubacteria and eukaryotes. Highly variable are positions that differ between the three rodent enzymes for which sequences are available (mouse, rat, hamster). This method of classification focuses on differences that occur at the same timescale as the duplication and is, therefore, preferable to alternative methods, e.g., those based on overall sequence variability in all species.

Statistical Analysis

The same classification as in the structural analysis was used to examine whether differences between IDI1 and IDI2 occur preferentially in highly variable regions and exclude more conserved regions of the protein. If n is the total number of sites, x the number of sites in the class of interest, and m the total number of mutations observed, then the probability of observing no mutations in the class of interest is

From the same principle, the probability of observing exactly y mutations in the class of interest can be deduced as

which can be conveniently reduced for exact calculation even in the case of large numbers. The significance of differences from random expectations was calculated by adding up the probabilities of observing y or less mutations for the highly conserved class and y or more mutations for the highly variable class, respectively. A two-tailed test was done for the remaining residues, for which no a priori expectation for the direction of deviation from randomness exists. As this approach is independent of a detailed mutational model and does not explicitly consider reversals and multiple mutations at the same site, it tends to systematically underestimate the significance of differences in mutational rate, thus making conclusions derived from it more robust.

To test whether the pattern of differences observed between human and mouse IDI2 is significantly different from that between the reconstructed ancestral IDI1 and IDI2, we replaced x in the calculations above by

, where m ancestor is the number of differences between the two ancestral sequences and m expected is the number of mutations expected if mutations occurred randomly in the sequence. This essentially amounts to determining the “mutational sensitivity” of the residues in the class of interest that maximizes the likelihood of observing the differences of the ancestral sequences, and then testing whether the pattern seen in the recent sequences could be explained by the same sensitivities.

Results

Gene Structure

IDI is tandemly duplicated in the human and mouse genome (Fig. 1). In humans, the two genes are situated in the subtelomeric region of the short arm of chromosome 10, close to framework marker D10S1857. The mouse homologues of the two human isozymes are encoded by tandemly arranged genes in the subcentromeric region of chromosome 13 that shows conserved synteny to the subtelomeric region of human chromosome 10 (Hudson et al. 2001; Rheault et al. 1999). The complete genomes of lower vertebrates (fugu fish, Takifugu rubripes) and protochordates (sea squirt, Ciona intestinalis) lack the duplication. As one of the duplicated genes is highly conserved and corresponds to the known active enzyme (Hahn et al. 1996), it has been named IDI1, while the other is provisionally referred to as IDI2. This second copy is much more divergent (evolving much faster in both human and mouse) (Fig. 2), however, it still retains considerable similarity with IDI1 (65% identity at the amino acid level). In both human and mouse, the exon-intron structure (five exons, four of them translated) and the numbers of codons in the open reading frames are exactly conserved between IDI1 and IDI2. As discussed in more detail below, human and mouse IDI2 share several unusual and highly characteristic mutations that together with the tandem gene arrangement clearly establish the orthology relationship between the two proteins.

Figure 1
figure 1

Gene and transcript structure of the duplicated human IDI genes. The upper panel indicates the exon-intron structure and the position of Alu repeats. Translated regions of the exons are shaded. Subclasses of Alu repeats are named following Batzer et al. (1996). The lower panel shows the six different transcripts derived from the IDI locus.

Figure 2
figure 2

Phylogenetic tree of animal IDI proteins. The tree was reconstructed by neighbor-joining analysis of IDI protein sequences (gaps removed) and rooted with the yeast IDI. Bootstrap support in100 pseudoreplicates is indicated at the branches (upper number, maximum parsimony; lower number, neighbor joining).

IDI1 Pseudogenes

Both the mouse and the human genome contain a number of processed IDI pseudogenes, probably derived by retroposition (Table 1). Phylogenetic and genomic analysis shows that these pseudogenes originate from IDI1 after the human-rodent split (not shown), i.e., they show much closer similarity at the protein and nucleotide level to the IDI1 of their organism of origin than to any other IDI. Only one of the pseudogenes maintains an intact open reading frame, and this is an almost-exact duplicate of the IDI1 cDNA even in third-codon positions, indicating a very recent genesis. The proteins were predicted by the National Center for Biotechnology Information by automated computational analysis and some of them either lack part of the IDI1 sequence because of nonsense mutations (e.g., XP_13653) or include additional unrelated sequence (e.g.,XP_092157). There is no indication that the pseudogenes are actually transcribed.

Table 1 Processed pseudogenes of IDI1 in mouse and human

Expression Patterns

To determine the expression pattern of the two isozymes we searched the dbEST database with the cDNA sequence of both genes. As the two nucleotide sequences show almost no similarity, this search can reliably discriminate transcripts of IDI1 and IDI2. In both human and mouse, IDI1 is highly and ubiquitously expressed (146 human ESTs, 66 murine ESTs), while ESTs for IDI2 are very rare (8 human ESTs, 3 murine ESTs) and, in humans, are almost exclusively restricted to skeletal muscle (6 ESTs). In the mouse all three IDI2 ESTs are from embryonic or neonate head. The number of skeletal muscle ESTs for IDI2 is comparable to that of muscle-specific proteins like myosin light chain kinase 2, but even in that tissue the level of expression of IDI2 is hardly higher than that of IDI1 (four skeletal muscle ESTs). RT-PCR cloning and sequencing from adult mouse liver confirmed the very low levels of IDI2 expression (data not shown). The much higher level of IDI1 expression is also indicated by the number of processed pseudogenes for that gene in both human and mouse (at least two in human and five in mouse; Table 1), assuming that, all other factors being equal, the number of processed pseudogenes should roughly correlate with the number of mRNA molecules available for retroposition. In addition to the two different IDI transcripts, EST analysis also detects four other (probably untranslated) transcripts from the IDI locus in humans, three of them different splice products of the hypothetical gene HT009. All four of these transcripts are oriented in antisense direction compared to the IDI genes and overlap with parts of the coding region of IDI2 and, in the case of the HT009 transcripts, of IDI1 (Fig. 1). The absence of intronic sequences and the presence of poly(A) tails show that these ESTs are authentic transcription products and might be involved in the regulation of the IDI locus. This is especially the case for the transcript overlapping the last exon of IDI2, which is expressed almost ubiquitously at much higher levels than IDI2 itself (35, compared to 8 human ESTs). None of the antisense transcripts is detected in mouse and the corresponding sequences are not conserved.

Phylogenetic Analysis

A phylogenetic analysis of known IDI proteins reveals that the duplication of IDI occurred independently in several plants and mammals, but also indicates that both mouse and human IDI2 are the result of a single ancient duplication event (i.e., before the human-mouse split about 70 MYA [Eizirik et al. 2001]). The tree of animal IDIs (Fig. 2) shows that mouse and human IDI2 are diverging at a much faster rate than IDI1 (or the plant IDIs, which are not included in the picture). Similarly high divergence rates are seen in Ciona intestinalis, Drosophila, and C. elegans, that are all cholesterol-auxotrophs and use their IDI only for the synthesis of nonsterol isoprenoids. The highly different rates of evolution make resolution of the deep branching pattern within the vertebrates difficult: IDI1, IDI2 and fugu IDI are essentially represented in an unresolved trichotomy. However, whole-genome analysis supports the topology shown in Fig. 2, as it minimizes the number of gene losses in the fish, which has only one IDI homologue.

Subcellular Localization

Maximum parsimony analysis predicts that the C-terminal tripeptide (peroxisomal targeting signal) of the most recent common ancestor of mouse and human IDI2 was probably the same as in the peroxisomal hamster IDI1 [-HRM (Paton et al. 1997)], indicating that for some time after the duplication both isoforms maintained a peroxisomal localization. Later, at least the mouse IDI2 lost its peroxisomal targeting signal (C-terminus: -YGL), and the sequence of the human IDI2 C-terminus (-HRV) is only slightly reminiscent of a canonical targeting signal (Subramani et al. 2000). The same is, however, true for the human IDI1 (-YRM), hence the evidence for a difference in subcellular localization of the two isozymes remains inconclusive. In addition to mutations in canonical targeting signals, subfunctionalization of IDI2 might also involve differences in interaction partners. In that case it would be predicted that differences between the isozymes concentrate at the new interaction surface. However, mapping the differences between IDI1 and IDI2 on a structural model of the protein reveals that they are distributed evenly over the whole protein (Fig. 3).

Figure 3
figure 3

Distribution of differences between human IDI1 and IDI2 on a structural model of IDI2. The differences are color-coded corresponding to the severity of the change. Mutations of varying severity are found all over the protein and are not concentrated in specific regions. Red, Cys87Ser mutation in the catalytic center; orange, Asn84Asp and Tyr137His mutations ofhighly conserved residues; yellow, mutations that change charge or hydrophobicity in moderately conserved positions; blue, mutations that change charge or hydrophobicity inhypervariable sites; cyan, conservative mutations that leave charge and hydrophobicity unchanged; green, isopentenyl diphosphate. Only the C-alpha atom is shown for each residue.

Active-Site Analysis

As described before, IDI2 is diverging much faster than IDI1, indicating that this protein is probably no longer acting in isoprenoid housekeeping biosynthesis. To address the issue, whether IDI2 is an active isomerase at all, we examined its putative catalytic center in more detail. Several residues at the active site of IDI2 are mutated in human and mouse compared to all other IDI proteins, but all of those mutations are relatively conservative: Cys > Ser, Tyr > His, Asn > Asp. The physicochemical (“Grantham”) distance from Cys to Ser and from Asn to Asp is minimal of all possible exchanges for these two amino acids, and the Tyr > His exchange is also relatively mild (i.e., its Grantham distance is below-average [Grantham 1974]). In addition, the Tyr-His and the Cys-Ser pair are adjacent in the amino acid exchangeability scheme according to Argyle (Argyle 1980; Pieber and Toha 1983). The mutated cysteine residue is critically important for the catalytic function of IDI1. Targeted replacement by a serine all but inactivates the yeast homologue (Street et al. 1994). However, some activity is retained, in contrast to a cysteine-to-alanine exchange. Analysis of the three-dimensional arrangement of the three active-site mutations of IDI2 suggests a possible adjustment of the catalytic mechanism as indicated in Fig. 4, that predicts significant isomerase activity in IDI2.

Figure 4
figure 4

Active site of bacterial IDI and proposed mechanism for IDI2. A Structure of the active site of E. coli. IDI based on X-ray data (Bonanno et al. 2001; Durbecq et al. 2001). B Proposed reaction mechanism of IDI1, involving a deprotonation/reprotonation of IPP by cysteine 67 and glutamate 116, respectively (modified after Durbecq et al. 2001). C Possible modification of the reaction mechanism for IDI2. Exchange of asparagine 64 by aspartate may sufficiently increase the acidity of serine67 to enable protonation of IPP. Amino acid numbering refers to the bacterial protein for ease of comparison.

Purifying Selection

The mutations of crucial active-site residues of IDI2 raised the possibility that it might be a slowly decaying pseudogene that maintains an open reading frame but is otherwise accumulating random mutations that will ultimately lead to its demise. This is, however, not the case: The power of purifying selection acting on IDI2 can be appreciated by comparing it to the several processed pseudogenes that are present in the mouse and human genome. Those genes are much younger than IDI2 (after the mouse-rat split in the case of the two mouse pseudogenes as determined by parsimony analysis, not shown) and are still very similar to their ancestral cDNA, but all but one of them have already lost their ORFs by nonsense and frameshift mutations. That IDI2 is under strong purifying selection can also be demonstrated directly at both the DNA and the protein level. Figure 5 shows that only translated regions of IDI2 show detectable similarity to the IDI1 gene, while introns and untranslated exons have diverged beyond recognition. This pattern would not be expected had the mutations been fixed randomly. Figure 5 also shows that the distribution of Alu repeats in the IDI1 and IDI2 genes is independent, i.e., the duplication occurred before the Alu expansion and IDI2 has survived a massive onslaught of Alu integrations intact. At the protein level, differences between the two isozymes also deviate significantly from random expectations (Table 2). Highly conserved sites are preferentially guarded against fixed mutations, while those residues that show differences among mammalian IDI1 proteins tend to differ also between IDI1 and IDI2. This is true for both human and mouse IDIs. In addition, comparison of human and murine IDI2 shows that purifying selection is maintained even after the acquisition of disabling mutations (i.e., after the human-mouse split; Table 2, middle column). On the other hand, comparison of the ancestral IDIs as reconstructed by parsimony analysis indicates that IDI2 underwent an initial phase of random mutation after the duplication, i.e., differences between the two ancestral forms not only affected the highly conserved active site, but also do not show a preference for highly variable sites (Table 2, last column). If mutations in the active site were acquired at the same rate after the human-mouse split, the probability of observing no further changes in the active site is p < 0.05. For the highly variable sites the probability of the observed number of changes is p < 0.01, if only reliably reconstructed ancestral residues are taken into account, but is not different from random expectations when all ambiguous residues are assumed to be mutated. Thus, comparison of the differences between the ancestral enzymes and the differences within the IDI2 lineage shows that the rate of change in the two phases of IDI2 evolution is significantly different.

Table 2 Purifying selection at the protein level, showing the number of mutations in each conservation class (see Materials and Methods)
Figure 5
figure 5

Conservation of the nucleotide sequence of human IDI1 and IDI2. Similarity of the two sequences in a pairwise comparison (according to the DiAlign score) is shown in arbitrary units. Only translated regions show detectable similarity although both genes have the same gene structure. Several Alu elements have independently invaded the loci of both genes after the duplication and their similarity is not the result of special selection pressure but of a more recent common ancestry.

Discussion

The duplication of essential enzymes of primary metabolism is relatively uncommon—compared to the pervading and extensive duplications seen, for example, in transcription factors, protein kinases, or cell surface receptors—and requires special explanation, if it is not to be considered as fundamentally redundant. Duplications of IDI have first been described and analyzed in plants, where sufficient experimental evidence exists to suggest that in most cases one of the isoforms is specialized for temporally and spatially isoprenoid biosynthesis tasks, such as the production of antioxidant carotenoids or fungicidal anthraquinones (Nakamura et al. 2001; Sun et al. 1998). Therefore, it suggested itself to assume that the mammalian duplication has a similar function and might even be the result of a common ancient duplication event. Our analysis, however, demonstrates that the IDI gene duplication in mammals differs from the well characterized duplication in plants in several important aspects:

1. Mechanism of duplication. In contrast to the tandem duplication of mammalian IDI (most likely caused by unequal crossing-over), an analysis of the genomic context of the two Arabidopsis isozymes indicates that the plant genes are remnants of a whole-genome polyploidy event that occurred at least 100 million years ago (data not shown; Blanc et al. 2000; The Arabidopsis Genome Initiative 2000; Vision et al. 2000). The duplications in plants and mammals are due to independent events.

2. Level of sequence conservation. Mammalian and plant IDI isozymes arose during approximately the same time range, but while the coding sequences among the plant isoforms are highly conserved, sharing about 90% identity, the mammalian IDI1 and IDI2 genes are only 65% identical. On the other hand, the similarity between the IDI1 genes of mouse and human is very similar to that among the plant isozymes at about 87%.

3. Functional divergence. In mammals one isozyme (IDI1) shows obvious signs of functional conservation while the other form is much more divergent from the common IDI pattern observed in other eukaryotes. No such difference is found in plants, where the isozymes cannot be classified based on sequence patterns. In contrast, the plant isozymes differ in expression and probably in physiological role, with one enzyme being responsible for the housekeeping synthesis of isoprenoids and the other meeting specialized demands.

These differences prevent the direct application of the botanical findings to the mammalian duplication, which has to be approached on its own terms. However, one similarity stands out: In both plants and mammals, one of the genes maintains general housekeeping functions. Mammalian IDI1 by sequence conservation, functional studies and expression pattern is easily identified as the “conservative” gene. The following analysis will accordingly focus on the question: Does IDI2 have a novel, nonredundant function setting it apart from its twin IDI1? In our discussion we consistently apply the systematic “respectful” terminology of Brosius and Gould (1992). Our central question can thus be rephrased as follows: Is IDI2 a poto-, napto-, or xaptogene?

IDI2 Is Not a Naptogene. A naptogene as defined by Brosius and Gould (1992) is a gene without function, on the way to obliteration as genomic noise. Several findings argue strongly against this interpretation of the human duplication. First, IDI2 is expressed (although antisense transcripts may interfere with translation); second, it retains an intact open reading frame; and most importantly, it is under effective purifying selection on the amino acid and nucleotide level. The vast majority of duplicated genes disappear after a few million years (half-life of duplications about 7.3 million years [Lynch and Conery 2000]). IDI2 originated more than 70 Mya (i.e., before the human-mouse split) and has survived in good shape for a much longer time than could be explained for a gene without function. On the other hand, even if a gene has only a very small but measurable fitness effect it is, in evolutionary terms, tantamount to essential (Hirsh and Fraser 2001), so that even both copies of a duplicated gene may be subject to purifying selection regularly (Hughes and Hughes [1993], for tetraploid Xenopus). Our results indicate that IDI2 should have at least such a minimal fitness effect and is consequently not a naptogene.

IDI2 Was, but No Longer Is, a Potogene. Immediately after a gene duplication event one copy of the gene is always redundant and can rapidly acquire mutations that modify its original function. A further change (e.g., in targeting or expression pattern) may then lead to the exaptation of this copy for a new and no longer redundant function, i.e., it is a potentially new gene (potogene). The parsimony analysis of ancestral IDI2 indicates that IDI2 went through a potogene phase immediately after its duplication but had acquired a new function by the time of the rodent-human split. This is manifested in the change of mutation pattern: During the potogene phase mutations were fixed in apparently random positions, but after the rodent-human split changes appear preferentially in more variable sites, sparing the essential residues of the active center. While the extremely low expression of IDI2 (8 human ESTs vs. 146 for IDI1) might be characteristic of a gene still waiting for a new function, this pattern of conservation argues for the hypothesis that IDI2 is no longer redundant.

IDI2 Is a Xaptogene. A xaptogene is defined as a gene that has undergone exaptation (Gould and Vrba 1982), in other words, it has acquired a novel function, e.g., by expression in a new subcellular localization or in specialized tissues. As a xaptogene IDI2 may retain its function as an isomerase enzyme despite the changes in the active site, e.g., with the altered mechanism suggested in Fig. 4. Inhibitor studies on chicken IDIs show that some isoforms may have a cysteine-less mechanism (Sagami and Ogura 1983), and the “conservativeness” of the active site mutations (Cys > Ser, Tyr > His, Asn > Asp) suggests a possible modification of the catalytic mechanism. In that case IDI2 may have a skeletal muscle-specific enzymatic role, e.g., in the synthesis of geranylgeranyl diphosphate for the specific modification of proteins (compare the statin-induced myopathy and rhabdomyolysis due to geranylgeranylation deficiency [Flint et al. 1997a, b]). But IDI2 may also have acquired a completely new function, e.g., as a structural protein (with or without loss of enzymatic activity), an exaptation similar to eye-lens crystallins (Piatigorsky 1998). The low level of expression does not argue against this, as structurally important components are not necessarily bulk materials. Having acquired a new enzymatic function is extremely unlikely at the level of sequence similarity seen between IDI1 and IDI2 (Todd et al. 2001; Wilson et al. 2000). The large amount of sequence change that would be necessary to transform the enzymatic activity of IDI can be appreciated from a comparison with its closest relatives, the nudix/MutT family of hydrolases. While these proteins show extensive similarity to IDI in 3D structure, their pattern of sequence conservation is drastically different: Three of seven positions of the characteristic nudix motif (Gly-x5-Glu-x7-Arg-Glu-x2-Glu-Glu-x-Gly) are either changed or not conserved in IDI. In contrast, the essential histidine and glutamate residues involved in metal binding and catalysis in IDI are localized in areas that show little conservation in the rest of the nudix superfamily (data not shown). But it should be noted that even a very small change in protein sequence can sometimes change enzymatic function, e.g., a single amino acid replacement can transform Bacillus lactate dehydrogenase into malate dehydrogenase (Wilks et al. 1988). Whatever the new function of IDI2 may be, it is obviously restricted to a very specific set of tissues—one of them being skeletal muscle—which could explain the low level of overall expression. The adoption of a novel, nonredundant function by IDI2 is supported not only by the strong purifying selection acting on the gene but also by the observed two-phase mutation pattern (random mutations first—purifying selection later). Even though we observed strong purifying selection, the data are also consistent with the idea that positive selection is contributing to IDI2 evolution, especially during the initial rapid divergence of the two isoforms. However, the data do not allow the discrimination between random (neutral) mutations and positive selection for these ancient events. If IDI2 is indeed involved in the local production of farnesyl and geranylgeranyl diphosphate for the posttranslational modification of skeletal muscle proteins, polymorphisms in the IDI2 gene may well be responsible for differences in the susceptibility toward statin-induced rhabdomyolysis, the major lethal complication observed under treatment with these highly efficient cholesterol-lowering drugs (Evans and Rees 2002; Hodel 2002).

While our data suggest that IDI2 may be a cytosolic skeletal muscle-specific isomerase, the evidence for this remains inconclusive and needs to be supported by a further molecular characterization of the protein. We do, however, present compelling arguments that IDI2 is a functionally important protein and that such further efforts will indeed be worthwhile.