Introduction

Tocopherols and tocotrienols function as the principle lipid-soluble oxidation chain-breaking antioxidants in biological membranes and lipoproteins (Liebler 1998; Brigelius-Flohé et al. 2002). In contrast to cytoplasmatic systems, such as the glutathione redox cycle, or the superoxide dismutase, which depend on enzymatic inactivation of oxygen radicals, the antioxidant reactions of tocochromanols do not require involvement of enzymes. The physiological role of tocopherols and tocotrienols is thought to be the protection of polyunsaturated fatty acids (PUFA) from lipid oxidation by quenching free radicals in cell membranes and other lipophilic environments (Kamal-Eldin and Appelqvist 1996). Tocopherols and tocotrienols occur in four major isoforms each: α-, β-, γ-, and δ-tocopherol, and α-, β-, γ-, and δ-tocotrienol (Pongracz et al. 1995; Fig. 1). Tocotrienols are distinguished from tocopherols by the presence of three double bonds in the isoprenoid side chain, and greek letters refer to the grade of methylation on the aromatic head group, with α-tocopherol or α-tocotrienol being the highest methylated isoforms (Fig. 1). Of these, α-tocopherol has the highest vitamin E activity in animals and humans (Sheppard et al. 1993; Bramley et al. 2000), presumably due to its preferred retention and distribution throughout the mammalian body (Traber and Sies 1996).

Fig. 1
figure 1

Homogentisate prenyltransferase reaction in context of the tocopherol biosynthetic pathway. Tocopherols and tocotrienols with major isoforms (a) and schematic drawing of the homogentisate prenyltransferase reaction (b). MEP methylerythritol phosphate

The tocochromanol biosynthesis begins with the prenylation of homogentisic acid (HGA) with phytyldiphosphate (PDP) catalyzed by the homogentisate phytyltransferase (VTE2; Collakova and DellaPenna 2001; Schledz et al. 2001; Savidge et al. 2002), or by prenylation of HGA with geranylgeranyldiphosphate (GGDP) catalyzed by the homogentisate geranylgeranyl transferase (HGGT; Cahoon et al. 2003). HGA and PDP plus GGDP are derived from the shikimate pathway (Norris et al. 1998) and the methylerythritolphosphate pathway (Rohmer 2003), respectively (Fig. 1). The reaction products of this first committed step, 2-methyl-6-phytylbenzoquinol, or 2-methyl-6-geranylgeranylbenzoquinol are subsequently cyclized and methylated resulting in the formation of various tocopherol isoforms (Fig. 1).

Previous evidence suggests that the first committed reaction in tocopherol biosynthesis regulates tocopherol flux and to some extent tocopherol composition (Savidge et al. 2002; Cahoon et al. 2003; Collakova and DellaPenna 2003a, b; Karunanandaa et al. 2005). VTE2 has for these reasons been identified to be critical for tocopherol biosynthesis pathway engineering. Interestingly, VTE2 was discovered via bioinformatics approaches in parallel by two different groups (Collakova and DellaPenna 2001; Savidge et al. 2002). In this study, we report on further bioinformatics characterization of homogentisate phytyltransferases leading to the identification of four conserved sequence motifs and on the application of two of these motifs for database searches to specifically search homogentisate prenyltransferase sequences. Application of these motifs for additional database searches identified a VTE2 paralog in the Arabidopsis genome. This new VTE2 paralog was named VTE2-2 and the previously known VTE2 was renamed VTE2-1. Transgenic overexpression of this new gene supports a function of this new gene in tocopherol biosynthesis.

Material and methods

Bioinformatics analysis

BLAST 2.2 algorithms (Altschul et al. 1997) and the non-redundant amino acid database were downloaded from NCBI (http://www.ncbi.nlm.nih.gov). Profile Hidden Markov Models (HMMs) for the motifs described were built and profile HMM searches using these motifs were performed using HMMER 2.0 software as described in the users manual (http://hmmer.wustl.edu/). Gene prediction software FGENESH was procured from Softberry Inc., NY, USA. Dicot model provided with this software was used to predict coding regions from Arabidopsis genomic DNA sequences. Multiple sequence alignments using ClustalX (Thompson et al. 1997) were performed as described in the user’s manual (http://www.igbmc.u-strasbg.fr/BioInfo/ClustalX/Top.html). Multiple sequence alignments were visualized and edited using GeneDoc (Nicholas et al. 1997). The residues were shaded using conserved residue shading mode set to level 4 using default settings. Similar amino acid residues conserved in all columns were shaded dark. Phylogenetic analysis was conducted using MEGA version 2.1 (Kumar et al. 2001). Separate neighbor-joining trees were generated using both p-distance and gamma distance amino acid substitution models. Both pair wise and complete deletion approaches for handling gaps or missing data were applied separately for each model. The trees were tested by bootstrapping, performing 500 replications. Subsequently, the trees were rooted for the Chloroflexus aurantiacus VTE2.

For further functional characterization of VTE2-2 the expression patterns of VTE2-1 and VTE2-2 genes were analyzed in the Arabidopsis Massively Parallel Signature Sequencing (MPSS) data set available at http://mpss.udel.edu/at/ (Meyers et al. 2004). MPSS produces short sequence signatures chosen from a defined position within an mRNA, and the relative abundance of these signatures in a given library represents a quantitative estimate of expression of that gene. The MPSS signatures are 17 bp in length, and can uniquely identify >95% of all genes in Arabidopsis.

Arabidopsis transformation

Using standard cloning techniques the VTE2-2 full-length cDNA was excised from an in house EST clone (pMON69960) to generate binary vectors for sense and antisense expression under the control of the seed-specific Napin promoter (Kridl et al. 1991) and the constitutive e35S promoter (McPherson and Kay 1994) as described (Valentin et al. 2003). The plant binary vectors, pMON69963, and pMON69965 were used to transform Arabidopsis thaliana Col-0 for seed-specific sense and antisense expression of At-VTE2-2, respectively. Binary vectors, pMON69964 and pMON69966 were used for constitutive sense and antisense expression of VTE2-2, respectively. The binary vectors were transformed into Agrobacterium tumefaciens strain ABI by electroporation (Bio-Rad Electroprotocol Manual, Dower et al. 1988). Transgenic Arabidopsis plants were obtained by Agrobacterium-mediated transformation as described (Valvekens et al. 1988; Bechtold et al. 1993; Bent et al. 1994). Transgenic plants were selected by sprinkling the transformed seed onto selection plates containing MS basal salts (4.3 g/l), Gamborg’s B-5, 500× (2.0 g/l), sucrose (10 g/l), Mes (0.5 g/l), phytagar (8 g/l), carbenicillin (250 mg/l), cefotaxime (100 mg/l), plant preservation medium (2 ml/l), and kanamycin (60 mg/l) and then vernalizing them at 4°C in the absence of light for 2–4 days. Subsequently, the seed were transferred to 23°C, and 16/8 h light/dark cycle for 5–10 days until seedlings emerged. Once, one set of true leaves were formed on the kanamycin resistant seedlings, plantlets were transferred to soil and grown to maturity. Transgenic lines generated through kanamycin selection were grown under 16 h light and 8 h dark.

Tocochromanol analysis

The tocochromanol content was analyzed on R2 Arabidopsis seed as described by Savidge et al. (2002).

Northern analysis

RNA was isolated from 100 mg fully matured but not dried, Arabidopsis siliques. Tissue samples were ground with 0.25 g polyvinylpolypyrrolidone in liquid nitrogen. While the ground tissues were frozen, 10 ml REC.8+ buffer containing 50 mM Tris–HCl (pH 9.0), 800 mM NaCl, 10 mM EDTA, 0.5% CTAB and 0.5% β-mercaptoethanol was added to each sample, mixed and centrifuged for 5 min at 18,000g in a Sorvall centrifuge, model Super T21. The supernatant was filtered through miracloth (Calbiochem) onto 3 ml chloroform, mixed and centrifuged again at 18,000g for 5 min. The supernatant was separated and extracted twice with an equal volume of phenol:chloroform mix (1:1, v/v) and then ethanol precipitated. The ethanol pellet was re-suspended in 50 mM EDTA. Equal amounts of total RNA (20 μg) were fractionated on 1.2% agarose gels containing 1 M formaldehyde. Gels were blotted onto nylon membranes (Ambion) according to the manufacturer’s instructions and hybridized in Sigma PerfectHybTM buffer, containing 10 mg/ml of salmon sperm DNA at 65°C. A cDNA fragment (XhoI/BamHI) of the Arabidopsis VTE2-2 gene was labeled and used as a gene-specific probe. Probe labeling with 32P-dCTP (1.85 MBq) was performed using Amersham RediprimeTM labeling kit (Amersham Biosciences Corp.). Blots were washed in 2× SSC, 0.1% SDS at 65°C and exposed to X-ray film (BioMax MS film, Sigma). After detection of VTE2-2 mRNA the blot was stripped with 0.1× SSC containing 0.1% SDS to remove the radiolabelled VTE2-2 probe. Subsequently the blot was again pre-hybridized and hybridized with a rDNA radiolabelled probe using the same methodology described above. After 2 h hybridization with this probe, the blot was washed and exposed to X-ray film.

Results

Bioinformatic analysis

Motif analysis I: Arabidopsis thaliana and Synechocystis sp. PCC 6803 VTE2 genes (At-VTE2, and Sy-VTE2, respectively) have been described previously (Collakova and DellaPenna 2001; Schledz et al. 2001; Savidge et al. 2002). Using the deduced amino acid sequence of Sy-VTE2, the Nostoc and Anabaena VTE-genes were identified from GenBank (gi|17230940, and gi|46135490, respectively) based on sequence homology. In addition, sequences homolog to the At-VTE2 and Sy-VTE2 deduced amino acid sequences were obtained from soybean (Glycine max), leek (Allium porrum) and Cuphea pulcherrima cDNA libraries (GenBank accession no. DQ231059, DQ231057, and DQ231058, respectively). The deduced VTE2 amino acid sequences from Arabidopsis, leek, soybean, Cuphea pulcherrima, Synechocystis, Nostoc, and Anabaena were aligned and four motifs conserved in all VTE2 sequences were identified (Fig. S1 in supplemental material). These motifs have been numbered 1.1, 2.1, 3.1 and 4.1, respectively. They correspond to amino acids 99–112, 167–174, 285–298, and 371–390 of At-VTE2-1 respectively.

We used Hidden Markov Model (HMM) profile techniques (Eddy 1998) to test the specificity and sensitivity of these motifs. Profile HMMs are statistical models of multiple sequence alignments. They capture position-specific information about how conserved each column of the alignment is, and which residues are likely to be sensitive in identifying proteins sharing homology to a conserved domain of interest. Profile HMM models built from multiple alignments of these motifs were used to search all sequences of the NCBI non-redundant amino acid database available on March 2003. Motifs 1.1 through 4.1 identified all previously known deduced VTE2 amino acid sequences present in the database at high specificity. Surprisingly, this search identified two new genomic Arabidopsis VTE2 variants representing contradicting predicted protein sequences from the same area of genomic Arabidopsis DNA. When both of these sequences (gi15229898 and gi10998133) were used to search the non-redundant amino acid database, the BLAST search results indicated that these sequences were most related to VTE2 sequences from cyanobacteria and Arabidopsis VTE2-1 (from here on refered to as VTE2-1). However, gi15229898 appeared to encode a much larger protein of 970 amino acids, which was homologous to VTE2-1 over its carboxy terminal half. A blast search using the 561 amino-terminal deduced amino acids of gi15229898 that were not homologues to VTE2-1, identified homology to deduced amino acid sequences from genes of unknown function from Musa acuminate (gi|40850572, 53% identity over 310 amino acids), Oryza sativa (gi|50930768, 44% identity over 335 amino acids), Zea mays (gi|54652075, 41% identity over 224 amino acids), and TNF receptor-associated factor 5 from Gallus gallus (gi5|4020685, 27% identity over 179 amino acids). PFAM (http://www.sanger.ac.uk/Software/Pfam/index.shtml) analysis of this region identified a conserved TRAF-type zinc finger (Zf-TRAF) domain. Mammalian signal transducers associated with the cytoplasmic domain of the 75 kDa tumor necrosis factor (TNF) receptor are known to contain this domain. The presence of a putative Zf-TRAF domain may therefore indicate a regulatory function of the gi|15229898 amino terminal sequence. Analysis of plant EST databases indicated that the deduced amino terminal sequence of gi|15229898 that was not homologous to VTE2-1 may code for a gene of functional significance which is conserved in plants. EST sequences of this gene were identified in multiple plant species including Arabidopsis thaliana, Brassica napus, Zea mays and Triticum aestivum (data not shown). However, all cDNA sequences with homology to gi|15229898 lacked the carboxyterminal region that contained sequence similarities to VTE2, suggesting that the N-terminal region of gi|15229898 is transcribed independent from the sequence that was homolog to VTE2-1.

Gi|10998133 encoded a 441 amino acid protein that was highly similar to VTE2-1 but it appeared to miss approximately 40–50 amino acids from its carboxy terminus. Gene prediction algorithms are often mistaken in predicting terminal exons (Zhang 2002). Such predicted proteins are often found to have misannotated N-terminal and/or C-terminal sequences and require further verification. In order to verify the GenBank sequence prediction, the nucleotide sequence of a BAC clone that corresponded to the Arabidopsis genomic sequence that contained VTE2-motifs (gi|12408742|gb|AC016795.6|ATAC016795, 10,0835 bp) was analyzed, and its coding sequences were predicted using the FGENESH gene prediction program (Solovyev 2001). FGENESH identified 28 coding regions on this BAC clone (data not shown). To identify new homogentisate phytyltransferase proteins among these 28 sequences, all 28 predicted amino acid sequences were searched against the non-redundant amino acid database using BLAST tool (Altschul et al. 1997). Predicted protein no. 25 (base 80,571 to 83,410 on the BAC clone), with a length of 402 amino acids was most similar to gi10998133 (441 amino acids), the carboxyl terminal half of gi15229898 (970 aa) and other deduced VTE2 amino acid sequences. The deduced amino acid sequence of protein no. 25 aligned well with other VTE2s over the entire length of the protein (data not shown). To provide functional and transcriptional evidence and to confirm the coding sequence for this gene, plant EST sequence databases were searched. Several ESTs from Arabidopsis matched the amino terminal and carboxy terminal portions of this gene (data not shown). ESTs with significant similarity were also identified from other plant species (data not shown). The new gene was named VTE2-2 (GenBank accession no. DQ231060) and previously known VTE2s from plant sources were renamed VTE2-1. The deduced At-VTE2-2 amino acid sequence is quite distinct from At-VTE2-1 and is about 32% identical (Figs. 2 and 3). Similar as all plant derived VTE2-1 sequences represented in Fig. 2, the deduced amino acid sequences of all VTE2-2 represented in Fig. 2 contain a predicted chloroplast target peptide (CTP) according to ChloroP (Emanuelsson et al. 1998).

Fig. 2
figure 2

Neighbour-joining tree of homogentisate prenyltransferases. The tree has been rooted for Chloroflexus aurantiacus (Ca) VTE2. Branch length calculated based on amino acid substitutions per site is shown below (0.05=5% difference between sequences). Blank squares VTE2-1 sequences from plants, blank diamonds, VTE2-2 sequences from plants, filled squares HGGT sequences from plants, filled triangles VTE2 sequences from cyanobacteria, blank circle VTE2 from photobacterium. Abbreviations: An Anabaena, Ap Allium porrum, At Arabidopsis thaliana, Cp Cuphea pulcherrima, Cw, Crocosphaera watsonii, Gm Glycine max, Gv Gloeobacter violaceus, Ms Medicago sativa, No Nostoc, Os Oryza sativa, Sy Synechocystis, Ta Triticum aestivum, Te Trichodesmium erythraeum, Zm Zea mays. Reference numbers: An-VTE2, gi|46135490; Ap-VTE2-1, accession no. DQ231057; At-VTE2-1, gi|21281072; At-VTE2-2, accession no. DQ231060; Ca-VTE2, gi|53795310; Cp-VTE2-1, accession no. DQ231058; Cw-VTE2, gi|45525087; Gm-VTE2-1, accession no. DQ231059; Gm-VTE2-2, accession no. DQ231061; Gv-VTE2, gi|37519852; Hv-HGGT, gi|33391138; Ms-VTE2-1, gi|51949754; No-VTE2, gi|17230940; Os-HGGT, gi|33391144; Os-VTE2-1, gi|51536170; Os-VTE2-2, gi|50938601; Sy-VTE2, gi|16330366; Ta-HGGT, gi|33391142; Ta-VTE2-1, accession no. DQ231056; Te-VTE2, gi|48893591; Zm-VTE2-1, accession no. DQ231055

Fig. 3
figure 3

Multiple alignment of selected homogentisate prenyltransferases. Motif 1, 2, 3, and 4 are labeled with a bold line. Species abbreviations At Arabidopsis thaliana, Hv Hordeum vulgare, Sy Synechocystis, Ca Chloroflexus aurantiacus

Motif analysis II:

Additional EST database searches using the previously identified motifs revealed that At-VTE2-2 orthologs were present in soybean and rice as well. These newly identified VTE2-2 sequences were cloned and their sequences included in the multiple sequence alignment used in the previous motif analysis, for further motif optimization. The resulting new motifs were designated 1.2, 2.2, 3.2, and 4.2 (Fig. 4) corresponding to motifs 1.1, 1.2, 3.1 and 4.1 from previous motif analysis (Fig. S1). Profile HMM models were built and HMM searches were performed using the optimized motifs on the NCBI non-redundant amino acid database, containing more than 2.1 million sequences (Nov. 2004). The results of this search are shown in Table 1. All four motifs identified all previously known homgentisate phytyltransferase genes from plants and cyanobacteria found in the database. In addition to VTE2-1 and VTE2-2 sequences, these motifs also identified the recently identified HGGT sequences from monocotyledon plants (Cahoon et al. 2003). Motifs 3.2 and 4.2 were found to be highly specific to VTE2 type amino acid sequences as evident by lower E values. Motif 4.2 also identified a VTE2 related sequences from the photosynthetic bacterium Chloroflexus aurantiacus. Motif 1.2 is also specific to VTE2 type genes; however, its sensitivity for HGGT sequences is low as indicated by higher E values. Motif 2.2 identified five bacterial ubiA prenyltransferase sequences in addition to VTE2s and HGGTs as it spans the conserved catalytic domain of prenyltransferases (Collakova and DellaPenna 2001). However, the E-values for ubiA sequences were higher by several orders of magnitude, indicating a lower sequence homology and greater sequence divergence of tocochromanol related prenyltransferases from ubiA type prenyltransferases.

Fig. 4
figure 4

Optimized VTE-2 sequence motifs identified based on multiple alignment of VTE2 sequences. Numbers on the right border represent the number of amino acid residues in each motif. Consensus sequences are shown for each motif. Abbreviations: An Anabaena, Ap Allium porrum, At Arabidopsis thaliana, Cp Cuphea pulcherrima, Gm Glycine max, No Nostoc, Sy Synechocystis, Ta Triticum aestivum, Zm Zea mays

Table 1 HMM search of non-redundant NCBI amino acid databases

Phylogenetic analysis

Four neighbor-joining trees were generated using two different amino acid substitution models (see Materials and methods) and two different approaches for handling gaps or missing data. All four neighbor-joining trees clustered the homogentisate phytyltransferases in four clades. One representing plant VTE2-1 sequences, a separate clade representing plant VTE2-2 sequences, a third representing HGGT sequences, and a fourth clade representing bacterial homogentisate phytyltransferases. A tree generated using p distance and complete deletion model is shown in Fig. 2. The topology of the tree suggests that the bacteria and cyanobacteria harbor only one homogentisate phytyl transferase gene. Dicotyledonous and monocotyledonous plants harbor two and three homogentisate prenyltransferase genes, respectively.

Expression pattern of VTE2-1 and VTE2-2 in Arabidopsis

For characterization of the functional significance of VTE2-2 the Arabidopsis expression patterns of the two VTE2-paralogs were characterized in silico. While the frequency of VTE2-1 and VTE2-2 hits in public and proprietary cDNA-libraries from flower buds, seed, silique, roots and leaves suggested a similar expression pattern of these two genes (data not shown), a search in the more quantitative Arabidopsis MPSS data set (http://www.mpss.udel.edu/at/) revealed a preferred expression of VTE2-1 in inflorescence tissue, and a preferred expression of VTE2-2 in leaf (Fig. 5). Expression levels in callus, root and silique tissues were found to be similar.

Fig. 5
figure 5

Expression pattern of VTE2 paralogs in Arabidopsis. The expression pattern are based on analysis of the Arabidopsis MPSS data set (http://mpss.udel.edu/at/)

Overexpression and antisense of VTE2-2 in Arabidopsis

Expression of At-VTE2-2 under napin promoter control resulted in 38% increased seed tocopherol levels in the R2 seed population of all events (Table 2). This increase is slightly less than the increase obtained with VTE2-1 using the same promoter and 3′-UTR (Savidge et al. 2002), but it represents a significant change in total seed tocopherol content. Constitutive expression of VTE2-2 under e35S promoter control resulted in a significant tocopherol increase of 13% compared to the vector control (Table 2). β-Tocopherol was not detected in any of the VTE2-2 over expressing lines, and γ-tocotrienol was found only in trace amounts (≤7 ppm) of some samples harboring the napin driven expression construct for VTE2-2 (Table 2). Mean tocopherol levels of all events in Arabidopsis seed harboring seed-specific and constitutive VTE2-2 antisense constructs were not significantly different from tocopherol levels found in control seed populations. However, these constructs represented simple antisense constructs, which are frequently found to provide only a low percentage of events that exhibit the antisense effect.

Table 2 Sense and antisense expression of VTE2-2 in Arabidopsis thaliana

Expression analysis of transgenic VTE2-2 expressers

Siliques from seven selected R3, kanamycin resistant, transgenic lines originating from transformation with pMON69963 and an empty vector control line were used for Northern analysis to detect the expression level of the VTE2-2 transgene. All transgenic lines exhibited a signal migrating at a molecular mass of 1.2 kb corresponding with the expected molecular mass of the VTE2-2 transgene mRNA that was not present in wild-type control preparations. Transgenic lines with low, medium and high tocopherol increase were used for this experiment. The transgene expression level in the lines analyzed correlated with the level of tocopherol increase in R2 seed (Fig. 6).

Fig. 6
figure 6

Northern-blot analysis for detection of At-VTE2-2 transcripts in R3 transgenic Arabidopsis siliques harboring the pNapin::VTE2-2 expression cassette. The top panel indicates detection of the VTE2-2 mRNA from total RNA isolated form transgenic siliques. The bottom panel shows the same northern blot that was used for detection of VTE2-2, re-probed with rDNA. Among the lines compared, events 48109 and 48125 represent low (1.2-fold); 48108, 48111, and 48119 represent medium (1.4-fold), and 48115 and 48121 represent high (1.5-fold) seed tocopherol lines for seed of the R2 generation

Discussion

In this study, we describe a motif analysis approach which identified a set of conserved motifs in VTE2-1 and HGGT amino acid sequences. Application of these motifs in sequence database searches resulted in the discovery of a VTE2-paralog in the Arabidopsis genome. A function of this new gene in tocopherol biosynthesis was confirmed through over expression in Arabidopsis resulting in increased seed tocopherol levels. Interestingly, this new gene maps to the same location as a so far not characterized mutation on the top of chromosome 3 (pds2) that caused a tocopherol deficient phenotype (Norris et al. 1995).

The conserved motifs were identified through alignment of homologous sequences. Profile HMMs constructed from multiple alignments of these motifs were used in sequence database searches. In comparison to PFAM domains (http://pfam.wustl.edu) the motifs used here are small, vary in size from 12 to 30 amino acids, and are specific to a subset of proteins of a large protein family. In contrast, the corresponding PFAM motif is approximately 260 amino acids long and identifies a large group of prenyltransferases including ubiA and VTE2 orthologs found in multiple taxonomic groups. Motifs 3.2 and 4.2 described in this paper are specific to VTE2 and HGGT sequences found in bacteria and higher plants. While motif 2 covers the catalytic domain found in other ubiA family prenyltransfrases, the functional significance of the other conserved domains identified in this study is not known. Additional studies using VTE2 or HGGT genes mutagenized in the conserved regions identified here may help to decipher the structural and biochemical functions of these amino acid sequences. We found this approach sensitive in identifying short sequences that can be easily missed by sequence similarity search methods like BLAST.

It is interesting to note that all other tocopherol pathway genes appear to occur as single copy genes in Arabidopsis. The 2-methyl-6-phytylbenzoquinolmethyltransferase (VTE3) even appears to carry out reactions in two different metabolic pathways, the methylation of 2-methyl-6-phytylbenzoquinol for tocopherol biosynthesis, and the methylation of the plastoquinol precursor 2-methyl-6-solanylbenzoquinol (Cheng et al. 2003; Motohashi et al. 2003). In this context it may appear surprising to find two VTE2 paralogs in the Arabidopsis genome. However, VTE2 catalyzes the first committed reaction in tocopherol biosynthesis. These reactions frequently coincide with enzymes regulating the pathway flux. Indeed several recent publications suggest a flux limitation for tocopherol biosynthesis by VTE2 under certain growth conditions (Savidge et al. 2002; Cahoon et al. 2003; Collakova and DellaPenna 2003a, b; Karunanandaa et al. 2005). The occurrence of two VTE2 paralogs may provide a broader flexibility in tocopherol flux regulation, or it may help to separate tocopherol flux regulation from plastoquinol biosynthesis. This would however require differences in substrate specificity for the two VTE2 paralogs. A slight difference in the substrate spectrum of VTE2-1 and VTE2-2 may be suggested by the observation that some Arabidopsis seed samples expressing VTE2-2 under napin promoter control contained traces of γ-tocotrienol (Table 2), while a similar construct expressing VTE2-1 did not result in tocotrienols accumulation (Karunanandaa et al. 2005). VTE2-2 might therefore also help to explain the accumulation of tocotrienols in transgenic Arabidopsis in the presence of high HGA-levels.

Previous studies (Rippert et al. 2004; Karunanandaa et al. 2005) have demonstrated the accumulation of substantial tocotrienol levels in plants co-expressing a prephenate dehydrogenase and a p-hydroxypenylpyruvate dioxygenase. The accumulation of tocotrienols could not be explained on the basis of currently published data, as previous studies did not detect prenyltransferase activity of the Arabidopsis VTE2-1 with the tocotrienol precursor GGDP (Collakova and DellaPenna 2001). The identification of VTE2-2 and the data presented in Table 2 might suggest that this enzyme can utilize GGDP as substrate, and thereby solve the previous dilemma on how tocotrienols were formed. Whether VTE2-2 has a function in plastoquinol biosynthesis or fulfills other as yet to be identified physiological functions will require additional experimentation, such as enzyme assays using a variety of different substrates, or parallel complementation experiments to elucidate if the pds2 mutant as described by Norris et al. (1995) can be complemented by VTE2-1 or VTE2-2 or both.

Analysis of the Arabidopsis MPSS data set suggested that compared to VTE2-2 the VTE2-1 gene is expressed at substantially higher levels in inflorescence and at lower expression levels in leaf. Further enzymatic characterization of both VTE2 expression products with an emphasis on substrate preferences will help to decipher the physiological function of these two genes.