Introduction

Carotenoids are isoprenoids which derived from the precursor molecule isopentenyl pyrophosphate and its isomer dimethylallyl diphosphate. All photosynthetic organisms including higher plants and algae synthesize carotenoids (Armstrong and Hearst 1996; Cunningham and Gantt 1998; DellaPenna and Pogson 2006; Fraser and Bramley 2004; Goodwin 1980; Hirschberg et al. 1997; Lichtenthaler 1999) which are structurally constituted in the thylakoid membrane and function as accessory molecules for light harvesting and for prevention from photo-damage and as antioxidants under stress conditions (Goodwin 1980; Hirschberg et al. 1997; McCarthy et al. 2004; DellaPenna and Pogson 2006; Lichtenthaler 2007).

The phytoene synthase (PSY) is considered as the enzyme performing the rate-limiting entry reaction into the carotenoid biosynthesis pathway in photosynthetic organisms (Chen et al. 2007; Lindgren et al. 2003; Salvini et al. 2005; Yan et al. 2005; Li et al. 2008b). Multiple paralogous PSY genes were discovered in higher plants such as corn (Li et al. 2008a, b) and tomato (Bartley and Scolnik 1993) in which chromoplasts of specialized tissue cells over-accumulate carotenoids. When paralogous genes exist in higher plants they are often differentially up-regulated depending on environmental conditions or developmental stages of various tissues (Bartley and Scolnik 1993; Buckner et al. 1996; Li et al. 2008b). In the past biosynthesis of carotenoids of bacteria and higher plants has been studied extensively, whereas research on carotenoid biosynthesis in algae is still in its infancy. For unicellular green algae, PSY was previously investigated for example in Chlamydomonas reinhardtii (Bohne and Linden 2002; McCarthy et al. 2004; Lohr et al. 2005) and Haematococcus pluvialis/lacustris (Steinbrenner and Linden 2001). The genome of C. reinhardtii contains only one functional psy gene (McCarthy et al. 2004; Lohr et al. 2005). For Haematococcus the number of psy copies per genome is unknown, but psy was shown to be up-regulated under stress conditions of high light and low nutrient availability (Steinbrenner and Linden 2001, 2003). Further, psy genes were also cloned from different species of the unicellular green alga Dunaliella (Yan et al. 2005), but so far only one gene was reported for each species.

Recently, the genome sequences for a number of different microalgae became available from the DOE Joint Genome Institute (http://www.jgi.doe.gov). An examination of these algae genomes with representatives from very different groups such as green algae, red algae, diatoms, and haptophytes was performed to identify their psy genes. The results presented here demonstrate that some algae contain only a single gene coding for psy whereas other algae contain either multiple paralogous or orthologous copies of the psy. From the discovery of small psy gene families in algae it can be expected that analogous to the diversity of psy genes and their differential expression in higher plants (Li et al. 2008a, b) algae also differentially regulate expression of their paralogous psy gene copies. Similarly, it may be hypothesized that orthologous psy gene copies identified in some algae could also be differentially regulated.

Materials and methods

Phytoene synthase cDNA and protein sequences of Aureococcus anophagefferens, Chlamydomonas reinhardtii, Chlorella sp. NC64A, Volvox carteri, Micromonas pusilla, Micromonas sp. RCC299, Ostreococcus lucimarinus, Ostreococcus tauri, Ostreococcus RCC809, Emiliania huxleyi, Thalassiosira pseudonana, and Phaeodactylum tricornutum were obtained from the website of the DOE Joint Genome Institute (Walnut Creek, CA, USA; http://www.jgi.doe.gov). The cDNA and protein sequence of the PSY for the red alga Cyanidioschyzon merolae were obtained from the Cyanidioschyzon merolae Genome Project (http://merolae.biol.s.u-tokyo.ac.jp). Genomes from these algae were examined for presence of psy genes. In a first step analysis of the number of psy genes present in the algae genomes, we followed JGI’s or the Cyanidioschyzon merolae Genome Project’s annotation. In a following step, the cDNA and the protein sequence of the PSY of C. reinhardtii were each used independently as query sequences to perform BLAST searches in each algal genome to identitfy psy homologues.

Phytoene synthase protein sequences for the algae Dunaliella salina (Accession no. AY601075), Dunaliella bardawil (Accession no. DBU91900), Dunaliella spec. 366 (Accession no. DQ463305), and Haematococcus pluvialis (Accession no. DQ057355) were obtained from the NCBI GenBank database. The protein sequences for PSY of the higher plants Zea mays (Accession no. AY773475, AY773476, DQ356430), Arabidopsis thaliana (Accession no. P37271), and Solanum lycopersicum (Accession no. P08196, ABU40771) were also obtained from the NCBI GenBank. PSY protein sequences for Synechocystis sp. PCC6803 and Anabaena sp. PCC7120 were obtained from the Kazusa Cyanobase (http://bacteria.kazusa.or.jp/cyanobase/).

Based on a PSY cDNA sequence of Dunaliella bardawil that was available from NCBI (NCBI no. U91900) various forward and reverse primers (Table 1) were designed to clone the corresponding full length genomic PSY gene by PCR. In addition, two other different partial genomic psy sequences were obtained by PCR. Genomic DNA was isolated from cells using the DNeasy Plant Mini Kit (Qiagen Cat. no. 69104) and PCR was performed with the following standard conditions: 1 cycle of 95°C, 5 min; 32 cycles of 95°C, 1 min; 59°C, 1–3 min depending on the length of the products; 72°C, 1 min; and holding the sample at 4°C. The PCR products were gel-purified by agarose gel electrophoresis and following gel extraction (Qiagen Cat no. 28704) and cloned into the vector pSA-C using Strataclone PCR cloning kit (Cat. no. 240205-5) before being sent out for sequence determination at MWG Biotech Inc. (High Point, NC, USA). Resulting sequences were submitted to NCBI (PSY1A Accession no. DQ057342; PSY1B Accession no. FJ262988; PSY2 Accession no. FJ262989).

Table 1 List of primer sequences used for PCR amplification to clone one full and two partial genomic psy in Dunaliella bawdawil UTEX LB2538

All protein sequences from our PSY dataset were multiply aligned using ClustalW, version 1.83 (Thompson et al. 1994). A primary PSY phylogenetic tree was constructed in MrBayes, version 3.12 (Huelsenbeck and Ronquist 2001), under 100,000 runs, using the Jones amino acid substitution matrix with a fixed rate among sites. A second PSY phylogenetic tree was constructed using the Seqboot, Neighbor, and Consense programs in the Phylip package, version 3.66 (Felsenstein 1989). Bootstrap support values were derived from 100 randomized, replicate datasets.

Results

In contrast to higher plants, for genomes of unicellular green algae such as C. reinhardtii and V. carteri only one psy gene was known to exist. However, recently the genomes of a variety of microalgae were sequenced by the DOE Joint Genome Institute (http://genome.jgi-psf.org/) and by the Cyanidioschyzon merolae Genome Project. These available microalgae genomes were analyzed for presence of psy genes and results were summarized in Table 2. The translated PSY proteins were then used to perform a phylogenetic analysis.

Table 2 Phytoene synthase genes identified in algae genomes

Figure 1 displays a Bayesian phylogeny of our PSY dataset, rooted using the outgroup PSY proteins of the cyanobacteria Synechocystis and Anabaena. The tree topology matches our neighbor-joining tree (not shown), and the high clade support values coincide with high neighbor joining bootstrap values, suggesting the PSY phylogeny is robust to different tree reconstruction methods. The tree exhibits one major ancestral duplication event at the root node, which gave rise to two distinct PSY classes. To illustrate similarities and differences between the two PSY classes, Fig. 2 shows an exemplary alignment with selected class I and class II PSY protein sequences. The alignment indicates that both PSY classes share the essential characteristics of PSY including predicted substrate-Mg2+-binding sites (Aspartate-rich regions) and catalytic residues. Major differences between the two PSY classes appear to exist only in regions not essential to the enzymatic function.

Fig. 1
figure 1

Shown is a phylogenetic tree for the phytoene synthase from various organisms including cyanobacteria, algae, and higher plants. The arrow indicates an ancient gene duplication event creating a class I PSY (I) and a class II PSY (II). Stars indicate where later gene duplications led to creation of paralogous genes found within one species. Major groups of organisms are labeled to allow comparison between the phylogeny of PSY and algae evolution. Note that the overall phylogeny of PSY follows the currently accepted system of classification of algae

Fig. 2
figure 2

Alignment of the selective PSY protein sequences from different algae produced with the BioEdit program by using Clustal W. The putative transit peptides were removed for this analysis. The alignment indicates selective sequences of PSY class I and class II, aspartate rich and regions/substrate-Mg2+-binding sites (DXXXD), residues of the substrate binding pocket (open circle), catalytic residues (dark filled circle), and active site lid residues (straight line)

Figure 1 also shows that gene loss is present for both classes of PSY, and higher plants appear to have retained only class I PSY. Similarly, within the green algae members of the Chlorophyceae as well as the group of the Trebouxiophyceae seemed to have also only retained class I PSY. In contrast, loss of class I PSY occurred in the Rhodophyta with the example of Cyanidioschyzon, in the Haptophyta with the example E. huxlei, and in the Heterokontophyta, which appeared to have retained only the class II PSY as shown in Fig. 1 for P. tricornutum and T. pseudonana (Bacillariophyceae) as well as for Aureococcus (Pelagophyceae). It seems that only green algae within members of the Prasinophyceae represented by the species O. tauri, O. lucimarinus, Ostreococcus strain RCC809, Micromonas strain RCC299, and M. pusilla retained both ancestral PSY copies.

A special case appears to be represented by the two psy genes both coding for class II PSY identified in the alga A. anophagefferens which belongs to the Pelagophyceae within the phylum Heterokontophyta.

In addition to an ancient gene duplication leading to two PSY classes, more recent psy gene duplications appeared to have occurred independently in higher plants and some microalgae. For example, multiple paralogous psy genes were not only found in some higher plants such as corn and tomato, but were also present in the haploid genome of the chlorophyte D. bardawil. For D. bardawil one full genomic psy sequence and two more partial genomic psy sequences were cloned by PCR indicating that also this alga has multiple paralogous psy genes (class I PSY duplications) in its haploid genome. To demonstrate the existence of multiple class I psy genes from D. bardawil, Fig. 3 shows the selectively PCR-amplified parts of two paralogous psy genes by use of specific primers. In contrast to D. bardawil which is a haplont, the two copies of class I PSY found in the haptophyte E. huxlyei are diploid alleles.

Fig. 3
figure 3

Shown is the photo of a polyacrylamide gel. Lane M molecular weight marker, lane 1 PCR product obtained by using the primer pair PSYF1A and PSYR1B to amplify the partial psy1A from genomic DNA of D. bardawil, lane 2 PCR product obtained by using the primer pair PSYF2B and PSYR2B to amplify a partial psy2 from genomic DNA of D. bardawil

Discussion

Previously, it was known that some higher plants contained small psy gene families consisting of multiple paralogous genes (Bartley et al. 1992; Bartley and Scolnik 1993; Gallagher et al. 2004; Li et al. 2008a). In higher plants containing small paralogous psy gene families, the different psy genes are differentially regulated during development (Bartley et al. 1992; Bartley and Scolnik 1993; Buckner et al. 1996) and/or in response to environmental stress (Li et al. 2008a, b). In contrast to the situation in higher plants, the number of psy present in microalgae genomes was largely unknown. However, recently genomes from different classes of microalgae became available. The comparative analysis of various algal genomes for PSY in combination with a phylogenetic analysis shown in Fig. 1 suggested an ancient gene duplication creating two classes of PSY. Both classes of PSY appeared to have only been retained in members of the Prasinophyceae, which belong to the Chlorophyta, whereas all other investigated species belonging to the Chlorophyta as well as the related higher plants (Streptophyta) seemed to have lost the class II PSY. In contrast, members of the algal classes Rhodophyta, Heterokontophyta, and Haptophyta investigated here lost the class I PSY. This unbalanced distribution of PSY genes within different algal classes may have been driven by neutral processes or by adaptive pressure.

The persistence of gene duplicates in only a subset of algal groups may be due to the acquisition of a novel function in one copy (neo-functionalization), or the degeneration of both copies facilitating their joint requirement (sub-functionalization) (Force and Lynch 1999). Evidence for such adaptive neofunctionalization can be found in a mutation rate variation between the two PSY classes. It appears that there is a relaxed selective constraint in class II, which may be indicative of rapid evolution of new function. This variation exists in those species containing both ancient PSY copies, and therefore, is not attributed to species-specific mutation rates. Taken together, the persistence of multiple PSY copies, and the relaxed selective constraint in class II PSY suggested that a functional novelty may have played a role in the maintenance of the two PSY gene classes. However, analysis of the conserved residues of class I and class II PSY showed that they shared the essential substrate binding and catalytic site residues. Such conservation of the critical catalytic residues suggested that in the case of PSY not the catalytic function was affected in evolution of the class I and II enzymes, but rather it is proposed that there may be regions of the enzyme impacted by evolution that are critical for regulation of the function. A structural analysis is necessary to further delineate the role of the more variable regions of the class II PSY proteins, but such an analysis is beyond the scope of this manuscript.

Phylogenetic analysis of PSY showed also that more recent gene duplication events creating small paralogous gene families did not only occur in some higher plants, but independently in microalgae such as the chlorophyte D. bardawil. Possibly, existence of multiple paralogous psy copies resulted in their differential regulation in response to developmental or environmental cues (Li et al. 2008b). This hypothesis remains to be tested for the alga D. bardawil, which over-accumulates carotenoids in response to abiotic stress.

It is generally accepted that chloroplasts of the algae in the phylum Heterokontophyta were acquired by secondary endosymbiosis involving a red alga (Bhattacharya and Medlin 1998; Braun and Phillips 2008; Boore 2008). Therefore, it may be hypothesized that the two different copies of psy coding for class II PSY found in A. anophagefferens may have originated from a host cell and from a secondary endosymbiosis event rather than from a more ancient gene duplication. Indirect evidence for this hypothesis comes from the fact that the genome of the red alga C. merolae contained only one gene coding for a class II PSY (Fig. 1). Analysis of further genomes of algae in the phylum is necessary to delineate between these two possibilities of psy origin in A. anophagefferens.

In summary, phylogenetic analysis of PSY from a variety of photosynthetic organisms revealed an ancient gene duplication resulting in two PSY classes. Further, it was shown that more recent gene duplications occurred which led to existence of small paralogous psy gene families in some algae and some higher pants. This finding raises new questions regarding the function of multiple PSY copies within some of the unicellular algae. It is postulated that similar to the situation in higher plants, in algae containing multiple copies of psy these genes are differentially regulated in response to developmental and/or environmental cues to fine-tune metabolic flux into carotenoid biosynthesis.