Introduction

Over 700 carotenoids are widely distributed in nature including bacteria, fungi, algae, and plants (Britton et al. 2004). They carry out important functions in photosynthesis, nutrition, and protection against oxidative damage. Most naturally occurring carotenoids are hydrophobic tetraterpenoids containing a C40 methyl-branched hydrocarbon backbone derived from successive condensation of eight C5 isoprene units. The C5 isoprene unit, isopentenyl pyrophosphate (IPP), can be generated from acetyl-CoA precursor by the mevalonate pathway (Wilding et al. 2000) or from the pyruvate and glyceraldehydes 3-phosphate precursors by the non-mevalonate pathway (Rohmer et al. 1996). Several new genes including ispH (Rohdich et al. 2002) have recently been identified in the non-mevalonate isoprenoid pathway. IPP is then isomerized to dimethylallyl pyrophophate (DMAPP) by IPP isomerase encoded by the idi gene. IPP is condensed with DMAPP to form the C10, geranyl pyrophosphate (GPP), and elongated to the C15, farnesyl pyrophosphate (FPP). In the carotenogenic organisms, FPP is extended to common C40 carotenoids (Fig. 1) such as lycopene by geranylgeranyl pyrophosphate synthase (CrtE), phytoene synthase (CrtB), and phytoene dehydrogenase (CrtI). The aliphatic ends of lycopene (ψ,ψ-carotene) can be modified by enzymes such as hydratase (CrtC) and desaturase (CrtD). They can also be cyclized by lycopene cyclase and further modified by enzymes such as β-ionone ring ketolase (CrtW) and β-ionone ring hydroxylase (CrtZ) to generate a variety of C40 carotenoids.

Fig. 1
figure 1

Proposed pathway for synthesis of flexixanthin in Algoriphagus sp. KK10202C. CrtE: geranylgeranyl pyrophosphate synthase, CrtB: phytoene synthase, CrtI: phytoene dehydrogenase, CrtC: hydratase, CrtD: desaturase, CrtY: lycopene β-cyclase, CrtW: 4,4′-β-ionone ring ketolase, CrtZ: 3,3′-β-ionone ring hydroxylase

Both structural and functional diversity has been reported for lycopene β-cyclases. Four different types of lycopene β-cyclases have been described in a review (Krubasik and Sandmann 2000b): the common monomeric bacterial CrtY-type (Misawa et al. 1995; Schnurr et al. 1996; Pasamontes et al. 1997; Krugel et al. 1999; Hannibal et al. 2000); the monomeric plant CrtL-type or Lcy-b type (Cunningham et al. 1994, 1996; Hugueney et al. 1995; Pecker et al. 1996); the heterodimeric CrtYc and CrtYd type (Krubasik and Sandmann 2000a; Viveiros et al. 2000); and the bifunctional fungal CrtYB type (Verdoes et al. 1999; Velayos et al. 2000; Arrach et al. 2001, 2002). The lycopene cyclase domain of the fungal CrtYB share homology with the heterodimeric CrtYc and CrtYd from gram-positive bacteria, however, they don’t show homology with the monomeric lycopene cyclases. CrtYc and CrtYd encode two proteins, which have to interact as heterodimers for lycopene cyclization. Recently, a novel fusion-type of lycopene β-cyclases has been described in archaeon (Peck et al. 2002; Hemmi et al. 2003), whose N- and C-terminal halves are homologous to the CrtYc and CrtYd subunits of the bacterial heterodimeric enzymes. Most of the lycopene β-cyclases are bicyclases, which cyclize both ends of lycopene to produce bicyclic β-carotene. Two novel lycopene β-monocyclases, the CrtYm (Teramoto et al. 2003) and the CrtLm (Tao et al. 2004), were recently isolated from bacteria that synthesized exclusively monocyclic carotenoids. The lycopene β-monocyclases asymmetrically cyclize only one end of lycopene to produce monocyclic γ-carotene.

Algoriphagus sp. belongs to the Flexibacteraceae family within the bacterial phylum Bacteroidetes. The structures of carotenoids (flexixanthin and deoxy-flexixanthin) produced in certain flexibacteria were elucidated several decades ago (Aasen and Jensen 1966). Flexixanthin is a monocyclic carotenoid that possesses one oxygenated β-ionone ring analogous to that of astaxanthin and one hydroxylated aliphatic end group as found in demethylspheroidene. However, no pathway has been proposed for flexixanthin synthesis and no carotenoid synthesis gene has been cloned from flexibacteria. Here we report the isolation of a carotenoid synthesis gene cluster from a marine isolate Algoriphagus sp. strain KK10202C. Five of the genes in the cluster were involved in flexixanthin synthesis. One of them encoded a novel fusion-type lycopene β-cyclase that was first reported in eubacteria.

Materials and methods

Strain isolation

A marine bacterium Algoriphagus sp. strain KK10202C was isolated from marine sponge, Halichondria okadai, which was collected at the coastal area in Numazu, Japan. For strain isolation, 0.1 ml homogenates of the specimen were transferred onto plates of marine agar 2216 (Difco, Detroit, MI, USA). From a number of strains, an orange-pigmented bacterium, KK10202C, was isolated and purified on the same medium.

Carotenoid identification

Cultivation of the bacterium KK10202C and the extraction of carotenoids produced were carried out by the same methods as previously described (Yokoyama and Miki 1995). The main carotenoid extracted was purified by column chromatography on silica gel 60 (Nakarai Tesque, Japan) with chloroform and methanol (9:1, V/V), and HPLC on Cosmosil 5-SL (Nakarai Tesque) with hexane and acetone (8:2, V/V). Three mg of the pigment was isolated from 12 l of the culture. The purified carotenoid was identified by visible spectrum, 1H-NMR spectrum, and mass spectrum (Aasen and Jensen 1966). Authentic standard of lycopene, β-carotene was purchased from Sigma (St. Louis, MO, USA).

Library construction and screening

A small insert library of strain KK10202C was prepared by partial restriction digest method. Genomic DNA of KK10202C was partially digested with HincII and separated on a 0.8% agarose gel. The 4–6-kb fraction was excised from the gel and extracted using Qiagen MinElute Gel-Extraction kit. The extracted DNA was ligated to pEZseq vector using pEZSeq Blunt Cloning kit (Lucigen, Middletown, WI, USA). The ligation mixture was electroporated into freshly prepared competent cells of Escherichia coli 10G containing a β-carotene producing plasmid pBHR-crt1 (Tao et al. 2004). Transformants were plated on LB plates with 100 μg/ml ampicillin and 50 μg/ml kanamycin. Positive clones were screened by orange color of the colonies.

A cosmid library of KK10202C was constructed using the pWEB cosmid cloning kit from Epicentre Technologies (Madison, WI, USA). Genomic DNA was sheared by passing it through a 261/2 G syringe needle three times. The sheared DNA was end-repaired and size-selected on a low-melting-point agarose gel. DNA fragments approximately 40-kb in size were purified and ligated into the blunt-ended pWEB cosmid vector. The library was packaged using ultra-high efficiency MaxPlax Lambda Packaging Extracts, and titered with EPI100 E. coli cells. Approximately 600 cosmid clones were grown in LB with 100 μg/ml ampicillin. They were pooled and screened by PCR using primers for the crtW gene identified from the small insert library.

Sequencing of the positive clones

From an orange positive clone, the library plasmid pEZ-HY1 was separated from the β-carotene reporter plasmid by selecting for ampicillin resistant and kanamycin sensitive clones. The insert on the pEZ-HY1 was sequenced by random transposon insertion using the EZ-TN<TET-1> kit (Epicentre, Madison, WI, USA) and/or primer walking. The insert on the positive cosmid was also sequenced using the EZ-TN<TET-1> kit. The sequences were assembled with the Sequencher program (Gene Codes Corp., Ann Arbor, MI, USA).

Construction of the expression clone

The lycopene cyclase gene crtY cd from KK10202C was expressed in E. coli using the pTrcHis2-TOPO expression vector. It was amplified using primers 5′-ATGGGAAACTATCTCTACCTAGC-3′ and 5′-CTAATAACTTTTAGTTTGTTGAATTGTTTTCCG-3′. The resulting plasmid pDCQ250 containing the crtY cd gene in the forward orientation was confirmed by restriction analysis and sequencing. Its function was tested in lycopene-accumulating E. coli DH10B containing pDCQ51 (Tao et al. 2004).

Phylogenetic analysis

The almost-complete 16S rDNA sequence (1,443 nucleotides) of Algoriphagus sp. KK10202C was deposited in GenBank under accession number AB086623. Aligned 16S rRNA gene sequences of representative strains of family Flexibacteriaceae were obtained from the databases of RDP-II release 9 (Cole et al. 2005). The sequences were aligned with those from the GenBank database based on the sequences from the RDP database by using the profile alignment function of ClustalX v. 1.83 (Thompson et al. 1997). Phylogenetic analysis was conducted using MEGA 3.1 (Kumar et al. 2004). Evolutionary distance was calculated based on Kimura 2-parameter model (Kimura 1980). Neighbor-joining method (Saitou and Nei 1987) was used for tree construction. The reliability of an inferred tree was tested by 1,000 replicates of bootstrap method (Felsenstein 1985).

Nucleotide accession number

The carotenoid synthesis gene cluster isolated from Algoriphagus sp. KK10202C has been deposited in GenBank under accession number DQ286432.

Results

Isolation of Algoriphagus sp. that synthesized flexixanthin

Strain KK10202C was an orange-pigmented bacterium isolated from marine sponge. It was gram-negative and rod-shaped. Its approximate size was 1.3 × 0.6 μm2. This strain utilized glucose oxidatively, but could not degrade gelatin. Phylogenetic analysis (Fig. 2) revealed that strain KK10202C belongs to the family Flexibacteriaceae (Garrity and Holt 2001). Its 16S rDNA sequence has 98% identity to the 16S rDNA sequences of several other bacteria in the family of Flexibacteriaceae such as Algoriphagus and Cyclobacterium. Those close relatives of KK10202C were also isolated from the marine environment (Matsuo et al. 2003; Nedashkovskaya et al. 2004). The major carotenoid in KK10202C was identified as flexixanthin by the data of visible spectrum (λmax: 481, 509), 1H-NMR spectrum (data not shown), and mass spectrum (m/e: 582). Since flexixanthin was shown to be synthesized by Algoriphagus sp. (Aasen and Jensen 1966), strain KK10202C was named as Algoriphagus sp., which belongs to the Algoriphagus genus of the Flexibacteriaceae family. This strain Algoriphagus sp. KK10202C has been deposited in Marine Biotechnology Institute Culture collection (MBIC) as MBIC01539 (previously named as Cytophaga sp.). We propose the pathway for synthesis of flexixanthin by Algoriphagus sp. KK10202C as shown in Fig. 1. Presence of carotenoids such as lycopene (λmax: 448, 473, 505; m/e: 536), 3, 4-dehydrorhodopin (λmax: 454, 477, 509; m/e: 552) and deoxyflexixanthin (λmax: 481, 509; m/e: 566) as minor intermediates in Algoriphagus sp. KK10202C supported the proposed pathway.

Fig. 2
figure 2

Phylogenetic tree based on the 16S rRNA gene sequences of strain KK10202C and representative strains of the family Flexibacteriaceae. The tree was constructed by using the neighbor-joining method. Bootstrap values (more than 50%) expressed as percentages of 1,000 replications are shown at the branch points. Numbers in parentheses are DDBJ accession numbers. The scale bar represents 2 nucleotide substitutions per 100 nucleotides

Cloning of a carotenoid synthesis gene cluster from Algoriphagus sp

The major carotenoid synthesized by Algoriphagus sp. KK10202C is a keto-carotenoid flexixanthin. KK10202C presumably contains a 4,(4′)-β-ionone ring ketolase, which could be identified by turning β-carotene producing E. coli cells from yellow to orange. Approximately 30,000 E. coli transformants were obtained from the small insert library of KK10202C and several orange colonies were identified. HPLC analysis showed that ketocarotenoids (canthaxanthin and echinenone) were produced in the positive E. coli clones. One of the positive clone, pEZ-HY1, was shown to contain an insert of ∼ 3 kb. Sequencing of the insert confirmed that it contained an intact 4,(4′)-β-ionone ring ketolase gene (crtW). To obtain sequences of other carotenoid synthesis genes that were adjacent to the crtW, a cosmid library was constructed and screened for the presence of the crtW by PCR. A 22-kb DNA fragment spanning the crtW gene was assembled from cosmid sequencing. Table 1 shows the organization and homology analysis of the carotenoid synthesis genes and several flanking genes in this region.

Table 1
structure 1

Organization and homology analysis of the genes in the carotenoid synthesis gene cluster in Algoriphagus sp. KK10202C

Table 1

Seven genes were transcribed in the same direction, among which five of them were involved in carotenoid synthesis. ORF2 (designated as CrtI) contained 492 amino acids, which showed high sequence similarities to phytoene desaturases/ dehydrogenases from many different organisms. ORF3 (designated as CrtB) contained 280 amino acids, which showed high sequence similarities to many phytoene synthases. ORF4 (designated as CrtYcd) contained 236 amino acids, which showed sequence similarities to hypothetical proteins and lycopene cyclases from different organisms (more details in next section). ORF5 (designated as IspH) contained 281 amino acids, which showed high sequence similarities to many penicillin tolerance proteins (LytB). ORF6 (designated as CrtW) contained 256 amino acids, which showed sequence similarities to 4,(4′)-β-ionone ring ketolases from many different organisms. No ORFs on the sequenced 22-kb insert showed homology to geranylgeranyl pyrophosphate synthases (CrtE) or 3, (3′)-β-ionone ring hydroxylases (CrtZ). They presumably are located elsewhere in the chromosome. The genes responsible for the aliphatic end modification of flexixanthin (crtCD) (Giraud et al. 2004) were also not found in this region.

Identification and characterization of the fusion-type lycopene β-cyclase gene

BLAST analysis (Altschul et al. 1990) showed that ORF4 (236 amino acids) from Algoriphagus sp. KK10202C had the highest similarity (42% amino acid identity) to a hypothetical protein Chut02000358 from Cytophaga hutchinsonii. This 245-amino acid hypothetical protein from C. hutchinsonii was derived from automated computational analysis of its genomic sequence. ORF4 also had moderate similarities (25–28% amino acid identities) to a functionally confirmed lycopene β-cyclase from Sulfolobus solfataricus (Hemmi et al. 2003) and a putative lycopene β-cyclase from Picrophilus torridus (Futterer et al. 2004). The 226-amino acid lycopene β-cyclase from thermoacidophilic archaeon S. solfataricus was shown to be a novel fusion-type enzyme, whose N- and C-terminal halves are homologous to the subunits of bacterial heterodimeric lycopene β-cyclases (CrtYc and CrtYd). The N- and C-terminal halves of ORF4 also showed homology to the CrtYc and CrtYd from Mycobacterium aurum (Viveiros et al. 2000) and Brevibacterium linens (Krubasik and Sandmann 2000a). Multiple sequence alignment of ORF4 with the archaeal fusion-type lycopene β-cyclases and the bacterial heterodimeric enzymes is shown in Fig. 3. ORF4 was designated as CrtYcd to reflect the fusion of CrtYc and CrtYd. This is the first time that the fusion-type lycopene β-cyclase was identified in eubacteria. The lycopene cyclization function of the crtY cd from Algoriphagus sp. strain KK10202C was confirmed by heterologous expression in a lycopene-accumulating E. coli. β-carotene was produced as the predominant carotenoid and γ-carotene as a minor carotenoid upon expression of crtY cd .

Fig. 3
figure 3

Multiple sequence alignment of the fusion-type lycopene β-cyclases with the heterodimeric lycopene β-cyclases. The sequences used for the alignment are as follows: the fusion-type lycopene β-cyclase from Algoriphagus sp. KK10202C (accession number DQ286432); the putative fusion-type lycopene β-cyclase from Cytophaga hutchinsonii (accession number ZP_00310778); the fusion-type lycopene β-cyclase from Sulfolobus solfataricus (accession number NP_344223); the fusion-type lycopene β-cyclase from Picrophilus torridus (accession number YP_024312); the heterodimeric lycopene β-cyclase from Brevibacterium linens (accession number AF139916); the heterodimeric lycopene β-cyclase from Mycobacterium aurum (accession number AJ133724). The alignment was generated by AlignX, a component of Vector NTI Suite 7.0 (Informax Inc., Bethesda, MD, USA). The identical or conservative residues were highlighted in black boxes; the similar residues in gray boxes

Discussion

This paper reports the cloning of a carotenoid synthesis gene cluster from a marine Algoriphagus isolate and first identification of a fusion-type of lycopene β-cyclase in eubacteria. Heterodimeric type of lycopene β-cyclases were previously reported in gram-positive bacteria (Krubasik and Sandmann 2000a; Viveiros et al. 2000). The novel fusion-type of lycopene β-cyclases, whose N- and C-terminal halves are homologous to the CrtYc and CrtYd subunits of the bacterial heterodimeric enzymes, were recently described in archaeon (Peck et al. 2002; Hemmi et al. 2003). In the thermoacidophilic archaeon S. solfataricus, the fusion-type crtY was located in a cluster with putative crtB, crtZ, and crtI genes. In the halophilic archaeon Halobacterium salinarum, the fusion-type crtY was upstream of blh and was required for bacteriorhodopsin biogenesis. The CrtYcd from Algoriphagus showed significant homology to the fusion-type CrtY from Sulfolobus, but no significant homology to the fusion-type CrtY from Halobacterium, whose function was also confirmed experimentally. Phylogenetic analysis was previously performed with the respective subunits or domains of bacterial heterodimeric lycopene cyclases, the archaeal fusion-type lycopene cyclases and the fungi lycopene cyclase–phytoene synthase fusion proteins (Hemmi et al. 2003). Two subgroups emerged which separated the two functionally confirmed archaeal lycopene cyclases. Subgroup A contained the enzymes from Sulfolobus, Brevibacterium, and Mycobacterium, and subgroup B contained those from Halobacterium, Myxococcus (putative), and fungi. It appears that the Algoriphagus CrtYcd belongs to the subgroup A based on sequence homology. It was proposed that the fusion type of lycopene cyclases might be evolutionary intermediates for the fungal CrtYB, which the fusion CrtY was further fused with phytoene synthase CrtB.

Aside from homology with the known lycopene β-cyclases, the CrtYcd from Algoriphagus sp. had the highest similarity to a hypothetical protein Chut02000358 from C. hutchinsonii. Phylogenetic analysis of the strains in Fig. 2 showed that Cytophaga is closely related to Algoriphagus and belongs to the same Flexibacteraceae family as Algoriphagus. Chut02000358 from C. hutchinsonii would be another fusion-type of CrtYcd from bacteria if its function were demonstrated to be a lycopene β-cyclase.

The bicyclase activity of the CrtYcd from KK10202C was somewhat unexpected since only monocyclic carotenoids were detected in strain KK10202C. It appears that presence of monocyclic carotenoids in strain KK10202C was not due to the monocyclase activity of the lycopene cyclase as reported in several other bacteria (Teramoto et al. 2003; Tao et al. 2004). It is likely that in the native KK10202C host, CrtCD competes with CrtYcd for the lycopene substrate (Fig. 1). After one end of acyclic lycopene was modified by CrtCD, only the other linear end was available for cyclization by CrtYcd to synthesize monocyclic carotenoids such as flexixanthin.

It is interesting that the isoprenoid gene ispH was found for the first time to be organized with the carotenoid genes in this cluster. IspH was renamed from LytB since it was recently shown to encode 4-hydroxy-3-methylbut-2-enyl diphosphate reductase, which is part of the nonmevalonate isoprenoid pathway (Rohdich et al. 2002) to synthesize the IPP precursor. IspH (LytB) of E. coli is essential for cell growth (McAteer et al. 2001). Heterologous expression of ispH (lytB) in E. coli was shown previously to be able to increase carotenoid production (Cunningham et al. 2000). Linking of isoprenoid genes with carotenoid synthesis genes was observed previously in Pantoea agglomerans with a different isoprenoid gene idi encoding isopentenyl pyrophosphate isomerase (Hahn et al. 1999), which was located with the crtEXYIBZ genes as a cluster (GenBank accession number M87280).

Several genes (crtE, crtZ, crtC, and crtD) of the carotenoid synthesis pathway in Algoriphagus sp. KK10202C were not located in the cluster shown in Table 1. Neither of them was identified in the 15-kb region downstream of crtW from cosmid sequencing. Degenerate PCR approach was attempted to clone crtZ and was not successful. Another approach could be used is to screen the activity of the individual gene in an appropriate reporter strain. The crtE gene could be screened to complement the function of a crt cluster containing a crtE knockout mutation. The crtZ, crtC or crtD would probably have to be screened by HPLC or TLC. Cloning and characterization of the remaining carotenoid synthesis genes would give a complete picture of carotenoid synthesis in Algoriphagus sp. KK10202C. The functionalization genes (crtY, crtW, crtZ, crtC, and crtD) could also be exploited for combinatorial synthesis of novel carotenoids and heterologous production of desirable carotenoids.