Introduction

Rosmarinic acid (RA) is a well-known constituent of members of the Lamiaceae and Boraginaceae but also occurs in evolutionarily more distant plant families and even in monocotyledonous plants as well as ferns and hornworts (see Petersen and Simmonds 2003 for a review). The antiviral, antimicrobial and anti-inflammatory activities of RA add to the beneficial effects of important medicinal plants like Melissa officinalis, Salvia officinalis and Mentha x piperita. RA is an ester of caffeic acid and 3.4-dihydroxyphenyllactic acid, which is formed from the amino acids L-phenylalanine and L-tyrosine (Ellis and Towers 1970). A biosynthetic pathway for RA in the ornamental plant Coleus blumei (Lamiaceae) has been proposed already some time ago (Petersen et al. 1993; Petersen 1997). Phenylalanine is transformed to 4-coumaroyl-CoA by the enzymes of the general phenylpropanoid pathway (phenylalanine ammonia-lyase, cinnamic acid 4-hydroxylase, hydroxycinnamic acid:coenzyme A ligase). Tyrosine is transaminated by tyrosine aminotransferase to 4-hydroxyphenylpyruvate, which is further reduced to 4-hydroxyphenyllactate (pHPL) by hydroxyphenylpyruvate reductase. A cDNA encoding an enzyme catalyzing this reaction has recently been cloned from Coleus blumei (Kim et al. 2004). The hydroxycinnamoyl moiety of hydroxycinnamoyl-CoA is then transferred to the aliphatic hydroxyl group of a hydroxyphenyllactate by hydroxycinnamoyl-CoA:hydroxyphenyllactate hydroxycinnamoyltransferase (“rosmarinic acid synthase”, RAS, E.C. 2.3.1.140) which was shown to prefer the monohydroxylated substrates 4-coumaroyl-CoA and pHPL (Fig. 1; Petersen and Alfermann 1988; Petersen 1991). The hydroxyl groups in positions 3 and 3′ of the aromatic rings are finally introduced by cytochrome P450 monooxygenases (Petersen 1997). Attempts to purify RAS from suspension cultures of Coleus blumei (Petersen 1993) showed that the enzyme is highly active at barely detectable protein quantities hampering amino acid sequence determination. We here report a modified method to purify RAS, which eventually led to the identification of peptide sequences, cloning of a corresponding cDNA by a PCR-based approach and expression of active RAS.

Fig. 1
figure 1

Reactions catalyzed by hydroxycinnamoyl-CoA:hydroxyphenyllactate hydroxycinnamoyltransferase = rosmarinic acid synthase (RAS) from Coleus blumei

Materials and methods

Materials

Primer synthesis and DNA sequencing were commercially performed by MWG Biotech, Martinsried, Germany. All reactions using commercial kits were performed according to the manufacturers’ protocols.

Cell cultures

Suspension cultures of Coleus blumei were initiated and propagated as described previously (Petersen and Alfermann 1988). Biosynthesis of RA was enhanced by feeding 4% sucrose in CB-medium (CB4) to the cell cultures.

Purification and peptide sequence determination of RAS

Suspension cells of Coleus blumei cultivated for 7 days in CB4-medium were harvested by suction filtration and homogenized in a mortar (50 g each; total 2400 g) together with 10 g Polyclar 10 and 50 ml 0.1 M potassium phosphate buffer pH 7.5 containing 1 mM DTT. The homogenate was centrifuged at 4°C and 48,000 g for 20 min and the supernatant subjected to (NH4)2SO4 precipitation (60–80% saturation). The precipitated protein was collected by centrifugation for 20 min at 4°C and 48,000 g and redissolved in 10 mM potassium phosphate buffer pH 7.0 containing 1 mM DTT (buffer A) with 1 M (NH4)2SO4. The protein was applied to a Fractogel TSK-Butyl (Merck, Darmstadt, Germany) column (10 ml) pre-equilibrated with buffer A with 1 M (NH4)2SO4 and was eluted with 45 ml of a linear gradient of (NH4)2SO4 (1 M to 0 M) in buffer A at 1 ml/min. Protein elution was monitored at 280 nm. Fractions containing RAS activity (see below) were pooled and concentrated by centrifugal ultrafiltration (Millipore Ultrafree Biomax 10 K). The concentrated protein was then applied to a Reactive Yellow 86 Agarose column (6 ml; Sigma-Aldrich, Taufkirchen, Germany) pre-equilibrated with buffer A. The column was washed with the same buffer at 0.5 ml/min. Fractions containing RAS activity were pooled and applied to a Fractogel TSK AF-Blue 650 M (Merck) column (8 ml) equilibrated with buffer A. Elution was performed with 30 ml of a linear gradient of 0–1 M KCl in buffer A. Fractions containing RAS activity were concentrated to 1 ml and subjected to a non-denaturing polyacrylamide gel electrophoresis essentially according to Ornstein (1964). After electrophoresis the gel was cut into pieces (1 × 2 cm) and the protein eluted for 5 h at 4°C with 400 μl 0.1 M potassium phosphate buffer pH 7.0 with 10 mM DTT and 0.5 mM ascorbic acid. The elutes were used for RAS activity assays (see below) and subjected to SDS-PAGE. The band (after Coomassie staining) corresponding to the distribution of RAS activity in the fractions was used for the determination of peptide sequences. The peptide sequence determination was performed by Dr. Peter Hunziker (Biochemical Institute, University of Zuerich, Switzerland). The protein was tryptically digested in the gel and the peptides after elution were analyzed by MALDI-TOF and Nano-Electrospray.

Determination of protein concentrations

Protein concentrations were determined by the method of Bradford (1976) using bovine serum albumin as a standard.

Isolation of a full-length cDNA for RAS

Short degenerated primers were designed according to two RAS-specific peptides (VEFYP, DEDYL; see Fig. 2) and the DFGWG-motif that is conserved in acyltransferases of the BAHD superfamily (St. Pierre and De Luca 2000). Suspension cells were harvested 5, 6 and 7 days after transfer to fresh CB4-medium and then used for the isolation of total RNA with the phenol/chloroform method essentially according to Sambrook et al. (1989). The first-strand cDNA was synthesized with the RevertAid kit (Fermentas, St. Leon-Rot, Germany). A first PCR was performed with the forward primer 5′-GTNGARTTYTAYCC-3′ and the reverse complementary primer 5′-CCCCANCCRAAR-3′ using GoTaq polymerase (Promega, Mannheim, Germany), then a nested PCR was performed with 2 μl of the first PCR amplification with the same forward primer and the reverse primer 5′-ARRTARTCYTCRTC-3′, 40 cycles: 1 min 95°C (first cycle 2 min), 1 min 53°C (first cycle 2 min), 2 min 70°C (last cycle 10 min) for both PCR reactions. A dominant band at 800–850 bp was seen on the agarose gel (0.7%) and the amplicon was eluted from the gel by a NucleoSpin kit (Macherey and Nagel, Dueren, Germany), ligated into pGEM-T® (Promega) and transferred into Escherichia coli DH5α made competent with RotiTransform (Roth, Karlsruhe, Germany). The plasmid was isolated with the QIAprep Spin Miniprep Kit (Qiagen, Hilden, Germany) and sequenced. 3′- and 5′-RACE PCR experiments were performed with the GeneRacer system (Invitrogen, Karlsruhe, Germany) using the system-specific primers and the following gene-specific primers: 5′-RACE-primer 5′-GCCGTGGCGGGGGCGGTTTTGAA-3′, 5′-nested RACE-primer 5′-TGGGCGATGTCGGTGTGGGGAAGA-3′, 3′-RACE-primer 5′-CCTACCCCGCTGCCGCACTTCGA-3′, 3′-nested RACE-primer 5′-ACCCTCTTCCCCACACCGACATC-3′. Amplicons were ligated into pGEM-T® and transformed into E. coli JM109 made competent with RotiTransform. Inserts were sequenced and used for the design of primers for full-length amplification. The full-length open reading frame was amplified by PCR using the Fermentas High Fidelity PCR Enzyme Mix with the forward and reverse primers, 5′-ATTACATATGAAGATAGAAGTCAAAGACTC-3′ and 5′-TAGGATCCTCATCAAATCTCATAAAACAACTTCTC-3′, respectively, introducing 5′-NdeI and 3′-BamHI restriction sites (italics).

Fig. 2
figure 2

Nucleotide and amino acid sequence of rosmarinic acid synthase. RAS-peptides identified by amino acid sequence determination are shaded in grey; conserved motifs in acyltransferases of the BAHD superfamily are printed in bold letters. Primers (in italics) used for the isolation of the cDNA are indicated by arrows: full arrows: primers for first and nested PCR, broken arrows: primers for 3′- and 5′-RACE and nested RACE PCR

Expression of RAS-cDNA

The full-length PCR-amplicon was digested with NdeI and BamHI (Fermentas), separated on a 0.7% agarose gel, purified with a NucleoSpin column and ligated in-frame into the expression vector pET15b (Novagen, Bad Soden, Germany) digested with the same restriction enzymes. The plasmid was introduced into E. coli BL21(DE3)pLysS made competent with RotiTransform. Bacteria harboring pET15b without insert or pET15b with RAS-cDNA were inoculated into 5 ml LB-medium supplemented with 100 μg/ml ampicillin (LBamp100). The cells were grown at 37°C overnight under shaking (200 rpm) and 2 ml transferred into 50 ml fresh LBamp100-medium. IPTG was added to a final concentration of 1 mM when bacterial growth was equivalent to OD600 = 1.4; control cultures were without IPTG addition. After incubation at 37°C under shaking (200 rpm) for 5–6 h bacteria were harvested by centrifugation (5,000 g, 5 min). The bacterial sediments were washed with 12.5 ml 20 mM Tris/HCl pH 8.0, centrifuged as before and stored at −70°C. Soluble protein was extracted by sonication (0.3 cycles, 100%, 1 min) of the bacteria in 3 ml 50 mM sodium phosphate buffer pH 7.5 (crude protein extracts) or 4 ml HisTag binding buffer per gram fresh weight of bacteria. The crude extract was centrifuged at 10,000 g for 20 min at 4°C to yield a cell-free supernatant that was assayed for RAS activity. Further purification by metal chelate chromatography was performed with Ni-NTA His-Bind Superflow Resin (Novagen) and the eluates were tested for RAS activity as well.

Enzyme assays

The enzymatic activity in purification fractions, gel eluates or bacterial extracts with heterologously expressed protein was determined by incubating protein fractions in a total volume of 125/250 μl 0.1 M potassium phosphate buffer pH 7.0 with 1 mM DTT and 0.5 mM ascorbic acid: 0.2 mM 4-coumaroyl- or caffeoyl-CoA and 0.4 mM acceptor (pHPL, DHPL, shikimic or quinic acid dissolved in 20% ethanol) for up to 30 min at 30°C. The reaction was stopped by adding 10/20 μl 6 N HCl. The reaction products were extracted three times with 0.3/0.5 ml ethyl acetate each. The combined extracts were evaporated and redissolved in 100 μl 50% aqueous methanol/0.01% H3PO4. HPLC analysis of the reaction products was performed on a Hypersil ODS column (length 270 mm, diameter 4 mm, particle size 5 μm) with 40 or 45% aqueous methanol/0.01% H3PO4 at a flow of 1 ml/min; detection wavelength 333 nm. Authentic standards (25 μM) of pC-pHPL, Caf-pHPL, pC-DHPL and RA were from our laboratory collection; 4-coumaroyl-shikimate, caffeoyl-shikimate and 4-coumaroyl-quinate were a kind gift of Dr. Pascaline Ullmann (Strasbourg, France); shikimic, quinic and chlorogenic acids were purchased from Roth.

Phylogenetic analysis

Construction of phylogenetic trees was made with the PHYLIP program package v. 3.65 (Felsenstein 1994) using 12 sequences encoding (putative) plant hydroxycinnamoyltransferases. Alignments were performed with Clustal W v. 1.83 (Thompson et al. 1994). The tree was drawn with Phylodendron (http://www.iubio.bio.indiana.edu/soft/molbiol/java/apps/trees/).

Results

Purification of hydroxycinnamoyl-CoA:hydroxyphenyllactate hydroxycinnamoyltransferase (rosmarinic acid synthase, RAS)

A previously reported purification of RAS had resulted in a 225-fold enrichment of the RAS activity in the final protein fraction, which only showed one protein band after silver staining and two-dimensional electrophoresis (Petersen 1993). Since this protein band proved not to be RAS a new purification strategy (fractionated ammonium sulphate precipitation, hydrophobic interaction and two affinity chromatography steps) was followed which resulted in the 119-fold enrichment of RAS-activity after the second affinity chromatography step (see Table 1); this protein fraction had a specific activity of 0.7 mkat kg−1. Due to high losses during the chromatographic procedures the yield was only at 0.003%. The partially purified protein preparation with RAS activity was subjected to a non-denaturing PAGE as the final purification step. Protein eluted from the gel slices still showed RAS activity and bands on a SDS-PAGE corresponding to this activity were sequenced. The sequencing of tryptic peptides resulted in the identification of six peptide sequences (note that I and L cannot be distinguished by their molecular mass): DF(L/I)E(L/I)QED(L/I)SK, A(L/I)VEFYPSFGR, (L/I)DEDY(L/I)R, PAPTP(L/I) and SY(L/I)(L/I)P; two other peptide sequences, LTRDQLNSLK and QHMERFEK, were already known from a former sequencing attempt.

Table 1 Purification of rosmarinic acid synthase from suspension cultures of Coleus blumei

Isolation of a full-length cDNA for RAS

Only short degenerated primers corresponding to the amino acid sequences VEFYP and DEDYL were used together with a degenerated primer corresponding to the highly conserved motif DFGWG (St. Pierre and De Luca 2000) in BAHD acyltransferases. A nested PCR approach resulted in the amplification of a 810 bp sequence. RACE-PCR was then performed to identify the 5’- and 3’-ends of the cDNA. The deduced full-length cDNA (EMBL accession number AM283092) and amino acid sequences showed high similarities to acyltransferases: the highest identity on nucleotide level was found to an alcohol acyltransferase from Prunus mume (58.9%; AB218791) and on amino acid level to a hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase from Nicotiana tabacum (55.6%; CAD47830; Hoffmann et al. 2003). The open reading frame of the cDNA had a length of 1290 base pairs encoding a protein of 430 amino acids with a calculated molecular mass of 47,932 Da and an isoelectric point of 5.89. Transit peptides for chloroplasts, mitochondria or endoplasmic reticulum were not predicted by suitable programs. The amino acid sequence contained all tryptic RAS peptides (see above and Fig. 2) with exception of two altered amino acids in one of the peptides. The protein can be grouped into the superfamily of BAHD acyltransferases (St. Pierre and De Luca 2000). This family was named according to its first four members: BEAT (benzylalcohol O-acetyltransferase from Clarkia breweri), AHCT (anthocyanin O-hydroxycinnamoyltransferase), HCBT (anthranilate N-hydroxycinnamoyl/benzoyltransferase) and DAT (deacetylvindoline 4-O-acetyltransferase (see St. Pierre and De Luca (2000) and D’Auria (2006) for further information). Conserved sequence motifs for this enzyme family are the HXXXD motif, which can be recognized at amino acid positions 152–156 and the DFGWG motif detectable at positions 377–381 (see Fig. 2, bold letters). Sequence alignments of hydroxycinnamoyltransferases accepting quinate and shikimate together with RAS show a number of conserved stretches such as the typically conserved motifs HXXXD and DFGWG but also distinct differences (not shown).

Heterologous expression and activity determination

The full-length open reading frame was amplified by PCR and ligated into the NdeI and BamHI restrictions sites of the expression vector pET15b thus introducing a N-terminal His6-Tag into the heterologously expressed protein. The vector harboring the putative RAS-sequence was introduced into E. coli BL21(DE)pLysS and the bacteria were induced to express the protein by IPTG. Crude bacterial protein extracts as well as eluates from HisTag-purification were assayed with the following substrates: 4-coumaroyl- and caffeoyl-CoA as hydroxycinnamoyl donors and pHPL, DHPL, shikimate and quinate as hydroxycinnamoyl acceptors. Product formation was observed with 4-coumaroyl- as well as caffeoyl-CoA and only with pHPL and DHPL (Fig. 1 and Table 2). Shikimic and quinic acid did not serve as acceptors. This makes the newly isolated cDNA and enzyme quite distinct from other hydroxycinnamoyltransferases like the hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase from tobacco (Hoffmann et al. 2003) despite the sequence similarity of 55.6% on amino acid level. The specific activities of the His-Tag-purified RAS protein ranged between 10 and 30 mkat/kg.

Table 2 Relative activities of heterologously expressed RAS with 4-coumaroyl- and caffeoyl-CoA as hydroxycinnamoyl donors and pHPL, DHPL, shikimate and quinate as acceptors. A specific activity of 100 % corresponded to 11 mkat kg-1 in HisTag-purified protein preparations

Discussion

A modified purification procedure for RAS from suspension-cultured cells of Coleus blumei including a non-denaturing polyacrylamide gel electrophoresis and elution of active protein from the gel (Table 1) produced sufficient protein amounts for amino acid sequence determination of short peptides after a tryptic digest. During the purification procedure a similar problem as reported by Hoffmann et al. (2003) occurred: the activity of RAS is very high at barely detectable protein concentrations, which hampered the purification of sufficient amounts of protein for amino acid sequence determination. The amino acid sequences determined by MALDI-TOF and Nano-Electrospray did not allow the design of long PCR primers without too high a degree of degeneration due to ambiguities in amino acids (leucine and isoleucine have identical masses) and to the occurrence of amino acids with high numbers of codons. Therefore only very short primers were designed according to the sequences VEFYP and DEDYL and were used together with a degenerated primer corresponding to the highly conserved motif DFGWG (St. Pierre and De Luca 2000) in BAHD acyltransferases in a nested PCR-approach. PCR and RACE-PCR experiments finally resulted in the identification of a full-length cDNA clone with similarities to acyltransferases of the BAHD superfamily (St. Pierre and De Luca 2000; D’Auria 2006). The highest similarity on amino acid level was found in a hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase from Nicotiana tabacum (55.6%; Hoffmann et al. 2003). Similarities can be observed throughout the whole sequence, however, RAS shows a deletion of seven amino acids after amino acid number 222 (RAS numbering) and an insertion of four amino acids after amino acid 238 compared to the tobacco hydroxycinnamoyl-CoA: shikimate/quinate hydroxycinnamoyltransferase.

The BAHD acyltransferase superfamily encodes proteins catalyzing the acyl transfer from coenzyme A-activated acids to varying acceptor molecules (St. Pierre and De Luca 2000; D’Auria 2006). Several members of this family are involved in plant secondary metabolism, e.g. deacetylvindoline 4-O-acetyltransferase (DAT) from Catharanthus roseus (St. Pierre et al. 1998), vinorine synthase from Rauvolfia serpentina (Bayer at al. 2004), anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus (Yang et al. 1997) or hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase (Hoffmann et al. 2003). In contrast, the transfer of acyl moieties from glucose esters in plant secondary metabolism is catalyzed by a different enzyme familiy, the serine carboxypeptidase-like acyltransferase family (Milkowski and Strack 2004). A first crystal structure of a member of the BAHD acyltransferase family has been published recently (Ma et al. 2005). Genes encoding enzymes transferring hydroxycinnamoyl moieties have been found in three out of five clades as defined for this acyltransferase family by D’Auria (2006). Sequence homologies alone therefore cannot help to make functional predictions.

Phylogenetic trees were constructed by parsimony, distance matrix plus neighbor joining and maximum likelihood methods of the PHYLIP program package from 12 published amino acid sequences of hydroxycinnamoyl transferases. A tree constructed by the maximum likelihood method is shown in Fig. 3. All trees irrespective of the method used show that RAS is clearly separated from the hydroxycinnamoyl-CoA: quinate/shikimate hydroxycinnamoyl transferase sequences. However, the overall topology of the trees shows distinct differences, probably caused by the limited number of known sequences of enzymes using hydroxycinnamoyl-CoA as a substrate. D’Auria (2006) had already shown that hydroxycinnamoyltransferases can be found in three out of five clades in the BAHD superfamily and therefore might not have evolved as a single group.

Fig. 3
figure 3

Unrooted phylogenetic tree constructed on the basis of 12 amino acid sequences of (putative) hydroxycinnamoyltransferases by the maximum likelihood method of the PHYLIP package. Values at the branches indicate numbers of partitions of 100 bootstrapping replicates. AAO73071: agmatine N-hydroxycinnamoyltransferase from Hordeum vulgare; CAB06430: anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus; BAA74428: anthocyanin 5-aromatic acyltransferase from Gentiana triflora; BAA93475: hydroxycinnamoyl-CoA:anthocyanin 3-O-glucoside-6’’-O-acyltransferase from Perilla frutescens; BAD72525: shikimate/quinate hydroxycinnamoyltransferase from Oryza sativa; BAC78633: hydroxyanthranilate N-hydroxycinnamoyltransferase from Avena sativa; NP199704: hydroxycinnamoyltransferase from Arabidopsis thaliana; CAD47830, CAE46932: shikimate/quinate hydroxycinnamoyltransferases from Nicotiana tabacum; ABA46756: shikimate/quinate hydroxycinnamoyltransferase from Solanum tuberosum; CAE46933: shikimate/quinate hydroxycinnamoyltransferase from Lycopersicon esculentum; AM283092: rosmarinic acid synthase from Coleus blumei

The open reading frame of the cDNA isolated from Coleus blumei was transferred into the expression vector pET15b and heterologously expressed in E. coli as an N-terminal His6-fusion protein. Crude bacterial protein extracts as well as protein purified via the His-tag were tested for enzyme activities. Despite the similarities to hydroxycinnamoyltransferases accepting shikimate and/or quinate as substrates the heterologously expressed protein from Coleus blumei could not form hydroxycinnamoylshikimates or hydroxycinnamoylquinates but catalyzed the formation of rosmarinic acid or corresponding less hydroxylated esters from 4-coumaroyl- and caffeoyl-CoA and hydroxyphenyllactates (see Fig. 1).

So far, acceptors for hydroxycinnamoyl transfer catalyzed by enzymes described on the molecular level have been anthocyanins in Gentiana triflora (BAA74428) or Perilla frutescens (BAA93475; Fujiwara et al. 1998; Yonekura-Sakakibara et al. 2000), agmatine in Hordeum vulgare (AAO73071; Burhenne et al. 2003), anthranilate in Dianthus caryophyllus (CAB06430; Yang et al. 1997), hydroxyanthranilate in Avena sativa (BAC78633; Yang et al. 2004) as well as shikimic and quinic acid in Nicotiana tabacum (CAE46932, CAD47830), Solanum tuberosum (ABA46756), Lycopersicon esculentum (CAE46933, Arabidopsis thaliana (NP_199704) and Oryza sativa (BAD72525; Hoffmann et al. 2003, 2005; Niggeweg et al. 2004). A cDNA sequence coding for an enzyme accepting hydroxyphenyllactic acids as hydroxycinnamoyl acceptors is presented here for the first time; this enzyme is active in the biosynthesis of rosmarinic acid which has formerly been considered to be a typical compound present in Lamiaceae and Boraginaceae. During the last years however, it has been shown that rosmarinic acid is much further distributed in the plant kingdom (Petersen and Simmonds 2003). This raises the question whether the ester-forming enzymes in all plant taxa are related to each other or whether the ability to synthesize rosmarinic acid has evolved independently several times. Recent investigations on other plant biosynthetic pathways towards natural products have shown that both possibilities exist. The recruitment of a key enzyme for natural product biosynthesis from primary metabolism together with a change in function seems to have occurred several times independently during the evolution of pyrrolizidine alkaloid biosynthesis (Ober and Hartmann 2000). In contrast, data on enzymes involved in the biosynthesis of benzylisoquinoline alkaloids support the hypothesis of a monophyletic evolution of this pathway (Liscombe et al. 2005). Wink (2003) additionally suggested a monophyletic evolution for the biosyntheses of tropane and quinolizidine alkaloids.

The isolation of the cDNA encoding the essential enzyme of RA biosynthesis in the Lamiaceae plant Coleus blumei now opens up the possibility to look further into the evolution of RA biosynthesis and the biosynthesis of related caffeic acid esters in the plant kingdom.