Introduction

Plants produce a vast and diverse array of terpenoids, a class of structurally diverse natural products found in many plants that play essential roles in growth and development, respiration and photosynthesis, and interactions with the environment. Gibberellins (GAs) are plant hormones that control divers aspects of plant growth and development, including seed germination, stem elongation, leaf expansion, and seed development [1, 2]. They are biosynthesized in a number of committed steps, as shown in Fig. 1: (1) cyclization from geranylgeranyl diphosphate (GGPP, 1) to ent-copalyl diphosphate (CPP, 4) by ent-CPP synthase (ent-CPS); (2) cyclization from ent-CPP to ent-kaurene (5) by ent-kaurene synthase (ent-KS); (3) a three-step oxidation of ent-kaurene to ent-kaurenoic acid (8) by ent-kaurene oxidase (ent-KO); and (4) ent-kaurenoic acid oxidase catalyzation of a three-step oxidation reaction and oxidative extrusion of an endocyclic ring carbon from ent-kaurenoic acid via ent-7-hydroxy-kaurenoic acid (9) to GA12 (10). GA12 is further converted into bioactive GAs, such as GA4 (11) and GA1 (12) in several oxidation steps [1, 3, 4].

Fig. 1
figure 1

Diterpene metabolic pathway in Scoparia dulcis. CPP ent-Copalyl diphosphate, GA gibberellin, GGDP geranylgeranyl diphosphate, SDB scopadulcic acid B

It is well recognized that diterpene synthases (DTSs) and cytochrome P450 enzymes (CYP450s) are the key participants in providing diterpenoids, terpenoids composed of two terpene units, with structural diversity [5,6,7]. DTSs are classified into two types, referred to as class I and class II DTSs, according to their reaction mechanisms. Class II DTSs catalyze bicyclization initiated by protonation of the carbon–carbon double bond of GGPP. In contrast, the enzymatic reaction of class I DTSs is dependent on ionization of labdadienyl diphosphate to the labdadienyl cation by removal of the diphosphate group. After formation of the diterpene skeletons by these two consecutive enzymatic reactions, P450s modify the redox status of these diterpenoids.

Scoparia dulcis L. (Plantaginaceae) is a perennial herb widely distributed in the torrid zone that has been used as a medication for stomach disorders, diabetes, hypertension, and insect bites [8]. It has been reported that S. dulcis produces a number of unique diterpenoids, such as scopadulcic acid B (SDB, 3) [9], and that these diterpenoids may be biosynthesized via syn-CPP (2), which is a diastereomer of 4. Therefore, S. dulcis (Sd) possesses two distinct biosynthetic machineries for the production of diterpenoids. We recently reported that SdCPS2 and SdKSL1 were involved in SDB biosynthesis, based on the results of next-generation sequencing-based transcriptome analysis [10]. The results of that revealed that S. dulcis possessed at least two specific pairs of class II and class I DTSs that resulted in the formation of a distinct diterpene skeleton derived from 2 and 4. ent-CPS (SdCPS1) was previously cloned and characterized from S. dulcis [11]; however, other participants involved in GA biosynthesis have not been identified to date. In the study reported here, we focused on GA metabolism in an attempt to obtain a detailed overview of the diterpene biosynthetic machinery in S. dulcis.

Materials and methods

Plant material

Scoparia dulcis was germinated under sterile conditions and was grown on half-strength Murashige and Skoog agar medium at 25 °C under continuous illumination. After 5–6 weeks of growth the seedlings were harvested, frozen immediately in liquid nitrogen, and stored at − 80 °C for RNA isolation.

Cloning of ent-kaurene synthase and kaurene oxidase

Total RNA was isolated from seedlings with TRIzol regent (Invitrogen, Carlsbad, CA), and cDNA was generated by the reverse-transcription reaction using the PrimeScript II First-strand cDNA synthesis kit (Takara Bio Inc., Kusatsu, Shiga, Japan). The SdKS and SdKO genes were cloned using degenerate primers described in Electronic Supplementary Material (ESM) Table S1. The 5′ and 3′ ends of the targeted genes were obtained using the 5′ and 3′ rapid amplification of cDNA ends (RACE) system (Invitrogen, Carlsbad, CA) according to the manufacturer’s instructions. The full-length cDNA for each open reading frame (ORF) was cloned into pGEM-T easy vector (Promega Corp., Madison, WI) and transformed into Escherichia coli TOP10 cells (Invitrogen).

Construction of expression vectors

All expression vectors were constructed according to the previously reported method by Cyr et al. [12]. Briefly, SdGGPPS (Accession No. AB034250) was truncated to remove the transit peptide sequence (57 amino acids) and introduced into pACYC–Duet (Novagen Merck, Darmstadt, Germany) multiple cloning site 2 (MCS2). The ORF of SdCPS1 (Accession No. AB169881), after the transit peptide sequence (98 amino acids) was truncated, was then cloned into MCS1 of the vector and the name of the vector changed to pSdGGeC. SdKS or AtKS (from Arabidopsis thaliana; Accession No. AF034774) were also cloned into the pET-28b (Novagen Merck) vector to give the pSdKS and pAtKS vectors, respectively. The SdKO1 or SdKO2 genes were cloned into the MCS1 site in the pETDuet-1 vector (Novagen Merck), and ATR2 from A. thaliana (Accession No. X66017) was ligated into the MCS2 site in the vector to give pSdKO1ATR and pSdKO2ATR, respectively. In order to enhance the expression of these genes in bacterial cells, we modified the N-terminus sequences of SdKO1 and SdKO2 as described by von Wachenfeldt et al. [13] and Wang et al. [14]. Briefly, the N-terminus region (44 amino acids and 39 amino acids, respectively) of these genes were truncated using specific primer sets (KO1-Nmod-FW/SdKO-RV5 or KO2-Nmod-FW/SdKO2-RV5) and then inserted in the nucleotide ATGGCGAAAAAAACCAGCAGCAAAGGTAAA, which was then inserted into the truncated genes to introduce the peptide sequence MAKKTSSKGK into the N-terminus of ORFs. The introduced modifications were confirmed by sequence analysis. After introducing the restriction enzyme sites BglII and XhoI into these cDNA fragments, they were cloned into the MCS2 site of the pETDuet-1 vector. Then, ATR2 was cloned into the MCS2 site in the vector to construct pmodKO1ATR and pmodKO2ATR.

Heterologous expression and metabolite analysis

Expression vectors were transformed into E. coli C41 (DE3) strain (Lucigen Corp., Middleton, WI). The transformant was grown in NZY medium containing chloramphenicol (20 µg/ml) supplemented with 1% glucose at 37 °C. When the optical density at 600 nm reached approximately 0.6, the incubation temperature was shifted down to 20 °C and kept there 1 h. Isopropyl β-d-1-thiogalactopyranoside (IPTG; 0.5 mM) and glycerol (5 g/l) were added to the medium and incubated at the same temperature for 24 h. After the incubation period, culture media including cells were extracted with n-hexane and concentrated in vacuo. The n-hexane extracts were analyzed by gas chromatography–mass spectrometry (GC–MS) using a DB5-MS column (Agilent Technologies, Palo Alto, CA) on a Shimadzu QP-2010 Ultra gas chromatograph mass spectrometer (Shimadzu Corp., Kyoto, Japan) in electron ionization mode (70 eV). Each sample was injected at 60 °C in the splitless mode. For ent-kaurene, the samples were initially held at 60 °C for 5 min, following which the oven temperature was increased at 10 °C/min to 310 °C and held for 5 min. For kaurenoic acid, the oven temperature was increased from 60 °C (5-min hold) to 200 °C at 25 °C/min and then increased at 5 °C/min to 300 °C. The flow rate of helium carrier gas was set at 1.4 ml/min. The MS data were collected from 40 to 400 m/z.

Phylogenetic analyses

Phylogenetic analyses were performed using RAxML software [15] with alignments prepared using the MAFFT program [16]. Selected KS and CYP701A proteins were aligned by employing a highly accurate method: L-INS-I. Maximum likelihood (ML) trees were built using the WAG model and 500 replicates of bootstrap analyses, and the obtained phylogeny was displayed using FigTree software (http://tree.bio.ed.ac.uk/software/figtree).

Quantitative PCR analysis of SdCPS1, SdKS, and SdKOs

First-stranded cDNAs from each plant organ were synthesized using a PrimeScript™ II First-strand cDNA Synthesis kit (Takara Bio Inc). The resulting first-strand cDNAs were used as templates for quantitative (q) PCR. Real-time PCR was performed using the Brilliant III Ultra-Fast SYBR® Green QPCR Master Mix on an Mx3005 real-time qPCR system (Agilent Technologies, Santa Clara, CA). The S. dulcis 18S rRNA gene (Accession No. JF718778) was used for normalization. The primer sequences used in the qPCR study are listed in ESM Table S1.

Results

Cloning of ent-kaurene synthase

A homology-based approach and the RACE system were adopted to isolate a full-length cDNA of KS from S. dulcis. Degenerate primers corresponding to highly conserved regions (SAYDTAW and DDxxD motif) among plant KSs were used in degenerate PCR reactions to amplify a 1500-bp product from S. dulcis leaf cDNA. The core fragment was identified as KS by sequencing and BLASTn analysis, and the core fragment was further completed in the 5′ and 3′ directions by RACE. The full-length cDNA was 2376 bp, including an 82-bp 5′ untranslated region (UTR) and a 152-bp 3′ UTR. The ORF was estimated to be 2142 bp, and the deduced amino acid sequence had 791 amino acid residues (91 kDa; pI 5.6). The first 39 N-terminal amino acids were rich in serine and threonine, which is a common characteristic of transit peptides targeting to plastids. The ORF contained two highly conserved motifs in terpene synthases, SSYDTAW and QxxDGSW, and two characteristic motifs important for metal-dependent ionization of prenyl diphosphate substrate, DDxxD and NSE/DTE [17] (ESM Fig. S1). As these characteristics were in good agreement with those of class I DTSs, we named the gene thus obtained as SdKS (Accession No. JF781124). In addition, Southern blotting analysis revealed that there were no other isoforms in S. dulcis genome (data not shown), a result that was identical with our earlier transcriptome analysis [10].

BLAST searches showed extended similarities of SdKS with already functionally annotated KSs from Chinese red sage (SmKS), tomato (SlKS), and cucumber (CsKS) (approximately 66, 58, and 55% identities, respectively). Phylogenetic comparison revealed that SdKS fell into a sub-clade consisting of KSs isolated from members of the plant order Lamiales (Fig. 2a), showing in particular that it was most closely related to the ent-KS from Dorcoceras hygrometricum (DhKS).

Fig. 2
figure 2

Phylogenetic trees of ent-kaurene synthase (KS; a) and ent-kaurene oxidase (KO; b). The maximum likelihood trees illustrates the phylogenetic relatedness of S. dulcis KS (SdKS) with other KSs (a) and S. dulcis KOs (SdKO1, SdKO2) with other KOs (b). The ancestral Physcomitrella patens ent-kaurene synthase (PgCPS/KS) and ent-kaurene oxidase (PpKO) were used to root the tree. Descriptions of KSs and KOs used in the phylogeny are listed in ESM Table S2

Cloning of ent-kaurene oxidases

The SdKO cDNAs were also isolated by the homology-based approach by using degenerate primers as shown in ESM Table S1. The core fragment thus obtained (approx. 1200 bp) was sequenced and found to show similarity against known plant KOs. Subsequent 5′ and 3′ RACE provided a full-length cDNA (1970 bp) containing 1533 bp of the ORF, which we named to SdKO1 (Accession No. KP987567). In order to determine whether there were other isoforms of SdKO1 in the S. dulcis genome, we performed genomic Southern analysis using the SdKO1 cDNA probe under low-stringency conditions. The results suggested that at least one homolog of SdKO1 was present since two hybridized signals were observed in the genomic DNA digested with BamHI and EcoRI (ESM Fig. S2). Therefore, we attempted to isolate the SdKO1 homolog from genomic DNA via PCR amplification. The PCR product (1114 bp) obtained using the same degenerate primer sets as for SdKO1 was sequenced and the exon region sequence (565 bp) determined. The predicted exon sequence was used to obtain full-length cDNA by 5′ and 3′ RACE. The full-length cDNA of the homolog, which represented SdKO2 (Accession No. KP987568), was 1826 bp, including a 171-bp 5′ UTR and a 134-bp 3′ UTR. SdKO1 and SdKO2 shared 69.2% identity at the nucleotide sequence level. ORFs of SdKO1 and SdKO2 encoded polypeptides of 511 and 506 amino acid residues, respectively. SdKO1 and SdKO2 showed the highest similarity to KOs from S. miltiorrhiza (SmKO, 76% identity) and Vitis vinifera (VvKO, 68% identity).

The molecular weights of SdKO1 and SdKO2 were calculated to be 57.9 and 58.3 kDa, respectively, and both SdKO1 and SdKO2 contained motifs characteristic to P450s, such as substrate binding and oxygen pocket, ExxR, PERF, and FxxGxRxCxG (CxG) (ESM Fig. S3). In addition, computational analyses (TargetP program) revealed the presence of a hydrophobic N terminus, suggesting that SdKO2 was located in the secretory pathway, i.e., in the endoplasmic reticulum; the location of SdKO1 was not specified by the program. Phylogenetic comparison of SdKOs with CYP701 subfamily members placed SdKO1 and SdKO2 into a clade consisting of KOs from members of the order Lamiales and Coffea arabica (Fig. 2b). SdKO1 and SdKO2 were assigned to be CYP701A50 and CYP701A51, respectively, by the CYP nomenclature committee (Prof. David Nelson, University of Tennessee Health Science Center).

Functional characterization of SdKS and SdKOs

In order to confirm the biochemical functions of SdKS and SdKOs, we constructed expression vectors for the identification of enzymatic reaction products in vivo. In general, the transit peptide sequences interfered in the expression of the SdKS and SdKOs in E. coli, therefore, the corresponding nucleotide sequences in SdKS were truncated. The truncated cDNA derived from SdKS was ligated into the expression vector pET-28b to give a pSdKS plasmid. In addition to pSdKS, the pSdGGeC vector, which harbored GGPPS and ent-CPS isolated from S. dulcis, was transformed to provide ent-CPP into E. coli C41 cells, as reported by Cyr et al. [12] with slight modifications. The transformed E. coli C41 cells and medium of the bacterial culture grown under induction conditions at 16 °C for 21 h were extracted with n-hexane and then concentrated in vacuo; the resultant enzymatic reaction products then subjected to GC–MS analysis. As shown in Fig. 3, the recombinant bacteria produced a diterpene hydrocarbon (peak 5′) with a mass spectrum and retention time that were identical with those of the authentic ent-kaurene standard. Based on these results we considered out SdKS to be an ent-kaurene synthase.

Fig. 3
figure 3

ent-Kaurene synthase activity of SdKS. Gas chromatography–mass spectrometry (GC–MS) analysis of diterpene products from hexane extracts of recombinant Escherichia coli harboring the pSdGGeC and pSdKS vectors (see text for description). a Selective ion chromatograms of authentic ent-kaurene and diterpene products of recombinant E. coli. b Mass spectra of ent-kaurene and peak 5′ shown in a

Next, we attempted to characterize the function of the SdKOs cloned from S. dulcis. In preparation for the heterologous expression of the S. dulcis P450, a cDNA encoding a cytochrome P450 reductase from Arabidopsis thaliana (ATR2) had been isolated and functionally expressed in E. coli. In order to characterize the biological function of the SdKOs, pSdKO1ATR and pSdKO2ATR vectors were constructed, and each vector was co-transformed with the pSdGGeC and pSdKS vectors in E. coli C41 cells. After induction, the culture media was extracted with n-hexane and the resultant extract analyzed by GC–MS after methylation. However, the peak corresponding to the ent-kaurenoic acid methyl ester was not detected (data not shown). In addition, truncation of the N-terminal membrane anchor regions of the SdKOs also showed little oxidative activities against ent-kaurene (data not shown). It has been reported that replacement of the transmembrane region in the N-terminus of P450s with a ten amino acid-long lysine- and serine-rich leader peptide (MAKKTSSKGK) optimizes the functional bacterial expression of P450 [13, 14]. Based on this information, we constructed synthetic genes of SdKOs, such as modSdKO1 and modSdKO2, to enhance the expression of those P450s in E. coli cells. Figure 4 shows the GC–MS data of the in vivo enzymatic reaction products of pmodKO1ATR and pmodSdKO2ATR. In the chromatograms, peak 6′, 7′, and 8′ were identified to be ent-kaurenol (6), ent-kaurenoic acid (8), and ent-kaurenal (7) on the basis of their mass spectra by comparison with reported values [18]. Therefore, we considered both SdKO1 and SdKO2 to be ent-KOs.

Fig. 4
figure 4

Kaurene oxidase activities of SdKO1 and SdKO2. GC–MS analysis of diterpene products from hexane extracts of recombinant E. coli harboring the pSdGGeC, pSdKS, and pmodKO1ATR/pmodKO2ATR vectors. a Chromatograms (total ion chromatogram and extracted ion chromatograms) of recombinant E. coli and the authentic kaurenoic acid methyl ester (KA). b Mass spectra of peak 6′, 7′, and 8′

Steady-state transcript levels in S. dulcis tissues

Relative transcript levels of SdCPS1, SdKS, SdKO1, and SdKO2 were determined in young and mature leaves, upper and lower stems, roots, and lateral roots of S. dulcis by quantitative real-time PCR (Fig. 5). The expression pattern of SdKS was similar to that of SdKO1, and the relative transcript levels observed in roots were approximately 50- to 60-fold higher than those in young leaves. In addition, the results suggested that both genes were expressed constitutively in all tested tissues. On the other hand, transcript levels of SdCPS1 and SdKO2 in mature leaf and upper stem were relatively lower than those in other tested tissues. However, SdCPS1 and SdKO2 also showed that they were predominantly expressed in root tissue.

Fig. 5
figure 5

Real-time quantitative PCR analysis of SdCPS1 (S. dulcis ent-copalyl diphosphate synthase 1), SdKS, SdKO1, and SdKO2. Relative steady-state accumulation levels of transcripts were determined in different tissues of S. dulcis. Data were normalized to an internal control (18S rRNA), and the ΔΔCT method was used to obtain relative values. Data are expressed as the mean ± standard deviation. Asterisks indicate significant differences from the control (*p < 0.05 and **p < 0.01)

Discussion

Scoparia dulcis produces a variety of diterpenoids that are synthesized from GGDP (1), including the GAs and phytoalexins such as SDB (3). The former is synthesized via ent-CPP (4) by ent-CPS, whereas the latter is synthesized via syn-CPP (2) mediated by syn-CPS. Researchers who design studies aimed at furthering our understanding of (di)terpene metabolism in targeted organism(s) face the significant hurdle of having to prepare specific substrates for biosynthetic enzymes as many substrates are not commercially available and are difficult to isolate from natural resources. However, Cyr et al. have demonstrated the modular approach for relatively easy biosynthesis of diterpenes [12], thereby facilitating functional analyses of terpene synthases and P450s involved in the metabolism of terpenes. In accordance with this concept, we were able to analyze the function of biosynthetic enzyme genes from S. dulcis involved in GA metabolism.

In our study, we cloned an SdKS and two SdKOs which catalyzed subsequent reactions from ent-CPP to ent-kaurenoic acid via ent-kaurene. The deduced amino acid sequence of the SdKS and SdKOs shared high identity with functionally characterized plant KSs and KOs, respectively. Since we had already characterized SdCPS1 (formerly SdECPS) from S. dulcis in an earlier study [11], in the present study we examined the three enzyme genes which we believed to be involved in subsequent steps in the GA biosynthetic pathway. When pSdKS was transformed into E. coli together with pSdGGeC, ent-kaurene produced by the transformant was detected in the culture medium. Truncation of hydrophobic segments in the N-terminus of plant P450s has been reported to enhance expression in bacterial cells, but no activity of various P450s was detected in pSdKO1ATR and pSdKO2ATR transformants with only removal of the signal peptide [13, 14]. Therefore, we prepared pmodKO1ATR and pmodKO2ATR by adding ten residues to the N-terminus, as reported by von Wachenfeldt et al. [13]. After the pmodKO1ATR or pmodKO2ATR was transfected with E. coli harboring pSdGGeC and pSdKS, we noted that these modifications enabled the detection of enzymatic activities in E. coli, as shown in Fig. 5. Therefore, the approach reported here may be successful for characterization of terpene biosynthetic enzymes in E. coli.

GAs are key regulators of plant growth and development. In general, GA biosynthesis occurs predominantly in apical parts of the plant, namely apical buds, developing leaves, and root tips [19, 20]. Moreover, gene transcription of ent-CPS and KS is known to be preferentially expressed in rapidly growing tissues [21]. As shown in Fig. 5, all examined transcripts were preferentially expressed in roots, but they were expressed in all examined tissues. This tendency is in good agreement with results from studies in A. thaliana and Oryza sativa [22,23,24,25]. Figure 5 shows that the expression patterns of SdKOs were quite different from each other. In addition, an earlier study showed that two SdKOs were distinctly expressed when stimulated with methyl jasmonate [10], suggesting that these two KOs might play different roles in S. dulcis.

In conclusion, we were able to functionally characterize consecutive enzymes (SdKS and two SdKOs). Although we have not performed a genetic study to determine the function of these genes, several lines of evidence strongly support that these enzymes are responsible for GA biosynthesis in S. dulcis. To date there have been no reports of the metabolism of diterpenoids derived from ent-CPP as secondary metabolites in S. dulcis. In addition, S. dulcis possesses only two genes encoding ent-CPS and syn-CPS; therefore, diterpene metabolism in the plant had two lineages, such as GA biosynthesis and unique diterpene metabolism [10]. Taking all of the evidence into consideration, we suggest that these genes are involved in GA biosynthesis in S. dulcis.