Introduction

Coprinopsis cinerea is a model organism for the study of basidiomycetes. The complete genome sequence of this mushroom was made publicly available in 2003 [1], and a substantial volume of information on genes involved in the biosynthesis of secondary metabolites has become available. In addition, methods of gene introduction and disruption have been established in our previous report [2]. Therefore, the stage has been set to deepen our understanding of the biosynthesis of complex secondary metabolites in basidiomycetes. Lagopodins are terpenoid natural products isolated from C. cinerea that was shown to exhibit antibacterial activity against Staphylococcus aureus [3]. The chemical structures of key members of this family of compounds have a unique sesquiterpene core consisting of a fusion of a five- and a six-membered ring. Because of the unique chemical structures and their potentially useful biological activity, lagopodins garner interests in the research field of natural product chemistry, medicinal chemistry, and chemical biology [4, 5]. Analysis of the biosynthetic gene cluster of lagopodin B (1) revealed that the terpene cyclase encoded by cop6 and two cytochrome P450s encoded by cox1 and cox2 perform cyclization and oxidation reactions to form 1 (Fig. 1) [2, 6]. Specifically, the biosynthetic pathway of 1 starts with the cyclization of farnesyl pyrophosphate catalyzed by Cop6 to generate α-cuprenene (3) (Fig. 1) [6, 7]. The next step involves a series of oxidation reactions catalyzed by P450s Cox1 and Cox2 to generate 4, in which hydroxyl groups are added to C-1, -4 and -9, and the C-9 position undergoes a further oxidative cyclization to generate hitoyopodin A (5) [2, 8]. Subsequently, oxidation of the hydroxyl groups at C-1 and C-4 could result in lagopodin A (6), which forms a quinone core structure. Lastly, a hydroxyl group is introduced at C-5, followed by another hemiacetal formation to form 1. Recently, the biosynthetic steps toward the formation of hitoyols were proposed by our group, where lagopodins were implicated as the early intermediates. The biosynthesis of prehitoyol from 1 was envisaged to proceed through a benzilic acid rearrangement of the o-quinone in the hemiketal form of 1, followed by decarboxylation to yield hitoyol A and then hitoyol B [7, 8]. On the other hand, we have determined the pathway leading from the early intermediate 3 to the formation of 4, the intermediate immediately preceding the formation of 5 and lagopodins [2]. However, the effort was hindered by the low production level of the relevant secondary metabolites. Here we report the identification of an expression boost area (EBA) within the C. cinerea chromosome and exploitation of the EBA to substantially increase the production of the lagopodin biosynthetic pathway products. Use of this newly established method allowed successful isolation of an additional pathway product 2 and establishment of a more complete pathway for the biosynthesis of the lagopodin family of secondary metabolites.

Fig. 1
figure 1

The proposed biosynthetic pathway of lagopodins and hitoyols through cuparene and other intermediates

Materials and methods

Strains

To engineer a high-producing strain of C. cinerea for the formation of 1, EBA coded in C. cinerea was evaluated using the C. cinerea ku3-24 strain (A43mut B43mut pab1-1 Cc.ku70:: FltR). This strain was generated from the C. cinerea 326 strain (A43mut B43mut pab1-1), which was developed as a basidiomycete model strain for biotechnological applications [9], by disrupting the ku70 gene to minimize nonhomologous end joining. The strain ku3-24 allows efficient gene disruption and replacement in C. cinerea [10,11,12,13].

General techniques for DNA manipulation

PCR was performed using PrimeSTAR GXL DNA polymerase as recommended by the manufacturer (TAKARA Bio, Inc.). Sequences of PCR products were confirmed through DNA sequencing (Macrogen Japan Corporation). Escherichia coli XL1-Blue (Agilent Technologies, Inc.) and Saccharomyces cerevisiae BY4705 were used for plasmid propagation. DNA restriction enzymes were used as recommended by the manufacturer (Thermo Fisher Scientific, Inc.).

RNA sequencing

Total RNA was isolated from the C. cinerea ku3-24 strain that was cultured in MYG medium for 7 days at 180 rpm and purified following the procedure reported earlier [2]. Library construction and sequencing were carried out by Macrogen Japan Corp. (Kyoto, Japan). Briefly, the sequence library was constructed from the total RNA using TruSeq Stranded mRNA LT Sample Prep Kit (Illumina) and subsequently sequenced by Hiseq (101 bp × 2). The raw data were processed for quality trimming by fastp (0.19.6) [14], and the mapping data to the genome of C. cinerea okayama7 [10,11,12,13] were generated by hisat2 (2.1.0) [15] and samtools (1.9) [16]. Finally, the fragments per kilobase of transcript per million mapped reads (FPKM) value of each gene in C. cinerea was calculated by stringtie (1.3.5) [17]. Since the read count data are biased by the length of the transcript such that the number of reads sequenced from a long transcript becomes overrepresented, it is not suitable for comparing the amount of transcription between genes. Therefore, the total number of fragments was scaled by one million to obtain the fragments per million (FPM) value, and the FPKM value was calculated by normalizing the FPM value by the length of each gene in kilobases.

Construction of pKW20605, the cop6 expression vector

The integration cassette was comprised of a 5′ side- and a 3′ side-flanking fragment named Rec1 and Rec2, respectively. Rec1 and Rec2 are a 1500-base pair fragment homologous to the 5′ and 3′ flanking regions of the target site in the C. cinerea ku3-24 genome, respectively. The primer sets for Rec1, 5′-CGACGGTATCGATAAGCTTGATATCGGATGGGCGCCTCTGAAGAACTCGAG-3′ (pKW20603_F1) and 5′-AACTTCTCCAAACCAACGTGTTCAAATTTAAATCTCTACAGGTCCGCAAGTTGGCCAA-3′ (pKW20603_R1), and Rec2, 5′-TTGGCCAACTTGCGGACCTGTAGAGATTTAAATTTGAACACGTTGGTTTGGAGAAGTTGGG-3′ (pKW20603_F2) and 5′-CGGTGGCGGCCGCTCTAGAACTAGTGGTCCATGCACCCTACGAGTCAACT-3′ (pKW20603_R2), were used to prepare the required flanking homologous regions for each of the target genes. PCR amplification of Rec1 and Rec2 fragments were carried out as described earlier.

For plasmid construction, the four fragments (Rec1, Rec2, hph hygromycin resistance marker [18], and a cassette of pDED1 (the promoter of the ATP-dependent RNA helicase DED1) [19]/cop6 [2]), each at 50–150 ng in a total volume of 45 μL, were mixed with the delivery vector pRS426 [20] (2 µg) predigested with EcoR I (10 units) and BamH I (10 units) at 37 °C for 30 min. The mixture was transformed into the Saccharomyces cerevisiae strain BJ5464-NpgA [21] to assemble the plasmid pKW20605 (Fig. 2a) possessing the desired expression cassette through in vivo homologous recombination. The resulting plasmid pKW20605 was recovered from the yeast transformant and transferred to E. coli, where the plasmid was amplified for subsequent characterization by restriction enzyme digestion and DNA sequencing to confirm its identity.

Fig. 2
figure 2

Confirmation of targeted integration of cop6 into the specified expression boost areas (EBA) within the genome of C. cinerea ku3-24. a The map of the plasmid pKW20605 used for homologous recombination-mediated integration of cop6 into the genome of C. cinerea ku3-24. Rec1 and Rec2: 1500-base pair fragment homologous to the 5′ and 3′ flanking regions of the target site in the C. cinerea ku3-24 genome, respectively; pDED1: promoter of the ATP-dependent RNA helicase DED1 [19]; cop6: terpene cyclase [2]; hph: hygromycin resistance marker [18]. b The schematic diagram of the C. cinerea ku3-24 genome showing the outcome of the successful genome integration event, and the annealing sites of the primers designed to identify the positive outcome by PCR using the gDNA as a template. c The schematic diagram of the C. cinerea ku3-24 genome without a genome integration event, and the annealing sites of the primers designed to identify the negative outcome by PCR using the gDNA as a template. d Gel electrophoretic analyses of the PCR products from the cop6-integrated and the wild-type strains amplified with the primer sets described in (b) and (c). The sequences of the primer sets are listed in the section of “Materials and methods”. Lane M: molecular weight marker; lane 1: the 1.8 kb positive PCR product including the Rec1 segment amplified from the cop6-integrated gDNA using the primer set pKW20604_pos_LF/pKW20604_pos-LR; lane 2: the 1.8 kb positive PCR product including the Rec2 segment amplified from the cop6-integrated gDNA using the primer set pKW20605_pos_RF1/pKW20604_pos_RR; lane 3: the 0.5 kb negative PCR product amplified from the gDNA that did not undergo the integration event using the primer set pKW20604_neg_F2/pKW20604_neg_R2

Conformation of targeted integration of cop6 by PCR

Transformation of C. cinerea ku3-24 was carried out using essentially the same experimental procedure described in our previous report [2]. To verify that the target region was replaced with the cassette, the gDNA isolated from the transformants was analyzed by diagnostic PCR. For verification by diagnostic PCR, three sets of PCR primers were designed (Fig. 2b–d). For the first primer set, one primer (pKW20604_pos_LF: 5′-CACCTGTGCGTATTCCCTCTGCCAT-3′) that anneals to the gDNA and another primer (pKW20604_pos_LR: 5′-GTTTCTTCCCTCCCGCCGTAGTCG-3′) that anneals at the 3′ side of Rec1 region were designed (Fig. 2b, “Positive PCR”). With this primer set, no PCR product will be formed with the wild-type gDNA as the template. However, a PCR product around 1.8 kb in size will be formed with the gDNA of the strain that has gone through the desired gene integration event. For the second primer set, one primer (pKW20605_pos_RF1: 5′-GATGTTGTCACTTCTATTTGTCATTTTGCGG-3′) that anneals to the selection marker and Rec2, and another primer (pKW20604_pos_RR: 5′-CTGAAGCACATCCACAGTCAGCCAT-3′) that anneals to the gDNA were designed (Fig. 2b, “Positive PCR”). With this primer set, the wild-type gDNA will not produce any PCR product as the template. However, a PCR product around 1.8 kb in size will be formed with the gDNA of the strain containing the desired gene deletion. For the third primer set, one primer (pKW20604_neg_F2: 5′-CCCCAACGCCTTCTCCCTTTCACAT-3′) that anneals near the 3′ end of the Rec1 and another primer (pKW20604_neg_R2: 5′-GCAGTCCTTGAAACAACCGGGAAGC-3′) that anneals ~500 bp inside of the target gene were designed (Fig. 2c, “Negative PCR”). The results of the electrophoretic analysis of the three sets of PCR reactions are given in Fig. 2d.

Spectroscopic analyses

NMR spectra were obtained with a Bruker BioSpin AVANCE III HD 500 MHz spectrometer (1H 500 MHz, 13C 125 MHz). 1H NMR chemical shifts are reported in parts per million (ppm) using the proton resonance of residual solvent as reference: CDCl3 δ 7.26 and CD3OD δ 3.31 and DMSO-d6 δ 2.50 [22]. 13C NMR chemical shifts are reported relative to CDCl3 δ 77.16 and CD3OD δ 49.0 and DMSO-d6+TFA 39.52 [22].

Results and discussion

In our previous study, we showed that the terpene cyclase Cop6 was essential to the biosynthesis of lagopodins in C. cinerea [2]. To improve the production level of 1 and thereby increase the production levels of downstream pathway intermediates for further analyses, a C. cinerea strain with a copy of cop6 nonspecifically incorporated into its genome was initially prepared. This strain was generated by introducing an overexpression cassette containing a cop6 gene under the control of the DED1 promoter [19] into the C. cinerea 326 genome via random integration. Five of the resulting strains were cultured in 500 ml of MYG liquid medium for 18 days at 30 °C with agitation at 180 rpm to confirm the production of 1 (Fig. 3). The yield of 1 was 0.65 mg l−1, and the overall production level of 1 and its related metabolites remained very low. Thus, a new approach was needed for elucidating the biosynthetic pathway of lagopodins. Such an effort would at the same time afford us a high-yielding secondary metabolite production system for obtaining new and useful compounds derived from basidiomycetes using C. cinerea.

Fig. 3
figure 3

Productivities of 1 in two strains, one expressing an extra copy of cop6 integrated into the genome of C. cinerea ku3-24 randomly (random), and another having an extra copy of cop6 integrated into the expression boost area (EBA) identified in this study. Each data point represents the means of the amounts of 1 isolated from three independent cultures of each strain

One hypothesis for improving the expression level of a certain gene is to place the gene of interest in the area of the chromosome we refer to as an EBA, where the level of gene expression is higher than other areas of the chromosome [23]. To test this idea, we sought to identify EBAs in the genome of C. cinerea ku3-24, a ku70-deficient strain developed previously [13], to enable high-fidelity homologous recombination-mediated genome modification [24], by analyzing the expression level of each gene by RNA-Seq. When the transcription levels of all genes in the liquid culture of C. cinerea ku3-24 were sorted by the fragments per kilobase of exon per million reads mapped (FPKM) values, genes CC1G_14010 and CC1G_14011 were found to be highly expressed, hence designated as an EBA (Fig. 4a). Thus, the cop6-containing overexpression cassette was introduced into the position between CC1G_14010 and CC1G_14011 on the C. cinerea ku3-24 chromosome via homologous recombination (Fig. 4b). Proper integration of the cop6-containing overexpression cassette at the designated EBA was confirmed by colony PCR (Fig. 2), and the resulting transformant was named pKW20605-C2. When the pKW20605-C2 strain was cultured under essentially the same condition as described earlier, it produced ~9.2 mg l–1 of 1 in 1 l of MYG liquid medium, which was slightly over 14-folds higher than the yield obtained with the strain with cop6 randomly integrated in its genome (Fig. 3). We have not established directly that introduction of cop6 into the EBA increased its expression and hence caused the increase in the production of 1. Nevertheless, our result indicated that targeted introduction of a key biosynthetic gene into an EBA could result in a substantial enhancement of the yield of the corresponding metabolic products.

Fig. 4
figure 4

Exploration of expression boost areas (EBAs) within the genome of C. cinerea. a FPKM counts of EBA in C. cinerea ku3-24 grown under a liquid culture condition. b The site of targeted integration of the cop6 overexpression cassette within the genome of C. cinerea ku3-24

Next, we performed a thorough analysis of the metabolites obtained from the pKW20605-C2 strain to establish a more complete pathway for the biosynthesis of lagopodins and hitoyols. The general method for a large-scale isolation of lagopodins and related products for detailed characterizations is given below using purification of 2 as an example. The pKW20605-C2 strain grown on MYG liquid medium (30 ml × 20) was subjected to 5–7 day-long incubation at 30 °C. The culture was separated by filtration to a broth and mycelia. The broth was extracted with ethyl acetate (600 ml × 2). The extract was dried in vacuo at a yield of 105.8 mg. The acetone extract of the mycelia was concentrated in vacuo to give an oily residue at a yield of 178.5 mg. The broth and mycelia extracts were combined. The combined dried residues were dissolved in MeOH and subjected to HPLC purification using Mightysil RP18GP column 5 µm (20 × 250 mm, Kanto Chemical Co., Inc.) with a 20–100% (v v–1) MeOH linear gradient in H2O supplemented with 0.05% (v v–1) TFA at a flow rate of 8.0 ml min–1 over 30 min to yield a fraction containing 2 (3.3 mg), a previously unidentified compound in the C. cinerea culture extract. To further purify 2, the resultant fraction was subjected to HPLC purification using Cosmosil MS-II column (10 × 250 mm, Nacalai Tesque) with a 25% (v v–1) MeCN isocratic elution in H2O supplemented with 0.05% (v v–1) TFA at a flow rate of 4.0 ml min–1 over 60 min to afford 2 (1.9 mg). The resulting solution was subjected to LC–MS analysis for an initial identification of the compound. LC–MS analysis was performed with a Thermo Scientific Exactive liquid chromatography-mass spectrometer using both positive and negative electrospray ionization. Samples were analyzed using an ACQUITY UPLC 1.8 μm, 2.1 × 50 mm C18 reversed-phase column (Waters) and separated on a 10–50% (v v–1) MeCN linear gradient in H2O supplemented with 0.05% (v v–1) formic acid at a flow rate of 500 µl min–1. Subsequently, 2 was recovered as a yellow crystal.

The chemical structure of 2 was elucidated by UV/Vis spectroscopy, high-resolution electrospray ionization mass spectrometry (HRESIMS) and NMR spectrometry as well as comparison against the spectroscopic data of 1 [2, 3]. For instance, the absorption maxima at 300 nm indicated the presence of a quinone moiety in the structure. The HRESIMS data indicated a molecular formula of C15H19O5+ based on the [M + H]+ ion signal at m/z 279.1227; calcd. for C15H19O5+, 279.1227, Δ = 0.0 mmu. This outcome was the same as the calculated mass of 279.1227 for 2. The 1H and 13C NMR spectra (Fig. 5a) contained the number of resonance signals expected for 16 protons and 15 carbons. The 1D-NMR (1H and 13C) and HMBC spectra for 2 accounted for 15 carbon signals for the following types of carbons: three tertiary methyls, one olefinic methyl, two methylenes, four aromatic quaternary carbons, two nonhydrogen attached sp3 carbons, one dioxygenated tertiary carbon, and two carbonyl carbons (Fig. 5a). Use of DMSO-d6 supplemented with 0.05% (v/v) of TFA-d1 as a solvent for measuring 1H NMR spectrum afforded sharpened signal for C-14 and allowed us to characterize it as a clear methyl group. In addition, the DQF-COSY spectrum revealed two sets of geminal couplings at H2-8 and H2-10, respectively. The benzoxabicyclo[3.2.1]octane core was revealed by HMBC analyses (Fig. 5). The HMBC correlations were as follows: two methylenes H2-8/C-9, C-7, C-10, C-11 and C-14; H2-10/C-9, C-11 and C-13. The presence of a 1,1,2-trimethyl-4,4-disubstituted cyclopentane moiety was also established by the HMBC correlations from three methyls (H3-12/C-7, C-10, C-11 and C-13; H3-13/C-7, C-10, C-11 and C-12; and H3-14/C-6, C-7, C-8 and C-11) and another methyl (H3-15/C-2, C-3 and C-4). The HMBC correlations from H3-14/C-6 and C-7 established the linkage between C-6 and C-7. Considering that 2 had one degree of unsaturation except for a quinone and cyclopentane ring, we deduced from the downfield shift of C-5 (δC 181.5 ppm) as a hemiacetal carbon that 2 had an ether bridge from C-5 to C-9. Consequently, the planar structure of 2 was established as illustrated in Fig. 5. In addition, the stereochemistry of 2 at C-7 and C-9 was determined to be both in the S* configuration. The stereochemical configuration at those positions was identical to that of 1, in line with the idea that both originate from the same biosynthetic pathway (Fig. 1). The optical rotation of 2 was observed as follow: [α]D20: +4.9 (c 0.19, MeOH). Based on all of the evidences collected, 2 was determined to be hydroxylagopodin B. While the chemical structure has been analyzed in previous reports [25, 26], this is the first report of a rigorous structural elucidation through NMR spectrometric analysis. A plausible tautomerization of this compound is shown in Fig. 5b.

Fig. 5
figure 5

Chemical structure of 2. a Chemical structures of 2 showing selected key HMBC (arrow) with a table of NMR data of 2 in DMSO-d6 including TFA-d1. b Possible tautomerization of 2

Regarding the proposed lagopodin biosynthetic pathway, since 1 and 2 were both isolated from the C. cinerea ku3-24, 2 appeared to be associated with the lagopodin biosynthetic pathway, which we identified to be encoded by a gene cluster containing two cytochrome P450 monooxygenases (P450) genes cox1 and cox2 and the terpene cyclase gene cop6 [2]. Based on the chemical structures of 2, involvement of another P450 other than Cox1 and Cox2 can be speculated. However, in silico analysis of the gene cluster and the regions within 5 kb from both ends of the gene cluster could not identify any additional P450 gene. It is possible that a yet-to-be identified P450 is responsible for the formation of a divergent product 2 through a benzyl radical-mediated hydroxylation reaction. Similarly, a radical-coupling catalyzed by a P450 that does not resolve in hydroxylation can also lead to the formation of a dimeric product, lagopodin C (7) [26] (Fig. 1). A previous work that predicted the chemical structure of 7 proposed the dimerization to proceed via a nonenzymatic base-catalyzed nucleophilic addition [26]. While we have not yet identified 7 in our C. cinerea culture so far, our detailed characterizations of the lagopodin biosynthetic pathway led us to propose a possible enzymatic mechanism of the formation of this dimeric compound. These findings reconfirm the critical role of P450s and other oxygenating enzymes in introducing chemical diversity into natural products [27, 28].

In this study, we successfully identified a highly expressed area of a chromosome called EBA within the genome of C. cinerea by RNA-Seq, and exploited it in an attempt to overexpress the key lagopodin biosynthetic gene cop6. The gene was integrated into the chromosome through homologous recombination, which was made possible with the use of the previously developed ku70-knockout strain of C. cinerea [13]. The resultant strain not only achieved over a 14-fold increase in the production of 1, but also provided previously undetectable metabolites. Isolation and chemical structure determination of one of the metabolites allowed us to identify that 2 was also produced by C. cinerea, implicating possible involvement of yet another cytochrome P450 in the lagopodin biosynthetic pathway and possible enzymatic production of a dimeric product lagopodin C from the pathway. While we have not confirmed directly that the placement of the cop6 in the EBA resulted in an increased expression of the gene, successful enhancement of the product yield suggests that the use of EBA may be a powerful means to boost the productivity of otherwise poorly biosynthesized target compounds in complex eukaryotic organisms such as basidiomycetes.