Abstract
Akebiae Fructus (AF) is the dry and near-ripe fruit of Akebia trifoliata (Thunb.) Koidz. Triterpenoid saponins are important characteristic components of AF, but their molecular biosynthetic pathway has not yet been identified. In this study, DNBSEQ was used to sequence the transcriptome of the five stages of AF from May to September. A total of 96.87 Gb of data were assembled, including 115,430 unigenes with an average sequence length of 1445 bp. Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses for metabolic pathway enrichment identified 434 unigenes involved in two metabolic pathways related to triterpenoid saponin biosynthesis, of which 255 encoded 19 key enzymes in the triterpenoid saponin biosynthetic pathway. Sequence analyses and homology modeling of the key enzyme, β-amyrin synthase (β-AS), which is implicated in triterpenoid saponin biosynthesis, showed that it has a conserved substrate-binding domain. Differentially expressed genes (DEGs) were identified by comparing gene expression levels between September and other months, and multiple genes encoding key enzymes, such as β-AS, and the transcription factors (TFs) that are involved in triterpenoid saponin biosynthesis were further analyzed. The chemical constituents of triterpenoid saponins were identified via ultra-high performance liquid chromatography-quadrupole electrostatic field orbitrap mass spectrometry (UHPLC-QE-MS), and triterpenoid saponin content in AF at different developmental periods was determined by high performance liquid chromatography (HPLC). Finally, the expression levels of some unigenes encoding key enzymes were fully verified using quantitative real-time polymerase chain reaction (qRT-PCR). Herein, we elucidate the biosynthetic pathway of triterpenoid saponins in AF and its key enzymes, laying a foundation for future investigations of AF biosynthesis regulation.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Akebiae Fructus (AF) is the immature fruit of the Akebia genus of the Lardizabalaceae family. AF has many beneficial effects, including soothing the liver, regulating qi, promoting blood circulation, relieving pain, dispersing nodules, and diuresis. AF is used in the clinic to treat dysmenorrhea, epigastric distension, and adverse urination, among other conditions (Chen et al. 2021). Components found in AF include triterpenoid saponins, amino acids, non-triterpenoids, monosaccharides, polysaccharides, tannins, and fatty oils (Fang et al. 2021; Ma et al. 2020). Triterpenoid saponins are the main components in AF (Ye et al. 2019), including 14 compounds with hederagenin as a mother nucleus, such as α-hederagenin, saponin A, and saponin D, which have proven antitumor and anti-inflammatory activities (Xing et al. 2017; Xu et al. 2019).
In recent years, transcriptomic analysis using RNA sequencing data has become an important method of gene mining and functional prediction in medicinal plants. Previous studies have capitalized on this to identify synthetic pathways of triterpenoid saponins from Poria, Salviae miltiorrhiza, and Panax ginseng (Zeng et al. 2019; Zhan et al. 2020; Zhao et al. 2015). The triterpenoid saponins biosynthetic pathway can be divided into three stages. First, the upstream precursors isopentenyl diphosphate and dimethylallyl diphosphate are synthesized using a mevalonicacid pathway (MVA) and 2-methyl-D-erthritol-4-phosphate (MEP). Previous studies have shown that the MVA pathway is localized in the endoplasmic reticulum and in the cytoplasm, while the MEP pathway mostly occurs in the cytosol and in mitochondria (Newman et al. 1999; Choi et al. 1992). In the second stage, the universal triterpene precursor, 2,3-oxidosqualene, is obtained from the catalysis of geranyl diphosphate synthase (GPPS), farnesyl diphosphate synthase (FPPS), squalene synthase (SS), and squalene epoxidase (SE). Finally, in the third stage, more than 100 triterpene nuclei with different skeletons are cyclized under the action of oxidosqualene cyclases (OSCS) (Suzuki et al. 2002; Basyuni et al. 2007), before triterpenoid skeletons can be successively modified using cytochrome P450 monooxygenase, UDP-glycosyl transferase, and glycosidase. Next, hydroxylation and glycosylation perform oxidative replacement (Xu et al. 2021, 2014; Yang et al. 2018; Zhu et al. 2020; Wei et al. 2019) allowing for the generation of different types of triterpenoid saponins. OSCS can then produce β-amyrin following catalysis of β-amyrin synthase (β-AS), which is a key step in hederagenin synthesis (Zhang et al. 2016). Thus, β-AS is clearly an important regulatory enzyme in AF triterpenoid saponin biosynthesis.
The peel of fruits in the Akebia genus can be eaten after ripening. Importantly, the medicinal and nutritional value of these fruits is closely related to the accumulation of chemical components as they ripen. Previous studies of AF have largely focused on evaluating quality and understanding the underlying pharmacology of these fruits (Zhang et al. 2013; Zhong et al. 2021; Hong et al. 2017; Wang et al. 2019; Zhang et al. 2022). However, the molecular pathways behind triterpenoid saponin synthesis remain uncharacterized. The work of our research group focuses on the regulation of gene biosynthesis in medicinal plants (Ou et al. 2020; Song et al. 2021), and in this work, we sought to understand the dynamic law of compositional changes in AF during development and the internal genes that regulate compositional changes. As such, experimental samples of fruit at different stages of development were collected and subjected to high-throughput transcriptome sequencing and co-expression analyses to assess the expression patterns of key genes in triterpenoid saponin biosynthetic pathways along the different developmental stages. Moreover, we also sought to investigate the internal biosynthetic and transportation mechanisms of triterpenoid saponins. Transcriptomic and metabonomic analyses were used to explore changes in triterpenoid saponins at different developmental stages to provide a basis for the comprehensive application of AF.
Materials and Methods
Plant Material and RNA Extraction
Fresh AF was harvested from the planting base of Jinzhai County, Lu’an, Anhui Province, and officially identified as Akebia trifoliata (Thunb.) Koidz by Associate Professor Jinmei Ou of Anhui University of Chinese Medicine. Fresh fruits were collected at each of the five developmental stages from the beginning of May to the beginning of September (Y5-Y9). Fresh plant samples were washed several times with ultrapure water and then air-dried on filter paper. Samples from each developmental stage were placed into 10-mL centrifuge tubes, instantly frozen in liquid nitrogen, and then stored at − 80 °C until subsequent RNA extraction. Total RNA was extracted from each sample using an RNA kit. Three biological replicates were sat up for each treatment.
cDNA Library Construction and Sequencing
The mRNA enrichment method was used to purify qualified total RNA with magnetic beads using Oligo (dT) and to enrich the mRNA with a polyA tail. An appropriate amount of breaking reagent was added to the resulting mRNA sample, which was segmented under high-temperature conditions. Next, single-stranded cDNA was synthesized using the broken mRNA as a template. Subsequently, two-strand cDNA was synthesized using a two-strand synthesis reaction system. This two-strand cDNA was then purified and recovered with a kit, the sticky ends were repaired, and single adenosine bases were added to the 3′ end of the cDNA before connecting to the sequencing connector. Next, fragment size and length were selected for polymerase chain reaction (PCR) amplification, before the cDNA library was successfully generated. The constructed libraries were assessed and examined using the Agilent 2100 Bioanalyzer and the ABI StepOnePlus real-time PCR System (Zhao et al. 2020; Shan et al. 2020a; Shi et al. 2019). The Illumina HiSeq sequencing platform (BGI, Shenzhen, China) was used to perform high-throughput sequencing analysis of the complete cDNA libraries with the samples. Sequencing data were deposited in the NCBI SRA database (accession number: PRJNA857910).
Transcriptome De Novo Assembly and Gene Functional Annotation
Following transcriptome sequencing, Trimmomatic 0.36 was used to filter out reads containing joints (joint contamination), unknown base N contents greater than 5%, and low-quality reads to obtain high-quality reads. Clean high-quality reads were then assembled using Trinity 2.0.6 software (Kim et al. 2017). Unigenes were obtained using Tgicl software for clustering and redundancy removal (Pertea et al. 2003), and these were used as reference sequences for subsequent analyses. The candidate coding regions in the predictor unigenes were searched using the TransDecoder software (v 3.0.1), the longest open reading frame (ORF) was extracted, and based on the sequence similarity, the ORF was cast to the SwissProt database using the sequence comparison tool, Diamond Blastp. Homologous sequences of the PFAM proteins were then searched using Hmmscan and the coding sequence (CDS) of predictor unigenes coding sequence was predicted using TransDecoder.Predict function (Kim et al. 2015). The parameters were then set according to the assembly results, and the simple sequence repeat (SSR) of the predictor unigenes was detected using MISA (v 1.0), leading to the identification of six nucleotide repeat types (Thiel et al. 2003).
Using NCBI Blast + (Pirooznia et al. 2008) tool, the unigene NCBI non-redundant nucleic acid sequence database (NT, https://www.ncbi.nlm.nih.gov/blast/db), NCBI non-redundant protein sequence database (NR, https://www.ncbi.nlm.nih.gov/blast/db), protein direct homology database (KOG, http://www.ncbi.nlm. nih.gov/KOG), SwissProt protein sequence database (SwissProt, http://ftp.ebi.ac.uk/pub/databases/swissprot), and the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg), multi-sequence alignment results were matched with the annotation database of the protein family domain constructed using a hidden Markov model (multi-sequence alignment and protein family database, PFAM, http://pfam.Xfam.org). Using annotation data from the NR library, the similarity between the transcription sequences of species and related species, as well as the functional information of homologous sequences, was checked, and Gene Ontology (GO, http://geneontology.org) information on related unigenes was matched with Blast2GO 2.5.0. E < 1 × 10−5 was set to obtain the corresponding functional annotations and the classification information of the unigenes (Shan et al. 2020b).
Structural Characteristics Analysis of β-AS
The β-AS was translated using the Translatetool online software, and the complete ORF sequence was screened. Triterpenoid saponin sequences were compared using DNAMAN and Clustalx software, and secondary structures within β-AS were visualized using ESPript 3.0 software. Next, the SWISS-MODEL (https://swissmodel.expasy.org/) online system and PyMOL software were used to model and visualize β-AS tertiary structures.
Differentially Expressed Gene Analyses
High-quality reads for each sample were assessed using Bowtie2 (version 2.2.5) software against AF genomic sequences (Langmead et al. 2012). Next, gene and transcript expression levels for each developmental stage were calculated using RSEM (version 1.2.8) software and the TPM formula. The standard expression of each unigene was obtained (fragments per kilobase of exon model per million mapped fragments, FPKM) (Li et al. 2011). Functional annotation and KEGG pathway enrichment analysis of differentially expressed genes (DEGs) were conducted for fruit samples from the five developmental stages. DEGs were assigned the default false discovery rate (FDR) ≤ 0.001.
Transcription Factor-Encoding Genes Analyses
Genes encoding transcription factors (TFs) along with the TF families were classified. Next, unigene ORFs were detected using Getorf (EMBOSS:6.5.7.0) (Rice et al. 2000) and aligned to TF protein domains using Hmmsearch (version 3.0). Subsequently, competent unigenes were identified based on the TF family features found in PlantTFDB (Mistry et al. 2013).
Ultra-high Performance Liquid Chromatography-Quadrupole Electrostatic Field Orbitrap Mass Spectrometry (UHPLC-QE-MS) Analysis
AF metabolites were analyzed by high-performance liquid chromatography (HPLC) coupled with tandem mass spectrometry (LC–MS/MS) (Vanquish, Thermo Fisher Scientific) (Xie et al. 2021). The samples were separated on a Waters UHPLC BEH C18 column (2.1 mm × 100 mm, 1.7 μm). The flow rate was set at 0.5 mL/min, the column temperature was maintained at 50 °C, and the sample injection volume was 5 μL. The mobile phase consisted of 0.1% (v/v) formic acid–water (A) and 0.1% (v/v) formic acid-acetonitrile (B). Multi-step linear elution gradient was carried out as follows: 0–11 min, 85–25% A; 11–12 min, 25–2% A; 12–14 min, 2% A; 14–14.1 min, 2–85% A; and 14.1–16 min, 85% A. The mass spectrum with a scanning mass range of 100–1500 Da was then obtained using the Orbitrap Exploris 120 mass spectrometer in negative mode. The detailed parameters were as follows: sheath gas flow, 35 Arb; auxiliary gas flow, 15 Arb; ion transfer tube temperature, 350 °C; vaporizer temperature, 350 °C; full ms resolution, 60,000; MS/MS resolution, 15,000; collision energy, 16/32/48 in NCE mode; and spray voltage, − 4.00 kV (negative) (Wu et al. 2023).
Determining Triterpenoid Saponins Content
The α-hederin content in AF at different developmental stages was determined using an Agilent 1260 HPLC system equipped with a Waters SunFire C18 column (4.6 × 250 mm, 5 µm). The mobile phase consisted of an acetonitrile:water:phosphate mix (45:55:0.1). The flow rate was maintained at 1.0 mL/min with the detection wavelength at 203 nm, the column temperature at 30 ℃, and the injection volume at 20 μL. For each sample, 1 g of AF powder (via sieve four) was added to a conical flask with a stopper and extracted with 100 mL of 75% (v/v) methanol (300 W, 50 kHz, 30 min) at room temperature by ultrasonography. The sample weight was then calculated, and the lost weight was supplemented with 75% (v/v) methanol to obtain the final test solution (Deng et al. 2021).
Subsequently, 5 mg of α-hederin control was carefully weighed, and 10 mL of methanol was added to generate a solution with mass concentration of 0.5 mg/mL before it was filtered through a 0.22-µm microporous filter membrane. Next, various volumes of this α-hederin standard solution (i.e., 1 µL, 2 µL, 4 µL, 6 µL, 8 µL, 10 µL, 12 µL, and 14 µL) were taken for measurement. Linear regression analysis was used to generate a standard curve with the sample volume (µL) on the x-axis and the peak area on the y-axis.
Triterpenoid Saponin Biosynthetic Gene Expression Analyses
To validate our transcriptomic data, we assessed the expression of key enzyme genes involved in triterpenoid saponin biosynthesis by quantitative real-time PCR (qRT-PCR). Primers for genes involved in biosynthesis as well as those for reference genes (actin) were designed using Primer Premier (version 5.0). After reverse transcription, each reaction was prepared in a 15 μL volume containing diluted cDNA (2 μL), 1.5 μL of 2.5 μM primer (forward and reverse), qPCR Mixture (2X, 7.5 μL), and RNase-free water (4 μL). The following thermal cycling protocol was followed: initial denaturation at 95 °C for 10 min, followed by 40 cycles of denaturation at 95 °C for 15 s, and annealing or extension at 60 °C for 30 s. Three technical replicates were performed for each sample. The relative expression of each selected unigene was normalized against actin (Un 10407) and analyzed using the 2−ΔΔCt method (Livak et al. 2001).
Results
RNA Sequencing and De Novo Transcriptome Assembly
We obtained 96.87 Gb of data from transcriptome sequencing of AF samples. The Clean Reads Q20 value for each sample was > 97.28%, and the Clean Reads Q30 value was > 92.85%, suggesting good date reliability. After Trinity assembly and Tgicl de-redundancy, 115,430 unigenes were obtained. The total sequence length, average length, N50, and GC contents were 166,896,333 bp, 1445 bp, 2269 bp, and 40.16%, respectively (Table 1). The unigenes obtained after assembly of the AF sequences were distributed across 200–500 bp (33,848 unigenes), accounting for 29.32% of the total, and gene sequences exceeding 3000 bp accounted for 11.31% (Fig. 1a). Taken together, the unigenes were relatively complete allowing for functional annotation.
Using TransDecoder software CDS to predict the predictor data, we found that 66,882 unigenes were successfully predicted as CDS’. The total length of the CDS’ was 77,730,759 bp with a maximum and minimum length of 15,297 bp and 297 bp, respectively. Further, the N50 was 1485 bp while the N90 was 576 bp, and the GC content was 43.62%, of which 52,703 CDS’ were larger than 500 bp, accounting for 78.80% of the data (Fig. 1b). SSR detection was performed on the transcriptome data using MISA software, and screening detected that there were 55,777 SSRs distributed in 37,794 unigenes. Among them, the main types were dinucleotide repeats (24,619, 44.14%), followed by mononucleotides (17,736, 31.80%) and trinucleotides (10,268, 18.41%) (Table S1).
Functional Classification and Expression Overview of Unigenes
Among the 115,430 unigenes, 81,143, 69,686, 63,991, 66,446, 65,365, 61,359, and 65,675 unigenes were annotated using the NR, NT, SwissProt, KEGG, KOG, PFAM, and GO functional databases. All databases annotated 36,704 unigenes, or 31.80% of the total, and 85,859 unigenes were annotated by at least one database, constituting 74.38% of the total (Table 2 and Fig. S1).
In the transcriptome data of AF samples taken in May, June, July, August, and September, 81,114, 76,201, 78,848, 75,823, and 71,014 unigenes were obtained, respectively. Of these, which 33,805, 35,472, 34,192, 32,623, and 32,753 had low expression (FPKM ≤ 1); 36,999, 30,704, 34,337, 32,616, and 27,824 had medium expression (FPKM = 1–10), and 10,310, 10,025, 10,319, 10,584, and 10,437 had high expression (FPKM ≥ 10) (Fig. 2).
Identification and Analyses of Triterpenoid Saponin Pathways in AF
The metabolic pathways that AF may be involved in were analyzed using the KEGG database (Kanehisa et al. 2000; Kanehisa et al. 2019; Kanehisa et al. 2021). In all, 66,446 unigenes were annotated in the AF transcriptome, including 2798 for cell processes, 4288 for environmental information processing, 14,307 for genetic information processing, 36,398 for metabolism, and 3233 for the organic system. The five categories included 136 metabolic pathways. These results indicate that, among other pathways for the synthesis of secondary metabolites, 911 unigenes were mapped to the phenylpropanoid biosynthesis pathway, occupying the highest (Fig. 3a). Moreover, terpenoid and polyketide metabolism were mainly distributed in terpenoid backbone biosynthetic processes, containing 309 unigenes (Fig. 3b).
There were two metabolic pathways for triterpenoid saponin biosynthesis, including 309 unigenes involved in terpenoid backbone biosynthesis, and 125 involved in sesquiterpenoid and triterpenoid biosynthesis. These two metabolic pathways involved a total of 434 unigenes. Those related to enzymes in the triterpenoid saponin synthesis pathway were further sorted: 10 unigenes encoded acetyl-CoA acetyltransferase (AACT), 29 encoded GPPS, 10 encoded SS, and 43 encoded β-AS (Table 3). The FPKM of unigenes was greater than 1, and the distance from the complete coding frame was less than 50. There were 7, 4, 4, 3, 5, and 5 unigenes encoding AACT, 3-hydroxy-3-methylglutaryl-CoA synthase (HMGS), 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR), the mevalonicacid pathway (MVK), phosphomevalonate kinase (PMK), and mevalonate-5-pyrophosphate decarboxylase (MVD) in the MVA pathway, respectively. Further, there were 7, 1, 2, 2, 3, 7, and 2 unigenes encoding 1-deoxy-D-xylulose-5-phosphate synthase (DXS), 1-deoxy-D-xylulose-5-phosphate reductoisomerase (DXR), 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (MCT), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (CMK), 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (MDS), (E)-4-hydroxy-3-methylbut-2-enyl-diphosphate synthase (HDS), and (E)-4-hydroxy-3-methylbut-2-enyl-diphosphate reductase (HDR) in the MEP pathway, respectively. The formation of the carbon ring skeleton requires 4, 14, 2, 5, and 5 unigenes for isopentenyl-diphosphate delta-isomerase (IDI), GPPS, FPPS, SS, and SE, respectively (Fig. 4). The relative expression of these key enzyme genes in the five developmental stages in the biosynthetic pathway of AF triterpenoid saponins is shown as a heat map. Importantly, we found that β-AS expression in August was significantly higher compared to September. Moreover, AACT and MDS expression was upregulated in July and September, respectively. In addition, HMGS, PMK, SE, FPPS, and IDI expression was highest in June, and the same was true for GPPS in August.
Structural Characteristics of the Key β-AS Enzyme
Transcriptome data analyses indicated 43 unigenes encoding β-AS. The amino acid sequences were compared to the NCBI Blast database, and it was determined that nine unigenes had complete ORFs. The sequences of the nine genes were compared using DNAMAN software, and the results indicated a sequence identity of 90.82%, thereby suggesting that sequences encoding β-AS had high homology. Their three-dimensional structures, constructed using SWISS MODEL, also showed a high degree of similarity (Fig. 5a). The unigene, CL 11200–8, which had the highest expression level in the complete reading frame, had an ORF length of 2304 bp and encoded 767 amino acids. Next, we used the β-AS protein sequence (CL 11200–10, 42.00%) with high sequence identity to the template protein to model the secondary and tertiary structures. Secondary structure predictions showed that β-AS consists of 43.94% alpha helices, 36.77% random coils, 12.39% extended strands, and 6.91% β-turns. The β-AS tertiary structure was modeled using PyMOL software, and we found that it contains a characteristic MWCYCR region and a highly conserved substrate binding site (DCTAE), which is consistent with the conserved β-AS sequences reported in other plants and in animals (Kushiro et al. 2000; Ito et al. 2013) (Fig. 5b, c).
DEGs Analysis
According to our transcriptome data, 71,596 unigenes were co-expressed in the five developmental stages of AF, and 1865, 2080, 1177, 1441, and 840 unigenes were expressed in May, June, July, August, and September, respectively (Fig. 6a). DESeq was used to compare AF samples from the five developmental stages, and DEGs were analyzed (Fig. 6b). For Y9 vs. Y5, there were 16,905 significant DEGs, including 9191 up- and 7714 down-regulated genes. For Y9 vs. Y6, there were 14,972 significant DEGs, including 8218 up- and 6754 down-regulated genes. For Y9 vs. Y7, we identified 10,778 significant DEGs, including 5597 up- and 5181 down-regulated genes. Finally, for Y9 vs. Y8, we observed 7053 significant DEGs, including 3366 up- and 3687 down-regulated genes. These results suggest that AF samples taken in September were more divergent than samples taken in the other months. The DEGs of Y9/Y5, Y9/Y6, Y9/Y7, and Y9/Y8 were analyzed using a Venn diagram, for a total of 7444, 5899, 3651, 1635 DEGs in Y9/Y5, Y9/Y6, Y9/Y7, and Y9/Y8, respectively.
KEGG enrichment analyses of DEGs from different AF developmental stages of AF showed that 6407 DEGs were annotated to 132 metabolic pathways in Y5 vs. Y9, with a focus on enrichment in plant hormone signal transduction (ko04075, 458 DEGs), starch and sucrose metabolism (ko00500, 321 DEGs), and amino sugar and nucleotide sugar metabolism (ko00520, 283 DEGs) (Table S2). Further, Y6 vs. Y9, 5,775 DEGs were annotated to 134 metabolic pathways, with a focus on plant hormone signal transduction (ko04075, 409 DEGs), amino acids biosynthesis (ko01230, 340 DEGs), and carbon metabolism (ko01200, 340 DEGs) (Table S3). In addition, 4297 DEGs were annotated to 132 metabolic pathways in the Y7 vs. Y9, with an emphasis on plant hormone signal transduction (ko04075, 267 DEGs) and starch and sucrose metabolism (ko00500, 238 DEGs) (Table S4). Similarly, 2960 DEGs were annotated to 133 metabolic pathways in Y8 vs. Y9, with an emphasis on plant hormone signal transduction (ko04075, 201 DEGs) and starch and sucrose metabolism (ko00500, 170 DEGs) (Table S5).
Analysis of TFs Related to Triterpenoid Saponin Biosynthesis
TFs play key roles in regulating the activities of genes involved in triterpenoid saponin biosynthesis and other secondary metabolic processes (Dong et al. 2020). In the AF transcriptome database, 3490 unigenes were annotated as TFs and grouped into 56 TF families. The MYB family had the largest number of unigenes (422), followed by the WRKY family (261), NAC family (241), and bHLH family (213) (Fig. 7). Next, KEGG metabolic pathway enrichment analysis was performed on 3490 unigenes. Among the three metabolic pathways of triterpenoid saponin synthesis, only Un 15,655, a unigene encoding TF, was involved in terpenoid backbone biosynthesis. This unigene belongs to the bHLH family and regulates hydroxymethylglutaryl-CoA reductase (NADPH).
UHPLC-QE-MS and HPLC Analyses of Triterpenoid Saponins in AF
A total of 119 metabolites were identified using the UHPLC-QE-MS, the specific classifications and numbers of which are shown in Fig. 8a. Of these, 29 were terpenoids, which accounted for the highest proportion (24.37%). Notably, among the 29 terpenoids identified, 18 were triterpenoid saponins. A qualitative analysis of the triterpene saponin metabolites of AF revealed the presence of α-hederin in AF. The total ion flow diagram for positive and negative ion modes is shown in Fig. S2. In addition, based on retention time (Tr), maximum UV absorption wavelength (λ-MS), molecular ion peaks, and ESI–MS data, oleanolic acid, hawthornic acid, and ginsenoside Ro may also be present in AF (Fig. S3 and Table S6).
HPLC analysis was carried out to further compare changes in triterpenoid saponin expression in prehensile seeds at different developmental stages. To this end, α-hederin expression was calculated (Fig. 8b) according to the generated the standard curve (Y = 4.703 1 X − 0.097 7, R2 = 0.999 9). Dynamic changes in α-hederin expression in the AF samples first increased then decreased, before finally increasing again. The accumulation of α-hederin in AF peaked in September, probably because AF typically mature around September.
Validation of RNA-Seq Data with qRT-PCR
To independently validate the transcriptome data and DEGs at different developmental periods, six unigenes involved in triterpenoid saponin biosynthesis were selected for qRT-PCR analysis. The levels of CL 2686–3 (GPPS) expression were the highest in May. Similarly, CL 2371–4 (AACT), CL 2808–1 (IDI), and Un 21,574 (PMK) expression was peaked in June, while CL 3010–8 (HMGR) and CL 1616–6 (SS) expression was highest in July. Importantly, these data are consistent with our transcriptome data (Fig. 9).
Discussion
Triterpenoid saponins are the main bioactive components in AF and exhibit good pharmacological activities. We identified triterpenoid saponins in AF by UHPLC-QE-MS analysis and determined α-hederin expression by HPLC. Importantly, α-hederin is an indicator and characteristic component of AF, with strong antitumor activity (Cao et al. 2022). We found that α-hederin expression increased from May to June in AF, decreased from June to August, and peaked in September, suggesting that AF is more medicinally useful in September. This is in agreement with previous studies, and importantly, these data provide guidance for the optimal harvesting period for AF (Xiong et al. 2008; Cai et al. 2022). These observed changes were also consistent with our transcriptomic data, indicating a certain correlation between the composition and transcriptomic expression.
Transcriptome sequencing was carried out on samples from five developmental stages using the DNBSEQ platform. After reassembly and de-redundancy procedures, 115,430 unigenes were obtained, with a total length, an average length, N50, and GC content of 166,896,333 bp, 1445 bp, 2269 bp, and 40.16%, respectively. These unigenes were compared to the NR, NT, SwissProt, KOG, KEGG, GO, and PFAM databases. From this, 81,143 (70.30%), 69,686 (60.37%), 63,991 (55.44%), 65,365 (56.63%), 66,446 (57.56%), 65,675 (56.90%), and 61,359 (53.16%) unigenes were functionally annotated, respectively. At the same time, 3490 unigenes encoding TFs were predicted; all Clean Reads Q30 values were greater than 92.85%, indicating that the transcriptome database was reliable and the assembly quality was high.
A large number of unigenes, 65,675, were assigned to the GO database, which indicated significant diversity in the AF transcriptome across the different developmental stages. A total of 66,446 unigenes were annotated in the KEGG database, and 451 unigenes were found to be involved in triterpenoid saponin biosynthesis, of which 255 encoded 19 enzymes that play a key role in that process. Next, the expression of key enzyme genes was determined according to their FPKM values. We found that AACT (CL 2371–4), PMK (Un 21,574), and SS (CL 1616–6) expression first increased before it decreased, suggesting that the synthesis rate may change accordingly, providing a basis for further research and cultivation. AACT (CL 2371–4), GPPS (CL 2686–3), HMGR (CL 3010–8), IDI (CL 2808–1), PMK (Un 21,574), and SS (CL 1616–6) genes were assessed using qRT-PCR, and the authenticity and reliability of our transcriptional data were verified.
As a key enzyme in oleanane triterpene saponin synthesis, β-AS is also one of the reasons for the diversity of triterpene products (Wu et al. 2013). The full-length (2304 bp) of the β-AS gene (CL 11200–10), which is highly consistent with the template protein sequence, encodes 767 amino acids. The sequence of published plant β-AS identified through the NCBI search was between 2000 and 2800 bp, with an amino acid sequence of 690–780 residues in length (Zhao et al. 2019). Our data show that β-AS is highly conserved, with secondary structures mainly composed of an alpha helices and random coils, followed by extended strands and β-turns. These secondary structures are also very similar in different plants, indicating additional β-AS conservation at the secondary structure. A search of the conserved domains in the NCBI database showed that β-AS contains a conserved DCTAE substrate-binding domain. These residues may be necessary to open the oxygen ring (Haralampidis et al. 2002; Dang et al. 2000; Shibuya et al. 1999).
There were 16,905 significant DEGs in Y9 vs. Y5, 14,972 in Y9 vs. Y6, 10,778 in Y9 vs. Y7, and 7053 in Y9 vs. Y8. The DEGs of different developmental stages shows that AF harvested in September is quite different from compared to fruits harvested at other times of the year, with the number of DEGs also being highest in September. This is in line with the pervious observed phenomenon, which further proves that AF has important medicinal value in September (August in the lunar calendar).
In the transcriptome database, 3490 candidate TFs were assigned to 56 TF families. These TFs may play an important role in regulating triterpenoid saponin biosynthesis and plant metabolism. The MYB family has the largest number of unigenes at 422. MYB factors are very important for the biosynthesis of terpenoid trunks. MYB can be divided into four categories. The subclass R2R3-MYB is the most widely distributed MYB TF in plants (Dubos et al. 2010), and it regulates the biosynthesis of terpenoid secondary metabolites. In-depth study of MYB TFs has shown that R2R3-MYB TFs regulate terpenoid synthesis in plants. For example, R2R3-MYB TF PnMYB1 can simultaneously regulate the expression of two key enzyme genes in the biosynthesis pathway of Panax notoginseng saponins, thereby affecting their biosynthesis (Lei et al. 2022). In addition, overexpression of CrBPF1 TFs can increase the transcription level of genes related to terpene biosynthesis in Catharanthus roseus to also regulate terpene biosynthesis (Li et al. 2015).
In summary, we conducted a comprehensive transcriptome analysis five developmental stages of AF, identified genes related to triterpenoid saponin biosynthesis using RNA-Seq, identified the key enzymes involved in biosynthesis, and verified the RNA-Seq data of some genes with qRT-PCR. Our data provide novel insights into the metabolic pathways related to AF triterpenoid saponins. These data will also aid further analyses of the molecular mechanisms involved in regulating AF triterpenoid saponin biosynthesis.
Data Availability
No datasets were generated or analysed during the current study.
Abbreviations
- AF:
-
Akebiae Fructus
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- β-AS:
-
Beta-amyrin synthase
- DEGs:
-
Differentially expressed genes
- TFs:
-
Transcription factors
- UHPLC-QE-MS:
-
Ultra-high-performance liquid chromatography-quadrupole electrostatic field orbitrap mass spectrometry
- HPLC:
-
High-performance liquid chromatography
- qRT-PCR:
-
Quantitative real-time polymerase chain reaction
- MVA:
-
Mevalonicacid pathway
- MEP:
-
2-Methyl-D-erthritol-4-phosphate
- GPPS:
-
Geranyl diphosphate synthase
- FPPS:
-
Farnesyl diphosphate synthase
- SS:
-
Squalene synthase
- SE:
-
Squalene epoxidase
- OSCs:
-
Oxidosqualene cyclases
- PCR:
-
Polymerase chain reaction
- ORF:
-
Longest open reading frame
- CDS:
-
Coding sequence
- SSR:
-
Simple sequence repeat
- NT:
-
NCBI non-redundant nucleic acid sequence database
- NR:
-
NCBI non-redundant protein sequence database
- KOG:
-
Protein direct homology database
- SwissProt:
-
Swiss-Prot protein sequence database
- PFAM:
-
Multi-sequence alignment and protein family database
- GO:
-
Gene Ontology
- FPKM:
-
Fragments per kilobase of exon model per million mapped fragments
- FDR:
-
False discovery rate
- TFs:
-
Transcriptome factors
- LC–MS/MS:
-
High-performance liquid chromatography coupled with tandem mass spectrometry
- RNA-Seq:
-
RNA sequencing
- AACT:
-
Acetyl-CoA acetyltransferase
- HMGS:
-
3-Hydroxy-3-methylglutaryl-CoA synthase
- HMGR:
-
3-Hydroxy-3-methylglutaryl-CoA reductase
- MVK:
-
Mevalonate kinase
- PMK:
-
Phosphomevalonate kinase
- MVD:
-
Mevalonate-5-pyrophosphate decarboxylase
- DXS:
-
1-Deoxy-D-xylulose-5-phosphate synthase
- DXR:
-
1-Deoxy-D-xylulose-5-phosphate reductoisomerase
- MCT:
-
2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase
- CMK:
-
4-Diphosphocytidyl-2-C-methyl-D-erythritol kinase
- MDS:
-
2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase
- HDS:
-
(E)-4-hydroxy-3-methylbut-2-enyl-diphosphate synthase
- HDR:
-
(E)-4-hydroxy-3-methylbut-2-enyl-diphosphate reductase
- IDI:
-
Isopentenyl-diphosphate delta-isomerase
References
Basyuni M, Oku H, Tsujimoto E, Kinjo K, Baba S, Takara K (2007) Triterpene synthases from the Okinawan mangrove tribe. Rhizophoraceae FEBS J 274(19):5028–5042
Cai FL, Zou SY, Gao PX, Jia TJ, Cheng CS, Fan SQ, Ding JM, Huang HW (2022) Analysis on quality variations of Akebia trifoliata fruit at different harvesting times. J Plant Resour Environ 31(1):83–85
Cao LN, Peng PK, Pan ZQ (2022) Research advances on anti-tumor effects of plants of Akebia extracts. Chin Tradit Herb Drugs 53(13):4187–4197
Chen WC, Wang C, Chen T, Hu W, Zeng JH (2021) Mechanism of action of Fructus Akebiae in the treatment of hepatocarcinoma: a network pharmacology analysis. Acta Med Univ Sci Technologiae Huazhong 50(04):471–477
Choi D, Ward L, Bostock RM (1992) Differential induction and suppression of potato 3-hydroxy-3-methylglutaryl coenzyme A reduetase genes in response to Phytophthora infestans and to its elieitor arechidonic acid. Plant Cell 4(10):1333–1344
Dang T, Prestwich GD (2000) Site-directed mutagenesis of squalene-hopene cyclase: altered substrate specificity and product distribution. Chem Biol 7(8):643–649
Deng JJ, He JJ, Ma HY (2021) Establishment of fingerprint and identification of the components in Akebiae Fructus formula granule by HPLC-Q-TOF-MS. Chin J Mod Appl Pharm 38(6):715–720
Dong YM, Zhang WY, Ling ZY, Li JR, Bai HT, Li H, Shi L (2020) Advances in transcription factors regulating plant terpenoids biosynthesis. Chin Bull Bot 55(03):340–350
Dubos C, Stracke R, Grotewold E, Weisshaar B, Martin C, Lepiniec L (2010) MYB transcription factors in Arabidopsis. Trends Plant Sci 15(10):573–581
Fang Y, Bi Y, Li XC, Hu YB, Liu XL (2021) Provenance selection and biological characteristics of fruit type Akebia trifoliata (Thunb.) Koidz. Hubei Agric Sci 60(01):68–71+76
Haralampidis K, Trojanowska M, Osbourn AE (2002) Biosynthesis of triterpenoid saponins in plants. Adv Biochem Eng Biotechnol 75:31–49
Hong HL (2017) A preliminary study on the domestication and cultivation technique of wild Akebia trifoliata (Thunb.) Koidz. in Guangde County. Anhui Agric Sci Bull 23(13):53–54
Ito R, Masukawa Y, Hoshino T (2013) Purification, kinetics, inhibitors and CD for recombinant beta-amyrin synthase from Euphorbia tirucalli L and functional analysis of the DCTA motif, which is highly conserved among oxidosqualene cyclases. FEBS J 280(5):1267–1280
Kanehisa M (2019) Toward understanding the origin and evolution of cellular organisms. Protein Sci 28:1947–1951
Kanehisa M, Goto S (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 28:27–30
Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M (2021) KEGG: integrating viruses and cellular organisms. Nucleic Acids Res 49:D545–D551
Kim HS, Lee BY, Won EJ, Han J, Hwang DS, Park HG, Lee JS (2015) Identification of xenobiotic biodegradation and metabolism-related genes in the copepod Tigriopus japonicus whole transcriptome analysis. Mar Genom 24(3):207–208
Kim CS, Winn MD, Sachdeva V, Jordan KE (2017) K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the inchworm module of trinity. BMC Bioinformatics 18(01):467
Kushiro T, Shibuya M, Masuda K, Ebizuka Y (2000) Mutational studies on triterpene synthases: engineering lupeol synthase into β-amyrin synthase. J Amer Chem Soc 122(29):6816–6824
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(04):357–359
Lei J, Chen Q, Deng B, Zhang JY, Liu DQ, Cui XM, Ge F (2022) Biosynthesis of Panax notoginseng saponins regulated by R2R3-MYB transcription factor PnMYB1. Biotechnol Bull 38(05):74–83
Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinforma 12:323
Li CY, Leopold AL, Sander GW, Shanks JV, Zhao L, Gibson SI (2015) CrBPF1 overexpression alters transcript levels of terpenoid indole alkaloid biosynthetic and regulatory genes. Front Plant Sci 6:818
Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCt method. Methods 25(04):402–408
Ma BL, Li JM, He MZ, Ou YH, Tan T, Yang SL, Feng YL (2020) In-vivo studies of the metabolites of Akebiae Fructus extract based on UHPLC-Q-TOF/MS method. Tradit Chin Drug Res & Clin Pharmacol 31(11):1350–1359
Mistry J, Finn RD, Eddy SR, Bateman A, Punta M (2013) Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res 41(12):e121
Newman JD, Chappll J (1999) Isoprenoid biosynthesis in plant: carbon partitioning within the cytopl-asmic pathway. Crit Rev Biochem Mol Biol 34(02):95–106
Ou JM, Yang X, Shan CM, Zhang SX, Shi YY, Wu JW, Huang LQ, Wang R (2020) Transcriptome analysis of “Langmei” fruits and key enzyme genes structure and function prediction involved in citric acid biosynthesis. China J Chin Mater Med 45(19):4606–4616
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR gene indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19(05):651
Pirooznia M, Perkins EJ, Deng Y (2008) Batch blast extractor: an automated blastx parser application. BMC Genomics 9(S2):S10
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16(06):276–277
Shan CM, Wang CK, Shi YY, Zhang SX, Zhao LQ, Wu JW (2020a) Identification of key enzyme genes involved in biosynthesis of steroidal saponins and analysis of biosynthesis pathway in Polygonatum cyrtonema. China J Chin Mater Med 45(12):2847–2857
Shan CM, Wang CK, Zhang SX, Shi YY, Ma KL, Yang QS, Wu JW (2020b) Transcriptome analysis of Clinopodium gracile (Benth.) Matsum and identification of genes related to triterpenoid saponin biosynthesis. BMC Genomics 21(01):49
Shi YY, Zhang SX, Peng DY, Wang CK, Zhao DR, Ma KL, Wu JW, Huang LQ (2019) Transcriptome analysis of Clinopodium chinense (Benth.) O. Kuntze and identification of genes involved in triterpenoid saponin biosynthesis. Int J Mol Sci 20(11):2643
Shibuya M, Zhang H, Endo A, Shishikura K, Kushiro T, Ebizuka Y (1999) Two branches of the lupeol synthase gene in the molecular evolution of plant oxidosqualene cyclases. Eur J Biochem 266(1):302–307
Song C, Li XL, Jia B, Liu L, Ou JM, Han BX (2021) De novo transcriptome sequencing coupled with co-expression analysis reveal the transcriptional regulation of key genes involved in the formation of active ingredients in Peucedanum praeruptorum Dunn under bolting period. Front Genet 12:683037
Suzuki H, Achnine L, Xu R, Matsuda SP, Dixon RA (2002) A genomics approach to the early stages of triterpene saponin biosynthesis in Medicago truncatula. Plant J 32(6):1033–1048
Thiel T, Michalek W, Varshney RK, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106(3):411–422
Wang X, Yuan MM, Zhou GP, Wu X, Zeng S, Zhang LJ, Zhong RJ (2019) TLC identification and HPLC determination of the active constituents in Akebia trifoliata (Thunb.) Koidz. Chin J Pharmaceutical Analysis 39(04):638–643
Wei YD, Xiong C, Zhang TY, Gao H, Yin QG, Yao H, Sun W, Hu ZG, Chen SL (2019) Synthesis of triterpenoid saponins from Aesculus chinensis based on transcriptome data. Chin J Chin Mater Med 44(06):1135–1144
Wu Q, Sun C, Chen SL (2013) Cloning of β-amyrin synthase gene from Panax quinquefolius and its bioinformatics analysis. Chin Tradit Herbal Drugs 44(11):1476–1480
Wu R, Yang YT, Qian CC, Zhang W, Ou JM (2023) Compositional differences of Mume Flos at different developmental periods by UPLC-QE-MS and UPLC and their antioxidant activity. Nat Prod Res Dev 35:1915–1928
Xie FF, Su J, Cai Y, Xu LB, Li L (2021) Analysis of chemical constituents of Zhuang medicine Lindernia ruellioides by UHPLC-QE-MS. Res Zhuang Yao Ethnic Medicine 01:71–78+184-185
Xing Y, Nan ML, Wang X, He YF, Zhao YW, He ZM (2017) Research progress of hederagenin. Chine J Exp Tradit Med Formulae 23(22):226–234
Xiong DS, Xiong Y, Deng WJ, Cai ZL (2008) A study on technology of harvest and process for stem and fruit of Akebia trifoliate. J Chin Med Mater 31(8):1116–1119
Xu XS, Zhang FS, Qin XM (2014) Research advances on triterpenoid saponins biosynthesis and its key enzymes. Modernization Tradit Chin Med Mater Med-World Sci Technol 16(11):2440–2448
Xu J, Zhan YP, Jiang YY, Cao Q (2019) Advance in anti-tumor pharmacological action of α-hederin. Practical Oncology J 33(06):553–556
Xu YY, Chen Z, Jia LM, Weng XH (2021) Advances in understanding of the biosynthetic pathway and regulatory mechanism of triterpenoid saponins in plants. Sci Sinica Vitae 51(05):525–555
Yang L, Fan W, Meng ZG, Long GQ, Li LG, Zhang GH, Yang SC, Chen JW (2018) Research development in biosynthesis and metabolic regulation of tetracyclic triterpenoid saponins in medicinal plants. Modernization Tradit Chin Med Mater Med-World Sci Technol 20(06):1018–1025
Ye X, Liu XQ, Gao HM, Feng WH, Yang LX, Li C, Wang ZM (2019) Study on assay strategy of total tritepenoid saponins in traditional Chinese medicines using total saponins from Akebiae Caulis as an example. China J Chin Mater Med 44(14):3087–3093
Zeng GP (2019) The expression and regulatory mechanism of the key gene of triterpenoid biosynthesis in Wolfiporia cocos. Guizhou Univ
Zhan ZG (2020) Advances in biosynthesis and regulation of the active ingredient of Salvia miltiorrhiza based on multi-omics approach. Acta Pharm Sin 55(12):2892–2903
Zhang YJ, Dang HS, Yang LL, Wei XY, Wang Y (2013) Geographical distribution and resource survey of wild medicinal plant Akebia trifoliata subsp. trifoliata. Chin Wild Plant Resources 32(03):58–62
Zhang YJ, Hou ZF, Liang S, Wang H, Song J, Wang YP (2016) The research progress of key enzymes involved in triterpene saponin biosynthesis in Panax genus plants. Spec Wild Econ Anim Plant Res 38(02):53–57+70
Zhang YH, Peng PK, Fang ZQ, Lu WL (2022) Antitumor effect of Akebia Fructus combined with celastrol on hepatoma bearing nude mice and its influence on NF-κB pathway. Chin J Inf Tradit Chin Med 29(09):95–100
Zhao ZX, Wang XX (2019) Bioinformatics analysis of DS and β-AS in metabolism of triterpenoid saponins in plants. Jiangsu Agric Sci 47(18):86–92
Zhao C, Guo LN, Peng YS, Liu C, Wang RF (2015) Research progress in key enzymes involved in biosynthesis of Panax notoginseng saponins and their regulation. Chin Tradit Herbal Drugs 46(19):2954–2965
Zhao LQ, Shan CM, Zhang SX, Shi YY, Ma KL, Wu JW (2020) Identification of key enzyme genes involved in anthocyanin synthesis pathway in Clinopodium gracile by transcriptome analysis. Bull Botanical Res 40(06):886–896
Zhong J, Liang GY, Wang JW, Hu HL, Sun C (2021) Nutritional of Akebia trifoliata seed and its effects on mycelial growth of Auricularia polytricha. Food Industry 42(08):333–337
Zhu Y, Li JX, Li JB, Huang RB (2020) Research progress in synthetic biology of tetracyclic triterpenoids saponins in medicinal plants. J Guangxi Acad Sci 36(03):309–316
Acknowledgements
We are very grateful to professor JWW (Anhui University of Traditional Chinese Medicine, CHN) for helpful suggestions.
Funding
This research was supported by the Ability Establishment of Sustainable Use for Valuable Chinese medicine Resources (2060302), the Research Project on Pharmacodynamic Material Basis of Anhui Genuine Main Medicinal Materials (RZ2100000757), the Scientific Research Project of University in Anhui Province (2023AH050787), and the Talent Project of Anhui University of Chinese Medicine (2021RCYB011).
Author information
Authors and Affiliations
Contributions
W.T. Fang, C.C. Qian, R. Wu, and Y.T. Yang contributed to the sample preparation, sequencing, and metabolite analysis. W.T. Fang and C.C. Qian performed the transcriptome data analysis and wrote the manuscript. J.M. Ou revised the manuscript. J.M. Ou provided financial support. All authors read and approved the manuscript.
Corresponding author
Ethics declarations
Ethical Approval
Not applicable.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fang, W., Qian, C., Wu, R. et al. Transcriptomic Analyses of Akebiae Fructus and Identification of Genes Related to Triterpenoid Saponin Biosynthesis. Plant Mol Biol Rep (2024). https://doi.org/10.1007/s11105-024-01496-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11105-024-01496-9