Introduction

Three original plants, Glycyrrhiza uralensis, Glycyrrhiza glabra, and Glycyrrhiza inflate, are prescribed as licorice and widely used in traditional Chinese medicine (TCM) (Chinese Pharmacopoeia2015). Modern pharmacological studies demonstrate that licorice possesses antitumor (Fukuchi et al. 2016; Rehan et al. 2013), anticancer (Lin et al. 2014), anti-inflammatory (Yang et al. 2017), anti-microbial (Wang et al. 2015a, b; Huang et al. 2012), immune-regulatory (Ma et al. 2013; Kim et al. 2013), liver protective (Seo et al. 2014), and neuroprotective activities (Chakravarthi and Avadhani 2014). In addition, it is also an important raw material for cosmetic brighteners, food additives, and tobacco flavoring agents (Hayashi et al. 2016).

Up to date, more than 300 flavonoids have been isolated from licorice (Wang et al. 2013). Licorice flavonoids possess various pharmacological properties (He et al. 2017; Gao et al. 2017; Wang et al. 2015a, b; Luo et al. 2016; Park et al. 2015; Gong et al. 2015; Zhou and Ho 2014). Liquiritin, a flavonoid, is the marker component for evaluating the quality of licorice. According to Chinese Pharmacopoeia, the content of liquiritin in licorice must be at or above 0.5% to be effective. However, our previous investigation showed that about 60% cultivated licorice didn’t meet this requirement. Therefore, how to improve the content of liquiritin in cultivated licorice has become a crucial issue.

In order to increase the content of natural active components, many researchers have focused on the radiation mutation breeding. It has become one of the most effective ways to obtain new germplasm resources in recent years (Tanaka et al. 2010). In our previous studies, we also found that X-ray irradiation improved the contents and yields of flavonoids in G. uralensis (Hu et al. 2017). However, the molecular mechanism of the flavonoids accumulation in licorice remains unclear so far.

The flavonoids in licorice are biosynthesized by the phenylpropanoid metabolic pathway which is controlled and regulated by many key enzymes (Fig. 1). Several functional genes have been successfully cloned and characterized, such as cinnamate 4-hydroxylase (C4H) (Li et al. 2016), isoflavone synthase (IFS) (Cheng et al. 2013), and chalcone synthase (CHS) (Awasthi et al. 2016). However, biosynthesis of flavonoid is a complex process, many genes critical for the biosynthesis remain to be identified.

Fig. 1
figure 1

Liquiritin biosynthesis through phenylpropanoid metabolic pathway in G. uralensis. PDH prephenate dehydratase, AAT aromatic aminotransferase, GT glycosyltransferase, PAL phenylalanine ammonialyase, C4H cinnamic acid 4-hydroxylase, 4CL 4-coumarate CoA ligase, ACC acetyl CoA carboxylase, CHS chalcone synthase, CHR chalcone reductase, I-CHI I-chalcone isomerase, II-CHI II-chalcone isomerase

With the development of gene sequencing technology, genomic and transcriptome analyses have attracted increasing attention to explore the molecular mechanism of secondary metabolism in medicinal plants (Han et al. 2016; Rhoads and Au 2015). A recent analysis of the transcriptome of Carthamus tinctorius has led to the identification of several key genes involved in the biosynthesis of flavonoids (Chen et al. 2018). In addition, transcriptome sequencing of Scutellaria baicalensis has uncovered 54 unigenes encoding 12 key enzymes involved in the biosynthetic pathway of flavonoids (Liu et al. 2015). Similarly, a transcriptome analysis of licorice has identified cytochrome P450 enzymes (CYP) and vacuolar saponin transporters involved in glycyrrhizin production (Ramilowski and Daub 2013).

In the present study, X-ray irradiated licorice seeds were cultivated for one year and two resulting plants with high flavonoid content were selected for transcriptome analysis and comparison with a control plant from untreated seeds. A metabolic network of flavonoids was established on the basis of the RNA-sequencing (RNA-Seq) results. The co-expressed differentially expressed genes (DEGs) involved in the metabolic network of flavonoids were determined and partly analyzed by qRT-PCR. This study provides a basis for functional genes mining and molecular regulatory mechanism elucidation of flavonoid biosynthesis in licorice.

Materials and methods

Plant material and X-ray treatment

Healthy seeds of G. uralensis were irradiated by six gradient doses of X-rays, 5, 10, 15, 20, 30, and 50 Gy, and cultivated for one year in the herb garden at Beijing University of Chinese Medicine. Healthy seeds without irradiation were used as the blank control. The roots of all samples were collected for HPLC analysis of the contents of liquiritin, isoliquiritin, liquiritigenin, and isoliquiritigenin. Based on the HPLC results, two irradiated samples with high flavonoids contents and one blank sample with low flavonoids contents were selected for RNA-Seq analysis, which were showed in Table 1.

Table 1 Information of the three G. uralensis samples for transcriptome analysis

RNA extraction, cDNA library construction and RNA-Seq

Total RNA was extracted from the roots of licorice samples (Ding et al. 2010). RNA purity (OD 260/280) and concentration were detected by Nanodrop. RNA integrity was detected by Agilent 2100 Bioanalyzer. The cDNA library of each sample was constructed by an Illumina TruSeq™ RNA Sample Preparation Kit (Illumina, San Diego, CA, USA) and sequenced by Illumina HisSeq™ X10. Raw reads were obtained (oss://nextomics/FTP/BJXWZ-201707001D/) and filtered by NGS QCToolkit (version 2.3.3) to obtain high-quality clean reads (Patel and Mukesh 2012). All clean reads were mapped to the reference genome sequence (https://ngs-data-archive.psc.riken.jp/Gur-genome/download.pl) using Hisat (version 2.0.5) (Kim et al. 2015).

Gene annotation and functional enrichment

Gene annotation and functional enrichment were performed using Java Treeview (version 1.1.6), Kyoto Encyclopedia of Genes and Genomes (KEGG) database, and Gene ontology (GO) database (Kanehisa et al. 2008; Ye et al. 2006). The BLASTX algorithm was used to query the assembled sequences against GO and KEGG databases. The IDs of DEGs were submitted to the database for the enrichment analysis of GO biological process terms and KEGG pathway categories. The functional genes influencing the accumulation of flavonoids in G. uralensis were selected according to the gene annotation.

Gene expression quantification and differential expression analysis

The fragments per kilobase of exon per million mapped reads (FPKM) value was used to measure the gene expression level (Mortazavi et al. 2008). Taking the blank sample as a control, DEGs in irradiated samples, H1 and H2, were identified using EdgeR (Robinson et al. 2010). According to the Benjamini–Hochberg method, the significant differences in gene expression were represented by the false discovery rate (FDR). Fold change (FC) indicates the ratio of the gene expression level between two samples. FDR < 0.01 and \(\left| {\log_{2}^{{({\text{FC}})}} \ge 1} \right|\) were set as the thresholds for gene differential expression. According to the gene expression level, the DEGs were classified into two relative groups, up-regulated and down-regulated genes. The DEGs simultaneously up-regulated or down-regulated in both samples H1 and H2 were co-expressed DEGs. Using the website (https://www.genome.jp/kegg/tool/map_pathway2.html), co-expressed DEGs were mapped to the different pathways. Then, the pathways enriched with DEGs, the biosynthesis map map01060, and various hypotheses about the biosynthetic mechanism of secondary metabolites, including growth differentiation balance hypothesis (GDB), optimum defense hypothesis (OD), carbon nutrient balance hypothesis (CNB), and resource availability hypothesis (RA) (Huang et al. 2010), were combined to construct a metabolic network of flavonoids in licorice.

Relative expression analysis of co-expressed DEGs

To verify the RNA-Seq results, qRT-PCR was performed with the SYBR® Green qPCR Master Mix (High ROX) as the fluorescent dye. The β-actin was used as the internal control gene. All primers of co-expressed DEGs were designed by the Primer Premier 5.0 (Table 2). The qRT-PCR was performed on a Light Cycler480 II (Roche, Switzerland). Gene expression level was calculated by the 2−∆∆CT method (Livak and Schmittgen 2001). Correlation of gene expression between RNA-Seq and qRT-PCR data was analyzed by Pearson-Test.

Table 2 Primer sequences for qRT-PCR analysis

Results

RNA-Seq sequencing analysis

The RNA integrity number (RIN) of three licorice samples was 8.1, 9.6, and 8.3, respectively, which met the requirements of cDNA library. The sequencing error rates of more than 91% clean reads were less than 0.1% (Q30) (Table 3). After the data filtering, 61.37 million, 54.21 million, and 54.22 million clean reads were obtained in samples H1, H2, and L1, respectively. More than 88% of clean reads in each library was mapped to the reference genome. Therefore, the transcriptome data of the three samples were obtained with a high correct rate and good genomic coverage.

Table 3 Statistics of RNA-seq analysis

Gene annotation and pathway enrichment analysis

A total of 16,006 unigenes were annotated to the GO database and classified into three principal categories, including cellular component, molecular function, and biological process, which were further subdivided into 39 categories. The percentage of genes with the function of “binding” was the largest, followed by “catalytic activity” and “metabolic process” (Fig. 2a). A total of 9693 unigenes were annotated to the KEGG database and categorized into 208 KEGG pathways. The percentage of genes with the function of “signal transduction” was the largest, followed by “carbohydrate metabolism” and “translation” (Fig. 2b).

Fig. 2
figure 2

Unigenes function classification in G. uralensis. a GO classification of unigenes in G. uralensis. Green shows the GO categories of molecular function. Red shows the GO categories of biological process. Purple shows the GO categories of cellular component. Stripes 1, 2, 3 show the percentage of genes with the function of protein binding, metabolic process and catalytic activity, respectively. b KEGG classification of unigenes in G. uralensis. The number of unigenes is marked in pie chart. (Color figure online)

Gene expression analysis and identification of DEGs

3386 up-regulated DEGs and 1141 down-regulated DEGs were identified in the group “H1 vs L1”, and 1995 up-regulated DEGs and 2235 down-regulated DEGs in the group “H2 vs L1” (Fig. 3a). A total of 1875 core DEGs was obtained in both group “H1 vs L1” and “H2 vs L1” (Fig. 3b), which was closely related to the accumulation of flavonoids in licorice. The expression patterns of the core DEGs in samples H1 and H2 were similar, but were in contrast with sample L1 (Fig. 3c). In comparison with L1, many genes in H1 and H2 were down-regulated, such as peroxidase gene, beta-glucosidase gene, phenylalanine ammonia-lyase gene, and coniferyl-aldehyde dehydrogenase gene.

Fig. 3
figure 3

Expression profiling of core DEGs in the three samples. a The numbers of up-regulated and down-regulated DEGs in samples H1 and H2 compared with L1. b The core DEGs between H1 and L1, and H2 and L1. c Hierarchical clustering graph of the total 1875 core DEGs

Functional enrichment analysis of DEGs

The major GO enrichment terms of the 1875 core DEGs were shown in Fig. 4, including “oxidoreductase activity”, “transmembrane transporter activity”, and “carbohydrate metabolic process”. The KEGG pathway significantly enriched in 11 biosynthetic pathways, including “starch and sucrose metabolism (map00500)”, “terpenoid backbone biosynthesis (map00900)”, “diterpenoid biosynthesis (map00904)”, “carotenoid biosynthesis (map00906)”, “sesquiterpenoid and triterpenoid biosynthesis (map00909)”, “phenylpropanoid biosynthesis (map00940)”, “flavonoid biosynthesis (map00941)”, “flavone and flavonol biosynthesis (map00944)”, “plant hormone signal transduction (map04075)”, “isoflavonoid biosynthesis (map00943)”, and “circadian rhythm-plant (map04712)”. A metabolic network of flavonoids in licorice was constructed as shown in Fig. 5. Five KEGG pathways are closely related to the biosynthesis of flavonoids, including the flavonoid metabolic pathway, plant hormone signal transduction pathway, terpenoid biosynthetic pathway, plant circadian rhythm pathway, and starch and sucrose metabolic pathway.

Fig. 4
figure 4

GO enrichment scatter plot of the total 1875 core DEGs

Fig. 5
figure 5

Network diagram of the flavonoid metabolic pathway and secondary metabolites in licorice. Blue box marks the starch and sucrose metabolic pathway. Green box marks the plant circadian rhythm pathway. Red box marks the flavonoid metabolic pathway. Orange box marks the plant hormone signal transduction pathway. Purple box marks the terpenoid biosynthetic pathway. (Color figure online)

Expression analysis of DEGs in the flavonoid metabolic pathways

Twenty-three DEGs were obtained on the flavonoid metabolic pathway (Table 4), one was up-regulated and eighteen were down-regulated in both samples H1 and H2, while the left four genes were up-regulated in sample H1 but down-regulated in sample H2. Eighteen DEGs were obtained on the plant hormone signal transduction pathway, four were up-regulated and eleven were down-regulated in both samples H1 and H2, while the left three genes were up-regulated in sample H1 but down-regulated in sample H2. Nine DEGs were obtained on the terpenoid biosynthetic pathway, five were up-regulated and one was down-regulated in both samples H1 and H2, while the left three genes were up-regulated in sample H1 but down-regulated in sample H2. Seven up-regulated DEGs were obtained on the plant circadian rhythm pathway in both samples H1 and H2. Four DEGs were obtained on the starch sucrose metabolic pathway, two were up-regulated and two were down-regulated in both samples H1 and H2. The further analysis of the above DEGs on the five metabolic pathways are showed in “Discussion”.

Table 4 FPKM value of DEGs on the five metabolic pathways

Verification of gene expression by qRT-PCR

The gene expression level of ten co-expressed DEGs, PAL (Glyur000106s00011717), CHS1 (Glyur000424s00026890), NCED (Glyur000278s00017280), GA2ox (Glyur000261s00014360), DXS (Glyur000231s00022061), GID1 (Glyur000158s00011331), JAZ (Glyur002299s00036262), SAUR (Glyur000017s00002448), LHY (Glyur000116s00009244), and AMYB (Glyur000047s00004005), were verified by qRT-PCR. The gene expression levels of the ten co-expressed DEGs were measured in RNA-Seq (Fig. 6a). The correlation coefficient between the \({\log}_{2}^{{{\text{FRKM}}}}\) value and \(\log_{2}^{{{\text{average}}(2^{ - \Delta \Delta CT} )}}\) value was 0.6324 (P < 0.0001) (Fig. 6b). Except DXS, the expression level of the other nine genes was consistent with the RNA-Seq results (Fig. 6c). The expression levels of NCED, GA2ox, LHY, and AMYB were highest in sample H1, and those of CHS1 and JAZ were highest in sample H2 (Fig. 6c).

Fig. 6
figure 6

Gene expression levels of the co-expressed DEGs in RNA-Seq and qRT-PCR. a Gene expression levels of the ten co-expressed DEGs measured in RNA-Seq. b Correlation scatter plot between log2average(2^−∆∆CT) and log2FPKM, which shows the relationship between RNA-Seq and qRT-PCR. c Gene expression levels of the nine co-expressed DEGs in qRT-PCR. The blue columns corresponding to the ordinate axis on the left show the expression level of co-expressed DEGs in qRT-PCR. The red polyline corresponding to the ordinate axis on the right show the expression of co-expressed DEGs in RNA-Seq. (Color figure online)

Discussion

In this study, a total of 1875 core DEGs involved in the secondary metabolic pathways were obtained in three licorice samples by RNA-Seq analysis, and a metabolic network for the biosynthesis of flavonoids was established. Five metabolic pathways were identified that play important roles for flavonoid accumulation in licorice and analyzed as follows.

The flavonoid metabolic pathway: the up-regulated expression of chalcone synthase gene (CHS) in samples H1 and H2 is expected to influence the biosynthesis of flavonoids (Wang et al. 2018). The down-regulated expression of beta-glucosidase gene (GLU), coniferyl-aldehyde dehydrogenase gene (CADH), and peroxidase gene (POD) may result in attenuation of the downstream alternative pathways, and hence shunt biosynthetic substrates towards the flavonoid synthesis pathways.

The terpenoid biosynthetic pathway: two up-regulated genes, gibberellin 2-oxidase gene (GA2ox) and gibberellin 3-β-dioxygenase gene (GA3ox), are involved in photosynthesis and plant primary metabolism (Zhou et al. 2011). The up-regulated expression of 1-deoxy-D-xylulose-5-phosphate synthase gene (DXS) is expected to promote the biosynthesis of monoterpene, diterpene and carotenoids. The down-regulated expression of capsanthin synthase gene (CS1) may favor biosynthesis of carotenoid and promote accumulation of flavonoids (Zhou et al. 2017).

The starch sucrose metabolic pathway: the up-regulated genes, the 1,4-α-glucan branching enzyme gene (GBE) and β-amylase gene (AMYB), are involved in conversion of starch to dextrin and maltose. The down-regulated expression of the α-amylase gene and sucrose synthase gene may lead to inhibition of sucrose synthesis, which may be related to the increased biomass of G. uralensis observed in our previous studies (Hu et al. 2017).

The plant hormone signal transduction pathway: the up-regulated expression of the protein transport inhibitor response 1 gene (TIR1) and gibberellin receptor gene (GID1) is expected to promote the protein ubiquitination (Maraschin et al. 2010) and diterpene biosynthesis. The up-regulated expression of the jasmonate methyl-domain protein 5 gene (JAZ) is expected to enhance stem maturation and stress resistance. The down-regulated expression of the SAUR-like auxin-responsive protein gene (SAUR), auxin-reactive GH3 family protein gene (GH3), and indoleacetic acid-induced protein 10 gene (IAA10) indicates a down-regulation of auxin metabolism (Luo et al. 2018), while the down-regulated expression of the histidine-containing phosphotransfer factor 5 gene (AHP) and response regulator 4 gene (RR4) suggests a down-regulation of cytokinin metabolism (Verma et al. 2015). The down-regulation of auxin and cytokinin metabolism may cause the slow growth rate of the two samples in the premature state, which, according to the growth-differentiation balance hypothesis, the optimum defense hypothesis, and the resource availability hypothesis, leads to the accumulation of secondary metabolites and promotes the biosynthesis of flavonoids.

The plant circadian rhythm pathway: the up-regulated expression of the chalcone synthase gene (CHS) and pseudo-responsive regulator gene 5 (PRR5) may protect licorice against the injury caused by the X-ray irradiation to increase the biomass of G. uralensis. The up-regulated expression of the flowering locus T gene (FT) and MYB-related transcription factor gene (LHY) may influence the flowering period and anthocyanid production, resulting in an increased accumulation of flavonoids.

Many genes were identified critical for biosynthesis of bioactive components in medicinal plants in the past several years (Wei et al. 2015; Wang et al. 2016). In this study, sixty-one genes (thirty different kinds) involved in five pathways were mined (Table 4) that play an important role for flavonoids biosynthesis in licorice. Among them, AH, DXS, LUP, CHS, and SQS have already been identified from licorice, and POD, FCH, GLU, PAL, CADH, GA3OX1, NCED, CS1, β-AL, GBE, α-AL, GH3, JAZ, RR4, SAUR, PYR, TIR1, GID1, MYC2, FT, and LHY have been identified from other Leguminosae plants. However, for most of these genes the roles of them in flavonoids production in high plants were unclear. Among the ten genes we selected for qRT-PCR analysis, CHS has been reported to be related to a high level of flavonoid accumulation (Wang et al. 2018). In our current studies we also find that CHS and DXS are able to regulate flavonoid biosynthesis. These studies provide important insights regarding the role of the identified genes in production of various bioactive components.

With the development of 2nd generation sequence technology, the genomic and transcriptome analyses have become powerful tools to analyze biosynthetic pathways in plants (Han et al. 2016; Rhoads and Au 2015). For example, to clarify the mechanism of gibberellin-regulated flowering in Jatropha curcas, a lot of genes involved in gibberellin metabolism and signaling pathways were identified by the genomic and transcriptome analyses (Gao et al. 2015). The present study utilizes a genowide based analysis to uncover changes in transcription associated with high content of flavonoids. The results show that five pathways are involved in the biosynthesis of flavonoids in licorice. Changes in the five pathways are able to affect production of flavonoids, which make them potential targets for genetic manipulation in order to increase flavonoid production in licorice. Our transcriptome study will provide a frame-work for further determining the role of each identified gene in the process.