Background

Agarwood is resinous heartwood derived from Aquilaria and Gyrinops trees. Due to the high economic value of these trees and the extensive deforestation, agarwood producing tree species have become endangered. The use of agarwood is prevalent in many cultures for religious ceremonies, perfumes, and especially in Chinese herbal medicine, where plant materials are commonly utilized [1, 2]. Agarwood is one of the most used plant materials in Chinese medicine, second only to ginseng. The value of agarwood lies not only in its aromatic compounds [3], but also in its non-volatile compounds, which potentially have beneficial properties with regards to human medicine [4, 5].

In our previous study, we presented a draft genome and a putative pathway for cucurbitacins E and I, compounds with known medicinal value, in Aquilaria agallocha [6], one of the largest producers of agarwood. Briefly, gene expression changes for in vitro samples treated with methyl jasmonate (MJ) were shown to be consistent with known responses of A. agallocha to biotic stress and a set of homologous genes related to cucurbitacin biosynthesis in Arabidopsis thaliana was identified. However, MJ treatment is perhaps not the most efficient protocol. Although there exists much research into Chinese medicinal herbs and extraction of high value compounds, few have focused on increasing the quantity of target compounds through stimulation of its related pathways in this species.

In this study, we demonstrate that the quantity of cucurbitacins can be controlled by utilizing different types of light. Red light (R) and far-red light (FR) are components of the solar spectrum that strongly affect plant tissues. Many studies have reported an interaction between plant defenses and R/FR responses [7, 8]. Under low R/FR conditions, there is a dramatic decrease not only in the number of root nodules but also in the expression of jasmonic acid (JA) response genes. In a study on phytochrome B (phyB) mutants, JA-related gene expression levels have also been observed to be down-regulated [9] and are known to participate in secondary metabolic pathways [10].

In order to better understand the effect of light conditions on cucurbitacin secondary metabolic pathways in A. agallocha, we performed high-throughput sequencing experiments under two different light conditions: red light, a factor activating phyB, and far-red light, a factor inhibiting phyB [11]. Three types of sequencing experiments were performed: RNA sequencing (RNA-seq) to study gene expression, whole-genome bisulfite sequencing to study DNA methylation, and small RNA (sRNA) sequencing to determine sRNAs that play a role in methylation. As epigenetic modifications may also play a role in the regulation of gene expression, studies on DNA methylation are becoming increasing important.

To higher organisms, DNA methylation plays an important and widespread role in epigenetic modification, mediated by DNA methyltransferases (DMTs). DNA methylation in the genome is known to provide protection from transposons and/or RNA viruses, where they play a role in regulating splicing. DNA methylation is also associated with major developmental reprogramming [12]. Small RNAs are also an essential factor in plants where they play a role in regulating the activation of functional genes and transposons [3].

The results of our analysis show that R/FR conditions have a large effect on gene expression levels in agarwood. RNA-seq data revealed an array of gene clusters with distinctive expression patterns, where individual gene clusters responded primarily to red light or far-red light. Differentially methylated regions (DMRs) discovered from whole-genome bisulfite sequencing data showed that there is also a large difference in methylation levels between R/FR conditions. We observed that sRNAs may potentially play a role in influencing the methylation levels of genes important to secondary metabolism and subsequently play a role in gene expression regulation.

These genome wide profiles provide insight into the regulatory interaction between red light and far-red light conditions in A. agallocha as well as identify compelling new candidates for secondary metabolic functional components. The data used in this study is freely available at our provided webserver (http://molas.iis.sinica.edu.tw/agarwood) and at NCBI (Bioproject ID: PRJNA240626).

Results and discussion

Red light conditions increase cucurbitacin E and I content

In our previous study, we showed that agarwood contained high cucurbitacin content and that MJ treatment increased content levels [6]. Here, we instead used red light conditions to stimulate cucurbitacin biosynthesis (Fig. 1). From LC-ESI-MS quantification, it was seen that cucurbitacin content increased as red light exposure increased, up to 356 μg/g of cucurbitacin I at day 2. Cucurbitacin I content decreased as far-red light exposure increased, down to 96 μg/g at day 2. Similarly for cucurbitacin E, content levels increased up to 972 μg/g under red light conditions at day 1 and decreased down to 567 μg/g under far-red light conditions at day 5. Under red light conditions, at peak levels, cucurbitacin content was significantly increased compared to normal light conditions with p-values of 1.09E-5 and 4.57E-6 for cucurbitacin I and E respectively in a two-sample t-test. Similarly for far-red light conditions, at the lowest levels, cucurbitacin content was significantly decreased compared to normal light conditions with p-values of 3.44E-2 and 1.32E-4 for cucurbitacin I and E respectively.

Fig. 1
figure 1

Endogenous cucurbitacin content of in vitro agarwood. Content was measured after red and far-red light treatment over the course of 5 days. Data is represented as mean ± standard deviation (n = 5). At peak levels under red light conditions, cucurbitacin content was significantly increased compared to normal light conditions (paired t-test p-values 1.09E-5 and 4.57E-6 for cucurbitacin I and E respectively). At the lowest levels under far-red light conditions, cucurbitacin content was significantly decreased compared to normal light conditions (paired t-test p-values 3.44E-2 and 1.32E-4 for cucurbitacin I and E respectively)

Different types of light affect various biological pathways in plants. There are five classes of phytochromes which typically absorb red light and far-red light [13]. Previous studies on phyA and phyB photosensory functions show that red light activated phyB interacts with transcription factors to induce a phytochrome-dependent signaling cascade [7, 8] and that vascular plant one-zinc-finger (VOZ) transcription factors interact with phyB [14]. VOZs are active transcription factors that promote SA and JA-mediated defense responses under biotic stress [14, 15]. Far-red light is known to inhibit phyB and plays an antagonistic role in most pathways [11, 14].

Previous studies have demonstrated that target compounds can be increased through stimulating biosynthetic pathways [6, 16] and that light can be used as stimuli for increasing compound yield [17]. With the increasing commonality of plant factories, the use of light as stimuli instead of chemical treatment may be preferable due to a simpler protocol.

Red light and far-red light gene expression patterns in agarwood

In order to study the effects of different light on gene expression in agarwood, we performed high-throughput RNA sequencing under red light and far-red light conditions. The time-course RNA-seq data (Table 1) was obtained from samples under red light and far-red light conditions at 1, 2, and 5 days, as well as normal conditions (white light control). Two biological replicates were sequenced.

Table 1 RNA-seq libraries under different light conditions

We utilized the RNA-seq data and the previously constructed A. agallocha genome [6] for gene expression quantification, resulting in an average correlation coefficient of 0.9404 for gene expression levels between biological replicates. Genes were clustered into 16 clusters based on their expression patterns, requiring a two-fold change in expression and a p-value cut-off of 0.001 for differential expression (Fig. 2). In total, 8882 genes were determined to be differentially expressed and clustered into distinct expression patterns (Additional file 1: Table S1). Gene ontology (GO) classification was performed to identify each cluster’s most significant biological process (Table 2).

Fig. 2
figure 2

Cluster analysis of gene expression patterns in agarwood. Sixteen clusters were identified by k-means clustering. The samples are represented on the x-axis, from left to right: FR day 5, FR day 2, FR day 1, normal, R day 1, R day 2, R day 5. The centered log2 fold-change is represented on the y-axis

Table 2 Gene ontology analysis on 16 clusters of gene expression patterns

Clusters 3 and 11 were observed to exhibit a pattern of up-regulation under red light conditions and repression under far-red light conditions, consistent with the observed changes in cucurbitacin content levels. The GO classifications show that 253 out of 495 genes, in clusters 3 and 11 combined, are classified as belonging to metabolic processes (Additional file 2: Figure S1). Furthermore, these clusters contain 3 genes classified as belonging to terpene biosynthesis, the main class of compounds related to the medicinal properties of agarwood [1820]. Terpenoid content is induced under biotic stress as an immune response to resist various pathogens [6, 21] and its derivatives have been shown to exhibit anti-microorganism, anti-tumour, and other pharmacological effects that are beneficial towards human medicine [4, 5]. In addition to terpene biosynthesis, clusters 3 and 11 contained 26 genes related to defense response. Previous studies have shown that far-red light down-regulates the expression of defense response genes by reducing a plant’s sensitivity to jasmonate (or methyl jasmonate) in Arabidopsis [7, 8]. From the RNA-seq data, it was seen that some defense response genes were up-regulated under red light conditions and down-regulated under far-red light conditions. These results are consistent with our expectations and suggest that controlled light conditions can be used in place of plant hormones to induce defense response genes in agarwood.

Red light and far-red light DNA methylation patterns in agarwood

In order to study the effect of different light on methylation patterns in agarwood, we performed whole-genome bisulfite sequencing with two biological replicates for red light day 2, far-red light day 2, and normal samples (Additional file 2: Table S2). The methylation levels for each sample were used to discover differentially methylated regions (DMR) between different light conditions. A characterization of DMRs (Fig. 3a) shows that DMR proportions in transposons and intergenic regions were not significantly changed by R or FR conditions. In genic regions, it was seen that there was a slight increase (~6.4 %) in DMR proportions at promoter regions under FR conditions. The number of DMRs for each light condition (Fig. 3b) indicates that there is a large change in methylation levels between red light and far-red light conditions.

Fig. 3
figure 3

Characterization of differentially methylated regions for light conditions red light, far-red light, and normal. a Composition of DMRs in the A. agallocha genome. TE represents transposable elements, IG represents intergenic regions, Gene represents the gene body, and Promoter represents gene promoter regions. b Number of DMRs that are overlapping or unique to red light and far-red light conditions

We focused on hypo-DMRs under red light conditions, using the consensus hypo-DMRs between R/normal and R/FR data, resulting in 621 regions for analysis. The average methylation levels in red light hypo-DMRs (Fig. 4a) show that CHH methylation (where H represents A, T, or C) exhibit the most significant differences under red light conditions. This remains the trend for average weighted methylation levels [22] in genic regions (Fig. 4b), where the most significant differences in methylation levels were observed in promoter regions for CHH methylation. CHG methylation levels were also observed to be affected by red light while CG methylation levels were relatively unchanged. These results suggest that red light may regulate gene expression in agarwood by changing CHH and CHG methylation, primarily in promoter regions.

Fig. 4
figure 4

Methylation levels for hypo-DMRs under red light conditions. a Box plots displaying the distribution of average CG, CHG, and CHH methylation levels for hypo-DMRs under red light conditions. b Average methylation levels in gene bodies and flanking 2 kb regions. Each gene was aligned from start to end and divided into 20 equal bins. Upstream and downstream flanking regions were also each divided into 20 equal bins. Weighted methylation levels were calculated for each of the 60 bins across all corresponding regions

In higher plants, Domains Rearranged Methylase 2 (DRM2) catalyzes de novo DNA methylation in all cytosine contexts including CG, CHG, and CHH [23], via the RNA-directed DNA methylation pathway (RdDM) [2426]. Cytosine methylation and demethylation are both closely linked with gene regulation where high methylation patterns typically accompany low gene expression [27, 28]. In RdDM, Argonaute 4 (AGO4) has been recognized to interact with sRNAs and participate in DNA methylation [2830].

sRNAome of red light and far-red light conditions in agarwood

In order to identify sRNAs that play a role in changes to methylation under different light conditions, we performed sRNA sequencing with two biological replicates for red light day 2, far-red light day 2, and normal samples (Table S2). Overall, approximately 6 million distinct sRNAs were able to be mapped perfectly and uniquely to the genome. A characterization of mapped sRNAs (Additional file 2: Figure S2) revealed that the majority (56.28 %) of sRNAs were mapped to genic regions, within which, a large majority (61.11 %) were mapped to promoter regions. As well, we characterized the mapped sRNAs in terms of their length (Table 3) and observed that 71.93 % of the sRNAs were 24-nt long overall, 73.37 % in promoter regions. These results support the idea that under different light conditions, sRNA may play a role in DNA methylation via AGO4 and the RdDM pathway in agarwood.

Table 3 Characterization of sRNAs by sequence length

Small RNAs are classified into two major categories: microRNA (miRNA) and short interfering RNA (siRNA) [31]. Small RNAs, which are cut from double-stranded RNA (dsRNA) by Dicer-like enzymes, participate in gene silencing as miRNA [3234]. The focus of this study, siRNAs, are processed from the overlapping regions of natural sense-antisense transcript pairs or the near-perfect double-stranded RNAs (dsRNAs) synthesized by RNA-dependent RNA polymerases (RDRs) [3537]. Based on their origins, plant siRNAs include four major classes: heterochromatic siRNAs (hc-siRNAs), trans-acting siRNAs (ta-siRNAs), natural antisense transcript-derived siRNAs (nat-siRNAs), and long siRNAs (lsiRNAs) [38]. siRNAs bind to specific Argonaute proteins to form a RNA-induced silencing complex (RISC) guiding RISCs to DNA or RNA targets based on sequence complementarity and trigger gene silencing transcriptionally or post-transcriptionally [31]. Different AGOs have different preferences. AGO1 has a strong bias towards 5’ terminal uridine, AGO2 prefers 5’ terminal adenosine, and AGO4 prefers 5’ terminal adenosine, guanine, or uridine [29]. Different length small RNAs play different roles and are cut by different Dicer-like enzymes (DCL) [34, 36, 39]. Among them, the 24-nt long miRNAs (lmiRNAs) and 24-nt siRNAs are processed by DCL3 [40]. These 24-nt small RNAs interact with AGO4 and acts as a guide to catalyze DNA methylation via RdDM [40, 41].

Regulation of secondary metabolic gene expression by RdDM pathway

Although DNA methylation in promoter regions and intergenic transposable elements generally inhibit gene expression [42], the role of DNA methylation in A. agallocha is still unclear. To further our understanding of DNA methylation in A. agallocha, we identified sRNAs that inhibit gene expression through the RdDM pathway selected from the set of metabolic processes genes containing hypo-methylated regions (Additional file 2: Figure S3).

As mentioned previously, different AGOs have different preferences. Here, we focused on sRNA sequences that suited AGO4 preferences and mapped to hypo-DMRs. We identified 61 genes in agarwood related to secondary metabolism that fit our criteria. Three candidate genes were selected for further analysis (Fig. 5), a sterol methytransferase (g16251), a hydroxysteroid dehydrogenase (g23648), and a cytochrome P450 (g29032). The selected genes show that sRNAs were mapped to red light hypo-DMRs with a corresponding increase in mRNA expression under red light conditions. The expression levels were also verified using qRT-PCR (Additional file 2: Figure S4).

Fig. 5
figure 5

Light conditions regulate gene expression by the RdDM pathway. The RNA expression, DNA methylation, and sRNA expression is shown for three candidate genes: g16251 (sterol methytransferase), g23648 (hydroxysteroid dehydrogenase), and g29032 (cytochrome P450). Signals in red represent red light conditions while signals in blue represent far-red light conditions

In the three candidate genes, we detected three specific sRNAs that mapped perfectly to promoter regions under far-red light conditions. It was seen that these sRNAs had a positive relationship with DNA methylation levels and a negative relationship with gene expression levels. In contrast, for both the sRNA sequencing and qRT-PCR validation, these sRNAs were not able to be detected under red light conditions. This suggests that the effects of red light and far-red light on secondary metabolism gene expression in agarwood are antagonistic to each other and that these sRNAs potentially play a role in gene expression regulation through the RdDM pathway in cucurbitacin biosynthesis.

Sterols (steroid alcohols) belong to steroids and are ubiquitous in eukaryotic organisms, playing pivotal roles in membrane structure and as precursors of vitamins and steroid hormones [43]. Sterol methyltransferases are known to catalyze a single methyl addition, an important step in phytosterol synthesis [43], and important to biosynthesis of secondary metabolites such as cucurbitacin. Hydroxysteroid dehydrogenases belong to alcohol oxidoreductases, which catalyzes the dehydrogenation of hydroxysteroid in steroidgenesis by cofactor NADP(H) or NAD and may affect the activity of compounds [44]. Cytochrome P450s (CYP450s) are also ubiquitous in many organisms. In plants, one or more CYP450s participate in compound modification and affect compound activity in secondary metabolism [45]. As well, some CYP450s play an important role in steroidgenesis [46, 47].

Although these three candidate genes belong to rather large gene families, the gene expression, sRNA, and methylation patterns under red light and far-red light conditions indicate that these genes are potentially important for cucurbitacin metabolism in agarwood.

Conclusion

In this study, we performed three types of sequencing experiments in order to study the effect of light conditions on cucurbitacin biosynthesis and secondary metabolism in agarwood. This resulted in a number of new insights regarding the global regulation of genes by red light and far-red light. From the RNA sequencing results, gene expression patterns were clustered into distinct clusters, many of which can be characterized as responding primarily to light conditions. In particular, two gene expression clusters clearly exhibited gene expression patterns in response to red light and far-red light. Significantly, the two clusters included genes related to terpene biosynthesis and defense response. In addition to gene expression, small RNA and DNA methylation were observed to be factors affected by different light conditions which in turn affect cucurbitacin metabolism in agarwood. We identified a set of small RNA which potentially regulates gene expression through the RdDM pathway.

The results from this study provide genome-wide profiles of RNA expression, small RNA, and DNA methylation with regards to light conditions. These profiles provide insight into the effect of light on gene expression for cucurbitacin biosynthesis in agarwood as well as provide compelling new candidates for functional secondary metabolic components, highlighting new questions to be addressed in future studies.

We also demonstrate that light conditions can be used in lieu of methyl jasmonate treatment to stimulate pathways related to secondary metabolism, increasing the yield of cucurbitacins. This has important implications for the increasing use of plant factories for the synthesis of high value compounds.

Methods

Plant materials for DNA and RNA extraction

A plant regeneration system from shoot tips into in vitro plants was created using a tissue culture process similar to the processes described by He et al. [48]. LED light sources (Daina Electronics) were used to provide different light conditions (Table S3). Normal (white light ~55 μmol m−2 s−1) in vitro plant materials were grown under long-day conditions (16 h of light, 8 h of darkness) at 25 °C. Red light samples (~15 μmol m−2 s−1, 680 nm) and far-red light samples (~15 μmol m−2 s−1, 730 nm) were continuously exposed to their respective light conditions at 25 °C and the materials used for sequencing were collected after 1, 2, and 5 days.

DNA was extracted from 1 g of in vitro materials using the Plant Genomic DNA MiniKit (Maestrogen) following the manufacturer’s instructions. RNA was extracted from 1 g of in vitro materials using RNeasy Plant MiniKit following the protocol prescribed by the manufacturer. Normal light samples were collected from material grown under long-day conditions in white light. The DNA and RNA samples were sent to BGI for poly(A) RNA sequencing, whole-genome bisulfite sequencing, and small RNA sequencing.

LC-ESI-MS

In vitro materials were ground with liquid nitrogen and mixed with 1 mL of methanol. Supernatant was collected by centrifugation (12000 rpm, 1 min). The LC-ESI-MS system consisted of an ultra-performance liquid chromatography system (Ultimate 3000 RSLC, Dionex) and an electrospray ionization source of quadrupole time-of-flight mass spectrometer (maXis HUR-QToF system, Bruker Daltonics). The autosampler was set at 4 °C. Separation was performed with reversed-phase liquid chromatography on a BEH C8 column (2.1 × 100 mm, Walters). The elution started from 99 % mobile phase A (0.1 % formic acid in ultrapure water) and 1 % mobile phase B (0.1 % formic acid in ACN), held at 1 % B for 1.5 min, raised to 60 % B in 6 min, further raised to 90 % in 0.5 min, and then lowered to 1 % B in 0.5 min. The column was equilibrated by pumping 1 % B for 4 min. The flow rate was set to 0.4 mL/min with an injection volume of 5 μL. LC-ESI-MS chromatogram were acquired under the following conditions: capillary voltage of 4500 V in positive ion mode, dry temperature of 190 °C, dry gas flow maintained at 8 L/min, nebulizer gas at 1.4 bar, and acquisition range of m/z 100–1000. Five samples for each condition were independently measured for cucurbitacin content levels.

RNA sequencing analysis

The RNA-seq data for all samples (Table 1) were trimmed for low quality bases at the 3’ terminal and then individually aligned to the set of annotated A. agallocha transcripts using BWA [49]. For each dataset, expression quantification was performed using eXpress [50]. R/FR pair-wise differential gene expression analysis was performed using edgeR [51] incorporating all replicates. Genes which exhibit at least a two-fold change in expression with a p-value threshold of 0.001 between any red light and far-red light sample were retained for clustering analysis. Clustering analysis was performed on the expression profiles of differentially expressed genes using k-means clustering. Gene ontology classifications for each cluster was performed using BinGO [52].

Whole-genome bisulfite sequencing analysis

The whole-genome bisulfite sequencing data for red light day 2, far-red light day 2, and normal were trimmed for low quality bases at the 3’ terminal. MOABS [53] was utilized to perform alignment to the A. agallocha genome, methylated cytosine calling, discovery of differentially methylated cytosines (DMCs), and discovery of differentially methylated regions (DMRs). Differentially methylated cytosines were discovered using a Fisher Exact Test, with a p-value threshold of 0.05, a minimum depth of 3, and a minimum of 33 % nominal difference in methylation ratios between conditions. Differentially methylated regions were discovered using a Fisher Exact Test, with a p-value threshold of 0.05, a minimum of 3 DMCs in a region, and a maximum distance of 300 bp between DMCs.

sRNA sequencing analysis

The sRNA sequencing reads for red light day 2, far-red light day 2, and normal were aligned to the A. agallocha genome using BWA [49]. Only sequences with perfect mappings (no mismatches, no gaps) and uniquely mapped (to one genome location only) were retained for analysis.

qRT-PCR analysis

Validation of RNA expression on three candidate genes was performed using qRT–PCR analysis. The RNA samples for each light condition were extracted from 1 g of in vitro A. agallocha shoots using RNeasy Plant MiniKit following the protocol prescribed by the manufacturer. Primers pairs were designed for each transcript (Table S4) with the ABI Prism 7500 sequence detection system (Applied Biosystems). Each primer pair was used to amplify the respective cDNA fragments using a cycling profile consisting of 58 °C for 2 min, 95 °C for 10 min, and 40 cycles of 95 °C for 15 s and 60 °C for 1 min. The relative gene expression was determined by the comparative CT method, 2−ΔCT (ΔCT = CT, gene of interest – CT, control gene), using AcHistone as the internal control [54]. Four independent biological repeats were performed for each assay where the final expression value is the mean expression of the repeats.

Validation of sRNA used the same plant materials as described above. An endogenous sRNA (CGGTGGAAGAAATAATAGGGCCTG) was chosen as internal control due to its expression levels being stable under different light conditions (mean TPM of 237.00 ± 39.44) as well as uniquely mapping to an intergenic region and thus will not affect genes. For detecting sRNAs of g16251, g23648, and g29032, miScript Primer Assays (Qiagen) #MSC0074731, #MSC0074729, and #MSC0074727, respectively, as well as the miScript Universal primer were used. Five independent biological repeats were performed for each assay where the final expression value is the mean expression of the repeats.

Availability of supporting data

The datasets supporting the results of this article are available in the NCBI repository, BioProject ID: PRJNA240626, http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA240626. Gene annotations, KEGG, and GO classifications for Aquilaria agallocha are available at our webserver, http://molas.iis.sinica.edu.tw/agarwood.