Introduction

Cymbidium ensifolium L. is a valuable terrestrial orchid with fragrant flowers, graceful leaves, and a delicate fragrance. It is primarily found in southern China, southeastern Asia, northwest India, Korea, northern and eastern Australia, Borneo, New Guinea, and the Philippines (Wang et al. 2011). The horticultural traits of C. ensifolium, notably its floral organ, have been the focus of much research, particularly on floral morphology, floral fragrance, flowering time, and flower color (Ai et al. 2021; Yang et al. 2021; Zhang et al. 2021, 2022a; Sun et al. 2021). C. ensifolium can be classified into several varieties depending on the color and shape of the flower and leaf. ‘Hong He’ is the most common variety, combining safflower, and lotus petals. The leaves of this variety change color during its growth and development, with red leaves in the juvenile stage (RL) that transform into green leaves in the mature stage (GL). Prior to flowering, C. ensifolium generally experiences a 5- to 6-year juvenile phase, with a subsequent flowering time of approximately 1–2 months (Li et al. 2013; Zhang et al. 2022b). The variations in leaf color at the different developmental stages are key in determining the aesthetic and economic values of the orchids as the leaf viewing period is longer than that of flower viewing (Zhang et al. 2022b). However, despite its key significance, leaf color change in C. ensifolium is poorly understood. Compared with flower color, the influencing factors of leaf color are more complex (Gao et al. 2020). Thus, in order to overcome the gap in the current literature, it is crucial to investigate the molecular basis of leaf color change in C. ensifolium, which will consequently enhance the understanding of its ecological significance and increase its economic.

Variations in the levels of pigment families (e.g., chlorophylls and anthocyanins) are reported to be responsible for plant pigmentation (Pahlavani et al. 2004; Fernández-García et al. 2012; Wang et al. 2020; Li et al. 2015). Chlorophyll provides green pigmentation and comprises chlorophyll a and chlorophyll b molecules (Wang et al. 2020). For plants, chlorophyll synthesis is robustly affected by the metabolic pathway’s genes such as Glutamyl-tRNA reductase (GluTR), and Glutamyl-tRNA reductase (HEMA), etc. Anthocyanins are water-soluble, derived from phenylalanine, and engaged in the flavonoid biosynthetic pathway. They are synthesized in the cytosol and localized in the vacuoles (Tanaka et al. 2008; Zhao et al. 2022). Anthocyanin biosynthesis is primarily supported by structural genes, including F3H, F3′H, F3′5′H, DFR, ANS, UFGT, etc. (Ferrer et al. 2008; Holton and Cornish 1995; Sunil and Shetty 2022). In addition, genes related to anthocyanin biosynthesis are controlled by transcription factors (TFs) from the MYB, bHLH and WD40 families (Ahmad et al. 2022; Lloyd et al. 2017; Liu et al. 2021). In ornamental plants, bright colors (red, pink, purple, violet, and blue) coloration or fading results are generally attributed to the complex metabolic accumulation or loss in anthocyanin biosynthesis pathways (Gao et al. 2020; Lim et al. 2020; Song et al. 2021). Understanding the underlying mechanisms of color change can help with the effective selection of desirable leaf traits in breeding. However, research on the mechanism of leaf color change in C. ensifolium from red to green is limited.

Studies have demonstrated that combining metabolomics and transcriptomics provides an effective approach to identifying trait-related key genes associated with metabolites, and it currently plays a significant role in fundamental plant biology (Wei et al. 2016; Zhang et al. 2023, 2020; Li et al. 2020). In Crataegus maximowiczii C. K. Schneid, Zhang et al. integrated metabolome and transcriptome analyses to reveal cyanidin-based anthocyanins as the main pigments responsible for the black coloration of the fruit (Zhang et al. 2023). Illumina transcriptome analysis and widely targeted metabolomics (LC–ESI–MS/MS) were successfully used to identify the B-class AP3- and AGL6-like MADS-box genes participating in the flower color formation of the Cattleya hybrid ‘KOVA’ (Li et al. 2020). In the absence of a complete and high-quality genome sequence of C. ensifolium, the integrative analysis of metabolome and transcriptome provides a successful approach for gaining new insights into the molecular mechanisms of orchids.

In this study, analysis of the pigment contents from RL to GL of C. ensifolium ‘Hong He’ was performed. The molecular basis of leaf color change was explored using a widely targeted metabolomics approach and transcriptome analysis. Moreover, we focused on the differentially accumulated metabolites (DAMs), differentially expressed unigenes (DEUs) and TFs related to anthocyanin biosynthesis. Investigating the molecular mechanism of leaf color change in C. ensifolium can determine new directions for future genetic modifications and improvements in orchids.

Materials and methods

Plant materials

C. ensifolium ‘Hong He’ was cultivated at a spacing of 30 cm in the multi-span greenhouse of the Hangzhou Academy of Agricultural Sciences (China, 30.16° N, 120.08° E). The substrates used by the plants are 5–10 mm bark and 3–6 mm perlite. The plants were grown at 28/20 °C (day/night), 80% relative humidity, and illumination levels range of 8000 Lux to 10,000 Lux. RL and GL from the 3-year-old plants were collected with three biological replicates, respectively. The color leaves were dropped to liquid nitrogen for rapid freezing after sampling and then stored at − 80 °C. The samples were used to measure the pigment contents (anthocyanin, total flavone and chlorophyll), transcriptome sequencing, and metabolic profiling.

Measurement of pigment contents

For anthocyanin profiling (Meiling et al. 2018), fresh leaves (0.1 g) were mixed with 4 mL of a hydrochloric acid–ethanol solution (0.1 mol/L) and incubated in a 60 °C water bath for 6 h. The supernatant was measured at 530 nm, 620 nm, and 650 nm, respectively. Total anthocyanin concentration (μg/g) was determined as [(A530 − A620) − 0.1 × (A650 − A620)] ÷ ε × V × N ÷ M × 103 × 287.24 (ε = molar extinction coefficient, 4.62 × 104; V = volume of the extract, mL; N = dilution multiple; M = sample fresh quantity, g).

Extraction of flavone was conducted according to SN/T 4592-2016 (Industry Standard for Entry Exit Inspection and Quarantine of the People’s Republic of China). 0.1 g of the sample and 3 mL of absolute ethanol were added into a 10 mL centrifuge tube. The solution was shaken well and sonicated at 60 °C for 1 h. Then the mixture was generated by the mixing proportion [sample liquid; aluminum nitrate solution (100 g/L); potassium acetate solution (98 g/L); distilled water = 30:2:2:66]. The mixture was rested at room temperature for 1 h for color reaction. The absorbance value was determined at 420 nm and the standard curve was determined. The total flavone content (μg/g) was estimated as C × V ÷ V sample × N ÷ M (C = flavone content obtained from the standard curve, μg; V = volume of the total extract, mL; V sample = volume of the sample extract for the standard curve, mL; N = dilution multiple; M = sample quantity, g).

The total chlorophyll content was determined by spectrophotometric analyses (Xuekui 2006). Fresh leaves (0.1 g) from RL and GL were homogenized in a glass tube with 5 mL of extracting solution (95% ethanol) and 50 mg of calcium carbonate powder. The glass tube was maintained in the dark until the tissue was completely whitened. The absorbance was measured at 665 nm and 649 nm. The chlorophyll was measured in three replicates. The total chlorophyll content (μg/g) was calculated as (18.08 × A649 + 6.63 × A665) × V × N ÷ M (V = volume of the extract, mL; N = dilution multiple; M = sample fresh quantity, g).

RNA extraction, library construction, and sequencing

According to the TRIzol® reagent (Invitrogen, USA), RNA was extracted from six samples (two tissues, three biological replicates) of RL and GL. Sequencing libraries were constructed using the NEBNext® UltraTM RNA Library Prep Kit for Illumina® (NEB, USA). Transcriptome sequencing was performed on an Illumina Novaseq 6000 platform with a 150-bp paired-end strategy. Raw data in fastq format was further processed using the BMK online platform (http://www.biocloud.net) for quality control and to output the clean reads. Transcriptome assembly was performed using Trinity (version 2.14.0) with non-default parameters (min_contig_length 200, group_pairs_distance 500) (Grabherr MG 2011). Clean reads were aligned to the unigenes library using Bowtie (Langmead et al. 2009), and expression level estimation was performed by RSEM (version 1.2.19) (Li and Dewey 2011). The fragments per kilobase of exon per million fragments mapped (FPKM) value for each unigene was subsequently obtained (Trapnell et al. 2010). Using BLAST alignment (cutoff E-value ≤ 10–5) and Hmmer alignment (version 3.1b2) (cutoff E-value ≤ 10–10), the unigenes were annotated based on eight functional annotation databases, including Clusters of Orthologous Groups (COG), euKaryotic Orthologous Groups (KOG), Swiss-Prot, Protein Family (Pfam), Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), eggNOG, and NCBI non-redundant protein sequences (NR). The TFs were identified and classified by iTAK and PlantTFDB (http://planttfdb.gao-lab.org/) with E-value ≤ 10–5.

Extraction, data preprocessing and annotation of metabolome

The extraction procedure for metabolome analysis (Dunn et al. 2011; Want et al. 2010) was as follows: leaves from C. ensifolium were freeze-dried by a vacuum freeze-dryer (Scientz-100F) and crushed using a mixer mill with a zirconia bead for 1.5 min at 30 Hz. A total of 100 mg of lyophilized powder was dissolved in 1.2 mL of a 70% methanol solution. The samples were placed in a refrigerator overnight at 4 °C after being vortexed for 30 s every 30 min. This step was repeated six times. The samples were centrifuged (12,000 rpm, 10 min) and the extracts were filtrated for subsequent analysis. Widely targeted metabolite analysis was conducted using the UPLC-ESI–MS/MS system, which consisted of the ultra-high-performance liquid chromatograph (UPLC, SHIMADZU Nexera X2, Japan) and a tandem mass spectrometer (MS/MS 4500 QTRAP, Applied Biosystems, USA). The raw data was collected using MassLynx (version 4.2, Waters Corpration) and processed for the identification and quantification of metabolites based on Progenesis QI with the online METLIN database and Biomark’s self-built library. The results of the OPLS-DA were plotted with ropls package (version 1.18.8).

Identification of differentially expressed unigenes

Differential expression analysis was performed using DESeq2 (version 1.30.1) (Love et al. 2014). The resulting P values were adjusted using Benjamini and Hochberg’s approach for controlling the false discovery rate (FDR). Genes with the threshold of twofold change (FC) in expression [|log2 (FC) |≥ 1] and an adjusted P < 0.01 were assigned as DEUs.

Identification of differentially accumulated metabolites

The constraints variable importance in projection (VIP) ≥ 1, [log2 (FC) |≥ 1], and P < 0.05 were used to identify metabolites with significant differences in content as DAMs.

Statistical analysis

PCA and PCC analyses were performed using the Prcomp and Cor functions in R (version 3.6.3), and the heat map was generated using pheatmap (version 1.0.12). GO and KEGG enrichment analyses were conducted using the R package ClusterProfiler (version 3.14.3) (Yu et al. 2012).

qRT-PCR verification

qRT-PCR was conducted using SYBR® Premix Ex TaqTMII (Takara, Japan), with three replicates for each sample. We employed EF1a (GenBank: KC794500.1) as the reference gene. The amplification procedure adopted in the AnalytikjenaqTOWER2.2 was as follows: (i) pre-denaturation at 95 °C for 3 min; (ii) denaturation at 95 °C for 10 s; and (iii) annealing at 60 °C for 30 s. Steps (ii) and (iii) were repeated for 40 cycles. We calculated the relative abundance of transcripts using the 2−ΔΔCt method (Livak and Schmittgen 2001).

Results

Phenotypical and physiological differences between RL and GL

The leaves of C. ensifolium were red during the leaf bud stage and finally matured into green (Fig. 1A). In order to determine the pigment components involved in the leaf color change, we measured the contents of anthocyanin, total flavone, and chlorophyll. Anthocyanin content declined sharply (P = 6.09 × 10–5, Student’s t-test), changing from 432.93 μg/g in RL to 78.94 μg/g in GL (Fig. 1B). A slightly lower total flavone content was observed in RL compared to GL (Fig. 1C). However, no significant differences were observed in the chlorophyll content between RL and GL (Fig. 1D). To further explore the mechanism of leaf color change in C. ensifolium, a widely targeted metabolomics approach and transcriptome analysis were conducted.

Fig. 1
figure 1

Characteristics of phenotype and pigment content in red leaves (RL) and green leaves (GL) of C. ensifolium. A Phenotypes of RL and GL. B Significant difference in anthocyanin content of RL and GL. C Significantly higher total flavone content in GL than RL. D No difference in chlorophyll content between RL and GL. Bars represent the mean ± SD (n = 3). NS no significant difference

De novo transcriptome sequencing of RL and GL

A total of 47.58 Gb of clean data was obtained by de novo transcriptome sequencing on RL and GL in C. ensifolium, with 7.54 Gb of clean data per sample and a mean GC content of 46.56%. The results determined that Q20 and Q30 were greater than or equal to 97.99% and 89.20%, respectively, for each sample (Supplementary Table 1). An impressive range of 81.91% to 88.23% was observed for the mapped ratio, highlighting the reliable quality of the RNA-seq data (Supplementary Table 1). Principal component analysis (PCA) revealed that the two principal components explained 59.3% of the total variance (39.1% by PC1, 20.2% by PC2). Each sample tissue (three biological replicates) formed a cluster, indicating a significant difference in gene expression between RL and GL (Fig. 2A). The correlation coefficient among the biological replicates for each sample was high (Pearson’s r > 0.982 in RL, Pearson’s r > 0.994 in GL), while the correlation coefficient between RL and GL was low (Pearson’s r < 0.189) (Fig. 2B). Following the assembly of the sequences, 59,445 unigenes were obtained with average lengths of 1013 bp and N50 lengths of 1,890 bp (Supplementary Table 2). A total of 25,856 unigenes were functionally annotated in the eight public databases (COG, KOG, SwissProt, Pfam, KEGG, GO, eggNOG, and NR) (Figure S1), with 3274 annotated in all databases and accounting for 5.51% of the total unigenes. In addition, 3,955 DEUs were identified, 1,338 of which were significantly upregulated and 2617 were downregulated in GL compared to RL (Figure S2, Supplementary Table 3).

Fig. 2
figure 2

Information of de novo assembled unigenes in C. ensifolium determined from transcriptomic analysis. A Principal Component Analysis of unigene expression derived from the transcriptome. B Pearson’s correlation coefficient of unigene expression. Numbers 1, 2, and 3 correspond to each replication. C Bubble chart of significant GO biological process terms (FDR < 0.01) among the DEUs. The redder the bubble color, the higher the enrichment degree. D Bar diagram showing the KEGG analysis of the DEUs. X-axis represents − log10 (P value), with greater values indicating a higher enrichment degree. Y-axis plots KEGG terms; numbers represent the DEUs included in each term. FDR < 0.01 was set as the cutoff

To get a deeper understanding of major DEG functions throughout changing leaf color, enrichment analysis was performed. The biological process revealed that the DEUs were significantly enriched for 15 GO terms, containing the carbohydrate metabolic process (GO:0005975), cell wall organization (GO:0071555), pectin catabolic process (GO:0045490), and auxin-activated signaling pathway (GO:0009734) (Fig. 2C). In the molecular function and cellular component categories, the terms cell wall (GO:0005618), extracellular region (GO:0005576), heme binding (GO:0020037), and protein kinase activity (GO:0020037) were most enriched (Figure S3). Furthermore, the phenylpropanoid biosynthesis (ko00940), starch and sucrose metabolism (ko00500), and flavonoid biosynthesis (ko00941) pathways exhibited the better significance and richer factors based on KEGG analysis (Fig. 2D).

Metabolite identification and quantification

In order to explore the metabolic networks occurring during leaf color change in C. ensifolium, widely targeted metabolite analysis was used to detect the metabolites based on the UPLC-ESI–MS/MS system. A total of 897 metabolites were identified (Supplementary Table 4). The PCA analysis revealed significant differences between RL and GL, explaining 89.0% of the total variance (64.1% by PC1, 24.9% by PC2) (Fig. 3A). The Pearson’s correlation coefficient (PCC) was high among biological replicates (Pearson’s r > 0.816 in RL, Pearson’s r > 0.993 in GL) (Fig. 3B). Moreover, the orthogonal projections to latent structures-discriminant analysis (OPLS-DA) was stable and reliable (R2X = 0.936, R2Y = 1, Q2Y = 0.996), and can be used to identify DAMs according to the VIP value (Fig. 3C). We performed 200 permutation experiments for the alignment verification of OPLS-DA (Figure S4). Compared with RL, 381 DAMs were identified, 309 of which were upregulated and 72 were downregulated in GL (Fig. 3D, Supplementary Table 4).

Fig. 3
figure 3

Metabolomic UPLC-ESI–MS/MS-based analyses of the C. ensifolium leaves. A Principal Component Analysis of metabolites. B Pearson’s correlation coefficient of metabolites. C OPLS-DA model of metabolites. D Bar chart depicting the DAMs between RL and GL. E Annotation of DAMs against HMDB databases. F Bar chart depicting the KEGG analysis of the DAMs

To further understand the category of metabolites, we annotated the DAMs based on the HMDB and KEGG databases. A total of 60 DAMs were annotated into 19 classification entries, such as carboxylic acids and derivatives, organooxygen compounds, and flavonoids (Fig. 3E). Based on the rich factor ≥ 1.5, 40 DAMs were assigned to 22 KEGG pathways. A total of 7 DAMs were mapped to glyoxylate and dicarboxylate metabolism (ko00630), 1 DAM was mapped to anthocyanin biosynthesis (ko00942), and 2 DAMs were mapped to flavone and flavonol biosynthesis (ko00944) (Fig. 3F).

Comprehensive analysis of the metabolome and transcriptome of C. ensifolium leaf

We detected many DEUs and DAMs were simultaneously annotated in 57 different KEGG pathways, such as flavone and flavonol biosynthesis associated with 2 DAMs and 6 DEUs, the citrate cycle (TCA cycle) associated with 5 DAMs and 4 DEUs, and the phenylalanine metabolism associated with 6 DAMs and 9 DEUs (Supplementary Table 5). In order to investigate the deeper association between DEUs and DAMs, a comprehensive analysis was performed using PCC. Supplementary Table 6 reported the correlations based on |r|> 0.90 and P < 1.0 × 10–2, revealing strong correlations between the 3,723 DEUs and 381 DAMs. TRINITY_DN703_c0_g1 (encoding kinesin-like protein KIN-5A) exhibited the strongest negative correlation (r = − 0.999, P = 3.68 × 10–9) with icariside B2, and TRINITY_DN6894_c0_g1 (encoding MYB) showed the strongest positive correlation quercetagetin-7-O-glucoside (r = 0.999, P = 5.81 × 10–9) (Supplementary Table 6). The highly correlated DEUs and DAMs, which were involved in multiple metabolic pathways, may play important roles in the leaf color change of C. ensifolium.

Regulatory network of anthocyanin biosynthesis

The anthocyanin content dropped significantly from RL to GL (Fig. 1B). The transcriptome results demonstrated the enrichment of DEUs in flavonoid biosynthesis, and the investigation of metabolomes revealed a superior rich factor in anthocyanins biosynthesis (ko00942, an important branch of flavonoid synthesis) during the leaf color change of C. ensifolium. Thus, we focused on anthocyanin biosynthesis and analyzed the key DEUs and DAMs implicated in this pathway. A total of 10 anthocyanin synthesis-related DEUs were identified, containing CHS (TRINITY_DN316_c0_g1, TRINITY_DN1786_c0_g1), CHI (TRINITY_DN10644_c0_g1, TRINITY_DN143_c0_g1), DFR (TRINITY_DN2247_c1_g1), F3H (TRINITY_DN76_c0_g3, TRINITY_DN3316_c1_g1), F3H (TRINITY_DN3907_c0_g1, TRINITY_DN9664_c0_g1), and ANS (TRINITY_DN852_c0_g1). Interestingly, all 10 major DEUs in the anthocyanin biosynthesis pathway were identified to be downregulated in RL vs GL, with FC varying from 3.49 to 79.17 (Supplementary Table 7). Metabolome analysis revealed that 3 downregulated and 4 upregulated DAMs were involved in the anthocyanin biosynthesis pathway. Cyanidin-3-O-glucoside (down by 6.47 FC), petunidin-3-O-(6″-O-p-coumaroyl) rutinoside (down by 7.26 FC), and peonidin-3-O-glucoside (down by 15.70 FC) depicted a down accumulation pattern in GL vs RL. Although petunidin-3-O-glucoside (up by 17.78 FC), delphinidin-3-O-rutinoside-7-O-glucoside (up by 18.54 FC), malvidin-3-O-(6″-O-caffeoyl) glucoside (up by 21.20 FC) and pelargonidin-3,5-O-diglucoside (up by 20.71 FC) were identified with up accumulation in GL vs RL. Figure 4A presents the anthocyanin synthesis pathway determined by combining transcriptomic and metabolomic data. The results signify the important role of cyanidin-3-O-glucoside, petunidin-3-O-(6″-O-p-coumaroyl) rutinoside, and peonidin-3-O-glucoside in coloration in C. ensifolium. Furthermore, CHI, DFR, F3H, and ANS were positively correlated with cyanidin-3-O-glucoside (r > 0.932, P < 6.83 × 10–3), while CHI exhibited a negative correlation with peonidin-3-O-glucoside (r = 0.927, P = 7.83 × 10–3) (Fig. 4B).

Fig. 4
figure 4

DEUs and DAMs in the synthesis pathway of anthocyanins. A Regulatory network of anthocyanin biosynthesis. Red indicates higher levels of unigene expression in RL relative to GL and blue indicates lower levels of unigene expression in RL relative to GL. Blue rectangular boxes represent the DAMs that are significantly lower in GL than RL. PAL phenylalanine lyase, C4H cinnamate 4-hydroxylase, 4CL 4-coumadin CoA ligase, CHI chalcone isomerase, CHS chalcone synthase, F3H flavanone 3-hydroxylase, F3H flavonoid 3′-hydroxylase, F35H flavonoid 3′,5′-hydroxylase, DFR dihydroflavonol 4-reductase, ANS anthocyanidin synthase, and UFGT flavonoid 3-O-glucosyltransferase. B Correlations between the anthocyanin synthesis-related DAMs and DEUs

Identification of transcription factors

To promote functional gene research in C. ensifolium, the identification of TFs was performed. A total of 242 DEUs were identified to be associated with TFs (Figure S5, Supplementary Table 8). In particular, the bHLHs and MYBs TF families, which are considered to regulate the biosynthesis of anthocyanins (Ahmad et al. 2022; Lloyd et al. 2017), accounted for the largest proportion, with 23 (9.50%) and 18 (7.44%) members, respectively. The correlation coefficients between 242 DEUs associated with TFs and key anthocyanin synthesis-related DEUs was presented in Supplementary Table S9. In addition, we identified 93 DEUs associated with TFs to be significantly correlated with key anthocyanin synthesis-related DAMs (Supplementary Table S10). Further investigation determined a set of 77 DEUs associated with TFs to be strongly correlated with both the DEUs and DAMs implicated in anthocyanin biosynthesis (Figure S6). The 77 DEUs were assigned to 31 different TF families, with the top 5 TF families (in terms of quantity) determined as bHLH (N = 13), NAC (N = 9), AP2/ERF-ERF (N = 6), MYB (N = 5), and MYB-related (N = 4). Interestingly, a general downregulation tendency was observed in the MYBs and bHLHs families from RL to GL among these 77 TFs. For example, TRINITY_DN1376_c0_g1 (R2R3-MYB) exhibited a positive correlation with 1 key anthocyanin synthesis-related DAMs and 5 anthocyanin synthesis-related DEUs from RL to GL. This implies the role of these TF families in regulating the biosynthesis and transport of anthocyanins during leaf color change, especially MYBs and bHLHs.

Real-time polymerase chain reaction (PCR) validation

To assess the accuracy and reproducibility of the RNA-Seq data, the relative expression levels of key structural unigenes were selected and examined using quantitative real-time PCR (qRT-PCR). Supplementary Table 11 reports the enumeration of the specific primers. The qRT-PCR and RNA-seq correlation results are represented by R2. The R2 >  = 0.767 indicates that the expression patterns of the DEUs obtained from the RNA-seq and qRT-PCR are consistent, confirming that our RNA-seq data are authentic and reliable (Fig. 5). Thus, the acquired data verified the importance of anthocyanin biosynthesis-related DEUs in the change of leaf color in C. ensifolium.

Fig. 5
figure 5

Validation of qRT-PCR. Bars represent the mean ± SD (n = 3). R2 represents the correlation analysis of the qRT-PCR and RNA-seq results

Discussion

The color of leaves frequently varies as a result of a plant’s growth and development. Examples include Alternanthera bettzickiana (Li et al. 2022), Acer pictum subsp. mono (Ge et al. 2019), Padus virginiana (Li et al. 2021). To understand the mechanism of leaf color change, earlier studies concentrated on physiology, cytology, and molecular biology techniques (Qi et al. 2022; Wang et al. 2021). However, the analysis of the regulatory mechanisms and abundant metabolites using multi-omics methods to investigate leaf color change is lacking. The integrated analysis of the transcriptome and metabolome can be employed to identify secondary metabolites or related genes that affect leaf color (Ge et al. 2019; Feng et al. 2021; Qi et al. 2022; Qiu et al. 2020). In our study, the combined analysis of metabolome and transcriptome data was performed to elucidate mechanism-based information for the genetic improvement of C. ensifolium.

Pigment accumulation between RL and GL

The aesthetic and ornamental worth of C. ensifolium is greatly influenced by the leaf color, a key horticultural characteristic. Studies have shown that the formation of leaf color in colored-leaf plants is complex and based on the biosynthesis, distribution, type, and content of pigments (Tang et al. 2020). This study revealed that the anthocyanin content in GL is 5.48 times lower than that in RL, with a greater degree of change amplitude than the total flavone and chlorophyll content (Fig. 1B–D). We hypothesized that anthocyanins might be a key cause of color change in C. ensifolium leaves. Anthocyanidins are important secondary metabolites and have the ability to safeguard photosynthesis, resist oxidation, moderate stress, and postpone senescence. More significantly, numerous studies have demonstrated that anthocyanins are crucial for the development of pigments in key plant organs including leaves, flowers, fruits, and seeds (Gao et al. 2020; Song et al. 2021; Lim et al. 2020; Zhang et al. 2023). Cyanidin controls the change from the pink to purple-red color spectrum (Kawase et al. 1970; Veberic et al. 2015; Yi et al. 2021; Han et al. 2021; Bureau et al. 2009; Robinson and Robinson 1932) and is the dominant anthocyanin in the scarlet skin of Crataegus pinnatifida Bunge and Crataegus laevigata (Poir.) DC. fruits (Veberic et al. 2015). Longan’s pericarp has a bright red color owing to the accumulation of cyanidin derivatives (cyanidin 3-O-glucoside, cyanidin 3-O-6″-malonyl-glucoside, and cyanidin O-syringic acid) (Yi et al. 2021). The mature fruit pericarp of sweet osmanthus is purple-black due to elevated levels of cyanidin-3-O-rutinoside and peonidin-3-O-rutinoside (Han et al. 2021). In this work, cyanidin-3-O-glucoside, petunidin-3-O-(6″-O-p-coumaroyl) rutinoside, and peonidin-3-O-glucoside all exhibited a significant down accumulation pattern in GL. The reduction of the three key anthocyanins was thought to be one of the primary causes of the C. ensifolium leaf color shift from red to green.

Anthocyanin synthesis-related DEUs of C. ensifolium leaves

The anthocyanin biosynthetic route in plants has been extensively researched. There are numerous enzymes that can catalyze anthocyanin synthesis, such as CHS, CHI, DFR, ANS, and F3H (Ferrer et al. 2008; Holton and Cornish 1995; Sunil and Shetty 2022). CHS catalyzes the key branch point from monolignol biosynthesis to the anthocyanins, while F3H determines whether the cyanidin-based anthocyanins will be synthesized (Jaakola 2013). Genes related to anthocyanin synthesis are vital for the leaf color. In Paeonia qiui found that F3H, F3H, DFR and ANS were upregulated during spring leaf red color fading (Luo et al. 2017). An increased expression of the UGFT and BZ1 genes contributes to the purple-red color change of Padus virginiana (Li et al. 2021). From our results, unigenes annotated as CHS, CHI, DFR, F3H, F3H, and ANS exhibited a consistent downregulation from RL to GL in C. ensifolium leaves (Fig. 5, Supplementary Table 7). We inferred that the 10 anthocyanin-related DEUs were identified as key genes affecting leaf color change. Further functional verification of identified DEUs in the near future can shed light on the regulatory processes underlying anthocyanin in C. ensifolium. Furthermore, C. ensifolium leaves can be utilized for natural pigmentation.

Transcription factors related to anthocyanin biosynthesis

The biosynthesis of anthocyanins is strictly regulated by TFs, particularly MYBs and bHLHs (Ahmad et al. 2022; Lloyd et al. 2017; Liu et al. 2021). Anthocyanin synthesis activation can be restored by simultaneously introducing MYB and bHLH anthocyanin regulators in Cymbidium orchids (Albert et al. 2010). In numerous studies on model plants and horticultural crops, the regulation of the anthocyanin-promoting R2R3-MYB TF can control anthocyanin accumulation (Yang et al. 2022). Among the 77 DEUs associated with TFs, we found that a R2R3 MYB (TRINITY_DN1376_c0_g1) might potentially have a significant impact on the color change of leaves, exhibiting positive correlations with CHS, CHI, F3H and Petunidin-3-O-(6″-O-p-coumaroyl) rutinoside. Its expression decreased in GL compared to RL by 10.26 FC. Furthermore, TRINITY_DN6149_c2_g1 (encoding bHLH25) demonstrated a strong positive association with 2 key anthocyanin synthesis-related DAMs and 8 anthocyanin synthesis-related DEUs. Previous research attributed the coordinated regulation of the MYBs and bHLHs to the purple-red leaf stage of Padus virginiana (Li et al. 2021). According to our research, MYBs and bHLHs may positively or cooperatively regulate the expression of several structural genes involved in the production of anthocyanins, which is what causes the color shift in C. ensifolium leaves. Several additional types of TFs with differential expressions, such as bZIPs, MADs, WRKYs, NACs, and AP2/ERF-ERFs, were also discovered to be connected to the expression of anthocyanin synthesis-related DEUs and DAMs. TRINITY_DN1116_c0_g1 (which encodes for the ethylene-responsive TF ERF039) was reduced from RL to GL by 12.89 FC and was positively correlated with CHI and peonidin-3-O-glucoside. TRINITY_DN1013_c0_g2 (WRKY44) increased in expression by 3.65 FC from RL to GL and was negatively correlated with peonidin-3-O-glucoside, CHI, and CHS. TF such as ERF, WRKY, bZIPs, MADs may synergistically regulate the expression of unigenes in anthocyanin biosynthesis. Similar results were found in Artemisia annua L (Hassani et al. 2020).

Conclusion

In this study, the molecular regulatory mechanisms of leaf color change in C. ensifolium were characterized by combining metabolome and transcriptome data. We found anthocyanin, whose biosynthesis pathway is essential for leaf color change, was markedly lower in GL than RL. The key anthocyanin-related DEUs (CHS, CHI, DFR, F3H, F3′H, and ANS) were identified with down accumulation pattern in GL vs RL, implying a significant role in coloration. The reduction of cyanidin-3-O-glucoside, petunidin-3-O-(6″-O-p-coumaroyl) rutinoside, and peonidin-3-O-glucoside was considered as one of the main reasons for leaf color change from red to green in C. ensifolium. 77 DEUs associated with TFs were detected to exhibit strong correlations with both DEUs and DAMs involved in anthocyanin biosynthesis, such as TRINITY_DN1376_c0_g1 (R2R3 MYB) and TRINITY_DN6149_c2_g1 (encoding bHLH25). In the future, the application of biotechnology is expected to improve the expression of identified DEUs, achieving the goal of prolonging the red color of the leaves in C. ensifolium. Taken together, the comprehensive analyses explained the mechanism of color change in C. ensifolium leaves, providing an important reference for colored-leaf plants.