Introduction

Catalpa bungei C.A. Meyer, originated in China, is a valuable timber species and a famous ornamental tree extensively found in the Yellow River and the Yangtze River basins. This timber confers both good physical and chemical properties, as well as good medicinal properties (Lin et al. 2010). However, the nature of self-sterile, little fruits, low seed germination rate, and barely rooting largely limit the spreading and usage of this fine timber. Thus, the tissue culture method provides an excellent way for rapid propagation of C. bungei plantlets in a low cost, high survival rate, and easy to grasp manner (Lin et al. 2010).

Somatic embryogenesis is an artificial process through which an embryo or eventually a whole plant is derived from a single somatic cell or group of somatic cells. So far, somatic embryogenesis has become one of the most promising approaches for plant propagation (Martin 2004), germplasm storage (Yin and Hong 2010), producing synthetic seeds (Manjkhola et al. 2005; Kumar and Chandra 2014), generating virus-free plants (Li et al. 2016), and genetic transformation (Pathi et al. 2013), as well as a potential model system for the basal research of plant earliest developmental events (Zhu and Perry 2005). For woody plants, it is necessary to develop a method of mass propagation and conservation to satisfy the demand for wood. Lin et al. (2010) have reported callus induction in C. bungei with leaf, stem and petiole as explants, and Jiang et al. (2010) have reported somatic embryogenesis in Catalpa fargesii with seedling tissues as explants. In our previous study, we have established an effective plant regeneration system in C. bungei via somatic embryogenesis. Using immature seeds as explants, high frequency (90%) of calli was obtained. In comparison with the method established by Jiang et al. (2010), these calli, induced from immature zygotic embryos, have higher frequency of somatic embryo induction (59.82%, unpublished data), as well as less induction time since their explants are from mature zygotic embryos and require a 7d–15d post-germination growth. And under further subculturing, embryogenic callus formation was observed, which eventually developed into somatic embryos. However, the genetic and physiological basis of this reprogramming is largely unknown. In agreement with many previous reports about American chestnut (Carraway and Merkle 1997), Passiflora cincinnata Masters (Rocha et al. 2012), and Quercus rubra L. (Vengadesan and Pijut 2009), we found that this process also requires certain growth conditions, such as suitable concentrations and combinations of plant growth regulators.

Investigations in various species have revealed that embryogenic callus and somatic embryo induction process are affected by various genetic and physiological factors (Karami et al. 2009). Plant hormones are among the most important physiological factors regulating somatic embryogenesis in many woody plants, such as Cassava (Ma et al. 2015), Caesalpinia sappan L. (Naik et al. 2014), Jatropha curcus L. (Kumar et al. 2015), and Citrus species (Savita et al. 2010). However, whether and how plant hormones participate in somatic embryogenesis in C. bungei remain unknown. So far, the plant hormone signaling transduction pathways have been well documented in model plants (Wang and Irving 2011; Daviere and Achard 2013; Pacifici et al. 2015). Recognition of auxin by F-box proteins (TIR1/AFBs, known as auxin receptors) initiates signal transduction, followed by ubiquitin-mediated degradation of AUX/IAA repressors and releasing of ARFs, resulting in the activation of auxin response (Gray et al. 2001; Dharmasiri et al. 2005; Paciorek and Friml 2006). The similar ubiquitin-mediated signaling activation pathways lie in abscisic acid (ABA), gibberellin, jasmonic acid, and ethylene signaling, with different proteins as receptors, repressors, and transcriptional factors (Guo and Ecker 2003; Sasaki et al. 2003; Vierstra 2009; Chen et al. 2013; Daviere and Achard 2013; Nagels et al. 2016).

To better understand the molecular basis of C. bungei EC induction, the gene expression profiles in EC and NEC were characterized via RNA-seq technology. The major pathways and key events related to these representative materials of somatic embryo developmental progress were illustrated by the deep analysis of DEGs. In combination with quantitative analysis of various endogenous hormones in EC and NEC, the understanding of the molecular networks regulating somatic embryo formation has increased, which may provide guidance toward improving somatic embryo induction as well as ensuring the quality of regenerated plants through improved culture conditions.

Materials and methods

Materials and establishment of somatic embryogenesis in C. bungei

Immature C. bungei seeds of 70 days post-pollination were collected at the Hutoushan Forest Park in Henan Yanshi County and used as explants for callus induction. Tissue culture was performed on half-strength MS basal medium and incubated in a plant growth chamber at 25 ± 2 °C, under 12 h light/12 h dark cycle.

Using immature zygotic embryos of C. bungei clone Q12 as explants, the highest frequency of calli (90%) was obtained on half-strength MS medium with 1.0 mg L−1 2,4-D (2,4-dichlorophenoxyacetic acid) and 0.1 mg L−1 6-BA (6-benzyladenine). After about 1 week, the callus was transferred onto half-strength MS medium containing 1.0 mg L−1 6-BA and 0.01 mg L−1 NAA (1-naphthaleneacetic acid) for subsequent culturing. About 5 weeks later, some of the calli proliferated rapidly but did not show significant morphological change, while the other part grew slowly and become light yellow granular embryogenic callus, which eventually turned into somatic embryos. In this step, after subsequent culturing on half-strength MS medium containing 1.0 mg L−1 6-BA and 0.01 mg L−1 NAA for 5 weeks, non-embryogenic callus (Fig. 1a) and embryogenic callus (Fig. 1b) were separated and collected for RNA-seq analysis and plant hormones quantification. The highest frequency (59.82%) of somatic embryos induction was achieved on half-strength MS medium supplemented with 0.2 mg L−1 6-BA and 0.05 mg L−1 NAA for a subculturing lasting two and a half months. Gradually, these converted into globular, heart, torpedo and cotyledonary shaped somatic embryos following subsequent subculture on the same medium. Cotyledon embryos, appearing with two leaves, were transferred into a hormone-free plant regeneration medium and which, in the end, developed into normal plantlets.

Fig. 1
figure 1

Morphology of embryogenic callus and non-embryogenic callus. a Non-embryogenic callus without significant morphological changes. b Embryogenic callus. c Differentiated embryogenic callus with visible embryo-like structures. Arrow indicates somatic embryo. Scale bar = 5 mm

RNA isolation, library construction, and sequencing

Total RNA isolation from EC and NEC of C. bungei was performed as previously described (Liu et al. 2015). In brief, 100 mg callus was ground in liquid nitrogen for each sample, and RNA was isolated with RNAprep pure Plant Kit (TIANGEN, China) according to the manufacturer’s protocol. RNA level was quantified by Agilent2100 Bioanalyzer and 1.5% agarose gel electrophoresis, and the quality and integrity of isolated RNA was detected by NanoDrop and Agilent2100 Bioanalyzer. The isolated RNA samples were then transferred in dry ice to BGI-Shenzhen (Headquarter) for library construction and sequencing. The cDNA libraries were constructed with the mRNA-Seq Sample Preparation Kit™ (Illumina, San Diego, CA, USA) and analyzed with Agilent 2100 Bioanalyzer and ABI StepOnePlus Real-Time PCR System, and then sequenced on an Illumina HiSeqTM 2000 sequencing instrument. RNA-seq analysis was performed with three independent biological replicates for each sample.

Bioinformatics analyses of RNA-seq data

Low-quality regions and adaptor sequences were first removed from raw RNA-Seq reads to obtain clean reads with clean reads ratio over 93%. Then, clean reads were used for de novo assembly using Trinity software (v2.0.6). The assembled unigenes were annotated to NT, NR, COG, KEGG, and SwissProt database by Blast software (Altschul et al. 1990), to GO database by Blast2GO software (Conesa et al. 2005), and to InterPro database by InterProScan5 software (Quevillon et al. 2005). Transcription factor prediction was performed according to getorf (Rice et al. 2000) and hmmsearch software (Mistry et al. 2013), and simple sequence repeat (SSR) was detected by MISA software (Thiel et al. 2003). Three independent replicates were used for the DEGs analysis. And DEGs were analyzed according to NOIseq (Tarazona et al. 2011) and PossionDis (Audic and Claverie 1997) methods, with parameter setting as Fold Change ≥ 2.00 and Probability ≥ 0.8 or Fold Change ≥ 2.00 and False Discovery Rate (FDR) ≤ 0.001, respectively. The overall process is shown in Figure S1.

qRT-PCR

Total RNA isolation was performed as described above, and thereafter the first-strand cDNA was synthesized using StarScript II First-strand cDNA Synthesis Kit (Genstar, Beijing) according to the manufacturer’s protocol. The quantitative real-time PCR was performed on a Bio-Rad CFX96 apparatus, using the diluted cDNA and SYBR® Premix Ex Taq™ II (Tli RNaseH Plus, Taraka) as previously described (Zhang et al. 2015). PCR was carried out in 96-well plates with the following settings: 3-min incubation step at 94 °C for complete denaturation, followed by 40 cycles consisting of 94 °C for 10 s, 55 °C for 30 s, and 72 °C for 10 s. The ACTIN gene was chosen as the reference gene for the following analysis. All experiments were performed with three independent biological replicates and three technical repetitions. The primers for real-time PCR are listed in Table S1.

Quantification of endogenous hormone contents

Endogenous hormone contents in C. bungei EC and NEC were extracted and quantified as Chen et al. (2011, 2012) previously described, using derivatization approach coupled with nano-LC–ESI–Q–TOF–MS analysis. The experiment was performed with three biological replicates for each sample.

Results

Transcriptome sequencing analyses

To solve the low proliferation rate of C. bungei, an important timber woods, we set up the somatic embryogenesis system to fast regenerate new plants. Using immature zygotic C. bungei embryos as explants, induction of calli, light yellow granular embryogenic callus, globular, heart, torpedo, and cotyledonary shaped somatic embryos was observed sequentially. During the whole progress, induction of EC is a key component. Thus, to better understand it, we further investigated the underlying molecular mechanisms by comparing the transcriptional profiles and endogenous plant hormone levels in NEC (Fig. 1a) and EC (Fig. 1b). Transcriptome sequencing was performed on an Illumina HiSeqTM 2000 sequencing instrument. A total of 27.13 GB data were obtained, with Q20 of clean reads over 96%. After trimmed for low-quality regions, 73038 unigenes were assembled, with 98,906,132 bp in total and 1354 bp on average, of which 55573 unigenes (76.09%) were over 500 bp. The length distribution of unigenes is shown in Figure S2. Then, these transcripts were used to search various protein databases (NR, NT, SwissProt, KEGG, COG, InterPro, and GO) for annotation. As shown in Table 1, the ratios of unigenes annotated in these six databases were 71.44%, 64.95%, 50.14%, 44.03%, 30.63%, 56.23%, and 7.86%, respectively. Meanwhile, 5813 SSRs were detected in 12904 unigenes (Figure S3A), and 2251 unigenes were predicted to encode transcription factors (Figure S3B).

Table 1 The ratios of unigenes annotated in these six databases

Pathway and GO term enrichment analyses

GO annotation was performed using the Blast2GO pipeline, and 27445 unigenes were assigned with at least one GO term (Fig. 2a). Among the three GO categories, 11764 unigenes were annotated in Biological Process, leading by metabolic process (3203) and cellular process (2840); 8590 unigenes in cellular component, leading by cell (1934) and cell part (1934); and 7091 unigenes in molecular function, leading by catalytic activity (3174) and binding (2989). KEGG annotation assigned 32167 unigenes, which were mostly enriched in metabolism, with 7454 in Globe map, 2342 in carbohydrate metabolism, 1912 in lipid metabolism, 1521 in amino acid metabolism, and 1029 in biosynthesis of other secondary metabolites (Fig. 2b). Taken together, the annotated unigenes provided valuable information for investigating specific processes, functions and pathways involved in EC development, and facilitated further gene functional analysis in this non-model organism.

Fig. 2
figure 2

GO and KEGG enrichment analyses of annotated unigenes. a Clustering of DEGs into the GO sub-categories. b Classification of DEGs into the sub-categories of KEGG pathways

GO Clustering and KEGG functional classification of DEGs

According to NOIseq (Tarazona et al. 2011) and PossionDis (Audic and Claverie 1997) methods, we identified 12310 reliable DEGs between EC and NEC by comparing the results from three independent biological experiments, among which 6994 were up-regulated and 5316 were down-regulated (Figure S4), suggesting that these genes might be involved in somatic embryogenesis in C. bungei. GO term enrichment analyses of these DEGs further showed that 2483 DEGs were classified in biological process, which mainly enriched in metabolism process and cellular process; 1628 DEGs were classified in cellular component, mainly in cell and cell part; 1464 DEGs were classified in molecular function, mainly in binding and catalytic activity (Fig. 3a). Pathway analysis revealed that DEGs in EC and NEC could be mapped into 125 biological pathways. Among them, 32 pathways were significantly enriched, including metabolic pathways, biosynthesis of secondary metabolites, plant hormone signal transduction, phenylpropanoid biosynthesis, glycerophospholipid metabolism, endocytosis, ether lipid metabolism, and phenylalanine metabolism (Fig. 3b). These results indicated that these pathways might function in EC induction.

Fig. 3
figure 3

Classification of DEGs by GO (a) and pathway (b) classification

To validate the reliability of the RNA-seq data, the expression of six unigenes (Unigene5588, Unigene15766, Unigene21038, CL998.Contig10, Unigene10161, and Unigene2849), which were belonging to different functional categories or pathways and differentially expressed between EC and NEC, was randomly selected and assessed via quantitative real-time PCR. Using ACTIN as a reference gene, we found that the qRT-PCR results exhibited the same trend as the RNA-Seq data, although the expression levels of these genes had slight differences (Table 2). Thus, we can confirm the reproducibility of the RNA-Seq data.

Table 2 Validation of selected gene expression by qRT-PCR

Analysis of DEGs involved in plant hormones pathways

To gain insights into the mechanisms of somatic embryo formation, we further analyzed the transcriptome data. KEGG functional classification revealed that sub-category of plant hormone signal transduction was significantly enriched (Fig. 4), such as tryptophan metabolism (auxin) and zeatin biosynthesis (cytokinin), which have long been known to play important roles in somatic embryogenesis by regulating cell elongation and division (Skoog and Miller 1957; Dello Ioio et al. 2008). Further analyses of the expression of auxin signaling genes showed that 5, 1, 20, 15, 4, and 8 unigenes, coding putative AUX1, TIR1, ARF, AUX/IAA, GH3, and SAUR, respectively, were up-regulated in EC, while 4, 0, 7, 6, 2, and 3 unigenes were down-regulated, accordingly (Fig. 4, Table S2). For cytokinin signaling pathway, 7 were up-regulated and 13 were down-regulated within unigenes coding putative CRE1; 2 were up-regulated within AHP; 13 were up-regulated and 10 were down-regulated within B-ARR; 4 were down-regulated within A-ARR (Fig. 4, Table S2). As to ABA signaling pathway, 3 were up-regulated and 9 were down-regulated within unigenes coding putative PYP/PYL; 15 were up-regulated and 6 were down-regulated within PP2C; 1 was down-regulated within SnRK; 21 were up-regulated and 4 were down-regulated within ABF (Fig. 4, Table S2). Thus, those genes involved in auxin, cytokinin, and ABA signaling seem to be differentially expressed between EC and NEC, indicating that auxin, cytokinin, and ABA might be largely involved in EC formation of this tree species.

Fig. 4
figure 4

Changes of plant hormone signal transduction pathway between embryogenic callus and non-embryogenic callus. Red lines indicate up-regulated gene expression, and yellow lines indicate down-regulated gene expression. + p indicates phosphorylation, − p indicates dephosphorylation, and + u indicates ubiquitination

Then, we further analyzed the expression of several somatic embryogenesis-associated genes, previously reported as markers of somatic embryogenesis (Roberts et al. 1993; Kitamiya et al. 2000; Stasolla et al. 2004; Torii 2004). The transcriptome data showed that one SERK (SOMATIC EMBRYOGENESIS RECEPTOR-LIKE KINASE) gene, four LEA (LATE EMBRYOGENESIS ABUNDANT) genes, one AGP (ARABINOGALACTAN PROTEIN) gene, and seven HSP (HEAT SHOCK PROTEIN) genes were significantly up-regulated in EC (Table 3).

Table 3 Expression of somatic embryogenesis-related genes in EC and NEC of Catalpa bungei

Analysis of endogenous hormone levels of EC and NEC in C. bungei

Since plant hormones directly affected the induction of EC in C. bungei, we further quantified the endogenous hormone contents in EC and NEC using derivatization approach coupled with nano-LC–ESI–Q–TOF–MS analysis. In total, we detected IAA, ABA, and eight CK isoforms (cZ, DHZ, iPR, iP, iP9G, tZR, tZ, tZ9G) and analyzed the results from three independent biological experiments. Our results showed that iP, tZ, cZ, and DHZ were not detected in the samples (Table 4). And the levels of iPR and tZR (precursors of iP and tZ), as well as iP9G (cytokinin N9-glucosides, inactive isoform of iP) and iP9G (cytokinin N9-glucosides, inactive isoform of iP) were very low, implying that cytokinin contents might be too low to be detected in the callus of C. bungei. However, IAA and ABA levels were dramatically higher in EC (18.2 ± 0.4 and 37.8 ± 1.4 ng g−1 F.W.) than in NEC (1.2 ± 0.1 and 2.3 ± 0.1 ng g−1 F.W.) (as shown in Table 4). In accordance with this, many genes involved in IAA and ABA signaling pathways showed altered expression levels between EC and NEC (Table S2). These results implied that the elevated IAA and ABA levels may be crucial for somatic embryogenesis in C. bungei.

Table 4 Quantification of hormone contents in EC and NEC of C. bungei

Discussion

Catalpa bungei is one of the oldest fine timber and ornamental tree species in China. Rapid propagation through somatic embryogenesis is of great commercial value because of its incompatibility of self-reproduction. Previous researches on somatic embryogenesis of C. bungei mainly focused on morphological, physiological, and biochemical stage (Lin et al. 2010; Ren et al. 2011), but rarely involved the molecular mechanism. Studies on the molecular process of somatic embryogenesis will contribute to better comprehensive understanding of the mechanism of somatic embryogenesis. In our previous study, we established the somatic embryogenesis and plant regeneration system in C. bungei. Here, we isolated and described the morphological features of the embryogenic callus (EC) and non-embryogenic callus (NEC), which conferred different developmental destiny during somatic embryogenesis. Furthermore, we investigated their transcriptional profiles by RNA-sequencing and identified 12310 DEGs. Meanwhile, the contents of various endogenous plant hormones were quantified by derivatization approach coupled with nano-LC–ESI–Q–TOF–MS analysis.

To reveal the regulatory mechanism of EC formation, RNA-sequencing assay was firstly performed to compare the global transcriptome reprogramming. We totally identified 12310 DEGs, with 6994 up-regulated and 5316 down-regulated (Figure S3). These DEGs might function in C. bungei callus differentiation during somatic embryogenesis. Therefore, we further classified them by GO and KEGG analysis (Fig. 3a, b). GO term clustering analyses showed that of the three GO categories, biological process (2483 DEGs) is the most enriched category, followed by molecular function (1464 DEGs) and cellular component (1628 DEGs). KEGG functional classification revealed that DEGs are significantly enriched in 32 pathways, leading by metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction, indicating that these pathways might be involved in EC induction. In addition, this is the first study reporting transcriptome data in C. bungei. According to our results, a total of 73038 unigenes were assembled, and 27445 unigenes were assigned with at least one GO term. Thus, our results also provide sufficient information for further gene functional analysis in C. bungei.

A series of cellular signal transduction associated genes, such as LEUCINE-RICH REPEAT CONTAINING RECEPTOR-LIKE KINASE (LRR-RLK), AGP, HSP, and LEA, plays key roles during plant somatic embryogenesis progress (Karami et al. 2009). For instance, a large number of studies suggested that somatic embryogenesis receptor kinase (SERK), a LRR-RLK family member, is specifically expressed and considered as a marker in somatic embryogenesis (Schmidt et al. 1997; Hecht et al. 2001; Torii 2004). DcSERK is only expressed in somatic embryo with heart stage (Schmidt et al. 1997), and CitSERK1 is expressed in embryogenic cells but not in non-embryogenic cells (Shimada et al. 2005). Moreover, AGP genes, encoding extracellular glycoproteins, play an important role in the generation and development of somatic embryo. Saare-Surminski et al. (2000) found that the development of somatic embryos appears to be associated with the presence of particular AGPs in E. pulcherrima. Portillo et al. (2012) demonstrated that the presence of AGPs is directly correlated to the embryogenic capability, indicating a positive relationship between AGPs and somatic embryogenesis. And the expression of specific AGPs was detected in the somatic embryogenesis culture of tobacco (Du et al. 1994, 1996), carrot (Kiyosue and Shinozaki 1995; Kreuger and van Holst 1995; Smallwood et al. 1996), and rice (Smallwood et al. 1996). In addition, HSPs were also found to be distinctively expressed in somatic embryos in carrot (Kitamiya et al. 2000), alfalfa (Fowler et al. 1998), white spruce (Dong and Dunstan 1996), sunflower (Coca et al. 1994), and cork oak (Puigderrajols et al. 2002). Besides, the late embryogenesis abundant protein (LEA) is also largely associated with somatic embryogenesis in cotton (Roberts et al. 1993), white spruce (Stasolla et al. 2004), and sweet orange (Pedrosa et al. 2015). In this study, our RNA-seq data showed that one SERK gene, one AGP gene, seven HSP genes, and four LEA genes were significantly up-regulated in EC in comparison with NEC (Table 4). These results not only confirmed our judgement of EC from NEC, but also indicated some common regulatory mechanisms underlying the induction of somatic embryo between C. bungei and other species. Meanwhile, it may also contribute to distinguishing embryogenic genotypes from those non-embryogenic. The use of certain genes as markers of embryogenesis induction in C. bungei regeneration process will provide new tools to assume the embryogenic development conditions of this species for further optimizing the methodology.

Plant hormones play key regulatory roles in plant somatic embryogenesis (Karami et al. 2009). Our results showed that auxin and cytokinin significantly affect the formation of callus as well as the induction of EC, suggesting the requirement of plant hormones in somatic embryogenesis in C. bungei. Consistent with this, KEGG functional classification revealed that the sub-category of plant hormone signal transduction was significantly enriched (Fig. 4), such as tryptophan metabolism (auxin), zeatin biosynthesis (cytokinine), and carotenoid biosynthesis (ABA). These processes played a crucial role in regulating cell elongation and division, which may be involved in somatic embryogenesis. Thus, to reveal the possible involvement of phytohormones in somatic embryogenesis, we further quantified the contents of these endogenous hormones in EC and NEC. The cytokinin levels were not successfully detected, probably because their contents were too low to be detected in the callus. However, IAA and ABA were clearly increased in EC in comparison with NEC (Table 4). Furthermore, analyses of the transcriptome data suggested that the expression of the downstream genes involved in these hormone signaling was extensively varied between EC and NEC. Interestingly, our data showed that ABA content was significantly elevated in the EC of C. bungei. Similar results were also observed in Hevea brasiliensis (Laurent et al. 1997), Picea glauca (Dong et al., 1997), and Picea asperata (Jing et al. 2017). However, the effects of ABA exogenous application on somatic embryogenesis differed between species and experiments, which might be due to the concentration, duration, and interaction with other plant growth regulators. Thus, the endogenous ABA content may be a better pointer of its roles in somatic embryogenesis.

Revealing the mechanisms and cellular signals of somatic embryogenesis is of great importance for basic research in plant embryology as well as for plant improvement and propagation. Here, we studied the changes of transcriptome and plant hormone contents involved in the induction of C. bungei somatic embryo. These data will extend our understanding on the initial stages of somatic embryogenesis in this tree species, especially the dynamic changes of the endogenous hormone levels and the potential key regulator genes. Modulation of the above key elements may promote somatic embryogenesis and thus facilitate fast plant regeneration through improved culture conditions.