Introduction

Somatic embryogenesis (SE) is an important developmental process in plant regeneration that includes the development and maturation of embryos from embryogenic callus. SE is regarded as a powerful tool to study the morphological, physiological, molecular and biochemical events occurring during the onset of embryogenesis and embryo development in higher plants (Yang and Zhang 2010).

The complex physiological and molecular processes during SE result from a program of gene expression. Consequences of disturbed regulation in early stages of embryogenesis will cause developmental arrest or aberrations at the subsequent stages of embryonic and postembryonic development (Heckel et al. 1999; Aquea and Arce-Johnson 2008). Recently, efforts have been made to describe SE at the molecular level in white spruce (van Zyl et al. 2003; Stasolla et al. 2004), Cyclamen persicum (Rensing et al. 2005), cotton (Zeng et al. 2006), maize (Che et al. 2006), Brassica napus (Tsuwamoto et al. 2007), potato (Sharma et al. 2008), and Elaeis guineensis (Lin et al. 2009). However, metabolism and regulation during early SE are poorly understood, and no pathways specific to SE have been identified.

The isolation of embryo-defective mutants and molecular studies on specific genes have been used to study embryogenesis and seed development of Arabidopsis. Tzafrir et al. (2004) described an initial dataset of 250 embryo (EMB) genes required for normal embryo development which were enriched for basal cellular functions, but deficient in transcription factors and signaling components. Regulatory network linking the four major regulators of seed maturation, i.e. ABI3, FUS3, LEC1, and LEC2 had been found by examining their expressions in single, double, and triple mutants (To et al. 2006). In M. truncatula, using the highly embryogenic mutant Jemalong 2HA and its progenitor Jemalong, important genes for SE, such as MtSERK1 (Nolan et al. 2003, 2009), MtSK1 (Nolan et al. 2006), MtSERF1 (Mantiri et al. 2008) were identified; Imin et al. (2004, 2005) applied proteomic approach to detect and identify the developmentally regulated and differentially expressed proteins during SE. Collectively, molecular studies of zygotic and somatic embryogenesis indicate that embryogenesis is a complex process, regulated by different genes, and dissecting this network is a major challenge.

For citrus, an economically important perennial woody fruit crop, SE is critical for in vitro germplasm conservation and genetic improvement via biotechnological approaches such as transformation, protoplast fusion and somatic mutation breeding (Guo et al. 2007a, b). In contrast to other plant species, citrus somatic embryogenesis can be induced without plant growth hormones, and a large amount of embryos can form rapidly in MT (Murashige and Tucker 1969) culture medium with glycerol rather than sucrose as carbon source (Kayim and Koc 2006). However, SE capacity of Citrus species is found to vary with different cultivars, and long-term subculture is generally accompanied by gradual loss of embryogenesis capacity. Isolating genes specifically expressed in SE process is an effective approach to understand the inherent mechanism of different embryogenic competence among citrus cultivars. In the past, factors related to somatic embryo induction of citrus had been studied in our laboratory at the morphological, physiological, molecular and proteomic levels (Hao and Deng 2002; Zhang et al. 2006; Pan et al. 2009; Wu et al. 2009; Ge et al. 2010). To date, the studies in citrus SE only focused on individual genes, such as CitSERK1 (Shimada et al. 2005) and CsHPt1 (Maul et al. 2006). A comprehensive investigation of the global transcription in SE is lacking. In an attempt to understand the mechanism of embryogenic competence acquisition and embryo induction and development in citrus, embryogenic callus line of Citrus sinensis cv. ‘Valencia’ sweet orange, which still maintains strong SE capacity after preservation in vitro for over 28 years, was used as experimental material. A transcriptome study was conducted among non-embryogenic calluses, embryogenic calluses and somatic-embryo-induced calluses by applying suppression subtractive hybridization (SSH) followed by cDNA microarray analysis. Interpretation of potential unisequences in this study provided new insight into the molecular processes in SE of citrus and other woody fruit trees.

Materials and methods

Plant material and growth condition

Embryogenic callus (EC) of Citrus sinensis cv. ‘Valencia’ induced in 1982 from undeveloped ovules was kindly provided by Prof. Jude Grosser (Citrus Research & Education Centre, University of Florida, Lake Alfred, FL, USA) and preserved in vitro in Huazhong Agricultural University. The callus line was subcultured every 20 days on MT (Murashige and Tucker 1969) basal medium. To get the non-embryogenic callus (NEC) line, epicotyl segments of 4-week old germinated seedlings of C. sinensis cv. ‘Valencia’ were cultured on MT basal medium supplemented with 1.5 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D), 1.0 mg/l naphthaleneacetic acid (NAA), 0.15 mg/l kinetin (KT), and 0.1 mg/l zeatin (ZT) in dark at 23 °C. The established callus line was subcultured monthly in the same culture medium but with decreased 2,4-D concentration (1.0 mg/l) (Deng 1987). After 3 months of culture, rapid-growing NEC was further used.

Embryogenic callus was cultured in MT liquid medium to obtain synchronous cultures for 45 days, and then moved to induction medium (MT medium containing 2 % (v/v) glycerol as carbon source instead of sucrose) (Liu and Deng 2002). EC induced for 7 days (IEC1), 14 days (IEC2), 21 days (IEC3), 28 days (IEC4) and subsequently formed globular embryos (GE) and cotyledonary embryos (CE) were collected and sampled. In addition, calluses of seven citrus cultivars with different embryogenic capacity were cultured for 45 days in MT liquid medium. These cultivars are: C. sinensis cv. Valencia (V), C. sinensis cv. Anliucheng (AL), C. sinensis cv. Newhall (NH), C. sinensis cv. Jingcheng (JC), C. sinensis cv. Bingtangcheng (BT), C. sinensis cv. Olinda (O) and C. unshui cv. Guoqing No.1 (G1).

Morphological, histological and ultrastructural analysis of samples from certain SE stages

For histological observation, samples (NEC, EC, IEC2, IEC4, GE and CE) (Fig. 1) were fixed with formalin–acetic–alcohol (FAA) overnight (or longer), then dehydrated in an ethanol series and embedded in paraffin. Approximately 10 μm thick sections were cut using a Thermo microtome (Thermo-Fisher). Sections were double stained with periodic acid-Schiff (PAS) and naphthol blue-black (NBB) according to the method described by Fisher (1968). PAS stained starch reserves and cell walls red and NBB specifically stained soluble or reserve proteins blue-black. Pictures were taken using microscope BX61 (Olympus).

Fig. 1
figure 1

Morphological, histological and ultrastructural observations of non-embryogenic callus, embryogenic callus and induced embryogenic callus of ‘Valencia’. Morphological observation: a non-embryogenic callus (NEC), b embryogenic callus (EC), c embryogenic callus induced for 2 weeks (IEC2), d embryogenic callus induced for 4 weeks (IEC4), e somatic embryos. Histological observation: f NEC with big vacuole and stained red by PAS, g EC with small size and compact arranged cells, the out layer cell stained blue by NBB (arrow shows), h IEC2 with embryogenic nodular structures, i IEC4 with visible globular-embryos, j GE and CE. Ultrastructural observation: k NEC cells with mature vacuole but no obvious nucleus, l EC with big nucleus and amyloplast, m IEC2 with big nucleolus and several small vacuoles, n IEC4 with big nucleolus and vacuole, o EC with plenty of protein body structure. Scale bars 1.5 mm (a); 2 mm (b); 3 mm (c); 2 mm (d); 200 μm(e); 600 μm (fi); 300 μm (j); 2 μm (kn); 1 μm (o). Am amyloplast, N nucleus, Nu nucleolus, V vacuole, P protein body

For ultrastructural analysis, the surface cells of the calluses were used because the cells in the outer layer of calluses with higher metabolic activity could form embryos easily, while the development of interior cells was relatively late. Samples were prefixed in a solution of 2.5 % glutaraldehyde adjusted to pH 7.4 with 0.1 M phosphate buffer, fixed in 2 % OsO4 in the same buffer, and dehydrated and embedded in epoxy resin and SPI-812 in sequence. Ultrathin sections obtained with a Leica UC6 ultramicrotome were stained with uranyl acetate and subsequently with lead citrate. The observations and recording of images were performed with a Hitachi H-7650 transmission electron microscope at 80 kV and a Gatan 832 CCD camera.

Experimental design and tissue sampling

For constructing SSH libraries, approximately 3-month-old NEC was collected; EC cultured for 45 days in MT liquid medium was used; because the SE in citrus is a long process that needs at least 2 months, and to collect all potentially useful information, we used IEC1, IEC2, IEC3 and IEC4 and subsequently formed GE and CE as one pool test (IEC). For microarray analysis, cultures at six typical stages (NEC, EC, IEC2, IEC4, GE and CE) were sampled and three separate biologically replicated pools were processed (Fig. 2). Total RNA was extracted using a modified Trizol extraction protocol (Liu et al. 2006). The mRNA used to construct the library was purified using Oligotex mRNA Spin-Column Kits (Qiagen) according to the manufacturer’s instructions.

Fig. 2
figure 2

Schematic representation of samples used for SSH analysis and microarray design. For constructing SSH libraries, NEC, EC and IEC (IEC1, IEC2, IEC3, IEC4, GE and CE as one pool test) were used. For microarray analysis, cultures at six typical stages (NEC, EC, IEC2, IEC4, GE and CE) were sampled and three separate biologically replicated pools were processed

Construction of subtracted cDNA library and amplification of cDNA inserts

One SSH cDNA library was constructed using EC as the tester while NEC as the driver; the other two libraries were constructed using the EC and an induced embryogenic calluses pool (IEC) containing transition-stage calluses (IEC1-4) and transition-stage embryos (GE and CE) as the tester and driver, respectively. PCR-selected cDNA subtraction was performed using the Clontech PCR-Select cDNA subtraction Kit (Clontech) according to the manufacturer’s instruction. The tester and driver cDNAs were reverse transcribed from 2 μg mRNA, digested with RsaI, and then ligated to different adaptors. Hybridization and PCR amplifications were performed to enrich the differentially expressed sequences according to the manufacturer’s recommendation. The subtracted cDNAs were inserted directly into the T/A cloning vector pMD18-T (Takara) and transformed into Electro MAXTM DH5α-ETM cells (Invitrogen) to produce the subtracted cDNA libraries. The transformed colonies were randomly picked and grown in LB (Luria–Bertani) liquid medium with ampicillin (100 μg ml−1) at 37 °C in incubator for 16 h. Glycerol (4.4 % final; Amresco, Solon, OH, USA) was added for storage at −80 °C.

cDNA clones were amplified and PCR products were precipitated with anhydrous ethanol–sodium acetate (25:1, v/v), resuspended in 15 μl sterile water, the yield and quality were determined spectrophotometrically (Nanodrop 1000 Thermo, Germany) at wavelengths of 230, 260 and 280 nm, then run on 1.2 % agarose gel and examined by Bio-Rad UV spectroscopy (Bio-Rad USA) to ensure single clone, and finally stored at −80 °C.

Microarray slides fabrication and preparation of fluorescent dye-labelled cDNA

Fifty microlitres of each PCR products were precipitated by adding 100 μl of anhydrous ethanol and were dissolved in EasyArrayTM spotting solution (CapitalBio Corp, Beijing, China) at a final concentration of 0.1–0.5 μg μl−1 and then printed on amino-silaned glass slides with a SmartArrayer microarrayer (CapitalBio). Each clone was printed twice. After printing, the slides were baked for 1 h at 80 °C and stored dry at room temperature till use. Prior to hybridization, the slides were rehydrated over 65 °C for 10 s, snap dried on a 100 °C heating block for 5 s, and UV cross-linked at 250 mJ cm−2. The unimmobilized PCR products were washed off with 0.5 % SDS for 15 min at room temperature, and SDS was removed by dipping the slides in anhydrous ethanol for 30 s. The slides were spun dry. Eight sequences derived from intergenic regions in yeast genome, showing no significant homology to all existing citrus sequences, were spotted multiple times onto the microarray as exogenous controls. Total citrus RNA was spiked with a mixture of these exogenous control RNAs to validate the semi-quantitative microarray result.

The relative gene expression profiles of somatic embryogenesis at six different developmental stages, i.e. NEC, EC, IEC2, IEC4, GE and CE were investigated by microarray analysis. An aliquot of 5 μg total RNA was used to produce Cy5/Cy3-labelled cDNA employing an RNA amplification combined with Klenow enzyme labeling strategy.

DNA microarray hybridization and normalization

The microarray consisted of 12, 576 amplicons obtained from three SSH libraries. In each comparison, Cy3 and Cy5 two color labeled cDNAs hybridizations at three biological repeats and two technical repeats were conducted. Cy5/Cy3-labelled cDNA was hybridized with the microarray at 42 °C overnight. After that, the arrays were washed with 0.2 % SDS, 2× SSC at 42 °C for 5 min, and then with 0.2 % SSC for 5 min at room temperature. Arrays were scanned with a confocal LuxScan-scanner and the resulting images were analyzed with LuxScan 3.0 software (CapitalBio). Spots with fewer than 50 % of the signal pixels exceeding the local background value for both channels (Cy3 and Cy5) plus two standard deviations of the local background were removed. A spatial and intensity dependent (LOWESS) normalization method was employed (Yang et al. 2002). Differentially expressed genes were identified using a t test, and multiple test corrections were performed using false discovery rate (FDR) (Benjamini and Hochberg 1995). Genes with FDR <0.01 and a fold change >2 were identified as differentially expressed genes.

Annotation and GO functional categorization

Sequence information of the differentially expressed clones originating from the three SSH cDNA libraries were obtained using standard high-throughput sequencing by BGI-Wuhan, China. The software SeqClean was used to perform vector removal, poly (A) removal, trimming of low quality segments at the 5′ and 3′ ends, and cleaning of low complexity regions. Reading assembly was performed by CAP3 program (Huang and Madan 1999), using the read quality and default parameters.

The unisequences were searched in the TAIR protein database (version 8) using BLASTX programs with an E value less than 10−5 (Altschul et al. 1997), and were searched by GO annotation according to the gene ontology of Arabidopsis Information Resource (http://www.arabidopsis.org/tools/bulk/go/index.jsp). The GO annotation results plotting were assigned based on biological process and molecular function by inputting the annotation number with the Web Gene Ontology Annotation Plot (WEGO) (http://wego.genomics.org.cn/cgi-bin/wego/index.pl).

Analysis of differentially expressed genes by qRT-PCR and RT-PCR

Quantitative real-time PCR (qRT-PCR) was performed with the ABI 7500 Real Time System (PE Applied Biosystems). The primers for actin gene as endogenous control were according to Liu et al. (2009). Then the target gene and actin gene were diluted in the SYBER GREEN PCR Master Mix (PE Applied Biosystems) and 10 μl of the reaction mix were added to each well. Reactions were performed by an initial incubation at 50 °C for 2 min and at 95 °C for 1 min, and then followed by 40 cycles of 95 °C for 15 s and 60 °C for 1 min. qRT-PCR was performed in two biological replicates and three technical replicates for each sample. Output data generated by the instrument on-board software Sequence Detector Version 1.3.1 (PE Applied Biosystems) were transferred to a custom-designed Microsoft Excel macro for analysis; the data were indicated as mean ± SD (n = 3).

Real-time PCR was performed in 20 μl reactions. As a control, the actin transcripts were analyzed. The general program initiated with a denaturation step at 94 °C for 5 min, followed by 22 cycles of 94 °C for 30 s, 60 °C for 30 s, 72 °C for 30 s, and a final extension at 72 °C for 5 min. The PCR products (10 μl) for each sample were then electrophoresed in a 1.8 % ethidium bromide agarose gel and viewed under ultraviolet light.

Results

Morphological, histological and ultrastructural analysis of SE cultures used for SSH libraries

Like zygote embryos, somatic embryos go through a series of distinct morphological developmental stages. Under growth conditions, cell arrangements between NEC and EC showed significant differences. NEC growing on medium supplemented with plant growth regulators (PGRs) has higher water content and a smoother texture (Fig. 1a) than EC cultured in MT liquid medium, and their polysaccharide cell walls were stained red by PAS (Fig. 1f). Compared with the loosely organized cells of NEC (Fig. 1f), EC had compact arranged cells (Fig. 1g). Cells in the outer layers of EC were small with dense cytoplasm rich in soluble proteins stained blue by NBB (Fig. 1g). The EC cultured on induction medium became yellow (Fig. 1c) during IEC2 stage, and embryogenic nodular structures could be observed by histological analysis (Fig. 1h); at IEC4 stage, visible globular embryos formed (Fig. 1d, i). Compared with NEC (Fig. 1k), the ultrastructural observation of EC showed that small cells had bigger nucleus with obvious nucleolus and amyloplasts (Fig. 1l). During IEC2 stage, cells of EC contained bigger nucleolus and rich nucleoplasm with high electron density (Fig. 1m), and at IEC4 stage, cells of EC became mature with a big vacuole in cell center (Fig. 1n). Moreover, more protein bodies were found in EC (Fig. 1o) than that in NEC, IEC2 or IEC4. For EC cultured on induction medium, embryogenic cells could develop into globular- and cotyledonary-embryos (Fig. 1e, j). It was observed that the cells in the outer layer of calluses which were stained blue by NBB (Fig. 1g, h) could develop into embryos easily. Based on the above characteristics, NEC, EC, IEC2, IEC4, GE and CE were collected for the construction of three SSH libraries as described in “Materials and methods”.

Analysis of EST data and functional annotation

The microarray used is consisted of 12,576 amplicons obtained from three SSH libraries. Genes with FDR <0.01 and a fold change >2 were identified as differentially expressed genes. The number of amplicons in SSH libraries significantly up- or down-regulated in SE is shown in Supplementary Fig. S1. We sequenced 3,700 amplicons significantly up- or down- regulated, and finally obtained 3,263 high-quality ESTs. The entire EST data were deposited in NCBI under the accession numbers from HS085836 to HS089098. Totally, these three libraries clustering with an overlapping percent identity cutoff at 80 % yielded 988 sequences with the average length of 526 bases and the average 45.43 % GC content (The assembly results of the libraries are provided in Supplementary Table S1). Totally, 880 unisequences were selected, among them 386, 285 and 158 differentially expressed unisequences were uniquely detected in EC versus NEC, IEC versus EC and EC versus IEC SSH libraries. Moreover, thirty-three genes were overlapping in EC versus NEC and IEC versus EC libraries, and thirty-four genes were shared between EC versus NEC and EC versus IEC (Fig. 3) (more detailed information is shown in Supplementary Tables S2 and S3).

Fig. 3
figure 3

Venn diagrams showing differentially expressed genes in SSH EC versus NEC, IEC versus EC and EC versus IEC libraries (for gene identity see Supplementary Table S1)

Obtained EST unisequences were annotated on the basis of existing annotation available for the protein database of Arabidopsis. Blastx program was used to screen the TAIR protein database (version 8) using the obtained unisequences as query, and hits with an E value less than 10−5 were considered as significant (Altschul et al. 1997). Unisequences annotated as unclassified and with no BLAST hits were excluded from further analysis. According to the putative function of their homologous genes, the main classifications of molecular function were binding (29.1 %), oxidoreductase activity (12.6 %), structural molecule activity (8.4 %), transporter activity (4 %), transcription factor activity (3.8 %); the small but significant proportions were electron carrier activity (1.7 %), translation regulator activity (1.3 %), transcription activator activity (0.6 %), and signal transducer activity (0.6 %) (Fig. 4a). Among the biological process, the largest groups of genes were assigned to genes involved in response to stimulus and stress (26.4 %), multicellular organismal process and development (17.3 %). Genes involved in embryonic and post-embryonic development (7.2 %), as well as localization (7 %) were also frequently found (Fig. 4b).

Fig. 4
figure 4

Gene ontology categorization based on molecular function (a) and biological process (b) of transcripts which showed differential expression patterns during somatic embryogenesis

Validation of microarray transcript profiling by qRT-PCR

To confirm the reproducibility of the microarray hybridization and data quality, ten unisequences were selected (Supplementary Table S4) and subjected to qRT-PCR for expression profile analysis. We detected the differentially expressed genes during four stages (NEC, EC, GE, and CE) of SE. There is a good correspondence (R 2 = 0.762) of the log2 ratio between qRT-PCR and microarray results (Supplementary Fig. S1), indicating the overall reliability of the microarray expression data.

Cluster analysis of gene expression patterns at certain SE stages

To classify patterns of differentially expressed genes during SE, nine clusters were identified using a K-means clustering (KMC) based clustering technique (Fig. 5). The expression of genes in NEC stage was used as the control. Genes were classified into four groups based on clustering results. The first group contained genes up-regulated in EC and subsequent induction stages (cluster 1). The second group of genes was up-regulated during the EC to CE stage, but low expressed in EC stage (clusters 2 and 3). The third group consisted of genes up-regulated during EC to IEC4 but down-regulated in subsequent GE stage and up-regulated in CE stage (clusters 4, 5 and 6). The fourth group contained genes down-regulated in SE process (clusters 7, 8 and 9). Genes identified from the clusters are shown in detail in Supplementary Table S5. On the basis of the gene expression profiles, an overview of major physiological and biosynthesis processes at different developmental stages during SE in citrus was proposed (Fig. 6).

Fig. 5
figure 5

Schematic description of the nine patterns based cluster analysis. The differentially expressed unisequences were classified into nine clusters using KMC. The mean signal ratio (in log2 ratio scale) for unisequences in cluster was plotted on the y axis. The NEC stage was used as the control point; the ratios of the EC, GE and CE stage samples were on the x axis from left to right

Fig. 6
figure 6

Overview of major physiological and biosynthesis processes at different developmental stages of SE in Valencia sweet orange obtained from the GO and KEGG data

Differentially expressed genes in EC cells and in embryo induction and formation stages

To understand the molecular mechanism of embryogenic competence acquisition, we detected the differentially expressed genes in EC cells. Several differentially expressed transcription factors were further characterized, such as LEC1/B3 domain regulatory network genes, e.g. Leafy cotyledon 1 (LEC1) (CS_P030_E_05), zinc-finger family genes (CS_P016_C_07, CS_P032_B_04), especially three transcription factors i.e. homeobox-leucine zipper family genes (HB) (CS_P009_F_12, CS_P062_F_06), a B3 domain containing gene (CS_P006_E_03) and a tubby like protein 8 (CS_P044_H_07). Moreover, some genes encoding fibrillarin 1 (CS_P013_C_11), Arabidopsis thaliana seed gene 1 (CS_P131_A_07), f-box family protein (CS_P006_F_01) were also detected highly expressed in EC cells. Some of the differentially expressed genes having no homology with any entry were also shown in Table 1.

Table 1 Expression profiles of certain interested genes of interest during citrus SE

The expression of series genes was further investigated by qRT-PCR (Fig. 7). Among genes that were associated with embryo induction and formation, the first group was transcription factors, such as B3 domain regulatory network genes e.g. FUSCA 3 (FUS3) (CS_P068_D_11), ABA INSENSITIVE 3 (ABI3) (CS_P072_C_08), and transcription factors encoding basic leucine zipper transcription factor (CS_P135_E_05) and indoleacetic acid-induced protein (CS_P142_A_02, CS_P107_E_08), which showed an up-regulated expression during the embryo induction and formation stages. The second group was transporter and channel genes, such as transport family protein (CS_P055_G_11, CS_P056_B_12), water channel genes (CS_P034_A_12, CS_P038_E_09, CS_P075_H_07, CS_P099_A_03, CS_P122_D_10) and calcium channel (CS_P102_C_01), which were up-regulated in EC or transition-stage embryos. In addition, the expression of a histidine-containing phosphortransmitter 3 (CS_P088_H_07) increased during initiation of SE. We also found some other genes up-regulated during SE encoding embryo defective 1644 (CS_P035_A_03), early methionine labeled (CS_P094_B_07), rubber elongation factor (REF) protein-related (CS_P136_D_07) in SE. It was interesting that two copper/zinc superoxide dismutases (CSD) (CS_P072_H_04 and CS_P090_E_04) genes were identified in our SSH libraries (CSD1 and CSD2); the expression of CSD1 was up-regulated more than threefold in embryo transition-stages, and the transcript of CSD2 was up-regulated in EC and down-regulated in embryo transition stages.

Fig. 7
figure 7

Relative expression levels of transcription factor genes in six developmental stages during SE by qRT-PCR. Six stages are: non-embryogenic callus (NEC), embryogenic callus (EC), embryogenic callus induced for 2 weeks (IEC2), embryogenic callus induced for 4 weeks (IEC4), globular embryo (GE), and cotyledonary embryo (CE). Gene expression was expressed as the mean and standard error calculated over biological replicates (n = 2) and technical replicates (n = 3)

Metabolic pathways involved in formation and development of embryos in citrus

Among the SE associated metabolic pathways, eighty-one pathways were found changed during SE formation and development. Four most altered metabolic pathways were found closely related with SE, e.g. glutathione pathway, amino sugar and nucleotide sugar, chlorophyll and photosynthesis, starch and sucrose (Table 2). Five genes encoding 1-cysteine peroxiredoxin (CS_P013_D_12), glutathione peroxidase 2 (CS_P027_A_01), glutathione peroxidase 6 (CS_P086_D_03), glutathione S-transferase (CS_P083_E_08), and glutathione transferase (CS_P087_C_12) in the glutathione pathways were up-regulated in the embryo transition stages, while nine genes involved in amino sugar and nucleotide sugar metabolism were regulated during embryo-induced stages, including up-regulated UTP-glucose-1-phosphate uridylyltransferase (CS_P059_B_11), UDP-D-galactose (CS_P103_H_08), beta-xylosidase (CS_P086_A_08, CS_P108_B_11), chitinase (CS_P091_D_07, CS_P094_F_06, CS_P100_E_06), and down-regulated UDP-glucose 6-dehydrogenase (CS_P037_E_02), POM-POM1 (CS_P094_C_10). Several genes encoding key enzymes in chlorophyll and photosynthesis metabolism were significantly altered, among which chlorophyll-chlorophyllido hydrolase 2 (CS_P038_A_07), ATP synthase d chain, mitochondrial (CS_P048_E_09) and ferretin 1 (CS_P098_G_03) were up-regulated, and glutamyl-tRNA synthetase (CS_P088_F_02) was down-regulated.

Table 2 List of differentially expressed genes involved in metabolic pathways associated with embryo formation according to the KEGG pathway database (KEGG Kyoto Encyclopedia of Genes and Genomes; http://www.genome.jp/kegg/pathway.html)

Expression dynamics of SE associated genes in seven citrus cultivars with different embryogenic competence

Isolating genes specifically expressed in SE process is an effective approach to understand the inherent mechanism of different embryogenic competence among citrus cultivars. According to previous characterization of gene expression in SE process of Valencia, eight genes were selected to detect their expression patterns in the callus lines of seven cultivars with different embryogenic competence, i.e. calluses of C. sinensis cv. Valencia (V), C. sinensis cv. Anliucheng (AL), C. sinensis cv. Newhall (NH), and C. sinensis cv. Jingcheng (JC) with high somatic embryogenic capacity, and C. sinensis cv. Bingtangcheng (BT), C. sinensis cv. Olinda (O), and C. unshui cv. Guoqing No.1 (G1) recalcitrant to form somatic embryos in the present culture conditions (Hao and Deng 2002; Zhang et al. 2006). The results showed the expression of most genes in cultivars with high embryogenic competence was higher than that in cultivars recalcitrant to form somatic embryos (significantly low in G1) except that the L1L gene was also expressed at high level in calluses of BT and O (Fig. 8). The up-regulated expression of LEC1 and FUS3 was detected in calluses of NH, V, AL and JC while low expression was detected in calluses of BT, O and G1; similar transcription dynamics of B3 domain containing gene, HB4 (homeobox-leucine zipper protein) and TLP (tubby like protein) were also revealed. All these results suggested LEC1, FUS3, B3 domain containing gene, HB4 and TLP are closely correlated with the SE capacity. Moreover, considerable transcript level of ABI3 was also detected in calluses of O and JC, while higher expression of ABI5 was detected in calluses of O than that in V.

Fig. 8
figure 8

The expression dynamics of SE associated genes in seven citrus cultivars with different embryogenic capacity by qRT-PCR and RT-PCR. These cultivars are: C. sinensis cv. Valencia (V), C. sinensis cv. Anliucheng (AL), C. sinensis cv. Newhall (NH), C. sinensis cv. Jingcheng (JC), C. sinensis cv. Bingtangcheng (BT), C. sinensis cv. Olinda (O), and C. unshiu cv. Guoqing No.1 (G1). Gene expression was calculated as the mean and standard error calculated over technical replicates (n = 3)

Discussion

This study was undertaken with the purpose to investigate the genes controlling citrus SE. SSH-based microarray approach was proved to be an efficient way to enrich and identify differentially expressed genes (Yang et al. 1999; Liu et al. 2009). Herein, SSH and cDNA microarray was for the first time combined to explore the transcriptional regulation of SE in citrus. In our study, non-embryogenic callus, embryogenic callus and somatic-embryo-induced calluses and embryos were separately used to construct three SSH cDNA libraries, aiming to identify the genes and biogenesis process associated with SE competence acquisition, somatic embryo induction and development. In our libraries, 880 unisequences were identified. Comparing our results with previous research data in other higher plants, some unisequences are most likely conserved in certain pathways and processes, e.g. cell proliferation and growth, polarity establishment, signal transduction, some extracellular proteins and housekeeping genes. Notably, the analyses of differentially expressed genes indicated that nucleolus associated regulation and biogenesis processes, transporters, embryo development related genes, hormone signal transduction and stress factors might be involved in SE process, and especially the transcription factors might play important role in citrus SE.

The important events in embryogenic competence maintenance and somatic embryo induction

SE process is generally divided into two stages—induction and expression. In the induction stage, somatic cells can acquire the embryogenic competence to differentiate into embryogenic cells with multiplication capacity; embryogenic cells can develop into somatic embryos in expression stage (Namasivayam 2007). To identify the early events involved in the acquisition of somatic embryogenic competence, we used the NEC as negative control. A routine method was applied here to obtain the Valencia NEC through culturing epicotyl explants of in vitro seedlings on MT basal medium supplemented with PGRs in the dark (Maul et al. 2006; Niedz and Evens 2008), which indicated that the NEC was from the origination of non-embryognic cells whereas EC cell line originated from embryognic cells. Because the NEC and EC were arisen from different explants, to verify if genomic variation occurred between them, 21 SSR primers which were previously developed in our lab and proved effective to distinguish the Citrus sinensis cultivars were used here, and no genetic difference was revealed (unpublished data). In our study, the exposure to medium containing PGRs, particularly 2,4-D, could result in the formation and preservation of NEC cells with non-embryogenic competence (Fig. 1a), while the EC could proliferate and preserve in medium without PGRs maintaining strong embryogenic competence (Fig. 1b), and EC cells can be induced on medium with 2 % glycerol to form somatic embryos (Fig. 1d). In our study, NEC and EC cells from different embryogenic origination were used to construct the SSH library, the data showed that significant high proportion of differentially expressed genes were embryogenesis related, and it also suggested that some important genes not only were associated with somatic embryo induction and formation, but also play an important role in acquisition and maintenance of embryogenic competence in calluses as such before the induction of SE in citrus.

Significant expression of LEC1/B3 regulatory network genes indicates conserved mechanism in citrus embryogenesis

LEC1/B3 regulatory network genes had been reported to act in embryogenesis in Arabidopsis seed development. Ectopic expression of LEC1 could induce embryo formation in vegetative tissues in Arabidopsis (Lotan et al. 1998), but in white spruce, no ectopic organ formation was observed on transgenic seedling of spruce LEC1 homolog (Klimaszewska et al. 2010). In our SE libraries, five regulatory genes were identified, i.e. LEC1 (CS_P030_E_05), L1L (CS_P047_E_07, CS_P056_E_06), FUS3 (CS_P068_D_11), ABI3 (CS_P072_C_08) and ABI5 (CS_P016_H_10). Increased expression of these genes was further confirmed by qRT-PCR analysis (Fig. 7). In our ultrastructural analysis, many protein bodies were observed in EC cells (Fig. 1o), as agreed with the result that the LEC1 controlled the accumulation of seed storage proteins through its regulation of FUS3 and ABI3 in Arabidopsis (Kagaya et al. 2005). Together, it is reasonable to hypothesize that this regulatory network of genes is necessary in citrus SE, and these genes played important role in maintaining the EC cells in a good state for further development, and should also be required for regulating the cascade operation of embryo development.

The expression of two genes encoding B3 domain-containing protein and homeobox-leucine zipper protein is necessary in somatic embryogenic competence acquisition and maintenance

In our study, one gene encoding B3 domain-containing protein which was up-regulated in EC and down-regulated in induction and transition of SE process (Fig. 7) was detected. In addition, the expression dynamics in calluses of seven citrus cultivars with different embryogenic capacity also confirmed that this gene might reflect the potential of embryogenesis in different citrus cultivars (Fig. 8). Another important gene encoding homeobox-leucine zipper protein 4 (HB4), which exhibited high expression level in citrus calluses with embryogenic capacity was detected (Fig. 8). In white spruce, a homeodomain leucine-zipper gene (PgHZ1) was isolated; PgHZ1 belongs to the HD-Zip class I family and its expression pattern increased in an embryogenic spruce line during embryo formation (Tahir et al. 2008). In our experiment, the full-length CsHB4 was amplified. Sequence analyses showed that this gene belongs to the HD-Zip class II family, while SSR analyses showed that its polymorphism was related to the capacity of acquisition of embryogenic competence (data not shown). Functional study of these SE related transcription factors needs further efforts, and would help to elucidate the regulating networks that function during somatic embryogenesis in citrus and other woody fruit trees.

The expression of bZIP genes is necessary in somatic embryo development

The basic-region leucine zipper (bZIP) transcription factor has been recently reported to be functioned as an important trans-acting factor in the ABA signal transduction pathway during carrot somatic embryo development (Guan et al. 2009). In our libraries, two candidate bZIP genes (CS_P107_D_09, CS_P135_E_05) which exhibited different transcription profiles were identified. One of them, the Dc3 promoter-binding factor 2 (DPBF2) (CS_P135_E_05) which could be interacted with ABA-responsive and embryo-specification elements in the Dc3 promoter (Kim et al. 1997) was up-regulated in citrus EC, embryo induction and transition stages. These two genes belong to different bZIP groups and might play different roles in SE process.

Genes responsible for nucleolus associated regulation and biogenesis processes in somatic embryogenic competence maintenance

In eukaryotic cells, the nucleolus is a prominent non-membrane-bound nuclear structure that forms around chromosome segments bearing nucleolar-organizing regions (NORs). Some important regulation and biogenesis events occur in nucleolus (Lam et al. 2005). In our ultrastructural analysis, the NEC cells were observed large without obvious nucleus while the EC cells were with obvious nucleolus and enriched protein bodies, suggesting that regulation within the nucleolus was crucial in maintenance of SE capacity and induction of somatic embryos. Except many genes encoding ribosome proteins were identified, two genes, i.e. fibrillarin 1 (CS_P013_C_11) and a nuclear transport factor 2, were revealed up-regulated in EC cells. Previous studies showed that depletion or knockdown of fibrillarin had leaded to early embryonic lethality in mouse (Newton et al. 2003) and abnormal nuclear morphology and reduction of cellular growth in Hela cell (Amin et al. 2007), the NTF2 gene in Caenorhabditis elegans played critical role in establishing and maintaining the Ran gradient, and was required for early embryogenesis (Quimby et al. 2000). Combining molecular and cytological analysis mentioned above, it was suggested that the nucleolus of citrus EC was in a state of high metabolic activity, and some regulation and biogenesis events occurred in nucleolus of EC played important role in maintaining the competence to form embryos, while the NEC possessed large cell size and remained in undifferentiated stage.

Hormone signal transduction pathways might control the somatic embryo induction and development

In citrus, endogenous hormonal levels and their balance played a key role in regulating SE. To keep the embryogenesis capability of callus during long-term subculture period, relative high level of endogenous ABA was needed but not GA (Liu 2003). In our study, a number of ABA signaling transduction genes including ABI3 (CS_P072_C_08), ABI5 (CS_P016_H_10), ABI1 (CS_P037_F_07), PP2C (CS_P128_C_08), ABI8 (CS_P077_B_11) and bZIP proteins (CS_P107_D_09, CS_P135_E_05), SAD1 (CS_P052_F_08) were identified. All these findings suggested the important role of ABA signaling transduction pathway in citrus SE process, as proposed by Ikeda and Kamada (2006), who found that ABA and ABI3 could regulate LEA gene expression in SE. Thus, it was possible that the ABA signal transduction pathway might be associated with important embryological process in citrus SE.

Another gene family involved in the signal transduction pathway during the process of SE is the GASA (GAST1 protein homolog). A GAST gene highly expressed in non-embryogenic compared with embryogenic callus and embryoid was recently identified in oil palm (Low et al. 2008). In our libraries, the down-regulated expression of GASA1 (CS_P085_B_01) and GASA2 (CS_P083_H_11) genes was detected during the whole SE process, while the lowest expression of GASA3 (CS_P095_E_02) was detected in EC. In some plant species e.g. Solanum tuberosum and Cocos nucifera, the exogenous GA3 from culture medium was needed, and some genes associated with GA biosynthesis and GA response were found up-regulated during the formation and germination of somatic embryos (Sharma et al. 2008; Montero-Cortes et al. 2010). However, citrus SE capacity could be maintained in medium without growth hormones (GA or ABA), suggesting that the SE process could be regulated by endogenous hormones which lead to the changes in gene expression.

Role of stress factors in somatic embryo induction and development

During SE initiation and transition phases, stress-related genes showed significant up-regulation. Molecular basis for stress-induced acquisition of SE was described in detail by Karami and Saidi (2010). Some stress-related genes and proteins were reported to be associated with SE process, such as pathogenesis-related (PR) proteins (Helleboid et al. 2000; Passarinho et al. 2001), glutathione metabolic genes (Winkelmann et al. 2006; Zeng et al. 2006). It was noteworthy that two copper/zinc superoxide dismutases (i.e. CSD1 and CSD2) were identified in our libraries, which was reported to prevent ROS mediated cell damage and death. In Arabidopsis, the two CSD genes, cytosolic CSD1 and plastidic CSD2 were both targets of miR398, which was a conserved miRNA family that was identified in Arabidopsis from small RNA libraries of seedlings exposed to abiotic stresses (Sunkar and Zhu 2004). In a previous study, the expression of miR398 was found closely associated with CE development in citrus (Wu et al. 2011). We supposed that there might have some internal relationship between the expression of CSD genes and the antioxidation mechanism of citrus embryo development process, which could establish a metabolic balance.

In summary, our study uncovered the transcriptome changes during SE of citrus. A total of 880 unisequences identified as differentially expressed via SSH-based microarray analysis were considered potentially significant to the SE competence acquisition and somatic embryo induction and formation. Gene ontology categorization of the differentially expressed genes showed that a number of important factors involved in regulation and biogenesis were required for SE, including transcription factors, nucleolus associated regulation and biogenesis processes and stress factors. In our study, LEC1/B3 domain regulatory network genes (i.e. LEC1, L1L, FUS3, ABI3, and ABI5) were first isolated in citrus SE. Moreover, some genes that were not previously known to be involved in SE, e.g. genes encoding homeobox-leucine zipper protein, a B3 domain containing protein, IAA9, and tubby like protein, were identified. Our results for the first time provided an extensive perspective of regulatory factors involved in the entire process of SE in citrus. Characterization of these genes and in-depth elucidation of their functions on citrus SE are currently being conducted.