Introduction

Coffee is one of the main commodities traded worldwide, with Coffea arabica L. accounting for about 60% of global production (International Coffee Organization 2014). However, to maintain production levels and to satisfy the expectations of demanding markets that best remunerate the producer, it is necessary to sustain programs aiming at coffee breeding. The release of elite genotypes that are disease resistant and produce top quality beans can be accelerated by combining conventional breeding and biotechnological techniques (Gatica-Arias et al. 2008). In this context, somatic embryogenesis is a technique with considerable potential since it not only provides the possibility of mass clonal propagation of genetically improved varieties and maintenance of germplasm, but also serves as an efficient regeneration system for the genetic transformation process (Pathi et al. 2013; Ribas et al. 2011; Winkelmann 2010).

Quantitative polymerase chain reaction (qPCR) is a rapid and sensitive technique that has frequently been used to study gene expression during somatic embryogenesis and embryo germination (Gruszczyńska and Rakoczy-Trojanowska 2011; Ma et al. 2012; Silva et al. 2014; Zhang et al. 2014). Although the procedure offers high reproducibility, precision and throughput (Bustin et al. 2005; Logan et al. 2009), the reliability of the technique depends on various factors including the integrity of RNA, the quality of cDNA synthesis, the number of repetitions, the efficiency of amplifications and the appropriate selection of reference genes used as internal controls for normalizing and monitoring sample-to-sample and run-to-run variations (Pfaffl et al. 2004; Santis et al. 2011; Vandesompele et al. 2002). Reference genes must exhibit moderate levels of expression and be stably expressed in different cell types and experimental conditions, but must not be associated with pseudogenes (Ling et al. 2014; Lland et al. 2006; Wan et al. 2010).

A great number of reference genes have been validated for gene expression analysis in plants, including those encoding β-actin, β-tubulin, ubiquitin, glyceraldehyde-3-phosphate dehydrogenase and elongation factors (Kumar et al. 2011). However, there appears to be no universal reference gene that could be used for all plant tissues and/or experimental conditions (Chen et al. 2011; Cheng et al. 2013; Gutierrez et al. 2008a, b; Imai et al. 2014; Lin et al. 2013; Rodrigues et al. 2014; Zeng et al. 2014), hence it is important to assess the stability of the selected reference gene under the specific experimental conditions employed in the expression analysis (Vandesompele et al. 2009). Indeed, the selection of suitable reference genes is a key step in qPCR analysis, since inappropriate choices will negatively affect the reliability of the results (Carvalho et al. 2013b; Docimo et al. 2013; Fan et al. 2013; Kong et al. 2014; Nolan et al. 2006).

RefFinder is a computational tool for evaluating and screening candidate reference genes from experimental datasets in order to infer their suitability for normalization of qPCR data. The web-based application assigns appropriate weights to individual candidate genes based on data obtained using the algorithms geNorm (Vandesompele et al. 2002), NormFinder (Andersen et al. 2004), BestKeeper (Pfaffl et al. 2004) and Delta-Ct (Silver et al. 2006), and ranks the genes according to the geometric mean of their weights (RefFinder 2016).

In the case of C. arabica, various reference genes have already been described for expression analysis (Barsalobres-Cavallari et al. 2009; Carvalho et al. 2013b; Cruz et al. 2009; Figueiredo et al. 2013; Goulao et al. 2012), but none have been validated in samples related to somatic embryogenesis. Considering the increasing interest in functional genomics of coffee that has arisen in response by the recent availability of genome and transcriptome data, the requirements for appropriate reference genes for expression normalization have become much more stringent. In order to address this issue, 12 candidate reference genes were selected and evaluated in 18 different embryogenic and non-embryogenic samples obtained from C. arabica explants to identify those suited for normalization of qPCR analyses and, therefore, appropriate for application in expression studies involving embryogenic tissues of this species. Also, the best combination of reference genes determined for all tissue types was used to further assess the expression of the Baby boom (BBM) gene—plant-specific transcription factor belonging to the AP2/ERF superfamily—which activates developmental pathways associated with cell proliferation and growth (Passarinho et al. 2008), and is involved in the acquisition of embryogenic competence (Namasivayam 2007). The results of the present study will be valuable in future research for defining reference genes that are appropriate for evaluation of target gene expression during the process of somatic embryogenesis in C. arabica.

Materials and methods

Somatic embryogenesis

Embryogenic and non-embryogenic calli were established from leaves of greenhouse-grown plants of C. arabica cv. Catuaí Amarelo IAC 62 according to the protocol described by Teixeira et al. (2004). Embryogenic cell suspensions were obtained by transferring calli to Erlenmeyer flasks containing liquid multiplication medium T3 (Van Boxtel and Berthouly 1996) at an inoculum density of 10 g callus L−1 (Zamarripa et al. 1991). The flasks were maintained in the dark under constant agitation at 100 rpm in a growth room at 25 °C, and the medium was replaced every 15 days. Embryos were regenerated in 2 months after transferring cell suspensions to RR medium (Carvalho et al. 2013a) at an inoculum density of 1 g L−1. Maturation and germination of somatic embryos was accomplished following the method described by Teixeira et al. (2004).

Experimental design

The experiment was performed according to a totally randomized design and each sample comprised three biological replicates. The samples evaluated included: (i) embryogenic and non-embryogenic calli: each repetition encompassed a set of ten calli obtained from different leaf explants; (ii) two cell lines of embryogenic cell suspensions with different culture times (60, 90, 120, 150, 180 and 210 days): each repetition consisted of 200 mg of cell agglomerates for each cell line (iii) somatic embryos at different stages of development: each repetition included 275 globular embryos, 25 cordiform/torpedo embryos and 25 cotyledonary embryos; (iv) coffee plantlets: each repetition included 25 plantlets. All samples were stored at −80 °C until required for the RNA extraction.

Extraction of total RNA and cDNA synthesis

Total RNA was extracted from embryogenic calli, embryogenic suspension cells and globular embryos using Macherey Nagel (Düren, Germany) NucleoSpin® kits, and from non-embryogenic calli, cordiform/torpedo embryos, cotyledonary embryos and plantlets using Invitrogen™ (Life Technologies, Carlsbad, CA, USA) Concert™ Plant RNA reagent. RNA extracts were treated with Ambion® (Life Technologies) Turbo DNA-free kit reagents in order to remove any contaminating genomic DNA. The quantity and purity of total RNA was assessed with an ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, NC, USA), while quality and integrity were verified using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) to ensure an RNA integrity number (RIN) ≥ 7.0. The synthesis of cDNA from 1000 ng aliquots of RNA was carried out using Applied Biosystems (Life Technologies) High-Capacity cDNA Reverse Transcription kits according to the recommendations of the manufacturer.

Selection of candidate reference genes and design of primers

A set of 12 potential reference genes that had been reported in other crops or frequently used in coffee were selected. These included ribosomal protein 24S (24S), β-actin (ACT), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), cyclophilin (CYCL), elongation factor 1α (EF1a), β-tubulin (TUB), protein phosphatase 2A (PP2A), clathrin adaptor complex medium subunit (AP47), 60S ribosomal protein L39 (RPL39), adenine phosphoribosyltransferase (APRT), ubiquitin (UBQ) and protein 14-3-3 (14-3-3).

In the case of 24S, ACT, GAPDH, CYCL, EF1a, TUB, PP2A, RPL39, AP47, APRT, UBQ, and 14-3-3 genes were selected from C. arabica EST sequences developed by the Brazilian Coffee Genome Project Consortium (http://www.lge.ibi.unicamp.br/cafe/) (Vieira et al. 2006) (Table 1). For BBM gene, we used primers pairs already reported in previous studies in Coffea (Silva et al. 2015). Coffee homologue sequences with the best matches were retrieved and submitted to Primer Express software version 3.0 (Applied Biosystems) for primer design. The specificity of each pair of primers was verified by analysis of the dissociation (melting) curves. PCR amplification efficiencies (E) and regression coefficients (R2) were determined during the validation of primers according to the standard curve method using a set of all cDNA samples with 5× serial dilution. The specifications of the selected reference genes and primer pairs are shown in Table 1.

Table 1 Descriptions of coffee candidate reference genes and BBM for qPCR analysis

qPCR amplification

qPCR analyses were performed using a Qiagen (Venlo, Netherlands) Rotor Gene-Q® thermal cycler with a reaction mix containing 7.5 μL of 2× Rotor-Gene SYBR® Green PCR Master Mix (Qiagen), 5 ng of cDNA, optimized concentrations of primers (see Table 1) and RNase-free water to a total volume of 15 μL. Amplification conditions involved an initial activation at 95 °C for 5 min and 40 cycles of denaturation at 95 °C for 5 s and combined annealing/extension at 60 °C for 10 s. In order to confirm the specificity of primers, melting curves were recorded after the 40 amplification cycles had been completed by increasing the temperature from 60 to 95 °C. All qPCR assays were carried out in technical and biological triplicate.

Analysis of expression stability of candidate reference genes

In qPCR, amplified DNA bears a fluorescent label and the amount of fluorescence detected during the reaction is directly proportional to the amount of amplified DNA present. The levels of expression of candidate reference genes were determined on the basis of the quantification cycle (Cq), also known as the threshold cycle (Ct), which is defined as the cycle at which the fluorescence from amplification exceeds that of the background. The Cq values of samples were determined using Qiagen Rotor Gene-Q Series software with the fluorescence threshold set at 0.2, and corrected according to the efficiency of each pair of primers with the aid of GenEx Enterprise software (MultiD Analyses, Göteberg, Sweden; http://genex.gene-quantification.info/). Box plot-type diagrams were made using Systat Software (San Jose, CA, USA) SigmaPlot version 12.0 to illustrate levels and variations in expression of the tested reference genes.

The RefFinder tool (http://fulxie.0fees.us) was employed to assess the stability of the tested genes in seven sample sets: (i) embryogenic cell suspensions at different culture times, (ii) non-embryogenic calli, (iii) embryogenic calli (iv) combined embryogenic and non-embryogenic calli; (v) globular, cordiform/torpedo and cotyledonary embryos, (vi) plantlets, and (vii) a pool of biological samples representing all tissue types from (i) to (vi). The RefFinder tool evaluates the rankings of stability according to the geNorm, NormFinder, BestKeeper and Delta-Ct algorithms and provides an overall ranking for the tested reference genes (RefFinder 2016). The geNorm algorithm calculates an M-value describing the variation of each candidate gene in comparison with all other candidates, and eliminates the gene with the highest M-value. This process is repeated until only two genes remain and these are then ranked as the best possible pair of reference genes. A low M-value, i.e. one that is below the cut-off point of 1.5, indicates transcriptional stability (Mamo et al. 2007; Rodrigues et al. 2014; Spinsanti et al. 2006; Vandesompele et al. 2002). NormFinder calculates both intra- and inter-sample variance and the stability value (SV) for each of the candidate reference genes. A low SV value matches a gene with high transcriptional stability (Andersen et al. 2004). The BestKeeper algorithm assesses the transcriptional stability of all candidate reference genes based on calculated variation parameters, namely standard deviation (SD), correlation coefficient (r) and coefficient of covariance (CV). The genes are then ranked according to variability from those most stably expressed (lowest variation) to the least stable (highest variation) (Pfaffl et al. 2004). The Delta-Ct algorithm calculates the differences between the Cq values of the tested reference genes in which the smaller the value the more stable is the gene transcript (Silver et al. 2006).

Validation of reference genes by BBM expression analysis

Coffea arabica baby boom (CaBBM) was screened from EST library in the Brazilian Coffee Genome Project data (Vieira et al. 2006). In order to validate the reference genes, the expression levels of CaBBM were quantified in the same different tissue types for reference genes tested using both the most-stable reference genes and the most-unstable ones, to demonstrate how the different reference genes adoption can affect the expression of a specific gene of interest. To determine the optimal number of reference genes for normalization in each experimental condition, pairwise variation (Vn/Vn + 1) was calculated using geNorm. Vandesompele et al. (2002) usually used 0.15 as a cutoff value to determine the optimal number of reference genes, below which the inclusion of additional reference genes is not required. The transcriptional activity of BBM was calculated by applying Pfaffl formula (Pfaffl 2001).

Results

Specificity and efficiency of primers

The amplification efficiencies (E) and correlation coefficients (R2) of the 12 candidate genes and BBM gene were generated using the slopes of the standard curves obtained by serial dilutions. Values of E varied from 80 to 100%, while those of R2 obtained from standard curves were ≥0.947 (Fig. S1 and Table 1), indicating that the amount of product was doubled at the end of each cycle. The specificity of each of the tested primer pairs was confirmed by the presence of a single peak corresponding to one amplicon in the respective melt curves (Fig. S1).

Levels of expression of candidate reference genes

The expression profiles of all qPCR products for all genes and all sample sets are shown in Fig. 1. The mean Cq values of the 12 candidate reference genes ranged from 17 to 28, indicating a wide variation in the levels of expression. The results obtained with the pool of biological samples showed that UBQ presented the lowest level of expression (mean Cq = 25.3) while EF1a exhibited the highest (mean Cq = 18.7). The coefficients of variation (CV) (lower values represent lower variability) of 12 reference genes were 1.80% (24S), 2.69% (ACT), 3.47% (GAPDH), 2.76% (CYCL), 2.61% (EF1a), 2.26% (TUB), 1.93% (PP2A), 2.48% (AP47), 2.39% (RPL39), 1.84% (APRT), 2.40% (UBQ), 2.86% (14-3-3).

Fig. 1
figure 1

Expression of candidate reference genes as determined by the quantification cycle (Cq) values determined in seven sample sets. Bars indicate maximum and minimum Cq values while circles represent mean values

When expression was evaluated in embryogenic cell suspensions at different culture times, GAPDH presented the highest CV (2.72%) and APRT the lowest (1.52%). For non-embryogenic calli, APRT presented the highest CV (0.70%) while UBQ exhibited the lowest (0.05%). Regarding embryogenic calli, EF1a presented the highest CV (2.43%) and RPL39 the lowest (0.24%). With the combined embryogenic and non-embryogenic calli, the highest CV was observed for GAPDH (4.21) and the lowest for RPL39 (0.42%). Samples from somatic embryos at different stages of development exhibited for TUB higher CV (5.17%), while APRT exhibited lower value (1.77%). Regarding plantlet samples, highest CV was observed for CYCL (0.60%) and the lowest for PP2A (0.06%).

Expression stability of candidate reference genes

According to M-values calculated by the geNorm algorithm using data obtained from the pool of biological samples (Table 2), all candidate genes exhibited acceptable transcriptional stability (M ≤ 1.5) although the stabilities of 24S and RPL39 (M = 0.273) were higher than those of the other genes. On the other hand, PP2A and 24S were designated as the most stable genes by NormFinder (SV = 0.290 and 0.390, respectively) and Delta-Ct (ΔCt = 0.592 and 0.628, respectively), while BestKeeper assigned 24S and APRT as most stable (SD = 0.425 and 0.435, respectively). All four algorithms identified GAPDH as the least stable gene (M = 0.706, SV = 0.711, SD = 0.562 and ΔCt = 0.850), although none of the stability values exceeded the recommended cut-off points. The overall ranking established by RefFinder revealed that the most stable candidate genes were 24S followed by PP2A.

Table 2 Ranking of candidate reference genes according to stability values assessed in a pool of biological samples of Coffea arabica

In embryogenic cell suspension samples (Table 3), the most stable genes were identified as 24S and RPL39 by geNorm (M = 0.249), APRT by NormFinder and Delta-Ct (SV = 0.257 and ΔCt = 0.494, respectively) and EF1a by BestKeeper (SD = 0.342). All four algorithms identified AP47 as the least stable gene (M = 0.583, SV = 0.665, SD = 0.643 and ΔCt = 0.755). The overall ranking established by RefFinder revealed that the most stable candidate genes were APRT followed by EF1a, whereas AP47 was the least stable.

Table 3 Ranking of candidate reference genes according to stability values assessed in embryogenic cell suspensions of Coffea arabica at different culture times

In non-embryogenic calli samples (Table 4), the geNorm algorithm identified UBQ and ACT as the most stable genes (M = 0.042), while BestKeeper and Delta-Ct algorithms both designated UBQ (SD = 0.013 and ΔCt = 0.098) as the most stable. In contrast, NormFinder assigned RPL39 as the most stable gene (SV = 0.029) followed by UBQ (SV = 0.032). All four algorithms identified APRT as the least stable gene (M = 0.127, SV = 0.172, SD = 0.160 and ΔCt = 0.188). In the overall ranking, UBQ and ACT were classified as the most stable genes, although all candidate genes presented relatively high transcriptional stability in this system.

Table 4 Ranking of candidate reference genes according to stability values assessed in non-embryogenic calli samples of Coffea arabica

In embryogenic calli samples (Table 5), geNorm identified 24S and UBQ as the most stable genes (M = 0.094), whereas NormFinder designated ACT and TUB as most stable (SV = 0.040). In contrast, BestKeeper assigned RPL39 and 24S (SD = 0.047 and 0.106, respectively) as most stable, while Delta-Ct identified ACT as showing the greatest transcriptional stability (ΔCt = 0.253) similar to the finding of NormFinder. In the overall ranking, ACT followed by 24S were classified as the most stable genes with EF1a as the least stable.

Table 5 Ranking of candidate reference genes according to stability values assessed in embryogenic calli samples of Coffea arabica

In combined embryogenic and non-embryogenic calli samples (Table 6), the genes RPL39 and 24S were identified as the most stable by all four algorithms (M = 0.106; SV = 0.053 and 0.094, respectively; SD = 0.083 and 0.145, respectively; ΔCt = 0.489 and 0.488, respectively). According to the overall ranking, the most stable genes were RPL39 followed by 24S, while UBQ constituted the least stable.

Table 6 Ranking of candidate reference genes according to stability values assessed in combined embryogenic and non-embryogenic calli samples of Coffea arabica

In samples from somatic embryos at different stages of development (Table 7), geNorm classified PP2A and RPL39 as the most stable genes (M = 0.200), a finding that was in agreement with those of NormFinder (SV = 0.074 and 0.099, respectively) and Delta-Ct (ΔCt = 0.470 and 0.475, respectively). According to BestKeeper, however, APRT and 14-3-3 were the most stable genes (SD = 0.421 and 0.427, respectively). In the overall ranking, PP2A and RPL39 were considered the most stable genes while TUB was the least stable.

Table 7 Ranking of candidate reference genes according to stability values assessed in somatic embryos (globular, cordiform/torpedo and cotyledonary) of Coffea arabica

Finally, in samples of C. arabica plantlets (Table 8), PP2A and APRT were indicated as the most stable genes by geNorm (M = 0.024) and BestKeeper (SD = 0.014 and 0.024, respectively), while NormFinder and Delta-Ct classified AP47 as the most stable (SV = 0.017 and ΔCt = 0.069, respectively). According to the overall ranking, PP2A followed by AP47 were the most stable genes and CYCL was the least stable, although all candidate genes presented relatively high stability.

Table 8 Ranking of candidate reference genes according to stability values assessed in plantlets of Coffea arabica

Optimal number of reference genes

To generate accurate and reliable results, a combination of stable reference genes is paramount to obtaining accurate results when using several reference genes (Liu et al. 2015). Normalization with an inadequate number of reference gene can produce significant analyses errors (Vandesompele et al. 2002). Results showed that pairwise variation values for V2/3 were below the cut-off value of 0.15 in all sample sets (Fig. 2), indicating that the combination of two stable reference genes would be sufficient for the gene expression normalization.

Fig. 2
figure 2

Pairwise variation (V) calculated by geNorm to determine the optimal number of reference genes. A value < 0.15 means that inclusion of an additional reference gene is not required

Reference genes validation

To assess the impact of the selection of reference genes on gene expression calculations, we analyzed the BBM expression by employing two normalization strategies (Fig. 3). The comparative analysis of relative expression profiles for target gene in samples related the process of somatic embryogenesis were very similar when normalized with different internal reference genes. BBM transcripts showed higher relative expression levels in embryogenic calli and embryogenic cell suspensions. However, the relative transcript abundance for target gene was dependent on the reference genes used for normalization, BBM expression levels were obviously overestimated when unsuitable reference genes were used for normalization. Moreover, the utilization of the newly identified normalization controls resulted in significantly lower standard deviations underlining the higher reproducibility of the results. This indicates that questionable results would be produced by using unstable reference genes. These results reinforce the importance of validating reference genes prior to experimental applications.

Fig. 3
figure 3

Differential gene expression of BBM using the selected reference genes. Relative gene expression quantification was performed using two different normalization strategies: the combination of the two top ranked genes and combination two most unstable genes. The columns represent the gene expression in different materials (NEC non-embryogenic calli, EC embryogenic calli, ECS embryogenic cell suspensions with 60, 90, 120, 150, 180 and 210 days culture, GLO globular embryos, TOR cordiform/torpedo embryos, COT cotyledonary embryos, PLA plantlets) of C. arabica. Error bars indicate standard deviation (SD). Letters denote statistically significant differences (Student’s t test, P < 0.05)

Discussion

Coffee is the world’s favorite beverage, plays an important role in industry. Studies are necessary to produce quality coffee or protect the coffee supply chain from economic, climate, or pest and disease threats. Despite the rapid exploration of the coffee genome and the growing requirement for the deep biological study of gene function in published papers that employ—OMIC, to our knowledge, very limited information is available on the expression stability of reference genes in Coffea spp. during somatic embryogenesis. The expression pattern, a reflection of the biological function of a target gene, is preferably detected by qPCR method, in which reference gene is used for normalization (Lin et al. 2014). The expression patterns of reference genes are expected to be stable irrespective of experimental conditions (Chen et al. 2011; Cheng et al. 2013; Gutierrez et al. 2008a, b; Imai et al. 2014; Lin et al. 2013; Rodrigues et al. 2014; Zeng et al. 2014). It is assumed that genes encoding proteins involved in the primary metabolism and structural integrity of cells are uniformly expressed regardless of the experimental conditions and cell type (Vandesompele et al. 2002). However, studies have shown that there is no universal reference gene appropriate for all cells and conditions because these genes can participate in other cell functions (Chen et al. 2011; Cheng et al. 2013; Gutierrez et al. 2008a, b; Imai et al. 2014; Lin et al. 2013; Rodrigues et al. 2014; Zeng et al. 2014). Therefore, it is necessary to perform a systematic validation of candidate reference genes for the specific tissues to be evaluated.

RefFinder ranks candidates genes on the basis of output from different algorithms, namely geNorm, NormFinder, BestKeeper and Delta-Ct. When we consider the stability for some genes in the four analyzes, it was observed discrepant results in the same sample set because the algorithms use dissimilar mathematical models to calculate the gene expression stability. Volland et al. (2016) recommended to employ a minimum of two experimental validation procedures, as individual algorithms can produce variable results, and no consensus can be reported. The use of the RefFinder tool can be an alternative for a global ranking. However, when raw Cq values are used as input in RefFinder, the results are not faithful to those generated by the software packages (Spiegelaere et al. 2015). Before the data entry in RefFinder, the raw data should be converted assuming specific primer efficiency for each gene. Thereafter, similar rankings to those of the original algorithms can be verified, as demonstrated by Li et al. (2016).

Real time qPCR data are frequently normalized without validation of the reference genes (Gutierrez et al. 2008a, b). In present study, it is of interest to note, the commonly employed reference gene GAPDH (Cardoso et al. 2014; Ivamoto et al. 2015; Marraccini et al. 2011; Ságio et al. 2014; Silva et al. 2014, 2015) exhibited very high levels of expression (i.e. very low Cq values) but was ranked as one of the most unstable of the candidate genes by all four of the algorithms accessed by RefFinder. One explanation for this finding is that the criterion for selection of a reference gene requires gene expression at a moderate level (Ling et al. 2014). Moreover, the protein encoded by GAPDH performs alternative metabolic roles (Zaffagnini et al. 2013), hence expression levels are likely to be highly variable. Instability in GAPDH expression has been reported for coffee hypocotyls inoculated with Colletotrichum kahawae (Figueiredo et al. 2013), although the gene exhibited stable expression in other tissues/organs of Coffea arabica (namely, roots, stems, leaves and fruits) (Barsalobres-Cavallari et al. 2009) and in leaves and roots that had been submitted to abiotic stress (Carvalho et al. 2013b). In soybean under water stress (Stolf-Moreira et al. 2011), strawberry fruits (Galli et al. 2015), maize grains (Galli et al. 2013), lettuce (Borowski et al. 2014), among others, the GAPDH gene was also considered less stable.

Expression analysis of 12 candidate different reference genes in a pool of C. arabica samples comprising embryogenic and non-embryogenic calli, embryogenic cell suspensions with different culture times, somatic embryos at different developmental stages and plantlets, revealed that 24S and PP2A were the most stably expressed. However, distinct analyses of the different categories of samples revealed that other genes exhibited transcriptional stability. For instance, UBQ emerged as the most stable gene in non-embryogenic calli, while ACT and APRT were the most stable in embryogenic calli and cell suspensions, respectively. In contrast, when combined samples of embryogenic and non-embryogenic calli were analyzed, RPL39 emerged as the most stable. Moreover, the highest level of transcriptional stability in somatic embryos and plantlet samples was attained by PP2A. Variation in the expression stability of reference genes in different tissues has been frequently reported (Barsalobres-Cavallari et al. 2009; Carvalho et al. 2013b; Chao et al. 2012; Imai et al. 2014; Lin et al. 2013; Yeap et al. 2014), hence the discrepancies observed in our study are not surprising and reinforce the importance of caution in the selection of suitable reference genes for the normalization of qPCR analyses.

In summary, even though GAPDH and ACT have been previously employed for normalization of expression data in calli and cell suspensions of coffee (Silva et al. 2014, 2015), the present study revealed that GAPDH was not appropriate as a reference gene for any embryogenic tissues of coffee, while ACT was appropriate only for embryogenic calli. The arbitrary choice of reference genes leads to inadequate normalization, imprecision in the qPCR technique and improper quantification of the target gene (Carvalho et al. 2013b; Fan et al. 2013; Kong et al. 2014; Nolan et al. 2006). The results obtained in this study assist to emphasize the necessity of validating reference genes for the specific tissues to be evaluated.

To demonstrate the need for accurate relative quantification using suitable reference genes, the expression of the C. arabica BBM gene was studied. A clear role for CaBBM in the embryogenic process is none-the-less evident based on high expression embryogenic calli but not in the non-embryogenic calli. The previous reports showed baby boom (BBM) gene is related to cell proliferation and morphogenesis during embryogenesis (Boutilier et al. 2002; Florez et al. 2015; Kulinska-Lukaszek et al. 2012; Passarinho et al. 2008). However, results this work suggest that BBM gene in C. arabica can be only associated with cell proliferation and acquisition of embryogenic capacity because high levels of transcription of the target gene were not verified during the transition from embryos to plantlets.

In summary, among the 12 candidate reference genes studied, 24S/PP2A emerged as the most appropriate for normalization of qPCR analyses of all somatic embryogenesis-related cultures of C. arabica. We recommend APRT/EF1a, UBQ/ACT, ACT/24S, RPL39/24S, PP2A/RPL39, PP2A/AP47 for embryogenic cell suspensions, non-embryogenic calli, embryogenic calli, combined embryogenic and non-embryogenic calli, somatic embryos, plantlet, respectively. The results provide guidelines for reference genes selection towards a more accurate use in normalization of qPCR in future Coffea transcriptomic studies involving embryogenesis-related tissues.