Introduction

Astrocytomas are the most common primary tumors of the central nervous system in humans. These tumors are classified by the World Health Organization (WHO) into four histological grades. Among them, glioblastoma multiforme (GBM) is the most aggressive tumor (WHO grade IV), characterized by the shortest survival [1].

Despite advances in genetic and epigenetic characterization, there is still a lack of information regarding the molecular pathogenesis of malignant glioma and the complex biological interactions that regulate glioma development [2]. Various therapy options based on characterized genetic alterations are already in use or in clinical trial phase, but their efficacy is still poor [36]. Therefore, it is necessary to analyze molecular pathways responsible for etiology and development of gliomas, and to search for new biomarkers of potential use in treatment, modification, and outcome improvement. Gene expression analysis using quantitative real-time reverse-transcription PCR (RT-qPCR) has been shown to be a promising approach to identify novel genes that are expressed differentially and molecular markers [79]. RT-qPCR is a highly sensitive, specific, and reproducible method, and has potential for high throughput.

To obtain valid results by RT-qPCR, it is crucial to consider experimental variations such as the amount of starting material as well as quantity and quality of the RNA, efficiencies of reverse transcription and PCR [10]. To account for these, the accuracy of RT-qPCR relies on normalization to valid reference genes (“housekeeping genes”), and an efficiency correction of the PCR reactions has to be performed [11]. Valid reference genes should be stably expressed in all samples under investigation, regardless of tissue type, developmental stage, disease state, and medical or experimental treatment. Because there is no gene that meets these criteria, candidate reference genes have to be verified under the same experimental conditions used for target genes, and one has to use at least the combination of two validated reference genes for proper quantitative evaluation [12]. Actually, there exist three studies that analyze a panel of candidate reference genes in glioma tissue. They give different results, which underline the need for verifying reference genes for each individual study [1315]. Here we investigated the suitability of nine frequently used reference genes for SYBRGreen based RT-qPCR analysis in human astrocytomas of WHO grade II–IV and non-neoplastic brain tissue.

Materials and methods

Tissue samples

Tumor tissues and normal brain samples (diffuse astrocytoma, n = 3; anaplastic astrocytoma, n = 8; GBM, n = 8; normal tissue, n = 8) were obtained from patients undergoing surgery at the Department of Neurosurgery, Jena University Hospital. All tumor specimen were histologically diagnosed according to the WHO criteria and stored at −80 °C [1]. The study was approved by the local Human Research Ethics Committee and informed consent was signed by every patient.

RNA extraction and cDNA synthesis

Total RNA was extracted from the tissue samples using Qiazol Reagent and the RNeasy Mini Kit (Qiagen). In brief, tissue samples were homogenized in 1 ml Qiazol Reagent, and 200 µl Chloroform were added. After centrifugation, RNA extraction from the upper phase was further performed with the RNeasy Mini Kit following manufacturer’s instructions. Concentration and purity of the isolated RNA were assessed by absorbance (A) readings on a nanodrop spectrophotometer (peqlab) at wavelengths of 260 and 280 nm. The mean ratio value of A260/280 for all RNA samples was 2.03 (±0.21), reflecting high purity. RNA integrity was evaluated by the ratio of 28S/18S ribosomal RNA bands after electrophoresis in denaturing 1 % agarose gel. One microgram of total RNA from each sample was reverse transcribed into cDNA using the GoScript reverse transcription system (Promega) according to the manufacturer's protocol.

Quantitative real-time PCR

The expression levels of nine frequently used reference genes (Table 1) were measured on a Rotor-Gene 6000 instrument (Qiagen) using the DyNAmo Flash SYBR Green qPCR Kit (Finnzymes). All primers were designed to be intron spanning and to amplify at 55 °C by using Primer-BLAST (NCBI) and NetPrimer (http://www.premierbiosoft.com) software. The used primer sequences are listed in Table 1. Real-time qPCR was performed in duplicate with the following cycling conditions: 7 min at 95 °C, 40 cycles of 10 s at 95 °C, 20 s at 55 °C, and 30 s at 72 °C. Each run was completed with a melting curve analysis to confirm the specificity of amplification and lack of primer dimers. PCR reaction efficiency was calculated for each sample based on a standard curve using serial dilutions of pooled cDNA [16]. All primer pairs utilized in this study presented reaction efficiencies between 89 and 110 % (Table 1).

Table 1 Selected candidate reference genes evaluated in this study and information on the primers used for RT-qPCR

Data analysis

To select the most stable candidate reference genes, we applied the software geNorm [17], NormFinder [18], and BestKeeper [19], as previously described. For comparison of candidate gene transcription levels, the cycle threshold values (C t) were plotted. The mean Ct values of the replicates for each sample were transformed into raw, non-normalized quantities (Q) using the standard curve method. In NormFinder and geNorm software, stability values (M) were calculated for each candidate gene under investigation. BestKeeper uses raw C t values to calculate variations of the candidate genes.

Means and standard deviations were computed and compared. The normality test was performed by the Kolmogorov and Smirnov method using SPSS 21 software. Intergroup comparisons were performed by t test or Mann–Whitney test, in case that data were normal or not normally distributed, respectively. Multiple group comparison was done with one-way ANOVA (in case of normal distribution). p values < 0.05 were considered statistically significant.

Results

Expression levels of candidate reference genes

To determine the expression stability of nine commonly used reference genes, RNA expression levels were measured in nineteen glioma tissue samples (3 diffuse astrocytoma, 8 anaplastic astrocytoma, and 8 GBM), and eight samples of normal brain tissue. Genes encoding beta-2-microglobulin (B2M), cytochrome c-1 (CYC1), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), hydroxymethylbilane synthase (HMBS), hypoxanthine guanine phosphoribosyl transferase 1 (HPRT I), ribosomal protein L13a (RPL13A), succinate dehydrogenase (SDHA), TATA-box binding protein (TBP), and 14-3-3 protein zeta (YWHAZ) were selected (see Table 1) and validated using the statistical algorithms geNORM, NormFinder, and BestKeeper. The efficiency of all qPCR assays was confirmed to be between 89 and 110 %. The C t values were found to be normally distributed. Mean Ct values ranged from 14.15 (GAPDH) to 21.43 (HMBS). The expression levels of the candidate reference genes for glioma and normal brain samples are shown in Fig. 1a. B2M was excluded from expression stability analysis due to strong differences in expression level of more than twelve cycles within the high grade gliomas. The expression levels of the remaining candidate genes showed a wide range of variability within the different tissues of the panel with a standard deviation of the C t values between 0.75 (RPL13A) and 1.66 (HMBS). Intergroup comparison of each of the tumor subgroups versus normal brain revealed statistical significant differences in candidate gene expression for HMBS in anaplastic astrocytoma and SDHA in anaplastic astrocytoma and glioblastoma (p < 0.05). Multiple group comparison of all tumor entities plus normal brain revealed that all candidate reference genes except HMBS and SDHA (in anaplastic astrocytoma compared to normal tissue), were expected not to be differentially expressed (p < 0.05). Figure 1b shows the differences in mean expression between high grade tumors and normal brain tissue. The confidence intervals for HPRT I, RPL13A, TBP and YWHAZ are included in deviation area and overlap zero line, so these candidate genes are considered to be equivalently expressed.

Fig. 1
figure 1

Expression levels of nine candidate reference genes in glioma and normal brain tissue samples. a Values are given as quantitative real-time reverse-transcription PCR (RT-qPCR) cycle threshold (C t) of the nine candidate genes in normal brain tissue (filled circles, n = 8), astrocytoma WHO grade II (open circles, n = 3), astrocytoma WHO grade III (filled triangles, n = 8), and glioblastoma (open diamonds, n = 8). Horizontal bars indicate mean. b The differences of mean (filled circles) and confidence intervals (whiskers) are shown for the logarithmized relative expression of the candidate reference genes (except B2M). The deviation area from −1 to 1 (dotted lines) indicates twofold changes in expression levels between high grade gliomas and normal tissue. In case that the confidence interval is included in the deviation area and overlaps zero line, the gene is considered equivalently expressed

geNorm expression stability analysis

geNorm software calculates an average expression stability value (M) for all candidate genes by determining the average pairwise variation of each input gene with all other candidate genes. The genes with the lowest M values are considered to be the most stable [17]. For selection of reference genes from heterogeneous sample sets an M value of 1.5 was considered as cut-off. For the three tumor subgroups compared to normal brain, the order of expression stability of the six analyzed candidate genes was as follows: GAPDH/RPL13A > CYC1 > TBP > YWHAZ > HPRT I (M values: 0.800, 0.886, 0.935, 1.00, 1.135) (Fig. 2a). Analysis of the pairwise variation (V) of the candidates indicates that only two reference genes are required for optimal normalization in this sample set (V < 0.15). Stability values for high grade gliomas separately were ranked as follows for anaplastic astrocytoma: GAPDH/YWHAZ > TBP > RPL13A > CYC1 > HPRT I (M values: 0.438, 0.654, 0.725, 0.794, 0.880), and for glioblastoma: CYC1/TBP > GAPDH > RPL13A > HPRT I > YWHAZ (M values: 0.712, 0.982, 1.03, 1.15, 1.25) (Fig. 2b, c). Pairwise variation of the candidates in this sample set revealed that two reference genes are required for normalization.

Fig. 2
figure 2

Gene expression stability analysis of candidate reference genes in human glioma and normal brain tissue using geNorm. The average expression stability values (M) of six reference candidates after exclusion of differentially expressed candidate genes are shown. Candidate genes are sorted from the least (left) to the most (right) stable. a Stability values for glioma compared to normal brain (n = 27). b Stability values for anaplastic astrocytoma alone (AIII, n = 8). c Stability values for glioblastoma alone (GBM, n = 8)

NormFinder expression stability analysis

Ranking of candidate genes by NormFinder is done by estimation of inter- and intragroup variations and calculation of a stability value after separate analysis of the sample subgroups [18]. The calculated stability values of the six candidate genes comparing all four subgroups are reported in Fig. 3a. The ranking was GAPDH > RPL13A > YWHAZ > CYC1 > TBP > HPRT I (stability values: 0.243, 0.283, 0.316, 0.330, 0.383, 0.442). The most stable combination of two genes with minimal combined intra- and intergroup variation in the tumor subgroups compared to normal brain was CYC1 and GAPDH with a stability value of 0.161. The stability values for high grade gliomas were ranked as shown in Fig. 3b for anaplastic astrocytoma: YWHAZ > GAPDH > CYC1 > TBP > HPRT I > RPL13A (stability values: 0.141, 0.255, 0.336, 0.408, 0.532, 0.539), and for glioblastoma: CYC1 > GAPDH > RPL13A > TBP > HPRT I > YWHAZ (stability values: 0.207, 0.305, 0.557, 0.676, 1.048, 1.161) (Fig. 3c).

Fig. 3
figure 3

Gene expression stability analysis of candidate reference genes in human glioma and normal brain tissue using NormFinder. The lower the variance in the stability value, the more stable a gene is within the set of samples analyzed. Candidate genes are sorted from the least stable (left) to the most (right) stable. a Average stability values (M) for glioma compared to normal brain (n = 27). b Stability values (M) for anaplastic astrocytoma alone (AIII, n = 8). c Stability values (M) for glioblastoma alone (GBM, n = 8)

BestKeeper expression stability analysis

BestKeeper determines suitable reference genes by using pairwise correlation analysis [19]. Initially calculated variations [SD (±Ct) and CV (% Ct)] showed overall stability in gene expression. HPRT I exhibited high standard deviation (SD = 1.01) and was excluded from further analysis. Thereafter, pairwise correlation and regression analysis assessed the relation between the index and candidate genes, and the highly correlated candidate genes were combined into an index. Comparing the four sample subgroups, RPL13A showed the least correlation (r = 0.74) and was therefore eliminated. TBP, YWHAZ, and CYC1 showed the best correlation coefficient (r = 0.875, 0.854, 0.854), indicating that their expression correlates very well with one another and with the BestKeeper index (Fig. 4a). Analysis of the high grade glioma subgroups revealed for anaplastic astrocytoma YWHAZ, CYC1, and GAPDH as best correlating candidate reference genes (r = 0.948, 0.920, 0.903) (Fig. 4b), and for glioblastoma: TBP, YWHAZ, and CYC1 (r = 0.942, 0.915, 0.902), as shown in Fig. 4c.

Fig. 4
figure 4

Gene expression stability analysis of candidate reference genes in human glioma and normal brain tissue using BestKeeper. Coefficient of correlation (r) is shown for the four most stable candidate reference genes compared to the BestKeeper index. Candidate genes are sorted from the least (left) to the most (right) stable. a Correlation values for glioma compared to normal brain (n = 27). b Correlation values for anaplastic astrocytoma alone (AIII, n = 8). c Correlation values for glioblastoma alone (GBM, n = 8)

Discussion

RT-qPCR is becoming the method of choice for gene expression analysis in specific experimental settings, because of its wide dynamic range and high sensitivity. However, RT-qPCR requires accurate data normalization strategies. The purpose of normalization is the elimination of non-biological variation. For this purpose reference genes are used. A valid reference gene is a protein coding gene or ribosomal RNA gene that ideally exhibits invariant expression levels across all test samples. Although genes that regulate basic and ubiquitous cellular functions are supposed to be almost invariable between different samples, previous studies demonstrated that expression levels of possible reference genes can vary in a sample set as a result of pathological transformation or experimental treatment for instance [20, 21]. Hence, more than one reference gene should be used for validation of each experimental setting [22, 23], and an accurate evaluation of reference gene stability is necessary to generate certain results. Systematic analysis of multiple reference genes should be used to identify putative candidates. Furthermore a previous exclusion of regulated genes based on raw expression data evaluation is required [24].

We analyzed the expression of nine candidate reference genes in 27 different human tissues. All potential reference genes used in this study are considered to be common reference genes and their use was reported in a multitude of RT-qPCR studies [13, 23, 25]. The obtained RT-qPCR expression data of the reference candidates were evaluated using three independent expression stability analysis methods, geNorm, NormFinder, and BestKeeper, and the results were compared. We conducted a preliminary intergroup comparison of C t values to exclude genes whose expression levels differ between sample groups. Here HMBS and SDHA showed significant differences in expression between high grade gliomas and normal brain tissue, and were therefore excluded from further evaluation.

geNorm is one of the most popular algorithms for validating candidate reference genes [17]. It determines expression stability (M) via a pair-wise comparison of one candidate reference gene and all other candidate genes independent of the level of gene expression for each sample. A low M value is indicative of a more stable expression; hence, increasing the suitability of a particular gene as a reference gene. However, co-regulation of candidate genes does seem to influence the efficiency of this algorithm due to the use of pair-wise comparisons. To minimize the risk of co-regulation the nine candidate reference genes selected for this analysis were chosen on the basis of their physiological functions: immune system (B2M), electron transport (CYC1), energy metabolism (GAPDH, SDHA), heme production (HMBS), nucleotide salvage (HPRT I), translation (RPL13A), transcription (TBP), signaling pathways (YWHAZ).

According to this analysis, GAPDH and RPL13A represented the best combination of reference genes for glioma compared to normal brain tissue (Fig. 2a), while CYC1 and TBP were ranked third and fourth, respectively. For gene expression analysis in glioblastoma alone CYC1 and TBP exhibited the most stable reference genes, while GAPDH and RPL13A are ranked afterwards (Fig. 2c).

Andersen and colleagues proposed a model based approach incorporated into the NormFinder software [18]. This algorithm identifies candidate reference genes with an inter-group variation as close to zero as possible, while at the same time having small intra-group variation. This represents an effective method to overcome the influence of co-regulation. Ranking our candidate reference genes by their stability values, CYC1 and GAPDH seemed to be the best combination of endogenous control genes. Further examination of the results revealed that intra-group variation of CYC1 was relatively large. Therefore, it was ranked fourth and RPL13A and YWHAZ appear to be the more stable reference genes. Examining stability values of the candidate genes for glioblastoma alone, the best control genes are: CYC1, GAPDH, and RPL13A.

The same two genes, GAPDH and RPL13A were ranked as the most stable both by geNorm and NormFinder software for gene expression analysis of glioma compared to normal brain.

In order to compare GeNorm and NormFinder results with an independent ranking method, the data were also analyzed with BestKeeper [19]. In this algorithm, stable reference gene expression is indicated by low variation between the samples under examination. BestKeeper uses a pair-wise correlation analysis for all pairs of candidate genes based on the raw C t values and calculates the geometric mean of the best suited ones. Hence, stability (SD) and relationship to the BestKeeper index (r values) are the two most important criteria for evaluating the stability of reference genes. HPRT I was excluded from further analysis because of its high standard deviation. Based on this algorithm, TBP and YWHAZ followed by CYC1 were ranked as the top three genes (Fig. 3a). GAPDH was ranked as the least stable gene and RPL13A had to be excluded because it showed the worst coefficient of correlation, although it exhibits the lowest standard deviation and therefore the lowest inter-group variation. Both candidate genes have relatively low C t values of 14.15 and 14.33, respectively. So it seems that the comparison to the BestKeeper index selects candidate reference genes towards an overall mean C t value, in this case 17.71. For that reason, the Bestkeeper results had been left out of the concluding assessment of reference genes for expression analysis in human glioma. For glioblastoma alone, Bestkeeper analysis produced TBP, YWHAZ, and CYC1as most stable reference genes, unlike geNorm and NormFinder.

The identified reference genes suitable for normalization in tumor tissue of different WHO grades and normal brain (GAPDH, RPL13A, CYC1, and TBP) were identical to the top-ranked reference genes suitable for normalization of glioblastomas, thereby supporting the validity of the results obtained from different modeling strategies. These genes may serve as “universal” reference genes allowing RT-qPCR experiments comparing GBM, astrocytoma grade III, astrocytoma grade II, and normal brain tissue. But it is important to note, that the order of suitable candidate genes changed, when stability evaluation was done for glioblastoma alone. Although highly ranked across the tumor subgroups, the genes GAPDH and RPL13A turned out to be more variable within the glioblastoma and not the best references for expression analysis. Here GAPDH and CYC1 are the most stable.

Based on our stability analysis and previous data [1315] we recommend the inclusion of GAPDH, RPL13A, CYC1, YWHAZ, and TBP in a candidate set for evaluation of reference genes for expression analysis in human glioma. Out of these most constitutively expressed candidates, we use GAPDH, RPL13A, and CYC1 in further expression analysis. The remaining candidate genes, YWHAZ and TBP exhibited high expression variation within the glioblastoma samples and hence were positioned among the least stable reference genes in this set. Furthermore the three proposed candidates belong to three different functional classes of protein-coding RNAs and should avoid the problem of co-regulation among the reference gene set. In conclusion, our findings suggest that usage of this three evaluated reference genes will result in accurate RT-qPCR data analysis for human glioma of different WHO grade and normal brain tissue. Overall, regardless of the algorithm used, the pattern of the most and least stable genes still remains the same, indicating well-matched estimations of intragroup variation of candidate genes by all three software applications.