Introduction

Molecular biomarkers are urgently needed for late onset autosomal dominant Machado-Joseph disease/Spinocerebellar ataxia type 3 (MJD/SCA3) (MIM # 109150; ORPHA98757) to complement clinical scales in the ongoing and eminent clinical trials for this disease. MJD is a polyglutamine (polyQ) disorder caused by a CAG repeat expansion located in exon 10 of the ATXN3 gene that encodes the ubiquitously expressed ataxin-3 protein (Kawaguchi et al. 1994). Expansion of the polyQ tract above a pathological threshold initiates a cascade of pathogenic events, including failure of cellular protein homeostasis, transcriptional dysregulation, mitochondrial dysfunction, and abnormal neuronal signaling (revised in Costa and Paulson 2012).

We previously identified a set of potential blood transcriptional biomarkers for this disease in cross-sectional (Raposo et al. 2015) and preliminary longitudinal studies (unpublished work) in MJD subjects. Validation of transcriptional candidate biomarkers relies on quantitative real-time PCR (qPCR) experiments, yet accurate quantification of gene expression levels by qPCR depends on the normalization of data using reference genes. While the ideal reference gene should be stably expressed under different experimental conditions (Bustin et al. 2009), several studies have reported that expression levels of generally used housekeeping genes may vary in multiple biological and experimental conditions (see among others Coulson et al. 2008; Stamova et al. 2009; Butterfield et al. 2010). It is a common, but unfortunate, routine in the scientific community to choose reference genes without carrying out further validation in the samples used in a specific study. Identification of the best reference genes for an experimental design (e.g., specific tissue, cell type, and/or biological condition) is a crucial step towards trustworthy data (revised in Chapman and Waldenström 2015). To our knowledge, the identification and validation of reference genes to normalize qPCR expression data using biological samples from MJD subjects (preataxic and patients) has not been conducted. The reported involvement of mutant ataxin-3 in transcriptional dysregulation (revised in Nóbrega and de Almeida 2012; Costa and Paulson 2012) further reinforces the need to validate reference genes to be used in the abovementioned experiments.

Seeking to identify a set of reference genes to normalize gene expression in blood samples from MJD subjects, we evaluated the expression behavior of five housekeeping genes previously reported as being stably expressed in peripheral blood: (1) peptidylprolyl isomerase B gene (PPIB) has been used by our group as reference gene in previous gene expression experiments using samples from MJD subjects (Raposo et al. 2015, 2017; Kazachkova et al. 2017), based on its stability in blood samples from Huntington disease (HD) subjects (Diamanti et al. 2013); (2) TNF receptor associated protein 1 gene (TRAP1) (Stamova et al. 2009); (3) beta-2-microglobulin gene (B2M) (Stamova et al. 2009); (4) 2,4-dienoyl-CoA reductase 1 gene (DECR1) (Stamova et al. 2009); and (5) folylpolyglutamate synthase gene (FPGS) (Stamova et al. 2009). Verification of the stability of these reference genes will allow their use in future MJD blood-based qPCR studies, namely those aiming to identify reliable transcriptional biomarkers for MJD. Our results showed that all five candidate housekeeping genes were expressed in a stable manner in our samples and, therefore, can be used as reference genes in future qPCR experiments to identify transcriptional biomarkers in blood of MJD subjects.

Subjects and Methods

Subjects

Peripheral blood samples from ten preataxic MJD subjects (without clinical diagnosis of MJD), ten patients, and 20 age- and sex-matched controls of Azorean background (Table 1) were used to evaluate the expression behavior of five candidate housekeeping genes. Preataxic subjects volunteered for the study after completing the Genetic Counseling and Predictive Test Program and received a result of carriers of the MJD mutation (CAG repeat expansion in the ATXN3 gene). The MJD mutation was molecularly confirmed in patients and excluded in control individuals. Clinical diagnosis of MJD was established by a single neurologist (J. Vasconcelos) at the Department of Neurology–Hospital Divino Espírito Santo (HDES, Ponta Delgada, Azores, Portugal), and age at onset was defined as the age of appearance of gait disturbance and/or diplopia reported during clinical assessment. Neurological evaluation was performed using the NESSCA (Neurological Examination Score for the Assessment of Spinocerebellar Ataxias) rating scale, according to Kieling and colleagues (Kieling et al. 2008). This study was approved by the Ethics Committee of HDES. Informed consent was obtained from all participants.

Table 1 Demographic, genetic, and clinical features of MJD subjects and age- and sex-matched controls

RNA Isolation, Quality Control, and cDNA Synthesis

Whole-blood samples were collected in Tempus™ Blood RNA tubes. Total RNA was isolated using the MagMax for Stabilized Blood Tubes RNA Isolation Kit (Thermo Fisher Scientific, Waltham, MA, USA), according to manufacturer’s protocol. Purity (A260/280) and quantification of RNA samples was assessed by Nanodrop 2000c (Thermo Fisher Scientific). Complementary DNA (cDNA) was synthesized from 0.5 μg of total RNA, using the High Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA), according to manufacturer’s protocol.

Quantitative Real-Time PCR

Expression of PPIB, TRAP1, B2M, DECR1, and FPGS

TaqMan Gene Expression Assays (Applied Biosystems) (Table 2) and TaqMan Gene Expression Master Mix (Applied Biosystems) were used for qPCR of the five selected reference genes. Selection of the TaqMan assays cumulatively fulfilled three criteria: (1) pre-developed and pre-validated TaqMan assays for which over 96% of amplification efficiency is guaranteed by the manufacturer; (2) the probes span an exon-exon junction; and (3) qPCR reaction generates a short-amplicon length. qPCR was performed in an ABI StepOnePlus™ Real-Time PCR System (Applied Biosystems) using 100 ng of cDNA per reaction and each sample was tested in triplicate.

Table 2 Five candidate reference genes and one target gene selected to be analyzed by qPCR in blood samples of MJD preataxic subjects, MJD patients, and corresponding age- and sex-matched controls

Performance of PPIB and TRAP1 Using ERAP2 as Target Gene

To assess the performance of PPIB and TRAP1 as reference genes, the endoplasmic reticulum aminopeptidase 2 gene (ERAP2) was randomly selected from a previous study (Raposo et al. 2015) to be used as target gene. Expression levels of ERAP2 were determined by qPCR using TaqMan Gene Expression Assays (Applied Biosystems) (Table 2) and the SensiFast Probe Hi-ROX Kit (Bioline). Data were analyzed using PPIB or TRAP1 as reference genes.

Data Analysis

Expression of PPIB, TRAP1, B2M, DECR1, and FPGS

Raw data obtained from qPCR experiments was analyzed with StepOne™ Software v2.3 (Applied Biosystems). Average raw quantification cycle (Cq) and standard deviation (SD) values for PPIB, TRAP1, B2M, DECR1, and FPGS were calculated, considering ungrouped samples. The median, 25th and 75th percentiles, and the range of raw Cq values for each candidate reference gene were calculated for each subgroup (preataxic MJD subjects, patients, and corresponding matched controls) using IBM® SPSS® statistics (version 25). Validation of expression stability for the five candidate reference genes was assessed using three different statistical algorithms, which have been incorporated in different software packages: original geNorm package (Version 3) (Vandesompele et al. 2002), NormFinder (R version 5, 2015-01-05) (Andersen et al. 2004) and BestKeeper-Excel based tool (Version 1) (Pfaffl et al. 2004). The algorithms abovementioned rank the candidate reference genes in decreasing order of expression stability. GeNorm calculates gene expression stability measure (M-value) for all candidate reference genes in ungrouped samples based on the average pairwise variation of a particular gene compared with all other genes and stepwise exclusion of the least stable gene (highest M value), repeating this process until obtaining the combination of the two most stable genes (a value ≤ 1.0 indicates stable expression (Hellemans and Vandesompele 2014)). NormFinder calculates the stability of expression of candidate reference genes based on their intra- and intergroup variation, which are combined into a stability value (S) for each reference gene (the most stable gene has lower stability value). While this algorithm calculates stability values for either ungrouped or grouped samples, also provide the best pair of genes for grouped samples. BestKeeper calculates stability of gene expression in ungrouped samples based on SD of Cq values and on Pearson correlation coefficient (r) (highest r value indicates higher stability). The Pearson correlation coefficient is a correlation between each candidate reference gene and BestKeeper Index (combination of all highly correlated genes calculated by pairwise correlation analysis). In addition, BestKeeper tests sample integrity by calculating an intrinsic variance (InVar) of Cq value for each sample based on the deviation of the sample compared to the mean value of all samples (samples with a threefold over or under-expression should be removed and excluded from further analyses (Pfaffl et al. 2004)). In all analyses, it was assumed 100% efficiency for all candidate reference genes, and data analysis was performed according to software’s instructions.

Performance of PPIB and TRAP1 as Reference Genes Using ERAP2 as Target Gene

Relative expression values of ERAP2 were normalized to either PPIB or TRAP1 and determined by the 2-ΔCq method (Livak and Schmittgen 2001) using DataAssist v3.0 (Applied Biosystems). ERAP2 expression levels in blood of MJD subjects (preataxic and patients) and corresponding age- and sex-matched controls were compared by the Wilcoxon test. An ANCOVA test was conducted to compare ERAP2 expression levels between preataxic subjects and patients using age at blood collection as a covariate. Statistical analyses were performed using IBM® SPSS® statistics (version 25).

Results

Expression of PPIB, TRAP1, B2M, DECR1, and FPGS in Blood of MJD Subjects

Transcript levels of PPIB, TRAP1, B2M, DECR1, and FPGS were assessed by qPCR in blood of MJD preataxic subjects, patients, and matched controls. Average raw Cq values plotted versus SD of average raw Cq values for PPIB, TRAP1, B2M, DECR1, and FPGS are shown in Fig. 1. Considering ungrouped samples, Cq values ranged from 23.3 to 32.7 with B2M and FPGS showing, respectively, the highest and lowest expression levels (Fig. 1). PPIB and TRAP1 showed the highest and the lowest expression level variation, respectively (Fig. 1).

Fig. 1
figure 1

Raw quantification cycle (Cq) values for each candidate reference genes. Average raw Cq values vs. standard deviation (SD) of average raw Cq values for ungrouped samples

GeNorm, NormFinder, and BestKeeper were used to analyze and rank the expression stability of the five candidate reference genes (Table 3). GeNorm analysis revealed that the M-value for all candidate reference genes is below the cutoff value (M ≤ 1.0) indicating that all genes are stably expressed in our samples. GeNorm ranking from the most stable (lowest M-value) to the least stable gene (highest M-value) was: B2M/DECR1, TRAP1, FPGS, and PPIB (Table 3). NormFinder provided a different rank order of candidate reference genes that was identical in ungrouped and grouped samples: TRAP1 gene was identified as the most stable gene (lowest stability value) followed by B2M, DECR1, FPGS, and PPIB (Table 3). Moreover, considering the four biological groups (grouped samples), NormFinder also identified TRAP1/B2M and TRAP1/DECR1 as the two best pair of genes. BestKeeper analysis confirmed that all candidate reference genes are stably expressed (SD < 1) and revealed an identical ranking of expression stability as NormFinder (Table 3). Furthermore, based on the InVar value of each sample (lower value and x-fold below the exclusion value) provided by BestKeeper (data not shown), we inferred that all samples have high quality integrity, according to this algorithm. Analysis of expression stability by the three algorithms showed that all five candidate housekeeping genes were expressed in a stable manner in MJD samples.

Table 3 Ranking of the five candidate reference genes provided by geNorm, NormFinder, and BestKeeper algorithms

Performance of PPIB and TRAP1 as Reference Genes Using ERAP2 as Target Gene

To evaluate the behavior of PPIB and TRAP1 as reference genes, we analyzed the expression levels of ERAP2 in blood samples of MJD subjects. ERAP2 was selected among hits of a gene expression study in MJD blood samples (Raposo et al. 2015). No differences in expression levels of ERAP2 normalized using PPIB were found for MJD preataxic subjects (0.199 ± 0.044 standard error (SE)) or for MJD patients (0.189 ± 0.052 (SE)) when compared to corresponding matched controls (0.265 ± 0.062 (SE) and 0.255 ± 0.068 (SE), respectively) (Wilcoxon test, p > 0.05). Importantly, ERAP2 transcript levels normalized using TRAP1 were also similar for preataxic subjects (1.018 ± 0.241 (SE)) and patients (2.277 ± 1.084 (SE)) when compared to matched controls (2.009 ± 0.549 (SE) and 2.378 ± 0.868 (SE), respectively) (Wilcoxon test, p > 0.05). Furthermore, no statistical differences of ERAP2 levels, either normalized using PPIB or TRAP1, were found between MJD preataxic subjects and patients (ANCOVA test, p > 0.05). Statistical analysis revealed no differences in ERAP2 levels between the biological groups studied, when normalized either with PPIB or TRAP1.

Discussion

In this study, we evaluated the stability of expression of PPIB, TRAP1, B2M, DECR1, and FPGS as candidate reference genes for normalization of gene expression data obtained by qPCR in blood from MJD subjects (preataxic and patients). Assessment of expression stability of candidate reference genes showed that all the five studied genes were stably expressed in our set of samples. Results were in accordance with previous reports that identified these reference genes as suitable reference genes to be used in qPCR experiments using peripheral blood from patients with several diseases (including HD, Tourette syndrome, and Muscular Dystrophy) and controls (Pachot et al. 2004; Stamova et al. 2009; Diamanti et al. 2013). While all five genes were stably expressed in MJD blood samples, NormFinder and BestKeeper identified TRAP1 as the most stable gene of the five, contrasting to the result obtained by geNorm. Noteworthy, using both geNorm and NormFinder, Stamova et al. (Stamova et al. 2009) also identified TRAP1 as the most stable gene in blood of patients with several other disorders.

Following the expression stability analysis of all five candidate reference genes, the performance of the most stable gene, TRAP1 was tested alongside with PPIB, which was used by our group as reference gene in previous gene expression experiments using blood samples from MJD subjects (Raposo et al. 2015). Using ERAP2 as target gene, and PPIB or TRAP1 as reference genes, we were able to show that the choice of the reference gene (PPIB or TRAP1) did not impact results from the comparison between the biological groups analyzed. This result reinforces the idea that both PPIB and TRAP1 are stably expressed in our sample set.

Several studies reported that qPCR data normalization with a set of reference genes is more precise than when performed with one single gene (Vandesompele et al. 2002; Coulson et al. 2008; Stamova et al. 2009; Diamanti et al. 2013). However, according to the Minimum Information for Publication of Quantitative Real-Time (MIQE) guidelines (Bustin et al. 2009), the use of a single reference gene is acceptable if its expression stability in the specific experimental conditions has been previously demonstrated. Here, we show that all five candidate housekeeping genes—PPIB, TRAP1, B2M, DECR1, and FPGS—were expressed in a stable manner in our samples and, therefore, any of them can be used as a single reference gene in future qPCR studies in blood of MJD subjects.