Introduction

The razor clam (Sinonovacula constricta) is an importantly economic bivalve species, and distributes in intertidal zones and estuarine waters along the coast of West Pacific Ocean. In China, it is a member of four important and traditional shellfish with breeding history for hundreds of years. Meanwhile, owing to its benthic lifestyle and strong adaptability, it is a main species cultivated with crustacean for polyculture. Because of its commercial value and ecological niche, molecular and genetic researches in S. constricta have been increasing (Niu et al. 2017; Peng et al. 2017; Ran et al. 2017; Zhao et al. 2017).

Gene expression analysis is an important and useful procedure in our understanding of gene function, pathways and networks underlying cellular and biological processes. Although there are available technologies to evaluate gene expression, such as northern blotting, microarray and next-generation sequencing, quantitative real-time PCR (qRT-PCR) is still a common method, even used for the validation of sequencing results, owing to its high sensitivity and convenience (Bustin et al. 2005; Kubista et al. 2006). However, the normalization of qRT-PCR data is a prerequisite before accounting for variations between samples, which can minimize the non-specific variations obtained from RNA quality, the efficiencies of reverse transcription and quantity of samples (Guénin et al. 2009). To assure the accuracy of qRT-PCR results among tissues and over time, the appropriate selection of the reference genes is crucial. Housekeeping genes are constitutive genes that express for the maintenance of basic cellular functions and should remain constant under different tissues and different physiological conditions (Thellin et al. 1999; Dheda et al. 2004). Hence, the internal reference controls of qRT-PCR mostly are housekeeping genes. Actually, various studies have reported housekeeping genes used for reference control expressed variably under different tissues, developmental stages, and different treatments (Stürzenbaum and Kille 2001). In previous studies, it was validated that the expression levels of the same gene normalized by the different reference genes varied greatly, even acquired opposite results (Niu et al. 2015; Koramutla et al. 2016). These findings showed that the choice of reliable reference gene is essential for accurate normalization of target gene expression levels. It’s necessary to validate the stable housekeeping genes in the given organisms and experiments.

The identification of optimal reference genes for qRT-PCR has been carried out in aquatic organisms, such as fish (Ingerslev et al. 2006; Filby and Tyler 2007; Zheng and Sun 2011), shrimp (Dhar et al. 2009; Valenzuela-Castillo et al. 2017) and mollusc (Cubero-Leon et al. 2012; Du et al. 2013; Song et al. 2017; Volland et al. 2017). Unfortunately, qRT-PCR normalization in immunology studies using non-model organism, such as bivalves, frequently with not validated genes or with genes previously validated under different experimental conditions or other species (Volland et al. 2017).

The razor clam is a promising studied species among the bivalves. For decreasing the damage by disease, its immune response to pathogen attracts widespread attention. A number of commonly used housekeeping genes in bivalves, such as β-actin (ACT), elongation factor 1-α (EF1), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), 18S and tubulin (TUB) are reported in previous studies (Meistertzheim et al. 2007; Mateo et al. 2010; Li et al. 2011; Liu et al. 2014a, b). In addition to these genes, we selected housekeeping genes from the stable expression of the high-throughout data from S. constricta treated by Cd2+ and Vibrio parahaemolyticus (Wang et al. 2016; Zhao et al. 2017) as candidate reference genes. In this study, we validated eight candidate housekeeping genes associated with five target tissues and Vibrio infection in S. constricta to identify the most stable internal controls for normalization of qRT-PCR data in response to pathogen.

Materials and methods

Selection of candidate reference genes and primers design

Based on the RNA-seq data of 12 libraries for the gill and hepatopancreas of the razor clam under Cd2+ stress (Wang et al. 2016) and Vibrio infection (Zhao et al. 2017), genes with similar expression levels were classified to ten different clusters. Candidate reference genes were selected from clusters with stable expression trends and were the orthologs of the housekeeping genes of Crassostrea gigas (Du et al. 2013). According to the FPKM values of the candidate reference genes in the 12 transcriptome, the coefficient of variation (CV) of these genes were calculated in Microsoft Excel for the analysis of expression stability (Robinson et al. 2010). CV was calculated as SD/mean. According to the CV value and commonly used internal controls, six candidate genes were selected. In addition, β-actin (ACT) and tubulin (TUB), used commonly in bivalves, were added as candidate reference genes (Volland et al. 2017) (Glyceraldehyde-3-phosphate dehydrogenase is a common housekeeping gene in most of animals, but there is no orthologs in the transcriptome of S. constricta). The cDNA sequences of these selected housekeeping genes were obtained from our RNA-seq transcriptome dataset of razor clam. The qRT-PCR primers were designed by Primer 3 (http://bioinfo.ut.ee/primer3-0.4.0/) with the melting temperature between 59 and 61 °C (Table 1). Lysozyme (F: AACTATTGGCTGGACTGTGGCTC; R: GACATTAGAACATCCTGGCTGG) and C1q-domain-containing protein (C1qDC) (F: AATGCTTACAACTCCCACGCCG; R: CCACGCTTTGCTGTCACTACTG) were chosen to validate the applicability of the reference genes.

Table 1 Candidate reference genes and their primer sequences for qRT-PCR

Sample collection

Adult clams (S. constricta) with an average body weight of 10 g, average body length of 5.5 cm and average shell width of 2.25 cm were purchased from a commercial clam fishery (Ningbo, Zhejiang, China) in June 2017. The clams were held in aerated 40 l tanks, with filtered seawater at a salinity of 20‰ and a temperature of 16 °C. After accommodation for three days, all the clams was exposed to V. parahaemolyticus with the final concentration of 107 CFU/mL. The gill tissues from five individuals were merged as one sample and the samples were collected at 0, 6, 12, 24 and 48 h, respectively. The tissues were dissected and immediately frozen in liquid nitrogen, and stored at − 80 °C for later use. Each time point had three biological replicates.

Total RNA extraction and cDNA synthesis

The gill tissues from five individuals with the same treatments were homogeneously mixed. Total RNA was extracted using RNAiso Plus (TaKaRa, Tokyo, Japan) according to manufacturer’s instructions. RNA integrity was confirmed by gel electrophoresis with the proportion of the ribosomal bands as a ratio of 2.0. The quantity and quality of the total RNA were measured by a NanoDrop ND-1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA). The samples with A260/A280 and A260/A230 ratios greater than 1.8 were used for cDNA synthesis. Total RNA was diluted with nuclease-free water and 1 µg RNA of each sample was reverse transcribed using a PrimeScript™ RT reagent Kit with gDNA Eraser (TaKaRa).

qRT-PCR analysis

Real-time PCR reactions were performed using an ABI 7500 with 7500 software v2.0.1 (Applied Biosystems). The 20 µl PCR mixture contained 10 µl of PowerUp SYBR Green Master Mix (Applied Bfiosystems), 0.8 µl of 10 µM forward primer, 0.8 µl of 10 µM reverse primer, 1 µl of cDNA template and 7.4 µl of nuclease-free water. Cycling conditions were an initial UDG activation step at 50 °C for 2 min, 95 °C for 2 min, 40 cycles of 15 s at 95 °C, 1 min at 60 °C and then a melting curve stage after the cycling stage. For each reference gene, no-template reactions were run as negative PCR controls. To calculate the gene-specific PCR efficiency, standard curves were generated from dilution series of cDNA template for each primer pair.

Analysis of gene expression stability

Gene expression stability was evaluated by geNorm (qbase+, version 3.1) (Hellemans et al. 2007), NormFinder (version 0.953) (Andersen et al. 2004) and BestKeeper (version 1) (Pfaffl et al. 2004). The raw expression CT values for the eight selected genes in the 15 vibrio-infected samples and 15 tissues samples were exported into an Excel sheet. RefFinder (http://150.216.56.64/referencegene.php?type=reference) includes geNorm, Normfinder, BestKeeper and delta CT that can be used to rank the stability order of reference genes stability generated from each statistical algorithm.

Results

Selection of candidate reference genes

We identified ten different clusters with various gene expression patterns (Fig. 1). There were 631 to 4091 genes in each cluster. Gene expression patterns of clusters 1, 2, 3, 4 and 6 (Fig. 1a–d, f) were highly variable between gills and hepatopancreas, while the gene expression changed with the different treatments in the last four clusters (Fig. 1g–j). Only genes in cluster 5 (Fig. 1e) showed the most stable expression trend. Therefore, 2061 unigenes in cluster 5 were the dataset for selecting candidate reference genes. Based on the published 317 housekeeping genes in C. gigas (Du et al. 2013), a total of 55 orthologs of S. constricta housekeeping genes were identified (Supplementary Table S1). There were 11 housekeeping genes in cluster 5 among the 55 orthologs. To measure the stability of these gene expression, we calculated their CV values based on the FPKM values across 12 transcriptomes (Table 2). The first nine genes in Table 2 except 18S rRNA and Methionyl-tRNA synthetase had lower CV values. According to the lower CV values and commonly used reference genes, we chose six housekeeping genes from these 11 genes as candidate internal controls, including RS9, RL12, RL13, EIF3, EF1 and 18S.

Fig. 1
figure 1

Clusters of genes in 12 transcriptome of two tissues in response to Vibrio and Cd2+ in S. constricta. control group post 12 h in the gill (G12N) and in the hepatopancreas (L12N), control group post 48 h in the gill (G48N) and in the hepatopancreas (L48N), vibrio treatment post 12 h in the gill (G12B) and in the hepatopancreas (L12B), vibrio treatment post 48 h in the gill (G48B) and in the hepatopancreas (L48B), Cd2+ treatment post 12 h in the gill (G12Cd) and in the hepatopancreas (L12Cd), and Cd2+ treatment post 12 h in the gill (G48Cd) and in the hepatopancreas (L48Cd)

Table 2 The coefficient of variation (CV) of the 11 gene expression across 12 transcriptomes

qRT-PCR specificity and efficiency

A total of eight candidate genes were selected to normalize the gene expression levels in S. constricta using qRT-PCR. The specificity of the primers was confirmed using a melting curve analysis by single peak melting curves of the qPCR products without primer-dimer formation (Supplementary Figure S1). The amplification efficiencies of primer pairs ranged from 95.6% of RL13 to 104.4% of ACT (Table 1), and the standard curves displayed highly linear correlation coefficients (R2), which were ranged between 0.921 and 0.998.

Analysis of gene transcription stability

We tested eight candidate housekeeping genes for expression stability in vibrio infected razor clams at five different time points (0, 6, 12, 24 and 48 h) and five different tissues, including gill, hepatopancreas, siphon, hemocytes and foot from five individuals with three biological replicates. The raw data of qRT-PCR was performed using boxplot analysis (Fig. 2) and listed in the supplementary Table S2 and Table S3. The results showed 18S was the most abundant housekeeping gene with the lowest mean CT value of 9 and 13 in both vibrio-infected samples and tissues, respectively. EIF3 was the least expressed housekeeping gene with the highest mean CT value of 23 in vibrio-infected samples. The results also revealed that genes were characterized by the smaller variation in transcript levels in vibrio-infected samples than in tissues. Among them, EIF3 gene had the smallest variation in transcript levels either in vibrio-infected samples or in tissues. In addition, RS9, RL12, RL13, EIF3 and EF1 had relatively stable expression across all the tissues and vibrio-treated samples tested.

Fig. 2
figure 2

Boxplot of absolute cycle threshold values for eight candidate reference genes. The median is indicated by a line in each box. Whiskers indicate when the values go down to the smallest value and up to the largest

In this study, we used three algorithms to select better-suited housekeeping genes for higher-accuracy stability rankings. The results in different algorithms differed, even in the most appropriate reference genes (Fig. 3; Table 2). But the first four or five stably expressed genes showed little variation in each software. In addition, TUB and 18S were identified to be the least stable expressed gene in vibrio-infected samples and tissues, respectively. Among vibrio-infected samples, EIF3 and RS9 were the most stable reference genes analyzed by geNorm and NormFinder (Fig. 3a). In addition, all candidate reference genes could be considered stable for vibrio-infected samples according to geNorm stability values (M) below 0.5 (Hellemans et al. 2007). In tissue samples, geNorm identified RS9 and EF1 to be the most stable reference genes, followed by RL13, EIF3, RL12, ACT, TUB and ultimately 18S. Except TUB and 18S, the stability values (M) of other candidate housekeeping genes were below 0.5 (Fig. 3b). It showed that five genes selected from the stable expression data of transcriptome except 18S were validated to express stably. NormFinder identified RL13 as the most stable reference gene tested, followed by RL12, ACT, EF1, RS9, EIF3, TUB and 18S. BestKeeper calculates the standard deviation (SD) to identify the stable reference genes with the SD value inversely proportional to the stability of expression. As Table 3 shown, EF1 is the most stably expressed in vibrio-infected condition, while EIF3 is the most stably expressed in tissues. Similar to the results of geNorm, genes in vibrio-infected samples expressed more stable than those in tissues based on the SD distribution in the two groups. According to the integration of four algorithms’ results by RefFinder (Fig. 4), EIF3, RS9 and ACT were the three most stable reference genes in vibrio-infected conditions, while RL13, RS9, EF1 and EIF3 were the four most stable reference genes in five different tissues of S. constricta.

Fig. 3
figure 3

Ranking of candidate reference genes in vibrio-infected samples (a) and in different tissues (b). A lower value indicates more stable expression

Table 3 Expression stability values of the candidate reference genes calculated by BestKeeper
Fig. 4
figure 4

Expression stability analysis of reference genes in vibrio-infected samples (a) and in different tissues (b) by RefFinder

Determination of the optimal number of reference genes

In regards to determine the optimum number of reference genes, geNorm used a pairwise number variation Vn/Vn + 1 analysis. A suggested cut-off value for the pairwise variation is 0.15. The result below it suggesting that the inclusion of an additional reference gene is not required (Vandesompele et al. 2002). In vibrio-infected samples, the V2/3 value was 0.063 suggesting that EIF3 and RS9 (on the basis of the M value) were enough for normalization (Fig. 5). Compared to V2/3 value, the V3/4 value was a little lower, which indicated the inclusion of a third reference gene could improve slightly stability of normalization. In tissues, the V2/3 value was 0.138 suggesting that RS9 and EF1 were enough for normalization. BestKeeper does not provide a measurement for the optimal number of reference genes. We identified the three genes with the highest correlation coefficient (r) as the minimal number of reference genes according to the authors’ advice (Pfaffl et al. 2004).

Fig. 5
figure 5

Determination of the optimal number of reference genes by geNorm. The horizontal line represents the proposed cut-off value

Validation of the reference genes

Analysis of the above algorithms indicated RS9 as the top ranked reference gene in terms of stability in gene-expression for tissues and Vibrio-infected treatments. And EIF3 expressed more stably under Vibrio infection. In contrary, TUB and 18S were found to be the least stable. Therefore, we chose i-type lysozyme and C1qDC to validate the applicability of these reference genes for the normalization, either as single or in combination in response to Vibrio infection at different time points. The results showed the expression patterns of lysozyme and C1qDC differed normalizing by the most and least stable reference genes (Fig. 6). For lysozyme, the expression showed higher consistency compared to normalization by all the reference genes except the 12 h-treatment group. It showed significant variation when TUB and 18S were as normalizer compared to normalization by RS9 and EIF3 and their combinations in the 12 h-treatment group (Fig. 6a) At 48 h, the expression profile normalized by 18S showed significant variation. As the qRT-PCR data of C1qDC shown, TUB was the least stable reference gene as normalizer in the 12 h-treatment group and 48 h-treatment group (Fig. 6b).

Fig. 6
figure 6

Variation in lysozyme and C1qDC gene-expression data normalized by different reference genes and their combinations. Lysozyme (a) and C1qDC (b) expression pattern under Vibrio challenge at different time points. The five kinds of bars indicate the two genes’ expression levels normalized by different reference genes (RS9, EIF3, TUB and 18S) and their combination, respectively

Discussion

Quantitative real-time PCR is a popular technology to measure the expression levels of target genes under all kinds of experimental conditions, including different tissues, different treatments or different developmental stage, on account of its high sensitivity, specificity and cost-efficiency (Ginzinger 2002). Vibrio sp. is commonly detected in marine and estuarine environments (Thompson et al. 2004) and is one of the main pathogens to bivalves and is responsible for massive mortality, including S. constricta. Hence, exploring the expression patterns of immune-related genes is imperative. However, the accuracy of gene expression levels by qRT-PCR needs to be normalized with an internal control. The internal control should have a relatively stable expression level regardless of experimental variations. There are little information about S. constricta housekeeping genes and no study to examine the suitability of potential housekeeping genes for qRT-PCR in S. constricta. In this study, we clustered gene expression data of 12 transcriptome into ten clusters based on similar expression patterns. About half of these clusters showed the expression patterns had significant tissue specificity. The perfect internal control gene should show stable expression regardless of experimental variations. For this purpose, the cluster 5 was chose to be the candidate database with the stable gene expression patterns in spite of different tissues and different stresses. We identified 55 orthologs of housekeeping genes in S. constricta blasting to those of C. gigas, and combined the CV values to select six candidate reference genes. In addition, two genes commonly used as internal control genes in Mollusca were added in the list of candidate reference genes. These eight genes were validated the expression stabilities by qRT-PCR in both vibrio-infected samples and different tissues of S. constricta.

We selected three algorithms (geNorm, NormFinder and BestKeeper) to evaluate the stability of the expression levels of eight candidate reference genes. Under vibrio infected conditions, EIF3 was the most stable internal control genes followed by RS9 based on the consistent results of the three methods. The pairwise variation analysis performed by geNorm indicated that EIF3 and RS9 were enough for normalization. Among the five different tissues, the results from different methods differed. The integrative results by RefFinder showed that RL13 was the most stable internal control genes followed by RS9. The pairwise variation analysis indicated that RS9 and EF1 were enough for normalization. According to the conclusion that genes with M ≤ 0.5 calculated by geNorm have high reference target stability, all the genes tested could be considered stable in vibrio infected conditions, and all the genes except 18S expressed stably among tissues. Almost all the genes with stable expression may result from the filter criterion that the reference genes selected from the gills and hepatopancreas transcriptome in response to Vibrio infection with stable expression. Meanwhile, we used the CV values to measure the stability of genes selected relying on the transcriptome data. The ranking of these genes based on the transcriptome data was similar to that validated in qRT-PCR, especially the least stable gene, 18S. It indicated that the cluster analysis of transcriptome combining the CV ranking is an important reference for candidate internal controls selection used in qRT-PCR.

It was apparent that the best ranked reference gene for one treatment was not applicable to the other treatments. For experimental convenient, it was imperative to identify the reference genes that show acceptable stability in expression across various treatments. RS9 was the most stable reference gene in both vibrio-infected samples and different tissues based on the results. It belongs to small subunit ribosomal proteins, and takes part in ribosome biogenesis in all cell types, as a common internal control for qRT-PCR in human cells (Aychek et al. 2008). Ribosomal protein-encoding genes are widely used as internal controls in humans and other animals, even plants (Hsiao et al. 2001; Thorrez et al. 2008; Barsalobres-Cavallari et al. 2009). In this study, another two ribosomal proteins, RL12 and RL13 were also tested for the stable expression, which belong to large subunit ribosomal proteins. RL13 had the most stable expression among different tissues based on the data of Normfinder and RefFinder, which was found to be one of the best reference genes in mustard aphid (Koramutla et al. 2016). RL12 had lower stability either in different tissues or under vibrio challenge. This result may reflect their different functions in protein synthesis. In marine bivalves, the stability and suitability of the ribosomal protein-encoding genes as reference genes has been validated in the previous studies. RL7 and RS18 were found to be the most stably expressed genes during the development of Crassostrea gigas larvae by OsHV-1 infection (Du et al. 2013). RS18 gene was also found to be the most stably expressed gene in Mya arenaria after Vibrio challenge (Mateo et al. 2010). 40S ribosomal protein s20 (RPS20) was validated to express most stably in Ruditapes philippinarum hemocytes in response to copper stress (Volland et al. 2017). Our study validated that a certain ribosomal protein gene can express stably under environmental challenge again.

EIF3 was the most stable gene under vibrio infection, which is required for several steps in the initiation of protein synthesis, targets a subset of mRNA participating kinds of cellular processes (Masutani et al. 2007; Lee et al. 2015). It was validated that EIF3 was not regulated by experimental conditions in animals (Kouadjo et al. 2007; Zhang et al. 2012) and plants (Shi et al. 2012). However, the expression of EIF3 was variable depending on the different tissues in this study, indicating that the protein synthesis is different among the tissues.

EF1, ACT, 18S and TUB are common reference genes with validation or without validation in papers of bivalves (Volland et al. 2017) and used in many organisms, including animals and plants. ACT and 18S genes were the most common internal controls for qRT-PCR in S. constricta under a range of experimental conditions including different tissues, pathogen infection and ocean acidification (Li et al. 2011; Niu et al. 2015; Peng et al. 2017). ACT expressed at moderately abundant levels in all the tissues and encodes a ubiquitous cytoskeleton protein. At the present study, ACT showed variable in different tissues and more stable in gills under vibrio infection at different time points. Although ACT is one of the first and often used reference genes, it has been showed to vary considerably and be unsuitable as internal control to normalize the gene expression analysis in some cases (Selvey et al. 2001). The 18S is a component of the small subunit of eukaryotic ribosomes (40S) involved in the translation process (Boujedidi et al. 2012), which was used for a reference gene in many previous studies (Zhang et al. 2011; Banni et al. 2014). However, in this study, the 18S gene was the least stable gene across the two datasets. It has also been found inappropriate for normalization of qRT-PCR analysis in C. gigas (Du et al. 2013), R. philippinarum (Volland et al. 2017) and Rapana venosa (Song et al. 2017). In S. constricta, the expression of 18S should be validated before it is used for the internal control for qRT-PCR. EF1 is a member of the G-protein family and plays a key role in protein translation (Browne and Proud 2002), which is frequently used in C. gigas as the reference gene, and is the most stable reference gene in hemocytes of flat oyster Ostrea edulis (Morga et al. 2010) and in gametogenesis of Mytilus edulis (Cubero-Leon et al. 2012). The stability of expression of EF1 in razor clam remains unknown. Based on the results, EF1 has a moderately stable expression in different tissues and vibrio-infected samples of S. constricta. TUB is another least stable candidate gene in our study, which plays a crucial role in cell structural maintenance. TUB was found to be the most stable gene in development stages of Hippoglossus hippoglossus (Fernandes et al. 2008) and in goat (Costa et al. 2012). Taken together, it indicates that candidate reference genes differ among species or treatments.

To verify the actual utility of validated reference genes in this study, the expression profile of lysozyme and C1qDC (MF289989) was conducted in the razor clam under Vibrio infection at different time-points. Lysozyme and complement component play important roles in protecting animals against bacteria pathogens (Callewaert and Michiels 2010; Cui et al. 2018). Their mRNA expressions were induced following bacterial challenges in other mollusk (Ren et al. 2012; Bathige et al. 2013; Liu et al. 2014a, b). Here, the expressions of the two genes were remarkably expressed at 12 h after infection by Vibrio normalized by the four selected reference genes. However, the least stable genes TUB and 18S showed significant variation compared with the most stable genes RS9 and EIF3. Even in the result of C1qDC expression pattern at 48 h after infection normalized by 18S showed a very different profile than others. Therefore, these results further proved the importance of choosing the appropriate reference genes.

Conclusions

There were just 18S and β-actin commonly used for normalization of qRT-PCR data in S. constricta without validation. We combined the expression patterns of transcriptome to select candidate housekeeping genes, which suggested that the cluster analysis of transcriptome combining the CV values of FPKM of candidate reference genes is an important reference. In addition, three algorithms were used for the decision of the most appropriate internal control, and RS9 was identified as the recommended reference gene for qRT-PCR analysis under Vibrio infection and in different tissues. RL13 and EIF3 were the most stable reference genes in different tissues and vibrio-infected samples, respectively. The results of this study provide the primary reference to select the internal controls for qRT-PCR in S. constricta, and will aid further studies of clam immune response.