Introduction

Sika deer (Cervus nippon hortulorum) are one of the best-known and valued animals in China. The Sika deer antler, well valued in traditional Chinese medicine and listed as a tonic in many ancient Chinese pharmaceutical monographs, is a valuable model for the study of the mechanisms of organ regeneration and rapid tissue growth [1]. It also serves as a useful model of bone growth and mineralization in mammals [2]. Antler growth can be divided into two stages: the rapid growth stage (60 days after the casting of the previous hard antlers) and the high ossification stage (90 days after the casting of the previous hard antlers) [3]. During the period of rapid growth, the antlers of large species of deer will elongate by more than 2 cm per day. This represents the fastest rate of tissue growth in the animal kingdom and involves the coordinated regeneration of various tissue types, including skin, nerves, blood vessels, cartilage and bone [4], after which the regenerating antler gradually ossifies.

Deer antler has been extensively studied from morphological and histological perspectives, but the molecular regulation mechanisms are poorly understood. At present, the genomic resources available for antler are scarce. These genetic data are insufficient for elucidating the molecular mechanisms of antler growth. RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. It is the first sequencing based method that allows the entire transcriptome to be surveyed in a high-throughput and quantitative manner [5]. Currently, Illumina/Solexa sequencing technology is one of the widely used platforms in the field. It can produce millions of sequence reads from a single run at a much lower cost than conventional Sanger sequencing [6].

Our group has recently reported on the transcriptome sequencing and analysis of Sika deer antler tips at two different stages (60 and 90 days) [7, 8]. We generated over 80 million short reads from two cDNA libraries using Illumina technology, each read had a 90-bp length. Approximately 95 % of the clean reads had quality scores at the Q20 level (a base quality greater than 20 and an error probability of 0.01). The data sets were deposited in the NCBI ArrayExpress repository (accession number: E-MTAB-802 and E-MTAB-853). After sequence assembly, we obtained 89,001 and 138,642 unique sequences, respectively. Assembled sequences were annotated using GO terms, COG classifications and KEGG pathways. As a result, we have contributed to the preliminary understanding of gene expression patterns in Sika deer antler.

In order to obtain a comprehensive understanding of gene expression patterns during antler growth. In the present study, we performed a rigorous algorithm to identify differentially expressed genes (DEGs) between two different stages (60 and 90 days) during antler growth. A total of 16,905 significantly changed transcripts were identified. Those sequences were mapped to 5,573 genes with 2,217 genes up-regulated and 3,356 genes down-regulated (60 vs. 90 days), including ribosomal proteins, translation initiation and elongation factors, transcription factors, signaling molecules and extracellular matrix proteins. We also performed the Gene Ontology (GO) functional enrichment and pathway enrichment analysis of gene expression patterns with hypergeometric test and Bonferroni Correction. Both the two stages were enriched with members of GO categories and distinct pathways. These results provide useful information for further research in the area of deer antler growth.

Materials and methods

Sequence assembly

As previously reported, we generated over 80 million short reads from the cDNA libraries of antler tips collected at different stages (60 and 90 days) using the Illumina sequencing platform [7, 8]. De novo assembly of the short reads was performed using SOAPdenovo [9]. After sequence assembly, the resulting contigs were joined into scaffolds using the read mate pairs. The scaffolds were clustered using TGI Clustering tools [10] to obtain distinct gene sequences defined as unigenes. In the present study, the unigenes from the assembly of two samples were further processed. CAP3 [11] was used to splice sequences and remove redundancy to construct unigenes as long as possible.

A BLASTN program was used to confirm the sequencing accuracy and completeness; our sequences were compared with the ESTs in the Sika deer EST database at NCBI (http://www.ncbi.nlm.nih.gov/nucest?term=%20Cervus%20nippon). The results with E-values equal to or less than 10−10 were treated as ‘‘significant matches’’, whereas ESTs with no hits or matches with E-values greater than 10−10 to nucleotides in NCBI were classified as ‘‘no significant matches’’.

Functional annotation by sequence comparison with public databases

BLASTX alignment with an E-value cut-off of 10−5 between unigenes and protein databases (non-redundant) nr, Swiss-Prot, KEGG and COG was performed and the best aligning results were used to determine the unigene sequence directions. If results from the different databases conflicted, a priority order of nr, Swiss-Prot, KEGG and COG was followed to determine the direction of the unigene sequences. When a unigene was not aligned to any of the abovementioned databases, ESTScan software [12] was used to predict the coding regions and determine the sequence direction. Functional annotation with GO terms was performed using Blast2GO [13] and the GO database [14] with an E-value cut-off of 10−5. After obtaining a GO annotation for every unigene, we used WEGO software [15] to perform a GO functional classification for the unigenes and determine the gene function distribution of the species at the macro level. Annotation using the COG and KEGG pathways was performed by searching with BLASTX against the Clusters of Orthologous Groups (COG) of proteins database [16] and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [17] using an E-value cut-off of 10−5.

Analysis of differential gene expression during antler growth

Gene expression levels were expressed as RPKM (reads per kilobase of exon model per million mapped reads) [18] using the formula RPKM = 109C/NL. In this equation, C is the number of mappable reads that uniquely align to a unigene, N is the total number of mappable reads that uniquely align to all unigenes and L is the length of a unigene in base pairs

Based on the previous method [19], we performed a rigorous algorithm to identify the DEGs between the two samples. The number of unambiguous clean tags from gene A is denoted as x. Because the expression of every gene occupies only a small part of the library, p(x) will closely follow the Poisson distribution:

$$ p\left( x \right) = \frac{{e^{ - \lambda } \lambda^{x} }}{x!} $$

The parameter λ represented the number of the real transcripts of the gene A. The total number of clean tags from sample 1 was N1 and the total number of clean tags from sample 2 was N2. Gene A had x tags in sample 1 and y tags in sample 2. The formula to calculate the probability of gene A expressed equally in both samples and the significant p value are defined as follows:

$$ \begin{aligned} & 2\sum\limits_{i - 0}^{i - y} {p\left( {i|x} \right)} \\ & {\text{Or}}\; 2\times \left( { 1 { - }\sum\limits_{i - 0}^{i - y} {p\left( {i|x} \right)} } \right)\quad \left( {{\text{if}}\sum\limits_{i - 0}^{i - y} {p\left( {i|x} \right) > 0.5} } \right) \\ & p\left( {y|x} \right) = {\left( {\frac{{N_{2} }}{{N_{1} }}} \right)}^y\,\frac{{\left( {x + y} \right)!}}{{x!y!\left( {1 + \frac{{N_{2} }}{{N_{1} }}} \right)^{{\left( {x + y + 1} \right)}} }} \\ \end{aligned} $$

The p value corresponds to differential gene expression test. The threshold of the p value in multiple tests was determined through manipulating the false discovery rate (FDR) value. Assume that we have selected R DEGs in which S genes are differentially expressed and the other V genes are false positives. If we decide that the error ratio “Q = V/R” is less than the cutoff (e.g., 5 %), we should select an FDR value that is no greater than 0.01 [20]. We used “FDR ≤ 0.001 and the absolute value of log2Ratio ≥ 1” as the threshold to assess the significance of gene expression difference.

DEGs were then analyzed using GO functional analysis and KEGG Pathway analysis. DEGs were first mapped to GO terms in the database (http://www.geneontology.org/). Gene numbers for the GO terms were calculated and a hypergeometric test was subsequently used to identify the significantly enriched GO terms in the DEGs compared to the genome background. The calculating formula was the following:

$$ {\text{P = 1 - }}\sum\limits_{i = 0}^{m - 1} {\frac{{\left( {\begin{array}{*{20}c} M \\ i \\ \end{array} } \right)\left( {\frac{N - M}{N - i}} \right)}}{{\left( {\begin{array}{*{20}c} N \\ n \\ \end{array} } \right)}}} $$

in which N was the number of all genes with GO annotation; n was the number of DEGs in N; M was the number of all genes that are annotated to the specific GO terms; m was the number of DEGs in M. The calculated p value went through Bonferroni Correction, taking “adjusted p value ≤0.05” as a threshold. The GO terms fulfilling this requirement were defined as significantly enriched GO terms in DEGs. KEGG Pathway enrichment analysis was performed in the same way as that in GO analysis. In this formula, N is the number of genes with a KEGG annotation; n is the number of DEGs in N; M is the number of genes annotated to specific pathways and m is the number of DEGs in M.

Real-time PCR analysis

Real-time PCR (qPCR) assays were performed using a One Step SYBR PrimeScript PLUS RT-PCR Kit (Takara, Japan) and a BioRad Chromo4 Real-Time PCR System (BioRad, USA) to validate the transcriptome data. The results were analyzed with Opticon Monitor software version 3.1. Total RNA were treated by DNase I (Takara, Japan). The reactions were performed in a 25 μl total volume containing 2 μl total RNA (50 ng); 1 μl primer mix containing 10 μM each of the forward and reverse gene specific primers; 1.5 μl of Ex Taq HS Mix; 0.5 μl of PrimeScript PLUS RTase Mix; 12.5 μl of 2X One Step SYBR RT-PCR Buffer 4 and 6.5 μl of RNAase Free H2O. The PCR amplification was performed under the following conditions: 42 °C for 5 min, 95 °C for 10 s and then 40 cycles at 95 °C for 5 s and 60 °C for 30 min. Consequently, the specificity of the individual PCR amplification was verified using a heat dissociation protocol raising the temperature from 60 to 95 °C following the final cycle of the PCR and agarose gel electrophoresis. All reactions were performed in biological triplicates; the gene expressions of target genes were normalized against an internal reference gene, 60S ribosomal protein L40. The relative gene expression was calculated using the 2−ΔΔCt method [21].

Results

Sequence assembly

After sequence splicing and removing the sequence redundancies using sequence clustering software, the unigenes from each library were assembled into 116,504 unigenes with a mean size of 522 bp and an N50 of 682 bp (50 % of the assembled bases were incorporated into unigenes 682 bp or longer). The length of assembled unigenes ranged from 200 to 10,522 bp. There were 84,364 unigenes (72.41 %) with lengths varying from 200 to 500 bp, 19,190 unigenes (16.47 %) with lengths varying from 501 to 1,000 bp and 12,950 unigenes (11.12 %) with lengths greater than 1,000 bp (Fig. S1).

To assess the dataset quality, we analyzed the ratio of the length of the gap to the length of the assembled unigene (Fig. 1). The majority of the unigenes showed gap lengths that were less than 5 % of the total length, which accounted for 87 % of total unigenes numbers (101,355 unigenes). In addition, sequencing bias was analyzed by detecting random distribution of reads in assembled unigenes (Figs. 2, 3). Although the 3′ ends of the assembled unigenes contained relatively fewer numbers of reads, the other positions of the assembled unigenes showed a greater and more even distribution. Comparisons with Sika deer EST database using BLASTN showed that 2160 out of 2749 ESTs in the Sika deer EST database had significant matches (E-value ≤10−10) to sequences produced by our transcriptome sequencing.

Fig. 1
figure 1

Ratio distribution of the gap’s length to the length of assembled unigenes The x axis indicates the ratio of the gap’s length to the length of assembled unigenes. The y axis indicates the number of unigenes containing gaps

Fig. 2
figure 2

Random distribution of Illumina sequencing reads in the assembled unigenes The x axis indicates the relative position of sequencing reads (60 days) in the assembled unigenes. The orientation of unigene is from 5′end to 3′end

Fig. 3
figure 3

Random distribution of Illumina sequencing reads in the assembled unigenes The x axis indicates the relative position of sequencing reads (90 days) in the assembled unigenes. The orientation of unigene is from 5′end to 3′end

Functional annotation and classification

To annotate the gene sequences, BLASTX searches were first performed against the NCBI non-redundant (nr) database using a cut-off E-value of 10−5. Using this BLAST approach, 39,047 genes (33.52 % of all unigenes) with a result greater than the cut-off value were identified. The species distribution of the best match result for each sequence was shown in Fig. 4. The sequences showed 57.11 % matches with Bos taurus sequences followed by Ailuropoda melanoleuca (7.41 %), Homo sapiens (5.43 %), Mus musculus (4.88 %) and Canis familiaris (4.48 %).

Fig. 4
figure 4

Species distribution of the BLASTX results This figure shows the species distribution of unigene BLASTX results against the nr protein database with a cutoff E value <10−5 and the proportions of each species. Different colors represent different species. Species with proportions of more than 1 % are shown

We used GO terms, COG classifications and KEGG pathways to classify the functions of the unigenes. Based on GO classifications, 15,667 sequences can be categorized into 48 functional groups (shown in Fig. S2). In each of the three main categories (biological process, cellular component and molecular function) of the GO classification, “cell part” (12,927 members); “binding” (11,983 members) and “cellular process” (10,699 members) terms were predominant. We also observed a high percentage of genes from the following categories: “organelle” (8,660 members); “metabolic process” (8,353 members); “catalytic activity” (5,882 members); “biological regulation” (5,669 members) and “pigmentation” (5,122 members). Few genes were found in the following categories: “metallochaperone activity” (5 members) and “electron carrier activity” (1 member) (Fig. S2).

We also searched the annotated sequences for the genes involved in COG classifications. In total, out of 39,047 nr hits, 10,278 sequences had a COG classification (Fig. S3). Among the 25 COG categories, the cluster for “general function prediction only” represented the largest group (3,742 members) followed by “translation, ribosomal structure and biogenesis” (2,512 members) and “replication, recombination and repair” (2,002 members). The categories representing “extracellular structures” (13 members) and “nuclear structure” (5 members) were found to be the smallest groups (Fig. S3).

To further evaluate the completeness of our transcriptome library and effectiveness of our annotation process, we mapped the 39,047 annotated sequences to the reference canonical pathways in the KEGG database. In total, we assigned 25,182 sequences to 218 KEGG pathways (Table S1). The pathways with the highest representation of unique sequences were the “metabolic pathways” (2,494 members) followed by pathways related to “focal adhesion” (1,685 members), “amoebiasis” (1,347 members), “regulation of actin cytoskeleton” (1,257 members), “ECM-receptor interaction” (1,183 members) and “cancer pathways” (1,104 members).

Comparison of gene expression level between the two libraries

To identify the genes that showed a significant expression change upon comparing the two different antler growth stages, the differentially expressed tags between two samples were identified using an algorithm modified by Audic et al. [19]. A hierarchical clustering of the differentially expressed transcripts (FDR ≤0.001 and the absolute value of log2Ratio ≥1) was generated by comparing the 60- to 90-day stages; two main clusters consisted of up- and down-regulated transcripts. A total of 16,905 significantly changed transcripts were detected by comparing the 60- and 90-day stages. Those sequences were mapped to 5,573 genes; 2,217 of the genes were up-regulated and 3,356 of the genes were down-regulated (Table 1).

Table 1 Overview of the differentially expressed genes between two stages (60 vs. 90 days)

Genes with similar expression patterns potentially have a functional correlation. To screen for novel candidate genes that functionally correlate with antler development, we performed GO functional enrichment and pathway enrichment analysis on the gene expression patterns with hypergeometric test and Bonferroni Correction. Both stages were enriched with members of GO categories and distinct pathways (Tables 2, 3). According to GO functional enrichment analysis, the DEGs between two stages can also be categorized into three main categories (biological process, cellular component and molecular function) (Table 2). In the biological process category, some of these genes were correlated to metabolic processes, such as translation, polyamine metabolism and hexose catabolism. Similarly, some of these genes were correlated to cell part for cellular component (ribosomal subunit, intermediate filament and cytoplasm) and some of these genes were correlated to structural molecule activity and catalytic activity (Table 2). In the KEGG pathway classification process, 16 significant enrichment pathways were observed as shown in Table 3, i.e., ribosomal and oxidative phosphorylation, which was in accordance with the GO functional enrichment analysis results.

Table 2 Gene ontology (GO) functional enrichment analysis between two stages
Table 3 KEGG pathway enrichment analysis between two stages

Candidate genes involved in rapid growth and ossification of Sika deer antler

As shown in Table 4, we found some of the significantly changed transcripts that closely relate to rapid growth and ossification, such as cartilage oligomeric matrix protein, chondroadherin, collagen X, thrombospondin-1, periostin, annexin and runt-related transcription factor 2. Most of the transcripts were identified from deer antler for the first time and those identified play an important role in chondrocyte proliferation, hypertrophy and ossification. We selected all the 19 genes involved in rapid growth and ossification and designed 19 pairs of primers (Table S2) for qPCR validation. The results showed consistent expression patterns with the observed findings in transcriptome analysis (Figs. S4, S5). The significant agreement between the data sets indicated that the abundance of the Illumina sequences from the Sika deer transcriptome closely mirrors the actual expression level.

Table 4 Differentially expressed genes (DEGs) involved in antler growth (60 vs. 90 days)

Discussion

Despite the morphological and histological importance of the Sika deer antler, the lack of molecular information on this species prevented the analysis of the molecular regulation mechanisms of antler growth. Creating cDNA libraries from the Sika deer antler and performing RNA-Seq provided an efficient procedure for identifying the DEGs associated with rapid growth and ossification. We have previously produced 89,001 and 138,642 unique sequences from two cDNA libraries of Sika deer antler. Assembled sequences were annotated using GO terms, COG classifications and KEGG pathways. These studies represent the largest de novo transcriptome of Sika deer to date. Our goal is to develop a comprehensive and representative core gene set of the deer antler. In the present study, these unique sequences were assembled into 116,504 unigenes with a mean size of 522 bp and an N50 of 682 bp (Table 1). To evaluate the quality of the dataset, we analyzed sequence quality, sequencing bias and gap lengths. These results are consistent with previous publications [22, 23], suggesting that the quality of our dataset was comparable to similar reports in other non-model species. Comparisons with the Sika deer EST database using BLASTN revealed that our transcriptome dataset had both a high degree of consistency (78.57 %) with previous EST data and an increase in coverage by factor of approximately 42. We found that more than 60 % of the unigenes are orphan sequences and no homologues were found in the NCBI database. This might be consistent with the expectation that antler has a unique gene expression pattern compared to other species.

Among the unigenes, 39,047 genes (33.52 % of all unigenes) returned a BLAST result greater than the cut-off value. It was surprising to determine that Sika deer shared the highest similarity (57.11 %) with cattle (Bos taurus) in the BLAST annotation. Although a Sika deer antler can grow rapidly and regenerate every year, the difference between an antler and a horn is unresolved. Although the present experimental results cannot suggest a clear explanation, these results will provide useful reference information to the further studies. To further evaluate the completeness of our transcriptome libraries and the effectiveness of our annotation process, we searched the annotated sequences for genes involved in GO classifications, COG classifications and KEGG pathways. In total, we assigned 15,667 sequences to 48 GO classifications, 10,278 sequences to 25 COG classifications and 25,182 sequences to 218 KEGG pathways. These annotations provide a valuable resource for investigating specific processes, functions and pathways for Sika deer antler research.

To better understand the gene expression pattern obtained for Sika deer antler, we have developed a rigorous algorithm to identify the DEGs between two stages of antler growth. By comparing the 60- and 90-day stages, 5,573 genes were identified; 2,217 genes were up-regulated and 3,356 genes were down-regulated. We also performed GO functional enrichment and pathway enrichment analysis of the gene expression patterns with hypergeometric test and Bonferroni Correction. We found that most of the significantly changed genes were correlated to protein synthesis and translation, i.e., ribosomal proteins and translation initiation and elongation factors (Tables 2, 3, S3).

Protein synthesis is the ultimate step of the gene expression process and a critical control point for regulation. In particular, it enables cells to rapidly initiate protein production without additional mRNA synthesis, processing, or export. It is well established that cellular protein synthesis consists of three steps: initiation, elongation and termination [24]. Previous evidence suggests that some of the ribosomal proteins have secondary functions independent of their involvement in protein biosynthesis. A number of these proteins function as cell proliferation regulators and in some instances as inducers of cell death [25]. There were 24 ribosomal proteins significantly up-regulated by comparing the 60- and 90-day stages, such as 40S ribosomal protein S2 (RPS2), 40S ribosomal protein S6 (RPS6), 40S ribosomal protein S8 (RPS8), ribosomal protein S27L (RPS27L), 40S ribosomal protein S29 (RPS29) and 60S ribosomal protein L11 (RPL11) (Table S3). Under normal conditions, ribosomal biogenesis is accurately tuned to the cellular need for protein synthesis. An increased ribosomal biogenesis is, in general, associated with increased proliferative activity, whereas cells in a stationary state normally reduce the synthesis of ribosomal proteins [26]. For example, the levels of RPS2 protein and its corresponding mRNA were higher in mouse hepatocellular carcinoma, in mouse livers after one-third partial hepatectomy and in serum-starved cultured hepatocytes following serum treatment compared to normal livers [27]. RPS6 is highly expressed in primary diffuse large B cell lymphoma (DLBCL) samples. The genetic modulation of RPS6 protein levels with specifically targeted short hairpin RNA (shRNA) lentiviruses led to a decrease in the actively proliferating population of cells compared with the control shRNA [28]. RPS8 was also identified up-regulated in human hepatocellular carcinoma (HCC), which suggests that activation of this gene is an important manifestation of HCC proliferating [29]. Ribosomal proteins are also involved in regulation of apoptosis. It has been reported that ribosomal protein S27L (RPS27L) is a direct p53 target that regulates apoptosis because p53 directly induces the expression of a ribosomal protein (RPS27L), which in turn promotes apoptosis [30]. Studies demonstrated that enhanced S29 expression resulted in increased apoptosis in HeLa cells; it is a potent apoptosis-inducing agent, which acts through both receptor-mediated and mitochondria-mediated caspase-dependent cell death pathways in human NSCLC cells and H520 cells. S29 induces apoptosis by increasing p53, Bax and decreasing Bcl-2, Bcl-XL, surviving and NF-jB. Mitochondrial involvement was observed by an increased release of cytochrome c and increased downstream production of caspase-3 [31]. In addition, ectopic L11 expression can induce p53-dependent cell cycle arrest or apoptosis by reducing HDM2-mediated p53 ubiquitination and stabilizing p53 [32]. These results indicate that, during the process of antler growth, ribosomal proteins play a prominent role in cell growth and apoptosis. On the one hand, some of the ribosomal proteins regulate the rapid growth of antler through promoting cell proliferation. On the other hand, some of the ribosomal proteins control the balance between cell growth and apoptosis to avoid tumor formation caused by excessive cell proliferation.

Several translation initiation, such as eukaryotic initiation factor 5A (eIF5A), eukaryotic initiation factor 3 (eIF3) and eukaryotic initiation factor 4B (eIF4B), which have been shown to have important regulatory roles in cell growth, were significantly up-regulated by comparing the 60- and 90-day stages. For example, eIF5A is a highly conserved translation initiation factor that is essential for cell proliferation. Mature eIF5A-1 is a marker of abnormal proliferating cells in intraepithelial neoplasia of the vulva, whereas the eIF5A-2 transcript is overexpressed in ovarian cancer and colorectal cancer cell lines. Furthermore, mutations in eIF5A genes of other eukaryotic model organisms (Drosophila melanogaster and yeast) also revealed a correlation between eIF5A function and cell proliferation control [33]. The translation initiation factor eIF4B is required for cell proliferation and survival, whereby it acts by regulating the translation of a functionally related subset of mRNAs. These mRNAs encode proteins, which function in cell proliferation (Cdc25C, c-myc and ODC [ornithine decarboxylase]) and survival (Bcl-2 and XIAP [X-linked inhibitor of apoptosis]) [34]. Likewise, eIF3 causes a similar stimulation of the translation of mRNAs involved in regulating cell growth and proliferation. Ectopic expression of eIF3A, -3B, -3C, -3H, or -3I in stably transfected NIH3T3 cells leads to a number of oncogenic properties [35].

We also found two forms of elongation factor 1α significantly up-regulated by comparing the 60- and 90-day stages. One was elongation factor 1-alpha, oocyte form (EF-1αO) and the other was elongation factor 1-alpha, somatic form (EF-1αS). They presented in Xenopus laevis under different development stages. EF-1αO is expressed predominantly in oocytes and transiently in early embryogenesis; accordingly, mRNA cannot be detected after neurulation in somatic cells. In contrast, EF-1αS is expressed at low levels in oocytes but actively in somatic cells [36]. The function of the two forms of elongation factor is still unknown, but the significant changes indicate that they may also play an important role in antler development.

To further investigate the mechanism of rapid growth and ossification of the Sika deer antler, we analyzed the significantly changed transcripts closely related to chondrocyte proliferation, hypertrophy and ossification. As shown in Table 4, there were 12 transcripts up-regulated and 7 transcripts down-regulated by comparing the 60- and 90-day stages. Most of the up-regulated transcripts were related to chondrocyte proliferation and hypertrophy. Cartilage oligomeric matrix protein (COMP), a prominent noncollagenous component of cartilage extracellular matrix, has been reported to interact with multiple protein partners and these interactions are required for chondrogenesis and endochondral bone growth. COMP associates with granulin–epithelin precursor (GEP) and potentiates GEP-stimulated chondrocyte proliferation. The overexpression of GEP stimulates the proliferation of chondrocytes and this stimulation is enhanced by COMP. COMP is required for GEP-mediated chondrocyte proliferation because chondrocyte proliferation induced by GEP is dramatically inhibited by an anti-COMP antibody [37]. COMP interacts with extracellular matrix protein 1 (ECM1), which was found to potently inhibit chondrocyte proliferation in vitro and neutralizes ECM1-mediated inhibition of chondrocyte proliferation. COMP and ECM1 play opposing roles in regulating chondrocyte hypertrophy, matrix calcification and endochondral bone growth [38]. Chondroadherin is a cartilage matrix protein that is primarily found close to the cells. It is particularly prominent in growth plate cartilage, primarily between the proliferative and hypertrophic zones. Chondroadherin interaction with cells may be essential for maintaining the adult chondrocyte phenotype and cartilage homeostasis [39]. Collagen X is the only known hypertrophic chondrocyte–specific molecular marker [40]. The up-regulation of this collagen type is consistent with previous studies reporting that chondrocyte hypertrophy plays an essential role in longitudinal bone growth. In rapidly growing bones, 9 % of bone length is contributed by proliferating cells and 32 % by matrix synthesis throughout the growth plate, the process of hypertrophy accounts for approximately 60 % of bone elongation [41]. Thrombospondin-1 (TSP-1), which has been detected in large amounts in articular cartilage, can significantly prevent chondrocyte hypertrophy and strongly inhibit matrix mineralization. TSP-1 prevents excessive ossification in cartilage repair tissue induced by osteogenic protein-1 (OP-1), which stimulates chondrogenesis, but also induced chondrocyte hypertrophy and excessive bone formation [42]. Chondromodulin 1 (ChM-1), a 25-kDa glycoprotein that induces the chondrocyte phenotype, strongly inhibits angiogenesis. ChM-1 not only promotes chondrogenesis, but also significantly inhibits chondrocyte hypertrophy and endochondral ossification [43]. Midkine (MK), a member of the pleiotrophin (PTN)/MK family, was originally described as a retinoic acid-inducible secreted neurotrophic factor. Previous studies have shown that it plays an essential role in neurons and neurite outgrowth, bone fracture healing and live regeneration. Recombinant human midkine (rhMK) can stimulate the proliferation of auricular chondrocytes and attenuate the dedifferentiation of primary auricular chondrocytes during serial passages in monolayer cultures [44]. Integrin-linked kinase (ILK) is a key effector of downstream integrin signaling and an essential regulator of gene expression and cellular activity. ILK is highly expressed in proliferating and prehypertrophic chondrocytes and regulates chondrocyte shape and proliferation. The specific ablation of ILK in a portion of the growth plate cells leads to dwarfism and chondrodysplasia. Mice with a chondrocyte-specific disruption of the gene encoding for Ilk develop chondrodysplasia and at die birth due to respiratory distress. The chondrodysplasia was characterized by abnormal chondrocyte shape and decreased chondrocyte proliferation [45, 46]. Anterior gradient protein 2 (AG2) has evolutionarily conserved roles in development and tissue regeneration and is linked with several human cancers. AG2 is first identified as a cement gland-specific gene in Xenopus laevis. The overexpression of AG2 in Xenopus laevis (XAG-2) induces both cement gland differentiation and the expression of anterior neural marker genes in the absence of mesoderm formation [47]. Elevated levels of AG2 were also found in a range of human malignancies including hormone-dependent cancers (breast, prostate) and hormone-independent cancers (colorectal, pancreatic, esophageal and other common cancers). AGR2 protein expression induces metastasis, acts as a p53 tumor suppressor inhibitor and survival factor, participates directly in neoplastic transformation and is involved in drug resistance [48]. Strikingly, the newt AG homolog (nAG) was the first protein discovered with the ability to promote limb regeneration. The recombinant nAG protein acts as a growth factor for cells dissociated from the blastema. The expression of this protein acts as a growth factor for cells dissociated from the blastema, rescues denervated limb blastema and supports a regeneration to the digit stage [49]. Matrix Gla protein (MGP) is a 14-kDa member of the mineral-binding Gla protein family. In the mammalian growth plate, MGP is expressed by proliferative and late hypertrophic chondrocytes. Studies of MGP-deficient mice suggest that MGP is an inhibitor of extracellular matrix calcification in arteries and the epiphyseal growth plate [50]. Connective tissue growth factor (CTGF), a member of the CCN family of secreted proteins, is required for cell proliferation and matrix remodeling during chondrogenesis and is a critical regulator that induces extracellular matrix remodeling and angiogenesis in the growth plate. CTGF stimulates DNA synthesis in chondrocytes and acts in a paracrinic manner to control chondrocyte proliferation [51]. In addition, we found that hematopoietic progenitor cell antigen CD34 (CD34) and insulin-like growth factor II (IGF-II) were highly expressed at the rapid growth stage. CD34 is a transmembrane protein that is strongly expressed on hematopoietic stem/progenitor cells (HSCs). CD34 expression is required for the proliferation and maintenance of hematopoietic progenitor cells in embryos and adults [52]. Recent studies have shown that human embryonic stem cells (hESCs) have the capability to generate functional hemato-endothelial cells and other mesoderm lineage cells. hESC-derived CD34+ cells can serve as mesenchymal stem/progenitor cells (MSCs) with the ability to differentiate into adipocytes, osteoblasts and chondrocytes [53]. CD34 overexpression may indicate that stem cells are located in a niche in the cellular layer of the antler tip. These stem cells are periodically activated to differentiate into osteoblasts and chondrocytes and regenerate a new antler. IGF-II, but not IGF-I was significantly up-regulated by comparing the 60- and 90-day stages. This is consistent with the previous studies that IGF-II may be more effective than IGF-I in stimulating antler growth [54, 55].

In contrast, most of the down-regulated transcripts, such as periostin, annexin A1, collagen I, osteoclast-stimulating factor 1, bone morphogenetic protein 3, runt-related transcription factor 2 and tumor necrosis factor receptor superfamily member 11B, were related to ossification, as shown in Table 4. Periostin, a 90-kDa secreted extracellular matrix protein, was initially identified as osteoblast-specific factor 2. Immunohistochemical analysis showed that periostin is preferentially expressed in the extracellular matrix of the periosteum and periodontal ligaments in the adult mouse. The periosteum and periodontal ligaments are vascular and cellular connective tissues that play critical roles in mineralized tissue generation and support [56]. Immunohistochemistry in bones from different growth stages showed that periostin localized specifically at the sites of endochondral and intramembranous ossification, especially at the periosteal envelopes [57]. Annexin A1 is a 37-kDa protein that has complex intra- and extracellular effects. An anomalous development of the skull was observed in the annexin-A1 null mice. This was characterized in the newborn annexin A1 null animals by a delayed intramembranous skull ossification, an incomplete fusion of the interfrontal suture and palatine bone and the presence of an abnormal suture structure [58]. Collagen I is an essential protein in endochondral bone. Collagen I was detected only in the perichondrium, vascular cavities and osteoid and bone; it was absent from the hypertrophic cartilage matrix [59, 60]. Osteoclast stimulating factor 1 (OSTF1) is an intracellular protein that enhances 1, 25 dihydroxyvitamin D3, induces bone resorption, binds c-Src with high affinity and indirectly stimulates osteoclast formation [61]. BMP3 overexpression in the chick wing bud at the onset of chondrogenesis, using a replication competent retrovirus, reduces BMP signaling, which leads to increased cell proliferation and delayed cell differentiation, resulting in expanded skeletal elements and joint fusions. These results suggest that BMP3 expression in the perichondrium may serve to regulate cartilage cell proliferation by modulating the levels of BMP signaling, thus ensuring proper endochondral ossification [62]. Runt-related transcription factor 2 (Runx2) is a member of a family of transcription factors that share a DNA-binding domain with the Drosophila pair rule gene runt. Runx2 plays a prominent role in endochondral ossification because it activates proliferative chondrocytes to further differentiate into hypertrophic chondrocytes [63]. Tumor necrosis factor receptor superfamily member 11B, which is also known as osteoclastogenesis inhibitory factor (OCIF) and osteoprotegerin (OPG), is a secreted factor that specifically inhibits osteoclast formation. Studies have shown that OPG plays an important role in regulating physiological bone formation. An overexpression of OPG in the liver is associated with high OPG protein levels in the systemic circulation and a marked increase in bone density (osteopetrosis). Adolescent and adult OPG−/− mice exhibit a decreased total bone density, which is characterized by severe trabecular and cortical bone porosity, marked thinning of the parietal bones of the skull and a high incidence of fractures [64]. The expression pattern of OPG at 90 days of antler growth may help explain why Sika deer antlers harden at the stage of ossification.

The sensitivity of RNA-Seq is a function of both molar concentration and transcript length. Therefore, we quantified transcript levels in terms of the reads per kilobase of exon model per million mapped reads (RPKM). The RPKM measure of the read density reflects the molar concentration of a transcript in the starting sample by normalizing for the RNA length and the total read number in the measurement. Measuring the RPKM facilitates transparent comparisons of transcript levels both within and between samples [18]. We have selected a set of highly expressed genes related to rapid growth and ossification to validate our gene annotations. Each one of the genes generated the expected band size and showed consistent expression patterns. We are confident that our transcriptome dataset is a valuable addition to the publicly available Sika deer genomic information.

Taken together, our analysis of Sika deer antler transcriptome provides useful information for further understanding of antler biology, particularly the molecular regulation mechanisms of rapid growth and ossification. Although more than 60 % of the transcripts are still unknown, the pathways and genes identified here may provide useful information for further research in the area of deer antler development.

Conclusion

This study provides an efficient method to identify and examine the DEGs involved in regulating Sika deer antler growth and development. We performed a rigorous algorithm to identify DEGs between two different stages (60 and 90 days) during antler growth. A total of 16,905 significantly changed transcripts were identified. Those sequences were mapped to 5,573 genes with 2,217 genes up-regulated and 3,356 genes down-regulated (60 days vs. 90 days), including ribosomal proteins, translation initiation and elongation factors, transcription factors, signaling molecules and extracellular matrix proteins, which are responsible for chondrocyte proliferation and differentiation and ossification during antler growth. We also performed the GO functional enrichment and pathway enrichment analysis of gene expression patterns with hypergeometric test and Bonferroni Correction. Both the two stages were enriched with members of GO categories and distinct pathways. These data will provide the foundation for other functional genomic research in Sika deer or its closely related species.