Introduction

As the second most common cancer among women worldwide, cervical cancer is a major threat to our health, especially in developing countries where active screening and human papillomavirus (HPV) vaccines have not yet become available to every individual. Although most patients can be cured if diagnosed at an early stage, those with advanced stage disease are at a high risk of secondary metastatic cancer and tumor relapse [1]. Discovery of biomarkers for early diagnosis and novel effective therapeutic modalities against advanced stage cervical cancer is urgently required. To this end, uncovering the underlying mechanisms of tumorigenesis and progression of cervical cancer is of great importance. Tumorigenesis is a complicated pathological process which involves multiple genetic alterations such as overexpression of oncogenes and inactivation of tumor-suppressor genes [2]. Expression profiling studies using microarrays have been extensively used in identifying biomarkers for early cancer diagnosis, classifying cancer subtypes, and predicting response to anticancer therapies [3]. Identification of genes with multiple roles in dysregulated cancer pathways shed light on tumorigenesis and cancer progression, thus helping invent novel anticancer strategies. In terms of cervical cancer, earlier studies comparing expression profiles between normal cervix and cervical cancer samples identified many differentially expressed genes (DEGs). However, DEGs reported in different studies vary greatly with only a few consistently detected among various studies [35]. Although these studies provide evidence of several DEGs as valuable potential biomarkers for diagnosis and prognosis, due to their limited role in disease progression, few of the identified DEGs led to the development of novel therapeutics. The objectives of this study were to verify DEGs in cervical cancer and identify significantly changed biological networks and KEGG pathways, with a special focus on uncovering possible hub genes which play multiple roles in tumorigenesis and disease progression of cervical cancer. In the study, to detect DEGs in cervical cancer, two groups of Affymetrix microarray data were analyzed. One group consisted of carcinomatous cervical epithelial cell samples, and the other had healthy cervical epithelial cell samples, both from the Amerindian. In addition, gene ontology (GO) biological process and KEGG pathway analysis using identified DEGs were conducted, and corresponding heat maps and biological networks were constructed.

Methods

Data collection

Gene expression omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) was searched and Affymetrix microarray expression data (GSE 29570), which contained two groups, were obtained [6]. One group with 43 samples consisted of carcinomatous cervical epithelial cells and the other with 17 samples of healthy cervical epithelial cells, both from the Amerindian. Unprocessed data sets (.cel files) were collected for further analysis. The Affymetrix Human Gene 1.0 ST Array [transcript (gene) version] was applied in the experiments. The probe annotation files were downloaded accordingly for further research. This study was approved by the Ethics Committee of the Affiliated Hospital of North Sichuan Medical College and the Ethics Committee concluded that the study would not cause harm or risk to the subjects.

Data processing and filtering

Considering the array platform, we applied RMA in this research to quantify microarray signal [7]. The normalization process mainly consisted of three steps: model-based background correction, quantile normalization and summarizing. To filter out uninformative data such as control probesets and other internal controls as well as removing genes which were expressed uniformly close to background detection levels, the genefilter package in R language provides nsFilter function to do this. However, the filter does not remove probesets without Entrez Gene identifiers or with identical Entrez Gene identifiers.

DEGs analysis

Statistical comparison between normal samples and carcinomatous samples was conducted. Limma in R language was applied to identify the differential expression of the comparison. For probes which had identical Entrez Gene identifiers, only the probe occupying the biggest variance was kept for further DEG analysis. Only those genes with absolute log2 (fold change) >2 and adjusted p value <0.01 were recognized as statistically differentially expressed between two sample groups. The adjusted p value was obtained by applying Benjamini and Hochberg’s (BH) false discovery rate correction on the original p value, and fold change threshold was selected based on our purpose of focusing on significantly differentially expressed genes.

Hierarchical clustering

Hierarchical clustering was performed to classify the analyzed samples based on gene expression profiles [8]. Hierarchical clustering was carried out using DEGs to observe the global gene expression patterns. Besides, the DEGs, which were classified in specific GO biological processes and KEGG pathway analysis, were further extracted and the expression pattern of those DEGs was characterized, and the heat maps for the DEGs classified in targeted biological processes or KEGG pathways using R package were also constructed.

GO and KEGG pathway analysis

R packages––GO.db, KEGG.db and KEGGREST were used to detect GO categories and KEGG pathways with significant overrepresentation in DEGs comparing with the whole genome. The significantly enriched biological processes were identified as p value less than threshold value 0.01. As to KEGG pathway, p value was set to less than 0.05.

Construction of biological network

Network database was downloaded from HPRD, BIOGRID, and PIP [911]. The pair interactions, which were included in any of the three databases, were chosen to be in our curated database. Cytoscape was utilized to construct biological network [12]. Interacted gene pairs existed in the curated database was imported as stored network. After the functional enrichment analysis, DEGs specified in dramatically altered biological processes and KEGG pathways were mapped to corresponding network, respectively, to analyze the interactions.

Results

Differential expression analysis

Comparative analysis was performed between normal and carcinomatous samples to identify genes with significantly differential expression levels. Setting the threshold as absolute log2(fold change) >2 and adjusted p value <0.01, 122 DEGs were identified, among which 46 were up-regulated and 76 were down-regulated (Table 1, Supplementary Table 1, 2). The top 5 up-regulated genes were MMP12, SYCP2, SMC1B, MMP1 and CDKN2A. The top 5 down-regulated genes were SPINK7, CRISP3, TMPRSS11B, CRNN and DSG1.

Table 1 Statistical distribution of differentially expressed genes in normal and tumor samples

Construction of biological network

DEGs identified were utilized to construct biological network (Fig. 1), since the size of DEGs was relatively small. The interactions of genes could be observed clearly. Two separated sub-networks were indicated as shown and the larger one centered on CDK1 and ESR1. Moreover, the differential expression was considerably consistent. It was interesting to see that most of the genes connected with CDK1 such as CDC6, CDKN3, ECT2 and TOP2A were up-regulated, whereas those connected with ESR1 such as AR, FOS, DSG1 and KRT1 were down-regulated. Genes connected with CDK1 are mainly involved in cell cycle and cell proliferation. The smaller biological network was mainly composed of genes of MMP family such as MMP1 and 3.

Fig. 1
figure 1

Heat map of DEGs (a) and corresponding biological network (b). a Heat map of hierarchical clustering of all DEGs (17 normal samples and 43 tumor samples). Red indicates high relative expression, and green indicates low relative expression. b Biological network was constructed according to the direct connections among all DEGs. Red indicates high relative expression, and green indicates low relative expression

GO and KEGG pathway analysis

402 GO biological processes and 9 KEGG pathways were over-represented by DEGs (Table 2). Top significant biological processes included cell cycle process, cell proliferation, tissue development, nuclear division, and cell division (Table 3). Moreover, top significant KEGG pathways were oocyte meiosis, cell cycle, p53 signaling pathway, pathways in cancer, and progesterone-mediated oocyte maturation (Table 4). Heat maps and biological networks of significantly enriched GO biological processes or KEGG pathways were constructed (Figs. 2, 3, 4, 5). Obviously, CDK1 frequently appeared in these processes, which indicated its significant role in the progression of cervical cancer.

Table 2 Obtained GO biological processes and KEGG pathways
Table 3 Several significantly changed GO biological processes
Table 4 Several significantly changed KEGG pathways
Fig. 2
figure 2

Heat map of GO: 0007049/cell cycle (a) and corresponding biological network (b). a Heat map of hierarchical clustering of all data sets (17 normal samples and 43 tumor samples) using 34 DEGs in “cell cycle”. b Biological network was constructed according to the direct connections among corresponding DEGs

Fig. 3
figure 3

Heat map of GO: 0008283/cell proliferation (a) and corresponding biological network (b). a Heat map of hierarchical clustering of all data sets (17 normal samples and 43 tumor samples) using 34 DEGs in “cell proliferation”. Red indicates high relative expression, and green indicates low relative expression. b Biological network were constructed according to the direct connections among corresponding DEGs

Fig. 4
figure 4

Heat map of KEGG: 04110/cell cycle (a) and corresponding biological network (b). a Heat map of hierarchical clustering of all data sets (17 normal samples and 43 tumor samples) using 6 DEGs in “cell cycle”. b Biological network were constructed according to the direct connections among corresponding DEGs

Fig. 5
figure 5

Heat map of KEGG: 04114/oocyte meiosis (a) and corresponding biological network (b). a Heat map of hierarchical clustering of all data sets (17 normal samples and 43 tumor samples) using 6 DEGs in “oocyte meiosis”. Red indicates high relative expression, and green indicates low relative expression. b Biological network were constructed according to the direct connections among corresponding DEGs

Discussion

One objective of this study was to identify corresponding DEGs between normal cervix and cervical cancer. By comparing gene expression profiles of 17 normal cervical epithelial samples and 43 carcinomatous cervical samples, 122 DEGs were identified including 46 up-regulated and 76 down-regulated genes. Referring to results of previous profiling studies in cervical cancer, many DEGs identified in this study such as SPP1, MMP12, CDKN2A, and CDK1 were previously reported [3, 4, 13, 14]. In the study, 402 GO biological processes and 9 KEGG pathways were over-represented by DEGs. While cell cycle process, cell proliferation, tissue development, nuclear division, and cell division were revealed as the most significantly up-regulated biological processes, the top KEGG pathways were pathways in cancer, oocyte meiosis, cell cycle, p53 signaling pathway, and progesterone-mediated oocyte maturation. Our results were consistent with previous findings. A previous study conducted by Seo and colleagues using mRNA differential display and GO analysis demonstrated that cervical cancer was present with complete up-regulation of cell cycle, transport, protein biosynthesis, and RNA metabolism [15]. On the other hand, up-regulation of pathways in cancer, metabolic pathways, and progesterone-mediated oocyte maturation in cervical cancer was also previously reported [3]. In fact, dysregulation of cell cycle, cell proliferation, and p53 signaling pathway are present in most types of cancer. Under normal circumstances, cell cycle progression is tightly regulated by groups of proteins that interact with each other in a very specific sequence of events. In cancer cells, some of these proteins no longer function properly leading to dysregulated cell cycle progression and cell proliferation. Described as the guardian of the genome, the p53 signaling pathway contains an array of genes whose protein products are very important in maintaining cellular homeostasis. Mutations in p53 signaling pathways are very frequent in cancers. Using the identified DEGs, a large and a small biological network were constructed. The up-regulated genes detected in the smaller network are members of MMP family and SPP1. Matrix metalloproteinases (MMPs), are known to help induce tumor metastasis by involving in tumor angiogenesis, local invasion and establishment of metastatic lesions at distant secondary site [16]. For example, research has shown that membrane type 1 MMP (MT1-MMP) is involved in cervical cancer carcinoma progression and invasion [17]. In a separate study, overexpression of MMP-1, MMP-2, MMP-9, MMP-14 and MMP-15 were detected in more than 50 % of primary cervical cancer samples. More importantly, progressively up-regulated expression of MMP-2 and MMP-9 is closely associated with progression and recurrence of human cervical cancer [18]. Our results further confirmed that MMP family is closely associated with progression and invasion of cervical cancer. In addition to the MMPs, SPP1 (secreted phosphoprotein 1), also known as osteopontin, which encodes an extracellular glycosylated bone phosphoprotein, was another up-regulated gene identified in the smaller biological network. In fact, up-regulation of SPP1 in cervical cancer was detected in a number of previous studies [3, 5, 19]. In a study conducted by Cho and colleagues, diagnostic and prognostic significance of increased plasma SPP1 level in cervical cancer patients were evaluated. Their results suggest that it is a potential diagnostic and prognostic biomarker for cervical cancer [20]. In a separate study, SPP1 was also suggested as an effective biomarker of disease progression of cervical cancer [3].

There were 11 up-regulated genes in the larger biological network. The functions of these genes are mainly involved in regulation of cell cycle, cell proliferation, and oocyte meiosis. Dysregulation of most of these genes has been reported in cervical cancer studies [21, 22]. Intriguingly, up-regulation of CDK1 was found to be involved in all up-regulated biological networks with direct interactions with most of the 11 DEGs. In the cell cycle network, CDK1 was directly linked to CCNB2 and CDC6. In the cell proliferation network, direct interactions of CDK1 with CDC6, CDKN3, DLGAP5, FOXM1 were revealed. In oocyte meiosis, we found interconnections among CDK1, CCNB2, and AR. Taken together; these results suggest that CDK1 plays a pivotal role in mediating cervical cancer genetic networks in tumorigenesis and disease progression. The finding is not surprising. It is well known that HPV infection is one of the most important factors in carcinogenesis of cervical cancer. After infection, E6 and E7 viral oncogenic proteins inhibit p53-mediated apoptosis and disrupt regulations of cell cycle by inducing rapid proteasomal degradation of p53 [23]. As a tumor-suppressor protein, p53 regulates the transcription of regulatory genes involved in apoptosis and cell cycle arrest. Under normal circumstances, p53 negatively regulates CDK1 transcription [24]. In cervical cancer, abrogation of p53 function due to HPV infection leads to overexpression of a number of its downstream genes such as SPP1 [5, 25, 26]. Results from a number of studies have suggested that therapeutics which can effectively restore p53 level in cancer cells have significant anticancer effects. As a direct down-stream gene of p53, overexpression of CDK1 in cervical cancer was previously reported. Cyclin-dependent kinase 1 (Cdk1), a highly conserved protein, functions as a serine/threonine kinase. It plays a pivotal role in the control of cell cycle with more than 70 regulatory targets. In response to various stimuli, Cdk1 directly phosphorylates a variety of target substrates that are involved in the control of transcription and cell cycle progression [27].

Studies have shown that aberrant activation of CDKs and their modulators are present in many cancers. Dysregulation of CDKs causes abnormal cell proliferation and genomic instability [28]. In fact, the D-cyclin-cdk4/6-INK4-Rb pathway is disrupted in all human cancers [29]. A study comprised of both in vitro and in vivo anticancer assays showed IPP5, a novel inhibitor of protein phosphatase, successfully inhibited growth of Hela cells by regulating expression levels of cell cycle regulators including Cdk1, cyclin A1, cyclinB1, p21 and p53 [30]. Their results further indicated that aberrant cell cycle regulation is a major pathological event in the progression of cervical cancer. Previous studies have proposed a number of genes such as p53, SPP1, MMP1, and CDKN2A as very important genes in cervical cancer with great diagnostic and therapeutic values [3, 13, 19]. Based on our data, due to its extensive involvement in aberrantly regulated pathways present in cervical cancer, CDK1 is obviously another important hub gene in cervical cancer, which might help develop novel therapeutics. In fact, studies have indicated that anticancer drugs targeting aberrant cell cycle regulation have great therapeutic potentials. For example, treatment of JNJ-7706621, a cell cycle inhibitor, inhibited CDK1 activity, changed its phosphorylation status, and arrested cell cycle at the G2/M phase. Data from both in vitro and in vivo studies showed that the CDK inhibitors have significant anticancer effect against a number of cancers including cervical, colon, and breast cancer [31, 32]. A recent study showed that oridonin, an active compound extracted from the plant Rabdosia rubescens, inhibited proliferation of gastric cancer cell by inducing cell cycle arrest in the G2/M phase. It was suggested that the cell cycle arrest might be induced by decreased expression of CDK2 and CyclinB1 [33].

In summary, in the current study, 122 DEGs were detected by comparing gene expression files between normal and cervical cancer samples. Among the 122 DEGs, a number of genes were previously reported such as SPP1, CDKN2A, and CDK1. In addition, several significantly changed biological networks and KEGG pathways were identified. The most important finding was that CDK1 plays a comprehensive role in modulating genetic networks implicated in tumorigenesis, disease progression, and metastasis of cervical cancer. To date, only a few genes have been reported to have multiple impacts on progression of cervical cancer such as p53, pRb, and SPP1. Our finding suggests that novel therapeutics targeting CDK1 or its related pathways might help improve prognosis of advanced stage cervical cancer. However, due to CDK1’s global regulatory functions in cell cycle, inhibition of CDK1 is likely to induce severe toxicities and side effects in patients. As a result, strategies focusing on its downstream regulation pathways might be more promising options.