Keywords

3.1 Introduction

Cervical cancer is a major burden in the healthcare industry, accounting for close to 0.6 million new cases every year worldwide, ranking fourth among cancers caused in women [1]. A critical key to tackling the disease at a global level is the implementation of large-scale screening techniques, adopting effective strategies to specifically identify the viral infection (Human papillomavirus—HPV) at an early stage. Over the years, various research groups have extensively worked on identifying specific biosignatures in response to an HPV infection and thus correlating them with the various stages of cervical cancer (CC). According to the National Institute of Health, these biosignatures/biomarkers are defined as “A biological molecule found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or of a condition or disease” [2]. These biomarkers are tools that provide a platform to aid early detection, diagnosis, prognosis, and prediction of the outcome of the patients.

As described in the previous chapter, the presence of an HPV infection does not imply the development into invasive cancer. In more than 90% of cases with an HPV infection, the virus is generally cleared from the body within about 2–3 years [3]. Only in a small percentage of the population (less than 8%) with a rather compromised immune system (inclusive of but not restricted to), the infections transform into cervical lesions which further develop into carcinoma in situ and then metastasize into a fully blown carcinoma of the cervix [4]. Thus, an ideal biomarker should be able to accurately unmask the infection at a precancerous lesion stage given the higher probability for it to transform into an invasive carcinoma thus providing a chance for intervention early on, improving the disease management. Identification of stage-dependent biomarkers (risk assessment) that can distinguish between transient and clinically significant infections can thus be cited as a critical necessity for the detection of cervical cancer, particularly. This is further substantiated by the fact that the treatment course depends on the grade of the infection. In addition to the use of biomarkers for screening the early onset of disease, it is used in every stage of the disease, surveillance of treatment response and possible prognosis to ascertain the outcomes of the patients on a case by case basis.

The biomarkers for cervical cancer are broadly classified into molecular markers and protein-based markers. The molecular markers can further be subdivided into DNA and RNA based markers which are characteristic of either the virus or the host.

3.2 Molecular Markers

3.2.1 DNA Based Markers

Since the publication of the DNA sequence of the HPV in the late 1980s, one of the initial biomarkers for the detection of CC was the identification of HPV DNA in various samples [5]. Radiolabeled DNA probes were used in cervical smears or scrapes using a dot-blot assay. Shortly following this was the in situ hybridization using non-radiolabeled fluorescent probes.

One such early study reports in situ hybridization for the detection of the HPV type 1a, 6b, 16, 11 using synthetically designed 30-mers labeled with biotin targeting the beginning of the E6 open reading frame. The study was able to successfully differentiate type specificity between the HPV-16 and HPV-11 strains whose probes differed only by four bases with minimal cross-hybridization. The total detection time of 2 h (which was essentially just comprised of the incubation time) paved a way towards an easy, safe (in comparison with radiolabeled detection techniques), and an efficient HPV detection system [6]. In addition to the in situ hybridization detection of HPV in cervical smears, attempts to closely follow this have also been made to identify HPV in Cervical Intraepithelial Neoplasia (CIN) as well [7].

Initially, the identification of HPV DNA in cervical tissue samples was due to the idea that the integration of the HPV genome into the host precedes the development of lesions into invasive carcinoma. However, with years of research, it is now widely accepted that the progression into CC precedes the integration process. Thus, alternative biomarkers are being studied to effectively diagnose CC [8].

Epigenetics is defined as the study of variations happening in a heritable phenotype without changes in the DNA sequence [9]. Methylation and acetylation of DNA are the most common chemical modifications which result in altered gene expression. The methylation sites are predominantly Cytosine, Guanine based which may or may not be a part of the CpG islands [10]. In a typical cancer of cancer cells—hypomethylation is observed genome-wide while hypermethylation is observed at the promoter regions resulting in inactivation of tumor-suppressive genes [11].

The expression levels of the HPV L1 protein play a major role in determining the grade of a CIN and the probability of it progressing to CC. As the L1 protein, which codes for the nucleocapsid, is strongly immunogenic—the basal cells downregulate the L1 protein primarily in the case of a productive infection while not in the case of a low-grade lesion [12]. Thus, since gene regulation plays a major role in the progression of the infection, primarily epigenetic modifications such as methylation of the L1 genes come into the picture as they are characteristic of the integration of viral genes into the host genome [13]. The analysis of L1 genes rather than the conventional analysis of LCR (late coding regions) has also been shown to be more powerful since the latter is largely influenced by the physical state of the virus [14].

A pyrosequencing-based study of exfoliated cervical cells collected from a Thai female population suggests the scope of using the methylation status of the HPV 16 L1 gene (with specific emphasis on the 5′ and 3′ ends of the gene) as a marker to understand infection progression, as is evident from the following figure (Fig. 3.1a). A clear distinction between CIN1 and the following stages can be made by analyzing the hypermethylation status of the 5′CpG islands 5609 and 5600, while only a comparatively lower methylation % can be found in the 3′ terminus CpG islands of the L1 gene. Thus, the combined analysis of %methylation of the sites 5600 and 5609 from exfoliated cervical cells can be used as a prognostic marker for CC, chiefly to differentiate CIN2–3 from CC [15].

Fig. 3.1
figure 1

(a) Stage dependent methylation status variation of HPV 16 L1 gene [15]. (b) Proposed independent predictors of CIN2+ based on global methylation patterns [16]. (Copyrights Received)

In addition to the analysis of HPV specific genes methylation status, the global DNA methylation profile of various tumor suppressor genes also provides an overall picture of the status of lesions. Based on quantitative methylation-specific PCR, it was concluded that, out of the 15 genes taken into consideration for analysis, hypermethylation of hsa-miR-124, SOX1, TERT, and LMX1A was deemed to be the independent predictors (95% confidence interval) of CIN2+ regardless of HPV status (Fig. 3.1b) [16].

Quantitative-methylation specific PCR, which offers sensitivity equal to the HPV DNA test, reveals that hsa-miR-124 helps improve cell adhesion due to its role in inducing expression of insulin-growth factor, while LMX1A has a role in aiding epithelial–mesenchymal transition (EMT), which is an important trademark of cancer. The study has also hypothesized an alternative method in the overall cervical carcinogenesis pathway, suggesting that even after the clearance of HPV from the system, the initial hypermethylation caused by the virus could have an impact on its progress to a high-grade lesion [16]. Thus, providing substantial evidence that unlike the traditional study of epigenetic changes in the genome, studies on miRNA could help to identify novel biomarkers of CC. A few other epigenetic based markers for CC are described in Table 3.1.

Table 3.1 List of methylation pattern-based biomarkers for cervical cancer detection

The epigenetic based biomarkers are still far from being used in commercial assays, primarily since the assays utilized for identifying these methylation patterns are not well standardized, resulting in the detection of false-positive markers and large variability. Besides, unlike the limited number of RNA based markers, the number of sites where DNA methylation can take place is extremely diverse, so it further complicates analysis [24].

3.2.2 RNA Based Markers

3.2.2.1 miRNA-Based Markers

Around 98% of the human genome consists of non-coding regions, which broadly include micro RNAs (miRNAs), lncRNAs, and circRNAs. miRNAs are around 20 nt long RNAs that can suppress gene expression by binding to the 3’UTR (untranslated region) of miRNAs, which can modulate the expression of close to 60% of coding genes in humans [25]. miRNAs are extremely stable in the sense that they are resistant to ribonucleases in bodily fluids as they exist extracellularly either as exosomes or by forming complexes with proteins such as Ago (Argonaute) [26]. Thus, providing easy accessibility for analysis from bodily fluids. Since a single differentially expressed miRNA may have the same effect in multiple disease conditions, multi-panel miRNA analysis has widely been adopted, further improving the sensitivity and selectivity of the tests [27].

With regard to the use of miRNAs as biomarkers, particularly for CC, they can be broadly classified into the miRNA produced under the influence of HPV genes and others that are not influenced. A study providing evidence for the latter shows that the upregulation of miR-21-5p and downregulation of miR-34a in 118 CC tissue samples analyzed are characteristic of the early onset of CC (pre-neoplastic lesion to CC progression). Particularly, miR-34a shows a significant reduction in expression consistently as the stages of CC progress, starting with CIN1. The human telomerase RNA component (hTERC) reported in the same study is an RNA template for the enzyme telomerase during telomere elongation. While not belonging to the family of miRNA, still being an RNA—has been shown to be found in a significantly higher number of copies as cancer progresses, thus aiding as a marker to identify the transformation of precancerous lesions.

A recent study by Xin Liu shows that the relative overexpression of miR-20a in CC cell lines was facilitated by the HPV E6 gene, which was confirmed based on gene silencing studies. Upon further analysis, it was found that the target for miR-20a—PDCD6 was downregulated, enhancing cell proliferation by activating the Akt/p38 pathway. Thus, providing substantial evidence that the HPV genes can largely influence the miRNA profiles of the host [28] (Fig. 3.2).

Fig. 3.2
figure 2

Stage dependent expression of various miRNAs as potential biomarkers for CC [29]. (Copyright Received)

However, one of the major challenges concerning the use of miRNA-based biomarkers for the detection of CC includes inconsistencies in results, and thus, universal standardization of protocols in terms of sample collection, analysis, and detection is necessary for greater reliability [30] (Table 3.2).

Table 3.2 List of miRNA-based biomarkers for cervical cancer detection

3.2.2.2 circRNA-Based Markers

Circular RNAs (circRNA) are novel non-coding RNAs that differ from miRNAs, lncRNAs in terms of their structure. Due to the event of back-splicing, the free 5′ and 3′ are joined covalently to form a closed circular structure, unlike their counterparts (miRNA and lncRNA) which are linear [48]. While circRNAs have a variety of mechanisms through which they regulate gene expression, the most critically acclaimed one is their ability to function as miRNA sponges. Every circRNA has miRNA responsive elements (MRE) which can selectively capture miRNAs, acting like a sponge [49]. The binding of the miRNA to the circRNA results in the disruption of the downstream signaling processes, resulting in aberrant expressions of the sponged miRNAs target [50]. A particularly distinguishing feature of circRNA which aids its applicability as a reliable biomarker among other non-coding is its high stability in mammalian systems and the presence of highly conserved sequences [51]. Their high stability in bodily fluids thus allows detection not only in tissue samples but also in serum, plasma, urine, etc. A list of circRNA-based CC biomarkers reported in the last 3 years has been tabulated in Table 3.3. A few notable studies have been discussed as follows.

Table 3.3 List of circRNA-based biomarkers for cervical cancer detection

An extensive study conducted by Ma et al. on the profiling of circRNA in cervical cancer cell lines revealed that out of a total of 4760 circRNA detected, 9.3% of the circRNAs were differentially expressed in CC cells [59]. Further analysis has provided evidence that the circ_000284 was consistently and significantly overexpressed across five different cervical cell lines under consideration when compared to normal cells. It was concluded that since miR-506 was sponged by circ_000284, it resulted in the overexpression of SNAI1 (Snail—the target of miR-506) which is a protein responsible for the epithelial-to-mesenchymal transition (EMT) facilitating metastasis of carcinoma in situ [62].

While the previously discussed study was pertinent to only in vitro analysis, another study, which included cervical tissue patient samples, suggested the use of circ_0005576 as a potential biomarker for CC. The identified circRNA was a sponge for miR-153-3p and was found to be expressed differentially based on the stage of cancer (CIN1,2a vs CIN2b) and thus was well correlated with the lymph node metastasis status. Based on the Kaplan–Meier regression, it was also concluded that the overall outcome of the patients with high expression of circ_0005576 is poor since the target of miR-153-3p-Kinesin family member 20A (KIF20A) is overexpressed and is known to have excess cell proliferative capacity [63].

However, based on Table 3.3, conclusions have been drawn based on either in vitro models or CC tumor tissue samples. A point to be noted is that in all these reports, the control samples are non-cancerous tissue samples adjacent to the cancerous tissues. Thus, due to variations in gene expression patterns among different tissues (even among adjacent tissues), the results may not be completely reliable. This again raises questions about their abundance in circulating fluids, thus rendering them ineffective in terms of non-invasiveness. Even though circRNAs were discovered in the year 1976, it was not until 2012 that circRNAs in humans were sequenced [64]. Thus, research in the field of circRNAs remains at a primitive stage, requiring further studies to validate their applicability to be used as prognostic biomarkers for CC.

3.2.2.3 lncRNA-Based Markers

Long non-coding RNAs represent 200 nts long non-protein-coding RNAs that lack an open reading frame [65]. In general, lncRNAs regulate cellular processes as transcriptional regulators, recruitment of effectors through scaffold structures, and guide RNAs [66]. In the case of cancer, lncRNAs interfere with normal gene regulation by acting as a miRNA sponge thus affecting downstream signaling pathways, functioning on the same lines as of circRNAs [67]. But unlike circRNAs, they do not consist of highly conserved sequences [68]. The main advantage of the analysis of lncRNAs lies in the fact that there is no need to invasively extract tumor samples for analysis as lncRNAs are stable circulating RNAs and thus their expression can be studied in just the body fluids of patients [69].

A recent list of lncRNA-based markers is tabulated in Table 3.4. A representative study conducted by Duan et al. shows a relatively higher expression of RHPN1 antisense RNA1 (RHPN1-AS1) in CC cell lines in comparison with normal squamous epithelial cells. This has further been substantiated by the analysis of 60 CC tumor tissue samples, which shows a similar trend [75]. Rescue experiments conducted further conclude that the fibroblast growth factor 2 (FGF2) was overexpressed (becoming oncogenic—stem cell-like properties) due to the sponging of miR-299–3p by RHPN1-AS1 thus involved in tumorigenesis by invasion, proliferation, and metastasis of the cancerous cells [77] (Fig. 3.3).

Table 3.4 List of lncRNA-based biomarkers for cervical cancer detection
Fig. 3.3
figure 3

(a) CC tissue sample, (b) CC cell lines showing significant overexpression of RHPN1-AS1 compared to controls [75]. (c) CC stage-dependent downregulation of lnc_H19, [70] in CC cell lines. (Copyrights Received)

A particularly interesting lncRNA is H19, which has been widely reported to have contradictory roles in the development of CC. While in the case of cell lines, overexpression of H19 (sponging hsa-miR-675) has been observed, but in the case of tissue samples, a lower expression of H19 (miR-138-5p) has been reported [70, 78]. The primary target of miR-675 is involved in controlling the tumor 238 environment and thus facilitates CC migration while miR-138-5p promotes tumor 239 suppression [79, 80]. However, the authors have claimed that the primary cause for the differential miRNA targets is due to stage-dependent molecular alterations in clinical samples which are not profound in cell lines [70].

lncRNAs are a promising candidate to be used as CC biomarkers. However, further research focusing on their practicality is needed since the amount of circulating lncRNAs may not be abundant enough to be sensitively detected. A deeper understanding of the CC stage-dependent release of lncRNAs from the cervical cells into other bodily fluids is essential to extend their applicability in being used as prognostic markers. Most of the differentially expressed lncRNAs are not specific to only CC but are rather found across most cancer types and thus, there is a need to identify highly specific lncRNAs characteristic to only CC.

3.3 Protein-Based Markers

The protein-based markers are essentially differentially expressed host proteins due to the influence of the HPV oncoproteins. The proteins identified generally play a pivotal role in the cell cycle as it is the tumor-suppressing genes (retinoblastoma—pRb, p53) and proto-oncogenes (EGFR, CDK4, Ras), which are primary targets of the HPV oncoproteins. The following figure (Fig. 3.4) provides an overview of the cell cycle modulations due to the expression of HPV oncogenes E6, E7. As most of these protein-based markers are well established, in the last 3 years, newly identified protein-based markers have been tabulated in Table 3.5.

Fig. 3.4
figure 4

Modulation of cell cycle molecules by HPV oncoproteins [81]. (Copyright Received)

Table 3.5 List of protein-based biomarkers for cervical cancer detection

Since HPV infects the basal cells and moves upwards towards the squamous epithelial cells, it can be ascertained that the topmost or outer layer of the epithelium (exfoliated cervical cells) is a good source to understand the stage of CC. A study by Jin et al. focused on the evaluation of tumor-associated proteins (TAPS) in 146 CC patients’ tissue samples based on ELISA. p53, SLeA (Sialyl Lewis A), HPV 16 L1 were identified as potential markers of cervical lesions, while the expression of SLeA in combination with L1 was found to be dependent on the progression of the disease. Also, SLeA and p53 together differentiated CC from normal samples with 91.3% sensitivity and 96.7% specificity [86].

Unlike the other normal protein-based biomarkers correlated to some parts of the aberrant cell signaling pathways of cancerous cells, there are proteins such as the hepatitis B virus X-interacting protein (HBXIP) which have no clearly elucidated mechanism for its role in CC even though its oncogenic role in the development of breast cancer has been well established. The HBXIP overexpression in the study was been found to be inversely proportional to the overall survival rate of patients. Thus, it also serves as a prognostic marker of CC based on immunohistochemical studies on 105 CIN patients when compared to 31 normal cervical epithelial samples. The strongly positive rates of HBXIP expression were close to 57.9% in the case of SCCs and were also found to be strongly correlated with the differentiation stage, p63 expression status (a key player in SCCs tumorigenesis), and lymph node metastasis, as can be concluded from the following figure (Fig. 3.5) [82].

Fig. 3.5
figure 5

HBXIP overexpression as an indicator of clinical-stage, differentiation, lymph node metastasis of CC [82]. (Copyright Received)

Higher expression of Carcinoembryonic antigen 125 (CA-125), squamous cell carcinoma antigen (SCC-Ag), and highly sensitive C-reactive protein (hs-CRP) in comparison with normal cells has been reported by Guo et al. The authors claim that these protein markers can detect whether recurrence is expected to occur in CC patients. Since the rate of survival is <20% due to the recurrence of that disease within 5 years, this helps improve the treatment time and the survival rate [87].

3.4 Conclusion

Cervical cancer is one of the major causes of morbidity among women. The high morbidity rate is closely associated with the fact that the infection is diagnosed at a very late stage, thus ascertaining the importance of early large-scale screening strategies. While the currently used screening techniques such as cytology have low specificity to detect precancerous lesions, CC biomarkers such as DNA, RNA, protein-based biomarkers have the potential to be exploited for CC diagnosis. While the detection of HPV DNA as a biomarker has been well established, the aspect of epigenetics-based biomarkers has a large potential and so is the case of RNA based biomarkers making them promising candidates for diagnosis and studying the effect of therapy. Further studies in the direction of associating the nucleic acid expression/methylation patterns with clinical outcomes may provide promising results in terms of disease management with suitable therapeutic interventions. On the other hand, protein-based biomarkers will have to be further studied and validated for their use as CC biomarker as they compromise on specificity unlike the DNA, RNA based markers. Thus, an in-depth evaluation of the molecular and protein-based biomarkers will pave the way to affordable, simple, selective, and specific detection of CC at an early stage.

Declaration of Competing Interest

The authors declare no conflicts of interest.