Keywords

2.1 Cancer Biomarkers

Cancer is a cluster of diseases, responsible for the death of about nine million individuals and almost one-sixth of global mortality. The rapidly increasing number of cancer cases has been greatly affecting the health sector. The study forecasts that over the next 20 years the cases may increase by 70%. This disease burden can be reduced effectively by the application of cancer biomarker for predictive measures, early detection, and appropriate therapy followed by routine checkup. The US Food and Drug and Administration (FDA) define biomarker in the following context “Any biological molecule that can be used as diagnostic indicator to measure the risk and presence of disease” (Ilyin et al. 2004; World Health Organization 2017). It can be enzyme, cell, gene, protein, nucleic acids which can be detected in blood, urine, tissues, and body fluid, etc. Cancer biomarkers (CB) are biological substances secreted by tumors or other cells, that can be utilized as an indicative tool to detect, prognose and diagnose cancer and can be used to distinguish the subpopulation of patients’ response to a therapy (Goossens et al. 2015; Rhea and Molinaro 2011).

2.2 Types of Cancer Biomarkers

Cancer biomarkers can be categorized into the following classes based on their usage:

2.2.1 Screening Biomarkers

Screening biomarkers are the first type of cancer biomarkers that can be utilized for early detection of cancer: it is used to identify those individuals that are at danger of developing a specific disease or to detect a disease when the individuals having it are asymptomatic which is different from the diagnosis of symptomatic individuals. This results in increased survival rate and reduces other complications and morbidity (Weigelt et al. 2005). Example of screening biomarkers includes APF which is used in screening for hepatocellular cancer in high-risk individuals, CA125, in screening for ovarian cancer, for prostate cancer PSA is used as screening biomarker and in screening for colorectal cancers, fecal occult blood testing (FOBT) is used (Duffy 2015).

2.2.2 Predictive Biomarkers

Predictive biomarker, another type of cancer biomarker used to detect/predict the response of cancer cells to specific therapy or drug, i.e., the HER2 activation in breast cancer in response to trastuzumab or the prediction of mutated KRAS activation resistance to EGFR inhibitor cetuximab in colorectal cancer (Cameron et al. 2017; Romond et al. 2005; Slamon et al. 2001; Van Cutsem et al. 2009).

2.2.3 Prognostic Biomarkers

Prognostic biomarkers can be used to provide information regarding the disease recurrence or progression, but not linked directly with therapeutic interventions, i.e., 21-gene recurrence score in breast cancer, used to predict the cancer recurrence in tamoxifen-treated node-negative breast cancer (Paik et al. 2004).

2.2.4 Diagnostic Biomarkers

Diagnostic biomarkers, another type of cancer biomarker utilized to detect the presence or absence of a particular disease in a patient. Stool cancer DNA in colorectal cancer surveillance is used as diagnostic biomarker lately (Imperiale et al. 2014).

2.2.5 Monitoring Biomarkers

The biomarkers used for the monitoring or prediction of cancer recurrence post therapy is known as Monitoring biomarkers. The level of these biomarkers increase above the basal level in cancer recurrence can be predicted biochemically prior to any clinical or radiological evidence, i.e., carbohydrate antigen CA19-9, used as monitoring biomarker in pancreatic cancer and is FDA approved since 2002 (Bast et al. 2001; Koprowski et al. 1979; Rosty and Goggins 2002; Sharma 2009).

2.3 Discovery of CBMs

The discovery of cancer biomarkers employs numerous routes that includes the coverage of several disciplines ranging from high-throughput data initiation to generation of big-data and utilization of machines learning algorithms to the validation of biomarkers in different preclinical and clinical trials. These comprehensive steps involved in the cancer biomarker discovery has depicted in the Fig. 2.1.

Fig. 2.1
figure 1

Depiction of numerous technologies for identification and validation of cancer biomarkers. High-throughput technologies have generated huge bulk of big-data that is being deciphered by data mining for the generation of meaningful information. This data-mining results in the identification of novel targets that goes on becoming a cancer biomarker following different approaches such as support vector machine learning and analysis of integrated databases. In the meantime, the potential biomarker-specific inhibitors’ hunt also begins that employ different computed techniques such as cheminformatics to identify potential functional groups for having binding affinity with the identified biomarkers. The potential cancer biomarker is being validated in preclinical studies employing in silico, in vitro, microfluidics and in vivo approaches. This leads to the developmental validation of cancer biomarker in human population following its comprehensive journey in clinical trials, with ultimate success of biomarker approval for cancer clinics

2.3.1 Preclinical Studies

2.3.1.1 In-Silico Studies

The integration, evaluation, and analysis of gene banks from huge databases present in gene expression profiling repositories can be done through sets of tools termed as “Bioplat (biomarker platform)”. The core purpose of user-friendly Bioplat software is to aid in early diagnosis and prognosis of cancer patients by means of functional genomic data. Along with “in-silico identification” of new cancer biomarkers it is also helpful in extracting data from gene repositories as well as gene expression analysis.

Bioplat plays a significant role in edition of gene and creation of biomarkers with the help of identifiers in the embedded database, named Gene name, Entrez, Ensembl and Probe IDs. Additionally, Bioplat can also integrate gene data by means of online available resources including DAVID (Database for Annotation, Visualization and Integrated Discovery), STRING (Search Tool for the Retrieval of Interacting Genes/Proteins), Enrichr, Expression Atlas, RNA-seq Atlas and Gene Cards.

The gene signature optimization process is the prominent step in the Bioplat software development. The significant processes of Bioplat comprises of “blind search” and “particle swarm optimization (PSO)” helps in hitting the right optimum gene in less time (Butti et al. 2014).

However, another study encompasses some other approaches for in-silico identification of cancer biomarkers includes Panther, UniProtKB, NetOGlyc, NetNGlyc, Oncomine, and Cytoscape (Azevedo et al. 2018).

2.3.1.2 In Vitro

The use tissue culture paved a promising path towards the discovery of cancer biomarkers. The tissue cultures are rich in tumor cell lines and hence, wide spectrum of candidate biomarkers (Minamida et al. 2011). The limitation in the accessibility of patient tissue sample leads towards the transition to use tumor cell lines as second option for the discovery of potential biomarker.

The major ingredient of Conditioned media (CM) is secretory proteins that plays the major role in the identification of biomarkers with greater efficacy (Xu et al. 2010).

The traditional 2D (two-dimensional cell cultures) are replaced by 3D (three-dimensional cell culture) for the exclusive representation of homeostasis during in vitro analysis. The 3D cultures resemble tissue engineering models which helps in the understanding of gene expression and molecular mutated pathways of cancer (Lenas et al. 2009; Martin et al. 2008).

Among several techniques for better understanding of biomarkers, mass spectrometry has got the central focus. Through minimal number of sample, mass spectrometry has the significance to calculate accurate molecular mass with precision (Boja and Rodriguez 2012). Two broad categories of mass spectrometry for the identification of biomarkers are gel-based (2-DE and 2D-DIGE) and gel-free (SILAC, iTRAQ) techniques (Leong et al. 2012).

Additionally, gel-free techniques are also emerged as promising technique for the discovery of biomarkers. In tissue culture-model system, Stable isotope labelling by amino acids in cell culture (SILAC), that includes the integration of amino acid within stable isotope nuclei are now considered as method of choice. iTRAQ (Isobaric tags for relative and absolute quantitation) can also be used as alternate method (Mann 2006).

2.3.1.3 Microfluidics Chip Technology

Microfluidic chip technology utilizes an approach that can control fluids on a microscale, thus manipulating the cell-culture-related parameters in a comprehensive way to mimic the microenvironment of a malignant tumor in vivo (Xu et al. 2016). The microfluidic chip has strongly emerged as a biochip that can assimilate numerous fields, including cell biology, oncology, pathology, physiology, biophysics, biomechanics, bio-printing, motorized design, and so forth (Chaudhuri et al. 2016; Rosenbluth et al. 2008). In the recent decades, the application of biochip technology has displayed remarkable potential in the field of cancer treatment. A number of science validation techniques such as 2D and 3D cell and tissue cultures, spheroids and tissue organoid cultures can be performed on microfluidic biochips (Vadivelu et al. 2017). Moreover, cancer patients’ derived cell lines and tissues can also be cultured on microfluidic biochips in a observable, controllable, manageable, and a high-throughput fashion that will significantly advance the progress of personalized medicine (Mulholland et al. 2018).

The novel biomarker and drug development consist of a number of major practices, including drug discovery, validations via preclinical trials and clinical developmental trials. Since the initial progress in 1990s, microfluidic biochip technology has been employed in multiple research disciplines including single cell analysis, medicinal synthesis, proteomics, tissue engineering, libraries screening, and medical diagnosis (Yu et al. 2014). Such platforms deliver novel understandings of biological mechanisms and endow the effective and rapid generation of novel data analysis. The microfluidics biochip revolution escalated due to the numerous effective applications offered by system size shrinking, while in the meantime providing high-throughput analysis, improved sensitivity, enhanced analytical potential, multiplexing abilities, and utilizes less volume of reagents, as well as its portable and easily fabricated (Boobphahom et al. 2020). This ultimately results in the development of economical in vitro models for lead compounds’ identifications that can steadfastly predict the effectiveness, cytotoxicity, and pharmacokinetics of test compounds in humans, as well as for novel library screening analyses.

2.3.1.4 In Vivo

With the emergence of biomarkers discovery from in vivo mouse models, the extraction of plasma from genetically modified mouse model can be an attractive approach (Hingorani et al. 2003). Extraction of plasma from mice during stages of pancreatic tumor development, followed by proteomic approaches helps in marking the protein alterations (Aguirre et al. 2003).

Through comparative analysis technique, the noticeable similarity in expression of candidate biomarkers in human and mouse models were observed. To mark out differences in the protein concentrations, different samples are labeled with Cy dyes, IPAS (intact-protein analysis system) is done to indicate the protein differences. On the other hand, mass spectrometry can be helpful to highlight the gaps in protein bands (Wang et al. 2005).

Another sera comparison between the mouse model having human A549 lung adenocarcinoma cells with the control mouse group. The result showed very prominent quantitative and qualitative alterations in “expression of protein” between two groups. The key investigation revolves around the fact that differences in protein expression due to acute-phase inflammatory protein responses or antibody-mediated immune responses. Through histopathological staining techniques, it can be concluded that protein alterations are due to secondary changes in host origin and are not related to tumor cell derived proteins (Subramaniam et al. 2013).

2.3.2 Clinical Studies

2.3.2.1 CBMs Already in Clinics?

The EPGR (epidermal growth factor receptor) family member named as HER2 (ERBB2) is used as molecular biomarker in clinical settings. The amplification and overexpression of HER2 shows considerable responses against monoclonal antibodies, e.g., trastuzumab and pertuzumab. Among 20% of breast cancer patients, the phase 3 trails reflect the appreciable results of anti-HER 2 therapy along with better survival rates (Piccart-Gebhart et al. 2005; Romond et al. 2005).

Presently, ten HER 2 assays have been approved as companion diagnostic devices by FDA as well as approval of three HER2 assays (nucleic-acid based tests) are done by the Center for Devices and Radiological Health. However, other categories of biomarkers in clinics are BCR-ABL in chronic myeloid leukemia, KRAS mutations in colorectal cancer and multiple mutations in non-small cell lung cancer (NSCLC) (Kalavar and Philip 2019) (Table 2.1).

Table 2.1 List of FDA-approved protein tumor markers presently utilized in clinical practice adapted from (Füzéry et al. 2013)

2.3.2.2 CBMs Clinical Trials

To replace the invasive cancer biomarkers, significant efforts are done to introduce predictive biomarkers. They are majorly based on single protein or gene and are mostly in phase II or III trials for evaluation and validation along with therapeutic targets (Tables 2.2 and 2.3).

Table 2.2 List of CBMs in Phase 1–Phase 4 clinical trials adapted from (Goossens et al. 2015)
Table 2.3 List of ongoing clinical trials for CBMs adapted from (Kirwan et al. 2015)

2.4 Technologies That Lead to CBMs Discovery

2.4.1 Genomics (Nuclear and Mitochondrial CBMs)

2.4.1.1 Next-Generation Sequencing (DNA and RNA seq)

Genomic alterations are under study for most major tumors using sequencing techniques (Brooks 2012). Maxam Gilbert and Sanger laid the basis for next-generation sequencing through their cleavage method and dideoxy synthesis respectively (Maxam and Gilbert 1980; Sanger and Coulson 1975; Sanger et al. 1977). Next-generation sequencing, deep or massively parallel sequencing can sequence an entire genome in a single day which is extremely fast in comparison to Sanger sequencing which took almost 10 years to sequence human genome (Behjati and Tarpey 2013). Short-read whole genome sequencing and barcode linked read sequencing are novel approaches that can be used to resolve genomic rearrangements which can lead to tumorigenesis (Cunha 2017).

2.4.1.2 Microarrays: Gene Expression Profiling

Microarray is basically an arrangement of nucleic acids attached to a solid surface and it can be used to detect expression of different nucleic acids (DNA, mRNA, miRNA, circRNA, etc.). Recently, circulator RNAs microarray was used to discover novel circulating biomarkers for diagnosis of gastric cancer.

2.4.1.3 Genome-Wide Association Studies

Genome-wide association studies or GWAS is used to identify linkage between genotype and phenotype and it can be used to associate a genetic variant with a particular disease (Tam et al. 2019). This approach has proved to be effective in particular with respect to breast cancer, where it has been used to associate many risk factors and biomarkers to this particular disease (Walsh et al. 2016).

2.4.2 Proteomics (Cytoplasmic and Membrane CBMs)

2.4.2.1 Western Blotting

Western blotting is an important procedure for the immunodetection of proteins particularly less abundant proteins after electrophoresis (Kurien and Scofield 2006). Diagnostic and therapeutic biomarkers for hepatocellular carcinoma, ovarian cancer, and breast cancer were discovered using western blotting (Cho 2007).

2.4.2.2 FACS

Fluorescence-activated cell sorting or FACS is a technique which is utilized to sort, detect, and count fluorescently labelled cells. Recently, a better technology has been devised, intelligent image-activated cell sorting (iLACS), which is a machine intelligence technology and has the capacity to analyze fluorescence-intensity profiles as well as multidimensional images of the cells and hence can sort cells and their components more efficiently (Isozaki et al. 2019).

2.4.2.3 MALDI-TOF

MALDI-TOF or matrix-assisted laser desorption/ionization-time of flight is an inexpensive technique which can be used with mass spectrometry to analyze protein composition of a tissue and it has been proven valuable in discovering novel biomarkers of gastrointestinal cancer, cancer of respiratory system, breast cancer, ovarian, and has the potential of discovering many more valuable biomarkers in other types of cancer (Rodrigo et al. 2014).

2.4.3 Bioinformatics (Predictive/Deduced CBMs)

2.4.3.1 Molecular Docking

Molecular docking is a tool which can be used to analyze interaction between two molecules (Morris and Lim-Wilby 2008) and hence can show us whether two molecules are likely to interact in in vivo conditions or not. Many tools are available online to perform molecular docking, of which one is HADDOCK 2.4 (High ambiguity driven protein–protein docking), it uses information of already identified or predicted protein interfaces in ambiguous interaction restraints and dock proteins accordingly (Van Zundert et al. 2016) and is different from ab-initio methods.

2.4.3.2 Simulations

Simulations or molecular dynamics (MD) simulations is a basic tool for evaluating biomolecules and biomolecules interactions that were generated through in-silico approach (Hansson et al. 2002). For MD simulation, many software and servers are also available, for example, CABS-flex 2.0 which is an online server for quick modeling of protein structural flexibility (Kuriata et al. 2018) and GROMACS which is a software to simulate Newtonian equation of motions on particles (Van Der Spoel et al. 2005).

2.4.3.3 Molecules-Interaction Network Analysis

TargetScan and STRING are just an example of servers that can be used to visualize interaction of miRNAs with their targets and proteins with proteins respectively (Agarwal et al. 2015; Szklarczyk et al. 2019). These interactions can be used to analyze and predict biomarkers.

2.4.3.4 Support Vector Machine Learning

The support vector machine (SVM) learning, which is a supervised learning method, utilizes a collection of labeled training data to generate input–output mapping functions (Wang 2005), or in simple words has the advance ability to classify things through its learning abilities. It is a powerful classification tool that can be used to discover new biomarkers (Huang et al. 2018). ISOWN is a program based on this approach (Kalatskaya et al. 2017).

2.4.3.5 Integrated Databases

The Cancer Genomic Atlas (TCGA) dataset contains molecular characteristics of 33 different types of over 20,000 cancer and matched normal samples. TCGA and other similar databases are used by ISOWN. OncoMX is also a database more focused on biomarkers which consists of literature from different databases such as EDRN, Bgee, BioXpress, Reactome, and BioMuta (Singleton and Mazumder 2019).

2.4.4 Metabolomics

To detect cancer, predict response to different therapies and predict or monitor cancer recurrence, metabolites released as a byproduct by any metabolic pathway or during tumor growth can be used as a cancer biomarker. During cancer occurrence and development, specific metabolites expression changes due to which they can be used as biomarkers for cancer (Cardoso et al. 2018; Haukaas et al. 2017; Winter et al. 2003; Zaimenko et al. 2017). These biomarkers can be detected in circulatory fluids like blood and CSF, excretory fluids like urine, saliva and by the tissues itself (Cavaco et al. 2018; Hadi et al. 2017; Harvie et al. 2016; Jagannathan and Sharma 2017). The exploration of the cancer metabolome appears to be an effective approach to analyze the phenotypic variations connected with tumor proliferation because metabolome is a strong representative of phenotype compared with genome, transcriptome and proteome (Holmes et al. 2008). Metabolite markers are different from traditional biomarkers (e.g., biochemical indices) and rely on various analytical techniques with includes nuclear magnetic resonance spectroscopy and mass spectrometry. Various metabolite markers have been identified until now. One of them thoroughly studied is 2-hydroxyglutarate (2-HG) which is being identified in many types of cancer which includes breast cancer, renal cancer, papillary thyroid carcinoma, and AML and is a product of IDH1 and IDH2 mutation (Borger et al. 2014; Dang et al. 2009; Fathi et al. 2014; Kanaan et al. 2014; Montrose et al. 2012; Rakheja et al. 2011; Shim et al. 2014; Wang et al. 2013).

2.4.5 Epigenetics Biomarkers

Heritable changes occurring at the molecular level in the cell are primarily due to alterations in the nucleotide sequence, as deciphered clearly by the human genome project. However further analysis has now led scientists to discover the importance of the other components of the human genome that can alter how phenotypes are expressed. These includes the epigenetic mechanisms like DNA methylation and histone modifications as well as the role of non-coding RNA.

These changes maybe because of external (environmental effects) or internal mutations by controlling trigger zones on the DNA, i.e., repressor proteins. These epigenetic factors have been identified to play a major role in various malignancies and thus maybe used as potential biomarkers for tumor identification, progression, and recovery (Kamińska et al. 2019). Bisulfite sequencing is a valuable technique to analyze DNA cytosine methylation. After bisulfite treatment of the sample, PCR amplification is performed which converts unmethylated cytosines into thymine (Xi and Li 2009).

Therefore, whatever the genetic sequence the final phenotypic expression depends on how the mutations are translated and hence the term epimutation. Epimutations is heritable and is associated with repression of genetic activity in somatic and in some cases germ cells.

The Human Epigenome Project (HEP) has evolved and expanded to add data to the ENCODE database (Encyclopedia of DNA elements) and the Cancer Genome Atlas (TCGA) with 212 cell culture lines. Covalent modifications of DNA or its histones (chromatin) play central role in epigenetic inheritance. This section shall investigate epigenetic markers in the field of oncology as under:

2.4.5.1 DNA Methylation: Aberrations

Both hyper and hypomethylation of promoters can silence important tumor suppressor genes. Since its first discovery in 1983 there has been immense progress in developing in vitro diagnostic (IVD) assays for cancer screening and progress. DNA methylation is important in reprogramming the predetermined genetic makeup. Post fertilization there is loss of the original methylation from the paternal side and some from the maternal, erasing epigenetic memory of the parents and then later on re-methylation introduces a phenotype very specific and tailored to the new individual or offspring (Bradbury 2003). The two major known regions for methylation to occur are the promotor region and the CpG-rich region (cytosine residues) converting cytosine to 5-methylctosine. They silence the non-coding promoter sites and attract methyl-CpG-binding domain proteins (MBD).

2.4.5.2 Histone Posttranslational Modifications

Histones are made up of amino acids and once the amino acids are changed, the shape is modified and thus a new lineage-specific transcription is continued after cell division. Modification of histone by methylation and acetylation lead to euchromatin whereas, phosphorylation and deacetylation, heterochromatin that is condensed and inactive. Global histone acetylation modifications are potential markers of tumor recurrence with a better prognosis as compared to global methylation.

Thus based on these, patient can be classified into two subtypes, but as it is more dangerous minute modifications such as Lys16 and Lys20 hypomethylation is considered characteristic of human tumor cells (Shain and Pollack 2013), for example breast cancer with these modifications has a worse prognosis (Elsheikh et al. 2009). The presence of isoforms of histone also upsurge the tendency of cancer as in overexpression of H2A.Z in prostate and bladder tumors (Monteiro et al. 2014). Increased levels of circulating histones because of cancerous cell death or vigorous release are an indication of tumor progression and are a non-invasive biomarker to predict tumor response to chemotherapy as well. Upregulation of H3Cit histone have been documented in predicting short-term mortality (Thålin et al. 2018).

2.4.5.3 Chromatin Spatial Modifications

One of the chromatin remodeling complex, the Switch/Sucrose Non-Fermentable (SWI/SNF) is mutated in a wide range of cancers from ovarian, gastric to pancreatic (Shain and Pollack 2013).

2.4.5.4 MicroRNAs

These are non-coding RNAs that regulate various biological functions and each miRNA targets approximately 200 or so messenger RNAs (mRNAs), thus inhibiting translation. These miRNAs are regulated by either CpG islands or histone modifications. miRNAs act as biomarkers from both tumor tissue and body fluids like blood, CSF, urine, and saliva. Thus, the study of circulatory miRNAs in liquid biopsy’s samples delivers encouraging biomarkers’ platforms for non-invasive-based diagnosis in many human cancers. The detailed role of miRNAs as prognostic, predictive, and diagnostic factor is give in Table 2.4.

Table 2.4 The predictive, prognostic, and diagnostic role of different epigenetic markers in the field of oncology

2.4.6 Microbiomics Biomarkers

Omics technologies are promising contributors towards the discovery of biomarkers. The path towards the development of personalized medicines is paved by the discovery of relevant biomarkers under the umbrella of omics technologies (Quezada et al. 2017).

The microbial communities resides over and inside human body consisting of bacteria, viruses, fungi and archaea. They are termed as “microbiota/microflora” and encoded genes are called “microbiome” (Schwabe and Jobin 2013). Maintenance of homeostasis and shielding effect against pathogen are highlighted roles of microbiomes (Shreiner et al. 2015).

In 2007, Human microbiome project (HMP) brought the importance of microbiome in limelight through bioinformatics approaches. The major outline was to manipulate the components of microbiome to trigger immunity responses against deadly diseases (Clemente et al. 2012).

However, the disturbances or alterations in microbiome are directly proportional in triggering different cancer. Even a single alteration in microbiota can lead to drastic consequences (Bultman 2014). A continuous evolving microbiome has been recognized as playing a crucial role in carcinogenesis at a molecular level. One of the penalties in coexisting with these bacteria, fungi and viruses is the potential silent hazardous effect on human health. Thus, elaborating the taxonomy of theses microbes and understanding their basic mechanisms can we shed a light on the role they play not only in disease development but also in reversing these to become therapeutic agents and diagnostic tools (Singh et al. 2015).

Different composition of microbiota in multiple organs in human reflects the variability of inflammation responses and carcinogenesis in different body parts. Additionally interpersonal alterations of microbiome compositions at various location within the same organ can also lead towards cancer (Huttenhower et al. 2012).

The susceptibility of cancers also varies with the presence or percentage of microbiome in multiple organs. The higher densities of microbiome in large intestine are indicators of higher risk of cancer compared to small intestine (Breitbart et al. 2008; O’Hara and Shanahan 2006).

The variety of microbiome along with metabolites are present in body fluids, i.e., blood, saliva, urine, and cervicovaginal discharge is a promising factor in proving microbiome as novel as well as non-invasive cancer biomarkers (Farrell et al. 2012). For example, in non-small lung cancer, the higher percentage of hippuric acid metabolite was marked in PD-1 blockade therapy responders as compare to non-responders. Therefore, hippuric acid can act as “combinatorial biomarker” for the screening of patients for cancer immunotherapy and others are directed towards different therapies (Hatae et al. 2020).

The advent of next-generation sequencing technology has permitted us to further explore the inter-relationship of the disease, host, and microbe triad especially so in the gut microbiomes elaborating their role in cancer via direct or even immunological mechanisms. Any imbalance of these factors or dysbiosis is then linked with a plethora of diseases, including cancers and so these microbiomes may in future be used as markers for cancer diagnostic. This has led to a rapid expansion of the study of DNA of microbes or microbiomics (Feng et al. 2020).

Though many studies have identified these pathogens in different cancers it is still not clear whether these are a cause or effect of these cancers. Do these proliferate under the influence of the tumor cells or lead to the growth and progression of these cancers? In either case identifying and using these as markers may help track the prognosis of disease or even be possible routes for targeted therapies.

There are however many challenges because of the complexity of the technologies involved for example, in case of gut microbiota, whether the sample is from stool versus biopsy samples, correctly defining the genes and finally understanding the source of microbial genes because of this being a very young field (Cong and Zhang 2018). To overcome the insufficient biomass as well as contamination and variability of kits, repetition is the best possible way to validate and substantiate the findings across labs and microbiomes.

The most studied microbiome is the gut microbiome and it has shown in some cases that treatment with simple antibiotics can lead to reversal of tumors like Helicobacter pylori-induced gastric mucosa-associated lymphoid tissue (MALT) and lymphoma using lansoprazole 30 mg, amoxicillin 1 g and clarithromycin 500 mg (PREVPAC) (Stolte et al. 2002). By creating enzymatically active protein toxins, directly inducing host cell DNA damage or interfering with critical host cell signaling pathways of cell proliferation, apoptosis, and inflammation, certain bacterial species can have a pro-tumoral effect (Fiorentini et al. 2020).

The mechanisms that the carcinogenic microbes employs are shown in Table 2.5 (Goodman and Gardner 2018).

Table 2.5 Description of carcinogenic microbes’ mechanisms

2.4.7 Cancer Imaging Technologies

Imaging technologies are used commonly to detect and categorize cancer. Imaging is performed widely to stage cancer, to monitor cancer therapy, to detect disease recurrence, or for surveillance purposes (Dregely et al. 2018).

In oncology, Image Biomarkers (IBs) that are used commonly include clinical TNM (tumor, node, metastasis) stage, objective response, and left ventricular ejection fraction. Beside these other biomarkers that are used extensively in cancer research and drug development are MRI, CT, PET, and ultrasonography biomarkers (O’Connor et al. 2017). In the diagnosis, staging and treatment of cancers, the imaging modalities range from radiological X-rays, computed tomography (CT) and magnetic resonance imaging (MRI) to ultrasound (US) and radioactive single-photon emission computed tomography (SPECT), positron emission tomography (PET), and optical imaging. Imaging in cancer is still poor despite advances in other aspects of diagnostic radiology unless tumor-to-background ratio improves by 2–4 times with increase efficiency in sensitivity and contrast agent targeting (Frangioni 2008).

For several cancers, MRI is now the main imaging evaluation method and plays a key role in management decisions. It is the initial imaging tool for prostate cancer and myeloma diagnosis; for rectal, cervical, and endometrial cancer staging; and for hepatocellular cancer response evaluation. A variety of MRI biomarkers are already identified or are well on their way to being established for oncology evaluation in clinical practice. These MRI biomarkers include BI-RADS (Breast Imaging Reporting and Data System), PI-RADS (Prostate Imaging Reporting and Data System), and LI-RADS (Liver Imaging Reporting and Data System), to diagnose breast, prostate, and hepatocellular cancers, respectively (Dregely et al. 2018).

PET (Positron Emission Technology) scans are used for the detection of cancer and also for the examination of the effects of cancer therapy. It is used to identify localized biochemical changes at the site of cancer. PET scans show only the location of a molecular marker, they do not provide anatomical information. It is a diagnostic test that requires the acquisition of physiological images that are dependent on positron detection. Positrons are tiny particles which emit from a radioactive substance when administered to the patient (Scaros and Fisler 2005).

Patient receives an injection of radioactive tracers that contain a type of sugar attached to a radioactive isotope. When cancer cells take up the sugar and attached isotope, positively charged, low-energy radiations known as positrons emit. The electrons in the cancer cells react with the positrons and result in the production of gamma rays. These gamma rays are then detected by the PET machine, which transforms this information to the form of a picture.

For example, 18F-FDG is a commonly used tracer to detect cancer in clinical oncology. FDG-PET is very useful in the diagnosis, staging, and monitoring cancer therapies, particularly Hodgkin lymphoma (Zaucha et al. 2019).

The newer and improved versions of these modalities include PET radiotracers using Gallium 68, and hyperpolarization MRI using Carbon 13 pyruvate will be needed to increase sensitivity in diagnosis of cancers. The specificity may be increased by using cancer-specific targeting ligands like immunoglobulin (Fass 2008).

Cancer treatment using image-guided chemotherapy by MRI, optical tomography using radioisotopes for neoadjuvant therapies are now changing our approach to cancer treatment (Table 2.6).

Table 2.6 New imaging technologies for different cancers types and their advantages and disadvantages

2.5 Emerging Technologies

2.5.1 Circulatory Cancer Biomarkers

2.5.1.1 Circulating Tumor Cells

It is possible to find circulating tumor cells (CTCs) in the peripheral blood of patients with metastatic cancer. Recently, with the advent of technologies that are sufficiently sensitive to detect very rare cells, research to enhance the detection of CTCs has increased considerably. The development of such tools has empowered research into defining the clinical implications of CTCs and has revealed that the levels of CTCs in patients’ blood shows a relationship with prognostic outcomes and is a clinically significant biomarker for patients’ prognosis with metastatic prostate, colon and breast cancers. Several studies have shown that CTC tracking can be used to assess patient responses to therapy and to track genetic and phenotypic tumor changes in real time (Preedy and Patel 2015).

Because of the correspondence with traditional tumor tissue’s biopsy, the word “liquid biopsy” for measuring the concentration of CTCs in blood was introduced (Alix-Panabières and Pantel 2013). In comparison to tissue biopsy, the liquid biopsy offers numerous advantages, for example, efficient and simple pulling out of liquid sample from patients, cheaper and least painful procedure and low risk for patients suffering because of its nominal invasiveness. This does not only deliver the prospect for improved understanding of the underlying biological mechanisms such as cells’ spreading and metastasis, but also to utilize these types of circulatory cells as biomarkers for the detection, analysis, and treatment of complete cancer more efficiently and successfully. Nevertheless, due to the exceptionally low levels of CTCs in blood and mostly the missing of cancer-specific biomarkers, their detection still poses a major challenge and holds some limitations upon their significance in cancer diagnosis. Liquid biopsy has many advantages as compared to tissue biopsy such as low cost, rapid extraction, and minimal invasiveness. This not only helps in the better understanding of cancer biology but also helps in the use of these cells as biomarkers to more effectively diagnose and analyse cancer.

Racila and colleagues described a major scientific breakthrough in 1998 to identify the extremely rare Circulating Tumor Cells (CTCs) (Racila et al. 1998). They used antibodies designed against epithelial cell adhesion molecules (EpCAM) joined with ferrofluids. These were combined with flow cytometry that they performed as immunomagnetic CTCs enrichment. This method was used for the origination of the CellSearch® (CS) system that is currently being used frequently and is the lone CTCs detection method approved by the US-FDA (Marcuello et al. 2019).

For detecting CTC in the peripheral blood of cancer patients, several in vitro approaches have been reported. However, currently used in vitro techniques, they have limitations such as less yield and sensitivity. An innovative in vivo CTC isolation product, the GILUPI CellCollector® can isolate CTC directly from the circulating blood. It intends to increase the yield while capturing CTC and has been approved with a Conformité Européenne (CE) mark, for application in solid cancers and by the China Food and Drug Administration for breast cancer. This new strategy has been found to have high capture rates for advanced stage lung cancer and can even detect CTC in ground glass nodule patients as well (He et al. 2020).

2.5.1.2 Circulatory DNA/RNA

The circulatory fluids such as the blood samples carries small quantities of circulatory tumor DNA/RNA (ctDNA/ctRNA) released from the primary and metastatic tumors cells along with the cell-free DNA (cfDNA) from non-malignant cells, primarily hematopoietic cells. ctDNA can provide a more detailed description of the range of mutations that could be found in the tumor of a patient as compared to single tissue biopsy. ctDNA can provide a potential for minimally invasive disease course monitoring and residual disease evaluation following surgery (Marcuello et al. 2019).

2.5.1.3 miRNA

MicroRNAs (miRNAs or miR-) are endogenous single stranded non-coding RNAs that can post-transcriptionally control the expression of hundreds of target genes. There are two main mechanisms by which they can negatively regulate gene expression, firstly through binding to the 3′-untranslated regions (3′-UTRs) of target mRNAs, thus inhibiting the translation. Secondly, by binding effective complementarily to messenger RNA sequences, consequently resulting to their degradation (Luo et al. 2013; Yang et al. 2015). On the other hand, there is also some data present that miRNAs can also trigger translation of target mRNAs (Vasudevan et al. 2007).

The initial association between human cancer and miRNA was revealed in 2002 (Calin et al. 2002). MiRNAs can be present alone or in combination with other proteins in the circulation. In addition, they are able to be released directly into extracellular fluids and can also be carried with the help of microvesicles (O’Brien et al. 2018). In 2008, Chim et al. found placental miRNAs in maternal plasma, making it first principal research on miRNAs in biological liquids (Chim et al. 2008). Subsequently many studies were conducted for characterization of miRNAs in fluids as biomarkers.

MiRNAs possess many distinctive features that makes them as ultimately non-invasive cancer biomarkers. Cancer-specific miRNAs are extra stable and resistant to storage, their sequences are conserved throughout different species, they can be identified by cutting-edge technologies in small amounts of samples with high specificity and reproducibility, and are found in many biological fluids (e.g., blood, breast milk, amniotic fluid, saliva, feces, tears, urine) that makes their detection easy and minimal-invasive (Mitchell et al. 2008).

2.5.1.4 Exosomes

In both natural and pathological conditions, exosomes are released by cells. These exosomes carry nucleic acids and proteins which are the indicators of the pathophysiological conditions and hence can be used as biomarkers in clinical diagnostics. Tumor cells release exosomes which contain tumor-specific RNAs that can serve as potential biomarkers for cancer diagnosis. Exosomes include several proteins, including common membrane and cytosolic proteins, as well as origin-specific protein subsets that represent cell functions and conditions (Roldán Herrero 2021).

For example, exosomes are highly enriched with tetraspanins, a family of scaffolding membrane proteins. The exosomal marker CD63 is also a member of the tetraspanin family. In 2009, Logozzi and colleagues revealed that plasma CD63+ exosomes were significantly higher in patients with melanoma relative to healthy controls (Logozzi et al. 2009). All of these circulatory cancer biomarkers and their promising role in cancer research are depicted in Fig. 2.2.

Fig. 2.2
figure 2

Depiction of circulatory cancer biomarkers in liquid biopsies and its wide range applications in understanding cancer genomics and proteomics’ instabilities. The biological fluids from cancer patients contain large number of circulatory biomarkers that includes exosomes, circulatory tumor cells from primary origin or metastatic site, blood cells, different types of micro RNAs, circulatory proteins such as cell surface receptors, enzymes and signaling molecules, and cells’ free circulatory DNA and RNA released from the tumor site. These circulatory cancer biomarkers offers numerous diagnostic, prognostic and therapeutic applications by analyzing the cancer cells’ chromosomal abnormalities, single cell analysis, RNA expression profile, types and levels of miRNAs, proteins expression and its phosphorylation, in vitro and in vivo cultures assays, genes amplifications, insertions and deletions, the segments translocations and other different types of genetic mutations

2.5.2 Drug Repurposing

Repurposing or repositioning involves drugs of which the mechanism of actions is completely or partially understood. Clinical repositioning studies may also take benefit of this information and provide predictive biomarkers from initial phase development or trials. These biomarkers are frequently established among molecules, which are recognized to be involved in sensitivity or resistance to the test compound. In early drug agent testing, the use of predictive biomarkers may upsurge the treatment efficacy of the testing agent in question by raising the efficacy of the test agent in the favorable population of the selected biomarker. In the same way, drug-induced cytotoxicity in the unfavorable population of the selected biomarker can be avoided as these clinical trials-involved participants will not be exposed to the test agent/drug (Stenvang et al. 2013).