Keywords

Introduction

Cancers of an identical primary site can be heterogeneous in molecular pathogenesis, clinical course, and treatment responsiveness, which reflects the existence of multiple cancer subtypes [1]. The differentiation of these subtypes is often based on biomarkers that distinguish important cancer features such as the aggressiveness of the disease (prognostic biomarkers) or the response to treatment (predictive biomarkers). The latter have fueled an increasing interest in biomarkers, given the potential they hold for individualized or personalized medicine. This new field focuses on differences between people and the potential for these differences to influence medical outcomes. With individualized or “precision” medicine, a person’s cancer may be subtyped based on an explicit biomarker that is present or absent or that may have increased or decreased expression levels. This may result in a greater likelihood of receiving treatment that is appropriate and effective for a specific tumor in a particular cancer patient. Individualized medicine contrasts markedly with the traditional “empiric method,” which uses a standardized treatment for the whole patient population with an established presentation of disease symptoms, based on long-standing generic descriptions of the average patient (Fig. 13.1).

Fig. 13.1
figure 1

Empiric treatment versus patient-oriented treatment. Individualized medicine is contrasted with the traditional “empiric method,” which uses a standardized treatment for all patients with a certain disease

Nowadays, tumor biomarkers, together with new genomic and proteomic technologies, provide powerful tools for the early identification of cancer patients and recurrent disease and for defining therapeutic responsiveness. In spite of the rapid developments in biotechnology and genomics, the pace of acceptance of new markers in clinical practice is surprisingly low. The slow uptake is due to the substantial reasons presented below and elsewhere [1,2,3]. In this chapter we (1) summarize the importance of personalized medicine and describe some of the biomarkers and genetic tests which are being used in pathology practice now, (2) describe the translational research cycle and draw attention to some of the challenges faced in delivering practice-changing discoveries, (3) discuss the impact of genomic biomarkers on the design of new clinical trials, and (4) briefly review the guidelines and recommendations for moving successful biomarkers into clinical practice.

Cancer-Associated Biomarker Categories

Personalized, i.e., patient-oriented, research refers to a continuum from initial studies in humans to comparative effectiveness and outcome research and the integration of this research into the health-care system and clinical practice. The goal of patient-oriented research is to optimize the translation of innovative diagnostic and therapeutic approaches to the point of care, as well as to help researchers meet the challenge of contributing to high-quality, cost-effective health care [4]. It involves ensuring that the right patient receives the right clinical intervention at the right time, ultimately leading to better health outcomes [5, 6]. In order to make patient-oriented care effective, there is a great need to discover more promising, reliable cancer-specific biomarkers and translate them successfully into clinical use.

In general, biomarkers are biological measurements that are used to aid clinical practice. The National Cancer Institute defines a biomarker as a “biological molecule found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or of a condition or disease” [7]. A biomarker may be used to see how well the body responds to a treatment for a disease or condition [8]. The Biomarkers Consortium (managed by the Foundation of National Institutes of Health) states that “biomarkers are characteristics that are objectively measured and evaluated as indicators of normal biological processes, pathogenic processes, or pharmacologic responses to therapeutic intervention” [9].

There are five different categories of cancer biomarker measurements that can be assayed either once at baseline (diagnostic, prognostic, and predictive) or repeatedly (disease screening, disease monitoring, and molecular imaging) during the course of the disease. A marker may belong to a single or to multiple biomarker categories.

A diagnostic biomarker is an indicator measurement that will aid in the detection of malignant disease in an individual. PSA (prostate-specific antigen) is the best-known cancer biomarker for early detection of prostate cancer. Serum PSA has been widely used for almost 25 years in screening for prostate cancer and has brought about a dramatic increase in early detection of the disease. Unfortunately, the low specificity of elevated serum PSA as a cancer biomarker results in a significant number of men who do not actually have prostate cancer undergoing unnecessary needle core biopsies [10, 11]. To address these concerns, the US Preventive Services Task Force (USPSTF) reconsidered the potential harms and relative benefits of using PSA as a screening biomarker. It was found that there was insufficient evidence to recommend routine use of PSA as a screening test at any age (see section “The Biomarker Development Process”). The PCA3 (prostate cancer antigen 3) RNA biomarker test has been introduced as a simple additional urine assay to address the significant diagnostic dilemma in new cases of prostate cancer [12, 13]. The specificity of this test in prostate cancer is 74% compared to only 21–51% (depending on grade) for serum PSA, which at least increases the potential for this type of assay in predicting the likelihood of a positive needle core biopsy [14,15,16]. Using a cutoff of 4.0 ng/mL, the PSA blood test has a sensitivity of 67.5–80% compared to 52% sensitivity for the PCA3 urine test. PSA is used for both as a diagnostic and a prognostic test after the USPSTF recommended against its routine use as a general screening biomarker, except in high-risk patients with a family history. Nowadays, PSA is more appropriately used as part of the diagnostic work-up on a new patient rather than as a primary screening test, though it can be still used for both purposes.

Screening biomarkers are an important subclass of biomarkers that must have high sensitivity and a good negative predictive value (specificity is less critical) in a clinical setting. These biomarkers are designed to robustly differentiate patients with disease from those without a disease. A perfect screening biomarker should have 100% sensitivity and 100% specificity, but at present none of the available biomarkers achieve these ideal performance standards. Another good example of a currently used screening biomarker is the widespread testing for HPV (human papillomavirus) DNA as part of cervical cancer screening programs. The HPV molecular test is more sensitive with a high negative predictive value than either conventional cytology (PAP smear) or liquid-based cytology methods. An example is the cobas® HPV (Roche Molecular Systems, Inc.) DNA test, which has been used as an adjunct to conventional screening methods in the USA and in some European countries since 2011. In the ATHENA screening trial, this test was able to quantify the risk of precancer and cervical cancer in HPV 16+ and/or HPV 18+ women who either had atypical squamous cells of undetermined significance (ASC-US) or they had normal cytology [17]. In 2014, the FDA announced approval of the HPV DNA test as a primary screening method for cervical cancer for women over the age 24 [18].

Prognostic biomarkers are often defined as measurements made at diagnosis that provide information about patient prognosis. Prognostic biomarkers may predict disease recurrence (disease-free survival) and/or cancer-related death (cancer-specific survival) or overall survival for an individual patient in the absence of treatment or in the presence of standard primary treatment. Thus, prognostic markers typically give information about patient outcomes and tumor aggressiveness. For example, estrogen receptor (ER)-positive breast cancer patients have longer survival in the absence of systematic therapy than those patients who are ER negative [19]. CA125, which is present in a subset of ovarian cancers, is not used for detection of early cancers because the serum levels are elevated in only 50% of patients with stage I disease [20, 21]. This biomarker is usually used to evaluate response to chemotherapy, relapse, and disease progression in ovarian cancer patients. Gupta and Lis performed comprehensive evaluation of the existing literature on the prognostic role of CA125 and suggested that postoperative levels of serum CA125 are also a strong prognostic factor for estimating overall survival and progression-free survival in ovarian cancer [22].

Disease-monitoring biomarkers are assays that are performed repeatedly over time. A change in disease status during treatment will be reflected by a concomitant change in the biomarker status. Examples of biomarkers used for such monitoring are as follows: PSA in prostate cancer, CA125 in ovarian cancer, CEA in colorectal cancer, CA19–9 in pancreatic cancer, and CA15–3 or CA27.29 in breast cancer.

Predictive biomarkers are used to predict response or resistance to a specific cancer therapy, i.e., they are used to identify the patients who are likely or unlikely to benefit from a specific treatment. For example, in addition to its role as a prognosticator, tumor ER positivity is considered to be a predictive biomarker in breast cancer because such patients are far more likely to benefit from antiestrogen therapy such as tamoxifen. On the other hand, ER negativity is a predictive biomarker for benefit from conventional cytotoxic chemotherapy. Human epidermal growth factor receptor 2 (Her2/neu) amplification is a predictive marker for benefit from trastuzumab (Herceptin®), doxorubicin, and taxanes [23, 24]. In some situations, predictive biomarkers can be used to identify patients who may not benefit from a particular drug. For example, advanced colorectal cancer patients whose tumors have KRAS mutations are typically poor candidates for treatment with epidermal growth factor receptor (EGFR) antibodies [25, 26].

Cancer Genomics: From Research to Pathology Practice

The successful completion of the Human Genome Project stimulated a shift in emphasis from studying genes and proteins as individual biomarkers to current objectives to better understand their interactions in pathways of therapeutic importance. Thus, genomics, proteomics, transcriptomics, and metabolomics are now providing excellent opportunities for researchers to learn more about complex diseases like cancer by studying the overall response of cells to a mutation or to changes in the disease microenvironment. It is important to note that technologies that are used for biomarker discovery are often not exactly the same technologies that will be routinely used in a clinical laboratory. However, it is clear that discoveries made using genomic and proteomic technologies , coupled with advances derived from applied bioinformatics, are showing great promise for simpler and more cost-effective analysis of clinical samples.

Genomic Technologies Used for Biomarker Discovery

Gene Expression Arrays

Gene expression analysis has been one of the first high-throughput molecular profiling technologies with widespread adoption for biomarker discovery. Microarrays enable simultaneous analysis of tens of thousands of genes and thus the rapid identification of new potential biomarkers. Gene expression analysis measures the activity of cellular RNA (mRNA) in a tissue or bodily fluid at a given point in time, and it may provide information about the current status of a disease or the likelihood of future disease. RNA levels are dynamic and change as a result of pathology or environmental signals [27]. Certain patterns of gene activity may be used to diagnose a disease or to predict how an individual will respond to treatment over time. Methods used for gene expression analysis are diverse, ranging from real-time reverse transcription polymerase chain reaction (RT-PCR) to microarray screening technologies, which have been widely used in research, and are now beginning to be applied in clinical settings.

The most significant genomic biomarkers that have emerged in recent years include BCR-ABL1 for CML (chronic myeloid leukemia) diagnosis and monitoring of treatment responses [28], Her2/neu for diagnosis and prognosis of the breast cancer subtype which benefits from monoclonal antibody (trastuzumab [Herceptin®]) treatment [29], and detection of EGFR (epidermal growth factor receptor) and KRAS mutations for predictive purposes in lung [30] and metastatic colon cancer [31]. Discoveries from molecular profiling of RNA and DNA continue to generate many new candidate biomarkers that have potential similar to these successful genomic biomarkers.

The use of DNA expression microarrays has provided one of the most powerful tools to discover subsets of clinically important genes in human cancer [32]. Such expression arrays have been used to obtain major insights into progression, prognosis, and response to therapy on the basis of gene expression profiles (see the section on gene expression tests, below). Typically microarrays have been used to discover subsets of genes whose expression levels can be used to provide a distinct molecular subclassification of disease state. Once such a distinguishing genetic signature with likely clinical relevance has been discovered, custom-made arrays or other molecular biology methods are used to develop preclinical or clinical testing.

Genome-Wide Association Studies (GWAS)

GWAS is a comprehensive approach that identifies and correlates single-nucleotide polymorphisms (SNPs) to complex diseases such as cancers and is predominantly carried out with SNP microarrays specifically designed to interrogate millions of different polymorphisms in the human genome. GWAS is also very helpful as a biomarker discovery tool [32]. Results obtained from GWAS are typically cross-referenced with data from the HapMap Project or the 1000 Genomes Project in a process called imputation that aims to substitute values for missing data [33]. The advantage of GWAS is that it is unbiased and less likely to miss important genes or pathways than methods that use selected genes. Analysis of the large complex datasets generated by GWAS poses several challenges: (1) it requires large sample numbers and advanced bioinformatics to determine statistical significance; (2) there often remains a high likelihood of false-positive associations; and (3) with such marked biostatistical complexity, small differences may be missed due to stringent biostatistical corrections. With the introduction of high-throughput next-generation sequencing (NGS) into clinical medicine, diagnostic genomics is becoming an integral part of advanced molecular oncology. The USA recently launched the Precision Medicine Initiative in 2015 that includes a million patients as part of a multimillion dollar longitudinal cohort study to understand the hurdles and pitfalls of NGS-based applications and to accelerate the progress of personalized medicine [34, 35].

Next-Generation Sequencing (NGS)

The comprehensive screening power of NGS promises to help mine the remaining “unannotated regions” of the genome for novel sequence-based biomarkers that are below the resolution levels for detection by conventional microarray analysis [36]. In NGS all sequence information from a patient sample is aligned to a full-length reference genome to match all sequencing reads to their exact genomic locations [37]. Counting the number of sequencing reads that align to a given genomic location is analogous to microarray intensities for a probe with a specific sequence, and this metric can provide an estimate of relative expression levels. With slight modification in the NGS experimental design, DNA copy number, expression levels, and differential methylation can be determined. Sequencing technologies can further identify variation between samples by identifying genomic locations, whereas reads that do not perfectly match the reference genome may indicate individual genetic variation such as SNPs, loss of heterozygosity (LOH), as well as copy number variation (CNV) [38, 39].

While sequencing costs continue to decrease over time, costs associated with downstream data analysis are expected to grow by ∼50% between 2010 and 2020 [40]. There are two types of NGS technology: (a) targeted sequencing of genes or so-called gene panel sequencing and (b) whole-exome (WES) or whole-genome sequencing (WGS) both for clinical management and for discovery of new disease-associated genes.

Gene panel sequencing can detect base-pair substitutions (gene mutations, SNPs), short insertions and deletions (indels), duplications or deletions of large chromosomal regions, and gene copy number changes. The advantage of targeted NGS is that the method works well with relatively low amount of DNA present in FFPE samples and provides high depth of coverage (up to 1000×), which makes it ideal for using in clinical laboratories. Such NGS panels have been designed for diagnostic, prognostic, and predictive purposes to detect and monitor regions of interests and specific gene sets. Although gene panel sequencing can detect CNVs, the method is not sufficiently sensitive for detection of low copy number changes or for evaluation of complex gene rearrangements [41, 42]. While whole-exome sequencing (WES) provides DNA sequence data of just the genome coding regions, whole-genome sequencing (WGS) provides full sequence data of all genome coding exons as well as all the intervening noncoding regions. Whole-genome sequencing looks at the genome more broadly allowing for a more accurate detection of genome rearrangements and is the most sensitive approach to characterize copy number changes that are often not evident with other sequencing approaches such as targeted sequencing. The disadvantage of this broader sequencing is the high cost of analysis and inability to capture intratumoral heterogeneity at sufficient depth. In addition, data analysis and interpretation is the biggest drawback [43]. Although whole-exome and whole-genome sequencing are more comprehensive approaches compared to targeted sequencing, whole-exome sequencing covers only 1% of the genome that is translated into protein, and therefore, a large number of noncoding regions are ignored from analysis. A number of recent studies have demonstrated that mutations in noncoding regions may have direct tumorigenic effects, and therefore, future diagnostic genomics will need to move toward more complete 100% genome sequencing [44]. Current clinically available sequencing-based tests are discussed in section “Gene Expression and Sequencing-Based Tests”.

Role of Bioinformatics and Genomic Datasets in the Public Domain

In order to facilitate the biomarker discovery process, it was recognized that there was a need for freely accessible datasets containing comprehensive information associated with DNA and with RNA expression. Most journals now require that investigators make such genomic data publically available in a standardized format for open access in silico analysis. All data must be MIAME (minimum information about a microarray experiment)-compliant. In other words, MIAME comprises the minimum requirements that should always be included with published microarray datasets, as suggested by the Functional Genomics Data Society (http://www.fged.org). The most popular genomic datasets are GEO, ONCOMINE, and ArrayExpress Archive, described below.

GEO (the Gene Expression Omnibus) is the biggest public repository that was designed to utilize features of the most commonly used molecular profiling methods today. These include data generated from microarray analyses as well as sequence technologies and include gene expression profiling, noncoding RNA profiling, chromatin immunoprecipitation (ChIP) profiling, genome methylation profiling, SNP genomic variation profiling, array comparative genomic hybridization (aCGH), serial analysis of gene expression (SAGE), and protein arrays (http://www.ncbi.nlm.nih.gov/geo/).

ONCOMINE is a cancer microarray database and Web-based data mining platform aimed at facilitating discovery from genome-wide expression analyses [45]. Using the ONCOMINE platform, researchers can easily compare gene expression profiles between cancer and normal samples; compare gene expression between different molecular, pathological, and clinical cancer subtypes; and investigate expression of genes in pathways and networks associated with cancer. It is possible to identify pathways, processes, chromosomal regions, and regulatory motifs activated in cancer and also search for genes that distinguish and predict cancer types and subtypes (http://www.oncomine.org).

ArrayExpress Archive/Gene Expression Atlas is a European database that contains functional genomic experiments including gene expression data. Here, researchers can query and download data collected according to MIAME and MINSEQE (minimum information about a high-throughput nucleotide sequencing experiment) standards. It is also an atlas that can be queried for individual gene expression under different biological conditions across experiments (http://www.ebi.ac.uk/arrayexpress).

Integration Approaches to In Silico Datasets

For in silico analysis , information is extracted from publicly available genomic datasets and then analyzed by the researcher using a computer to look for various patterns associated with particular diseases. In silico analysis can be applied, for example, to determine the location of mutations in a certain tumor suppressor gene, to look for copy number changes for particular genes, and to compare gene/protein expression patterns between cancerous and normal samples. Commercial bioinformatics software (such as Nexus™, BioDiscovery, Inc., California, USA, or Partek®, Partek Inc., Saint Louis, USA) enables users to manage, integrate, visualize, and analyze data generated from high-throughput gene expression analysis, aCGH, SNP arrays, and NGS datasets.

The advantages of in silico methods are that they are rapid and avoid the need for expensive experiments to evaluate a biomarker’s clinical value. Moreover, bioinformatics permits the investigator to search for a biomarker in one dataset and attempt to validate it in another. However, the utility of in silico analysis depends on the quality of the clinical data collected, as well as the coverage and accuracy of the annotations used to report the genomic data. It can also be difficult to compare results across several datasets because of the differences in genomic methods. For these reasons, in silico analysis in biomarker discovery is often considered an initial step that must be followed by rigorous experimental validation prior to preclinical investigation.

Clinically Applicable Gene-Based Assays

A very important aspect of marker development is to translate it to the clinic, once its usefulness has been established. A potential marker can be tested in different sources, including tumor tissues and body fluids such as serum and urine. The methods used should be of rapid execution, reliable, and ideally not very expensive. As our understanding of complex diseases grows, additional biomarkers are being identified and developed into new and improved diagnostic tools that can analyze multiple biomarkers simultaneously. Often, such biomarker assays establish a complex molecular profile of the disease and provide an estimate of the likelihood of a response to a given treatment. They combine the values of multiple variables to yield a single patient-specific result. Such multigene assays commonly use PCR tests or gene expression microarrays, the results of which are integrated into an algorithm to organize and prioritize individual markers, thereby producing a readily accessible result [46]. The common examples of this modality are discussed below and some are already FDA cleared or approved.

Gene Expression and Sequencing-Based Tests

In spite of the fact that microarray technologies are costly, gene expression tests are increasingly being implemented in modern clinical practice as an aid to conventional diagnostic, prognostic, and predictive decision tools used in cancer management. Some of the most recently used examples are discussed below.

ColoPrint® (Agendia, Amsterdam, the Netherlands) is a microarray-based gene expression profile used to predict the risk of distant recurrence of stage II and III colon cancer. ColoPrint® combines a multigene panel, which includes seven colon cancer-related genes and five reference genes, with a proprietary algorithm for determining risk of recurrence (http://www.agendia.com). ColoPrint uses the same technology, methods, and quality control as FDA-cleared assays (i.e., MammaPrint®), though it is not approved by the FDA. Similarly, Genomic Health, Inc. provided the Oncotype DX® colon cancer test for stage II colon cancer patients by evaluating expression levels of 12 genes. The results of the test are reported as a quantitative Recurrence Score® result, which is a score between 0 and 100 that correlates with the likelihood of a person’s chances of having the cancer return [47]. At present this test it is not FDA approved. The assay is only performed by the developers in their Clinical Laboratory Improvement Amendments (CLIA) commercial laboratory. Genomic Health also provides MMR (mismatch repair) testing by immunohistochemistry on colon tumor samples, which, in combination with Oncotype DX®, may help the clinician in making treatment decisions (http://www.oncotypedx.com). Stage II colon cancer patients with MMR-deficient (MMR-D) tumors have a much lower risk of recurrence compared to patients with MMR-proficient (MMR-P) tumors [48].

BluePrint® is an 80-gene expression signature which classifies breast cancer into basal-type, luminal-type, and ERBB2-type cancers. The BluePrint® molecular subtyping profile, combined with the patient’s MammaPrint® (see below) test results, provides a greater level of clinical information to assist in therapeutic decision-making (http://www.agendia.com). BluePrint® does not require FDA clearance because it is considered a class I, low-risk device under FDA regulations.

MyPRS™/MyPRS Plus™ (my prognostic risk signature) is a tool for guiding treatment in patients with multiple myeloma. It analyzes all of the nearly 25,000 genes in a patient’s genome to determine the gene expression profile (GEP) that is associated with a particular patient’s condition. The GEP is made up of the 70 most relevant genes (GEP70) which aid in the prediction of the patient’s outcome (http://www.signalgenetics.com). Both MyPRS™ and MyPRS Plus were developed by Myeloma Health, LLC, who determined performance characteristics in a CLIA-certified laboratory. The FDA has indicated that these tests do not require either clearance or approval at present.

MammaPrint® (Agendia, Amsterdam, the Netherlands) is based on microarray technology using 70 cancer-related and about 1800 non-cancer-related genes (http://www.agendia.com). The test stratifies patients into two distinct groups: low risk or high risk for distant recurrence, with no intermediate-risk patients. With low-risk patients, hormonal therapy (e.g., tamoxifen) might be sufficient, avoiding the necessity of aggressive treatment such as chemotherapy. The test was cleared by the FDA as a class II device in 2007. However, the FDA did not evaluate treatment outcomes as a result of use of this “prognostic” device. In addition, the EWG (the Evaluation of Genomic Applications in Practice and Prevention [EGAPP] working group) found that “data were adequate to support an association between the MammaPrint Index and 5 or 10 year metastasis rates, but the relative efficacy of testing in ER-positive and ER-negative women is not clear.” Also, study subjects were European, and how characteristics of other demographic populations might affect test performance is not known [49]. The MINDACT (Microarray In Node-Negative Disease May Avoid Chemotherapy Trial) is designed to compare the effectiveness of MammaPrint test results versus clinical evaluation in predicting 15-year disease-free survival and overall survival (EORTC (European Organization for Research and Treatment of Cancer), MINDACT 2008). This trial will compare clinical response to endocrine therapy alone versus endocrine therapy combined with chemotherapy regimens (anthracycline-based, docetaxel-capecitabine, letrozole).

The Oncotype DX® breast cancer test (Genomic Health, Inc., Redwood City, CA) uses RT-PCR to study gene expression profiles in formalin-fixed, paraffin-embedded (FFPE) breast cancer tissues. Oncotype DX analyzes expression of 21 genes, 16 cancer related and 5 normative [50]. The test is intended for stage I or II, lymph node-negative, and ER-positive breast cancer patients, who will be treated with tamoxifen. Results are reported as a Recurrence Score™ (RS; scale of 0–100). Patients are divided into low-, intermediate-, and high-risk categories. Oncotype DX® claims to provide information beyond conventional risk assessment tools, including how likely the woman is to benefit from chemotherapy in addition to tamoxifen therapy (http://www.genomichealth.com). The TAILORx (Trial Assigning Individualized Options for Treatment) trial was designed to determine the benefit of chemotherapy for women with intermediate risk. The trial has shown that gene expression test could identify women with a low risk of recurrence who could be spared chemotherapy [51]. The test is not FDA cleared but is available at the Genomic Health, Inc. CLIA-certified laboratory.

The most extensively studied tests among those listed above are Oncotype DX® breast cancer and MammaPrint®. In many countries these new tests are being offered for clinical use, but there remains a need for more comprehensive long-term studies to assess whether test outcomes lead to clear beneficial effects for patients and are cost-effective.

There are also a number of sequence-based gene panel tests that have been developed recently that provide precise information on mutations of clinical importance. These include clinical tests of germ line DNA for the risk of hereditary disorders and tests of tumor DNA for therapeutic decision-making in cancer [52].

The hotspot panel is a collection of frequently mutated hotspots that are either therapeutically actionable or with diagnostic/prognostic significance. There are two types of hotspot cancer panels currently commercially available to guide for treatment: one for the choice of therapy and the other for the amount of medication.

The AmpliSeq™ Cancer Panel v1, developed by the Life Technologies, covers 739 clinically relevant hotspot mutations from 46 cancer genes, including well-established tumor suppressor genes and oncogenes. The similar panel from ThermoFisher (Ion AmpliSeq™ Cancer Panel v2) has become very popular as a clinically validated test that is compatible with FFPE samples, and it has been adopted by many academic institutes and private laboratories in North America [52]. PGxOne™ developed by Admera Health is a hotspot panel (http://www.admerahealth.com/pgxone/), which screens for 152 frequently mutated sites from 13 well-established pharmacogenomics genes that affect drug absorption, metabolism, or activity. The data from the panel provide information for physicians to prescribe appropriate doses for effective treatment based on the presence of specific actionable mutations. Several institutions offer similar panels as lab-developed procedures performed in CLIA-certified laboratories.

The disease-focused panels are designed to detect germ line mutations to screen for the risk of inherited diseases or to diagnose suspected genetic diseases in carriers. The hereditary cancer panels are widely used tests since approximately 5–10% of all cancers are considered to be hereditary. More than 100 cancer susceptibility syndromes have been reported, including hereditary breast and ovarian cancer syndrome, Lynch syndrome, Cowden syndrome, and Li-Fraumeni syndrome. Today around 227 tests are available for hereditary cancer screening in clinical laboratories.

Comprehensive panels include all genes associated with all diseases. Illumina’s TruSight One is an example of such a comprehensive panel. This panel includes more than 60 well-established subpanels and covers 4813 genes having known association with clinical phenotypes. Such panels minimize test development and validation efforts and enables physicians to request testing for specific disease(s) if clinically indicated, without any additional efforts.

Whole-genome sequencing is the most comprehensive tool for future clinical application. It can provide full coverage of all protein-coding regions like WES as well as intronic and other noncoding regions associated with inherited diseases. With the recent release of Illumina HiSeq X Ten, a human genome can be sequenced at 30x coverage under $1000 for the wet lab portions of the analysis.

Protein Chips

Similar to using DNA chips for identification of gene expression profiles in particular tumors, the advent of “protein chips,” which enables the analysis of thousands of proteins expressed by a single tumor sample at the same time, has helped researchers to better understand the molecular basis of disease, including disease susceptibility, diagnosis, progression, and potential points of therapeutic interference. The basic format of most protein chips is similar to that of DNA chips, such as the use of glass or plastic printed with an array of molecules (e.g., antibodies) that can capture proteins. Ideally, a protein chip would be able to predict a cancer state by a simple serum or urine test. This technology is likely to see considerable additional development and application in the coming years [53].

Fluorescence In Situ Hybridization (FISH)

Quantification of multiple mRNA levels in tumors is expensive, technically demanding, and not readily available in a routine clinical setting. FISH provides an alternative way to diagnose and identify predictive or prognostically important genetic alterations. The method is simple, fast, and reliable and therefore has been widely accepted for clinical use in human cancer. It is used to assess various genetic alterations (amplifications, deletions, translocations). FISH can detect genomic anomalies over a much greater dynamic size range than other techniques. In the past decade, the technique has been developed to include multicolor FISH assays so it is now possible to assess complex genomic alterations [54]. Recent improvements have been made to FISH in the form of chromogenic in situ hybridization (CISH) and silver-enhanced in situ hybridization (SISH). These techniques use peroxidase enzyme-labeled probes whose signals do not decay over time and allow the specimen to be viewed using bright-field microscopy. CISH and SISH have been used to assess Her2/neu gene status [55].

Assessment of Her2/neu amplifications in breast cancer, to assess prognosis and to predict treatment outcome, is the most common example of FISH use in clinical settings [56]. Other examples include the recently developed commercialized test eXagenBC. The latter promises to provide a tailored prognosis in node-positive and node-negative breast cancer patients and is based on assessment of DNA copy numbers of three genomic regions (around CYP24, PDCD6IP, and BIRC5) for ER-positive and progesterone (PR)-positive tumors and three different genes (NR1D1, SMARCE1, and BIRC5) for ER-negative and PR-negative tumors in both node-negative and node-positive patients. The eXagenBC test uses a prognostic index (PI) from an algorithm to integrate the information from the three genes and predict recurrence rates. This test may provide greater accuracy compared to other criteria for recurrence risk assessment and therefore has been suggested for routine clinical use [57].

Additional promising prognosticators are fusion genes such as TMPRSS2-ERG translocations and PTEN deletions in prostate cancer which show great promise for identification of aggressive prostate cancers. PTEN deletions have been associated with earlier biochemical relapse following radical prostatectomy. Prostate cancers showing homozygous PTEN deletions, termed “PTEN null,” have been strongly associated with metastasis and androgen-independent progression, i.e., castration-resistant prostate cancers (CRPC) [58,59,60]. One important new FISH biomarker is the echinoderm microtubule-associated protein-like 4-anaplastic lymphoma kinase (EML4-ALK) fusion gene, present in a small subset of non-small-cell lung cancers (NSCLC). Such tumors are particularly sensitive to ALK inhibitors such as crizotinib which has been approved by the FDA in 2011 for the treatment of locally advanced or metastatic non-small-cell lung cancers that are ALK-positive [61, 62]. The FDA also approved the Vysis ALK Break Apart FISH Probe Kit (Abbott Molecular, Inc.) that is a diagnostic test designed to detect rearrangements of the ALK gene in NSCLC [63].

Polymerase Chain Reaction (PCR)

Clinical diagnostic applications of real-time PCR or real-time quantitative PCR (qPCR) have been widely implemented by hospital-based clinical laboratories [64]. In translational research, qPCR is simple and one of the fastest, most reliable and cheapest molecular techniques for the validation of a newly discovered biomarker. A qPCR assay can be used to identify gene amplifications, deletions, fusions, overexpression, and mutations down to single base changes, and therefore, these very sensitive and specific molecular tests are among the most widely used methods to translate recent discoveries in cancer research into clinical practice.

Examples of clinically applicable qPCR assays in cancer diagnostics and prognostics include the detection of BCR-ABL1 transcripts in patients with chronic myeloid leukemia (CML) who are then subjected to tyrosine kinase inhibitor (imatinib [Gleevec®]) treatment as a first-line therapy and to quantification of minimal residual disease (MRD) by qPCR [65]. Recently highly sophisticated methods have been developed using DNA-based and RNA-based PCR assays for the detection of BCR-ABL1 transcripts that were previously not detectable by conventional PCR methods [66, 67]. Thyroid cancer is another example where qPCR assays play an important role: in this case they have a diagnostic and predictive role. Real-time PCR can be used to diagnose papillary thyroid carcinomas (PTCs) harboring a point mutation in BRAF or RAS, or a RET-PTC rearrangement (>70%), and they can help diagnose follicular thyroid carcinomas (FTCs) that harbor either RAS mutations or PAX8/PPARγ rearrangements [68]. RAS mutations may also be found in benign thyroid lesions. In addition, sporadic and hereditary medullary thyroid carcinomas (MTCs) are both associated with point mutations in the RET gene. Thus, molecular testing is now an important component of thyroid cancer diagnosis and management [68, 69].

Assays that simultaneously amplify (or detect) two or more target fragments (or detect sequence changes within target fragments) are termed duplex and multiplex real-time PCRs, respectively. It is noteworthy that the multiplexing of biomarkers has many advantages over single biomarker measurements, especially when trying to identify the best diagnostic or prognostic models for various human cancers (prostate cancer, as an example, is discussed below) [70]. One commercially available real-time PCR assay (HemaVision, DNA Technology, Aarhus, Denmark) is widely used in clinical laboratories to simultaneously detect 28 fusion genes and more than 80 breakpoints and splice variants in patients with acute myelogenous leukemia (AML) and acute lymphoid leukemia (ALL) ([71]; http://www.biocompare.com).

Classical cytogenetic methods (e.g., conventional karyotyping) continue to provide well-established diagnostic findings to clinicians. However, the detection of certain genetic abnormalities (translocations or fusion genes) that often have been missed by conventional cytogenetics is now feasible with high reliability using newer molecular techniques that have advantages over traditional methods. These may include a shorter turnaround time, automated analyses, and a lack of the prior requirement of dividing cells [72].

Impact of Genetic Biomarkers on Drug Development and Clinical Trial Designs

Genetic biomarkers now have tremendous impact in every phase of drug development, from drug discovery to preclinical evaluations through each phase of clinical trials and into routine clinical use [73]. In the early phases of drug development, biomarkers are used to evaluate the activity of small molecule therapeutics in animal models, to investigate mechanisms of action and to provide essential preclinical data needed for the various later stages of clinical trials. If the preclinical phase of drug development is successful, then it is followed by an application to the FDA as an investigational new drug (IND). The purpose of an IND is “to ensure that subjects will not face undue risk of harm” in a clinical investigation that involves the use of a drug. The IND is the mechanism by which the investigator, or pharmaceutical sponsor, provides the requisite information to obtain authorization to administer an investigational agent to human subjects [74]. By doing so, the compound can be tested for dose response, efficacy, and toxicity. After an IND is approved, the next steps are clinical phases 1, 2, and 3. Phase 1 trials determine safety and dosage and identify side effects (patient number: 20–80); phase 2 trials are used to obtain an initial assessment of efficacy and to further explore safety of the drug or treatment in a larger number of patients (100–300); and in phase 3 trials, the treatment is given to large groups of patients (>1000) to confirm effectiveness, monitor side effects, compare efficacy to established treatments, and collect information that will allow it to be used safely.

In clinical trials which are designed to validate and assess the usefulness of a prognostic or predictive biomarker, the major issues are to obtain sufficient statistical evidence of treatment benefit in patients who are positive for the predictive or prognostic biomarker and then to examine the biological relationships associated with the biomarker’s expression and the molecular pathways targeted by the therapeutic agent. Often, such studies utilize a retrospective analysis of a biomarker in available tissues from patients with known response who have been treated similarly [75]. Before initiating studies to confirm the clinical utility of a novel biomarker, it is necessary to conduct validation trials in which several criteria must be met. First, specific testable hypotheses must be proposed based on scientific evidence of the predictive properties of the putative biomarker relative to the existing (standard) treatment. In addition, any prognostic benefit is assessed as well. A novel biomarker is considered promising for clinical utility when it demonstrates the following features in the validation study: (1) the marker is independently associated with clinical outcome; (2) its biological effects are specific for the cancer of interest as opposed to normal tissues, other disease states, or other cancers; (3) the marker’s prevalence in the target population is high; and (4) the methods of marker measurement are feasible and reproducible.

In the next phase of the evaluation of clinical utility of the predictive or prognostic biomarker, two major issues have to be considered: the selection of an appropriate patient population and the choice of the most appropriate end point. For example, when evaluating predictive markers of therapeutic efficacy in the adjuvant setting, the primary end point usually is overall, disease-free, or recurrence-free survival. Possible primary end points for metastatic disease trials would include response rate, time to progression, survival, or risks of toxicity [75].

With respect to clinical trial designs for new drugs or treatment options and companion biomarkers, randomized controlled trials (RCT) are the most popular, because they limit the potential for bias by randomly assigning one arm to an intervention and the other arm to nonintervention (or placebo). This minimizes the chance that the incidence of confounding (particularly unknown confounding) variables will differ between the two groups. Currently, some phase 2 and most phase 3 drug trials are randomized, double-blind, and placebo-controlled. Traditional RCT designs are not always well suited for drugs with molecular targets and associated biomarkers. Newer clinical trial designs have incorporated the recent discoveries of molecular oncology [76]. These trial designs are much more efficient because study arms are enriched based on mutational profiles associated with a specific actionable drug response. For example, the standard randomized approach in a clinical trial for trastuzumab would not be very effective without the use of an enrichment design, because the drug has little effect on Her2/neu-negative patients. Because almost 75% of patients are Her2/neu negative, a standard design would require a large sample size to detect the treatment effect of trastuzumab on Her2/neu-positive patients. An enrichment clinical trial design is used to evaluate a treatment or a drug in which the effect can be readily demonstrated on a specific subset of the study population. Often such a subset is identified by a biomarker test that is used to select those patients who are likely to respond well to the treatment. Efficiency of the study thus depends on the prevalence of test-positive patients and on the relative effectiveness of the new treatment in test-negative patients [76]. In the enrichment designs, the number of randomized patients is often substantially smaller than for a standard design.

Another new type of clinical trial is the “basket” phase 2 design, which is based on the idea that the presence of a molecular marker will predict response to targeted therapies, independently of tumor histology. Basket trials can be nonrandomized or randomized and can include a single drug or multiple individual drugs [76]. The MATCH (Molecular Analysis for Therapy Choice) clinical trial, launched by the National Cancer Institute, opened with 400 clinical sites and 10 drugs is an example of a large multidrug basket design [77]. In this trial, more than 3000 patients with advanced metastatic cancer of many histologic types have been genomically tested with a common platform and triaged to a nonrandomized substudy with an actionable drug.

The “umbrella” trial design is a similarly innovative approach that takes patients with the same type of cancer and assigns them to different arms of a study based on their mutations and the availability of a targeted therapy. The BATTLE I (Biomarker-Integrated Approaches of Targeted Therapy for Lung Cancer Elimination) phase 2 trial for patients with non-small-cell lung cancer is an example of a phase 2 umbrella trial [78]. In this trial, patients’ samples were assayed for four candidate biomarkers based on genomic or transcriptomic alterations. The patients were then randomly assigned to receive one of the four drug regimens. The analysis of this trial was the same as for the randomized basket designs, but in the umbrella design, the conclusion about whether targeting was useful was limited to patients with the single selected primary site of disease.

The Translational Research Continuum

Despite the rapid pace of biomarker discovery in recent years, there are still very few validated genetic biomarkers of proven and robust clinical utility [79]. This poor performance reflects that the clinical development of new biomarkers is just as difficult as the development and approval of a new drug . Here we will outline the bench to bedside pipeline and discuss how best to facilitate the successful development of biomarkers and molecular targeted treatments, respectively. Throughout the cancer research process, many challenges are faced during the transition of a new discovery from the “research bench” through the phases of laboratory and clinical validations. Unfortunately the majority of “exciting discoveries” never succeed in overcoming the rigorous evaluations and are not accepted as part of routine clinical practice or used for laboratory testing by pathologists (Fig. 13.2).

Fig. 13.2
figure 2

The translational research continuum. This graph schematically depicts the three major obstacles that impede an exciting research discovery (leftmost peak) moving though the validation phase from preclinical research into clinical trials (middle peak) and onto clinical or laboratory practice (small peak on right). The graph illustrates the continuing gap between basic biomedical research and clinical research and knowledge. This gap limits the capacity to translate the results of provocative discoveries generated by basic biomedical laboratory research to the bedside, as well as to successfully engage and educate health-care providers in the benefits of the discoveries

Challenges in Preclinical and Clinical Research

A major factor contributing to the lack of use of genetic biomarkers in clinical trials is the poor quality of published preclinical data. This has been the focus of a recent commentary by Begley and Ellis [3]. IND trials rely heavily on the literature and on having a comprehensive understanding of the agent’s target, its associated biomarker, and the various downstream consequences of the drug. Very often, however, the biological hypothesis around a new agent and its companion biomarker is uncertain or questionable. The lack of reproducibility of preclinical “research assays” when applied to patient samples may prevent the application of novel biomarkers in a clinical setting. Some of the issues that are considered to be associated with poor uptake of research biomarkers by trialists and clinical laboratories are summarized in Table 13.1.

Table 13.1 Challenges in preclinical and clinical research

The Biomarker Development Process

The biomarker development process requires multiple collaborative mechanisms, knowledge networks, and consortia to facilitate biomarker fruition in clinical practice. The critical limitation in biomarker development is the lack of a proper structure in the biomarker discovery process as is present in the process of testing a new drug. After proving, among other things, the clinical validity and clinical utility of a newly discovered biomarker (see below), a biomarker is not considered “validated” and cannot be recommended for use in clinical practice until independent research groups at multiple sites have demonstrated concordant results in separate trials. The challenge is firstly to determine which data are required to perform these studies and, secondly, to obtain, share, and pool these data together and to provide adequate support to analyze the pooled datasets. A solution would be to apply uniform standards, which should facilitate effective translation of newly discovered biomarkers to the clinical setting. Therefore, numerous collaborative mechanisms, knowledge networks, and consortia have emerged in order to facilitate biomarker discovery and enhance the delivery process to the clinic. Examples of such mechanisms such as the Early Detection Research Network (EDRN) and The Biomarkers Consortium (TBC) demonstrate the value of a national coordinated approach [80, 81].

Guidelines (known as the Standards for Reporting of Diagnostic Accuracy, or STARD statement) have been developed for diagnostic studies and were inspired by CRGs (Cochrane Review Groups) in 1999. For prognostic studies, guidelines known as REMARK criteria were developed by NCI-EORTC (National Cancer Institute-European Organisation for Research and Treatment of Cancer) [82,83,84]. The STARD initiative aims to improve the reporting quality and diagnostic accuracy of publications describing new biomarkers. The statement consists of a checklist of 25 items, and the decision to include items in the checklist was based on evidence linking these items to either bias, variability in results, or limitations of the applicability of results to other settings [82]. The checklist can be used to verify that all essential elements are included in the report of a research study.

REMARK (REporting recommendations for tumor MARKer prognostic studies) guidelines were developed primarily for studies of prognostic markers, especially those evaluating a single tumor marker while possibly adjusting for other known prognostic factors. The guidelines suggest relevant information that should be provided about the study design, preplanned hypotheses, patient and specimen characteristics, assay methods, and statistical analysis methods [83].

While some biomarkers have already been approved by the FDA, the use of others has been recommended in clinical guidelines by various cancer societies [5]. A recent example of this is a test for epidermal growth factor receptor (EGFR) mutation in patients with advanced NSCLC, which determines whether or not first-line EGFR tyrosine kinase inhibitor therapy is indicated [5, 85]. The introduction of biomarkers into routine clinical practice is considered in the framework tumor marker utility grading system (TMUGS) which was designed to evaluate the clinical utility of tumor markers and to propose a hierarchy of “levels of evidence” that might be used to determine if available data support the use of a marker or not [86]. TMUGS provides guidelines to determine the clinical utility of known and future tumor markers, as well as guidance on biomarker assay design, interpretation, and use in clinical practice. This evidence scale has been widely cited and used for deciding whether to recommend the use of a tumor marker in clinical practice and for design and conduct of tumor marker studies [87, 88]. This evidence scale has recently been revised to distinguish data generated from prospective clinical trials, in which the marker is the primary objective of the study, from those in which archived specimens are used [1, 75, 89]. Starting in 2000, the Office of Public Health Genomics (OPHG) at the Centers for Disease Control and Prevention (CDC) established the analytic framework ACCE Model Project based on four main criteria for evaluating a genetic tests:

  1. 1.

    Analytic validity is a component of clinical validity (see below) describing how accurately and reliably the test measures the genotype of interest. Analytic validity assesses technical test performance and includes analytic sensitivity (detection rate), analytic specificity (false-positive rate), reliability (repeatability of test results), and assay robustness (resistance to small changes in pre-analytic or analytic variables).

  2. 2.

    Clinical validity describes the accuracy with which a test predicts a particular clinical outcome and clearly separates two subgroups of patients with different outcomes within a large population. When a test is used diagnostically, clinical validity measures the association of the test with the disorder [90], and when used predictively, it measures the probability that a positive test will result in the appearance of the disorder within a stated time period.

  3. 3.

    Clinical utility is a balance of benefits and harms when the test is used to influence patient management, i.e., the evidence that the use of the marker improves outcomes compared to not using it. Evaluation of clinical utility factors and the available information about the effectiveness of the interventions for people who test positive and the consequences for individuals with false-positive or false-negative results.

  4. 4.

    Ethical, legal, and social implications (ELSI) refer to other implications which may arise in the context of using the test and cut across clinical validity and clinical utility criteria. In 2004, a new initiative, termed EGAPP™ (evaluation of genomic applications in practice and prevention) was created by OPHG at the CDC “to better organize and support a rigorous, evidence-based process for evaluating genetic tests and other genomic applications that are in transition from research to clinical and public health practice in the U.S.” [49, 91].

The US Preventive Services Task Force (USPSTF) is an independent panel of non-federal experts in prevention and evidence-based medicine and is composed of primary care providers. The USPSTF strives “to make accurate, up-to-date, and relevant recommendations about preventive services in primary care. It conducts scientific evidence reviews of a broad range of clinical preventive health care services (such as screening, counseling, and preventive medications) and develops recommendations for primary care clinicians and health systems” (http://www.uspreventiveservicestaskforce.org). These recommendations are published in the form of “Recommendation Statements.” Also, the USPSTF stratifies the evidence by quality about the effectiveness of treatments or screening by three different levels (Table 13.2). For example, in 2002, USPSTF deemed the evidence to be insufficient to recommend routine use of PSA as a screening test among men younger than age 75. The recommendation, however, does not include the use of PSA test for surveillance after diagnosis or treatment of prostate cancer. The USPSTF reviewed the available evidence again in 2011 and in a draft report concluded that population benefit from PSA screening was inconclusive, recommending against PSA-based prostate cancer screening at any age [92, 93]. The USPSTF makes evidence-based recommendations about clinical preventive services such as screenings, counseling services, or preventive medications. Currently the majority of USPSTF recommendations are not in favor of widespread use of cancer screening using biomarkers. However, as more DNA-based biomarkers are developed, it seems likely that the benefits of screening may outweigh the risks for some of the diseases where early intervention can prevent disease progression (http://www.uspreventiveservicestaskforce.org/uspstopics.htm#AZ).

Table 13.2 Stratification of evidence by quality [94]

Conclusions

Various consortia, grading systems, and collaborative initiatives discussed in this chapter are basically founded and developed in North America and are part of the goal to provide evidence-based medicine, which seeks to assess the strength of the evidence of risks and benefits of treatments, diagnostic tests, and biomarkers. Similar systems exist in Europe though they are not discussed here. The development and application of high-throughput sequencing have led to the precision medicine initiative in cancer. At the same time, radical changes in clinical trial design, combined with accelerated biomarker development, suggest there will be greatly improved response rates for patients and reduced cancer mortalities for many more tumor types. Networking infrastructures throughout the world developed to date have a goal of sharing and pooling analyzed data to complete the biomarker discovery → development → validation continuum. Increased collaboration between such consortia will continue to accelerate biomarker development and the use of genomics in clinical oncology. Global harmonization of guidelines in the years ahead will likely underpin the success of biomarker translation from bench to bedside.