Background

When the search terms “cancer” and “histology” were entered in the search engine of the US National Institutes of Health service (www.clinicaltrials.gov) at the end of June of 2015, more than 15,000 interventional clinical trials came up. This staggering number illustrates how often interventional studies use or rely on information provided by histological data. Of the drugs approved in 2013, 40 % target cancer. In the last 20 years, improvements in conventional chemotherapeutic drugs, which include among others alkylating agents, antimetabolites, and topoisomerase inhibitors, have been slow and limited in impact. However, significant improvements have been achieved in a few cancer types using either antihormonal medication or drugs targeting molecules in specific signaling pathways, e.g., receptor tyrosine kinases and their ligands. Conventional chemotherapy requires accurate histological diagnosis of cancer subtype and the assessment of tumor stage, while “targeted” drugs often need predictive biomarker testing on a tissue specimen. Targeted treatment tends to be palliative rather than curative, but it has expanded progression-free, tumor-specific, and overall survival for an increasing number of cancer types and added further “bullets” in the “war against cancer.” This marketing-type terminology commonly used in public media tends to generate optimism and hope but in doing so tends to ignore major problems encountered, in terms of drug efficacy in clinical trials but also in the development of reliable companion diagnostics. The example of the EXPAND study, which tested capecitabine and cisplatin with or without cetuximab for patients with previously untreated advanced gastric cancer, is a good example: adding cetuximab to capecitabine-cisplatin as first-line treatment did not result in any benefit. Cetuximab is a fully humanized antibody directed against the epidermal growth factor receptor (EGFR) and provides significant survival benefit to patients with advanced colorectal cancer with wild-type KRAS status [1]. As in comparison with colorectal cancer in gastric cancer KRAS mutations are rare while they do express EGFR, expectations were high but cetuximab failed to show any benefit and the relative success in colorectal cancer did not materialize in better clinical management of gastric cancer [2]. Likewise, results of recent studies communicated during the annual meeting of the American Society of Clinical Oncologists failed to show a benefit of MET inhibitors in gastric cancer. In both studies, drug efficacy was tested along with a potential companion diagnostic [3, 4].

Therefore, in spite of a breathtaking number of clinical trials, many of which also attempt to validate companion diagnostics, the number of effective targeted drugs and the spectrum of tumors for which they work remain limited and this also holds true for the accompanying handful of tissue-based biomarkers that have become standard of care in diagnostic surgical pathology. This raises the question which role pathologists might play in planning and executing clinical trials.

Role of the surgical pathologist in clinical trials

Surgical pathology already contributes considerably to clinical trials. Three roles can be distinguished regarding pathology input: support in clinical studies, participation in preclinical investigations, and implementing trial results, notably concerning companion diagnostics, into diagnostic pathology practice [5]. Support in clinical studies consists of providing sound pathological diagnoses through, e.g., central review, as well as providing biomarker test results (specific tissue-based diagnostics conditional for patient inclusion or exclusion). Support in the preclinical phase includes biomarker discovery, i.e., support for or execution of tissue-based translational studies for the development of new clinically relevant biomarkers, which requires well-characterized tissue collections. Implementing trial results concerns roll-out of new tests into daily diagnostic practice, including bedside-to-bench research projects that improve application of a new tissue-based biomarker, but also the establishment and execution of external, objective quality control procedures.

Diagnostic pathology support in clinical studies

In the past 20 years, major progress has been made in surgical pathology. The World Health Organization (WHO)/International Agency for Research on Cancer (IARC) currently publishes the fourth edition of the classification of tumors. Treatment decisions are based on a sound histological diagnosis and the “blue books,” as they tend to be called, provide a globally accepted consensus-based framework for the histological and molecular classification of human tumors. Clinical trials would be inconceivable without standardized histological diagnoses, which classify diverse human tumors according to morphology, taking molecular characteristics and natural behavior into account. These histological classifications are work in progress. Novel insight, including concepts based upon next-generation sequencing data but also clinical findings and pharmacotherapeutic progress, all impacts on classification schemes. Lung cancer is a striking example of the dynamic evolution of tumor classification: new modalities of targeted therapy of pulmonary adenocarcinomas have led to international initiatives to profoundly revise their classification [6].

The tumor (T), node (N), metastasis (M) classification of the Union for International Cancer Control (UICC) provides a solid basis for staging of human tumors [7]. It is the most important instrument for tailoring treatment to the needs of the individual cancer patient. The TNM classification has stood the test of time and has never been surpassed in multivariate analyses by any other single prognostic biomarker, often based upon immunohistochemistry or molecular analysis. The TNM classification is used in clinical trials, to select patients who are eligible for inclusion, and in cancer registries to compare outcome between different patient series, across different countries, over different time periods (i.e., ethnicity, medical treatment developments, sociocultural effects), and particularly between different studies. The TNM classification is also continuously improved by the addition of novel criteria and adoption of novel insights into tumor biology and prognostic factors. It is important to emphasize here that every diagnostic pathologist involved in signing out biopsies and surgical specimens of cancer patients contributes to clinical trials, in providing reliable pTNM classification for each case.

Diagnostic standards harmonize and standardize surgical pathology and provide a basis for evidence-based medicine and indirectly for the quality of clinical trials. Standardized classification is particularly important because most clinical trials are multicenter and often span several countries and even continents. Without the support of a solid histopathological diagnosis, including tumor type, grade (if appropriate), and stage, which are all provided by a surgical pathologist who usually is not directly involved in study design and/or execution standardization, clinical trials would be impossible to conduct. Patient accrual by clinicians often depends on work done by surgical pathologists. Beyond standardization of diagnostic criteria, adherence to diagnostic standards needs to be monitored, not only for daily practice but also for clinical trials. To this end, internal and external quality assurance programs have been implemented in many countries. In addition, continuous medical education programs, such as those provided by learned societies including the International Academy of Pathology (www.iapcentral.org), are dedicated to improving quality of diagnostic pathology through education. This implies that pathologists involved in daily diagnostic practice, which can directly influence quality of trial data, as well as those actively participating in developing and executing clinical studies, should be board-certified specialists with a diagnostic capacity at the level required for standard of care. This may require specific expertise, depending on the particularities of the study or the applied technologies. This becomes imperative when a surgical pathologist is actively involved in a clinical trial as central reviewer providing a second opinion prior to inclusion of a patient.

Participation in preclinical investigations

The question arises whether harmonization of diagnostic standards, continuous medical education, and specialization are sufficient guarantees for the required quality. Tissue-based biomarker testing has become an integral part of histopathology practice in oncology because prognostic biomarkers might determine whether or not the patient needs more, less, or no additional (adjuvant) treatment. Predictive biomarkers are needed to select the right drugs for a specific patient/target, hence the term targeted therapy, the mainstay of precision medicine [8]. As a result, clinical trials increasingly include exploration of tissue-based biomarkers in the quest for prognostic and companion diagnostics. This requires not only diagnostic expertise but also competences in quality assurance laboratory procedures and understanding of diagnostic algorithms beyond basic requirements for board certification. Important issues here are pre-analytical variables, sampling issues, and test and evaluation algorithms.

Pre-analytical variables

An area of particular importance is the impact of pre-analytical variables on the results of a test. A striking example is the Her2/neu retesting controversy in Canada [9]. In 1998, trastuzumab was approved for the treatment of breast cancer but only those that tested positive for Her2/neu expression or amplification on a breast cancer tissue sample. Guidelines for Her2/neu testing of breast cancer were published in 2007 [10]. In spite of these, retesting of breast cancer samples in a quality assurance program revealed major discrepancies, in part due to pre-analytical conditions of tissue treatment including fixation time, as test results are heavily influenced by insufficient fixation. Accumulating evidence led to revised guidelines, which were published in 2013 [11]. In comparison to the significant standardization efforts regarding diagnostic criteria, little effort has been dedicated to standardization of pre-analytical variables. International multicenter trials often fail to provide detailed specification of pre-analytical variables such as fixative and fixation time. While standards of reporting for tumor marker prognostic studies (REMARK; [12]) and diagnostic accuracy (STARD; [13]) have been published, these mainly refer to standards of reporting and publishing. However, harmonization is also needed in terms of pre-analytical tissue treatment conditions.

Fully automated immunostaining devices and certified test kits have improved tissue-based test results, but these will never become truly robust if pre-analytical variables are not harmonized. Commonly, formalin-fixed and paraffin-embedded tissue specimens are used. Formalin fixation is still strongly recommended, in spite of the fact that formalin fixation leads to fragmented suboptimal DNA quality which allows amplification only of small PCR amplicons. It remains the most widely used fixation procedure, allowing also retrospective cohort studies, particularly of cancers with a low prevalence or historical, chemotherapy-naïve patient populations.

Sampling procedure

It is essential that before any molecular testing, a solid histopathological diagnosis including detailed classification has been established. This also applies to the biopsy taken for clinical trials. DNA will be extracted from tissue samples which include a mixture of non-neoplastic and neoplastic cells. The percentage of each of these varies and has to be considered. Supervision and interpretation of molecular testing should not be carried out without surgical pathologists [14]. Lack of supervision by surgical pathologists may carry the risk of testing non-representative or even non-neoplastic tissue samples. The sensitivity of molecular biological assays varies and mutations can be missed when inappropriate assays are applied or the “mutational load” in a tumor subclone is below detection level [15]. The need for quality assurance programs is exemplified by the external quality assessment for KRAS mutation testing in colorectal cancer carried out by European Society of Pathology, in which 27 % of the participants genotyped at least 1 of 10 samples incorrectly [16].

Tumor heterogeneity is becoming one of the main obstacles of cancer treatment in the era of precision medicine. Sequencing of multiple samples of primary clear cell renal cell carcinoma (ccRCC) and its metastatic sites revealed mind-boggling complexity, which allowed reconstruction of its evolutionary history in terms of gene abnormalities [17, 18]. Genetically distinct subclones were found in the primary tumor and distant metastases. Cancer is now viewed as a highly dynamic evolutionary disease [19], with continuing mutations favoring the emergence of new (sub)clones with distinct biological properties already in precursor lesions [20]. Subclones may show different patterns of interaction; they may compete, overtake other clones, parasitize, or peacefully co-exist. Based upon the ccRCC observations, a “trunk and branch” model has been proposed for tumor evolution [21]. A similar level of heterogeneity has, as yet, not been found in other tumor types, but tumor heterogeneity does have a major impact on the interpretation of earlier studies and the design of future clinical trials. Common events, found in every subclone of the tumor region, are represented in the trunk. Heterogeneous somatic events occurring in a limited number of (or even a single) subclones represent the branches of the tree [21]. While targeting alterations in the trunk might be most promising as they occur in all tumor cells, many actionable targets exclusively occur in branches. This subclonal heterogeneity of treatment targets poses problems in diagnosis (tissue sampling error) and therapy (emergence of resistance). Treatment and eradication of a subclone may provide a growth advantage for a competing non-responsive subclone, which finally may kill the patient. This is probably why the efficacy of several targeted therapies was limited in time [21].

Molecular heterogeneity appears to be inherent to tumor evolution and has consequences for sampling procedures. We have studied the impact of sampling protocols on Her2/neu testing in gastric cancer. Her2/neu has been introduced as a predictive biomarker for the treatment of gastric cancer with trastuzumab [22]. Amplification of genes encoding receptor-tyrosine kinases usually occurs in genomically unstable gastric cancers [23], which is why in gastric cancer, Her2/neu expression is intrinsically heterogeneous. We assessed Her2/neu status (according to the gastric cancer scoring system [24]) using a tissue micro array approach, in which a tissue core serves as surrogate for a biopsy procedure, and compared the results with those obtained on whole tissue sections cut from the same paraffin block. On the TMA cores, we obtained a “false-negative” rate of 24 % and a “false-positive” rate of 3 %. Similar observations regarding heterogeneity were made in gastric cancer for expression of MET [25] and microsatellite instability [26]. Heterogeneous expression of predictive biomarkers poses a major challenge in clinical trials but this issue is often neglected. The phenomenon is often better explored in retrospective biomarker studies, once a treatment modality has been formally approved by a regulatory body. As an example, many studies on Her2/neu expression were carried out after the approval of trastuzumab for the treatment of gastric cancer (for a review, see [27]). These issues should be addressed prior to implementation of a trial, and whether (expression of) the molecular target of the treatment is homogeneous (“trunk alteration”) or heterogeneous (“branch alteration”) should be explored in advance and translated into suitable biopsy procedures. However, many studies address treatment in later stages of cancer with palliative intent, and analyses rely on biopsy samples only, as patients may not be eligible for tumor resection. However, an essential requirement for any trial is the exploration of tumor heterogeneity before the trial takes off, to avoid sampling errors and favor representative tissue biopsies, which should then be assessed according to REMARK standards (Table 1) [13].

Table 1 Recommendations for tumor marker prognostic studies (REMARK)

Test and evaluation algorithms

Another obstacle is the choice of adequate test and evaluation algorithms. Test results of immunohistochemistry-based markers are sensitive to choice of antibody, staining protocol, and the microscopical evaluation procedure. The use of different antibodies and evaluation procedures, added onto dissimilar study populations, almost inevitably generates enormous variability in the final result and what should be considered the true prevalence of expression and significance of a biomarker may be difficult to establish. With regard to Her2/neu, MET, and microsatellite instability in gastric cancer, we carried out a literature review and found that the prevalence ranges from 5 to 29 % for Her2/neu [28], 3.8–85 % for MET [25], and 0–44.5 % for microsatellite instability [26]. Similar observations were made for Her2/neu in breast cancer. In Australia, the Her2/neu-positivity rate decreased over a 4-year period from 23.8 % in 2006 to 14.6 % in 2010 [29]. Thus, not only standardization of laboratory procedures but also evaluation of the staining result is mandatory. In Germany, this has led to the introduction of the Her2-monitor, which serves as an external benchmark [30]. However, such benchmarks are not available for a considerable number of (future) actionable targets.

KRAS (for treatment with cetuximab) and Her2/neu testing (for treatment with trastuzumab) taught us further lessons. Test algorithms cannot be translated from one tumor type to another without solid experimental evidence. Testing for mutations of KRAS (and more recently also of NRAS, BRAF, and PIK3CA) in colon cancer allowed identification of patients (with a KRAS mutation) who do not respond to cetuximab [31]. This has improved progression-free and tumor-specific survival in the palliative setting [32], but this has not appeared to be applicable in stomach cancer. Mutations in cancer genes do not occur in isolation but in an established genomic landscape, which is different for different cell and tissue types. This “ground state” of the transformed cell may have a profound impact on the effect of a mutation, such as determining whether cell death or clonal expansion might ensue [19]. Somatic alterations occurring in the “omic” landscape of the stomach might not have the effect of the same mutation in the context of the colon. A somatic mutation may be a driver in one but a passenger in a different context [19]. The understanding pathologists have of cell and tissue context makes their contribution to molecular testing particularly valuable.

Once again Her2/neu is a good example. The breast cancer scoring system was found to be unsuitable for gastric cancer, and it was deemed necessary to change the test algorithm [33]. As yet, generally accepted approaches or guidelines as to how for a novel tissue-based biomarker the test algorithm should be developed do not exist. The approaches are characterized by “trial and error.” The rationale underpinning a cutoff value in an immunohistochemical test is rarely provided, often follows statistical reasoning (e.g., splitting at the median) but rarely reflects tumor biology.

Implementing trial results: roll-out and quality assurance

In many countries, diagnostic standards are maintained through certification and accreditation. In the UK and the USA, clinical pathology is accredited according to the internationally recognized standard ISO/IEC 15189:2012 (www.ukas.com). In Germany, departments of pathology are accredited according to the ISO/IEC 17020:2012 (www.dakks.de). Accreditation programs confirm by an independent third party the compliance to standards and competence in laboratory and diagnostic procedures, which includes continuous medical education. The National Accreditation Body in Germany has published guidelines for the validation of immunohistochemical tests [34]. The College of American Pathologists has published principles for analytic validation of immunohistochemical assays [35]. As a result, standards in diagnostic pathology are high, but somehow these standards are not always applied in clinical trials or validation studies, at least in part because what has become mandatory in diagnostic surgical pathology might not be applied in a research laboratory. In our literature review of Her2/neu [23], MET [25], and MSI testing [26], we did not find a single publication reporting participation in an external quality assurance program. Certification or accreditation of a surgical pathology research laboratory appears not to exist.

This opens up the question whether certification and accreditation of surgical pathology research laboratories is necessary or even at all feasible. In a translational study aimed at finding a prognostic or predictive tissue-based biomarker, external quality assurance is not appropriate; good scientific and laboratory practice including appropriate positive and negative controls are essential. However, subsequent confirmatory studies anticipating clinical/diagnostic use (which is the case for the vast majority of Her2/neu studies in gastric cancer after formal approval of trastuzumab) should significantly benefit from external quality assurance programs.

Several examples illustrate the gap between phase III clinical trials leading to formal approval of a new drug in combination with a companion diagnostic and subsequent roll-out of the new test into clinical and surgical pathological practice. Narrowing this gap is necessary. Drug development, clinical trials, and roll-out are expensive and time consuming but, when not carried out according to the guidelines recommended by REMARK and STARD (Tables 1 and 2, respectively) [12, 13], may end up to be disastrous for the patient. Quality assurance programs should become mandatory also in validation of biomarker studies and roll-out of new biomarker tests. A minimum requirement during roll-out of clinical trial results with companion diagnostics would be implementation of an external quality assurance program, as soon as the drug that requires a companion diagnostic is approved by the European Medicines Agency. This has become the standard approach of the German Quality Initiative for Pathology (QuIP; a collaborative initiative of the German Society for Pathology (Table 3) and the Bundesverband Deutscher Pathologen e.V.).

Table 2 Statement for reporting studies of diagnostic accuracy (STARD)
Table 3 Statement of the German Society for Pathology

Common pitfalls and key issues

Recently, the US Food and Drug Administration (FDA) published common pitfalls and challenges in their companion diagnostic review and approval process [36]. These include the use of multiple tests to test patient specimens to select for a trial, the lack of analytical validation prior to use in the trial, inappropriate specimen types used to validate the assay, missing samples, the instability of analytes during storage, the assessment of drug efficacy in only one subset defined by the test, and the retrospective assignment of cutoffs in a pivotal trial. Almost all also apply to tissue-based diagnostics. The proposals to improve on these key issues provided by the FDA are particularly relevant:

  1. 1)

    A companion diagnostic development plan should be included as part of the development plan of the drug under study when biomarker-based conclusions about drug safety and efficacy are anticipated.

  2. 2)

    The final version of the test should be used to screen patients for the trial.

  3. 3)

    Drug and device claims rely on prespecified device design, and analytical validation prior to initiation of a study is critical to planning patient enrollment.

  4. 4)

    A plan for appropriate banking and annotating of patient specimens (both test negative and test positive) and assuring storage that does not impact on test results will be critical to future bridging studies [36].

A problem with the latter is that it will often interfere with regulations impeding on patient service outside clinical trials. Many patients are enrolled in clinical trials only after a diagnostic biopsy or therapeutic resection specimen was obtained, and such tissue samples were obtained for diagnostic and not for research purposes.

Do clinical trials need to interfere with diagnostic pathology service?

Surgical pathologists not involved in clinical trials are increasingly challenged in daily practice by the requirements of an ever increasing number of clinical trials. They are requested to provide tissue samples entrusted to them for diagnostic purposes only or additional study-related information, or even perform additional studies on the tissue samples without having any primary research/study intention. This confronts pathologists in their role as tissue trustee with a multitude of unresolved issues, including heterogeneity of the research landscape (who finances the study, what are the aims of the study, does patient informed consent cover the request), study-associated tissue collections, proprietorship, study results, data protection, and compensation of expenses (for a review see [5]). The German Society of Pathology recently summarized the problems increasingly encountered in diagnostic pathology outside clinical trials and published a proposal [5], and explicitly recommend that pathologists should be actively involved in study planning, implementation and data analysis, and the use of cell and tissue material. By doing so, pathologists assume their dual role as competent diagnostician and expert advisor in specific issues regarding tissue-based research and diagnostics. In such a position, pathologists would be instrumental in avoiding protocols, procedures, and regulations that are disadvantageous for health care, hamper the study, and disturb interactions between involved parties. The study pathologist would ensure that pathology-specific standards are met comprehensively in cell- and tissue-based research and clinical studies including diagnostics and that the study requirements are not in conflict with principles of good practice. When central pathology review is part of the study, the study pathologist must have an outstanding level of expertise and recognized professional status. Trials and studies which include tissue-based analyses but not tissue bases analytical methods (e.g., histopathology, immunohistochemistry, in situ hybridization) also require expert histological characterization of tissue samples in order to assure adequate sample selection and based upon relevant parameters (e.g., tumor tissue vs. non-neoplastic tissue, necrotic vs. vital tissue, percentage of tumor tissue in the sample). This is the only way potentially irrelevant results or inadequate interpretation of the results can be avoided [5].