Keywords

Introduction

Radiologic images have been used for decades to gauge the effectiveness of therapeutic interventions [1]. Increasingly, novel quantitative imaging techniques are being incorporated into oncology clinical trials, where they serve as surrogate biomarkers for various aspects of tumorigenesis or as indicators that facilitate evaluation of the efficacy of experimental therapies. Moreover, properly designed imaging studies can significantly affect the size, duration, cost, and success of clinical trials and ultimately affect patient care. Indeed, in this modern era, which has witnessed the development of targeted therapies and personalized medicine, development of a tissue biomarker that predicts sensitivity to a targeted therapy has become an essential step in the clinical success of a novel anticancer agent. Imaging can help identify such predictive tissue biomarkers, which can allow us to subdivide tumors into sensitive and resistant populations. In day-to-day oncology practices, imaging has been widely used to assist radiologists in the early detection of metastasis and in identification of ineffective and toxic therapies so that patients can be promptly switched to an alternative treatment option.

But, classic imaging approaches may not be as appropriate for many new cancer therapies that are being developed. For 30 years, the standard way to assess a patient’s response to treatment in both clinical trials and clinical practice has been to monitor tumor changes measured bidimensionally per World Health Organization (WHO) criteria [2, 3] or, since 2000, unidimensionally using the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines [4, 5]. However, many of the new classes of anticancer therapies are cytostatic drugs that may not cause as rapid tumor shrinkage or may cause less tumor shrinkage than previous generations of cytotoxic chemotherapies. Instead of size reduction, therapy-induced tumor changes may be associated with development of central necrosis or other complex changes. Such new patterns of change seen on radiographic images are challenging traditional response assessment methods, which are based on measuring tumor diameters, predominantly on longitudinal computed tomography (CT) or magnetic resonance imaging (MRI) scans.

While functional and molecular imaging techniques, e.g., positron emission tomography (PET) and dynamic contrast-enhanced (DCE) MRI, hold great promise, they are immature, expensive, rarely accessible, and prone to measurement variability. In contrast, CT is well developed and globally ubiquitous and is the standard clinical imaging modality for monitoring the growth of solid tumors. Furthermore, in clinical trials, there are many quantitative endpoints that depend upon CT findings and may correlate with overall survival. These endpoints include (but are not limited to) objective response rate (ORR), progression-free survival (PFS), disease-free survival (DFS), and time to progression (TTP). Ultimately, in monitoring therapies, the goal of imaging is for the images to serve as a successful surrogate endpoint for a patient’s response to therapy. Thus, tumor shrinkage that is quantified by ORR, PFS, etc., as measured on CT should ideally correlate with prolonged survival. CT has also facilitated identification of target lesions in baseline examinations, detection of new lesions in follow-up studies, and confirmation of tumor responses after completion of therapy.

In this chapter, we will take CT as an example to address the role and progress of medical imaging and image analysis techniques in monitoring tumor responses to therapies in oncology clinical trials and clinical care. We will start with a brief overview of conventional response assessment methods and then address limitations of these standard response criteria, especially in the era of therapies targeting specific molecules. We will then introduce revised and modified RECIST criteria for lymphoma, mesothelioma, hepatocellular carcinoma (HCC), and gastrointestinal stromal tumor (GIST). Last but not least, we will discuss the use of volumetric CT to improve tissue biomarker discovery for novel therapies in non-small cell lung cancer (NSCLC).

Conventional Response Assessment Methods

Tumor change with therapy plays a fundamental role in medical oncologic care. A reduction in tumor size, termed a “response” to therapy, indicates that the patient is gaining some degree of benefit from treatment. In contrast, an increase in tumor size, termed “disease progression,” suggests a tumor that is refractory to therapy and that a change of treatment is needed. In clinical trials and also in clinical care, tumor sizes are measured mainly on CT, and based on size changes, tumor responses to therapies are generally assessed by WHO and by the currently recommended RECIST guidelines.

WHO Criteria

The first guideline, known as the WHO criteria, that attempted to use objective metrics to assess tumor responses to therapy and to standardize reports of clinical outcomes from cancer treatment trials was codified by the WHO and published in 1981 [2, 3]. The WHO criteria utilize the cross product (i.e., bidimensional measurement) of the greatest diameter of the tumor and its greatest perpendicular diameter in a transverse plane to approximate tumor burden. Based on the change in the sum of these cross products of tumors, the WHO criteria recommend reporting results of cancer treatment using the following four categories: complete response (CR), partial response (PR), stable disease (SD), and disease progression (PD) (Table 8.1). A size reduction of 50 % or more from the baseline study was considered to be a PR, whereas a size increase of 25 % or more was deemed to be PD. The presence of any new lesion would be considered PD, and any “substantial” enlargement in tumor size that was not easily measured would also be considered PD.

Table 8.1 The WHO and the RECIST response criteria

RECIST

In the middle of the 1990s, the European Organization for Research and Treatment of Cancer (EORTC), the National Cancer Institute (NCI) of the USA, and the NCI of Canada trials group set up a task force to review existing response assessment criteria. Based upon a retrospective review of clinical trials involving approximate 4,000 patients, and considering advances that had been made, particularly in medical imaging technologies, a new set of guidelines for assessing the response of solid tumors to anticancer therapies was released in 2000 [4]. Known as RECIST, these recommendations included the adoption of a simplified approach to measuring tumors utilizing only the greatest diameter (unidimensional measurement) and the sum of the greatest diameters of the target lesions. RECIST defines the rules to select target lesions on baseline scans including the number (up to 10 per patient and 5 lesions per organ) and the size of target lesions (≥10 mm). The establishment of these new criteria was guided by a number of important principles: (1) the need to maintain the standard four-response category system (i.e., CR, PR, SD, PD) (Table 8.1), with a size reduction of 30 % or more for PR and a size increase of 20 % or more for PD; (2) the goal of maintaining consistency of results such that no major discrepancy in the meaning of PR would exist between the older WHO criteria and the new RECIST criteria; (3) the recognition of both the arbitrary nature of the cutoff value for PR and the need to maintain this cutoff until other potentially more reliable or powerful surrogates could be developed; (4) concern about categorizing patients as PD too easily; and (5) recognition that cytostatic agents may not have the same measurement “activity” and that other serum markers and specific tumors may present unique challenges.

Assuming that a tumor is spherical and changes size symmetrically, a size reduction of 30 % defined by the unidimensional RECIST method corresponds to a size decrease of 50 % by the bidimensional WHO criteria. Considering calling for PD too soon, a size increase of 20 % proposed by the RECIST method is compatible with a size increase of 44 % by the WHO method (Table 8.2). When comparing clinical trials evaluated by RECIST with old studies, tumor progression may be detected later because of the increased threshold for PD. Since its establishment in 2000, unidimensional RECIST guideline has been widely accepted as the standard method for assessing tumor responses to systemic therapies.

Table 8.2 Relationship between changes in diameter, product, and volume

RECIST 1.1

Continuous evaluating and updating of RECIST guidelines was suggested by the RECIST Working Group at the time the criteria were published. Based on an intensive analysis of data collected for more than 6,000 clinical trial patients [6] and the reported inadequacies of using the RECIST criteria in prospective clinical trials [7], a revised set of RECIST (version 1.1) was published in 2009 [5]. Major modifications in this new release included (1) a reduced number of target lesions from 10 to 5 per patient and from 5 to 2 per organ, (2) the need for response confirmation only in nonrandomized trials and only where response is the primary endpoint, (3) use of the short axis to measure malignant lymph nodes, and (4) requirements for a 20 % increase and for a minimum absolute increase of 5 mm in the sum of all target lesions’ diameters for PD. The Working Group believed that it is not yet time to adopt volumetric and functional assessments (e.g., DCE MRI, DCE CT, or 18F-fluorodeoxyglucose (FDG) PET) because these techniques have not been standardized, are not widely available, and have not, through studies, received thorough clinical validation.

Limitations of Conventional Response Assessment Methods

Limitations of RECIST have been well described [4, 5]. First, changes in tumor maximal diameters measured on an axial plane between longitudinal imaging scans cannot fully capture changes in total tumor burden, especially along the z-axis (Fig. 8.1a, b). Second, the response cutoffs (e.g., 50 % or more reduction in the sum of tumor bidimensional measurements is considered to be a PR by the WHO) were developed by evaluating the measurement error of antiquated response assessment modalities used during the 1970s and early 1980s (i.e., physical palpation or plain X-ray measurements) [8, 9]. These cutoff values probably do not reflect variability in measuring tumor diameters using today’s tumor measurement tools (e.g., an electronic ruler on a diagnostic workstation) [10, 11] on modern CT scans. Third, conventional response assessment methods disregard changes in tumor component as seen with tumor necrosis or tumor density decreases, a potential new dimension allowing evaluation of anticancer effects of antiangiogenic agents with anatomical imaging.

Fig. 8.1
figure 1

Asymmetric growth of lung cancer. Two examples of NSCLC tumors (a, b) taken from a clinical trial testing gefitinib. (a) Line direction along which the greatest tumor diameter was measured on baseline and follow-up scan images changed. Percentage changes in unidimensional, bidimensional, and volume measurements were 1.1, 9.8, and 24.9 %, respectively. (b) Unidimensional and bidimensional measurements did not detect tumor change, but the volumetric technique did. Percentage changes in unidimensional, bidimensional, and volume measurements were 1.2, 3.2, and 33.4 %, respectively (b: Used with permission from Zhao et al. [43])

Revised and Modified RECIST and Beyond

Over the past decade, revised and modified RECIST guidelines have been suggested for certain types of tumors that do not lend themselves to unidimensional or bidimensional measurements due to their origin, extent, posttreatment residue, and density changes with targeted therapies. In the remaining sections of this chapter, we will discuss, for certain types of cancers, limitations of RECIST and its revisions and modifications that have been proposed and that are or will be evaluated.

Lymphoma

Lymphoma usually resides in the normal structure of the lymph nodes. Variations in the size of normal nodes can make selection of target lesions at baseline scans and determination of new lesions at follow-up scans both difficult and inconsistent. Furthermore, posttreatment residual masses often consist of non-tumor components such as fibrosis, necrosis, or inflammation that can be indistinguishable from tumors on CT and affect classification of CR and PR rates in clinical trials [12].

To resolve inconsistencies that can arise in lymphoma clinical trials, in 1999 an international working group (IWG) of lymphoma experts published a set of guidelines, based on their consensus, for the standardization of response assessment in adult patients with indolent and aggressive non-Hodgkin’s lymphomas (NHL) [13]. The IWG criteria specifically defined a posttreatment size range of normal lymph nodes by taking into account the baseline size. For lymph nodes greater than 1.5 cm at baseline, a CR should be declared if these nodes have regressed to less than 1.5 cm after therapy. For nodes less than 1.5 cm but greater than 1.1 at baseline, the size of the normal nodes after therapy should be no larger than 1.0 cm in order to qualify as a CR. The IWG guidelines have provided clinicians with uniform criteria to interpret and assess outcomes of lymphoma clinical trials. However, these criteria cannot differentiate viable tumor components from necrosis or fibrosis [14]. With the increased availability of PET with 18F-FDG radiotracers and the use of immunohistochemistry and flow cytometry, an International Harmonization Project (IHP) significantly revised the IWG criteria for lymphoma clinical trials [15]. The new IHP criteria evaluate all types of lymphomas and tumor responses to therapy by jointly considering tumor changes measured on both FDG PET and CT (Table 8.3). Additionally, the IHP criteria suggest post-therapy time intervals at which response should be assessed (i.e., after 3 weeks or more to evaluate the effects of chemotherapy and between 6 and 12 weeks to evaluate chemoimmunotherapy and radiation therapy).

Table 8.3 IHP response criteria for lymphoma

Mesothelioma

Malignant pleural mesothelioma (MPM) typically grows as a rind of tumor encasing the lung in an irregular pattern (Fig. 8.2a). Measuring tumor size by replacing the “longest in-plane diameter” per RECIST can be problematic and will unlikely capture the change in tumor burden. Lack of reproducibility due to the circumferential and axial growing patterns of MPM is another major problem when using RECIST to assess tumor changes [16]. Because of these limitations and recent reports on the inadequacy of the RECIST criteria in the response assessment of MPM [1719], current practice is to modify the RECIST criteria so that they can better capture the unique growth pattern of MPMs [20].

Fig. 8.2
figure 2

Measuring pleural mesothelioma using conventional RECIST and modified RECIST. (a) Lines represent possible interpretations of “greatest tumor diameter” per conventional RECIST. (b) Lines represent suggested measurement sites that are perpendicular to fixed structures such as chest wall and vertebral column, according to the modified RECIST guidelines (Used with permission from Byrne and Nowak [20])

The modified RECIST criteria for MPM measure tumor thickness perpendicular to fixed anatomical structures such as the chest wall, mediastinum, or vertebral column at two sites for each of three separate levels on transverse CT planes (Fig. 8.2b). The unidimensional measurement is defined as the sum of the six tumor thicknesses, and the response evaluation follows the RECIST guidelines. In the modified RECIST criteria, the anatomical landmarks where measurements should be taken have been defined to improve measurement reproducibility on serial scans. Using the modified RECIST method to reevaluate two clinical trials, Byrne et al. found no change in response rate as assessed originally by the WHO criteria. However, the median survival for responding patients was significantly higher than that for nonresponding patients (15.1 month vs 8.9 month, p = 0.03) [20].

In the late 1990s, Pass and colleagues published a study showing that the preoperative tumor volume was representative of tumor T status in MPM and predictive of overall and progression-free survival as well as postoperative stage [21]. However, lack of automated or semiautomated volume quantification tools prevented further validation of these important findings. Recently, with the help of a computer algorithm, Fan et al. found a strong association between MPM patient survival and change in tumor volumes measured at two cycles after the onset of induction chemotherapy [22].

HCC and GIST

HCC is one of the most common malignancies worldwide and is the fastest growing cancer in the USA. GIST is a classic tumor model for the development, in modern drug discovery, of anticancer therapies that target specific molecules such as enzymes or receptors rather than killing cells. Many such therapeutic agents have been proposed for these two cancers. The sized-based RECIST method, especially with an arbitrary 30 % cutoff value defining a response (i.e., a 30 % or greater reduction in tumor size), has been shown to be misleading in the evaluation of tumor responses. Indeed, responding tumors may only minimally decrease in size or even slightly increase in size, but they may undergo internal necrosis and hemorrhage, hyalinization, and fibrosis [2325]. Accurate and sensitive response assessment methods are thus imperative for the success of these clinical trials as well as for continued discovery of novel, target-specific, anticancer agents.

HCC: mRECIST for Locoregional Treatments

A variety of locoregional treatments have been developed for HCC in the past decade. However, such therapies are hard to evaluate by WHO and RECIST criteria because of the development of central necrosis, an outcome of all effective locoregional therapies.

In 2000, an expert panel on HCC organized by the European Association for the Study of the Liver (EASL) revised the response assessment for HCC by taking into account therapy-induced tumor hypodense areas and necrosis [26]. The concept of viable tumor, i.e., an enhanced tumor component in the arterial phase of dynamic CT or MRI, was then used to assess HCC responses to therapies and was soon accepted by the American Association for the Study of Liver Diseases (AASLD) [27]. To standardize a growing number of complex HCC clinical trials, a formal guideline, called the modified RECIST assessment (mRECIST), was subsequently established for HCC by the AASLD panel and was published in 2008 [28]. The mRECIST method emphasizes the importance of standardization of dynamic contrast-enhanced imaging techniques because the viable tumors are best depicted and measured on arterial-phase images. Instead of measuring an entire tumor to assess treatment response, the mRECIST method suggests that one should measure the longest diameter of only the viable tumor component in each tumor area (Fig. 8.3a, b) [29]. The four categories and the corresponding cutoff values for tumor response and progression used by the RECIST guidelines, however, remain unchanged.

Fig. 8.3
figure 3

Use of mRECIST criteria in the assessment of HCC responses to therapy. Target tumor response measurements on arterial-phase CT scans. (a) Measurement of greatest overall tumor diameter according to conventional RECIST criteria and (b) measurement of greatest viable tumor diameter according to the mRECIST method for HCC (Used with permission from Lencioni and LIovet [29])

GIST: Choi’s Criteria for Targeted Therapies

A number of research groups reported significant underestimation of tumor responses by the RECIST method while monitoring GISTs treated with imatinib mesylate, a targeted therapy [3032]. Although PET scanning has proven useful for monitoring tumor responses in GIST patients, the high cost of PET, its lack of worldwide availability, and its lack of standardization led Choi and her colleagues to develop alternative CT criteria in evaluating responses for patients with GIST [32].

Choi’s criteria define a 10 % decrease or more in a unidimensional measurement or a 15 % decrease or more in density (as measured by Hounsfield units) on a selected image plane as a partial response [32]. In a study of metastatic GIST patients treated with imatinib, the group reported that Choi’s criteria reached a sensitivity of 97 % and a specificity of 100 % in identifying the responders assessed by PET, whereas the RECIST method only had a sensitivity of 52 %, though specificity was also 100 % [32].

HCC: Necrosis to Tumor Ratio

Necrotic lesions frequently develop a cyst-like appearance without a significant change in anatomic dimensions, when evaluated by CT or MRI. In a pivotal clinical trial of sorafenib in HCC, a partial response according to the WHO criteria occurred in only 2 % of subjects [33]. However, there was clear clinical benefit as 33.6 % of patients had stable disease (SD) for ≥16 weeks, and central tumor necrosis in response to sorafenib was common. Using baseline and follow-up triphasic CT scans, Abou-Alfa et al. then calculated lesion and necrosis volumes with the help of computer software. They found that the necrosis to tumor ratio (N/T) was significantly associated with responses, with responders having greater increases (in the ratio between necrosis volume and tumor volume) relative to baseline, as compared to nonresponders (Fig. 8.4a–d). The study did not show an association between the N/T ratio and overall survival [34].

Fig. 8.4
figure 4

Contrast-enhanced CT scans of an HCC patient enrolled in a growth factor inhibitor trial where target lesion tumors are outlined in blue and necrosis is outlined in red. (a) At baseline, the ratio of necrosis volume to tumor volume (N/T) was 34 %; (b) at 6 weeks post-therapy there was less necrosis and a decrease in the N/T ratio to 20 %. It was considered to be an ineffective treatment; (c) the patient was then switched to sorafenib, with an increase in the N/T ratio up to 47 %; (d) density histogram of the tumor at baseline (blue), at the first follow-up (red), and at the final time point (turquoise)

HCC: Choi’s Criteria for Targeted Therapies

Studies on HCC treated with sunitinib reported considerably low response rates (<10 %) when evaluated by RECIST criteria, even though improved patient survival was observed [33, 35, 36]. Increases in hypodense regions rather than reductions in tumor size were often seen in the sunitinib-treated tumors. However, antitumor effects of antiangiogenic agents in HCC could not be captured by the RECIST criteria. For a recently published phase II clinical study of HCC patients treated with sunitinib, Faivre and colleagues reported a drastically increased response rate from 3.8 % (by RECIST criteria) to 65.4 % (by Choi’s criteria). They also found that, using Choi’s criteria, responding patients had a significantly longer time to progression than nonresponders (7.5 month vs 4.8 month, p = 0.0182). However, no significant difference in median overall survival between the two groups was observed [37].

NSCLC

In 2004, somatic activating mutations in the tyrosine kinase (TK) domains of the epidermal growth factor receptor (EGFR) gene were discovered in a subset of non-small cell lung cancer (NSCLC) patients who showed remarkable responses to an inhibitor of the TK within the EGFR [3840]. This has opened the door to genotype-directed therapies for NSCLC. That is, treatment options can be selected based on an individual patient’s clinical characteristics and tumor biology, and novel therapies are developed to target key genetic mutations.

The rapid increase in the number of systemic agents in NSCLC has demanded novel biomarkers that better determine drug-induced tumor changes. Such biomarkers should be indicative of underlying biologic processes in the tumor and be of high precision so that they can serve as a valuable means for screening of promising anticancer agents. One potential technique of particular interest is volumetric CT, a technology with the potential to more accurately capture tumor growth dynamics. In a phase II neoadjuvant NSCLC trial [41], Zhao and colleagues incorporated an analysis of thin-section volumetric CT scans to determine the value of early radiographic changes (i.e., 3 weeks post-therapy) in predicting the biologic activity of gefitinib therapy in a subset of EGFR-enriched NSCLC patients (Fig. 8.5a–c). The authors found that, compared to unidimensional measurements, volume measurements allow significantly better dichotomization of these molecular subtypes, indicating that volume change has promise as an investigational method for early detection of the biologic activity of a systemic therapy in NSCLC [42].

Fig. 8.5
figure 5

Volumetric CT technique in tissue biomarker discovery. (a) Shows an EGFR mutant NSCLC (likely response to gefitinib therapy). Changes in unidimensional and volume measurements were −4.4 and −52.4 %, respectively. The volumetric technique detected tumor change at 3 weeks post-therapy, whereas the diameter technique did not. (b) Shows a KRAS mutant NSCLC (resistant to gefitinib therapy). Changes in unidimensional and volume measurements were −5.2 and −10.0 %, respectively. Both measurement techniques showed no real tumor change at 3 weeks post-therapy. (c) Receiver operating characteristic curve (ROC) shows a significantly higher area under the curve (AUC) of the volumetric measurement than unidimensional measurement, indicating that early volumetric change is a better metric for predicting EGFR mutation status than is early diameter change (a, c: Used with permission from Zhao et al. [42])

Summary

Insensitive methods in evaluating patient responses to cancer treatments can delay drug discovery and mislead those doing patient management. The conventional RECIST guideline has its pitfalls when applied to the solid tumors whose origin, morphology, and extent are not suitable for linear measurements (e.g., lymphoma and mesothelioma) or when used to assess treatments that may not necessarily reduce tumor size (e.g., interventional and targeted therapies in HCC). In the past decade, revisions and modifications of the RECIST criteria were proposed for certain types of cancers treated with these novel therapies, aiming at better assessing drug efficacy. Yet, the full potential of modern CT and computerized image analysis in accurately assessing tumor changes over time, and thus optimally interpreting tumor responses to therapies, has not been well explored.

Even though the RECIST criteria are considered as the standard guidance for evaluation of modern clinical trials, its response cutoff values have not been validated against state-of-the-art CT and advanced tumor measurement tools. In a recent contemporary study, Zhao and her colleagues revealed, for the first time, the magnitudes of the variability in tumor unidimensional, bidimensional, and volumetric measurements made on two repeat CT scans performed within 15 min in patients with non-small cell lung cancer (NSCLC) [10]. The high reproducibility of both radiologists’ and computer-aided tumor size measurements suggests that a thorough reevaluation of conventional RECIST criteria should be done for assessing novel targeted therapies.

There is no doubt that volume measurements are more accurate in quantifying changes in tumor burden than current diameter measurements. Changes in necrosis to tumor ratio and/or in tumor density can be quantified by computer-aided algorithms. However, to be accepted by the oncology community as better imaging biomarkers of tumor responses, these new quantitative metrics along with the proposed (or to be proposed) response criteria need to be validated in prospective multicenter clinical trials. The new metrics should be reproducible, and tumor responses to therapies assessed by any new criteria should be correlated with clinical outcomes (e.g., survival).

There is an ever-improving understanding of tumor biology and of the underlying mechanisms of interactions between drugs and cancers. There are also advances in medical imaging technologies and computerized image analysis methods. Therefore, optimal strategies to monitor tumor responses to novel therapies at anatomical, functional, or molecular levels should be developed by jointly considering the best possible imaging modalities, by standardizing imaging acquisition techniques, and by developing advanced response assessment methods.