Introduction

Metastasis to bone is a common pattern of breast cancer relapse, with complications of bone pain, hypercalcemia, myelopathy, spinal cord compression, and pathologic fractures, Nielson et al. [1] and Mundy [2]. The imaging studies that best detect bone metastases, bone scan and magnetic resonance imaging (MRI), depict tumor sites largely on the basis of the tumor’s effect on adjacent bone, Vanel et al. [3] and Bares [4]. These modalities have high sensitivity for detecting bone metastases on this basis, underlying their use for breast cancer staging, Bares [4], O’Sullivan and Cook [5] and Ghanem et al. [6]. However, for bone scan and MRI, changes in the appearance of bone lesions with effective treatment occur slowly, or even paradoxically, as exemplified in the phenomenon of bone scan “flare,” making the evaluation of treatment response difficult, Hamaoka et al. [7], Schneider et al. [8], Vogel et al. [9], Maffioli et al. [10]. In clinical trials evaluating novel systemic therapies for metastatic breast cancer, patients with bone-only or bone-dominant disease are often excluded due to the lack of measurable disease by Response Evaluation Criteria in Solid Tumors (RECIST) criteria.

FDG PET is increasingly applied to breast cancer staging and response evaluation, Eubank and Mankoff [11]. FDG PET appears to image bone metastases in a pattern distinct from bone scintigraphy, Cook et al. [12], Nakai et al. [13] and Abe et al. [14]. This spurred our previous study, showing that serial FDG PET can measure tumor glucose metabolism as response to therapy in patients with metastatic, BD breast cancer, Stafford et al. [15]. However, the assessment of bone metastasis response to therapy is challenging and may not provide an appropriate “gold standard” against which to test a new modality such as FDG PET. With this in mind, we undertook the current analysis to test whether changes seen in FDG uptake in serial scans of patients undergoing treatment of bone metastases correlate with more robust, clinically relevant patient outcome measures such as time-to-progression (TTP) and time to skeletal-related events (t-SRE).

Methods

Patients

A retrospective analysis was performed from the records of 405 breast cancer patients referred to our institution for FDG PET between January 1999 and December 2003. All patients had histologically-proven breast cancer and were referred for breast cancer staging to help identify the extent of breast cancer metastasis. Inclusion criteria for this study of changes in FDG uptake in bone sites were: (a) bone metastasis with clear uptake on an initial FDG PET scan and confirmed by biopsy or conventional imaging (e.g., bone scanning, MR imaging, CT, plain radiography); (b) bone-dominant disease during the entire course of follow-up (i.e., all distant lesions were confined to the bone, with or without nodal involvement either at the time of initial staging or subsequent biopsy or imaging); (c) no fewer than two FDG PET scans separated by at least 1 month and by no more than 18 months while receiving treatment for bone metastases; and (d) no granulocyte colony-stimulating factor (G-CSF, GM-CSF) for at least 2 weeks before FDG PET. Use of G-CSF can induce significant increases in bone marrow uptake of FDG, which can be mistaken for diffuse bony metastases or obscure the presence of known bony lesions, Sugawara et al. [16]. Of the 405 breast cancer patients, 28 patients fulfilled these inclusion criteria.

All patients provided written, informed consent for access to medical record evaluation before their initial FDG PET examination according to University of Washington IRB guidelines.

FDG PET

FDG was produced at our institution by means of a standard technique that uses nucleophilic fluorination, Hamacher et al. [17]. FDG radiochemical purity was 95%, and its specific activity was greater than 47 GBq/mmol. All patients fasted for at least 4 h before undergoing PET. All had a medical history negative for diabetes and prescanning serum glucose levels less than 150 mg/dl (8.3 mmol/l) (mean, 84.5 mg/dl [4.7 mmol/l]). Patients were injected with an activity of 260–270 MBq (7–10 mCi) of FDG via a peripheral intravenous or central venous catheter and rested comfortably in a supine position for 45–60 min prior to scanning. Most patients received sedation (lorazepam 1 mg intravenously) to reduce muscular uptake of FDG, particularly in the neck and upper thorax.

Imaging was performed with a commercially available whole-body PET scanner (Advance; GE Medical Systems, Milwaukee, WI) that was operating in the high sensitivity mode. Imaging consisted of a torso survey covering 5 adjacent 15-cm axial fields of view (FOVs). Emission scans, 7 min per axial FOV, were obtained beginning 45–60 min after FDG injection. The time between injection and start of the emission scan was carefully documented for each patient in order to duplicate timing for follow-up scans, which was within 5–10 min of the baseline scan in almost all cases, Beaulieu et al. [18]. Prior to implementing segmented attenuation scans, transmission imaging at 15 min per FOV was performed for one to two axial FOVs that contained FDG-avid sites of bone disease, in approximately half the patients. After implementing a segmented attenuation scan routine supplied by the tomograph manufacturer, then tested and verified for quantitative accuracy and comparability to full attenuation scans at our institution, Lewellen et al. [19], transmission data were collected for all axial FOVs, 3–7 min per FOV. Attenuation-corrected emission data were reconstructed with standard filtered back-projection after correction for scattered and random coincidences. A Hanning filter and a 128 × 128 × 35 plane matrix were used for reconstruction, which yielded a reconstructed in-plane spatial resolution of approximately 10 mm. Images were corrected for radioactive decay of the tracer and normalized to the injected dose and body weight, which resulted in regional standardized uptake values (SUVs) as follows: SUV = A/(ID/BW), where A is the tissue tracer uptake in microcuries per gram obtained from the PET image, ID is the injected dose in millicuries, and BW is the body weight in kilograms. FDG PET scans were evaluated prospectively by nuclear medicine physicians with experience in PET imaging. FDG PET images were interpreted to determine the extent of abnormal uptake in the skeleton. Available conventional images, especially bone scans and spine MR images, were used for comparison to help localize sites of disease.

The maximum SUV at the site of most prominent FDG uptake was prospectively recorded on the baseline scan, and this site was designated as the index lesion. The SUV of the index lesion was used to determine absolute and percentage change in FDG uptake between the initial and second scan. In some cases the index lesion was in a field that had received previous radiation therapy; however, sites undergoing selective radiotherapy after the first PET scan were not used as index lesions. The maximum SUV of the index lesion served as the primary metric for the FDG PET scans. Although the interpreting physicians prospectively noted all other sites of abnormal uptake, the determination of the degree of heterogeneity in the FDG uptake was beyond the scope of this initial analysis.

Clinical outcome measures

The primary aim of the study was to compare change in FDG uptake in serial PET scans with long-term, clinically relevant outcomes such as time to progression (TTP) and time to skeletal-related event (t-SRE). Overall survival was not explored because only one patient died during follow-up. TTP was defined as the time from the second FDG PET to clinical disease progression. The second PET scan was the time at which change in FDG uptake was determined, and patients who met the inclusion criterion of serial PET scans were therefore not at risk of progression during the period between the scans. Clinical progression was defined as >25% increase in blood tumor markers (CA 27.29, CEA), clear symptomatic progression, progression determined by an imaging modality other than FDG PET (CT, MRI, bone scans), skeletal-related events (defined below), or development of a new bone lesion. FDG PET results were not considered in the determination of TTP.

The t-SRE was defined as the time from the second FDG PET to a SRE, defined as radiation therapy to stabilize skeletal disease, pathologic fracture, spinal cord compression, surgery to stabilize spine, or hypercalcemia of malignancy. Both clinical progression and SREs were determined retrospectively by a board-certified medical oncologist specializing in the treatment of breast cancer, (J.R.G.), who was blinded to patient identity and FDG PET results.

Statistical analysis

Proportional hazards regression was used to assess the relationship between FDG PET predictors (maximum SUV of the index lesion at the initial scan; absolute change in FDG uptake; percent SUV change in FDG uptake) and time to clinical progression or skeletal-related event. Nested proportional hazards models were compared using likelihood-ratio tests, as recommended for small samples, Singer and Willett [20].

In this retrospective analysis of clinical practice, systemic therapy for bone metastasis and time interval between scans was determined by the treating attending clinician, and thus not standardized prospectively. In particular, FDG PET may have been considered as part of treatment decisions, and timing of scans could be related to suspicion of progressive disease. Although the aim of this study was to show the potential of FDG PET in assessing response to treatment, the analysis could not address the role of treatment in influencing progression, SRE, or change in SUV. To explore the interval between scans as a competing explanation for differences in TTP and t-SRE, proportional hazards models were stratified by long (≥5 months) or short (<5 months) time elapsed between scans. Further sensitivity analyses examined the influence of individual subjects. Kaplan–Meier survival curves and log-rank tests were used to illustrate results, and to examine threshold effects for median splits of SUV predictors and interval between scans. Analyses were conducted using the statistical package R 2.0.1 (R Foundation for Statistical Computing, Vienna, Austria), [21].

Results

Study population

Twenty-eight patients met the inclusion criteria. Of these, two had been included in our prior report of bone metastasis response, Stafford et al. [15]. Patient menopausal status, primary tumor characteristics, and treatment for metastatic breast cancer are described in Table 1. Primary tumor histology, estrogen receptor (ER), progesterone receptor (PR), and HER2/neu over-expression were collected from medical records and pathology reports where available and most often reflect tumor characteristics at time of original diagnosis. Average age at time of first FDG PET scan was 51 years, with a range between 30 years and 68 years of age. Fifty percent of the patients were premenopausal at time of the first FDG PET scan.

Table 1 Selected characteristics among patients with metastatic breast cancer

The majority of the patients presented with ductal carcinoma, followed by lobular carcinoma at the time of breast cancer diagnosis. Histology of the primary tumor was not available for 25% of patients, many of whom were diagnosed with breast cancer many years prior to the first FDG PET. Immunohistochemistry for ER and PR from either the primary or metastatic tumor was available for 27 patients. The majority of tumors were ER positive or PR positive. Data on HER2/neu over-expression was available for tumors of 24 patients, of which 10 over-expressed HER2/neu.

Information on type of therapy for metastatic breast cancer during the interval between first PET scan and last follow up is presented in Table 1. These data were not collected prospectively and are limited in detail and scope in this retrospective analysis. The heterogeneity of patients included in this analysis reflects that of those patients referred to our tertiary cancer center.

FDG PET imaging

Figure 1 shows representative examples of FDG PET images obtained in study patients. The images in the upper frames illustrate a patient with decreasing FDG uptake after therapy. The patient’s initial FDG PET scan demonstrated a maximum SUV reading of 5.2 at the index lesion located in the lumbar spine at the level of L4. Following treatment, the SUV reading decreased to background levels of 2.2. The images in the lower frames of Fig. 1 illustrate an increase in FDG uptake with treatment. Table 2 summarizes initial maximum SUV values and FDG PET changes for the cohort. The average pre-treatment SUV was 5.9. The average SUV change during treatment was a decrease of 2.2, ranging from a decrease of 9.2 to an increase of 2.8. The average percentage change in FDG PET was a 33% decrease in SUV.

Fig. 1
figure 1

Serial FDG PET imaging examples. The upper frames illustrate representative coronal images from serial FDG PET scans on a patient with decreasing FDG uptake after therapy. The index lesion in the lumbar spine at L4 demonstrates a SUV of 5.2 on the initial scan (upper left). Following treatment, the SUV reading of the L4 lesion decreased to background levels of 2.2 (upper right). The lower frames depict coronal images from a patient with increased FDG uptake after therapy. The index lesion in the thoracic spine increased from an initial SUV of 3.4 (lower left) to a post-therapy SUV of 8.0 (lower right)

Table 2 Descriptive statistics for maximum SUV of index lesion at initial PET scan, absolute change in FDG uptake between scans, and percent change in FDG uptake between scans, = 28

The interval between PET scans ranged from 1 month to 17 months, with an average of 4.9 months between scans. Of 28 patients, 12 (43%) had scans 3 or fewer months apart, 11 (39%) had scans 4 or 5 months apart, and the remaining 5 (18%) had scans 6 or more months apart. SUV data were compared by two-sample t-tests for patients that had scans under 5 months apart (= 18) to those with scans 5 or more months apart (= 10). We were concerned about the potential interaction between FDG PET scan interval and treatment response in that patients with a shorter interval between scans may have less opportunity for treatment response to be reflected by FDG PET. Among patients with shorter intervals between FDG PET scans compared to those with longer intervals there was no difference in FDG uptake at baseline (P = 0.57), in absolute SUV change (P = 0.72), nor in percentage change in SUV (P = 0.75).

Outcome measures

Median follow-up time was 17.5 months from the time of the second FDG PET scan (range 1–61 months). Fifteen patients (54%) experienced clinical progression with a median TTP of 5 months after the second PET scan (range 1–42 months). The median follow-up for patients free of disease progression was 18 months (range 1–61 months). Five patients received radiation therapy for a bone lesion, and one patient developed a pathologic fracture, for a total of 6 SREs (21% of patients) occurring 1–28 months after the second PET scan. Hypercalcemia of malignancy was not observed in this patient cohort. For the remaining 22 patients free of a SRE, the median follow-up was 18.5 months (range 1–61 months).

Association between SUV changes and long-term outcome measures

Figure 2 shows Kaplan–Meier estimates of TTP, predicted by FDG PET imaging. Absolute difference in SUV is not a compelling predictor of TTP, and initial SUV does not appear as predictive as percent change in SUV. However, percentage change in SUV is predictive of TTP, with a greater than 41% (median) decline in SUV being associated with a longer TTP, = 0.0054 (Fig. 2c). Table 3 shows fitted Cox proportional hazards models, with continuous predictors of TTP instead of the median splits shown in Fig. 2. Again, percent SUV change predicts TTP (hazard ratio 1.02), and the effect is still present when the regression model is stratified by the interval between scans (χ 2 1 = 7.63, P = 0.006). According to the stratified model, the hazard for progression in a patient with no change in SUV is expected to be 2.16 times the hazard for a patient with a 42% decrease in SUV (the median value in Table 2). The estimated hazard is 37% greater for a patient in the 75th percentile of percent change in SUV (a 25% decrease) than for a patient with the median 42% decrease.

Fig. 2
figure 2

Kaplan–Meier estimates of time to progression, for median splits of three FDG-PET predictors with P-values for log-rank tests, = 28

Table 3 Cox proportional hazards models predicting time to clinical progression, = 28

Figure 3 shows Kaplan–Meier estimates of time to first skeletal-related event (t-SRE), predicted by FDG PET imaging, with the three predictors split at the median value. Initial SUV predicts t-SRE, both in the Kaplan–Meier analysis (Fig. 3a) and the proportional hazards regression model (Table 4). In a model stratified by longer (≥5 months) or shorter (<5 months) intervals between PET scans, the fitted hazard ratio is 1.30, suggesting that a patient in the 75th percentile for the initial scan (SUV 5.9) has a SRE hazard 56% greater than for a patient at the 25th percentile (SUV 4.2). Since only 6 of 28 patients experience a SRE, the influence of each event is quite high: excluding a patient who experienced a SRE 1 month after the second scan, the hazard ratio is estimated as 1.20 but is not found to be statistically significant by the likelihood ratio test (χ 2 1 = 2.09, P = 0.15).

Fig. 3
figure 3

Kaplan–Meier estimates of time to skeletal-related event (SRE), for median splits of three FDG-PET predictors with P-values for log-rank tests, = 28

Table 4 Cox proportional hazards models predicting time to skeletal-related event, = 28

Discussion

Although bone is among the most common sites of metastases in patients with breast cancer, traditional imaging modalities such as plain radiography, bone scan, and even MRI fall short in providing accurate assessment of response to systemic therapy, Bares [4], O’Sullivan et al. [5], Hamaoka et al. [7], Maffioli et al. [10], and Even-Sapir [22]. FDG PET appears to image bone metastases in a pattern distinct from bone scintigraphy. In particular, FDG PET is more sensitive in identification of lytic bone metastases, while bone scintigraphy more readily identifies the sclerotic reaction to metastasis by the surrounding bone, Cook et al. [12], Peterson et al. [23], Cook and Fogelman [24], and Cook and Fogelman [25]. Thus, FDG PET may be a more accurate reflection of tumor activity, versus the bony reaction, and may provide a superior measure of response to systemic therapy in serial studies. Our preliminary studies have demonstrated that serial FDG PET correlates with therapeutic response as assessed by tumor markers, symptoms, and conventional imaging of bone-dominant breast cancer, Stafford et al. [15]. Patients included in the present retrospective analysis had FDG-avid, BD metastatic breast cancer and received a representation of standard systemic therapies including endocrine therapy, chemotherapy, biologic therapy, and bisphosphonates. The findings of this study demonstrate the prognostic potential of FDG PET for time to progression (TTP) and time to first skeletal-related event (t-SRE), as assessed by tumor markers, symptoms, and conventional imaging. Using proportional hazards regression, we observed that patients with greater percentage decreases in SUV were at lower risk of disease progression. Additionally, higher SUV values for an index lesion on the initial FDG PET predicted a shorter time to a SRE.

Interestingly, absolute changes in SUV during treatment were not predictive of TTP, compared to percent change. Heterogeneity of FDG uptake by bony lesions has been previously characterized and likely reflects the “lytic” versus “blastic” biology of individual lesions, Cook et al. [12], Cook and Fogelman [24], Cook and Fogelman [25]. FDG uptake heterogeneity is the most plausible explanation for the lack of association between absolute changes in SUV and TTP. Percentage change in SUV takes into account the unique uptake characteristics of an individual patient’s lesion. Our results suggest that it is the fractional change in SUV, rather than an absolute value, that is the important measure of response.

Also novel from this analysis was the observation that a high SUV reading in the index lesion on initial FDG PET was predictive of a shorter time to a SRE. Increased FDG uptake on PET is characteristic of predominantly lytic-type bone lesions, Cook et al. [12]. It is biologically rational that greater tumor activity in such a bony lesion may confer a greater fracture risk as manifest by t-SRE.

This study’s limitations are typical for a retrospective study. Treatment regimens, duration of treatment intervals, intervals between serial FDG PET, and the use of conventional imaging and tumor markers all varied widely in this patient cohort. Bias in selection of patients for inclusion in this analysis was minimized by careful review of consecutive patients referred for FDG PET imaging, and by use of specific inclusion criteria to define a more homogenous group of patients with respect to disease biology and prognosis. The variation in time interval between initial and subsequent FDG PET imaging may introduce bias as responses to different systemic therapies may manifest after shorter (i.e., chemotherapy) or longer (i.e., endocrine therapy) periods of time. Use of bisphosphonates, which have been demonstrated to prolong time to SRE, was also not controlled in this retrospective series, although most patients did receive bisphosphonate therapy. Analyses based on changes in SUV in a single index lesion also fail to capture and characterize the heterogeneity of responses that can be observed in multiple sites of bony metastases.

In this patient cohort, FDG PET imaging was ordered by attending clinicians, and, although not the sole determinant for decisions on continuation or change in therapy, the results may have influenced clinical decision-making. Attempts to minimize such bias in this analysis included independent determination of TTP and t-SRE through comprehensive clinical assessment by a medical oncologist blinded to FDG PET results. We accounted for the interval between scans through stratification, but cannot account for the potential information from patients who progressed before a second PET scan was performed.

Planned studies at our institution will build upon these retrospective observations with prospective trials of FDG PET imaging in patients presenting with bone-dominant metastatic breast cancer. In the prospective setting, tumor characteristics, systemic therapies, intervals between FDG PET scans, and comparison to standard imaging modalities (bone scintigraphy, CT, MRI) can be evaluated in a more standardized fashion and correlated with changes in SUV by FDG PET with the goal of developing RECIST-like criteria for bone metastases using PET imaging. Future studies will also benefit from improvements in the sophistication of FDG PET image analysis to better account for heterogeneity in uptake of multiple lesions and to provide better estimates of volume of FDG-avid disease. The unique uptake characteristics of FDG and [18F]-fluoride in bony metastases also provide a robust platform for important in vivo studies into the physiologic responses of both the metastases and the surrounding bone to different therapeutic modalities. Such studies may further the understanding of breast cancer bone metastases and provide a method to quantify bone metastasis response for clinical trials and clinical practice.