Introduction

Currently, anthracycline-based chemotherapy followed by taxanes (paclitaxel or docetaxel) has been a standardized regimen in patients with primary breast cancer, contributing to a substantially favorable outcome of disease-free survival and overall survival as compared with an anthracycline-based regimen alone [14]. Neoadjuvant chemotherapy using these two kinds of regimens has been also widely administered to patients with locally advanced breast cancer to improve surgical options and provide information on in-vivo chemosensitivity tests [5, 6]. Patients who experience a pathological complete response (pCR) are associated with a favorable outcome after chemotherapy. In contrast, non-responding patients unlikely to benefit from neoadjuvant chemotherapy are recommended to receive additional therapy after surgery [7].

Noninvasive molecular images have been recently developed to visualize cancer viability and metabolism [8, 9]. Positron emission tomography (PET) with 18F-fluorodeoxyglucose (FDG) is able to image the glucose metabolism of the tumor cells [1012]. There is cumulative evidence showing that reduction in FDG uptake value [e.g., standardized uptake value (SUV)] on the early cycle after the initiation of chemotherapy more reliably predicts a favorable outcome of patients with breast cancer [1316]. Tumor glucose metabolism using FDG PET has the potential to distinguish viable cancer cells from those in suspension or necrotic components because the degree of tumor FDG uptake is closely associated with its proliferative activity [17]. Therefore, FDG PET could help to individualize treatment and to avoid potentially ineffective chemotherapies [1821].

In this prospective two-center study, we performed serial FDG PET combined with computed tomography (FDG PET/CT) scans during neoadjuvant chemotherapy at the baseline, after four cycles of an anthracycline-based regimen, and after taxane followed by surgery, to monitor tumor response in patients with primary breast cancer. We classified breast tumors according to receptor status and evaluated the absolute value of baseline tumor SUV and the rate of change in tumor SUV during treatment. The primary aim was to determine whether the baseline tumor SUV could predict pCR or not. The secondary aim was to determine whether serial FDG PET/CT scans could optimize sequential use of taxane in patients who receive an anthracycline-based regimen. We discuss the role of FDG PET/CT monitoring in optimizing neoadjuvant chemotherapy for each subtype of breast cancer.

Materials and methods

Patients’ enrollment

Between August 2007 and July 2009, 108 patients with newly diagnosed primary breast cancer (≥2 cm, stages II and III), non-inflammatory breast cancer, and candidates for neoadjuvant chemotherapy were eligible for the study. Fifty-five (51%) and 53 (49%) of the patients were enrolled at the Department of Breast Oncology and the Department of Nuclear Medicine, International Medical Center, Saitama Medical University (Center 1) and at the Department of Breast Surgery, National Defense Medical College and Tokorozawa PET Diagnostic Imaging Center (Center 2), respectively. Two patients had bilateral breast tumors so that a total of 110 tumors were evaluated in this study.

Diagnosis of invasive breast cancer was done by core needle biopsy in all patients. For determining baseline staging of breast cancer, palpation, chest X-ray, ultrasound examination of the bilateral breast, axilla, and the liver, and a whole-body FDG PET/CT were performed. Estrogen receptor (ER) and progesterone receptor (PR) were immunohistochemically (IHC) assessed in the specimens obtained by core needle biopsy, and HER2 status was determined by IHC or fluorescence in-situ hybridization (FISH) analysis. Excluded were patients who were pregnant and those who had severe diabetes mellitus, other types of cancer, or metastatic disease. Patients’ characteristics are shown in Table 1. All patients were enrolled after they had been properly informed and had agreed to participate. The study protocol was approved by the institutional review boards in both centers.

Table 1 Patient and tumor characteristics between Center 1 and Center 2

Neoadjuvant chemotherapy regimen and surgery

All patients received neoadjuvant chemotherapy comprising 4 cycles of anthracycline-based regimen followed by a taxane regimen (Fig. 1). They received 4 triweekly cycles of epirubicin at 90 mg/m2 and cyclophosphamide at 500 mg/m2 (EC). Forty-six (42.6%) patients (48 tumors) and 60 (55.6%) patients (60 tumors) received 8–12 weekly cycles of paclitaxel at 90 mg/m2 and 4 triweekly cycles of docetaxel at 70 mg/m2, respectively (Table 1). None of the patients with HER2-positive tumors received trastuzumab. Two (1.9%) patients discontinued sequential use of taxane due to severe adverse effects from chemotherapy or progressive diseases. All patients underwent lumpectomy, partial mastectomy, or mastectomy with sentinel node biopsy and/or axillary dissection between 4 and 6 weeks after completion of chemotherapy.

Fig. 1
figure 1

Schematic presentation of the study design. The chemotherapy regimen consisted of 4 cycles of epirubicin (90 mg/m2) + cyclophosphamide (600 mg/m2) on a triweekly basis and sequential use of 8–12 cycles of weekly paclitaxecl (80 mg/m2) or 4 cycles of triweekly docetaxel (70 mg/m2). The treatment was monitored by serial measurement of FDG PET/CT at baseline (baseline PET/CT), after four cycles of initial chemotherapy (midtherapy PET/CT), and prior to surgery (posttherapy PET/CT)

FDG PET/CT examination

Briefly, whole-body FDG PET/CT scans (Biograph, Siemens Medical Systems) were set identically in both centers. FDG PET/CT scans were scheduled before treatment (baseline PET/CT), after 4 cycles of an anthracycline-based regimen (midtherapy PET/CT), and prior to surgery (posttherapy PET/CT) for the patients enrolled.

Details of the FDG PET/CT procedure have been previously described [19, 22]. Patients fasted overnight for at least 4 h, and normal blood glucose level was confirmed before FDG injection. In Center 1, 60 min after intravenous injection of FDG tracer (3.7 MBq/kg body weight), patients were positioned prone with both arms at their sides on the whole-body PET/CT scanner couch. In Center 2, 60 min after intravenous injection of FDG (3.7 MBq/kg body weight), patients were scanned in the same manner as mentioned above. The Biograph allows simultaneous collection of 64 slices over a span of 15.8 cm with a slice thickness of 2.5 mm and a transaxial resolution of 6.3 mm. All data were reconstructed with an OSEM image. After acquisition of CR data had been completed, the table top with the patients automatically advanced into the PET-sensitive field of view, and acquisition of PET data was started in three-dimensional mode with the patients in exactly the same position on the table. The attenuation correction was automatically completed using corresponding CT data.

Regions of interest (ROI) with 1.0 cm maximum diameters were drawn on the area of abnormal FDG accumulation corresponding to the baseline tumor lesion. In a series of FDG PET/CT scans, careful attention was paid to drawing the ROI in the same lesion as on the baseline lesion. CT combined with FDG PET provided an anatomical landmark for detecting the lesion. FDG PET/CT was analyzed by at least two blinded interpreters in each center. Maximal SUV was obtained from breast tumors detected by FDG PET/CT scan. Only patients with clearly visible and high metabolic lesions at baseline underwent serial FDG PET/CT scans. The SUV was calculated using the formula: SUV = activity concentration in ROI (MBq/ml)/injection dose (MBq/kg body weight).

Assessment of endpoints: pathological response

All specimens excised from the breast were cut into 5-μm-thick slices and were microscopically analyzed for the presence of residual tumor [23]. Assessment of therapeutic response was determined by two pathologists (T.S. and H.T.). In this study, pathological complete response (pCR) was defined as microscopic evidence of the entire disappearance of invasive components of cancer cells in all resected pathological specimens. Regional lymph node involvement was not evaluated in this study. The tumors that did not achieve pCR were considered as non-pCR.

Metabolic response

The maximal SUV values on baseline PET and midtherapy PET were employed to assess tumor response to therapy by calculating relative changes (%) in SUV of midtherapy PET relative to that of baseline PET: relative value (%) in SUV = [(midtherapy SUV) − (baseline SUV)/(baseline SUV)] × 100. The rate of change of SUV was used to derive the cut-off values from receiver operating characteristic (ROC) analysis.

Statistical analysis

The statistical analysis was performed by using MedCal software (Mariakerke, Belgium). The unpaired t test was used to compare two independent population means of SUV inferred from tumor subgroups. The cut-off value for predicting pathological response was calculated by ROC analysis. For the optimal cut-off value, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of FDG PET/CT were calculated by the software.

Results

Patients’ characteristics

There were no significant differences in patients’ backgrounds and baseline SUV between Center 1 and Center 2 (Table 1). Overall, 22 (20%) tumors achieved pCR, a rate which was identical between the two centers. In this study, 70 tumors (63.6%) and 55 tumors (50%) were ER-positive and PR-positive, respectively. Twenty-six tumors (23.6%) had HER2 overexpression or amplification.

Comparison of baseline SUV between pCR and non-pCR in tumor groups

Figure 2 shows comparisons of baseline SUV between pCR and non-pCR in each subgroup. Overall, tumors with pCR had significantly higher SUV (9.3 ± 3.7 SD) than those with non-pCR (7.2 ± 3.8 SD) (p = 0.02). Stratified by receptor status, the baseline SUV (9.8 ± 4.0 SD) of the ER-positive tumors that achieved pCR was significantly higher than the baseline SUV (6.4 ± 3.1 SD) with non-pCR (p = 0.006). In HER2-negative tumors, baseline SUV (10.5 ± 4.2 SD) in pCR cases was higher than baseline SUV (7.0 ± 3.9 SD) in non-pCR (p = 0.01). On the other hand, no significant difference in SUV between pCR and non-pCR in other subgroups was detected.

Fig. 2
figure 2

Distribution of baseline SUV in breast cancer subtypes. Filled circles represent pCR cases while open circles show non-pCR cases

Chemotherapy monitoring and pathological response

All patients were examined by FDG PET/CT scans on baseline PET/CT. Ninety-eight (89.1%) of patients underwent midtherapy PET/CT, and ninety (81.8%) received posttherapy PET/CT.

The test results depended on the cut-off values (Fig. 3). By modifying the cut-off value by means of ROC analysis, an optimal SUV cut-off value was calculated with regard to its ability to predict pCR. A greater distance from the diagonal line indicated a diagnostic test with a higher reliability. The area under the curve (AUC) was 0.85 in overall tumors. Table 2 indicates pathological response rates by optimal cut-off of change in SUV after completion of an anthracycline-based regimen. The optimal cut-off value was −72.1% with a sensitivity of 88.9%, specificity of 78.7%, PPV of 48.5%, and NPV of 95.5%.

Fig. 3
figure 3

ROC analysis for predicting pCR. ROC analyses of overall tumors (n = 98) for predicting pCR by change in midtherapy SUV relative to baseline SUV on a per-individual tumor basis. This shows the performance in predicting pCR with sensitivity of 88.9%, specificity of 78.7%, PPV of 48.5%, and NPV of 95.5% at the optimal cut-off of −72.1%

Table 2 Pathological response rate by optimal cut-off SUV

Center dependence

The diagnostic performance between the two centers was assessed by using ROC analysis. By baseline SUV, the test quality of predicting pCR showed a sensitivity of 44.2%, specificity of 86.4%, and AUC of 0.61 in Center 1 and sensitivity of 72.7%, specificity of 79.5%, and AUC of 0.76 in Center 2. The test quality of predicting pCR by relative change in SUV exhibited a sensitivity of 88.9%, specificity of 87.8%, and AUC of 0.84 in Center 1 and sensitivity of 88.9%, specificity of 87.8%, and AUC of 0.84 in Center 2.

Discussion

Prediction of pCR at baseline FDG PET/CT

In the current study, the baseline SUV of pCR tumors was significantly higher than that of non-pCR, and the trend was marked in ER-positive tumors or HER2-negative tumors. The baseline SUV could permit prediction of the treatment effect of luminal A subtype even though there was a considerable overlap in baseline SUV for tumors with and without pCR.

The diagnostic performance for predicting pCR by baseline SUV between the two centers showed high specificity, ranging from 79.5 to 86.4%, but a wide range of sensitivity, from 44.2 to 72.7%. Therefore, even though whole-body FDG PET/CT scans were set identically in both centers, we surmise that a direct comparison of baseline tumor SUVs between institutions has certain limitations due to inter-institutional bias on injection time, data acquisition and reconstruction parameters. Given the considerable overlap and variation in sensitivity, we propose that for an individual patient, baseline SUV level of the primary tumor would not be a reliable indicator for predicting chemo-responsiveness in breast cancer. On the other hand, the results of relative change in SUV were very similar between the two centers.

We evaluated only the maximum SUV of the tumor lesion. Although the maximum SUV is certainly reproducible, there are differing opinions on metric systems of FDG uptake measurement.

Predicting pCR and optimizing sequential use of taxane at midtherapy FDG PET/CT

When the optimal cut-off value was set, the ROC curve analysis showed good performance in predicting pCR, with a sensitivity of 88.9%, specificity of 78.7%, PPV of 48.5%, and NPV of 95.5%. The finding suggests that administering the full course of anthracycline and taxane chemotherapy seems appropriate in patients who are responding to initial chemotherapy irrespective of receptor status. High NPV identifies patients who might not respond to the full course regimen. However, both (4.5%) tumors which achieved pCR in non-responding tumors had HER2 overexpression. This suggests that the HER2 subtype could benefit from sequential taxane even if tumors responded ineffectually to anthracycline-based chemotherapy. This finding is in line with previous studies [24, 25]. It should also be noted that patients with the HER2 subtype in this study did not receive specific anti-HER2 therapy, e.g., trastuzumab or lapatinib. Introduction of anti-HER2 therapy given concurrently with taxane might improve the pCR rate in patients who did not respond to the anthracycline-based regimen, as well as in patients responding to it [26, 27].

We evaluated only primary tumor metabolism with FDG PET/CT during chemotherapy in the present study. However, it would be more reliable if axillary response along with tumor response is monitored with FDG PET/CT, because it can visualize lymph node metastasis with great accuracy [28, 29].

In conclusion, high SUV of the primary tumors on baseline PET/CT scans could not be a useful indication of chemo-responsiveness because of variable sensitivity; however, it might be useful for a small subset of breast cancers such as luminal A subtype. Relative change in SUV on midtherapy PET/CT could predict pCR after completion of chemotherapy. Metabolic information from FDG PET/CT scans could be useful in future clinical practice for tailoring individual therapy in breast cancer.