Introduction

Neoadjuvant chemotherapy (NAC) prior to surgery is not only the recommended standard treatment for locally advanced breast cancer [1], but is also used in patients with operable breast cancers in order to increase breast-conservation rates [2]. In this regard, it has previously been shown that achieving pathologic complete response (pCR) after NAC is significantly associated with favourable disease-free survival outcomes [3]. However, as 20 % of patients are resistant to NAC, prediction of its response at the initial cycles, which would allow the treatment to be modified in non-responders, would be of great clinical significance [4]. Indeed, a recent study has reported that the survival rates of non-responders to initial NAC who responded to a different regimen, were similar to those of responders to the initial regimen [5].

For the prediction of pCR during the early cycles of NAC, tumour size or volume change measured on contrast-enhanced magnetic resonance imaging (MRI) has been demonstrated to be superior to clinical assessment [6]. However, as changes of tumour size after treatment usually occur after the 2nd cycle of NAC [7], earlier predictors reflecting angiogenesis, metabolic activity, or tumour cellularity, which may show change before tumour shrinkage, have been explored. 18F-fluorodeoxyglucose (FDG)-positron emission tomography (PET) has been reported to show 65–88-% accuracy in the prediction of pathologic response after the first or second cycle of NAC [8, 9]. In addition, greater total choline-containing compounds (tCho) signal changes of the tumour via in vivo proton magnetic resonance spectroscopy (1H-MRS) have been observed in patients with pCR than in non-pCR patients also after only the first or second cycle of NAC [1012]. Furthermore, Tozaki et al. reported that the changes in the integral value of the tCho peak determined via MRS were correlated with those of peak standardized uptake values (SUVs) determined via FDG-PET, both during and following NAC cycles [13].

However, to the best of our knowledge, there have been no studies comparing the performances of MRS and FDG-PET in predicting the pathological responses to NAC. Thus, the purpose of our study was to prospectively compare the performances of single-voxel 1H-MRS and FDG-PET in predicting the pathologic response to NAC in breast cancer patients.

Methods

Patients

The institutional review board of our hospital approved this prospective study and written informed consent was obtained from all patients. Between August 2010 and December 2012, 57 stage II or stage III breast cancer patients who received NAC were enrolled according to criteria described in a previous study [14]: patients with (a) breast cancer pathologically confirmed by core needle biopsy, (b) an initial clinical stage of II or III, (c) objectively measurable lesions, (d) Eastern Cooperative Oncology Group performance scores of 0-2, (e) previously untreated and (f) adequate bone marrow, and hepatic, cardiac and renal function. The patients received 6 cycles of neoadjuvant docetaxel (75 mg/m2; Taxotere; Sanofi Aventis, Paris, France) with doxorubicin (50 mg/m2; Adriamycin PFS; Ildong Pharmaceutical Co., Seoul, Korea) or 4 cycles of doxorubicin (60 mg/m2) with cyclophosphamide (600 mg/m2; ; Endoxane; Baxter, Deerfield, IL) followed by 4 cycles of docetaxel (75 mg/m2), which is the standard NAC protocol at our institution. All patients were clinically examined prior to each cycle of chemotherapy and the regimen was continued if there was no evidence of progression. Their response was evaluated at the completion of chemotherapy and curative surgery was performed. Each patient underwent both MRS and FDG-PET twice: one examination each within 3 weeks prior to chemotherapy and a second examination 2 weeks after the first cycle of chemotherapy but prior to the second cycle of chemotherapy. The reason for the second examination performed at post-1st cycle of chemotherapy was that predicting response at an earlier time point was highly important to modify the treatment for non-responders.

Of the 57 patients enrolled, 22 patients were excluded as no MRS data were obtained (n = 13), or due to withdrawal of consent (n = 3), an inconsistent MR protocol (n = 4) or withdrawal from surgery (n = 2). Thus, 35 patients (mean age, 46.9 years; range, 35–65 years) with invasive breast cancers (mean size, 4.8 cm; range 2.3–11.5 cm measured at baseline contrast-enhanced MRI) were finally included for analysis. The patients had invasive ductal carcinomas (94.3 %) or invasive lobular carcinoma (5.7 %) with a clinical stage of II (22.9 %) or III (77.1 %).

1H-MR spectroscopy and image analysis

Single-voxel 1H-MRS examinations were performed with patients in the prone position using a 3.0-Tesla scanner (Verio; Siemens Medical Solutions, Erlangen, Germany) with a dedicated 16-channel breast coil equipped with a bottle phantom (syngo GRACE external phantom; Siemens Medical Solutions, Erlangen, Germany) for external reference [15]. The phantom filled with 1.25 g of NiSO46H2O per 1,000 g of H2O was placed behind the breast coil. After obtaining contrast-enhanced MR examinations, sagittal, coronal, and axial T1-weighted localizer images were acquired and used for voxel placement. After a 20 × 20 × 20-mm3 voxel was placed in the tumour, a point-resolved spectroscopy sequence (PRESS) with a repetition time/echo time (TR/TE) of 6,000/125 msec, a spectral width of 2,000 Hz, and 2,048 data points were performed. The acquisition time was 8 minutes and 12 seconds. Following the first chemotherapy, when a tumour decreased less than 2 cm, a 10 × 10 × 10-mm3 voxel was used. Semiautomatic shimming was performed to achieve full width at half maximum of the unsuppressed water peak of 10 to 25 Hz. For the acquisition of water- and fat-suppressed data, 160 scans with a TR of 3,000 msec were obtained. For semi-quantification of the choline signal, water-unsuppressed data were obtained using 4 scans with a TR of 6,000 msec from a 10 × 10 × 10-mm3 voxel in an external water bottle.

Each spectrum was analyzed using syngo software (Siemens Medical Solutions, Erlangen, Germany) at an MR workstation. Hanning filtering with a width of 400 msec and zero filling of 4,096 was applied before the Fourier transform (FT). Final spectra were obtained using phase and polynomial baseline corrections. Areas of choline signals were estimated by applying Gaussian model fitting and their values were normalized via the normalization procedure using the external water reference signal as performed in a previous study [15]. The presence of a tCho signal was defined as a peak at 3.2 ppm clearly identifiable from the baseline noise. The integral values of the tCho signals were calculated and recorded.

18F-FDG PET image acquisition

FDG-PET images were obtained using an integrated PET/computed tomography (CT) system (Biograph 40, Siemens Medical Solutions, Erlangen, Germany). After the patients fasted for at least 6 hours, 18F-FDG (5.18 kBq/g) was intravenously injected and PET/CT scanning was initiated 1 hour later. CT scans were performed for attenuation correction and anatomical reference from the base of skull to the proximal thigh followed by an emission scan for a period of 2 minutes per bed. An iterative algorithm was used to reconstruct the PET image. The SUV, defined as the tissue concentration of radioactivity (kBq/mL) divided by the injected dose per body weight (kBq/g), was measured by three nuclear medicine physicians (H.L., H.J.Y., I.K.C.) in consensus using a dedicated software (syngo.via, Siemens Medical Solutions, Erlangen, Germany). After a volume of interest (VOI) was drawn over the breast lesion, the maximal SUV (SUVmax), defined as the value of the highest pixel, and the peak SUV (SUVpeak), defined as the highest mean value of SUV in a 1 cm3 spherical VOI, were measured. After an isoactivity contour was drawn in the VOI using a preset margin threshold, and the metabolic tumor volume (MTV, cm3), mean SUV (SUVmean) of the MTV, and total lesion glycolysis (TLG) defined as MTV multiplied by SUVmean were calculated [1618]. For the margin threshold, SUVmax 2.0 was used, which was performed in a previous study for breast cancer [18].

Histopathologic analysis

Core biopsy specimens obtained prior to chemotherapy as well as surgical specimens were evaluated by an experienced pathologist (I.A.P.) with 29 years of experience in breast histopathology to determine pathological response and to quantify changes in tumour cellularity. The Miller-Payne system was used to classify the changes in tumour cellularity after chemotherapy [19]: grade 1 = no reduction of cellularity, grade 2 = loss of tumour cells up to 30 %, grade 3 = 30–90 % reduction, grade 4 = more than 90 % reduction, and grade 5 = pCR, defined as the absence of invasive tumour cells, although ductal carcinoma in situ may have been present. Miller-Payne grades of 3, 4 or 5 were defined as showing good pathologic response and grades 1 or 2 were defined as showing minor response [20].

Tumour diameters and expressions of the oestrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor (HER)-2 were evaluated based on the surgical histopathology. Standardized templates were used to record the number and sizes of the invasive components and carcinoma in in situ components of the tumour and the status of lymph node metastasis. For lesions showing pCR, the expressions of ER, PR, and HER-2 were determined based on the core biopsy specimens. A cut-off value of 10 % was used to define positivity for ER and PR at 10X magnification. HER-2 expression was categorized as 0, 1+, 2+ or 3+ through immunohistochemical staining. Tumours with a score of 3+ were categorized as HER-2 positive, and tumours with a score of 0 or 1+ as HER-2 negative. For tumours with a score of 2+, fluorescence in situ hybridization was performed. When the ratio of HER-2 gene copies to chromosome 17 signals was >2.2, HER-2 expression was considered positive. Tumours were classified as hormone receptor-positive (e.g., ER-positive or PR-positive), triple negative (e.g., hormone receptor-negative and HER-2-negative) or HER-2-positive (e.g., HER-2-positive and hormone receptor-negative) based on the immunohistochemical staining.

Statistical analysis

Fisher’s exact test was performed to compare the clinicopathologic variables (age at diagnosis, clinical stage, expression of ER, PR, HER-2 and the Ki-67 protein, immunohistochemical subtype, chemotherapy regimen, and type of surgery) of the pCR and non-pCR groups as well as the good-response and minor-response groups. An independent samples t-test was performed to compare imaging variables (mean tumour size, mean tumour volume, integral value of the tCho signal, SUVmax, SUVpeak, and TLG) between the pCR and non-pCR groups as well as the good-response and minor-response groups. Linear regression analysis was performed to evaluate the correlation between imaging parameters and Miller-Payne grades of cellular reduction.

Area under the receiver operating characteristic (ROC) curve analysis was performed to compare the diagnostic performance of MRS and FDG-PET in the prediction of non-pCR. To suggest the optimal criteria for prediction, the best cut-off values of the MRS and FDG-PET to achieve the maximal sum of sensitivity (proportion of pathologic non-responders correctly classified as non-responders) and specificity (proportion of pathologic responders correctly classified as responders) were determined. Thereafter, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for the prediction of non-pCR or minor responses were calculated.

All statistical analyses were performed using the Statistical Program for Social Sciences software (version 19.0; SPSS, Chicago, IL) and MedCalc (version 10.3.0; MedCalc Software, Mariakerke, Belgium). A two-tailed P value of less than 0.05 was considered to indicate a statistically significant difference.

Results

The initial clinical stages of the patients were stage II in 8 of 35 patients (22.9 %) and stage III in 27 of 35 patients (77.1 %). The median interval between the MRS and FDG-PET examinations was 0 days (range, 0 to 4 days). Nineteen (55.9 %) of the 35 patients underwent MRS and FDG-PET examinations on the same day. The median interval between baseline imaging examination and the beginning of NAC was 4 days (range, 1–15 days). The median interval between baseline and the second imaging examination was 20 days for MRS and 21 days for FDG-PET examinations (both ranges, 14–36 days).

After surgery, pCR was achieved in 6 of the 35 patients (17.1 %). According to the Miller-Payne system, 28 patients were classified as good responders (80.0 %, grades 3 4, or 5), and 7 patients as minor responders (20.0 %, grades 1 or 2; Table 1). No differences were found in mean baseline tumour size between patients who showed pCR [4.5 ± 1.7 cm (standard deviation, SD)] and patients with non-pCR (4.8 ± 1.9 cm; P = 0.736; Table 2). Following the first cycle of chemotherapy, a mean of 17.7 ± 19.2 % in tumour diameter was decreased and a mean of 24.5 ± 32.8 % in tumour volume was decreased. The pCR group showed a more decreased tumour diameter than the non-pCR group (-38.5 ± 13.4 vs. -13.4 ± 16.8 %, P = 0.002) and the good-response group showed a more decreased tumour diameter than the minor-response group (-20.2 ± 20.0 vs. -7.9 ± 6.8 %, P = 0.012; Table 2). The mean integral value of tCho of the 35 patients decreased from 13.7 (range, 2.3 to 56.4) to 7.0 (range, 0 to 21.9; P < 0.001) and the mean SUVmax decreased from 12.2 (range, 3.8 to 35.3) to 8.0 (range, 2.2 to 30.4; P < 0.001). The mean % reduction of tCho of the pCR group was significantly greater than that of the non-pCR group (-80.3 ± 13.9 % vs. -32.1 ± 49.4 %, P = 0.025; Table 2; Figs. 1 and 2). The mean % reductions of SUVmax, SUVpeak, and TLG of the pCR group were also greater than those of the non-pCR group (SUVmax: -54.7 ± 22.1 % vs. -26.3 ± 33.7 %, P = 0.058; SUVpeak: -60.7 ± 18.3 % vs. -32.3 ± 23.3 %, P = 0.009; TLG: -89.5 ± 8.5 % vs. -52.6 ± 36.2 %, P = 0.020; Table 2; Figs. 1 and 2).

Table 1 Association between clinicopathologic characteristics and pathological response
Table 2 Association between imaging variables and pathological response
Fig. 1
figure 1

Single-voxel 1H-MR spectroscopy and 18F-FDG-PET of a 55-year-old woman with invasive ductal carcinoma (ER-, PR- and HER-2-positive) which demonstrated pCR after neoadjuvant chemotherapy and surgery. Pre-treatment (A) and post-1st cycle (B) single-voxel 1H-MR spectrum shows that the normalized integral value of total choline-containing compound (tCho) resonance peak at 3.21 ppm decreased from 8.83 to 1.81, measured as a 79.6-% reduction. Pre-treatment (C) and post-1st cycle (D) SUVmax of the tumour on FDG-PET decreased from 9.1 to 2.4, measured as a 73.6-% reduction. SUVpeak of the tumour decreased from 7.2 to 1.7, measured as a 76.4-% reduction. Pre-treatment contrast-enhanced MRI (E) shows that the mass is 4.1 cm in diameter and preoperative contrast-enhanced MRI taken after completion of NAC shows the disappearance of the mass (F). Surgical histopathology revealed no residual tumour. Milller-Payne classification was a grade of 5, or complete response.

Fig. 2
figure 2

Single-voxel 1H-MR spectroscopy and 18F-FDG-PET of a 53-year-old woman with invasive lobular carcinoma (ER+ and HER-2-negative) which were found to show a considerable amount of residual tumour after neoadjuvant chemotherapy and surgery. Pre-treatment (A) and post-1st cycle (B) single-voxel 1H-MR spectrum shows that the normalized integral value of tCho resonance peak at 3.21 ppm decreased from 25.5 to 21.1, measured as a 17.3-% reduction. Pre-treatment (C) and post-1st cycle (D) SUVmax of the tumour on FDG-PET decreased from 7.3 to 4.1, measured as a 43.8-% reduction. SUVpeak of the tumour decreased from 5.1 to 2.9, measured as a 43.1-% reduction. Pre-treatment contrast-enhanced MRI (E) shows that the mass is 5.3 cm in diameter and preoperative contrast-enhanced MRI taken after completion of NAC shows the mass is 4.3 cm in diameter (F). Surgical histopathology revealed a 6.3-cm invasive lobular carcinoma. Miller-Payne classification was determined as a grade of 2, or a minor response

With regard to the linear regression analysis between imaging parameters and grades of cellular reduction of tumours, % change of tumour size (r = -0.54, P = 0.001), % change of tCho (r = -0.60, P < 0.001), % change of SUVmax (r = -0.45, P = 0.006), % change of SUVpeak (r = -0.60, P < 0.001), and % change of TLG (r = -0.53, P = 0.001) and the Miller-Payne grade showed correlations with statistical significance, which indicates that a greater reduction of tumour size, tCho, SUVmax, SUVpeak, or TLG results in less tumour cellularity remaining (Fig. 3).

Fig. 3
figure 3

Scatter plots demonstrate the results of linear regression between the % change of tumour size (A), % change of tCho at MRS (B), % change of SUVmax (C), % change of SUVpeak (D), % change of total lesion glycolysis (TLG) at FDG-PET (E), and Miller-Payne grades of cellular reduction, which resulted in significant correlations. For each scatter plot, the best-fit line is shown as a solid line

In the prediction of pathological residual tumours, the area under the ROC curve of % change of tCho [0.911, 95 % confidence interval (CI), 0.765 to 0.980] was similar to those of % change of SUVmax (0.822, 95 % CI, 0.656 to 0.930, P = 0.552), SUVpeak (0.862, 95 % CI, 0.703 to 0.955, P = 0.675), and TLG (0.879, 95 % CI, 0.725 to 0.965, P = 0.551; Fig. 4). If the best cut-off value was defined as the value yielding a maximum sum of sensitivity and specificity from the ROC curve analysis, a 60.7-% reduction of tCho yielded a 75.9-% sensitivity (22/ 29 non-pCRs) and 100-% specificity (6/6 pCRs); a 62.3-% reduction of SUVmax yielded a 100-% sensitivity (29/29 non-pCRs) and a 66.7-% specificity (4/6 pCRs; Table 3, Fig. 5).

Fig. 4
figure 4

Receiver operating characteristic curves evaluating the performances of MRS and FDG-PET in the prediction of non-pCR. The area under the ROC curve (Az) of % change of tCho was 0.911 (95 % CI, 0.765 to 0.980), which was similar to those of % change of SUVmax (0.822, 95 % CI, 0.656 to 0.930; P = 0.552), SUVpeak (0.862, 95 % CI, 0.703 to 0.955, P = 0.675), or TLG (0.879, 95 % CI, 0.725 to 0.965, P = 0.551). MRS = Single-voxel 1H-MR spectroscopy, FDG-PET = 18F-FDG-PET, pCR = pathologic complete response, tCho = otal choline-containing compounds, SUV = standardized uptake value, TLG = total lesion glycolysis

Table 3 Diagnostic performance of single-voxel 1H-MR spectroscopy and 18F-FDG-PET in the early prediction of non-pCR and minor responses
Fig. 5
figure 5

Graphs demonstrating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of single-voxel 1H-MR spectroscopy (1H-MRS), SUVmax, SUVpeak, and total lesion glycolysis (TLG) at 18F-FDG-PET in the early prediction of non-pCR (A) and minor response (B)

Discussion

According to our study results, in the prediction of pathological residual tumours, the area under the ROC curve of the % change of tCho at MRS was 0.911 (95 % CI, 0.765 to 0.980), which was similar to those of the % change of SUVmax (0.822, 95 % CI, 0.656 to 0.930; P = 0.552), SUVpeak (0.862, 95 % CI, 0.703 to 0.955, P = 0.675), and TLG (0.879, 95 % CI, 0.725 to 0.965, P = 0.551) as determined via FDG-PET. Moreover, the % change of tumour size (r = -0.54, P = 0.001), the % change of tCho (r = -0.60, P < 0.001), the % change of SUVmax(r = -0.45, P = 0.006), % change of SUVpeak (r = -0.60, P < 0.001), and % change of TLG (r = -0.53, P = 0.001) were all significantly correlated with changes in tumour cellularity grades after NAC.

Of the imaging biomarkers for the early prediction of response to NAC in breast cancer patients, previous studies have shown that FDG-PET is the leading imaging modality with reports that the changes of SUV on FDG-PET during the early cycles of NAC is associated with the pathologic response rate [8, 9] and disease-free survival outcomes [21, 22]. With regard to MR imaging, changes of the transfer constant derived from pharmacokinetic modelling on dynamic contrast enhanced (DCE)-MRI has also been shown to be associated with pathologic response [23] or favourable survival outcomes [24]. In comparison, however, superior accuracy of FDG-PET compared to that of DCE-MRI [25] or the usefulness of the complementary use of FDG-PET and DCE-MRI has been reported [2628]. Yet, although these previous studies have reported that the combined use of FDG-PET and DCE-MRI after the 1st cycle of NAC improved the prediction of pCR or disease-free survival in breast cancer patients, additional ionizing radiation and cost limited this combination in real clinical practice.

In this regard, previous studies showing that MRS may have the potential to predict the pathologic response in the initial stages of chemotherapy has clear clinical implications as it may overcome the aforementioned limitations of PET while still providing the opportunity to minimize ineffective or toxic treatment or provide more cycles of effective regiments with confidence. Indeed, changes in the tCho concentration of the tumour 24 hours after the 1st chemotherapy [10] session or at a time point between the 1st and 2nd chemotherapy had already been demonstrated [11, 12], similar to our study results. Moreover, given our linear regression analysis results and the demonstrated correlation between % tCho changes (r = -0.60, P < 0.001) and % changes in SUVmax (r = -0.45, P = 0.006) observed after the 1st cycle of chemotherapy and tumour cellularity grade changes, MRS may be sufficiently sensitive to identify early changes of cellular proliferation from membrane choline phospholipid metabolism changes in a similar degree to the early glucose metabolism changes observed with FDG-PET [29, 30]. Thus, our results demonstrate that MRS may be used as an alternative to FDG-PET for the imaging prediction of the pathological response to NAC without additional ionizing radiation.

With regard to imaging parameter criteria, we analyzed the cut-off values in distinguishing not only pCR versus non-pCR but also good responders versus minor responders so as to avoid discontinuing an effective regimen in partial responders. When breast cancer patients receiving NAC present with a % tCho reduction lower than 25.7 %, modification of the regimen or alternative therapy may need to be considered as the patients would not respond to the regimen [100 % (7 of 7 minor responses) sensitivity]. To the contrary, when patients present a % tCho reduction of greater than 60.7 %, we can expect that the patients would respond to NAC with 100 % (6 of 6 pCR) specificity. With regard to the cut-off values of FDG-PET in our study, a 35.6-% reduction of SUVmax yielded a sensitivity of 85.7 % (6/7 minor responses) and a specificity of 71.4 % (20/28 good responses). Although cut-off values can vary depending on the timing of the imaging (after 1st cycle versus 2nd cycle), the standard of reference (pCR versus histological grading versus survival outcome), and quantification methods (SUV versus metabolic rate), considering that cut-off values of 25-% to 45-% reduction in SUVmax at post-1st cycle of chemotherapy has been recommended by earlier studies [9, 31, 32], our study results are shown to be within the similar recommended range. With regard to the cut-off values of tCho, without standardization of acquisition and quantification methods of MRS for breast cancers, our cut-off tCho value may not be applicable to other institutions. Robust quantification methods for tCho using an external phantom reference or internal reference approach regardless of the voxel, magnetic field strength, breast coil design, or capability of shimming have been investigated in the literature [33]. However, these absolute quantification methods are too complicated to be performed in routine clinical practice [34]. In our study, we used relative changes of the normalized integral value of the tCho peak before and after the 1st cycle of NAC, which has been shown to be accurate in discriminating malignant from benign lesions and which are more practical for clinical use [34]. Thus, we believe that relative quantification of normalized integral values of the tCho peak can be used as a parameter in longitudinal studies for the early prediction of the pathologic response to NAC.

Another interesting result of our study is that the Az value of TLG (0.879) was similar to those of SUVpeak (0.862) and SUVmax (0.822). Although SUVmax is simple and reproducible to obtain, however, as it only represents one pixel showing highest SUV, volume-based analysis such as TLG has been recommended in recent studies, which is consistent with our study results [17, 19].

The strengths of our study are the prospective application of imaging protocols, i.e., MRS and FDG-PET were acquired from the same patients with a mean interval of only 1 day, with consistent time points of imaging acquisition and chemotherapy regimens.

Nonetheless, our study has several limitations. First, this was a single institutional study with a relatively small sample size. Moreover, due to the higher number of dropouts than we expected, we only included 35 patients. Second, we did not assess the reproducibility of the quantification of integral values of the tCho peak on MRS. Third, the cut-off values for imaging parameters derived from our study population might have overestimated our results. Thus, further studies with a validation cohort may be warranted in future studies. Despite these limitations, we believe that our results showing that MRS provided similar predictive capabilities as FDG-PET can be of great clinical value.

In conclusion, our study demonstrated that MRS can be used as an alternative to FDG-PET in the early prediction of the pathological response to NAC in breast cancer patients, allowing response-guided treatment modification with comparable accuracy to FDG-PET without additional ionizing radiation.