Introduction

Osteosarcoma is the most common malignant bone tumor in adolescents and young adults. The long-term survival of osteosarcoma patients has been significantly improved by combining neo-adjuvant chemotherapy and surgical resection [1]. The goal of preoperative chemotherapy is to treat the micrometastatic disease systemically and to provide tumor debulking before “en bloc” resection.

The prognosis of localized osteosarcoma strongly depends on tumor histological response to neoadjuvant chemotherapy [1, 2]. However, this gold standard criterion is available only postoperatively. If the information on tumor response could be obtained earlier by an alternative method, the preoperative treatment could also be optimized earlier.

Static MRI has proven its usefulness in the management of these tumors and is routinely used to determine the surgical strategy. The pretreatment tumor volume has been shown to be related to post-chemotherapy histological necrosis and event-free survival [1, 3, 4]. However, the initial tumor volume is not routinely used in current protocols since it appears to be a poor predictive factor in multivariate analyses [1, 5]. Static MRI is not considered suitable for prognostic determination [6]. The tumor volume changes during neoadjuvant chemotherapy also do not correlate with prognosis [7, 8], since tumor shrinking mainly concerns the soft tissue component, whereas the bony component itself changes very little.

Dynamic contrast-enhanced MRI (DCE-MRI) allows neaoangiogenesis regression to be assessed during chemotherapy and has been validated as an accurate predictor of histological response [914]. However the best correlation is only obtained on preoperative examination and therefore cannot be used as an early predictor of treatment response [14].

Diffusion-weighted (DW) MRI has been used to characterize musculoskeletal tumors, both in adults [1517] and children [18, 19]. Since the diffusion signal is directly related to the tumor cellularity [20, 21], necrotic areas within the tumoral tissue induce a local diffusion increase. This phenomenon has been demonstrated in both clinical practice [20, 21] and experimentally [22, 23]. Three studies have shown the accuracy of DW-MRI to predict the histological response to chemotherapy in pediatric osteosarcoma [2426]. However, all studies have only compared the prechemotherapy diffusion values to those recorded immediately before surgery.

18F-FDG PET has been also used to assess the response to neoadjuvant chemotherapy in this setting. A strong correlation between FDG uptake decrease and tumor necrosis was demonstrated. However, there is still no standardized uptake value threshold available that will definitively separate good and poor responders [5, 2731]. As with DCE-MRI, FDG-PET response is obtained too late in the treatment course to modify potentially unsuccessful treatment. Moreover, FDG-PET is associated with significant radiation dose to patients and should be used carefully in the pediatric population.

The main objective of our study was to assess the accuracy of DW-MRI as a method to predict the early response to neoadjuvant chemotherapy by comparing the tumor apparent diffusion coefficient (ADC) values measured at mid-course of chemotherapy with the final histological response measured after completion of preoperative chemotherapy. Our secondary objective was to determine the same values of DW-MRI at the end of chemotherapy.

Materials and methods

All children and adolescents with long-bone osteosarcoma treated in our institution between 2005 and 2010 were included in this prospective study. The study was approved by our Institutional Review Board and Ethics Committee, and written informed consent was obtained from both patients and parents before the MR examinations.

All patients were treated according to the same protocol, including 13 weeks of preoperative chemotherapy schedule as follows: seven infusions of high-dose methotrexate (12,000 mg/m2) and two courses of ifosfamide/etoposide regimen (3 g/m2 and 75 mg/m2/day, respectively) for 4 days.

Three MR examinations were planned: the first one (MRI-1) at baseline, before surgical biopsy; the second one (MRI-2) half way through chemotherapy (range −4 to +3 days, mean 2.3 days between the actual date of the examination and the theoretical “mid-course” treatment date; and the third one (MRI-3) at the end of neoadjuvant chemotherapy (i.e., immediately before surgery).

MR examinations

All examinations were performed on the same MR unit. Our routine osteosarcoma MR protocol included the following sequences:

  • Body coil: sagittal and coronal, 3 mm slices, STIR (TR > 2,500/ TE 70/ TI 140, matrix 320 × 256, FOV 500) was used for joint-to-joint coverage (for skip metastasis depiction).

  • Surface coil: centered on the bone tumor (with FOV ranging from 200 to 240 mm):

  • T1 coronal spin-echo (TR 435/TE 18, matrix 304 × 242).

  • T2 axial turbo-spin echo TR 5,400 /TE 110, matrix 197 × 400).

  • T2 sagittal fat-saturated turbo-spin echo (TR 4,504/TE 52, matrix 208 × 165).

  • T1 GE dynamic with subtraction (TR 11/TE 4.2, matrix 256 × 163, 15 dynamic scans, dynamic scan time 17).

  • 3D T1 GE with spectral fat suppression after gadolinium injection of 0.1 mmol/kg body weight gadoteric acid (Dotarem, Guerbet, Roissy, France) (TR 14/TE 6.9, matrix 256 × 114).

  • The slice thickness for all sequences performed with the surface coil was 3 mm.

Diffusion-weighted sequences: A diffusion-weighted sequence was obtained before contrast medium injection: SPIR coronal DW turbo-spin echo (TR 1,500/TE 138; matrix 112 × 89, TSE factor 16, b values = 0 and 900. DWIs were acquired along three gradient directions. This sequence provided one slice of 20 mm thickness in the long axis of the bone in 1.46 min, parallel to the plane in which the future histological specimen was to be analyzed.

The full examination took less than 45 min.

Image analysis

The image analysis was performed with a Philips View Forum processing console (Best, The Netherlands). The diffusion map and conventional sequences were analyzed concomitantly with the screen divided in four quadrants dedicated to the four following sequences: T1 coronal, T1 fat-sat gadolinium coronal, b0 coronal, and its corresponding ADC map. Because we did not use the same FOV or the same gap and slice thickness between the T1 sequences (T1 and 3D T1 gadolinium with fat saturation) and the DWI sequence, we were not able to do a copy and paste of the tumoral limits from the former to the latter. However, using the same magnification on screen and the anatomical landmarks on both T1 sequences and DWI sequence, we were able to draw the contours of the tumor depicted by the T1 sequences on the b0 image. Then, we performed a copy and paste from this delineation done on the b0 image to the ADC map in order to obtain the ADC value (Figs. 1 and 2).

Fig. 1
figure 1

Osteosarcoma of the upper tibia with physeal extension. Example of ADC calculation in patient no. 13 (poor responder) at MRI-3, prior to surgery. T1 coronal (a), T1 coronal gadolinium fat saturation (b), DWI acquired at b0 (c) and corresponding ADC map (d) and histological section (e)

Fig. 2
figure 2

Osteoblastic osteosarcomas. ADC maps in patient no. 12 (good responder; ADC = 2.21) (a) and patient 13 (poor responder; ADC = 1.93) (b) at mid-course of chemotherapy (MRI-2). The signal looks higher and more homogeneous for the good responder

ADC was calculated using the usual formula: ADC = (ln S0 × S900)/(b900 − b0) (where S0 is the signal intensity if b = 0 and S900 is the signal intensity if b = 900, and ln = natural logarithm), expressed in mm²/s.

These measurements were made independently by two senior radiologists, enabling the interobserver variability to be defined. Intra-observer variability with a 6 month interval between measurements was also assessed for observer 1. Because of his greater experience, the measurements made by observer 1 were those used for the different ADC calculations.

Three ADC values were recorded: ADC1, ADC2, and ADC3 (for MRI-1, MRI-2, and MRI-3, respectively). Four other parameters were defined: the ADC differentials between the first and second MRI (ADC2 − ADC1), the ADC differentials between the first and the third MRI (ADC3 − ADC1), and the ADC variations expressed as a percentage between the first and the second MRI, defined as the ratio (ADC2 − ADC1/ADC1) × 100, and between the first and the third MRI, defined as the ratio (ADC3 − ADC1/ADC1) × 100.

Histology

After surgical resection, bone specimens were sent for histological analysis according to Huvos’ grading system [32]. The percentage of viable residual cells was calculated from a 5 mm coronal slice of the specimen along the largest axis of the tumor, including soft tissue extension. A good response was defined as tumors composed of 10% viable tumor cells or less, and a poor response as tumors containing more than 10% viable tumor cells.

Statistics

Normality of data was assessed using Shapiro-Wilk test. Owing to the small sample size and the nonhomogeneity of the distributions of the different parameters, nonparametric tests were used. The comparison of ADC values, differentials, and ratios for the “good responder” and “poor responder” groups was performed using Mann-Whitney nonparametric tests. We carried out a receiver operating characteristic curve analysis to assess the performance of the three parameters to discriminate good and poor responders: ADC2 values, ADC2 − ADC1, and the ratio ADC2 − ADC1/ADC1 × 100. For each parameter, we chose the cut-off identifying the best sensitivity for a 100% specificity.

Inter- and intra-observer variability was tested using the nonparametric Wilcoxon test for paired samples. Statistical analysis was done using SPSS version 15.0 (SPSS, Chicago, IL, USA). For all tests, a p-value of less than 0.05 was considered significant.

Results

Fifteen patients, aged 4.8 to 18.5 years (mean = 13.5 years, median = 14.5 years), were prospectively included during the study period. The male/female sex ratio was 6/9. The tumors were located on the distal femur (n = 10), the proximal tibia (n = 3), the proximal humerus (n = 1), and the proximal femur (n = 1).

Histological analysis demonstrated eight good responders and seven poor responders with a similar distribution between the subtypes of osteosarcomas. There were six patients with osteoblastic osteosarcomas and one mixed osteosarcoma (chondroblastic and osteoblastic) in each group. The 15th patient belonged to the good responders group and had a mixed type of osteosarcoma. Patient’s characteristics are provided in Table 1.

Table 1 Tumor location, osteosarcoma subtype, and response to chemotherapy

No significant difference in ADC values was observed either between the two observers or between measurements obtained at a 6 month interval by observer 1 (Table 2).

Table 2 Intra- and interobserver variability of ADC measurements at MRI-1, MRI-2, and MRI-3

A significant difference was observed for all values between MRI-1 and MRI-2 (Tables 3 and 4). All three patients who had an ADC2 value lower than their ADC1 value were poor responders.

Table 3 For each patient, ADC values compared to histological response to chemotherapy
Table 4 Values (mean  ±  standard deviation) at mid-course of chemotherapy for good and poor responders

The best parameter to detect poor responders was the ADC differential (ADC2 − ADC1). It showed 100% specificity for depicting 57% (4/7) of the poor responders correctly (Fig. 3).

Fig. 3
figure 3

Graphs comparing the ADC2 values (a), the differential ADC2 − ADC1 (b), and the ratio ADC2 − ADC1/ ADC1 × 100 (c) at mid-point of chemotherapy for poor (PR) and good responders (GR)

No difference was observed for all MRI-3 values (Table 5).

Table 5 Values (mean  ±  standard deviation) at the end of chemotherapy for good and poor responders

Discussion

Early detection of those who respond poorly to preoperative chemotherapy may be an important issue in the management of osteosarcomas in children. For at least 30 years, poor histological response has been strongly associated with a poor predictive outcome [2]. The major challenge is to find an early prognostic factor that will allow the neoadjuvant treatment regimen to be adjusted. The treatment of suspected poor responders could then be intensified earlier, potentially improving their prognosis and decreasing the risk of iatrogenic toxicity. Another possibility would be to decide early to proceed with surgery, in order to remove a resistant tumor as soon as possible, aiming to lower the risk of metastasis. Although DW-MRI has been used to assess tumor necrosis in bone tumors [2426], this method has been tested at a time point too close to surgery to allow treatment optimization. Therefore, we assessed the value of early DW-MRI, i.e., at mid-course of chemotherapy. To our knowledge, such a study has never been published, and our results demonstrate a potential value for predicting the final histological response.

In previous DWI studies, the relationship between histological response and ADC values and their changes during chemotherapy showed discordant results [4, 2426]. On the basis of a short series (eight patients), Uhl et al. [24] described a significant difference between good and poor responders; their ADC3 − ADC1 values were 0.58 ± 0.15 vs. 0.23  ± 0.15 mm²/s, respectively (p = 0.016). Hayashida et al. [25] confirmed the ability of DWI to accurately separate good and poor responders in a larger series. However this study was not homogeneous as it also included two patients with Ewing’s sarcoma.

With larger series of 22 and 31 osteosarcoma patients, respectively, Oka et al. [26] and Bajpai et al. [4] obtained results close to those obtained in the present series. In these cases, no significant difference between good and poor responders was observed in terms of the average ADC signal measured in different parts of the tumor at the end of neoadjuvant chemotherapy. This calculation (i.e., average ADC signal) was the same as that used by Uhl [24] and Hayashida [25]. In Oka’s study, however, the minimum ADC signal of the tumor and the minimum ADC ratio (ADC3 − ADC1/ADC1) were also calculated, and good responders had a significantly higher minimum ADC ratio than poor responders (1.01  ±  0.22 vs. 0.55 ± 0.29) [26].

In the present series, we observed that the best predictor of poor responders was the ADC differential at mid-course of chemotherapy (ADC2 − ADC1). Furthermore, the method we used was associated with a good intra- and interobserver reproducibility.

The comparison of ADC results between different series expressed as absolute ADC values still remains difficult. The main reason is obviously the use of different MR techniques and settings. DW sequences are highly sensitive to acquisition parameters: B0 magnetic field, gradient intensity, type of sequence (TE, suppression of background), choice of diffusion gradients, field of view sizes, slice thicknesses, coding direction, bandwidth, concomitant use of imaging, and size and position of region of interest (ROI) [16, 17, 33]. One way to standardize the results is to use ADC differentials or variations. ADC differentials should be more reproducible and less dependent on the MR unit. ADC variations in percentage terms should also be more reproducible and could also be more easily understood by clinicians for comparison to histological response. However in our series, ADC variations appear to be less accurate than ADC differentials for distinguishing poor responders.

Spin echo-echo planar imaging (EPI) sequences are the most commonly used since they are performed within a few seconds, reducing the risk of motion artefacts. However, this sequence is associated with low signal-to-noise ratio and is highly sensitive to magnetic field homogeneity. Since the anatomic regions explored in osteosarcoma patients are not especially prone to motion artefacts, we elected to use a turbo spin echo sequence in order to increase the signal-to-noise ratio. This allowed us to work with a high diffusion gradient (b = 900 mm²/s) but with less magnetic field distortion as compared to EPI sequences. This high diffusion gradient improved the results by decreasing the perfusion-related component of the diffusion signal. Uhl [24], Hayashida [25], and Oka [26] used EPI sequences at 1.5 Tesla, with diffusion gradients of 700–1,000 mm²/s. However, this difference is probably not the only factor explaining this variation in the results. All authors working with DW-MRI in osteosarcomas, including us, selected an initial b-value equal to 0. In order to exclude the perfusion contribution, a higher first b-value of up to 100 should be considered. All previous DW-MR studies [4, 2426] based their measurements on small ROIs. We preferred to use an overall measurement of the whole tumor after delineation on a morphological image. ADC was therefore calculated including all intra- and extraosseous tumor margins. We elected to take into account all of the variables that can influence diffusion parameters, before and also during chemotherapy. Even if the intraosseous volume does not actually change during chemotherapy, its content is modified in terms of tumor cellularity, necrosis, edema, ossification, and hemorrhage. All of these modifications, especially tumor cellularity, will have an impact on the ADC values.

Furthermore, to be more representative of the tumor modifications, we used a 20 mm slice thickness. When using the Huvos grading, pathologists use a 5 mm thick coronal section. Therefore, our DW-MRI slices are actually more representative of the tumor than the pathologist’s slice. Our data and follow-up are currently too small to compare the prognostic values of DW-MRI and Huvos grade. However, as MR is able to assess more of the tumor volume than pathology, MR could be more accurate in terms of long-term survival prediction.

Our study presents limitations: The number of patients of this rare pathology is small and impacts the statistical power of our results. However a trend seems to emerge and needs to be confirmed by implementation of a multicenter trial. We arbitrarily decided to conduct the MRI-2 at the mid-course of chemotherapy as this represents the routine follow-up time point. Earlier DW-MRI (e.g., after the first chemotherapy course) might provide the same information. This could be further assessed with serial MR studies in the future.

Conclusion

Our study demonstrates that mid-course DW-MRI is a potential method to get an early prognostic factor to monitor neoadjuvant chemotherapy in osteosarcoma patients. If these results prove to be significant in a larger population, we assume that patients identified as being poor responders during early treatment could benefit from intensified treatment.