Introduction

Osteosarcoma (OS), though a rare disease, is the most frequent primary bone tumor [1]. OS arises in or adjacent to regions of high bone growth, most commonly involving the distal femur, proximal tibia or humerus. OS parameters affecting survival include American Joint Committee on Cancer (AJCC) stage, histology, age, gender, and site of disease [2, 3]. These factors are all taken into consideration when determining whether to include specific neoadjuvant and/or adjuvant therapies in addition to surgery. New and more aggressive chemotherapy protocols have recently been introduced in patients with high risk OS, and the National Comprehensive Cancer Network (NCCN) suggests switching chemotherapy protocols in patients with poor neoadjuvant response [4].

The radiologic diagnosis of OS is usually made with plain radiograph and MRI. Furthermore, OS is one of the most FDG avid tumors and the clinical application of FDG PET/CT has been well documented. The combined metabolic and morphological information provided by the 18F-FDG PET/CT imaging allows for a high diagnostic performance in the detection of OS and its distant metastases. Therefore, this technique is useful both for evaluating the initial staging as well as for follow-up.

Several papers have demonstrated the ability of 18F-FDG PET/CT to predict clinical outcome, as well as histological response before surgery. Histological response is highly predictive of patient survival; however, this can only be assessed on surgical pathology after resection, at which point the neoadjuvant chemotherapy (CHT) protocol could no longer be modified. Thus, the ability to assess response based on the detection of viable tumor cells on PET is of significant clinical value. Furthermore, the combination of histological response and PET/CT data may further improve prediction of prognosis [5].

PET/CT has rapidly evolved as a technique in the field of the radiology and is the first hybrid technique widely used worldwide. In the first PET studies, lesions were defined as 18F-FDG-avid based on the comparison with the background. Later semi-quantitative parameters were developed to better quantify uptake, such as SUVmax and SUVmean. More recently, other quantitative methods, such as total lesion glycolysis (TLG) and the metabolic tumor volume (MTV), have been studied and demonstrated to be valuable in identifying malignant tissue and assessing response to therapy.

The purpose of this article is to review different tools of FDG PET/CT image analysis and their correlation with clinical outcome in the setting of OS.

Research Methodology

This review paper is based on data gathered from PubMed searches (Table 1). Pubmed was searched for the words “Osteosarcoma and PET/CT,” “Osteosarcoma imaging,” “Osteosarcoma and radiology,” published in or after 2009. Both prospective and retrospective studies were included. All papers were reviewed and scored for methodological quality according to quality-assessment tool for diagnostic accuracy studies (QUADAS). This 14-item tool is composed of five items related to verification bias, three items related to review bias, two items relating to generalizability and context and spectrum bias, and four to reporting. A score of “1” for “yes” and “0” for “no” for each of the 14 items was given, and only papers with score equal or higher than 8/14 were considered for this review paper.

Table 1 List of FDG PET/CT studies including osteosarcoma populations and quantification parameters used since 2009

OS Chemotherapy Treatment Protocols and General Considerations

Low grade OS is usually treated without induction chemotherapy (CHT) while high-grade OS is usually treated with adjuvant chemotherapy; therefore, this review will specifically focus on high grade OS. The most valuable and clinically available protocol used in the assessment of OS by FDG PET/CT includes FDG PET/CT before and after induction CHT, which in this paper, will be referred as to PET/CT1 and PET/CT2, respectively. If the disease is controlled, patient will usually go on to surgery. On pathology, a histologic response to therapy is defined as viable malignancy composing of less than 10% of the specimen [6]. Adjuvant CHT is administered post-operatively to control possible circulating micrometastases. At this point, the patient enters a pathway of several clinical and radiological follow-ups lasting several years. Long-term follow-up is required since most of the OS recurrences will occur at a median of 2.5 years (range 1.2–8.2 years) after the initial diagnosis [7], and early diagnosis of recurrent disease improves patient survival [7].

PET/CT

The clinical protocol for PET/CT in primary bone tumor assessment includes whole body examination performed from vertex to thighs or toes, depending on the primary tumor site. Patients fast for at least 6 h, blood glucose levels are confirmed to be between 80 and 120 mg/dl, and then 555–740 MBq of 18F-FDG is administered intravenously according to patient’s weight. Usually neither intravenous nor oral contrast is used, as the CT scan is utilized primarily for attenuation correction of PET data and musculoskeletal structures, lymph nodes, and lung nodules can be clearly identified for localization to areas of focal 18F-FDG uptake. The standard CT acquisition parameters are often close to 130 kV, 110 mA and 3.75 axial slice placement. Emission PET performed either in 2D or 3D mode is obtained about 60 min after 18F-FDG injection for 5–8 bed positions (3–5 min each). Finally, attenuation-corrected datasets are reconstructed and used to assess clinical parameters such as SUVmax, SUVmean, total lesion glycolysis, and metabolic tumor volume (Fig. 1).

Fig. 1
figure 1figure 1

12 year-old female affected by tibial osteosarcoma; incidental osteochondroma at the lateral aspect of the tibia is also noted. ROI was drawn to include the entire tumor; semiquantitative PET/CT values were calculated. More specifically, figure a shows the 2 ROIs used to assess tumor/background ratio. Figure b demonstrates SUV maximum, minimum, and average calculations. Figure c demonstrates SUV maximum, peak, and average pre- and postinduction chemotherapy (bottom and top rows, respectively). Figure d shows calculation of TLG with a threshold of 42% of SUV max; pre- and postinduction chemotherapy (bottom and top rows, respectively). Figure e shows metabolic volume

SUV: Standardized Uptake Value

The widely used SUV stands for Standardized Uptake Value. This value is determined by using one of the several available softwares by placing a region of interest (ROI) over the investigated lesion. SUV value linearly correlates with ROI activity and body weight, and is inversely correlated to the administrated activity. However, this measurement is difficult to interpret because it is affected by several factors such as body composition, serum glucose, level of hydration, time from injection to scan, possible extravasation, therapy, etc. The most commonly used SUV derived parameter is the SUV maximum (SUVmax), which measures the FDG activity in the most avid pixel. The SUVmax usually is considered the most reproducible measurement and therefore often used in the clinical setting or in clinical trials.

The primary lesion’s biological aggressiveness, locoregional extent, and involvement of adjacent tissues can be characterized by PET/CT. Higher FDG avidity is associated with higher histologic grade. On the other hand, necrotic tumor components are associated with decreased FDG uptake. Therefore, the area of maximum activity represents the optimal target for biopsies.

Several scientific articles have assessed the relationship between SUVmax and clinical outcome in OS cohorts [8, 9•, 10,11,12]. High SUVmax values pre- and post-CHT—SUVmax1 and SUV max2—respectively, have demonstrated to be predictive of worse progression-free survival in a study from MD Anderson Cancer Center, which included 31 patients [9•]. In particular, both cutoff points, SUVmax1 of 15 g/ml and SUVmax2 of 5 g/ml, were shown to be significantly associated with poor progression-free survival. In addition, the same study along with a second work, with partially overlapping cohorts, demonstrated that high SUVmax2 with a cutoff point of 3.3 g/ml was associated with poor overall survival [9•, 13]. Also, yet another study including 40 patients demonstrated that low SUVmax2 value is associated with better progression-free survival; in this case, the optimal cutoff point was identified as SUVmax2 of 2.5 g/ml [8].

With regard to the relationship between histological response and SUVmax, several papers demonstrated that SUVmax pre- and postinduction CHT, as well as the difference between SUVmax before and after chemotherapy that is calculated as ratio or percentage of variation, could predict the histological response [14•, 15,16,17,18,19]. Byun et al. deemed a difference in SUVmax pre- and post-CHT of greater than 52% to be the optimal criterion to predict histological response. Im et al. showed that the post-therapy SUVmax was also able to predict response to the induction CHT, obtaining a SUVmax of 3.0 as the optimal cutoff soon after the end of the first cycle [15]. Similarly, a SUVmax value of less than 2.5 after chemotherapy corresponded to tumor necrosis in OS [20].

Another parameter used to describe FDG uptake is the SUVmean. It is calculated based on the delineated isocontour-defined volumes with a certain threshold of the maximum voxel value. Reviewing the most recent literature, this parameter has not been presented as unique variable as much as SUVmax; rather SUVmean is often used to calculate other parameters such as total lesion glycolysis (TLG), which will be discussed in detail later. This can also be related to a possible problem with reproducibility: a study assessing both OS and soft tissue tumors has found SUVmean to be more observer dependent than SUVmax [21].

A more recently introduced semi-quantitative parameter is the SUVpeak. This is calculated as the average SUV in a small, fixed-size area identified as a high-uptake part of the tumor. Using a larger sample of volume, several pixels rather than a single pixel value are being considered, allowing SUVpeak to be less affected by image noise and to be considered a more robust alternative over SUVmax [22,23,24,25]. This parameter has been introduced more recently; therefore, there are fewer studies available on dedicated OS cohorts compared to those on SUVmax. However, there are several reports of the usefulness of SUVpeak, in particular as a predictor of event-free survival [26, 27].

Tumor/Background Ratio: TBR

Tumor-to-Background Ratio (TBR) along with SUVmax is one of the most-used parameters to quantify tumor uptake. Several ways of calculating this parameter have been described in the literature [22, 29]; however, they all differ slightly between each other based on the position of the background ROI. For example, TBR can be calculated by drawing an ROI around the tumor and in the same area on the contralateral extremity (background ROI) [21]. The TBR is then calculated by dividing the tumor SUVpeak by the mean activity in the contralateral background ROI.

TBR and SUVmean were found to be significantly more observer dependent than SUVpeak and SUVmax [21]. This may be one of the reasons why investigations on OS using these parameters are rare in the most recent literature. TBR has been described as predictor of clinical outcome in several papers in the 1990s and early 2000s [28,29,30]. One of the most recent papers using TBR by Ye et al. demonstrated significant association between TBR and histological specimens [29]. TBR was measured both before (TBR1) and after (TBR2) chemotherapy, and TBR-ratio (TBR2/TBR1) was calculated thereafter. While no significant differences in TBR1 values were observed between responders and nonresponders, TBR2 was significantly lower in responder patients; in addition, TBR-ratio was significantly correlated with histological response.

MTV: Metabolic Tumor Volume

Metabolic Tumor Volume (MTV) is calculated by adding the volumes of the voxels with uptake above a pre-established threshold. Several thresholds have been investigated and assessed in comparison with other variables. For example, Byung et al. [31] applied a threshold of SUV = 2, 2.5 and 45% of the SUVmax of the tumor area to calculate the MTV. This work showed that MTV with threshold SUV of 2.5 at baseline correlated with histological response and MTV of 47 ml after one cycle of CHT was the optimal cutoff to predict histological outcome. The same study also analyzed the percentage of change between pre- and post-therapy scans calculated as MTV2-MTV1/100 and again demonstrated statistically significant correlation with the histological specimen for a reduction in MTV of more than 70% [31]; similarly, Gaston et al [20] demonstrated that a MTV2/MTV1 <0.5 was associated with favorable histologic response. Further studies confirmed these results in different OS cohorts using different values of threshold and ratios [15, 32].

Total Lesion Glycolysis -TLG

Total Lesion Glycolysis (TLG), also called the Larson–Ginsberg index, is another semi-quantitative parameter, which is calculated by multiplying the SUVmean by the MTV. TLG corresponds to the global metabolic activity in all viable cells of the target lesion above a specific threshold designed to exclude the background activity. This biomarker is thought to better represent the index of global metabolic activity of the entire tumor. At the time of its introduction about 10 years ago by Dr. Larson, computation of TLG was very lengthy and time consuming; eventually with the advance of the technology, new softwares have been developed to simplify this analysis. Using recent programs, it is enough to simply draw a volumetric ROI including the tumor, and the TLG is then automatically calculated. TLG has the potential to add value to clinical staging, patient follow-up and CHT optimization.

The threshold of the SUVmean used to calculate Total Lesion Glycolysis (TLG) slightly varies in different studies ranging from SUVmean of 1.5–3.0 or identified as percentage of tumor SUVmax [9•, 15, 31]. Thresholding is particularly efficient when there is a high tumor-to-background ratio. Costelloe et al. [9•] determined the threshold for TLG, to be 45% of the tumor SUVmax and demonstrated that high TLG in pre-induction CHT tumors is predictive of poor outcome. Further analysis showed that an increase in TLG on the postinduction therapy study was associated with poor overall survival [9•]. TLG was also associated with histological response, with the optimal predictor cutoff value of 190 ml after one cycle of chemotherapy and a decrease of more than 90% after completion of chemotherapy being suggested. In addition, SUVmean of 2.0 and 2.5 were better associated with the histologic necrosis fraction. In particular, the pre-/postinduction CHT treatment ratio was associated with histologic outcome [15]. Similarly, Byun et al [31] compared different thresholds and calculated that SUVmean of 2.5 ml was the best threshold.

Dual-Phase PET/CT

Dual-phase FDG PET/CT consists of image acquisitions at different time intervals. The range of this time interval is wide, but usually less than 2 h, due to FDG decay. The principle of this technique is that over time tumor usually increases FDG accumulation, whereas other nonmalignant reasons of uptake such as inflammation usually decreases FDG accumulation after 60–90 min. This technique has already been applied to different cancers in the past such as breast and colon [33, 34]. The technique was demonstrated to be useful both for distinguishing benign from malignant lesions and for predicting prognosis. Only one scientific paper based on dual-phase FDG PET/CT focused on OS was found in our PubMed research [35]. Byun et al. acquired PET images at 60 and 150 min after FDG injection and calculated the change of SUVmax between early (60 min) and late (150 min) images as SUVmax ratio. They demonstrated that the SUVmax ratio after induction CHT correlated with the histological response [35]. In addition, this parameter could be used in conjunction with the more widely used SUVmax parameter to improve prediction of tumor response.

Technique Limitations

Several limitations are common to previous publications dedicated to the assessment of the use of PET/CT in OS. First, most of the studies are limited by a small sample size, which limits the statistical power of their results. Only a small number of cancer centers have a large volume of OS patients to allow for large population studies. This limitation is mostly related to the low prevalence of OS. Even if OS is the most common primary bone tumor, its incidence is low, at 4–7/1000 new cases per year as reported by a group at MD Anderson Cancer Center [36]. Second, all selected studies are affected by limitations related to the use of 18F-FDG as radiotracer. Increased FDG uptake after CHT could be confounded by inflammatory etiologies, and this can limit the utilization of FDG uptake-derived parameters. Only one study with dual PET/CT technique tried to address this specific limitation. In addition, partial-volume effect produced by limited PET spatial resolution (7–8 mm) may result in underestimation of uptake in small lesions. This limitation likely did not significantly affect tumor assessment included in the reviewed papers, given the usual large size of OS at presentation, more specifically usually ranging from 3–5 to 20–25 cm [5].

Statistical results were possibly affected by interdependence of variables. For example, AJCC staging already includes OS size; therefore, other variables such as tumor location might be not independent. As a case in point, upper extremity OSs are usually smaller in size than lower extremities sarcomas.

The continuous efforts of different groups to come up with numerical or percentage cutoff is often counterproductive for different reasons. Different scanners used at different institutions are not calibrated, and it is very difficult to validate a specific number across all scanners. Furthermore, specific cutoff number may be useful for clinical trial, but then becomes difficult to generalize its application to the daily clinical practice.

Results are often inconsistent, and many studies often do not accord with previously reported data [22]. Also, even within the same study, sometimes only some variables are statistically correlated with histological response or survival [9•, 14•]. For example, Andersen et al. did not find any association between clinical outcome and SUVmax or TRB in the osteosarcoma cohort [22]; similar results were found by Palmierini et al [37•]. Costelloe et al. demonstrated correlation with overall survival only for postinduction CHT SUVmax [9•]. One possible explanation for these inhomogeneous results is the relatively small size of samples in those studies.

Finally, other drawbacks of PET/CT are the relatively high dose of radiation in pediatric patients, and the lack of a breath-hold CT acquisition of the lung [11]. Potential dose savings could be accomplished by combining the PET and a low-dose breath-hold CT in one imaging study, or by means of hybrid PET/MRI for staging and follow-up of pediatric sarcoma patients.

Conclusion

PET/CT may be useful as a tool for predicting outcome and response to therapy when all its limitations are understood, and it is used in conjunction with clinical data. Multiple quantification parameters beyond the routine SUVmax have demonstrated promising results in evaluating response. However, further investigations with prospective, multicentric studies are required before issuing evidence-based recommendation for 18-F FDG PET/CT as a prognostic tool in OS.