Introduction

Head and neck squamous cell carcinoma (HNSCC) is the fifth most common malignancy worldwide [1]. There are many challenges in the diagnosis and therapy of patients with head and neck tumours. Their disease course is often complicated by recurrent disease, regional lymphatic spread, synchronous primary tumours, and distant metastases. Although there are aggressive combined modality treatment regimens, there remains a high rate (up to 40 %) of locoregional recurrence [2]. Moreover, two-thirds of locoregional recurrences and lymph node metastases occur within the first 2 years [3]. However, these recurrences are still often underdiagnosed as evidenced by the results of two studies reporting a 30 % rate of occult relapse in asymptomatic patients 1 year after treatment [4, 5]. Traditional clinical prognostic factors are tumour size/stage, lymph node involvement, and anatomic subsite. Indeed, in oropharyngeal SCC, 5-year overall survival (OS) is 65 % and 21 % in patients with T2 and T4b stage, respectively. In oral cavity SCC, the presence of lymph node involvement reduces the 5-year OS from 80 % to 50 %, particularly when lymph node capsular involvement is present [6]. Despite careful evaluation of these clinical factors, it is difficult to reliably predict the outcome after selected treatment [7]. Identification of novel pretreatment factors capable of predicting patient outcome is thus of great interest. Patients whose prognoses are likely to be unfavourable with conventional approaches might be selected for alternative strategies. This may involve moving away from single modality therapy to multidisciplinary approaches, either by intensifying radiochemotherapy schedules, or by adding innovative biologic agents.

18F-FDG PET/CT is a medical imaging technique based on the study of glucidic metabolism of tumour cells [8]. During the last decade, it has emerged as an essential imaging tool in the field of oncology, not only for diagnosis but also for prognostic and therapeutic evaluation [9, 10]. Concerning HNSCC disease, there have been many studies investigating the prognostic value of FDG PET/CT. In fact, maximum SUV (SUVmax), a semiquantitative measure of tumoral uptake, is a predictor of survival, regardless of the size and stage of the tumour but without a real cut-off set, varying between 4 and 10 according to previous studies [11, 12]. For example, in our department we have shown in a cohort of 89 patients that for both disease-free survival and OS a tumour SUVmax value of 7 is the best cut-off [13].

Recently, metabolic tumour volume (MTV) defined as the volume of FDG activity in a tumour assessed by automated volume of interest (VOI) delineation has been proposed as a new quantitative PET index. Indeed, MTV has been reported as an additional diagnostic [14] and prognostic imaging biomarker in various solid cancers [15]. A large MTV has been already suggested as a poor prognostic factor for lung cancer, oesophageal carcinoma and epithelial ovarian cancer [1620], and even seems a better predictor of survival than SUVmax [16, 17]. Some studies have focused on this novel FDG PET/CT application in HNSCC [2128], but the method of volume segmentation has not been clearly stated. To the best of our knowledge, few of these retrospective studies have compared the MTV obtained according to several different segmentation methods with multiple SUV thresholds, as it was recently reported for non-small-cell lung cancer [29].

So the aim of this study was to prospectively determine MTV measured by 18F-FDG PET/CT with two different segmentation SUV-based methods with various thresholds and to evaluate its significance as an independent prognostic factor for predicting event-free survival (EFS) and OS in comparison with more commonly used prognostic factors in patients with HNSCC.

Materials and methods

Patients and pathology

Patients referred for 18F-FDG PET/CT at initial staging of a histologically proven HNSCC at the Department of Nuclear Medicine of the University Hospital of Brest between February 2009 and March 2011 were prospectively included. Patients with a previous history of recurrence or suspected metastatic disease or suspicion of secondary cancer at initial evaluation of the disease were excluded.

Imaging technique

Scans were performed on a Gemini GXL PET/CT scanner (Philips, Eindhoven, The Netherlands). PET/CT imaging was performed 1 h after injection of approximately 370 MBq (5 MBq/kg) of 18F-FDG. Standard patient preparation consisted of a fast of at least 4 h and a serum blood glucose level of <7 mmol/l before 18F-FDG administration. The patients were supine with their neck supported in a semirigid headrest. They were allowed to breath normally during the PET and CT acquisitions.

PET data were acquired in the 3-D mode and, for attenuation correction, were also reconstructed using CT data and a row-action maximum-likelihood iterative algorithm. The Gemini scanner consists of a six-slice multidetector-row spiral CT scanner with a transverse field of view of 600 mm. The CT parameters (collimation 6 × 5 mm, tube voltage 120 kV, and effective tube current 100 mAs) are standard for PET/CT studies and permit differentiation between tissues with good spatial resolution while ensuring that the patient does not receive a high radiation dose.

PET/CT parameter measurement

18F-FDG PET/CT datasets were evaluated blindly by an experienced nuclear medicine physician without prior knowledge of the clinical history or the results of other imaging studies.

Tumour uptake

FDG uptake was determined from SUVs calculated from the following expression:

SUV = tissue radioactivity concentration (kBq/mL)/[injected dose (kBq)/patient weight (g)]

SUVmax and SUVpeak, corresponding to the highest possible mean value of a 1 cm3 spherical VOI positioned within the tumour, were recorded for each primary site.

Metabolic tumour volume

The MTV was defined as the summed volume in millilitres including the primary tumour and was measured using a semiautomated contouring program on a Syngo workstation (Siemens, Erlangen, Germany). Different MTVs were obtained using SUV-based methods for each tumour using an absolute and a relative threshold method. For the absolute threshold method, 3-D contours around voxels equal to or greater than absolute values of 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5 and 7.0 times SUV were applied, and for the relative threshold method, 3-D contours around voxels equal to or greater than 30 %, 40 % and 50 % of SUVmax were applied.

Clinical evaluation

Clinical data, including sex, tumour location, SCC histologic grade, treatment and initial pathologic stage according to the American Joint Committee on Cancer classification (AJCC) [30] and World Health Organization (WHO) [31] were recorded. Patients were treated in accordance with standard guidelines of the French Society of Cervico-facial Oncology, after multidisciplinary board approval. Patients were clinically followed-up for at least 12 months to calculate EFS and OS. Clinical follow-up consisted of a standard head and neck examination including inspection and palpation of all anatomic head and neck subsites and examination of internal structures using a mirror and a flexible endoscope every 1 – 3 months and every 2 – 4 months, respectively, during the first and second year after treatment, as recommended by the National Comprehensive Cancer Network, and radiography of the thorax each year [32]. EFS was defined as the time from diagnosis to disease progression, relapse, or death. OS was defined as the time from diagnosis to death.

Statistics

Receiver-operating characteristic (ROC) curve analysis was used to test the statistical significance of the differences among the multiple volumetric parameters calculated using the various SUV-based cut-off values. The area under the ROC curve (AUC) and the diagnostic accuracy for recurrence and death were used to select the best method.

OS and EFS were chosen as endpoints to evaluate prognosis. Univariate analysis was first performed to test the significance of the following factors: age, sex, AJCC stage, TNM classification, SCC histologic grade, tumour location, treatment, SUVmax, SUVpeak and MTV. The Kaplan-Meier method was used to estimate EFS and OS probabilities. A log-rank test was used to compare survival distributions. Multivariate analysis using the Cox proportional hazards model was then performed to assess the potential independent effect of MTV, after adjusting for the effect of other relevant variables. Significance level of p-values was 0.05. All statistics were determined using XLSTAT-Life software (Addinsoft, Paris, France).

Results

Population

The study included 80 patients (70 men, 10 women; mean age ± SD 62.4 ± 9.0 years, median 61.3 years, range 46.4 – 80.9 years, at the time of diagnosis). Patient characteristics including age, sex, site of primary tumour, SCC histologic grade, AJCC stage, TNM classification and treatment are presented in Table 1. FDG PET/CT scans were performed at an average of 16.6 ± 8.4 days before the start of treatment. Mean SUVmax and SUVpeak were respectively 9.5 ± 5.6 and 7.7 ± 4.7.

Table 1 Characteristics of patients

Follow-up

Mean follow-up ± SD was 51.2 ± 27.1 months. Three patients (4 %) were lost to follow-up, 42 Forty-two patients (53 %) showed a local recurrent disease or distant metastases with a mean delay of 8.9 ± 4.7 months (median 225 days, range 44 – 735 days), and 33 patients (41 %) died during the follow-up period secondary to their head and neck cancer with a mean delay of 11.9 ± 5.9 months (median 358 days, range 44 – 720 days).

Metabolic tumour volumes

According to the ROC analysis results, and combining AUC, accuracy and p-values, MTV using a threshold of 5.0 times SUV (MTV5.0) yielded the best prediction of recurrence and death. Figures 1 and 2 shows ROC analysis curves illustrating the values of MTV5.0 for predicting recurrence and death, respectively. When using MTV5.0 >4.86 ml to predict recurrence, the sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 61.5 %, 88.6 %, 86.7 %, 66.0 % and 74.0 %, respectively. When using MTV5.0 >4.86 ml to predict death, the sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 66.7 %, 81.8 %, 73.3 %, 76.6 % and 75.3 %, respectively.

Fig. 1
figure 1

ROC curve using MTV5.0 to predict EFS

Fig. 2
figure 2

ROC curve using MTV5.0 to predict OS

The proportions of MTV5.0 >4.86 ml in the T1, T2, T3 and T4 groups were respectively 7 %, 5 %, 42 % and 73 %. Retrospective analysis using Fisher’s test showed a significant correlation between tumour size (T classification) and MTV5.0 value (p < 0.001).

Figure 3 shows different volumes delineated on a hypopharyngeal tumour.

Fig. 3
figure 3

Maximum intensity projection and axial PET/CT images showing examples of MTVs delineated with a semiautomated contouring program on a Syngo workstation (Siemens, Erlangen, Germany): MTV3.0, MTV5.0 and MTV50%. This patient with a hypopharyngeal SCC stage IV tumour (T4 N0 M0) had a MTV5.0 of 20.67 ml, and showed relapse 9 months after diagnosis

Univariate analysis

Table 2 shows the factors associated with EFS and OS in the univariate analysis, including MTV5.0. There was a significant difference for predicting EFS (p < 0.0001) between the high MTV5.0 group (>4.86 ml) and low MTV5.0 group (≤4.86 ml; Fig. 4). The estimated 2-year EFS was 67 % (95 % CI 53 – 81 %) and 12 % (95 % CI 0 – 24 %) in patients with in the low MTV5.0 group and high MTV group, respectively. There was a significant difference for predicting OS (p < 0.0001) between the high MTV5.0 group and the low MTV5.0 group (Fig. 5).The estimated 2-year OS was 73 % (95 % CI 59 – 87 %) and 22 % (95 % CI 6 – 38 %) in patients in the low MTV5.0 group and the high MTV5.0 group, respectively.

Table 2 Univariate survival analysis
Fig. 4
figure 4

Kaplan-Meier curves for EFS of two groups according to the optimal cut-off value of MTV5.0 (p < 0.0001)

Fig. 5
figure 5

Kaplan-Meier curves for OS according to the optimal cut-off of MTV5.0 (p < 0.0001)

Multivariate analysis

All prognostic factors with significance in the univariate analysis were included in the multivariate model to evaluate their interaction and joint effect on EFS and OS. These parameters and the significance of the obtained estimators are presented in Tables 3 and 4. The adjusted Cox proportional hazards regression model revealed that both age and MTV5.0 were independently correlated with EFS (p = 0.011) and OS (p = 0.010). SUVmax and SUVpeak were not correlated with EFS (p = 0.277 and p = 0.188, respectively) or OS (p = 0.975 and p = 0.182, respectively).

Table 3 Multivariate Cox regression analysis of EFS
Table 4 Multivariate Cox regression analysis of OS

Discussion

HNSCC show a high rate of locoregional recurrence [2]. Moreover, two-thirds of locoregional recurrences and lymph node metastases occur within the first 2 years [3]. Fewer than 50 % of patients survive more than 5 years from diagnosis [6]. With such a background, it seems clinically important to identify a subgroup of patients with a poor prognosis who may benefit from aggressive therapy aiming to improve their survival. Such a subgroup may also benefit from close monitoring with 18F-FDG PET/CT whose accuracy in the diagnosis of occult recurrences has been proven [4, 5].

Several studies have demonstrated that high SUVmax of the primary tumour is correlated with worse local control and poor survival rates in patients with HNSCC and such patients should be considered for a more aggressive treatment approach [11, 12]. On the contrary, some studies have not confirmed that tumoral SUVmax is a significant factor predicting clinical outcome in patients with pharyngeal cancer [22, 27]. In our results, SUVmax as well as SUVpeak of the tumour were not independent predictive indicators of survival in the multivariate analysis. Similarly, Choi et al. found that higher SUVpeak (>6.2) of tumour tissue is not significantly associated with a higher EFS (p = 0.721) or OS (p = 0.238) [23]. Furthermore, we note that SUVpeak is mainly used for treatment response, and its prognostic value has not really been evaluated [33]. According to SUVmax, as mentioned above, the optimum cut-off value for predicting survival has not been established, varying between 4 and 10 in different studies [11, 12]. So it seems important to search for more relevant prognostic indicators than SUVmax or SUVpeak. A study by Higgins et al. suggested that SUVmean may be useful as a predictive factor for disease-free survival [34]. Furthermore, our team has already considered the possibility of using various SUVmax ratios such as tumour/liver or tumour/blood activities, which have proved interesting [13].

Because of the good correlation between MTV and gross tumour volume (GTV) segmentation in radiotherapy assessed by other imaging tools such as CT, we can expect that MTV will be another important factor to be considered in the management of HNSCC. Thus, Schwartz et al. found that FDG PET was the most accurate modality for delineating tumour volume in a study comparing CT, MRI and 18F-FDG PET/CT for measurement of tumour volume in pharyngolaryngeal SCC, with the results validated by surgical specimens [35].

Our results showed that in a multivariate analysis MTV >4.86 ml using an absolute SUV threshold of 5.0 was a significant independent predictor of a poor EFS (HR 6.2, p = 0.011) and OS (HR 17.3, p = 0.010). There have been a few reports of MTV as an independent prognostic factor in the treatment of HNSCCs, compared with the many studies of GTV performed using CT or MRI. In the study by Lim et al. in 176 patients with oropharyngeal HNSCC undergoing definitive radiation therapy with concurrent chemotherapy, univariate analysis showed a significant relationship between MTV and local treatment failure (HR 2.4, p = 0.005) and OS (HR 1.8, p < 0.001) [26]. Furthermore, Seol et al. found that MTV 9.3 ml or higher was significantly associated with an increased risk of relapse or death in 59 patients with HNSCC treated by chemoradiotherapy [28]. Moreover, a similar study of MTV in head and neck cancer, indicated that MTV >40 ml was a predictive factor for disease-free survival in a cohort of 82 patients [27]. In these two studies, the MTV regions were defined on the pretreatment PET/CT images using a fixed SUV of >2.5. This high difference in MTV using the same segmentation method can be explained by the fact that Chung et al. [27] included in their study 63 patients with nasopharyngeal cancer, which corresponded histopathologically to undifferentiated carcinoma recognized as having more 18F-FDG avid lesions. In contrast, La et al. used a threshold intensity value of 50 % of tumour SUVmax and found that MTV >17.4 ml was an adverse factor for recurrence and death in 85 patients treated for pharyngeal cancer [22].

Our results logically found a high proportion of tumours with MTV >4.86 mL among patients with large tumours, such as T4 lesions (73 %). Moreover, retrospective analysis showed a significant statistical correlation between tumour size and MTV5.0 value (p < 0.001). Tumour size has long been recognized as a prognostic factor in HNSCC [6]. But in this study, neither AJCC stage nor TN classification was an independent predictive factor for survival in multivariate analysis, in contrast to MTV5.0. This confirms our hypothesis that in addition to its morphology, a tumour’s MTV has to be considered for prognostic analysis.

To the best of our knowledge, our study is the only one in which MTV determined with so many thresholds using different segmentation SUV-based methods has been studied. Kao et al. used four segmentation methods and found that MTV2.5 >13.6 ml achieved the best predictive value for poor disease-free survival of pharyngeal cancer treated by definitive chemoradiotherapy [25]. In contrast, Moon et al. did not find that MTV based on an isocontour at SUVs over 2.5, 3.0, 3.5 or 4.0 was correlated with OS in 69 patients with SCC of the tonsil [24].

The current study had some limitations. Firstly, we did not measure overall tumour burden, including the primary lesion and lymph node metastasis, which might have provided a more accurate prognosis. In this study, only the MTV primary site was evaluated because its measurement is simple and applicable in routine clinical practice, and even if metastatic lymph nodes are adjacent but not completely contiguous to the primary tumour, it is still possible to correctly segment the lesion. Secondly, we did not evaluate the adaptive threshold method based on signal-to-background or the gradient-based method which have been shown to add value to clinical staging of oral cavity and oropharyngeal SCC [14], and among various PET segmentation methods the gradient-based technique provided the best estimate of the true tumour volume in non-small-cell lung cancer [36]. However, this type of method is more tedious and requires preliminary calibration of the machine. Other limitations include the heterogeneity of the primary tumour sites, the high proportion of advanced stages, and the nonuniform treatment regimens. Fourthly, the human papilloma virus (HPV) status of the oral cavity and oropharyngeal lesions was not determined. Recent literature suggests that HPV/p16 is an effective prognostic factor for SCC at these locations [37]. For example, Tang et al. showed that MTV predicted progression-free survival (HR 4.23; p < 0.0001) and OS (HR 3.21; p = 0.0029) in patients with HPV/p16-positive oropharyngeal cancer [19]. But, at the start of our study, HPV/p16 status was not determined routinely in our institution. Despite these limitations, we obtained highly significant results demonstrating that MTV is an independent prognostic factor in patients with HNSCC.

Recently, Chu et al. introduced the concept of MTV velocity, which corresponds to the metabolic tumour increase over time in patients who have received two pretreatment 18F-FDG PET/CT scans. They demonstrated that primary tumour velocity predicts disease progression (HR 2.94; p = 0.001) and OS (HR 1.85; p = 0.03) [21]. This concept of MTV velocity should be explored in further studies but the performance of two PET scans is ethically unacceptable. Nevertheless, we have previously suggested that the tumour retention index of FDG measured by dual time-point 18F-FDG PET/CT is a predictive factor for EFS independent of more commonly used prognostic factors [38]. And a large retention index has also been reported to be a poor prognostic factor for survival in other solid cancers, such as non-small-cell lung cancer and malignant pleural mesothelioma [39, 40]. Thus based on these data, calculating the difference in MTV between a whole-body acquisition and a delayed cervical step on a pretreatment dual time-point PET/CT scan could be a future approach.

Conclusion

Our results suggest that pretreatment MTV measured by 18F-FDG PET/CT can be used as an independent predictive factor for EFS and OS with higher prognostic value than SUVmax. Patients with MTV5.0 value greater than 4.86 ml should be considered for a more aggressive treatment approach or close monitoring.