Introduction

Oropharyngeal squamous cell carcinoma (OPSCC) is the most common subtype of head and neck cancer (HNC), and its incidence is increasing despite a decline in the overall incidence of HNC [1]. In the US, approximately 10,000 men and women are diagnosed with OPSCC and more than 1,500 patients die of the disease every year [2]. A recent trend towards organ-preserving strategies has resulted in the use of nonsurgical modalities such as radiotherapy (RT) or concurrent chemoradiotherapy (CRT) as the first choice of treatment in patients with OPSCC, particularly when associated with human papillomavirus (HPV) infection [3]. The treatment outcomes in OPSCC are generally favourable, with locoregional and distant failure rates of 9 – 20 % and 5 – 13 %, respectively, regardless of HPV status [46]. In general, patients with HPV-associated OPSCC have markedly better survival after CRT than those with HPV-negative OPSCC [6]. Despite the low rates of locoregional failure and development of second malignancies compared with cancers at other sites in the head and neck region, locoregional and distant failure remain the main factors limiting the survival of patients with OPSCC [6]. Therefore, the identification of high-risk patients with a poor prognosis is important for the selection of a suitable treatment, and to predict and improve the outcomes in patients with OPSCC.

Although the American Joint Committee on Cancer (AJCC) tumour-node-metastasis (TNM) staging system is a widely accepted prognostic indicator, this system is based almost exclusively on anatomical information and has limited prognostic accuracy [7, 8]. Primary tumour (T) staging indicates tumour size and its relationship with adjacent tissues [9], which is not representative of the three-dimensional volume or the biological behaviour of tumours. Lymph node (N) staging evaluates the number, size and side of metastatic lymph nodes [9], which is not an accurate representation of metastatic burden. Therefore, the identification of reliable prognostic factors predictive of treatment outcomes is necessary [7, 8].

Combined PET and CT using 18F-FDG is a widely used primary tool for tumour staging and for evaluating the therapeutic responses of solid tumours, including head and neck squamous cell carcinoma (HNSCC) [1013]. 18F-FDG PET/CT can improve risk prediction and shows prognostic value as a biomarker of HNSCC [14, 15]. The maximum standardized uptake value (SUVmax), which is currently the most commonly used parameter in 18F-FDG PET/CT [12, 13, 16], is an easily readable value that can be determined by estimating the highest intensity of 18F-FDG uptake. However, it does not reflect the metabolic activity of the whole tumour [17]. Volumetric measurement of tumour with increased 18F-FDG uptake has recently emerged as a prognostic marker in human malignancies [18, 19]. Metabolic tumour volume (MTV) and total lesion glycolysis (TLG), which combine the tumour volume and metabolic activity of the entire tumour, were introduced as prognostic biomarkers in HNC [18, 20, 21].

The prognostic value of SUV and metabolic volumetric parameters has been demonstrated in patients with OPSCC [16, 20, 22, 23]. However, the prognostic significance of these imaging parameters needs to be examined in larger cohorts of OPSCC patients with definitive treatments. Therefore, it has been hypothesized that the metabolic indices could be useful in predicting outcome in OPSCC patients. Here, we evaluated the ability of SUV, MTV and TLG measured on pretreatment 18F-FDG PET/CT to predict recurrence and survival outcomes in patients with OPSCC.

Materials and methods

Study population

The records of 517 patients diagnosed with oropharyngeal cancer obtained from an electronic database searching for the referring physician’s diagnosis and who had undergone pretreatment 18F-FDG PET/CT for initial staging between 2004 and 2012 were reviewed. Patients were included if they had previously untreated OPSCC, were aged >18 years, had undergone pretreatment 18F-FDG PET/CT scanning at initial staging and had received definitive treatment with curative intent. Patients were excluded if they had pathology other than OPSCC (n = 202; e.g. lymphoma, adenocarcinoma, minor salivary gland tumour and metastatic cancer to the oropharynx), initial distant metastasis (n = 8), or a previous history of HNC (n = 3). Also excluded were patients with incomplete clinical data (n = 12) and patients with 18F-FDG images that could not be properly analysed (n = 71) because they had been acquired with different equipment at other institutions. All surviving patients were followed for at least 12 months. The remaining 221 patients were eligible for the study. All included patients were restaged according to the pathology results in those who underwent surgery and by re-reading the PET/CT scans and other imaging studies in those who received CRT/RT according to the AJCC staging system [9]. The present study was conducted under the review and approval of our Institutional Review Board. For this type of retrospective study, informed consent from each patient was waived.

Treatments and follow-up

The treatment modality for each patient was selected according to tumour stage, operability and patient condition, and preference at a multidisciplinary team meeting that included the head and neck surgeons, medical and radiation oncologists, radiologists and pathologists. Patients were divided into two groups according to primary treatment strategy, that is a surgery group (115 patients who underwent curative surgery with or without adjuvant therapy) and a nonsurgery group (106 patients who received RT or CRT). The surgical patients underwent excision of primary tumours and ipsilateral (n = 74) or bilateral (n = 21) neck dissection, and 95 of these patients received postoperative RT or CRT with a median dose of 66 Gy (range 44 – 78 Gy). Indications for postoperative RT/CRT were the presence of adverse pathological features, e.g. positive resection margin, local invasiveness, extracapsular spread of tumour, and cervical lymph node involvement. Primary RT and CRT were performed in 11 and 95 patients, respectively. RT was performed using intensity-modulated or three-dimensional conformal radiation administered in daily fractions of 1.8 or 2.0 Gy, 5 days per week for 7 to 8 weeks; CRT consisted of high-dose cisplatin (80 – 100 mg/m2) infused every 3 weeks for three cycles. Of the 221 patients, 79 (35.7 %) received induction chemotherapy consisting of two or three cycles of cisplatin plus 5-fluorouracil with/without TS-1 or docetaxel administration. The patients who underwent induction chemotherapy followed by surgery were allocated to the primary surgery group.

After initial treatment, all patients received physical and endoscopic examinations at every clinic visit. The patients were followed up every 1 to 3 months during the first year, every 2 to 4 months during the second year, every 4 to 6 months during the third, fourth, and fifth years, and annually thereafter. Imaging work-ups included head and neck CT and/or MRI and whole-body 18F-FDG PET/CT.

18F-FDG PET/CT imaging

18F-FDG PET/CT scans were performed with a Biograph Sensation16 (BIO16; 145 patients) or TruePoint 40 (BIO40; 76 patients) system (Siemens Medical Systems, Knoxville, TN), equipped with a 16-slice or 40-slice CT scanner, respectively. All patients fasted for 6 h or longer and had a serum glucose concentration <150 mg/dL prior to 18F-FDG PET scanning. Whole-body images were obtained 50 – 70 min after intravenous injection of 333 – 688 MBq (9.0 – 18.6 mCi) of 18F-FDG. CT scanning was performed in spiral mode from the skull base to the proximal thigh at 100 mAs and 120 kV, with a section width of 5 mm and collimation of 0.75 mm. No oral or intravenous contrast medium was used for CT. CT scanning data were obtained for attenuation correction and image fusion, followed by three-dimensional caudocranial PET emission scanning with an acquisition time of 2.5 min per bed position with six or seven bed positions for the whole body, and 5 min per bed position with two bed positions for the head and neck.

The PET data were reconstructed with CT attenuation correction using an attenuation-weighted three-dimensional ordered subsets expectation maximization (3D-OSEM) algorithm. For the BIO16, two iterations and 16 subsets were used for the OSEM, followed by postreconstruction smoothing with a gaussian filter (4 mm FWHM). For the BIO40, three iterations and 21 subsets were used for the OSEM in conjunction with a point spread model-based algorithm (True-X), followed by 4 mm FWHM smoothing. To equalize the SUV between the two scanners, a SUV equalization method was used similar to that described elsewhere that is included in the Siemens EQ-PET software [2426]. Specifically, recovery coefficients (i.e. relative SUV ratios in relation to the ideal SUV of 2.5) were estimated for hot rods of different diameters in the American College of Radiology-approved phantom (Data Spectrum, Hillsborough, NC) filled with 18F-FDG water solution to set the background SUV to 1.0 and the hot rod SUV to 2.5. The SUVs of these hot rods were measured, and their recovery coefficients were estimated as well. Comparison of smoothed phantom images acquired with the BIO40 using smoothing kernels of different sizes with the phantom image acquired with the BIO16 allowed identification of the optimal smoothing kernel (i.e. 6 mm FWHM gaussian filter) for matching the recovery coefficients of the BIO40 to those of the BIO16.

Image analysis

18F-FDG PET/CT findings were reviewed on the workstation (Petavision, Seoul, Korea) by a board-certified nuclear medicine physician (J.S.K.) with 20 years of clinical experience, who identified the lesions with increased tracer uptake, and the nuclear medicine physician (J.S.K.) and a nuclear medicine physicist (J.O.) performed the analyses blinded to patient outcome. 18F-FDG PET data were fed into the workstation in DICOM format, and intensity values were automatically converted to SUVs. SUVmax, SUVpeak, MTV and TLG for each patient were measured. The reported metabolic parameters represent the primary tumour. The SUV was used to determine 18F-FDG PET activity and was calculated using the following equation: SUV = A/(ID/LBM), where A is the decay corrected activity in tissue (in millicuries per millilitre), ID is the injected dose of FDG (in millicuries), and LBM is the patient lean body mass (body weight in grams minus body fat weight). Spherical or ellipsoidal volumes of interest (VOIs) were placed over the lesions visible on PET images. SUVmax values were obtained by drawing the VOIs over the most intense voxel of the primary tumour within the oropharynx. SUVpeak was measured as a hybrid SUV value including a local average SUV in a group of voxels surrounding the voxel with the highest activity [17]. For MTV calculations, the contouring margins of the tumour were delineated with the SUV 2.5 isocontour, as described previously [18, 21]. Our in-house PET-CT quantification and VOI editing tool called AMC NM Toolkit for Image Quantification of Excellence (ANTIQUE [27]) was used to automatically calculated the volume of the tumour in each VOI [27]. The metabolic volumes of primary tumours were then determined (Fig. 1). TLG was calculated as MTV × SUVmean.

Fig. 1
figure 1

Examples of standardized uptake value (SUV) and metabolic tumour volume (MTV) measurements in two patients with oropharyngeal squamous cell carcinoma (a and b). The volumes of interest (circles) were defined to include the primary tumour with metabolic activity, and the software automatically calculated the SUVmax, SUVpeak, SUVmean and MTV. The TLG was calculated as MTV × SUVmean. Patient a with high SUVmax and MTV died from disease recurrence 31 months after concurrent chemoradiotherapy. Patient b with low SUVmax and MTV had been disease-free at the time of this report for 48 months after chemoradiotherapy

Statistical analysis

Continuous variables are expressed as median and range, and categorical variables as number and percentage. The study endpoints were the oncological outcomes of disease-free survival (DFS) and overall survival (OS). DFS was defined as the time from the last day of treatment until the first evidence of disease recurrence. OS was defined as the time from the first day of treatment and the date of death from any cause or the last clinical follow-up. The optimal cut-off values for imaging parameters of 18F-FDG PET were set according to the findings of a previous study [28]. Test statistics (Q statistics) were used to define the cut-off points, and maximum Q enabled the cut-off point under any monotonic relationship between the continuous variable and survival outcome (DFS) to be estimated and its significance determined. Survival curves were estimated using the Kaplan-Meier method, and survival outcomes were compared using log-rank tests. Univariate and multivariate analyses using the Cox proportional hazards regression model were performed to identify significant factors for predicting OS and DFS. Variables with P < 0.05 in the univariate analyses were selected for the multivariate analysis. All tests were two-sided, and a P value < 0.05 was considered statistically significant. Statistical analyses were performed using IBM SPSS software version 21.0 (IBM, Armonk, NY) and SAS version 9.3 (SAS Institute Inc., Cary, NC).

Results

Patient characteristics

A total of 221 patients comprising 197 men and 24 women with a median age of 60 years (range 37 – 87 years) were analysed. The characteristics of the patients are summarized in Table 1. Of the 221 patients, 115 (52.0 %) underwent primary surgery with or without postoperative RT or CRT, and 106 (48.0 %) received RT or CRT as the primary treatment modality. Of the 221 patients, 39 (17.6 %) experienced disease recurrence, including 23 (10.4 %) with locoregional recurrence, 15 (6.8 %) with distant metastases, and 1 with both regional and distant metastases. At last follow-up, 157 (71.0 %) patients were alive without disease, 4 (1.8 %) were alive with disease, 34 (15.4 %) had died of the disease, and 26 (11.8 %) had died of other causes. The 5-year OS and DFS rates were 72.0 % and 79.5 %, respectively. The median follow-up period for survivors was 61 months (range 18 – 122 months).

Table 1 Characteristics of the 221 patients

Measurement of the cut-off values of imaging parameters

The median (range) SUVmax, SUVpeak, MTV and TLG were 7.8 (2.5 – 51.3), 6.8 (2.2 – 40.0), 10.4 mL (0.1 – 172.4 mL) and 44.5 g (0.3 – 2,394 g), respectively. Optimal Cut Point Search was used to find the maximal Q for the calculation of the optimal cut-off values for DFS. The cut-off values for primary tumour SUVmax and SUVpeak were 7.55 and 6.80, respectively, and Q statistics were 2.15 and 2.08, respectively (P < 0.001). The cut-off value for MTV was 11.06 mL and the Q statistic was 1.57 (P = 0.014). The cut-off value for TLG was 78.56 g and the Q statistic was 1.46 (P = 0.027).

Univariate and multivariate analyses

Survival outcomes were compared according to the dichotomized values of clinical variables and 18F-FDG PET/CT imaging parameters by univariate and multivariate analyses. Univariate analyses for OS revealed that age >60 years, nonsurgical treatment, advanced tumour stage, high primary tumour SUVmax and SUVpeak, and high MTV and TLG were significantly associated with decreased OS (P < 0.05 each). Univariate analyses for DFS showed that old age, advanced tumour stage, high primary tumour SUVmax and SUVpeak, and high MTV and TLG were significantly associated with decreased DFS (P < 0.05 each; Tables 2 and 3).

Table 2 Factors affecting overall survival in the 221 study patients
Table 3 Factors affecting disease-free survival in the 221 study patients

Although SUV parameters showed a low correlation with MTV and TLG (Pearson’s correlation coefficient <0.25), multivariate analyses were performed with backward elimination. Multivariate analyses revealed that old age, treatment modality, primary tumour SUVmax and MTV were independent prognostic factors for OS (P < 0.05 each; Table 2), and that old age, primary tumour SUVmax and MTV were independent prognostic factors for DFS (P < 0.05 each; Table 3). The Kaplan-Meier curves for DFS and OS with dichotomized (high and low) values of SUVmax and MTV are shown in Fig. 2. Patients with high SUVmax had a higher likelihood of recurrence and poorer OS than those with low SUVmax (70.9 % and 90.3 % for 5-year DFS, 60.9 % and 80.3 % for 5-year OS; P < 0.05 each). Patients with high MTV had a higher likelihoods of recurrences and poorer OS than those with low MTV (62.3 % and 83.9 % for 5-year DFS, 57.3 % and 78.9 % for 5-year OS; P < 0.05 each).

Fig. 2
figure 2

Kaplan-Meier curves showing overall survival and disease-free survival according to the dichotomized values of tumour SUVmax (a and c) and MTV (b and d) with cut-off values of 7.55 and 11.06, respectively. Log-rank test, P < 0.01

Survival outcomes according to treatment modality were compared between the subgroups stratified by the defined cut-off (low and high) values of SUVmax and MTV. Among patients with high SUVmax, the OS rates of the surgery group (n = 61) were higher than those of the nonsurgery group (n = 47, P = 0.011), whereas among patients with high MTV, the OS rates of the surgery group did not differ significantly from those of the nonsurgery group (P = 0.241). Among patients with low SUVmax and MTV, OS did not differ significantly between the surgery and nonsurgery groups (P = 0.805 and P = 0.155, respectively).

Discussion

In the present study the role of quantitative metabolic measurement by pretreatment 18F-FDG PET/CT in predicting clinical outcomes in patients with OPSCC was examined. Our results showed that primary tumour SUVmax and MTV were independent factors predictive of recurrence or survival in these patients. The prognostic value of the imaging parameters was superior to that of TNM staging or clinicopathological variables except patient age. Our study showed that the AJCC staging system alone, which is based on the size, location and extent of tumours, and the number, size and side of nodal involvement, did not adequately predict the posttreatment clinical course in our and previous cohorts [16, 20].

The clinical value of the 18F-FDG PET or PET/CT parameters was examined in HNC. SUVmax is an easily readable imaging parameter that has been most commonly used for determining the prognostic value of 18F-FDG PET in HNC [12, 13, 16, 29, 30]. A tumour SUVmax of >10 was correlated strongly with poor local control and DFS in 73 patients with HNC, regardless of tumour stage and size [29]. Lin et al. showed that a pretreatment SUVmax of >11 was a predictor of primary recurrence in 62 patients with pharyngeal cancer treated with RT [31]. A recent report has also shown that a pretreatment SUVmax of >7.1 is associated with a poor 5-year DFS in patients with OPSCC treated with initial surgery [32]. However, recent studies have failed to show the association of SUVmax with survival outcome in patients with pharyngeal cancers [18, 20, 22]. In a recent study MTV and TLG provided important prognostic information in 176 patients with OPSCC treated with CCRT, whereas SUVmax was not associated with distant metastasis and locoregional failure [33]. By contrast, our data showed that SUVmax was an important prognostic marker in OPSCC patients, with a high SUVmax of >7.55 showing a significant association with poor OS and DFS outcomes. In addition to SUVmax, our study showed that a high SUVpeak was a significant risk factor for OS and DFS, but not independently. SUVpeak is measured as an average SUV value in a group of voxels surrounding the voxel with the highest uptake [34]. The relationship between SUVpeak and survival in patients with HNC has not been examined in a standardized fashion, and further investigation is necessary.

The present study also showed the prognostic value of tumour MTV and TLG. Lim et al. showed that volumetric measurements of 18F-FDG-positive disease, that is tumour MTV and TLG, remained associated with death after correcting for T stage in patients with OPSCC [33]. Tumour MTV (>12.5 mL) and TLG (>92 g) were independent predictors of recurrence-free survival in 74 patients with oropharyngeal and hypopharyngeal cancers [35]. Moon et al. reported that only tumour TLG was an independent predictive factor associated with decreased OS in 69 patients with tonsillar cancer [36]. Our study supports previous results that MTV and TLG provide important prognostic information in patients with OPSCC. However, the discrepancy in the value of MTV and TLG between this and previous studies [33] might be a result of differences in treatment modality and measurement of imaging parameters.

Our study had limitations inherent in its retrospective design. The patients included underwent primary surgery, RT or CRT with/without induction chemotherapy or postoperative RT/CRT. The various treatment modalities might have affected the clinical outcomes. However, a head and neck oncological team approach to planning and multimodal treatments was used in each OPSCC patient. Excluding the considerable number of patients from the analysis might have resulted in biased estimates of treatment outcomes. However, most of the excluded patients had non-SCC pathologies, and age, tumour stage, grading and location, and treatment modalities were comparable between the included and excluded patients. Data on p16 and HPV status were only obtained in some patients, which led to difficulties in the analysis according to these biomarkers [23]. The present study enrolled OPSCC patients treated for >7 years during the study period, which might have resulted in differences in the values of parameters measuring abnormal 18F-FDG uptakes and metabolic volume. This was minimized by reinterpretation and proper delineation of VOIs from the 18F-FDG PET/CT data of each patient by a nuclear medicine physician and a physicist. Previous studies involved patients with HNC arising at different anatomical sites that may show different clinical behaviour and outcomes, leading to potential bias. The present study included a relatively large cohort of patients with only OPSCC who underwent definitive treatment including primary surgery or RT/CRT. Our data show the value of pretreatment SUVmax, SUVpeak, MTV and TLG for predicting the clinical outcome in OPSCC patients.

In conclusion, our data show that imaging parameters measuring focal 18F-FDG uptake in OPSCC patients provide important prognostic information for predicting recurrence and survival after definitive treatment. SUVmax and MTV may be useful for risk stratification and selection of at-risk patients who would benefit from intensive treatments and posttreatment surveillance. The present results and those of previous studies should be further validated in a large prospective cohort of OPSCC patients, with the inclusion of biological information such as p16 and HPV status.