Introduction

Over 60 years ago, it was shown that hypoxia inhibits the ability of radiation to kill mammalian cells [1]. The existence of hypoxic regions in solid tumors was hypothesized [2] and subsequently confirmed [3], establishing this as a key limiting factor in cancer radiotherapy. Given the biological and clinical significance of tumor hypoxia, a number of technologies for quantifying tumor oxygenation have been developed and applied, including electrode-based measurements as well as hypoxia-sensing probes suitable for use in immunohistochemistry and non-invasive molecular imaging. Molecules that accumulate in hypoxic cells have been conjugated to positron-emitter isotopes to produce hypoxia radiotracers for positron emission tomography (PET), including [18F]-Fluoromisonidazole (FMISO), [18F]-Fluoroazomycin arabinoside (FAZA), [18F]-2-(2-Nitro-1-H-imidazol-1-yl)-N-(2,2,3,3,3-pentafluoropropyl) acetamide (EF5), 3-[18F]-2-(4-((2-nitro-1H-imidazol-1-yl)methyl)-1H-1,2,3,-triazol-1-yl)-propan-1-ol (HX4), [18F]-fluoroerythronitroimidazole (FETNIM), and a number of others.

Tumors of the head and neck are prime candidates for use of these methods because of their location away from regions of physiologic uptake of these tracers, as well as the incidence and clinical significance of hypoxia in these lesions [46]. However, despite positive initial reports of the utility of several hypoxia PET radiotracers to identify clinically important features of head and neck cancers [7], debate continues over how to optimally utilize them in the clinical management of these patients. A variety of qualitative and quantitative analysis methods have been proposed, but their relative association with patient outcome is not known. Tumor human papillomavirus (HPV) status has been shown to be strongly associated with both local control and overall patient survival following treatment [8, 9]; however, previous studies have not elucidated the relationship between HPV and PET-visible hypoxia as predictors of treatment outcome [10]. We therefore investigated the relationship between FDG and FAZA PET signals, tumor HPV status, and patient outcome in a sample of 41 head and neck cancer patients treated prospectively on the randomized Tasmanian Radiation Oncology Group (TROG) trial 02.02, comparing chemoradiotherapy to chemoradiotherapy plus the hypoxic cell cytotoxin tirapazamine. We extracted a panel of qualitative and quantitative features from FDG and FAZA PET datasets of primary tumors and involved lymph nodes, and evaluated their association with patient outcome and to tumor HPV status.

Materials and methods

Patients

Patients were collected from a previous open-label, randomized phase III trial of radiation and cisplatin combined with the hypoxic cytotoxin tirapazamine that was performed in 82 centers across 16 countries in Australia, New Zealand, North America, Europe, and South America between April 2002 and September 2005 (TROG 02.02). The study protocol including this subanalysis was approved by the institutional ethics committees. Eligibility criteria for this phase III trial have been described previously [11]. Additional eligibility criteria imposed for this analysis included pre-treatment FAZA PET-CT, an applied radiotherapy dose >60 Gy, and the absence of any major deviations in radiotherapy quality that would be predicted to have an adverse effect on tumor control [12].

Positron emission tomography imaging and analysis

All patients underwent both an 18F-2-fluoro-2-deoxy-glucose (FDG) PET scan as well as a FAZA PET scan at the Peter McCallum Cancer Centre on separate days prior to the initiation of treatment. FDG was obtained through the licensed manufacturer Cyclotek Pty Ltd. (Melbourne, Australia, license no. MI-12092005-LI-000904-2). 18F-FAZA was produced at the Peter MacCallum Cancer Centre using a previously reported synthesis methodology [13]. Static PET images were acquired 1 hour after intravenous injection of FDG or 2 h after intravenous injection of FAZA at a dose of 5.2 MBq/kg using a dedicated PENN-PET300H PET scanner (UGM Medical Systems Inc., Philadelphia, PA, USA) or a Discovery LS PET/CT scanner (GE Medical Systems, Milwaukee, WI, USA). PET scans were acquired over a range of at least 24 cm from the base of the skull to the top of the lungs, including all tumor and involved lymph nodes in the neck. The PENN-PET300H imaged a single field-of-view over a 15-min period, while the Discovery LS was operated in whole body mode, scanning a series of bed positions for 5 min each. FDG and FAZA PET images were acquired within 1 month of each other. Following acquisition of both PET datasets, the FAZA images were coregistered with FDG PET images using a semiautomated mutual information-based PET-to-PET rigid registration method implemented within the RT_Image software package [14]. This method has been applied previously to register metabolic and hypoxia PET scans, resulting in average errors on the order of 2 mm (approximately half the width of a PET voxel) [15]. The FDG PET was then used to identify tumor and involved nodal areas as regions of interest for FAZA analysis. The FAZA images were initially analyzed qualitatively by two blinded readers, with lesions in each dataset scored from 0 (uptake less than background) to 4 (focal uptake markedly greater than background). The FAZA scan was interpreted to be positive if there was greater activity (score of 2 or 3) within the sites of tumoral uptake of FDG than the activity present in adjacent or mirrored soft-tissue sites. The PET images were also analyzed quantitatively using RT_Image. The maximum SUVs over the primary tumor, lymph nodes, and total tumor volume were calculated for FAZA and FDG. The volume of FDG-avid tissue, defined as the metabolic tumor volume (MTV), was measured for the tumor and involved nodes as described previously [16]. The histogram analysis of Mortensen et al. was applied to this patient cohort to characterize the distribution of FAZA signals, in order to calculate a threshold value separating normoxic and hypoxic voxels [10]. Hypoxic tumors were identified based on the quantitative analysis as those with at least one voxel with a FAZA signal greater than the calculated threshold. The hypoxic tumor volume (HTV) was defined as the volume in the FAZA PET dataset with an intensity greater than the determined threshold. The HTV analysis was restricted to the MTV. Finally, the fractional hypoxic tumor volume (fHTV) was calculated by dividing the HTV by the MTV for the primary tumor, lymph nodes, and total tumor volume.

p16 status

p16 immunohistochemistry was performed as previously described to ascertain the HPV status of each patient [9]. Slides were semiquantitatively scored for staining intensity in the cell nucleus and cytoplasm. Intensity was scored as 0 (none), 1 (weak), 2 (moderate), or 3 (strong), with 0 or 1 defined as p16-negative and 2 or 3 defined as p16-positive.

Endpoints

Locoregional failure was defined as a disease progression at the primary tumor and/or involved lymph nodes. Time to locoregional failure was measured from the date of the start of the treatment to the date of first locoregional failure. Distant failure and death were treated as censoring events for time to locoregional failure. Overall treatment failure was defined as disease progression at any site, locoregional and/or distant, or death. Failure-free survival was measured from the date of the start of the treatment to the date of first failure or death.

Statistical analysis

Patients were grouped based on their p16 status in HPV-positive and -negative groups. For the purposes of this analysis, all patients with non-oropharyngeal primary tumors were considered HPV-negative. Kaplan–Meier curves were computed for the endpoints described above, and the treatment arms were compared using the exact logrank test. The impact of hypoxia metrics on the time-to-event outcomes was assessed using the logrank test for trend. The hypoxic tumor volume in the primary tumor, lymph nodes, and tumor plus nodes was compared between HPV-positive and -negative patients using the Wilcoxon rank sum test.

Results

Patient population

A total of 41 patients were included in this study. The characteristics of this patient sample are described in Table 1 according to treatment arm, either chemoradiotherapy (CRT) or chemoradiotherapy plus tirapazamine (CRT + TPZ). Fourteen (34 %) of the oropharyngeal tumors were determined to be HPV positive, 24 (59 %) were HPV negative, and three (7 %) of tumors were of unknown HPV status. Only one of the non-oropharyngeal tumours was p16-positive, but was included in the HPV negative group as per a priori decision. On subsequent analysis using an in-situ hybridization assay for HPV DNA [9], this case was found to be HPV-negative. FDG and FAZA PET datasets were acquired from all patients within a median of 7 days of each other (range: 1–29 days). Twenty-four patients were imaged using the PENN-PET300H scanner, while 17 were imaged on the Discovery LS PET/CT. Representative FDG and FAZA PET images from this patient cohort are shown in Fig. 1.

Table 1 Properties of the study patient population based on randomization arm
Fig. 1
figure 1

Analysis of FDG and FAZA PET images. a Shows an FDG PET scan of a patient with a tumor of the larynx, showing the computed MTV for the primary tumor (central lesion) and involved lymph node (peripheral lesion) as a black contour. This contour was then overlaid on the coregistered FAZA PET image shown in b, and used to determine the FAZA positive volumes for the primary tumor and lymph node. In this patient it is apparent that the majority of the primary tumor is hypoxic according to FAZA, while the lymph node is less FAZA-avid and correspondingly less hypoxic. Note the lower contrast between FAZA uptake in tumor compared to background soft tissues than observed with FDG

PET imaging and HPV status

All patients underwent FDG and FAZA PET imaging prior to treatment. Of the 24 HPV-negative patients, 15 (62 %) were classified as having no PET-visible hypoxia according to the qualitative grading scheme. Four (17 %) exhibited FAZA accumulation in the primary tumor but not in lymph nodes, two (8 %) showed FAZA accumulation in the lymph nodes but not the primary tumor, and three (12 %) were FAZA-positive in both the primary tumor and lymph nodes. Of the 14 HPV-positive patients, six (43 %) showed no PET evidence of tumoral hypoxia, one (7 %) exhibited FAZA uptake in the primary tumor, four (29 %) exhibited FAZA uptake in the lymph nodes, and three (21 %) showed uptake in both the primary tumor and lymph nodes. Analysis of the distribution of FAZA signals across all patients according to the method of Mortensen et al. [10] resulted in a threshold hypoxic tumor:muscle FAZA ratio of 1.44.

Table 2 summarizes the quantitative imaging metrics derived from the FDG PET images for HPV-positive and -negative tumors. The cumulative MTVs for HPV-positive and -negative tumors are similar; however, the MTVs for HPV-negative primary tumors are significantly larger (p = 0.01), while the MTVs for HPV- negative-involved lymph nodes are significantly smaller than those for HPV-positive tumors (p = 0.001). There are no significant differences in the SUV values between the two HPV groups. A similar analysis of the FAZA PET images is shown in Table 3 and in Fig. 2. As with the MTV values, the HTVs for the HPV-negative-involved lymph nodes are significantly smaller than those for the HPV-positive tumors (p = 0.004). There was a statistically significant correlation between T stage and hypoxic tumor volume assessed using the Kendall Tau rank correlation (p = 0.002, tau = 0.42), with higher T stage associated with a higher FAZA tumor HTV (Fig. 3). A correlation was observed between the SUVmax values measured for the tumor and the nodal volumes (r = 0.81), as seen in Fig. 4. This correlation is weaker when normalizing by the muscle SUVmean (r = 0.45).

Table 2 FDG PET metrics stratified based on HPV status
Table 3 FAZA PET metrics stratified based on HPV status
Fig. 2
figure 2

Distributions of hypoxic volumes, sorted in increasing order, within the tumor (a), nodes (b), and total tumor volume including the primary and nodes (c). Blue markers denote HPV-negative tumors and red HPV-positive

Fig. 3
figure 3

The relationship between hypoxic tumor volumes and T stage. HPV-negative tumors are indicated by blue markers and HPV-positive tumors by red markers

Fig. 4
figure 4

The relationship between FAZA PET signals in the primary tumor and nodal volumes. a Shows a scatter plot of tumor SUVmax against nodal SUVmax. b Shows a scatter plot of the same metrics scaled by the SUVmean of the reference muscle tissue for that patient. Black lines identify the empirically determined SUVmax/muscle SUVmean hypoxia threshold of 1.44. Blue markers denote HPV-negative tumors and red HPV-positive

An alternate method of analyzing the degree of hypoxia using the fractional hypoxic tumor volume (fHTV) is considered in Table 4. The fHTV values range from 0.00 to 0.95 for the primary tumor, from 0.00 to 0.92 for the lymph nodes, and from 0.00 to 0.90 for the total tumor volume. The distribution of fHTV values for the primary tumor and the total tumor volume are not significantly different between the HPV- positive and -negative tumors; however, the fHTV values for the lymph nodes are smaller (p = 0.05) for the HPV-negative tumors (median 0.13) than for the HPV-positive tumors (median 0.30). This is in accord with the lower HTV observed for the lymph nodes of HPV-negative tumors, as seen in Table 3.

Table 4 Fractional hypoxic tumor volumes stratified based on HPV status

Treatment outcome

The treatment failures observed in this patient population are summarized in Table 5. A total of six failures (35 %) were seen in the patients with tumor or nodal volumes qualitatively classified as oxic, while six failures (29 %) were observed in the patients qualitatively classified as hypoxic. The failure rates for the patients treated with chemoradiotherapy and with chemoradiotherapy plus tirapazamine were respectively 40 and 17 % for the oxic tumors, and 56 % and 8 % for the hypoxic tumors. Stratified based on HPV, the failure rates for patients in the chemoradiotherapy and chemoradiotherapy plus tirapazamine arms were respectively 80 and 15 % for the HPV-negative tumors, and 11 % and 0 % for the HPV-positive tumors.

Table 5 Patterns of failure stratified based on qualitative hypoxia score, HPV status, and treatment arm

The Kaplan–Meier curves for locoregional failure free, failure-free survival, and overall survival for the HPV-negative patients are shown in Fig. 5, stratified by their quantitative FAZA hypoxia status and the treatment arm in which they were enrolled. Due to the low number of failures in the HPV-positive group, parallel analysis of those patients was not possible. A significant difference in time to locoregional failure (p < 0.001) and failure-free survival (p = 0.004) was observed between HPV-negative patients treated with chemoradiotherapy and with chemoradiotherapy plus tirapazamine in the subset of patients with hypoxic tumors. However, the difference between these treatment arms in terms of overall survival was not significant (p = 0.17).

Fig. 5
figure 5

Kaplan–Meier curves for locoregional failure free (a), failure-free survival (b), and overall survival (c). Curves are stratified into groups consisting of the patients with non-hypoxic tumors according to FAZA PET, patients with hypoxic tumors treated in the cisplatin arm, and patients with hypoxic tumors treated with tirapazamine. Identification of hypoxia was based on a hypoxic tumor volume greater than 0. The curves for the non-hypoxic and tirapazamine hypoxic groups in a overlap

Discussion

In this study, we have applied hypoxia PET imaging using the radiotracer 18F-FAZA towards patients treated with chemoradiotherapy and a hypoxic cytotoxin for tumors of the head and neck. PET imaging of hypoxia has been applied in studies of human head and neck cancer for almost 2 decades. The FMISO tumor:blood (T:B) ratio as well as the absolute hypoxic volume, computed using a threshold FMISO T:B ratio of 1.2, were found to be independent predictors of outcome in a group of 73 head and neck cancer patients [17]. Qualitative analysis of FMISO images was shown to identify patients who would benefit from the addition of tirapazamine in a previous phase II TROG study [7]. Analysis of FMISO time-activity curves obtained from dynamic PET scanning of head and neck cancer patients have revealed an association between curve shape and patient response to radiotherapy [15].

FAZA is a successor to FMISO that has been reported to exhibit improved tumor to muscle ratios, indicative of faster washout of unbound probe [18]. While FAZA has been comprehensively studied in animal models of cancer and has been shown to be predictive of the efficacy of chemoradiotherapy and hypoxia-directed therapies [19], only recently has it entered into pilot clinical trials [20, 21]. In our study, FAZA PET signals exhibited a strong association with T stage, which are larger for HPV-negative tumors. This is consistent with previously noted associations between hypoxia as detected by FMISO PET and T stage [22]. Notably, the FDG and FAZA avid regions in the involved lymph nodes were larger in the HPV-positive than in the HPV-negative tumors. The frequent presence of cystic nodal metastases in HPV positive tumors has been previously reported [23], and may explain the lack of association between nodal hypoxia and treatment outcome that has been observed [7].

We observed a lower incidence of qualitative hypoxia in this cohort with FAZA compared to an earlier cohort imaged with FMISO [24]. While this could reflect in part differences between the cohorts, it may be due to the different tracers and/or changes and image interpretation. Locoregional failures in the HPV-negative patients on the control arm in the FMISO cohort were predominantly in patients with tumors that had qualitative hypoxia, but this was not the case in the current study, with several locoregional failures in the non-hypoxic group. This suggests that a lower, more sensitive, threshold for qualitative hypoxia may be required for FAZA than for FMISO in order to maintain its predictive value in identifying patients at risk of locoregional failure with standard treatment. These findings also emphasize the difficulty in applying qualitative scoring schema established with one agent to another with different uptake characteristics.

PET imaging of hypoxia has a number of limitations that remain to be overcome as it progresses toward widespread clinical adoption. There is a lack of consensus on the optimal method to present and interpret hypoxia PET data, one that has been identified as a key limitation of many emerging functional and molecular imaging methods. Hypoxia PET is limited by the inherently low signal:noise ratio (SNR) and dynamic range of the acquired images. This makes selection of a robust threshold discriminating FAZA-positive and -negative tumors much more difficult than for FDG, where SUVs may be as high as 20. The physical and biological basis of this poor SNR has been suggested [25]; however, this does not compensate for the practical difficulties imposed by it. Although qualitative score is commonly used for PET interpretation, it is potentially subject to inter-reader variability and might be less reproducible than a quantitative metric. Therefore, we performed both qualitative and quantitative analyses for these studies, using investigators that were blinded to each other’s results. The spread of quantitative metrics within a single qualitatively equivalent group probably reflects the emphasis of qualitative analysis on image contrast and quantitative analysis on absolute image intensities. The hypoxic tumor volume also incorporates a volumetric dimension not included in qualitative interpretation. It is possible that both the severity and extent of tumoral hypoxia will have an influence on outcomes, and therefore further work is required to determine optimum quantitative analysis methods and to establish prognostically important thresholds for these parameters.

This study was subject to several shortcomings. Because of the duration over which patients were accrued, two different scanners were used to acquire the PET data. While each scanner was routinely calibrated according to the manufacturer’s specifications, the differences in spatial resolution and imaging performance between these systems impose a source of variability across the dataset. However, we did not observe any significant differences in the quantitative and qualitative metrics between patients imaged on each system. Additionally, due to practical considerations there was an interval of between 1 and 29 days between the acquisition of the FDG and FAZA PET scans. While the majority of patients were scanned with each radiotracer within a week, longer delays were encountered for some patients that could complicate comparison of the scans because of changes in the tumor during this interval. As above, we did not observe any significant association between the imaging metrics considered and the delay between the scans. Finally, interpretation of FAZA scans relative to the total tumor volume, such as the fHTV, should ideally be performed using the gross tumor volume (GTV) identified from an anatomic CT imaging examination. In the absence of these images within this dataset, we instead relied on the FDG PET-derived MTV to identify the tumor volume. While practically convenient, this practice may underestimate the tumor volume if portions of the tumor are not FDG-avid, introducing error into the calculated metrics.

Overall, our data support the prognostic significance of hypoxia in HPC-negative head and neck cancer treated with conventional radiotherapy and the potential benefit of hypoxia cytotoxins in this setting, consistent with our previous reports [7]. The subjectivity of the qualitative analysis, as well as the strong relationship between the quantitative grading of hypoxia and outcome as seen in Fig. 5, encourage the use of these standardized, quantitative methods in future studies. Hypoxic HPV-negative tumors treated with tirapazamine exhibited significantly better locoregional failure-free and overall failure-free survival.

Conclusion

In summary, in this study we have characterized PET images acquired using the metabolic radiotracer FDG and the hypoxia radiotracer FAZA in patients with head and neck cancer being treated as part of the TROG 02.02 trial of chemoradiotherapy and tirapazamine. A significant association between FAZA PET signals and tumor T-stage was observed. Locoregional failure-free and failure-free survival for HPV-negative tumors identified as hypoxic by PET imaging and quantitative analysis was improved for patients treated with tirapazamine relative to those treated with cisplatin. These findings will need to be validated in future larger studies, testing hypoxic cell radiosensitizers or hypoxia-activated prodrugs.