Introduction

Nowadays, quantitative assessment of therapy response as well as response prediction with positron emission tomography (PET) is considered an accepted diagnostic procedure in the management of various tumor entities. The maximum standardized uptake value (SUVmax) and the metabolically active tumor volume (MTV) are most commonly used as semi-quantitative and quantitative parameters for response assessment [1]. Additionally, the product of MTV and SUVmean (total lesion glycolysis: TLG) is increasingly considered (see e.g., [24]).

In a recent publication, we proposed an easily computable novel parameter for quantifying spatial heterogeneity of tumor lesions in PET (asphericity: ASP) [5]. The rationale for the definition of ASP was that prominent intratumoral spatial variation of cellularity, angiogenesis, extravascular extracellular matrix, and necrosis in aggressive tumors is expected to result in irregularity of the shape of the FDG uptake at both the external surface (irregular tumor expansion and necrosis) and the internal surface (necrosis). In this pilot study, we were able to show that in head and neck cancer the pretherapeutic ASP of the primary tumor is a prognostic factor for progression-free survival (PFS) and overall survival (OS), respectively. Moreover, the prognostic value of ASP was independent of MTV, TLG, and SUV.

Motivation for the present work was the verification of these results in an independent group of head and neck cancer patients acquired on a different scanner, at a different institution. For this, we employed the cut-off values, which were derived in the initial exploratory study as the optimal threshold for discrimination of “low risk” and “high risk” head and neck cancer patients. We applied these predefined cut-off values in a survival analysis of the current patient group, in order to confirm the robustness of ASP in a multi-site setting. In a second step- the cut-off values were adjusted to the present patient group. Finally, the prognostic value of ASP was confirmed in a larger pooled patient group encompassing the present patient group and that from our previous study [5].

Methods

Definition of asphericity

The rationale for investigation of spatial heterogeneity is that aggressive tumors show particularly fast and uncontrolled growth that often results in irregular expansion and/or necrosis and thus increases the ratio of surface area S to volume V of the metabolically active part of the tumor. However, the ratio S/V is not a dimensionless quantity (it has the dimension [Length]−1) and thus depends on the chosen length unit as well as on absolute tumor size. Therefore, this ratio is not directly suitable for describing shape irregularities. It is straightforward, however, to remove the scale dependency by defining the dimensionless quantity

$$ H = \frac{1}{36\pi}\frac{S^{3}}{V^{2}}\,. $$
(1)

where the factor 1/36π ensures that H is equal to one for a sphere. More generally, the third root of H is equal to the ratio between the actual lesion surface area and the surface area of a sphere with the same volume. Based on this fact, we define the asphericity (ASP) of the tumor as

$$ ASP = \sqrt[3]{H} - 1\,, $$
(2)

which is equal to zero for spheres. For non-spherical shapes, A S P > 0, and is a quantitative measure of the degree of deviation from a spherical shape.

As an anecdotal observation, we would like to mention an old geology paper published in 1935 by Wadell et al. [6], which came to our attention while finalizing the present manuscript. These authors investigated the shape variability of quartz particles and defined a parameter “sphericity”, which can be expressed as 1/(1+ASP) using our definition of ASP. Thus the concept to define a scale invariant parameter for quantification of deviations from a spherical shape is not new at all.

As indicated above, the numerical value of ASP has a simple intuitive meaning: for example, A S P = 0.5 = 50 % indicates that the lesion’s surface area is increased by 50 % relative to the surface area of a sphere exhibiting the same volume. Figure 1 shows three examples. The displayed lesions (marked with red lines) exhibit comparable volumes (A: 15.4 ml, B: 13.8 ml, C: 14.1 ml) but distinctly different ASP. Lesion A is quite close to a spherical shape yielding a moderate ASP (11 %). The shape of lesion B differs more distinctly from the shape of a sphere leading to ASP = 28 %. Finally, lesion C exhibits a pronouncedly irregular shape including necrotic parts, which leads to ASP = 65 %.

Fig. 1
figure 1

Representative coronal slices of three patients with head and neck cancer. The red lines indicate the delineation of the primary tumors. Tumors have approximately the same MTV but different shapes resulting in different values of ASP

Patient group

The investigated patient group included 37 subjects with head and neck cancer (33 men and four women, mean age 54 years, range, 37–72). Data were acquired prospectively from August 2005 to May 2009 at the University Hospital, Technische Universität Dresden (TUD) in the context of a different study (ClinicalTrials.gov identifier: NCT00180245, see [7]) and were evaluated retrospectively in the present study. Inclusion criteria were:

  1. 1.

    patient had been referred to whole-body FDG PET/CT prior to chemoradiotherapy

  2. 2.

    head and neck cancer was proven histologically

  3. 3.

    no distant metastasis present

  4. 4.

    histopathology and/or clinical/radiological follow-up for at least 12 months was available

All tumors were squamous cell carcinomas. All tumors were inoperable due to tumor size, localization, and/or general condition of the respective patient. Tumor progression was defined by the follow-up as occurrence of (a) local, regional recurrence, (b) distant metastases, or (c) a combination of both. In the following we refer to this patient group as “prospective group”.

Retrospective evaluation of the data has been approved by the local Clinical Institutional Review Board and complies with the Declaration of Helsinki.

PET imaging

Hybrid PET/CT scans were performed with a Biograph 16, Siemens Medical Solutions Inc., Knoxville, TN, USA (3D acquisition, 3-min emission per bed position). Data acquisition started 83 ±15 minutes after injection of 305–380 MBq 18F-FDG. All patients had fasted for at least 6 h prior to FDG injection. Tomographic images were reconstructed using attenuation weighted OSEM reconstruction (four iterations, eight subsets, 5-mm FWHM Gaussian filter). The resulting image data had a voxel size of 4.1 × 4.1 × 5 mm3.

Combined patient group

To increase the statistical power, we created another group by combining the above-described patient group with the patient group from our previous work [5]. Only patients matching the above-described inclusion criteria were included. The respective PET investigations were performed at Charité – Universitätsmedizin Berlin. Details of the data acquisition and image reconstruction can be found in [5]. This patient group includes 31 subjects (22 men and nine women, mean age 60 years, range, 42-81 years). The combined group thus consists of altogether 37 + 31 = 68 subjects. Tumor characteristics are summarized in Table 1. In the following we refer to this patient group as “combined group”.

Table 1 Combined group: summary of tumor characteristics

Image analysis

The metabolically active part of the primary tumor was delineated by an automatic algorithm based on adaptive thresholding taking the local background into account [8]. The result of the automatic delineation was inspected visually by an experienced observer (one observer at each site) and corrected manually in case of obvious segmentation failure. This was deemed necessary in seven of 68 cases. In six of these seven cases, the automatic delineation did not separate primary tumor from lymph node metastases in the immediate vicinity. Here, the voxels corresponding to lymph nodes were removed from the respective ROI. In a single case with low diffuse tracer accumulation, the automatic method failed completely. This tumor was therefore delineated manually. The delineating observers were blinded to patient outcome in all cases.

ASP of the delineated ROIs was computed together with SUVmax, SUVmean, the metabolic tumor volume (MTV = metabolically active part of the tumor), and the total lesion glycolysis (TLG = MTV × SUVmean).

ROI definition and ROI analyses was performed using ROVER version 2.1.20 (ABX, Radeberg, Germany).

Statistical analysis

Prospective group

The association of MTV, SUVmax, SUVmean, TLG and ASP with PFS and OS was analyzed using univariate Cox proportional hazards regression. In a first evaluation, the same cut-off values as in [5] were used to define two subgroups. In the following we refer to the subgroup with “parameter ≤ cut-off” as “low risk” and to the subgroup with “parameter > cut-off” as “high risk”.

Additionally, the cut-off values were optimized for the present patient group. For this purpose, for each parameter a univariate Cox regression for each parameter value was performed. The value within the interquartile range (to avoid too small sample size in each group), which led to a hazard ratio (HR) with the highest significance, was chosen as the cut-off value. The cut-off values were computed for PFS and OS separately.

Combined group

Pair-wise correlation of the PET parameters was tested by Spearman’s rank correlation method and illustrated as scatter plots.

The optimal cut-off values for defining groups of high and low risk were computed as described above. The impact of the resulting parameters on PFS and OS was analyzed using univariate Cox regression. Univariate Cox regression was also performed for the investigating site as independent parameter (TUD and Charité). The probability of survival was computed and illustrated as Kaplan–Meier curves. Additionally, the independent prognostic value of these parameters was analyzed in multivariate Cox regression, including clinically relevant parameters with at least a trend for significance according to univariate Cox regression (P < 0.1) as confounding parameters.

Statistical significance was assumed at a p value of less than 0.05. Statistical analyses were performed with the R language and environment for statistical computing [9] version 3.0.2.

Results

Prospective group

Mean follow-up of all surviving patients was 27 months (range, 14–80.7 months). Seventeen patients died during the follow-up. The mean survival time of these patients was 13 months (range, 1–30 months). Progression had occurred after a mean time interval of 9.9 months (range, 1–23 months). Four of the survivors experienced tumor recurrence or progression after an average of 15 months (range, 4–23 months).

In a first subanalysis, the cut-off values determined in our previous study [5] were used to define groups of high and low risk. Cox regression revealed TLG and ASP but not MTV as significant prognostic factors with respect to PFS, whereas only MTV and ASP were significant prognostic factors with respect to OS (Table 2A).

Table 2 Prospective group: the results of univariate Cox regression. A: same cutoff values as in [5], B: cutoff values adjusted to prospective group

In a second subanalysis, the cut-off values were adjusted to the current patient group. After this adjustment, Cox regression showed a significant prognostic effect of TLG, MTV, and ASP on both PFS and OS. SUVmax and SUVmean showed no significant effect even after adjusting the cut-off values (see Table 2B).

Combined group

Mean follow-up of all surviving patients was 29 months (range, 14–80.7 months). Twenty-seven patients died during the follow-up. The mean survival time of these patients was 12 months (range, 1–30 months). Progression had occurred after a mean interval of 11 months (range, 1–32.8 months). Nine of the survivors experienced tumor recurrence or progression after an average of 17 months (range, 4–32.8 months).

Correlation analysis revealed, as expected, a strong correlation between MTV and TLG (Spearman’s rho = 0.89) and to a lesser extent between SUVmax/mean and TLG (Spearman’s rho = 0.63/0.61). Correlation of MTV and SUV was < 0.45. ASP was only moderately correlated with MTV and TLG (Spearman’s rho ≤ 0.43). There was no significant correlation between ASP and SUV. All other correlations were significant. Corresponding scatter plots are shown in Fig. 2.

Fig. 2
figure 2

Scatterplots: TLG vs. MTV (a), ASP vs. MTV (b), ASP vs. TLG (c), SUVmax vs. MTV (d), SUVmax vs. TLG (e), SUVmax vs. ASP (f)

The results of the univariate Cox regression and the corresponding cut-off values are shown in Table 3. Cox regression revealed a significant prognostic value with respect to PFS and OS for MTV, TLG, SUVmax and ASP. The clinical parameters T stage and patient age revealed a trend for significance with respect to both clinical endpoints. Both were included in the multivariate analysis. The investigating site had no significant effect.

Table 3 Combined group: the results of univariate Cox regression

Kaplan–Meier curves for TLG, SUVmax, MTV, and ASP with respect to PFS and OS are shown in Figure 3. The significant prognostic value of each of these parameters demonstrated by the univariate Cox regression was confirmed. The results of the multivariate Cox regression are shown in Table 4. TLG and ASP showed a significant prognostic value with respect to PFS and OS. MTV and SUV were excluded from multivariate Cox regression due to colinearity with TLG. A repeated analysis with MTV and SUVmax instead of TLG showed a significant prognostic value with respect to both clinical endpoints for these parameters as well (MTV: HR = 2.3/2.1, p = 0.04/0.04, SUVmax: HR = 2.1/2.5, p = 0.03/0.04, ASP: HR = 3.7/5.1, p < 0.001).

Fig. 3
figure 3

Combined group: Kaplan–Meier curves with respect to PFS (left) and OS (right) for MTV (a,e), SUVmax (b,f), TLG (c,g) and ASP (d,h)

Table 4 Combined group: results of multivariate Cox regression of PET parameters with the confounding clinical parameters T stage and patient age. The same risk categories as in Table 3 were used

Discussion

In this work, we further investigated the prognostic value of ASP in patients with head and neck cancer. Our main goal was to corroborate our previously published results [5]. For this purpose, we applied the optimal cut-off for defining subgroups as derived in [5] to an independent patient group (prospective group) measured at a different site. We were able to demonstrate that in the present patient group, too, ASP proved to be a independent significant prognostic factor for PFS and OS according to univariate Cox regression (see Table 2). The observation that a fixed ASP cut-off provides a significant prognostic discrimination of low-risk and high-risk patients in two independent patient groups indicates that ASP can be determined unambiguously across different sites, which is especially relevant in the context of multi-center trials and which is a basic requirement for a generalized use of this new parameter.

This was not the case for MTV and TLG. With the optimal cut-off values defined in [5], MTV was a prognostic factor only for OS and TLG only for PFS according to univariate Cox regression. After adjusting the cut-off values to the respective study group, also MTV and TLG were prognostic factors for both investigated endpoints. However, the derived optimal cut-off values for TLG were distinctly larger for the present group, where a cut-off of 82.6 ml for PFS and 141 ml for OAS were derived compared to 59 ml found in [5]. In accordance to our findings, the prognostic value of pretherapeutic TLG in patients with head and neck cancer has been reported in several previous studies [1013], however, the reported cut-off values vary notably. For example, Chan et al. found TLG = 330 as optimal cut-off for predicting OS [10], while Dribble et al. reported a distinctly lower cut-off of TLG = 55 [11].

In the combined patient group, the cut-off values were adjusted appropriately for the whole group. This seems to be more reasonable than using the previously derived cut-offs, since one can expect to derive improved cut-off values from a larger patient group. For ASP, we calculated a cut-off value of 25.6 %, which is very close to the previously derived cut-off (24 %). With this cut-off, ASP was a significant prognostic factor for PFS and OS according to both univariate and multivariate Cox regression. In the combined patient group, also MTV, TLG, and SUVmax were significant prognostic factors for PFS and OS. These results confirm our previous findings, namely that ASP provides information in addition to MTV, TLG, or SUV, and is an independent prognostic factor for therapy outcome in patients with head and neck cancer.

Contrary to SUV-related parameters, ASP is not directly affected by partial volume effects but only depends on the given tumor delineation. However, the delineation (and thus ASP) is affected by the given limited spatial resolution of the PET images and, in the case of manual delineation, by inter- and intra-observer variability. The chosen delineation thus has an effect on ASP (as well as on MTV, TLG, and SUVmean). Regarding the problem of inter- and intra-observer variability, the issue can be mostly avoided by using (supervised) automatic methods for volume delineation. Several viable algorithms for automatic delineation can be found in the literature [1423]. We used our own algorithm for which we have shown previously that lesions > 2.5 ml can be delineated reliably with sufficient accuracy [8]. Similar limits of applicability (which are essentially determined by the typical spatial resolution of PET image data) can be expected for the different alternative delineation algorithms.

In the present study, only two out of 68 lesions were slightly smaller than 2.5 ml (2.1 and 2.4 ml, respectively), but were still delineated in a reasonable way according to thorough visual inspection. Nevertheless, it is obvious that near a lesion size of about 2.5 ml (corresponding to a diameter of about 17 mm) ASP starts to become less and less meaningful because the limited spatial resolution drives the lesion image towards a spherical shape (at very small sizes asymptotically approaching the point spread function of the scanner). Another non-negligible factor is the fact that for small lesions discretization errors become increasingly noticeable which can lead to random variations of the derived ASP values. Overall, it is obvious that the precise limits of validity of the ASP approach are not yet clear and would need to be determined by exhaustive future investigations. This, however, is beyond the scope of the present work where lesion size was—apart from the mentioned two cases—large enough and the above-stated problems were thus not operational.

Our study suffers from the following limitations. One limitation is the problem of a non-standardized uptake period at both contributing sites. Variable uptake time is an issue in general and especially when data from multiple centers are evaluated as was the case in the present study [24]. Therefore, our results for uptake-related parameters in the combined group, e.g., for SUVmax, are of only limited value. Indeed, SUVmax was an independent prognostic factor for OS and PFS in the combined group, but with a rather poor discrimination between high and low risk (see Figure 3B,F). Whether this discrimination would be improved by a standardized acquisition protocol can not be decided with our data but this possibility cannot be ruled out. Another noteworthy observation is that TLG seems to be less affected by the variability of the uptake period. A tentative explanation might be that the prognostic value of TLG is dominated by MTV and that SUVmean has a lesser influence. Another limitation of our study is its retrospective character. Furthermore, the size of both investigated patient groups as well as the size of the combined group is still too small for definitive results. A final assessment of ASP will only be possible in a prospective multi-center trial.

Conclusions

Our results confirm that ASP of the pretherapeutic FDG uptake in the primary tumor provides independent significant prognostic information in patients with head and neck cancer. Moreover, these results indicate that ASP can be determined unambiguously across different sites. We consider these results as highly encouraging and believe that ASP is a promising new quantitative PET measure which is comparatively easy to determine in the PET data. ASP thus deserves further investigation. A final assessment of ASP in a prospective multi-center trial would be highly desirable.