Abstract
Purpose
Quantification of tumour heterogeneity in PET images has recently gained interest, but has been shown to be dependent on image reconstruction. This study aimed to evaluate the impact of the EANM/EARL accreditation program on selected 18F-FDG heterogeneity metrics.
Methods
To carry out our study, we prospectively analysed 71 tumours in 60 biopsy-proven lung cancer patient acquisitions reconstructed with unfiltered point spread function (PSF) positron emission tomography (PET) images (optimised for diagnostic purposes), PSF-reconstructed images with a 7-mm Gaussian filter (PSF7) chosen to meet European Association of Nuclear Medicine (EANM) 1.0 harmonising standards, and EANM Research Ltd. (EARL)-compliant ordered subset expectation maximisation (OSEM) images. Delineation was performed with fuzzy locally adaptive Bayesian (FLAB) algorithm on PSF images and reported on PSF7 and OSEM ones, and with a 50 % standardised uptake values (SUV)max threshold (SUVmax50%) applied independently to each image. Robust and repeatable heterogeneity metrics including 1st-order [area under the curve of the cumulative histogram (CHAUC)], 2nd-order (entropy, correlation, and dissimilarity), and 3rd-order [high-intensity larger area emphasis (HILAE) and zone percentage (ZP)] textural features (TF) were statistically compared.
Results
Volumes obtained with SUVmax50% were significantly smaller than FLAB-derived ones, and were significantly smaller in PSF images compared to OSEM and PSF7 images. PSF-reconstructed images showed significantly higher SUVmax and SUVmean values, as well as heterogeneity for CHAUC, dissimilarity, correlation, and HILAE, and a wider range of heterogeneity values than OSEM images for most of the metrics considered, especially when analysing larger tumours. Histological subtypes had no impact on TF distribution. No significant difference was observed between any of the considered metrics (SUV or heterogeneity features) that we extracted from OSEM and PSF7 reconstructions. Furthermore, the distributions of TF for OSEM and PSF7 reconstructions according to tumour volumes were similar for all ranges of volumes.
Conclusion
PSF reconstruction with Gaussian filtering chosen to meet harmonising standards resulted in similar SUV values and heterogeneity information as compared to OSEM images, which validates its use within the harmonisation strategy context. However, unfiltered PSF-reconstructed images also showed higher heterogeneity according to some metrics, as well as a wider range of heterogeneity values than OSEM images for most of the metrics considered, especially when analysing larger tumours. This suggests that, whenever available, unfiltered PSF images should also be exploited to obtain the most discriminative quantitative heterogeneity features.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Using quantitative parameters extracted from the positron emission tomography (PET) component of PET/Computed Tomography (CT) images, such as standardised uptake values (SUV), as biomarkers in multicentre trials or in sites equipped with multiple PET/CT scanners requires that these parameters be comparable among patients, regardless of the PET/CT system used. This can be achieved by harmonising patient preparation, as well as data acquisition, reconstruction, and processing, including the steps for image analysis and parameters extraction [1–3]. The American College of Radiology (ACR) program [4], the European Association of Nuclear Medicine (EANM)/EANM Research Ltd. (EARL) accreditation program [5] and the Society of Nuclear Medicine (SNM) clinical trials network (SNM-CTN) [6] have set up harmonisation programs based on the use of phantoms acquisitions. These are used as standardised objects in order to harmonise data acquisition, processing, and analysis so that the physical, technical, and biological sources of error [1, 7] in SUV measurements can be limited.
A specific issue is related to reconstruction-dependent variations encountered with recently introduced advanced image reconstruction algorithms, such as those incorporating the point spread function (PSF) in the system matrix [8] or Bayesian penalised likelihood (BPL) reconstruction [9]. These new image reconstruction schemes have been shown to produce SUV metrics significantly higher than conventional ordered subset expectation maximisation (OSEM) algorithms [10]. Consequently, an additional filtering step can be used in order to meet harmonising standards [11–13]. With regards to the EANM/EARL program [5], a set of PET images with NEMA NU-2 anthropomorphic phantom-based filtering is mandatory to harmonise SUVs to the EANM standards. Given that centres running PET systems with advanced reconstruction algorithms are often willing to use them with parameters chosen in order to achieve optimal lesion detection, EARL-accredited centres tend to use two PET datasets when participating in multicentre trials: one for optimal lesion detection and image interpretation, and the filtered one for harmonised quantification [12].
It is important to emphasise that all these previous efforts have been focused on typical SUV metrics, as they are commonly used in oncology for therapy assessment and risk stratification. However, there is growing interest in using alternative measurements—for instance, metabolically active tumour volume (MATV) and heterogeneity metrics—in order to provide a more comprehensive quantitative assessment of lesions from PET images [14, 15]. One of the most promising approaches for heterogeneity quantification is textural features analysis, introduced for image processing applications in the 1970s, used in magnetic resonance imaging (MRI) and CT since the early 1990s, and more recently in PET [16]. As PSF reconstruction improves resolution and therefore provides higher definition of structures within a lesion, it is reasonable to expect improved evaluation of tumour heterogeneity as compared to OSEM algorithms. This raises the question of which reconstruction should be used for assessing tumour heterogeneity within a program using a smoothed dataset to reach harmonising standards. Two studies have already reported on the impact of the type of reconstruction algorithm or variation of reconstruction parameters on the textural features values [17, 18]. However, they have mostly been focused on reporting the quantitative impact only, and have neither explored the issue within the context of harmonisation programs nor looked at the relationship between heterogeneity and volume, an important aspect that was recently demonstrated [19].
This study focused on lung cancer, a tumour type for which standard SUV metrics have been proven to be clinically useful [20–23] and for which quantification of tumour heterogeneity in PET images has recently gained interest [24, 25], and aimed at evaluating the potential impact of the EARL accreditation program [5] on selected 18F-FDG heterogeneity metrics. The primary aim was to compare several heterogeneity features previously identified as reliable (robust and repeatable) in lung cancer patients, in PSF-reconstructed images, PSF-reconstructed images with a filter chosen to meet harmonising standards, and in EARL-compliant OSEM images, later referred as to OSEM images. This comparison was performed not only in terms of absolute values but also in terms of their distributions with respect to tumour volume, which was not considered in previous studies. A secondary aim was to study whether potential differences in heterogeneity features amongst these three reconstructions would be similar in adenocarcinomas (ADC), squamous cell carcinomas (SqCC), and large cell lung cancer (LCC), the main histological types encountered in non-small cell lung cancers (NSCLC).
Materials and methods
Patients’ selection
Over a 3-month period, 60 consecutive biopsy proven lung cancer patients (four small cell lung cancer and 56 NSCLC) were prospectively included. Informed consent was waived for this type of study by the local ethics committees (Ref A12-D24-VOL13, Comité de protection des personnes Nord-Ouest III) since the scans were performed for clinical indications and the study procedures were performed independently of normal clinical reporting.
PET calibration and cross-calibration
The calibration of the PET system was performed daily with a 68Ge cylinder with a known radioactive concentration.
The cross-calibration procedure was performed twice during the present study, as per the EANM guidelines [11]. Details regarding this cross-calibration can be found elsewhere [12]. The cross-calibration factors were found to be 0.99 and 1.00.
PET/CT examinations
After a 15-min rest in a warm room, patients who had been fasting for 6 h were injected with 18F-FDG. The injected activity and the exact delay between injection and the start of the acquisition were recorded for each patient.
All PET imaging studies were performed on a Biograph TrueV (Siemens Medical Solutions) with a 6-slice spiral CT component. For additional technical details regarding this system we refer to a previous publication [26]. CT acquisition was performed first, with the following parameters: 60 mAs, 130 kVp, pitch 1 and 6 × 2 mm collimation. Subsequently, the PET emission acquisition was performed in three-dimensional (3-D) mode. Patients were scanned from the skull base to the mid-thighs.
PET reconstruction
The standard reconstruction in the department where patients were recruited is a PSF reconstruction algorithm (HD; TrueX, Siemens Medical Solutions; three iterations and 21 subsets) without filtering. For the purpose of the present study, raw data were also reconstructed with the OSEM reconstruction algorithm (four iterations and eight subsets) and a PSF reconstruction algorithm (HD; TrueX, Siemens Medical Solutions; three iterations and 21 subsets) with a 7-mm Gaussian filter (PSF7). As shown in a previous study, this latter reconstruction leads to protocol-specific images with NEMA NU-2 phantom-based filtering that meet EANM quantitative harmonising standards, therefore reducing reconstruction-dependent variation in SUVs [12]. The OSEM reconstruction parameters met the EANM requirements regarding activity recovery.
For all reconstructions, matrix size was 168 × 168 voxels, resulting in isotropic voxels of 4.07 × 4.07 × 4.07 mm3. Scatter and attenuation (using the associated CT) corrections were applied.
PET tumour delineation
All lesions were first automatically delineated in the PSF PET image using the fuzzy locally adaptive Bayesian (FLAB) algorithm, and resulting segmentations were reported on the two other images (OSEM and PSF7). This process avoided any variability in the tumour volume definition and number of voxels involved in the calculations when comparing features across the three images. FLAB has been developed specifically for PET image segmentation [27] and has been thoroughly validated for reproducibility, robustness, and repeatability [28, 29], as well as for accuracy on simulated and clinical images [30]. Tumours were first located and isolated in a volume of interest (VOI) well enclosing the tumour and its surrounding background, without including nearby pathological uptake. This was performed using in-house software in which points are placed by the user around the tumour (see Supplemental Fig. 1). The FLAB algorithm was then applied to this VOI in fully automated mode, in contrast to a semi-supervised approach considered in a previous work on [18F]fluorothymidine (FLT) images during radiotherapy, in which the contrast and signal-to-noise ratio were lower [31].
However, in order to be more representative of a current multi-centric clinical setting, tumour volumes were also defined using a fixed threshold at 50 % of SUVmax (SUVmax50%) applied independently to each of the three images. Furthermore, for the most discordant volumes between PSF7/OSEM and PSF (outliers located above the 90th percentile) when using SUVmax50%, tumours were also segmented independently with FLAB on each set of images.
Features were then extracted from each of the three volumes and compared.
Tumour characterisation and quantification
From the FLAB-delineated volumes, all “standard” metrics were extracted: SUVmax, SUVmean and metabolically active tumour volume. To characterise uptake heterogeneity, several metrics were considered: on the one hand, a first-order metric based on the intensity histogram (IH) [16] denoted area under the curve of the cumulative histogram (CHAUC) [32], and on the other hand, second- and third-order textural features (TF). The metric CHAUC is based on intensity histogram only and does not incorporate spatial information. TFs have been defined to quantify patterns of spatial arrangements and/or intensity variations. There exist dozens of TFs based on different computational frameworks. In the present work, we used only a few selected TFs. This selection was based on several previous studies showing that most of the features (including first-order metrics such as skewness), especially third-order metrics focusing on small areas and/or low intensities, are unreliable due to poor robustness vs. reconstruction [17, 18] or partial-volume effects and segmentation [33], and low repeatibility on test-retest images [34]. The remaining features are either calculated from co-occurrence (second-order: entropy, correlation, and dissimilarity) or size-zone matrices [third-order: high-intensity larger area emphasis (HILAE) and zone percentage (ZP)]. Before building these matrices, images are first discretised into a chosen number of bins (B) with a quantisation step. It has been shown that the choice of the quantisation value (usually between 8 and 256) has an important impact on the resulting TF value, but also the reproducibility [34] or complementary value with the tumour volume in which it is calculated [19]. Based on these previous results, a value of B = 64 was used in the present work, and the quantisation was performed using equation 1, in which I(x) is the original SUV of the voxel of interest and SUV min and SUV max are the minimal and maximal SUV values within the tumour volume.
For co-occurrence matrices, it has been shown that less redundant features are obtained when calculated using a single co-occurrence matrix taking into account all 13 spatial directions simultaneously, rather than computing a matrix for each direction followed by averaging [19, 35]. A single co-occurrence matrix was thus adopted in the present work.
Noise analysis in PET images
In order to evaluate noise characteristics of each of the three reconstructions, signal-to-noise ratio (SNR, defined as \( 20\times { \log}_{10}\left(\frac{\mu }{\sigma}\right)DB \) [36] where μ and σ are the mean and standard deviation of intensities) was measured in circular regions of interest (ROIs) placed in homogeneous regions of the liver and automatically reported in each reconstruction.
Statistical analysis
Quantitative data are presented as mean (standard deviation), as well as the median when not normally distributed. Bland-Altman analyses were used to compare the SUV metrics obtained in the three images. The features obtained on each of the three sets of PET images were first compared globally using Friedman tests. Graphical plots of each feature depending on tumour volume were also used to estimate the impact of PSF reconstruction compared to OSEM and PSF7 images, and the features were then compared by categories of volumes using Friedman tests. MATV, SUVmax, and TFs extracted from the three sets of data were compared according to the histological type of the tumour (ADC, ScCC, and LCC) using Kruskal-Wallis tests. For all tests, a two-tailed P value of less than 0.05 was considered statistically significant. Graphs and analyses were carried out using Prism (GraphPad Software, La Jolla, CA).
Results
Population characteristics and compliance to guidelines for tumour imaging
Population characteristics are displayed in Table 1. Overall, 58 (96.7 %) patient examinations fulfilled the EANM 2.0 guidelines for PET tumour imaging. The mean (SD) injected dose of 18F-FDG was 4.02 (0.16) MBq/kg. The mean (SD) delay between the injection and the start of the PET acquisition was 60.43 (3.38) min. The mean (SD) blood glucose level was 1.04 (0.23) mmol/L.
Validation of the use of an additional harmonised PET dataset to overcome reconstruction-dependency of SUVs
Overall, 71 pulmonary lesions were delineated. The mean (SD, median) FLAB-derived MATV was 31.7 (46.4, 9.7) cm3. The mean (SD) SUVmax for OSEM, PSF, and PSF7 reconstructions were 10.50 (5.85), 15.42 (9.56), and 10.56 (5.88), respectively. The mean (SD) SUVmean for OSEM, PSF, and PSF7 reconstructions were 6.14 (2.99), 7.37 (4.03), and 6.25 (2.98), respectively.
As shown in Supplemental Fig. 2, a Bland-Altman analysis demonstrated that the mean ratio of PSF and OSEM reconstructions for SUVmax and SUVmean were 1.46 (95 % CI = 0.86–2.08) and 1.19 (95 % CI = 0.71–1.67), respectively. When using the PSF7 harmonised reconstruction, the mean ratio between PSF7 and OSEM reconstructions were 1.01 (95 % CI = 0.93–1.09) and 1.02 (95 % CI = 0.95–1.09) for SUVmax and SUVmean, respectively.
Compared to OSEM, SNR in the liver was lower in PSF images (−25.8 ± 3.9 %), whereas it was very similar in PSF7 images (1.2 ± 2.9 %).
Impact of newest reconstruction algorithms on textural features
In a first step we used FLAB to delineate lesions in PSF images and we reported this segmentation on OSEM images. For the first-order metric CHAUC based on the intensity histogram, most tumours were quantified as significantly more heterogeneous in PSF images compared to OSEM images, as PSF values were significantly lower than OSEM ones (a lower area under the curve indicating higher heterogeneity). Regarding second-order metrics calculated on the co-occurrence matrix (entropy, correlation, and dissimilarity), on the one hand, PSF values were significantly lower for correlation and significantly higher for dissimilarity (in both cases indicating higher heterogeneity), compared to OSEM reconstruction. On the other hand, no significant difference between PSF and OSEM images was observed for entropy. Regarding third-order metrics calculated on size-zone matrices (HILAE and ZP), there was a significant difference between PSF and OSEM images only for HILAE values, which were lower, indicating higher heterogeneity. Figure 1 displays TFs for the three reconstructions used.
Heterogeneity features were also analysed depending on the range of tumour volumes. As shown in Fig. 2, the dispersion of the values (represented by the interquartile range) was larger and calculated values were significantly smaller for PSF reconstruction as compared to OSEM reconstruction for tumour volumes larger than 1 cm3 in the case of CHAUC. For HILAE, calculated values were also significantly smaller for tumour volumes larger than 1 cm3 but the dispersion of values was narrower for PSF compared to OSEM. Dissimilarity values were significantly higher in PSF for volumes >50 cm3 and the dispersion of these values was larger in PSF images for tumour volumes larger than 1 cm3, compared to OSEM reconstruction. Distributions for all metrics can be seen in details of Supplemental Figs. 3 and 4.
When defining volumes using SUVmax50% applied independently to OSEM and PSF images, mean (median, SD) MATVs were significantly smaller on PSF images [9.0 (12.4, 3.3) cm3] as compared to OSEM images [18.8 (24.2, 7.3) cm3] (p < 0.0001). There were significant differences between PSF and OSEM images for first-order, second-order, and third-order metrics with the same trends as detailed above (Fig. 3). Figure 4 displays representative examples of tumour delineation using the FLAB algorithm and a 50 % of SUVmax threshold, as well as the related SUV and TF metrics.
When comparing volumes obtained with SUVmax50% on PSF to OSEM and PSF7 ones, eight outliers (above the 90th percentiles) were observed and re-processed by independently contouring with FLAB. There was no significant difference between PSF and OSEM mean (median, SD) MATV [12.8 (12.2, 4.8) cm3 and 13.4 (13.0, 4.8), respectively]. There were significant differences between PSF and OSEM images for first-order, second-order, and third-order metrics as previously described, except for dissimilarity and ZP, for which there were only trends (Supplemental Fig. 5).
Effect of the harmonisation strategy on textural features
In the previous section, using the FLAB algorithm to delineate lesions, we found that data extracted from OSEM reconstructions were different from those extracted from PSF reconstructions for several TFs but also for CHAUC. When comparing these metrics extracted from OSEM and PSF7 reconstructions, none exhibited significant differences (Fig. 1). Furthermore, the distributions of their values according to MATV were much more similar for all ranges of volumes (Fig. 2) with no significant difference in whatever range of tumour volumes was considered (except for HILAE in volumes larger than 50 cm3), highlighting that the quantifiable heterogeneity content of the PSF7 images was very close to the one contained in OSEM images.
We also defined tumour volumes using SUVmax50% applied independently to OSEM and PSF7 images. When using this methodology, no significant difference was observed between MATV, CHAUC, and all TFs extracted from OSEM and PSF7 reconstructions (Fig. 3). Mean (SD, median) MATV were 18.8 (24.2, 7.3) cm3 and 19.5 (25.5, 7.7) cm3 for OSEM and PSF7 reconstructions, respectively (ns).
Analysing the eight outliers described above for which FLAB was used independently on the three sets of images, there was no difference between OSEM and PSF7 mean (median, SD) MATV [13.4 (13.0, 4.8) cm3 and 13.6 (13.2, 5.11), respectively] and between textural features extracted from OSEM and PSF7 images (Supplemental Fig. 5).
SUV metrics and heterogeneity features amongst the histological subtypes
Standard metrics exhibited significant differences amongst the three NSCLC histological subtypes. In particular, there was a trend towards smaller volumes in ADC. SUVmax values were also different in the three subtypes, however, these had large overlaps between the three distributions. SUVmax values obtained in PSF reconstructed images were higher for all three subtypes, though these resulted in a similar and unchanged differentiation between them: ADC had significantly lower SUVmax than SqCC and LCC in all three reconstructions (Fig. 5).
Although the heterogeneity metrics were differently distributed with the three different reconstruction schemes, none of them were significantly different among the three histological subtypes, whatever reconstructed image set was considered (Fig. 6).
Discussion
Heterogeneity metrics, especially textural features, have gained interest in the past few years to quantify intratumour heterogeneity in PET images. There have been several studies highlighting the dependency of these metrics to various factors, including the image analysis workflow (such as tumour delineation or partial-volume effects correction) [33], the image reconstruction schemes or parameters [17, 18, 37] and basic stochastic effects occurring in the PET acquisition process [38].
Our results confirm some of these previous results regarding the impact of the reconstruction choices on these metrics values [18]. Compared to OSEM images, unfiltered PSF-reconstructed images showed lower SNR in the liver, higher heterogeneity and higher range of heterogeneity values in the tumour, for most of the metrics when using FLAB (independently on the three sets of images or not) and for all of the metrics considered in the present work when using SUVmax50%, to be more representative of a current multi-centric clinical setting. This difference was logically especially observed when analysing larger tumours. Our study indeed sheds light on the impact of reconstruction algorithms on the distributions of heterogeneity features with respect to tumour volume, which had not been considered in these previous studies. Regarding the differences observed in the case of SUVmax50%, it should be emphasised that part of these can be directly attributed to the fact that this segmentation method applied to PSF images led to significantly smaller volumes than on OSEM and PSF7 images, with sometimes drastically reduced volumes not covering the tumour uptake spatial extent (see Fig. 4c and d and supplemental Fig. 6). This method, which has been evaluated previously mostly on standard non-PSF images, is thus clearly not appropriate to extract tumour volume and associated metrics from PSF-reconstructed images because of their higher contrast.
Thus, the impact of reconstruction for comparable tumour volumes was found to be significant for some metrics (CHAUC, correlation, dissimilarity, HILAE) and only an observable trend for others (entropy, ZP), and the differences increased with larger tumour volumes (for instance, in the case of ZP, differences were significant only for tumours larger than 50 cm3). This suggests that PSF-based reconstruction may provide more quantifiable heterogeneity-related information in larger tumours than OSEM images, as the interval between smallest and highest values increases, thereby providing more potential for differentiating different levels of heterogeneity in these tumours. Our results also highlight the fact that some TFs seem more sensitive than others to the changes in PSF-reconstructed images compared to OSEM images when analysing similar volumes determined with FLAB: CHAUC, correlation, and HILAE showed higher sensitivity with larger differences in both overall and volume-related distributions than entropy, dissimilarity, and ZP.
The present study was conducted within the overall harmonisation strategy context and focused on the EARL accreditation program, which is why unfiltered PSF images (optimised for diagnostic purposes) and OSEM images were compared to PSF images filtered with a 7-mm Gaussian filter chosen to meet the EANM 1.0 guidelines (PSF7). As previously published [12], the use of PSF7 resulted in SNR in the liver and SUVmax values in the tumour very close to OSEM, and the same pattern was observed for heterogeneity metrics. All metrics considered in the present study were very close with no significant differences when extracted from OSEM and PSF7 images, no matter the delineation technique used. This suggests that OSEM and PSF7 EARL-compliant images present a similar quantifiable heterogeneity content, and validates the use of TFs extracted from PSF-filtered images for multi-centre studies. However, as stated above, our results also suggest that using unfiltered PSF-based reconstructions could potentially provide more discriminative image features allowing for higher differentiation amongst patients, for studies aiming to quantify tumour heterogeneity using TFs and exploit these metrics for a clinical endpoint, such as patient stratification according to survival or response to therapy. Of note, these studies have been mostly performed in single sites, but future validation studies will likely require pooling data from several centres in order to obtain larger cohorts with enough statistical power. This raises the issue of using filtered-harmonised PSF images so that they can be pooled with OSEM data from other centres (potentially losing some discriminative power from TFs), or pooling data only from centres using PSF reconstruction with no post-filtering. This issue is problematic, as the EARL accreditation program is not meant to exclude images from centres running PET systems not equipped with PSF reconstruction or other advanced algorithms. Also, the sensitivity of TFs to reconstruction parameters needs to be interpreted in the context of an important reconstruction disparity within PET centres, even in centres running the same PET system, as recently reported by the Clinical Trials Network of the Society of Nuclear Medicine and Molecular Imaging (SNMMI) [39]. Taken together, these findings suggest that PSF reconstruction with a Gaussian filtering chosen to meet harmonising standards could be used within the harmonisation strategy context for studies aiming to quantify intra-tumour heterogeneity to stratify, rank, or classify patients with respect to a given clinical endpoint. In addition, we recommend that whenever available, unfiltered PSF images should also be analysed, especially for large, single-centre series since quantitative metrics obtained from these could potentially offer higher discriminative power. This of course requires additional studies in larger cohorts to be carried out.
One limitation of our study is the inclusion of a single system where the underlying reconstruction was identical apart from the use of the PSF modeling. Differences in image reconstruction methods between several vendors may give rise to additional variability that needs to be evaluated before texture analysis could be reliably used in the context of multi-centric studies, despite the demonstrated repeatability and robustness of several features versus changes in image properties [17, 18, 37].
Finally, we sought to identify differences in heterogeneity features within NSCLC histological subtypes. Although we showed that the ADC presented much smaller volumes as well as lower SUVmax values than the SqCC and LCC subtypes, none of the heterogeneity metrics showed any discriminative power in differentiating these subtypes, in either reconstruction method used, which is in line with recent results obtained in breast cancer studies [40]. On the other hand, it contradicts another recent study that suggested textural features could differentiate between ADC and SqCC in a cohort of 30 NSCLC Asian patients [41]. These results were obtained on 2D-slice—not 3D volume analysis only—and required the combination of numerous parameters through machine learning (automated clustering) in order to differentiate the two subtypes. The derived model was not validated in an external cohort. This possibly led to overfitting and the results might not be generalisable to other series of patients, especially European ones. Our results suggest that heterogeneity features could be used in a multi-centre setting regardless of the histology in series of European patients. Indeed, results in term of heterogeneity in Asian patients may not be applicable to European patients, as not only the ratio between ADC and SqCC is inverted in these populations, but also the rate of EGFR mutation is higher in Asian patients (20–40 %) compared to European patients (around 10 %) [42]. One could therefore postulate different TFs in ADC depending on the mutation status. Studies with a larger ADC population, focusing on the differences in heterogeneity features between mutated and non-mutated ADC are therefore required to complement recent data on standard SUV metrics [43, 44].
Conclusion
The use of PSF reconstruction with Gaussian filtering chosen to meet harmonising standards produced comparable SUV values, as well as similar levels of heterogeneity information, compared to OSEM images, which validates its use within the harmonisation strategy context for studies aiming to quantify intra-tumour heterogeneity to stratify, rank, or classify patients. However, unfiltered PSF-reconstructed images showed significantly higher heterogeneity for CHAUC, correlation, and HILAE, as well as a wider range of heterogeneity values than OSEM ones, for most of the metrics considered, especially when analysing larger tumours. This suggests that, whenever available, unfiltered PSF images should also be analysed because resulting quantitative heterogeneity features could be more discriminative in stratifying or ranking patients, which remains to be demonstrated. Finally, the main NSCLC histological subtypes in this cohort did not show any differences in terms of intra-tumour heterogeneity, despite some notable differences in metabolically active tumour volume and levels of uptake (SUVmax). This may facilitate the potential multi-centre use of heterogeneity features regardless of the histology in series of European patients.
References
Boellaard R. Standards for PET image acquisition and quantitative data analysis. J Nucl Med. 2009;50 Suppl 1:11S–20S. doi:10.2967/jnumed.108.057182.
Boellaard R. Methodological aspects of multicenter studies with quantitative PET. Methods Mol Biol. 2011;727:335–49. doi:10.1007/978-1-61779-062-1_18.
Boellaard R. Mutatis mutandis: harmonize the standard! J Nucl Med. 2012;53:1–3. doi:10.2967/jnumed.111.094763.
American College of Radiology, Nuclear Medicine and PET accreditation program http://www.acraccreditation.org/Modalities/Nuclear-Medicine-and-PET.
European Association of Nuclear Medicine. EARL FDG-PET/CT accreditation. 2015. http://earl.eanm.org/cms/website.php?id=/en/projects/fdg_pet_ct_accreditation.htm.
The Society of Nuclear Medicine and Molecular Imaging Clinical trials network (SNMMI-CTN). Scanner validation program. http://www.snmmi.org/Research/Content.aspx?ItemNumber=5482&navItemNumber=6834.
Makris NE, Huisman MC, Kinahan PE, Lammertsma AA, Boellaard R. Evaluation of strategies towards harmonization of FDG PET/CT studies in multicentre trials: comparison of scanner validation phantoms and data analysis procedures. Eur J Nucl Med Mol Imaging. 2013;40:1507–15. doi:10.1007/s00259-013-2465-0.
Panin VY, Kehren F, Michel C, Casey M. Fully 3-D PET reconstruction with system matrix derived from point source measurements. IEEE Trans Med Imaging. 2006;25:907–21.
Teoh EJ, McGowan DR, Macpherson RE, Bradley KM, Gleeson FV. Phantom and clinical evaluation of the Bayesian penalized likelihood reconstruction algorithm Q.Clear on an LYSO PET/CT system. J Nucl Med. 2015. doi:10.2967/jnumed.115.159301.
Bellevre D, Blanc Fournier C, Switsers O, Dugue AE, Levy C, Allouache D, et al. Staging the axilla in breast cancer patients with 18F-FDG PET: how small are the metastases that we can detect with new generation clinical PET systems? Eur J Nucl Med Mol Imaging. 2014;41:1103–12. doi:10.1007/s00259-014-2689-7.
Boellaard R, Delgado-Bolton R, Oyen WJ, Giammarile F, Tatsch K, Eschner W, et al. FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–54. doi:10.1007/s00259-014-2961-x.
Lasnon C, Desmonts C, Quak E, Gervais R, Do P, Dubos-Arvis C, et al. Harmonizing SUVs in multicentre trials when using different generation PET systems: prospective validation in non-small cell lung cancer patients. Eur J Nucl Med Mol Imaging. 2013;40:985–96. doi:10.1007/s00259-013-2391-1.
Quak E, Le Roux PY, Hofman MS, Robin P, Bourhis D, Callahan J, et al. Harmonizing FDG PET quantification while maintaining optimal lesion detection: prospective multicentre validation in 517 oncology patients. Eur J Nucl Med Mol Imaging. 2015. doi:10.1007/s00259-015-3128-0.
Mertens J, Dobbeleir A, Ham H, D’Asseler Y, Goethals I, Van de Wiele C. Standardized added metabolic activity (SAM): a partial volume independent marker of total lesion glycolysis in liver metastases. Eur J Nucl Med Mol Imaging. 2012;39:1441–8. doi:10.1007/s00259-012-2166-0.
Visvikis D, Hatt M, Tixier F, Cheze Le Rest C. The age of reason for FDG PET image-derived indices. Eur J Nucl Med Mol Imaging. 2012;39:1670–2. doi:10.1007/s00259-012-2239-0.
El Naqa I, Grigsby P, Apte A, Kidd E, Donnelly E, Khullar D, et al. Exploring feature-based approaches in PET images for predicting cancer treatment outcomes. Pattern Recogn. 2009;42:1162–71. doi:10.1016/j.patcog.2008.08.011.
Galavis PE, Hollensen C, Jallow N, Paliwal B, Jeraj R. Variability of textural features in FDG PET images due to different acquisition modes and reconstruction parameters. Acta Oncol. 2010;49:1012–6. doi:10.3109/0284186X.2010.498437.
Yan J, Chu-Shern JL, Loi HY, Khor LK, Sinha AK, Quek ST, et al. Impact of image reconstruction settings on texture features in 18F-FDG PET. J Nucl Med. 2015;56:1667–73. doi:10.2967/jnumed.115.156927.
Hatt M, Majdoub M, Vallieres M, Tixier F, Le Rest CC, Groheux D, et al. 18F-FDG PET uptake characterization through texture analysis: investigating the complementary nature of heterogeneity and functional tumor volume in a multi-cancer site patient cohort. J Nucl Med. 2015;56:38–44. doi:10.2967/jnumed.114.144055.
Calais J, Dubray B, Nkhali L, Thureau S, Lemarignier C, Modzelewski R, et al. High FDG uptake areas on pre-radiotherapy PET/CT identify preferential sites of local relapse after chemoradiotherapy for locally advanced oesophageal cancer. Eur J Nucl Med Mol Imaging. 2015;42:858–67. doi:10.1007/s00259-015-3004-y.
Sauter AW, Schwenzer N, Divine MR, Pichler BJ, Pfannenberg C. Image-derived biomarkers and multimodal imaging strategies for lung cancer management. Eur J Nucl Med Mol Imaging. 2015;42:634–43. doi:10.1007/s00259-014-2974-5.
Takeuchi S, Khiewvan B, Fox PS, Swisher SG, Rohren EM, Bassett Jr RL, et al. Impact of initial PET/CT staging in terms of clinical stage, management plan, and prognosis in 592 patients with non-small-cell lung cancer. Eur J Nucl Med Mol Imaging. 2014;41:906–14. doi:10.1007/s00259-013-2672-8.
Vera P, Mezzani-Saillard S, Edet-Sanson A, Menard JF, Modzelewski R, Thureau S, et al. FDG PET during radiochemotherapy is predictive of outcome at 1 year in non-small-cell lung cancer patients: a prospective multicentre study (RTEP2). Eur J Nucl Med Mol Imaging. 2014;41:1057–65. doi:10.1007/s00259-014-2687-9.
Fried DV, Mawlawi O, Zhang L, Fave X, Zhou S, Ibbott G, et al. Stage III non-small cell lung cancer: prognostic value of FDG PET quantitative imaging features combined with clinical prognostic factors. Radiology. 2016;278:214–22. doi:10.1148/radiol.2015142920.
Lovinfosse P, Janvary ZL, Coucke P, Jodogne S, Bernard C, Hatt M, et al. FDG PET/CT texture analysis for predicting the outcome of lung cancer treated by stereotactic body radiation therapy. Eur J Nucl Med Mol Imaging. 2016. doi:10.1007/s00259-016-3314-8.
Aide N, Desmonts C, Beauregard JM, Beyer T, Kinross K, Roselt P, et al. High throughput static and dynamic small animal imaging using clinical PET/CT: potential preclinical applications. Eur J Nucl Med Mol Imaging. 2010;37:991–1001. doi:10.1007/s00259-009-1352-1.
Hatt M, Cheze le Rest C, Turzo A, Roux C, Visvikis D. A fuzzy locally adaptive Bayesian segmentation approach for volume determination in PET. IEEE Trans Med Imaging. 2009;28:881–93. doi:10.1109/TMI.2008.2012036.
Hatt M, Cheze Le Rest C, Albarghach N, Pradier O, Visvikis D. PET functional volume delineation: a robustness and repeatability study. Eur J Nucl Med Mol Imaging. 2011;38:663–72. doi:10.1007/s00259-010-1688-6.
Hatt M, Cheze-Le Rest C, Aboagye EO, Kenny LM, Rosso L, Turkheimer FE, et al. Reproducibility of 18F-FDG and 3′-deoxy-3′-18F-fluorothymidine PET tumor volume measurements. J Nucl Med. 2010;51:1368–76. doi:10.2967/jnumed.110.078501.
Hatt M, Cheze le Rest C, Descourt P, Dekker A, De Ruysscher D, Oellers M, et al. Accurate automatic delineation of heterogeneous functional volumes in positron emission tomography for oncology applications. Int J Radiat Oncol Biol Phys. 2010;77:301–8. doi:10.1016/j.ijrobp.2009.08.018.
Arens AI, Troost EG, Hoeben BA, Grootjans W, Lee JA, Gregoire V, et al. Semiautomatic methods for segmentation of the proliferative tumour volume on sequential FLT PET/CT images in head and neck carcinomas and their relation to clinical outcome. Eur J Nucl Med Mol Imaging. 2014;41:915–24. doi:10.1007/s00259-013-2651-0.
van Velden FH, Cheebsumon P, Yaqub M, Smit EF, Hoekstra OS, Lammertsma AA, et al. Evaluation of a cumulative SUV-volume histogram method for parameterizing heterogeneous intratumoural FDG uptake in non-small cell lung cancer PET studies. Eur J Nucl Med Mol Imaging. 2011;38:1636–47. doi:10.1007/s00259-011-1845-6.
Hatt M, Tixier F, Cheze Le Rest C, Pradier O, Visvikis D. Robustness of intratumour (1)(8)F-FDG PET uptake heterogeneity quantification for therapy response prediction in oesophageal carcinoma. Eur J Nucl Med Mol Imaging. 2013;40:1662–71. doi:10.1007/s00259-013-2486-8.
Tixier F, Hatt M, Le Rest CC, Le Pogam A, Corcos L, Visvikis D. Reproducibility of tumor uptake heterogeneity characterization through textural feature analysis in 18F-FDG PET. J Nucl Med. 2012;53:693–700. doi:10.2967/jnumed.111.099127.
Vallieres M, Freeman CR, Skamene SR, El Naqa I. A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities. Phys Med Biol. 2015;60:5471–96. doi:10.1088/0031-9155/60/14/5471.
Webb AG. Introduction to biomedical imaging. Wiley-IEEE Press; 2003.
van Velden FH, Kramer GM, Frings V, Nissen IA, Mulder ER, de Langen AJ, et al. Repeatability of radiomic features in non-small-cell lung cancer [F]FDG-PET/CT studies: impact of reconstruction and delineation. Mol Imaging Biol Off Publ Acad Mol Imaging. 2016. doi:10.1007/s11307-016-0940-2.
Nyflot MJ, Yang F, Byrd D, Bowen SR, Sandison GA, Kinahan PE. Quantitative radiomics: impact of stochastic effects on textural feature analysis implies the need for standards. J Med Imaging (Bellingham). 2015;2:041002. doi:10.1117/1.JMI.2.4.041002.
Sunderland JJ, Christian PE. Quantitative PET/CT scanner performance characterization based upon the society of nuclear medicine and molecular imaging clinical trials network oncology clinical simulator phantom. J Nucl Med. 2015;56:145–52. doi:10.2967/jnumed.114.148056.
Groheux D, Majdoub M, Tixier F, Le Rest CC, Martineau A, Merlet P, et al. Do clinical, histological or immunohistochemical primary tumour characteristics translate into different (18)F-FDG PET/CT volumetric and heterogeneity features in stage II/III breast cancer? Eur J Nucl Med Mol Imaging. 2015;42:1682–91. doi:10.1007/s00259-015-3110-x.
Ha S, Choi H, Cheon GJ, Kang KW, Chung JK, Kim EE, et al. Autoclustering of non-small cell lung carcinoma subtypes on (18)F-FDG PET using texture analysis: a preliminary result. Nucl Med Mol Imaging. 2014;48:278–86. doi:10.1007/s13139-014-0283-3.
Chung C. Tyrosine kinase inhibitors for epidermal growth factor receptor gene mutation-positive non-small cell lung cancers: an update for recent advances in therapeutics. J Oncol Pharm Pract. 2015. doi:10.1177/1078155215577810.
Caicedo C, Garcia-Velloso MJ, Lozano MD, Labiano T, Vigil Diaz C, Lopez-Picazo JM, et al. Role of [(1)(8)F]FDG PET in prediction of KRAS and EGFR mutation status in patients with advanced non-small-cell lung cancer. Eur J Nucl Med Mol Imaging. 2014;41:2058–65. doi:10.1007/s00259-014-2833-4.
Ko KH, Hsu HH, Huang TW, Gao HW, Shen DH, Chang WC, et al. Value of (1)(8)F-FDG uptake on PET/CT and CEA level to predict epidermal growth factor receptor mutations in pulmonary adenocarcinoma. Eur J Nucl Med Mol Imaging. 2014;41:1889–97. doi:10.1007/s00259-014-2802-y.
Acknowledgments
Dr. Alison Johnson is thanked for proofreading of the manuscript.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflicts of interest
The authors declare that they have no conflicts of interest.
Ethical approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent
Informed consent was waived for this type of study by the local ethics committee (Ref A12-D24-VOL13, Comité de protection des personnes Nord-Ouest III), since the PET scans were performed for clinical indications and the trial procedures were performed independent of usual clinical reporting.
Additional information
Mathieu Hatt and Nicolas Aide contributed equally to this work.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplemental Fig. 1
Illustration of the in-house software used to define a 3-D box around the tumour, aiming at enclosing the tumour as well as excluding any nearby physiological or undesired uptake. For details see the Materials and Methods section. (GIF 466 kb)
Supplemental Fig. 2
Relationship between quantitative values extracted from PSF or PSF7 and OSEM images, assessed using Bland-Altman plots for SUVmax (a) and SUVmean (b) in tumour lesions. (GIF 34 kb)
Supplemental Fig. 3
Plots of TF features (CHAUC: area under the curve of the cumulative histogram; high-intensity larger area emphasis (HILAE); ZP: zone percentage) against tumours MATV for OSEM versus PSF reconstructions. (GIF 84 kb)
Supplemental Fig. 4
Plots of TF features (CHAUC: area under the curve of the cumulative histogram; high-intensity larger area emphasis (HILAE); ZP: zone percentage) against tumours MATV for OSEM versus PSF7 reconstructions. (GIF 72 kb)
Supplemental Fig. 5
Impact of the EARL harmonisation strategy on textural features using the FLAB algorithm independently on the three sets of reconstructions to delineate lesions. Textural features for the three reconstructions used. CHAUC: area under the curve of the cumulative histogram; high-intensity larger area emphasis (HILAE); ZP zone percentage. Data is shown as Tukey boxplots (lines displaying median, 25th and 75th percentiles; cross represents the mean value).*, **, and *** indicate two-tailed P < .05, P < .01, and P < .001, respectively. ns: non significant. Data represent the eight outliers (above the 90th percentile) observed with OSEM-PSF MATV and/or PSF7-PSF MATV when using a 50 % of SUVmax threshold. (GIF 39 kb)
Supplemental Fig. 6
PET transverse slice of a PSF-reconstructed image showing one of the largest differences in MATV (−95 %) between FLAB and 50 % SUVmax thresholding. (GIF 22 kb)
Rights and permissions
About this article
Cite this article
Lasnon, C., Majdoub, M., Lavigne, B. et al. 18F-FDG PET/CT heterogeneity quantification through textural features in the era of harmonisation programs: a focus on lung cancer. Eur J Nucl Med Mol Imaging 43, 2324–2335 (2016). https://doi.org/10.1007/s00259-016-3441-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00259-016-3441-2