Keywords

1 Introduction

Recent years have witnessed exponential growth in the use of imaging for diagnostic and therapeutic radiological purposes. In particular, positron emission tomography (PET) has been widely used in oncology for the purposes of diagnosis, grading, staging, and assessment of response. For instance, PET imaging with 18F-FDG (fluoro-2-deoxy-d-glucose), a glucose metabolism analog, has been applied for diagnosis, staging, and treatment planning of lung cancer [110], head and neck cancer [11, 12], prostate cancer [13], cervical cancer [14, 15], colorectal cancer [16], lymphoma [17, 18], melanoma [19], and breast cancer [2022]. Moreover, accumulating evidence supports that pretreatment or posttreatment FDG-PET uptake could be used as a prognostic factor for predicting outcomes [2327]. For a review, see Chaps. 13 and 19.

Besides FDG-PET, other PET tracers have been also shown to be useful in interrogating tumor properties such as hypoxia by FMISO or Cu-ATSM and DNA synthesis and cell proliferation by FLT [28]. Interestingly, Denecke et al. compared CT, MRI, and FDG-PET in the prediction of outcomes to neoadjuvant radiochemotherapy in patients with locally advanced primary rectal cancer, demonstrating sensitivities of 100 % for FDG-PET, 54 % for CT, and 71 % for MRI and specificities of 60 % for FDG-PET, 80 % for CT, and 67 % for MRI [29]. Benz et al. showed that combined assessment of metabolic and volumetric changes predicts tumor response in patients with soft tissue sarcoma [30]. Similarly, Yang et al. showed that the combined evaluation of contrast-enhanced CT and FDG-PET/CT predicts the clinical outcomes in patients with aggressive non-Hodgkin’s lymphoma [31]. Indeed, quantitative information from hybrid imaging modalities could be related to biological and clinical endpoints, a new emerging field referred to as “radiomics” [32, 33]. We were among the leading groups to demonstrate the potential of this new field to monitor and predict response to radiotherapy in head and neck [34], cervix [34, 35], and lung [36] cancers, in turn allowing for adapting and individualizing treatment regimens.

In this chapter, we discuss the application of advanced image processing techniques in PET imaging with specific focus on two major areas of better tumor target definition and image-based prediction of treatment outcomes.

2 Image Features from PET

A necessary prerequisite of image processing application in PET is the robust extraction of relevant imaging features, which could be used in varying applications. The features extracted from PET images could be divided into static (time-invariant) and dynamic (time-variant) features according to the acquisition protocol at the time of scanning and into pre- or intra-treatment features according to the scanning time point [37].

2.1 Static PET Features

  1. (a)

    Standard uptake value (SUV) descriptors: SUV is a standard method in PET image quantitative analysis [38]. In this case, raw intensity values are converted into SUVs, and statistical descriptors such as maximum, minimum, mean, standard deviation (SD), and coefficient of variation (CV) are extracted.

  2. (b)

    Total lesion glycolysis (TLG): This is defined as the product of volume and mean SUV [5, 30, 39].

  3. (c)

    Intensity-volume histogram (IVH): This is analogous to the dose-volume histogram widely used in radiotherapy treatment planning in reducing complicated 3D data into a single easier to interpret curve. Each point on the IVH defines the absolute or relative volume of the structure that exceeds a variable intensity threshold as a percentage of the maximum intensity [34]. This method would allow for extracting several metrics from PET images for outcome analysis such as I x (minimum intensity to x% highest intensity volume), V x (percentage volume having at least x% intensity value), and descriptive statistics (mean, minimum, maximum, standard deviation, etc.). We have reported the use of the IVH for predicting local control in lung cancer [36], where a combined metric from PET and CT image-based model provided a superior prediction power compared to commonly used dosimetric-based models of local treatment response.

  4. (d)

    Morphological features: These are generally geometrical shape attributes such as eccentricity (a measure of non-circularity), which is useful for describing tumor growth directionality, and Euler number (the number of connected objects in a region minus the total number of holes in the objects) the solidity (this is a measurement of convexity), which may be a characteristic of benign lesions [40, 41]. An interesting demonstration of this principle is that a shape-based metric based on the deviation from an idealized ellipsoid structure (i.e., eccentricity) was found to have strong association with survival in patients with sarcoma [41, 42].

  5. (e)

    Texture features: Texture in imaging refers to the relative distribution of intensity values within a given neighborhood. It integrates intensity with spatial information resulting in higher-order histograms when compared to common first-order intensity histograms. It should be emphasized that texture metrics are independent of tumor position, orientation, size, and brightness and take into account the local intensity spatial distribution [43, 44]. This is a crucial advantage over direct (first-order) histogram metrics (e.g., mean and standard deviation), which only measures intensity variability independent of the spatial distribution in the tumor microenvironment. Texture methods are broadly divided into three categories: statistical methods (e.g., high-order statistics, co-occurrence matrices, moment invariants), model-based methods (e.g., Markov random fields, Gabor filter, wavelet transform), and structural methods (e.g., topological descriptors, fractals) [45, 46]. Among these methods, statistical approaches based on the co-occurrence matrix and its variants such as the gray-level co-occurrence matrix (GLCM), neighborhood gray-tone difference matrix (NGTDM), run-length matrix (RLM), and gray-level size zone matrix (GLSZM) have been widely applied for characterizing FDG-PET heterogeneity [47].

Four commonly used features from the GLCM include energy, entropy, contrast, and homogeneity [44]. The NGTDM is thought to provide more humanlike perception of texture such as coarseness, contrast, busyness, and complexity. RLM and GLSZM emphasize regional effects. Textural map examples from multiple PET tracers are shown in Fig. 12.1.

Fig. 12.1
figure 1

The two rows, referring to an individual patient with primary cervical cancer, show texture maps for FDG (metabolic marker) and Cu-ATSM (hypoxia marker) alone and overlapping texture maps of the two markers [37]

These features were shown to predict response in cancers of the cervix [34], esophagus [48], head and neck [49], and lung cancer [50]. MaZda is a dedicated software for image feature analysis [51]. An example of PET different feature extraction from head and neck cancer is shown in Fig. 12.2.

Fig. 12.2
figure 2

A pretreatment PET scan of a head and neck cancer case of a patient who died from disease after radiotherapy treatment. (a) The head and neck tumor region of interest (brown) and the gross tumor volume (GTV) (green) were outlined by the physician. (b) An IVH plot, where I x and V x parameters are derived. (c) A texture map plot of the GTV heterogeneity through intensity spatial mapping

2.2 Dynamic PET Features

These features are based on kinetic analysis using tissue compartment models and parameters related to transport and binding rates [52]. In the case of FDG, a three-compartment model could be used to depict the trapping of FDG-6-phosphate (FDG6P) in tumor [53, 54]. Using estimates from compartmental modeling, glucose metabolic uptake rate could be evaluated. The uptake rate and other compartment estimates themselves could form “parameter map” images, which previously described static features, and could be derived from as well (see Chap. 14).

Glucose metabolic rate was correlated with pathologic tumor control probability in lung cancer [55]. Thorwarth et al. published interesting data on the scatter of voxel-based measures of local perfusion and hypoxia in head and neck cancer [56, 57]. Tumors showing a wide spread in both showed less reoxygenation during a course of radiotherapy and had lower local control. A rather interesting approach to improve the robustness of such features is the use of advanced 4D iterative techniques; an example is given in Fig. 12.3. Further improvement could be achieved by utilizing multi-resolution transformations (e.g., wavelet transform) to stabilize kinetic parameter estimates spatially [58].

Fig. 12.3
figure 3

An abdominal dynamic FDG-PET/CT kinetic analysis. The figure shows the three-compartment parameter map (K 1 , k 2 , k 3 ) model assuming irreversible kinetics (k 4 = 0), blood volume component (bv = K 1 /k 2 ), and K FDG net influx rate constant (Ki). In this case, the parameters were obtained using a 4D iterative technique (compared with simple differential methods) by estimation directly from the sinogram domain

3 Extension to PET/CT and PET/MR

Combining information from multiple modalities allows for better utilization of complementary features from different images. For instance, several studies have indicated that inter- and intra-observer variability of defining the tumor extent could be reduced by using PET/CT or PET/MR. Researchers in lung cancer reported reduced variability when using CT with PET for target definition [3, 10]. Furthermore, a study of fractionated stereotactic radiotherapy for meningioma patients demonstrated improved target definition by combining physiological information from PET, anatomical structure from CT, and soft tissue contrasts from MRI, resulting in alterations of the original contour definitions in 73 % of the cases [59]. However, this visual approach for integrating multimodality imaging information is prone to observer subjectivity and variations as contrasted to single image analysis as discussed later.

The PET imaging features presented as static metrics in Sect. 12.2.1 could be applied equally to PET/CT or PET/MR, where instead of SUV, Hounsfield units are used in the case of CT, and T1w or T2w images, for instance, could be used in the case of MRI, for instance, using its weighted relaxation times or proton density pixel intensities. In the case of dynamic MRI acquisitions, the corresponding pharmacokinetic models would be applied to extract the parametric maps such as extended Tofts model [60], which is also a three-compartment model, and extracted parameters include the transfer constant (K trans), the extravascular-extracellular volume fraction (ve), and the blood volume (bv).

However, among the most challenging issues in multimodality imaging is the fusion of multiple imaging data from different scanners. This is typically carried out through a geometric transformation, i.e., image registration. This could be solved greatly using integrated hardware systems such as PET/CT scanners or PET/MRI scanners; otherwise, software solutions need to be deployed. These software solutions could be divided into rigid or deformable registration techniques [61]. In our previous work, we have developed several tools for this purpose such as MIASYS [62] and DIRART [63]. https://sites.google.com/a/umich.edu/ielnaqa/home/software-tools.

MIASYS is a dedicated open-source software tool developed in MATLAB for multimodality image analysis. The software tool aims to provide a comprehensive solution for 3D image segmentation by integrating automatic algorithms, manual contouring methods, image preprocessing filters, post-processing procedures, user-interactive features, and evaluation metrics. The implemented methods include multiple image thresholding, clustering based on K-means and fuzzy C-means (FCM), and active contours based on snakes and level sets. Image registration is achieved via manual and automated rigid methods [62].

DIRART is also an open-source software implemented in MATLAB to support deformable image registration (DIR) for adaptive radiotherapy applications. Image registration in this regard computes voxel mapping between two image sets and is formulated as an optimization problem in which the solution is found by maximizing a similarity metric between the two images (e.g., mutual information). DIRART contains 20+ deformable image registration (DIR) algorithms including 12 different optical flow algorithms, different demon algorithms, and four level set algorithms. It also supports interface to ITK so that ITK DIR algorithms can be called from within DIRART. Currently five ITK DIR algorithms are supported, including demon algorithms and B-spline algorithms. In addition, the newer inverse consistency algorithms to provide consistent displacement vector field (DVF) in both forward and backward directions are implemented [63].

4 Application of PET in Radiotherapy

In the following, we discuss the application of PET to radiotherapy with focus on two cases, contouring of tumor/organs at risk in treatment planning and outcome prediction for clinical decision-making using radiomics.

4.1 Biological Target Definition Using PET

Medical image segmentation is a process to separate structures of interest in an image from its background or other neighboring structures. It is a necessary prerequisite step for many medical imaging applications in radiology and radiation therapy. These applications may include automatic organ delineation, quantitative tissue classification, surface extraction, visualization and image registration, etc. [64, 65]. For instance, Pham and coworkers divided segmentation algorithms into eight different categories: thresholding, region growing, classifiers, clustering, Markov random field models, artificial neural networks, deformable models, and atlas-guided approaches. In our work on PET-guided treatment planning in radiotherapy, we presented a comparative survey of the current methods applied for tumor segmentation [66, 67]; an example in head and neck cancer using different segmentation algorithms is presented in Fig. 12.4.

Fig. 12.4
figure 4

Example of different PET segmentation methods of head and neck cancer. The methods include 40 % of SUVmax (SUVmax40) and the methods of Nestle, Black, Biehl, and Schaefer. This is in addition to the level set technique (active contour), the stochastic EM approach (EM), the FCM algorithm (FCM), and the FCM-SW variant of the FCM algorithm (FCM-SW). The 3D contour defined on the macroscopic tumor specimen is used as the reference for assessing the performance of the different segmentation techniques (From Zaidi et al. [66])

There are several commercial and academic software tools that support different segmentation algorithms. In general, commercial software packages have better implementations with a user-friendly interface for manual and semiautomatic segmentation methods, but often lag behind the latest development in the field. In contrast, academic software packages, such as ITK [68], BioImage Suite [69], MIPAV [70], ImageJ [71], and 3D slicer [72], may tend to be oriented toward single-modality applications and less friendly in handling multimodality images as proposed here.

Most automatic algorithms attempt to utilize image intensity variations or image gradient information. However, for low-contrast images, many of these algorithms tend to provide suboptimal solutions that are not clinically acceptable. For such cases, it has been demonstrated that if multiple images are available for the same object (the same image modality or different image modalities), all the available complementary information can be fed into the segmentation algorithms to define the so-called biophysical target [73]. Thus, the segmentation algorithms would benefit from the complementary information from different images, and consequently the accuracy of the final segmentation results could be improved. Similar approaches have been applied for detecting blood-wall interface of heart ventricles from CT, MRI, and ultrasound images using a snake deformable model [74], for classifying coronary artery plaque composition from multiple contrast MR images using K-means clustering algorithm [75], and for defining tumor target volumes using PET/CT/MR images for radiotherapy treatment planning by using a multivalued deformable level set approach as in our previous work. Mathematically, such an approach is a framework that could be thought of as a mapping from the imaging space to the “perception” space as identified by the radiologists [73]:

$$ \mathrm{Biophysical}\ \mathrm{target}=f\left(\mathrm{CT,}\;\mathrm{PET,}\;\mathrm{MRI,} \dots, l\right) $$
(12.1)

where \( f\left(\cdot \right) \) is the mapping function from the different imaging modalities to the target space parameterized by λ, which represents users’ defined set of parameters representing the prior knowledge. This framework is highlighted in Fig. 12.5.

Fig. 12.5
figure 5

Biophysical target as generated from multimodality imaging by combining anatomical and functional information

The robust image segmentations methods are based on deformable models, which are geometric representations of curves or surfaces that are defined explicitly or implicitly in the imaging domain. These models move under the influence of internal forces (contour curvature) and external forces (image boundary constraints) [76, 77]. The level set is a state-of-the-art variational method for shape recovery [76, 7880]. They were originally developed in curve evolution theory to overcome the limitations encountered in parametric deformable models (e.g., snakes [81]) such as initialization requirement, generalization to 3D, and topological adaptation such as splitting or merging of model parts. Our generalization to multimodality imaging is based on redefining the concept of a boundary as a logical or “best” combination of multiple images by learning and incorporating expert’s knowledge on subregional or even voxel levels. An example showing combination of PET/CT in lung cancer is shown in Fig. 12.6 using a multivalued level set algorithm [73].

Fig. 12.6
figure 6

Joint estimation of lung PET/CT target/disease volume. (a) A fused PET/CT displayed in CERR with manual contouring shown of the subject’s right gross tumor volume. The contouring was performed separately for CT (in orange), PET (in green), and fused PET/CT (in red) images. (b) The MVLS algorithm was initialized with a circle (in white) of 9.8 mm diameter, an evolved contour in steps of ten iterations (in black), and the final estimated contour (in thick red). The algorithm converged in 120 iterations in few seconds. The PET/CT ratio weighting was selected as 1:1.65. (c) MVLS results are shown along with manual contour results on the fused PET/CT. Note the agreement of the fused PET/CT manual contour and MVLS (dice = 0.87). (d) MVLS contour superimposed on CT (top) and PET (bottom) separately

In another example the PET/CT images were taken from patients with cervix cancer. The PET image was sharpened using a deconvolution approach [82]. The 40 % maximum SUV (standard uptake value) thresholding is adopted in many institutes to estimate gross tumor volume for cervix cancer patients due to the high target to background ratio of these tumors in PET and the difficulty to distinguish their boundary in CT. In Fig. 12.7, the active contour algorithm is initialized with a circle (in white) of 15.9 mm diameter around the PET lesion. The evolved contour took ten iterations (in blue) and the final estimated contours (in thick black) are shown. The algorithm converged in just 30 iterations. This fast convergence could be attributed in part to the almost spherical shape of the tumor and the sharpness of the gradient. It is noticed that the results of the algorithm match the PET ground truth (99 %) as delineated by an experienced nuclear medicine radiologist. Hence, the delineation results were explained mainly by PET in this case, although information from CT could still be used to steer the algorithm, if desired.

Fig. 12.7
figure 7

A 3D generalization of multivalued level set (MVLS) algorithm in the case of PET/CT cervix. The MVLS algorithm is initialized with a sphere (in white) of 15.9 mm diameter, a curve evolution in steps of ten iterations (in magenta), and the final estimated contour (in thick blue). The algorithm converged in 30 iterations. MVLS estimated contour superimposed on CT. MVLS estimated contour superimposed on PET

4.2 PET Radiomics

The extraction of quantitative information from imaging modalities and relating information to biological and clinical endpoints is a new emerging field referred to as “radiomics” [32, 33]. Traditionally, quantitative analysis of FDG-PET or other PET tracer uptake is conducted based on observed changes in the standard uptake value (SUV). For instance, decrease in SUV postirradiation has been associated with better outcomes in lung cancer [83, 84]. However, SUV measurements themselves are potentially pruned to errors due to the initial FDG uptake kinetics and radiotracer distribution, which are dependent on the initial dose and the elapsing time between injection and image acquisition. In addition, some commonly reported SUV measurements might be sensitive to changes in tumor volume definition (e.g., mean SUV). These factors and others might make such approach subject to significant intra- and inter-observer variability [25, 26, 34].

Radiomics consist of two main steps, extraction of static and dynamic features as discussed in Sect. 17.3 and outcome modeling as presented in the following. Outcomes in oncology and particularly in radiation oncology are characterized by tumor control probability (TCP) and the surrounding normal tissue complication probability (NTCP) [85, 86]. A detailed review of outcome modeling in radiotherapy is presented in our previous work [87]. DREES is a dedicated software tool for modeling of radiotherapy response [88]. In the context of image-based treatment outcome modeling, the observed outcome (e.g., TCP or NTCP) is considered to be adequately captured by extracted image features [34, 89]. We will highlight this approach using classical logistic regression.

Logistic modeling is a common tool for multi-metric modeling. In our previous work [90, 91], a logit transformation was used:

$$ \begin{array}{cc}f\left({\mathbf{x}}_i\right)=\frac{e^{g\left({\mathbf{x}}_i\right)}}{1+{e}^{g\left({\mathbf{x}}_i\right)}},& i=1,\dots, n,\end{array} $$
(12.2)

where n is the number of cases (patients) and x i is a vector of the input variable values (i.e., image features) used to predict f(x i ) for outcome y i (i.e., TCP or NTCP) of the i th patient

$$ g\left({\mathbf{x}}_i\right)={\beta}_o+{\displaystyle \sum}_{j=1}^d{\beta}_j{x}_{ij},\kern0.85em i=1,\dots, n,\kern0.5em j=1,\dots, d, $$
(12.3)

where d is the number of model variables and the βs are the set of model coefficients determined by maximizing the probability that the data gave rise to the observations. Resampling methods such as cross validation and bootstrapping methods could be used to determine optimal model order and parameter selection as shown in Fig. 12.8 for PET/CT modeling of lung cancer [36]. Interestingly, a model of two parameters from PET and CT based on intensity-volume histograms provided the best fit to local control.

Fig. 12.8
figure 8

Multi-metric modeling of local failure from PET/CT features. (a) Model-order selection using leave-one-out cross validation. (b) Most frequent model selection using bootstrap analysis where the y-axis represents the model selection frequency on resampled bootstrapped samples. (c) Plot of local failure probability as a function of patients binned into equal-size groups showing the model prediction of treatment failure risk and the original data (Reproduced with permission from Vaidya et al. [36])

5 Current Issues and Future Directions

5.1 PET Image Characteristics

Generally speaking, PET images have lower resolution than CT or MRI in the order of 3–5 mm, which is further worsened under cardiac or respiratory motion conditions due to longer acquisition periods. Moreover, PET images are susceptible to limited photon count noise. Advances in hardware such as crystal detector technologies [92] and software such as image reconstruction techniques [93] are poised to improve PET image quality and their subsequent use. See Chaps. 8 and 11.

5.2 Robustness and Stability of Extracted Image Features

It is well recognized that image acquisition protocols may impact the reproducibility of extracted features from PET images, which may consequently impact the robustness and stability of these features for image analysis. This includes static features such as SUV descriptors [9496] and texture features [97, 98]. Interestingly, texture-based features were shown to have a reproducibility similar to or better than that of simple SUV descriptors [99]. Moreover, textural features from the GLCM seemed to exhibit lower variations than NGTDM features [97]. Other factors that may impact the stability of these features may include signal-to-noise ratio (SNR), partial volume effect, motion artifacts, parameter settings, resampling size, and image quantization [34, 98]. Denoising methods for mitigation of noise in PET imaging follow their limited photon effects using traditional denoising filtering methods [66, 100] or more advanced methods based on combining wavelet and curvelet transform characteristics [101].

5.3 Improved PET-Based Outcome Models

In addition to using appropriate candidate image features for PET-based outcome modeling, “radiomics,” a main weakness in using classical logistic regression formalism is that the model’s capacity to follow details of the data trends is limited. In addition, Eq. (12.3) requires the user’s feedback to determine whether interaction terms or higher-order terms should be added, making it a trial-and-error process. A solution to ameliorate this problem is offered by applying machine learning methods [102].

A class of machine learning methods that is particularly powerful, and which we propose to use for image-based outcome prediction, includes so-called kernel-based methods and their most prominent subtype, support vector machines (SVMs). These methods have been applied successfully in many diverse areas including outcome prediction [103107]. Learning is defined in this context as estimating dependencies from data [108]. In the example of outcome prediction (i.e., discrimination between patients who are at low risk versus patients who are at high risk of local failure), the main function of the kernel-based technique would be to separate these two classes with “hyperplanes” that maximize the margin (separation) between the classes in the nonlinear feature space. The objective here is to minimize the bounds on the generalization error of a model on unseen data before rather than minimizing the mean-square error over the training dataset itself (data fitting). Note that the kernel in these cases acts as a similarity function between sample points in the feature space. Moreover, kernels enjoy closure properties, i.e., one can create admissible composite kernels by weighted addition and multiplication of elementary kernels. This flexibility allows for the construction of a neural network by using a combination of sigmoidal kernels. Alternatively, one could choose a logistic regression equivalent kernel by proper choice of the objective function itself.

Evaluation of radiomics in clinical trials is still in its infancy. According to the website clinicaltrials.gov, a registered trial for “Radiomics: A Study of Outcome in Lung Cancer” between the Maastro Clinic in the Netherlands, Moffitt Cancer Center and Research Institute from Florida, USA, and Gemelli Hospital from Rome, Italy, is reported. Another trial on “Radiomics Prediction of Long Term Survival and Local Failure After Stereotactic Radiotherapy for Brain Metastases” by the Maastro Clinic has been recently opened.

Conclusions

Image processing constitutes an indispensible set of tools for analyzing and extracting valuable information from PET images. We presented in this chapter an overview of different features that could be extracted from PET images for different applications including contouring and response prediction. We have shown that incorporation of different anatomical information from CT and MRI into PET is feasible and could yield better results. However, there are challenges still in the use of PET, some are related to inherited image quality, and others are related to standardization of image acquisition protocols and reconstruction algorithms. Nevertheless, advances in hardware and software technologies will further facilitate wider application of advanced image processing techniques to PET and hybrid imaging to achieve better clinical results. In particular, the synergy between image analysis and machine learning could provide powerful tools to strengthen and further the utilization of PET in clinical practice.