Introduction

In recent decades, 18F–fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) has been routinely used for diagnosing lymph node and distant metastases, assessing treatment response, and detecting recurrence after treatment in various malignancies [1,2,3,4]. Because tumor FDG uptake is associated with tumor aggressiveness, positive tumor FDG uptake on visual analysis and intensity of uptake, expressed as maximum standardized uptake value (SUV), are associated with prognosis in patients with malignant diseases [5, 6]. Furthermore, volumetric parameters including metabolic tumor volume (MTV) and total lesion glycolysis (TLG), which integrate metabolically active tumor volume with tumor FDG uptake, also show significant associations with clinical outcomes [3, 7]. Generally, tumor FDG uptake shows uneven spatial distribution, at least partly due to underlying biological tumor conditions such as metabolism, hypoxia, necrosis, and cellular proliferation, which is referred to as intratumoral heterogeneity [8, 9]. However, although intratumoral heterogeneity is related to tumor aggressiveness, treatment response, and prognosis [9, 10], established FDG PET/CT parameters such as SUV, MTV, and TLG do not reflect this property, raising the need for different analytic methods.

Recently, the concept of radiomics, which is defined as high-throughput extraction of a large number of features from medical images, has been widespread in analyses of diagnostic imaging modalities [11, 12]. The underlying hypothesis of radiomics is that genomic and proteomic cancer patterns are expressed in image-based features and that, with optimal analysis of medical images such as texture analysis, we can quantify these cancer properties [11]. This could extend to the concept of radiogenomics, which implies that genomic tumor heterogeneity is associated with intratumoral heterogeneity on imaging studies [11]. Texture analysis has long been applied in CT and magnetic resonance imaging (MRI), but it was introduced in PET images only recently, at the end of the 2000s [13,14,15]. Since then, quite a number of clinical studies have been published that assessed the significance of tumor FDG uptake heterogeneity.

For this review, we overview recent reports in the field of oncology regarding intratumoral heterogeneity on PET images, and we discuss the clinical roles of radiomics in PET/CT for patients with malignant diseases. Because methodological aspects of heterogeneity analysis will be discussed in other works, we mainly focus here on the clinical value of intratumoral heterogeneity on PET/CT.

Brain Tumor

Among the various histological brain tumor types, glial-origin tumors comprise approximately 80% of malignant primary brain tumors [16]. Although previous studies have shown the role of FDG PET/CT in diagnosis, detecting recurrence, and predicting grade and prognosis of glioma, because of high physiologic FDG uptake in the normal brain cortex, the clinical utility of FDG PET is known to be limited in brain tumors [17,18,19,20]. Therefore, other radiotracers, such as 18F–fluoroethyl-L-tyrosine (FET), which is taken up specifically by L-type amino acid transporters, and 3′-deoxy-3′-18F-fluorothymidine (FLT), accumulation of which is closely associated with cellular proliferation, have been used to evaluate gliomas [16, 20].

Only a few studies have investigated intratumoral heterogeneity on PET/CT in patients with brain tumors, and all of those studies were performed using FET and FLT. In patients with gliomas, two studies investigated the predictive value of textural features for survival. Pyka et al. [16] evaluated the role of texture parameters on FET PET for predicting World Health Organization glioma grade and prognosis in 113 patients with high-grade gliomas. In receiver operating characteristic (ROC) analysis for differentiating grade IV gliomas from grade III tumors, contrast (a measure of local variation of intensity of a pixel and its neighbor over the image), busyness (the intensity changes from a pixel to its neighborhood), and coarseness (a measure of intensity differences throughout the image) from the gray-level neighborhood difference matrix showed area under the curve (AUC) values between 0.737 and 0.775, which was higher than the tumor-to-background ratio (TBR; 0.644) and the MTV (0.710). With the combination of contrast, complexity (a measure of the uniformity of pattern versus rate of change in an image), and MTV, 85% of tumors were correctly classified with an AUC of 0.830. Furthermore, coarseness, contrast, and busyness correlated significantly with progression-free survival (PFS) and overall survival (OS) on multivariate analysis, whereas TBR and volumetric parameters were not significant. Mitamura et al. [20] evaluated uptake heterogeneity on FLT PET in patients with newly diagnosed gliomas. In that study, kurtosis (flatness of gray-level histogram distribution), entropy (a measure of randomness of the image, reflecting irregularity of gray-level distribution), and uniformity (a measure of the heterogeneity of the image array) showed significant correlations with the Ki-67 index, which indicates proliferative tumor activity, and patients with low skewness (asymmetry of gray-level histogram distribution) and kurtosis showed significantly longer OS than those with high values.

In patients with brain tumors, whole-brain radiation therapy and stereotactic radiosurgery have been frequently used to treat tumor lesions [21, 22], but radiotherapy can cause radiation-induced injuries and post-therapeutic changes including tumor necrosis, demyelination, and coagulative necrosis [21]. In some patients, these radiation-related changes are difficult to differentiate from tumor progression on MRI [23]. Recently, Kebir et al. [24] assessed the diagnostic value of textural features of FET PET for detecting true tumor progression after radiotherapy in 14 patients with high-grade gliomas. Using total lesion FDG uptake, maximum SUV, and eight textural features [contrast, entropy, correlation (a measure of continuous areas of same or similar voxel values in an image), size-zone variability (a measure of variability in the size and intensity of homogeneous tumor zones), coarseness, volume, coefficient of variation, and complexity], the authors classified PET features into three clusters, high, low, and intermediate heterogeneity clusters using unsupervised consensus clustering. They found that all of the patients in the high heterogeneity cluster were diagnosed with true progression and three out of four patients with pseudoprogression were assigned to the low heterogeneity cluster, suggesting that clustering-based analysis could be useful in differentiating true progression from pseudoprogression. In another recent study, Lohmann et al. [22] evaluated the potential of textural features of FET PET for differentiating tumor recurrence from radiation-related injuries in patients with brain metastases originating from malignancies including breast cancer, lung cancer, and melanoma. The authors demonstrated that by combining maximum and mean TBR with textural parameters such as coarseness and short-zone emphasis (a measure of distribution of short zones as the differences of the gray value when going to the next voxel, showing high value when the intensity changes often between voxels), diagnostic accuracy increased from 81% to 85% for mean TBR and from 83% to 85% for maximum TBR.

Head and Neck Cancer

Squamous cell carcinoma is the most common histopathological types of head and neck cancer and the most common site for head and neck cancer is the oral cavity, followed by the pharynx [25, 26]. Several clinical studies evaluated the clinical implications of FDG uptake heterogeneity on PET/CT in patients with pharyngeal cancer, oral cavity cancer, and parotid gland cancer, summarized in Table 1.

Table 1 Current literature evaluating FDG uptake heterogeneity on PET/CT in head and neck cancer

Most studies involving head and neck cancer investigated the prognostic value of heterogeneity parameters for predicting disease-free survival and OS in patients with pharyngeal cancer [27,28,29,30, 32, 35,36,37]. The results of those studies consistently demonstrated that intratumoral heterogeneity parameters such as gray-level uniformity, coarseness, busyness, zone-size nonuniformity (the sum of the squares of gray-level zones normalized to the total number of zones, showing large value in tumor with heterogeneous uptake due to uneven zone distribution), and skewness are significantly associated with survival in patients with naso-, oro-, and hyopharyngeal cancer, with worse survival in patients with high intratumoral heterogeneity. One representative study is a retrospective study by Folkert et al. [32] performed in a relatively large number of patients (n = 174) with oropharyngeal cancer. In that study, the authors derived 24 representative features including maximum SUV, intensity-volume histogram distribution of SUV, and second-order gray-level co-occurrence matrix-based features from pre-treatment FDG PET/CT images. Using these features, the authors built multiparameter logistic regression models for all-cause mortality, local failure, and distant metastasis using machine-learning-based feature selection methods, and tested the models on an independent cohort of 65 patients with oropharyngeal cancer treated at another institution for external validation. The study results revealed that all multiparameter models, kurtosis and MTV for all-cause mortality, homogeneity (a local measure of similarity of intensity in paired voxels, showing a large value when the gray levels of each pixel pair are similar) and MTV for local failure, and solidity (a measure of the proportion of pixels of the region-of-interest to the largest possible convex hull polygon structure, which is the best-fitting polygon that encloses all the pixels of the region-of-interest), kurtosis, and MTV for distant metastasis, were statistically significant on cross-validation. Furthermore, the model for predicting local failure still showed statistical significance on the independent validation cohort.

Because human papillomavirus (HPV) infection-mediated oropharyngeal squamous cell carcinoma has been known to have different biological characteristics and good prognosis, HPV-related oropharyngeal cancer has different TNM staging and treatment strategies from other pharyngeal cancers in recent guidelines [26, 38]. In line with this, Mena et al. [35] recently retrospectively evaluated the role of intratumoral heterogeneity on FDG PET/CT in patients with HPV-positive oropharyngeal cancer. They used the area under a cumulative SUV volume histogram (AUC-CSH) to represent intratumoral FDG uptake heterogeneity and concluded that patients with greater heterogeneity had worse clinical outcomes in HPV-positive oropharyngeal cancer.

In a group of patients with oral cavity cancer, Kwon et al. [34] performed a retrospective study to evaluate the predictive value of heterogeneity parameters for OS. In the study, the authors measured the heterogeneity factor to evaluate intratumoral heterogeneity. The heterogeneity factor is defined as the derivative of a tumor volume-SUV threshold function of the primary tumor and calculated as the slope from the linear regression on the tumor volume-SUV threshold curve. Hence, heterogeneity factor had a negative value and tumor with more heterogeneous uptake would have a larger negative value of heterogeneity factor. The results of their study demonstrated that the heterogeneity factor had significant negative correlations with maximum SUV, MTV, and TLG of cancer lesions, showing more heterogeneous FDG uptake in tumors with high metabolic activity. Furthermore, only the presence of neck lymph node metastasis and low heterogeneity factor were independently associated with worse survival. Hofheinz et al. [31] used asphericity to quantify spatial heterogeneity in head and neck squamous cell carcinoma including oral cavity and pharyngeal cancers. Asphericity is the parameter that reflects irregularity in the shape of the FDG uptake, and it can quantify the degree of deviation from a spherical shape. It is equal to 0 for spheres and is >0 for non-spherical shapes. In that study, asphericity was an independent prognostic factor for PFS and OS, along with maximum SUV, MTV, and TLG.

In patients with parotid gland tumors, Kim et al. [33] assessed the diagnostic ability of the heterogeneity factor to differentiate malignant from benign tumors, compared with tumor size, SUV, MTV, TLG, and coefficient of variation. The malignant tumors showed significantly lower heterogeneity factors than did the benign tumors. The heterogeneity factor showed the highest AUC (0.947) among the parameters and, using the optimal cut-off heterogeneity factor of −0.084, the sensitivity and specificity for detecting malignant parotid gland tumors were 100% and 89.2%, respectively.

Thyroid Cancer

Differentiated thyroid cancer is the most common endocrine tumor and is well-known for its favorable prognosis [39]. Patients with differentiated thyroid cancer generally received total thyroidectomy with radioiodine treatment for treating thyroid cancer, except early thyroid cancer. In progressive radioiodine-refractory thyroid cancer, tyrosine kinase inhibitor treatment and peptide receptor radionuclide therapy have been introduced with promising results, but still show dismal prognosis [40]. Among the various oncogene mutations in thyroid cancer, BRAF mutation is notable for its association with extrathyroidal invasion, lymph node metastasis, glucose metabolism in cancer cells, resistance to radioiodine treatment, and poor prognosis, making it a potential target for both imaging and therapy [41,42,43].

Kim et al. [44] investigated the diagnostic performance of intratumoral heterogeneity of FDG uptake for differentiating malignant thyroid nodules in patients with thyroid nodules. They enrolled 200 patients who had incidentaloma on FDG PET/CT and used the heterogeneity factor for measuring intratumoral heterogeneity of thyroid nodules. The heterogeneity factor showed a high diagnostic value for predicting malignant thyroid nodule with sensitivity of 100%, specificity of 60%, and AUC of 0.826.

Several studies with papillary thyroid cancer patients have evaluated the relationship between BRAF mutation in thyroid cancer and findings of FDG PET/CT [39, 42, 45]. All of those studies revealed that BRAF mutation was independently associated with maximum SUV of papillary thyroid cancer, showing significantly higher prevalence of BRAF mutation in FDG-avid papillary thyroid cancer. Furthermore, among those studies, two studies investigated the effect of tumor size in relation to BRAF mutation and FDG uptake of papillary thyroid cancer [42, 45]. Two studies commonly demonstrated that BRAF mutation was significantly associated with maximum SUV of tumor for papillary thyroid cancer with size of >1.0 cm, but for papillary thyroid microcarcinoma (size of ≤1.0 cm), there was no significant relationship between BRAF mutation and FDG uptake of tumor. Nagarajah et al. [46] enrolled 48 patients with metastatic differentiated thyroid cancer and 34 with metastatic poorly-differentiated thyroid cancer and evaluated differences of FDG uptake between BRAF mutation and BRAF wild-type tumors for each tumor type. In differentiated thyroid cancer, FDG uptake of BRAF mutation tumors was significantly higher than that of BRAF wild-type tumors, but in poorly-differentiated thyroid cancer, no significant difference of FDG uptake was shown between BRAF mutation and BRAF wild-type tumors.

Lapa et al. [40] performed a retrospective study with eight radioiodine-refractory differentiated thyroid cancer patients and four medullary thyroid cancer patients to evaluate the relationship between textural features on somatostatin receptor PET/CT and treatment response to peptide receptor radionuclide therapy. The results of their study demonstrated that gray-level non-uniformity (the variability of gray-level intensity values in the image) showed the highest AUC (0.93) in ROC curve analysis for predicting PFS, followed by contrast (0.89), whereas none of the conventional PET/CT parameters, such as maximum SUV, mean SUV, and total receptor expression were associated with PFS. Furthermore, in lesion-based analysis, only entropy can predict progression of the individual lesion with sensitivity of 67%, specificity of 75%, and AUC of 0.73.

Non-Small Cell Lung Cancer

Lung cancer is the most common cause of cancer death as well as the most commonly diagnosed cancer worldwide [25]. FDG PET/CT has been shown to be valuable for staging, evaluating treatment response, detecting recurrence, and predicting prognosis of non-small cell lung cancer (NSCLC) and has been used as an essential imaging tool [1, 7, 47]. Reflecting the wide use of FDG PET/CT in NSCLC, dozens of studies investigating clinical values of intratumoral heterogeneity on PET/CT have been published in recent years (Table 2). In addition to evaluations of the prognostic value of PET/CT for predicting survival, various clinical roles of intratumoral heterogeneity have been assessed in patients with NSCLC.

Table 2 Current literature evaluating FDG uptake heterogeneity on PET/CT in lung cancer

Miwa et al. [62] evaluated the diagnostic performance of the morphological fractal dimension, which reflects morphological complexity on CT images, and the density fractal dimension, which reflects FDG uptake heterogeneity on PET images, for differentiating malignant pulmonary nodules from benign nodules in patients with suspected NSCLC. According to the results of the study, the diagnostic accuracy for detecting malignant nodules of density fractal dimension (78%) was higher than those for maximum SUV (68%) and morphological fractal density (65%). Gao et al. [56] evaluated the diagnostic ability of textural analysis to detect mediastinal lymph node metastasis compared with that of maximum SUV and maximum short diameter of lymph node. With textural parameters from a gray-level co-occurrence matrix, the authors constructed three support vector machine classifiers from CT, PET, and combined PET/CT images. The diagnostic ability of support vector machine classifier generated from the combined PET/CT images (AUC: 0.685) was not inferior to that of maximum SUV (AUC: 0.652) and maximum short diameter (AUC: 0.684), suggesting that the diagnostic ability of a computer algorithm might not underperform that from visual experience.

Several studies have evaluated the relationships between textural features and tumor characteristics in NSCLC such as cancer subtypes, histopathologic staging, volumetric parameters, and genetic mutations of NSCLC. Two studies compared textural features between adenocarcinoma and squamous cell carcinoma and showed that the majority of textural features were different between them [57, 64]. Another previous study revealed that the heterogeneity factor was significantly higher in squamous cell carcinoma than in adenocarcinoma [59]. Furthermore, intratumoral heterogeneity on FDG PET/CT was found to be significantly associated with T, N, and M stages, MTV, ki-67 index, vascular endothelial growth factor expression, and epidermal growth factor receptor mutation of NSCLC [49, 58, 67, 69]. These results suggest that textural parameters on FDG PET/CT can be used as imaging biomarkers to characterize NSCLC. Interestingly, MTV and textural features had fewer correlations in NSCLC with large tumor volumes [58, 64].

Two studies have investigated the clinical significance for planning radiotherapy of intratumoral heterogeneity of FDG uptake in NSCLC. Dong et al. [52] evaluated the differences in gross tumor volume between four different volume definition methods: 1) CT images, 2) fused PET/CT images, 3) PET images using 40% of SUV as a threshold, and 4) PET images using a cut-off SUV of 2.5 according to the entropy in patients with NSCLC and esophageal cancer. In tumors with high entropy, the authors found large differences in gross tumor volume between the four methods, suggesting that FDG uptake heterogeneity should be considered in radiation treatment planning. Another study of stage III NSCLC patients by Fried et al. [55] demonstrated that subgroup patients with increased textural parameters [solidity and primary co-occurrence matrix energy (a measure of homogeneous patterns in the image)] had improved survival when the radiation dose increased from 60–70 Gy to 74 Gy, whereas, for all patients in the study, no significant difference in survival was shown between patients who received the two difference doses, indicating that textural features can be used to isolate subgroups of patients who have beneficial effects from radiation dose escalation.

In assessing treatment response, previous studies have shown relationships between textural features and response to radiotherapy [65], chemoradiotherapy [53], and erlotinib [50]. Reduced textural features such as entropy, uniformity, and contrast on follow-up FDG PET/CT were associated with chemoradiotherapy and erlotinib response [50, 53]. Moreover, entropy on pretreatment PET/CT showed a higher AUC value (0.872) for predicting radiotherapy response than did maximum (0.613) or mean SUV (0.575), MTV (0.806), or size (0.739) of NSCLC, demonstrating significantly higher treatment failure in patients with high entropy [65].

A number of studies assessed the prognostic value of intratumoral heterogeneity for predicting survival using various parameters, such as asphericity, histogram-based parameters, heterogeneity factor, and textural features in diverse clinical NSCLC settings in patients from stage I to distant metastases [48, 49, 51, 53, 54, 59,60,61, 63, 65, 66, 68]. In all of these studies, increased intratumoral heterogeneity was associated with poor NSCLC prognosis and treatment failure. Asphericity, heterogeneity factors, and solidity, dissimilarity (a measure of local intensity variation between the neighboring pairs of voxels), and entropy on textural analysis were found to be independent prognostic factors for predicting survival [48,49,50, 54, 59, 61, 65]. In addition to the known heterogeneity parameters, Kim et al. [60] developed the macroheterogeneity factor, defined as the ratio of tumor surface area to spherical surface area, and showed that it was the only predictor of recurrence in pathologically N0 squamous cell lung carcinoma patients. In a multi-institutional dataset of 201 patients from more than 30 institutions who had stage III NSCLC and who underwent definite concurrent chemoradiotherapy, the gray-level co-occurrence matrix-based sum mean, which measures the relationship between occurrence of pairs with lower intensity values and occurrence of pairs with higher intensity values and is frequently used in image segmentation for separating tumors from surrounding tissues, was found to be an independent predictor of OS, and patients who had high tumor MTV and low sum mean (more heterogeneous) showed poorer OS (median OS: 6.2 months) than did those with low MTV (median OS: 22.6 months) and those with high MTV and high sum mean (median OS: 20.0 months) [63]. Moreover, combining textural parameters with conventional prognostic factors further enhanced the prognostic value for predicting clinical outcomes. In a previous study of stage I NSCLC, the prognostic model for predicting distant metastasis, which consisted of peak SUV and the cluster shade (a measure of the skewness and uniformity of the matrix) of the Gaussian-filtered image within the Laws feature group, showed a higher concordance index (0.71) than those of the maximum SUV (0.67) and MTV (0.64) [68]. In a study of stage III NSCLC patients, linear predictors of OS generated with both histogram- and gray-level co-occurrence matrix-based quantitative imaging features and conventional prognostic factors revealed improved risk stratification compared with that generated with only conventional prognostic factors [54]. Desseroit et al. [51] built a nomogram with entropy on PET images and zone percentage (a measure of the coarseness of the texture from gray-level size-zone matrix) on CT images in 116 patients with stage I-III NSCLC and revealed that the nomogram had higher stratification power for OS than did staging alone in patients with stages II and III.

Breast Cancer

Breast cancer is the most common cancer in women worldwide [70]. In diagnosing breast cancer, identifying molecular subtypes based on immunohistochemistry with estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (HER2) is essential in selecting treatment strategies [71]. Especially, because patients with triple negative breast cancer have typically shown poor response to treatment with rapid progression, there is an urgent clinical need of imaging biomarker related with triple negative breast cancer [72]. Furthermore, in locally advanced breast cancer, neoadjuvant treatment followed by surgical resection is now considered standard, because it can provide information for treatment response and prognosis and downstage the tumor lesions, allowing complete surgical resection [73]. Therefore, a number of previous studies have evaluated the clinical role of FDG PET/CT in differentiating breast cancer subtypes and predicting treatment response to neoadjuvant treatment as well as staging and predicting prognosis [74,75,76,77]. Similarly, many studies of intratumoral heterogeneity in breast cancer have also focused on evaluating discriminative power for differentiating breast cancer subtypes and predictive power for neoadjuvant treatment response, as shown in Table 3.

Table 3 Current literature evaluating FDG uptake heterogeneity on PET/CT in breast cancer

A previous study by Soussan et al. [85] used three textural parameters, high-gray-level run emphasis (a measure of the distribution of segments of high-gray level values, showing a higher value when a greater concentration of high gray-level values are present in the image), entropy, and homogeneity, to evaluate the relationships between intratumoral heterogeneity and hormone receptor negativity and triple-negative breast cancer. The authors found that estrogen-negative, progesterone-negative, and triple-negative breast cancers showed significantly higher high-gray-level run emphasis, and, by combining maximum SUV and high-gray-level run emphasis, 77% of triple-negative breast cancers and 71% of non-triple-negative breast cancers were correctly classified. In contrast, other studies demonstrated no significant correlations between textural parameters and hormone receptor and HER2 expression status [78, 80, 82]. In a study by Groehux et al. [78], none of the textural parameters such as entropy, dissimilarity, and homogeneity from a gray-level co-occurrence matrix differentiated between three breast cancer subgroups: estrogen receptor-positive/HER2-negative, HER2-positive, and triple-negative.

Elevated MYC expression and increased activity of the MYC pathway has been shown in triple negative breast cancer, implying that imaging the MYC oncogenic pathway could be a breakthrough as a new imaging biomarker for triple negative breast cancer [87]. One of the promising imaging biomarker targets for the MYC pathway is the transferrin receptor, which is generally highly expressed in cancer cells and regulated by MYC pathway [87]. A recent study by Henry et al. [88] demonstrated that 89Zr-labeled transferrin uptake was significantly associated with MYC expression status and 89Zr-labeled transferrin PET can detect triple negative breast cancer significantly better than FDG PET, suggesting 89Zr-labeled transferrin imaging as a novel imaging biomarker for predicting triple negative breast cancer.

A recent study by Ha et al. [80] showed clinical significance of textural features for predicting response to neoadjuvant treatment and risk of recurrence in 73 stage II and III breast cancer patients. Using unsupervised clustering, the authors created three individual tumor clusters from 109 textural features based on multiple matrixes (gray-level co-occurrence matrix, gray-level run-length matrix, gray-level neighborhood intensity-difference matrix, gray-level size-zone matrix, SUV statistics, texture spectrum, texture feature coding, texture feature coding co-occurrence matrix, and neighboring gray-level dependence). Although estrogen and progesterone receptor, and HER2 expression showed no statistically significant differences between the three tumor clusters, pathological complete remission for neoadjuvant treatment and risk of recurrence were significantly associated with the clusters, suggesting that integrated textural features can be used as predictive and prognostic biomarkers in managing breast cancer. Son et al. [84] used the heterogeneity factor, the derivative of a volume threshold function from 40% to 80%, to represent breast cancer intratumoral heterogeneity. The authors showed that both MTV and the heterogeneity factor were significantly associated with OS, even after they adjusted for TNM stage, in 123 patients with invasive ductal carcinoma. Furthermore, the heterogeneity factor was the best OS predictor among the PET parameters. However, in contrast to the results of the aforementioned studies, two other studies failed to show any significance of textural features for predicting clinical outcomes [79, 82]. In a previous study by Groheux, et al. [79], only MTV was significantly associated with event-free survival among the PET parameters and textural features, entropy and homogeneity, did not further improve the prediction of the survival. Similarly, four textural parameters (entropy, homogeneity, energy, and contrast generated from the co-occurrence matrix) failed to show significance for predicting pathological response to neoadjuvant chemotherapy in a study by Lemarignier et al. [81], although maximum SUV and TLG did predict response. Because the methodologies for textural analysis differed in these studies, more research is needed to elucidate the predictive value of textural features.

Shin et al. [83] investigated the relationships between histopathological results and intratumoral heterogeneity using heterogeneity factor in patients with breast cancer and found that dermal lymphatic cancer involvement was significantly associated with a higher heterogeneity factor; but, the heterogeneity factor could not predict lymph nodal metastasis.

Kim et al. [81] investigated the relationships between FDG PET/CT parameters including maximum SUV, MTV, TLG, and heterogeneity factor, dynamic contrast-enhanced MRI parameters, and recurrence-free survival in 67 patients with invasive ductal carcinoma. In that study, only maximum SUV showed a significant correlation with the tumor cellularity on MRI images; there were no significant correlations between other FDG PET/CT and MRI parameters. Furthermore, the authors found that only the heterogeneity factor was an independent predictor of recurrence-free survival among FDG PET/CT and MRI parameters.

Only one study has evaluated the role of intratumoral heterogeneity on FDG PET/CT in identifying the invasive components of breast ductal carcinoma in situ [86]. In that study, 65 patients with ductal carcinoma in situ who underwent pre-operative FDG PET/CT were retrospectively enrolled. The study results showed that the invasive component of tumor was significantly correlated with lower AUC-CSH and that AUC-CSH was the only independent predictor of invasive components of ductal carcinoma in situ.

Esophageal Cancer

Esophageal cancer is the one of the most aggressive cancers, and it has a poor prognosis, showing a similar ratio of mortality to incidence [89, 90]. For patients with resectable esophageal cancer, trimodal therapy, concurrent neoadjuvant chemoradiotherapy, and subsequent surgical resection, has been applied as primary treatment [91]. In those patients, tumor response to chemoradiotherapy is important for determining treatment plans and predicting clinical outcomes; however, there are still no clinical factors or imaging studies including FDG PET/CT that can accurately predict the response [90, 92, 93].

The results of recent clinical studies of intratumoral heterogeneity on FDG PET/CT are summarized in Table 4. Most of the studies investigated the value of textural features for predicting tumor response to neoadjuvant chemoradiotherapy [93, 98, 100, 101]. Textural parameters on pre-treatment FDG PET/CT and changes in textural parameters between FDG PET/CT scans performed before and after chemoradiotherapy demonstrated high predictive value for pathological response to chemoradiotherapy in studies with small sample sizes of between 20 and 54 patients [97, 98, 100, 101]. Recently, van Rossum et al. [93] performed a retrospective study with a relatively large number of patients with esophageal adenocarcinoma (n = 217) who underwent baseline and post-chemoradiotherapy FDG PET/CT scans and who were treated with neoadjuvant chemoradiotherapy followed by surgery. The authors found that adding four geometric and textural features [cluster shade on baseline PET/CT, relative change of run percentage (a measure of coarseness of the texture using a number of runs from gray-level run-length matrix), relative change of intensity co-occurrence matrix-based entropy, and roundness on post-chemoradiation PET/CT] increased the c-index of the model for predicting response to chemoradiotherapy to 0.77; for the model with only clinical factors, the model with clinical factors and subjective assessment of FDG PET/CT, and the model with clinical factors and subjective assessment of FDG PET/CT with the addition of post-chemoradiation TLG, the c-indices were 0.67, 0.72, and 0.73, respectively. However, decision curve analyses showed no incremental value of the prediction model with geometric and textural features at a decision threshold of 0.9 or higher, reflecting the clinical predictive value for pathological complete remission of 90% or more, at which point, it might be possible to forgo surgery.

Table 4 Current literature evaluating FDG uptake heterogeneity on PET/CT in esophageal cancer

In evaluating the prognostic value of intratumoral heterogeneity for predicting survival, researchers found AUC-CSH and fractal dimension on FDG PET/CT to be significant predictors for relapse-free survival and overall survival in esophageal squamous cell carcinoma patients with curative surgical resection [95, 99]. However, in a study of patients with definite chemoradiotherapy, intensity variability and size-zone variability, which represent regional heterogeneity and showed the highest AUC for predicting treatment response among the PET parameters, failed to show significance for predicting PFS and OS, whereas lymph node metastasis, tumor stage, and response status to chemoradiotherapy were found to be independent prognostic factors [97]. Because the tumor stages and treatment modalities differed between these studies, further studies are necessary to assess the predictive value of the heterogeneity parameter for survival.

Two studies have investigated the relationship between tumor stage and intratumoral heterogeneity in esophageal cancer [94, 96]. Authors found significant correlations between T and N stages of tumor and textural parameters (entropy and energy), and entropy showed an AUC of 0.789 for detecting tumors above stage II [94]. Furthermore, the heterogeneity factor measured on FDG PET/CT was the only independent predictor of regional lymph node metastasis among PET/CT parameters including maximum SUV, MTV, and TLG in esophageal cancer patients with curative surgical resection [96].

Pancreatic Cancer

Pancreatic adenocarcinoma is notable for its aggressiveness and poor prognosis. Surgical resection is the only curative treatment, but only a small portion of patients are candidates for curative surgical resection and most patients with pancreatic adenocarcinoma undergo palliative chemoradiotherapy [102, 103]. Furthermore, even following surgical resection with curative intent, the 5-year survival rate after surgery is only approximately 10% [102].

A small number of studies have been performed to investigate the clinical significance of intratumoral heterogeneity on FDG PET/CT in pancreatic adenocarcinoma, mostly assessing the prognostic value of intratumoral heterogeneity for predicting survival [104,105,106,107].

Two studies evaluated the relationships between textural features and prognosis in pancreatic cancer patients treated with radiation therapy [104, 107]. Cui et al. [104] performed a retrospective study of 139 patients with locally advanced pancreatic cancer treated with stereotactic body radiation therapy and derived 162 image features from FDG PET/CT images. Those authors proposed a prognostic signature that contained seven FDG PET image features: 1) sum variance of the Laplacian of Gaussian filtered image, 2) information measure of correlation of the filtered image wavelet coefficients, 3) information measure of correlation of the Laplacian of Gaussian filtered image, 4) maximum of Gaussian filtered image, 5) sum variance of the Laplacian of Gaussian filtered image, 6) surface area-to-volume ratio, and 7) correlation of the filtered image wavelet coefficients. The authors found that the proposed signature was the only significant predictor of OS when adjusted for conventional FDG PET parameters such as maximum SUV and TLG and clinical factors. In another study, Yue et al. [107] retrospectively enrolled 26 patients with pancreatic adenocarcinoma who had been treated with radiotherapy and who had undergone both pre-radiation and post-radiation FDG PET/CT. The authors extracted 12 gray level co-occurrence matrix-based textural features from PET/CT images. In univariate analysis, the change in homogeneity, variance, and cluster tendency (a measure of grouping of voxels with similar gray-level values) between pre- and post-radiation PET/CT images were significantly associated with OS along with age, node stage, and maximum SUV in the post-radiation images. In multivariate analysis, all three features showed borderline significance (p-values between 0.065 and 0.081).

Kim et al. [106] evaluated the prognostic value of the heterogeneity factor, along with other PET/CT parameters including MTV and TLG, in 93 patients with pancreatic adenocarcinoma who underwent surgical resection. There were significant differences in MTV, TLG, and the heterogeneity factor between patients with recurrence and those with no recurrence, with patients with recurrence showing lower heterogeneity factor (that is, more heterogeneous). The heterogeneity factor on PET/CT and venous invasion in the surgical specimens were found to be independent predictors of recurrence-free survival.

Hyun et al. [105] performed a retrospective study to evaluate the prognostic value of textural features generated from gray-level run length matrix, neighborhood gray-level difference matrix, and gray-level size zone matrix in 137 patients with pancreatic adenocarcinoma who underwent diverse treatment modalities including curative surgery, concurrent chemoradiotherapy, chemotherapy, and supportive care. On time-dependent ROC curve analysis for 2-year survival prediction, entropy showed the highest AUC (0.720) among textural parameters and conventional PET/CT parameters such as TLG (AUC = 0.697) and MTV (AUV = 0.692), showing worse survival in patients with high entropy. On multivariate analysis, clinical stage, tumor size, serum carbohydrate antigen 19–9 level, and entropy were independently associated with overall survival, but TLG failed to show statistical significance.

Colorectal Cancer

Colorectal cancer is the third most commonly diagnosed cancer worldwide [25]. It usually develops from adenomatous polyps with an alteration of several oncogenes [108]. RAS is one of the well-known oncogenes for colorectal cancer and is known to correlate with the effects of cetuximab, an epidermal growth factor receptor inhibitor used to treat metastatic colorectal cancer [109, 110]. Surgical resection, with adjuvant chemotherapy in the advanced stage, has been performed as curative treatment in colorectal patients with no distant metastasis or with resectable metastatic lesions [111]. Recently, in patients with locally advanced rectal cancer, neoadjuvant chemoradiotherapy followed by surgical resection improved local control rates and overall survival, and predicting response to neoadjuvant treatment has been one of the centers of attention for studies with diverse imaging modalities such as MRI and FDG PET/CT in patients with rectal cancer [112, 113].

Two studies have evaluated the predictive value of intratumoral heterogeneity on FDG PET/CT for tumor response to neoadjuvant treatment in locally advanced rectal cancer [113, 114]. Bundschuh et al. [114] performed a study with 27 patients who underwent FDG PET/CT before neoadjuvant chemoradiotherapy, 2 weeks after treatment started, and 4 weeks after it was completed, and they used skewness, kurtosis, and coefficient of variation of FDG uptake to assess of intratumoral heterogeneity. The results of the study demonstrated that the changes in the coefficient of variation during and after the treatment showed the highest AUCs (0.83 for changes in early response 2 weeks after treatment and 0.89 for changes in late response 4 weeks after treatment) for predicting response to neoadjuvant treatment and were significantly associated with PFS. Bang et al. [113] investigated 50 textural features from histogram, absolute gradient, co-occurrence matrix, and run-length matrix for predicting neoadjuvant treatment response and disease-free survival in 74 patients. Textural parameters such as sum entropy (a sum of neighborhood intensity value differences) and entropy were significantly higher in the non-responder group than in the responder group; however, there were no significant associations between textural features and treatment response on multivariate analysis. For disease-free survival, the study authors found that only kurtosis based on absolute gradient was an independent predictor among the textural features and conventional PET/CT parameters.

Lovinfosse et al. [115] performed a retrospective study with 151 patients with rectal cancer to evaluate the relationships between textural features and tumor genetic mutational status. In the analysis, the authors used 11 textural parameters from gray-level co-occurrence matrix and neighborhood intensity difference matrix as well as MTV, TLG, and maximum, mean, standard deviation, skewness, kurtosis, and coefficient of variation of SUV of primary tumors. The results of the study revealed that only maximum, skewness, standard deviation, and coefficient of variation of SUV of primary tumors were significantly associated with RAS mutations. In contrast, volumetric PET parameters and textural features were not significantly associated with the presence of RAS mutations, suggesting the limited role of textural parameters in predicting RAS mutations of rectal cancer. Similarly, in a previous study by Kawada et al. [116] with 51 colorectal cancer patients, the KRAS/BRAF-mutated group had significantly higher maximum SUV than wild-type group, suggesting significant association between KRAS/BRAF mutation and FDG uptake of primary colorectal cancers.

In a different study, Wagner et al. [117] enrolled 50 colorectal cancer patients, 32 without hepatic metastases and 18 with hepatic metastases, and compared skewness and kurtosis of primary tumor lesions and hepatic metastatic lesions on contrast-enhanced CT and FDG PET/CT images. In their study, there were no significant differences in skewness or kurtosis of primary colorectal cancer derived from contrast-enhanced CT and FDG PET/CT images between patients with and without hepatic metastasis, and no significant associations between tumor stage and skewness and kurtosis. Only skewness and kurtosis between primary colon cancer and hepatic metastatic lesions showed significant differences.

Cervical Cancer

Cancer of the cervical uteri is the fourth most common cancer in women worldwide and the majority of cervical cancer occurs in less developed countries [25]. The standard treatment for cervical cancer is surgical resection for early cancer and concurrent chemoradiotherapy for locally advanced cancer [118]. Because adjuvant chemotherapy after chemoradiotherapy in patients with advanced cervical cancer has improved clinical outcomes by reducing distant recurrence, identifying predictors of distant recurrence is important for planning treatment and follow-up strategies [119, 120]. Lymph node metastasis was shown to be a predictive factor for distant recurrence in several studies [120, 121].

A small number of studies with cervical cancer have been performed to investigate intratumoral heterogeneity, and two of them assessed textural features for predicting lymph node metastasis [122, 123]. Shen et al. [123] evaluated 33 textural features generated from gray-level co-occurrence matrix, neighboring gray-level dependence matrix, gray-level run length matrix, and gray-level size zone matrix as well as conventional PET/CT parameters for predicting pelvic and paraaortic lymph node metastases in 170 cervical cancer patients with stage IB-IVA tumors. The authors found that homogeneity from the gray-level co-occurrence matrix was the single predictor of pelvic lymph node metastasis, whereas for paraaortic lymph node metastasis, TLG was the sole independent predictor among PET parameters and all textural parameters failed to show predictive value. In contrast, in a retrospective study with 85 stage IIB cervical cancer patients, the authors found no significant association between the presence of pelvic lymph node metastasis and intratumoral heterogeneity metrics on PET/CT including sphericity (a measure of the roundness of the shape of the region relative to a sphere), extent (a ratio of a volume of an object area to a volume of a bounding area), Shannon entropy (a measure of informational content within the individual distributions of gray-level intensities), and the accrued deviation from the smoothest gradients (a measure of deviations from homogeneity) [122]. The authors of that study suggested that the heterogeneous nature of the patient populations in other studies, which could influence the statistical distribution of heterogeneity metrics, might have led to the different results.

Mu et al. [124] assessed the staging value of intratumoral heterogeneity by using more than 50 textural features in 52 squamous cell cervical cancer patients with stage IA-IVb tumors. Using automatic classification with the support vector machine classifier, the authors showed that run percentage generated from the gray-level run length matrix was the most discriminative feature for differentiating advanced stage (stages III and IV) from early stage (stages I and II) among PET parameters, showing accuracy of 88.10% and AUC of 0.880.

Yang et al. [125] performed a study of 90 patients with locally advanced cervical cancer treated with chemoradiation to evaluate the predictive value of textural parameters for treatment response. They proposed a novel image metric, standardized spatial heterogeneity, which reflected the spatial patterns of intratumoral gray-level value distribution, and they compared its predictive value with that of SUV and various textural features based on gray-level histogram, gray-level co-occurrence matrix, gray-level neighborhood difference matrix, and gray-level zone size matrix. The authors found that five image metrics [standardized spatial heterogeneity, energy and entropy from the gray-level co-occurrence matrix, and gray-level non-uniformity and zone size non-uniformity (the variability of size zone volumes in the image) from the gray-level zone size matrix] showed significant differences between responders and non-responders. Non-responders exhibited greater standardized spatial heterogeneity, indicating a larger number of segregated regions of equal level of FDG uptake. On ROC curve analysis, standardized spatial heterogeneity showed the highest AUC (0.782) for differentiating responders from non-responders among the five parameters, suggesting that this new image metric has competent predictive value for clinical outcomes in cervical cancer patients treated with chemoradiation.

Another recent retrospective study of 118 locally advanced cervical cancer patients treated with chemoradiation assessed textural features for predicting local recurrence after treatment [126]. In the study, a four-feature signature that consisted of peak SUV, homogeneity, low gray-level zone emphasis (a measure of the distribution of the lower gray-level size zones), and high gray-level zone emphasis (a measure of the distribution of the higher gray-level size zones), could differentiate between patients with and without recurrence with AUC of 0.86, which was significantly higher than that for maximum SUV alone (0.67).

Sarcoma

Sarcomas are uncommon heterogeneous mesenchymal cell origin tumors. There are more than 50 subtypes of sarcoma, but they can be classified into two broad types, soft tissue and bone sarcomas [127]. Although curative surgical resection can be performed for localized low-grade sarcomas, because approximately 50% of patients with intermediate- to- high-grade sarcomas develop metastatic disease, neoadjuvant or adjuvant treatment with surgical resection are considered as a standard treatment for most of these tumors [127, 128]. Therefore, differential diagnosis for benign versus malignant tumors and low-grade versus high-grade tumors is important in patients suspected of sarcoma. Morphological imaging modalities, specifically CT and MRI, are known to have limitations in differential diagnosis, and FDG PET/CT using tumor lesion SUV has been reported to better guide differential diagnosis [129, 130]. However, a certain amount of overlap in FDG uptake exists between malignant and benign tumors and between low-grade and high-grade sarcomas [129,130,131,132].

Two recent studies have assessed the value of intratumoral heterogeneity for differentiating benign and malignant bone and soft tissue tumors because malignant tumors have more heterogeneous FDG uptake than do benign tumors [132, 133]. In a previous study of 85 patients with pathologically proven musculoskeletal tumors, AUC-CSH on PET/CT showed significantly higher accuracy (75%) and a larger AUC (0.71) for diagnosing malignant tumors than did maximum SUV (64% and 0.60, respectively) and mean SUV (57% and 0.51, respectively) [132]. Xu et al. [133] investigated the diagnostic value of 19 textural features on both PET and CT images generated from gray-level histogram, gray-level dependence matrix, and neighborhood gray-tone difference matrix in 103 patients with benign and malignant bone and soft tissue tumors. Entropy and coarseness showed more discriminative power than did SUV for differential diagnosis. Furthermore, using a support vector machine classifier, combining the entropy and coarseness of PET images with the entropy and correlation of CT images showed significantly higher accuracy (82.52%) for detecting malignant tumors than did maximum SUV (63.11%), CT textural parameters (72.82%), or PET textural parameters (74.76%) alone.

Sagiyama et al. [134] performed a prospective study of 35 patients with soft tissue tumors who underwent the PET/MRI system to evaluate the usefulness of voxel-based PET and MRI parameters for differentiating high-grade from low-grade and intermediate-grade sarcomas. Using the SUV on PET and the apparent diffusion coefficient (ADC) on MRI for every voxel of tumor lesions, the authors calculated the correlation coefficients between SUV and ADC and heterogeneity, defined as the elliptical 95% area of bivariate normal distribution of SUV and ADC, for each patient, in addition to tumor volume, maximum SUV, and minimum ADC. The study results demonstrated that the correlation coefficient between SUV and ADC was the only parameter that significantly differed between high-grade and low-grade to intermediate-grade tumors, showing the sensitivity of 96.0% and accuracy of 85.7% in detecting high-grade sarcomas.

Vallieres et al. [135] investigated the potential role of textural features of FDG PET and MRI for predicting lung metastasis in 51 patients with soft-tissue sarcoma. The authors extracted a total of 41 textural features from intensity-histogram, gray-level co-occurrence matrix, gray-level run-length matrix, gray-level size zone matrix, and neighborhood gray-tone difference matrix from five different images types: FDG PET, T1-weighted (T1) MRI, T2-weighted fat-suppression (T2FS) MRI, fused FDG PET/T1 images, and fused FDG PET/T2FS images. They found that textural features from fused FDG PET/T1 and FDG PET/T2FS images had high predictive value for lung metastasis, with AUC of 0.984, sensitivity of 95.5%, and specificity of 92.6%.

Lymphoma

In both Hodgkin’s and non-Hodgkin’s lymphoma, FDG PET/CT is one of the standard imaging modalities for evaluating the lymphoma extent and treatment response [136]. Furthermore, FDG PET/CT has been used to evaluate bone marrow infiltration of lymphoma, assess systemic inflammatory response to lymphoma involvement, and define radiation treatment targets, and the intensity of lymphoma FDG uptake has been shown to be the significant predictor of survival in various types of lymphoma including diffuse large B-cell lymphoma and T-cell lymphoma [137,138,139,140,141].

Lartizien et al. [142] investigated the diagnostic performance of FDG PET for detecting lymphomatous disease sites in 25 lymphoma patients using textural analysis. They extracted 115 geometrical and textural features from both PET and CT images and used two supervised classifiers, support vector machine and random decision forest, to select features and assess diagnostic performance. The authors established a high diagnostic ability in discriminating lymphomatous disease sites from physiologic uptake sites and inflammatory non-lymphomatous sites using the supportive vector machine classifier with AUCs between 0.91 and 0.97. Watabe et al. [143] used AUC-CSH to differentiate lymphoma from gastrointestinal stromal tumors in 21 patients who had large abdominal tumors of more than 5.0 cm. Gastrointestinal stromal tumors showed more heterogeneous FDG uptake and significantly lower AUC-CSH than did lymphoma, and the authors concluded that intratumoral heterogeneity on FDG PET/CT could aid in discriminating between these two tumors.

In one recent study, Ko et al. [144] retrospectively evaluated the value of pretreatment FDG PET/CT for predicting PFS using 64 conventional and textural parameters from co-occurrence matrix, intensity size zone matrix, and neighborhood intensity difference matrix in 17 patients with nasal type extranodal natural killer/T-cell lymphoma. These authors found that dissimilarity and low-intensity short-zone emphasis were independent predictors of PFS, whereas conventional parameters including MTV and TLG failed to show correlations with PFS.

A study by Hanaoka et al. [145] compared radiotracer accumulation and heterogeneity, using skewness and kurtosis of voxel distribution and AUC-CSH, of FDG PET/CT and 111In-ibritumomab tiuxetan single photon emission computerized tomography (SPECT)/CT with the tumor response in 16 patients treated with 90Y–ibritumomab tiuxetan therapy. On FDG PET/CT, only maximum SUV showed significant differences between responders and non-responders, and neither the skewness, kurtosis, nor AUC-CSH showed significant differences between the groups. In contrast, on 111In-ibritumomab tiuxetan SPECT/CT, non-responders showed more heterogeneous distribution of radiotracer uptake than did responders, although there was no significant difference in maximum radiotracer uptake between them.

Current Limitations

Although a number of studies of malignant diseases have emerged and have shown the clinical implications of intratumoral heterogeneity on FDG PET/CT, there are several limitations in evaluating intratumoral heterogeneity using FDG PET. First, because PET/CT has lower spatial resolution than that of anatomical imaging modalities such as CT and MRI, the reliability of heterogeneity parameters in tumors with small volume is limited [9, 58]. A previous study by Brooks et al. [146] concluded that tumors with volumes less than 45 cm3 can profoundly bias comparisons of heterogeneity metrics. Establishing recommendations on methodological choices according to the volume of tumor would be needed rather than applying uniform method to all tumors [146]. Furthermore, developing a PET devise and reconstruction protocol to overcome the limitation in small tumor is required in the long term. Second, the repeatability and reproducibility of textural features can be of concern in the clinical setting. In one recent study, repeatability was highly variable among textural features, and numerous metrics were identified as only poorly to moderately reliable on repeatability analyses [147]. Moreover, various factors including PET image reconstruction methods, quantization processes, tumor segmentation methods, partial volume effect correction, and the time of acquisition post-injection of FDG have been shown to affect the values of textural features [12, 147,148,149,150,151]. Generating and organizing standard imaging protocol, image preprocessing, and analytic method can be a solution to resolve the problem [9]. Third, because the vast majority of clinical studies that investigated intratumoral heterogeneity retrospectively enrolled small numbers of patients in diverse stages with different treatment modalities, it is necessary to validate these studies’ findings against external cohorts to establish the clinical significance of intratumoral heterogeneity. A previous study by Chalkidou et al. [152] estimated that the average type I error probability in published studies on the relationships between survival and textural parameters derived from CT and PET images was 76% (range: 34–99%) and did not reach statistical significance. Fourth, because a number of heterogeneity metrics correlate with each other or with MTV, more sophisticated analytic methods such as machine-learning technique are necessary to properly select and combine textural features and clarify their complementary value [9, 56, 64]. Finally, the definitions and nomenclature of textural features are inconsistent throughout studies, which makes it difficult to compare their results [9].

Conclusion

FDG PET/CT is a non-invasive imaging modality that can provide much information on the characteristics of cancer lesions. Using numerous imaging features based on radiomics, we can quantify and investigate the intratumoral heterogeneity of cancer lesions on FDG PET/CT. Many clinical studies regarding intratumoral heterogeneity, mostly textural features, have been recently published in the literature and have demonstrated the clinical value of intratumoral heterogeneity in diagnosing, evaluating treatment response, and predicting survival in a variety of cancers, complementing the roles of the conventional PET parameters. Radiomics is a promising method for providing personalized medicine and enhancing cancer management. However, multicenter studies with large patient populations are needed for further validation.