Introduction: Role of Radiomics in Precision Medicine

Tumors are biologically complex and show phenotypic and genomic heterogeneity between different tumors and even within an individual tumor. In other words, although tumors have the same histopathological cell type, they can show vast variations in imaging features including vascularity, contrast enhancement, and necrosis. In parallel, such variations have also been reported in the genetic profile of cancers. Such genetic variation of cancers has become of great interest because patient-centered chemotherapy based on patient-specific tumor cell mutation, an approach called precision medicine, has recently been introduced and shows excellent results. Thus, during the past decade, large database studies have transferred the concept of cancer diagnosis from traditional histopathological cell type to a new classification based on molecular genetic data [1,2,3]. However, cancer treatment based on these results typically fails due to the amazing ability of tumor cells to acquire subclonal mutations during tumor evolution. Therefore, the key factor leading to successful precision medicine lies in a clear understanding of each patient’s tumoral heterogeneity and individual situation [4]. In other words, robust biomarkers are required to obtain a better understanding of the evolving biology of cancer.

During the last decade, dramatic advancements in high-throughput computing and automated pipeline systems have been introduced. Such advancements, especially in computed tomography (CT), have made it possible to extract innumerable quantitative features from medical CT images, a discipline known as radiomics. Thus, by extracting radiomics features, a great deal of information hidden within the layers of conventional CT images can be revealed for clinical use. Although radiomics can be applied to various conditions, its potential has been most promising in the field of oncology. Multiple studies using a radiomics approach have shown that quantitative features offer better characterization of the tumor, more precise prognosis assessment, and improved prediction of drug resistance [5,6,7]. Texture analysis has also been shown to be a highly significant independent predictor of survival in patients with non-small cell lung cancer [8]. Intratumor heterogeneity, which is near ubiquitous in malignant tumor, is a key challenge in cancer medicine. Genetic heterogeneity of a malignant tumor leads to regional difference in stromal architecture or function of individual tumors and imaging can quantify the adverse spatial feature and functional heterogeneity through measurement of quantitative features [9, 10]. In other words, quantitative tumor characteristics observable at medical imaging reflect the molecular, cellular, and tissue components, which might ultimately advance our understanding of the evolving biology of the whole tumor. In this article, we review the methodology of CT radiomics and discuss its application in thoracic oncology.

Methodology of CT Radiomics in Oncology

Radiomics is a quantitative, noninvasive method of revealing information embedded within conventional CT images performed clinically for diagnosis and preoperative planning. Radiomics data can be used to build descriptive and predictive clinical models relating imaging characteristics to tumor biology phenotypes. Although conceptually simple, each step of radiomics has its own challenges and applications. Furthermore, clinical translation of CT radiomics is a complex undertaking that requires the coordinated efforts of the radiologist, computer scientist, and oncologist (Figs. 1 and 2).

Fig. 1
figure 1

Overview of radiomics in lung cancer. Whole tumors are segmented by drawing regions of interest that traced tumor edge, and quantitative features are extracted within defined tumor contours on CT images. Relationships among the radiomics features, clinical data, and genomic data are analyzed

Fig. 2
figure 2

Schematic diagram of quantitative analysis of diffuse lung disease in lung cancer patients. Whole lung and lobes are segmented, and quantitative features are extracted using histogram or texture-based method. Relationships among the quantitative features, clinical data, and genomic data are analyzed

Steps

  1. 1)

    Image acquision

    Image acquisition is the first step in the practice of radiomics. One major challenge in this step is the wide variation in image acquisition parameters including radiation dose, scanning protocol, reconstruction algorithm, and slice thickness used in routine clinical practice. Yan et al. successfully identified several features that remained stable even at different PET image reconstruction settings [11], of which peak standardized uptake value (SUVpeak), SUVmean, multiple texture features, and entropy were the most robust. However, comparison of radiomics features extracted from different methods of image acquisition needs further investigation.

  1. 2)

    Segmentation

    The next step is to define the region of interest (ROI) that contains the whole tumor or subregions within the tumor, a process called tumor segmentation. This is generally not a problem for solid tumors with definite tumor margins. However, when tumors have indistinct borders, e.g., peripheral ground glass opacity (GGO) in invasive lung adenocarcinoma, identification of the tumor margin becomes a much more complex task [12].

    In addition, particular consideration should be paid to whole lung and lobe segmentation, which provides the advantage of predicting postoperative residual lung function, morbidity, and mortality. For lobe segmentation, the first step is to segment the lung region including the lung parenchyma, airways, and vessels by applying an airway threshold. Next, the major airways and vessels are removed to separate the left and right lungs. Fissure detection is essential for accurate lobe segmentation and is based on image intensity computed from the local neighborhoods around each voxel and anatomical information, such as the airways and vasculature [13]. Segmentation of the airways can be performed by manual, semiautomatic, or automatic methods. Manual segmentation is extremely time consuming. Region growing and wave propagation are common methods of airway segmentation based on threshold (cutoff) pixel values in Hounsfield units (HU). A morphology-based method is also used for segmentation of airways [14]. Some additional algorithms have been developed to improve this basic segmentation process. The luminal segmentation is condensed to the centerline that runs exactly through the center of the airway [15]. Airway branches are identified by detection of the divergence of each point on the skeleton, and the airway is labeled or classified by the identified airway branches [16].

  1. 3)

    Feature extraction

    After accurate tumor segmentation, a nearly limitless supply of radiomics features can be extracted from the identified tumor ROI. The advantage of including radiomics features in the field of oncology is quite clear: quantitative features will allow better tumor characterization and can objectively reveal valuable patterns reflecting the tumor biology that are hard to detect with the human eye. Furthermore, extracted radiomics features are constantly being refined and developed [17,18,19]. We will discuss major types of currently available radiomics features in detail later in this article.

  1. 4)

    Feature selection

    Having extracted a massive amount of radiomics features, the next step is to capture the true clinical value of such features. Although the full potential of extracted features has yet to be realized, they have been shown to have associations with cancer detection, diagnosis, prognosis assessment, and even monitoring of treatment response [5, 7, 17, 20]. Commonly use methods are least absolute shrinkage and selection operator, principle component analysis, and random forest. The challenge in this step is the considerable variability in predictive performance that has been reported for the different methods of feature selection and classification [21]. Therefore, the goal is to select the most useful radiomics features for clinical translation in the field of oncology.

Types of Radiomics Features

We present five major classes of radiomics feature: (a) morphological, (b) statistical, (c) regional, (d) model-based, and (e) skeleton features [22]. Morphological features provide detailed information about the shape and volume of a tumor. Features calculated by statistical methods can be further classified into first-order statistical (histogram) features and higher-order statistical (texture) features. Regional features can allow quantification beyond the immediate neighborhood and represent intratumor clonal heterogeneity by subregional clustering. Model-based features are extracted using mathematical approaches, such as the fractal model. Skeleton features provide information about the alteration, shape, thickness, and narrowing of airways.

In the subsequent subsections, we briefly summarize the details of each class of feature.

  1. 1)

    Morphological features

    Morphological features are used to define the physical characteristics of a tumor. For example, the roundness of a tumor can be quantified using features, such as spherical disproportion, sphericity, and discrete compactness. Surface area can be calculated by triangulation, which is a technique of generating a net of triangles that completely covers the tumor surface. In terms of spiculation, a larger surface-to-volume ratio demonstrates a more spiculated and irregular tumor, while a lower surface-to-volume ratio demonstrates a smoother and rounder tumor. Another morphological feature of interest is tumor mass, a parameter that integrates volume and density. A wide spectrum of lung adenocarcinomas, the most common histologic type of lung cancer, manifest as sub-solid nodules including a GGO portion. Tumor mass measurement enables the detection of GGO growth earlier than traditional measurements [23, 24].

    Laplacian of Gaussian is a spatial filtering technique that enhances the marginal features from surrounding regions. This technique enables quantitative analysis regarding tumor margin characterization, which can reflect the relationship between tumor and surrounding tissue and thus the tumor microenvironment.

  1. 2)

    Statistical features

  1. a.

    First-order histogram features

    The basis of first-order statistics is a histogram, which is a simple plot of tumor pixel attenuation along one axis versus the frequency of pixels at each attenuation value along the other axis. Thus, a histogram displays the range and frequency of pixel values within the defined lesion ROI. Multiple features including mean, median, standard deviation, kurtosis, skewness, energy, entropy, uniformity, and variance can be calculated from this histogram, and most features are reported to be reproducible [25].

    Constructing a histogram from conventional CT images is easy, and histogram analysis yields multiple quantitative features; therefore, histogram-based features have been used widely in the field of oncology. Quantitative features from the histogram demonstrate information from the voxel level, which can reflect subtle changes in lung cancers. However, a major limitation of histogram-based features is the loss of spatial information about each voxel.

  1. b.

    Higher-order texture features

    In contrast to histogram features, higher-order texture features denote spatial information about each voxel. A gray level co-occurrence matrix (GLCM) is constructed using the number, distance, and angle of a combination of gray levels in the image. From the GLCM, features of cluster, correlation, contrast, energy, and entropy can be extracted. A gray level run length matrix (GLRL) characterizes continuous voxels with the same gray level in any direction. From the GLRL, features such as long run emphasis, short run emphasis, run length non-uniformity, gray level non-uniformity, and run percentage can be extracted. The neighborhood gray-tone difference matrix (NGTDM) uses the intensity values of a neighborhood instead of one voxel to represent how similar or dissimilar voxel intensities are within a neighborhood. Features of busyness, complexity, and texture strength can be extracted from the NGTDM. There is a large body of literature on texture analysis showing an association with tumor stage, metastasis, treatment response, survival, and molecular genetic profiles in lung cancer [8, 26,27,28,29,30].

  1. 3)

    Regional features

    As mentioned above, a great deal of heterogeneity exists even within a single tumor. Intratumoral heterogeneity is important because certain subregions can initiate cancer cell transformation leading to tumor progression. Intratumoral heterogeneity can be exhibited by mapping the spatial distribution of similar gray level intensities within a tumor, namely regional features. In other words, regional features demonstrate the number of subregions and how often certain subregions occur within a tumor. Methods of subregional partitioning include data-driven segmentation and the use of threshold values [6, 10, 31]. Data-driven segmentation groups voxels with similar intensity into clusters, and threshold values are also used to group voxels into clusters.

  1. 4)

    Model-based features

    Fractal alteration characterizes the shape complexity of an object over a range of scales. In other words, fractal dimension is a mathematical calculation that reflects the intrinsic shape of an object. In this context, morphological complexity and spatial heterogeneity of tumors can be quantified and assigned a numerical value.

    Advantages of fractal dimension are that it is relatively stable, less susceptible to noise than other features, and can be used for longitudinal assessment in a single patient [32]. Another feature of interest is the recently developed fractal signature dissimilarity method, which has been suggested as a novel image texture analysis technique [33]. In that study, the fractal signature dissimilarity method was used to quantitatively assess contrast agent uptake heterogeneity dynamics, indicating a potential role in monitoring the early response to anti-angiogenesis treatment [33].

  1. 5)

    Skeleton features

    Skeletonization, referred to as medial axis extraction, is widely used in computerized shape analysis [16]. Quantitative measurement of airways follows segmentation to accurately find the location of the inner airway and is then computed, allowing segmentation of perpendicular plans across the targeted bronchi [14, 34]. The full-width-at-half-maximum (FWHM) method, based on the difference between the two extreme values at which the HU value is equal to half of its maximum, is mostly used to find the inner and outer pixels of the airway wall and calculate the airway wall dimension [15, 34]. The luminal area, wall area (WA%), is automatically extracted and has been used for quantification of airway wall thickening and airway narrowing [34]. Bifurcation angle and airway luminal circularity are used to identify the alteration of airway skeletal structure and heterogeneous airway luminal shape [35].

Clinical Application of CT Radiomics in Oncology

Radiomics Approach to Lung Cancer

In 2011, the International Association for the Study of Lung Cancer (IASLC), the American Thoracic Society (ATS), and the European Respiratory Society (ERS) introduced a new classification for lung adenocarcinomas. A vast volume of literature has covered sub-solid nodules, namely nodules with a GGO component, which correlate with the spectrum of lung adenocarcinoma. CT findings of early-stage lung adenocarcinomas and their precursors are usually pure GGO nodules or part-solid nodules. Thus, the imaging spectrum of GGO reflects the evolving process of adenocarcinoma from preinvasive lesions caused by the accumulation of gene mutations. However, discrimination between the invasive and non-invasive proportions is challenging in GGO lesions due to limited visual perception and subjective analysis of conventional CT scans [36, 37]. Multiple investigations have shown that quantitative radiomics features of GGO lesions can help find small pathologically invasive components that are hard to visually perceive at the medical imaging voxel level [7, 18, 38]. Entropy or a high attenuation value, such as the 75th percentile CT attenuation value from histograms, has been reported as a significant discrimination factor for invasive adenocarcinomas [7]. Furthermore, the 97.5th percentile CT attenuation value and the slope of CT attenuation values have been suggested as predictors for future CT attenuation changes and the growth rate of pure GGO lesions [39]. Therefore, it is not surprising that lung cancer-specific (GGO-related) radiomics features can provide additional information about tumor invasiveness and progression from other indolent or non-invasive lesions and can even predict tumor growth.

In addition, regional features have shown great potential in depicting the spatial heterogeneity of cancers. By grouping similar voxels together, multiple subregions that respond differently to therapy or result in tumor progression can be revealed. In fact, in one recent study, researchers were able to identify clinically relevant high-risk subregions in lung cancer using intratumor partitioning of 18F-FDG PET and CT images [31].

Tissue stiffness is a widely accepted biomechanical property of fibrotic tumors and affects tumor growth, invasion, metastasis, and treatment. Analysis of tissue displacement after disruption of the confining structure showed that solid stress depends on both cancer cell type and the microenvironment; solid stress increased with tumor size, and mechanical confinement by the tumor surroundings substantially contributes to intratumoral solid stress [40].

Radiomics features have also shown favorable results when linked to underlying genomic alterations. Features of tumor size, edge shape, and sharpness showed the highest prognostic significance to predict metagenes in patients with non-small cell lung cancer [30]. Another study using semantic features and clinical variables was able to predict patients with ALK rearrangements [41]. Finally, Yoon et al. combined radiomics features and clinical information to successfully predict oncogenic fusion genes in lung cancer [42].

Prediction of Postoperative Lung Function or Postoperative Morbidity

Prediction of postoperative lung function plays a key role in the preoperative evaluation of lung cancer patients with impaired lung function in order to identify an increased risk of postoperative complication and mortality [43]. Currently, postoperative lung function is predicted using spirometry, including forced expiratory volume in 1 s and diffusing capacity for carbon monoxide, and radionuclide lung scanning [44]. An accurate prediction of postoperative pulmonary function is considered for conditions with inhomogeneous effective pulmonary function such as pulmonary emphysema or interstitial lung disease (ILD) [45]. Quantitative CT can be used to calculate the volume of regional and total functional lung, enabling the normal functional volume to be distinguished from the non-functional volume resulting from emphysema, tumor, and atelectasis using histogram-based lung densitometry, separately in the resected lobe and the remaining lung regions [46, 47]. The effectiveness of quantitative CT for predicting postoperative lung function was first proposed by Wu et al. [48]. Further studies have demonstrated the role of quantitative CT in predicting postoperative lung function, and its prediction seems to correlate well with perfusion scintigraphy and pulmonary function tests [43, 49,50,51]. A recent study showed that volumetry from inspiration/expiration CT could be useful for prediction of postoperative lung function [45]. Although lung lobectomy results in permanent loss of functional lung, patients with lung cancer and chronic obstructive pulmonary disease (COPD) who undergo cancer resection can have minimal loss or improvement in postoperative lung function, a phenomenon known as the lung volume reduction effect [52]. A combined evaluation using spirometry and quantitative CT could characterize the respiratory dynamics and might be used as a predictor of the volume reduction effect [47]. In addition, dual-energy CT (DECT) provides images presenting the lung perfusion at a specific time point. By extraction and quantification of the iodine concentration, DECT provides the ratio of lobar perfusion of the lung, which allows accurate prediction of postoperative lung function [46, 53]. Choe et al. reported that a modified method incorporating postoperative lung volume change using DECT can be considered a comparable method for predicting postoperative lung function [54].

Furthermore, the frequencies of postoperative complications and mortality are higher in patients with COPD and ILD [52, 55]. Quantitative CT in combination with spirometeric measurements contribute to improved prediction of cardiopulmonary complications after lobectomy for lung cancer [56]. In one study, the authors reported that lung density lower than −787.5 HU and volume of emphysema greater than 5.41% increase the risk for developing postoperative pulmonary morbidity [57]. In addition, the severity of lung fibrosis on preoperative CT images was an independent predictive factor of postoperative mortality in lung cancer patients with combined pulmonary fibrosis and emphysema [58]. The severity of lung fibrosis can be quantitatively analyzed for extent and pattern on CT using histogram-based quantification, texture-based quantification, and deep learning. Several studies showed that automated quantification of radiologic patterns of ILD, including normal, GGO, reticular opacity, honeycombing, emphysema, and consolidation, could predict lung function, disease severity, and progression [59,60,61,62,63]. Therefore, quantitative analysis of ILD and COPD can be used to predict mortality and morbidity after treatment in lung cancer patients.

Conclusion and Future Aspects

Although radiomics is still in its infancy, the approaches used to date are very promising sources of robust imaging biomarkers for predicting molecular genetic subtypes related to patient prognosis, optimizing treatment such as selecting the appropriate chemotherapeutic agent, and predicting treatment response. Nevertheless, the hurdle of reproducibility of radiomics features remains. Although radiomics features were found to be mostly unstable in earlier studies [64,65,66], there are continuous improvements in their standardization. In addition, most studies have extracted radiomics features from a single imaging modality, and extraction of radiomics features using multispectral analysis across different modalities has substantial potential. By combining anatomic, functional, and metabolic imaging, a radiomics approach can provide valuable information for phenotyping tumor biology in correlation with tumor diagnosis, classification, and treatment response prediction.

Another issue is sharing of data cross multiple institutions. Data sharing is a critical point in the field of radiomics which must be overcome. Answers to this may be large centralized data repositories or federated approaches. For tumor response and prognosis, any imaging biomarker must be reliable and meaningful. Thus, by incorporating different image protocols and reconstructions, data sharing among institutes can help to create a predictive and prognostic model with high accuracy. Furthermore, as intensity inhomogeneity may significantly affect the extracted radiomics features, special consideration is required when applying radiomics to magnetic resonance images [67, 68].

In conclusion, the role of medical imaging in human cancer is now larger than ever, and analysis of radiologic data, namely radiomics, has enormous potential to further enrich the knowledge obtained from medical images. Therefore, we anticipate that radiomics will have an essential position in the development and implementation of precision medicine in the field of oncology in the foreseeable future.