Introduction

Thymic epithelial tumors (TETs) are relatively rare tumors, accounting for 0.2–1.5% of adult malignancies, but they represent the most common primary tumors of the anterior mediastinum [1, 2]. Masaoka staging and World Health Organization (WHO) pathological classification are the major prognostic indicators in addition to resectability [3, 4]. Clinical management of TETs is mainly dependent on the pathological subtypes and stages [5]. Early-stage (stages I and II) TETs are generally treated with surgery, and advanced stage (stages III and IV) of TETs frequently recommend a multimodality treatment regime [5,6,7,8,9]. Therefore, it is critically important to accurately identify the risk grades of TETs before treatment for guiding treatment decision-making.

Imaging diagnosis and risk grades of TETs would be valuable and desirable for determining appropriate treatment strategies [10]. Computed tomography (CT) and conventional magnetic resonance imaging (MRI) are used as routine imaging modalities for TET patients, and they can provide detailed morphologic information regarding tumor location, shape, contour, homogeneity, infiltration, and so on [11,12,13,14]. Although conventional imaging showed considerable potential, these examination methods depend on qualitative parameters, with many degrees of overlap among different entities, and cannot quantitatively assess the subtypes and stages of TETs accurately [4].

With the technical advances in fast MR imaging, diffusion-weighted imaging (DWI) has been successfully used in studies of lung cancer, mediastinal tumors, and pleural lesions [15,16,17]. DWI is considered the most sensitive method to detect the differences of water molecular diffusion in living tissues [18], and the apparent diffusion coefficient (ADC) value obtained by using a monoexponential model, providing information on tumor cellularity, can be potentially useful in quantitatively differentiating the grades of TETs [9, 19, 20].

Texture analysis evaluates the distribution of signal intensity at a pixel level within a tumor to quantify the tumor heterogeneity, which occurs due to variations in genomic subtypes, cell proliferation or apoptosis, metabolic activity, vascular structure, and other factors [21]. Comparing with morphological analysis of conventional imaging, texture parameters provide a more detailed and quantitative information on tumor composition through a pixel-by-pixel analysis. A series of texture analysis studies have demonstrated a potential role in differential diagnosis, staging, and predicting prognosis on different tumors [22,23,24,25,26,27]. However, it remains largely unknown whether DWI texture analysis can improve the efficacy in predicting the grades of TETs.

In the present study, we aimed to evaluate the potential value of combining ADC and DWI texture parameters in predicting the pathological subtypes and stages of TETs preoperatively.

Materials and methods

Subjects

This retrospective single-center study was approved by the local Ethics Committee, and informed consent was waived.

Between December 2013 and March 2016, 91 consecutive patients with suspected TETs based on CT underwent conventional MRI and DWI examination of the thorax, and the final diagnosis was based on pathological analysis. Of these, 20 patients were excluded for non-TETs diagnosis based on pathological evaluation, 11 patients were excluded for poor image quality or motion artifact, and 3 patients were excluded for solid part of the tumor < 2.0 cm in size. The final study population was comprised of 57 patients (36 men, 21 women; mean age, 48.9 ± 11.4 years) with newly diagnosed TETs according to the pathological results (Fig. 1 and Table 1).

Fig. 1
figure 1

Flow diagram of patient selection, quantitative metrics, and grouping

Table 1 Clinical and demographic characteristics of 57 patients with thymic epithelial tumor

Thorax MRI protocol

All patients were examined preoperatively with the same imaging acquisition protocol on a 3.0-T whole-body system (MR750, GE Healthcare) with a 40-mT/m maximum gradient capability and a standard 8-channel torso coil. Conventional MRI and DWI were performed in regular sequence during the same examination. The conventional MRI protocol included respiratory triggering T1-weighted spin-echo in the axial plane, T2-weighted fast spin-echo in the axial planes and coronal planes, and axial T2-weighted turbo spin-echo images with fat suppression. Subsequently, DWI sequences (b = 1000 s/mm2) were performed with a single-shot diffusion-weighted spin-echo echo-planar sequence. We used the respiratory triggering and chemical shift-selective fat suppression technique to reduce the artifacts. Parallel imaging was used with an acceleration factor of 2. A local shim box covering the whole thorax was applied to minimize the susceptibility artifacts. Other parameters were as follows: TR/TE, 6000 ms/51 ms; matrix size, 96 × 128; FOV, 40 cm × 40 cm; slice thickness, 5 mm; gap, 0.5 mm; NEX, 4. The total scan time was approximately 1 min and 33 s.

ADC value measurement

All data were analyzed on a GE ADW4.6 workstation. The mean ADC values were measured independently by one experienced radiologist (B.L., with 15 years of experience in MRI) using a commercial software (Functool 9.4.05, GE Healthcare). He was aware that the patients had TETs, but blinded to the pathological subtypes of the tumors. First, he reviewed the conventional MR images carefully to determine the solid part of each tumor. Next, the DWI data were analyzed. Two circular regions of interest (ROIs) were manually drawn using an electronic cursor in the slice containing the largest cross-sectional area, which were placed to include the solid tumor elements by defining ROIs based on the relatively high signal intensity on the DW image (bright region, b = 1000 s/mm2, as shown in Fig. 2a) or the relatively low ADC value in the ADC map (deep-blue region, as shown in Fig. 2b), avoiding large vessels and hemorrhagic, cystic, and necrotic areas. The mean ROI area was 46.9 ± 20.2 mm2 (range, 11.5–88.0 mm2). The ADC value was calculated by fitting the b0 image and DW images at 1000 s/mm2b value into the conventional ADC equation (Eq. (1)) [28]:

$$ {S}_{\mathrm{b}}/{S}_0=\exp\ \left(-b\ \mathrm{ADC}\right) $$
(1)
Fig. 2
figure 2

The schematic diagram of tumor segmentation. Two circular regions of interest were drawn to include the solid tumor elements by defining ROIs based on the relatively high signal intensity on the DW image (a, bright region, b = 1000 s/mm2), or the relatively low ADC value in the ADC map (b, deep-blue region), and a volume of interest (VOI) was drawn around the entire cross-sectional tumor region in three consecutive slices on DW images (c)

The ADC maps were generated automatically (as shown in Fig. 2b) and the mean ADC values within two ROIs were obtained. The final ADC is the average of the ADC values of two ROIs.

DWI texture analysis

Tumor segmentation

All images were anonymized and stored in DICOM format. One experienced radiologist (B.L., with 15 years of experience in MRI) manually segmented the tumor area on axial DW images by using commercial software applications, NordicICE (Version 4.0.4, NordicNeuroLab). A volume of interest (VOI) was drawn manually around the entire cross-sectional tumoral region in three consecutive slices containing the largest cross-sectional area on DW images in a slice-by-slice method. Care was taken to exclude adjacent tissues, attempting to maintain an approximate distance of 1–2 mm away from the tumor margin to minimize the partial volume phenomenon (Fig. 2c).

Feature extraction

Using a non-commercial Analysis-Kit software (GE Healthcare), the quantitative features were extracted on DW images based on the drawn VOI in the previous step. Forty-two first-order histograms, 983 second-order textures (432 gray-level co-occurrence matrix, 540 gray-level run length matrix, and 11 gray-level size zone matrix), 9 form factors, and 10 Haralick features were extracted; thus, a total of 1044 quantitative features were obtained from the original images. Detail descriptions regarding the quantitative features extracted in this study are presented in Supplemental file ( Texture parameters description.docx ).

Feature selection

Due to the large number of features, we applied a three-step feature selection process to remove irrelevant or redundant features and identify ones that are the most informative. First, one-way ANOVA was used to test for significant differences of texture features among low-risk thymoma (LRT), high-risk thymoma (HRT), and thymic carcinoma (TC) groups, and features with p value < 0.01 were kept for further feature selection. Similarly, independent samples t test was used to test for significant differences of texture features between early and advanced TETs, and p < 0.05 as the selecting criteria. Further reduction of features was conducted by Pearson’s correlation tests. When the Pearson correlation coefficient of any two features was larger than 0.9, one of them was selected at random, and the rest could be treated in the same manner [29]. Finally, the optimal feature with the best discriminative performance from similar category was selected.

Pathologic diagnosis

The final diagnosis was determined by surgical or puncture biopsy specimen and confirmed with pathologic analysis, which was performed by an expert in the pathology department, who was blinded to the MR findings. Based on the criteria of the 2004 World Health Organization (WHO) histological classification and Jeong simplification classification of the thymic tumors [30, 31], TETs were divided into three subgroups: LRT (types A, AB, and B1), HRT (types B2 and B3), and TC. Early (stages I and II) and advanced stages (stages III and IV) of TETs were defined according to the Masaoka staging system [3].

Statistical analysis

The Kolmogorov–Smirnov (K–S) test was used to assess the normality of data distributions. Numerical variables with normal distribution were denoted as mean and standard deviation. The ADC value and DWI texture parameters in TETs were compared for differences among LRT, HRT, and TC groups with one-way ANOVA (equal variances assumed) or Brown–Forsythe test (equal variances not assumed), and further post hoc multiple comparisons were performed with Bonferroni test (equal variances assumed) or Dunnett’s T3 test (equal variances not assumed). The ADC value and texture parameters between early (Masaoka stages I and II) and advanced (stages III and IV) stages of TETs were tested for differences using the independent samples t test. The binary logistic regression analysis was used in evaluating the effects of combination of multiple parameters. Receiver operating characteristic (ROC) curve analyses were performed to determine optimum thresholds for differentiating the defined groups by various parameters, and also to calculate the sensitivity, specificity, and area under the curve (AUC). The leave one out cross-validation (LOOCV) was used in resampling in the area under the ROC curve. All statistical analyses above were performed with IBM SPSS 20.0 software (IBM Corp). Differences between the AUCs were compared by using a Delong test and performed using Medcalc version 18.5 software (MedCalc). Statistical significance was accepted as p < 0.05.

Results

Demographic characteristics

Table 1 exhibits the demographic characteristics of the patients. The major clinical features of the patients include myasthenia gravis (26.3%; 15 of 57 patients), chest pain (24.6%; 14 of 57), respiratory symptoms (19.3%; 11 of 57), and others (12.3%; 7 of 57), and no symptom in 10 patients (17.5%).

Forty-seven TET patients were staged based on surgical specimen, and the other 10 advanced-stage patients who were proved by puncture biopsy were staged by the presence of pleural or pericardium implants or metastasis at imaging. According to the Masaoka stages, 17 (29.8%) patients were in clinical stage I, 15 (26.3%) in clinical stage II, 7 (12.3%) in stage III, and 18 (31.6%) in stage IV. As for the pathological subtypes, 12 patients had LRT (types A (n = 1), AB (n = 8), and B1 (n = 3)); 32 HRT (types B2 (n = 23) and B3 (n = 9)); and 13 TC (squamous cell carcinoma (n = 7), adenocarcinoma (n = 3), and neuroendocrine carcinomas (n = 3)). The relationship between Masaoka stage and the simplified WHO histologic classification is shown in Table 2. None of LRTs was in the advanced stage (stages III and IV) compared with 12 of 32 (37.5%) HRTs, and all of 13 TC patients.

Table 2 Relationship between histopathological subtypes and Masaoka stages of thymic epithelial tumors

Comparison of ADC values and DWI texture parameters among low-risk thymomas, high-risk thymomas, and thymic carcinomas

After feature selection, seven texture parameters with good discriminative performance were obtained, including V80 (Cluster Shade_angle90_offset1), V140 (GLCM Entropy_AllDirection_offset3), V153 (Cluster Prominence_AllDirection_offset3_SD), V204 (GLCM Entropy_angle0_offset4), V296 (Cluster Shade_AllDirection_offset6_SD), V312 (Cluster Shade_angle45_offset6), and V735 (Short Run Emphasis_AllDirection_offset5_SD). The statistic results of the ADC and selected texture parameters among LRTs, HRTs, and TCs are shown in Table 3 and Fig. 3.

Table 3 ADC and DWI texture parameter comparisons among low-risk thymoma, high-risk thymoma, and thymic carcinoma
Fig. 3
figure 3

Box plots for values of ADC (a), V80 (b), V312 (c), and V296 (d) among low-risk thymoma (LRT), high-risk thymoma (HRT), and thymic carcinoma (TC). ADC, apparent diffusion coefficients; V80, Cluster Shade_angle90_offset1; V296, Cluster Shade_AllDirection_offset6_SD; V312, Cluster Shade_angle45_offset6

The mean ADC values in LRTs or HRTs were significantly higher than ones in TC groups according to the one-way ANOVA (ADC, 1.63, 1.30, and 0.86 × 10−3 mm2/s, p = 0.004 and 0.001, respectively), while there were no significant differences in ADC values between the LRT and HRT groups (p > 0.017). With regard to 7 selected DWI texture features, significant differences were found among LRT, HRT, and TC groups (all p < 0.05). After further post hoc multiple comparisons, V80 and V312 values in LRTs were significantly higher than ones in HRT or TC groups (all p < 0.017).

Comparison of ADC values and DWI texture parameters between early and advanced stages of TETs

After feature selection, nine texture parameters with better discriminative performance remained, including V57 (Cluster Prominence_AllDirection_offset1_SD), V204 (GLCM Entropy_angle0_offset4), V278 (Correlation_angle135_offset5), V296 (Cluster Shade_AllDirection_offset6_SD), V735 (Short Run Emphasis_AllDirection_offset5_SD), V920 (High Gray-Level Run Emphasis_AllDirection_offset8_SD), V1025 (Sphericity), V1030 (Maximum 3D Diameter), and V1033 (Spherical Disproportion). The statistic results of the ADC and selected texture parameters between early and advanced TETs are shown in Table 4 and Fig. 4.

Table 4 ADC and DWI texture parameter comparisons between early and advanced stages of thymic epithelial tumors
Fig. 4
figure 4

Box plots for values of ADC (a), V204 (b), V1030 (c), and V296 (d) between early and advanced stages of thymic epithelial tumors. ADC, apparent diffusion coefficients; V204, GLCM Entropy_angle0_offset4; V296, Cluster Shade_AllDirection_offset6_SD; V1030, maximum 3D diameter

The mean ADC value in early-stage TETs was significantly higher than that in advanced-stage TETs (1.48 vs. 1.00 × 10−3 mm2/s, p < 0.001). As for the nine selected DWI texture parameters, significant differences were found between early and advanced stages of TETs (all p < 0.05).

ROC analysis

Based on the ROC analyses, the V80 and V312 values achieved higher diagnostic efficacy with an AUC of 0.862 and 0.875, the same sensitivity of 75.0%, and specificity of 88.1% and 90.5%, for differentiating the LRT from HRT and TC at the cutoff value of 1.29 × 106 and 5.00 × 105, respectively. As for the ADC value, the AUC, sensitivity, specificity, and the cutoff value for differentiating the LRT from HRT or TC were 0.746, 50.0%, 90.5%, and 1.86 × 10−3 mm2/s, respectively. In addition, we made a binary logistic regression by using the group as a dependent variable and using V80 and V312 or ADC and V312 as covariates, and then acquired each patient’s predicted probability (p value). These p values were then analyzed by ROC curve analysis. Logistic (ADC, V312) achieved the highest diagnostic efficacy with an AUC of 0.933, sensitivity of 95.2%, and specificity of 91.7%, and logistic (V80, V312) did not improved the differentiating efficacy with an AUC of 0.877 (Table 5 and Fig. 5a). Pairwise comparison of the AUCs revealed that AUC of logistic (ADC, V312) was significantly higher than AUC of ADC value for differentiating the LRT from HRT and TC (p = 0.0147), but no significant differences were found between AUCs from any other parameters (all p > 0.05).

Table 5 Diagnostic efficacy comparisons of ADC and DWI texture parameters in differentiating the defined groups of thymic epithelial tumors (TETs) based on the WHO classification and Masaoka–Koga stage
Fig. 5
figure 5

Receiver operating characteristic curves for the differentiating performance of the ADC and DWI texture parameters among the defined groups of thymic epithelial tumors based on the WHO classification and Masaoka stage. a LRT vs. HRT and TC with the ADC, V80 and V312 value. b Early vs. advanced stage with the ADC, V204 and V1030 value. LRT, low-risk thymoma; HRT, high-risk thymoma; TC, thymic carcinoma; ADC, apparent diffusion coefficient; V80, Cluster Shade_angle90_offset1; V312, Cluster Shade_angle45_offset6; V204, GLCM Entropy_angle0_offset4; V1030, maximum 3D diameter

As shown in Table 5 and Fig. 5b, for differentiating early from advanced stages of TETs, the ADC value obtained a relatively high differentiating ability with an AUC of 0.755, sensitivity of 68.8%, and specificity of 76.0% at the cutoff value of 1.18 × 10−3 mm2/s. The AUC, sensitivity, and specificity for V204, V1030, and logistic (ADC, V1030), respectively, were as follows: V204, 0.720, 68.0%, and 75.0%; V1030, 0.726, 92.0%, and 56.3%; and logistic (ADC, V1030), 0.772, 84.0%, and 65.6%. Pairwise comparison of the AUCs revealed no significant differences between AUCs from the above parameters for differentiating early from advanced stages of TETs (all p > 0.05).

In the discriminant analysis performed across the two subtypes (LRT vs. HRT and TC), 92.6% of original or cross-validated grouped cases were correctly classified by ADC+V312, and 71.9% of original or cross-validated grouped cases were correctly classified across the two stages (early vs. advanced stages) by ADC+V1030 according to the LOOCV procedure.

Discussion

As a crucial big-data source for the mining of large information, digital medical images are routinely acquired for almost every patient with tumor, and texture analysis is rapidly becoming a noninvasive means of lesion characterization and classification for improved decision support [32]. In this study, we evaluated whether the DWI texture analysis could be used to predict the histological subtypes and stages of TETs. The results revealed significant differences of several texture parameters in different pathological subtypes or stages of TETs, and combination of ADC and V312 significantly improved the differentiating ability of subtypes than ADC alone, which could potentially be used in clinical practice regarding the TET evaluation before treatment.

Stage and histological type were two of the most important factors that bridge the biology and clinical behavior of TETs [33]. WHO histological subtypes of TETs were correlated with clinical stages [19, 34]. Accurate identification of TET risk grades before treatment is a crucial question facing clinicians and radiologists for guiding therapeutic strategies. In this study, all of the LRTs were early stage, and most of HRTs and all TCs were advanced disease. Therefore, to a certain extent, pathological classification was correlated with the clinical stage of TETs.

Previous studies showed that the ADC value was helpful in differentiating different subtypes or stages of TETs [9, 19, 20]. Similarly, our results also found different mean ADC values among different subtypes or stages of TETs. Comparing with the LRT or early-stage TETs, the HRT and TC or advanced-stage TETs have more significantly enlarged nuclei and show hypercellularity, consequently leading to the decreased ADC value [19, 20].

The various parameters derived from imaging texture analysis are biologically regarded as good indices of tumor heterogeneity, which were thought to result from regional differences in tumor cellularity, proliferation, angiogenesis, hypoxia, and necrosis, all of which are related to tumor risk grades [35, 36]. Two recent studies differentiated the pathological subtypes of TETs using PET/CT textural features and demonstrated its potential value in TET tumor grades [27, 35]. In this study, the results showed that several DWI texture parameters were significantly different among various subtypes or stages of TETs. Both V80 and V312 of cluster shade parameters measured the skewness and uniformity of the GLCM (gray-level co-occurrence matrix); a higher cluster shade implied greater tumor asymmetry or less homogeneity. Interestingly, according to our results, V80 and V312 values were significantly higher in LRT than those in HRT or TC, indicating lower tumor homogeneity in LRT, which was out of our expectation. In most cases, malignant tumors have higher tumor heterogeneity due to more necrotic and heterogeneous growth than benign or low-grade malignancies. However, macroscopic appearance of multiple nodules separated by fibrous bands was demonstrated in LRT especially in over 77% of type AB thymomas, and the typical growth pattern is less in HRT and none in TC [11], which might explain why the significant higher V80 and V312 values exist in LRT. In addition, V1030 (maximum 3D diameter) values were significantly lower in LRT than those in HRT or TC in the current study, which were consistent with previous results of longer diameter in advanced stages of TETs than in LRT or early stages [19].

Furthermore, we evaluated the diagnostic efficacy of ADC values and DWI texture parameters in differentiating the TET types and stages. The results showed that V80 and V312 achieve relatively good performance, and the binary logistic regression analysis with combination of ADC and V312 revealed the highest efficacy, with an AUC of 0.933, in differentiating the LRT from the HRT or TC. Interestingly, the combination of two texture parameters did not improve the differentiating efficacy, and the reason for such a result could be that V80 and V312 belong to a category of cluster shade feature. Chiefly, combining ADC and V312 could potentially be useful as a predicting biomarker of TET subtypes.

As in the previous studies with various imaging modalities [9, 14, 19, 20], in this study, both the ADC value and DWI texture parameters did not obtain excellent differentiating ability for different stages of TETs. Indeed, there was no unequivocal correlation between pathological subtypes and stages [19]; all of the types of TETs can follow an aggressive clinical course [2], which might explain in part relatively low performance in predicting the stages of TETs [9].

In addition, significant differences were found in form factor parameters of V1025, V1030, and V1033 between early- and advanced-stage TETs. These features descript the three-dimensional size and shape of the tumor region, mainly decided by the volume and the surface area of the VOI according to the formula. In this study, the VOI was maintained at a distance of 1–2 mm away from the tumor margin to avoid the effects of edge noise and partial volume phenomenon [37], which could potentially affect the volume and surface area of the VOI and therefore affect the form factor parameters. Although segmentation is of critical importance, as there is no defined ground truth, reproducibility of segmentation becomes an important issue [38]. Therefore, further research may be warranted to clarify this problem.

There are several limitations of this study. Firstly, although this was the largest series of TETs evaluated with DWI, the sample size was relatively small, mainly because of the low clinical incidence of thymic tumors. Secondly, we calculated the ADC value by drawing ROIs of focal tumor areas by one reader. Further histogram analysis based on the whole tumor and the calculation of inter-class correlation coefficient might be needed to reduce sampling bias and assess inter-observer repeatability of the measurements. In addition, we used the b0 image and b1000 image to obtain the ADC values in the monoexponential model of DWI, which could be overestimated due to the perfusion bias, and the use of perfusion-free ADC measurements by minimum b value of 100–150 s/mm2 instead of 0 s/mm2 might avoid this problem properly [39, 40]. Thirdly, ten advanced-stage TET patients did not undergo surgery owing to the widespread invasion or metastasis, which was proved by puncture biopsy and staged by imaging, and thereby might cause a study bias. Finally, as a preliminary study to explore the possibility of using DWI texture analysis for obtaining quantitative biomarkers of TETs, further radiomics study with bigger samples is needed to clarify this issue.

In conclusion, several parameters derived from DWI texture analysis could be useful in distinguishing pathological subtypes or stages of TETs, and combination of ADC and texture indices improved the differentiating diagnosis efficacy.