Introduction

Meningiomas are the most common primary intracranial neoplasms encountered in clinical settings, and 20–30% of them originate from the skull base [1, 2]. Although most meningiomas are classified as benign tumors according to the 2016 WHO classification system [3], a subset of these tumors may show early progression/recurrence (P/R) after surgical resection [4,5,6]. Because of the complex neurovascular structures involved in this location, complete surgical resection of the skull base meningiomas (SBM) is often difficult to achieve safely [1]. In order to avoid surgically related neurological complications, subtotal tumor resection (STR) or conservative follow up is often opted as alternative treatment options [7,8,9,10]. In clinical practice, it is important to identify risk factors that correlate with P/R in SBM so the best options regarding treatment and follow-up strategies can be selected. Conventional MR imaging findings such as tumor size, bone invasion, and proximity to the major sinuses are related to P/R in meningiomas [6, 11]; however, quantitative analysis of MRI features for the evaluation of clinical outcomes in meningiomas is rarely reported in the available literature. In recent years, radiomics analysis is emerging as a comprehensive quantitative method to evaluate brain tumors [12], extracting parameters related to the underlying anatomical microstructure and dynamics of smaller-scale biophysical processes such as gene expression, tumor cell proliferation, and neovascularization [13]. Furthermore, radiomics analysis has been shown capable of providing predictive markers for diagnosis, prognosis, and therapeutic planning in brain tumors [12,13,14,15,16,17].

Recently, the preoperative apparent diffusion coefficient (ADC) value was used for the prediction of P/R in patients diagnosed with SBM and non-skull base meningiomas [18,19,20]. Since subjective ROI placement might vary from operator to operator, in this study, we investigated the role of quantitative radiomics analysis based on automatically segmented tumor for the prediction of P/R in SBM. Besides, manually measured ADC value for the prediction of P/R in SBM was also performed for comparison.

Materials and methods

Ethics statement

This retrospective study was approved by our Institutional Review Board (IRB serial no.: 10708–005). Written consent was waived as this retrospective study did not impact the healthcare of the included individuals. All patients’ records were anonymized and de-identified prior to analysis.

Patient selection

From October 2006 to December 2017, 138 patients were diagnosed with SBM (WHO grades I–III) by brain MRI and pathological confirmation. Patients with less than 1-year postoperative MRI follow-up were excluded (N = 34). Patients with incomplete preoperative MRI, poor imaging quality, or without preoperative diffusion-weighted imaging (DWI) and ADC map were excluded (N = 29). In addition, patients with inconsistent imaging sequences compared to the majority of the patients were also excluded (N = 15). Finally, 60 patients (14 men, 46 women, median age, 57 years), including 56 benign (WHO grade I), 3 atypical (WHO grade II), and 1 malignant (WHO grade III) SBM were included. None had history of cranial radiation or neurofibromatosis type 2. 21 (21/60, 35%) patients were diagnosed with P/R, and the median time to P/R was 27 months (range 2–56 months). The median follow-up time was 52 months (range 12–122 months). According to anatomic locations, the SBM were classified into five subgroups including anterior fossa/olfactory groove, spheno-orbital, temporal floor, sellar/cavernous sinus, and posterior fossa [21, 22]. Extent of surgical resection was determined by a review of surgical documentations in combination with preoperative and postoperative MRI findings by a neuroradiologist (C.C.K.) and neurosurgeon (S.W.L.). Simpson Grades I–III resection (considered gross-total resection, GTR) was performed in 33 patients, and Simpson Grades IV–V resection (considered subtotal tumor resection, STR) was done in 27 patients. Postoperative adjuvant radiotherapy (RT) was usually performed for patients with STR or high-grade meningiomas (WHO grade II or III) in our hospital. A total of 24 patients (21 benign, 2 atypical, and 1 malignant SBM) received postoperative adjuvant RT, and 3 patients refused further adjuvant RT. The RT was done by using stereotactic radiosurgery (SRS) (N = 15, median dose of 25 Gy, ranging from 18 to 30 Gy; median fraction of 5, ranging from 3 to 5 fractions), or fractionated stereotactic intensity-modulated radiotherapy (IMRT) (N = 9, dose ranging from 55 to 60 Gy with 30–33 fractions) by linear accelerators. Detailed information of radiation therapy techniques can be found in the supplementary file 1.

Determination of progression/recurrence

P/R of SBM was evaluated by two experienced neuroradiologists (C.C.K. and T.Y.C.), blinded to the clinical and radiologic findings of the studied patients. In equivocal cases, judgment was made in consensus. Interobserver reliability with Cohen k value of 0.9 was obtained. P/R was defined as recurrence of tumor in GTR (Simpson Grades I–III resection) or progression of residual tumor size in STR (Simpson Grades IV–V resection) on contrast-enhanced T1WI. In cases of STR, the threshold of P/R was defined as a 10% increase in tumor volume in comparison with postoperative brain MRIs. In patients who received adjuvant RT, P/R was differentiated from post-radiation effect (pseudo-progression) based on progressive tumor growth, not transient increase in tumor volume.

Imaging acquisition and tumor segmentation

MRI images in this study were acquired using a 1.5-T (N = 52) or a 3.0-T (N = 8) scanner. Scanning protocol include axial and sagittal spin echo T1-weighted imaging (T1WI), axial and coronal fast spin echo T2-weighted imaging (T2WI), axial fluid attenuated inversion recovery (FLAIR), axial T2*-weighted gradient-recalled echo (GRE), axial DWI and ADC map, and contrast-enhanced T1WI in axial and coronal sections. Detailed imaging parameters of the MR scanners can be found in the supplementary file 2. Because radiomics in T2WI, ADC, and contrast-enhanced T1WI were associated with histopathology in meningiomas [16, 32], the three sequences were selected for analysis in our study. Figure 1 showed the flowchart of the analysis process. The lesion was segmented on contrast-enhanced T1WI by subtracting pre-contrast images from post-contrast images. For each lesion, the operator placed an initial ROI indicating the lesion location and then selected the beginning and ending slices that contained the lesion. The outline of the lesion ROI on each imaging slice was then automatically obtained using the fuzzy c-means (FCM) clustering-based algorithm [23]. The ROIs from all imaging slices containing this lesion were combined to obtain a 3D mask of the entire lesion. 3D connected-component labeling was then applied to remove scattered voxels not connected to the main lesion ROI, and hole-filling was applied to include all voxels contained within the main ROI that are labeled as non-lesion components. When necessary, the operator performed manual corrections, and the number of pixels that were changed was recorded. The percentage of corrected pixels was calculated by dividing to the total pixel number of the entire tumor. Correction was necessary in 28 of the 60 cases, and the corrected pixels were fewer than 5% (mean 3.2 ± 2.1%).

Fig. 1
figure 1

Flowchart of the analysis process. The tumor is segmented on contrast-enhanced T1WI, and then mapped to T2WI and ADC maps. On each set of images, a total of 33 texture and histogram features are extracted. The random forest algorithm is used to select features for building the classification model by using the decision tree

The segmented tumor mask was co-registered to T2WI and ADC maps to transfer the tumor ROI to these images (Fig. 1). This process was done by FMRIB’s Linear Image Registration Tool (FLIRT) [24]. This tool read the header information of the images that contained the slice locations and the field of view from T2WI, ADC maps, and T1WI. Due to different image resolutions and thickness, the pixels in the tumor masks were mapped to T2WI and ADC maps using affine transformation and linear interpolation.

Quantitative feature extraction

On each set of the contrast-enhanced T1WI, T2WI, and ADC map, 20 Gray Level Co-occurrence Matrix (GLCM) texture features were calculated from the tumor ROI, including autocorrelation, cluster prominence, cluster shade, contrast, correlation, dissimilarity, energy, entropy, homogeneity 1, homogeneity 2, maximum probability, sum average, sum variance, sum entropy, difference variance, difference entropy, information measure of correlation 1, and information measure of correlation 2, inverse difference normalized, and inverse difference moment normalized [25]. In addition, 13 histogram-based parameters were calculated, including 10, 20... to 90% percentile values, mean, standard deviation, kurtosis, and skewness. Thus, a total of 99 parameters were extracted from the three sets of images.

Feature selection and classification

Random forest algorithms were utilized via Bootstrap-aggregated decision trees to evaluate the importance of these features in differentiating patients with and without P/R [26]. A measure of the feature significance can be assessed as the loss of accuracy after this feature was removed. All features were sorted based on their importance, and then different number of features starting from the top 1, 2, 3… was used to test their classification performance with 10-fold cross-validation. Finally, three imaging features, including T1 max probability, T1 cluster shade, and ADC correlation, were selected. A decision tree with five leaves was used to build the final classification model [27]. The decision tree was a binary tree. Since the outcome was categorical, the split might be based on either the improvement of cross entropy [27]. For each node of the tree, the cross entropy of the classification results was calculated as the following formula:

$$ \mathrm{Cross}\ \mathrm{Entropy}=-\sum \limits_{i=1}^k{p}_i\log \left({p}_i\right) $$

where k is the number of classes and pi was the proportion of cases belonging to class i. For all of the parent and child nodes, the splitting of the nodes was determined by splitting threshold, which minimizes the cross entropy. This procedure was implemented in MATLAB 2018b.

Measurement of ADC value

For comparison with the radiomics model in prediction of P/R in SBM, ADC value was measured manually by two experienced neuroradiologists as in the published literatures [18,19,20]. The ROI was placed in a way to avoid volume averaging with necrosis, calcification, hemorrhage, and cystic regions that might influence the ADC values in SBM (Fig. 3). A circular ROI with area ranging from 35 to 76 mm2 (mean 56 ± 4 mm2) was placed within the tumor area to obtain ADC values. Due to the almost perfect reproducibility in the interobserver reliability, the subsequent statistical evaluation of ADC value was performed using the mean value calculated from both raters.

Statistical analysis

Statistical analyses were performed using statistical package SPSS (V.24.0, IBM, Chicago, IL, USA). Mann-Whitney U test was used to compare the three parameters obtained by random forest algorithms for differentiation of P/R. Chi-square or Fisher exact test was used to compare the clinical categorical data. Receiver operating characteristic (ROC) analysis was performed for ADC values to discriminate between patients with and without P/R, and p value < 0.05 was considered statistically significant.

Results

Clinical data

The clinical data of the 60 SBM cases included in this study are summarized in Table 1. Twenty-one (21/60, 35%) patients are diagnosed with P/R. Meningothelial is the most common histological type in both groups, and no significant association was found between histological subtype and P/R (p = 0.86). Although a higher rate of P/R was observed in patients with STR, no statistical significance was found between the extent of resection and P/R (p = 0.17) (Figs. 2 and 3). In 24 patients receiving adjuvant RT, 6 (6/24, 25%) patients still had P/R in the subsequent follow-up. No significant difference existed in P/R for patients with and without adjuvant RT (p = 0.19). The spheno-orbital region is the most common location amongst SBM with P/R (p = 0.03).

Table 1 The clinical data of SBM with and without progression/recurrence (P/R)
Fig. 2
figure 2

A 44-year-old woman with pathologically proven sellar meningioma (WHO grade I). a Axial contrast-enhanced T1WI showing an enhancing tumor (green outline) involving the sellar/suprasellar region. The tumor (green outline) is segmented on contrast-enhanced T1WI, and then mapped to b axial T2WI and c axial ADC maps; d coronal contrast-enhanced T1WI showing the sellar/suprasellar enhancing tumor (arrows) with bilateral encasement of the proximal internal carotid arteries, middle cerebral arteries, and anterior cerebral arteries; e gross-total resection was performed, and WHO grade I meningioma was confirmed pathologically; f recurrent tumor at the left clinoid process (arrow) was observed 36 months after surgical resection

Fig. 3
figure 3

A 46-year-old man with pathologically proven right posterior fossa meningioma (WHO grade I). a Axial T2WI and b axial contrast-enhanced T1WI showing an enhancing tumor (arrow) in the right posterior fossa with involvement of the right transverse sinus; c measured ADC value (circular ROI) was 0.823 × 10−3 mm2/s (b = 1000 s/mm2); d coronal contrast-enhanced T1WI showing the enhancing tumor (arrow) arising from the right tentorium with downward extension; e subtotal resection was performed to preserve the right transverse sinus, with residual tumor (arrowheads) in the right tentorium, and WHO grade I meningioma was confirmed pathologically; f progression of the residual tumor (curved arrow) was observed 14 months after surgical resection

Radiomics model and ADC in differentiation of P/R

The most significant three parameters selected by the random forest method for differentiation of P/R were T1 maximum probability, T1 cluster shade, and ADC correlation. The performance could not be improved by adding more features. The p values (Mann-Whitney U test) of T1 maximum probability, T1 cluster shade, and ADC correlation were 0.004, 0.043, and 0.52, respectively, between the P/R and non-P/R groups (Fig. 4). The final classification results were generated by using the selected thresholds in the decision tree (Fig. 5). The results contain 18 true positive cases, 36 true negative cases, 3 false positive cases, and 3 false negative cases, with an overall prediction accuracy of 90%. In comparison, the area under ROC curve (AUC) of 0.88 and cut-off value of 0.825 ×10−3 mm2/s (b = 1000 s/mm2) were obtained in ADC for prediction of P/R in SBM (Fig. 3). Based on the optimal cut-off point of 0.825 ×10−3 mm2/s, the overall accuracy in differentiation of P/R by the ADC value obtained from manually placed ROI was 83% (10 false prediction cases). The interobserver reliability in the intraclass correlation coefficient for ADC values was 0.9 (95% confidence interval 0.88, 0.96).

Fig. 4
figure 4

Box plot of a T1 maximum probability, b T1 cluster shade, and c ADC correlation in skull base meningiomas with and without progression/recurrence (P/R). Statistical difference (p < 0.05) (Mann-Whitney U test) in T1 maximum probability and T1 cluster shade was observed. Boxes indicate the interquartile range, and whiskers indicate the range. The horizontal line represents the median in each box. Circles represent outliers, defined as distances greater than 1.5 times the interquartile range above the third quartile. The star represents an extreme value, defined as a distance greater than three times the interquartile range below the first quartile or above the third quartile

Fig. 5
figure 5

The diagnostic decision tree with five leaves to separate patients into P/R and non-P/R groups. The total number of splits is four

Discussion

In this study, we established a system implementing radiomics to predict P/R in SBM. Random forest algorithm was applied to evaluate the importance of the extracted features. In the three selected features, two were extracted from contrast-enhanced T1WI and one from the ADC map. The overall accuracy in differentiating between P/R and non-P/R groups was 90% with 6 false prediction cases. No histogram parameters were selected in the final model, suggesting that texture provides more important prognostic information. Although 4 high-grade meningiomas were included in our study, the results were similar with accuracy of 89.3% after excluding the 4 high-grade cases.

Although 90% of meningiomas are benign (WHO grade I) tumors, about 21% of these tumors recur in 5 years after surgical resection [4, 5]. The risk factors related to progression of SBM were investigated in several studies, and recurrence rates varying from 13.2 to 56% were reported [1, 28, 29]. In our study, the relatively high rate of PR (21/60, 35%) may also be caused by small sample size and selection bias. It is known that the genetic and pathologic mechanisms between the SBM and non-skull base meningiomas (non-SBM) are different [30]. Furthermore, the recurrence rate and clinical outcomes between these two disease presentations are inconsistent [1, 29]. Mansouri et al. [1] reported higher recurrence rates in non-SBM. In contrast, Savardekar et al. [29] reported that SBM progressed at a higher rate than non-SBM during the first 10 years’ follow-up after surgery. The higher recurrent rate in SBM may be caused by incomplete tumor resection and bone invasion [19, 30]. Since complete surgical resection may result in neurologic complications, prediction of recurrence in SBM is a clinically significant issue for selecting optimal treatment strategies.

Although conventional MR imaging findings related to recurrence in meningiomas had been reported, most imaging data were presented in qualitative and subjective terms [6, 31]. In contrast, MR radiomics is able to reproducibly extract objective and quantitative data from different imaging sequences to build diagnostic models classifying different types of lesions [12,13,14,15,16,17]. Several authors had reported the application of MR radiomics providing valuable information for differential diagnosis, tumor staging, prediction of prognosis, and assessment of cancer genetics [12,13,14,15,16,17]. It is known that spatial and temporal texture features of radiomics are based on the compression and destruction of normal brain anatomy by tumor mass, peritumoral edema, tumor cellularity, and degenerative changes. Some of that cannot be detected by human visual system [14,15,16]. Further, some studies reported that texture analysis can reveal visually imperceptible tumor information extends beyond radiology to histopathology, and it could be a potentially useful approach for estimating grades and molecular status in brain tumors [14,15,16]. Recently, MR radiomics and machine learning analyses had been employed in the differentiation of meningioma grading [15, 16]. Park et al. [16] reported that radiomics feature-based machine learning classifiers of postcontrast T1-weighted images, ADC, and fractional anisotropy maps were useful for differentiating meningioma grades. Niu et al. [17] found that radiomics features provided satisfactory performance in the preoperative differential diagnosis of meningioma subtypes. Therefore, it is reasonable that radiomics features may play a potential role in prediction of recurrence in meningiomas. However, the application of radiomics for predicting clinical outcomes in meningiomas had only been reported in few studies [32]. To the best of our knowledge, we have thus undertaken the first MR radiomic analysis for preoperative prediction of P/R in SBM.

In this study, we employed random forest to undertake feature selection and then implemented a binary decision tree to build the final classification model. Random forest combines multiple decision trees, with each tree stratifying the feature space into a number of simple non-overlapping regions that maximizes classification accuracy. Compared with other feature selection algorithms, such as LASSO and artificial neural network [23], random forest improves the generalization of the selection process and works better for small datasets. In this study, three features were selected from 99 features. Dealing with a small number of features and cases, a binary decision tree can be constructed, and the results can be easily interpreted. Although other classification algorithms such as support vector machine or convolutional neural network may achieve very high accuracies, they require huge datasets. Besides, these algorithms are considered as “black-box” classifiers, and interpretation of obtained results is difficult [33]. Although ADC correlation is one of the most important parameters measured by random forest algorithms in our study, it is not necessarily that significant difference existed in ADC correlation in Mann-Whitney U test. The univariate feature ranking filter such as t-test or Mann-Whitney U test does not take into account the possible interactions between variables. In contrast, random forest algorithm embedded into the estimation of a multivariate predictive model typically captures those interactions [34]. Although some differences may exist in the radiomic analysis between 1.5 and 3 T MRI scanners, most of our cases (N = 52) were performed in the 1.5-T MRI scanner. Besides, the accuracy of 92.3% was obtained after excluding the 8 cases done in the 3-T MRI scanner.

From a previous study, it was known that the ADC value measured from manually placed ROI on the aggressive tumor area could be used to predict P/R for SBM [19]. The ROI was carefully decided, which could avoid volume averaging with calcification, necrosis, and cystic regions. However, the texture and heterogeneity within the tumor could not be considered using this manual ROI analysis, and valuable information may be overlooked. In this study, the accuracy for prediction of P/R by using ADC value measured from manually placed ROI was 83% (10/60 false prediction), which was inferior to the radiomics model, which yielded an accuracy of 90% (6/60 false prediction).

There was a total of six false prediction cases. In the three false positive cases, all involved lesions located in the right sphenoid ridge. Two received GTR and one received STR. None received adjuvant RT. Two had large tumor sizes (maximal diameter 6.8 and 5.6 cm) that exhibited heterogeneous contrast enhancement and uneven ADC mapping. In the three false negative cases, two involved lesions located in the temporal fossa, and all received STR. One patient underwent adjuvant RT. Relatively homogeneous contrast enhancement and consistently low ADC values were seen in all three false negative cases. Further investigation involving a larger sample size is necessary to better understand factors contributing to false positive and false negative predictions.

Mathiesen et al. [35] reported recurrence rates of SBM at 3.5–25% in Simpson Grades I–III resection and 45% in Simpson Grade IV resection. Although it is generally agreed that the extent of surgical resection is an important determining factor in the rate of recurrence [1], Voß et al. [36] recently reported a similar recurrence rate between GTR and STR in 325 SBM. Similarly, no significant difference was observed between the extent of resection and P/R in our study.

Adjuvant RT is known to improve overall survival in high-grade meningiomas, but its role in benign (WHO grade I) meningiomas is still unclear [37]. For patients without evidence of tumor recurrence, adjuvant RT is controversial because it increases risks of complications such as cranial nerve deficits, symptomatic peritumoral edema, internal carotid artery stenosis, and neurologic deficits [38]. With advanced radiomics approaches, aggressive surgical resection combined with postoperative adjuvant RT and close imaging follow up should be considered in patients with high risk factors of P/R; in contrast, for patients with lower possibilities of recurrence, the aim of surgery would be relief of mass effect and clinical symptoms, and adjuvant RT may be performed more conservatively to avoid long-term side effects [37]. Therefore, radiomics approaches offer objective and clinically valuable information for the planning of treatment in SBM.

Our study still had several limitations. The retrospective nature of the study may result in selection bias. All images were acquired at a single site, mostly with a single protocol. Future testing on multi-institutional data and on varying imaging protocols is important in determining whether the trained classifier is generalizable. The implemented radiomics analysis method is straightforward, and it may not fully utilize the information from all images since it is based on pre-defined features. Due to the small number of cases, only a few features can be selected into the classification model to avoid over-fitting. More cases are expected to improve the model performance. Although no statistical significance existed in adjuvant RT between P/R and non-P/R groups, adjuvant RT may alter the independent predictive value of the extracted features for recurrence. More advanced statistical analysis methods that can take all confounding factors into account need to be developed in the future.

Conclusions

To the best of our knowledge, this is the first study attempting to apply the MR radiomic analysis to predict P/R in SBM. The results are superior compared with the approach using ADC measured by operator-defined ROIs. Preoperative radiomics offer valuable clinical information for the planning of treatment in SBM, including extent of tumor resection, implementation of adjuvant RT, and the time interval of imaging follow-up. This approach will need to be validated when more cases with a long-term follow-up are available.