Introduction

Endometrial cancer (EC) is the most common gynaecologic cancer worldwide, with ever-increasing incidence and disease-related mortality [1]. EC is a heterogeneous disease comprising multiple histological subtypes, including endometrioid, serous, and clear cell carcinomas. The endometrioid type constitutes approximately 85% of EC cases [2]. Due to its complex biological features and distinct heterogeneity, EC has wide variability in its treatment response and prognosis [3]. Genomic and transcriptomic analyses have shown that 25–30% of ECs present with a high frequency of somatic mutations in mismatch repair genes [4]. A defective DNA mismatch repair (dMMR) system in cancers contains thousands of mutations most frequently located in monomorphic microsatellites, causing microsatellite instability (MSI).

Considerable evidence has suggested that MSI status emerged as an important prognostic factor associated with shorter overall and disease-free survival in patients with EC [5, 6]. Furthermore, MSI contributes significantly to the high objective response rate in anti-programmed death protein 1 (PD-1) therapy [7]. Preclinical and clinical studies have shown that anti-PD-1 monoclonal antibody of pembrolizuma demonstrated robust antitumour activity and encouraging survival outcomes in MSI/dMMR EC [8, 9]. Testing for MSI in all EC samples has been recommended in the National Comprehensive Cancer Network (NCCN) [10] and European Society for Medical Oncology (ESMO) [11] guidelines. The MSI status is assessed by immunohistochemical (IHC) or pentaplex polymerase chain reaction (PCR) assays on tumour tissue. However, these procedures are costly and invasive. Furthermore, the limited amount of tissue obtained through biopsy can be insufficient, making diagnosis challenging [12]. In addition, it is difficult to monitor the MSI status when tissue is not available for immunohistochemical or mutational analysis [13]. Therefore, we need to find a noninvasive, cost-effective, and convenient approach to stratifying MSI status in EC.

Radiomics offers objective and reproducible modes on personalised medicine in oncologic practice. Studies have suggested that radiomics features can capture tissue and lesion characteristics referred to as genomic and biological variations [14, 15]. MRI-based radiomics has been confirmed to be a useful tool for stratifying preoperative risk [16] and predicting lymphovascular space invasion [17] and pelvic lymph node metastasis [18] in EC patients. In contrast to those with microsatellite stability (MSS), tumours with MSI/dMMR show a higher incidence of heterogeneity because of the hypermutator phenotype [19]. These internal changes could be captured by radiomics. A prior study demonstrated a significant correlation between a contrast-enhanced computed tomography (CECT)-based radiomics signature and MSI status in EC [13]. However, it is difficult to delineate EC on CT due to the ill-defined borders of the tumours. In addition, previous studies have demonstrated that radiomics features from pretreatment MRI are associated with MSI status and that radiomics analysis greatly contributes to MSI status identification in rectal cancer [20, 21]. The aim of this study was to evaluate the performance of the radiomics analysis of multisequence MRI in predicting MSI status in patients with EC.

Materials and methods

Patients

In this retrospective multicentre study, a total of 307 patients from the centre I and 172 patients from centre II with postoperative pathology-confirmed EC who underwent preoperative contrast-enhanced MRI were initially retrieved from January 2017 to July 2021. The MRI sequences included T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), and fat-saturation contrast-enhanced T1-weighted imaging (CET1WI). The exclusion criteria were as follows: (1) therapy prior to the MRI (n = 30); (2) an interval between MRI examination and surgery of more than 30 days (n = 16); (3) no MSI status information (n = 56); (4) insufficient tumour size with diameters less than 0.5 cm (n = 20); and (5) poor image quality (n = 19). Finally, 227 patients from centre I and 111 patients from centre II were enrolled. The data from centre I were randomly divided into the training cohort (37 MSI, 123 MSS) and internal validation cohort (15 MSI, 52 MSS) at a ratio of 7:3, and the data from centre II were designated the external validation cohort (30 MSI, 81 MSS). Figure 1 shows the flowchart for the patient selection and study design. Clinical characteristics, including age, body mass index (BMI), carcinoembryonic antigen 125 (CA 125), tumour size (maximum axial diameter of the tumour measured on the axial T2W image), histological subtype, and International Federation of Gynaecology and Obstetrics (FIGO) stage, were obtained for each patient from the archived medical records. Ethical approval was granted by the medical ethics committees of both participating centres, and the requirement for informed consent was waived.

Fig. 1
figure 1

Flowchart of patient selection and study design. MSI microsatellite instability; MSS microsatellite stability

Identification of MSI status

Tumour microsatellite status was determined by examining the expression of four MMR proteins in the hysterectomy specimen following standard IHC protocols [22]. MMR protein expression was tested by using primary antibodies against MLH1, MSH2, MSH6, and PMS2. Tumours were classified according to clinical practice as MSI when one or more MMR proteins were negatively expressed and MSS when all MMR proteins were positively expressed [23]. The details of the IHC procedure are shown in Supplementary S1.

MRI Acquisition and Segmentation

All patients underwent pelvic contrast-enhanced MRI scans in the two centres covering the entire region of the tumour. The sequences included T2WI, DWI with b values of 1000 s/mm2, and fat-saturation CET1WI with the administration of contrast medium (gadopentetate dimeglumine and gadodiamide, at a dose of 0.1 mmol/kg). The details of the MRI scanners and acquisition parameters are listed in Table 1.

Table 1 MRI acquisition parameters

The tumour region of interest (ROI) was manually segmented using ITK-SNAP software (version 3.8.0; www.itksnap.org) on axial T2WI, DWI, and CET1WI in the delayed phase separately by a radiologist. Our preliminary study in Supplementary S2 demonstrated that three-dimensional (3D) (multiple slices of whole tumour volumes) and 2-dimensional (2D) (a single slice of the largest cross-sectional diameter of the tumour) ROIs achieved similar performance in the prediction of MSI status in EC. Thus, a 2D ROI was applied in this study (Fig. 2). Another radiologist performed ROI segmentation on a randomly selected group of 50 patients. Both radiologists were blinded to the histopathology results of the patients. The interobserver variability of the radiomics features was assessed by calculating the intraclass correlation coefficient (ICC).

Fig. 2
figure 2

Examples of manual lesion segmentation in a patient with endometrial cancer. Axial T2-weighted imaging (T2WI) with manually segmented regions of endometrial cancer a, axial diffusion-weighted imaging (DWI) (b = 1000 s/mm2) b, and axial contrast-enhanced T1-weighted imaging (CET1WI) c

Image preprocessing and feature extraction

Before feature extraction, imaging preprocessing was performed as follows: (1) resampling pixels to 1 mm × 1 mm × 1 mm for T2WI and CET1WI and 1.5 mm × 1.5 mm × 1.5 mm for DWI using BSpline interpolation; (2) grey-level normalisation by the z score with max value in the range of 3 × the standard deviation; (3) grey-level discretisation with bin widths of 5; and (4) Laplace of Gaussian (LoG, σ: 3, 5) and wavelet image filters (HH HL LH LL). Seven classes of Imaging Biomarker Standardisation Initiative-based feature were extracted from the original and filtered images using the PyRadiomics package [24]: (1) 18 first-order (histogram) features; (2) 14 shape features; (3) 24 grey-level co-occurrence matrix (GLCM) features; (4) 16 grey-level run-length matrix (GLRLM) features; (5) 16 grey-level size zone matrix (GLSZM) features; (6) 5 neighbouring grey-tone difference matrix (NGTDM) features; and (7) 14 grey-level dependence matrix (GLDM) features. From the three sequences, 1980 features (660 × 3) were obtained from each patient.

Feature selection and model building

The ComBat harmonisation method was first carried out to realign feature distributions computed from different MRI scanners and protocols [25,26,27]. Second, the ICC was adopted to assess the stability of the radiomics feature. The features with ICC > 0.75 were retained. Third, univariate analysis was performed to select important features using the Wilcoxon rank-sum test (P < 0.05). Finally, the Boruta wrapping algorithm was applied to select the key radiomics features that were most relevant to the MSI status. Based on the selected radiomics features, radiomics models were built by three machine learning classification algorithms, logistic regression (LR), random forest (RF), and support vector machine (SVM), to predict the MSI status in EC. Considering that MSI is more common in endometrioid carcinomas than in non-endometrioid carcinomas [28], the performance of the radiomics model in the prediction of the MSI status in endometrioid carcinoma was also evaluated separately.

Statistical analysis

All statistical analyses were performed with R software (version 4.1.0, www.rproject.org). The differences in variables were evaluated by the Mann‒Whitney U test, chi-square test and Kruskal‒Wallis H test. Principal component analysis (PCA) was performed to evaluate the impact of ComBat on feature uniformisation. Z-scores were used to standardise all radiomics features. The receiver operating characteristic (ROC) curves were plotted and the best cut-off point was obtained by the Youden index in the training cohort and applied in both validation cohorts. The area under the ROC curve (AUC), accuracy, sensitivity, and specificity were applied to assess the performance of the models. The differences in values between radiomics models were compared using the DeLong test. Decision curve analysis (DCA) was conducted to determine the clinical usefulness of the models by quantifying the net benefits at different threshold probabilities.

LR was performed with the ‘base’ package. The RF model was performed using the ‘randomForest’ package with 500 trees and a node size of 3. SVM was accomplished by the ‘e1071’ package with the Radial Basis Function kernel.

Results

Patient clinical characteristics

The clinical characteristics of the training cohort, internal validation cohort, and external validation cohort are summarised in Table 2. The prevalence of MSI was 23.13% (37/160) in the training cohort, 22.39% (15/67) in the internal validation cohort, and 27.03% (30/111) in the external validation cohort. There were no significant differences in MSI prevalence among the three cohorts (P = 0.704). In terms of distinguishing MSI from MSS, no notable differences in the clinical characteristics were found in the three cohorts (all P > 0.05), except for BMI in the training cohort (P = 0.011).

Table 2 Patient clinical characteristics in the training, internal validation, and external validation cohorts

Feature selection and model establishment

After harmonisation, the differences in radiomics features caused by intrascanner variability were removed by the ComBat (Supplementary S3). Of the 1980 radiomics features extracted from T2WI, DWI, and CET1WI, 1576 features showed ICC > 0.75 in the intraobserver analyses. Of these, 149 features were retained after univariate analysis using the Wilcoxon rank-sum test. Finally, 9 radiomics features including 5 from T2WI (T2WI_wavelet.LH_firstorder_Maximum; T2WI_wavelet.LH_GLDM_ Low Grey Level Emphasis; T2WI_wavelet.HH_GLSZM_Zone Entropy; T2WI_wavelet.HL_GLSZM_Grey Level Non Uniformity Normalized; T2WI_ original_GLCM_Inverse Variance), 2 from DWI (DWI_wavelet.HH_GLCM_ Informational Measure of Correlation 1; DWI_wavelet.LL_ GLSZM_Small Area Low Grey Level Emphase), and 2 from CET1WI (CET1WI_log.sigma.5.0_firstorder_10th Percentile; CET1WI_wavelet.LH_GLCM_Cluster Shade) were selected by Boruta to establish the radiomics models. The ICC and the distribution of selected features are shown in Supplementary S4. With the 9 optimal radiomics features, prediction models were built using the three different classifiers, LR, RF, and SVM, in the training and validation cohorts.

Performance of the radiomics models in the prediction of MSI status in endometrial cancer

After application of ComBat, a higher MSI prediction performance was achieved in the training, internal validation, and external validation cohorts (Fig. 3). In predicting MSI, the SVM model significantly outperformed the LR (AUC: 0.921 vs. 0.691, P = 0.001) and RF models (AUC: 0.921 vs. 0.794, P = 0.002) in the training cohort. Validation of the model also confirmed this finding. In the internal validation cohorts, the SVM model yielded the highest AUC (0.903), followed by the LR (AUC: 0.729, P = 0.004) and RF models (AUC: 0.763, P = 0.021). In the external validation cohort, the SVM model also significantly outperformed the LR (0.937 vs. 0.624, P < 0.001) and RF models (0.937 vs. 0.702, P < 0.001). The DCA results showed that the SVM model achieved more clinical utility in predicting MSI than the LR and RF models alone if the threshold probability was in the range of 0.581–0.783. The detailed accuracy, sensitivity, specificity, and AUC of these predictive models are summarised in Table 3. The ROC curve analysis and the DCA results for the radiomics models are presented in Fig. 4.

Fig. 3
figure 3

Areas under the ROC curve of radiomics features in the prediction of microsatellite instability (MSI) status with and without ComBat harmonisation in the training cohort a, internal validation cohort b, and external validation cohort c. The P value shown in each figure represents the Delong test result

Table 3 Performance of radiomics models with logistic regression, random forest, and support vector machine algorithms in predicting the MSI status in endometrial cancer
Fig. 4
figure 4

Areas under the ROC curve of the logistic regression (LR), random forest (RF), and support vector machine (SVM) classifiers in the prediction of microsatellite instability (MSI) status of endometrial cancer (EC) in the training cohort a, internal validation cohort b, and external validation cohort c. The decision curves in the training cohort d, internal validation cohort e, and external validation cohort f showed that the SVM classifier achieved more clinical utility in predicting MSI in EC than the LR and RF classifiers

Performance of radiomics model in predicting MSI status in endometrioid cancer

For the prediction of MSI status in endometrioid carcinoma only, SVM also achieved good performance (Fig. 5). The AUC values were 0.914 for the training cohort and 0.902 and 0.940 for the internal validation and external validation cohorts, respectively. Table 4 presents the details of the accuracy, sensitivity, and specificity of the SVM radiomics model.

Fig. 5
figure 5

Areas under the receiver operating characteristic curve of the support vector machine (SVM) radiomics model in the prediction of the microsatellite instability (MSI) status in endometrioid carcinoma

Table 4 Performance of the support vector machine radiomics model in predicting the MSI status in endometrioid carcinoma

Discussion

This study demonstrated that the multisequence MRI-based radiomics features were able to predict the MSI status in patients with EC, and the SVM classifier outperformed other classifiers. A recent study by Veeraraghavan et al. suggested that radiomics features derived from CECT provided an adjunct tool for identifying the dMMR/MSI ECs with an AUC, sensitivity, and specificity of 0.78, 0.67, and 0.77, respectively [13]. In our study, MRI radiomics model with the SVM classification algorithms achieved the highest diagnostic performance, with an AUC of 0.921 for the prediction of MSI status in EC patients. The multisequence data from the relatively large, multicentre sample may reveal more comprehensive information about tumour heterogeneity. Considering that the inherent intensity variability across multicentre MRIs may hamper the reproducibility and performance of the radiomics features, the ComBat harmonisation was applied to improve the prediction model’s performance. Further validation of the model was performed using internal and external validation cohorts and resulted in AUCs of 0.903 and 0.937, respectively. These results indicate that this computer-based model could be an effective tool to predict the MSI state in patients with EC.

Tumour segmentation, feature extraction, and selection are key steps in radiomics analysis. This study was carried out in manually drawn 2D ROIs. Previous radiomics studies in oropharyngeal squamous cell carcinoma [29] and non-small cell lung cancer [30] reported that 2D segmentation achieved comparable or even higher performance than 3D segmentation. Our preliminary study also showed that 2D and 3D ROIs shared a similar capacity in predicting MSI status in EC. As a result, although 3D segmentation may provide higher dimensional information about the tumours, 2D ROIs with their lower complexity and reduced time consumption, are more recommended. In feature selection, the Boruta algorithm was used to select a set of the most relevant features. Compared to selecting univariately predictive and nonredundant features, Boruta can improve the generalisability and interpretability of selected features [31]. Finally, 9 radiomics features, consisting of seven texture features, were retained. The GLCM_Cluster Shade, GLCM_ Informational Measure of Correlation 1 and GLCM_Inverse Variance features quantify certain parameters of cell structure such as textural uniformity and complexity; GLDM_Low Grey Level Emphasis is a measure of the distribution of low grey-level values; GLSZM _Grey Level Non Uniformity and GLSZM_Small Area Low Grey Level Emphase measure the variability of gray-level intensity values in the image and the proportion in the image of the joint distribution of smaller size zones with lower gray-level values. Texture features are known to demonstrate the intratumour heterogeneity and subtle alterations through quantifying the spatial relationships and interactions between pixel intensities [32]. They have been recognised as potential markers for grading brain glioma [33], and predicting survival in patients with metastatic colon cancer [34]. In accordance with previous studies of colorectal cancer, the texture features are the most frequent radiomics features significantly related to MSI status [20, 21]. Based on the existing evidence in the literature, our study showed that MRI-based radiomics offers a way to characterise the intratumour heterogeneity difference between MSS and MSI in EC.

This study applied three classification algorithms, including LR, RF, and SVM, to build the prediction models. Among these classifiers, SVM was the most effective algorithm. The SVM algorithm is very effective at recognising subtle patterns in complex datasets due to its ability to minimise classification errors on unseen data without prior assumptions made on the probability distribution of the data [35]. The main advantage of SVM is the ability to model moderate nonlinearities [36]. Previous studies have reported that SVM had the highest prediction efficacy in differentiating lung malignancies from benign lesions [37] and glioblastoma from solitary brain metastases [38]. However, the best classification algorithms may vary for different clinical applications. Yin et al. [39] reported that the generalised linear model classifier with the least absolute shrinkage and selection operator had the best performance in differentiating sacral chordoma and sacral giant cell tumours. Mao et al. [40] demonstrated that LR performed best in the preoperative classification of primary and metastatic liver cancer. Thus, further studies are warranted to explore the optimal algorithm for different clinical applications.

Our study has several limitations. First, the MR images were retrospectively obtained from different scanners and acquisition protocols, which may influence the generality of the results. Nevertheless, the ComBat harmonisation method was applied to realign feature distributions. Second, the ROIs were manually delineated in this study. However, manual tracing is resource intensive. Semiautomatic and automatic ROI method will be discussed in future studies. Third, ICC analysis was performed to reduce the variability caused by the manual delineation of ROIs on each sequence separately. The variability in areas and volumes across the different sequences were not assessed. Finally, our study only focused on the primary EC lesions. Considering that the Food and Drug Administration has approved pembrolizumab in the recurrent setting of EC, it is also important to validate the results in recurrent lesions.

In conclusion, a radiomics model based on multisequence MRI showed good performance in predicting MSI status in EC. This model could serve as a pontential tool for evaluating the pathological features of this disease to guide clinicians in EC management.