Introduction

The eighth edition of the cancer staging system of the American Joint Committee on Cancer (AJCC), announced in 2017, reported that the most important change in breast cancer is the incorporation of biomarkers (histologic grade, estrogen receptor [ER] or progesterone receptor [PR], human epidermal growth factor receptor2 [HER2] expression), which could affect survival into the anatomic staging to make clinical prognostic stages [1]. Additionally, the eighth edition identifies the division of breast cancer into four molecular subtypes: luminal A (hormone receptor [HR] such as ER- or PR-positive, HER2-negative, low Ki-67), luminal B (HR-positive, HER2-negative, high Ki-67), HER2 (HER2-positive regardless of the HR status), and basal (both HR- and HER2-negative) [1,2,3].

Luminal cancer accounts for 73.2% of all breast cancers [4]. Subgroups (A and B) are classified based on a cutoff value of 14% Ki-67 indicating tumor proliferation [5, 6]. Luminal A subgroup accounts for 48.4% of all breast cancers and typically presents with pT1 tumors, is lymph node (LN)–negative, and responds best to endocrine therapy (ET). Luminal B subgroup accounts for 24.8% and typically presents more commonly with pT2 tumors and is LN-positive and resistant to ET; thus, these cases benefit from combined chemotherapy and ET [4, 7,8,9]. Besides Ki-67, histologic grade is another important prognostic factor in luminal cancer, which reflects tumor differentiation. A study reported that patients with histologic grade 1 luminal cancers have a better prognosis than those with histologic grade 3 [10]. Therefore, the AJCC manual combines the histologic grade with anatomic staging to determine the clinical prognostic stage, in which luminal cancers with histologic grade 3 are categorized one stage higher than their anatomic stage [1, 11]. For example, in a patient with an anatomic stage IIA luminal cancer, clinical prognostic stage is changed to stage IIB when the histologic grade is 3; however, this case would remain as stage IIA if the histologic grade is 1 or 2 [1, 11].

To efficiently select the luminal B cancer patients who may benefit from chemotherapy, many studies have been performed using preoperative MRI features including apparent diffusion coefficient (ADC) at diffusion-weighted imaging (DWI) [12, 13], texture features [14], and imaging features extracted from computer vision algorithms [15]. However, not including partial imaging features but including multiparametric MRI (mpMRI) features is essential for developing elaborate prediction models; 3.0-T mpMRI can provide detailed information about tumors such as morphology, kinetic features related to angiogenesis, and function at the cellular levels via DWI [16]. In the present study, we aimed to investigate whether prediction models based on 8 machine learning algorithms using mpMRI features can predict Ki-67 levels and histologic grade in cases of stage I–II luminal cancer.

Materials and methods

This retrospective study was approved by our institutional review board and the requirement for written informed consent was waived.

Study population

A retrospective review of medical records at our institution between November 2013 and April 2019 was performed, and 497 women were selected consecutively who meet the following criteria: initially diagnosed with invasive breast cancers via a core needle biopsy, confirmed immunohistochemically as luminal cancers (ER- or PR-positive and HER-2 negative), underwent preoperative MRI with DWI followed by surgery, and staged as stage I–II. Among them, 197 patients were excluded, for the following reasons: received neoadjuvant chemotherapy (NAC) before surgery (n = 49), no recorded Ki-67 proliferation index (n = 46), underwent previous vacuum-assisted biopsy or vacuum-assisted excision or excisional biopsy before MRI examination (n = 36), cancer did not have enhancement on CAD and was unable to acquire kinetic features (n = 25), tumor size was too small to perform exact ADC measurements on DWI (n = 24), DWI was suboptimal for analysis (n = 13), and no histologic grade recorded (n = 4). Finally, 300 women (mean age ± standard deviation, 56.6 years ± 12.0; age range, 32–93 years) were enrolled (Fig. 1). After an initial diagnosis of breast cancer by imaging-guided biopsy, MRI was performed. In all cases, no clip was inserted after biopsy. The interval between biopsy and MRI ranged from 1 to 39 days (median: 13 days).

Fig. 1
figure 1

Flowchart of the study population

MRI examination

MRI was performed using a 3.0-T scanner (Achieva 3.0T TX; Philips Healthcare) with a dedicated seven-channel breast array coil in the axial orientation. All scans were taken with the patients in the prone position. Bilateral axial T2-weighted images (T2WI) with fat suppression were acquired (repetition time ms/echo time ms, 5375/65; flip angle, 90°, matrix, 620 × 303; field of view, 340 × 340 mm2; section thickness, 3 mm; section gap, 0 mm) and DWI was obtained using an echo-planar-imaging sequence with fat suppression in the axial plane (5417/72; flip angle, 90°, matrix, 96 × 126; field of view, 320×320 mm2; section thickness, 3 mm; section gap, 0 mm; b values, 0 and 1000 s/mm2). One pre-contrast and five post-contrast dynamic series were obtained using a three-dimensional T1-weighted fast spoiled gradient-echo sequence with fat suppression (5/2; matrix, 436 × 436; flip angle, 12°; field of view, 340 × 340 mm2; section thickness, 1 mm; no gap) with intravenous injection of a bolus of 0.1 mmol/kg gadoterate (Dotarem; Guerbet).

MRI image analysis

For evaluation of tumor morphology, all MRI images were independently reviewed by two breast radiologists (18 and 9 years of experience), blinded to the patient’s clinical history, using the 2013 Breast Imaging Reporting and Data System MR lexicon [17]. The amounts of fibroglandular tissue (FGT), background parenchymal enhancement (BPE), and the lesion type as mass or non-mass enhancement (NME) were assessed. In patients with multiple lesions, the largest lesion was evaluated. For mass observations, margin, shape, internal enhancement patterns, and associated NME were evaluated. The presence of intratumoral high signal intensity (SI) and peritumoral edema was assessed as absent or present on T2WI [18, 19]. High intratumoral SI on T2WI was visually defined when SI of the lesion was stronger than that of vessels or water or higher than that of the surrounding parenchymal tissue [17,18,19]. Peritumoral edema was also determined when high SI around the tumor was observed on T2WI [18, 19]. Axillary LN enlargement was defined when one or more following findings were found: abnormal LN shape, increased cortical thickness, irregular LN margins, or completely or partially effaced fatty hilum [17].

Kinetic feature analysis

All T1-weighted images were transferred to a CAD system (CADstream, version 6.0, Confirma) and processed for evaluation of kinetic features. As a threshold of 50–60% has been reported as the most appropriate for the CAD system [20], we used a 50% threshold to classify enhancement. A color overlay angio-map was generated at all enhancing lesions above this threshold. After one radiologist selected the largest tumor on the angio-map, the peak enhancement, angio-volume, and early phase and delayed phase profiles were calculated. Within the tumor, the early enhancement phase profiles were described as either medium or rapid enhancement. The delayed enhancement phase profiles were described as persistent, plateau, or washout.

DWI analysis

All DWI images were transferred to a CAD system and the ADC maps were constructed using b-values of 0 and 1000 s/mm2. On the ADC map, two radiologists independently and manually outlined the regions of interest (ROIs) for each lesion. By using T2WI and angio-map as references, measurements were performed twice to cover the entire tumor in its largest cross-sections and to avoid cystic areas (Fig. 2). The minimum, mean, and maximum ADC values of each ROI were measured. The average value of the first ADC measurements by two reviewers was used for statistical analysis.

Fig. 2
figure 2

Preoperative images of the left breast in a 55-year-old woman with breast cancer. a Axial T1-weighted contrast-enhanced subtraction MR image shows rim enhancing mass in the left upper breast. b Axial T2-weighted MR image shows intratumoral high signal intensity within the tumor. c Auto-portfolio of CAD system shows tumor enhancement kinetics with 23% washout component. d Apparent diffusion coefficient (ADC) map shows mass with low signal intensity. A region of interest (green) was manually drawn on the CAD system to encompass an entire cross-section of the lesion. Maximum ADC, mean ADC, and minimum ADC were 1.29 × 10-3 mm2/s, 0.94 × 10-3 mm2/s, and 0.69 × 10-3 mm2/s, respectively. Surgical histopathologic examination revealed a 3-cm invasive ductal carcinoma with a histologic grade 3 and Ki-67 proliferation index of 37% (stage II luminal B cancer)

Histopathologic data analysis

Histopathologic data were determined based on the surgically excised tissue. Using the labeled streptavidin-biotin method for immunohistochemical (IHC) technique, ER and PR positivity were defined as the presence of positive staining in at least 1% of the nuclei in ten high-power fields [21]. HER2 negativity was defined as an IHC score of 0 or 1+ or 2+ staining with negative HER2 gene amplification on fluorescence in situ hybridization. Ki-67 expression was graded as low (< 14%) or high (≥ 14%) [5, 6] and histologic grade was dichotomized as low (grade 1 or 2) or high (grade 3).

Statistical analysis

The MRI morphologic features, kinetic features, and ADC values were compared between cancers with low Ki-67 and high Ki-67 and in those with low histologic grade and high grade. For categorical variables, the x2 test or Fisher exact test was used. For continuous variables, a Student t test was used for normally distributed variables and the Mann-Whitney U test was used for non-normally distributed variables. To determine the optimal cutoff values of kinetic features and ADC parameters for predicting high Ki-67 and histologic grade, receiver operating characteristic (ROC) curve analysis was performed by using the maximum Youden index (i.e., sensitivity + specificity −1).

To identify the mpMRI features associated with high Ki-67 and histologic grade, logistic regression analysis was used. For multivariate analysis, we used covariates with p values < 0.05 in the univariate analysis. To build a prediction model based on significant mpMRI features from the univariate and multivariate analysis, eight machine learning algorithms (random forest, adaptive boosting, decision tree, k-nearest neighbor, linear support vector machine, naive Bayes classifier, linear discriminant analysis, and logistic regression) were used. To train the eight machine learning algorithms, repeated (n = 20) and 10-fold cross-validation was applied. The DeLong method was used to compare the AUCs of eight algorithms in independent data sets.

Interobserver reliability between two reviewers was assessed using interclass correlation coefficient. An r value of 1.0 was deemed to indicate perfect agreement; 0.81–0.99, almost perfect agreement; 0.61–0.80, substantial agreement; 0.41–0.60, moderate agreement; 0.21–0.40, fair agreement; and ≤ 0.20, slight agreement [22]. Statistical analyses were performed using SPSS for Windows, version 20.0 (SPSS Inc.), open-source R software (version 3.5.1; R Foundation for Statistical Computing), and Python (Python Software Foundation, version 3.7.4) with Scikit-learn package (https://github.com/scikit-learn/scikit-learn).

Results

Patient characteristics and multiparametric MRI features

Of the 300 women, 227 (75.6%) had undergone breast conservation surgery and 73 (24.4%) had undergone total mastectomy. The median pathologic invasive tumor size was 1.5 cm (range, 0.6–4.7 cm) and axillary LN metastasis was confirmed in 25.6% (77 of 300). The range of Ki-67 within this group spanned from 0.1 to 56.6% (median, 12.0%); 203 (67.7%) had a Ki-67 of less than 14%; and 97 (32.3%) had a Ki-67 of 14% or higher. Histologic grade 1 was observed in 91 patients, grade 2 in 151 patients, and grade 3 in 58 patients, and 242 (80.7%) were histologic grade 1 or 2 while 58 (19.3%) were histologic grade 3.

Compared to luminal cancers with low Ki-67, those with high Ki-67 more frequently demonstrated intratumoral high SI on T2WI, larger tumor size, larger angio-volume, smaller persistent component, and larger washout component (all p < .05). Compared to luminal cancers with low histologic grade, those with high histologic grade more frequently demonstrated rim enhancement, intratumoral high SI on T2WI, axillary LN enlargement, larger tumor size, larger angio-volume, smaller persistent component, larger washout component, and lower ADC values (all, p < .05) (Table 1).

Table 1 Associations between multiparametric MRI features and Ki-67 and histologic grade

The optimal cutoff values of kinetic and ADC parameters

Based on the results of the ROC curve analysis, the optimal cutoff values to assess association between high Ki-67 and (1) tumor size was 1.75 cm (area under the curve [AUC], 0.60; p = .005); (2) angio-volume, 1.35 cm3 (AUC, 0.58; p = .019); (3) persistent component, 37.5% (AUC, 0.58; p = .019); and (4) washout component, 13.5% (AUC, 0.64; p < .001). The values to evaluate the association between high histologic grade and (1) tumor size was 1.85 cm (AUC, 0.66; p < .001); (2) angio-volume, 1.45 cm3 (AUC, 0.67; p < .001); (3) persistent component, 33.5% (AUC, 0.67; p < .001); (4) washout component, 15.5% (AUC, 0.73; p < .001); and (5) mean ADC value, 0.945 × 10-3 mm2/s (AUC, 0.61; p = .008).

Features associated with high Ki-67 and histologic grade

Univariate analysis demonstrated that high Ki-67 was associated with intratumoral high SI on T2WI, tumor size > 1.75 cm, angio-volume >1.35 cm3, persistent component < 37.5%, and washout component > 13.5%. In multivariate analysis, washout component > 13.5% (odds ratio [OR] = 4.16; p < .001) and intratumoral high SI on T2WI (OR = 1.89; p = .018) were each associated with high Ki-67 (Table 2).

Table 2 Univariate and multivariate logistic regression analysis of features associated with high Ki-67

Univariate analysis was also conducted to determine if an association existed between high histologic grade and certain mpMRI features, and intratumoral high SI on T2WI, rim enhancement, axillary LN enlargement, tumor size > 1.85 cm, angio-volume > 1.45 cm3, persistent component < 33.5%, washout component > 15.5%, and mean ADC < 0.945 × 10-3 mm2/s were associated with high histologic grade. Multivariate analysis revealed that washout component > 15.5% (OR = 7.22; p < .001), rim enhancement (OR = 2.59; p = .022), and mean ADC < 0.945 × 10-3 mm2/s (OR = 2.47; p =.015) were each associated with high histologic grade (Table 3).

Table 3 Univariate and multivariate logistic regression analysis of features associated with high histologic grade

Diagnostic performance of prediction models

The diagnostic performances of prediction models using the eight machine learning algorithms are summarized in Table 4. Among eight methods using significant features from univariate analysis, two had the highest diagnostic performance for predicting high Ki-67 (AUC: 0.71) and k-nearest neighbor had the highest performance for predicting high histologic grade (AUC: 0.83), whereas six had the highest performance for predicting high Ki-67 (AUC: 0.70) and naive Bayes classifier had the highest performance for predicting high histologic grade (AUC: 0.79) when using significant features from multivariate analysis (Fig. 3). Figure 4 is a box plot showing the diagnostic performance of eight machine learning methods.

Table 4 AUCs of eight machine learning models using significant features acquired from univariate and multivariate logistic regression analysis for the prediction of high Ki-67 and histologic grade
Fig. 3
figure 3

Graphs show areas under receiver operating characteristic curve (AUCs) of eight machine learning algorithms using significant features acquired from univariate analysis for the prediction of higher Ki-67 (a) and higher histologic grade (b), and those using significant features acquired from multivariate analysis for the prediction of higher Ki-67 (c) and higher histologic grade (d). RF random forest, AB adaptive boosting, DT decision tree, kNN k-nearest neighbor, SVM linear support vector machine, NB naive Bayes, LDA linear discriminant analysis, LGR logistic regression

Fig. 4
figure 4

Box and whisker plot illustrate diagnostic performances of eight machine learning (ML) algorithms using significant features acquired from univariate analysis for the prediction of higher Ki-67 (a) and higher histologic grade (b), and those using significant features acquired from multivariate analysis for the prediction of higher Ki-67 (c) and higher histologic grade (d). RF random forest, AB adaptive boosting, DT decision tree, kNN k-nearest neighbor, SVM linear support vector machine, NB naive Bayes, LDA linear discriminant analysis, LGR logistic regression

Interobserver agreement

The overall agreement for ADC values between two readers was almost perfect (r = 0.96). For MRI morphologic imaging features, the overall agreement for FGT (r = 0.62), BPE (r = 0.65), mass internal enhancement (r = 0.63), associate NME (r = 0.61), intratumoral T2 high SI (r = 0.72), and axillary LN enlargement (r = 0.78) was substantial, while that for mass shape (r =0.53), mass margin (r = 0.57), and peritumoral edema (r = 0.56) was moderate.

Discussion

Our study revealed that among various mpMRI features, CAD-measured washout component is key for predicting both higher Ki-67 and histologic grade in patients with stage I–II luminal breast cancer. A machine learning–based prediction model incorporating mpMRI features shows good diagnostic performance for predicting Ki-67 and histologic grade with AUCs of 70% and 79%, respectively.

Washout kinetics was the most important feature for this detailed prediction model. The results of our study are concordant with a prior study that indicated that the washout curve shape may predict a higher level of Ki-67 and histologic grade [23]. The impact of washout kinetics on poorer survival outcomes has previously been reported [24, 25]. One study found that a higher washout component was significantly associated with poorer disease-free survival [24], and another showed that patients with a smaller reduction in washout component after NAC had worse recurrence-free survival [25]. The most probable explanation for this is that hypervascular tumors may have a higher washout component indicating rapid perfusion because they have higher angiogenic markers, which could facilitate the spread of tumor cells into the body [26, 27]. Our study suggests that higher washout component could predict unfavorable prognosis due to its association with higher Ki-67 and histologic grade in luminal cancer [4].

The second-most influential mpMRI feature for predicting higher Ki-67 was the presence of intratumoral high SI on T2WI, not the ADC value. This result is similar to a recent study that found no correlation between mean ADC and Ki-67, while mean Ki-67 significantly increased with the amount of very high SI on T2WI, indicating tumor necrosis [28]. A possible explanation is that breast cancers with high proliferation (i.e., high Ki-67) may exceed the oxygen supply of their vascular system, resulting in prolonged hypoxia and subsequent necrosis with decreased cellularity [29]. Although controversy still remains regarding the correlation between ADC values and Ki-67 [14, 30,31,32], our results support that intratumoral high SI on T2WI may be a more appropriate predictor for Ki-67 than the ADC value.

Contrary to Ki-67, lower ADC value can be used to predict higher histologic grade in luminal cancer. Previous studies also reported the inverse correlation between ADC value and histologic grade [31, 32]. However, another multicenter analysis of 870 breast cancer patients suggested that ADC cannot be used as a surrogate marker for Ki-67 and histologic grade [30]. Whereas the prior study attempted to distinguish grade 2 or 3 tumors from grade 1 tumors [30], we tried to distinguish grade 3 tumors from grade 1 or 2 tumors, because grade 3 luminal cancers are categorized one stage higher according to the eighth edition AJCC guidelines. From our results, we can assume that a higher histologic grade which means less tubule formation and higher mitotic activity could reflect higher cellularity, which results in lower ADC in luminal cancer.

Rim enhancement is a significant predictor of higher histologic grade [23, 32]. In luminal cancer subtype, the impact of rim enhancement on poorer distant metastasis-free survival has been reported [33]. In addition, a radiogenomic approach reported that a high calculated score indicating rim enhancement was correlated with early occurrence of metastasis and certain RNA phenotypes [34].

To date, machine learning algorithms have been applied to build a prediction model incorporating clinical or imaging data for further improving accuracy [35, 36]. In breast cancer patients, the integration of machine learning–based prediction models using mpMRI features has the potential to provide valuable information on predicting NAC treatment outcomes or risk of recurrence, and predicting which BI-RADS 3 and BI-RADS 4 lesions are benign [35,36,37,38]. Similar to the study of Eun et al [37], we used eight machine learning algorithms in our study to provide the best model to fit the input data and to predict correctly. Through comparison of the diagnostic performance of eight machine learning algorithms, we found that diagnostic performance using two or three significant features from multivariate analysis achieved the similar performance when using five or eight significant features from univariate analysis for predicting Ki-67 and histologic grade.

Our study had several limitations. First, this was a retrospective study from a single tertiary academic institution. Second, we excluded 25 lesions without enhancement on CAD and 24 lesions whose lesion visibility on DWI was too poor to acquire exact ADC measurements. Because the DWI acquisition resolution was low, it was difficult to measure ADC values in some subcentimeter lesions with poor lesion visibility. Therefore, selection bias may have been present in patient enrollment. Third, we did not analyze histopathological findings such as tumor necrosis and cellularity using the surgical specimens, which could have been important to support our findings. Fourth, the visual assessment of the presence of intratumoral high SI or peritumoral edema could be affected by postbiopsy hemorrhage because all MRI examinations were performed after imaging-guided biopsy. Fifth, we did not divide enrolled patients into test and validation sets. Instead, 10-fold cross-validation was used to train the eight machine learning algorithms to test the model’s ability to predict correctly and to provide an insight into how the prediction model will generalize to an independent dataset.

In conclusion, higher values of CAD-measured washout component and a presence of high intratumoral SI on T2-weighted image were associated with high Ki-67, and higher values of CAD-measured washout component, rim enhancement, and lower apparent diffusion coefficient value were correlated with high histologic grade in luminal cancer subtypes. Our results provide an understanding about the relationship between luminal cancer biology and mpMRI features. In addition, the integration of machine learning algorithms using mpMRI features can provide important predictive information to guide treatment decisions or determine the clinical prognosis for patients with stage I–II luminal cancers.