Introduction

Magnetic resonance imaging (MRI) is a sensitive modality for assessing breast lesions, and is currently one of the major breast imaging exams. To standardize breast MRI reports worldwide, Breast Imaging Reporting and Data system (BI-RADS) for MRI recommends the use of assessment categories (0–6) that reflect the likelihood of cancer [1]. Category 4 is assigned when a breast lesion does not fulfill the typical criteria of malignancy, but is suspicious and needs pathological investigation with invasive procedures; it has a wide range of probabilities of malignancy, extending from > 2% to < 95%. BI-RADS for mammography and ultrasound subdivides this category into 4A, 4B, and 4C, which represent the probabilities of malignancy as low (> 2%, ≤ 10%), moderate (> 10%, ≤ 50%) and high (> 50%, < 95%) to give more graded stratification and increase clinical utility, but BI-RADS for MRI still does not [1].

There is a broad spectrum of histopathologic results in category 4 with a substantial overlap in imaging findings between benign and malignant lesions [2], which causes difficulty in subcategorization. However, several studies have demonstrated the correlation between imaging findings and the likelihood of malignancy; for example, irregular shape, spiculated margins, and rim enhancement of mass lesions; and segmental distribution, heterogeneous enhancement, and clustered ring enhancement of non-mass lesions are known to suggest malignancy [3,4,5,6,7]. Liberman et al. showed that increasing lesion size correlates with the likelihood of malignancy [8]. Diffusion-weighted imaging (DWI) is also known for its utility in distinguishing between malignant and benign lesions, though it is not yet included in the BI-RADS lexicon [9,10,11,12].

In our institute, all category 4 breast lesions on MRI are finally classified into subcategories 4A, 4B and 4C by the board-certified breast radiologists. There is, however, limited evidence on how successful this subcategorization is. This retrospective study is to estimate the positive predictive value (PPV) of malignancy of each category and subcategory in a single tertiary hospital, and to examine the clinical impact of category 4 subcategorization.

Materials and methods

Study population

This retrospective analysis was approved by the institutional review board of our institution with a waiver of informed consent. We included MRI scans obtained at our tertiary hospital with reports based on a standard protocol using T2-weighted images (T2WI), T1-weighted images (T1WI), DWI, and dynamic contrast-enhanced (DCE) MR images. MRI report databases were searched for studies with findings classified as BI-RADS category 2–6 from July 2015 to December 2016. Categories were allocated per lesion. Exclusion criteria were lesions identified after chemotherapy, no confirmation of malignancy or benignity, and postoperative examinations.

From 391 breast DCE-MRI with 496 lesions, 62 category 6 lesions from 45 patients were excluded. Consequently, 346 breast DCE-MR examinations with 434 category 2–5 lesions were identified in the designated time period. Indications for MRI were inconclusive findings on other image modalities (n = 317); follow-up for suspected benign lesions (n = 19), screening (n = 8), and others (n = 2).

MRI protocol

All examinations were performed using 3 T scanners [MAGNETOM Skyra (41 exams with 52 lesions) or Prisma (305 exams with 382 lesions), Siemens Healthcare GmbH, Erlangen, Germany] and dedicated 16-channel or 18-channel bilateral breast coils. Each patient received 0.2 mL/kg gadoteridol (ProHance, Eisai, Tokyo, Japan) or 0.1 mL/kg gadobutrol (Gadovist, Bayer Healthcare, Berlin, Germany) at a rate of 2.0 mL/s intravenously, followed by 20 mL of saline at the same rate. Our standard MRI protocols included T2WI [axial orientation; 2D-turbo spin echo with fat suppression; repetition time/echo time (TR/TE), 5500/79 ms; field of view (FOV), 330 × 330 mm; matrix, 448 × 336; thickness, 3.0 mm], T1WI [axial orientation; volumetric interpolated breath-hold examination (VIBE); TR/TE, 5.14/2.46 ms; FOV, 330 × 330 mm; matrix, 384 × 319; thickness, 2.5 mm], DWI [axial orientation; single-shot echo planar imaging; TR/TE, 9200/57 ms (Skyra) or 6300–6600/43–50 ms (Prisma); FOV, 330 × 185 mm: matrix, 162 × 92; thickness, 3.0 mm; number of excitations, 3; b = 0 and 1000 s/mm2], DCE-MRI (axial orientation; VIBE with fat suppression; TR/TE, 3.84/1.43 ms; FOV, 330 × 330 mm; matrix, 384 × 384; thickness, 1.0 mm) before and at 0–1 min, 1–2 min, and 5–6 min after contrast injection, and high-resolution CE-MRI (coronal orientation; VIBE with fat suppression; TR/TE, 4.59/1.80 ms; FOV, 330 × 330 mm; matrix, 512 × 461; thickness, 0.8 mm) at 2–5 min after contrast injection.

Image analysis

All enhancing lesions except background parenchymal enhancement were prospectively classified into category 2 (0% probability of malignancy), 3 (> 0%, ≤ 2%), 4 (> 2%, < 95%) or 5 (≥ 95%), and category 4 lesions were further subcategorized into 4A, 4B and 4C by one of the two experienced radiologists with > 10 years of experience in breast MRI diagnosis at the time of diagnosis. Category 4A was used for a lesion which needs biopsy but with a low suspicion of malignancy (> 2%, ≤ 10% probability of malignancy); category 4B includes lesions with a moderate suspicion of malignancy (> 10%, ≤ 50% probability of malignancy); category 4C was used for the findings with high suspicion of malignancy but not highly suggestive of malignancy as category 5 (> 50%, < 95% probability of malignancy) in accordance with BI-RADS mammography [1]. Lesion types included a focus, mass, or non-mass enhancement (NME). A focus is characterized by its small size, smaller than 5 mm in general, though the size criterion is not determined in BI-RADS.

The final categorization/subcategorization was determined by the radiologists reporting the specific breast MRI by referring to the published PPVs of particular MRI descriptors [3,4,5,6,7,8], based on a comprehensive analysis of all the findings: morphology, kinetics, signal intensity on T1WI, T2WI, DWI and DCE-MRI. Larger size, irregular shape with not circumscribed margin of masses, rim enhancement of masses, segmental distribution and heterogeneous or clustered ring enhancement of NME, low apparent diffusion coefficient (ADC) values on DWI and washout kinetics lead to higher probability of malignancy; while, smaller size, round or oval shape with circumscribed margin of masses, NME with associated cysts, high ADC values on DWI and persistent delayed enhancement lead to lower probability of malignancy. The radiologists were allowed to refer to mammograms, ultrasound images and available clinical information. Suspicious calcification on mammograms or strong family history of breast cancer might be the grounds for higher category, while normal-sized intramammary lymph node could be proven by ultrasound images and assigned to category 2.

Radiological and pathological reports of each case were retrospectively analyzed, and the PPV and tissue biopsy-proven positive predictive value (PPV3) of each category was calculated. The PPV was calculated as the number of malignant lesions divided by the number of lesions assigned to each category. The PPV3 was calculated as the number of malignant lesions divided by the number of tissue diagnoses through biopsy or operation. For lesions in category 4 or less, data on the presence of a malignant lesion (category 5 or 6) in the ipsilateral breast were collected to examine the effect of ipsilateral malignancy. Malignant lesions were pathologically diagnosed within 2 months of the MR examination. Benign lesions were diagnosed pathologically or confirmed by stability or shrinkage over 2 years’ follow-up.

Results

A total of 434 category 2–5 lesions found on 346 breast MRIs in patients with a mean age of 53 (range 21–83 years) were included in the study. Among 434 lesions, 149 were malignant and 285 were benign. The lesion types included 24 foci, 239 masses, and 171 NME. The distribution of patients’ age and lesion types in each category is shown in Table 1.

Table 1 Distribution of patients’ ages and lesion types

The number of lesions assigned to category 4 was 211, including 147 benign and 64 malignant lesions. Among them, 166 (102 benign and 64 malignant) were diagnosed through ultrasound- or mammography-guided biopsy. Among the 147 benign lesions assigned to category 4, 45 lesions (31 category 4A and 14 category 4B lesions) were not diagnosed through biopsy for the following reasons: the lesion was previously diagnosed as benign (n = 4), diagnosed as probably benign through fine needle aspiration (n = 1), decreased in size after treatment for abscess (n = 1), or absence of suspicious finding on ultrasound (n = 39).

None of the lesions classified as category 2 or 3 were diagnosed as cancer, with a PPV of 0%. The PPVs of category 4 and 5 lesions were 30.3% and 100%, respectively. One of 55 category 4A lesions was diagnosed as malignant, for a PPV of 1.8%. Nine of 76 category 4B lesions were diagnosed as malignant, for a PPV of 11.8%, and 54 of 80 category 4C lesions were diagnosed as malignant, for a PPV of 67.5% (Table 2, Fig. 1). Representative cases of subcategory 4A, 4B and 4C lesions are shown in Figs. 2, 3, 4.

Table 2 Number of malignant and benign cases with PPVs
Fig. 1
figure 1

Number of malignant and benign lesions by each category/subcategory

Fig. 2
figure 2

Axial MRI of the left breast in a 56-year-old female with a non-mass lesion detected incidentally on MRI obtained for examination of another lesion. This non-mass lesion was 18 mm in diameter with linear contour, heterogeneous enhancement, and plateau kinetics and was assigned to category 4A. Histopathology revealed usual ductal hyperplasia. There was a mass lesion assigned to category 5 in the ipsilateral breast which was diagnosed as invasive carcinoma of no special type. a T2WI with fat suppression. b DWI (b = 1000 s/mm2). c Apparent diffusion coefficient (ADC) map. d T1WI. e Dynamic contrast-enhanced MRI (1–2 min from contrast injection). f Dynamic contrast-enhanced MRI (5–6 min from contrast injection). g Time-signal intensity curve in a circular region of interest of 3 mm in diameter inside the lesion

Fig. 3
figure 3

Axial MRI of the left breast in a 41-year-old female, performed for detailed examination of calcification on mammography. A non-mass lesion 18 mm in diameter with focal distribution, heterogeneous enhancement, and plateau kinetics was detected and assigned to category 4B. Core needle biopsy was performed and revealed no malignancy. a T2WI with fat suppression. b DWI (b = 1000 s/mm2). c ADC map. d T1WI. e Dynamic contrast-enhanced MRI (1–2 min from contrast injection). f dynamic contrast-enhanced MRI (5–6 min from contrast injection). g Time–signal intensity curve in a circular region of interest of 3 mm in diameter inside the lesion

Fig. 4
figure 4

Axial MRI of the left breast in a 65-year-old female, performed for detailed examination of calcification on mammography. A non-mass lesion 35 mm in diameter with segmental distribution, clustered ring enhancement, and washout kinetics was detected and assigned to category 4C. Histopathological diagnosis was high-grade ductal carcinoma in situ. a T2WI with fat suppression. b DWI (b = 1000 s/mm2). c ADC map. d T1WI. e Dynamic contrast-enhanced MRI (1–2 min from contrast injection). f Dynamic contrast-enhanced MRI (5–6 min from contrast injection). g Time-signal intensity curve in a circular region of interest of 3 mm in diameter inside the lesion

Table 3 shows the number of malignant and benign cases with PPVs and PPV3s by category/subcategory and lesion type. The PPVs of mass lesions and NME assigned to category 4B were 2.4% and 23.5%, respectively. No focus included in this study was proved to be malignant. BI-RADS lexicon features of lesions assigned to subcategory 4A–4C are shown in Table 4. There were no significant differences in the distribution of lexicon features between the subcategories except for the margin and internal enhancement of mass lesions and the distribution of NME; circumscribed margin and dark internal septations of mass lesions were mainly assigned to subcategory 4A; whereas, rim enhancement was rarely assigned to subcategory 4A; segmental distribution of NME was mainly assigned to subcategory 4C.

Table 3 Number of malignant and benign cases with PPVs by category/subcategory and lesion type
Table 4 BI-RADS lexicon features in breast lesions assigned to subcategory 4A–4C

Of 211 category 4 lesions, 21 lesions coexisted with category 5 or 6 lesions in the same breast. Among them, 17 lesions were malignant, yielding a PPV of 81.0%. Three lesions were assigned to category 4A, one of which led to a diagnosis of malignancy, for a PPV of 33.3%. Two of the 21 lesions were assigned to category 4B, one of which was diagnosed as malignant, for a PPV of 50%. The remaining 16 of the 21 lesions were classified into category 4C, and 15 lesions were diagnosed as malignant, for a PPV of 93.8%. The PPVs for categories 4A, 4B, and 4C were higher in lesions coexisting with category 5 or 6 lesions within the same breast compared to isolated lesions (Table 5).

Table 5 Number of malignant and benign cases of category 4 lesions with and without ipsilateral malignancy

Discussion

This study aimed to determine the utility of category 4 subdivision for breast MRI from an observational database using real world data. Our results demonstrate that the PPVs for categories 4A, 4B, and 4C were 1.8, 11.8, and 67.5%, meaning that subcategorization provides graded risk stratification and that category 4C lesions are significantly more likely to be malignant than category 4A and 4B lesions. The PPV of category 4 lesions with ipsilateral category 5 or 6 lesions was 81.0%, higher than for isolated lesions. These results suggest that among category 4 lesions, lesions assigned to category 4C or coexist with category 5 or 6 lesions would warn clinicians that the lesions have higher likelihood of malignancy, which facilitates more informed treatment decisions.

Biopsy is basically required for category 4 lesions, but the sensitivity of tissue biopsy cannot be 100 percent. If the biopsy result is benign, then we should determine follow-up intervals or choose repeat biopsy depending on the likelihood of malignancy of the lesion mainly based on images. BI-RADS mammography suggests 6-month or routine follow-up after a benign tissue diagnosis for category 4A lesions, and some risk-tolerant patients with category 4A lesions may even choose to decline biopsy because malignant results are not expected [1]. Considering that MRI has relatively low specificity and 77% of MRI findings that require biopsy (i.e., category 4 or 5) turn out to be benign [13], risk-tolerant patients with low likelihood of malignancy on MRI may also choose careful follow-up instead of invasive biopsy as on mammography.

Another important issue regarding biopsy is how to deal with MRI-detected category 4 lesions. Breast tissue biopsy is often performed guided by ultrasound or mammography if MRI-guided biopsy is not common as in Japan, but not all MRI-detected lesions are visible on these imaging modalities. Subdivision of category 4 can convey stratified levels of the likelihood of malignancy, which helps patients and clinicians to determine the indication for biopsy.

The BI-RADS MRI states that assessment category 4 is not currently divided into subcategories [1], due to the paucity of data about the feasibility and accuracy of subdivision. Our results show that the distribution of each BI-RADS lexicon feature does not differ so much between subcategories, and implies that the combinations of lexicon features or other findings may contribute much to subdivision in the clinical settings. There may also be some other issues other than MRI findings to consider, such as the baseline risk represented by family history or genetic mutations, and the concordance with mammography or ultrasound images. Strigel et al. showed the feasibility of category 4 subdivision in a study of high-risk patients undergoing screening MRI [14]. Our current results provide a data for subdivision of breast MRI in routine clinical practice, suggesting the feasibility of subdivision.

There are several studies of category 4 subdivision using scoring systems by DCE-MRI alone or combined with T2WI and/or DWI [11, 12, 15, 16]. Fujiwara et al. proposed a grading system for breast mass descriptors of morphology and kinetic features [12], and Asada et al. proposed a grading system for NME descriptors of morphology (internal enhancement and distribution) [15]. They both successfully stratified lesions by the likelihood of malignancy, and the PPVs for subcategories 4A and 4B were higher than the target ranges for BI-RADS mammography and ultrasound. Almeida et al. added the signal intensity on T2WI in the scoring system and achieved category 4 subdivision with slightly higher PPV for category 4A and within the target ranges of PPVs for subcategories 4B and 4C, and achieved even better diagnostic performance with the addition of ADC measurements [11]. Simple scoring systems may also help to generalize subcategories regardless of the readers’ experience [17]. These results would help to establish formal criteria for subcategory classification, while some issues remain, such as the baseline risk or exceptional findings, i.e., bloody discharge.

Our results show the malignancy rate varies depending on the lesion type. Among category 4B lesions, in particular, NME is approximately 10 times more likely to be malignant than mass lesions; the PPV of mass lesions assigned to category 4B is only 2.4%, within the range of category 4A. We may tend to assign a higher category to mass than to NME. Lesions assigned as focus may have a considerably lower probability of malignancy, as no focus is proven to be malignant in the current results.

Although there was a large overlap of BI-RADS lexicon features among the subcategories, mass lesions with circumscribed margin or dark internal septations tended to be assigned to subcategory 4A and mass lesions with rim enhancement tended to be assigned to subcategory 4B or 4C. Also, NME with segmental distribution was mainly assigned to subcategory 4C. Previous studies have shown the importance of not circumscribed margin and rim enhancement of mass lesions [3, 18, 19] and segmental distribution of NME for differentiating malignant from benign lesions [6]. Dark internal septation is one of the benign features [1, 18]. Our subcategorization might reflect the malignant possibility estimated from these lexicons. Along with these morphologic features, delayed washout enhancement is known to be the suggestive feature of malignancy [20]; however, kinetic features did not differ much among our subcategorization, as most category 4 lesions in our study showed washout kinetics.

In our results, the PPVs of categories 4A, 4B and 4C lesions coexist with category 5 or 6 lesions within the same breast are 33.3%, 50% and 93.8%, all higher than isolated lesions. The PPV of these category 4A lesions exceeds the pre-defined PPV, while those of subcategories 4B and 4C are within the range defined in BI-RADS. In the previous studies, as many as 44–75% of suspicious lesions on MRI are reported to be malignant in the breast harboring synchronous cancer [13, 21]. For suspicious findings coexist with typical or known breast cancer, it may be necessary to assign a higher category than that inferred solely from the image findings.

The PPVs of category 3 and subcategory 4A are lower than pre-defined PPVs based on BI-RADS. This result implies the possibility of unnecessary follow-up or biopsy and needs improvement, as short-term follow-up is recommended for category 3 and biopsy is recommended for category 4 [1]. Our categorization is based on lesions’ morphology, kinetics, signal intensity of T1WI, T2WI, DWI and DCE-MRI, and other available clinical information. Revealing the contributions of each finding may improve prediction of malignancy. Further consideration is also needed regarding diagnosis and management of category 4A lesions.

This study has some limitations. First, this was a retrospective single-site study. Second, classification criteria were not clearly defined and depended on the radiologist’s decision, so the current result is difficult to apply to other facilities immediately. Generalizing category 4 subdivision requires further analysis with evaluation by multiple readers, with more effort to clarify diagnostic criteria and possibly developing a different score system or decision tree. Third, categorization of a specific lesion might be affected by the presence of ipsilateral malignant-looking lesion. Fourth, some foci or NME might not be classified separately from adjacent cancer when tissue diagnosis was not recommended, in accordance with BI-RADS. Fifth, MR scanners and coils were not unified throughout the examinations.

In conclusion, category 4 lesions can be classified into three subcategories depending on the likelihood of malignancy. PPVs of lesions in each subcategory were within or close to the pre-defined range. It may increase clinical utility of categorization, especially when determining the indications for biopsy. Category 4 lesions coexisting with category 5 or 6 lesions are more likely to be malignant than isolated lesions.