Introduction

Breast cancer is a heterogeneous disease that poses a major threat to women’s lives worldwide. To overcome the heterogeneous malignant potential of breast cancer, a good biomarker may contribute to more precise treatment strategies. Several genomic signature studies of breast cancer have identified various distinct prognostic markers or predictors [1]. However, the currently available gene signatures are limited to hormone receptor-positive (HR +) and human epidermal growth factor receptor 2-negative (HER2−) breast cancers. The National Comprehensive Cancer Network (NCCN) Guidelines (version 3.2021-March 29 2021) recommend that only a proliferation-related marker, Oncotype Dx, has both prognostic and predictive value with regard to chemotherapy in HR + /HER2- breast cancers [2]. Oncotype Dx identifies patients who are more likely to show distant recurrence but have a good response to chemotherapy, thereby offering strong evidence for the development of treatment plans [3,4,5]. On the other hand, for HER2 + or triple-negative (TN) breast cancers, there is no clinically available genomic assays for consideration of adjuvant chemotherapy to optimize treatment strategies from the NCCN Guideline. Clinically, decision-making for adjuvant chemotherapy for HER2 + and TN breast cancer patients is based on the classic clinical and pathological information (tumor size and nodal or distant metastatic status).

Accumulating evidence has suggested that immune genomic signatures (IGSs) can be used to predict clinical outcomes or response to chemotherapy in HER2 + and TN breast cancer [6]. However, previous studies evaluating the predictive value of IGSs for prognosis and response to chemotherapy have been performed in different databases, and the results differed accordingly [7,8,9,10,11,12,13]. Therefore, we aimed to identify optimal biomarkers for response to chemotherapy and prognosis for HR– patients by evaluating both prediction results in the same patient cohort. The aim of this analysis was to directly asses and compare five previously reported IGSs: Ascierto (7), Schmidt (8), Bianchini (9), TILsGS (13) and IRSN-23 (10) in a single cohort of breast cancer patients who received uniform neoadjuvant chemotherapy and were followed up for survival. Accordingly, we performed various assessments based on different breast cancer subtypes and evaluated pairwise consistency in IGSs.

Materials and methods

Patient cohorts and gene expression data

We retrieved a single dataset of 508 breast cancer patients that contained gene expression data and clinical information, including both response to neoadjuvant chemotherapy and prognosis after surgery. Complete gene expression data for primary breast cancer are available in the Gene Expression Omnibus (GEO: https://www.ncbi.nlm.nih.gov/geo/) under accession numbers GSE25055 and GSE25066. Gene expression profiling was performed using Affymetrix U133A gene chips, as previously described [14]. Pretreatment fine needle aspiration samples (n = 384) and core biopsy samples (n = 124) of primary breast cancer were collected. Expression data were normalized using the MAS5 algorithm, mean centered to 600, and log2-transformed. Of the 508 patients, 23 patients with HER2 + and unknown HER2 statuses were excluded because the patients received no HER2-targeted therapy during that time and the sample size was small; thus, the data for 485 HER2-negative patients were retained for further analyses. All patients received neoadjuvant chemotherapy with sequential paclitaxel (80 mg/m2 weekly × 12 treatments) and 5-fluorouracil, doxorubicin, and cyclophosphamide (500, 50, and 500 mg/m2, respectively, once every 21 days with 4 treatments) and underwent mastectomy or breast-conserving surgery with axillary lymph node sampling after completion of neoadjuvant chemotherapy. Pathologic complete response (pCR) was defined by the absence of viable invasive cancer in the breast and lymph nodes. HR and HER2 statuses were determined in the diagnostic core needle biopsy specimens before chemotherapy, in accordance with the American Society of Clinical Oncology/College of American Pathologists guidelines. Patients showing 10% positive nuclear staining for ER and/or progesterone receptor (PR) with immunohistochemistry (IHC) were considered HR-positive (+) [15]. Patients with either 3 + IHC staining for HER2 or showing HER2 gene copy number of 2.0 by fluorescent in situ hybridization (FISH) analysis were considered HER2-positive. Patients with HR-positive status received adjuvant hormone therapy. Distant relapse-free survival (DRFS) was defined as the time from operation to the first distant recurrence, and cases of death without distant recurrence were censored at the time of death.

Immune genomic signatures (IGSs)

We applied five different previously reported immune genomic signatures (IGSs) to this dataset. An Affymetrix-based approximation of IRSN-23 was calculated as previously described [10]. In brief, genes in the 23-probe [19 genes] signature were identified on the Affymetrix U133A platform using gene symbols, and their expression data were calculated as the weighted sum of the gene expression values. The weight was calculated as the expression level of a gene multiplied by its predetermined correlation coefficient that was taken from the original publication [10]. In order to simplify genomic markers with distinct complex algorithms for the other four immune genomic signatures, namely, “Ascierto (7),” “Bianchini (9),” “Schmidt (8),”and “TILsGS (13),” we calculated the average gene expressions based on algorithm-normalized MAS5 log2-converted mRNA gene expression data. The list of genes for each gene signature is shown in Supplementary Table 1. Intrinsic molecular subtypes (luminal A, luminal B, HER2 + , basal-like, and normal type) were assigned to each case using the PAM50 centroid-based classifier, as described previously [16].

We first compared five IGSs by molecular subtypes (luminal A, luminal B, HER2 + , basal-like, and normal type) to ascertain the associations between breast cancer subtypes and IGSs. IGSs scores were considered as continuous variables, and P values were calculated using the Kruskal–Wallis test. Next, we plotted a scatter plot matrix to visualize the bivariate relationships between combinations of every pair of IGSs. The Pearson's r values ranged from − 1 to 1. An r of − 1 indicated a perfect negative linear relationship between variables, while an r of 0 indicated no linear relationship between variables, and an r of 1 indicated a perfect positive linear relationship between variables.

Then, we classified each IGS model by dividing the patients evenly into three groups based on the expression level of the IGS score (low, intermediate, and high). To assess the prognostic value of each IGS model (low vs. high and intermediate vs. high), we performed univariate Cox proportional hazards analysis of the five IGSs by evaluating hazard ratios and 95% confidence intervals (95% CIs) in all, HR-positive ( +) and HR-negative (−) breast cancer patients separately. The outcome of interest was defined as DRFS and evaluated according to the tertiles of the IGS score. Survival curves were also calculated using the Kaplan–Meier method and compared using the log-rank test. Next, to evaluate the prediction of patients’ response to neoadjuvant chemotherapy in each IGS model, we performed univariate logistic regression analysis of five IGSs for the response to pCR after neoadjuvant chemotherapy and assessed the odds ratios (OR) and 95% confidence intervals (95% CI) in all, HR + , and HR– breast cancer patients separately. The predictive outcomes were classified as either pCR or residual disease (RD). Predicted outcomes were plotted using boxplots in each IGS model, and the P values were calculated by the Wilcoxon test.

Finally, we conducted multivariate analysis using traditional clinical pathological factors related to prognosis and therapy response, including age, nodal status (1–3 vs. 0), tumor stage (3,4 vs. 0–2), estrogen receptor status (negative vs. positive), and MKI67 (low vs. intermediate vs. high). Of the 485 cases, 17 did not include information regarding the histological grade. Therefore, we used the proliferation index, MKI67, instead and divided all patients into three groups (high, intermediate, and low) according to the gene expression level. Cox proportional hazards analysis and logistic regression analysis were separately performed in all, HR + , and HR– breast cancer patients.

Statistical analyses were performed using R software version 3.6.2 (https://www.r-project.org/) and BRB Array Tools version 3.9.0a (https://brb.nci.nih.gov/BRB-ArrayTools/). Differences were considered statistically significant at two-sided P values of ≤ 0.05.

Results

Patient characteristics and breast cancer subtypes

The patient characteristics are demonstrated in Table 1. The median age at diagnosis was 49.9 years. And the median follow-up time is 31.4 months. Among the 485 HER2-negative patients, 306 (63.1%) were HR + and 179 (36.9%) were HR–. The molecular subtypes of luminal A, luminal B, HER2, basal-like, and normal types were noted in 153 (31.5%), 75 (15.5%), 35 (7.2%), 179 (36.9%), and 43 (8.9%) cases, respectively. In assessments based on tumor size, 279 (57.5%) cases were categorized as T0-2, (T0 presents patients with no evidence of primary tumor) while evaluations of nodal status showed that 151 cases (31.1%) had no lymph node metastasis. Seventeen cases (3.5%) included no information on histological grade status. After neoadjuvant chemotherapy treatment, 93 (19.2%) patients achieved pCR, while 373 (76.9%) patients showed residual disease. As for the prognostic outcomes, 106 (21.9%) patients showed no relapse in 5 years after primary diagnosis, whereas 379 (78.1%) patients experienced recurrence within 5 years.

Table 1 Patient characteristics

The expression levels of each IGS classified by molecular subtype are shown in Fig. 1. Each model tended to show differences in expression across subtypes (Kruskal–Wallis P value < 0.001). As expected, significantly higher expression levels were found in more aggressive subtypes, such as HER2 or basal-like subtypes.

Fig. 1
figure 1

Boxplots for five IGSs by breast cancer subtypes. P values were calculated using the Kruskal–Wallis rank-sum test. Subgroups were classified by molecular subtype using the PAM50 centroid-based classifier. IGSs immune genomic signatures

Scatter plot matrix for each IGS

We then assessed the strength of the correlation between the models using scatter plot matrix-calculated correlation coefficients determined with Pearson's rank correlation test. As shown in Fig. 2, most pairs of IGSs showed high correlations with each other: Ascierto (r: 0.65–0.83), Schmidt (r: 0.58–1), Bianchini (r: 0.56–1), TILsGS (r: 0.56–0.66), and IRSN-23 (r: 0.6–0.67). The two variables with the strongest correlations are Schmidt and Bianchini model (r = 1). Whereas the least associations between Bianchini and TILsGS are found (r = 0.56).

Fig. 2
figure 2

Scatter-plot matrix of five immune genomic signatures for all breast cancer patients. A scatter-plot matrix was plotted to visualize bivariate relationships between combinations of every pair of IGSs. Pearson's r values ranged from − 1 to 1

Correlations between distant relapse-free survival and IGSs

In the univariate Cox proportional analysis of 5-year DRFS, in all cancer patients (n = 485), 3 of the 5 IGSs showed significant prognostic value (Table 2), including Ascierto: low vs. high (HR 0.577, 95% CI 0.353–0.944, P = 0.028); TILsGS: low vs. high (HR 0.281, 95% CI 0.164–0.482, P < 0.001); IRSN-23: low vs. high (HR 0.420, 95% CI 0.254–0.693, P < 0.001). In HR + patients (n = 306), two out of five IGS had considerable importance (Table 2), including TILsGS: low vs. high (HR 0.358, 95% CI 0.166–0.775, P = 0.009); IRSN-23: low vs. high (HR 0.319, 95% CI 0.141–0.724, P = 0.006). However, in HR– patients (n = 179), there was no significant difference between the groups (Table 2). Kaplan–Meier curves for survival are shown in Supplementary Fig. 1.

Table 2 Cox proportional hazards analysis of five immune-related gene signatures for 5-year distant relapse-free survival

In the multivariate Cox proportional analysis of 5-year DRFS (Table 2), in all cancer patients (n = 485), only IRSN-23 had significant prognostic value: low vs. high (HR 0.459, 95% CI 0.224–0.941, P = 0.033). In HR + patients (n = 306), two of the five IGSs had considerable importance, including TILsGS: low vs. high (HR 0.402, 95% CI 0.181–0.892, P = 0.025) and IRSN-23: low vs. high (HR 0.394, 95% CI 0.166–0.932, P = 0.034). However, in HR– patients, no IGSs were statistically significant.

Correlations between pCR and IGSs

In the univariate logistic regression analysis of the prediction of chemotherapy response, for all cancer patients (n = 466), 4 of the 5 IGSs showed significant predictive values (Table 3), including Ascierto: intermediate vs. high (OR 0.382, 95% CI 0.215–0.661, P < 0.001), low vs. high (OR 0.363, 95% CI 0.203–0.631, P < 0.001); Schmidt: intermediate vs. high (OR 0.460, 95% CI 0.267–0.809, P = 0.007), low vs. high (OR 0.400, 95% CI 0.249–0.762, P = 0.004); Bianchini: intermediate vs. high (OR 0.459, 95% CI 0.263–0.790, P = 0.005), low vs. high (OR 0.430, 95% CI 0.242–0.750, P = 0.003); TILsGS: intermediate vs. high (OR 0.484, 95% CI 0.284–0.814, P = 0.007), and low vs. high (OR 0.212, 95% CI 0.108–0.394, P < 0.001). In HR + patients (n = 306), 4 of the 5 IGSs showed considerable importance (Table 3), including Ascierto: intermediate vs. high (OR 0.258, 95% CI 0.100–0.615, P = 0.003), low vs. high (OR 0.268, 95% CI 0.112–0.609, P = 0.002); Schmidt: intermediate vs. high (OR 0.397, 95% CI 0.151–0.459, P = 0.034), low vs. high (OR 0.345, 95% CI 0.143–0.797, P = 0.014); Bianchini: intermediate vs. high (OR 0.400, 95% CI 0.168–0.926, P = 0.034), low vs. high (OR 0.340, 95% CI 0.139–0.798, P = 0.014); TILsGS: intermediate vs. high (OR 0.392, 95% CI 0.163–0.915, P = 0.032), low vs. high (OR 0.253, 95% CI 0.103–0.595, P = 0.002). However, there was no statistically significant difference in HR– patients (Table 3). Boxplots for the predictive results of pCR and RD are shown in Supplementary Fig. 2.

Table 3 Logistic progression analysis of five immune-related gene signatures for pCR

Also, among the multivariate logistic regression analysis of the prediction to chemotherapy response (Table 3), in HR + patients, 4 of the 5 IGSs (Ascierto, Schmidt, Bianchini, and TILsGS) showed considerable importance, including Ascierto: intermediate vs. high (OR 0.217, 95% CI 0.077–0.559, P = 0.002), low vs. high (OR 0.274, 95% CI 0.106–0.670, P = 0.005); Schmidt: intermediate vs. high (OR 0.357, 95% CI 0.137–0.880, P = 0.028), low vs. high (OR 0.309, 95% CI 0.118–0.771, P = 0.013); Bianchini: intermediate vs. high (OR 0.313, 95% CI 0.120–0.783, P = 0.015), low vs. high (OR 0.276, 95% CI 0.102–0.712, P = 0.009); TILsGS: low vs. high (OR 0.418, 95% CI 0.159–1.075, P = 0.019). However, no IGS showed statistical significance in all cancer patients and in HR− patients.

Discussion

We found that IGSs have disparate predictive and prognostic abilities in different breast cancer subtypes. In HR + breast cancer, IGSs showed coincident predictive power for response to chemotherapy and some predictive power for prognosis. However, in all breast cancer cases, IGSs showed little predictive power for prognosis, and in HR– breast cancer, IGSs showed neither predictive nor prognostic ability.

Various multigene assays have been developed with distinct breast cancer datasets to predict the likelihood of distant recurrence and response to adjuvant therapy. Previous studies identified some IGSs that could be used for either prediction of the response to chemotherapy or the patient’s prognosis, particularly in HER2 + and triple-negative breast cancer [4,5,6,7,8,9, 13, 14]. To improve prognostic power, a classifier consisting of seven immune genomic signatures was developed and validated to identify good prognosis of ER-breast cancer that was independent of lymph node metastatic status [11]. A set of 14 novel prognostic genes, including eight genes linked to immune and inflammatory chemokine regulation, were identified for HR − breast cancer, which was superior to other gene signatures with regard to the prediction of metastasis outcome of patients with early-stage HR– breast cancer/TNBC [18]. More recently, a 17-immunity gene signature was developed to predict the prognosis of distant metastasis-free survival among patients with ER- and highly proliferative breast cancers. Patients with high expression of these immune genes had significantly better outcomes [17]. To predict the response to chemotherapy, a novel prediction model (IRSN-23) constructed with immune genomic signatures can predict pCR independent of the intrinsic subtypes and chemotherapeutic regimens [10]. An immune module score was identified to predict the response to chemotherapy in ER-positive and luminal advanced BC [12]. However, all of the previous IGSs studies only showed results for either prediction of the response to chemotherapy or patient’s prognosis, particularly with regard to HER2 + and triple-negative breast cancer, and all were performed on different datasets. Our analyses were performed on a single dataset including both chemotherapy response and survival information. Moreover, the IGSs showed their utility in HR + breast cancer, whereas they were not as effective in HR– breast cancer.

The discrepancies in the predictive power of HR status in previous studies and our analysis may be attributed to the small sample size. Only 179 HR– patients were registered in our dataset, and no HER2 + patients were included because they were not treated with trastuzumab, the outcome should be different from the current standard care, so that HR– are not detected as a significance in our study. In addition, the occurrence of pCR might be insufficient. The pCR rates in the data set were only 11.8% (36/306) in HR + and 31.8% (57/179) in the HR– group, respectively. Moreover, the recurrence rate between HR + and HR– patients might be incomparable. When combining IGSs with ER status, the predictive function of some IGSs that were useful in the univariate analysis seemed to disappear, which may suggest that ER status is an extremely powerful predictor. Another interesting discrepancy between our study and previous reports could be its prognostic value. Previous studies have shown that high expression of immune genomic signatures is associated with favorable prognosis [7,8,9, 17] and favorable response to chemotherapy [10, 12] in patients with HR– and HER2 + breast cancer. In contrast, in our data, high expression was associated with poor prognosis, especially in HR + cases. Essentially, prognosis is determined by the nature of the original cancer and the effect of chemotherapy. Our data indicated that higher levels for IGSs were associated with aggressive and poor subtypes of breast cancer (e.g., basal and HER2 enriched), as expected [9, 13]. The difficulties in predictions for HR– breast cancer have been reported previously [19]. Indeed, in clinical practice, genomic assays to predict clinical outcomes are available only for HR + breast cancers to stratify Luminal A and B, and there were no powerful signatures for more subtle prediction in other settings. HR– breast cancer may be a heterogeneous group, and subtyping is needed to better identify optimal treatment strategies. Lehmann et al. identified 6 TN breast cancer subtypes from clustering analysis based on mRNA gene expression and to show the “driver” signaling pathways associated with treatment outcomes [20]. To overcome the problem of the dual predictor for HR– breast cancers, another study with a larger data set and novel statistical methods to capture the subtle differences or unknown biological processes associated with clinical outcomes is needed.

Furthermore, we demonstrated that five distinct IGSs are highly correlated with each other and that some different non-overlapping immune genomic signatures can produce statistically similar results. When the gene lists derived from some of these seemingly identical studies were compared, a minor overlap was noticed, if any. It is unclear what caused the lower-than-expected overlap, but differences in patient cohorts, platforms, and statistical methods are likely to be among the factors responsible for these differences. Despite the lack of gene overlaps, the various gene signatures were highly correlated, since they all might represent the same biological functions, thereby allowing distinction between poor and good outcomes of breast cancers (e.g., in terms of prognosis or response to chemotherapy).

One of the strengths of this study is that the cases included both survival and pathologic response data, allowing the prognostic and predictive functions to be analyzed and compared directly. However, our analyses had several limitations. First, we used the proliferation index, MKI67, instead of pathological grade, since information regarding the pathological grade of 17 (out of 485) patients was unknown. Second, the sample size was small especially in the HR– cohort, as we have already mentioned. Third, the dataset we used had the samples from fine needle aspiration and core needle biopsy. These differences in collection methods may have the different contents of stroma in the sample and lead to the bias for our analyses [9]. However, as shown in the Fig. 2, the IGSs are highly correlated with each other. The more, as shown in our prognostic and predictive results, we found some reproducibility of our findings, regardless of the IGSs. Therefore, we believe that there are some truths in our findings beyond a bias.

In conclusion, our results show that IGSs are, to some extent, useful for predicting prognosis and chemotherapy response. However, more compelling biomarkers are required for both prognosis and response to chemotherapy in HR and HER2 + cases. As an alternative, we hope to evaluate different biomarkers for prognosis and response to chemotherapy. In conjunction, we aim to obtain larger datasets including data for both prognosis and response to chemotherapy to determine more concrete biomarkers.