Introduction

Estrogen (ER), progesterone (PR) and human epidermal growth factor receptor 2 (HER2) are important in breast cancer prognosis and treatment choice. The use of hormonal therapy HER2-targeted therapy, as well as chemotherapy depend on ER, PR, and HER2 status. The accuracy of the assessment of these receptors are expressed most clearly by the risk of obtaining a false positive or false negative test result and the clinical consequences of false test results. Patients with false negative test results may be undertreated, whereas patients with false positive results may receive unnecessary treatment with expensive drugs and may cause serious side effects. These potential risks emphasize the importance of proper diagnostic procedures to determine ER, PR and HER2 status.

For years, the breast cancer resection specimen has been considered as the gold standard to obtain information on the receptor status. However, core needle biopsy has become increasingly important in the preoperative work-up of breast cancer patients, especially in those who receive neo-adjuvant treatment. Mann et al. [1] concluded that hormone receptor analysis on core needle biopsy appear to be more reliable than analysis on resection specimen, probably due to loss of antigen caused by the fixation process in the resection specimen. On the other hand Richter-Ehrenstein et al. [2] state that particularly in larger and/or heterogeneous tumors, core needle biopsy is less accurate than the resection specimen to determine histological features of the tumor, such as tumor invasiveness and tumor grade. Gerlinger et al. [3] also stated that intratumoral heterogeneity can lead to underestimation of the tumor genomics, due to varying prognostic gene expression signatures in different regions of the same tumor. Consequently, due to these conflicting results, many pathology laboratories determine the receptor status on both core needle biopsy and resection specimen. However, this strategy is time-consuming and costly. Moreover, even though the receptor status as determined in the resection specimen is generally considered as the gold standard, discrepant results may still puzzle both the oncologist and the patient with respect to final adjuvant treatment decision-making.

We assessed the hormone and HER2 receptor status in both the core needle biopsy and resection specimen in 526 breast cancer patients and looked for predictors of discrepant results between the two. The results were used to develop and propose a cost-effective algorithm for assessment of the ER, PR and HER2 status in patients with adjuvant or neoadjuvant systemic treatment.

Patients and methods

Data collection

We retrospectively collected data from a database containing records of patients with a new diagnosis of breast cancer in Atrium Medical Centre, a large teaching hospital within the Netherlands. All patients diagnosed with breast cancer from January 2005 until December 2009 (n = 1,213) were included. Patients of whom we had only information on resection specimen (n = 346) or only on core needle biopsy (n = 282) were excluded. Also, patients (n = 59) who had received neoadjuvant systemic therapy were excluded. This resulted in a study population of 526 patients, for whom information was available on both core needle biopsy and the resection specimen. For all these patients information was available on ER and PR status and for 432 of them complete information was available on HER2 status. Table 1 shows the patient and tumor characteristics of all 526 patients who were included in the study.

Table 1 Patient and tumor characteristics

Our data were collected from routine practice and there was no strict protocol with regard to the number of core needle biopsy taken for diagnostic purposes. It was left to the radiologist on call to decide how many biopsies should be taken, depending on factors such as size and location. With respect to the resection specimens the policy of the pathologists was to select one block for every case with a representative volume of tumor.

Tissue fixation, processing and analysis

Core needle biopsies were performed at the radiology department under ultrasound guidance, in most instances using an 18-gauge needle. Afterwards they were fixed for 24 h in 4 % neutral buffered formalin after sampling according to the local standards. Then they were embedded in paraffin and 4-μm sections from at least three levels were taken. Resection specimens were lamellated upon arrival in the laboratory, then a biopsy sample for receptor analysis of the visible tumor was taken and additionally these samples were fixated for 24 h in 4 % neutral buffered formalin. The remaining breast tissue was sent for lamellogram to the radiology department and than fixed in formalin for routine histology work-up.

For immunostaining in both core needle biopsy and the resection specimen samples, sections were deparaffinized in xylene and rehydrated in a descending ethanol series. Endogenous peroxidase activity was blocked by immersing the slides for 10 min in 3 % hydrogen peroxide in methanol, after which they were rinsed in PBS (pH 7.2–7.4). Slides were placed in a 0.1 M citrate solution (pH 6.0) and heated for 10 min at 90 °C in a microwave oven for antigen retrieval. The complete incubation process was done using a staining automate (Dako- immunostainer). After preincubation with 1 % bovine serum albumin (Sigma) and PBS for 10 min, monoclonal antibodies directed against ER and PR (1D5 and PgR636, respectively) were applied at the appropriate dilution (1:100 for both) for 1 h at room temperature. After washing in PBS, the secondary antibody (biotin-labeled goat anti-mouse Ig; ready-to-use LSAB2 kit; Dako, Denmark) was applied for 45 min at room temperature. After washing in PBS, these slides were incubated with streptavidin conjugated with horseradish peroxidase (ready-to-use LSAB2 kit; Dako). After washing in PBS, peroxidase activity was detected with 3,3-diaminobenzidine and 0.002 % H2O2 solution (Sigma). Sections were counterstained with Harris’s hematoxylin, dehydrated, cleared in xylene, and mounted in the study by Entellan [4]. Immunohistochemistry (IHC) for HER2 was done on paraffin sections for all patients using a polyclonal HER2 antibody (Dako) in a SABC-peroxidase procedure after heat-induced antigen retrieval [5]. Silver in situ hybridization (SISH) for HER2 was done using the Ventana Benchmark Dual SISH protocol according to the manufacturer’s instructions.

ER and PR status was considered positive when staining occurred in 10 % or more of the tumor cells (score ≥ 1). The scoring system used, was based on a combination of the amount of cells that were stained and the intensity of this staining. Each combination was multiplied leading to scores varying between 0 and 6. This method was a combination of the H-score and the category score [6].

IHC was used to determine the HER2 status in both the core needle biopsy as the resection specimen, giving a score on a 0 to 3+ scale. A score of zero or 1+ was considered negative and a score of 3+ positive according to international guidelines [7]. Samples with a score of 2+ were considered equivocal. To determine the definite HER2 status of this 2+ group, dual color SISH analysis (Ventana/Roche) was performed on both the core needle biopsy as the resection specimen. A 2+ score without amplification in the SISH analysis was considered negative, and a 2+ score with amplification in the SISH analysis was considered positive.

Statistics

In the statistical analysis sensitivity, specificity, negative and positive predicted values and their false negative and false positive rates were calculated, including the 95 % confidence intervals (CI). The false positive and false negative rates are defined as one minus the positive predictive value (1 − PPV) and one minus the negative predictive values (1 − NPV), respectively. Test results based on the resection specimen were considered as the gold standard.

Concordance was calculated as the proportion of patients who had equal score on core needle biopsy and resection specimen.

A multivariable logistic regression analysis was performed to determine predictors of discordance between the tumor specimen and core needle biopsy with respect to ER, PR and HER2 status. Age (continuous variable), grade (grade 1, 2 and 3), pathological tumor size (pT1 and ≥ pT2), pathological nodal status (pN0 and ≥ pN1) and histological tumor type (invasive ductal carcinoma, invasive lobulair carcinoma and others) were included in the multivariable models and odds ratios (OR) with 95 % CIs were calculated.

Results

Patient and test characteristics in dataset

Information on the ER, PR and HER2 status, as determined on core needle biopsy and the resection specimen, is presented in Tables 2, 3 and 4. According to the results of the resection specimen, 81.8 % (425/526) of the patients had a positive ER status, 67.3 % (354/526) had a positive PR status and 19.4 % (84/432) had a positive HER2 status.

Table 2 IHC estrogen correlation between core needle biopsy and resection specimen
Table 3 IHC Progesterone correlation between core needle biopsy and resection specimen
Table 4 HER2 correlation of IHC and SISH in resection specimen

Concordance between core needle biopsy and resection specimen

The concordance between core needle biopsy and resection specimen regarding the ER status was 89.5 % (Tables 2 and 5). The sensitivity of core needle biopsy to determine the ER status was 93.9 % (399/425) and the specificity was 71.3 % (72/101). The risk of obtaining a false negative result with core needle biopsy was 26.5 % (26/98) and the risk of a false positive test result was 6.8 % (29/428) (Tables 2 and 5).

Table 5 Core needle biopsy vs. resection specimen

The concordance between core needle biopsy and resection specimen regarding the PR status was 82.5 % (Tables 3 and 5). The sensitivity of core needle biopsy to determine the PR status was 83.6 % (296/354) and the specificity was 80.2 % (138/172). The false negative risk was 29.6 % (58/196) and the false positive risk was 10.3 % (34/330).

The concordance between core needle biopsy and resection specimen regarding the HER2 status was 80.6 % (Tables 4 and 5). The sensitivity of core needle biopsy to determine the HER2 status was 81.0 % (68/84) and the specificity was 80.5 % (280/348). The risks of obtaining a false negative or false positive test result on core needle biopsy were 5.4 % (16/296) and 50.0 % (68/136), respectively.

Multivariate logistic regression analysis

The multivariable analysis showed that the probability of having a discordant ER test result between core needle biopsy and the tumor specimen decreased significantly with higher age (odds ratio [OR] = 0.96, p = 0.002, 95 % CI 0.93–0.98) (Table 6). In patients with a discordant ER result, a higher risk for discordance was also observed for the PR status (OR = 2.87, p = 0.007, 95 % CI 1.33–6.21) (Table 6).

Table 6 Multivariate analysis model comparing the likelihood of having discordant IHC ER, PR and IHC-SISH HER2 result between core needle biopsy and resection specimen in patients with primary diagnosed breast cancer

The risk of having discordant results for the PR status between core needle biopsy and the tumor specimen was lower for patients with a poorly differentiated tumor, as compared to those with a well-differentiated tumor (OR = 0.24, p = 0.004, 95 % CI 0.09–0.63). A higher risk of discordance for PR status was also observed for patients who had discordant results compared to the ER status (OR = 2.46, p = 0.024, CI 1.12–5.39) (Table 6).

We also observed a significant higher risk of discordance for HER2 test result for patients with tumor involvement of the axillar lymph nodes (OR = 2.30, p = 0.002, 95 % CI 1.35–3.92) (Table 6).

Discussion

Most studies have reported on the concordance between core needle biopsy and resection specimen when studying the reliability of receptor assessment in breast cancer. Our approach was different as we focused on the negative and positive predictive value of core needle biopsy to determine the ER, PR and HER2 status. Such values are of greater importance for clinical practice as they give us more information about the possible impact of the test results on the treatment of patients. We observed a high level of false negative test result in the core needle biopsy as compared to the resection specimen, counting for up to 26.5 % for ER and 29.6 % for PR. For HER2, a 50 % false positive result was seen, concluding that core needle biopsy result does vary from the resection specimen test result. These findings make us critical about the reliability of using core needle biopsy alone in the assessment of the ER, PR and HER2 receptor status.

Our data on concordance for ER and PR test results, using the resection specimen as the gold standard, are comparable to the figures reported in the literature [1, 2, 812]. The concordance was 89.5 % for the ER status, 82.5 % for the PR status and 80.6 % for the HER2 status. In the literature concordance rates for the ER and PR status vary between 61.7 % and 98.8 % and 69.1 % and 89 %, respectively [1, 2, 810, 1318]. For the HER2 status concordance rates are reported ranging from 54 % to 100 %. Concordance rates for HER2 were lower when tests were performed with IHC alone as compared to use of both IHC and FISH [1, 2, 8, 10, 1315, 1921]. Previous studies comparing core needle biopsy and resection specimen also took the resection specimen as the gold standard, but mainly focused on concordance rates. To compare our data to the literature, we used two by two tables to calculate false negative and false positive results from the available data in previous studies. False negative rates for ER status ranged between 0 % and 33.3 %, and false negative rates for PR status between 0 % and 37.5 % [1, 813].

Several explanations have been given for the discordant findings between core needle biopsy and the resection specimen. Tumor heterogeneity, variation in tissue processing and fixation, and inter- and intra-observer variability are some confounders described to influence concordance in ER, PR and HER2 [22]. Knowing that core needle biopsy only reflects part of the tumor, predominantly from the core of the tumor, key information may be missed when tumor heterogeneity is large [2, 19]. The higher discordance rate seen in the PR group compared to the ER group might be because PR concentrations appear to vary more in the tumor than ER concentration [23]. HER2 expression in the tumor is also believed to vary between different parts of the tumor [21, 24].

Formalin fixation leads to cross-linking between and within proteins of the tissue components. Adequate cross-linking is necessary for proper IHC assessment. IHC incubation protocols for antigen retrieval have been designed to work with properly fixed tissue samples in order to avoid over activity of this process which may destroy epitopes or under activity which may insufficiently unmask epitopes. False negative test results are often considered to be the result of a fixation time of less than 24 h [1, 25]. Fixation artifacts were avoided in our study by processing both the core needle biopsy as the resection specimen in the same way, for 24 h. Rhodes et al. suggested that inter laboratory variations accounted for up to 24 % of false negative ER and PR test results [26]. In our study, one specialized pathologist examined all the core needle biopsies and tumor specimens. Inter-observer variability was therefore ruled out, though still leaving room for intra-observer variability. This might be considered a limitation of our study.

Further, patient selection might have contributed to the rate of discordance between core needle biopsy and the resection specimen in our study. In our study, younger patients were more likely to have discordant results with respect to ER. ER expression is generally known to be more homogenously spread over the tumor cells than PR. Elderly patients are more likely to have positive hormone status and are also known to have more favorable tumor characteristics, including small tumor size [27], enabling homogenous fixation of the resection specimen as well. These characteristics might explain the higher concordance between core needle biopsy and the resection specimen with respect to ER in older patients. Patients with a poorly differentiated tumor were less likely to have discordant results for PR status. PR expression is more heterogeneously spread throughout the tumor in contrast to ER, which is more homogenously spread over the tumor cells. The expression of PR is often dependent on an intact signal pathway [28], low-grade tumors therefore frequently show more heterogeneity in PR expression. On the other hand poorly differentiated tumors are more often PR negative [29], and if positive they are not easily influenced by external signals. For that reason less discordance is seen between core needle biopsy and resection specimen. Usami et al. [9] have also reported a higher level of concordance between core needle biopsy and resection specimen for PR in grade 3 tumors. For HER2, a higher nodal status, was found to play a role in identifying patients with a high risk of discordant results. Patients with a HER2 positive disease are known to have a more aggressive tumor, compared to patients with a HER2 negative disease. These patients are younger, have larger tumors and have more lymph node involvement [30, 31]. In addition, HER2 intratumoral heterogeneity is often seen in both immunohistochemical stain as in situ hybridization [18]. It is possible that the HER2 discordance in the patients with an axillary node involvement can be explained by a combination of more aggressive tumors and a higher probability of heterogeneity within the tumor. In other studies, discordance was also associated with these factors [1, 2, 9, 11, 12, 15, 18, 24, 27, 3032]. Since our patient population shows large similarities with that of several previous studies regarding age, tumor size, nodal involvement and grade, differences in concordance rates based on patients and tumor characteristics are unlikely.

Most institutions still choose to utilize the resection specimen result to base and/or confirm their adjuvant treatment choices. However, for some patients receiving neoadjuvant treatment, namely those who in the end will achieve pathologic complete response (pCR), the core needle biopsy is the only source of information regarding tumor characteristics. Considering the high risk of a false negative test result this would mean that hormone treatment would be withheld from almost one fourth of the patients with a ‘negative’ ER or PR status, if solely based on core needle biopsy. Considering the beneficial effects of hormone therapy on prognosis, such under treatment will increase the risk of recurrence and breast cancer related death in a substantial number of patients [33]. An overall pCR of about 16% is observed in patients receiving neoadjuvant systemic treatment [34]. Thus, the large majority of these patients does not have a pCR. In our view these patients can benefit from retesting of their hormone status on the resection specimen. Van der Ven et al. [22] reviewed 32 trials that investigated the influence of neoadjuvant therapy with or without trastuzumab treatment on ER, PR and HER2 receptors in breast cancer. They advised to retest the receptor status of the residual tumor after neoadjuvant treatment in order to improve future tailored adjuvant therapies. They concluded that neoadjuvant chemotherapy may directly or indirectly change the biology of tumor cells, or might cause a selection of resistant tumor cells in the residual disease [22].

For HER2, we observed a high risk of a false positive test result. Sixty-eight out of 136 patients (50 %) with a positive result in the core needle biopsy had a negative result in the resection specimen. When considering only the result of the core needle biopsy these patients would have received trastuzumab, whereas according to the results of the resection specimen this treatment would not be given. Taking only the core needle biopsy result into consideration would bring needless exposure to treatment with toxicity and economic costs as a result. False positive rates for HER2 reported in the literature vary between 0 % and 31.2 % [810, 13, 14, 21]. The HER2 positive rate of 19.4 % in the resection specimen of the patients in our study does not differ from the rate in other studies [2, 9, 10, 14, 20]. The high false positive results seen in our study may therefore possibly be related to a high HER2 percentage in the core needle biopsy rather than a low HER2 percentage in the resection specimen. The antibody techniques we used were similar for the core needle biopsy and the resection specimen. When compared to other studies, the amount of biopsies taken may have differed, since it was left to the radiologist to decide how many samples were taken. In our institution, a mean of about two to three samples was taken per patient. This could have influenced the concordance with the results of the resection specimen, since a higher number of biopsies is known to result in a higher level of concordance [17, 35]. On the other hand, a high prevalence of HER2 positive cases in the core needle biopsy may also partly be related to heterogeneity of the HER2 expression within and between the cells [2]. Heterogeneity of HER2 expression has also been reported between samples [18]. More biopsy samples from different locations in the blocks and from multiple tumor blocks, which is usually the case in tissue microarray analysis (TMA) may improve concordance.

Untch et al. [36] described a 39 % pCR in patients with a HER2 positive tumor treated with chemotherapy and trastuzumab in neoadjuvant setting. By extrapolating our study result to the neoadjuvant setting, where core needle biopsy is predominantly used as a diagnostic tool, we believe that up to half of the HER2 positive group could be false positive. Considering the fact that trastuzumab has no effect in HER2 negative tumors [7], the pCR rate is actually higher in true HER2 positive tumors. In the Netherlands, the total cost of one patient who is treated with 1 year of trastuzumab is €36.298 for the drug, €2.899 for regular outpatient visits and €1.559 for cardiac monitoring, amounting to a total of €40.759 per patient [37]. Considering these high costs in the light of the reliability of HER2 test result on core needle biopsies with current laboratory techniques, one may argue that patients with HER2 positive tumors who in advance are considered not eligible for breast conserving therapy might preferably have their local treatment first. Further research should focus on the value of adjuvant trastuzumab in patients treated with neoadjuvant trastuzumab, both if they did achieve a pCR and if they did not, but having a negative HER2 status in the remaining tumor.

To increase reproducibility one can think of ways to improve the assessment. Ozdemir et al. [32] suggest that taking a minimum of five samples instead of a maximum of five samples with the core needle biopsy, might increase its performance. Harvesting core samples from the centre of the tumor instead of the peripheral parts might be useful in case of heterogeneity. Better needle technologies might also increase the core needle biopsy quality by providing full length samples instead of fragmented samples [32]. However, Allred [38] suggest that new and more powerful techniques are developed to overcome the problems related to tumor heterogeneity and IHC assessment. TMA analysis might have this potential. With TMA, it is possible to combine different samples of the same tumor together with samples from other patients in one TMA block which may on the one hand improve the assay in a qualitative way (tumor heterogeneity and process uniformity) and on the other hand from a cost effective point for the laboratory (reducing the labor and reagents with TMA). Moreover, additional tests related to proliferation activity along with a standard HER2 in situ hybridization can be done within the limits of the original IHC budget. Rossing et al. [39] suggest that combining TMA with digitalization is a more cost-effective alternative for routine diagnostic on whole mount slides in a laboratory with high throughput of the same biomarkers. This method will decrease the number of slides and this will have an impact on storage capacity. All breast cancer specimens in one TMA block are exposed to the same pre-analytical conditions and thereby unify the analyses and minimize the day-to-day sample interpretation variability. Instead of physically storing slides, they are scanned and stored digitally, which means that staining will not fade and the quality will stay intact over time [39].

To this end, we conclude that we need to be aware of the problem of false negative and false positive test results in ER, PR and HER2 assessment in core needle biopsy. With current techniques, our recommendation is to use the resection specimen to measure ER, PR and HER2 receptors in patients without neoadjuvant treatment. Extrapolating our results to the neoadjuvant setting, where receptor assessment on core needle biopsy is required to guide neoadjuvant and adjuvant treatment choices, we would consider reassessment of ER, PR and HER2 status on the resection specimen if no pCR is achieved.