Background

Adjuvant Lapatinib and/or Trastuzumab Treatment Optimisation (ALTTO) is a phase III randomized international clinical trial conducted by the Breast International Group (BIG) and the North American Breast Cancer Groups [NABCG: lead group, North Central Cancer Treatment Group (NCCTG, now part of the Alliance)]. ALTTO evaluates the role of adjuvant lapatinib alone, or in combination or sequence with trastuzumab compared with trastuzumab alone for the adjuvant treatment of patients with early human epidermal growth factor receptor-2 (HER2)-positive breast cancer. Trial overview and further details can be found in the trial Web site (http://alttotrials.com). Between April 2007 and July 2011, 8,381 patients were enrolled in ALTTO.

One of the key features of the trial is that patients with disease classified as HER2 positive or HER2 equivocal by local laboratories were eligible for randomization only after HER2-positive status was confirmed by a central laboratory. Mayo Clinic (Mayo: Rochester, Minnesota; Scottsdale, Arizona, Drs. Robert Jenkins, Ann McCullough, Wilma Lingle) was responsible for confirmatory testing for North American patients enrolled through US NCI sponsorship; European Institute of Oncology (IEO: Milan, Italy, Dr. Giuseppe Viale) was responsible for confirmatory testing for patients from the rest of the world (except China that used a third central laboratory).

There is an increasing recognition that HER2-positive disease that is also steroid hormone receptor positive has a different natural history and requires different adjuvant therapy than HER2-positive disease that does not express either estrogen receptor α (ER) or progesterone receptor (PR) [1], specifically, antiestrogens after the completion of chemotherapy. The lack of local/central concordance in pathological reading of estrogen and progesterone receptor status in tumor specimens has been documented [2]. Therefore, central laboratory determination of ER and PR status was also initiated in ALTTO, and the stratification of patients in the randomization was according to centrally determined hormone receptor status of the primary tumor.

In this manuscript, we present results of a ring study in which a small number of cases were exchanged between Mayo and IEO for assessment of HER2 or ER status in order to understand the similarities or differences in results obtained between the two central confirming laboratories. PR status was not considered in this ring study.

Motivation for the ring study

The ALTTO Steering Committee annually reviewed data regarding eligibility failures (defined as locally HER2 positive, but HER2 negative at central review) as well as discrepancies between local and central determinations of ER status. In 2009, it was recognized that very few of the locally HER2-positive cases referred to Mayo were found to be ineligible (5.8 %), while 14.5 % of the HER2-positive cases referred to IEO were defined centrally as HER2 negative (Table 1). In addition, differences between central laboratories were seen with respect to ‘false-positive’ and ‘false-negative’ ER rates. The percent of cases defined as ER-positive locally but ER-negative on central review (i.e., false positive) was 16.2 % at Mayo compared with 4.2 % at IEO (Table 2). The percent of cases defined as ER-negative locally but with at least 1 % of cells staining positive for ER centrally (i.e., false negative) was 3.4 % at Mayo compared with 21.4 % at IEO (Table 2). ALTTO recruitment was completed in July 2011, and the final concordance figures between local and central laboratory determinations for HER2 and ER are shown in Supplementary Appendix C.

Table 1 Concordance between local and central HER2 status/combined IHC and FISH methods (as of December 2009)
Table 2 Concordance between local and central ER status (as of December 2009)

Methods

Ring study design

This ring study involved an exchange of slides between Mayo and IEO. There were three phases, which were launched sequentially based on results of the preceding phases. The statisticians (ACD and RDG) selected the cases per criteria described below. All evaluations were conducted with the pathologists blinded to the evaluations from the other laboratory, and the results were combined by the statisticians for analysis. Definitions of testing results followed published ASCO/CAP guidelines [3, 4], as summarized in Supplementary Appendix A.

Phase 1 was motivated by the false-positive and false-negative rates (Tables 1, 2) and involved exchange of material as follows: IEO submitted to Mayo for retesting 20 HER2 false-positive cases, 5 ER false-positive cases, and 5 ER false-negative cases; Mayo submitted to IEO for retesting 5 HER2 false-positive cases, 20 ER false-positive cases, and 5 ER false-negative cases. False positive was defined as locally positive/centrally negative, while false negative was defined as locally negative/centrally positive. Thus, for studying HER2 concordance, 25 cases with central HER2-negative status were exchanged, 20 from IEO, and 5 from Mayo. Cases that were centrally HER2 positive were not included in phase 1 of the ring study. For studying ER status, 25 centrally ER-negative cases (5 from IEO and 20 from Mayo) and 10 centrally ER-positive (5 from each laboratory) cases were exchanged.

Phase 2 was initiated when IEO identified 5 of the 20 Mayo ER-negative cases from phase 1 as ER positive. The Mayo laboratory evaluated eleven of the previously tested cases using the dual ER antibody method used at the IEO site. The 5 discordant cases and 6 non-discordant cases were retested to maintain blinding of the pathologists.

Phase 3 was initiated when it was recognized that the HER2 testing in the phase 1 plan included only cases that were locally HER2 positive, but centrally HER2 negative. The additional question was whether IEO central review would confirm HER2 positivity for cases that were locally IHC equivocal (by local laboratory criteria) and centrally HER2 positive at Mayo. Therefore, in phase 3, 23 additional cases not previously involved in the ring study and with local IHC equivocal results for HER2 were sent from Mayo to IEO for HER2 retesting: 10 were Mayo IHC positive and FISH positive (ratio >2.2), 5 were Mayo IHC equivocal and FISH positive, 5 were Mayo IHC equivocal and FISH negative (ratio <1.8), and 3 were Mayo IHC negative and FISH negative.

HER2 and ER testing methods

The various procedures used by the Mayo and the IEO laboratories are described in Supplementary Appendix B. In the central laboratories, HER2 IHC was tested using the HercepTest® kit (Dako, Carpinteria, CA), and HER2 FISH was tested using the PathVysion HER2 DNA probe kit/HER2/CEP17 probe mixture (Abbott Molecular, Des Plaines, IL). HER2 positivity was defined according to the 2007 ASCO/CAP guidelines (IHC positive: 3+ complete membrane staining in >30 % of invasive cells; FISH positive: HER2/CEP17 ratio >2.2) [3].

IEO and Mayo performed IHC for ER each according to their own methods, previously used in large multicenter trials and each recommended by ASCO/CAP ER/PR testing guidelines. IEO used a dual antibody (Dako cocktail of ER 1D5 and 2.123 monoclonal antibodies), while Mayo used a single antibody (Dako ER 1D5 monoclonal antibody). ER status was defined as positive if ≥1 % cells stained positively versus <1 % for negative.

Statistical analysis

All statistical analyses were descriptive consisting primarily of listings of results for the cases, as well as relative frequencies with exact binomial confidence intervals. The phase 1 sample size to retest 30 cases per laboratory was selected primarily to control costs for this underfunded investigation. In phase 1, the selection of more cases in some categories—20 HER2 false positive from IEO to Mayo and 20 ER false positive from Mayo to IEO—was based on the two primary goals to be evaluated. With 20 cases in a group, the 95 % exact binomial confidence interval would be 83–100 % if all 20 were confirmed, and 51–91 % if 15 of the 20 were confirmed (point estimate 75 %).

Results

Phase 1: HER2 central laboratory concordance

The results of the HER2 testing in phase 1 are shown in Supplementary Appendix D. Both central laboratories identified ALTTO ineligibility (HER2 negative) in each of the 25 cases exchanged. There was a slight tendency for Mayo evaluations to be in the equivocal category (rather than in the negative category) for both IHC (7 Mayo, 1 IEO) and FISH (3 Mayo, 0 IEO) determinations, but this did not impact the HER2-negative ineligibility determination in any case.

Phase 1: ER central laboratory concordance

By contrast, discordance was observed with respect to ER determinations (Table 3). Five of 34 cases with ER determination in both laboratories [15 %; 95 % confidence interval (CI), 5–31 %] were discordant. Five of the 20 Mayo ER-negative cases sent to IEO (25 %; 95 % CI 9–49 %) were determined to be ER positive at IEO. Assessments were concordant for all 5 IEO ER-negative cases, for all 5 IEO ER-positive cases, and for all 5 Mayo ER-positive cases.

Phase 2: ER retesting at Mayo

When the 11 cases in phase 2 were retested at Mayo using the dual antibody, all 11 gave results that were concordant with the IEO central laboratory determination (Table 3). Figure 1 shows the difference in staining for ER in two carcinomas when the dual antibody is used as compared with the single antibody.

Table 3 Concordance in ER status between central laboratories—35 cases included in the original set (phase 1) and 11 cases repeated at Mayo using dual antibody (phase 2)
Fig. 1
figure 1

Representative ER IHC staining in two separate carcinomas (×10). Sections A and C are each from carcinoma #1; B and D are each from carcinoma #2. All stains were performed at Mayo. ER IHC staining using single ER antibody (1D5) is depicted in A, B. ER IHC staining using the dual ER antibody cocktail (1D5/2, 123) is depicted in C, D. Each carcinoma shows negative staining with single ER antibody (A, B) and positive staining by dual antibody cocktail (C, D)

Phase 3: HER2 retesting at IEO

Table 4 shows the results of the review of the 23 cases included in phase 3. All 23 cases had equivocal local HER2 IHC results. Five of the 15 cases (33 %; 95 % CI 12–62 %) that were HER2 positive at Mayo central review did not reach the threshold of positivity by the IEO review. By contrast, only 1 of the 8 cases (12 %; 95 % CI 0–53 %) that was HER2 negative at Mayo was HER2 positive at IEO (Table 4). A clear tendency was observed for the ratios above 2 to be higher for the Mayo central review compared with the IEO central review (Fig. 2). This tendency is due to the fact that the Mayo central review counted nuclei with fewer than 2 green signals, while IEO central review did not.

Table 4 23 cases with locally equivocal IHC for determination of HER2 positivity (bold values indicate discordance between central laboratories)
Fig. 2
figure 2

HER2 FISH ratios for 23 local HER2 IHC equivocal cases comparing Mayo and IEO in phase 3 of this ring study. All 15 amplified Mayo cases have higher FISH ratios than the IEO values

Discussion

This ring study clearly established diversity between central laboratory results due to both technical issues in immunohistochemical testing and interpretation issues in FISH testing. Epitope mapping of the two antibodies contained in the dual ER cocktail used at IEO indicates binding to amino acid sequence 15–23 of the N-terminus for clone ER-2-123 (region A), and binding to amino acid sequence 127–130 for clone 1D5 (region B) [5]. The different rates of false-negative determination of ER we postulate are due to the use of dual ER antibody with a second epitope binding site in IEO compared with the older single epitope ER ID5 reference antibody used at Mayo. In phase 2, the use of dual antibody on cases reassessed at Mayo found ER expression levels concordant between the two laboratories. Standardized assays using the same reagents tend to increase concordance between the laboratories; the same method on the same tissue yielded the same result in ER IHC testing. ASCO/CAP guidelines for ER testing identified four well-validated antibodies for ER immunohistochemical detection on the basis of outcomes in estrogen receptor modulating therapeutic trials [4]. These antibodies were 6F11, 1D5 (used at Mayo), SP1, and the dual 2.123 + 1D5 (used at IEO). We found more discordance between these clinically validated ER antibodies than previously reported in a single laboratory study demonstrating 99 % concordance between the two antibodies [6].

Interpretation appears to play a role in the assessment of HER2 positivity, affecting both IHC and FISH results. The higher agreement regarding HER2 positivity between Mayo and local US sites compared with IEO and local rest-of-world sites might be due in part to the differences between central laboratories highlighted in phase 3 of this ring study. While the phase 1 initial part of the ring study provided 100 % concordance between central laboratories in cases that were clearly not eligible, the phase 3 part involving 23 cases with equivocal local HER2 IHC results revealed more positive FISH calls at Mayo compared with IEO. Similar low discordance was found in another HER2 international ring study utilizing predominantly equivocal immunohistochemical and borderline FISH cases [7]. There are several reasons for the discordance in phase 3. First, some discordance in FISH results between laboratories is likely due to the way the FISH-amplified signals are counted. When clouds of amplification (strongly clustered fluorescence partially obscuring individual nuclear signals) are seen, the Mayo pathologists assign a copy number of 20 based on the strong belief that one cannot count the signals in the cloud. Others, however, try to estimate the number of HER2 signals, and these estimates are always less than 20. Second, there are also differences in how the green (control CEP17) signals are counted. Some laboratories (such as the central IEO laboratory) only enumerate nuclei with 2 or more green signals. Other laboratories count nuclei with any numbers of green signals. The latter method will also inflate the HER2:CEP17 ratio. Third, phase 3 was focused on IHC equivocal cases. A large proportion of these cases had duplication or low-level amplification (e.g., 3–6 HER2 signals). The HER2:CEP17 ratios for such cases usually range from 1.3 to 3.0, so some laboratories may observe ratios slightly over and others slightly under 2.0 (or 2.2). Fourth, two of the 23 cases in phase 3 of the ring study contained clear heterogeneous HER2 amplification. However, the overall ratio was less than 2, and the fraction of amplified cells was less than 50 %. Accordingly, the final classification of these tumors was not amplified. In light of the new ASCO/CAP recommendations, these tumors would qualify as amplified if the fraction of amplified cells is higher than 10 %. Fifth, two cases were reflexed (at Mayo) to a 17 p-arm control probe (D17S122) because CEP17 exhibited frequent aneusomy, and this reflex increased the ratio.

It is not possible to know which central laboratory determination of HER2 status or ER status was biologically correct in terms of distinguishing patients who do or do not benefit from HER2-targeted or endocrine therapies, respectively. Determinations of trial eligibility based on interpretation of FISH testing in HER2 immunohistochemically equivocal patients differed somewhat between central laboratories due to small variations in FISH signal quantification. Because the same immunohistochemically equivocal case read by both laboratories had systematically higher ratios at Mayo than at IEO, patients classified as eligible to enter ALTTO from North America would include some with lower intrinsic HER2-positive FISH signaling who would have been classified as ineligible if screened from the rest of world. It is unlikely that this discordance will have a meaningful impact on trial results given the relatively few immunohistochemically equivocal cases for which differences in interpretation of HER2 FISH results would yield a difference in ALTTO eligibility [8].

Information regarding estrogen receptor central testing was communicated to trial participants from both central testing sites. Local treatment with estrogen receptor modulating therapies was not specified in the trial and was determined by locally treating physicians. Whether local and central discordances in ER testing would change the utilization of endocrine therapy locally is not known. With respect to ER status, more rest-of-world cases will be classified as centrally ER positive and thus may receive adjuvant endocrine therapy which would not have been given based on local ER-negative results. If the locally ER-negative, but centrally ER-positive disease is in fact responsive to endocrine therapy, better overall results might be achieved.

Some evidence regarding the validity of IEO central review of ER status comes from the BIG 1-98 trial [2]. Postmenopausal women with locally assessed steroid hormone receptor-positive disease were enrolled. Of 3,610 cases evaluated, 94 were found by the central IEO laboratory to have steroid hormone receptor-negative disease. The subsequent disease course of these 94 patients precisely followed the natural course expected for ER-negative breast cancer—rapid early recurrence in a proportion of patients followed by a plateau and long-term disease freedom by many (Ref. 2, Fig 2C). Furthermore, cases with central ER staining in 1–9 % of cells (using the single epitope 1D5 antibody) had a disease-free survival better than those with central ER-absent disease (no ER expression in any cells) (Ref. 2. Fig 3A).

An important side effect of the pre-randomization central laboratory review of HER2 and ER conducted prior to initiation of our ring study was that discordance between central and local laboratories could be identified. ALTTO enrollment from Germany was substantial, and it was noted that the discordance rate for HER2 assessment from Germany was slightly higher than the mean discordance rate for all rest-of-world cases. Drs. Jackisch and Untch, German ALTTO principal investigators, organized a meeting of the officers of the German Society of Pathologists specifically to discuss the discordance rate. Dr. Viale provided discordance data for each of the 192 German centers. Some (10) high-volume laboratories in Germany were identified as having a high discordance rate (up to 50 %), which substantially increased the discordance rate for Germany. Initiatives were undertaken by the German Society of Pathologists to address the issue with the relevant centers (see Supplementary Appendix E), and the concordance between local German assessment and central IEO assessment for both HER2 and ER improved substantially. Perhaps, the most important contribution of the central review initiative is that patients are now benefiting from high-quality biological determination of the characteristics of their tumor. This is particularly critical for HER2 and ER as these two biomarkers are used to directly determine the type of adjuvant therapy to be used.

ASCO/CAP guidelines are available to guide HER2 and ER determination [3, 4], and proficiency testing is also required for every laboratory that provides information for breast cancer biomarkers. A re-evaluation of HER2 guidelines for positivity is now available [9]. It remains to be seen whether the treatment guided by local determinations leads to worse outcomes than the treatment guided by central review. In fact, some evidence suggests that trastuzumab is effective for patients with centrally determined HER2-negative disease, which was locally determined to be HER2 positive [8, 10]. The degree of endocrine responsiveness is unknown for that subset of tumors that are locally ER negative but centrally ER positive based on the newer dual antibody.

In this multinational combined collaborative group trial, we can be assured that the patients enrolled in the trial bear disease with the HER2 target as confirmed by central testing. Excellent concordance in testing for trial eligibility was obtained between central confirming laboratories using identical methods and positivity criteria. Minor differences in HER2 immunohistochemical quantification and FISH amplification between central laboratories were detected in a selected small population; standardization of these types of morphological parameters is an area of intense exploration. Differences in centrally confirmed ER IHC status based on antibody choice were detected. A recommendation for future clinical trials involving multiple collaborative groups is that the validated hormonal receptor antibody should be specified and that central testing should be performed to determine trial eligibility. Given the absence of analytical standards in breast marker immunohistochemical and in situ hybridization methods, similar ring study comparisons between confirming collaborative laboratories are encouraged in the context of clinical trials.