Introduction

The prognostic relevance of histologic subtype within high-grade endometrial carcinomas (ECs) is poorly defined. It is however generally accepted that high-grade endometrioid-type ECs (EECs, GR3) have a slightly better prognosis than the high-grade non-endometrioid ECs. For adjuvant treatment decisions, a risk stratification (e.g., low/intermediate/high-intermediate/high risk) is made, which relies on a combination of clinicopathological risk factors including the International Federation of Gynecology and Obstetrics (FIGO) stage, grade, age, lymphovascular space invasion (LVSI), and histologic subtype. FIGO stage III/IV disease is considered high risk per definition, independent of any of the other factors. In stage I/II disease, the risk assignment is stratified depending, among other factors, on grade and histotype [1]. For risk assignment of a patient with stage I/II disease with a high-grade EC, histologic subtype is considered relevant: patients with FIGO stage IA myoinvasive grade 3 endometrioid-type EC (GR3 EEC) without substantial LVSI are considered intermediate risk, whereas those with myoinvasive stage IA non-endometrioid-type (non-EEC) are considered high risk. Similarly, FIGO stage IB GR3 EECs are high-intermediate risk, whereas FIGO stage IB non-EEC would be considered high risk [1]. Therefore, in the context of stage I/II disease, distinguishing histologic subtype of a high-grade EC may have consequences for clinical management.

High-grade EC is an heterogenous group of tumours consisting of GR3 EEC and non-EECs including serous carcinoma (SEC), clear cell carcinoma (CCC), mixed epithelial carcinomas, de-/undifferentiated endometrial carcinomas (DEC), and uterine carcinosarcoma (UCS). Despite the apparently clear histological description of high-grade histologic subtypes in the WHO classification [2], it has now been well documented that significant interobserver variability exists, even among experts[3,4,5,6,7]. This is likely due to the morphologic heterogeneity of this disease, in which a significant number of cases are difficult to classify. Although in these ambiguous high-grade ECs immunohistochemical markers will likely be helpful (e.g., Napsin A for the diagnosis of CCC), these markers are not uniformly used and also not always conclusive [8, 9]. This is causing a problem for the clinical management of those stage I/II patients for which the risk assignment relies on histologic subtype.

Research groups aware of this problem invest significant amounts of time reviewing retrospective cohorts by specialized gynecopathologists to ensure uniformity in the research setting [7, 10,11,12]. In addition, this interobserver variability issue has resulted in the recommendation to apply a low threshold for pathology revision of high-grade EC in clinical practice, suggesting that experienced and specialized pathologists maybe in a better position to assign histologic subtype. The obvious downside of this practice is the time and costs involved, both in clinical and research setting. The consequences of this general practice are only poorly studied; hence, it is worth to clarify the impact of possible changes on clinical outcome in relation to the revised diagnosis. Therefore, the aim of this study was to investigate the effects of histological review of high-grade EC and its prognostic impact in a large national Danish cohort.

Materials and methods

The Danish Gynecological Cancer Database (DGCD) includes 4707 EC patients diagnosed between January 1, 2005, and December 31, 2012 [13]. The DGCD holds prospectively registered information about initial surgical and adjuvant treatment, pathology diagnosis, and follow-up data [14]. From the DGCD 2005–2012 cohort, we included 425 patients with an original diagnosis of high-grade EC (all histologic subtypes except uterine carcinosarcomas). Of these, at least one hematoxylin and eosin (H&E)–stained slide from 396 cases (93.2%) could be retrieved for review (Fig. 1). These cases were originally diagnosed at 19 different pathology institutes distributed throughout Denmark. Distribution in age, original histologic subtype, stage, lymphovascular space invasion (LVSI) status, and risk group according to ESMO-ESGO-ESTRO 2016 [15] are shown in Table 1. Follow-up data for Cox analyses and Kaplan–Meier curves were retrieved from the database, from the national patient’s file registry and patient’s medical records. Missing data regarding recurrences were retrieved from the pathology reports in the Danish pathology database. Deaths were retrieved from the Danish Person Register and Cause of Death Register.

Fig. 1
figure 1

CONSORT diagram

Table 1 Distribution in age, original histological type, FIGO stage, lymphovascular space invasion (LVSI) status, and risk group according to ESMO-ESGO-ESTRO 2016 [15]. SEC, serous EC; CCC, clear cell carcinoma; DEC, de-/undifferentiated EC; GR3 EEC, grade 3 endometrioid-type EC

Pathology revision

The review was performed by four gynecopathologists (EEMP, ALC, VTHBMS, and TB). Even though in some instances immunohistochemistry was used for the original diagnosis, the histology review for this study was performed with H&E slides only. The vast majority of cases included H&E slides from the hysterectomy specimen (394/396; 99.5%), but in two cases it was limited to an H&E of the endometrial biopsy (2/396; 0.5%). The average and median number of slides reviewed per case was 10.9 (range 1–70, median 10), and cases were equally and randomly distributed among the members of the reviewing group. Prior to final histologic subtype assignment, all cases with ambiguous morphology (68/396; 17.2%) were discussed by the review group together to reach consensus diagnosis. The review group was blinded to the original diagnosis and any of the other clinicopathological variables listed in Table 1. The pathology review focused on histologic subtype and did not include re-assessment of grade or FIGO stage. The review group also assessed LVSI extent in this study cohort, results of which will be published separately.

The cases included were originally diagnosed as high-grade carcinomas including GR3 EEC, SEC, CCC, or un-/dedifferentiated carcinoma (DEC). For histologic subtype assignment, the review group used the terminology of the WHO 2014 [2]. In a minority of cases, histology could not be assessed due to poor tissue fixation, too small tumour, or no remaining tumour in the available slides from the hysterectomy.

Statistics

For statistical analysis regarding interobserver variability between original diagnosis and reviewed diagnosis, we used eight categories as shown in Table 2, similar to a categorization made in two other studies, that were based on histological cell type or major/minor disagreement, respectively [3, 5]. Mixed cell carcinomas were categorized according to their high-grade component or to the major component in the case of two high-grade components. Interobserver variability was analyzed using simple Kappa statistics and calculated with 95% confidence limits. Furthermore, interobserver variability was stratified by the original diagnosis made from subspecialized or general institute and stage, respectively, and tested for differences with hypothesis of equality. Calculations were done using SAS v.9.4 (SAS Institute, Cary, NC, USA).

Table 2 Categories for histological types. SEC, serous EC; CCC, clear cell carcinoma; EEC, endometrioid-type EC; DEC, de-/undifferentiated EC; GR3 EEC, grade 3 EEC; EIN, endometrioid intraepithelial neoplasia; UCS, uterine carcinosarcoma; MC, mucinous carcinoma

For statistical analyses regarding clinical outcome, a predefined categorization into four groups was used. This allowed for a comparison between GR3 EEC, SEC, CCC, and other high-grade ECs. The other group contained all other histological subtypes of high-grade EC, such as DEC and UCS. Recurrence-free survival (RFS) was calculated from the time of surgery to the first recurrence, omitting patients dying from other causes than EC. Overall survival (OS) was calculated from the time of surgery to death. The Kaplan–Meier method was used to calculate survival rates, p-values for Kaplan–Meier curves being based on log rank test. Hazard ratios were calculated with Cox regression analyses, where adjustments were made for age, comorbidity using ASA score, FIGO stage, lymph node resection, and/or adjuvant treatment. GR3 EEC was used as reference. Cases that were not high-grade carcinoma at revision were omitted from calculations of RFS and OS. p values for RFS and OS hazard ratios were calculated using adjusted Cox proportional hazards model. Calculations were done using STATA 11 (StataCorp, College Station, TX, USA).

Results

The distribution of the original histologic subtypes and the revised histologic subtypes is shown in Table 3. Of a total of 396 high-grade ECs, histology review could be performed on 384 (97%). These 384 cases were originally diagnosed as GR3 EEC (n = 163; 41.2%), SEC (n = 141; 35.6%), CCC (n = 83; 21.0%), and un-/dedifferentiated carcinomas (n = 9; 2.3%). This distribution changed substantially after review, including one additional category: GR3 EEC (n = 181; 45.7%), SEC (n = 133; 33.6%), CCC (n = 38; 9.6%), DEC (n = 17; 4.3%), and UCS (n = 13; 3.3%). Only two cases were not considered to be high-grade EC on review (0.5%), but EIN (0.25%, n = 1) and mucinous carcinoma (0.25%, n = 1), respectively. In both these outlier cases, the available H&E slides were from representative tumour from the hysterectomy specimen. The original diagnoses of these two cases were GR3 EEC and CCC, respectively. Furthermore, 12 cases (3.0%) could not be revised: 10 due to lack of tumour in the available H&E slides and 2 due to insufficient fixation quality for assessment. The distribution of these cases is presented in Table 4.

Table 3 Original and revised histological types. SEC, serous EC; CCC, clear cell carcinoma; EEC, endometrioid-type EC; DEC, de-/undifferentiated EC; GR3 EEC, grade 3 EEC; EIN, endometrioid intraepithelial neoplasia; UCS, uterine carcinosarcoma; MC, mucinous carcinoma
Table 4 Distribution of cases that could not be revised. SEC, serous EC; CCC, clear cell carcinoma; DEC, de-/undifferentiated EC; GR3 EEC, grade 3 endometrioid-type EC

Overall kappa value was 0.42. The highest concordance was obtained for GR3 EEC and SEC with 75.5% and 63.8%, respectively. For CCC and undifferentiated carcinoma, the concordance was considerably lower with 30.1% and 33.3%, respectively. The main histologic subtype shift was from SEC to GR3 EEC (26/43; 60.5%), followed by GR3 EEC to SEC (19/39; 48.7%). Interestingly, review of the 83 original CCC resulted in 29 GR3 EECs and 23 SECs, while only 25 remained CCC. Examples of CCC that were re-classified are shown in Fig. 2.

Fig. 2
figure 2

Original CCC that were re-classified as either GR3 EEC (A), SEC (B), or remained CCC (C)

Looking at concordance per stage, there were no statistically significant differences. Most of the patients were stage I (n = 292), and the distribution and type of discrepancies of stage I were completely in line with the overall results. For stage II–IV, numbers of patients were too small to draw any conclusions, but we saw no obviously different tendencies. Also, there were no significant differences in concordance whether the original diagnosis was made at a general or subspecialized institute.

Five-year survival, hazard rates, and p values based on Cox proportional hazards model for OS and RFS are shown in Table 5 and Kaplan–Meier curves for OS and RFS in Fig. 3. The OS of patients originally diagnosed with GR3 EEC, SEC, and CCC was not significantly different, and despite the shift in histologic subtypes after revision, there were no significant differences. However, patients with SEC had a poorer RFS than those with GR3 EEC with stronger significance after revision (HR 2.36 (95% CI 1.43–3.89), p = 0.001), compared to the original diagnosis (HR 1.74 (95% CI 1.07–2.81), p = 0.024). Finally, patients with an EC falling under the “other” category, consisting of un-/dedifferentiated carcinoma and UCS after review, had significantly worse OS and RFS than those with GR3 EEC for revised diagnoses with HR 2.41 (95% CI 1.39–4.16; p = 0.002) and HR 3.65 (95% CI 1.81–7.35; p < 0.001), respectively, while there was no statistically significant difference for original diagnoses with HR 2.10 (95% CI 0.92–4.78; p = 0.078) and HR 2.35 (95% CI 0.69–8.06; p = 0.174), respectively.

Table 5 Five-year overall survival and recurrence-free survival, HR with 95% CI, and p values based on Cox proportional hazards model. GR3 EEC serves as reference. GR3 EEC, grade 3 endometrioid-type EC; SEC, serous EC; CCC, clear cell carcinoma; Other, other types of high-grade EC
Fig. 3
figure 3

Kaplan–Meier curves for 5-year overall and survival recurrence-free survival, original and revised diagnosis. SEC, serous EC; CCC, clear cell carcinoma; GR3 EEC, grade 3 endometrioid-type EC; Other, other types of high-grade EC

Discussion

We present an interobserver pathology study of a large nationwide high-grade EC cohort including well-documented clinical outcome data. We were able to retrieve 90% of all high-grade EC cases and thereby, the data presented are a good reflection of the true distribution of high-grade EC in Denmark.

It was re-assuring to find that after revision as much as 99.5% of cases were consistently diagnosed high-grade EC by specialized gynecopathologists, despite the fact that the original diagnosis was made by 19 different pathology institutes, subspecialized as well as general. However, this study showed once again that histological subtyping of high-grade EC is poorly reproducible. From a clinical management perspective, one may argue that this inconsistency in histological type assignment has limited consequences, as adjuvant treatment recommendations according to international guidelines [1] would be altered for a minority of patients. This mainly involves reallocation from GR3 EEC to non-EEC and vice versa in FIGO stage I/II. In Denmark, currently the only exception would be the indication for omentectomy in SEC and DEC, which is not considered to be relevant for patients with GR3 EEC. In other countries, other choices are made, why the impact of the observed diagnostic shift may vary per country.

The overall agreement of histologic subtype assignment in our high-grade EC cohort was just moderate with a kappa value of 0.42. This is in agreement with other studies with kappa values of 0.30–0.68 for high-grade EC [4, 5, 7, 10], illustrating the limited reproducibility of histological subtyping of high-grade EC. The highest reproducibility was obtained for GR3 EEC (75.5%) and serous EC (63.8%), respectively. In addition, 13 cases were re-classified as uterine carcinosarcomas upon revision. The higher number of revised histological types is likely a reflection of the lack of reproducible histologic subtype specific features. This appeared particularly problematic for the diagnosis of CCC, as CCC was the subtype with the worst reproducibility.

CCC often includes a mixture of architectural patterns and can be difficult to distinguish from variants of EEC and SEC. In the new WHO classification published in 2020 [16], it was stressed that strict adherence to architectural and cytological diagnostic criteria is required to optimize the diagnostic reproducibility of CCC. Adding an immunohistochemical panel of ER/PR, p53, Napsin A, and HNF1Beta likely improves the correct diagnosis of CCC, but is not always helpful [8, 9]. Consequently, the WHO 2014 histology-based classification of EC is an insufficient basis for histotype-directed clinical treatment decisions and forms a poor basis for clinical trial inclusion.

The WHO 2020 [16] introduced the molecular classification, which relies on the analysis of surrogate markers in order to identify the four subgroups analogous to the ones described by The Cancer Genome Atlas (TCGA) [17]. This novel classification has a strong prognostic value and higher reproducibility than the histology-based classification [17,18,19,20] and therefore may be a better basis for future clinical trials [19]. Most of the data on the molecular EC classification is derived from analysis of EEC and SEC; however, small series of CCC indicate that the molecular classification may also be applicable to CCC [21, 22]. The clinical relevance of the rarer histologic EC subtypes remains to be determined in larger cohorts, and therefore, it will remain important to accurately assign histologic subtype going forward with the molecular classification. As H&E slides–based histologic subtyping of high-grade EC is poorly reproducible, the use of diagnostic IHC markers such as PTEN, ARID1a, Napsin A, and ER/PR is advisable.

Although the interobserver variability of high-grade EC diagnosis has been addressed in previous works, this is the first study to include an assessment of the impact of revision on RFS or OS. This is of obvious importance, as histologic classification systems are meant to serve as an important prognostic variable and guide treatment. The shift between the high-grade subtypes GR3 EEC, SEC, and CCC at revision had no significant impact on overall survival. However, the group of GR3 EEC had better RFS with much stronger significance after revision compared to the original diagnosis. Furthermore, there were significantly poorer RFS and OS of the revised DEC and UCS. These findings support the most recent European guidelines which differentiate between GR3 EEC and non-endometrioid subtypes to assign risk groups and consequently different adjuvant treatment recommendations [1]. Therefore, our study builds on previous work and argues in favor of central pathology review for all high-grade ECs in routine clinical practice.

This study is not without limitations. Due to the study design (selection of high-grade EC), there is an over-representation of serous carcinomas compared to the general EC population in Denmark where 70–80% are EEC and 10% are SEC according to the Danish national guideline group [23], and therefore, we cannot generalize our findings to low-grade EC. We note that previous studies analyzing the interobserver reproducibility of histological diagnosis had a lower proportion of SEC [3, 4, 11, 12]; however, their results did not differ substantially from the present work. Furthermore, due to our approach, we did not adjust for stage in COX regression analysis, and therefore, the role of stage in this context could not be addressed. Finally, our study design is not completely reminiscent of the “real-life” practice. First, for some cases, only selected slides were available for review, possibly omitting the part of the tumour with the most representative morphology. This limitation is counterbalanced by our ability to retrospectively review cases with an average number of 10.9 H&E slides/case. Second, review diagnoses were solely based on H&E without any immunohistochemistry (IHC) and it is conceivable that use of an IHC panel would improve interobserver agreement [8, 9]. Finally, in this study, an expert consensus diagnosis was used, which is not completely the same as a referral diagnosis in real-life practice. To improve on these points, a valuable future study would be to analyze interobserver variability of local and referral diagnoses in a country or region that has implemented a standard IHC marker panel for high-grade EC.

In conclusion, we confirmed the substantial interobserver variability in histologic subtyping high-grade EC in a large Danish population cohort. All but two cases remained high grade; however, a major shift in histologic subtype was observed, most significant for CCC. After revision, endometrioid-type high-grade carcinomas had strongly significant better RFS than SEC, and better RFS and OS than the group of DEC and UCS, but otherwise the shift between the different subtypes of high-grade EC did not change the outcome in terms of RFS or OS. We suggest keeping a low threshold for pathology revision of high-grade EC in clinical practice and foresee that molecular classification of high-grade EC will be a better fundament for future clinical management as it is built upon more objective parameters.