Introduction

In 1966, Gleason first presented his grading system for prostatic carcinoma (Fig. 1a). The Gleason grading is now universally acknowledged and also endorsed by the WHO [13, 1922]. During the preparation of the last edition of the WHO classification of prostatic carcinomas, other grading systems were also discussed, such as the modified histo-cytological WHO grading of Mostofi from 2002 and the Helpap subgrading system [13, 19, 23, 25, 33], but only the Gleason grading was finally included in the new WHO classification. When the Gleason grading was introduced, the clinical and diagnostic steps to find prostate cancer and the therapeutic consequences of this diagnosis differed from today’s practice. The basis of the Gleason grading system was the morphological experience from simple prostatectomy or transurethral resection specimens. Core needle biopsies were very rare at that time, and radical prostatectomy as performed today was virtually unknown.

Fig. 1
figure 1

Schematic representations of conventional (a) and modified (b) Gleason grading system. The most important changes between them are within patterns 3 and 4. In the modified system, most cribriform patterns and also poorly defined glands are included in pattern 4. Reproduced from [14] with permission of Lippincott Williams and Wilkins, Baltimore, MD, USA

The Gleason grading was slightly modified twice in 1966/1967 and 1974/1977 and then unchanged until 2005 [9, 10, 19, 20, 22, 2931]. After the publication of the new WHO classification of tumours of the urinary system and male genital organs in 2004 [13], there was a continued discussion about Gleason grading of needle biopsies. This issue was debated at the International Consultation on Prediction of Patient Outcome in Prostate Cancer sponsored by the WHO in Stockholm 2004, and some modifications of the Gleason grading were suggested [1]. A recommendation was given to separately report the Gleason score for each recognizable core. It was also suggested that tertiary patterns of higher grade in needle biopsies should influence the Gleason score [32].

To clarify how the Gleason grading system is applied in practice, a survey was recently circulated among 91 experts in genitourinary pathology in countries around the world [8]. Of 67 responding participants, only 13 and 36%, respectively, ever diagnosed a Gleason score of 2 to 4 on needle biopsies, and 88% of those who did so assigned a Gleason score 4 to <1% of cancers. Cribriform Gleason pattern 3 was acknowledged by 88%, but a majority of them would classify less than 20% of cribriform patterns as Gleason pattern 3. Two thirds included incomplete or poorly defined glands in Gleason pattern 4 (Fig. 2). Although only 26% defined Gleason score on needle biopsies as primary + tertiary Gleason pattern, a majority would mention a tertiary pattern separately. The conclusion was that there was a need to standardize the application of Gleason grading both in terms of interpretation of patterns and reporting of the grade. Therefore, the International Society of Urological Pathology (ISUP) decided to organize a consensus conference, which took place in San Antonio, TX, in 2005 [14]. Essentially the same experts who participated in the survey study were invited to the consensus meeting. The Gleason grading system for prostatic carcinoma now underwent its first major revision including both pattern interpretation and compiling and reporting of grade information. The definition of Gleason patterns 3 and 4 was modified, and the use of Gleason patterns 1 and 2 on needle biopsy and radical prostatectomy specimens was restricted. It was decided that focal areas of high-grade cancer in needle biopsies should be included in the Gleason score.

Fig. 2
figure 2

Adenocarcinoma of the prostate with both complete, circumscribed glands and poorly formed glands corresponding to a modified Gleason score 3 + 4 = 7. H&E

To be clinically useful, a grading system must be reproducible, and there must be a reasonable agreement between biopsies and surgical specimens [18]. The overall agreement of the Gleason score between biopsy and radical prostatectomy specimens has generally only been between 35 and 45% [26, 16, 27, 28, 31, 3437]. This relatively poor agreement is caused by undergrading of low-grade carcinomas in needle biopsies, while the agreement is more exact when high-grade carcinomas are diagnosed [16, 28, 31, 37]. Biopsies have been found to under-grade in 24–60% (pooled data 45%) and over-grade in 5–32% (pooled data 10.4%) [11]. Although other grading systems have better results with regard to biopsy and radical prostatectomy agreement as, e.g. the old and new WHO grading or the nucleolar subgrading of Helpap, the Gleason grading became the preferred grading system [13, 15, 23, 25, 33]. It would be clinically very important if the modified Gleason grading could contribute to a better agreement because the Gleason score is one of the most significant prognostic factors of prostatic carcinoma.

Therefore, we have estimated the modified Gleason scores of all prostatic carcinomas in needle biopsy, radical prostatectomy and transurethral tissue resections specimens of 1 year at the Department of Pathology, General Hospital Singen. A separate series of biopsies have been regraded for a comparison of the distribution of the grading systems. Furthermore, the correlation and agreement of Gleason scores were assessed between needle biopsy and radical prostatectomy specimens, both with the conventional and the modified Gleason grading system.

The study demonstrates the changed distributions of the Gleason scores with modified grading, which may be significant for the post-operative prediction of Gleason scores and its influence on therapeutic strategies.

Materials and methods

In 2004, 3,215 adenocarcinomas of the prostate were diagnosed by histological analysis at the Department of Pathology, General Hospital Singen, including 2,502 core needle biopsies, 533 radical prostatectomy specimens and 180 transurethral resection specimens. The prostate tissue was formalin-fixed, usually in 4% formaldehyde. Biopsy specimens were fixed 12 h, while transurethral resection and radical prostatectomy specimens were fixed 24–36 h. Radical prostatectomy specimens were worked up according to a modified Stanford method [36]. Every second section was embedded in paraffin. The transurethral resection material was embedded completely in paraffin when suspicious for malignancy. After paraffin embedment, specimens were cut at 3 μm and stained with hematoxylin and eosin. With low microscopic magnification (4–10× lens magnification), prostatic carcinomas were graded according to the modified Gleason grading system, and the scores were assessed (Fig. 1b) [14]. Every core needle biopsy was graded, and a final Gleason score (global Gleason score) was assessed for all core needle biopsies of every patient. In 20 cases of Gleason score 4 + 3 = 7, a tertiary pattern 5 (less than 5%) was included in the final Gleason score.

Modified and conventional Gleason grading (Fig. 1a and b) were compared by regrading 368 radical prostatectomy specimens and the available preoperative biopsies, and we estimated the agreement of conventional and modified Gleason scores between both tissue specimens. The radical prostatectomy specimens were received from 1/1996 to 12/2000. Gleason scores 7 were divided into 3 + 4 = 7a and 4 + 3 = 7b and scores 9 to 4 + 5 = 9a and 5 + 4 = 9b. A distinction between Gleason scores 7a and 7b and between 9a and 9b was not routinely made in the original pathology reports.

For statistical analysis, chi-square was used. A p value of <0.05 was considered significant.

Results

The distribution of modified Gleason scores in 3,215 prostate specimens is presented in Fig. 3. The most frequent Gleason score of biopsies and radical prostatectomy specimens was 7 (52.8 and 81.8%, respectively), while Gleason score 5 was the most common grade in transurethral resection specimens (28.9%). Gleason scores 7–10 were more common in radical prostatectomy specimens than in needle biopsies (94.4 and 66.7%, respectively, p < 0.001) and more common in needle biopsies than in transurethral resection specimens (66.7 and 43.9%, respectively, p < 0.001). Gleason score 3 + 4 = 7a was more common than 4 + 3 = 7b in all specimen types. In needle biopsies, radical prostatectomy specimens and transurethral resection specimens, Gleason score 7a were 69.3, 67.2 and 79.4%, respectively, for all Gleason score 7 cases.

Fig. 3
figure 3

Distribution of modified Gleason scores (GS) in needle biopsies (NB), radical prostatectomy (RP) specimens and transurethral resection (TURP) specimens. The grading was performed by two uropathologists

The major changes in Gleason score distribution after regrading 368 needle biopsy cases were that Gleason scores 2–4 decreased from 2.7 to 0% (p < 0.001), Gleason score 5 decreased from 12.2 to 0.3% (p = 0.18), Gleason score 6 decreased from 48.4 to 22.0%, but Gleason score 7 increased from 25.5 to 67.9% (p < 0.001; Tables 1 and 2). Similarly, after regrading 368 radical prostatectomy specimens, Gleason scores 2–5 decreased from 6.3 to 0% (p = 0.008), Gleason score 6 decreased from 32.3 to 6.3% (p < 0.001) and Gleason score 7 increased from 35.9 to 82.9% (p < 0.001; Tables 1 and 2). The overall exact agreement between Gleason score of core needle biopsies and radical prostatectomy specimens with conventional and modified Gleason grading were 58 and 72%, respectively (p < 0.001; Tables 1 and 2). With modified Gleason grading, the agreement was 28.4% for a biopsy Gleason score 6, 88.5% for Gleason score 3 + 4 = 7a, 68.0% for Gleason score 4 + 3 = 7b and 73.3–100% for Gleason scores 8–10.

Table 1 Correlation of Gleason scores of prostatic carcinomas after conventional Gleason grading in needle biopsies (NB) and radical prostatectomy specimens (RP)
Table 2 Correlation of modified Gleason scores of prostatic carcinomas in needle biopsies (NBX) and radical prostatectomy (RP) specimens

Discussion

In recent years, there has been a discrepancy between the original version of the Gleason grading and how it is applied in practice [8]. The ISUP consensus meeting in 2005 was organized with the purpose to standardize both the perception of histological patterns and how the grade information is compiled and reported [14].

In contrast to incidental carcinoma in transurethral resection material from the transition zone of the prostate, carcinomas with Gleason patterns 1 to 2 or Gleason scores 2 to 4 are generally not seen in the peripheral zone. Therefore, these Gleason scores are very unlikely to be diagnosed in core needle biopsies, which mainly sample the peripheral zone [8, 1214, 17]. A recommendation was issued by the ISUP that with extremely rare exceptions, a Gleason score of 1 + 1 = 2 should not be diagnosed in any type of specimen, and that Gleason score 4 should rarely, if ever, be diagnosed on needle biopsies. Furthermore, the consensus was that cribriform patterns are not allowed within Gleason pattern 2 in contrast to Gleason’s original diagram. When regrading needle biopsies in the present study, no specimens were assigned a Gleason score of 2 to 4, which is in line with these recommendations.

Similar to the results of our own consultation service for prostatic carcinomas, reference centers for prostatic carcinomas in the USA and Scandinavia have reported frequent undergrading of prostatic carcinomas in biopsy specimens [8, 14, 24]. We have observed that Gleason score 3 + 3 = 6 carcinoma is often undergraded on needle biopsies by external pathologists. Very often, small nests of Gleason 3 pattern (score 6) are misclassified as pattern 1 or 2 (scores 2, 3 or 4) carcinomas. The results of this study support the notion that Gleason patterns 1 and 2 should not be used in diagnostic biopsies. Such patterns may be diagnosed in incidental adenocarcinomas (T1a–b) detected on transurethral resection specimens from the transition zone of the prostate [12, 14, 17].

However, the most important modifications of pattern interpretation proposed by the ISUP meeting pertained to patterns 3 and 4. In the original description by Gleason, cribriform patterns were assigned either a Gleason pattern 3 or 4 depending on the shape of the cribriform glands. There is now an increased understanding that invasive cribriform carcinoma is a relatively aggressive disease. Therefore, it was now recommended that most cribriform carcinomas should be assigned a Gleason pattern 4 rather than 3, and that cribriform glands must be small and round to qualify for pattern 3. The ISUP consensus meeting emphasized that Gleason pattern 4 should also include ill-defined glands with poorly formed glandular lumina, a relatively common pattern that was not incorporated in the original definition of pattern 4 (Fig. 2). It is unclear how this pattern used to be graded, but the incomplete structure of some of the glands may have been overlooked to be included into Gleason pattern 3. With the ISUP modifications, some cases that used to be interpreted as Gleason score 6 are now considered Gleason score 7.

Even before the ISUP revision of the Gleason grading, the most frequent Gleason scores in needle biopsies and radical prostatectomy specimens were Gleason scores 6 and 7, respectively [37]. Therefore, the modification of the Gleason grading system may both change the distribution of Gleason scores and the agreement of Gleason scores between specimens. The most common grade in needle biopsies with conventional Gleason grading in this study was Gleason score 6, while with modified grading, a Gleason score 7 was the most frequent. In the literature, Gleason score 6 is often reported to be the most common Gleason score of needle biopsy specimens. This may be explained by pattern interpretation (cribriform and some poorly formed glands have been interpreted as pattern 3), but could also be attributed to the current stage shift with more and more cancers detected at an early stage.

In the current study, the overall exact agreement between Gleason score of needle biopsies and radical prostatectomy specimens improved from 58 to 72% with modified Gleason grading. Compared to previous reports [26, 16, 27, 28, 31, 3438], this is an excellent concordance. Preoperative identification of a high-grade tumour component is clinically important. Gleason pattern 4 may be either undergraded in needle biopsies or not sampled. Therefore, a significant disagreement exists between biopsies and radical prostatectomy specimens for Gleason scores 6, 7a and 7b [7, 21, 26]. Notably, the agreement of modified Gleason score 3 + 4 = 7a between needle biopsies and radical prostatectomies at 88.5% was very high compared to Gleason score 3 + 3 = 6 (28.4%) and 4 + 3 = 7b (68.0%). In comparison to this high level of agreement, the concordance of conventional Gleason grading was only moderate, corresponding to the upper end of the range (28–68%) reported in the literature [26, 9, 10, 16, 18, 27, 28, 31, 3438].

Hence, the modified Gleason grading minimizes undergrading of prostatic carcinomas in biopsies and improves the agreement between biopsies and radical prostatectomy specimens. By providing a more accurate prognostic information before treatment, the modified Gleason grading contributes to a safer treatment decision.