Introduction

Prostate cancer (PCa) is the most common non-skin malignancy among US men [1]. The incidence and outcomes of PCa vary by race/ethnicity: non-Hispanic Black (NHB) men and Hispanic men have both higher risk for the development of PCa and lower 5-year survival rates for PCa, when compared to non-Hispanic White (NHW) men [2]. Additionally, NHB men are at a higher risk of having biochemical recurrence (BCR) following radical prostatectomy (RP) compared to NHW men [3]. Efforts are needed to better characterize the prognosis of PCa within different racial and ethnic groups.

While surgical upgrading from prostate biopsy (PBx) to RP surgical pathology (SPx) has been well studied, to our knowledge, data on the outcomes of patients that are downgraded from PBx to RP are limited, particularly amongst high risk cohorts [4]. Furthermore, surgical downgrading can lead to suboptimal care for patients as these patients may have been reasonable candidates for less invasive therapy, such as active surveillance.

The goal of this study was to perform a systematic review of peer-reviewed published papers describing surgical downgrading of PCa at the time of RP and its impact on biochemical recurrence (BCR), with particular attention to the representation of racial/ethnic minority groups. A secondary aim of this study was to examine the impact of surgical downgrading on BCR within our own diverse, multiethnic patient population and compare our findings to those of the systematic review.

Materials and methods

Systematic review

Search strategy

The PubMed, EMBASE, MEDLINE, and Cochrane library databases were used to identify full text studies published in English language comparing patients who underwent surgical downgrading at the time of RP to their non-downgraded peers. The following search terms were used: (“prostate cancer”) AND (“downgrade”) AND (“surgical”). These terms were submitted in an identical manner to each database. Studies published from January 1984 through November 2020 were included.

Inclusion and exclusion criteria

All studies in which men older than 18 years underwent a standard or fusion prostate biopsy (PBx) with subsequent RP for PCa were reviewed. We included all studies which stratified surgical downgrading, defined as moving from a higher GL on PBx to a lower GL on RP, and had reported post-operative outcomes, such as recurrence-free survival or BCR (Supplementary Figure S1). We excluded studies that did not report on surgical downgrading by distinct GL groups, studies that pooled multiple GL groups into one combined group, studies that did not report post-operative oncologic outcomes, and those with fewer than 20 subjects in total. Articles published in a non-English language were also excluded. If it was unclear whether an article met inclusion criteria based on review of abstract, the full text was reviewed. The literature search was completed by one investigator (DZ).

Of the included studies, data were extracted by one investigator (DZ), including patient demographics, downgrading definition, clinical variables, and clinical outcome rates (e.g., BCR) for the downgraded group vs the non-downgraded group.

Retrospective cohort study

Patient population

Following institutional review board approval, we identified all patients who underwent an RP for biopsy-proven PCa from our institution from 2005 to 2020. Biochemical recurrence (BCR) was defined as the first post-operative PSA ≥ 0.2 ng/mL. Follow-up time was defined as the time from radical prostatectomy to the last encounter with our health system. Demographic and clinical variables, such as age at surgery, race/ethnicity, pre-operative body mass index (BMI), and pre-operative PSA were extracted by manual review of electronic medical records (EMR), queries of the EMR using Clinical Looking Glass (Streamline Health, Atlanta, GA) [5], and our institutional Cancer Registry [6]. Adverse histologic features, tumor grade, and pathologic tumor stage were extracted by manual review of RP pathology reports.

We examined different downgrading types, from GL7 to GL6, GL8-10 [7] to GL7, and GL7(4 + 3) to GL7(3 + 4). We compared patients who were downgraded against patients who had concordant GL scores (i.e., both PBx and RP GL scores are the same) of both the original and lower GL score (e.g., for GL7 to GL6 downgrading, we compared GL7 to GL6 downgraded patients against patients with GL7 concordance and GL6 concordance). Note that patients with GL7(4 + 3) to GL7(3 + 4) downgrading and GL7(3 + 4) to GL7(4 + 3) upgrading were included within the GL7 concordance group.

Statistical analysis

Baseline demographics were compared between patients who had surgical downgrading vs those without downgrading. Patients who experienced upgrading were excluded from this analysis. Continuous, normally distributed variables were compared using one-way analysis of variance (ANOVA). Continuous, not-normally distributed variables were compared using the Kruskal–Wallis H test, and categorical variables were compared using the χ2-test. Rates of BCR were compared using the log-rank test. All statistical tests were two-sided, using a significance level p ≤ 0.05. We followed up patients from date of PBx through date of BCR, death (if the patient died), or end of follow-up (last patient contact with our institution). Kaplan–Meier survival analysis was used to compare BCR-free survival between different downgrading strata [8, 9]. A Cox proportional hazards (PH) model was used to calculate and compare the cumulative BCR-free survival among downgraded and non-downgraded patients, and to estimate the hazard ratio (HR), after adjusting for age at biopsy (years), race/ethnicity, pre-operative PSA, and the presence of obesity (defined as BMI ≥ 30) [9]. Lastly, we performed a sensitivity analysis in which we replicated the analysis using Gleason Group (GG) definitions (GG4/5 to GG3 downgrading, or GG4/5 to GG2 downgrading, GG3 to GG2 downgrading, GG3 to GG1 downgrading, and GG2 to GG1 downgrading). Kaplan–Meier survival curves were plotted using Stata v15.1 (StataCorp, College Station, TX); all other analyses were done in SPSS v27 (IBM, Armonk, NY).

Results

Systematic review

A total of 137 unique manuscripts and abstracts were identified using our search criteria (Supplementary Fig. S1). Of these, 101 were excluded based on their title or abstract details not reflecting the scope of our analysis. Of the remaining studies, 28 were excluded based on the criteria listed in Supplemental Fig. S1. Eight retrospective studies examining the relationship of BCR among men who had surgically downgraded PCa were analyzed for this review (Table 1). A total of 9,059 men composed the study population, of which 1,608 (17.8%) experienced surgical downgrading. All studies found that downgraded patients had lower rates of BCR compared to patients with concordant pathology. However, only 3 out of 8 included studies provided data on race and ethnicity, and in those studies, the majority were non-Hispanic White patients.

Table 1 Summary data for studies included in review

Retrospective cohort study

Of the 1621 patients that underwent RP at our institution, 137 were excluded from the analysis due to the lack of pre-operative prostate biopsy data. Among our eligible 1,484 patients, 37.6% were NHB and 28.8% were Hispanic (Supplementary Table S1). Patients downgraded from GL7 to GL6 were compared to GL6 concordant and GL7 concordant patients (Table 2A). Only 2.6% (n = 3) of downgraded patients had a 6-month delay between PBx and RP. Patients downgraded from GL7 to GL6 had similar actuarial rates of BCR (15.5%) compared to those with GL6 concordance (13.0%), which were both lower than men with GL7 concordance (22.6%; p < 0.001, Fig. 1A).

Table 2 Association of demographic, pathologic, and clinical variables with (A) Gleason score (GL) 7 to GL 6 downgrading, (B) GL 8–10 to GL 7 downgrading, and (C) GL 7(4 + 3) to GL 7(3 + 4) downgrading
Fig. 1
figure 1

Kaplan–Meier curves for biochemical recurrence-free survival following radical prostatectomy of prostate cancer A GL7 → GL6 downgraded patients, B GL8-10 → GL7 downgraded patients, C GL7(4 + 3) → GL7(3 + 4) downgraded patients

Patients who were downgraded from high risk PCa (GL8-10) to intermediate-risk disease (GL7) were compared to patients who remained with high-risk PCa pathology (GL8-10 concordance) and GL7 concordance (Table 2B). Only 3.4% (n = 3) of downgraded patients had a 6-month delay between PBx and RP. Patients with downgrading from high risk PCa had lower actuarial rates of BCR (n = 25, 28.7%) compared to GL8-10 concordance patients (n = 19, 44.2%) and similar rates of BCR compared to GL7 concordant patients (n = 102, 22.6%, Fig. 1B). We also reviewed our data for downgrading from GL8-10 to GL6 disease, and found 21 cases (Supplementary Figure S2).

Patients who were downgraded from GL4 + 3 to GL3 + 4 PCa were compared against GL4 + 3 concordant and GL3 + 4 concordant patients (Table 2C). A 6-month delay between PBx and RP occurred in only 4.3% (n = 4) of downgraded patients. Patients who downgraded from GL4 + 3 to GL3 + 4 had similar actuarial rates of BCR (n = 17, 18.5%) compared to GL3 + 4 patients (n = 41, 17.1%), which were lower than GL4 + 3 concordance patients (n = 22, 36.1%, p = 0.008, Fig. 1C).

The association of surgical downgrading with BCR is reported in Table 3. On univariate analysis, downgrading form GL7 to GL6 was associated with a 51% reduction in the risk of BCR (HR = 0.49, 95% CI 0.29–0.83, p = 0.008), with a similar HR magnitude compared to GL6 concordance (HR = 0.43, 95% CI 0.30–0.62, p < 0.001), when compared to GL7 concordance patients as reference. Downgrading from high risk PCa (GL 8–10) to GL7 was associated with a significant 52% reduction in the risk of BCR (HR = 0.48, 95% CI 0.26–0.88, p = 0.018). The magnitude of the HR was slightly higher than GL7 concordance (HR = 0.37, 95% CI 0.22–0.60, p < 0.001). GL7(4 + 3) downgrading was associated with a 49% reduction in the risk of BCR (HR = 0.51, 95% CI 0.27–0.97, p = 0.039), with a similar magnitude compared to GL7(3 + 4) concordance (HR = 0.46, 95% CI 0.27–0.77, p = 0.003). When the models were adjusted for age, race/ethnicity, pre-operative PSA, and the presence of obesity, the results were similar for GL7 to GL6 downgrading (HR = 0.50, 95% CI 0.28–0.90, p = 0.022) and GL8-10 to GL7 downgrading (HR = 0.42, 95% CI 0.22–0.082, p = 0.011), but GL7(4 + 3) to GL7(3 + 4) downgrading was no longer statistically significant but with similar HR of 0.56 (95% CI 0.27–1.15, p = 0.12). When we examined the association of BCR using Gleason Grade (GG) downgrading definitions, we failed to detect statistically significant associations between surgical downgrading and BCR (Supplementary Table S2).

Table 3 Association of surgical downgrading with biochemical recurrence following robotic assisted radical prostatectomy, among Gleason 7 to 6 downgrading (A), Gleason 8–10 to Gleason 7 downgrading (B), and Gleason 4 + 3 to 3 + 4 downgrading (C)

Discussion

Discrepancies in Gleason score between PBx and RP are well recognized, with many studies examining the prognostic impact of surgical upgrading on BCR [10, 11]. Compared to surgical upgrading, the prognostic impact of surgical downgrading on BCR has been less explored. To our knowledge, no study has attempted to systematically examine the literature on this topic and summarize the effects of our proposed groupings of surgical downgrading (e.g., GL8-10 to GL7, GL7 to GL6, GL7(4 + 3) to GL7(3 + 4)) on clinical outcomes.

When we examined the impact of surgical downgrading on BCR, we found that GL at RP is a stronger determinant of BCR compared to GL at PBx. In particular, there was significant reduction in the risk of BCR by at least 50% for men who were downgraded from GL8-10 to GL7 as well as those downgraded from GL7 to GL6. Our models of GL7(4 + 3) to GL7(3 + 4) downgrading had similar HR compared to GL7(3 + 4) concordance patients; however, it failed to reach significance. This may be limited due to a smaller cohort of men with GL7(4 + 3) downgrading (n = 42). Lastly, when we examined our data using the GG downgrading definitions, we failed to detect statistically significant associations between surgical downgrading and BCR, but this is likely due to a small sample size of men with GG3 and GG2 on either biopsy or RP.

Our data are consistent with prior studies which found that GL at RP more accurately determines the risk of BCR compared to GL at PBx. In our systematic review, we found two studies that examined downgrading from GL7(4 + 3) to GL7(3 + 4). Jang et al. [12], in a review of 286 patients, of which 125 were downgraded, found that downgraded patients had lower rates of BCR (34.4% vs 51.6%, p = 0.001). Furthermore, a smaller study of 84 patients (16 downgraded) also found a difference in BCR-rates between the two groups, although they did not report the numbers of patients who developed BCR [13]. We similarly detected a difference in the rates of BCR among our cohort (downgraded: 18.5%, concordance: 36.1%, p = 0.008, Table 2C), although we did not measure a significant result within our multivariate models (Table 3).

Three studies examined downgrading from high-risk PCa (GL8-10) to GL7 [14,15,16]. A large study of 860 patients (332 downgraded at RP) found that rates of BCR differed when comparing downgraded patients to non-downgraded patients (49% vs 76.5%, respectively, p < 0.001) [15]. These rates were similar to a medium-sized review of 235 patients (103 downgraded) which reported similar rates of BCR among downgraded men (48.5%) compared to GL8-10 concordant (61.4%, p = 0.0004) patients. A smaller review of 91 patients (46 downgraded) found significantly lower rates of BCR among downgraded (28.3%) vs non-downgraded (48.9%) patients, although low sample size may have biased results among this cohort [14]. Our results were more similar to the smaller study (downgraded: 28.7% vs concordant: 44.2%, p < 0.001, Table 2B), suggesting that rates of BCR vary significantly between sites.

Downgrading from GL7 to GL6 was evaluated by three studies included in the systematic analysis [17,18,19]. Large retrospective reviews from Su et al. [17] and Ham et al. [18] found that downgrading was associated with lower rates of BCR (11.7% and 14.8%, respectively) compared to GL7 concordance (29% and 38.1%, respectively). Additionally, Ham et al. estimated HRs of BCR among GL7 downgraded patients and reported that GL7 downgrading was associated with a HR:1.87 (95% CI 1.40–2.51, p < 0.0001) compared to GL6 concordance (reference) and GL7 concordance (HR:4.09, 95% CI 3.50–4.78, p < 0.0001). Our adjusted models found that GL7 downgrading and GL6 concordance have similar risks of BCR, which represents a difference between our cohorts. Lastly, a smaller study of 1317 men (115 downgraded) found similar results, but did not report their rates of BCR [19].

Etiologically, the discordance between GL on PBx and RP is likely in part due to sampling error during PBx. The small amount of tissue obtained during PBx likely does not reflect the totality of PCa when the prostate is examined following RP. Discordance between PBx and SPx is worsened due to different reporting guidelines for PBx and SPx specimens: for PBx, pathologists report the most prevalent Gleason pattern and the highest grade pattern, while for SPx specimens, pathologists report the first and second most prevalent Gleason patterns. This is supported by studies which have been able to use percentages of cores positive for PCa to predict surgical downgrading [20,21,22]. Interobserver heterogeneity in grading PBx and surgical pathology has been suggested as well, although a well-validated study on this subject has failed to demonstrate significant heterogeneity occurring between pathologists [23]. Lastly, it now appears that targeted biopsy of MRI lesions may lead to more downgrading compared to standard PBx, as fusion biopsy has been shown to be associated with downgrading [24]. This may be due to the fact that cores from prostatic lesions are likely to be higher grade compared to cores taken systematically, resulting in discordance when the prostate is evaluated following surgery.

Our study has several strengths to consider. Our sample size is relatively large and with longer follow-up times compared to previously reported studies. In addition, our population is racially/ethnically diverse, contributing to the body of literature in this domain. Of the studies reviewed, only 3 reported race/ethnicity data, with NHW men composing ≥ 73% of the study population (Table 1) [14, 17, 18]. We found this to be a significant limitation during our systematic review of the literature on this topic, given that PCa incidence and mortality varies by race/ethnicity [2] and that NHB men are at higher risk for BCR following RP compared to NHW men [3]. In our study, we could not perform stratified analyses by race/ethnicity to better examine the impact of race/ethnicity on surgical downgrading. We can only conclude that we were able to replicate findings of other studies in a majority NHB (37.6%) and Hispanic (28.8%) population (Supplementary Table S1). Analysis of larger datasets of men with PCa treated with RP is needed to better understand the relationship between race/ethnicity and surgical downgrading.

Limitations of our study include its retrospective nature, including some incomplete biopsy data. We also could not perform a detailed analysis on the number or percentage of cores involved in PCa, which may provide data on the risk of downgrading. Additionally, there have been changes in the definition of the Gleason score from 2005 to 2015, with some of the older Gleason pattern 3 morphologies being shifted to Gleason pattern 4. Given that this period takes place during our systematic review, our results should be interpreted with this finding in mind. Lastly, we could not replicate our analysis when we used GG definitions for surgical downgrading, due to low sample sizes. Future work is needed to re-assess these findings using the GG definitions for surgical downgrading. Our findings further support the need for more advanced methods of PBx in order to minimize discordance between PBx and surgical pathology, as we have demonstrated that surgical pathology is more likely to determine the clinical outcome of patients compared to their PBx results.

Conclusions

Our systematic review and cohort study support prior findings from other authors that surgical downgrading of prostate cancer is associated with a lower risk of BCR compared to concordant Gleason score on PBx pathology. We have validated these findings in our multiethnic population. Further research is needed in the development of advanced PBx techniques and biomarkers to limit discordance between PBx and RP pathology, and to better optimize care.