Introduction

Prostate cancer is the commonest non-cutaneous malignancy in men [1]. Radical prostatectomy (RP) is a standard first-line treatment for eligible patients in localized disease and the majority of patients will harbor pT2 PCa in the final histopathological specimen [24]. While most of these patients have excellent biochemical recurrence-free survival (BCRFS) rates from 73 to 99 % over a 10-year period [2, 46], some will eventually recur. Previous studies have identified variables such as PSA, pathological substages, Gleason score (GS) and/or surgical margin (SM), to be predictive of BCRFS–failure [27]. Additionally, several studies have reported the predictive value of pathologic tumor volume (TV) and/or percentage of high-grade tumor volume (%HGTV) in prediction of BCR [712]. TV and %HGTV were shown to be independent predictors of BCR, and their inclusion in a multivariable model increased predictive accuracy (PA) [12]. However, other studies have questioned these findings [13, 14].

Clinically, the management of patients with adverse pathologic features and localized disease remains challenging. While adjuvant therapies might be considered in these patients, the risk of overtreatment is inherent. On the basis of these considerations, we assessed long-term BCRFS rates after RP in patients with pathological pT2 PC and aimed to discern the clinical value of TV and %HGTV in this setting. We specifically sought to provide a clinical decision tool to identify patients who would benefit from adjuvant measures.

Patients and methods

Study population

Institutional data of 2095 men who underwent RP between 1992 and 2005 for pT2 PCa were collected. For the current study, we only considered 1181 men with available detailed computer planimetric-assisted quantification of TV and %HGTV. To ensure homogeneity of the cohort, 278 patients were excluded from analyses due to neo-adjuvant or adjuvant androgen-deprivation therapy, lymph node involvement, or missing information on clinical/pathological variables and/or follow-up. This resulted in 903 men eligible for statistical analysis.

Variables

Pathological stage was defined according to the 2002 TNM staging system [15]. Until March 2001, decision for nerve sparing technique (NS) was made on surgeon`s expertise and thereafter based on a tree regression model [16]. RP specimens were processed according to the 3-mm whole-mount Stanford protocol [17]. Contact of tumor cells at the inked margin of the prostate was considered to represent a positive SM. In all prostate tissues, detailed pathological computer planimetric-assisted quantification of TV and %HGTV was performed, as previously reported [18]. For further analysis, TV was subdivided as ≤2 and >2 ml based on the overall median value. HGTV was defined as the volume in the specimen consisting of Gleason score ≥4. During follow-up, patients had PSA value measurements quarterly in the first, twice a year in the second and once a year from the third year after RP. BCR was defined as a PSA level of ≥0.2 ng/ml and rising after an initially undetectable PSA level after RP [7]. Men without BCR or who died from causes other than prostate cancer were censored at the date of their last undetectable serum PSA level. The primary outcome of interest was defined as the overall BCR. Occurrence of metastases and cancer-specific survival were reported as secondary outcomes.

Statistical analysis

Descriptive statistics focused on frequencies and proportions for categorical variables. Means, medians and interquartile ranges (IQRs) were reported for continuously coded variables. Chi square and independent-sample t tests were used to compare the statistical significance of differences in respectively proportions and means. Kaplan–Meier curves assessed actuarial BCRFS probabilities at 5, 10 and 15 years. Uni- and multivariable Cox logistic regression models (Cox LRMs) were used to test several clinical and pathological parameters to predict BCR at 5, 10 and 15 years after RP. Specifically, the base models included PSA, Gleason score and SM. The extended models additionally included TV and %HGTV. For the lack of standardized %HGTV thresholds, we tested several cutoff points predicting BCR using the minimal p value approach as previously described [19]. This assesses whether the predictive ability of continuously coded variables may be improved by dichotomization and was found to be 20 %HGTV.

The impact of each predictor variable on BCRFS rates at 5, 10 and 15 years were graphically depicted by a nomogram. To test the discrimination ability of the base model versus the extended model, we assessed Harrell’s c index [20]. The c index ranges from 0.5 (chance) to 1.0 (perfect discrimination) and describes the probability in a randomly chosen pair of patients that the one with the event (e.g., BCR) has a higher predicted probability of experiencing the event.

Two hundred bootstrap resamples were used for internal validation [21]. Discriminative ability estimates were compared using the Mantel–Haenszel test. Finally, to identify the net increase in the proportion of cases identified by the full model, we calculated decision curve analyses at 5, 10 and 15 years as introduced by Vickers and Elkin [22]. All tests were two-sided with a statistical significance set at p < 0.05. Analyses were conducted using the statistical package for R (the R foundation for Statistical Computing, Vienna, Austria, version 3.1.2).

Results

Descriptive characteristics of patients with pathological pT2 PCa are shown in Table 1. Median patient age was 63 years (IQR 59–66 years) with median pre-treatment PSA level of 6.4 ng/ml (IQR 4.6–9.0 ng/ml), respectively. Follow-up time was 133 (11.1) months (years) on median (IQR 97–157 months [8.1–13.1 years]). Among cancer survivors alone, the median follow-up time was 11.1 years (IQR 9–13.1). The majority of patients had an unsuspicious digital rectal examination (n = 644, 71 %) and a Gleason score of ≤3 + 3 at biopsy (79 %). In 731 men (81 %), nerve sparing was performed during RP. The majority of patients had pathological Gleason score 3 + 3 (n = 578, 64 %), whereas 325 men (36 %) also harbored less favorable Gleason patterns. Positive surgical margins were observed in 13.1 % of patients (n = 118).

Table 1 Descriptive perioperative characteristics of 903 men with pathologically organ confined prostate cancer treated with radical prostatectomy between 1992 and 2005

Overall, 137 men (15.2 %) developed BCR and this was proportionally higher in patients with SM (13.4 vs. 27.1 %). BCRFS rates at 5, 10 and 15 years were 91.7, 87.0 and 85.9 %, respectively. Metastatic disease occurred in 1.7 % and cancer-specific/overall mortality in 1.4/12.4 %, respectively. Median survival time was 20.0 years for men without BCR, 15.0 years for men with BCR and 18.3 years overall. Stratified Kaplan–Meier estimators are depicted in Fig. 1 and show significantly different BCRFS rates for (b) tumor volume (TV ≤2 vs. >2 ml), (c) percentage of high-grade tumor volume (%HGTV; ≤20 vs. >20 %), and (d) combination of surgical margin status (SM) and TV (SM− & TV ≤ 2 vs. SM− & TV > 2 ml vs. SM+ & TV ≤ 2 ml vs. SM+ & TV > 2 ml), respectively (all log-rank p < 0.001).

Fig. 1
figure 1

a Kaplan–Meier analyses depicting BCRFS after radical prostatectomy for the entire patient population (n = 903). Additional Kaplan−Meier analyses have been performed b after stratification according to tumor volume (TV; ≤2 vs. >2 ml), c percentage of high-grade tumor volume (%HGTV; ≤20 vs. >20 %), and d combination of surgical margin status (SM) and TV (SM− and TV ≤2 ml vs. SM− and TV >2 ml vs. SM+ and TV ≤2 ml vs. SM+ and TV >2 ml), respectively (all log-rank p < 0.001)

In multivariable Cox LRMs, patients with a Gleason score ≥3 + 4 were more likely to develop BCR relative to patients with a Gleason score ≤3 + 3 (HR 1.8, p = 0.004). Additionally, SM (HR 2.5; p < 0.001), TV > 2 ml (HR 2.0; p < 0.001) and %HGTV > 20 % (HR 1.74, p = 0.02) were independent predictors of BCR after RP (Table 2).

Table 2 Results of uni- and multivariable Cox regression analyses predicting biochemical recurrence in patients with pathologically organ confined, non-metastatic prostate cancer

The regression coefficients-based nomogram (Fig. 2a) to predict BCRFS consisting of PSA, Gleason score and surgical margin showed c-indices of 76.8 % (95 % CI 67.9–78.2 %), 70.5 % (95 % CI 64.9–75.0 %) and 68.1 % (95 % CI 60.6–73.5 %) at 5, 10 and 15 years after RP, respectively. Inclusion of TV and %HGTV increased the c-indices to 76.9 % (95 % CI 71.2–81.1 %), 72.4 (95 % CI 68.5–78.3 %) and 70.7 % (95 % CI 63.2–75.8 %) at 5, 10 and 15 years, respectively. These differences in c-indices were statistically significant at 10 and 15 years (both p < 0.001). Decision curve analysis demonstrated a net clinical benefit of the full model across a range of clinically relevant threshold probabilities for BCR at all determined time points (Fig. 2b).

Fig. 2
figure 2

a Nomogram predicting biochemical recurrence-free survival (BCRFS) at 5, 10 and 15 years after radical prostatectomy in patients with pathologically organ confined prostate cancer. To obtain nomogram-predicted probability of BCRFS, locate patient values at each axis. Draw a vertical line to the “Point” axis to determine how many points are attributed for each variable value and sum the points for all variables. Locate the sum on the “Total Points” line to be able to assess the individual probability of BCRFS on the “Predicted value” line for each time point (5 and 10 years, respectively). b In decision curve analysis (DCA), the developed nomogram appears to confer an advantage in identifying patients with BCRFS. For example, applying a nomogram-derived probability threshold of 15 % at 10 years after RP, use of the nomogram including TV and %HGTV (Model 1) would incur a net benefit of 20 over the base model (Model 2). The net benefit at a threshold probability 15 % can be interpreted as using the full model leads to the equivalent of a net 20 true-positive results per 100 patients without an increase in the number of false-positive results. PSA, prostate specific antigen; SM−, negative surgical margin; SM+, positive surgical margin; c index, predictive accuracy

Discussion

Men with pT2 PCa after RP generally present excellent BCRFS. However, some patients fail to remain BCR-free [2, 4, 5]. Disease characteristics, such as PSA, GS, pathological stage or SM, have been described as predictive factors of BCR after RP [27, 23]. Additionally, some authors have also investigated the predictive role of TV and HGTV on BCR [711]. Pathologists do not regularly provide TV assessments due to higher outlays and the complexities of the protocol. However, these tumor characteristics are promising markers to classify patients at risk of BCR. In order to clarify the cost–benefit ratio of these tumor characteristics on long-term BCRFS rates in men with pT2 PCa, we developed a novel predictive model based on TV and %HGTV and evaluated its discriminative accuracy.

Several of our findings are noteworthy. First, we report excellent long-term outcomes of patients with pT2 disease. The majority (85.9 %) of our patients were disease-free after 15 years. While BCR is rare in pT2 disease, it is significantly more pronounced in patients with SM. In our current cohort, 27 % of men with SM developed a BCR, which was significantly higher than in patients without SM. Pettenati et al. [24] corroborated these findings when they found a 33 % BCR after PSM in an institutional cohort of 630 patients. Although SM constitutes an adverse pathologic feature, there is an ongoing debate about the need for adjuvant therapies in patients with SM in pT2 disease. In light of evidence for favorable outcomes in locally advanced (≥pT3) tumors [25, 26], adjuvant radiotherapy might be considered, but has to be weighted against worse toxicity profiles. Thus, current guidelines advocate either adjuvant radiotherapy or a PSA-based follow-up with concurrent salvage radiotherapy as viable treatment options [27]. As SM falls short as a lone determinant of adjuvant therapies, it might rather be considered a stratification tool for the definition of high-risk disease in conjunction with TV and %HGTV as central determinants of BCR [11, 12, 28, 29]. Similar to Rampersaud et al. we used different categories for TV and %HGTV to distinguish between low and high risk of BCR. Patients with higher absolute TV expectably showed the highest risk of BCR. Likewise, %HGTV was highly associated with BCR, which is in accordance with the previous reports [9, 30]. The inclusion of TV and %HGTV in our nomogram resulted in significant improvements of PA at 10 and 15 years (p < 0.001). Contrary to previous reports on the weak discriminative accuracy of TV and %HGTV [9], our study provides strong evidence for the inclusion of these variables by utilizing a larger cohort study and substantially longer follow-up time. It might further be speculated that a more tailored patient sample selection as well as newer grading methods and improved statistical analyses have contributed to the different findings between earlier studies and ours. In line with our results, the addition of TV and %HGTV to the panel of reported pathological results in patients with positive surgical margin holds potential for the optimization of (long-term) risk prediction of BCR. Therefore, our newly developed nomogram is a promising tool to predict BCRFS in these patients and guide adjuvant treatment decisions. On the other hand, the predictions from our nomogram can enable clinicians to individually tailor a more stringent, PSA-based follow-up to patients with high BCR risk; further, patients with pathologically insignificant PCa could undergo a less rigorous follow-up, e.g., PSA test every second year or less after 10 years of inconspicuous follow-up. The initially higher costs of obtaining TV and %HGTV could be balanced by foregoing unnecessary and costly adjuvant therapies as well as unnecessary follow-up PSA readings.

Despite its strength, our study has some limitations. First, all patients were operated in a single European tertiary referral center; therefore, rates of pathological organ confined PCa may be affected by selection bias and differ from other geographical regions and institutions. In detail, it has to be acknowledged that 278 men were excluded because of unavailability of data or additional therapy regimens. Although this ascertained homogeneity of the study cohort, sampling bias might have subsequently increased. Secondly, our study comprised a majority of Gleason 3 + 3 pattern scores at final pathology and initial biopsy. According to the latest guidelines, these patients would currently constitute candidates for active surveillance. However, our study comprises a historical cohort when such recommendations were not readily available. The influence of tumor volume on BCR in current cohorts, with higher ratios of intermediate- and high-risk patients, needs to be further evaluated. In the same sense, our current model needs external validation. Furthermore, computer planimetric evaluation of tumor volume as performed on the specimens in our study, is itself prone to bias. First, it was first described over 20 years ago with a potential weakness in assessing tumor volume in case of unusual tumor shapes. Second, in patients with positive surgical margin status, it might have underestimated the true tumor volume. However, our cohort provides essential and valuable information in conjunction with an extended follow-up cohort. Finally, it has to be acknowledged that statistical models such as nomograms depend on their development cohort. Our nomogram should be validated with an external patient cohort to truly assess accuracy and calibration of the model. While the active use of nomograms has been questioned, advances in the routine applicability of these predictive tools have been made, and their use should be further encouraged [31, 32].

Taken together, our findings show good BCR-free survival rates after 5, 10 and 15 years in pT2 PCa patients after RP. The developed nomogram for prediction of BCRFS showed a high predictive accuracy after internal validation. TV and %HGTV could potentially serve as valuable measures to stratify patients at high risk of BCR. The use of our nomogram should be considered to counsel patients with pT2 disease and SM and to design appropriate follow-up or treatment regimens.