The utilization of robotic surgery has increased across many surgical disciplines over the last decade [1]. Perceived advantages of robotic surgery, including ergonomics and the ability to operate in the confined anatomic space of the pelvis, have made proctectomy for rectal cancer a particularly attractive target for robotic surgery. As a result, since the first published reports of robotic colectomies in 2002, a large number of colorectal procedures have been performed with robotic assistance [2, 3].

Despite increased interest in robotic proctectomy (RP), there remains a paucity of comparative evidence supporting the use of robotic surgery over laparoscopic or open approaches. Small, observational studies in specialized centers have shown support for RP, with low rates of conversion to open surgery and roughly equivalent oncologic outcomes, but have also raised cost-based concerns for this approach [4]. However, a large multi-center trial is needed to evaluate the long- and short-term outcomes after RP, as compared to laparoscopic approach. The second phase of the ROLARR (Robotic vs Laparoscopic Resection for Rectal Cancer) trial is currently underway to address this question in a randomized fashion. First phase results from this trial demonstrated no difference in conversion to open resection in either arm. Importantly, the median number of RP performed by an individual surgeon in the study was 50 (range 30–101), suggesting that the reported outcomes reflect those achieved by surgeons who have significant experience with this technique [5]. Practically, the center-level and surgeon-level volumes of RP are highly variable and lower than those centers and surgeons included in the ROLARR trial, and this trial likely does not represent outcomes in non-specialized centers. We hypothesized that a center’s robotic volume would be associated with outcomes after RP for rectal cancer. The primary aim of this study was to describe center-level volume for RP over time and determine the relationship between center-level RP volume and short- and long-term outcomes.

Materials and methods

Data source

After institutional review board approval, data from 2010 to 2015 were identified in the rectal participant use file of the National Cancer Data Base (NCDB), a joint effort between the American Cancer Society and American College of Surgeons’ Commission on Cancer. Established in 1989, the NCDB is a nationwide, facility based, comprehensive clinical surveillance resource oncology data set that currently captures 70% of all newly diagnosed malignancies in the USA annually [6].

Patient selection

Patients diagnosed with rectal adenocarcinoma, as defined by the International Classification of Disease-Oncology, 3rd Revision, who underwent proctectomy from 2010 to 2015 were selected [7]. As the surgical approach was not defined prior to 2010, patients diagnosed before 2010 were excluded. Staging was derived from the AJCC information provided. Patients who underwent surgical excision at a different facility than diagnosis were excluded, as surgical approach was not captured in this cohort.

Variables

Demographic, cancer-specific, and facility-related variables available in NCDB have been defined previously, and include demographic information, socioeconomic variables, tumor characteristics, staging, and surgery type [6].

The total number of proctectomies performed at each institution was identified during the years of interest. The number of RPs performed at each institution was identified and the average number of RP per year per institution was calculated. This was subsequently divided into four groups, the top and bottom 10% of volume with the middle group divided in half. These were corrected to represent the whole average numbers (i.e., 4.3 cases/year was rounded down to 4) for final groups representing ≤ 1/year, > 1/year- 4/year, > 4/year- < 12/year, ≥ 12/year.

Outcome measures

Primary outcomes were associations between RP and markers of oncologic adequacy: margin positivity (either circumferential resection or distal margin), adequacy of lymphadenectomy (at least 12 nodes identified), 30-day readmission, 30- and 90-day mortality, and overall survival. Secondary outcomes included 30-day readmission, and conversion to open surgery.

Statistical methods

Descriptive statistics are displayed as frequencies for categorical variables. Chi-square testing for categorical variable was used to evaluate the univariate associations between RP volume and patient (age, sex, race, insurance status, Charlson–Deyo score, socioeconomic status, and distance from treatment facility), hospital (facility type, overall proctectomy volume), and tumor (clinical stage, pathologic stage, neoadjuvant chemoradiotherapy) characteristics. Similarly, Chi-square testing was used to evaluate the univariate association between RP volume and the short-term outcomes, including conversion to open, LN harvest, 30-day readmission, margin positivity, and 30- and 90-day mortality.

Next, we created separate multivariable logistic regression models to examine the relationship between RP volume and each short-term outcome. To create these models, we first performed univariate logistic regression to evaluate the association between the above-mentioned patient, hospital, and tumor characteristic and each short-term outcome separately. The final multivariable model for each short-term outcome included RP volume and any patient, hospital, or tumor characteristic that was significantly (p < 0.05) associated with the outcome on univariate analysis. Presented are the models where RP volume was, after multivariable adjustment, associated with the short-term outcome. The conversion to open multivariable model included RP volume, age, facility type, and income. The lymph node harvest model included RP volume, age, clinical stage, and overall proctectomy volume. For the margin positivity model, covariables included RP volume, age, insurance status, and clinical stage. Full models for 30- and 90-day mortality and 30-day readmission were available upon request.

The Kaplan–Meier method was used to estimate the overall survival function. Overall survival was defined as time from diagnosis to death, with patients alive at time of last follow-up censored. Patients who underwent surgery in 2015 were not included in survival analysis, as they did not have enough follow-up for relevant analyses. Additionally, patients with mortality within 90 days were excluded from survival analysis, to highlight long-term outcomes rather than perioperative survival. Univariate cox regression analysis was used to evaluate the association between overall survival and patient and hospital and tumor variables. Those variables significant on univariate analysis were then included in a multivariable cox regression analysis to identify the adjusted association between center RP volume and overall survival. A p value of < 0.05 was considered statistically significant. All statistics were performed with STATA MP [8].

Results

Predictors of robotic proctectomy at high-volume centers

Eight thousand one hundred and seven total patients underwent RP. The fraction of proctectomies performed with robotic assistance increased significantly over the study period, from 487 (4.9%) in 2010 to 2404 (23.2%) in 2015 (p < 0.001, Fig. 1). The cohort was predominantly 70 years of age or older (n = 4249,52%), male (n = 5100, 63%), white (n = 6586, 81%), and privately insured (n = 4288, 53%). The majority had a Charlson–Deyo score of 0 (n = 6293, 78%) and were treated at an academic center (n = 3392, 44%) or comprehensive community cancer program (n = 3252, 42%) (Table 1). On univariate analysis, age (p < 0.001), race (p < 0.001), insurance status (p < 0.001), facility type (p < 0.001), income (p < 0.001), distance from treatment facility (p < 0.001), and clinical stage (p < 0.001) were associated with center RP volume (Table 1).

Fig. 1
figure 1

The rate of RP significantly increased over time, while the rate of open resection decreased (p < 0.001). The overall rates of proctectomy did not significantly differ by year

Table 1 Cohort Demographics

Low robotic proctectomy volume is associated with poorer short- and long-term outcomes

On univariate analysis, lower RP volume was associated with increased conversion to open procedure (p < 0.001), inadequate (< 12) lymph node harvest (p < 0.001), positive margin status (p = 0.002), and increased 30- and 90-day mortality (p = 0.003 for each) (Table 2).

Table 2 Univariate association between rp volume and short-term outcomes

Following multivariate analysis, each volume group below 12 RP/year was associated with an increased likelihood of conversion to open procedure. As annual volume decreased, the odds of conversion to open increased: between 4–12 RP/year [OR1.9 (95% CI 1.1–3.2), p = 0.02], between 1–4 RP/year [OR 3.9 (95% CI 2.3–6.5), p < 0.001], and 1 or fewer RP/year [OR 6.6 (95% CI 3.9–11.3), p < 0.001]. One or fewer RP was associated with inadequate lymph node harvest, [OR 1.5 (95% CI 1.1–2.0), p = 0.01]. And, for each volume group below 12 RP/year, there was an increased rate of margin positivity: from 4–12 RP/year [OR 1.8 (95% CI 1.1–2.9), p = 0.01], 1–4 RP/year [OR 1.9 (95% CI 1.2–3.0), p = 0.01], and 1 or fewer RP/year [OR 2.1 (95% CI 1.2–3.5), p = 0.005] (Table 3). Additional factors which remained significant on multivariate analysis are presented in Supplemental Table 1.

Table 3 Multivariate association between RP volume and short-term outcomes

Univariate survival analysis demonstrated associations between lower survival and conversion to open operation, positive margin, and inadequate lymph node harvest (additional factors are provided in Table 4). Higher RP volume was also associated with an improved overall survival on univariate analysis (Table 4). 5-year survival was 66% among centers which performed ≤ 1 RP/year, 72% for 1–4 RP/year, 75% for 4–12 RP/year, and 84% for > 12RP /year(Fig. 2). After multivariable adjustment including all variables predicted on univariate analysis, the lowest two RP volume groups remained significantly associated with poorer overall survival when compared to ≥ 12 RP/year ≤ 1 RP/year (HR 1.4, 95% CI 1.0–1.8, p = 0.04), > 1–4 RP/year (HR 1.4, 95% CI 1.1–1.9, p = 0.02) (Table 4). Additional factors on multivariate analysis which were predictive of poorer survival are presented in Table 4, and included age > 70, male gender, medicare/unknown insurances, Charlson–Deyo score ≥ 2, increasing pathologic stage, clinical stage IV, omission of adjuvant chemotherapy, margin positivity, and inadequate lymph node harvest.

Table 4 Univariate and multivariate predictors of survival
Fig. 2
figure 2

Kaplan–Meier curve demonstrating the relationship between the volume of RP performed each year and the overall survival, excluding patients with 90-day mortality

Discussion

These data demonstrate that RP, while being performed with increasing frequency, is associated with poorer oncologic outcomes and overall survival when performed at lower volume centers. When controlling for factors associated with the selection of higher volume centers, RP at centers that perform less than 12 per year is associated with higher conversion to laparotomy, positive surgical margins, inadequate lymph node harvest, and poorer overall survival. Despite the lack of definitive evidence defining the role of RP, use of RP is increasing in frequency. Previous series have shown increased adoption of laparoscopic techniques, and an ill-defined trend in RP [9]. Here, we demonstrate a profound expansion of this technique from 2010, when 4.9% of proctectomies were robotically assisted, to 2015, when 23.2% were. This is not surprising given the overall utilization of robotics in oncologic surgery, despite a lack of clear demonstrable evidence supporting this use [1].

In general, significant controversy exists as to the utilization of laparoscopy in rectal cancer. While laparoscopy represents a fundamentally different technique, the debate around its utilization informs the developing controversy around RP. ACOSOG Z6051, a multi-center randomized non-inferiority trial examining the utilization of laparoscopy in stage II-III rectal cancer initially failed to demonstrate non-inferiority for the minimally invasive technique when first reported in 2015 [10]. Despite this, at follow-up in 2019, the authors reported no significant differences in disease-free survival and recurrence [11]. Given the initial challenges in demonstrating oncologic safety associated with laparoscopy exemplified by ACOSOG, significant hesitance exists surrounding the adoption of RP, with respect to oncologic and technical outcomes.

Several small, multi-center studies have reported encouraging outcomes with RP, including low rates of open conversion and acceptable oncologic outcomes. Pigazzi et al. reported a conversion rate of 4.9% in their series of 143 patients undergoing RP, with comparable lymph node harvest and margin status as compared to a series of open operations [12]. A meta-analysis of five series by Ortiz-Oshiro et al. reported lower rates of open conversion in RP as compared to laparoscopic proctectomy, and equivalent rates of adequate lymph node harvest and margin positivity [13]. A larger series of 251 matched patients comparing laparoscopic proctectomy and RP demonstrated no difference in short-term outcomes, including readmission and reoperation, though RP was associated with an increased overall cost [14]. Despite these studies showing equivalent or superior outcomes in patients receiving RP, it is possible that these data represent the outcomes of selected patients at specialized centers with higher volume and greater experience in RP.

Large randomized studies, aiming to address selection and experience biases, present less conclusive evidence for RP. Early results from the ROLARR trial reported an 8.1% rate of conversion to laparotomy, 5.1% rate of positive circumferential margin, and 0.9% rate of 30-day mortality, which is very similar to the data we report here [5]. However, when stratified by hospital-level volume, our analyses suggest that RP may, in fact, have increased laparotomy conversion rates (as high as 13.1% in the lowest volume cohort) and poorer oncologic outcomes, with inadequate lymph node harvest in 30.4% in the lowest volume cohort. ROLARR, while acknowledging experience differences between individual surgeons, did not aim to stratify based on volume, with a range of experience from 30 to 101 RP. In our cohort, over half of all RP (4151 patients, 51.2%) were performed in centers where fewer than 7 RP were performed on average each year.

Relationships between volume and outcomes in rectal cancer have been demonstrated previously. Baek et al. showed decreased post-operative mortality and increased sphincter preserving techniques in higher volume compared to lower volume centers in a cross-sectional study in California [15]. Additionally, a recent meta-analysis by Chioriso et all demonstrated associations between hospital volume and margin positivity and overall survival in rectal cancer surgery, but their data did not observe any association between surgeon volume and either outcome [16]. Our data extend these relationships to robotic surgery in rectal cancer by showing a profound difference in overall survival between high- and low-volume centers, with 75% of patients surviving greater than 80 months in the highest volume centers, versus 47.2 months in the lowest volume centers.

It is important to consider the possibility that the outcomes reported at lower volume centers at any given time point may represent the evolution of the learning curve in robotic surgery for a particular surgeon or a center. Guend et al. demonstrated that for a center adopting robotics for rectal cancer, the first adopter requires 74 cases to achieve the initial learning curve, with later adopters requiring 25–30 cases [17]. As such, volume-based differences may be secondary to surgeons at varying experience levels and may be expected to diminish with time. Just because outcomes may improve with more experience; however, it does not change the fact that we demonstrate poorer outcomes in the low-volume group. Further, centers that perform one RP per year will likely never achieve a level of necessary proficiency. Our data may highlight the need for, at the least, proctoring and supervision by higher volume providers as centers begin to utilize RP in an attempt to mitigate the poorer outcomes as volume and experience are increasing. Additionally, centers must be realistic about their predicted RP volumes. Additional solutions, including structured training programs, simulation, cadaver labs, and initially performing RP for benign indications, may improve the learning curve and expedite general improvement in outcomes. Many of these techniques have been used within surgery with great success, including during the dissemination of endovascular techniques in vascular surgery [18]. As the time point in this study is 2010–2015, it is possible that many of the lowest volume centers have moved beyond this learning curve in the proceeding 4 years; however, these data will not be known for several years.

It is important to note that conversion to a laparotomy does not represent a technical failure, rather a reversion to a known technique typically performed for poor visualization, technical complexity, or bleeding. Despite this, associations between higher rates of conversion to laparotomy and lower volume centers may, in part, represent decreased technical experience, or inexperience in patient selection.

There are several limitations in the current study that should be noted. First, despite efforts to ensure data quality and accuracy, retrospective data are subject to omitted entries and coding errors. Additionally, the NCDB neither provides information on total center-level surgical volume at each center nor surgeon-specific volume, which may account for some of the differences observed in this cohort. We would anticipate there to be discernable differences between the outcomes of RP performed at centers that perform an otherwise large volume of robotic procedures but are just beginning to learn RP as compared to centers and surgeons that perform few robotic surgeries overall. We would also anticipate there to be a difference in outcomes between RPs performed by a single, experienced, high-volume surgeon at an otherwise low-volume center and a center with multiple low-volume surgeons. Despite this potential limitation, previous work has demonstrated that hospital volume is an adequate surrogate for surgeon volume in colorectal resections, likely mitigating the impact of this limitation on the outcome of our study [19].

Additionally, the selection of robotic volume cohorts by identifying the top and bottom 10% and dividing the middle 80% in half may introduce selection bias, namely where the lowest volume group are those surgeons who are just learning the procedure. Despite this, the volume groups are clinically grounded, ultimately comparing those whose average is less than 1 per year in the lowest group, to those hospitals where one per month or more is performed. In addition, when treating robot volume as a continuous variable it remains predictive of overall survival, conversion to open, inadequate lymph node harvest, and margin positivity in multivariate models, suggesting that our groups do not enrich for these findings. More importantly, as noted above, if outcomes are poor early in the learning curve (in the low-volume cohorts), then training programs must be designed so patients are not subject to inferior surgery as a procedure is being learned.

As a final limitation of this study, the NCDB does not capture disease-specific survival, and we are therefore only able to draw conclusions regarding center-level volume and overall survival. Even after rigorous adjustment for factors associated with both RP volume and overall survival, however, RP volume was still significantly associated with overall survival.

We conclude that outcomes after RP depend on the hospital-level volume of this procedure. While patients undergoing RP at centers that perform more than 12 cases per year have outcomes paralleling those reported previously, patients treated at centers performing lower volumes of RP have significantly increased rates of conversion to laparotomy, inadequate lymph node harvest, and positive margins. Most significantly, lower volume centers are associated with poorer overall survival. These findings are concerning given the widespread, and increasing, utilization of RP across the country. They highlight the need for closer inspection of the utilization of RP, especially when the volume of cases is lower than those being required in the current randomized trials. Clearly, more work – including randomized, controlled trials, like the ongoing ROLARR trial, as well as supervision from accreditation agencies, such as the National Rectal Cancer Accreditation system, is needed.