Introduction

The learning curve for cytoreductive surgery (CRS) and hyperthermic intraperitoneal chemotherapy (HIPEC) which peaks at 130 procedures is for the surgeon as well as the institute; apart from technical skills, multidisciplinary decision-making also plays a vital role in selecting appropriate patients for the procedure, leaving out patients who are unlikely to benefit from the procedure and in the early recognition and management of complications [1, 2]. The parameters that have been used to evaluate the proficiency of the surgeon in performing the procedure are the grade 3–4 morbidity, the mortality, and the completeness of cytoreduction [1]. It has also been evaluated in terms of early oncological failure which has been defined as recurrence within 2 years of surgery [3]. Ideally, the initial procedures should be performed under the direct supervision or with the help of expert centers which may not be possible when starting a peritoneal surface oncology unit in a different country and the initial surgeries are performed when surgeons are on the learning curve. Evaluation of a surgeon’s proficiency can be done by qualitative methods like direct observation by an expert and using a cumulative summation (CUSUM) graph [4,5,6]. These, however, have no formal means of indicating when the required level of competence has been reached [7]. Moreover, a minimal level of proficiency has not been defined previously for this procedure—an acceptable safe limit before the peak of the learning curve has been reached. The CUSUM test for learning curve (LC-CUSUM) is a quantitative and individualized statistical tool for analyzing the learning curve [7]. We aimed to determine the number of cases required to attain a minimal level of proficiency in treating pseudomyxoma peritonei (PMP) with CRS and HIPEC using the LC-CUSUM

Methods

Prospectively collected data was analyzed retrospectively, institutional permission was taken. All cases of pseudomyxoma peritonei (PMP) from an appendiceal primary treated with CRS and HIPEC from Jan 2011 to Jan 2016 were included. CRS was performed as per techniques of peritonectomy and visceral resections described before [8]. HIPEC was performed by the coliseum technique using either an oxaliplatin or mitomycin-based regimen [9, 10]. Bidirectional chemotherapy was used in all patients undergoing HIPEC [9, 10]. Morbidity was graded according to CTCAE-version 4 [11]. The time to recurrence was calculated from the day of surgery. All patients were followed up till death from any cause. Early oncological failure was defined as recurrence occurring within 2 years of cytoreductive surgery and HIPEC. [3]

Statistical Analysis

The data was analyzed using SPSS version 20.0 and p > 0.05 was considered significant. Kaplan Meier survival curves, Cox proportional hazards regression models, Pearson’s chi-square test, Fisher’s exact test, and t test were used to determine the survival outcomes. To determine the number of cases required to attain a minimal level of proficiency, the learning curve (LC) CUSUM was used (also known as the CUSUM test) [7]. The CUSUM graph should be differentiated from the LC-CUSUM [12]. The CUSUM graph plots the cumulative sums of the deviation between the process and a target value and provides a qualitative overview of the process being monitored [13]. The LC-CUSUM (CUSUM test) is a hypothesis test that relies on the comparison between computed values and a limit and determines whether a process has reached a predefined level of performance, in our case, minimum level of proficiency (acceptable limit of safety) in treating patients of PMP with CRS and HIPEC [14]. It presumes that the process is not in control at the start of monitoring (the surgeon is not proficient) and indicates when the process can be considered to be in control (the surgeon has reached the acceptable predefined level of performance) [7]. In this test, “h” is the defined target level of proficiency, h0 is an unacceptable event (the process is out of control), and h1 indicates that the process is in control though the highest level of performance has not been reached. Therefore, the hypotheses are inverted for the LC-CUSUM: with h0 the process is out of control and with H1 the process is in control [7].The LC-CUSUM remains responsive at all times and even after successive failures, proficiency is reached, it will be indicated instantly. In terms of graphical representation, the process is assumed to be unacceptable as long as the LC-CUSUM score remains above (or below if successes are indicated by an ascending graph) the limit h; the process is considered to be acceptable (the various prognostic variables are within acceptable limits) when the LC-CUSUM score crosses this limit. The LC-CUSUM incorporates a holding barrier at zero that cannot be crossed. The limits h1 and h0 have to be set for the analysis [7]. We employed the LC-CUSUM to determine whether the grade 3–4 morbidity, 30-day in-hospital mortality, rate of incomplete cytoreduction (CC-2/3), and early recurrence were within acceptable limits. The most important aspect of the study was to set the target levels for minimal level of proficiency. For grade 3–4 morbidity, 25% was taken as the upper limit, the target being 15% (h0 = 25%, h1 = 15%); for the 30-day in-hospital mortality, a lower limit of 2% and an upper limit of 4% was set (h0 = 4%, h1 = 2%). These values were based on the two largest published series of PMP treated with CRS and HIPEC; the multi-institutional study by Chua et al. that is the largest published series so far and has data from less experienced centers as well, the second was the single institutional series of 1000 patients by Moran et al. [15, 16]. The major morbidity reported by Chua et al. was 35% for emerging units and 25% of established units; that reported by Moran et al. was 15.2%, this however included their early experience as well. For CC-2/3 resections, h1 = 15% and h0 = 25%. We decided on these values based on the findings of the two above studies once again—83% by Chua et al. and 75% by Moran et al. For early oncological failure, 5% was the target (h1) and 15% or more, unacceptable (h0). We included this parameter based on the study by Kusamura et al., who considered this factor to be more reflective of the performance of the entire unit rather than the surgeon alone [17]. The reported recurrence rate from another series of 512 patients was 25% [18]. The acceptable limits were calculated using an α error of 5% and a β error of 20% [19].

Results

From Jan 2011 to Jan 2016, 77 patients underwent CRS and HIPEC. The patient characteristics, the operative findings, and morbidity and mortality are presented in Table 1. The mean PCI was 28, a complete cytoreduction CC-0/1 was attained in 75% of the patients. 77.9% of the patients had low-grade PMP and 21.1% had high-grade PMP. The grade 3–4 morbidity was 42.6% and the mortality was 5.2%. The 5-year overall survival (OS) was 62.3% and the 3-year disease-free survival (DFS) was 71%. Patients with low-grade tumors (p = < 0.01), a low PCI (p = < 0.01), a CC-0/1 score (p = < 0.01), and those without obstructive symptoms (p = < 0.01) experienced a better DFS; the only independent predictor was the CC score of 0/1 (p = 0.03). The absence of grade 3–4 complications (p = < 0.01), a low PCI (p = 0.05), and a CC score of 0/1 (p = 0.04) were the factors favorably affecting OS; the only independent predictor being an absence of grade 3–4 complications (p = <0.01).

Table 1 Patient characteristics, operative findings, morbidity and mortality in patients undergoing CRS and HIPEC for PMP

Table 1 Patient characteristics, operative findings, morbidity, and mortality

Analysis of the Learning Curve Using LC-CUSUM

The LC-CUSUM curve was interpreted as follows:

  • If the curve stayed inside the upper and lower limits then the process was considered to be under control.

  • When the curve went above the upper control limit, it was considered worsening of the performance.

  • When the curve went below the lower control limit, it was considered improvement of the performance.

  • Once the lower control limit was crossed and the graph plateaued, it was inferred that the highest level of proficiency has been attained.

For in-hospital mortality, the graph remained above the upper limit till the 57th case (Fig. 1) after it was within the acceptable limits that had been set. For the grade 3–4 morbidity (Fig. 2) and proportion of CC-2/3 resections (Fig. 3), it took 38 cases and 70 cases for early oncological failure (Fig. 4) for the graph to be in the acceptable limits. For all the above parameters, the highest level of proficiency had not been attained after 77 cases though the processes were in control.

Fig. 1
figure 1

CUSUM test for in-hospital mortality: acceptable performance is reached after 57 cases

Fig. 2
figure 2

CUSUM test for grade 3–4 morbidity: acceptable performance is reached after 38 cases

Fig. 3
figure 3

CUSUM test for CC-2/3 resections: acceptable performance is reached after 38 cases

Fig. 4
figure 4

CUSUM test for early oncological failure: acceptable performance is reached after 70 cases

Discussion

CRS and HIPEC is the standard of care for treating PMP of appendiceal origin [20]. The main concerns with widespread acceptance have been the extensive nature of the surgery, the cost and the high rates of morbidity and mortality when compared to other gastrointestinal cancer surgeries. Over the years with increasing experience, the morbidity and mortality has reduced significantly. This is due in part to more careful patient selection and early recognition and management of complications by a multidisciplinary team [17]. Reaching the level of expertise of the established centers has been associated with a prolonged learning curve that peaks at 90 procedures for the surgeon and 100 for the institute [17]. This may take years as these are rare tumors and each institute may have only few cases being treated every year. In the initial phase of the experience, surgeries are performed without having gained the highest level of expertise. During this phase, it becomes important to review the outcomes and ensure that the results are within acceptable limits (minimal level of proficiency) in terms of the determinants of the quality of the procedure, i.e., the procedure has an acceptable morbidity and mortality and the proportion of patients having complete removal of macroscopic disease is within acceptable limits. Looking at the outcomes of all patients put together, the morbidity was 42.6%, the mortality was 5.2%, and the rate of CC-0/1 was 75%. A 75% rate of CC-0/1 is similar to that published in large series of such patients, the morbidity and mortality are on the higher side; they are higher than the limits set for the CUSUM test in this study [21, 22]. But these values represent averages and there would be a point where we were performing better than what is reflected by the average. To determine if the procedure was being performed with a minimum level of proficiency (safety) and at what point this level was reached, we used the LC-CUSUM.

The morbidity and rate of complete cytoreduction were in control by the 38th procedure. It took 57 cases for the mortality and 70 cases for early oncological failures to be within acceptable limits. Voron and colleagues, in their study found that acceptable limits were reached after the 40th procedure though the peak of the learning curve was reached only after 140 procedures [23].

However, they only studied the grade 3–4 morbidity. The LC-CUSUM has a simple formulation and can detect small changes and is a useful tool to evaluate the performance of the surgeon. For determining the peak of the learning curve, the sequential probability ratio test (SPRT) is a more appropriate tool [24, 25]. Our findings also indicate that the number of cases required to attain expertise in terms of different parameters is not the same. In the CUSUM test, setting the values of h0 and h1 is crucial. We based these values of the results of high-volume centers. Experience with this procedure has grown substantially around the world and there are publications that can guide surgeons in difficult situations. Hence, the target level of proficiency should also be high. CRS and HIPEC are used to treat peritoneal metastases from other tumors as well and this adds to the experience with the procedure, though a disease-specific experience is important. PMP constituted 40% of our patients undergoing CRS and HIPEC.

Whereas, for the other common primaries like colorectal cancer, such procedures are performed only for limited peritoneal cancer spread; for PMP, there is no cut off of PCI for selecting patients for the procedure [26, 27]. Moreover, a third of the patients present with a high PCI (> 28–30) [28]. Hence, the surgeon has to deal with more extensive disease in an effort to attain complete cytoreduction.

The most significant variable influencing both DFS and OS is the completeness of cytoreduction, which depends on the skill and ability of the surgeon to remove the tumor from all regions of the abdominal cavity [15, 28]. This would make the surgeon an important prognostic factor. The surgeon should be able to predict if complete tumor removal is possible with reasonable accuracy. At the same time, the disease extent as reflected by the PCI also influences this end point [28]. While some indicators of inoperability on imaging have been defined, it is not uncommon to underestimate the extent of disease. With extensive disease, it may become impossible to achieve CC-0/1 and such cases should not be subjected to surgery unless a palliative debulking is possible [28]. The intent of surgery and prognosis should also be determined before the procedure.

Our study is important as it looks at a minimal level of proficiency with which the procedure should be performed. The number of cases needed for each prognostic variable is different and surgeons can improve in areas they are lagging. The limits that have been set maybe considered arbitrary and we have not performed a validation of the test which are the main drawbacks of this study. Though this test has been used to assess surgeons’ performance, it is impossible to exclude the institutional influence/influence of the multidisciplinary team managing these patients for certain parameters like morbidity, mortality, and early oncologic failure.

Conclusions

Whereas maximal expertise is attained in performing CRS and HIPEC for PMP after a prolonged learning curve, surgeons could perform this procedure with a minimum level of proficiency (acceptable level of safety) before the peak of the curve is reached. The same can be determined by the LC-CUSUM test. The number of cases for each prognostic variable is different and can help surgeons identify the areas in which they need to improve more. For CRS and HIPEC, the morbidity, mortality, and early oncological failure are dependent of the performance of the multidisciplinary team treating these patients and hence it is impossible to exclude this influence on the results. Thus, the performance is reflective of the performance of the team/institute and not the surgeon alone.