Introduction

Few would disagree that the use of permanent materials in the surgical treatment of pelvic organ prolapse represents one of the most compelling controversies in gynecologic surgical practice. Resolution of this controversy is complicated by the rapid pace of change in surgical therapies. Surgeons now weigh the risks and benefits of traditional native tissue vaginal procedures, vaginal surgery supplemented by permanent and nonpermanent mesh, abdominal repair via laparotomy, and abdominal repair via minimally invasive approaches. There is, in fact, level 1 evidence that abdominal sacral colpopexy via laparotomy results in more durable anatomic correction than vaginal sacrospinous ligament fixation [1, 2]. However, whether the best route of apical prolapse surgery is abdominal or vaginal remains controversial [35]. Minimally invasive approaches to open mesh colpopexy may provide the durability of abdominal repairs with the easier recovery of vaginal surgery [6], yet only one prospective comparison of minimally invasive sacral colpopexy (MISC) and vaginal mesh (VM) has been reported [2]. As for vaginal surgery, various randomized trials comparing VM to “native tissue” traditional vaginal surgery have reported superior anatomic results with mesh, though at a price of visceral injury, hemorrhage, and variable, often high rates of mesh erosion/exposure [79].

Mesh erosion (also called exposure) is one of the most common adverse events due to placement of mesh during pelvic organ prolapse surgery. Risk factors for mesh erosion 2 years after sacral colpopexy include mesh type (with expanded polytetrafluoroethylene increasing risk fourfold), concurrent hysterectomy [odds ratio (OR 4.9)], and current smoking (OR 5.2). It is also becoming apparent that the 2-year overall observed rates of mesh or suture erosion of 6 % increase over time as postoperative follow-up continues after abdominal sacral colpopexy [10]. Studies reporting on greater than 2-year outcomes after vaginal mesh repairs are now being published, with rates of 18–19 % reported mesh erosion [9]. While vaginal estrogen (estradiol, E2) therapy is widely used and thought to aid in vaginal healing and prevention of operative complications, very little data exist on the use of this therapy to prevent mesh erosion in the vagina.

The objective of this study was to explore the feasibility of addressing these two prominent controversies in prolapse surgery. We sought to design a randomized trial testing two commonly used techniques for durable repair of prolapse: VM and MISC. We also desired to test the efficacy of vaginal estrogen therapy to prevent mesh erosion after these two surgical procedures. A potentially efficient way to address both questions at once is use of a factorial design, where each subject undergoes two separate randomizations: one to a surgical intervention and one to E2 perioperative therapy vs placebo, yielding four distinct treatment arms. This kind of a trial offers the opportunity to test two hypotheses with fewer overall subjects than if each hypothesis were studied separately. We evaluated the feasibility of such a trial by constructing a model to simulate the probability of important treatment outcomes and estimate the sample size required to detect differences between the treatment arms.

Material and methods

Markov model

This study was deemed exempt from Institutional Review Board (IRB) approval by the Duke University IRB. We constructed a Markov state transition model (Fig. 1) to simulate the probability of clinical outcomes over the period of 2 years after undergoing treatment with one of the four treatment assignments: VM + E2, VM + placebo, MISC + E2, and MISC + placebo. Markov models are particularly suited to estimating the likelihood of events over time, whether the risk is constant or changing, and are widely used in health care [11]. In research design, simulation models can be particularly helpful for estimating sample size requirements in settings where risks may change over time and where there is a paucity of data about important outcomes [12]. We used TreeAge Pro 2009 software (TreeAge Pro, Williamstown, MA, USA) to build the model and perform all simulations. Simplifying assumptions in this version of the model included (a) no competing risk of death from other causes, (b) all states are mutually exclusive (i.e., recurrent prolapse symptoms do not develop during symptomatic mesh erosion, and vice versa), (c) post-procedure event rates are constant, and (d) individual patient characteristics (age, history of prior surgery, comorbidities, degree of prolapse) that might affect outcome are not included. Many of these assumptions could be changed or expanded upon in further interactions of the model. For the purposes of sample size estimation in a randomized trial, where these individual characteristics would be balanced through randomization, our last assumption regarding patient characteristics should not significantly affect the results.

Fig. 1
figure 1

Markov model. Arrows show possible monthly transitions between states. The probability of a given transition in a given month varies between treatment arms

Model estimates (or parameter estimates)

For our probability estimates regarding the surgical comparison, we used previously published estimates from a systematic review and meta-analysis by Diwadkar et al. (Table 1) [13]. This review encompassed published manuscripts and abstracts from January 1985 to January 2008 and compared a variety of outcomes in 5,639 sacral colpopexy subjects and 3,425 vaginal mesh kits. Subjects in the colpopexy group included women undergoing open colpopexy as well as laparoscopic approach. While the majority of subjects undergoing colpopexy in this review had traditional open colpopexy, we assumed that their outcome estimates would be similar to those having MISC [6]. Importantly, the authors reported these outcomes as mean incidence [with 95 % confidence intervals (CI)] over the entire follow-up period, which allowed us to generate beta probability distributions (bound by 0 and 1.0 at each end) for each outcome included in the model. We determined that the most relevant clinical outcomes to include in the model were the following: (1) visceral injury (including cystotomy, ureteral injury, and bowel injury), (2) mesh erosion, and (3) reoperation for recurrent prolapse. Mean length of follow-up differed between VM and MISC in the studies included in the review. The shorter duration of reported postoperative follow-up after VM procedures had the potential of contributing to an underestimation of erosion after VM. We compensated for this difference by converting overall incidence within the mean follow-up time to a monthly probability, using standard methods for converting rates to probabilities (http://www.treeage.com). This allowed us to compare the procedures while modeling equivalent duration of postoperative follow-up.

Table 1 Probability distributions for outcomes. These inputs were used in the model from published data [11]. A beta distribution was used in all instances. Probabilities represent cumulative risk over the follow-up period shown

We nonsystematically surveyed published literature regarding the effectiveness of E2 to prevent erosion after the use of permanent mesh to treat pelvic organ prolapse and found little to no useful data. Using key words “pelvic organ prolapse” and “mesh erosion” or “treatment” or “prevention” or “estrogen,” we found no case series, cohort studies, or randomized trials that explored the use of estrogen or other local treatments (e.g., anti-inflammatory creams) to treat or prevent mesh erosion. In the absence of data, we assumed that E2 would be 50 or 75 % effective in preventing erosion if used perioperatively. We purposely chose very optimistic estimates for the effect of E2, reasoning that high efficacy for E2 would be associated with a smaller sample size predicted by the model. If our predicted sample size was feasible for a trial, we were prepared to explore lower, more clinically relevant efficacy rates for E2. On the other hand, we determined that if the predicted sample size for a trial based on very optimistic E2 efficacy rates was too large to be clinically feasible, a trial with lower E2 efficacy rates would be even more challenging to conduct.

Monte Carlo simulations

We then performed a series of Monte Carlo simulations to estimate outcome rates (with 95 % CIs) under two different assumptions about estrogen efficacy (50 and 75 % reduction in erosion rates) with 10,000 simulations at each assumed level of estrogen efficacy. During each individual simulation, the values for each model parameter were drawn from the underlying probability distributions—for example, for a “bell-shaped” distribution, most of the values would be close to the mean, with 95 % of the values coming from within the 95 % CI. The calculated outcome rates were then used to generate sample size estimates for the hypothetical four-arm factorial trial using standard methods (http://www.biomedcentral.com/1471-2288/3/26). For sample size estimates, we calculated both 80 and 90 % power to detect a difference between groups, with α = 0.05 and assuming a 10 % lost to follow-up rate over the 2-year study period.

Results

Table 2 shows the results of the model simulations expressed as mean event rate with 95 % CI for visceral injury, surgery for recurrent prolapse, or mesh erosion. Based on the data in the systematic review, event rates for both visceral injury and surgery for recurrence were quite low, with overlapping CIs. CIs for mesh erosion rates did not overlap for any level of E2 effectiveness; in other words, the modeled difference in erosion rates between surgical procedures was significant at a p value of less than 0.05 even with very high levels of assumed E2 effectiveness. Sample size estimates generated for a two-arm trial as well as a four-arm factorial design are shown in Table 3. A two-arm trial comparing only VM and MISC, without estrogen, and with mesh erosion as the primary outcome would require 214 subjects/arm for 80 % power and 286 subjects/arm for 90 % power. Adding a factorial design to compare E2 to placebo substantially increases the sample size if E2 is 50 % effective; total sample sizes for a two-arm trial without E2 vs a four-arm factorial trial with E2 are similar if E2 is approximately 75 % effective in reducing mesh erosion.

Table 2 Comparison of predicted incidence and 95 % CIs for outcomes included in the model by treatment assignment. Values for VM + E2 and MISC + E2 are not included since those outcomes were assumed to be independent of E2 use. Values for predicted rates of mesh erosion for placebo treatment assignments as well as those assigned to E2 use are shown, with varied estimated E2 efficacy
Table 3 Sample size estimates for the hypothetical two-arm or four-arm factorial randomized trial, assuming both 50 and 75 % E2 efficacy to prevent mesh erosion

Discussion

Unless vaginal estrogen is as highly effective as to provide a 75 % reduction in the incidence of mesh erosion over 2 years, the potential sample size required to detect statistically significant differences in important clinical outcomes in a factorial randomized trial would be prohibitively large. A four-arm factorial randomized controlled trial (RCT) including E2 would be substantially larger (i.e., 1,212 subjects for 50 % E2 effectiveness at 80 % power) than for a two-arm trial comparing mesh erosion differences between surgical approaches alone (i.e., 428 subjects 80 % power).

The strengths of this study include the robust data we had available to use for outcome estimates after vaginal mesh and colpopexy surgeries [13], and the use of a modeling approach which allowed us to synthesize quantitative estimates of event rates (along with the uncertainty in those estimates and variability in follow-up) to use in sample size estimation.

Limitations of this study include our assumption that colpopexy outcomes reported by Diwadkar et al. could be extrapolated to represent those of minimally invasive colpopexy. We felt this was reasonable because the nature of the outcomes we included in the model (visceral injury, mesh erosion, and reoperation) were likely similar. This assumption would not have been true were we interested in outcomes such as wound complications or length of stay. Another limitation is that our outcome estimates were based on a systematic review that was published in 2009 and included published reports only through January 2008. Prospective studies of various methods of VM surgery and MISC have been published in the interim since our systematic review source, and outcome estimates incorporating these reports may be different than those we used here. However, we chose to limit our model input to those studies included in the well-documented meta-analysis by Diwadkar et al. [13] to ensure the validity of the outcome estimates and distributions. Another important limitation inherent in the actual study question is the near complete lack of reliable data about the efficacy of estrogen to prevent mesh erosion. This forced us to test efficacy rates based on opinion alone. It seems likely, however, that the actual efficacy of estrogen for this purpose is less and if so, our calculated sample size estimates for the factorial study would only increase. Furthermore, mesh erosion is likely a multifactorial event including infection, graft degradation, and other factors, and it is unclear how much of an effect estrogen has on each component contributing to the risk of erosion [14]. As with any modeling study (or any sample size estimation exercise), the degree to which our underlying assumptions differ from reality may affect the validity of our results.

While vaginal estrogen therapy is widely used and thought to aid in vaginal healing and prevention of operative complications, little data exist on the use of this therapy to prevent mesh erosion in the vagina [15]. There are numerous case reports and case series, largely from tertiary care centers, about surgical management of mesh complications [1618]. In general, most women referred to a tertiary care center have already been prescribed estrogen cream and so it is not known how often this treatment succeeds. This has not been rigorously studied. Vaginal contraction, apparently due to mesh shrinkage, is another complication reported after transvaginal mesh placement [19]. Symptoms include severe vaginal pain aggravated by movement, dyspareunia, and focal tenderness. Some experts believe that vaginal estrogen may also help to prevent this outcome; this, too, has not been studied. It is truly unfortunate that a commonly used technique to treat an important complication of prolapse surgery has so little supportive data. The recent US Food and Drug Administration (FDA) Safety Communication regarding complications of mesh used in prolapse surgery emphasizes the importance of a better understanding of prevention and management of mesh complications in our patients (http://www.fda.gov/medicaldevices/safety/alertsandnotices/ucm262435.htm).

We hope that this model may represent one method that can be used to aid productive investigation in these areas. Despite the above limitations, this approach has potential for further application in identifying research priorities and study design. By incorporating data or estimates related to costs and health-related quality of life, the model can be extended to perform value of information analysis, a technique which is being increasingly applied in other settings for research prioritization and study design efforts [20, 21].