Introduction

Of the 225,000 operations performed annually for prolapse in the USA [1], repairs involving the posterior vaginal wall are performed in 40–85% of cases [2]. Despite the obvious clinical importance of the posterior compartment, we lack evidence-based criteria for the diagnosis of rectocele. Dynamic magnetic resonance imaging (MRI) and x-ray defecography are increasingly being used for showing pelvic organ anatomy in ways that lend themselves to objective measurement. In our clinical practice, we see many women who have had a rectocele diagnosed on defecography, but who clinically do not have one. In looking at the existing cutoffs used to make this diagnosis as recommended by international standardization groups, we found that a 2-cm cutoff value for a distance from the rectocele to the mid-anal line for both MRI and defecography [3] referenced a paper that actually described using “resting position of the anorectal junction < 2 cm above the plane of the ischial tuberosities” for assessing pelvic floor descent [4].

A search for evidence-based cutoffs reveals important knowledge gaps and no consensus regarding which reference lines should be used, what cutoffs are appropriate, or even how rectocele size is best assessed. A search on PubMed of articles employing the phrases “Pelvic organ prolapse,” “Rectocele,” “Posterior Prolapse,” “Dynamic Magnetic Resonance Imaging/MRI,” and “Reference lines”—as well as their cited references—reveals a wide variety of measures (Fig. 1) that have been used to diagnose and quantify posterior vaginal wall prolapse. However, the lack of an objective comparison of performance characteristics (sensitivity/specificity) for the systems listed in Fig. 1 represents an important knowledge gap. In addition, evidence-based cutoff values for diagnostic purposes are generally lacking. Because up to half of women with documented rectocele do not have symptoms [5], it is not possible to use the presence or absence of symptoms as a gold standard for assessing diagnostic criteria. However, one method that might lend itself to comparing the efficacy of the measures in Fig. 1 is the statistic-based cutoff value [6]. This could provide an objective, evidence-based comparison of the assessment measures tabulated in Fig. 1, which could then help in evaluating current clinical practice.

Fig. 1
figure 1

Diagram of reference lines and measurements in the different anatomical landmark systems on midsagittal magnetic resonance imaging at maximum Valsalva in a woman with remarkable posterior vaginal wall prolapse. ah The existing eight reference lines from the literature; ag the maximum perpendicular distance from the most protuberant point on the posterior vaginal wall vs. each reference line that was measured; h the distance from the most proximal point of the puborectalis to the most protuberant point; i the exposed vaginal length, measured from the point where the posterior vaginal wall separates from the anterior wall to the ventral tip of the perineal body (dark zone between the distal posterior vaginal wall and the front of the external anal sphincter). B = bladder; PS = pubic symphysis; R = rectum; Sac = sacrum; U = uterus (a); Ext. = outside of pubic symphysis (d); Int. = inside of pubic symphysis (e). All figures ©DeLancey 2017

In this study, we sought to evaluate the abilities of the measurement systems described in Fig. 1 to identify the presence and assess the size of posterior vaginal wall prolapse. This included a new measurement, the exposed vaginal length—the portion of the posterior vaginal wall from the point where the anterior and posterior walls lose their contact to the perineal body just ventral to the external anal sphincter, which our previous studies have suggested as a critical mechanistic biomechanical factor in the formation of anterior vaginal wall prolapse [7,8,9]. We sought to (1) evaluate each system’s ability to distinguish between which subjects did and did not have posterior vaginal wall prolapse, including establishing the appropriate evidence-based cutoff values, and (2) the system’s ability to assess rectocele size during maximal Valsalva. We therefore sought to test the null hypothesis that there would be no difference in the performance characteristics for evaluating posterior vaginal wall prolapse presence or size.

Materials and methods

Subjects

This is a secondary analysis of magnetic resonance images from an NIH-funded, IRB-approved case-control mechanistic cohort study at the University of Michigan comparing women with posterior vaginal wall prolapse with normal asymptomatic women (Institutional Review Board HUM00012823) [10]. Subjects in the posterior wall prolapse group (cases) were those who had distal posterior wall prolapse (POP-Q point Bp ≥ 1 cm beyond the hymen, with no anterior or apical compartment point below the hymen) and in whom this was the predominant element of the prolapse (the most dependent point of the POP-Q measurements). Subjects in the control group were women who were asymptomatic based on Pelvic Floor Impact Questionnaires, had negative full bladder stress tests, and had all vaginal points above the hymen on POP-Q examination. Women with prolapse were recruited from our urogynecology clinic and women with normal support from research volunteer registries and advertisements. In the case group, only women who could reproduce their maximal prolapse with Valsalva during the MR imaging study and whose POP-Q values on physical examination matched the degree of prolapse detected by MRI were included. We were able to identify 52 cases that met the inclusion criteria and had usable scans and 60 race-, age-, and parity-similar women in the control group who had adequate image quality.

Women with prior prolapse repair surgery who could alter the pelvic floor anatomy were excluded from the study. Women were also excluded if they had been pregnant within the last year or had factors that might place them at risk for infection such as a history of chronic steroid use, previous radiation to the pelvis, or having been immunocompromised. Women with a prior hysterectomy were included if the procedure had not been done for prolapse and if the prolapse occurred at least 2 years after surgery. Demographic and clinical information including age, body mass index, race, POP-Q measurements, and levator defect score was collected and compared across cases and controls (Table 1). We asked women about difficult defecations using the relevant questions for the Colo-Rectal-Anal Distress Inventory (CRADI-8): “Do you need to strain hard to have a bowel movement?” and “Do you feel you have not completely emptied your bowels at the end of a bowel movement?” Those who responded with significant bother (“moderately” or” quite a bit”) were considered symptomatic.

Table 1 Demographic overview

Magnetic resonance imaging

Each participant underwent rest, Valsalva dynamic, and three-dimensional stress magnetic resonance technique sequences to acquire images described previously [11]. Briefly, sagittal images were acquired while women were in the supine position during maximal Valsalva using a 3-T Philips Achieva scanner with a six-channel, phased-array coil. To define the vaginal lumen, 10 to 20 ml of ultrasound gel was inserted intravaginally. To assure that the maximal prolapse was achieved during MRI, four Valsalva efforts were made and captured based on our previous study indicating that that many efforts were sometimes required to achieve maximal prolapse. One research team member reviewed the image in which the rectocele protruded the most (usually the fourth) [12] and compared those images to assure the size of the prolapse was similar to that in the POP-Q data. The image with maximal protrusion was selected and used for data analysis. We used a maximal Valsalva effort to develop the prolapse as is done during clinical examination rather than using standardized Valsalva pressure to achieve the full size of the rectocele. For the 3D stress imaging, the participants held the Valsalva for approximately 17 s with the prolapse protruding maximally while we obtained 14 sagittal images from one side of the pelvis to the other in the sagittal plane (repetition time range 1249–1253 ms, echo time 80 ms, 6-mm slice thickness, 1-mm gap, SENSE factor 4, number of signal average 2, 320 × 178 voxels). In all, 52 cases and 60 controls were included in the study.

Measurement methods

The midsagittal magnetic resonance images from the effort with the largest prolapse from either the dynamic or stress sequences of both cases and controls were selected and measured with ImageJ (v1.51) using the eight existing measurement systems. Seven of these systems measured the perpendicular distance from a reference line to the most protuberant point on the posterior vaginal wall (Fig. 1a–g). One of the systems, anteroposterior prolapse diameter, measured the distance between the most proximal point of the puborectalis and the most protuberant point of the posterior wall (Fig. 1h). Exposed vaginal length (Fig. 1i) was also measured because of its mechanistic importance to prolapse [7,8,9]. We measured the exposed vaginal length on sagittal MRI at maximal Valsalva as the portion of the posterior vaginal wall from the point where the anterior and posterior vaginal walls lose their contact to the perineal body just ventral to the external anal sphincter. We define the perineal body on MRI as the location of the substance of the perineal body where the dense connective tissue created a dark region. Note that this was not located on the perineal skin or vaginal wall, which are often difficult to see in magnetic resonance imaging. All measurements were performed by two experienced raters.

To evaluate the ability of those measurements to assess prolapse size, we first used expert opinion to determine the size of the prolapse as we felt POP-Q did not always reflect rectocele size. Three experienced clinicians with expertise in reading MRIs further scored the prolapse size from 0 to 5 by sorting all the scans into six subgroups (no prolapse = 0, smallest = 1, small = 2, medium = 3, large = 4, largest = 5) to be used in assessing estimates of rectocele size. Discrepancies in group assignment were resolved by consensus.

Statistical methods

Demographic characteristics for those with and without posterior vaginal wall prolapse were described with means, standard deviations, medians, and interquartile ranges. The primary measures that were compared across groups were: mid-anal line [3, 4, 13], internal anal sphincter line [6], hiatus line [14, 15], “perineal line-internal pubis,” a reference line from the inside of the pubic symphysis to the front tip of the perineal body [15, 16], “perineal line-external pubis,” a reference line from the outside of the pubic symphysis to the front tip of the perineal body [15, 16], mid-pubic line [15, 17,18,19,20], horizontal line [18, 21], anteroposterior prolapse diameter [22, 23], and exposed vaginal length [7,8,9]. Statistical differences between groups (cases and controls) were assessed with either simple linear regressions or the nonparametric test of medians, when appropriate. Simple linear regressions were used to compare the group differences between cases and controls. Cohen’s D effect sizes, receiver-operating characteristic curves, and the area under the curve were used to rank the ability of each measurement system to discriminate between groups. Receiver-operating characteristic curves were further used to determine a cutoff or threshold value at which those with and without posterior vaginal wall prolapse are optimally correctly classified as such. Measurement systems with a larger effect size and larger area under the curve were judged to be better at discriminating women with and without posterior prolapse. Rectocele size was first described by means and standard deviations within each size category. The association between each measure and expert-assigned rectocele size was assessed by r-squared values from simple linear regressions. Statistical significance was determined at α = 0.05. All statistical analyses were conducted in Stata version 14.1.

Results

Groups were similar in terms of age, vaginal parity, body mass index, and levator defect scores (Table 1). By design, the POP-Q assessments were statistically different between groups. As expected, cases had a significantly higher degree of both anterior (point Ba) and posterior (points Ap and Bp) prolapse than controls (Table 1).

The cases also had larger prolapse size during maximum Valsalva (both median and interquartile range) compared with controls (Fig. 2) for all measurement systems using various reference lines on midsagittal magnetic resonance imaging. In the existing parameters (Fig. 2a–g), the perineal line-internal pubis (Fig. 2e) displayed less overlap than others between cases and controls, while of all nine systems, the new parameter—exposed vaginal length (Fig. 2i)—had the least overlap overall.

Fig. 2
figure 2

Equivalent boxplots of each measurement in the different anatomical landmark systems for controls and cases. All data presented as centimeters (cm). Plots present the distributions and the extent of overlap the groups have for each parameter, with range shown as a box with the median (black center line), interquartile range (edge of box), and extreme values (whiskers). All figures ©DeLancey 2017

Table 2 provides the descriptive statistics for those with and without prolapse, along with a statistical comparison of the two groups. Cases had notably larger mean values than controls for our measures of interest. More variability was also observed within the cases. Large effect sizes (e.g., 0.83 – 2.12) were observed when investigating the magnitude of the difference between cases and controls.

Table 2 Statistical comparison between case and control groups

Figure 3 displays the receiver-operating characteristic curves for each measure and the associated area under the curve, which ranges from 0.72–0.95 and quantifies the discrimination between those with rectocele and normal women. Cutoff values derived from the receiver-operating characteristic curves are also presented (Table 2), indicating the threshold at which each measure optimally discriminates between cases and controls.

Fig. 3
figure 3

Equivalent receiving-operator characteristic curves of each parameter. Parameters listed in order of decreasing area under the curve. All figures ©DeLancey 2017

Of the nine measurement systems, exposed vaginal length had a cutoff value for identifying posterior vaginal wall prolapse of 2.7 cm, was nearly 4 cm larger in cases (4.8 ± 2.3 cm) than controls (1.0 ± 1.2 cm), and exhibited the largest effect size (2.1) and area under the curve (0.95). Among the existing eight measurements, the perineal line based on the internal surface of the pubis had the highest sensitivity and specificity to identify posterior vaginal wall prolapse and showed that referenced to this line the most protuberant point in cases (1.4 ± 1.1 cm) was over 1 cm higher than in controls (0.2 ± 0.3 cm). It had a large effect size (1.6) and high area under the curve (0.9), with a cutoff value of 0.9 cm.

Table 3 shows the ability of each measurement to assess prolapse size as categorized by expert examiners. Each measurement was significantly associated with increasing prolapse size (all p < 0.05), but not all the parameters had the same ability to distinguish where, in the gradual change from normal to the largest prolapse, that change had become large enough to change from one group to the other. For depicting prolapse size, the reference system that agreed most with expert opinion was the exposed vaginal length, followed by the perineal line-internal pubis and perineal line-external pubis, as evidenced by coefficients of determination (the proportion of prolapse size determined by the measurements) of 0.77 cm, 0.68 cm, and 0.62 cm (p < .001), respectively.

Table 3 Comparison of rectocele size

In response to the CRADI-8 difficult defecation questions: “Do you need to strain hard to have a bowel movement?”, 42% of women with posterior prolapse answered “moderately” or “quite a bit” vs. 7% of women with normal support (p < 0.000). For the question: “Do you feel you have not completely emptied your bowels at the end of a bowel movement?”, 43% of the women with prolapse responded “moderately” or “quite a bit” vs. 6% in women with normal support (P < 0.000). Overall CRADI-8 scores for the two groups were 20.5 (CI 16.0–24.8) vs. 49.4 (41.6–57.3) (p < 0.000). Posterior wall prolapse size, however, did not differ between symptomatic and asymptomatic women with prolapse. Exposed vaginal length was 4.7 cm (SD 1.8) in symptomatic vs. 4.5 cm (SD 2.4) in asymptomatic women (p = 0.8). For the perineal line-internal pubis, measures were 1.4 cm (SD 0.96) and 1.5 cm (SD 1.3), respectively (p = 0.7).

Discussion

Of the nine reference systems for assessing posterior vaginal wall prolapse, exposed vaginal length and perineal line-internal pubis were the two best-performing parameters for diagnosing and measuring prolapse size. The exposed vaginal length measured on MRI assesses the amount of vaginal wall exposed to the pressure differential between abdominal and atmospheric pressure, so it has mechanistic as well as diagnostic significance [7,8,9]. In addition, it could also theoretically be measured during physical examination, making it clinically useful—though data from examination would need to be analyzed to find the optimal cutoff, because the landmarks used in MRI would not be the same. Of the established systems, rectocele size measured as the distance from the most protuberant point on the posterior vaginal wall to the perineal line, based on the inside of the pubic symphysis to the ventral tip of the perineal body, performs best.

Our study also provides evidence-based cutoff values. In previous publications on the existing systems, only one described a cutoff value based on statistical methods (receiver-operating characteristic curve), and that study used nulliparous women as controls [6], which might result in many parous women with normal support being identified as having abnormal support. Because exposing asymptomatic women to radiation raises ethical concerns, radiographic studies lack appropriate control groups. Three studies had only prolapse cases [20,21,22], two had only “controls” [13, 18], and three used asymptomatic nulliparas as controls [6, 16, 17]. This latter group, although demonstrating ideal anatomy, does not represent the anatomy seen in normal women after vaginal deliveries, so using those values as cutoffs based on this stringent criterion for normal would suggest that many normal women have a rectocele who actually have normal anatomy for parous women. Our findings relate to the location of the posterior vaginal wall and did not measure the lumen of the rectum as would be done with defecography, so our cutoff values would be an overestimation due to the thickness of the posterior vaginal and anterior rectal walls when quantifying defecography images using the mid-anal reference line. Ethical limitations on radiating asymptomatic volunteers will probably preclude gathering similar data from asymptomatic women with normal support in x-ray studies.

Definitions and diagnostic criteria should be carefully evaluated. For example, a 2-cm cutoff value recommended by an international society for the distance a rectocele protrudes from the mid-anal line [3] cites a study that actually described the “resting position of the anorectal junction < 2 cm above the plane of the ischial tuberosities” as the cutoff [4]. Testing conditions also matter; thus, it would be appropriate when citing cutoffs from our study to limit them to straining in the supine position on imaging and not apply them to women having defecography in the sitting position.

It must be emphasized that we are not saying that magnetic resonance imaging and related measurements are always necessary or are superior to physical examination in the diagnosis of posterior wall prolapse. However, because imaging is increasingly being used, especially by non-obstetrician gynecologists, to diagnose and evaluate prolapse, it is important for women undergoing these studies to have evidence-based diagnostic cutoffs—and to know which systems perform best—to avoid unnecessary surgery.

This MRI study has several strengths. In a single cohort with a relatively large sample size of both asymptomatic multiparous women with normal support as determined by expert examination and women with clinician-confirmed posterior prolapse and normal support, it compares nine reference systems for posterior compartment assessment. It also provides cutoffs based on receiver-operating characteristic curve/area under the curve. Care was taken to make sure that women knew what to do before going into the scanner and all subjects attempted more than three Valsalva maneuvers in order to maximally develop the prolapse in the scanner [12].

Several limitations should be acknowledged when interpreting the results of this study. It is a study of posterior vaginal wall prolapse assessed with supine MRI and focuses on structural deformation of the posterior vaginal wall rather than the contour of the anorectum or common prolapse symptoms [5]. The purpose of this manuscript was to study the anatomical changes found in clinical evaluation of rectocele rather than the changes seen during defecation. We do not know whether these cutoff values would be appropriate for ultrasound or radiographic studies. All women were diagnosed clinically with rectocele or as having normal support. “Gray zone” individuals of uncertain status were not included, so this was not a population-based sample. This may somewhat overemphasize the differences between groups but should not substantially change the cutoff values. In assessing prolapse size without an existing “gold standard,” we relied on expert opinion that is admittedly subjective, but our examiners were experienced and the observations of three different people were used to establish our final assessment. Our observations were made in the supine position during maximal strain, not the seated posture, and we did not have women defecate in the scanner. This is similar to the way women are examined for prolapse, but is not the posture used for defecation. A total of 60 women with posterior vaginal prolapse had been recruited, and we selected 60 controls to be of similar age and parity; however, 8 of the cases had to be excluded because of motion artifact or scanner problems. These were not problems for the control group, because their selection had involved screening for adequate scans. As a result of these factors, we must consider the potential for bias, but we did not feel it would substantially affect our overall findings. Prolapse does not occur in neat groups, and although all our cases had posterior-predominant prolapse, there were varying degrees of anterior or uterine prolapse that might affect posterior vaginal protrusion during a study because of organ competition; this is the inherent nature of prolapse and not due to our study design.

This report systematically evaluated the effectiveness of current measurement systems in providing cutoff values for objective and evidence-based criteria for use in diagnosing posterior vaginal wall prolapse during supine magnetic resonance imaging. A simple new parameter—exposed vaginal length—demonstrates performance characteristics that are slightly better than the best existing measurement systems. This measure also has an advantage in that it could easily be adapted for use during physical examination as an additionally objective method of assessing prolapse size other than POP-Q. We believe that having evidence-based cutoff values specific for the selected measurement technique is essential to making progress. As imaging plays an increasingly important role in clinical management, having proper cutoffs will be important to avoid diagnosing women with a condition that they do not have or missing an important diagnosis.