Introduction

The POP-Q is currently the most objective, site-specific system for quantifying and describing pelvic organ prolapse [1]. The POP-Q system describes support to the anterior, posterior and apical aspects of the vagina by measuring the distance between the most dependent part of each point and the hymeneal ring. Points above the hymen are assigned a negative value, whereas points below the hymen are positive. Two external measurements, the genital hiatus and the perineal body, complete the evaluation.

After the introduction of the POP-Q, investigators reported good reproducibility of the measures [2, 3]. Patient position did, however, affect reproducibility in that the degree of pelvic organ prolapse was higher when women were examined in a birthing chair at a 45° angle, rather than in the dorsal lithotomy position [4]. Although the original description of the POP-Q system recommended that the type of vaginal retractor be recorded and that the patient position be specified, the effect of variation in these techniques has not been previously studied.

The National Institute of Child and Human Development (NICHD)-funded Pelvic Floor Disorders Network (PFDN) is a cooperative network of investigators from seven clinical academic centers and a Data Coordinating Center (DCC). The primary goal of the PFDN is to improve the level of knowledge about pelvic floor disorders in women, including pelvic organ prolapse, through clinical trials. The POP-Q was selected as the measurement tool to evaluate baseline and post-treatment anatomical findings. An initial survey of members of the PFDN revealed that although all were of the opinion that they were performing the POP-Q in a standardized fashion, significant variability existed in potentially important aspects of the examination, including patient position, measurement of genital hiatus and perineal body at rest or strain, and use or not of a speculum. To ensure that anatomic outcomes are assessed in a consistent manner, and to improve the reliability of POP-Q measurements, the PFDN sought to evaluate the impact of specific technique variations on POP-Q measurements. The objectives of this study were to assess: 1) whether there were differences in the POP-Q measurements of Aa, Ba, C, D, Ap, Bp and TVL obtained with and without a speculum; 2) whether there were differences in GH and PB between rest and maximum strain; and 3) whether the point of maximum prolapse was different in the lithotomy position compared to standing.

Methods

Members of the PFDN performed POP-Q measurements as outlined in the standardization document [1] on a convenience sample of women presenting for care over a 2-week period. Institutional review board approval or exemption was obtained from each clinical site. All patients were initially examined in the supine lithotomy position after emptying their bladders. Postvoid residual urine was not removed by catheterization. A rigid measuring device, such as a marked swab or sound calibrated in centimeters, was used. The size and type of speculum used was not standardized. All internal points were measured both with and without a speculum in place. The posterior blade of the speculum was used to measure Aa, Ba, Ap and Bp. An intact speculum, posterior blade or, rarely, a small vaginal dilator was used to measure the apical points C and D and TVL. For the points measured without a speculum, the vagina was manually depressed to provide exposure for the measurements. The point of maximal prolapse was assessed with the patient in both the supine lithotomy and the upright positions by asking the patient to perform a maximum Valsalva effort. External points were measured both with and without maximal straining. Examiners indicated when they were unable to obtain an accurate measurement. Study site and subject age, weight and height were recorded. Body mass index was calculated based on height and weight (kg/m2). Women for whom all POP-Q points were complete except for point D were assumed to have undergone a hysterectomy. Stage of prolapse was determined based on measurements of maximal protrusion obtained without a speculum.

Univariate statistics were calculated for the demographic data. POP-Q measurements (taken with and without the use of a vaginal speculum; at rest or with maximal strain; and lithotomy or standing) were compared using Student’s t-tests, Pearson’s correlation coefficients, and repeated measures analysis of variance (ANOVA). ANOVA was used to determine the relationship between the POP-Q measurements and stage of prolapse, age and body mass index, controlling for variation between observers. In order to determine whether the effect of the vaginal speculum differed across examiners, an interaction term was included in the model. A weighted κ statistic was used to determine the level of agreement between the compartment of maximal prolapse in the lithotomy and standing positions.

Results are expressed as mean ±SD. All P values less than 0.05 were considered statistically significant and all analyses were performed using SAS (SAS Institute Inc., Cary, NC).

Results

POP-Q examinations were performed on 133 patients by 16 examiners at seven clinical sites. Two subjects were excluded from all analyses because there were insufficient POP-Q data to calculate the stage of prolapse. For the remaining 131 subjects, their mean age was 60.8±13.8 years (range 26–84), mean weight 73.3±17.1 kg (range 42.5–135), and mean body mass index (BMI) was 28.0±6.7 kg/m2 (range 15.8–63.9). Eight subjects (6%) were classified as stage 0, 30 (23%) as stage I, 62 (47%), as stage II, 27 (21%) as stage III, and 4 (3%) were classified as stage IV. Because of the small sample sizes for stages 0 and IV, the patients were reclassified into three stage groups: stage 0/1(38, 29%), stage II (62,47%) and stage III/IV (31, 24%).

Anterior and posterior vaginal wall measurements (Aa, Ba, Ap and Bp) did not differ when obtained with and without a speculum (Table 1). For all anterior and posterior positions, the correlations between measurements taken with and without a speculum were greater than 0.89. For both the anterior and posterior measurements, 94% of the values measured without a speculum were within 1 cm of the values measured with a speculum. Only 2.5% of the values were more than 2 cm apart, with no bias as to direction.

Table 1. Anterior, posterior and apical points: effect of measurements made with and without a speculum

We found no difference in C or D measurements obtained with or without a speculum (Table 1). For both the C and D measurements, the correlations between measurements taken with and without a speculum were greater than 0.90. For the C measurements obtained with and without a speculum, 75% of values obtained were within 1 cm of each other. Three per cent of the values measured with a speculum were at least 2 cm greater than values measured without a speculum, whereas slightly over 5% of values obtained with a speculum were at least 2 cm less than values measured without a speculum.

There were only 52 measurements of D owing to the high rate of prior hysterectomies in this population. For the D measurements obtained with and without a speculum, 89% of values obtained were within 1 cm of each other. The remaining values were equally distributed in both directions.

Several of the patients for whom the difference between C measurements was greater than 2 cm did not have measurements for D. However, for those for whom D measurements were available, those measurements also often differed by more than 2 cm.

TVL measurements performed with a speculum were statistically significantly greater than without a speculum; however, the average difference was small (0.2 cm). The difference in TVL measurements taken with and without a speculum varied by hysterectomy status (P=0.04). For those subjects without a uterus, TVL was longer when measured with a speculum (8.3±1.5 cm vs. 8.0±1.5 cm, P=0.022), whereas for those with a uterus there was no significant difference in TVL when measured with and without a speculum (9.3±1.7 cm vs. 9.2±1.7 cm, respectively, P=0.55). Correlation between TVL measurements taken with and without a speculum was 0.82.

The relationships between genital hiatus (GH) and perineal body (PB) measurements performed during rest and maximal straining are summarized in Tables 2 and 3. The percentage difference between rest and maximal straining was similar for all stages (about 20%). The correlation between GH measurements performed during rest and during maximal straining was 0.86. For GH, 79% of values obtained at rest were within 1 cm of values measured with maximal straining. All the differences greater than 1 cm were in the same direction, i.e. with larger measurements at maximal strain. The correlation between PB measurements performed during rest and maximal strain was 0.93. Ninety-nine percent of the PB values obtained at rest were within 1 cm of values measured with maximal straining.

Table 2. GH and PB measurements at rest and with maximal straining
Table 3. Genital hiatus (GH) measurements at rest and on maximal straining, with stage 0–I, stage II and stage III–IV pelvic organ prolapse

We assessed the stage of prolapse both with and without a speculum in 127 women (Table 4). In 100 women (79%) the stage remained the same. For 26 patients (20%) their stage changed by one, with half of these subjects moving to a more severe stage group when measured with a speculum, whereas the other half moved to a less severe stage. The remaining subject moved two stage groups, from stage II to stage IV, when measured with a speculum rather than without.

Table 4. Cross-tabulation of stage measured with and without a speculum

Lastly, we compared the maximal degree of prolapse between standing and lithotomy positions. Mean maximal prolapse was significantly greater when standing than in lithotomy, overall and for each stage of prolapse, as shown in Table 5. The correlation between maximal prolapse measured in the standing and lithotomy positions was 0.86. The compartment that was maximally prolapsed in the supine position was the same as that maximally prolapsed standing 82% of the time (κ=0.72, CI=0.59, 0.84).

Table 5. Maximal extent of prolapse in lithotomy and on standing

BMI and age were not associated with differences in measurements taken with and without either straining or speculum. The differences between measurements taken with and without a speculum differed significantly by examiner. For measurements taken with a speculum, investigators had different mean values. As the range is bounded for many of the measurements, differences for high values measured with the speculum tend to go in the opposite direction from those with low values measured with a speculum; this may lead to the investigator interaction that we observed.

Discussion

Our data suggest that the outcome of the POP-Q examination varies with certain aspects of examination technique. This examination system, initially described in 1996 [1], was proposed as a tool for quantifying and describing pelvic organ prolapse. It has since been used in clinical research [5, 6] and has demonstrated good reproducibility [2, 3]. The system does not, however, dictate all aspects of examination technique. Examples of possible variations in technique include [1]: the position of the patient, the type of table or chair used, the type of speculum used, the type and intensity of straining, the fullness of the bladder, and the contents of the rectum.

Our results suggest that the use of specula does not affect most aspects of the examination. Only TVL differed significantly, and the difference (0.20 cm) is unlikely to be clinically significant. This effect on TVL was limited to women with a prior hysterectomy, in whom TVL appeared significantly longer when measured with a speculum. The use of the speculum did not significantly affect POP-Q measures at other vaginal sites. The overall stage or prolapse was affected by the use of the speculum in 20% of cases, but there was no pattern in either direction. In other words, the underestimation of stage was similar either using or not using a speculum. The observed interaction between examiner and speculum may be due to the fact that the severity of prolapse differed between patients of different investigators. Alternatively, it may represent an inherent difference in examination technique among investigators. Therefore, our results suggest that for clinical studies of pelvic organ prolapse the results of the POP-Q examination are unlikely to be affected by whether or not a speculum is used. The observed variation between examiners suggests that each investigator should maintain a consistent technique when following subjects over time.

The POP-Q system does not specify whether the external measurements (GH and PB) should be made at rest or with straining. We observed statistically significant increases in GH and PB with Valsalva, which may represent widening of the levator hiatus with increased intra-abdominal pressure. It is not clear from our data whether the widening of GH and PB with Valsalva is a normal phenomenon or evidence of pelvic floor dysfunction. Because GH and PB at rest and with maximal straining may measure different aspects of pelvic floor function, we recommend that both be measured in clinical trials.

Another technical issue regarding the POP-Q examination is whether to use a standing examination to verify that the full extent of prolapse has been observed. Three criteria for the demonstration of maximal prolapse are suggested [1]: the protrusion is taut, traction causes no further descent, and/or the subject confirms that the examination has demonstrated the full extent of the prolapse. A standing examination is recommended if these criteria are not met. Patient position has been shown to affect POP-Q examination results. Specifically, examination at a 45° angle in a birthing chair demonstrates a higher degree of prolapse than a supine examination. Our results suggest that the maximum extent of prolapse was best observed with the patient standing. Mean differences were small but statistically different. In some cases, the difference between the standing and supine examination was as great as 6 cm. Such large differences could affect the management of individual patients. More importantly, the supine and standing examinations did not always agree with respect to identification of the most dependent portion of the prolapse. This could affect the selection of surgical approach and is therefore highly relevant to clinical management. Further study is needed to compare prolapse determination between the 45° supine and the standing positions. Although this study is the first to evaluate the effects of using a speculum and straining and on POP-Q measures, other technique modifications may also be relevant. The potential effects of particular types or sizes of specula as well as rectal and bladder volume on POP-Q assessments need to be examined. Although a population-based sampling frame was not used in this study, we took advantage of the PFDN to recruit subjects quickly from a clinical sample. Clinicians and researchers alike should understand how specific technique modifications may affect their POP-Q measures. Future work should focus on the standardization of technique as a means to improve the reliability of pelvic organ prolapse measurement.