Keywords

Introduction

Pelvic floor disorders (PFD) are a group of anatomic and functional disorders of the pelvic organs and their sexual and excretory systems. Specifically, PFDs include both functional and anatomic conditions that can result in urinary incontinence (UI), voiding dysfunction, pelvic organ prolapse, fecal incontinence, defecatory dysfunction, and/or sexual dysfunction. PFDs are associated with poor quality of life (QOL), and compromised physical, social, and mental well-being [1, 2]. About one quarter of American women are estimated to have a symptomatic PFD. Higher prevalence is associated with increased age and parity [3], and consequently, as the population ages, PFD evaluation and treatment will likely become more common. It is projected that annual procedures performed for SUI and POP will reach 310,050 and 245,970, respectively, by 2050 [4].

The proper method for measuring efficacy of treatment for PFD is a question of great interest for the pelvic floor surgeon. A substantial body of literature exists around this point, and in this chapter, the outcomes measures for SUI and POP will be reviewed.

Traditional methods for assessing treatment efficacy were the so-called objective outcomes measures, such as physical examination and pad weight. However, a growing body of PFD literature has suggested that these measures are insufficient and suboptimal for representing the symptom severity as experienced by patients, or the multidimensional impact of these symptoms on women [5]. A broad field of outcomes measures—for PFDs and in healthcare generally—has grown to address these gaps and continues to evolve.

Subjective measures, particularly patient-reported outcomes (PRO), play an increasingly important role in assessment of PFD treatment efficacy. Clinicians and researchers now recognize that the impact of patient goals and experience of morbidity in overall patient satisfaction is key to determining efficacy of what is often considered “lifestyle surgery.” Formally developed outcomes questionnaires measure disease burden and severity, and/or impact on quality of life. The individual measures, as well as the field of PRO and its standards of instrument development, have expanded and evolved over the last 15 years.

Despite the rapid expansion of available tools for the pelvic floor surgeon and researcher, there is no consensus on the optimal subjective outcome instruments. The multitude of PRO measures applied in the PFD literature limits interpretation and comparisons across studies. Each measure has its own strengths and weaknesses, broadness of scope, and focus on symptom burden versus life-impact. Overall limitations of the data on PRO measures include ambiguity in reporting, inadequate follow-up, lack of standard definitions for treatment success or failure, lack of sufficient power, and lack of complications data.

The goal of this chapter is to provide an overview of selected tools available to assess outcomes of SUI and POP procedures. A complete review of validated measures available for all PFDs are beyond the scope of this review.

Objective Outcomes Measures

Objective outcomes measures, sometimes called “anatomic” measures, are used to quantify or assess the severity of pelvic floor symptoms independent from patient or clinician perception. Anatomic outcomes are a subset of these objective measures; the term is somewhat of a misnomer since some of the tests do involve clinician judgment or patient compliance, and, in fact, most of these measures are not validated. However, these tools are included here for completeness since they are frequently reported outcomes.

Stress Incontinence

SUI is clinically diagnosed by patient history and exam, and often treatment may be initiated without further assessment. Objective demonstration is required as part of the diagnostic evaluation of SUI by the guidelines of the International Urogynecological Association (IUGA), International Consultation on Incontinence (ICI) and International Continence Society (ICS) [6, 7].

Stress Test, Urodynamics, Pad Test, Bladder Diary

Most commonly, demonstration of SUI is accomplished by the cough stress test (CST), which should be performed with a full bladder. If the CST is negative in the supine position, then the maneuver is repeated with patient standing. Urodynamics and abdominal leak point pressure can demonstrate and characterize SUI, and are often utilized in PFD surgical outcomes trials..

Adjunctive testing such as a 24-h pad test has been used primarily in the diagnostic evaluation. The pad test has demonstrated limited validity compared to self-assessment questionnaires. The 1-h pad test is a less accurate evaluation tool and is not recommended by the ICI. The 3-day bladder diary has been shown to be a feasible, reliable, valid measure [8] pre-intervention, though it may be difficult to report change after intervention given its complexity.

Bladder diaries and frequency-volume charts are highly recommended by the ICS as diagnostic tests to document micturition frequency, voided volumes, incontinence episodes, and pad use. Incontinence episodes, as recorded on a bladder diary, are an outcomes measure for SUI and overactive bladder (OAB), often used for OAB pharmaceutical trials. A recent industry-sponsored study proposed that reduction of incontinence episode frequency by at least 40–50 % is necessary for patients to appreciate a change with therapy [9]. This is an important outcomes research concept, called the minimum clinically important difference (MCID).

Pelvic Organ Prolapse

Pelvic organ prolapse is a clinical diagnosis of urogenital organ descent or laxity. Treatment is generally reserved for symptomatic women. Traditional outcome measures for POP attempt to capture severity of maximum anatomic support defect and assess pelvic floor muscle function by physical exam.

POP-Q, S-POPQ and Pelvic Floor Muscle Strength

POP Quantification (POP-Q) staging was jointly developed and adopted by the ICS and AUGS as the standard, validated physical exam system utilized in the PFD literature [6, 1012]. Six defined landmark points along the vaginal walls are measured in relation to the hymen during maximal straining. The vaginal length and the lengths of the perineal body and genital hiatus are recorded at rest. The exam should be performed with an empty bladder and rectum and in a position of the patient’s choice that she feels demonstrates maximal descent of her pelvic organs. It should also be noted that intraoperative exams or exams after pessary removal may alter the degree of POP. Barriers to clinical use of the formal POP-Q include its time-consuming nature and clinician unfamiliarity or confusion with the system. The simplified POP-Q (S-POPQ) has only four defined vaginal points but is still a valid system for documenting anatomic severity of prolapse, and likely is more representative of many physicians’ examinations. Nevertheless, the ICS and IUGA have established that objective measures, such as POP-Q, should be fully tabulated (not summarized) for proper documentation of outcome measures in surgical trials of POP [11].

Urinary and Defecatory Function Tests, Imaging Studies

A full evaluation of POP includes measurements of prolapse impact on urine storage, micturition, sexual, and defecatory function. A post void residual is commonly performed to objectively rule out voiding dysfunction, and can be used as an outcome measure. Investigative trials and clinical evaluations may utilize other assessments of concurrent PFD including functional studies (urodynamics, anal manometry), imaging studies (ultrasound, cystogram, defecography, magnetic resonance imaging), pelvic floor muscle strength assessment, or bowel diary.

Subjective Outcomes Measures

Introduction to Validated Outcomes Measures

As the surgical PRO literature evolved over the last 20 years, physicians and researchers have struggled with the quality of outcomes measure reporting. Reported “cure rate” varied depending on how outcomes were measured and did not correlate well with patient satisfaction or self-assessment of symptoms. Objective measures, like POP-Q or dry rate, did not comprehensively capture the full extent of symptom burden or multidimensional life impact of PFDs. Nor did the outcomes measures employed sufficiently reflect patient goals for surgery [5]. PFD researchers, and healthcare researchers in general, responded to this need by creating and improving validated PRO instruments.

Development of the rigorous, standardized scientific methodology for producing valid PRO instruments has occurred with the collaboration of epidemiologists, statisticians, psychologists, and physician scientists [13, 14]. This process ensures that the PRO is reliably measuring what it is intended to measure, and that the PRO is appropriate for use in the population under investigation. A comprehensive review of the process for developing validated PRO instruments is beyond the scope of this chapter, but can be found in the Report on the 5th International Consultation on Incontinence [15].

In brief, the process begins with patient input, such as focus groups or structured interviews, to produce concept maps. Identified themes undergo factor analysis by experts to develop and refine symptom burden or QOL questionnaires. The questionnaire’s psychometric properties are rigorously tested to establish internal consistency, reliability (reproducibility), validity (degree to which an instrument measures its aim) of content and construct, stability (over time), and sensitivity to change within the relevant population.

The method of outcome measure development is important to clinicians and researchers using these measures because the instrument should be administered according to how it was validated. For example, a written questionnaire validated in English-speaking women with SUI is not valid or reliable if administered in another language, by telephone, to incontinent men, or to women with defecatory dysfunction, until studies have explicitly demonstrated the questionnaire’s validity within those populations.

A plethora of validated outcomes assessment instruments have been developed in response to the identified need to improve upon our ability to quantify the effects of PFD surgery. It behooves the clinician to be familiar with these instruments, since they vary in strengths, scope and applicability, and limitations with regard to approximating the true symptom burden, QOL impact, and patient goals or satisfaction. Furthermore, there is still no consensus on the optimal outcomes measures for POP or SUI surgery, nor is there consensus on establishing the MCID detected by these outcome measures [16].

A selected review of the subjective validated outcomes instruments used to evaluate SUI and POP surgery is presented below. Some non-validated outcomes measures important to the PFD literature are included for completeness and context. Validated outcomes instrument characteristics are summarized in Table 21.1.

Table 21.1 Summary of validated outcome instrument characteristics

Generic Measures

Generic outcomes measures are not specific to PFDs, and can be utilized for a number of health assessments. These instruments allow comparison of symptom distress or health impact across different health conditions. However, PFD researchers have shown that condition-specific QOL instruments are more responsive than generic QOL tools [17], and accordingly, these generic measures should not be used exclusively.

SF-36, PGI-I

The Short Form (36) Health Survey, SF-36, is a multidimensional short form health questionnaire that is designed to profile functional health and well-being, and assess outcomes of interventions. It yields eight subscales: physical functioning, role-physical, bodily pain, general health, vitality, social functioning, role-emotional and mental health. Depending on the subject of study, a researcher might prefer a weighted outcome of the SF-36: the Physical or Mental Component Summary Measures. The ICI recommends the more abbreviated form, SF-12, as a generic QOL measure. These questionnaires can estimate disease burden and compare disease-specific benchmarks with general population norms. While they are sensitive to change after intervention, the condition-specific PFD instruments may be even more responsive to change.

The Patient Global Impression of Improvement (PGI-I) scale is a 7-point Likert scale that measures patient perception of improvement or worsening of symptoms. The PGI-I is not specific to PFDs per se, but has been validated as a qualitative PRO in the POP population [18]. The Expectations, Goal setting, Goal achievement and Satisfaction (EGGS) mnemonic [19] provides a thorough guide for qualitative evaluation of satisfaction by emphasizing communication around patient goals and measuring how many pre-specified goals are achieved post intervention. Recent IUGA and ICS guidelines for reporting surgical outcomes recommends using a satisfaction scale in conjunction with a symptom scale [11].

Stress Incontinence

Subjective dry rate can be used as a measure of cure, though this is not a psychometrically validated measure. A number of formal PRO instruments have been developed for evaluation of incontinence and incontinence interventions. Of note, most of these instruments were developed in females with incontinence, but are not specific to SUI.

ISS

The Incontinence Symptom Severity Index (ISS) [20] is a commonly utilized validated PRO instrument to evaluate severity of female urinary storage and voiding symptoms. It has 8 items and was tested against bladder diaries, post void residual measurement, and pad tests.

UDI, UDI-6, IIQ, IIQ-7

The Urogenital Distress Inventory (UDI) and Incontinence Impact Questionnaire (IIQ) were rigorously developed together as the first female incontinence-specific PRO instruments in 1994 [21]. The UDI measures the degree to which symptoms associated with incontinence are troubling to women. It encompasses three symptom subscales including irritative, obstructive/discomfort, and stress. The IIQ assesses the impact of incontinence on various activities, roles, and emotional states. The same three subscales are utilized. The authors intended these measures for paired use in order to provide detailed information on the effect of incontinence on health-related QOL. Short forms of each instrument were validated shortly thereafter (i.e. the UDI-6 and IIQ-7), facilitating adoption in the clinical setting. Statistical analyses using more contemporary validation techniques have confirmed the test-retest reliability of these foundational PRO questionnaires.

ICI Modular Questionnaires (ICIQ)

The ICI modular Questionnaires (ICIQ) are a collection of validated PRO instruments adopted by the ICI examining various aspects of incontinence and voiding dysfunction [22]. The focus of each instrument is slightly different, including measures of such variables as symptom burden, QOL impact, and impact on sexual function. All modules hold a “Grade A” highly recommended assessment by the ICI for the quality of the instrument’s published psychometric testing. The ICI questionnaires related to SUI are outlined below.

The ICIQ-urinary incontinence (ICIQ-UI) short form was developed and validated according to the strict methodology outlined above for use in outcomes and epidemiological research. It is a brief and robust three-item questionnaire assessing the prevalence, frequency, and its impact on everyday life, as well as a fourth unscored item for self-diagnosis of the perceived cause of UI [23]. The ICIQ-Lower Urinary Tract Symptoms Quality of Life (ICIQ-LUTSqol) module, derived from the King’s Health Questionnaire [24], which was appropriately validated as a condition-specific QOL assessment for women with incontinence (the ICIQ-LUTSqol, on the other hand, can be used for men and women). It was designed as an outcomes measure for clinical trials and does not incorporate any symptom scales.

The ICIQ-Urinary Incontinence Symptoms Quality of Life (ICIQ-UIqol) module, was derived from the I-QOL [25], which was developed as an incontinence QOL measure for use in clinical trials. After a rigorous development process, it was validated in women with both stress and mixed urinary incontinence. This test was validated against other QOL questionnaires as well as objective measures such as the pad weight test. The authors of I-QOL further published an MCID of 2–5 % in association with those measures. The ICIQ-UIqol is intended for use in both genders, and scores reflect symptom impact on QOL.

The ICIQ-Female Lower Urinary Tract Symptoms (ICIQ-FLUTS) and ICIQ-FLUTS Long Form modules are both derived from the Bristol Female Lower Urinary Tract Symptoms (BFLUTS) questionnaire, which was developed as an instrument for symptom severity (especially incontinence and impact on sexual function), QOL impact, and evaluation of treatment outcome. The validation was performed in women undergoing urodynamic assessment. The ICIQ-FLUTS is a short, 12-item symptom impact scale for incontinence as well as dysfunctional voiding symptoms, while the corresponding Long Form module has 18 items and no scoring system. The ICIQ-FLUTS Long Form provides a detailed summary of the level and impact of urinary symptoms for outcomes assessment.

The sexual impact of incontinence is not well captured in traditional incontinence impact or QOL instruments. The ICIQ-Female Sexual Matters associated with Lower Urinary Tract Symptoms (ICIQ-FLUTSsex) module is also derived from the BFLUTS questionnaire. It contains four items to assess sexual dysfunction and impact of urinary symptoms (not restricted to the UI population).

PISQ, PISQ-12, PISQ-IR

The Pelvic Organ Prolapse/Urinary Incontinence Sexual Questionnaire (PISQ) is a 31-item condition-specific validated assessment instrument with the specific focus of evaluating sexual function in women with POP or UI. The questionnaire was designed by expert review and extrapolation on existing instruments, and then validated against other validated outcome assessments (both condition-specific measures and generic sexual health measures) in women with POP or incontinence. A short form instrument, the PISQ-12, was later developed based on the original dataset. Most recently, the questionnaire was updated and published as PISQ, IUGA-Revised (PISQ-IR) with extensive validation and psychometric testing (including patient input toward concept gaps with cognitive interviews) for use in sexually active and inactive women with PFDs [26]. The PISQ-IR items better capture symptom impact than prior iterations.

Pelvic Organ Prolapse

PFDI, PFIQ, PFDI-20, PFIQ-7

The Pelvic Floor Distress Inventory (PFDI) and Pelvic Floor Impact Questionnaire (PFIQ) are condition-specific QOL instruments for POP [27]. They were developed by extrapolating the structure and content of the UDI and IIQ onto multiple domains related to POP. The resulting PFDI questionnaire included three scales to cover symptom distress, including the UDI, a POP Distress Inventory (POPDI), and a Colorectal-Anal Distress Inventory (CRADI). Similarly, the PFIQ was expanded respectively into three scales assessing symptom impact. The PFDI and PFIQ were then validated in women with subjective complaints of a vaginal bulge using objective and subjective measures. Short form instruments, the PFDI-20 and PFIQ-7 were later validated from the original dataset. The subscales of the PFDI are also useful clinically and in research given their more narrow scope, though only the UDI and UDI-6 have been independently validated (albeit with slightly different scoring).

ICIQ-VS

The ICIQ-Vaginal Symptoms (ICIQ-VS) is a 14-item ICI modular questionnaire for assessment of severity and impact of vaginal symptoms and related sexual matters, as well as evaluation of treatment efficacy in women with POP [28]. The instrument was developed through expert consultation and included structured interviews with patients. The questionnaire was validated in an outpatient population that included a group of controls without POP. It is intended for clinical use and epidemiologic research. An ICI QOL module for vaginal symptoms is under development.

POP-SS

The Pelvic Organ Prolapse Symptom Score (POP-SS) is a seven-item scale developed with expert consultation and modeling on existing ICIQ instruments, and qualitative patient interviews were utilized [29]. An eighth item elicits patient identification of her most bothersome symptom. The POP-SS was then administered alongside validated instruments as an experiment that included women at risk for POP, presenting with POP symptoms, and undergoing surgery for POP. Post hoc validation studies were then performed, and an MCID for the measure was established (change in score with range 0–28 by −1.5 points).

P-QOL

The prolapse quality of life questionnaire (P-QOL) assesses severity of symptoms and QOL impact in women with urogenital prolapse [30]. The P-QOL was developed based on expert consultation, literature review, and patient interviews. The 20-item questionnaire includes multidimensional assessment of symptom impact on life, relationships, sleep, emotions, and other items. It was administered to symptomatic and asymptomatic women presenting to gynecology clinic. Validation statistics were performed, and a strong correlation was demonstrated with POP-Q findings.

Given the multidimensional nature of POP and its ability to impact the urinary, sexual, and defecatory organs, validated measures assessing these particular domains of POP are frequently used in the surgical literature. A limited review of available instruments is included herein for completeness.

OAB-q

Outcomes instruments for voiding dysfunction, such as the Overactive Bladder Questionnaire (OAB-q) [31], may be utilized in the POP literature. The OAB-q (and its 19-item short form) is a validated measure of symptom bother and QOL impact that was validated in women and men. A plethora of other validated instruments in the OAB literature may be applicable to POP-related voiding dysfunction [15], but are beyond the scope of this discussion.

FSFI

The Female Sexual Function Index (FSFI) was developed by experts and then validated in women with female sexual arousal disorder, as well as matched controls [32]. It is a comprehensive assessment with six different domains. This instrument has not been validated in the POP or SUI population, though it has been used as a sexual function measure in validation studies of the PISQ-IR, for example. The PISQ instruments were validated in the POP population, and are discussed in detail above.

Wexner, Vaizey Scales

The Wexner Continence Grading Scale and Vaizey Severity Score [33] are standardized instruments that capture severity of fecal incontinence symptoms. These instruments were not developed via the psychometric PRO methodology described above, but they were validated against clinical assessments and scored 28-day bowel diaries.

FIQL, ICIQ-B

The 29-item Fecal Incontinence Quality of Life (FIQL) is a validated QOL instrument containing four subscales: Lifestyle, Coping/Behavior, Depression/Self-Perception, and Embarrassment [34]. The ICIQ-Bowel (ICIQ-B) is a comprehensive, validated symptom severity and QOL measure for fecal incontinence, recently updated for the ICI modular Questionnaires [35]. It contains 17 scored items in three domains (Bowel Pattern, Bowel Control, and QOL), as well as unscored items on other symptoms and sexual impact. The ICIQ-B was recently validated in American English and in an electronic web-based form [36]. Further outcomes measures for bowel dysfunction related to POP are under development by the ICI.

Composite Outcomes

Despite the proliferation of PFD-specific validated outcomes measures for symptom severity and QOL impact, consistency of surgical outcomes reporting and thus, uniform assessment of interventions, remains a challenge. Furthermore, the quality of reporting randomized clinical trials in the PFD literature has recently been challenged for failing to comply with numerous methodological standards [37].

The ICI has suggested that POP and incontinence surgery studies should report subjective, objective, and QOL outcomes to address this need for quality comprehensive assessments that can be compared between trials [6, 15]. At this point, there is no gold standard measure for evaluating success of anti-incontinence procedures or prolapse repairs. Current thinking suggests that each component of multidimensional outcomes data will contribute meaningfully to the overall comprehension of patient well-being. Thus, researchers are supporting the idea of using composite outcomes measures. Pelvic floor researchers will need to thoughtfully select their outcomes measures, and balance and refine the components of these composite measures so that the collection of a large quantity of data does not drive the measures to be too broad in scope to capture important changes in the patient experience.

IUGA and the ICS have released a joint report on terminology for reporting surgical outcomes in POP [11]. Beyond terminology, specific guidelines were outlined for methodology (power calculations, avoiding bias, following established research guidelines such as Consolidated Standards of Reporting Trials, or CONSORT). Regarding choice of measures, IUGA and the ICS propose that outcomes reporting of POP surgery should include validated PRO questionnaires (guided by SMART–specific, measurable, appropriate, realistic, timely–criteria), qualitative patient satisfaction scale, appropriate and fully validated quality of life instruments, specific reporting of objective outcomes, timelines, cost analysis, secondary PFD outcomes, and complications reporting [38].

Conclusions

The outcomes measures available for reporting on SUI and POP procedures have rapidly expanded in the last 20 years. Familiarity with PRO instrument validation methods and characteristics of available instruments will assist the clinician and researcher in selecting and reporting appropriate and valid outcomes measures.

Future PFD outcomes research should focus on optimization (in terms of reflecting patient preferences and the ability to define and detect MCID), standardization, reaching consensus on definition of surgical success, and better applying outcomes measures in both the research and clinical setting.