Introduction

Pelvic floor disorders (PFDs) cover a range of conditions, most significantly, pelvic organ prolapse (POP), urinary incontinence (UI), and fecal incontinence (FI). Although not life threatening, these conditions greatly impact women’s quality of life (QoL). In developed countries, PFDs represent one of the most common indications for gynecologic surgery. Studies from Sub-Saharan countries show substantial variation in PFD prevalence, with mean prevalence of POP at 28.7% (range 2.8–70.8%), UI at 19.7% (range 3.4–56.4%), and FI at 6.9% (range 5.3–41.0%) [1]. There is limited data from Ethiopia regarding prevalence of PFDs. A cross-sectional study of 395 women in Dabat, northern Ethiopia, showed significantly lower prevalence of PFDs when screened by questionnaire compared with studies from Sub-Saharan countries. In the Dabat study, prevalence of UI, POP, and FI were 7.8%, 6.3%, and 0.5%, respectively. Despite these low numbers, in the same population, clinical examinations detected stage II–IV prolapse in 162 (55.1%) women who underwent pelvic exam. The screening questionnaire used in that study, adapted from the Norwegian Epidemiology of Incontinence in the County of Nord-Trøndelag (EPINCONT) study, demonstrated poor sensitivity (38.3%) in detecting symptomatic prolapse, even in cases of clinically relevant prolapse (stage III or IV) [1].

The Pelvic Floor Distress Inventory (PFDI) and Pelvic Floor Impact Questionnaire (PFIQ) are validated in women with PFDs [1]. Developed in English [2], these companion questionnaires are widely used to provide an accurate measure of symptom bother related to prolapse and urinary and colorectal–anal symptoms. Short versions of the forms—PFDI-20 and PFIQ-7 [3]—are consistent with full versions. Both have been adapted and validated in French, Swedish, Chinese, Arabic, and Turkish, as well as for Spanish-speaking people in the USA and Spain [4,5,6,7,8,9,10,11,12]. Currently in Ethiopia, validated instruments to aid in the clinical assessment of women with PFDs do not exist, and the prior questionnaire used in the Dabat study performed poorly. We aimed to translate and adapt the English versions of the PFDI-20 and PFIQ-7 to Tigrigna, the third most common language in Ethiopia, and test the validity and reliability of the translated questionnaires.

Materials and methods

The PFDI-20 consists of 20 condition-specific questions examining pelvic symptoms. Each question relates to the presence of an individual symptom and, if present, how bothersome it is on a 4-point scale. The PFDI-20 contains three subscales: Pelvic Organ Prolapse Distress Inventory (POPDI), Colorectal–Anal Distress Inventory (CRADI), and Urinary Distress Inventory (UDI). The PFIQ-7 similarly consists of 21 condition-specific questions about pelvic symptoms. It also contains three subscales: Urinary Impact Questionnaire (UIQ), Colorectal–Anal Impact Questionnaire (CRAIQ), and Pelvic Organ Prolapse Impact Questionnaire (POPIQ).

In our study, English versions of the PFDI-20 and PFIQ-7 were independently translated into Tigrigna by two medical experts familiar with survey methodology and medical issues surrounding PFDs and who were also fluent in English and Tigrigna. Translated versions were harmonized (Version 1), and women were then recruited to participate in cognitive interviewing. This group comprised women self-identifying as having or not having PFDs. Cognitive interviewing consisted of discussing the meaning behind questions as written and comparing any alternative wording that arose during translation of Version 1. Following cognitive interviewing, a revised version of the Tigrigna questionnaires was developed and back-translated into English by a bilingual, independent, professional translator. Discrepancies in the back-translation were corrected in Version 2. In a final step, Version 2 and the original English version were compared to ensure concepts in the original questionnaire were present in the final version.

Following translation, a cross-sectional study was conducted at five hospitals in Tigray: Ayder, Mekelle, Adwa, Maichew, and Aksum hospitals. Women with and without PFDs were recruited at gynecology outpatient clinics between January 2015 and January 2016. Women without PFDs were included after physicians determined no PFD or related symptoms were present. Women <18 years of age, those with mental illness, pregnant women, and those who had undergone urogynecologic surgery within 6 months preceding the study were excluded.

The study protocol was approved by the Institutional Ethics Review Committee of Mekelle University’s College of Health Sciences. All participants gave written informed consent. Sociodemographic data and clinical variables, such as age, history of pregnancy, clinical diagnosis of PFD, type of complaint, and severity of UI or anal incontinence (AI) were obtained from medical charts and direct interview with participants. POP stage according to the Pelvic Organ Prolapse Quantification system (POP-Q was determined by physician examination. Severity of UI was defined as the self-reported frequency of urine leakage (less than once a month, more than once a month, more than once a week, more than once a day) when physicians diagnosed UI after physical examinations. AI severity was defined by self-report (gas, stool, both gas and stool). FI was defined as at least one symptom of incontinence of flatus or stool (only flatus, only loose stool, only normal stool, or the combination of flatus, loose, and normal stool).

Reliability of translated questionnaires was evaluated by reproducibility (test–retest reliability) over a 1-week interval (mean 8 days) that assessed intraclass correlation coefficients (ICC) and included Bland–Altman analysis [13] for total PFDI-20 and PFIQ-7 scores and their subscales. Bland–Altman analysis was used to describe agreement between paired measured values of average score on the two tests (x-axis) against difference in scores (y-axis). Cronbach’s alpha was used for full PFDI-20 and PFIQ-7 questionnaires and subscales to determine internal consistency. Scores are expressed as mean ±  standard deviation (SD).

Sample size was determined a priori, with 88 women with PFD providing >80% power at 5% significance to detect Spearman’s correlation coefficients of ≥0.4 between PFIQ-7 total/subscale scores and the severity of PFD (POP-Q stage, frequency of UI, type of AI). A two-sample comparison between women with and without PFD at 80% power and 5% significance required 23 women in each group for known-groups validity. Kaiser-Meyer-Olkin Measure of Sampling Adequacy was done for all factors (41 questions), yielding a score of 0.95, close to a desirable value of 1. All statistical analyses were performed using SPSS 20.0.

Results

Face validity was established through translation/back-translation and cognitive review; ten women participated in the interviewing: five with and five without PFDs. Experts incorporated some changes to words or phrases after primary translation to ensure the final version was appropriate to the culturally diverse population in Tigray. For example, “anus” was translated as “bowel”. Questionnaires were completed by an additional 118 women not included in cognitive interviewing. Fifty women were reinterviewed 1 week later (mean 8 days) to assess test–retest reliability. Average time for the interview was 14 min and for the clinical exam 12 min. All participants answered all questions. Women average age was 49 ± 10 years; mean parity was 6 ± 3. Of 118 women, 88 were diagnosed with POP and 75 with POP-Q stage III or IV. Thirty had no symptoms of PFDs. Among women with POP, 52 had symptoms of UI and 25 had AI. In the 52 women with UI, 12 had stress (SUI), 12 had urgency (UUI), and the remaining 28 had mixed UI. Among the 25 women with AI, 11 had flatulence only, three had fecal incontinence (FI) only, and the remaining 11 had both. Mean scores of the total PFDI-20 and PFIQ-7 were 152.1 ± 59.0 and 167.8 ± 71.1, respectively, at baseline in women with PFDs. In women without PFDs, mean scores were 6.3 ± 5.7 and 8.1 ± 8.9 for the PFDI-20 and PFIQ-7, respectively (Tables 1, 2, 3, 4 and 5).

Table 1 Patient characteristics (n = 118)SPi Please use ± symbol and ensure space before ( in column 2
Table 2 Pelvic floor disorders (n = 118)
Table 3 Pelvic floor disorders at baseline
Table 4 Reliability of the PFDI-20 and PFIQ-7
Table 5 Known-groups validity: comparison of total PFDI-20 and PFIQ-7 scores with subscale scores between women with and without PFDs

Translated PFDI-20 and PFIQ-7 questionnaires showed high internal consistency with Cronbach’s α coefficient of 0.930 for PFDI-20 (range 0.891–0.897 for subscales) and 0.956 for PFIQ-7 (range 0.909–0.942 for subscales). ICC was 0.941 for PFDI-20 and 0.916 for PFIQ-7 retest. All values were statistically significant (p <0.001). ICC values >0.70 indicate good reliability [14]. Among women who were retested, Bland–Altman analysis showed the differences between the first and second scores of the total PFDI-20, PFIQ-7, and subscales were not significantly different from 0 and largely fell within the range of 0 ± 1.96 SD. Total PFDI-20 and PFIQ-7 and all subscale scores were higher in women with POP (p < 0.01, Table 6), especially in the POPDI-6 (62.5 ± 21.7 vs 3.0 ± 4.6; p < 0.001) and UIQ (67.3 ± 28.3 vs 2.9 ± 4.3; p < 0.001). Total PFDI-20, PFIQ-7, and all subscales scores were higher in women with than without UI and AI (p < 0.01).

Table 6 Construct validity: correlations of total and subscale scores of PFDI-20 and PFIQ-7 with measures of clinical severity of PFDs

Spearman’s correlation coefficients were 0.51–0.53 (p < 0.001) between POP-Q stage and PFDI-20/POPDI-6 scores and 0.38–0.41 (p < 0.001) between POP-Q stage and CRADI/UDI scores. Coefficients were 0.31–0.41 (p <0.04) between POP-Q stage and PFIQ-7/UIQ/CRAIQ/POPIQ scores. AI severity demonstrated the strongest correlation with CRADI (0.61, p < 0.001) and UDI (0.37, p = 0.068) scores and total PFDI-20, but correlation was poor with POPDI-6 (0.07, p 0.739). AI severity demonstrated closer correlation with CRAIQ (0.55, p 0.005) scores than with UIQ, POPIQ, and total PFIQ-7 (0.36–0.51, p = 0.078) scores. UI frequency was correlated with UDI (0.35, p 0.01), CRADI (0.37, p 0.007), and total PFDI-20 (0.43 p 0.001) scores, but not with the POPDI (0.28, p 0.048); it showed poor correlations with PFIQ-7 and subscale scores.

Discussion

Our study establishes the reliability and validity of orally administered Tigrigna versions of the PFDI-20 and PFIQ-7 questionnaires. Translation/back-translation was similar to previous adaptations to Spanish, Turkish, and Chinese [9,10,11,12]. While other versions of the PFDI and PFIQ have been validated for online and written administration, other studies validating oral administration of these questions are lacking. We show that oral administration has similar properties to these questionnaires administered in written formats.

Face validity of PFDI-20 and PFIQ-7 translations was adequate; both demonstrated acceptable reliability and validity. No significant difference was seen between their first and second in total and subscale scores using Bland–Altman analysis. ICC was 0.94 for PFDI-20 and 0.92 for PFIQ-7, showing a high level of test–retest reliability for both. There was higher reliability of the Tigrigna versions compared with a previous study of Dutch women, which reported an ICC of 0.79–0.91 [15].

As most women with POP in our study were stage III or IV, PFDI-20 and PFIQ-7 total and subscale scores were higher than in other studies [16]. A possible reason is that our study population was from rural areas and primarily uneducated, with significant barriers to accessing care, resulting in long-standing prolapse and influencing symptom bother. In addition, participants were older than those in the study from Japan [16], again potentially exacerbating symptom severity.

There was high construct validity using Spearman’s correlation coefficients between PFDI-20 scores and POP-Q stage, UI frequency, and AI severity. Most subscales demonstrated strong association with each measurement of symptom severity for which subscales were designed and moderate association with other symptoms. Spearman’s rank correlation matrix of PFIQ-7 and subscores with clinical diagnosis of UI was lower, likely because Tigray women are shy and reluctant to explain UI severity.

Limitations

A limitation of our study is the inability to compare construct validity with other questionnaires used for PFD evaluation, for example, the Short Form Health Survey (SF-12), International Consultation on Incontinence Questionnaire Short Form (ICIQ-SF), or the Patient Global Impression of Improvement (PGI-I) because no translated and validated versions are available in Tigrigna, and this was beyond the scope of our study objective. However, future studies comparing translated and adapted versions of these instruments would be useful, particularly if Ethiopia considers adapting our PFDI -20 and PFIQ-7 forms for national use. A second limitation is the need for oral administration of the surveys owing to illiteracy of the population. Questionnaires were not self-administered, so data was collected by direct interview by obstetrics and gynecology residents, possibly introducing bias. The sample size of women with AI was relatively small, though the number (n = 25) was nearly adequate for conducting known-groups validity based on sample-size calculations from a previous study [2].

Conclusion

The validated PFDI-20 and PFIQ-7 in Tigrigna showed semantic, conceptual, idiomatic, and content equivalence with original versions. The questionnaires are reliable, valid, and feasible for evaluating symptoms and QoL of women with PFD in Tigray, Ethiopia. The Tigray Regional Health Bureau should consider integrating these questionnaires into service delivery in the region. Our validated questionnaires can assist in similar adaptations in Ethiopia, possibly to Amharic or Oromiffa, as well as contributing to the body of research available globally on this topic.