Introduction

The use of a standardized questionnaire to measure quality of life (QOL) in osteoporosis (OP) patients accurately is important allowing a reasonable comparison to be made between other countries with different languages and cultures. Several questionnaires have been developed to measure QOL in OP [13], but none of these were originally created in Spanish. To be able to use any of these tools, it is important to carry out a validation and cross-cultural adaptation because cultural groups can vary in disease expression. Even in countries using the same language, local idiomatic expressions exists to name different foods contents; cultural or social activities might be specifically oriented to local needs and can differ from one country to the other, and they need to be incorporated to have accurate and reproducible results that can be comparable with outcomes from other countries and cultures across different health systems in place [4].

OP has recently become a focus of research in developing countries since, and an important increment on the number of elderly people and life expectancy of Mexicans has been reported [5]. One out of 12 Mexican women more than 50 years of age will sustain a hip fracture in the remaining years of their life. In addition, a 19.5% prevalence of vertebral fractures in Mexican women has been reported recently [6].

Vertebral fractures, the hallmark of OP fractures, are commonly associated with back pain, kyphosis, abdominal protrusion, and height loss. Patients with vertebral fractures have impaired walking and activities of the daily life as shopping and carrying or lifting objects [7, 8].

Studies have also shown that such patients suffer from a loss of independence [911]. Furthermore, the effects of physical changes influence psychological functionality causing anxiety, depression, low self-esteem, and stress. Anxiety and panic prove to be the most significant problems, as patients endeavor to avoid any situation in which fractures could take place. All these characteristics impact negatively on the QOL of OP patients and patients with vertebral fractures.

Several specific tools have been developed to evaluate QOL in people with OP such as the Osteoporosis Assessment Questionnaire, the Quality of Life Questionnaire for Osteoporosis, the Mini-Osteoporosis Quality of Life Questionnaire, the Quality of Life Questionnaire of the European Foundation for Osteoporosis (QUALEFFO), and the Osteoporosis Functional Disability Questionnaire. These tools have been useful in determining the QOL in those countries and cultures in which they originated from. Some of them have been validated for use in other countries with languages and cultures different from those in Mexico.

There are no original tools to evaluate QOL in OP in Spanish for people living with OP and vertebral fractures. Therefore, validation and possible adaptation of an existing tool is required that allows these aspects to be evaluated in Mexican patients, as much in the type of Spanish spoken as any cultural differences. The QUALEFFO has excellent psychometric characteristics; it is consistent, homogenous, and a reliable instrument in all countries where it is in current use, and in addition, it has proven to be a potential for discriminating the different QOL between people with and without vertebral fractures.

Therefore, the aim of this study was to make a transcultural validation of the QUALEFFO in Mexican patients with vertebral fractures and OP.

Materials and methods

Sample

A total of 160 women were included in the study, 80 cases with at least one vertebral fracture, which had been defined morphometrically at least 3 months before the study began and 80 women with OP and no fractures as a control group. All women were 50 years of age or older at the start of the study. The World Health Organization classification criteria for OP were used to classify the patients. Lateral X-rays from all subjects were read using the Genant semiquantitative method, and vertebral morphometry was done in all cases using the Modified Eastell criteria to define the vertebral deformities [12, 13]. The OP patients were enrolled from the National Institute of Rehabilitation, Mexico City (INR), and the Latin American Vertebral Osteoporosis Study (LAVOS) Study Sample from the City of Puebla [6]. Cases with secondary OP were excluded as were patients with metabolic bone disease, disseminated malignancies, or conditions interfering with mobility and/or activity.

Questionnaires

The QUALEFFO is a self-administered, specific questionnaire designed by the Working Party for Quality of Life of the European Foundation for Osteoporosis (EFFO) to be used by patients with vertebral fractures attributed to OP. It consists of 41 questions in the following five domains: pain, physical function, social function, general health perception, and mental function.

The Iberian Spanish version was obtained from the original study with the consent of the authors and Professor Paul Lips from the Work group of the EFFO, Academic Vrije University Hospital, Amsterdam, The Netherlands.

Along with the QUALEFFO, the Spanish version of the Short Form 36 of the Medical Outcomes Study (SF-36) [14] was applied concurrently. Previous authorization for the use of this tool was obtained. The SF-36 is a generic questionnaire developed to measure the state of health in a general population. It consists of eight domains: bodily pain, physical function, social function, general health, mental health, vitality, physical role, and emotional role. It is used widely to test concurrent validity.

Methods and procedure

A panel of experts consisting of two rheumatologists and one methodologist reviewed the tool in both English and Iberic Spanish to assess content validly and feasibility [15] to be applied in Mexican patients. A number of modifications were made in relation to the type and form of the language used as well as some adaptations arising from cultural differences; nine of the questions suffer modifications, and the Iberic and Mexican wording can be seen in the Appendix. A pilot study of the resulting “Mexican Tool” was carried out in 15 patients to verify that patients were able to understand the instructions, the questions, and the different answering options.

After piloting the tool, the final version was applied to all subjects in the same order. In 30 cases, the tool was applied twice to evaluate reproducibility. The QUALEFFO was applied first, followed by the Sf-36, and then vice versa, with a period of 1 day between the first and second application. The sample of patients with vertebral fractures from the LAVOS study were contacted and visited in their homes, while the patients recruited from the INR were seen at the Osteoporosis Clinic of the same institution. Lateral X-rays were taken of the dorsal and lumbar spine region using the same protocol [16], and a densitometry of two regions (spine and hip) was carried out in all cases to determine the OP diagnosis.

The tools were carried out by direct interview in 101 cases because these participants were unable to use the self-administration modality (impaired visual capacity or low level of education). In all of these cases, the tools were applied by the same interviewer (Ramírez Pérez). A written consent was requested from all participants. The interviews were carried out between November 2001 and December 2003. The protocol was reviewed and authorized by the Research Committee at the National Institute of Rehabilitation, Mexico City.

Statistic analysis

The QUALEFFO was evaluated according to the instructions in its original version. The answers to each question were scored from 1 to 5, except for questions 23, 24, 25, 26 (score 1–3), and 27–28 (score 1–4); “not applicable” was not scored. The response options for questions 33, 34, 35, 37, 39, and 40 were reversed so that the order was always from 1 (healthy) to 5 (not healthy). Parameter scores were calculated by adding up the answer scores and submitting the sum to a linear transformation using a 100 scale.

For demographic data, descriptive statistics were obtained for the variables concerning age, marital status, level of formal education, and type of fracture.

The Cronbach alpha coefficient was calculated to evaluate internal consistency. Reproducibility test–retest was evaluated [15] by calculating the intraclass correlation coefficient (R i) [17]. Pearson coefficient of correlation was calculated for concurrent validity, and the t test for independent groups was used to test for significance between the fracture and nonfracture groups.

A logistic regression was used to discriminate between the different dimensions in both tools (QUALEFFO and SF-36) for both the fracture and nonfracture groups. Odds ratios (ORs) and 95% confidence intervals (CI95%) were calculated, and a regression was used to adjust for age, formal years of education, presence/absence and number of fractures, and marital status. Receiver-operating characteristic (ROC) curves were constructed to compare the ability of the QUALEFFO and SF-36 to discriminate between vertebral fracture and nonfracture groups. The difference between the areas under the curves was used to compare between both tools [18, 19]. SPSS v.10 was used for the analysis.

Results

A total sample of 160 women was included in the study with a mean age of 76 ± 11.2 years for the fracture group and 68 ± 9.4 years for the control group (p = 0.0001).

In terms of the number of widows in the two groups, 49 women (61.3%) in the fracture group were widows as opposed to 28 women (35%) in the control group. A significant difference between the level of formal education was found between cases and controls; 48.8% of controls had more than 6 years of formal education in comparison to 25% in the fracture group (p = 0.002).

A total of 109 vertebral fractures were diagnosed: 47 fractures at the lumbar level and 62 at the thoracic level with the majority being found at the T12–L1 union (14 and 16 fractures, respectively). Sixty-one cases (76.2%) had one fracture, 13 cases (16.25%) had two fractures, and six cases (7.4%) had three or more fractures.

Content validity was assessed by consensus during a meeting with experts (two rheumatologists and one methodologist) in the field and verified in the pilot study.

The properties of the tools were as follows: for the QUALEFFO, a R i = 0.94 and an Cronbach alpha coefficient of α = 0.922. The internal consistency of SF-36 was found to have an α = 0.925 and a R i = 0.97.

When the different dimensions of QUALEFFO were analyzed, the internal consistency by parameter was as follows: pain, α = 0.886, physical function, α = 0.944, general health perception, α = 0.715, and mental function, α = 0.696. For social function a lower α value of 0.463 was found when the seven original questions from the tool were analyzed. Because of the high percentage of nonapplicable answers in this domain in two questions, a second analysis was performed with just five items, and the alpha value then increased to α = 0.706. For the SF-36 tool, the internal consistency by domain was: physical function, α = 0.940, mental function, α = 0.839, physical roll, α = 0.843, emotional roll, α = 0.813, bodily pain, α = 0.772, social function, α = 0.767, general health, α = 0.693, and vitality, α = 0.753.

The analysis data disaggregated by form of application showed no difference in psychometric properties. No differences in tool consistency were found in self-administered (α = 0.93) or interview (α = 0.91).

In terms of the QUALEFFO tool, the following parameters showed significant statistical differences between the fracture and nonfracture groups: pain (p = 0.046), physical function (p = 0.001), social function (p = 0.001), and mental function (p = 0.010). Significant differences between groups were also found using the generic SF-36 questionnaire as follows: physical function (p = 0.002) and social function (p = 0.042; Table 1).

Table 1 Grading the domains of the QUALEFFO-SF36 in people with and without vertebral fractures (FxV)

When analyzing the differences (Table 2) between the number of fractures and the deterioration in the QOL, patients in the sample with one fracture and two or more fractures showed deterioration related to pain, physical function, social function, and mental function.

Table 2 Relation between the number of fracture and the quality of life

Data was adjusted for age, years of formal education, marital status, and number of fractures; only age and number of fractures showed to have a significant effect, as the QOL of women aged was worst (p = 0.005), and as the number of fractures increased, the QOL decreased (p = 0.001).

The concurrent validity between both QOL tools calculated using the Pearson correlation showed a good fit (r = −0,817, p < 0.0001). As expected, these correlations were negative because of the different scoring methods used by the QUALEFFO and SF-36 questionnaires but strong for: mental function, r = −0.726, general health perception, r = −0.632, pain, r = −0.512, physical function, r = −0.471, and social function, r = −0.467, correlations.

Logistic regression analysis is shown in Table 3. The discriminative tool capacity was found to be as follows: pain, OR = 1.3 (IC95% = 0.85–2.02), physical function, OR = 3.7 (IC95% = 1.27–10.70), social function, OR = 2.2 (IC95% = 1.04–4.83), and mental function, OR = 1.5 (IC95% = 0.675–3.35). Data are shown in Table 3.

Table 3 Discriminative capacity of QUALEFFO and SF-36 questionnaires as assessed by Logistic Regression

Table 4 and Figs. 1 and 2 show the differences in the discriminate capacity of both tools using a ROC curve analysis. The QUALEFFO discriminates for physical function (p < 0.0001), social function (p ≤ 0.001), and mental function (p = 0.02), whereas the SF-36 discriminates exclusively for physical function (p ≤ 0.001).

Table 4 Receiver-operating characteristic (ROC) curve analysis for QUALEFFO and the SF-36
Fig. 1
figure 1

Discriminating between vertebral fracture cases and nonfracture controls in a receiver-operating characteristic curve for individual QUALEFFO domain performance

Fig. 2
figure 2

Discriminating between vertebral fracture cases and nonfracture controls in a receiver-operating characteristic curve for individual SF-36 domain performance

Discussion

This study confirms that cross-cultural adaptation of the QUALEFFO maintains the psychometric properties found in the original English version. In addition, the modifications made to the Iberic Spanish version of the tool make it both useful and acceptable for measuring the QOL of Mexican patients with OP and vertebral fractures.

The results are similar to those reported by Lips et al. among other authors who have used this tool in patients with OP and vertebral fractures. The tool is consistent, reproducible, and has good discriminative properties [2023]. This was demonstrated in the current study through the high alpha coefficients for internal consistency, good reproducibility using the test–retest, and significant discriminative properties using logistic regression. Furthermore, a very good concurrent validity was also found with both the QUALEFFO and SF-36 tools.

Although the results were similar to those reported internationally, analysis using different domains highlighted some important differences within the social domain. When the original seven questions were analyzed, a lower reliability was found (α = 0.46). A second analysis using only five of these questions showed better homogeneity (α = 0.70)

The two questions that were ruled out were: question 24, “Can you do your gardening”? and question 29, “Does your back pain or disability interfere with intimacy (including sexual activity)?”. Of the 160 patients interviewed, 90 (56.3%) did not consider that question 24 was applicable, while 123 (77%) did not consider question 29 applicable. The high percentage of nonresponses was probably due to the lower homogeneity in this domain.

A possible explanation for this finding in the social domains could be the cultural scope [24] in that gardening is an actual job in Mexico and is not considered to be a hobby, and people receive payment for this activity making the question irrelevant. In addition, houses tend not to have gardens, as space is limited, and a large percentage of the population live in apartments rather than houses. In terms of sexual activity, the majority of sample included in the study were more than 70 years of age and many were widows or living alone. They reported that in reality, the question did not apply to them. Therefore, it was not assumed that the question was irrelevant, but in fact, the question did not apply. This issue has been reported previously in the Mexican population [25].

A possible solution to overcome the cultural differences identified could be to replace these questions with others that are more applicable in terms of culture and social context to maintain the same number of items as in the original tool. When the validation and cultural adaptation of the Health Assessment Questionnaire Disability Index was performed in Mexican patients with rheumatoid arthritis, a similar problem was found. Two questions from the original version of the tool, “Are you able to use a bathtub?” and “Are you able to drive a car?” were removed because they were not applicable to most of the patients in the study. To maintain the magnitude of the questionnaire, a team of physicians and physiotherapists identified some activities that involved similar joint and muscle activities, such as the use of a foot-operated sewing machine, which replaced the original related questions.

In the current study, it was decided not to replace the questions that attracted low response, and so the questions were removed, and consequently, a lower score was obtained. However, because the output scale is expressed as a percentage of applicable items in each domain, it is believed that the loss of these two questions did not render it incomparable across different cultures or populations. There is always the possibility in a future version of the tool to replace these questions with others that might be culturally sound.

The discriminative properties of the tool were demonstrated by finding differences between the groups with and without vertebral fractures within the domains for pain, physical state, social activities, mental state, and total score. The generic SF-36 tool only discriminated the physical (p = 0.000) and social function (p = 0.029). In his original tool, Lips significantly discriminated between the fracture case group and the control group in all of it.

For the concurrent validity of the SF-36 tool, a significant correlation coefficient was obtained (p < 0.001) in all the domains. Nevertheless, the correlation in the pain (r = −0.481) and social function (r = −0.444) domains were low. These results show the justification of using a specific tool like QUALEFFO and concur with previously reported studies worldwide in which the most affected QOL for people with vertebral fractures are pain, physical activities, and social activities.

The number of fractures was related to the degree of QOL impairment in the physical, social, and mental function [26, 27]. Patients with several fractures do not report higher levels of pain than patients with a single fracture. This could be explained because older fractures may be asymptomatic, or patients are already taking analgesics [7]. Melton et al. [28] found that 72% of the subjects studied had symptoms such as backache for 1 day or less that could possibly have been associated with a vertebral fracture. Another study reported that vertebral deformity was negatively associated with pain, in that pain influences the performance of activities [11].

With the exception of the general health perception parameter, the ratings for all the other in this questionnaire discriminated between patients with or without vertebral fractures.

Conclusion

The tool has excellent psychometric characteristics; it is consistent, homogenous, reliable, and has the potential for discriminating the four different QOL between people with and without vertebral fractures. This study confirms the utility of the tool to demonstrate QOL deterioration in people with vertebral fractures. In addition, the study showed that the degree of QOL deterioration is dependent on the number of vertebral fractures and the age.

A cultural modification of the tool should be applied even if the original version was to be written in the same language as the country in which the tool is being applied. This study demonstrates that cultural differences affect the internal consistency of the tool, as seen in the social function parameter, and it may be important to consider omitting or changing questions to those that are more culturally appropriate.

The self-administered questionnaire can be applied by interview for people with either visual impairment or low academic performance without losing psychometric properties.

QUALEFFO can be used in a Mexican population to evaluate the QOL in patients with vertebral fractures attributable to OP.