Introduction

Osteoporosis is a common skeletal disease characterized by low bone mass and micro-architectural deterioration of bone tissue, resulting in increased bone fragility and risk of fracture [1]. More than 200 million people are affected by this disease worldwide [2, 3]. Almost 50 % of women will suffer an osteoporotic fracture in their lifetime [4]. By 2050, half of the world’s fractures will occur in Asia and mainly in China with osteoporotic fractures to reach an estimated 5.99 million. Females sustain four times the number of fractures than males [5]. Osteoporotic fractures may cause pain [6], decrease physical function, and limit daily activities [7] and lead to social isolation and emotional distress [8, 9]. Undoubtedly, osteoporotic fractures impact negatively on health-related quality of life (QOL).

Assessment of QOL plays an increasingly important role in clinical studies and, particularly, as an outcome measure during clinical trials of patients with osteoporosis [7].Given that one cannot find a correlation between quality of life and radiography or densitometry [10, 11], an appropriate instrument is required to determine the effects of osteoporotic fractures on a patient’s everyday health and well-being. Two different types of instruments (generic and disease targeted) are generally recommended. Generic instruments are designed for use across a wide range of medical conditions but lose their effectiveness if wanting to explore specific aspects of osteoporosis. Conversely, disease-targeted instrument scan measures specific clinical changes in response to treatments for individual diseases such as osteoporosis and, in particular, can highlight gender disparities [12].

Several instruments have been developed in recent years. However, the majority have not been sufficiently tested for clinical practice or optimally validated. In China, there has not been a unified tool to assess the QOL in osteoporotic fracture females. There is only one home-grown disease-targeted questionnaire to determine QOL in patients with lower bone density [13]. It has not been widely applied partly due to the length of the questionnaire items. The Chinese version of the Quality of Life Questionnaire of the European Foundation for Osteoporosis is well validated but was developed to measure QOL without gender specificity [14]. The Osteoporosis Assessment Questionnaire Short Version (OPAQ-SV) has only 34 items categorized according to three dimensions: “physical function,” “emotional status,” and “symptoms.” The instrument is consistent, homogenous, and reliable and will consistently measure dysfunction associated with osteoporotic fractures and in females [15]. Therefore, we cross-culturally adapted the OPAQ-SV to measure QOL in Chinese osteoporotic fracture females and subsequently evaluated its psychometric properties.

Methods

Cross-cultural adaptation

With permission from the original developers of the OPAQ-SV and strict adherence to cross-cultural adaptation guidelines [16], we began our adaptation by asking a team of osteoporosis experts to translate the questionnaire into Mandarin Chinese. The instrument was then back translated by another team (i.e., bilingual experts fluent in English and Chinese) and conformed to the COSMIN recommendations [17]. Each expert conducted a retranslation independently. The back translation was then reviewed by the team of experts who had been involved in development of the original version of the OPAQ-SV. The final cross-culturally adapted version (the Chinese OPAQ-SV) was then modified until the original and the Chinese versions were linguistically identical.

Validation

Pilot study

The Chinese OPAQ-SV was first pilot tested on a sample of 25 Chinese female patients with osteoporotic fractures. The aim was to detect problems with the questionnaire such as wording, terminology, instructions, and choice responses. Patients were administered the questionnaire followed by a structured interview with questions about each of the questionnaire items. Patients were asked to comment on items and offer recommendations for improvement. The Chinese OPAQ-SV was amended accordingly.

Field study

Following pilot testing, the Chinese OPAQ-SV was validated using a cross-sectional study design with 234 post-menopausal osteoporotic fracture patients recruited from orthopedic units in five tertiary hospitals in China. Ethics approval was obtained from the University’s Human Research Ethics Committee and institutional review boards overseeing research at each hospital site. Inclusion criteria were post-menopausal women: (a) with onset of menopause after 40 years of age, (b) who suffered a fracture with minimal or no trauma, (c) with a diagnosed fracture according to radiological reports, (d) willing to give consent to participate in the study, and (e) with an ability to read or speak Mandarin Chinese. Patients were excluded if they were secondary osteoporotic patients, had chronic disabling disease other than osteoporosis, or had cognitive impairment. In order to assess the discriminant validity of the OPAQ-SV, 235 post-menopausal osteoporosis females without fractures were also recruited. Osteoporosis was defined clinically through the measurement of bone mineral density (BMD, g/cm2) in 229 (97.4 %) of the patients, using daily calibrated Hologic 4500 dual-energy radiograph absorptiometry (DEXA) (Hologic, Bedford, MA, USA). The presence of osteoporosis in 6 (2.6 %) of the patients, who did not have a fracture history and who had no BMD test, was diagnosed by physicians/radiologists according to the risk factors, symptoms of osteoporosis, and X-rays.

Patients were recruited at clinics. Data collectors distributed questionnaires (including a one-page subject information form for socio-demographic and medical variable data), explained the purpose of the study, and provided instructions according to a pre-determined protocol after obtaining verbal consent. Patients filled in the questionnaires independently. For those unable to complete the questionnaire (due to low literacy), questions were asked orally. Questionnaires were checked, when completed, for unanswered items, and subjects were invited to answer any unanswered items. To ensure the self-administered mode equivalent to the interview mode for the instruments used for this study, strategies were taken as follows: (1) the data collector was trained beforehand to obtain a best equivalence with unified guidance; (2) 30 responses randomly selected from patients’ writing responses were compared with those providing them verbally.

Measures

OPAQ-SV

The OPAQ-SV consists of 34 items. There are three major dimensions and corresponding health domains: “physical function” (walking/bending, daily activities, transferring), “emotional status” (fear of falls, body image, independence), and “symptoms” (back pain) which, together, assess health-related QOL in osteoporosis. The higher the OPAQ-SV score, the better the health status. Each item has five options: “all days,” “most days,” “some days,” “few days,” “no days” or “always,” “very often,” “sometimes,” “almost never,” “never”. Item 1, 3, 4, 8, 9, 10, 13, 14, 15, and 32 are reverse scoring with five options scoring 10 points, 7.5 points, 5 points, 2.5 points, and 0 points. Remaining items are forward scoring with five options scoring 0 points, 2.5 points, 5 points, 7.5 points, and 10 points. Scores of each item within a domain are then summed to create the domain score. A normalization procedure is performed to produce expression of all domain scores in the 0–10 range, with 0 representing the worst possible health status and 10 representing the best possible health status. Domain scores are summed within the dimension and then normalized to a range from 0 to 100, with 0 representing the worst possible health status and 100 representing the best possible health status [15].

Osteoporosis general information questionnaire

Socio-demographic and medical variable data were obtained with the Osteoporosis General Information Questionnaire that we developed. The questionnaire included age, occupation, education, working status, income, fracture incidence, and so forth.

SF-12

For obtaining general quality of life measures, patients completed the Short Form 12 (SF-12). The SF-12 consists 12 items categorized in one of eight dimensions: “general health,” “physical functioning,” “role-physical,” “role-emotional,” “bodily pain,” “mental health,” “vitality,” and “social functioning” [18].

Statistical analyses

Data were entered into a database using Epi Data 3.1 software. Calculations were performed with a computer using statistical software (i.e., Statistical Package for the Social Sciences [SPSS 19.0]). The statistical description of demographic variable was carried out by frequency tables, the means, and the standard deviations (SD). Validity of items was determined through item analysis. Reliability was investigated using internal consistency/split-half coefficient reliability and test-retest reliability measures when the Chinese OPAQ-SV was completed twice with an interval of 2 weeks. The intraclass correlation coefficient (ICC) measured test-retest reliability. Values of 0.60–0.80 were deemed good reliability, and values above 0.80 were regarded as excellent reliability [19]. Internal consistency was calculated with the Cronbach alpha coefficient. If the Cronbach alpha coefficient was equal or greater than 0.70, it was considered satisfactory [20].

Construct validity of the Chinese OPAQ-SV was determined through exploratory and confirmatory factor analyses. Exploratory factor analysis using principal component analysis with varimax rotation was conducted to investigate the factor structure [21]. The scree plot, the Kaiser criterion (eigenvalue 1.0), and the clinical interpretability were used to determine a factor solution. An item was accepted on the final factors if it had a load of more than 0.4 on the corresponding factor [22]. To verify results, a confirmatory factor analysis was also performed. Fit indexes were calculated, including [23, 24] (a) the chi-squared test, and had to be ≤2 to be acceptable; (b) the root mean squared error of approximation (RMSEA), where a value <0.08 was considered acceptable; and (c) the Comparative Fit Index (CFI), which had to be >0.90 to be satisfactory [25]. Content validity was assessed by the correlation between the scores of the item and the corresponding dimension and the dimension and the total scale. Discriminant validity was determined by comparing mean Chinese OPAQ-SV dimensions in 234 women with history of clinical fracture to the 235 women without clinical fracture using the Wilcoxon test for independent samples, as appropriate. The data were tested to maintain excellent balance between the two groups. Significant (p < 0.05) parameter (age) was then included as a covariate to adjust the quality of life. Spearman’s rank correlation between the similar domains in Chinese version of OPAQ-SV and SF-12 was performed. Spearman’s rho correlation coefficients were obtained. Correlations above 0.4 and 0.7 are classified as moderate and strong, respectively [26]. A p value less than 0.05 was considered to be statistically significant.

Results

Participant characteristics

Table 1 lists the characteristics of the osteoporosis females with and without fracture. A total of 234 osteoporotic fracture females and 235 osteoporosis females without fracture were identified for recruitment. The mean age (SD) of the respondents in the validation study was 67.48 (9.19) years for osteoporotic fracture females and 63.18 (9.09) years for osteoporosis females without fractures. The average BMI (SD) for the participants was 22.96 (3.67) kg/m2. The average menopause (years) was 49.97 (3.69) years. Half were from primary school or less; 60 (25.6 %) osteoporotic fracture females and 59 (25.1 %) females without fractures received high school education or post-secondary education. The majority (>90 %) had medical insurance. The percentage of osteoporotic fracture females with more than one fracture was 78.6 %. A total of 55 (23.5) of the enrolled women suffered hip fractures among the 234 osteoporotic fracture females.

Table 1 Characteristics of the osteoporosis females with and without fracture

Item analysis

Item analysis entailed, first, sorting OPAQ-SV items into high and low scoring groups. The top 27 % of the highest scoring items comprised the high group, and the lower 27 % of the lowest scoring items comprised the low group. Then, the mean score of each item in the two groups was compared. Item analysis verified that the two groups were significantly different (p < 0.001). Item analysis was also performed on the three dimensions and total scores. For the high group, the mean score for “physical function,” “emotional status,” and “symptoms” compared to the total score were 90.87 (5.76), 32.97 (4.76), 88.79 (10.36), and 79.03 (6.17), respectively. For the low group, the mean score for physical function, emotional status, and symptoms compared to the total score were 18.86 (12.14), 12.81 (2.92), 21.83 (8.83), and 23.33 (8.71), respectively. These results indicated the items had a good discrimination without floor or ceiling effect so that all items were retained as 34 items.

Reliability

The Cronbach alpha coefficient was calculated for each dimension of the Chinese OPAQ-SV, resulting in 0.975 for physical function, 0.861 for emotional status, and 0.823 for symptoms. The Cronbach alpha coefficient was 0.970 for the entire Chinese OPAQ-SV, all indicative of good internal consistency. Split-half coefficient reliability was 0.868 and was between 0.697 and 0.956 for the three dimensions. Test-retest reliability of the Chinese OPAQ-SV across the three dimensions produced robust results. Item mean scores for physical function, emotional status, and symptoms were 48.51, 50.58, and 41.25, respectively. After the second week, item mean scores for physical function, emotional status, and symptoms were 48.38, 48.38, and 40.00, respectively. Pearson’s rank-order correlations for physical function, emotional status, and symptoms were 0.995, 0.984, and 0.992, respectively (see Table 2).

Table 2 Reliability analysis of the OPAQ-SV

Discriminant validity

Health status scores for the three dimensions of the Chinese OPAQ-SV across females with and without fractures are compared in Table 3. For females with fractures, the mean (SD) health status scores for physical function, emotional status, and symptoms were 59.73 (29.04), 53.66 (26.93), and 44.03 (15.61), respectively. Females without fracture indicated a higher mean (SD) health status score across dimensions: 73.34 (24.49), 63.50 (26.98), and 55.01 (21.45), respectively. Significant differences were revealed between females with fractures and those without fractures across all dimensions even after adjustment by age (p < 0.001).

Table 3 Comparison of OPAQ-SV dimension in female with and without fractures

Construct validity

Content validity was assessed by determining the correlation between item scores and the corresponding dimension and the dimension with the total Chinese OPAQ-SV score. Strong correlations were evident. Correlation coefficients between 19 items and physical function were 0.539 to 0.904, between 11 items and emotional status were 0.401 to 0.745, and between four items and symptoms were 0.669 to 0.896. In addition, Table 4 illustrates the correlation coefficients between the dimensions and the overall Chinese OPAQ-SV scores: value for physical function was 0.974; for emotional status, it was 0.833; for symptoms, it was 0.753. Values suggested a moderate to strong correlation (p < 0.01), indicating good content validity.

Table 4 Correlation coefficient of each dimension and the total scale (N = 234)

The Chinese OPAQ-SV allowed exploratory factor analysis with satisfactory Bartlett’s test of sphericity (χ 2 = 8320.075 (561), p < 0.001) and Kaiser-Meyer-Olkin test (KMO = 0.946) results. Factor analysis was carried out to examine the factorial structure. The number of factors was determined through graphic analysis of a scree plot and simple structure analysis [27]. Analysis revealed a 6-factor solution explaining 75.847 % of the total variance. Exploratory factor analysis of the 34 items produced factor loadings from 0.543 to 0.892 and the item communalities from 0.512 to 0.890. Factor 1 explained 51.279 % of the variance and covered the “walking/bending,” “transfer,” and “daily activities” domains (except item 12). Factor 2 covered the “fear of fall” domain. Factor 3 covered the “independence” domain and included “daily activities.” Factors 4 to 6 covered the “back pain” and “body image” domains, but factor 6 also, potentially, covered daily activities. This 6-factor structure is slightly different from the original structure of seven domains.

In the second-order 6-factor model, the CFI was 0.777 and the chi-squared test was 2.91, which is more than the benchmark of 2. Therefore, some modifications had to be done. In the modified model, the confirmatory factor analysis CFI was 0.922. The fit indexes were excellent: (a) the chi-squared test was 1.87, less than the benchmark of 2; (b) the RMSEA was 0.078, less than 0.08; and (c) the CFI was 0.922, exceeding the benchmark of 0.90.

The three domains of the OPAQ-SV were compared to four domains of the SF-12. In SF-12, those domains included physical function, role-physical, bodily pain, and mental. Corresponding domains in the OPAQ-SV and corresponding Spearman’s r were physical function (0.778), role-physical (0.770), emotional status (0.515), and symptoms (0.621). Using the Spearman’s r correlation coefficient to determine construct validity, the data suggests that there is a high correlation of the OPAQ-SV to SF-12 across all domains (see Table 5).

Table 5 The correlations of OPAQ-SV with SF-12 in corresponding dimensions (N = 234)

Discussion

Osteoporosis is thought to be a silent disease, yet osteoporosis with fractures produces pain and other negative effects on patients’ physical and emotional functioning, adversely impacting their QOL. For this reason, precise QOL assessment instruments are needed to properly quantify the burden brought on by osteoporotic fractures. Most of the available instruments exist in several languages. We translated and validated a Chinese version of the OPAQ-SV for discerning health-related QOL in females coping with osteoporotic fractures and who live in China. To our knowledge, this is the first validation of an OP-targeted female-specific instrument for use in China.

Given the different language and culture, it was important to take a comprehensive approach to validating the OPAQ-SV [28]. Similar to Silverman who tested the original version, and verified strong psychometric properties [15], we were able to reproduce strong support for the Chinese OPAQ-SV, in terms of reliability and validity in part because of our comprehensive validation procedures. In fact, functional assessment scales with Cronbach alpha values above 0.7 are considered adequate for internal consistency [20]. The Chinese OPAQ-SV produced values ranging from 0.823 to 0.975. Silverman, originator of the OPAQ-SV, discovered coefficients ranging from 0.72 to 0.92. Lips et al. [29] published internal consistency values of the Quality of Life Questionnaire of the European Foundation for Osteoporosis ranging from 0.72 to 0.92.

In addition, correlation coefficients between dimension scores and the entire Chinese OPAQ-SV scores indicated good content validity. Furthermore, construct validity was confirmed by the EFA and supported by the scree plot analysis and the CFA. Construct validity results suggest that the structure of the Chinese version of the instrument is similar to the original version. Exploratory factor analyses also revealed that the extracted components were similar to the original domains. Results were further supported by the Kaiser-Meyer-Olkin test (KMO = 0.946) and cumulative variability (75.847 %).

Nevertheless, there are several limitations. First, patients were recruited using convenience sampling from tertiary hospitals in northwest China. It is likely that patients who were motivated to complete the OPAQ-SV were different from patients randomly sampled. Patients from different hospitals should have been included and not just patients from tertiary hospitals. Second, the discharged patients in the family relying on family members for everyday care needs should have been included. Finally, the questionnaire for the illiterate participants was assessed on an interviewer-administered basis, which may lead to social desirability bias, particularly when addressing sensitive topics or mental health issues [30]. Although no significant difference of the responses was revealed, this is still an important point to consider for future application of the Chinese OPAQ-SV.

Conclusion

QOL instruments provide an efficient, standardized approach to explore an individual’s sense of well-being and ability to carry out activities of daily life. The disease-targeted instrument—the OPAQ-SV—is more suitable for clinical assessment of health-related QOL in osteoporotic female patients. Given our validation results, we conclude that the Chinese OPAQ-SV can and should be used in this patient group. Furthermore, we anticipate that this newly cross-culturally adapted and validated instrument can facilitate international research collaboration between Chinese and English-speaking clinicians interested in addressing QOL needs of patients suffering from osteoporotic fractures.