Introduction

Traumatic shoulder instability occurs in 1.7% of the general population [1]. It is observed more frequently in physically active individuals under the age of 40 [2]. About 98% of the patient population with instability has anterior dislocation [3]. Following the first-time traumatic dislocation, recurring instability is observed 3.2 times more in males compared to females [2].

Joint stability is provided by static stabilizers such as glenoid labrum, ligament, capsule, and dynamic stabilizers including biceps tendon, rotator cuff, and scapulothoracic muscles. Insufficiency of these structures causes instability [4]. The humeral head cannot sustain the position properly on the glenoid fossa during upper extremity movements, which causes anatomical and physiological anomalies and losses, functional limitations, and thus dysfunctions and disabilities in the patients [5].

The fact that complaints are not permanent and that pain occurs in certain physical activities renders the evaluation of patients with shoulder instability difficult [6]. Recent evaluation approaches aim at determining the functional status and thus their quality of life, in addition to the clinical evaluation [79]. Subjective evaluation scales used in clinics to evaluate patients with shoulder instability are Disabilities of Arm, Shoulder and Hand Scale (DASH) [10], Rating Sheet of Bankart Repair (Rowe Score) [11], Oxford Shoulder Instability Questionnaire (OSIQ) [12], Melbourne Instability Shoulder Scale (MISS) [13], Western Ontario Rotator Cuff Index (WORC) [14], and Western Ontario Shoulder Instability Index (WOSI) [15].

DASH is a patient-reported outcomes measure (PROM) developed for upper extremity in order to measure the physical function and symptoms in musculoskeletal disorders induced by upper extremity joints. The scale is composed of three sections labeled as symptoms (5 items), functions (25 items), and work and sports (8 items). Each item has five response options [10]. Rowe Score is a disorder-specific scoring questionnaire conducted by the clinician. It was developed by Rowe et al. in order to evaluate the long-term results of Bankart repair. The Rowe Score is calculated over a total of 100 points divided into three domains: (1) stability, which corresponds to a total of 50 points; (2) mobility, which corresponds to 20 points; and (3) function, which corresponds to 30 points [11]. OSIQ is a disorder-specific PROM developed by Dawson et al. [12] in order to evaluate the treatment results of the patients with shoulder instability. It is composed of a total of 12 items, and each item is scored on a scale from 1 to 5 [12]. WORC is a PROM that measures the quality of life in patients with rotator cuff disorder. It consists of five sections, namely pain and physical symptoms (6 items), sports and recreation (4 items), work (4 items), lifestyle (4 items), and emotions (3 items). Questions are answered on a 100-mm visual analog scale [14, 16]. WOSI is a disorder-specific PROM developed by Kirkley et al. in accordance with the methodology outlined by Kirschner and Guyatt in order to evaluate the treatment results of the patients with shoulder instability. It consists of four main sections, namely physical symptoms (10 items), sports/recreation/work-related activities (4 items), lifestyle (4 items), and emotional well-being (3 items). Questions are answered on a 100-mm visual analog scale [15, 17].

Self-administered questionnaires have been recommended as effective measurements that allow for the evaluation of the functional status and symptoms of patients’ perceptions [18]. Although there is no gold standard for shoulder instability, DASH, Rowe Score, MISS, OSIQ, WORC, and WOSI were shown to be reliable and are used for patients with shoulder instability [14, 15, 19, 20]. Due to the similarities between the compared parameters and the availability of Turkish versions, DASH, OSIQ, and WORC were used for validity analyses [16, 21, 22]. However, WOSI is simpler, more effective, and more responsive compared to the other instruments [6, 15, 23]. Although MISS has similar questions to WOSI, it is not as comprehensive as WOSI [15]. DASH can be used for all patients with upper extremity problems, but less responsive than condition-specific questionnaires. Evaluation apprehension is not clearly defined and pain is not measured specifically in Rowe Score [23].

Recent research shows WOSI to be among the best of all patient-reported outcomes measures (PROMs) for patients with a shoulder disorder [19]. It was also reported that this scale should be opted for since a 0–10 numeric scoring system is more sensitive to changes in the status [19]. Since it examines the patient in detail under four main sections, WOSI gives the clinician a chance for a detailed examination to reveal the functional status. We believe that the use of WOSI in clinical studies is to prove effective as it has high sensitivity to patients with shoulder disorder and it has the power to evaluate the effectiveness of the treatment. The purpose of this study was to evaluate the cultural adaptation, validity, and reliability of WOSI in the Turkish population with shoulder instability.

Methods

Permission was obtained from the author who developed the original scale to use it in our study. The study was approved by the Local Ethics Commission (dated 18 June 2015, reference number: 77082166-604.01.02).

Translation and Cross-cultural Adaptation Process

Translation and cultural adaptation of the scale were completed considering the stages indicated by Beaton et al. [24]. The original scale was translated into Turkish by two native Turkish speakers—one in the field of healthcare and one from outside of the field. Both translators were fully competent in both languages. The translators turned the two Turkish translations into one single translation. The Turkish version of the translation was translated back into English by two independent professional bilingual translators. A committee consisting of four translators and one Turkish linguist finalized the Turkish version of WOSI by comparing the first and the last translation. After the committee decided that the original WOSI and its Turkish version could be considered equivalent, firstly the comprehensibility of the final Turkish version was tested on 15 patients with shoulder instability and 15 healthy individuals of similar age and physical characteristics. 15 patients diagnosed with shoulder instability as a result of the examination by an orthopedist in Gazi University Hospital met the inclusion criteria. They filled the WOSI test under the guidance of a physiotherapist. Following the successful completion of these processes, the final Turkish version of WOSI (WOSI-T) was evaluated in terms of reliability and validity.

Subjects

The study was conducted with 74 patients with shoulder instability (anterior, posterior, multidirectional instability) who were admitted to the Department of Orthopedics and Traumatology in Gazi University Hospital and who agreed to participate in the study. Fourteen patients were later excluded from the study as they did not meet the inclusion criteria or declined to participate in the study (Fig. 1). The number of participants in the sample was determined following the recommendation of Altman, who states that the minimum number of patients must be 50 for methodological comparison [25]. The inclusion criteria were as follows: (1) being eighteen years old or older, (2) being diagnosed with shoulder symptomatic instability, whether anterior, posterior, or multidirectional, traumatic or non-traumatic, (3) being a native Turkish speaker and being able to read Turkish, and (4) receiving no treatment between test–retest assessments. The exclusion criteria were as follows: (1) inability to complete the form due to significant psychiatric or psychological disorder, (2) having a neurological disease, (3) having systemic inflammatory conditions, and (4) having neoplastic disorders or cervical radiculopathy and thoracic outlet syndrome. All the patients were administered the WOSI-T, DASH, WORC, OSIQ, and Rowe Score questionnaires. In order to determine test–retest reliability, 30 patients took the WOSI-T again 72 h later.

Fig. 1
figure 1

Flow diagram of the patients

Statistical analysis

Statistical Package for Social Sciences (SPSS) 22.0 was used to conduct the statistical analyses. The analyses were expressed as mean ± standard deviation and percentages. The reliability of the WOSI scale was assessed through test–retest and internal consistency analyses. Test–retest reliability was calculated using Intraclass Correlation Coefficient (ICC), while the internal consistency was determined by Cronbach’s Alpha coefficient. Cronbach’s Alpha for this scale was 0.70, and ICC value of 0.80 or higher was considered significant [26, 27].

The validity of the scale was evaluated in terms of construct validity. Construct validity was examined through convergent validity. For convergent validity of the scale, WOSI total score and the total scores of Rowe Score, OSIQ, DASH, and WORC were compared. Moreover, the correlation between the sub-parameters of WOSI and WORC was calculated. Pearson correlation coefficient was used for this analysis, and coefficients ranging between 0.81 and 1.00 were considered excellent, while coefficients between 0.61 and 0.80; 0.41 and 0.60; 0.21 and 0.40; and 0 and 0.20 were considered as very good, good, weak, and bad, respectively [28]. All values were considered significant at p < 0.05.

Results

Patient data

Of the 60 patients with shoulder instability participating in the study to determine the validity and reliability of the Turkish version of WOSI, 14 were females (23.3%) and 46 were males (76.7%). All the participants filled out the evaluation forms and no missing data were encountered in the study. Detailed demographic information about the participants is presented in Table 1.

Table 1 Demographics of the patients

Translation and cultural adaptation

Translation and cultural adaptation process was completed in accordance with the procedure outlined above and no problems were encountered at this stage.

Internal consistency

As a result of the internal consistency analysis, Cronbach’s Alpha coefficient was found to be 0.91. This value indicates that the scale has a high level of internal consistency. When the Cronbach’s Alpha coefficient was computed separately for each sub-parameter of WOSI, it was seen that the lifestyle parameter had a lower coefficient (0.77) compared to the other sub-parameters. This value was still not found to be below the cut-off value (Table 2).

Table 2 Internal consistency, test–retest, and floor–ceiling effect analyses

Reproducibility

For the test–retest analysis of the WOSI scale, a 72-h time interval was considered to be appropriate, and 30 participants were included in the test. As a result of the analysis, the ICC value was computed for the total score of the scale, which was found to be fairly high (0.97). Similarly, high ICC values were recorded in physical symptoms, sports, recreation, and work-related activities, lifestyle, and emotional well-being sub-parameters (0.83–0.97) (Table 2). The results revealed that both the sub-parameters and the total score of the scale and thus the WOSI scale itself are stable over time.

Floor and ceiling effects

The analysis of the worst–best status values, which are considered as an important measure for the sensitivity analysis (floor and ceiling effects) of scales in version studies, showed no floor and ceiling effect (15%) [29] in sub-parameters (0–4.9%) and in total score (0%) (Table 2).

Construct validity

Construct validity of the scale was examined in terms of convergent validity. In order to analyze convergent validity, correlation analysis was conducted between WOSI-T and DASH, Rowe Score, OSIQ, and WORC questionnaires. The total score of the WOSI scale was found to have a good negative correlation with the Rowe Score (−0.57), and a very good and excellent correlation with DASH, OSIQ, and WORC questionnaires (0.67–0.89) (Table 3). Similarly, a very good–excellent relationship was observed between the sub-parameters of WOSI and WORC (0.69–0.83) (Table 4).

Table 3 Correlation values of WOSI with other questionnaires
Table 4 Pearson Correlation analysis* of WOSI and WORC Questionnaires’ Sub-parameters

Discussion

The assessment of the effectiveness of the treatment with respect to functional condition and quality of life after shoulder injuries is a commonly used method in the clinic. The functional condition can be determined by implementing objective tests or administering questionnaires. Although DASH, Rowe Score, OSIQ, and WORC are often used as scoring methods in shoulder problems, they are not the scales specific to shoulder instability, except for the Rowe Score. Two of the three subscales of the Rowe Score are scored by the clinician. In this respect, WOSI, which has disorder-specific up-to-date criteria, will bring advantages in determining quality of life. Also, it was necessary to make the Turkish adaptation and test the validity and reliability of the scale in order to interpret the data pertaining to the Turkish society and to compare the data obtained from WOSI as the common language in both meta-analyses and international studies and meetings.

Cronbach’s Alpha coefficient, which indicates the internal consistency, was found to be excellent (0.91). Similarly, Cronbach’s Alpha coefficient was found as 0.88–0.90 (n 22) in the Sweden version, 0.93 (n 64) in the Italian version, 0.92 (n 49) in the German version, and 0.93–0.96 (n 138) in the Dutch version of the scale [6, 3032]. The internal consistency of the WOSI-T could not be discussed as the original WOSI does not report this coefficient. It was concluded that the Turkish version of the scale can be used in the clinic as the Cronbach’s Alpha coefficient was above 0.90. When the Cronbach’s Alpha coefficient for each sub-parameter of WOSI was analyzed, it was found that the lifestyle parameter had the lowest coefficient, which was nevertheless above the cut-off value (0.77). Similarly, the lifestyle parameter was found to be lower than the other parameters and below the cut-off value in the Swedish (0.56) and German (0.68) versions, while it was found to be much higher in the Dutch version (0.94).

For the test–retest analysis of the WOSI scale, the questionnaire was administered in 72-h intervals. An average total score of the scale was found to be 101.9 ± 35.0 in the first measurement and 99.1 ± 39.5 in the second measurement. The calculated ICC value was found to be high (0.97). Gaudelli et al., who developed the French version, reported the ICC value as 0.84 (95% CI 0.78–0.88) and the correlation coefficient as r: 0.85 (p = 0.01) for the questionnaire which was administered to 116 patients after 16 days [33]. In the Swedish version, which was administered to 32 patients after 2–3 months, the ICC value was found as 0.94 [6]. In the Dutch version, which was retested with 99 patients after 13 days, the value was 0.92 (0.88–0.95) [32]. In the Italian version, which was retested with 64 patients at 3-day intervals, the ICC value was 0.96 (95% CI 0.90–0.97) [30]. Finally, it was found as 0.91 in the Japanese version which was retested 2 weeks later [34]. In the Turkish version, high ICC values (0.83-0.97) were recorded in physical symptoms, sports, leisure, work activities, lifestyle, and emotional well-being sub-parameters. Except for the original WOSI, sub-parameters were analyzed only in four versions. In the English version, the ICC value for the sub-parameters was 0.72–0.94 [15], while it was between 0.88 (0.81–0.92) and 0.90 (0.85–0.93) in the Dutch version [32], between 0.85 and 0.91 in the Swedish version [6], between 0.87 and 0.93 in the German version [31], and finally between 0.64 and 0.86 in the Japanese version [34]. In the original WOSI, which was repeated after 2 weeks for the reliability analysis, the ICC value for the total WOSI was found to be 0.96, and it was found to be between 0.71 and 0.94 (2-week period) for the sub-parameters [15]. High ICC values in the Turkish version indicate that the translation of WOSI into Turkish did not change the features of the scores to a large extent.

In order to determine the changes in the clinical conditions of the patients more precisely, questionnaires are expected not to have floor and ceiling effects. No floor and ceiling effects were detected in the Turkish version of WOSI (n 60, 0–0%). This may be attributed to the fact that WOSI is scored on VAS, rather than the Likert scale.

As there is no self-report scale specifically designed for patients with shoulder instability in the clinic other than OSIQ, construct validity analysis was conducted using the Rowe Score, OSIQ, DASH, and WORC questionnaires. In the Italian and German versions, SF-36 was used because the original WOSI and sub-parameters were similar, and the correlations with the original WOSI were found to be low [30, 31]. The reason for this is that SF-36 is more related to the general health condition, rather than being disorder specific. Therefore, in the Turkish version, instead of SF-36, WORC was used because it is a scoring questionnaire specifically designed for shoulder pathologies. In addition, although the first two parameters of the Rowe Score are scored by the clinician and the third one by the patient, it was still used in the analysis of the study. Moreover, OSIQ, DASH, and WORC questionnaires were preferred because of the existence of their Turkish validations. The total score of the WOSI scale was found to have a good negative correlation with the Rowe Score (−0.57) and a very good–excellent correlation with OSIQ, DASH, and WORC questionnaires (0.67–0.89). WOSI and Rowe Score values were reported to have a medium correlation (0.627) in the German version [31], and WOSI and OSIQ scores were reported to have a high correlation (0.82) in the Dutch version [32]. In this study, the lowest correlation was between WOSI and Rowe Score values. The correlation (0.6) between WOSI and DASH in the original WOSI is similar to that in the Turkish version (0.67). The same correlation was reported to be 0.81 in the Dutch version [32]. In its broad sense, the correlations obtained for construct validity in both the original version and the other versions are similar to the correlation values in our study. Besides, very good–excellent correlation was found between Turkish WOSI and the sub-parameters of WORC (r = 0.69−0.83). The results derived as part of this analysis support the validity of the Turkish questionnaire.

The study has certain limitations. Responsiveness analysis, which is a significant measure in determining the sensitivity to clinical changes in health-related quality of life questionnaires, was not conducted in our study. We believe that the WOSI, whose clinical responsiveness was determined in other language versions, needs to be analyzed from this aspect as well. Another limitation of our study is that the sample size might not be adequate to perform a factor analysis to assess construct validity from another aspect. Future studies may conduct responsibility and factor analyses to support the clinical significance of the Turkish WOSI.

Conclusion

The Turkish version of WOSI is a valid and reliable scale for use in studies evaluating the final condition of the patients with shoulder instability.