Introduction

Vocational rehabilitation (VR) is an integral part of a disability management strategy and is essential to support workers and persons with disabilities willing to work to “secure, retain and advance in suitable gainful work participation”. Such a process will, in turn, enhance the workers’ integration or reintegration into the society [1]. VR interventions not only encourage the person’s participation in daily living and the society but also reduce the days of sick leave and help to prevent health-related impairments from becoming chronic [2]. Individuals who benefit from VR not only have a primary health condition but also suffer from various comorbidities, leading to a complex level of disability [3,4,5]. Moreover, overall problems in functioning are assumed to contribute to a decreased or lost workability [6, 7], where workability has been described as “the balance of the workers’ resources and work demands”. In this context resources or deficits refer to “functional abilities, professional skills and professional values” [8]. Mitigating measures to address work disability is critical to ensure that the work functioning of an individual is restored to the optimal level possible.

For VR to be effective, a carefully planned and comprehensive vocational assessment that addresses all relevant factors is crucial. Identifying the individual who needs the appropriate return-to-work support, optimizing the timing and selecting the best possible VR interventions requires a comprehensive understanding of the complex nature of the relationship between work and health resources [9,10,11].

Previous studies showed that the International Classification of Functioning, Disability and Health (ICF), an integrative biopsychosocial model of the World Health Organization (WHO), provides a suitable reference framework to address the complexity and multi-faceted nature of VR [12, 13]. ICF core sets, an abbreviated list of relevant domains based on the ICF, have been developed over the years to make the application of the ICF to specific settings in health care in a way that is practical and easy [14]. The categories from the ICF core set for VR were selected systematically and empirically and were intended to describe work-related functioning independent of a health condition, the respective VR setting or a specific time point throughout the return-to-work continuum.

While the ICF core sets can provide the basis of what domains of functioning can be assessed, it does not necessarily provide a way on how such an assessment can be undertaken. Hence, the search for a patient-centered measure that can be used to assess the ICF core set for VR has been made. No such instrument was found thus, during our search, efforts were made to develop a new questionnaire. The first version of the Work Rehabilitation Questionnaire (WORQ) is an interviewer-administered. WORQ consists of two parts: part 1 includes sociodemographic information, environmental factors, and work-related information and part two contains items on work-related functioning consisting of body functions and activities and participation domains. The development process of WORQ was described in detail elsewhere [15]. Feasibility tests of WORQ showed high patient satisfaction with the questionnaire. However, the interviewing health and work professionals raised the issue of whether or not a patient-reported version of WORQ can also be available to enhance feasibility and ease of use [16, 17]. To address the lack of such an instrument, a self-reported version of WORQ was developed. In the self-reported version, the changes mainly concern the introductory text, a shift from three support items from part two to the first part and the wording of items related to personality and temperament functions. (http://www.mywork.org).

Switzerland has four official national languages. The majority of the population (63%) speak German, hence the first version of WORQ was made available in German. However, the data collection for the purpose of testing the psychometric s is still ongoing. However, the second largest language group in Switzerland speaks French (23% of the population) [18]. Because there is no French version of WORQ to ensure applicability of the questionnaire in the majority of the Swiss population, a study was undertaken to develop and examine the psychometric properties of the French version of WORQ. Hence, our study-aim was to translate and cross-culturally adapt the patient-reported version of WORQ to French and to evaluate the fundamental psychometric properties of WORQ-French in a sample of persons undergoing vocational rehabilitation in a specialized center in Switzerland.

Methods

This single-center observational psychometrics study was approved by the ethical committee of the canton Valais in Switzerland (CCVEM 005/15) and conducted according to the principles outlined in the Declaration of Helsinki. All patients signed a written informed consent form before participating in the study.

WORQ Instrument

Part one of WORQ contained items on sociodemographic information and work-related items including age, gender, education, profession, work status, current VR interventions and items about family support, the support provided by the superior/supervisor, and the labor system support. Part two of WORQ has 40 functioning items, each with a response scale of 0–100 (0 = No problem to 100 = Complete problem). The internal consistency of the interviewer-administered version of WORQ showed Cronbach’s alpha of 0.887 and test–retest agreement of 0.789 (Spearman correlation) [15].

Step 1: Translation and Cross-Cultural Adaptation of WORQ

The translation and cross-cultural adaptation was based on guidelines proposed by Beaton et al. and then later modified to follow a dual-panel approach [19, 20]. In this approach, the English-language version of WORQ was forward-translated into French in Switzerland by a bilingual panel including, two experts in questionnaire development and validation, three bilingual experts (two Swiss-Francophone, one French), one VR specialist and two of the developers (Fig. 1). This first version of WORQ-French was then evaluated by two lay persons (non-patient) as well as three patients undergoing VR interventions due to a shoulder problem, a hip and a knee injury [21, 22]. These five individuals completed the questionnaire and were asked to comment about the understandability of WORQ. The panel discussed the findings and, if needed, the members agreed on adaptations. At this point, the WORQ modification as the result of this process underwent pilot testing in the next stage. The next stage consists of cognitively testing of WORQ by a group of ten patients. These patients were diverse regarding age, gender, diagnosis, and education. In the cognitive testing, the patients were asked to complete the pilot-testing version of WORQ and to provide feedback on the clarity and understandability of the items. They were also asked the ease of completing WORQ. Finally, the bilingual panel verified the second version of WORQ based on the information from the pretest and finalized the French version of WORQ (http://www.myworq.org).

Fig. 1
figure 1

Process of cross-cultural adaptation of WORQ to French

Step 2: Psychometric Evaluation of WORQ-French Version

Participants and Procedures

The psychometric evaluation took place in a Swiss rehabilitation teaching hospital (Clinique Romande de Réadaptation, Switzerland). Participants with musculoskeletal injuries were referred from all of the French-speaking counties of Switzerland, which includes urban and industrial city centers or more rural regions. Inclusion period lasted from March 2015 to February 2016. Patients were contacted by a research assistant and invited to participate if they were (a) aged between 18 and 65 years, (b) were participating in a vocational rehabilitation intervention due to musculoskeletal condition, and (c) is proficient in French (oral and written language) to be able to complete questionnaires. After giving informed written consent, the patient was provided with a patient case report form CRF-T0 including WORQ.

Variables and Instruments

A patient case report form (CRF-T0), along with WORQ, was completed by the participant upon admission into the VR program in the clinic. The CRF-T0 collected information pertaining to sociodemographics, such as age, gender, and family status. The CRF-T0 also contained a self-evaluation of general health measure using a numeric rating scale from 0 to 10 (0 = excellent and 10 = poor), a self-evaluation of general functioning using a numeric rating scale from 0 to 10 (0 = no problem and 10 = extensive problem), as well as an appraisal of the current state of health using the single item of the EQ-5D (a vertical visual analogue scale from 0 to 100 with the bottom anchor as “the worst state of health imaginable” and the top anchor as “best state of health imaginable”). These variables have been proven to be valuable indicators in understanding the health situation of persons in the context of vocational rehabilitation [5, 23]. In addition, the rehabilitation center provided further data that was collected at admission in the context of routine clinical practice. This data included injury location (upper extremity, lower extremity, trunk/back and polytrauma), having a case manager (response option: yes/no), and “case severity”. “Case severity” was determined by the treating physician and based on the Abbreviated Injury Scale (AIS) [24]. The AIS is an anatomical-based coding system created by the Association for the Advancement of Automotive Medicine to classify and describe the severity of injuries. Finally a variable reflecting the patient’s expectation if he or she will be back at work within 6 month (response option: yes/no) was reported [25]. The patients expectation was found to be a strong predictor for the resumption of work in patients with subacute and chronic musculoskeletal conditions [26,27,28]. The Hospital Anxiety and Depression Scale (HADS) was used to detect anxiety and depression, variables that are firmly related to rehabilitation outcome and successful return to work [29,30,31]. The HADS is a 14-item scale that was initially developed by Zigmond and Snaith [32] and is commonly used to determine the levels of anxiety and depression that a patient is experiencing. It was initially created to detect the perception of anxiety and depression in people with physical health restrictions. Seven of the items relate to anxiety and seven relate to depression. A subscore of 8–10 is considered as borderline and a score of 11+ is considered as anxiety or depression disorder.

Usability

Usability of WORQ-French was tested in a group of ten patients that participated in the psychometric testing. Understandability of the items in WORQ was addressed by asking “Did you have any difficulties understanding these questions? If YES, please write down which question/s”. For suitability of response options, the question was: "Did the response options make sense to you? Please comment”. For feasibility and appropriateness of length of the entire questionnaire, the participants were asked to choose one of the following options: “too long”, “a little long”, “a good length”, “a little short”, and “too short”.

Reliability

Internal Consistency

Internal consistency of WORQ-French was examined based on data from the full study sample at admission, using Cronbach’s alpha. Cronbach’s alpha is a general coefficient of homogeneity between items. Values for the coefficient α can range from 0 (no internal consistency) to 1 (perfect internal consistency). Coefficients above 0.75 are considered moderate, above 0.75 they are regarded as good, and excellent above 0.9 [33].

Test–Retest Reliability

The first 50 study participants were invited to complete WORQ-French for a second time 7 days after their initial completion to evaluate test–retest reliability. An average score was calculated for each participant by summing up the scores from the 40 equally-weighted items from part two, divided by the total number of items answered. As a single exception, item 34 “Overall in the past week, to what extent did you have problems with driving a car or any form of transportation?” had an additional answer option, “not applicable”. This option related to the situation of a person without driver’s license or without an option to drive because of a lack of car. In this case, a problem with driving would not relate to a disability. In the case of a “not applicable” the item was excluded from the sumscore calculation, i.e., for a total of 39 items. Test–retest reliability was calculated based on the average score using interclass correlation ICC2,1 [34].

Precision

Floor and ceiling effects were considered to be present if more than 15% of participants achieved either the lowest or highest possible scores, respectively [35].

The standard error of measurement (SEM) was calculated to assess response stability, meaning that the SEM is the amount of error that can be considered as measurement error. SEM was calculated using Cronbach’sα as reliability coefficient Rx. SEM = SD \(\sqrt {1 - {\text{Rx}}}\) [36, 37]. The minimal detectable change (MDC), meaning the minimum amount of change in a patient’s average score that is not the result of measurement error, was calculated on the 95% probability as MDC = 1.96 × SEM × \(\sqrt 2\) [38].

Validity

Content validity examines the extent to which the domain of interest, work-related functioning, in this study is comprehensively covered by the items in the questionnaire. We examined content validity in interviews by asking patients to comment on the comprehensiveness of WORQ-French in relation to their specific situation in VR.

Taking into account the theoretical and conceptual framework of the ICF based on which WORQ was developed, WORQ evaluates work-related functioning by taking into account, amongst other things, mental, emotional, physical and movement-related body functions and activities. Therefore a positive moderate association of “work-related functioning” as measured by WORQ with self-evaluated general functioning representing the encompassing concept of functioning is assumed. Anxiety and depression as measured by the two dimensions of HADS are expected to correlate only with items in WORQ that address mood and emotion, hence, what we could assume to be a moderate correlation of the HADS scores with the WORQ sumscore in our musculoskeletal population. As functioning is seen as a determinant of health in the ICF conceptually, we assume that the self-evaluated general health rating, and the overall health (VAS 0–100) will also be positively associated and reasonably so given the characteristics of these also overarching global concepts. Finally, we expect that patients with better functioning abilities (WORQ) are more likely to return-to-work (patient’s expectation of return-to-work within 6 months) than patients with lower functioning. In contrast, we suspected that those participants, who were supported by a case manager, showed more significant problems in work functioning than those without because case management is typically provided in more complex and severe cases.

Statistical Analysis

Descriptive statistics were used to describe the sample. Distribution of data (normality) was tested based on histogram analysis and Kolmogorov–Smirnov and Shapiro–Wilk test [39] to determine if Spearman correlation or Pearson correlation should be used. Values for the coefficient r can range from 0 (no correlation) to − 1 or 1 (perfect negative or perfect positive correlation); a value above 0.7 is considered highly positive [40]. Imputation for missing data in WORQ items was done with RStudio using MissForest—non-parametric missing value imputation for mixed-type data [41, 42]. As predictor variables, we used the WORQ variables supported by the following variables: sex, age, self-evaluation of general functioning and self-evaluation of general health. All calculations were performed using the software package IBM SPSS Statistics for Windows, Version 24.0, released 2016 [43].

Results

Step One: Translation and Cross-Cultural Adaptation

The bilingual panel experienced no significant problems in finding consensus on the translation of the functioning items in part two of WORQ regarding meaning and style. A discussion on the anchor definition of the scale led to a translation of “Aucun Problème” for “No problem” and “Problème grave” for “Complete problem”, as the most appropriate expressions. The cross-cultural adaptation of part one (sociodemographic and work-related items) revealed the need to take into account items as they hold relevance to local and regional characteristics and settings, such as school systems and availability of VR services in items 6, 7, and 14 (http://www.myworq.org).

In the first evaluation of WORQ-French, the wording of the items revealed only minor word changes, such as item 12 in part one, where the word “paramedical” was changed to “therapeutic” and in part two where the verb “bouger” (to move) was replaced by the verb “se déplacer”. In contrast to the wording of the items, all three patients, and one lay person reported serious problems with the visual-analog scale (0–100) that was used to evaluate the functioning items in part two. Therefore, the panel and the developers of WORQ decided to use the numeric rating scale (0–10) instead but maintaining the same anchor definitions of 0 = No problem to 10 = Complete problem. The participants understood the numeric rating scale of 0–10 better, and they felt to have a better estimation of how they feel about the item. Two patients reported some problems with answering items from part one related to their work status, work situation and vocational rehabilitation interventions but all patient reported good understandability of the functioning items in part two of WORQ. Only one person raised the issue of the numeric rating scale to be not meaningful in completing the items. Further, after two minor changes in the wording of item 5: “Which of the following describes your current work status best?” and item 14: “What kind of work or vocational intervention are you receiving now?” in part one, a final version of WORQ-French was approved for psychometric testing.

Step Two: Psychometric Evaluation

Participants

Eighty-nine patients completed CRF-T0. Ten patient out of the 89 provided feedback on the usability of WORQ after its completion.

Fifty consecutive participants out of the initial 89 also completed the patient record form “CRF-T1”, which was administered 7 days after the CRF-T0 for test–retest. The CRF-T1 contained WORQ-French, the self-evaluated general functioning rating, and the self-evaluated general health rating. For participant characteristics, they were predominantly male and married. Although over 25% of the participants were still at work prior to admission to the clinic, the remaining participants were on average off work for more than 10 month. A majority of the participants suffered from an upper extremity injury (44%). Around 40% of the participants took part in more than one vocational intervention, and 26–34% were actively looking for a new job while also 30–32% were engaged in activities to maintain their current job.

With an average WORQ sumscore of 3.4/10 (n = 50) and 4.5/10 (n = 89), patients rated to have fewer problems in work-related functioning compared to general functioning rating as 5.22/10 (n = 50) and 5.56/10 (n = 89) in the whole population. In the HADS depression and anxiety scales is a score between 8 and 10 considered as “borderline case”. This indicates that our study population, with an anxiety score of 9 or 8.8 respectively, was burdened with a substantial amount of anxiety (Table 1).

Table 1 Characteristics of respondents, study 1

Usability

Participants rated the usability of WORQ from fair to very good. However, three work-related items from part one were flagged by the participants. Item 8: “What is your current job or profession or if currently not working, what is the last job or profession you worked in (job title)?”, item 9: “What kind of business, industry or service is (or was) your job in?” and item 10: “What kind of work are (or were) you doing?” were found to be misleading and overlapping by three persons. These three also complained that they had to formulate the answers themselves and that WORQ didn’t provide simple response options to choose from. Especially defining specific work tasks in item 10 was considered as difficult and somewhat redundant to the job type asked for in question 8. All patients reported good understandability of the functioning items in part two and no problems with the NRS (0–10). Appropriateness of length of the entire questionnaire was rated as “a good length” by seven participants, two rated it as “a little too long” and one person rated it as “too long”.

Reliability

The psychometric testing of WORQ-SR revealed excellent internal consistency (Cronbach’s alpha = 0.968), although the high internal consistency may be influenced by the high number of items and potential redundancy amongst the items. The test–retest reliability was also high with an ICC2,1 = 0.935 (CI 0.889–0.963). These findings indicate the reliability of WORQ for use as a single measure to evaluate functioning in the context of work. Data quality of WORQ was good with randomly missing values of only 2.3% in total or 0–4 (3.56%) missing values per item, which were consequently imputed. Seven persons answered “not applicable” in item 34 “problems with driving” resulting in an average score out of 39 items. However, no change in test–retest reliability or construct validity was found when “not applicable” cases were eliminated via sensitivity analysis.

Precision of WORQ

No ceiling or bottom effect was detected. The WORQ sumscores ranged from 6 to 346/400 points.

The SEM was calculated as 0.323 points out of the maximal average score of 10; and the MDC was calculated as 0.895 points, meaning that changes in the average-score that are higher than 0.895 can be attributed to a real change. This value will be helpful in determining change or the stability of WORQ when collected over multiple time points.

Validity

Content Validity

In the context of the content evaluation, all patients found that WORQ covered all relevant aspects of work-related functioning, although one person missed items on “off work activities”, such as household chores, sport and community activities. This person considered the degree of problems that he experienced off work as relevant, as it impacted on his work-life balance. He thought that having a considerable amount of problems off-work would lead to a decrease in work functioning.

Construct Validity

Correlation of WORQ-French with other standard instruments can be seen in Table 2. As expected, a higher WORQ-French score was moderately associated with a higher score on self-reported general functioning (Pearson correlation = 0.662). WORQ-French also correlated moderately with both HADS scales, reflecting psychological distress that is represented in WORQ with six items related to the construct of “mood”, which highlights the contribution of emotional functioning to the overall work-related functioning in our population of mostly chronically injured workers.

Table 2 Construct validity of WORQ-French

Self-perceived health measured with the self-evaluated general health rating and the overall health also correlated significantly with the WORQ-French score. The self-evaluated general health rating correlated positively with WORQ, however, as expected due to the opposing direction of its visual analog scale, with the lower anchor “worst health possible” and the top anchor “best imaginable health”, the overall health score correlated negatively with WORQ-French.

Contrary to our expectations, neither the patient’s expectation of return-to-work within 6 months nor having a case manager (in severe cases) showed any significant correlation with WORQ-French (work related functioning) or self-reported general functioning. This may indicate that patients based their expectations factors unrelated to their problems in functioning and more related to personal factors or environmental factors such as work itself and work environment [26, 27].

Discussion

In this study, we performed a cross-culturally adaption of WORQ into French and evaluated its fundamental psychometric properties. Our findings suggest that WORQ-French is a reliable, valid, and easy-to-use instrument to assess self-reported work-related functioning. WORQ is a useful instrument to describe an individual’s work functioning using a biopsychosocial framework, while at the same time recognizing the influence of the environment. WORQ can also assess work-specific and general functioning aspects in the context of VR and support the planning of work participation strategies for people with various health conditions.

WORQ-French represents the first psychometric evidence on the self-reported version of WORQ. A dual-panel approach was successfully used for the cross-cultural adaptation as recommended in current guidelines [19, 44]. The adaptation of the second (functioning) part of WORQ resulted in few problems, neither concerning the language nor the cultural setting. In contrast to the second part, in the first part items six “When thinking about your work or vocational rehabilitation program: Are you currently”, seven “What is the highest level of education that you have completed?” and fourteen “What kind of work or vocational intervention are you receiving now?” that are related to school systems and occupational training or VR interventions, gave rise to extensive discussions. The region-specific naming of the respective school level, such as secondary school or real school instead of high school, or finding the appropriate examples to describe setting or system specific VR interventions have been identified to be crucial for the participants to provide exact and reliable answers in the self-reported instrument. The need of context-specific adaptation of these work and education related items was further confirmed by two physicians from France, who were asked by the authors to comment on WORQ-French. In further cross-cultural adaptations, a specific focus has to be set on context-specific items from part one to achieve cultural appropriateness without losing comparability across countries [45].

One primary concern for the use of WORQ in clinical practice, but even more so in research, is the length of WORQ. The notion of the global perspective taken by WORQ is leading to 40 functioning items, of which not all may appeal to everyone. Nevertheless, because of its broadness, WORQ-French is excellently capable of capturing the diversity of functioning problems that may be caused by an increasing number of co-morbidities found in individuals in VR [4, 46]. For example, a 42 years old construction worker is referred to VR to evaluate future work perspectives, after a severe work accident. He is not able to return to his former work due to severe lower extremity injuries. Besides, he suffers from of a previous mild traumatic brain injury with minor attentional deficits that didn’t bother him in his old job. He controlled his diabetes II well with sport and diet before the accident, but due to a lack of movement diabetic complications, such as ulcers at his coccyx have appeared. In such a situation WORQ-French can provide a fast and comprehensive overview as needed for a patient-centered assessment, to provide the necessary interventions promptly and to inform decisions concerning sustainable work in the future [16]. Nevertheless to encounter the challenge of the length of WORQ, a brief version of part two, containing a subset of 13 items was developed. These 13 items represent the body function, and activity categories from the brief ICF core set for VR complemented with the categories from the generic set [14, 47]. The WORQ-brief assess critical aspects of work-related functioning, such as “energy and drive”, “emotional functions”, “cognitive functions” and mobility (http://www.myworq.com). To which extent WORQ-brief fulfills the needs for a screening instrument has to be further evaluated and is currently tested in the English version of WORQ-brief.

A finding that had a retroactive effect on the overall development of WORQ was the apparent confusion around the use of the VAS (0–100) by the patients in the first test. In the first, interviewer-administered version of WORQ, the VAS was well accepted by the interviewers as well as by the patients. Nevertheless, the change from the VAS to an NRS (0–10) in the self-reported version of WORQ resulted in a high patient satisfaction and an increased reliability compared to the interviewer-administered VAS version [48]. This finding is also supported by the literature, concluding that the NRS had better compliance rates and responsiveness and were reported to be user-friendly relative to VAS [49,50,51]. As a result of this study, the developers of WORQ decided to replace the VAS scale with NRS 0–10 in the revised interviewer-administered version of WORQ, as well as in all future self-reported versions of WORQ-SR.

When comparing reliability and validity of WORQ-French to the initial interviewer-administered version of WORQ, reliability substantially improved [15]. The change of scale may attribute to this improvement. Another reason could be the fact that in the first study test–retest was assessed over a period of 14 days whereas in the current study the test–retest period was 7 days, what may be better to evaluate the clinical stability of an instrument. With a 9% minimal detectable change (MDC), WORQ-French shows a good precision, not only when considering the heterogeneous population evaluated in this study, but also when comparing WORQ-French to other patient-reported measures (PROM) such as the Disabilities of the Arm, Shoulder, and Hand (DASH) with and MDC of 17% or the Oxford Shoulder Score (OSS) with an MDC of 10% [52]. These promising results support the use of WORQ-SR as a reliable and valid way to evaluate patient-reported functioning in the context of return to work and employment.

WORQ was designed to assess the functioning of individuals in VR independent of the health condition, throughout the whole continuum of the return-to-work process. WORQ was developed from the ICF core set for VR. This concept-based approach assumes per se concept validity, what was supported in this study by the patients. Nevertheless, no leisure time activities are addressed in WORQ what is due to the experts of the ICF core set consensus conference, where interdisciplinary VR experts explicitly decided to concentrate on work-related functioning. This decision may be questioned, considering the ongoing debate on the importance of work-life balance. Nevertheless, the 40 functioning items of the self-reported version of WORQ provide information on body functions and activities, such as lifting, walking and relating to others that are also relevant to leisure time activities and may be used as proxy measures for related activities, such as sport and other hobbies [53, 54].

Similar to the interviewer-administered version of WORQ, the items of part two of the self-reported version, allow creating a functioning profile. Such a profile can be used to evaluate abilities and resources or potential, and identify areas of problems for an individual and may then serve as a basis for intervention and case management planning [16, 23].

As work-related functioning measured by WORQ-French can be assumed to be subarea of general functioning, WORQ-French showed a good correlation with general functioning. The higher score assessed with the general functioning rating compared to the WORQ average sumscore in our study may be due to the fact that the WORQ average score is assessed based on 40 tangible items, related to emotional, cognitive, physical and social aspects of functioning. Depending on the location and type of injury it can be expected that the patients score high on items related to their condition and low on items that are not affected by the respective health condition. This may lead to an average score that is lower than the overall functioning rated on one single 0–10 rating scale, because the patient is focusing on their problems and not on their resources or unaffected functional abilities. Nevertheless the correlation coefficient of 0.66 may indicate that some items in WORQ may weigh more than others in the patient’s overall rating of functioning, and these items may be different from patient to patient. In addition, general functioning was assessed with only one NRS 0–10. It can be assumed that the participants included their recent experiences with activities, their “off-work” situation as well as their current well-being in their rating [55, 56]. A good correlation of WORQ-French with its six items on mood with the HADS scores confirmed that mental and psychological functioning play a significant role in self-perceived functioning in the context of vocational rehabilitation and employment, what is consistent with the literature [57, 58]. Mood-related and cognitive in addition to movement-related aspects of functioning are also found to be most relevant in non-musculoskeletal health conditions such as heart diseases, cancer and neurologic diseases such as multiple sclerosis or traumatic brain injury. These findings may contribute to the external validity of WORQ in non-musculoskeletal conditions [59,60,61,62].

To our astonishment, the individual’s expectation to return-to-work within 6 months showed no correlation with functioning (WORQ and general functioning) at all. Although we know from the literature that the ability of patients to predict improvement of functioning is limited, we assumed that patient would base their return-to-work prediction on their current functioning status [63], what apparently was not the case in our mostly chronic population. That patients do not support their decision with their level of work-related functioning is also in line with the fact that “having a case manager”, a proxy for injury severity, did also not significantly correlate with functioning. We presume that mainly environmental factors, such as having a work contract, support of the superior and type of work influenced the decisions of the study participants. Nevertheless, the influence of work-related functioning on the prediction of return-to-work is not well studied and has to be further evaluated [64, 65].

Another issue in the context of VR is the question if self-reported functioning creates a reliable picture of a person’s abilities or if a clinician-rated assessment such as functional capacity evaluation may be needed. Current research suggests that our ability to predict return-to-work can be improved by combining self-reported and clinical information on functioning [66]. As clinical testing of work functioning is time-consuming, expensive and relays on skilled health professionals, to use a reliable and valid self-reported instrument such as WORQ, as a screening instrument may help to identify the appropriate clinical evaluation and targeted intervention in a timely and inexpensive way. Anyway, in light of patient-centred care and the person-centeredness of healthcare services, it is indispensable to evaluate and integrate the patient’s perception of his or her problems, as easily covered by WORQ-SR [67,68,69].

Despite the convincing results of the psychometric evaluation of WORQ-French, we caution the interpretation and generalization of our results because in this study we employed convenience sampling of patients with predominantly traumatic musculoskeletal injuries from a single VR center. Although our sample represents the general population at the VR department of the study center, the fact that < 10% of the participants included in this study were females could limit the transferability of results to settings of different gender proportion. The cross-sectional data for the construct validation cannot provide any basis for causality. These results have to be confirmed in other clinical settings and patient groups and with diverse health conditions, to ensure the external validity of WORQ-French. Besides, ongoing longitudinal studies will further evaluate assumed underlying factors of the self-reported version of WORQ-French, as well as its predictive value on return to work.

In conclusion, we found evidence that WORQ-French is a valid, reliable and easy to administer instrument to evaluate self-reported work-related functioning given our study setting and sample characteristics. Results of WORQ may be used to guide intervention planning and document changes in functioning throughout the VR process. However, further studies will shed light on the use of WORQ in clinical practice and research, as well as in diverse patient populations and settings.