Background

Health-related quality of life (HRQoL) is defined as the subjective assessment of the impact of disease and treatment across the physical, psychological, social and somatic domains of functioning and well-being [1]. Clinicians and policymakers are recognizing the importance of measuring HRQoL to inform patient management and policy decisions [2, 3]. Generic HRQoL instruments are designed to measure overall health states and allow comparisons across patients with different diseases and with the general population [4, 5]. One of the few generic HRQoL instruments validated for the Thai population is the Short-Form 36 (SF-36) [6]. However, the SF-36 measure, version 1 and 2, does not provide a utility score that is essential for cost-utility analyses. Cost-utility analyses can guide healthcare professionals and decision makers on resource allocation decisions [4]. The EuroQoL-5D (EQ-5D) is a widely used short generic HRQoL instrument that provides a utility score [711]. The EQ-5D instrument consists of a five-item descriptive system of health states and a visual analogue scale (VAS) [7]. Scores for the five health states can be converted into an EQ-5D index score (i.e., utility) by using scores from value sets (preference weights) elicited from a general population [12]. Under management of the EuroQoL group, the EQ-5D has been officially translated in Thai and over 150 other languages [13]. However, to the best of our knowledge, few validation studies of the Thai EQ-5D have been performed. A study by Sakthong et al. [10] found good construct validity of the Thai EQ-5D in a small sample of patients with HIV/AIDS. A second study, by the same authors, found good test–retest reliability, convergent and known-groups validity of the EQ-5D in type 2 diabetic outpatients from a general hospital in Bangkok [14]. In this study, we sought to evaluate the construct validity of the Thai EQ-5D in a large occupational population using Thai preference weights that were recently reported [15].

Methods

Sample

Data were derived from 4,850 participants of the EGAT (Electricity Generating Authority of Thailand) study, aged 25–70 years, conducted in 2008 and 2009 [16]. The EGAT study is a longitudinal study comprising of three waves of recruitment, referred to as EGAT 1, 2 and 3. Data used for this validity study are cross-sectional, comprising data from the third EGAT 2 survey in 2008 (n = 2,273) and the first EGAT 3 survey in 2009 (n = 2,584). This is the most recent data to date from surveys that incorporated the Thai EQ-5D. Seven participants from the EGAT 1 were unintentionally included in EGAT 2 and therefore removed from the EGAT 2 dataset. The majority of the study population came from the middle class working in both rural and urban Thailand. This study is part of the LIFECARE consortium [17].

Measures

A survey comprising the Thai EQ-5D [7], SF-36v2 [18], items on demographic characteristics, as well as the presence of chronic medical conditions was self-completed by study participants and checked for completeness by a health professional. An example of the questionnaire items on a chronic medical condition is, “Have you ever been told that you have liver disease?” If the answer to this question was “yes,” then the participant was asked to give details of the onset and treatment of that disease.

The Thai EQ-5D consists of a self-reported health state description and a visual analogue scale (VAS). The health state description comprises five single-item dimensions (mobility, self-care, usual activities, pain/discomfort and anxiety/depression), each with three response levels: no, some and severe problems. In this study, we used the Thai population-specific preference weights to convert the EQ-5D health state into a single EQ-5D index score [15]. The VAS allows a direct valuation of the current health state and differs from the EQ-5D index score in that it does not reflect preferences elicited under conditions of uncertainty and is therefore not recommended to be used as a measure of utility. An update to the EQ-5D is available, where the three response levels are replaced by five response levels (EQ-5D-5L) with the aim of improving the instrument’s sensitivity and reducing ceiling effects [19]. However, to date, no value sets are available for the EQ-5D-5L and there is no Thai translation of the EQ-5D-5L yet.

The Thai SF-36v2 is a 36-item generic questionnaire measuring eight health concepts: physical functioning (PF), role limitations due to physical health (role-physical, RP), bodily pain (BP), general health perceptions (GH), vitality (VT), social functioning (SF), role limitations due to emotional problems (role-emotional, RE) and mental health (MH). For each concept, item scores were coded, summed and transformed to a norm-based score with a mean of 50 and standard deviation of 10 based on US general population norms [18]. Two summary scores were generated: physical component summary (PCS) and mental component summary (MCS), which were similarly norm-based. Self-reported overall health was derived from the first SF-36v2 question “How would you rate your overall health?” with answers “Excellent, very good, good, fair or poor.”

Data analysis

Descriptive statistics were used to characterize the sample in terms of age, sex, marital status, level of education, self-reported overall health, and number and types of chronic medical conditions. Construct validity was tested by assessing relationships between the Thai EQ-5D and SF-36v2. The SF-36v2 was selected as the gold standard against which the EQ-5D was tested, due to its widespread use in clinical research [2022], validity in the Thai population [6], and evidence of a relationship with the EQ-5D [23]. We hypothesized that, in general, mean SF-36v2 summary scores for participants reporting no problems for any EQ-5D dimension (i.e., participants in perfect health) would be higher than those for participants reporting some or severe problems in one or more EQ-5D dimensions (i.e., participants not in perfect health) [24]. Specifically, we expect to see that the difference in scores between participants reporting problems for the EQ-5D dimensions mobility, self-care, usual activities and pain/discomfort and participants reporting no problems on these dimension, would be greater on the SF-36v2 PF scale than on the SF-36v2 MH scale [25]. Similarly, greater score differences between participants reporting problems for the EQ-5D dimension anxiety/depression and participants reporting no problems were expected on the RE and MH scales of SF-36v2 than on scales related to physical health. Finally, mean SF-36v2 summary scores for participants reporting no problems for any EQ-5D dimension were expected to be higher than those for participants reporting problems [24].

Known-groups validity was tested with the following hypotheses: Older people, females, participants who were not married, those with a low level of education, and those with a medical condition were expected to have lower EQ-5D index scores [2628]. It was also expected that scores decline with an increasing number of chronic conditions (0, 1, 2 and 3 or more chronic conditions) and poorer self-reported overall health. Because males, participants with a high level of education, and those living in an urban area were over-represented in the sample, the capacity of the EQ-5D index scores to discriminate between groups with varying self-reported overall health was separately tested. Non-parametric analyses (Mann–Whitney test for two groups and Kruskal–Wallis test for more than two groups) were mostly performed except for EQ-VAS scores, where parametric analyses (independent sample t test for two groups and analysis of variance (ANOVA) for more than two groups) were performed as the distribution was normal. All data were analysed using the statistical software SPSS package version 19.0. A two-tailed p value <0.05 was considered statistically significant.

Results

Complete data for all EQ-5D dimensions and SF-36v2 subscales from 4,689 participants (96.7 %) were analysed. The majority of participants were men (72.5 %), married or co-habiting (74.4 %), completed at least vocational school (87.0 %), had good or excellent self-reported health (65.6 %), did not have any chronic medical conditions (68.1 %) and mean (SD) age was 46 (8.3) years. The most common chronic medical conditions in this cohort were liver disease (11.3 %), arthritis (10.4 %) and diabetes mellitus (6.7 %) (Table 1).

Table 1 Characteristics of the study sample (n = 4,689)

EQ-5D response

The EQ-5D index showed a considerable ceiling effect in this sample, with 48.7 % of participants having an EQ-5D index score of 1, representing perfect health (Table 2). The mean EQ-5D index score was 0.841 (SD 0.173), the median 1 (IQR 0.69 to 1) and the mean EQ-VAS score was 76.7 (SD 12.7) (Table 3).

Table 2 Median SF-36v2 norm-based scores for participants with and without problems on individual EQ-5D dimensions
Table 3 Comparison of EQ-5D index and VAS scores for subgroups of participants with differing socio-demographic characteristics

Validity

Participants reporting some or severe problems for any of the EQ-5D dimensions reported much lower SF-36v2 scores on all scales than those reporting no problems (Table 2). As hypothesized, there was a greater score difference between participants reporting problems and those reporting no problems in the EQ-5D dimension mobility on the SF-36v2 PF scale (−8.6) than on the MH scale (−5.5). Similarly, greater score differences were seen between participants reporting problems for the EQ-5D anxiety/depression dimension and those reporting no problems on the SF-36v2 RE and MH scales (−7.6 and −11.1, respectively), than all other scales. Finally, mean SF-36v2 summary scores for participants reporting problems for any EQ-5D dimension were significantly lower than for those participants reporting no problems.

Older people, females and those with a low level of education had significantly lower EQ-5D index scores, reflecting poorer HRQoL. No significant difference was found in EQ-5D index scores between participants who were married and those who were not. EQ-VAS scores were only significantly lower for females compared to males (Table 3). Mean EQ-5D index scores were significantly lower (p < 0.05) for persons with chronic heart disease (0.806 vs. 0.842), chronic kidney disease (0.795 vs. 0.842), diabetes mellitus (0.801 vs. 0.844), arthritis (0.749 vs. 0.852), liver disease (0.807 vs. 0.845) and stroke (0.773 vs. 0.842). EQ-VAS scores were only significantly lower for participants with stroke, diabetes mellitus and arthritis. Both EQ-5D index and VAS scores declined when participants had an increasing number of concurrent chronic conditions (Table 4).

Table 4 Comparison of EQ-5D index and VAS scores for subgroups of participants with chronic medical conditions

EQ-5D index and VAS scores were significantly lower for participants with poorer self-reported health, in both males and females, participants living in urban and rural locations, and with a low and high level of education (Table 5).

Table 5 Comparison of EQ-5D index and VAS scores for subgroups of participants with differing self-reported health

Discussion

It is well recognized that cultural differences in perceptions of HRQoL exist [29]. Hence, it is important that the validity and reliability of HRQoL measures be evaluated in any given population [30]. To the best of our knowledge, this is the first study to validate the Thai EQ-5D using Thai preference weights in a large occupational population sample rather than in a specific disease group. Construct validity of the Thai EQ-5D in this population was supported with most a priori hypotheses being met, an exception being the EQ-VAS scores which were not significantly different for most socio-demographic groups. The VAS was also less likely to show significant differences between participants with or without a specific chronic medical condition. This may be explained by the fact that the VAS is only a single question, which restricts detection of small differences in health. However, VAS scores did decline with an increasing number of concurrent chronic conditions and were also significantly lower for participants with poorer self-reported health, in both males and females, and in participants with a low and high level of education.

Up to 2011, the Japanese or UK value sets were applied to transform health states into EQ-5D index scores for Thai samples [1014]. However, valuations of health states could differ for people in different countries due to differences in demographic backgrounds, social–cultural values, and economic systems [29]. Thus, it is advisable to use country-specific weights in a given country if available. Just recently, Thai preference weights were established from a national household survey in the Thai general population by Tongsiri and Cairns [15], using the same estimation methods as used in the original (UK) version [12]. It was found that any departure from perfect health is associated with a substantial decline in health state value, with the effect more marked for Thailand than for other countries [15]. Additional analysis for this study (not reported here) confirmed that Thai EQ-5D index scores for our population were on average 0.024 and 0.040 points lower than EQ-5D index scores based on Japanese and UK value sets, respectively, but all scores showed similar patterns across socio-demographic groups and self-reported health. Hence, the choice of preference weights is not likely to affect the outcomes of this validity study, but a different value set will lead to different EQ-5D index scores. The Thai preference weights are now used in several studies reporting HRQoL in specific medical conditions [31, 32]. Since other studies have either used different value sets [14] or different patient populations [10, 31, 32], a comparison of Thai EQ-5D index scores between studies is complex and often inappropriate.

A number of limitations should be considered when interpreting the study findings. First, the study sample is not representative of the general Thai population. People aged 60 years and over were not represented, females were underrepresented, and people with a higher education and living in urban areas were overrepresented. The fact that EQ-5D index scores <0.69 were not observed suggests that severely ill persons were also underrepresented in this sample. This may have been due to a healthy worker effect. Those who had significant health problems may not have entered the workforce. Also, the EQ-5D asks respondents to describe and rate their health on the day of the interview. Workers who were severely ill may not have been able to attend the interview. Nonetheless, the Thai EQ-5D was able to discriminate between gender, geographic locations and education levels, providing evidence for its use in the general population. Readers who intend to extrapolate mean EQ-5D index scores found in this study to other settings should take care to use weighted mean scores. Second, self-reported survey data were used to identify chronic conditions and, if inaccurate, this can potentially threaten the validity of study findings. Studies suggest self-report is fairly reliable for life-threatening, acute-onset conditions (e.g., stroke) and conditions requiring ongoing management (e.g., diabetes and hypertension), and less reliable for conditions such as asthma and depression, but results are inconclusive [3336]. Since most conditions identified in this study population are acute onset or require ongoing management, and surveys were checked for completeness by a health professional, we are confident that our data are sufficiently accurate for the purpose of this study. Finally, since to date only one survey per participant included both the EQ-5D and SF-36v2, we could not evaluate the test–retest reliability and responsiveness of the Thai EQ-5D in this population.

In conclusion, this paper has expanded the evidence base for the use of the Thai EQ-5D beyond clinical populations. Further research on the reliability and responsiveness of the Thai EQ-5D in the general population is recommended.