Introduction

In health-care evaluations, patients’ valuation of treatment success implicates inclusion of patient-reported outcome measures (PROMs). PROMs can facilitate comparisons of alternative treatments, health-care providers, or health-care routines of different countries; improve care; and pave the road for personalized medicine [15].

The EQ-5D is one of the most commonly used generic health-related quality-of-life (HRQoL) instruments to measure patient-reported health outcomes [6]. It provides a simple descriptive system encompassing five dimensions of health: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression which can, by applying a value set, be transposed into a single index (herein after referred to as EQ-5D value) which serves as an overall measure of HRQoL. Value sets may be established by different methodologies. Common methods to value health states include, but do not limit to, the use of rating scales such as a visual analogue scale (VAS) or time trade-off (TTO) questions [79]. Further, valuations can either be based on preferences from individuals to whom health states are described, i.e., hypothetical values, or from individuals who are actually in the health state, i.e., experience-based values. Hypothetical values, also referred to as social values, thus represent the valuation of imagined health states generated from a sample of the general population [10]. Experience-based values reflect individuals’ health states as currently experienced and are also denoted patient or individual values [11, 12]. Depending on the method used to construct a value set, there are considerable differences in EQ-5D values calculated [8, 9]. A relevant value set should ideally be consistent with the decision-making context and mirror the population whose health status is measured [1315]. Experience-based valuation may better reflect cultural differences [16]. Although cross-country similarities in response patterns have been observed [17], there are differences between cultures in how different health states are valued [8]. Even in bordering countries with cultural similarities, considerable differences in self-reported EQ-5D health states have been observed [18]. As previously proposed, country- or population-specific value sets may differ due to cultural and socioeconomic differences, differences in health-care policies, and sociodemographic factors [8, 9].

Despite the long tradition in Sweden of using the EQ-5D for public health surveys [19], clinical research [20, 21], and quality registry work [22], there has not, until recently, been a value set based on values from a Swedish population available. In the absence of Swedish value sets, the UK TTO value set [10] has been the most commonly used. Burström and colleagues [11] took the first step in remedying this situation. Based on data from 45,000 respondents, they attempted to estimate experience-based value sets for EQ-5D health states using general population health survey data. The authors state that an interesting area for further research would be to test the performance of the value sets by assessing how the predicted values correspond to directly measured values in other populations.

In 2002, the Swedish Hip Arthroplasty Register adopted a routine follow-up program covering patient-reported outcomes. Intentionally, all patients undergoing elective total hip replacement (THR) in Sweden are asked to participate in the program preoperatively and at 1, 6, and 10 years after surgery. Among twelve items, the survey includes the EQ-5D questionnaire. Thus far, EQ-5D values have been calculated and presented using the UK TTO value set based on hypothetical values [10]. Shift to the use of the recently introduced Swedish experience-based value set, derived from a representative Swedish population, is an appealing alternative for the registry [11].

The overall aim of this study was to test the performance of the recently published Swedish TTO and Swedish VAS value sets in patients with THR. The first objective was to investigate how accurate the Swedish experience-based VAS value set predicts pre- and postoperative EQ VAS values observed in the Swedish THR population. The second objective was to compare correlations between the Swedish TTO and VAS value sets, the UK TTO and VAS value sets, and provisional value sets derived from the Swedish THR population.

Methods

Data selection

We used data from the routine follow-up PROMs program of the Swedish Hip Arthroplasty Register. This program invites all patients scheduled for elective THR in Sweden to participate. Patients are asked to complete a short PROMs survey at their preoperative visit to the orthopedic clinic. In addition to the EQ-5D instrument [the EQ-5D descriptive system and the EQ visual analogue scale (EQ VAS)], the preoperative questionnaire comprises questions on hip pain, musculoskeletal comorbidity, smoking status, and previous physiotherapeutic interventions. At follow-ups one, 6, and 10 years postoperatively, the survey is sent by mail. Response rates are 86 % preoperatively and 90 % at the one-year follow-up [23].

The current analyses are based on 56,062 patients with THR operated between January 1, 2002 and December 31, 2012. These data contain THR patients with complete preoperative and one-year follow-up questionnaires. For patients with bilateral surgery during the study period, we only used data from the first operation with complete registration.

Ethical review approval was obtained from the Central Ethical Review Board in Gothenburg, Sweden (decision 293-13).

Statistical analyses

Establishment of provisional value sets

For the purpose of this study, we derived two provisional values sets: one register-based VAS value set for the preoperative data and one for the postoperative data.

For both provisional value sets, the patient-reported EQ VAS served as outcome. We regressed with ordinary least squares (OLS) regression the recorded EQ VAS values on the five dimensions of the EQ-5D questionnaire. If the assumptions of uncorrelated errors and no heteroscedasticity are met, linear regression coefficients given by OLS are the best linear unbiased estimators. For the provisional register-based preoperative value set, the preoperative EQ VAS score was regressed onto the preoperative EQ-5D dimensions, and for the provisional register-based postoperative value sets, postoperative EQ VAS score was regressed onto the postoperative EQ-5D dimensions. Each of the five questions of the EQ-5D has three levels of severity: no problems (level 1), moderate problems (level 2), and severe problems (level 3). This descriptive system generates 243 health profiles or health states.

Each dimension’s level 1 was defined as reference, and the estimated regression coefficients denote the mean difference in EQ VAS between patients who reported level 2 and level 1, and level 3, and level 1. Additionally, we defined an indicator variable N3 that takes up value 1 if any of the five EQ-5D dimensions score level 3, 0 otherwise. Our premise with the regression analysis was that coefficients should have negative signs, and level 3 should have coefficients with larger magnitude than level 2.

Concordance with the observed EQ VAS

Based on the individual responses to the EQ-5D descriptive system, we calculated expected EQ VAS values for each patient using the Swedish VAS value set. Deviation between the observed and calculated EQ VAS values was summarized by bias and mean absolute deviation (MAD). In this context, bias is a measure of the distance between the observed and the calculated values. The bias measures allowed us to explore not only the degree of deviation, but also the patterns.

On the presumption that EQ-5D values based on VAS value sets should correlate with the observed EQ VAS values, we then estimated the correlation between the observed EQ VAS values and the calculated EQ-5D values based on four different value sets: the Swedish VAS, the UK VAS, the register-based preoperative VAS, and the register-based postoperative VAS value sets. We used Pearson’s correlation to assess the correlation coefficient and nonparametric bootstrapping (B = 1000) for assessing the associated 95 % confidence intervals. For comparison of the different correlation coefficients, we calculated their differences and associated 95 % CI. If this interval contains zero, we cannot reject the null-hypothesis of no difference.

The correlation between the observed preoperative EQ VAS values and values calculated using the register-based preoperative VAS value set was used as reference for preoperative data. Similarly, the correlation between the observed postoperative EQ VAS values and values calculated using the register-based postoperative VAS value set was used as reference for postoperative data.

Results

Patient characteristics

The mean age at operation was 67.9 years (range 15–97), and 57.1 % (n = 32,000) were female. Compared to the preoperative EQ-5D value, 84.3 % (n = 47,244) improved, 6.8 % (n = 3806) stagnated, while 8.9 % (n = 5012) reported worsened health status (Table 1) at the one-year follow-up.

Table 1 Prevalence of problems on the EQ-5D dimensions in 56,062 Swedish total hip replacement patients preoperatively and one year postoperatively

Derivation of the provisional register-based value sets

The coefficients of the two provisional register-based VAS values sets established are presented in Table 2. For the preoperative data, all variables entered in the model were significant. The model indicated that level 3 with self-care has a lesser effect on health status than level 2. Additionally, we observed that if the patient indicated level 3 in at least one dimension (N3 variable), this has a positive effect on health status. As a result, the N3 variable has been removed, and the self-care dimension dichotomized to “no problems” versus “any problems.” This caused only a 0.0001 drop in predictive power, and the final coefficient of determination was 0.22. For the postoperative data, the model indicated that “severe problems” with self-care have a lesser effect on health status than moderate problems. Additionally, the effect of level 3 in self-care was nonsignificant (p = 0.058). Consequently, the self-care dimension dichotomized to “no problems” versus “any problems.” The final coefficient of determination was 0.52.

Table 2 UK and Swedish VAS and TTO value sets for EQ-5D together with a regression analysis on the EQ VAS and EQ-5D dimensions of Swedish total hip replacement patients from the Swedish Hip Arthroplasty Register

Concordance with the observed EQ VAS

Using the responses to the EQ-5D questionnaire and the regression coefficients of the Swedish VAS value sets, we calculated expected EQ VAS values before and after surgery. The calculated values correlated moderately to observed EQ VAS values in the preoperative data (r = 0.459). The correlation was higher in the postoperative data set (r = 0.722). The bias was relatively small, −0.192 for the preoperative and −1.098 for the postoperative data set. However, the bias correlated to the EQ VAS values (preoperative: r = −0.739; postoperative: r = −0.622). Calculated values for mild health states were consistently lower than observed values, the opposite were found for severe heath states (Fig. 1). This pattern was more consistent for the preoperative data set.

Fig. 1
figure 1

Bias in the observed and calculated pre- and postoperative EQ VAS using the Swedish VAS value set

Generally, the correlation between the calculated values based on VAS value sets and the observed EQ VAS was lower in the preoperative data than the postoperative data (Table 3). Expectedly, the EQ-5D values based on the register-based value sets had the largest correlations.

Table 3 Correlations between EQ-5D values derived from different value sets using preoperative and postoperative data

Using preoperative data, EQ-5D values based on the preoperative Swedish VAS value set had significantly higher correlation with the observed EQ VAS ratings than values based on the UK VAS value set (diff = 0.023, 95 % CI 0.020; 0.026). EQ-5D values based on the preoperative register-based VAS value set had higher correlation with observed EQ VAS values than values based on the Swedish VAS value set (diff = 0.011, 95 % CI 0.010; 0.013). In magnitude, this difference was half of the former (Fig. 2).

Fig. 2
figure 2

Correlation (r) and 95 % bootstrap confidence intervals of different VAS value sets in the prediction of observed EQ VAS values

The similar pattern for postoperative data was present. EQ-5D values based on the postoperative Swedish VAS value set had significantly higher correlation with observed EQ VAS ratings than values based on the UK VAS value set (diff = 0.032, 95 % CI 0.030; 0.034). EQ-5D values based on the postoperative register-based VAS value set had higher correlation with the observed EQ VAS ratings than values based on the Swedish VAS value set (diff = 0.0038, 95 % CI 0.0031; 0.0046).

Correlations between EQ-5D values based on different value sets

The correlations between EQ-5D values based on different value sets were high (Table 3). Generally, the correlation between EQ-5D values based on the Swedish TTO, Swedish VAS, and the two provisional register-based value sets was higher than the correlation between values based on the UK and the Swedish value sets. For preoperative data, the correlation between the EQ-5D values based on the Swedish TTO value set and the preoperative register-based value set was significantly higher than the correlations between EQ-5D values based on the UK TTO and preoperative register-based EQ-5D value sets (diff = 0.070, 95 % CI 0.069; 0.072). The same pattern was observed for the postoperative data (diff = 0.037, 95 % CI 0.036; 0.038).

Discussion

The recently developed Swedish experience-based value sets for weighting responses to the EQ-5D descriptive system appear to be valid for patients undergoing THR in Sweden. The predictive accuracy, as measured by correlations, differed between the pre- and postoperative data sets, and this may be explained by the fact that patients eligible for THR are generally in a condition that deviates markedly from the general population. By 1 year after the operation, patients’ health states are more similar to the general population [23]. Correlations between EQ-5D values calculated using the UK TTO value set and the register-based value sets were constantly lower compared to when values were calculated using either the Swedish TTO or VAS value sets. Both pre- and postoperative data showed high correlations between Swedish experience-based value sets and the provisional register-based value sets. It should be emphasized that the value sets derived from the Swedish THR population were provisionally established for comparative purposes of this study. Thus, they are not meant to be used in practice.

The choice of an experience-based value set as opposed to a value set based on hypothetical health states is somewhat a normative issue [11, 13]; however, it can affect resource allocation decisions [24]. It has been argued that the respondents in a general population health survey may be more focused on their overall perceptions of their health status and consequently their valuation of the EQ-5D health state [14, 25]. This may avoid respondents from contextualize this perception into a particular disease condition or the actual dimensions and levels of the EQ-5D descriptive system. Therefore, when evaluating health-care performance from a health outcome perspective, experience-based value sets derived from a representative sample of the general population are appealing. Regardless of preferences on hypothetical or experience-based value sets, this study takes a scientific approach to test the performance of different value sets in a target population.

Agreement between the EQ-5D values of the provisional register-based value sets and the Swedish VAS value set was higher than the agreement between the values of the provisional register-based and the Swedish TTO value sets. This is probably due to the similarities in the methodologies used to anchor the conception of an individual’s current health state. However, the higher degree of correlations between the Swedish TTO value set and the provisional register-based value sets, compared to correlations between the UK TTO value set and the provisional register-based values sets, indicates that the former better represents the Swedish THR population. The Swedish TTO, the Swedish VAS, and the register-based value sets were obtained using OLS regression assuming additive effects. The majority of value sets in the literature are based on the additive model and generalized least squares regression, with a few notable exceptions [8]. However, in our case, coefficient estimates from OLS regression and generalized least squares regression agreed even at the fifth decimals.

Apart from the cultural aspects, the Swedish value sets were established from a contemporary population, while the UK value sets are nearly 20 years old. This may contribute to the greater concordance between the value sets established in the THR population and the Swedish value sets compared to the UK value set. However, both the register-based and Swedish value sets were experience-based, as opposed to the hypothetical UK value sets, which we believe is the strongest contributing factor to explain the high correlations. For the evaluation of clinical health outcomes, experience-based value sets may better reflect the patients’ appraisal of their health states [1214].

The five dimensions of EQ-5D can predict the EQ VAS [26], but as Whynes [27] observed, this correlation varies by the medical condition of patients, and VAS valuations differ in spite of ostensibly being in the same EQ-5D health states. Thus, it is likely that the discrepancy observed is directly related to patients’ health states. The data collected from patients with hip problems awaiting THR—chronic and possibly debilitating conditions—showed lower agreement between the observed and predicted EQ VAS valuation than that of the follow-up data. One year after a commonly life-changing operation, patients are in a health state close to the general population, suggesting a possible response shift in responses prior to surgery [28].

The results imply that the new Swedish experience-based value sets are more accurate in terms of representation of the Swedish THR patients than the currently used UK TTO value set. In evaluations of health-care performance from a health outcome perspective, experience-based value sets are appealing. Based on the results of this study, we find it feasible to use the Swedish value sets for further presentation of EQ-5D values in the Swedish THR population. As the Swedish Hip Arthroplasty Registry does not collect TTO data, we could not in detail validate the Swedish TTO value set. However, detailed validation of the Swedish VAS value set and the concordance of the Swedish TTO with the provisional register-based value sets suggest that the Swedish value sets in general are more suitable for the Swedish THR population than the UK VAS or UK TTO value sets. As the VAS values were not anchored between dead and full health, they could not directly be used in calculation of quality-adjusted life years (QALYs); we then prefer the TTO value set in economic evaluation studies [11].