FormalPara Key Points for Decision Makers

Psychosocial health status has been increasingly identified as an important health-related quality of life outcome measure for the morbidly obese population who receive bariatric surgery.

Compared with the EQ-5D-5L, the AQoL-8D’s descriptive/classification system (and subsequent utility valuation) preferentially captures psychosocial health status for people who have received bariatric surgery.

While the EQ-5D dominates the clinical and economic evaluation literature, choice of multi-attribute utility instrument should be influenced by the innate sensitivities of the instrument to the relevant domains of health for the study population.

1 Introduction

Obesity is a worldwide problem. Its extensive health repercussions include a high prevalence of psychological comorbidities, and it also has substantial negative economic impacts [1]. Many clinical and epidemiological studies find the most efficacious therapy for morbid obesity is metabolic/bariatric surgery [2, 3]. A systematic review of the impact of bariatric surgery on health-related quality-of-life (HRQoL) found physical HRQoL was improved to a significantly greater degree than mental HRQoL [4]. Furthermore, the psychosocial health status of bariatric surgery patients is dynamic [5]. This recent study found an initial improvement in mental health followed by deterioration between 4 and 9 years post-surgery. Potential reasons for this diminution of HRQoL were postulated to include disappointment from unrealistic expectations about surgical treatment, unforseen changes in eating behaviour, medical sequelae after surgery, dissatisfaction with body appearance and excess skin, and the reoccurrence of psychiatric disorders [5, 6].

The need to assess the psychosocial health status of bariatric surgery patients in the short, medium and longer terms has been increasingly identified [47], and underpins the quality-of-life component of recent guidelines on standardised outcomes reporting for bariatric surgery patients from the American Society for Metabolic and Bariatric Surgeons (ASMBS) [8, 9]. The guidelines made no specific recommendations regarding the most appropriate HRQoL instrument, the recommendation being only to use a “validated instrument(s)”. Importantly, the measurement of psychosocial health or any domain of health is wholly dependent on the sensitivity of the instrument employed to assess that domain.

Health state utility values (HSUVs), or utilities, are important health economic metrics that assess the strength of preference for an individual’s health state relative to perfect health and death. Utilities are assessed relative to a 0.00–1.00 scale where 1.00 represents perfect health and 0.00 death [10]. The utility value therefore indicates the strength of preference for quality versus quantity of life [11], and quality-adjusted life-years (QALYs) can be calculated as the product of time spent in a health state and its utility. QALYs are a unit of benefit used in economic evaluation, namely cost-utility analysis (CUA) and, in principle, may be used to measure the HRQoL component of the burden of disease [10, 12]. Clinicians have also found that measuring health utilities is of benefit to patient–clinical assessment, relationships, communication and management [13]. Furthermore, utilities have been shown to be independent predictors of patient outcomes, including all-cause mortality and development of complications [14].

Multi-attribute utility instruments (MAUIs) are designed to rapidly and simply assess an individual’s HSUV through application of pre-established formulae/weights to the array of responses to the MAUI’s questions. Generic and disease-specific non-utility instruments may also be reduced to a single number; however, this number does not have independent meaning [10]. MAUIs thus differ fundamentally from generic HRQoL instruments.

Many MAUIs target physical health. For example, four of the five items in the EQ-5D, a well utilised international measure, relate to physical health. In contrast, 25 of the 35 items in the recently developed Assessment of Quality of Life (AQoL)-8D relate to psychosocial health [12]. Utility values assessed by MAUIs are not equivalent [15, 16], with the difference between the descriptive/classification systems of the MAUIs the principal determinant [15]. Differences in descriptive/classification systems are estimated to explain an average of 66 % of the difference between utilities obtained by MAUIs, and 81 % of the difference between the utilities of the EQ-5D-5L and AQoL-8D [15]. MAUIs are thus ‘imprecisely related’, a finding that threatens the comparability of economic evaluations that employ different instruments [17].

A small number of MAUIs dominate the economic evaluation literature [17]. A review of the Web of Science database (2005–2010) found that, of 1663 studies employing an MAUI, 63 % used the EQ-5D [15, 17]. Arguably, this finding reflects the recommendations of the UK National Institute for Health and Care Excellence (NICE) guidelines to use the EQ-5D as the preferred measure of HRQoL in adults [18]. These guidelines also acknowledge that the EQ-5D “may not be an appropriate measure of health-related utility in all circumstances” [19]. Emerging research is investigating the concept of ‘bolt-on’ dimensions to the EQ-5D in an attempt to broaden the classification system of this instrument [20, 21].

To inform debate on the choice of instrument for a particular patient group, it is important to compare different preference-based measures of health [22]. In particular, it is necessary to consider the applicability of the descriptive/classification systems. Our study investigated a ‘head-to-head’ cross-sectional comparison of the EQ-5D-5L [23] and AQoL-8D [24] MAUIs for patients who have previously undergone bariatric surgery. The EQ-5D-5L and AQoL-8D have not been specifically validated for patients who have undergone bariatric surgery. This study explored agreement between, and suitability of, the AQoL-8D and EQ-5D-5L for assessing health state utility in patients who have received bariatric surgery to determine whether either instrument could be preferentially recommended in this study population.

2 Materials and Methods

2.1 Participants

Participants were individuals who had previously received bariatric surgery [predominantly laparoscopic adjustable gastric band (LAGB)] in the private sector (n = 33) in Tasmania, Australia. Clinical and socio-demographic data were obtained during recruitment for a focus group designed to explore patient experiences following bariatric surgery. Participants were recruited with the aim of ensuring an appropriate mix of demographic/clinical characteristics. Each participant was sent both MAUIs for self-completion at home 2 weeks before their focus group [13, 25]. All data were de-identified. Questionnaire responses were independently entered into a database by two authors and cross-checked before utilities were generated. Ethics approval was granted by the University of Tasmania’s Health and Medical and Social Sciences Human Research Ethics Committees.

2.2 Instruments

The EQ-5D-5L [23] is a recent augmentation of the EQ-5D-3L [26], and the AQoL-8D [27] is the latest in the suite of AQoL instruments (AQoL-4D/6D/7D/8D) [28]. Table 1 provides a detailed comparison of the characteristics of both instruments. The EQ-5D-5L was developed to address the limited sensitivity (lack of descriptive richness and serious ceiling effects [29]) of the EQ-5D-3L. The EQ-5D-5L includes two additional levels for each of the five dimensions in the EQ-5D [30]. Nevertheless, it has the second lowest number of health states of the major MAUIs at 55 (3125). The EQ-5D-3L has the lowest, at 243. The EQ-5D-5L retains an optional visual analogue scale (EQ-VAS) in which patients rate their current health state on a scale of 0–100 (worst/best imaginable) [31].

Table 1 Comparison of the dimensions and content of EQ-5D-5L and AQoL-8D multi-attribute utility instruments

The AQoL-8D is the fourth and most comprehensive of the AQoL suite of instruments, developed to achieve increased sensitivity in psychosocial dimensions of health, which was relatively neglected in other MAUIs, including earlier versions of the AQoL [10]. Both patient and public involvement were utilised during the construction of the AQoL-8D, a key element of robust MAUI development according to a recent systematic review by Stevens [33]. Psychometric principles were also employed during construction of the AQoL instruments, the only MAUIs to do so [32]. These key features of the AQoL-8D were not identified in the Stevens [33] review. The AQoL-8D contains 35 questions and encompasses the largest number of health states of any existing MAUI (2.4 × 1023).

2.3 Data Analysis

Baseline socio-demographic and clinical data are presented descriptively as mean [standard deviation (SD)] and/or median [interquartile range (IQR)] for continuous variables and frequency (%) for categorical variables. Body mass index (BMI) was calculated as weight (kg)/[height (m)]2.

HSUVs were generated for the EQ-5D-5L using the UK ‘crosswalk’ value set with the EQ-5D-5L version mapped (crosswalked) onto the 3L version through the preferred non-parametric model [34]. For the AQoL-8D, we used a scoring algorithm incorporating Australian weights published on the AQoL group’s website (http://www.aqol.com.au). We assessed questionnaire completion by measuring the proportion of participants who completed the questionnaire and for whom an individual utility value could be generated.

Summary statistics of the HSUVs for each MAUI were assessed as mean (SD) and median (IQR) given the skewed nature of the data. Strength of correlation between the instruments’ utility values for the sample was tested using Spearman’s correlation coefficient, with Spearman’s rho of greater than 0.50 or less than −0.50 considered strong, values between −0.49 to 0.30 and 0.30 to 0.49 considered moderate; and between −0.30 and 0.30 weak [35]. To determine interchangeability between the instruments, pairwise agreement between the utility values for each instrument for each participant was assessed using a scatterplot and through the Bland–Altman (BA) method of differences [36]. The difference between the two measures was plotted against the mean measurement for those two instruments for each individual, along with the limits of agreement (the range of values that would be expected to include 95 % of individual differences) [31].

An MAUI should be able to produce utility valuations for various health states with a significant degree of accuracy to effectively detect and represent differences between individuals [31]. Discriminatory attributes of the instruments were therefore assessed globally and then at dimensional levels. Globally, the extent of floor (worst health: −0.594 EQ-5D-5L and +0.09 AQoL-8D) and ceiling effects (perfect health: 1.0 each instrument) was determined, and then utility values obtained on the alternate instrument were explored. At the dimensional level, summary statistics were obtained for the summary scores for each individual dimension of the AQoL-8D and its super-dimensions. The distribution of responses across the levels (1–5 or 6) of each of three psychosocial-related dimensions within each instrument was then explored. These dimension-to-dimension comparisons [22] encompassed anxiety/depression, self-care and pain/discomfort for the EQ-5D-5L, comprising one item each; and mental health, independent living and pain for the AQoL-8D, comprising eight, four and three items each, respectively.

The association between ‘current BMI’ and utility valuation obtained with each instrument was investigated by testing strength of correlation using the Spearman’s correlation coefficient.

Statistical analyses were undertaken using IBM® SPSS® (version 22) or R (version 3.0.2).

3 Results

3.1 Participants’ Clinical and Socio-Demographic Characteristics

Table 2 provides the participants’ clinical and socio-demographic characteristics. Mean (SD) age was 56 (11) years, and two-thirds (n = 22; 67 %) were female. Mean (SD) of the maximum recorded BMI (before surgery) was 43.7 (7.3) kg/m2, and mean (SD) current BMI (at recruitment) was 32.8 (7.7) kg/m2. One-third of participants had obtained university qualifications and one-quarter were educated to ≤year 10. Most participants (n = 32; 97 %) had received an LAGB, and 12 % (n = 4) of these participants had undergone a secondary procedure such as a revision. Median (IQR) number of years since primary surgery was 5.0 (3.0–8.0).

Table 2 Baseline socio-demographic characteristics of participants

3.2 Questionnaire Practicality

All participants completed both MAUIs. The EQ-5D-5L was completed without omissions or additions (such as multiple responses to one question). In contrast, one participant attempted to select two response items to two questions and modify those items when completing the AQoL-8D. These nonconformities had no impact on our ability to assess the utility of this participant. As advised by the AQoL group, we used the worst response for utility generation.

3.3 Construct Validity

Frequency distributions of the individual utility values for both instruments are provided in Fig. 1a, b. Utilities obtained through both MAUIs showed a distribution towards perfect health, more so for the EQ-5D-5L than the AQoL-8D. There was no significant difference in mean and median utility values of both instruments, with a strong correlation overall. The range and IQR for the EQ-5D-5L (0.40–1.00 and 0.75–1.00) and the AQoL-8D (0.35–0.95 and 0.63–0.88) were the same (Table 3), but each was higher for the EQ-5D-5L, reflecting its greater negative skew. In turn, the AQoL-8D’s assessed range and IQR for our study population compared with the potential scored range, measured as the difference between the floor to ceiling levels of +0.09 to 1.00, is proportionally larger than for the equivalent measure of the EQ-5D-5L. The inclusion of 1.00 in the EQ-5D-5L’s range and IQR also reflect the ceiling effects of this instrument within our study population as detailed below.

Fig. 1
figure 1

Distribution of a EQ-5D-5L utility scores and b AQoL-8D utility scores. c Scatterplot of participants’ utility scores for EQ-5D-5L and AQoL-8D. d Bland–Altman method of differences for utility scores between the EQ-5D-5L and AQoL-8D, all participants (n = 33)

Table 3 Descriptive statistics of EQ-5D-5L and AQoL-8D utility valuations, EQ-VAS scores and percent achieving worst and best health states

The mean (SD) and median utility values tended to be higher [0.84 (0.15); 0.84] for the EQ-5D-5L than for the AQoL-8D [0.76 (0.17); 0.81], respectively (Table 3). A strong correlation was obtained between the utilities for the EQ-5D-5L and AQoL-8D (Spearman’s rho 0.68; p < 0.001). The EQ-VAS gave rise to mean (SD) and median (IQR) ratings of 76 (17) and 80 (70–90), respectively.

3.4 Sensitivity

A scatterplot of individual utility values (Fig. 1c) demonstrated two distinct groupings around 0.8 and 1.0 for the EQ-5D-5L. The BA plot (Fig. 1d) revealed a relatively wide limit of agreement (0.55) and systematic variation, notably a negative trend in the difference between individual participant utility values by mean value. No floor effects were identified for either instrument, nor were there ceiling effects for the AQoL-8D (Table 3). However, a ceiling effect was observed for over one-third (n = 12; 36 %) of participants with the EQ-5D-5L.

Table 4 provides the EQ-VAS rating scores and AQoL-8D global utility values for each participant scoring perfect health using the EQ-5D-5L. One participant (number 11) rated themselves as experiencing perfect health on the EQ-VAS; however, their AQoL-8D utility valuation was high but not perfect (0.93). Overall, the mean (SD) and median (IQR) EQ-VAS ratings were 83 (10) and 84 (79–90), and the mean (SD) and median (IQR) AQoL-8D utility values were 0.87 (0.08) and 0.88 (0.84–0.93). Table 5 provides summary statistics for the individual and super-dimension scores of the AQoL-8D for the entire sample. The maximum score for the mental health dimension at 0.73 was markedly lower than for all other dimensions. The maximum score in the other seven individual dimensions was at least 0.96, six scoring 1.00. In turn, the maximum mental super-dimension score was 0.79. The mental health and mental super-dimensions also recorded the lowest mean (SD) and median (IQR) scores, at 0.62 (0.12) and 0.63 (0.52–0.73) and 0.44 (0.17) and 0.45 (0.27–0.54), respectively. Table 6 provides AQoL-8D individual and super-dimension scores for those recording perfect health using the EQ-5D-5L. One of these participants (Table 4, participant 11) achieved the maximum score (1.00) for the physical super-dimension (PSD). Given their AQoL-8D utility valuation was 0.93, this participant’s overall health status was diminished due to psychosocial impacts. The maximum mental super-dimension (MSD) score within this group was 0.71. The mean (SD) scores for the AQoL-8D PSD and MSD were 0.89 (0.07) and 0.52 (0.13), respectively. The findings are also reflected at the individual dimensions level of physical and psychosocial health. The physical health dimensions gave rise to the highest scores [independent living 0.97 (0.04), senses 0.92 (0.06), pain 0.95 (0.09)], and the psychosocial dimensions the lowest scores [happiness 0.85 (0.07), coping 0.87 (0.08), relationships 0.85 (0.12), self-worth 0.90 (0.08), mental health 0.65 (0.09)].

Table 4 EQ-VAS rating and AQoL-8D utility valuation for each individual assessed in perfect health through the EQ-5D-5L
Table 5 AQoL-8D individual and super dimensions scores for the entire sample (n = 33)
Table 6 AQoL-8D individual dimension and super-dimension scores for each individual assessed in perfect health through the EQ-5D-5L

Table 7 provides a dimension-to-dimension comparison for each of three individual psychosocial-related dimensions of the EQ-5D-5L and AQoL-8D. The EQ-5D-5L showed a larger proportion of participants at Level 1 than the AQoL-8D for each dimension, and less dispersion overall. There were no participants rated at Level 4 or above within the psychosocial dimensions for the EQ-5D, unlike the AQoL-8D.

Table 7 Distribution of levels of response for EQ-5D-5L individual dimensions of anxiety/depression, self-care and pain/discomfort with the AQoL-8D individual dimensions of mental health, independent living and pain

A moderate association was found between ‘current BMI’ and utility valuations for both the EQ-5D-5L and AQoL-8D with Spearman’s rho −0.37; p = 0.03 and −0.39; p = 0.02, respectively.

4 Discussion

To the best of our knowledge, our study is the first to investigate a ‘head-to-head’ comparison of the EQ-5D-5L and AQoL-8D MAUIs in patients who have undergone bariatric surgery. Our study’s key finding was the divergent sensitivity of the instruments in assessing health state utility in this patient group, a difference arguably due to their ability to assess and capture psychosocial HRQoL impacts. This finding is crucial because psychosocial health status has been identified as a significant outcome for the morbidly obese population who receive bariatric surgery [46, 8, 9].

We found 36 % of participants were assessed as having perfect health on the EQ-5D-5L, but none on the AQoL-8D. The mean utility valuation of the patient group scoring perfect health on the EQ-5D-5L was 0.87 using the AQoL-8D, the lower utility driven by less than perfect scores on the AQoL-8D MSD and, in all but one instance, the PSD. The assessed range for the EQ-5D-5L as a proportion of the potential scored range was less than for the AQoL-8D at 16 % [0.25/(1 − (−0.594))] and 27 % [0.25/(1 − 0.09)], respectively, indicating greater discriminatory attributes of the latter for this study population. These findings are underpinned by differences in the classification/descriptive systems and scoring algorithms of the two instruments.

EQ-5D-5L and AQoL-8D utility values were highly correlated by rank ordering (Spearman’s rho 0.68); however, high correlation does not imply close agreement and is blind to the possibility of systematic bias [36]. We observed pairwise disagreement in utility values assessed for a given individual and evidence of systematic bias. In turn, the utility valuations obtained with these instruments in the population who underwent bariatric surgery are non-interchangeable. Our finding of non-interchangeability between the EQ-5D-5L and the AQoL-8D is consistent with a lack of pairwise agreement between the EQ-5D-3L and the AQoL-4D [25].

One of the key drivers for the development of the EQ-5D-5L was to address serious ceiling effects of the EQ-5D-3L [23], with over 45 % of participants scoring perfect health in some studies [37, 38]. The severe ceiling effects of the EQ-5D-3L reflected difficulties in its ability to measure small and medium changes in health [23]. In an investigation of the EQ-5D-5L compared with the EQ-5D-3L across eight patient groups, the ceiling effect was reduced from 20 % (EQ-5D-3L) to 16 % (EQ-5D-5L), on average. Importantly, this study found that ceiling effects were higher for chronic diseases such as diabetes. In this population, the ceiling effect reduced from 34 % (EQ-5D-3L) to 28 % (EQ-5D-5L). In contrast, the ceiling effects for depression were reduced from 12 % (EQ-5D-3L) to 6 % (EQ-5D-5L) [30]. Arguably, this is a direct reflection of the specific question on depression/anxiety in the EQ-5D and underpins the importance of the descriptive systems employed.

Whilst floor/ceiling effects were not investigated in studies of bariatric surgery patients that employed the EQ-5D-3L [3941], over one-third of participants reported perfect health on the EQ-5D-5L in our study. This is a finding comparable to the extent of ceiling effects reported in recent studies that used the EQ-5D-5L for chronic conditions, including diabetes (n = 117 [42] and n = 289 [43]), end-stage renal disease (n = 150 [44]), and chronic hepatic disease (n = 1088 [45]), and consistent with the comparative findings above.

The ongoing ceiling effects measured in this and other studies indicate the limitations of the breadth of the EQ-5D. Furthermore, research concerning the development of ‘bolt-on’ items for the EQ-5D has argued that these items could facilitate greater sensitivity for specific conditions, and further research has been encouraged [20]. However, it has also been noted that the use of ‘bolt-on’ items may lead to “some variations in measurement between conditions and detract from the advantages of using a generic instrument” [20]. We postulate that inclusion of one or more ‘bolt-on’ items may render results non-interchangeable, even with other ‘EQ-5D’ analyses and, in turn, the current dominance of this instrument irrelevant.

We found the mean, median and maximum scores of the AQoL-8D mental health and MSD were low relative to other AQoL-8D dimension scores for both the entire sample and the ceiling effect’s subgroup for the EQ-5D. We also found greater dispersion for the AQoL-8D than the EQ-5D-5L across the three most comparable individual dimensions potentially impacting psychosocial health. We contend that together these findings support the greater sensitivity of the AQoL-8D than the EQ-5D towards psychosocial health.

In regard to the moderate correlations observed between utilities obtained from each instrument and ‘current BMI’, we contend that this finding is reflective of weight status being just one factor contributing to the HRQoL of people who have received bariatric surgery. This position is consistent with the most recent evidence, which does not support a direct link between long-term weight reduction and continued improvement/decline in mental health after bariatric surgery [5, 6]. Psychosocial support, alongside weight loss maintenance, are important management components for the HRQoL of this group of individuals in the longer term.

Economic evaluations of interventions that affect HRQoL commonly employ CUA that prioritise interventions according to the costs per QALY gained [15]. We found that significant differences in the EQ-5D-5L and AQoL-8D descriptive systems impact their sensitivity towards psychosocial domains of health. We also found that the utility values obtained cannot be used interchangeably. Impacts on psychosocial health for bariatric surgery patients have been identified as a vital outcome. Our findings thus have implications for the choice of utility instrument employed for clinical assessment and/or economic evaluation in the population for whom bariatric surgery is a consideration.

As noted previously, NICE’s recommendation to use the EQ-5D for utility assessment is tempered by whether use of the EQ-5D is considered appropriate; a lack of content validity, including missing key health dimensions, is a primary concern [18]. If the nominated choice of instrument lacks sensitivity within a particular health context (or health domain), interventions affecting health states where the instrument’s sensitivity is low will be disadvantaged [32], a potential bias of particular importance for healthcare decision makers. For people who are morbidly obese considering or having undergone bariatric surgery, the impact of any intervention will not be fully captured unless the nominated MAUI is sensitive to psychosocial health.

In turn, while the EQ-5D dominates the clinical/economic evaluation literature, its prevalence should not influence the choice of instrument in this (or other) study population(s). Rather, the choice of MAUI should be influenced by the sensitivity of the instrument to a patient group’s health profile. In turn, we argue that the AQoL-8D should be preferred to the EQ-5D-5L within the morbidly obese population, including those undergoing or having undergone bariatric surgery, given its sensitivity to the psychosocial dimension of HRQoL.

Within the ASMBS’s recently published outcomes reporting guidelines for bariatric and metabolic surgery [8, 9], the EQ-5D was classified as one of several frequently used generic HRQoL instruments within this population; however, the ASMBS was unable to provide specific guidance as to a preferred HRQoL instrument(s), as previously noted. No reference was made to MAUIs per se; a situation we believe is an important oversight. If a MAUI and associated utility valuation comprehensively assesses and captures the physical and psychosocial domains of health for bariatric surgery patients, use of such an instrument could fulfil ASMBS HRQoL requirements. Related economic evaluations would also be underpinned by robust utility valuation, and thus facilitate defensible resource allocation.

Respondent burden is also a necessary consideration in instrument choice. The ASMBS document argues that HRQoL instruments with more items are less likely to be completed by patients, whereas instruments with fewer items are completed at higher rates. We expected the EQ-5D-5L would achieve a higher level of completion given that it comprises 30 fewer items than the AQoL-8D. Additionally, the average time for completion for the EQ-5D-5L (1 min), is approximately 4 min faster than that for the AQoL-8D (Table 1). Our study showed a 100 % response rate for both instruments and subsequent generation of individual utility values. Nevertheless, we acknowledge that our study participants were fully engaged through focus group involvement and that this may have influenced the completion rate of the MAUIs for our study. As participant levels of education were relatively evenly spread, this should not confound questionnaire completion.

The ASMBS document also recommends, with reference to a 2011 review of HRQoL instruments measuring bariatric surgery [46], the use of a combination of HRQoL instruments to capture psychosocial impacts. The 2011 review found that while several generic and obesity-specific instruments have been developed and/or used in bariatric surgery, all have limitations [46]. The review investigated the content validity of one MAUI (EQ-5D) and other generic and disease-specific instruments, including the SF-36, Nottingham Health Profile and IWQoL-lite. The review consequently proposed a conceptual framework for a bariatric surgery-specific HRQoL instrument that comprised 20 items, 19 of which, including all of the psychosocial domains of health, are included in the AQoL-8D. The item not included in the AQoL-8D pertained to eating. This conceptual framework subsequently underpinned the development of the disease-specific quality-of-life instrument, the ‘bariatric and obesity-specific survey’ (BOSS) [47]. The BOSS is not an MAUI. The BOSS-42 (the final version of this instrument) contains 42 items, seven more than the AQoL-8D.

Thus, our study found that a single MAUI instrument, the AQoL-8D, is sensitive to the psychosocial as well as the physical domains of health for people who have undergone bariatric surgery, and it captures the vast majority of domains considered crucial in this population. While the length of the AQoL-8D may be an initial deterrent, this concern must be balanced against the sensitivity of this instrument to mental health [48] and physical health dimensions. Further, the use of a combination of up to three or four HRQoL instruments could be more burdensome and time consuming for the study population than the use of a single comprehensive instrument.

The major strength of this study is the use of a homogeneous group of bariatric surgery patients to minimise confounding due to patient characteristics in the identification of similarities and key differences between the EQ-5D-5L and AQoL-8D. The key limitation of this study is the sample size (n = 33). Nevertheless, we found that about one-third of the participants scored perfect health on the EQ-5D-5L, which is consistent with other studies of chronic disease with larger samples. Another limitation is that we did not include a disease-specific instrument because of concerns about the potential impact of respondent burden on both the quantitative and the qualitative components of the broader study. In lieu of a disease-specific instrument, we compared the utility valuations of ‘current BMI’. One further limitation could be attributed to the utilities estimated from the EQ-5D-5L crosswalk value set [49]. Given this study was exploratory, larger confirmatory studies are justified. We also suggest that a comparison between the AQoL-8D and SF-6D would be of value.

5 Conclusions

Before selecting a generic MAUI, researchers should fully understand the instruments’ descriptive/classification systems and the innate sensitivities of the MAUI in their context. Given the relative importance of the psychosocial health in the population contemplating or having undergone bariatric surgery, the choice of MAUI may be crucial. For bariatric surgery, the AQoL-8D more fully captured and assessed the psychosocial aspects of these patients’ HRQoL as compared with the EQ-5D-5L. Additionally, the AQoL-8D was sensitive to the physical aspects of these patients’ HRQoL. We recommend the AQoL-8D as a preferred MAUI to the EQ-5D-5L for patients undergoing bariatric surgery given their complex physical and psychosocial needs.