In various ways health economists are engaged in eliciting persons’ stated preferences with a view to using this information to inform resource allocation decisions in health care. One elicitation method that is increasingly being used is the discrete choice experiment (DCE). Such experiments involve the analyst constructing a set of alternatives based upon a limited number of important attributes and then obtaining from respondents an indication of preferences over those alternatives. This is done by presenting two or more alternatives and asking respondents to choose between them. The exercise is repeated with alternatives comprising different levels of the attributes in order to infer the relative weight attached to each level of each attribute.

Such techniques were originally developed in marketing research in the early 1970s and have since been widely used in transport research [1, 2, 3, 4]. They are now increasingly being applied to health and health care [5, 6, 7, 8, 9, 10]. The vast majority of health-related DCE studies have sought to assess the benefits of specific health care services such as interventions for miscarriage [5], liver transplantation [8] and in vitro fertilisation [11]. Where the focus has been on benefit assessment, DCEs can be used to: (a) calculate the health gains from different interventions, where the various dimensions of health are used as attributes; (b) calculate the implied willingness to pay for those benefits when ‘financial cost’ is included as an attribute; and (c) express the value of different attributes, including non-health outcomes and ‘process’ factors, in terms of one another [12].

One of the principal motivations behind the interest in the approach in health economics has been a desire to go ‘beyond health outcomes’ in the economic evaluation of health technologies, and therefore the approach has been most widely used in the third of these ways [12, 13, 14]. Indeed, the ability to consider a broad range of benefits is put forward as one of the main strengths of the DCE approach. However, others have used the technique to consider preferences relating to other aspects of the health care market such as the physician-patient relationship and job characteristics of health professionals [7, 9]. Advocates of DCE can point to encouraging results concerning both reliability and validity when applied in a health context [15, 16].

The picture painted of DCE in the health economics literature does indeed appear to be a rosy one. However, comparison is rarely made with other preference elicitation methods [5, 6, 11, 12, 13], and more often than not, apparent solutions to old problems bring with them their own set of challenges. We would therefore encourage further reflection by researchers working in this area on some of the important uncertainties and potential weaknesses of DCE as it has been applied in health economics. There are four specific issues that, in our opinion, deserve further consideration: (a) normative issues (i.e. how might DCE data be used to inform policy?), (b) psychological issues (i.e. how meaningful is the data?), (c) technical issues (i.e. how robust is the data?) and (d) generalisability issues (i.e. does the DCE approach imply many repeated preference elicitation exercises?)

Normative issues

A clear objective in using DCE to elicit preferences regarding the benefits from health care is that the results can be used to inform policy decisions. This case has been made strongly in a recent editorial in the British Medical Journal which called for national health policy to be informed by the results of DCEs [14]. However, if data from DCEs are to be so used, consideration needs to be given to the normative issues of whose preferences about what are relevant for which policies.

Taking first the question of whose preferences, it is interesting to note that the vast majority of published DCE studies have collected data either from patients or service users [5, 6, 7, 8, 12, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24]. There are still only a very small number of DCE studies that have sought to elicit preferences from members of the general public [25, 26, 27]. Turning then to the question of preferences about what, as indicated earlier, for much of the health-related work where the DCE has been employed there has been an explicit focus on the need to measure benefits that are not captured well by measures of health conventionally adopted in health economics, such as the quality-adjusted life year. The desire of researchers is to include non-health benefits, including ‘process’ concerns (e.g. staff attitudes [11]).

There are clearly a number of value judgements being made by DCE analysts, and the relevance of the data generated must be placed in its appropriate normative context, largely that of a tax-financed health care system with a high degree of cross subsidisation. Whilst patients ex post might place a relatively high value on ‘non-health’ attributes (as seems to be the case from many, although not all, of the studies), tax-payers ex ante might value a more limited (and possibly more health-focused) set of attributes. It can therefore be argued that the DCE studies conducted to date are more applicable to private health insurance schemes than to predominantly tax-based systems, as those found in the UK and in many other European countries.

Psychological issues

In general, DCE analysts in health have not dealt adequately with the broad range of psychological and cognitive aspects of eliciting preferences and of using the DCE in particular. A specific issue that has received limited attention is the effects that the processes of elicitation themselves have on the construction of preferences. There is now a significant literature (some of it in health economics) that deals with a range of framing effects and heuristics that provide important insights that are key to a better understanding of DCE data (for examples see [28, 29]). Attention to this literature might help the authors of papers reporting DCE studies to explain why, for example, the percentage of respondents whom they have had to exclude for having dominant preferences ranges from nearly nobody (for example, [8]) to more than 50% (for example, [13]).

A related issue is the cognitive burden placed on respondents to a DCE study. The simplicity of the exercise for the respondent and the familiarity of making choices in real-life situations are often cited as important strengths of the DCE [13]. Unlike most other stated preference techniques, for example, willingness to pay, respondents to a discrete choice question are not required explicitly to quantify the strength of their preference for the stimulus presented. However, the choice task is still a considerable cognitive challenge—respondents are required to process a large amount of information contained in the presented scenarios and consider trade-offs between all of the attributes. If undertaken ‘correctly’, the choice task involves the simultaneous comparison of different levels on many—sometimes as many as seven or eight—attributes [7, 8, 12, 18, 24, 25, 30, 31, 32, 33]. This cognitive burden might help to explain the low response rates in some of the studies; in some cases less than 35% [5, 18, 25].

In addition, further investigation of the thought processes and reasoning behind responses to DCE questions is necessary in order for the meaningfulness of such data to be critically reviewed. For example, it would be interesting to know whether infertile women really do consider ‘good’ (as opposed to ‘bad’) attitudes of staff to be worth a 6% chance—or about a 33% relative reduction—of having a child [12]. Similarly, it is surprising that Ubach et al. [9] found hospital consultants to be willing to work an extra one hour per week for an additional net income of only £11.40. This finding has been strongly disputed in a letter from a disgruntled gastroenterologist [34]! In depth exploration of preferences elicited using DCE is, however, clearly not made possible through the use of postal surveys, which represent the most prevalent approach to data collection in DCE work in the health field [5, 8, 12, 15, 18, 19, 20, 21, 23, 24, 25, 26, 27, 30, 32, 33, 35].

Technical issues

The number of discrete choices presented to respondents in DCE surveys is often very small in relation to the total number of scenarios generated. For example, it is common to see only eight or nine pairwise choices presented in a DCE questionnaire, even when the total number of possible scenarios range from approximately 250 to 500, depending on the number of attributes and the number of levels on each attribute [8, 21, 23, 24, 25, 26, 35]. It is not surprising then that the algorithm or model used to generate overall preference weights is almost always a linear additive one with no interaction terms specified. But is this highly restrictive model a descriptively valid one? And are the results robust to the particular scenarios chosen for the valuation exercise? With so few data filling the valuation space, it is impossible to answer these fundamental questions. Further methodological research is required to address these issues as a matter of some urgency.

Further technical issues relating to the practice of DCE in health care that need to be considered more fully include: (a) the analysis of data at the individual respondent rather than the aggregate level, (b) alternative methods by which the factorial design is generated, (c) methods of pairing selected scenarios in order to form choices and (d) how to take account of people with dominant preferences in the formal analysis. On the latter concern, respondents with dominant preferences are commonly excluded from the DCE analyses on the grounds that they are not ‘trading’ between the attributes in question. It is difficult to justify such a position when one considers the purpose for which the preferences are elicited, namely to inform public policy decisions. Given this, it is important that DCE analysts ensure that the preferences of all respondents (both traders and non-traders) are included, such that the results of their analyses have relevance for policy decisions that affect all stakeholders. Scott [36] explores this particular issue.

Generalisability

The final issue relates to the generalisability of the data collected in a DCE study. It has become accepted practice in health economics that, where a focus on ‘health’ is seen as sufficient and/or appropriate, a generic health status measurement tool (such as the EQ-5D) can be used, allowing off-the-shelf preference data to be applied to the health states of interest [37], and for quality-adjusted life years then to be calculated. This limits the burden on patients (in a clinical trial, say) since they are then simply required to describe their health (using a generic health state descriptor), and are not required to consider the strength of their preferences for the health care service being received and its associated benefits. In contrast, when the DCE technique is used, the preference elicitation exercise must be repeated for each clinical setting or technology. For example, the DCE results for the use of magnetic resonance imaging for patients with knee injuries are specific to that intervention and the context of that study and have no meaning in other settings [6].

The research resource implications of repeated preference elicitation exercises across different groups of patients and by the same patient groups across different interventions are potentially very considerable, and the use of case-specific surveys for every new technology is unlikely to be an attractive or even a realistic option. The onus therefore is on the advocates of the DCE to show how the results generated have sufficient generalisability to be of use in a broad health policy context.

Conclusion

Our intention in writing this editorial is not to condemn the use of DCE in health economics since, as we have indicated above, we believe that the approach provides the potential to broaden our focus in the measurement of benefits associated with health care interventions. Rather our purpose is to seek a more open debate on the relative strengths and limitations of the DCE method, particularly when applied in health settings. We acknowledge that there have been previous contributions to the literature calling for methodological issues to be addressed [12, 13, 38]. However, the focus has been on the identification of the preferred approach to conducting DCEs. For example, the need for further research into how best to model the benefit function has repeatedly been called for, with little subsequent empirical work on this topic appearing in the literature.

We believe that DCEs as currently being practiced in health economics raise some important normative and methodological issues that require urgent attention. There is an apparent rush to make use of the DCE approach, as demonstrated by the increasing number of published studies. Given the concerns that have been outlined here, it is our view that more caution and greater circumspection towards the technique is appropriate at this stage.