FormalPara Key Points for Decision Makers

Interest in using quantitative patient preference data to inform the valuation of non-health benefits in health technology assessment decision-making is growing, although there is no consensus on how this should be done.

This paper explains how patient preference studies could be used to generate a generic measure of the value of non-health benefits that adopts the concept of value favoured by many agencies – quality-adjusted survival equivalents (QASE).

Further work is required to demonstrate the methodological advantages of QASE over other methods and to determine how health technology assessment agencies accept this metric.

1 Introduction

Choice-based patient preference (PP) data are being increasingly used to inform healthcare decisions [1]. The US Food and Drug Administration (FDA) defines PP data as ‘assessments of the relative desirability or acceptability to patients of specified alternatives or choices among outcomes or other attributes that differ among alternative health interventions’ [2].

There are many reasons why PP data should inform health technology assessment (HTA) [3], most notably, patients’ right to participate in decisions impacting them, their experiential knowledge about their disease, adding social legitimacy to decisions and ensuring patients are satisfied with reimbursed technologies. HTA agencies have historically sought PP data on these topics in the form of qualitative insights, for example via submissions from patient advocacy groups or patient representatives as committee members. In contrast, quantitative preference data used in HTA tends to be from general population samples to inform the utility estimation required by economic evaluation [1].

While HTA bodies have conventionally used matching methods, such as time trade-off (TTO) and general population preferences to value treatment benefits [1], they have recently indicated an interest in using quantitative, choice-based PP data in their evaluations [1, 4, 5]. The Innovative Medicine’s Initiative PREFER project was established to generate recommendations on when and how to collect and use PP data to support decision-making by industry, regulatory authorities and HTA bodies. The final recommendations stated that quantitative PP data could support HTA decision makers’ deliberations by justifying unmet needs, selecting endpoints in early clinical development, understanding patients’ treatment choices and estimating the relative value of non-health benefits such as changes in mode of administration (MoA) [4, 6]. However, little attention has been given to how these data would be used for such purposes. The objective of this paper is to identify and evaluate a method for using choice-based PP data from a discrete choice experiment (DCE) to estimate the value of non-health benefits, remaining consistent with the concept of value prevalent among HTA agencies.

The method considered is an adaptation of that used in a submission to the National Institute for Health and Care Excellence (NICE) to estimate the value of changes in MoA in the assessment of migalastat for Fabry disease [7]. In that submission, a DCE was conducted with a general population sample and estimated that the utility of moving from biweekly infusions to oral treatment once every other day was equivalent to 1.79 additional years of survival. Although the effect of this study on NICE’s decision is unclear, this example provides insight into how choice-based PP data might be used to quantify the non-health benefit of less-burdensome MoA. First, it demonstrated how a DCE can be used to elicit preferences for MoA relative to other benefits of treatment. Second, it expressed the value of changes in MoA on a standardisable metric that is consistent with the notion of value favoured by many HTA agencies.

The remainder of this paper further develops the method used in the above Fabry disease example to consider whether it provides the basis for using quantitative choice-based PP data to generate a standardised measure of non-health benefits using the concept of value employed by many HTA agencies. The next section describes how the concept of survival equivalence, as used in the Fabry disease submission, may generate erroneous estimates of the utility of non-health benefits and how adopting quality-adjusted survival equivalents (QASE) can correct for this. The subsequent section asks whether face-valid, QASE-based utility estimates can be estimated from published DCEs. The fourth section identifies some of the methodological and normative considerations with the adoption of QASE.

2 Using PP Data to Calculate Quality-Adjusted Survival Equivalents

This section describes how PP data could be used to calculate survival equivalence for a treatment that generates both improvements in health-related quality of life, \({q}^{{\text{H}}}\), and non-health benefits, \({q}^{NH}\), and how an alternative utility concept, QASE, provides a more accurate estimate of the utility of non-health benefits.

The NICE Fabry submission valued \({q}^{NH}\) in terms of years of survival equivalence. The utility function generated by the study was of the following form:

$${U}^{pt}= {f}^{pp}\left({q}^{NH},{q}^{H}, T, X\right),$$
(1)

where \(T\) is life expectancy in years, and \(X\) is a matrix of other attributes that influence a patient’s treatment choice.

The increase in survival that will give the patient the equivalent utility as a change in \({q}^{NH}\) generated by intervention a was estimated as:

$${T}^{{\text{equiv}}}={\left(\frac{\partial {U}^{pt}}{\partial T}\right)}^{-1}\left(\frac{\partial {U}^{pt}}{\partial {q}^{NH}}\right) \left(\frac{\partial {q}^{NH}}{\partial a}\right),$$
(2)

where \(\left(\frac{\partial {U}^{pt}}{\partial T}\right)\) is the utility generated by a one-unit improvement in survival, \(\left(\frac{\partial {U}^{pt}}{\partial {q}^{NH}}\right)\) is the utility generated by an improvement in non-health benefit and \(\left(\frac{\partial {q}^{NH}}{\partial a}\right)\) is the non-health benefit generated by intervention a.

In Eq. 1, the utility impact of \(T\) is estimated while controlling for \({q}^{{\text{H}}}\). So \({T}^{{\text{equiv}}}\) can be interpreted as the improvement in survival required to generate the same utility as an improvement in \({q}^{NH},\) assuming that \({q}^{{\text{H}}}\) is constant. However, a patients’ \({q}^{{\text{H}}}\) can reasonably be expected to change. If \({q}^{{\text{H}}}\) was to decline post treatment, then \({T}^{{\text{equiv}}}\) is underestimated, and vice versa. That is, all other things being equal, if \({q}^{{\text{H}}}\) is lower, \({T}^{{\text{equiv}}}\) will need to be longer to generate the equivalent improvement in utility. To account for this, it is necessary to consider the change in \({q}^{{\text{H}}}\) post treatment and translate \({T}^{{\text{equiv}}}\) into QASE. For instance:

$$QASE={\left(\frac{\partial {U}^{pt}}{\partial T}\right)}^{-1}\left(\frac{\partial {U}^{pt}}{\partial {q}^{NH}}\times \frac{\partial {q}^{NH}}{\partial a}+\frac{\partial {U}^{pt}}{\partial {q}^{H}}\times \frac{\partial {q}^{{\text{H}}}}{\partial a}\right),$$
(3)

where \(\left(\frac{\partial {q}^{{\text{H}}}}{\partial a}\right)\) is the change in health benefit post treatment a and \(\left(\frac{\partial {U}^{pt}}{\partial {q}^{{\text{H}}}}\right)\) is the utility generated by a change in health benefit. This formulation is a special case that assumes that \(T\) and \({q}^{{\text{H}}}\) are utility independent. As this is not necessarily true, Eq. 1 needs to be updated to account for second-order effects, as follows:

$${U}^{pt}= {f}^{pp}({q}^{NH},{q}^{{\text{H}}}, T, {q}^{{\text{H}}}\times T, X),$$
(4)

and QASE is given by:

$$QASE={\left(\frac{\partial {U}^{pt}}{\partial T}+ \frac{\partial {U}^{pt}}{\partial T \times {q}^{{\text{H}}} \times \frac{\partial {q}^{{\text{H}}}}{\partial a}}\right)}^{-1}\left(\frac{\partial {U}^{pt}}{\partial {q}^{NH}}\times \frac{\partial {q}^{NH}}{\partial a}+\frac{\partial {U}^{pt}}{\partial {q}^{H}}\times \frac{\partial {q}^{H}}{\partial a}+\frac{\partial {U}^{pt}}{\partial T\times {q}^{{\text{H}}} \times \frac{\partial {q}^{{\text{H}}}}{\partial a} \times \frac{\partial T}{\partial a}}\right),$$
(5)

where \(\frac{\partial {U}^{pt}}{\partial T \times {q}^{{\text{H}}}}\) is the utility generated by the interaction between \(T\) and \({q}^{{\text{H}}}\) and \(\frac{\partial T}{\partial a}\) is the improvement in survival generated by treatment a.

3 Estimate of QASE for Mode of Administration Changes for Patients with Cancer

This section asks whether face-valid, QASE-based utility estimates can be estimated from published DCEs in oncology.

Published DCEs containing attributes of both MoA and survival attributes were identified from an existing review of DCEs with patients with cancer [8]. Six studies met these inclusion criteria [9,10,11,12,13,14]. The studies were conducted among patients with three types of solid tumours: metastatic renal cell carcinoma, melanoma and castration-resistant prostate cancer. Data were extracted on (1) study and sample characteristics, (2) attribute definitions and levels and (3) choice model coefficients. Data from one study were removed, as the part-worth for all MoA levels was not statistically significant [9]. Key study characteristics are outlined in Table 1.

Table 1 Characteristics of the DCEs included in the review [9,10,11,12,13,14]

The first observation from the review is that none of the published DCEs estimated second-order effects (\(\frac{\partial {U}^{pt}}{\partial T.{q}^{{\text{H}}}}\)), and none reported the change in \({q}^{{\text{H}}}\) or \(T\) post treatment. Thus, it was only possible to estimate \({T}^{{\text{equiv}}}\) from these studies.

Among the studies, survival attributes were defined as either length of remaining life (two studies, remaining life-years ranged from one to three years) or timepoint-specific probability of survival (four studies, range of survival probability: 45% at 1 year to 50% at 3 years). To facilitate comparison across studies, probabilistic survival attributes were transformed into median length of remaining life [15]. Where mode and frequency of administration were included as separate attributes, both coefficients were extracted and used to calculate utility levels for mode and frequency combinations.

Utility differences were calculated for pairs of MoA types (see Table 1 for MoA levels included in the studies). This was done for all pairs of MoA levels in each study, excluding within-category pairings. For example, utility changes were calculated for pairings of oral and intravenous (IV) MoA levels but not for pairs of oral MoA levels.

Survival equivalents for changes in MoA were estimated using the marginal utility of survival and utility differences in MoA. This was performed for all estimates of marginal utility reported in the studies, reflecting variations in the marginal utility of survival with different levels of survival and models reported for patient subgroups (Table 1). A total of 78 estimates of survival equivalents were generated.

Figure 1 shows the survival-equivalent values for MoA pairs. Most survival-equivalent estimates were for moving from IV to oral MoA (n = 43), with fewer estimates for moving from subcutaneous to oral (n = 11) and subcutaneous to intravenous (IV, n = 8). Within the oral to IV estimates, survival equivalents were estimated for a variety of IV frequencies and administration locations: IV administration every 2 weeks (n = 6), every 3 weeks (n = 24), every 3 weeks following an initial hospital stay (n = 2), every 4 weeks (n = 6), and 5 days per week for 1 month followed by self-injection 3 days a week for 1 year (n = 3).

Fig. 1
figure 1

Survival-equivalents by MoA pairs

These data present a mixed representation of the face validity of survival equivalence estimates. As expected, oral MoAs were consistently preferred to both IV and subcutaneous MoAs. Also, less utility was gained from avoiding IV, with a frequency of 2–4 weeks (survival-equivalent mean 0.127, standard deviation [SD] 0.116, n = 38), than from avoiding IV that involved an initial hospital stay (survival-equivalent mean 0.675, SD 0.049, n = 2) or avoiding an intensive daily IV regimen followed by a period of self-injection (survival-equivalent mean 0.670, SD 0.233, n = 3). However, the variation in survival equivalents of IV versus oral MoA with the frequency of IV was counter-intuitive, with survival equivalents increasing as the frequency of IV use declined. Further, there was variation in the survival equivalents for common MOA pairs. For instance, the survival equivalence of the utility generated by moving from an IV every 3 weeks to an oral regime ranged from 0 to more than 0.35 years.

This variation in survival-equivalence estimates does not necessarily invalidate QASE, as it would not be expected that the marginal utility of either MOA or survival would be consistent across patients studied in the reviewed publications. Related to that, we have only estimated \({T}^{{\text{equiv}}}\) rather than QASE.

To explore this variation further, Table 2 reports a bi-variate analysis of survival equivalents by study characteristics. Survival equivalents varied with all study characteristics included in this analysis; the survival equivalent is higher in Japan than in North America, increases with age and baseline survival, decreases with baseline utility and is higher when the survival attribute is defined as length of survival and where the treatment period is clearly defined. While some of these trends have face validity – as the marginal utility of survival decreases with baseline survival, we might expect a higher survival equivalence for the same MOA change – the type of multivariate analysis required to determine if variation in patient and study characteristics can explain variation in survival equivalence is not possible due to the small sample size and the strong correlation between explanatory variables (Supplementary Table 1).

Table 2 Survival-equivalents of moving from IV to oral MoA by study characteristics (n = 38)

4 Methodological and Normative Considerations

Compared with alternative approaches, the QASE metric has methodological and normative advantages and disadvantages. The QASE estimate leverages choice-based PP data, such as DCE. The obvious approach against which to compare QASE is TTO, the matching method conventionally preferred by HTA agencies.

Both TTO and DCE are recommended by the EuroQol Group for valuing the EQ-5D-5L [16]. However, further work is required to compare the methods in the context of QASE. Theoretically, Johnson and colleagues point to various methodological advantages of choice-based PP data compared with the quality-adjusted life-year estimated using TTO [17]. Most relevant here is their observation that choice-based estimates can account for preference non-linearities in \({q}^{{\text{H}}}\) or \(T\), as would be required to estimate Eqs. 4 and 5.

However, further empirical work is required to substantiate this benefit of QASE. Only one study has used both DCE and TTO to estimate process utility [18]. It identified different patterns of responses across methods. For example, some processes of childbirth that were found to be relatively important according to the DCE data were found to be relatively unimportant based on the TTO responses and vice versa. However, the DCE conducted in the study did not test for second-order effects. More generally, direct comparisons of TTO and DCEs provide mixed evidence on the consistency, relative validity and reliability and ease of completion [19, 20].

The application of QASE also raises normative questions, in particular, whether HTA agencies accept the use of a patient’s perspective in non-health benefit valuation. The normative considerations raised by the selection of patient or public preferences as the basis for benefit valuation has been examined in an abundance of literature [21,22,23,24]. Here we highlight two key considerations. The first is whether the legitimate perspective is the recipient of treatment or the potential recipient or funder of treatment. In socially funded healthcare systems, HTA agencies have often favoured the use of the public’s preference as representative of the taxpayer who funds provision. The second is whether and how public and patient preferences vary. Variation is expected to arise from two sources – patients’ greater insight into the benefits of treatment and the possibility that patients adapt to their condition and, thus, undervalue benefits. The latter is amendable to empirical assessment. Comparisons of patient and public preferences conclude that, on average, patients give higher values to their health states than non-patients, but this varies by patient and by the health state valuation method [25].

The survival equivalence estimates used in the NICE Fabry disease submission may suggest that patients adapt to the burden of treatment administration and, thus, ‘under’ value improvements in mode of administration. The DCE conducted as part of the submission with a general population sample estimated a utility gain of 1.79 years survival-equivalence for moving from an IV to an oral formulation. This is substantially greater than that estimated from published DCEs with patients with cancer in the previous section: 0.127 years (SD 0.116, n = 38). The evidence review group’s (ERG) assessment supports the greater face validity of the PP-based estimate; they criticised the magnitude of the general population-based estimate and instead conducted scenario analyses that reduced the associated utility impact by 50–75%. The PP-based estimate may further converge with those used by the ERG if they are adjusted for baseline life expectancy, as we would expect smaller survival equivalence estimates from patients with cancer with a lower life expectancy and, thus, a high marginal utility of survival.

5 Discussion

HTA agencies are interested in using quantitative choice-base PP data to evaluate non-health benefits [4, 6]. However, there is no accepted method for using such data for this purpose in a way that would generate a standard, comparable metric consistent with the value concepts commonly adopted by HTA agencies. This paper describes, expands upon and evaluates one such method used in a Fabry disease submission to NICE: estimation of survival equivalence.

It is proposed that the concept of survival equivalence be extended to estimate QASE, as this will give a more accurate estimate of utility by accounting for changes in health-related quality of life post treatment. The ability of choice-based preference methods to do this while accounting for second-order effects between length and quality of life is a theoretical advantage of QASE over TTO, the utility estimation method conventionally preferred by HTA agencies. However, the empirical evidence comparing TTO and DCE is not sufficient to substantiate this benefit, as the DCEs included in this work do not account for second-order effects. Further, capturing the second-order effects required to estimate QASE will require more complex DCE designs and larger sample sizes. This extra complexity may be compounded if second-order effects are not restricted to health-related quality of life and survival, as assumed in Eq. 5. For instance, when estimating the QASE for non-health benefits, depending on how the non-health benefit if experienced over time, second-order effects may also exist between non-health benefits and survival. Insufficient research has been conducted to determine the feasibility of assessing such second-order effects, with DCEs in healthcare frequently not accounting for second-order effects [26]. This may partly reflect the specification of such effects not being required to address the research question which the DCEs were designed to answer. Further work could usefully assess the validity of DCE-based second-order effects between quality and length of life generally and, more specifically, test the impact of including second-order effects in QASE estimates.

Given HTA agencies’ conventional leaning towards the use of general population preferences to estimate utility, the use of PP data to estimate QASE will require the resolution of normative concerns. The use of QASE may be normatively more acceptable where the patient is the consumer of healthcare, and patients are expected to have greater insight into their condition than the public. QASE will be less acceptable in socially funded health systems and where the patient adapts to their condition. Further work could usefully compare patient and public trade-offs of survival and non-health benefits and understand HTA agencies’ normative objectives to determine their acceptance of QASE.

Where QASE is adopted by HTA agencies, we would not recommend that this valuation be formally integrated with the results of the cost-utility analysis, as was done in the Fabry disease example [7]. Doing so would involve mixing the results of equivalent, though not identical, methods and mixing perspectives. Rather, knowing patients’ QASE valuation of non-health benefits facilitates decision makers’ weighing of this value component with other HTA evidence. The weight that such evidence is given in a decision would be left to the normative judgement of the decision maker.

This study identified several implications for the design of the preference studies required for estimating QASE. First, it would be necessary for studies to be designed to allow the identification of second-order effects between length and quality of life attributes. Second, the marginal utility of survival varied with differing survival attribute definitions, indicating the potential value of standards for defining survival attributes in preference studies. Third, it is important to provide a detailed MoA attribute definition, including the period over which the change in MoA is experienced. Fourth, the levels of the survival attributes should anticipate the likely QASE estimates so that estimates of equivalence should not exceed the attribute level ranges included in the study design.

In response to calls from the field [4, 6], this paper explicitly considers QASE as a means to use choice-based PP data to estimate the value of non-health benefits. However, there is nothing about QASE that would necessarily limit its application to this context. The valuation of health benefits is subject to the same dependencies between quality and length of life, and these are currently not captured by the TTO-based utility tariffs used to value health. Further, patient preferences are not integral to the estimation of QASE, which could be estimated based on DCEs conducted with, for instance, the general population. Further work could usefully consider these broader applications of QASE.

In summary, QASE provides an opportunity to generate a patient-centric, standardised measure of non-health benefits, calculated using the utility estimates and trade-offs elicited through a DCE. Despite its methodological advantages, further work is required to demonstrates its reliability and validity and to align its application with the normative goals of HTA agencies.