Introduction

Quality-adjusted life year (QALY) gain is a very common unit used to measure the effectiveness or clinical profitability of an intervention in health [1,2,3]. Namely, a QALY provides the utility value associated with a given health state for 1 year [4] and can be calculated in many ways [5]. The utility value of a QALY is equal to one in perfect health and to zero in case of death. Considering the limited resources in healthcare, it is important to determine a threshold below which a health intervention is considered cost-effective and a maximum threshold beyond which an intervention should be rejected. Several studies have proposed a threshold for the monetary value of a QALY [6,7,8], and several methods have been used to estimate this value [9].

A first way to give monetary value to a QALY is to consider the opinion of experts in the field. This relatively simple method consists of choosing, for a country, the value of a QALY based on the literature and the opinions of experts in the field. Starting with this method, several authors have provided a cost-effectiveness threshold value for a QALY in certain countries. Laupacis et al. [7] originally proposed a range value of CAD $20,000 to CAD $100,000 for Canada based on the work of Kaplan and Bush [6] for the United States. Menon and Stafinski [8] proposed a more recent value of USD $50,000, which seems to come from Laupacis [10]. However, this method remains criticized in the sense that it appears arbitrary and does not necessarily reflect the real monetary value of a QALY. In particular, it does not consider the preferences of society (i.e., in its role of taxpayer).

A second way to establish these threshold values is based on the human capital theory, which postulates that the statistical value of a life (SVL) corresponds to the contribution of what an individual brings to society [11, 12]. It consists of using an equivalent value to the gross domestic product (GDP) per capita. Therefore, the value of the remaining life of an individual would be the present value (discounted) of his or her future income. However, it is argued that the SVL goes well beyond an individual’s simple economic contribution, which is why some authors have postulated using multiples of the GDP per capita to assess the value of a QALY [13, 14]. For the World Health Organization (WHO), this value increase up to three times the GDP per capita, especially in developing countries [15, 16]. However, it should be noted that if the value of the GDP per capita reflects an individual’s economic contribution, it does not reflect the individual’s net contribution to the extent that everyone consumes a large number of goods and services during his or her lifetime. Therefore, Weisbrod [17] proposed using the net income from consumption. Regardless of the method used, the human capital theory approach presents several limitations. For instance, besides the justification of the income discount rate, this method does not consider the pain and suffering from disease, the reduction or cessation of leisure activities, or the contribution of unpaid domestic and volunteer activities [11].

A third and more appropriate way to determine a threshold value is to make the assessment directly within the general population [18]. In this setting, the preferences of individuals allow to reach the theoretical foundations of utility and economic well-being [19] and to ensure the effective allocation of health resources to maximize the population’s health. Although this method seems to be the most appropriate and could consider the value of unpaid activities [20], it is not free of the bias that could impact survey methodology and the understanding of a questionnaire. There are two possible approaches used in the direct assessment method [15, 21]. First, there is the perspective that one is willing to gain a year of life in perfect health (QALY = 1. From this perspective, the threshold value is evaluated based on what one is willing to pay, i.e., the willingness to pay (WTP. The other perspective is that one is willing to take a risk for his or her health (to degrade the quality of his or her health, in such a case, the utility value of a QALY is strictly less than 1. From this perspective, the threshold value is evaluated based on what amount of money one would receive as compensation or payment for risk taking, i.e., the willingness to accept (WTA. These two approaches produce different results with higher values when evaluating how much one is willing to accept [22,23,24], thereby implying that the monetary value of a QALY is different depending on whether the choice is for an investment or a disinvestment [25]. Generally, the monetary value of a QALY is close to one times the GDP per capita when it is assessed according to the WTP method, with the ratio varying from 0.05 to 5.40 [26]. However, when assessed according to the WTA approach, this ratio varies from 3 to 20 [9]. The disparity of this ratio can be explained by the experiment, the subjects, the standard of living, etc. [1, 27,28,29,30,31,32]. Considering the strong differences that exist between countries in terms of the monetary value assigned to a QALY, it appears that considering a threshold value that may not reflect the preferences of individuals could lead to over- or underinvestment in health [33]. It is therefore important to determine, for each country, what is, in its specific context, the value that its population assigns to 1 year of life in perfect health.

The objective of this paper was first to conduct a review of the literature on the WTP for a QALY (WTPQ) and second to propose a method for predicting the monetary value of a QALY. Our contributions are multiple. First, we revised and updated the previous empirical literature [26, 34,35,36]. Second, we made an econometric estimation aiming to highlight the relation between WTPQ and study settings such as the country or region context, the utility elicitation method (UEM), and the WTP elicitation method (WEM). We also used the sample characteristics such as age, sex, education, and income of respondents. This econometric estimation included more variables than other studies (e.g., [37,38,39,40,41,42] while maintaining a consistent sample size. Third, we predicted the monetary value of a QALY using the population of the province of Quebec (Canada), for an application of the method, with the estimated coefficients.

The rest of the paper is structured as follows. The research methodology and the selection process are presented. Then, we describe the studies included and the data used for the regression. After that, a meta-regression is performed to predict the monetary value of a QALY based on the studies’ characteristics. Finally, we extrapolated this value for the Quebec population.

Materials and methods

Research strategy

Based on a protocol research strategy, a systematic review of the literature with meta-regression was conducted. We consulted the APA PsycINFO, Cumulative Index to Nursing and Allied Health Literature (CINAHL), Medline EBSCO, EconLit, Scopus, ScienceDirect (Elsevier), PubMed, and the Cochrane Library databases. We also conducted research on relevant studies in the gray literature (i.e., various website). We completed our review by searching the reference list of previous works cited in the introduction section [26, 34,35,36], and the references of identified studies. The research was conducted using a combination of keywords in either English or French. The keywords used were as follows: Quality-adjusted life year, QALY, Willingness to Pay, WTP, Willingness to Accept, WTA. We did not use the restriction of any elicitation method. The article searches took place over an unrestricted period until June 26, 2020. The research strategy is available in the appendix.

Study selection and data analysis

The selection of studies was based on the criteria that the study must include the monetary value of a QALY calculated through a stated preferences method and should not be a literature review. Thus, only primary studies were included in this review. In the case that we found a systematic review, we considered the included primary studies to complete our selection. The selection process was independently performed by two evaluators after reading the titles and abstracts of the studies. After that, the studies were fully read and selected if they met the inclusion criteria. Studies were excluded if they did not explicitly calculate a WTPQ or they did not allow us to perform the calculation ourselves based on the data presented. However, authors were contacted to obtain additional information if necessary. We had no restriction regarding the target population or geographic location. One evaluator extracted data in a grid format, and the second validated the data. In case of disagreement between the evaluators, arbitration by a third evaluator was planned. Our focus was to extract data that could allow us to perform a meta-regression. Particularly, in addition to WTPQ and annual income, we sought information for region, country, health status, UEM, WEM, payment vehicle, and socioeconomic variables such as age, sex, employment status, marital status, education, etc.

We first performed a descriptive analysis of the data. We extracted and reported the cost data in 2018 US dollars and US international dollars (parity purchasing power). We used inflation and exchange rate data from the World Bank and International Monetary Fund website [43, 44]. For Taiwan (Republic of China, Taiwan), we used data from the National Statistics and Economy Watch (Economy Watch, Statistics Taiwan). For studies that did not report the cost year, authors were contacted, and when we did not get answers, we took the year before the publication date as the reference. We grouped the socioeconomic data by category if such data were available. The quality of each study was evaluated using the NIH Quality Assessment Tool [45] and the checklist from Xie et al. [46]. Then, we conducted a regression using ordinary least squares (OLS). Our dependent variable was the monetary value of a QALY in parity purchasing power, and the independent variables were the study characteristics and socioeconomic data of the sample. Based on coefficients from the meta-regression and population characteristics, a QALY value prediction was made for the province of Quebec.

Theoretical framework for the meta-regression

For each study that calculated a WTP using a stated preference method, the value found was dependent on how the utility and willingness to pay were elicited and on the socioeconomic characteristics of the sample. Our theoretical framework is based on this evidence from the literature.

Let f be a function of WTPQ defined by two components s and r, where s represents the study characteristics and r represents the respondents’ characteristics. Therefore, f (wtpq) = f [(s, r)]. The study characteristics could be assessed as the study settings, for example, the utility and willingness-to-pay elicitation method, the mode of survey, the type of outcome presented in the scenarios, etc. The respondents’ characteristics include income, age, sex, marital status, education level, health status, area of residence, etc. Our assumption is that ex ante, the study characteristics have no influence on the respondents’ characteristics. The design of a study cannot influence the characteristics of the people for whom it is intended. However, when a study is designed for a specific sample, the authors of the survey consider the context of the respondents when choosing and adapting the scenario, bid values, administration mode (e.g., face to face, internet, telephone), etc. Therefore, it is possible that the respondents’ characteristics could have an influence on the study settings, even if we cannot measure this magnitude. We could then specify that f (wtpq) = f [s(r)]. On the other hand, ex post, the respondents, while eliciting their preferences, integrate the study’s characteristics (e.g., QoL gained, payment vehicle, duration, and severity of scenario) to maximize their utility at the best price (or cost). By doing so, the study’s characteristics, which are exogenous ex ante, become endogenous ex post. Therefore, the respondents integrate the study’s characteristics and their own characteristics to give their best response. We could then specify that f (wtpq) = f [R(s), r)], where R corresponds to all the responses given to the WTP question by the respondents based on the study’s characteristics.

Because we performed a meta-regression in this study, we could not directly observe the bias in the respondents’ answers. However, we considered the fact that in each study, the authors were aware of these issues and tried their best to address them. Additionally, considering the great number of observations from the different types of studies, these biases would potentially be neutralized and thus allow us to perform our estimation.

We used ordinary least squares to estimate the coefficients. Due to potential heterogeneity and heteroscedasticity in the data, we used a robust estimator. Due to missing values in some of the respondents’ variables, we used official databases (Ministère de l’Économie et des Finances, France, 2020; [44, 47,48,49] to fill in the missing values to have a complete database to perform the estimates.

Regarding potential endogeneity issues, we think that the final value of the WTPQ observed has no impact on the studies or respondents’ characteristics. Consequently, it is unlikely to have a reverse causality (simultaneity) issue. Regarding the omitted variables, we remain aware that it is not possible to have a regression model that includes all explanatory variables. In this model, we included all the variables we found relevant according to the literature and our goal. The third and last possible possibility of endogeneity is measurement error. To account for this problem, we reported the data from studies as they were given. Since this study was a meta-regression, we did not have access to the primary data. Therefore, if we suppose that the authors for each paper adequately reported the data from their studies, then the measurement error should be resolved. Some data were missing or not reported for certain respondents, especially for education, age, income, and employment status.

Results

Characteristics of studies included

The PRISMA diagram flow Fig. 1 presents the selection process of the studies. In total, 9991 articles were identified and screened. Among them, 9894 articles were excluded, and 97 articles were fully read. We finally selected 39 studies and excluded 58 for various reasons. Reasons for excluding studies that were fully read were the following: the study did not provide a WTPQ value (n = 14),the study did not provide full information (n = 1); the study was a duplicate study (n = 7); the study did not use a stated preference method (n = 6); the study was not a WTPQ study (n = 20); the study was a literature review (n = 7); the study was an editorial (n = 1); the study was in Spanish (n = 1) or Japanese (n = 1).

Fig. 1
figure 1

PRISMA flow diagram for the selection of studies, June 26, 2020

The year of publication ranged from 1998 to 2020, with 82% (n = 32) of the studies published since 2010 and 48% (n = 19) published since 2015 Table 1. The studies cover 4 continents, namely, America (n = 5), Asia (n = 15), Europe (n = 18), Oceania (n = 1), and 24 countries, two studies (n = 2) concerned several countries located on different continents [41, 50]. In the majority of the studies, the sample respondents were drawn from the general population (n = 27) or patients (n = 8), and four studies (n = 4) focused on both patients and non-patients [51,52,53,54]. The sample size varied from 40 to 5008, with a mean of 955 respondents. Various survey modes were used to interview respondents. One study combined internet and telephone surveys [53], one study combined internet and face-to-face surveys [55], and one study combined internet and mail surveys [56]. The most used survey methods were face to face (n = 23) and internet (n = 13). All studies used the contingent valuation (CV) method to assess the WTPQ. Double-bounded dichotomous choice (n = 8) and close-ended (n = 8) were the most used WTP elicitation methods. Some studies used a combination of two CV methods, generally payment card, card sorting, payment scale, or single- or double-bounded dichotomous choice (DBDC), followed or not with an open-ended question. For the utility elicitation methods, most studies used a direct method (n = 17), seven studies used an indirect method, and others used a combination of direct and indirect methods (n = 15). Most of the common direct methods were used: standard gamble (SG) (n = 3), time trade-off (TTO) (n = 3), visual analog scale (VAS) (n = 5), and a combination of SG, TTO, VAS, and rating scale (RS) (n = 5). One study used a combination of TTO and discrete choice experiment (DCE) [57]. Indirect methods were also applied using multi-attribute utility instruments (MAUI), such as the EQ-5D, SF-6D, HUI, and QWB. Among the seven studies that used only an indirect method, EQ-5D was used six times alone and one time in combination with SF-6D [54]. Among the fifteen studies that used a combination of direct and indirect methods, the most commonly indirect method used was EQ-5D (n = 12), and the most used direct method was VAS (n = 12).

Table 1 Characteristics of studies included

Improving the quality of life (QoL) was the most used outcome in the included studies (n = 12), while other studies focused on life extension (n = 5), and some used a combination of the two outcomes (i.e., improving QoL and life extension) (n = 6). Other studies used other outcomes, such as perfect health, income gain and lifesavings. The large majority of the studies used out of pocket as the payment vehicle (n = 36), two combined out of pocket and taxes [58, 59], and one used income variation [60]. More than 3 out of 4 studies used an individual perspective (n = 30, four used a patient perspective and another four used a societal perspective, finally, one study used both individual and societal perspectives [61]. The WTPQ values ranged from $992 [62] to $3,363,338 [56] (2018 US dollars). Overall, the included studies were of good quality, with quality scores of 73% and 63.7%, which were calculated using the NIH grid [45] and the Xie et al. checklist [46], respectively.

Descriptive results

The mean WTP per QALY (in USD PPP 2018) was $103,618.2 (sd 396,417.5) Table 2. This value was increased by the inclusion of three studies [56, 63, 64]. When the three studies were dropped, the mean value was estimated at $34,282.86 (sd 31,630.76). In all cases, the median WTP value was below $35,000 per QALY.

Table 2 Mean and median estimates of WTP per QALY (in USD 2018)

By looking at the WTPQ values by continent, we found that America and the Far East (China, Japan, Malaysia, Republic of Korea, Taiwan, Thailand, Vietnam) had almost the same value, while Oceania (Australia) had the highest value of WTPQ Fig. 2. The Middle East (Iran, Kingdom of Saudi Arabia, Palestine) had the lowest value of WTP per QALY, which was almost one-third of the values of America, Europe and the Far East, and one-quarter of the value of Oceania. These differences could be explained by methodological setups and the context of each study and country. Many reasons may explain the differences in the monetary value of a QALY, from the health utility measures used [65, 66], to methodological issues [67] and other reasons previously stated in the introduction section.

Fig. 2
figure 2

Willingness to pay per QALY by continent (without the three studies)

Since we have 24 countries from all the studies included in this review, it seemed more appropriate to group countries by categories. We chose geographical criteria instead of economic (e.g., low-, middle-, high-income country) because we think that when it comes to value non-market goods like environment, and health in this case, the social and cultural context and habits have more importance than the economic. And the most appropriate way to capture these contexts and habits is the geographical localisation of the country.

Since it is important to empirically determine which variables are more susceptible to influencing the monetary value of a QALY, we conducted an econometric meta-regression of WTPQ based on the studies and samples’ characteristics.

Meta-regression data and econometric specification

From the 39 studies selected in the review process, we extracted 511 weighted aggregated observations since most studies offer more than one estimate of WTPQ. We excluded three studies [56, 63, 64] because they provided very high WTPQ values that are very unlikely to be considered by decision-makers and society for resource allocation. The first two studies were quite similar and used a variation in foodborne risk, while the other study used variation in health risk to elicit WTPQ. For the last study, the authors used a large variation in QALY gain applied to a road traffic injury that could be permanent. While the 36 other studies used variation in common health risk to elicit WTP, these three studies used somewhat particular methods that may lead to very high values. Table 3 shows the details of the variables used for the estimation. Variables were selected based on many criteria. First, they should have a potential relationship with the WTP as shown in the included studies (e.g., [37,38,39,40,41,42]. Then, their availability in the studies selected and included in this review. Finally, since it is a meta-regression, we used many fixed effects to control for potential multiple heterogeneity across the studies.

Table 3 Description of variables used in the meta-regression

The general model for the meta-regression is specified as follows:

ln WTPQi = R(X) + S(X) + u,

For the respondents’ regression, we included three more variables from the study characteristics (i.e., continent, country, respondent). Being from a specific continent or country has an indirect effect on the standard of living and the health status, which could affect the willingness to pay for a QALY. Additionally, the fact that the respondent is a patient or not (general population) could influence the perception of a year spent in perfect health. Therefore, we think that these three variables are relevant for assessing how respondents’ characteristics can have an impact on their willingness to pay per QALY.


where a is a constant and b represents the coefficients for the explanatory variables; the subscript “i” represents the observation. The dependent variable is the logarithm of the WTPQ in USD PPP 2018.

Thus, the general model can be written as follows:

$$ln \, WTPQ_{i} = a \, + \, b_{INC} \left( {{\text{ln INC}}_{i} } \right) \, + \, b_{{{\text{AGE}}}} {\text{AGE}}i \, + \, b_{{{\text{FEM}}}} {\text{FEM}}_{i} + \, b_{{{\text{EDUC}}}} {\text{EDUC}}_{i} + \, b_{{{\text{UNEMP}}}} {\text{UNEMP}}_{i} + \, b_{{{\text{CONT}}}} {\text{CONT}}_{i} + \, b_{{{\text{RESP}}}} {\text{RESP}}_{i} + \, b_{{{\text{SURV}}}} {\text{SURV}}_{i} + \, b_{{{\text{PAYM}}}} {\text{PAYM}}_{i} + \, b_{{{\text{PERSP}}}} {\text{PERSP}}_{i} u_{i}$$

and the predicted WTPQ based on the study characteristics is as follows:

$${\text{ln WTPQ}}_{i} = a \, + \, b_{{{\text{CONT}}}} {\text{CONT}}_{i} + \, b_{{{\text{RESP}}}} {\text{RESP}}_{i} + \, b_{{{\text{SURV}}}} {\text{SURV}}_{i} + \, b_{{{\text{PAYM}}}} {\text{PAYM}}_{i } + \, b_{{{\text{PERSP}}}} {\text{PERSP}}_{i} + \, v_{i} .$$

Based on the information found in the literature review and as stated in the theoretical framework, the predicted WTPQ based on respondents’ characteristics could be estimated as follows:

$${\text{ln WTPQ}}_{i} = a \, + \, b_{{{\text{INC}}}} \left( {{\text{ln INC}}_{i} } \right) \, + \, b_{{{\text{AGE}}}} {\text{AGE}}i \, + \, b_{{{\text{FEM}}}} {\text{FEM}}_{i} + \, b_{{{\text{EDUC}}}} {\text{EDUC}}_{i} + \, b_{{{\text{UNEMP}}}} {\text{UNEMP}}_{i} + e_{i}$$
(1)

The variables Country, Outcome, Utility EM, WTP EM, were included as specific fixed effects.

S(X) = {Continent, Country, Respondent, Survey mode, Payment vehicle, Perspective, Outcome, Utility EM, WTP EM}.

where R(X) and S(X) are sets of variables for respondents and studies, respectively.

R(X) = {Income, Age, Female, Education, Unemployment}; and,

Estimation results

The results of the estimations are shown in Table 4. Columns 1 and 4 are the estimations based on the respondents’ characteristics, columns 2 and 5 are based on the studies’ characteristics, and columns 3 and 6 are the overall estimations, including all the characteristics. In the first regression (column 1), we could see that the American respondents have significantly lower WTPQ values than those of the Oceanian and Far East respondents. The difference between the American and European WTP values is very small. Patients have a greater WTP than the general population, but this was not statistically significant. A 1% increase in individual income was shown to lead to an increase of 0.6% in the WTPQ value, in contrast, a 1-year increase in respondent’s age was shown to lead to a decrease of 3.3% in the WTPQ value. Female respondents tended to have a lower WTPQ than that of male respondents. The more the respondents were educated, the less they were willing to pay for a QALY. In contrast, unemployment tended to increase WTP.

Table 4 Estimation results

The second regression (column 2) shows the great value of patients’ WTP in comparison to that of the general population. Compared to face-to-face interviews, a survey conducted by internet or telephone is more likely to have a statistically significant higher value of WTPQ. This outcome may indicate that, depending on whether the interviewer is near the respondent will affect the value of the responses given. Studies using out of pocket and taxes as payment vehicles tend to have lower WTP in comparison to a variation in the respondents’ income. Studies with a patient perspective have lower WTP values than those with an individual perspective; however, the societal perspective is not statistically significant.

Overall (column 3), we found that most of the explanatory variables have significant effects on the value of the WTPQ. The more people earn, the more they are willing to pay for a QALY. In contrast, educated respondents are less willing to pay. Internet and telephone surveys have a statistically significant positive effect on WTP values compared to face-to-face surveys. The design setup of a study, such as the mode of survey, the payment vehicle, and the perspective, also has a statistically significant effect on the final WTPQ value obtained.

Including all the studies in the regressions led to a somehow different result. It is also important to highlight the large coefficient variations in the regressions including all studies. The large gap between the mean WTPQ with and without these three studies could explain the large variations in the estimated coefficients.

From the regressions, we predicted the mean WTPQ value based on the estimated coefficients Table 5. The predicted mean value was obtained by taking the retransformation of exponential value of the predicted value of the dependent variable:

Table 5 Mean WTPQ predicted
$$\widehat{y} =\mathrm{ exp}\left(\frac{{\widehat{\sigma }}^{2}}{2}\right)\mathrm{exp}\left(\widehat{logy}\right),$$

with \({\widehat{\sigma }}^{2}\) is the unbiased estimator of \({\sigma }^{2}\), the variance of the error terms.

The predicted value for a QALY, namely the monetary value estimated through the studies and respondents’ characteristics, was estimated at USD $52,619.39 when excluding the problematic three studies (model 3 in Table 4. The respondents’ and studies’ estimates show practically the same value of WTPQ. When including all the studies, there is a gap of more than USD $40,000,the overall estimate of the WTPQ in this case is more than USD $95,901.69.

Predicted value of a QALY for Quebec

We used the estimated coefficients from Eq. (1) to calculate a theoretical value of a QALY for Quebec. To do so, we used the average characteristics of the population of Quebec in 2018 taken from the website of Statistics Quebec (Statistics Quebec). The mean age was 49.45 years for the adult population (i.e., those more than 18 years old), the total population was approximately 50.13% women, and the average income was estimated at CAD $52,874 (USD $40,672). The mean years of schooling was estimated at 10.67, and 5.5% of the active population was unemployed. Hence, we multiplied those values by their corresponding coefficients in model 3 Table 4 to predict the final value of a QALY for Quebec.

We obtained a predicted value of USD $98,450 (CAD $127,985) per QALY. This value seems plausible since compared to the sample, Quebec had almost the same respondents’ characteristics (education and women proportion) but a lower unemployment rate and a higher age and average income Table 3.

Discussion

Through this review, we aimed to synthetize the large literature on the monetary value of a QALY to provide global insights on this topic and actualize the previous reviews to provide updated information. Since, to the best of our knowledge, there is no previous empirical study that elicited the WTP for a QALY in Quebec, this review made the first attempt at estimating this value based on the empirical evidence. We initially aimed to review WTP and WTA in this article, but it was impossible to find studies that elicited the preferences of people regarding accepting payment to take a risk for their health. Therefore, we focused on WTP studies only.

The selection of only studies containing stated preferences for this review is supported by the fact that the WTP values elicited through these methods are the most likely to represent population preferences. Our methodology for searching studies and reporting results was consistent with the PRISMA guidelines [68]. We found many differences in the studies selected. Studies across countries used different utility and health state elicitation methods, with the most preferred being EQ-5D, VAS and TTO. The way in which people were exposed to the WTP question also varied largely across studies. From these findings, we derived two possible ways to predict the WTPQ using the respondents’ and/or the studies’ characteristics. Our results in the “respondents” model confirmed a statistically significant effect of income, education, age, and sex on the WTPQ. This result confirmed the previous results found by Van Houtven et al. [36]. The coefficients signs were as expected for income and age, but we cannot make conclusions about the signs of the coefficients associated with sex, education, and unemployment. However, we found that women had lower WTP values than men. This finding needs to be confirmed by further empirical studies. Our results also showed that the respondents tend to report a higher WTPQ value when they are not near the interviewer. This result is contrary to the “social desirability bias” stated by Ethier et al. [69]. “Social desirability bias occurs when individuals provide different responses in the presence of an interviewer in an attempt to appear more socially acceptable”. In our study, the respondents from internet or telephone surveys had higher WTPQs than did those from face-to-face surveys. The robustness estimation made using studies cluster standard errors showed that our results remain consistent Table 8 in Appendix). Furthermore, when removing all regressions fixed effects, the predicted WTP per QALY gave values close to the main results, which strengthen our findings Table 6.

Table 6 Mean WTPQ predicted (without fixed effects)

The predicted value for a QALY based on the Quebec population’s characteristics showed that it could be USD $98,450 (CAD $127,985), which represents more than 2 times Quebec’s GDP per capita in 2018. This value is consistent with the literature from Nimdet et al. (2015), who found that “the ratio between WTP per QALY and GDP per capita varied widely from 0.05 to 5.40”, depending on the study settings. Furthermore, this value is consistent with those suggested by other authors [7, 8, 10]. Importantly, the result presented in this review relies on our methodology and on the characteristics of the included studies and their econometric estimation methods. To the best of our knowledge, no study was conducted in Quebec to assess the value of a QALY. Our meta-analysis thus provides a broad overview of what could be this value based on what is already known: the literature and the characteristics of the Quebec population.

The results obtained from the econometric regressions showed some consistency with the previous literature on WTP. Respondents with higher income tend to be willing to pay more while the payment amount have a negative effect on the willingness to pay [70,71,72]. While WTP decreases with age, another study did not find an association with education and occupation [71]. Theses authors also found that self-payment (out-of-pocket) tend to be rejected by individuals and lower the WTP. Finally, even if we did not have these variables in our estimates, it is worth to precise that some studies found pain and risk reduction to be associated with higher willingness to pay [73, 74].

Because the studies included in this review come from different settings and countries, it is also important, while interpreting the results to remember that all these hypothetical scenarios were implemented in countries which, for the majority, have health insurance plans. Studies included in this review comes from different healthcare system types which are generally classified into four categories: the Beveridge model, the Bismarck model, the national health insurance model, and the out-of-pocket model or the private insurance model [75]. Although we do not know how this may affect our results since mixed models may coexist, this may be investigated in future research. Indeed, each country has somehow adapted a general model to his own convenience. For example, United States and China have an out-of-pocket healthcare system, but a part of the population in the US is covered by a national health insurance system (Medicare) and another part by a Beveridge system (for veterans) and finally a Bismarck system (employer-based health care). The Bismarck system is widely used in Germany, Japan, Netherlands, and France. The United Kingdom and Spain used a Beveridge system; and finally, the national health insurance system is used by Canada and South Korea [75, 76].

During this review, we selected studies that met our inclusion criteria yet also gave us as many details as possible about their design and respondents’ characteristics. Including more studies than have been included in previous systematic reviews [26, 35] gave us more observations and allowed us to obtain more precision in the regression results. Including variables such as utility elicitation method, willingness to pay elicitation method, outcome and survey mode also led to us having more precision in our estimates. There was no significant difference between the willingness to pay value obtained for Quebec and those obtained from the studies included in this paper. The quality of studies included in this review was acceptable, with mean scores above 64%. The three studies excluded were from Sweden [56, 64] and the United States [63]. The fact that after inclusion, America became the region with the highest QALY value may be due to the higher weight given to the United States than that for Sweden. These three studies were excluded because the scenarios they used were very different which may have led to very high QALY values ranging from USD $900,000 [64] to more than USD $3.3 million [56]. Two of them used a foodborne risk to assess the value of a QALY, while one used a road accident injury.

Overall, combining all these studies to estimate the monetary value for a QALY potentially leads to some concerns, especially the lack of some variables that could have helped to explain the WTPQ more precisely. The underreporting of these variables could bias the estimates. In addition, having health status measured with different tools is a key concern in this review, but we addressed this issue using fixed effects to capture this heterogeneity. Furthermore, the perspective of the study, being societal or individual, could have an impact on the WTPQ. Finally, the way in which the respondents were asked to pay (i.e., the payment vehicle) may have impacted the results. Indeed, payment via out of pocket or via taxes would have some effect on the WTP [77]. The duration of the scenario is also a key point. Scenarios including many possibilities, such as improving QoL, extending life or simply saving a life could lead to varied effects on the WTP. It is, of course, hardly possible to consider all these details in a review where the data are limited by what previous authors reported in their primary studies. Our aim herein was to have the maximum variables to use without constraining the sample estimation.

Even if the QALY values were converted into purchasing power parity, they remain widely unbalanced, probably due to country context. Obviously, countries with high incomes tend to have high QALY values, while low-to-middle income countries have lower values. The prediction we obtained for Quebec could be used as evidence of this. The exclusion of the three studies with extremely high QALY values [56, 63, 64] did not affect the general finding of this systematic review, as Ruen and Svensson (2015) did by excluding one of these studies. Including these studies merely raised the mean estimate value without calling into question the overall results. The precision of the results outlined herein depends on the precision of the primary data reported by each study included in this review. Hence, one should keep this in mind while interpreting the results. It is also worthwhile to recall that, as stated by Van Houtven et al. [36], all the estimated WTP values in the included studies were for adult respondents. No study has focused on children or teenagers.

Considering all this heterogeneity in the data, it could not be possible to use this model as a standard in the meta-regression of QALY studies. However, the model used provided insight into how we could model and predict the monetary value of a QALY based on evidence from studies. Obviously, the model would be much more efficient with homogenous data samples.

Finally, we believe that this review, even if it has some weaknesses, as highlighted by the issues cited above, could be useful in many ways. First, it provides a new and updated systematic review and a meta-regression using stated preference studies. Including more studies and more observations in this review led to us obtaining more efficient and precise estimates. Second, in a context where decision-makers need up-to-date information to make the best decisions possible, it would be useful for them to have an overview of what is going on in the literature about WTPQ. Third, a country that does not have yet a monetary value for a QALY could predict a theoretical value using the estimated coefficients, as we did for Quebec. Even if this estimation would not be totally accurate, it could provide a first idea based on the population characteristics.

Conclusion

Knowing the monetary value that a population assigns to one QALY is very useful information for health authorities. This information allows them to make effective decisions regarding choosing between different health technologies, treatments, or medicines. In this article, we reviewed the literature on studies concerning the WTP for a QALY. Based on the included studies, we calculated a mean estimate of the value for a QALY by continent. We performed regressions using a generalized least squares method with fixed effects to control for heterogeneity in the data. The results confirmed a strong and significant relationship between socioeconomic variables, study characteristic variables, and the WTP for a QALY. The predicted value made for Quebec showed that the outcome was consistent with the values found in the literature. Although this study may have some weaknesses, we are confident that it is methodologically rigorous and could help to estimate the monetary value of a QALY elsewhere.