FormalPara Key Points for Decision Makers

The overall WTP value for one QALY in this study was estimated to be less than the threshold proposed by the WHO for developing countries (0.57 GDP per capita versus 1–3 GDP per capita).

Findings also support the observation that the general public places greater value on life-extending QALYs than life-enhancing QALYs and justifies additional weighting for the use of life-extending QALYs in cost-effectiveness analyses.

The value attached to a QALY is commonly used by health technology assessment (HTA) agencies across many countries as a measure of healthcare effectiveness. An estimation of this threshold can pave the way for evidence-informed policy making on HTA in Iran.

1 Introduction

Healthcare systems in developing countries are facing serious financing problems given the increase in medical needs of populations, the progress achieved in medical interventions, and the scarcity of resources from governmental and non-governmental agencies. Every healthcare system implements its own methodologies in assessing value for money, which serves as a threshold of cost-effectiveness analysis and as a decision aid in the financing of new and innovative healthcare technologies. Three methods of determining a cost-effectiveness threshold are suggested [1]. The first is to infer a threshold on the basis of previous decisions taken by leading institutions. The drawbacks of this procedure are that it is susceptible to inconsistencies and the threshold may change over the time [2]. The second method is to establish a threshold in such a way that exhausts an exogenously determined budget so that a newly conceived intervention is more efficient for each unit than least efficient previously funded interventions [3]. The third method involves determining a threshold in accordance with the optimal healthcare budget, but this approach has its own informational limitations. It can be performed in three different ways. Some studies [1] estimate the value of a statistical life employed in other areas of public health [4]. Such studies are concerned with avoiding small risks of immediate death and involved analyses of fatal situations, such as accidents or fire [2,3,4,5]. That health is a primary objective of the health sector and not regarded as a key responsibility of other public sectors is understandable; thus, separating health components from the activities of public sectors is difficult [1]. A cost-effectiveness threshold can also be established as being equal to GDP per capita [2], and this value is commonly applied in developing countries. Under this approach, society must devote all its wealth to healthcare, rendering that this value is a common-sense threshold rather than an evidenced-based threshold. Finally, a cost-effectiveness threshold is set by determining the monetary valuation of gain in quality-adjusted life-year (QALY) by using the willingness-to-pay (WTP) of a representative sample of society [6, 7]. Given the ultimate goals of a health system (improvement of health) and QALY gains (avoidance of losses), a desirable approach is to estimate the monetary value that society places on additional QALYs. This approach, however, has been criticized for certain limitations. The main critique is that the health system budget is mainly determined by parliament. So, a society-determined value of an additional QALY maybe not practical and researchers have instead suggested the estimation of the value of health gains through a political process [1, 5]. A recent argument, however, is that economists should combine both approaches, with scholars stating that QALY can appropriately quantify health gains [8]. The use of the WTP of society to estimate the value of QALY is more prevalent in developed countries [7,8,9,10,11].

The value attached to a QALY is commonly used by health technology assessment (HTA) agencies across many countries as a measure of healthcare effectiveness [9]. HTAs can recommend the adoption of a technology on the basis of the affordability and sustainability of intervention in publicly funded healthcare systems. Most recommendations are formulated with reference to a QALY gain relative to the cost incurred from intervention. Methods of estimating cost per QALY have been used in developed countries [9]. However, in cases characterized by insufficient evidence regarding the QALY value attached by a population (e.g., developing countries) or QALY value obtained by other information sources, the comparison of a specific threshold or range of values and cost per QALY gained is recommended. For example, a new therapeutic intervention that presents a cost that is lower or greater than 1–3 times the GDP per capita for one QALY gained may be rejected or recommended for adoption in a developing country (http://www.who.int/choice/en/). The problem is that such thresholds are essentially arbitrary, with little or no economic foundation. The determination of cost-effectiveness thresholds has become a critical policy issue, as reflected in the growing interest of some countries, especially developed ones, in the design of reliable methods for determining QALY values.

From 2009 to 2010, nine European countries and Palestine endeavored to determine the monetary value of a QALY in a research project called the European Value of a QALY (EuroVaQ). The study was conducted by a group of experienced health economists and funded by the Commission of the European Union under the Sixth Framework Programme. The main methodology consisted of online surveys, administered to approximately 4000 individuals in each of the participating countries in late 2009 and early 2010 [7, 10, 11]. The EuroVaQ study used two online survey approaches, namely, the chained and direct approaches. In the chained method, standard gamble or time tradeoff questions were used to determine willingness to pay (WTP) for a QALY with respect to different health gains based on EQ-5D health states. In the direct method, a QALY gain was ascertained through illustrations of health status on a visual analog scale called the “health thermometer”. The respondents were asked to rate their health status on a scale of 0 (“death”) to 100 (“full health”). Then, the health gains and losses that last for a certain number of years were illustrated as rectangles in a diagram, in which time was measured along horizontal (according to respondent age) and health (according to health status determined via the health thermometer) scales along a vertical axis. Finally, the respondents were asked to indicate what they are willing to pay for the avoidance of health losses or the achievement of health gains, which were expressed as predefined points on a scale that lasts for a given number of years. The results of the EuroVaQ study, particularly the direct approach, showed that the threshold stated by the respondents was lower than the predefined threshold used in most countries [10].

The direct method adopted by the EuroVaQ project appears to be applicable in developing countries, which suffer from informational limitations in their assessments of a cost-effectiveness threshold. Also, this approach simply yields the valuation of only one QALY and avoids the use of complex scenarios. It also uses specific illustrations for gains and losses in QALY on the basis of respondent age and health status. However, the EuroVaQ’s online survey method for data collection seems to be minimally applicable and acceptable in today’s settings of developing countries [11].

In the Eastern Mediterranean Region, Iran is one of the first countries that implemented an HTA system (initiated in 2007); since 2010, an HTA department has been operating in Iran under the supervision of the Health Technology Assessment, Standardization, and Tariffs Office of the Deputy of Curative Affairs [12]. Despite the progress achieved by the agency, however, it has also been confronted with challenges, primary among which is the lack of an evidence-based approach to HTA [12,13,14]. As previously stated, the value attached to a QALY is used as a measure of healthcare effectiveness. An estimation of this threshold can pave the way for evidence-informed policy making on HTA in Iran. With this issue in mind, we conducted a cross-sectional study to ascertain the WTP for one QALY gained of a representative sample of the Shiraz general public. To this end, the survey questionnaire employed in the EuroVaQ project, with some modifications, was used in the face-to-face interview [11].

2 Materials and Methods

2.1 Setting and Sample

Shiraz is the center of a large province in southwest Iran and has a population of 1,869,000. For the selection of 600 study participants, people older than 18 years, willing to participate were included in the study. Residential areas such as hotels, stores, agencies, companies, and places other than residential homes were excluded from the study.

Postal codes were obtained from the Shiraz Central Post Office, after which multistage random cluster sampling was used to select clusters corresponding with the population in each postal area. We chose a total of 57 head clusters, each having 14 postal codes with the distance of ten houses from each other. Thus, we acquired 798 respondents (57 × 14), among whom 82 “protestors” (i.e., respondents who indicated that payment for treatment was the responsibility of the government; the unwillingness of these respondents to pay seemed unreasonable) were excluded. This left us with a count of 716 individuals. The final sample comprised 651 individuals, which is equivalent to a response rate of 91%. Questionnaires were initially administered through computer-assisted personal interviews with 798 individuals in January and February 2017. Similar to the EuroVaQ study, the current research applied the direct survey method [12].

Approval was obtained from the Ethics Committee of the Shiraz University of Medical Sciences (approval code: IR. SUMS.REC.1395.S403).

2.2 Data Collection

In each of the randomly selected homes, the research’s aim was introduced, after which initial information about family members was obtained. In a family with more than one member older than 18 years, one of the over 18-year-old members was selected randomly using a software program developed specifically for the purposes of this study and installed on a personal digital assistant (PDA). The questionnaires were then completed on PDA by trained interviewers. In cases where there were criteria of gray block questionnaire use, the software on PDA-assisted versions was switched to the gray questionnaire automatically.

2.3 Questionnaire Design

In the direct approach, the cost that a respondent was willing to pay for a health gain was determined by presenting a descriptive instance. Most of the values in the questions reflected a 1-year calculation, thus eliminating the need to multiply annual WTP values.

In a pilot study, we found that the contents of three questionnaires including those on risk, were put in a way that prevented full comprehension [13]. Because a simple and understandable questionnaire increases the validity of a survey, we excluded the three problematic ones, leaving us with a final number of ten [12]. The number of cases wherein the ten questionnaires can be combined to come up with different versions is excessive. Such instruments should therefore be combined in a way that ensures the derivation of reliable conclusions; some versions arising from questionnaire combination may not extract suitable answers and even lead to confusion among respondents. In the EuroVaQ project, four versions were developed on the basis of diverse combinations of questionnaires. In the present study, our decision to conduct interviewer-facilitated sessions (instead of independently completed online questionnaires) highlighted the need to eliminate the possibility of boredom and prevent the repetition of questions. A triadic combination of the questionnaires that are the most similar to the EuroVaQ instruments was used. The layout of the different questionnaires used in the current research is described in Table 1.

Table 1 Layout of questionnaire versions

At the beginning of questionnaire completion, a respondent expressed his/her hopes regarding life expectancy and described his/her current health status using the health thermometer. Then, different versions of the questionnaire that was based on the instruments that are similar to the EuroVaQ questionnaires were processed using the developed software and shown to the respondents to enable them to comprehend the questions correctly. Through this approach, we generated a diagram similar to the model in the EuroVaQ questionnaires; the diagram reflects the “year” on the horizontal axis and “health loss” rate (0–100) on the vertical axis [12]. “Year” reflected the duration of health gains and losses (in years) (Graph 1 in the ESM Appendices).

All the questionnaires had the same core structure: A scenario was presented (corresponding to the questionnaire versions), and a respondent was asked whether he/she is willing to pay money to avoid disease occurrence or increase survival. If the answer was “yes,” the respondent was asked to state the amount that corresponds to his/her WTP in Rials. The values were then converted into US dollars on the basis of the exchange rate during the study period in 2017 (1 dollar,= 32381 Rials) provided by the Central Bank of Iran (http://www.cbi.ir/ExRates/rates_en.aspx).

If the answer was “no,” the respondent was asked to explain his/her reason for this response. A set of predefined answers were offered to the respondent, but he/she was also provided the option to provide other explanations on a field where the justification could be noted down. Different responses were incorporated by the trained interviewers.

In the versions that combined questionnaires (A, B), (I, J), and (F, G)—the questionnaires are described in Table 1—upon expression of WTP, a respondent was asked to explain why the amount that he/she is willing to pay for one questionnaire is greater, lower, or the same as the other one.

A questionnaire was evaluated as completed on the basis of criteria regarding the answers provided to the questionnaires during the time of questioning at the home by the software program; the completion criteria were used to reduce errors. We ensured the relevance of the questions to the respondents by means of the following: If the current health status is less than 20 units or if hope regarding life expectancy is below 2 years, the session was only partially completed and the next respondent in the same family was randomly selected. The gray block version was used in cases wherein two or more questionnaires in each combined version were unanswerable and were discontinued on the basis of the completion criteria. Note, as well, that the responses obtained through the unanswerable questionnaire version were considered inapplicable. Accordingly, the sessions switched from software-assisted sessions to gray block questionnaire sessions to reduce errors from the trained interviewers.

We intended to be sure that all health gains are derived at least a year before the end of an individual’s life expectancy and those health losses do not cause individual health status to reach a level below 10 points.

The interviewers were invited to participate in the study and took part in the explanatory meeting regarding the work process and the goals of the research. Thereafter they were fully trained through a four-session program over 2 weeks before the initiation of the research. Meetings with the trained interviewers were held at least once a week to review the problems that they encountered and implement necessary changes with the cooperation of the team members. After the data collection was initiated, the supervision of the trained interviewers and the accuracy with which they completed their work were refined every day. In addition to the local supervision, some of the respondents were called and asked about the completion of the questionnaires.

2.4 Software Development

As previously indicated, a proprietary software designed to advance the study’s aims and method was used. The design of the software program was essential given that the questions in each succeeding section of the questionnaires were to be presented on the basis of the responses to the first section of the instruments and that different kinds of health losses were to be visually presented. The software program was installed on PDAs and was tested before the start of the study. During the survey sessions, the PDAs were connected to the Internet. The devices could also be connected to a GPS, thereby enabling us to monitor a trained interviewer’s location on a map, the city up to the entrance of a residence, the online transmission of the questionnaires, and the completion of the questionnaires.

2.5 Data Analysis

2.5.1 Statistical Methods

All statistical analyses were performed in STATA 12.0, and a P < 0.05 was considered statistically significant. Descriptive features are presented as n (%) and mean ± SD. Tobit regression was used for assessment of factors contributing to the stated value for one QALY. Using traditional regression methods in assessing the factors that affect WTP would have led to biased estimates of coefficients because the value of the dependent variables was zero for some of the respondents. To handle large instances of zero values and obtain consistent estimates, we used Tobit (1958) regression, which can be used to estimate linear relationships when a response is censored to the left or right.

Because no appropriate univariate test for Tobit regression is available, unadjusted Tobit regression was performed for each of the variables as a means of univariate testing, and variables with a P < 0.2 were entered in multiple Tobit regression.

The overall mean value of one QALY was compared with the most recently calculated GDP per capita of Iran (US$4957), with 2015 as the reference year.

2.5.2 Sensitivity Analysis

To deal with the possibility that respondents give a very high WTP value, the top 1% of WTP responses for each scenario were omitted in the sensitivity analysis (the trimmed WTP value). The means and medians of WTP values (both trimmed and untrimmed) were reported after the exclusion of all respondents who chose not to pay for health gains (zero WTP).

3 Results

3.1 Descriptive Population Characteristics

The mean age ± SD of the participants was 43.9 ± 16.3 years, and their mean utility value ± SD was 0.79 ± 1.9, as calculated for the Iranian population using the EQ-5D questionnaire [14]. The mean family size was 3.7 ± 1.4. The characteristics of the participants are presented in Tables 2 and 3.

Table 2 Age and sex of participants
Table 3 Participants’ characteristics

3.2 Zero WTP

The participants who were unwilling to pay (Zero WTP) were asked to explain the reasons for their answer. Distribution of reasons for Zero WTP showed that in almost all scenarios, more than half of respondents “could not afford” the cost of proposed treatment and in I (LEnd) and L (Terminal illness) scenario, more than 40 % stated “It is my faith” (Online Resource 1). The participants who stated that the government should pay for treatment (protestors) were excluded from the analyses because the unwillingness of these respondents to pay seemed unreasonable. The number and percentage of unwilling participants and protestors for each questionnaire are presented in Table 4.

Table 4 Value of one quality-adjusted life-year across questionnaire versions for respondent agreeing to pay with and without trimming at 1%

3.3 Mean and Median WTP for One QALY

Across the questionnaires, the mean WTP value of respondents who expressed WTP for one QALY without trimming (1139 responses) ranged from US$3664 to US$10,080, with the overall mean value being US$4329 (= 0.87 of 1 GDP per capita). The trimmed WTP value (1103 responses) ranged from US$1792 to US$6250, with the overall mean value being US$2847 (= 0.57 of 1 GDP per capita) (see Table 4). For questionnaires I, J, and L, which included health gains as extra life scenarios, the overall mean values were US$3400 (= 0.68 of 1 GDP per capita) and US$4494.8 (= 0.9 of 1 GDP per capita) with and without trimming, respectively. The highest WTP for one QALY was that for scenario L, which presents gain of extra life from a terminal illness situation. The results on median WTP probably reflects the principle of democracy or recognition of a majority vote. However, the right-skewed distribution of WTP could not be considered; thus, only the mean values are discussed in this paper.

3.4 Determinants of WTP for One QALY

The coefficients from the Tobit regression models for factors that affect WTP for one QALY (determined for each question via unadjusted Tobit regression (Online Resource 2) and multivariate analyses) are presented in Table 5.

Table 5 Multiple Tobit regression

4 Discussion

This study estimated Iranian WTP values as a useful threshold for HTA in an era characterized by scarce resources. The overall WTP value for one QALY in this study is less than the threshold proposed by the World Health Organization for developing countries (0.57 GDP per capita versus 1–3 GDP per capita). This value aligns with the QALY value estimated on the basis of the perspectives of Iranian patients with cardiovascular diseases and diabetes [15, 16]. The method used in this study reflects a population’s preferences given that they are the potential medical service clients who pay for such services through out-of-pocket payments or tax/insurance premium-based payments. Gains in health and life expectancy, whose importance is determined by citizens, should therefore be valued [17]. Different methods for the evaluation of a cost-effectiveness threshold would lead to varying values. In the EuroVaQ project, the mean WTP value ranged from US$10,000 to US$30,000 in most cases, as indicated by society [12], and is lower than the threshold of GBP20,000 to GBP30,000 for the “cost-per-QALY gained” used in the funding recommendations of the National Institute for Health and Clinical Excellence in England and Wales. Nevertheless, previous international surveys held in the UK and the USA before the initiation of the EuroVaQ project showed that the thresholds estimated on the basis of the WTP of society for the value of a QALY are similar to previously reported values for these countries (US threshold = US$50,000–US$100,000 per QALY). [13]. To the best of our knowledge, no comparable country-specific study on the WTP value for one QALY has been conducted in the context of developing countries.

The mean values of QALY gains in the present study are relatively invariant to the shape of the gains in terms of the fractions of gained health and duration, except for scenarios E(0.1) and D(0.25), which reflected higher values. The WTP per QALY was higher only when fractions of a QALY could be gained (theoretically derived from budget constraints and diminishing marginal utility of health). Note that the mean values of QALY gains for scenarios featuring health gains show similar fractions for improved health in the near future and at the distant future (A 0.25 × 4, B 0.25 × 4 end; F 0.1 × 10, G 0.1 × 10 end). The results with untrimmed WTP showed that the respondents assigned a higher WTP to more distant future scenarios than near-future scenarios (higher values for B than A, higher values for G than F). These findings were cancelled out, to some extent, by the results with trimmed WTP values in scenarios that reflect a less severe scenario (higher values for F than G), but the values assigned to B remained higher than those assigned to A. A review of the respondent-stated causes for the higher WTP in cases where scenario B was presented right after scenario A showed that most of these respondents prefer not to grapple with illnesses that afflict the elderly or be bedridden in their old age, or stated that elderly quality of life is as important as quality of life during youth. This finding contrasts with those of other similar studies on the value of one QALY [10, 12, 18]. The difference is attributed to the fact that the aforementioned studies were performed mostly in developed countries. Elderly populations in developing countries enjoy less support from welfare systems and are concerned that their old age will be confronted with high costs of treatment and inadequate long-term care and services [19, 20]; some also stated a preference for leaving their money to their families, especially their children.

The mean value of life-extending gain at the end of life (LEnd) is similar to that of installment payments for a 0.25 × 4 gain of life. As stated by Cairns [21], however, gains offered in the distant future are discounted by respondents, thus limiting the comparability of QALY gains in the distant future with those in the near future.

The mean value of a QALY gained in the scenario where a coma occurs, with the assumption that time spent in a coma represents life reduction rather than health reduction, was around 15% higher than those composed of gains in health (Table 1). This difference in mean values is lower than that found in European countries [12]. The variance between the current work and the European studies may be attributed to the cultural matter of accepting a terminal illness. In our study, around 20% of the participants who were unwilling to pay stated that being in a coma must be their fate and that they would accept this situation. The value of health gains in the end-of-life (terminal illness) scenario was 13% higher relative to health gain scenarios, suggesting that the general public attaches more value to health gains achieved during a severe health state [22]. It also supports the observation that the general public places greater value on life-extending QALYs than life-enhancing QALYs and justifies additional weighting for the use of life-extending QALYs in cost-effectiveness analyses. Without this weighting, values may be inconsistent with welfare economics [23]. Nevertheless, such weighting may vary by country because of the preferences, culture, and norms of a country.

In scenario A, respondents are required to pay for a gain of one QALY with a single payment, but in scenario P, the value is paid in installments over 4 years. The mean values were consistently higher in the latter case—a finding that aligns with those of previous studies. It also indicates that payment mechanism influences WTP [10, 18].

This study confirmed the finding that one unique WTP cannot be obtained. Using a single value for QALY overrides individual trends, such as diminishing marginal utility of health and potential differences in the value of incremental health across population groups [10, 12, 24, 25].

Furthermore, Tobit regression results showed that the WTP for one QALY seems to be related to several contextual factors. Some of the factors assessed in this work were inconsistently related to WTP, but education, family income, and family expenditure (as an indicator of income) were more consistently associated with WTP.

A striking result was the proportion of zero-WTP respondents in any of the scenarios—a phenomenon similar to that observed in the EuroVaQ study and a corresponding German survey [10]. However, a sizable proportion of zero willingness to pay in the LEnd scenario occurred in which respondents exhibited acceptance of what they regard as fate, and in scenario E (0.1 QALY), in which respondents indicated that they have insufficient money to pay.

The proportion of protestors in the present research was very low compared with the percentage in similar studies [10, 12] probably because of our decision to conduct face-to-face interviews. Such sessions minimize cognitively challenging valuations and enable the acquisition of values that are more valid than those obtained from independently completed online surveys [10]. Face-to-face-interviews also reduce the possibility of overlooking certain age groups and social classes—another issue that challenges conventional Internet-based surveys [12, 13].

Another strength of this study is its use of information and communications technology for the visual illustration of scenarios; this approach is effective when respondents are asked to imagine hypothetical situations [10]. Although it resulted in values lower than the threshold provided for developing countries, it also provided valuable insights into values attached to different QALY types.

An important issue for consideration is that although decision makers and policy makers need to be aware of the general public’s WTP value for a QALY, this cost-effectiveness information is an input to decision making and not a magical substitute value. A WTP threshold should be used alongside other considerations rather than implemented as a strict decision-making measure; the final decision on allocating public resources remains the responsibility of politicians [24]. Additionally, WTP-based values should be rigorously tested across cultures using a consistent methodological approach. Correspondingly, we recommend that similar surveys be carried out in other provinces of Iran and other low- and middle-income countries to derive WTP values for QALY and the factors that affect such values.

5 Conclusion

The overall WTP value for one QALY in this study was estimated to be less than the threshold proposed by the World Health Organization for developing countries and higher for life-extending gains compared with health-enhancing gains. Further WTP-based values across cultures by using a consistent methodological approach are strongly recommended. Our results provided a threshold range of WTP and insights into rigorous scientific decision making about healthcare technology for the future.

Data Availability Statement:

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.