Abstract
Background
Decisions on reimbursement of health interventions in many jurisdictions are informed by health technology assessments (HTAs). Historically, the focus of these has often been cost effectiveness or cost utility, while other criteria were considered informally. More recently, there has been an increasing interest in the formal incorporation of additional criteria using multi-criteria decision analysis. Such an approach has not yet formally been part of decision-making policy in Ireland.
Objective
The objective of this analysis is to demonstrate that cost effectiveness is not the only criterion influencing reimbursement decisions in Ireland. Furthermore, the aim is to reveal criteria that may have informally influenced reimbursement decisions in the past.
Methods
A list of potential criteria was identified based on the literature, national guidelines and experience of the national HTA agency. Information on each of these criteria was sought for every assessment conducted in Ireland up to July 2015. A logistic regression was fitted to the data to identify influential parameters. Model selection was performed using the Bolasso method.
Results
Thirteen criteria were considered in the analysis. Two members of the HTA review team assessed the performance of the interventions against these criteria. Model selection suggests that the incremental cost-effectiveness ratio and quality of evidence could be important drivers of reimbursement recommendations in Ireland. Less important drivers suggested include the year of assessment, the level of uncertainty, as well as safety and tolerability.
Conclusion
The analysis demonstrates that recommendations for or against the reimbursement of technologies in Ireland are not only driven by cost effectiveness. This highlights the need for more formal inclusion of criteria in the process, to improve transparency and ensure consistency.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Reimbursement recommendations in Ireland are not only driven by cost effectiveness. |
Quality of evidence appears to be an important driver for reimbursement recommendations in Ireland. |
The analysis highlights a need for the formal inclusion of additional criteria to ensure transparent and consistent decision making. |
1 Introduction
Health technology assessment (HTA) plays an important role in informing reimbursement decisions in many jurisdictions across Europe and the world [1]. While the focus of such evaluations has historically been on cost-effectiveness analysis [2], it is widely accepted that cost effectiveness alone should not determine the decision on reimbursement [3]. A decision problem, such as the one at hand, requires simultaneous consideration of multiple, sometimes conflicting, objectives.
In Ireland, decisions on the reimbursement of pharmaceutical products are made by the Health Service Executive (HSE). On behalf of the HSE, the National Centre for Pharmacoeconomics (NCPE) carries out assessments of all new pharmaceuticals where an application has been made for reimbursement. Following a rapid review, the NCPE either recommends the reimbursement of a product or requests a full pharmacoeconomic assessment. Based on the full assessment, the NCPE makes one of three recommendations: (1) reimbursement at the requested price; (2) against reimbursement; or (3) against reimbursement at the submitted price. A cost-effectiveness analysis (including an analysis of comparative safety and efficacy) and a budget impact (BI) analysis are the core parts of each assessment. An agreement between the HSE and the marketing authorisation holders (MAHs) is in place in Ireland, which suggests a threshold of €45,000 per quality-adjusted life-year (QALY), below which technologies will be reimbursed [4]. The agreement states that exceptional products that fail to satisfy the €45,000/QALY threshold may proceed to discussions between the HSE, relevant stakeholders and the MAH. Likewise, the attainment of an incremental cost-effectiveness ratio (ICER) below €45,000/QALY may not result in a positive recommendation, particularly where there are concerns about the validity of any aspect of the cost-effectiveness analysis. In practice, therefore, there is flexibility in the threshold, which allows for other aspects (criteria) of individual interventions to be taken into account. No list of relevant criteria or their influence on the decision is made explicit, limiting the transparency of the decision-making process.
A number of authors have identified influential criteria in the HTA process [5–8]. Dakin et al. [5] modelled National Institute for Health and Care Excellence (NICE) decisions with three possible outcomes (“recommendation”, “restricted recommendation” and “no recommendation”). Multinomial logistic regression results suggest that clinical evidence, higher ICERs and decision date influence the decision. In a binary model, Devlin and Parkin [6] found that a combination of ICERs, burden of disease and uncertainty are acceptable indicators for NICE decisions. Harris et al. [7] modelled a binary decision outcome for Australia using probit multiple regression. Their analysis suggests clinical significance, cost effectiveness, cost to government and severity of disease as predictors for coverage decisions. Tanios et al. [8] conducted an international survey of 140 decision makers in 23 countries to report criteria that are currently considered or should be considered in healthcare decisions. Clinical efficacy, safety, quality of evidence (QoE), disease severity and costs were ranked to be the most relevant criteria.
The work in this paper aims to identify criteria that are likely to have influenced reimbursement decisions in Ireland in the past, through a retrospective analysis. The revealed preferences do not necessarily represent the criteria the HSE and the Irish population would select in a formal process to influence decisions on drug reimbursement [9]; however, they establish what is likely to currently influence decisions. The work highlights the need for a systematic process to incorporate influential criteria to improve transparency and consistency in reimbursement decision making.
2 Methods
2.1 Identification of Criteria
Based on case studies of HTA agencies in the UK, Germany, Australia and the USA, Rotter et al. [10] recently explored the changing landscape of economic evaluations in HTA. They propose a taxonomy of factors impacting on the value of medical technologies, which we used as a basis for this analysis. Guindo et al. [11] recently published a literature review identifying decision criteria for resource allocation. The top ten criteria identified in their analysis are covered by the taxonomy proposed by Rotter et al. [10]. The list of criteria was modified to suit Irish experiences in the past and clear definitions were added to each criterion. Clear definitions and measurement scales of each criterion were agreed upon in discussions with members of the NCPE review team.
2.2 Scoring of Assessments
Information was sought for every full pharmacoeconomic assessment completed by the NCPE up to July 2015. Data were extracted from summaries published on the NCPE website (http://www.ncpe.ie/), as these do not contain any confidential information. Each intervention was scored against each of the criteria using the predefined scales. Two members of the NCPE review team (LMC and SS) independently extracted the data; disagreements were resolved in discussions with the NCPE review team.
2.3 Statistical Analysis
The statistical analysis explores the relationship between the set of criteria identified in the literature review and the outcome of the NCPE assessment. The outcome was binary, with either a positive recommendation for reimbursement (1) or a negative recommendation (0).
As a first step, descriptive statistics are used to explore the relationship of each individual criterion and reimbursement. The dataset is then analysed using logistic regression. Logistic regression predicts the outcome of a categorical variable based on a number of predictor variables, which can be either categorical or continuous. In addition to the set of criteria identified in the literature review, the year in which the technology was assessed (coded as year of assessment) is considered as an additional predictor variable to account for the changes made during the last 7 years caused by financial constraints. Values are imputed where applicable for the primary analysis; cases with missing data were excluded in a sensitivity analysis. A logistic regression is a special case of a generalised linear model and takes the following form:
where Y i is the observed outcome of treatment i, taking values 1 and 0; each observation has a Bernoulli distribution with a probability p i of success. The weight given to each criterion is denoted by w j ; the score achieved by treatment i on criterion j is denoted by S ij . The overall utility u i of treatment i is then calculated as a weighted sum of these scores. A logit link connects the probability of reimbursement p i with the utility of treatment i.
Model selection is used to select important predictors out of the pool of defined criteria. Model selection is conducted using the Bolasso method, a combination of bootstrapping and LASSO (least absolute shrinkage and selection operator) [12]. The LASSO regression selects relevant variables by minimising the residual sum of squares while constraining the sum of the coefficients to be lower than a tuning parameter. If the tuning parameter is large, LASSO will give the same results as standard least squares regression. Cross-validation to trade-off bias and variance is used to choose the tuning parameter [13]. To ensure a consistent model choice, the analysis is repeated on 100 bootstrapped realisations of the dataset and only variables selected in more than 95 % of replications are selected for the base-case model. Variables selected in more than 85 % of cases are considered as an alternative scenario.
Model selection is conducted in R (version 3.2.1) using the glmnet package [14, 15]. The logistic regression incorporating the selected predictors is fitted in R using the JAGS package [16].
3 Results
3.1 Identification of Criteria
The final list of criteria and their definitions are summarised in Table 1. A total of 13 criteria were elected: cost effectiveness, BI, safety and tolerability, process utility (evaluation of the delivery/implementation of the product within the health service), unmet need, orphan status, disadvantaged population, end of life, severe disease, innovation, reversibility, QoE and uncertainty (based on probability of cost effectiveness). Cost effectiveness is measured as the ICER. The use of the ICER as a criterion in a regression can cause problems, since negative ICERs can refer to situations where QALYs are gained at a reduced cost (dominant technology) as well as situations where a higher cost is requested for a reduction in QALYs (dominated technologies). However, in this application, all negative ICERs refer to dominant technologies. The 5-year gross BI is measured in euros. The gross BI was chosen as it was broadly available and a high level of heterogeneity was found in the net budget calculations. All other criteria are measured on a simplistic two- or three-category scale as more detailed information on these criteria was not broadly available. Pharmaceuticals, which are listed on the Orphanet Database (having been granted an orphan designation for disease(s) considered to be rare in Europe) [17] are classified as orphan drugs. Further details can be found in Table 1.
3.2 Scoring of Assessments
A total of 85 full pharmacoeconomic assessments were completed by the NCPE between January 2006 and July 2015. Details on the technologies are summarised in Table 2. Each of the assessments was scored against each of the criteria. Each assessment evaluated one intervention (Table 3).
The base-case ICER was chosen; in cases where more than one base case was specified, the average across values was calculated. (In six cases, multiple comparators were deemed equally appropriate. In one case patients were split into subgroups; however, an overall recommendation on reimbursement was made.) Annual BIs are multiplied by five to determine the 5-year BI; the average is used where a range of values was reported.
Three assessments [17, 20, 44] did not submit a cost-effectiveness analysis and three assessments [23, 30, 75] conducted a cost-minimisation analysis. No ICER was reported in the summary in a further four cases [56, 59, 78, 81]. Two cases reported an ICER in the southwest quadrant of the cost-effectiveness plane (less costly and less effective than comparator) [22, 64] and the intervention in one case was dominated (more costly, less effective) [69]. These 13 cases were excluded from the analysis. One of the remaining assessments was a re-evaluation of a previous assessment. We only included the more recent assessment into the analysis in order to avoid double counting. Case 85 was therefore excluded.
No BI was reported in nine of the remaining cases [18, 24, 32, 39, 52, 54, 57, 72, 74]. Thirteen assessments [2, 3, 4, 12, 18, 21, 39, 41, 47, 49, 50, 52, 67] show dominance of the intervention in question and do not report an ICER.
In the base-case analysis, the BI is imputed as the mean BI of the remaining treatments. An ICER of –1 is adopted for dominant treatments. A sensitivity analysis (SA1) explores the analysis of the ICER as a categorical variable (0 for dominant technologies, 1 for ICERs €0–20,000, 2 for ICERs €20,000–40,000, 3 for ICERs €40,000–60,000, 4 for ICERs €60,000–80,000, 5 for ICERs €80,000–100,000 and 6 for ICERs more than €100,000). We have excluded dominant cases and cases for which no BI was reported in a sensitivity analysis (SA2).
The base-case analysis and SA1 is therefore based on 71 assessments, while 52 assessments are included in SA2.
3.3 Descriptive Analysis
Of the 71 evaluations included in the base-case analysis, 27 (38 %) interventions were recommended for reimbursement. Table 4 summarises the descriptive statistics. For continuous criteria, the table reports the overall mean as well as the mean stratified by reimbursement recommendation. For binary and categorical outcomes, the table reports overall counts and percentages in each category as well as stratified by recommendation.
All but one of the reimbursement decisions were classified as being reversible and only one intervention related to a disadvantaged population. These criteria are therefore not further considered in the analysis.
3.4 Regression Results
Two variables, ICER and QoE, are selected in >95 % of the bootstrap replications to be important predictors for the reimbursement recommendation and are therefore chosen for the base-case model. Other strong predictors (selected in >85 % of the bootstrap replications) are year of assessment, safety and tolerability, as well as level of uncertainty.
The regression model estimates a probability of reimbursement for each decision made in the past. The decision rule based on the logistic regression suggests a positive recommendation for technologies with a probability of ≥0.5 and a negative recommendation for technologies with a probability of <0.5.
Figure 1 plots the estimated probability of reimbursement for each assessment stratified by actual recommendation using the ICER and QoE as predictors.
The model correctly classifies 64 of 71 assessments (90 %). When extending the model to incorporate year of assessment, safety and tolerability and the level of uncertainty, 96 % of assessments are classified correctly.
The coefficients of both models are summarised in Table 5; they contain information on the impact the different criteria have on the reimbursement decision. In the base case coefficients are significant on a 95 % confidence level. Coefficients in the extended base case are more uncertain. The ICER, QoE, uncertainty and year of assessment remain significant at a 95 % confidence level, while confidence intervals for coefficients for safety and tolerability (95 % CI –0.02 to 0.04) as well as the intercept (95 % CI –49.7 to 70.3) are very wide.
Since QoE is a categorical variable, it is possible to calculate a threshold for the cost per QALY for each level. The model suggests the reimbursement of interventions based on evaluations with no issues associated with the quality of available evidence (QoE = 0) up to a threshold of €40,633/QALY. For interventions with poor QoE (QoE = –1), a negative threshold is estimated, indicating that evaluations based on poor QoE should not be recommended for reimbursement.
For the extended base case, the probability of a positive recommendation decreases with increasing ICER; a probability of 0.5 is reached for an ICER of €59,046/QALY for a technology assessed in 2015 where all other predictors equal 0. QoE remains important with a probability of close to zero for evaluations with poor QoE. The probability of a positive recommendation increases slightly with time. The probability decreases with a worsening of safety and tolerability and increased uncertainty.
SA1, treating the ICER as categorical, also selects the ICER and QoE in >95 % of simulations. The model suggests the reimbursement of technologies at a threshold between 2 (ICER: €20,000–40,000) and 3 (ICER: €40,000–60,000) for technologies with appropriate evidence and does not recommend the reimbursement of technologies with a poor evidence base. The model correctly classifies 90 % of technologies.
SA2, excluding dominant cases as well as cases where BI was not reported, also selects the same two criteria (ICER and QoE) in >95 % of bootstrap replications; year of assessment is selected in >85 % of replications.
4 Discussion
The analysis demonstrates that recommendations for or against the reimbursement of technologies in Ireland are not only driven by cost effectiveness. Apart from the ICER, the quality of available evidence is identified as a potential key predictor. Other potential drivers of the decision include safety and tolerability as well as decision uncertainty. While uncertainty surrounding the estimates is high, the results resemble what one would expect. Ireland currently operates at a threshold of €45,000/QALY for pharmaceuticals, which was reduced to €20,000/QALY between 2010 and 2012; an ICER of €40,633/QALY in the base-case model is therefore in line with current policy. An evaluation can only be as good as the evidence it is built on; estimates based on poor QoE or inadequate modelling approaches are unreliable. The large impact of QoE on reimbursement recommendation is therefore not surprising. While all interventions have adverse effects, a significant improvement in safety and tolerability has been identified to positively impact on reimbursement. In the same way, a significant dis-improvement in safety and tolerability makes a positive recommendation less likely. While the economic evaluation takes place post market authorisation, meaning companies have demonstrated an acceptable level of safety, the consideration of this criterion remains important as it may impact on the well-being of the patient.
These findings resemble those of analyses conducted in other countries [5–8]. The criteria selected here were also identified in one or several of these studies (see Sect. 1 for details). Burden of disease had been identified in several of the published studies; however, this was not mirrored in this analysis.
Multi-criteria decision analysis (MCDA) in HTA is emerging as a new approach allowing for the systematic inclusion of multiple objectives in the assessment. Thokala and Duenas [3] have illustrated how different approaches to MCDA can be incorporated in the HTA framework. Goetghebeur et al. [18] have developed the EVIDEM (Evidence and Value: Impact on DEcisionMaking) framework for Canada, a model which has also been applied in Nepal [19], Chile [20] and Ghana [21]. Sullivan [22] has proposed the application of MCDA for New Zealand. One approach to MCDA extends the incremental net benefit (INB) to a linear additive value function incorporating relevant criteria [23]. The methodology easily extends the current decision-making approach in Ireland.
This analysis is the first phase of the development of a MCDA approach for the Irish healthcare setting. A range of potentially relevant criteria was selected based on the literature and a descriptive approach is taken to explain the impact different criteria have had on reimbursement recommendations in the past. The results of this analysis could be used to inform parameters of a linear additive value function. However, while the identified criteria are likely to have influenced decision making in the past, they have done so in an informal way; their value and relative importance has never been formally assessed. However, a formal assessment is needed to select parameters of an MCDA model to inform future decisions.
The NCPE now routinely assesses all factors identified as relevant in this analysis and incorporates outcomes in their report. The findings of the analysis can strengthen the case for conditional reimbursement in a policy setting. MCDA structures may offer a way of formally recognising concerns regarding QoE or safety and tolerability.
There are some limitations associated with this analysis. The analysis is limited to data available in the public domain; it is therefore possible that other factors influencing the recommendation may not be captured.
The outcome variable of the analysis distinguishes between positive and negative recommendations for reimbursement. Approximately half of the negative recommendations by the NCPE were issued with a recommendation “not at this price”, which may result in post-recommendation price discussions and potential reimbursement following price reduction. While this study analysed factors influencing the recommendation process, an analysis of factors influencing the actual reimbursement decision would complete the picture of decision making in Ireland. However, price discussions and agreements following the recommendation are often commercial in confidence and assessments based on the newly agreed price are not available.
The ICER as a measure for cost effectiveness has certain properties that are not ideal in a regression analysis. Positive values can indicate a QALY gain at an additional cost as well as a loss of QALYs at a saving, and negative values can indicate a QALY gain at a saving as well as a QALY loss at additional costs. However, this is not a problem in this analysis, since the technologies included all show a positive QALY gain. A solution to the problems associated with the ICER is the use of the INB instead. Unfortunately, data to infer the INB were not available for many of the technologies analysed here.
Some of the criteria are not easily defined. Gross BI was reported for the majority of treatments and therefore used to measure BI. However, it would be interesting to see if net BI yields a different outcome. Unfortunately, the net BI is not in the public domain for many cases, and there is high heterogeneity in its calculation between the different assessments. Nevertheless, there is uncertainty in the gross BI as it is influenced by many factors including projections on patient numbers and market penetration. Furthermore, a constant BI over 5 years was assumed where only an annual BI was reported.
The scale used to measure the criteria also influences the analysis. Due to the difficulties in obtaining information on all criteria as well as to improve clarity, a simple two- or three-level scale was used for most of the criteria. A more sophisticated approach for some of the criteria may be beneficial. For instance, measuring uncertainty on a continuous scale, such as the expected value of perfect information, would capture the full level of uncertainty.
In addition, while year of assessment was included in the analysis to explain some of the changes over time, the ICER and BI were not adjusted to a common year to explain inflation effects.
5 Conclusion
Transparency is reduced when criteria informally influence the decision process. We hope that highlighting potentially influencing criteria in this analysis will trigger communication between decision makers, experts and other stakeholders to discuss potential ways of incorporating relevant criteria in a more formalised manner. The results of this analysis provide a basis for discussions on whether the revealed criteria, which have influenced decisions in the past, should also be the criteria used to influence decisions in the future. MCDA forces an explicit list of criteria and the quantification of their value. Careful consideration is necessary to determine the relative importance of the criteria before an MCDA approach can be applied to actually aid decision making in the Irish healthcare setting.
References
INAHTA. Global Networking for Effective Healthcare website. http://www.inahta.org. Accessed 30 Oct 2013.
Johri M, Norheim OF. Can cost-effectiveness analysis integrate concerns for equity? Systematic review. Int J Technol Assess Health Care. 2012;28(2):125–32.
Thokala P, Duenas A. Multiple criteria decision analysis for health technology assessment. Value Health. 2012;15(8):1172–81.
Health Service Executive. Framework agreement between the Irish pharmaceutical healthcare association ltd and the department of health and the health service executive on the supply terms, conditions and prices of medicines. 2012. http://www.hse.ie/eng/about/Who/cpu/IPHA_Agreement_2012.pdf. Accessed 19 Mar 2016.
Dakin HA, Devlin NJ, Odeyemi IA. “Yes”, “No” or “Yes, but”? Multinomial modelling of NICE decision-making. Health Policy. 2006;77(3):352–67.
Devlin N, Parkin D. Does NICE have a cost-effectiveness threshold and what other factors influence its decisions? A binary choice analysis. Health Econ. 2004;13(5):437–52.
Harris AH, Hill SR, Chin G, Li JJ, Walkom E. The role of value for money in public insurance coverage decisions for drugs in Australia: a retrospective analysis 1994–2004. Med Decis Mak. 2008;28(5):713–22.
Tanios N, Wagner M, Tony M, Baltussen R, van Til J, Rindress D, et al. Which criteria are considered in healthcare decisions? Insights from an international survey of policy and clinical decision makers. Int J Technol Assess Health Care. 2013;29(4):456–65.
Bridges JF, Hauber AB, Marshall D, Lloyd A, Prosser LA, Regier DA, et al. Conjoint analysis applications in health—a checklist: a report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health. 2011;14(4):403–13.
Rotter JS, Foerster D, Bridges JF. The changing role of economic evaluation in valuing medical technologies. Expert Rev Pharmacoecon Outcomes Res. 2012;12(6):711–23.
Guindo LA, Wagner M, Baltussen R, Rindress D, van Til J, Kind P, et al. From efficacy to equity: Literature review of decision criteria for resource allocation and healthcare decisionmaking. Cost Eff Resour Alloc. 2012;10(1):9.
Bach F. Model-consistent sparse estimation through the bootstrap. Arxiv. 2009; eprint arXiv:0901.3202. http://arxiv.org/abs/0901.3202. Accessed 19 Mar 2016.
Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. Ann Stat. 2004;32(2):407–99.
Friedman JHT, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1–22.
R-Core-Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2014.
Plummer M. rjags: Bayesian graphical models using MCMC. R package version 3-15. 2015. http://CRAN.R-project.org/package=rjags. Accessed 1 July 2015.
Orphanet. List of rare diseases online2016. http://www.orpha.net/consor/cgi-bin/Drugs.php?lng=EN. Accessed 1 Feb 2016.
Goetghebeur M, Wagner M, Khoury H, Levitt RJ, Erickson LJ, Rindress D. Bridging health technology assessment (HTA) and efficient health care decision making with multicriteria decision analysis (MCDA): applying the EVIDEM framework to medicines appraisal. Med Decis Mak. 2012;32(2):376–88.
Baltussen R, Ten Asbroek A, Koolman X, Shrestha N, Bhattarai P, Niessen L. Priority setting using multiple criteria: should a lung health programme be implemented in Nepal? Health Policy Plan. 2007;22(3):178–85.
Vargas V, Poblete S. Health prioritization: the case of Chile. Health Aff. 2008;27(3):782–92.
Jehu-Appiah C, Baltussen R, Acquah C, Aikins M, Amah d’Almeida S, Bosu WK, et al. Balancing equity and efficiency in health priorities in Ghana: the use of multicriteria decision analysis. Value Health. 2008;11(7):1081–7.
Sullivan T. Using MCDA (Multi-Criteria Decision Analysis) to prioritise publicly-funded health care. http://otago.ourarchive.ac.nz/bitstream/handle/10523/2651/SullivanTrudyA2012PhD.pdf?sequence=12012. Accessed 1 Dec 2015.
Belton V, Stewart TJ. Multiple criteria decision analysis: an integrated approach. Heidelberg: Springer; 2002.
Author contributions
SS and LMC collected the data for the analysis. SS and CW conducted the statistical analysis. RA and MB provided insights for the interpretation of the results. SS and LMC wrote the initial draft and all authors reviewed and provided advice for the final manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
No funding was received to conduct this study. None of the authors (SS, LMC, RA, MB, CW) have a conflict of interest in respect to this study.
Rights and permissions
About this article
Cite this article
Schmitz, S., McCullagh, L., Adams, R. et al. Identifying and Revealing the Importance of Decision-Making Criteria for Health Technology Assessment: A Retrospective Analysis of Reimbursement Recommendations in Ireland. PharmacoEconomics 34, 925–937 (2016). https://doi.org/10.1007/s40273-016-0406-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40273-016-0406-z