FormalPara Key Points for Decision Makers

MCDA has been used to capture the patient voice by involving patients as the source of weights.

A range of methods are available to elicit patient preferences.

A number of challenges face the implementation of MCDA when working with patients, such as how best to reflect the heterogeneity of patient preferences in decision making, and how to manage the cognitive burden associated with some MCDA tasks.

1 Introduction

Most people recognize that the opinion of the patient—his or her voice—should be a central consideration in making healthcare decisions [14]. However, cases of patients challenging decisions suggest that it is questionable whether this is being meaningfully achieved. For instance, patient testimonies and social media campaigns were used to gain Food and Drug Administration (FDA) approval for flibanserin to treat female hypoactive sexual disorder, after an advisory committee had previously rejected the drug twice [5, 6]. The drug was initially rejected because of concerns over side effects and questionable benefit. However, intensive campaigning by women’s groups are thought to have influenced the decision to eventually approve the drug, with patients stating that they were willing to accept potential risks in exchange for the potential benefit.

The apparent failure to capture the patient’s voice runs contrary to many efforts of decision makers. The Patient-Centered Outcomes Research Institute (PCORI) was established to fund research designed to improve patient care and outcomes through methods that bring the patient to the center of healthcare research and development [7]. The European Medicines Agency (EMA) has started a pilot project that will involve at least two patients in the Committee for Human Medicinal Products [8]. Health technology assessment (HTA) agencies involve patient groups in committees and in citizen juries intended to inform the principles on which decisions are made [2]. The FDA has recently recognized the need for further patient involvement in the drug development process forming the Patient-Focused Drug Development initiative [9], which incorporates patient perspectives in earlier stages of drug development, and the Patient Representative Program, which invites patient representatives to take part in advisory committees considering drugs for approval [10].

The incongruity between the efforts to involve patients in regulatory decision making on one hand but their ongoing dissatisfaction with the results on the other hand, points to the limitations of current approaches. They are criticized for collecting qualitative data and doing so from just a few patient representatives [11]. That is, while the performance of treatments on, e.g. clinical and safety endpoints, are quantified, patient preferences for these endpoints are currently not quantified. The natural tendency to focus on aspects of the decision problem that are quantified means that less attention is given to those which are not. If patient preferences are to be given the attention they deserve, it is necessary that they too be addressed quantitatively [12].

Multi-criteria decision analysis (MCDA) provides a way to quantify the patients’ values in order to inform the decision-making process. While different types of MCDA exist, it is the value measurement approach to MCDA that is most prevalent in healthcare [13]. This is a method for disaggregating a decision into its components and systematically addressing them, often quantitatively, to support decision making [14, 15]. This is done by systematically identifying decision criteria, measuring how well each alternative under consideration does against the criteria, valuing this performance (‘weighting’), and aggregating the data into an overall assessment of the relative value of each option. By breaking down complex, multi-dimensional health decisions into more manageable components, MCDA can support the quantification of patient values and, thus, facilitate their incorporation into decision making.

Though widely applied outside healthcare [16], MCDA’s value to healthcare decision makers has only recently been realized [1719] leading to a sharp increase in publications [15]. Agencies, including the German Institute for Quality and Efficiency in Healthcare (IQWiG) [20, 21] and the EMA [8] are piloting its use; and MCDA has been applied to support shared decision making (SDM) [22].

The relatively recent healthcare interest in MCDA means that further work is required to determine how best to use it to support healthcare decisions [15]. For instance, many weighting techniques are available, including: ranking (respondents are asked to order criteria according to importance, and assumptions are made to translate these ordinal ranks into weights); direct weighting (respondents provide numbers to each criterion to indicate its relative importance); pairwise comparison (respondents compare pairs of criteria, such as in the analytic hierarchy process (AHP; see Saaty [23] and discrete choice experiments (DCEs); see Ryan et al. [24]. It is not yet established which approach is more appropriate for use with patients.

Weighting techniques differ in the level of cognitive challenge they pose. This is particularly important to consider when working with patients who may be unfamiliar with the tasks they are being asked to complete. For instance, techniques that involve making a choice may be easier than pairwise comparison, which may be easier than directly providing a precise estimate of the relative importance of two or more criteria. Equally, techniques that require one or two criteria to be considered at a time may be easier than those that require all criteria to be weighed up. MCDA methods also vary in the level of ‘support’ provided to participants. This is partly a function of whether a workshop, interview, or a survey approach is adopted. DCEs typically use a survey, which limits the information provided to participants, and does not allow for interaction with or between participants. An interview or workshop context allows participants to clarify tasks, facilitates discussion between participants and knowledge sharing between experts and participants.

There is a lack of formal guidance on how to use MCDA to support healthcare decision making. Several frameworks have been proposed that identify the differences between MCDA methods (see for instance, De Montis et al. [25]). These were not, however, developed for a healthcare audience although efforts are ongoing to generate such use-specific guidance [26].

The objective of this review is to support the use of MCDA to capture the patient voice by reporting on existing MCDAs that elicited weights from patients, summarizing the approaches adopted, and the lessons learned from this experience.

2 Methods

MEDLINE and EMBASE were searched in June 2014 for English-language papers with no date restriction.Footnote 1 Abstracts were reviewed and included if they reported the application of MCDA to assess healthcare interventions. Abstracts were excluded if they did not apply MCDA, such as discussions of how MCDA could be used; or did not evaluate healthcare interventions, such as MCDAs to assess the level of health need in a locality. Full texts were retrieved for the remaining studies and reviewed to identify MCDAs that involved patients as a source of weights. Patients can be involved at a number of steps during the MCDA process (e.g. selecting criteria or providing value functions); however, this review selected those which included patients for weighting, because often in MCDA reporting weighting methods are more transparently and thoroughly described. Abstracts and full text were reviewed by two reviewersFootnote 2 and disagreements were resolved in a meeting. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagram is reported in Fig. 1.

Fig. 1
figure 1

Overview of the literature review. MCDA multi-criteria decision analyses

Data were extracted on: date of study; location (either jurisdiction for the research or, location of lead author if no jurisdiction given); therapeutic area; number of criteria included in the MCDA; nature of patient involvement; number of stakeholders (including patients) involved; evidence on the variation in preferences between patients and experts, and among patients; type of weighting technique adopted; and evidence on the appropriateness of different weighting techniques. Study quality was not reviewed, not least because there is no established approach for assessing the quality of MCDAs. Rather one of the objectives of the study was to learn the lessons for undertaking MCDAs. This study was funded by Sanofi Genzyme.

3 Results

3.1 Study Characteristics

After duplicate removal, the searches identified 2346 abstracts. Following abstract review, full texts were retrieved for 129 studies, and 10 of these reported MCDAs in which patients were the source of weights. Table 1 summarizes the characteristics of the 10 studies included. These covered 12 examples of MCDAs that elicited patient views to use as weights in the analysis. Five studies also drew on patient opinion to determine the value framework (criteria included in the analysis) [2731].

Table 1 Summary of study characteristics

3.2 Variation in Preferences Between Stakeholders

Three papers compared and found differences between weights obtained from patients and clinicians [27, 31, 32]. Sussex et al. [31] piloted the use of MCDA to value orphan medicinal products. Based on an extensive review of the literature and stakeholder engagement, the authors identified as criteria: availability of treatments; survival prognosis before treatment; morbidity before treatment; social impact of disease on patients’ and caregivers’ daily lives before treatment; treatment innovation; clinical efficacy of treatment; treatment safety; and social impact of treatment on patients’ and caregivers’ daily lives. Direct rating was used, allocating 100 points across the criteria in proportion to the respondent’s assessment of importance. Patients gave more weight to the impact of the disease, while experts gave greater weight to efficacy and availability of alternatives. Patients gave lower weight to availability (11 points out of 100, compared with 19.5 for clinical experts) and clinical efficacy (17.5 points out of 100, compared with 27.5 for clinical experts), and greater weight to the social impact of the disease without treatment (15 points out of 100, compared with 8 for clinical experts) and the social impact of treatment on the patients’ and caregivers’ daily lives (17.5 points out of 100, compared with 11 for clinical experts).

Hummel et al. [32] used a pairwise comparison technique (AHP) to estimate weights to inform reimbursement decisions on antidepressants. Weights were elicited from two panels: patients only, expert psychiatrists and psychotherapists in the other. The groups differed significantly in the weight given to response (patients = 0.37; experts = 0.05), and to remission (patients = 0.09; experts = 0.40). The authors suggested that this is related to the clinicians’ longer-term perspective versus the patients valuing immediate results.

Hummel et al. [27] used MCDA to evaluate augmentative treatment of upper limbs in persons with tetraplegia. They compared the weights elicited from an expert panel with those of patients attending rehabilitation clinics. The expert panel comprised two rehabilitation physicians, two occupational therapists, two physiotherapists, and one social worker, as well as a person with tetraplegia. Experts gave greater weight to arm-hand function (0.53 vs. 0.39 for patients). Patients gave a greater weight to ease of use (0.24 vs. 0.17 for experts) and the time required for treatment (0.11 vs. 0.03 for experts).

3.3 Heterogeneity of Patient Preferences

Four studies reported the variation in patient responses, all concluding that there is significant heterogeneity. Three studies that elicited patient priorities for colorectal cancer screening using pairwise comparisons of criteria (AHP) observed weights varying widely [3335]. Dolan et al. [34] surveyed 484 patients and found the differences were not associated with demographic factors, numeracy, or literacy skills. This supported the authors’ earlier finding in a survey of 48 primary care patients [33]. Hummel et al. [35] elicited weights from 167 patients and found the standard deviation for weights for each criterion ranged between 58 and 100% of the mean weight. The authors speculated that this large variation may be caused by the limited number of respondents, but could also reflect real differences among respondents. Hummel et al. [32] applied the AHP to elicit patient views on the attributes of antidepressants, and also observed large differences about the importance of response and relapse.

Two approaches for aggregating patient responses were observed across the 10 studies. Most studies used the geometric mean of patients’ responses [27, 30, 32, 35, 36]. One study asked participants to reach consensus on the weights [31].

The variation in weights has prompted reflection on the implications for elicitation methods and sampling strategies. Hummel et al. [32] observed that a single panel may be insufficient to ensure a representative assessment, rather multiple panels or surveys may be necessary. Dolan [33] elicited response from patients recruited from a single practice setting and notes that different results may be obtained by asking patients in a variety of practice settings. Hummel et al. [32] also express concern about representation, speculating that those volunteering to participate in a panel may be more experienced or knowledgeable about medication and treatment than the average patient.

3.4 Weighting Techniques

Several weighting techniques were used in these studies. Six used the AHP; three direct weighting; two a DCE; and one a rank-ordering approach. These methods were combined with different elicitation modes: direct weighting techniques all used workshops; DCEs and rank-ordering used surveys; and AHPs were split equally between workshops and surveys.

Authors reported concerns about patients’ ability to undertake weighting tasks. Airoldi et al. [28] used direct rating—allocating criteria points on a 0–100 scale to reflect their relative importance, because alternatives were considered too laborious for participants to understand. Still, they observed that participants found some criteria, which were set by the decision-making board, challenging to understand, such as equity. Sussex et al. [31] justified a similar direct weighting technique in the same way. Along with a third study [36], also using a direct approach (weighting criteria on a 1–5 scale), these authors felt that the cognitive challenges faced by patients may have influenced the results. They also speculated that variation in the weights obtained from participants, while possibly due to diverse perspectives, may also be due to patients misunderstanding the task.

Youngkong et al. [30] used a DCE to assess weights for prioritizing HIV interventions, eliciting responses from patients, community groups, and policy makers. The results of the DCE were presented to groups of participants for discussion. Of the three groups, patients were the only ones for whom the ranking of interventions changed following deliberation. The authors pointed to the cognitive challenges posed by the DCE as a possible source of the changes. They also noted that patients did not share other stakeholders’ positive view of the DCE as representing a systematic approach to priority setting. They suggested that this rejection of the purpose of the exercise—to prioritize interventions—might be another reason for their negative views of the method. During the deliberative exercise, patients gave the same priority to 40 different interventions, arguing that every intervention was important, and requesting more budget be made available to enable all interventions to be adopted.

The only studies that systematically elicited patients’ views of the MCDA method were those that used AHP [33, 34]. Dolan [33] used a survey to obtain responses from 48 patients. Following the elicitation exercise, patients were asked about their ability to complete the required tasks and the value of the method. To the question ‘Did you understand the interview?’ patients largely said yes (mean 4.72 on a scale of 1 = no, did not understand at all to 5 = yes, fully understood). Patients also generally liked the interview (mean 4.85, with 5 = strongly agree); and agreed with the statement ‘Doctors should use [this type of interview] routinely’ (mean 4.81). Dolan et al. [34] asked patients several questions to assess the feasibility of using AHP. A high proportion (92–93% across the five sites in which the study was undertaken) of the 484 participants indicated that it was not hard to understand the criteria; most found it easy to follow the pairwise comparison process (91%), and make the comparisons (85%). The majority (88%) would be willing to use a similar procedure to help make important healthcare decisions. Thus, the authors concluded that it was possible to use AHP to foster patient-centered decision making and that patients are able and willing to perform complex MCDA tasks.

Three authors reported on the consistency of the responses to weighting questions. Of the 650 respondents to the AHP undertaken by Hummel et al. [35], only 167 (26%) met the threshold of a consistency ratio lower than 0.3 (a threshold often adopted as a cut-off for responses to be included in AHP studies). This is a measure of extent to which the participants’ responses to multiple pairwise comparisons of criteria are logically sound in relation to each other. The authors speculate that the high proportion of responses considered too inconsistent to include could be due to the lengthy questionnaire, causing respondents to complete questions too hastily or to experience fatigue. Dolan et al. [34], also reported the consistency of responses to the weighting questions employed in an AHP. They employed a lower consistency ratio threshold of 0.15, and found that 79% of respondents met this requirement. Goetghebeur et al. [36] observed inconsistencies in responses when weighting exercises were repeated with the same respondents. Only half the weights (50.8%) were identical between test and retest, 39.2% differed by 1 point (on the scale of 1–5), and 10% differed by 2 points. The largest variations were observed for disease-related criteria, suggesting difficulty in deciding on the importance of disease severity. The authors attributed this to either participants wrestling with their values, or discomfort with the process or misinterpretation of the data.

4 Discussion

The review identified several recent examples of MCDA used to elicit the patient voice. The authors were all positive about the prospects of using MCDA with patients to support reimbursement and investment decisions [28, 30, 31, 36]; to weight outcomes measures from clinical studies [35] and to foster patient-centred decision making [34].

The studies also support a key reason for wanting to obtain patients’ views: they differ from those of the experts responsible for decision making in healthcare. The three papers that reported both patients’ and experts’ weights all found differences, with patients tending to put greater weight on the characteristics of the disease and the convenience of treatments, and experts tending to put more weight on efficacy.

Views vary widely among patients in ways that are difficult to predict. This has three important implications. First, this calls into question the aggregation of results across patients adopted by all the studies reviewed, which either averaged responses or forced participants to consensus. This ignores the heterogeneity: a given treatment may have a positive benefit-risk balance for some patients, but not for others. Just as variation in treatment efficacy increasingly leads to clinical trialists conducting analyses that assess various determinants (e.g. Cox proportional hazards) and to presenting results for subgroups, so too should differences in patient views be considered in decision making. This would also be consistent with the growing push to personalized medicine.

Second, as noted by several authors, the heterogeneity casts doubt on the use of a single, small workshop. Whether multiple workshops or more extensive surveys can solve this problem or methods are required to specifically address the variation has not been established.

Third, given the heterogeneity, one should have reservations about the practice of ‘capturing’ the patient voice by incorporating a patient representative onto an expert panel. This approach was adopted by a number of the studies reviewed, and is often the approach adopted by decision-making bodies. For instance, Goetghebeur et al. [36] developed an MCDA to support HTA, and constructed an expert panel in the manner in which HTA bodies might be expected to, comprising: clinical experts, including academics and nurses; an ethicist; a health economist; an epidemiologist; and a representative from a patient group. The weights for the MCDA were estimated by averaging the values provided by the members of the panel. It is questionable whether this approach captures a patient perspective and it is not obvious why the patient’s view is accorded equal importance to that of, for instance, the health economist. Rather, it might be more insightful to run the MCDA separately for different stakeholders. This will allow the MCDA to identify whether differences in stakeholder preferences have implications for the result of the technology assessment. However, the appropriate treatment of preference heterogeneity should be determined with the decision maker on a case-by-case basis.

Youngkong et al. [30] observed another challenge to engaging patients in an MCDA—their unwillingness to consider the trade-offs required to choose between healthcare interventions, and instead argued that budgets should be increased to allow more interventions to be funded. This challenge would be overcome by eliciting patients’ preference in abstract of a specific decision, and applying these in the MCDA designed to evaluate intervention.

The review underscores the challenges facing patients undertaking elicitation exercises. Authors speculated that some variation in preferences, as well as inconsistencies observed, may result from patients’ difficulties understanding elicitation tasks. It is important to avoid jumping to this conclusion without further research, however, as the two studies that surveyed patients about elicitation tasks [33, 34] suggested that patients were able and willing to provide the required data. Nevertheless, the review points to the importance of considering the cognitive challenges when designing MCDAs. Strategies that can be adopted to improve the quality of patients’ answers include reducing the number of questions posed; and face-to-face elicitation. Good MCDA practice should also involve participant training; piloting elicitation tasks; and validating that the results are consistent with participants’ understanding of the meaning of the scoring and weighting data they provide [37].

It is important to acknowledge the limitations with this review. First, there were only a small number of relevant studies, reducing the ability to identify methodological trends and lessons. While the studies covered several MCDA methods, they did not cover the entire range. Second, this review was restricted to English-language papers; unpublished studies were not captured—for example, those employed by local decision makers; and the choice of search terms may have overlooked relevant applications that did not explicitly use the term MCDA. As a consequence, certain MCDA techniques—for instance, swing weighting—were not represented in the review.

5 Conclusion

MCDA has the potential to ensure that the patient voice informs decision making by quantifying patient values. This review identified several recent examples of MCDA used to elicit the patient voice. These studies observed different values being expressed by patients and other stakeholders, reinforcing the need for better methods for capturing the patient voice. They also support the feasibility of using MCDA to capture the patient voice, though they point to a number of important challenges that will need further work, including: how best to reflect the heterogeneity of patient values in decision making; and the cognitive burden associated with some of the MCDA tasks.