INTRODUCTION

The mode of presentation of clinical trial results influences the perception of treatment benefit110. Generally, relative risks (e.g., mortality reduction by half) accentuate the perception of benefit compared to absolute risks (e.g., mortality was 5% versus 10%) or numbers needed to treat (e.g., 20 patients need to be trated to avoid one death). Similarly, negative framing (in terms of mortality) leads to greater perception of benefit than positive framing (in terms of survival).

The absolute risk format is often recommended because the relative risk format causes misunderstandings11,12, but others have argued against the provision of absolute risks13. Arguably, no single format allows a well-informed opinion, which would require careful consideration of all sides of the argument14. Only a few recent tudies have included a “fully informed” condition in their design, with variable results1518. These studies enrolled various volunteer samples, not patients or doctors, who are directly concerned by treatment decisions. Thus available evidence remains inconclusive.

Another unresolved question is whether patients and doctors are affected similarly by framing formats. A meta-analysis has suggested that both doctors and patients overrate treatment benefits when presented with relative risks4, but no study has probed doctors and patients using the same instruments. The doctors’ greater familiarity with the assessment of medical research findings, higher numeracy levels, and greater knowledge of the relevant clinical context may make them less susceptible to framing effects.

The objectives of this study were to determine which single framing format approximates best the comprehensive information format in terms of the perception of treatment benefit, and to compare the impact of risk framing formats on doctors and on patients.

METHODS

We conducted two randomized mail surveys that included a hypothetical scenario among doctors and among patients recently discharged from hospital. The doctor survey primarily explored doctors’ opinions about policy issue and was approved by the Research Ethics Committee of the University Hospitals of Geneva19,20; the patient survey, primarily a patient satisfaction survey, was exempted from full review21,22.

The doctor survey included all active clinicians in canton Geneva, Switzerland, and was conducted between November 2007 and February 2008. Doctors were identified through the registries of the Geneva Medical Association and University Hospitals of Geneva. Duplicate records, invalid addresses and doctors who did not work with patients were excluded. This left 2746 eligible doctors. The survey response rate was of 56.3% (1546/2746). Participation was not related to age, setting of practice and source data base, but differed by sex (58.0% in men vs. 53.7% in women, p = 0.027) and specialty (from 52.6% in technical specialists to 62.2% in primary care doctors, p = 0.003).

The patient survey took place between November 2005 and February 2006. It included all adult patients discharged to their home during a one month period. No clinical data were available, nor any information about the relevance of the study scenario for the patient. We excluded patients who were transferred to another facility, did not live in Switzerland, or reported not speaking French or being incapable of filling a questionnaire. The core of the questionnaire was the Picker patient opinion survey. The response rate was 65.0% (1432/2204).

Scenario

The scenario described a hypothetical clinical trial in which a new treatment provided a survival benefit over the old treatment, but caused more digestive side-effects (Box). Four basic risk formats were used in both surveys: 1) survival proportions, 2) mortality proportions, 3) relative mortality reduction, and 4) all three presentations of risk. The latter was considered to be the fully informed condition. The respondent was asked how the new treatment compared with the old treatment.

We tested two additional risk formats in doctors only: 5) the number needed to treat, and 6) the relative survival extension. These formats did not work well with patients during pretests, and we renounced their use in this sample. Furthermore, we called the viral disease HIV infection in the doctor scenario, but not in the patient scenario, to avoid singling out a specific patient group. The study was called a “multicenter clinical trial” in the doctor version, but just “study” in the patient version. We mentioned statistical significance in the doctor version because that question arose during pretests, but not in the patient version.

Presentations of treatment benefits as absolute survival, absolute mortality, relative mortality reduction, and number needed to treat do not require elaboration. Computation of the relative survival extension was based on the assumption of a constant mortality rate (i.e., exponential model). Under this assumption the expectation of survival time equals the inverse of the mortality rate23; thus if the mortality rate with the new drug is 2/3 of the mortality rate with the old drug, the expected survival on the new drug will be 3/2 the expected survival on the old drug, i.e., an increase of 50%. Of note, in the other versions of the scenario, we reported risks rather than rates, but because risks were low, the ratios of hazards and risks were practically equivalent (an exact computation yields a survival extension of 51.2% for this case; we used the rounded figure of 50% for simplicity).

Sample Size Determination

We sought to detect a difference in positive assessments of 60% versus 75%. With a type 1 error at 5% and a desired power of 90%, 220 observations per group were necessary.

Analysis

We compared the distributions of the 5-level assessment across versions of the scenario, and compared them using a chi-square test for linear trend, separately in patients and in doctors. For most analyses, we dichotomized the assessment as favorable (“much better” or “somewhat better”) versus other, and used chi-square tests to compare version of the scenario, and doctors to patients within each version. Multivariate modeling was conducted with logistic regression. The reference level was the comprehensive information condition, and differences between this and other risk frames were interpreted as framing bias. The models were replicated using ordinal logistic regression with the original 5-level assessment as dependent variable, but because the results were virtually identical, we report binary logistic regression results. To compare the effects of risk framing across subgroups we used interaction terms; e.g., to compare men and women, the model predictors included risk formats , sex, and the sex*formats interaction.

RESULTS

Of the 1546 doctors who returned the questionnaire, 107 (6.9%) left the scenario blank. Non-response was similar in men and women (6.5% vs. 7.6%, p = 0.38), but was higher in older doctors (up to 35 years: 3.3%, 36–50 years: 6.4%, 51 and over: 9.2%, p-value for trend = 0.001), and in psychiatrists compared to other specialties (12.6% vs. 5.6%, p < 0.001). The doctor respondents included a majority of men (Table 1), and their mean age was 47.0 years (standard deviation 11.6). The majority were in private practice. All clinical specialties were represented.

Table 1 Characteristics of Doctors and Patients Who Completed the Risk Assessment Scenario (Totals May Not Add up to Total Due to Missing Values)

Non-completion of the scenario was higher among the 1432 patient respondents (311, 21.7%). Non-response was similar in men and women (20.7% vs. 17.9%, p = 0.20), but was higher in older patients (up to 35 years: 10.2%, 36–50 years: 16.0%, 51 and over: 28.9%, p-value for trend < 0.001), patients who had less than high school education (23.1%) versus the more educated (12.0%, p < 0.001), and in patients who reported poor to good health (21.4%) compared to those in excellent or very good health (11.4%, p < 0.001). The majority of patient participants were women (Table 1), and their mean age was 51.0 years (standard deviation 18.7). Slightly more than half had only basic education; about a third reported very good or excellent health.

Scenario Assessments

Most doctors considered the new treatment to be better than the old treatment, but the proportions of positive assessments varied considerably across formats (Table 2). E.g., the proportion that rated the new treatment “clearly better” exceeded 60% for relative formats (relative mortality reduction, relative survival extension), but less than 10% for absolute survival. The differences between the six formats were statistically significant (p < 0.001). In comparison to the fully informed format, the distribution of responses was similar for the absolute risk format (p = 0.42), but differed significantly for all others (p = 0.009 for the NNT format, and <0.001 for the others).

Table 2 Distributions of Perceptions of Benefit of a New Treatment, by Rsk Presentation Format, Among 1439 Doctors and 1121 Patients Recently Discharged from Hospital, in Geneva, Switzerland

The pattern of responses was similar for patients, but the contrasts appeared somewhat more attenuated, and more negative opinions were expressed (Table 2). The differences between formats were statistically significant overall (p < 0.001), and in comparison to the fully informed condition, the difference was non-significant for absolute mortality (p = 0.80), but significant for absolute survival and for relative mortality reduction (both p < 0.001).

Collapsing the two positive categories (clearly or somewhat better) yielded results that were almost indistinguishable for doctors and patients (Table 3). About 50% of respondents considered the new treatment better when absolute survival was shown, about 70% did when comprehensive information or absolute mortality was shown, and about 90% did when relative mortality or relative survival extension were shown. None of the comparisons of doctors with patients was significant.

Table 3 Proportions of Doctors and Patients with a Positive Perception of the Benefits of a New Treatment (Clearly Better or Somewhat Better)

The logistic regression model confirmed these results (Table 4). Compared to the fully informed condition, the odds of a favorable assessment of the new drug were fourfold for relative mortality reduction and relative survival extension, less increased for the number needed to treat, similar for absolute mortality, and reduced more than twofold for absolute survival. The difference between doctors and patients was small and non-significant.

Table 4 Odds Ratios for a Positive Perception of the Benefits of a New Treatment

Subgroup Analyses

The only difference that we anticipated (attenuated framing effects in doctors compared with patients) was not borne out by the data (p value of interaction test =0.40). Among doctors, the impact of risk framing was similar across subgroups: interaction terms between format and sex (p = 0.78), age group (p = 0.50), site of practice (p = 0.84), and specialty (p = 0.80) were all non-significant. Similarly, among patients, no significant interactions were found between risk framing and sex (p = 0.82), age groups (p = 0.35), education level (p = 0.85), country of birth (p = 0.17), health status (p = 0.71), and hospital department (p = 0.91).

DISCUSSION

Risk Formats

Our study confirms that the risk information format influences perceptions of treatment benefit: absolute survival led to the perception of weakest benefit, and relative mortality reduction led to the perception of greatest benefit. To this we add two novel findings: comprehensive information that combined absolute mortality, absolute survival, and relative mortality reduction produced a similar perception of benefits to the presentation of absolute mortality, and the susceptibility to framing bias was similar in doctors and patients.

If respondent perceptions following comprehensive information are used as a yardstick, absolute risks constitute the least biased risk format. In contrast, presentation of relative risk reductions in isolation caused an optimistic bias, with a more than fourfold increase in the odds of a positive assessment of the new treatment. Presentation of absolute survival proportions caused a pessimistic bias, with a more than twofold decrease in the odds of a favorable assessment. The differences we observed between these risk formats are consistent with the results of a recent meta-analysis on the subject4. The novel aspect contributed by this study is the inclusion of a fully informed condition, which aimed to capture the “true” values and preferences of the respondents. Indeed, on face value, all perceptions of treatment benefit are equally valid. As long as the data on which these perceptions are based are correct (and a relative risk is as technically correct as two survival proportions), it cannot be said that one perception is right and the other wrong. Arguably, the “right” answer is one that the respondent arrived at after careful consideration of all arguments, preferably after sufficient reflection time, and after clarification of any questions that may have occurred. This is the ideal that we attempted to approach, however imperfectly, through the “comprehensive information” condition.

A few previous studies have compared isolated risk frames to a fully informed condition. Stovring et al. have followed an initial presentation of treatment benefit by a comprehensive description; initial presentation in the absolute risk reduction of a heart attack led to fewer changes in the decision to take the drug (6%) than if the initial presentation was a relative risk reduction (9%), but this difference was not statistically significant15. Carling et al. reported that framing the risks of cardiovascular events initially as a rate (events per year) resulted in fewer changes in decisions (19.1%) than describing a cumulative risk over 10 years (28.2%) or disease-free survival over 10 years (23.8%)16. Armstrong et al. have reported that showing survival and mortality curves together led to similar decisions regarding a preventive colectomy as showing survival curves alone, but not mortality curves alone17. Peters et al. have observed that a positive frame (absence of an adverse drug reaction) led to lower perceptions of risk than a negative frame (risk of adverse event), with the combined frame presentation falling in-between18. These studies have shown weak and somewhat inconsistent contrasts, while we have observed a strong and statistically significant differences. This raises the issue of contextual factors which may influence study results, such the type of scenario, the exact question asked, or the type of study population. All four aforementioned studies were conducted among various volunteer samples from the general population, while we surveyed patients and doctors.

Doctors Versus Patients

The second novel finding of this study is that doctors are just as prone to framing bias as patients. The results of doctors and patients were remarkably similar when the perceptions were dichotomized as positive versus neutral or negative. This was unexpected—we thought that doctors would be more sophisticated than patients in interpreting the scenario, less likely to be convinced by a relative mortality reduction with no absolute risk to anchor the comparison, and more apt to deduce the proportion of patients who died when the proportion who survived was given. This illustrates the difficulty that many doctors have in applying quantitative analysis skills in their practice. Previous studies have shown that doctors’ understanding of various terms used in medical literature, such as relative risk, absolute risk, or the number needed to treat, differ considerably from an objective, criterion based assessment24. Similarly, most doctors misunderstand numerical data regarding test accuracy, regardless of whether they are presented as sensitivity and specificity or likelihood ratios25, and fail to use relevant numerical information, such as disease prevalence, when they interpret the results of diagnostic tests20,26. Were doctors less prone to framing bias than patients, it would be possible to simply warn doctors of the patients’ limited capacity to interpret numerical risk data, and suggest various communication aids to improve patient understanding. As it is, doctors themselves may not be aware of the problem. More systematic reporting of absolute risks may be advisable in medical research reports and other original sources of medical information used by doctors. In this our findings support the recent addition to the CONSORT statement that “For binary outcomes, presentation of both relative and absolute effect sizes is recommended”27.

The comparability of framing effects in doctors and patients and the lack of any meaningful subgroup difference both suggest that framing biases are widespread, if not universal. However, we were not able to examine the personal context of the respondent, notably the direct relevance of the treatment under consideration. It is possible that study results are examined more carefully by potential users of a treatment, and that framing biases may be less prominent in such a situation.

Strengths and Limitations

We have used a robust design—randomized scenario-based trial—and applied it to large samples of patients and doctors, who are directly concerned by risk framing and treatment decisions. All scenarios described identical study results, so that only framing of information could explain the differences between responses.

The main limitation of the study is that only one situation was depicted. We do not know what would have happened had we described risks of an event other than death, if the risks had been 20% and 30%, instead of 4% and 6%, if the relative mortality reduction had been of one half or one tenth instead of one third, etc. Nevertheless the intent of this study was not to establish some universally valid constants (we do not believe that they exist), but rather to identify the format that best approaches the fully informed condition. We believe that the general answers that we obtained—absolute risks are the least biased, doctors are as sensitive to framing bias as patients—have a more general validity than any specific percentages obtained in this study.

Another limitation is that we do not know whether our “comprehensive information” condition achieved its purpose for all respondents. In real life, an optimal information process would allow for additional questions, further explanations, and sufficient time to reflect on the information, all things that could not be offered via a scenario. Therefore replication of our findings in clinical studies would be useful. A clinical study would also confirm whether our results hold for actual decisions, which is always a concern for simulation studies. In this case, we would argue that the real-life and simulated situations are similar, as both entail reading numerical results of a study and forming an opinion about the new drug. Finally, a clinical study would allow an exploration of the link between perceptions of benefit and actual decisions to prescribe, recommend, or agree to a new treatment.

As for most surveys, selection bias is a concern. Skipping the scenario was common among patients, particularly among older and less educated patients. This suggests that despite pretests, understanding and answering the scenario was not trivial. Whether this reflects a general difficulty inherent in interpreting risk data, or specific features of our scenario, cannot be determined. Finally, participation was average in both surveys (56% and 65%), which can also cause selection bias.

CONCLUSIONS

Presentation of study results in terms of absolute mortality risks leads to least biased perceptions of the benefit of a new treatment, when a combination of three risk formats is used as the reference. In comparison, relative presentations of benefit cause an optimistic bias, and survival proportions induce a pessimistic bias. Doctors are just as prone to these biases as patients.