The placebo problem

The ethical dilemma of placebo controls occurs when effective treatments exist and the use of placebo is associated with some risk or burden in excess of that present in individualized treatment. In a typical phase-III trial, there is potential for direct benefit since evidence supporting the efficacy of the experimental treatment exists, and the point is to demonstrate this at a level needed for mass marketing. However, the ethical difficulty of placebos arises because such potential direct benefit to the subjects do not equal or exceed the risks and burdens of entering a clinical trial.

What exactly is the nature of that sacrifice? What value counterbalances it? From the perspective of some proponents of placebo controls in psychiatric research, this sacrifice asked of the subjects with major depression or schizophrenia involves "no important risk" (Ellenberg and Temple 2000) such that "there should be no ethical objection" to the practice (Ellenberg and Temple 2000). This is the basis for the placebo-controlled psychiatric research conducted in the U.S. and in countries that have adopted the International Conference on Harmonization guidelines (Food and Drug Administration 1999)—although not without continuing debate and controversy.

Two broad conceptual frameworks dominate the debate: those that see the problem primarily as a matter of weighing risks/burdens and benefits (Ellenberg and Temple 2000; Temple and Ellenberg 2000; Emanuel and Miller 2001) and those following the World Medical Association (WMA) Declaration of Helsinki (World Medical Association 1997)Footnote 1, who put a special ethical emphasis on the act of intentionally not providing a proven treatment (Rothman and Michels 1994; Freedman et al. 1996; Weijer 2000; Huston and Peterson 2001). The first group of authors tended to favor placebo controls and the latter not, although some have criticized placebo controls even from a risk–benefit framework (Lavori 2000). A thorough analysis of the differences in frameworks is a project beyond the scope and intent of this paper. Instead, this paper's goal is to inform the ongoing debate by examining in depth the currently employed justifications for placebo controls.

This review paper focuses on psychiatric research conducted in the process of seeking a new drug approval, or a treatment indication approval, for the treatment of major depression or schizophrenia. I have set aside clinical trials of special populations, such as treatment refractory patients, since for such persons no effective treatments exist and the placebo problem may be less problematic. I have also set aside drug-free studies conducted with the primary aim of understanding disease processes, an issue that deserves a separate and detailed discussion of its own.

The risks and burdens of placebo controls

While all arguments in favor of placebo controls are based on the framework of risks and benefits, such arguments are not completely utilitarian, since even the most active proponents of placebos acknowledge that placebos are unethical if they would result in increased risk of death or of serious permanent morbidity (Food and Drug Administration 1999; Temple and Ellenberg 2000). Not unexpectedly, a debate is evolving around the issue of where to draw such a line, with a recent article advocating a lower threshold for barring placebos by adding two further categories: "reversible but serious harm" and "severe discomfort" (Emanuel and Miller 2001). Presumably, all four of these conditions are categorical, in the sense that no amount of societal benefit would justify exposing persons to those risks or burdens, when effective treatments exist. What is the relevant evidence for each of these risk categories, as it applies to antidepressant and antipsychotic clinical trials?

Increased mortality?

The main concern regarding placebo-controlled trials in persons with major depression or schizophrenia—conditions with inherently high risk for suicide—is that placebo-exposed subjects may be at higher risk for suicide. Until recently, the relatively infrequent nature of suicides in clinical trials made it difficult to assess this question. We now have analyses of large databases that suggest that, at least for the duration of the clinical trial and a short time beyond, there is no significant increase in suicide or suicide attempts among those receiving placebos in clinical trials testing antidepressants and antipsychotics.

For antidepressants, this has been a consistent finding in two analyses of the Food and Drug Administration (FDA) database (Khan et al. 2000; Laughren 2001). Laughren's analysis of FDA data from the approval process for eight different antidepressants revealed that the suicide rates were 0.10% (22 of 22,062) for those exposed to the investigational drug, 0.13% (7 of 5514) for those exposed to active controls, and 0.02% (2 of 8692) for those exposed to placebos (Laughren 2001). Khan et al. (using a subset of the same database) adjusted for duration of exposure (thus increasing the rate of suicide in the placebo group since that group had higher drop outs) and found no significant difference in suicide or suicide attempt rates among the three exposure groups (Khan et al. 2000). These analyses are consistent with a previous industry sponsored analysis of fluoxetine studies (n=3065, Beasley et al. 1991) as well as a recent analysis of the Medicines Evaluation Board of Netherlands database (Storosum et al. 2001). In the review of the Netherlands database, eight 'long-term' studies with a median length of 36 weeks (n=1949, 604 of whom were on placebo) showed no difference in rates of suicide or suicide attempts between the placebo and active drug groups (suicides: 0% vs 0.2%; suicide attempts: 0.7% vs 0.7%; Storosum et al. 2001).

In terms of antipsychotic trials, using a much smaller FDA database (n=10,118) than for the antidepressant analysis, Khan et al. (2001) concluded that there was no significant difference between the placebo and active drug groups in terms of suicides and suicide attempts. Estimated annual rates of suicide and attempted suicides were 1.8% and 3.3% for placebo, 0.9% and 5.7% for active controls, and 0.7% and 5.0% for the investigational drugs (Khan et al. 2001).

While it is reassuring that the exposure to placebo did not seem to increase suicide rates in comparison with the active drug groups, the overall suicide rates in both the antidepressant trials and in the antipsychotic trials are still at least as high as previous estimates for those populations, if not higher (Khan et al. 2000, 2001). These high rates occurred despite the routine practice of excluding subjects perceived to be at risk for suicide. It is not clear whether such exclusion criteria are simply ineffective screening tools for suicidality, whether being in a placebo-controlled trial per se adds to the risk, or whether suicides and suicide attempts are detected with higher sensitivity during a clinical trial, when compared with events occurring in the community. Lastly, it may be that, because suicide is relatively infrequent, even the current analyses have limited sensitivity despite their large n value.

Permanent serious harm?

There is considerably fewer data for the question of long-term harm from delayed treatment. Since no evidence of harm is different from evidence for no harm, the debate here seems to turn on which assertion carries the burden of proof. If the burden is to show evidence for no harm, it is notable that we currently lack the data for such a claim. Further, as Lavori points out, we do not seek the information needed to support that claim directly, such as collecting follow-up data of subjects after the placebo exposure period (Lavori 2000). However, if the burden of proof is to show evidence of harm, the relative lack of data perhaps raises less concern. Carpenter et al. conducted a careful review of conflicting data regarding the long-term adverse effects of extended drug-free periods in schizophrenia. They concluded that "the hypothesis that drug-free periods have a long-term toxic effect is not yet substantiated" (Carpenter et al. 1997).

Reversible but serious harm or serious discomfort?

Unlike suicides and attempted suicides, it is difficult to quantify psychosocial and symptom-related burdens. What is the best available scientific evidence regarding this issue? One useful exercise is to examine how we currently measure the disability and burdens experienced by persons suffering from major mental illnesses in the context of having to make morally difficult tradeoffs, such as in allocation of limited health care resources. Health state utility units, such as QALYs (quality adjusted life years), reflect both probability and magnitude of burdens and harms (Nord 1999). It is a useful conceptual tool in the risk–benefit analysis of research participation because it is meant to be used in a policy context and since it provides a comparative measure of the psychosocial burdens of various disease states.

In terms of major depression, 70 patients in remission from major depressive episode were interviewed in one study using a standard gamble technique to score various depression-related states (Revicki et al. 1995). In this technique, the subject is asked to consider two options, A and B. In option A, one would remain in a chronic illness state for t number of years. In option B, one would have the probability p of living for t number of years in perfect health or probability 1-p of dying immediately. The probability p is varied until the subject is indifferent between options A or B. The final p will be between 0 and 1, and this is the preference weight or utility score of an illness state. For untreated depression, the average utility score was 0.306. That is, on average, the patients were willing to accept a near 70% risk of immediate death in order to exchange a depressed health state with a perfect health state. This is a very low utility score, among some of the lowest measured for any illness state (Kamlet et al. 1995). Remarkably, 1 month of treatment on an effective antidepressant (e.g., fluoxetine) was scored on average as 0.725: when depression is even partially treated, the reduction in burden is impressive. One recent study examining the utility scores for varying degrees of depression noted that the utility for mild depression is akin to that for kidney dialysis, while the utility of moderate depression is below that reported for "being blind, deaf, or dumb" (Bennett et al. 2000).

The very low valuations of the depressed health state hold true whether evaluated from the point of view of depressed patients (Revicki et al. 1995, 1997; Bennett et al. 2000), from the point of view of community samples (Kamlet et al. 1995) or from public health experts (Murray and Lopez 1996), using diverse valuation techniques. The health state of active psychotic illness is also one of the worst health states known. In the World Health Organization's Global Burden of Disease Project, it was rated as a worse health state than major depression and fell in the worst health state category (among seven categories; Murray and Lopez 1996;Chouinard and Albright 1997). As Murray and Lopez point out, these valuations seem to hold even when persons of diverse cultural backgrounds are surveyed (Murray and Lopez 1996).

In short, suffering from a major mental illness is a terrible state; its burden is not merely from suicide risk. The very state of being depressed or psychotic causes immense suffering. To delay treatment, for example, so that one has to suffer two more months of depression or psychotic exacerbation than necessary is a burden that we as a society should be very reluctant to label as ethically 'unimportant risk.'

While it might be argued whether such risks and burdens are outweighed by the societal benefits, there should be less disagreement about the kind of informed consent required in placebo-controlled studies when effective treatments exist. Currently, there is what might be called an 'elemental' emphasis in informed consent: the various elements of risks and potential benefits are presented as separate items. It is ethically important to make the overall costs to the subject more explicit. In particular, subjects should be aware that, from a strictly therapeutic point of view, entering the clinical trial is the worse option since they are expected to do significantly worse (i.e., greater chance of suffering from depression longer), even if it turns out that the new drug is at least as effective as the current standard (Charney et al. 2002; Kim and Holloway 2002). Further, they should be told that there is no evidence for long-term harm or safety in entering the trial.

The benefits and limits of the 'better than placebo' standard

A clinical trial compares an experimental intervention and a control intervention. The current practice in psychiatric drug development is to show that the experimental intervention is significantly, but not necessarily clinically, more effective than the placebo intervention. The primary benefit of this practice is that it is one way of overcoming the problem of assay sensitivity. In psychiatric research, as in other fields, important methodological issues arise in attempting to show efficacy of a new drug (Ellenberg and Temple 2000; Temple and Ellenberg 2000). In an internal FDA study, all 50 studies used to approve eight new antidepressants between 1987 and 1999 were analyzed (Laughren 2001). All studies were deemed adequately designed. Of the 50 studies, 23 (46%) failed to show significant advantage over placebo, a disconcerting result given that power considerations alone cannot account for the rate of insensitivity (Laughren 2001). The high placebo response rate is a perennial problem with continuing uncertainty regarding its causes (Posternak et al. 2002). This 'assay sensitivity' problem, at least for antidepressant trials, makes active control equivalence trials (that is, trials using non-inferiority or "as good as an active control" standard) uninterpretable; since, if the new drug and an active control have equivalent effects, no conclusion can be drawn about the efficacy of the new drug (Temple and Ellenberg 2000).

A similar FDA analysis of antipsychotic studies revealed that 4 of 16 studies failed to show superiority over placebo. This 25% failure rate is in the range of being explainable by power considerations alone, raising the possibility that equivalence trials could be a scientifically valid way of testing new antipsychotics (Laughren 2001). The low placebo-response rate in antipsychotic trials is supported by Khan et al.'s analysis of the FDA database; they found that, across studies, the mean symptom reduction as measured by decrease in the brief psychiatric rating scale at 6 weeks was only 1.1% for those exposed to placebo (compared with 16.6–17.3% for active drug groups; Khan et al. 2001).

Why not use a 'better than active control' standard in developing new drugs? There is a concern that a standard that required superiority over available treatments for drug approval may be too difficult to achieve, given the remote likelihood that genuine superiority can be shown in such contexts (Laughren 2001). This may result in far fewer drugs being developed and approved for clinical use. Some of the important benefits of the 'better than placebo' standard could be lost. For instance, it is possible that a drug that is approved on the basis of 'better than placebo' standard could in the long run turn out to be more efficacious than standard drugs (as with clozapine); or, if not more efficacious, they could turn out to have fewer toxic effects (Carpenter et al. 1997; Lavori 2000; Temple and Ellenberg 2000; Laughren 2001). The increase in adherence rates due to lower side effects may have important public health benefits.

It is important, however, to place the benefits of the better-than-placebo standard in a historical context. It may have been ethically and scientifically necessary to use placebo controls in the early trials of selective serotonin re-uptake inhibitors when tricyclic antidepressants were the standard. Currently, there are (depending on how one counts them) at least seven distinct classes of antidepressants on the market, all better than placebo but none clearly more effective than another. Thus, it appears that the aims of the better-than-placebo standard have been well achieved. However, it does raise the question of how much more incremental benefit yet another better-than-placebo antidepressant would add to society. This is not to suggest that we abandon the better-than-placebo standard, but it does point out that that standard will have varying benefits, depending on the historical context. Again, in the ethical analysis of placebos, making this contextual variability on the benefit side explicit is important.

In summary, the current standard for new drug approvals involves a value judgment as well as a scientific necessity. It amounts to the claim that the sacrifice of the relatively few is outweighed by the benefits of the current better-than-placebo standard. This may be an acceptable value judgment but it is important to make the nature of such judgments explicit in evaluating the ethics of placebos.

The neglected perspective of research participants

A major weakness in the current arguments in favor of placebo controls is that the perspective of those suffering from the illness (those who bear the burdens of being in the trials) is seldom, if ever, considered. First, the potential subjects' views are not consulted on what level of risk or burden is acceptable. It is worth noting that even the lower risk levels suggested by some writers (Emanuel and Miller 2001) allow a fair degree of sacrifice on the part of subjects since, by implication, the position says that we as a society should tolerate placebo-controlled trials even if we could predictably expect reversible harm that falls just short of 'serious' and discomfort at a level that is just short of 'severe.' Since none of the advocates of various thresholds give an argument beyond an appeal to intuition, it is not clear why those thresholds are the best ones. Specifically, since these thresholds are meant to reflect a socially acceptable standard, it might be useful to scientifically gather community and patient-specific opinions regarding such thresholds.

Second, even if certain risk categories or descriptions are adopted, the question of what counts as falling under those descriptions remains. Fortunately, we are beginning to obtain important data regarding the perspectives of potential research participants regarding risks and benefits of psychiatric research (Roberts et al. 2000). However, we currently have no specific data regarding how much risk/burden is indeed acceptable to subjects when they truly understand the risk–benefit tradeoff in entering a placebo-controlled trial.

One might argue that since subjects give informed consent to participate in these trials, we already know the answer: research participants are indicating their preference based on risk–benefit considerations. However, this position makes an assumption contrary to the considerable data that show that de facto practice of informed consent in clinical trials is fraught with difficulties, not the least of which is the therapeutic misconception [the subject erroneously believes that the intent of the trial is to provide treatment individualized to the subject (Appelbaum et al. 1987; Daugherty et al. 1995)]. As noted above, current informed consent requirements use a laundry list of risks and potential benefits, rather than a direct and frank statement of what the subject is expected to lose by entering the clinical trial. This is important because, as Edwards et al. note in their comprehensive review of studies on participants' motivation for entering clinical trials, self-interested motives are more common than altruistic ones (Edwards et al. 1998).

As recently pointed out by Gelenberg, the stakeholders who make the rules of psychiatric drug development research are not the "silent consumers" (Gelenberg 2002). The International Conference on Harmonization (ICH), for instance, prohibits placebos when effective treatments exist, only in cases of increase of mortality or permanent morbidity (Food and Drug Administration 1999). It is important to remember that the ICH represents the perspective of pharmaceutical companies and their regulators, both of whom are primarily concerned with the impact of the mass marketing of medications. This is an extremely important ethical perspective as the marketing of unsafe or ineffective drugs can have wide public health consequences. However, it is also true that perspective does not always align with the perspective that puts protection of human subjects in research as the primary issue.

Conclusion: implications of making the tradeoffs explicit

Just as the public's scrutiny of the ethics of research with human subjects is unlikely to abate, the debate over placebos in psychiatric research will likely continue. The placebo issue is highly complex, both ethically and scientifically, and this paper's main goal was to make explicit the burdens and benefits of using placebos in psychiatric research.

It appears that the practice of allowing placebo-controlled trials in psychiatric research is only beginning to be based on data. It is reassuring that, at least for the short term, there does not seem to be increased risk of mortality due to suicide in those exposed to placebo. However, if a policy regarding placebo controls is to be evidence-based in a more robust sense, there are at least three areas for further empirical inquiry. First, we lack sufficient data regarding long-term adverse effects of delayed treatment due to use of placebos in clinical trials because we do not routinely obtain follow-up information. Second, it is unclear whether the tremendous burden of major depression or active psychosis is adequately addressed by the proponents of placebo controls. Currently, we can only speculate based on data borrowed from cost-effectiveness research. We need more direct data for the research ethics context. Finally, since risk–benefit analyses are influenced by the perspective from which they are performed, it is important that we do two things: (1) gather more data from the perspective of those directly influenced by placebos, viz., the potential research participants and (2) when evaluating particular protocols, we must take a genuinely societal perspective rather than a more limited perspective.

From a societal point of view, placebo controls, at least for antidepressant clinical trials, may be a necessity for the foreseeable future. However, if we do not make the moral tradeoffs involved explicit—and seek to gather ethically relevant data—we may not realize when they are no longer needed or no longer acceptable and, therefore, no longer ethical.