Keywords

1 The Emperor’s New Drugs: Medication and Placebo in the Treatment of Depression

On February 26, 2008, an article about antidepressants that my colleagues and I wrote was published in the journal PLoS Medicine (Kirsch et al. 2008). That morning, I awoke to find that our paper was the front-page story in all of the leading national newspapers in the United Kingdom. A few months later, Random House invited me to expand the article into a book, entitled The Emperor’s New Drugs: Exploding the Antidepressant Myth, which has since been translated into French, Italian, Japanese, Polish, and Turkish (Kirsch 2009). Two years later, the book, and the research reported in it, was the topic of a five-page cover story in the influential American news magazine, Newsweek. And 2 years after that, it was the focus of a 15 min segment on 60 Minutes, America’s top-rated television news program. Somehow, I had been transformed, from a mild-mannered university professor into a media superhero—or super villain, depending on whom you asked. What had my colleagues and I done to warrant this transformation?

To answer that question, we have to go back to 1998, when a former graduate student, Guy Sapirstein, and I published a meta-analysis on antidepressants in an online journal of the American Psychological Association (Kirsch and Sapirstein 1998). When they were new, meta-analyses were somewhat controversial and our article was accompanied by an editorial warning to that effect—not unlike the suicide warning that the U.S. Food and Drug Administration (FDA) requires for antidepressants. But now meta-analyses are published in all of the major medical journals, where they are widely considered to be the best and most reliable way of making sense of the data from studies with different and sometimes conflicting results.

When Sapirstein and I began our analysis of the antidepressant clinical trial data, we were not particularly interested in antidepressants. Instead, we were interested in understanding the placebo effect. I have been fascinated by the placebo effect for my entire academic career. How is it, I wondered, that the belief that one has taken a medication can produce some of the effects of that medication?

It seemed to Sapirstein and me that depression was a good place to look for placebo effects. After all, one of the prime characteristics of depression is the sense of hopelessness that depressed people feel. If you ask depressed people to tell you what the worst thing in their life is, many will tell you that it is their depression. The British psychologist John Teasdale called this being depressed about depression (Teasdale 1985). If that is the case, then the mere promise of an effective treatment should help to alleviate depression, by replacing hopelessness with hopefulness—the hope that one will recover after all. It was with this in mind that we set out to measure the placebo effect in depression.

We searched the literature for studies in which depressed patients had been randomized to receive an inert placebo or no treatment at all. The studies we found also included data on the response to antidepressants, because that was the only place one finds data on the response to placebo among depressed patients. I was not particularly interested in the drug effect. I assumed that antidepressants were effective. As a psychotherapist, I sometimes referred my severely depressed clients for prescriptions of antidepressant drugs. Sometimes the condition of my clients improved when they began taking antidepressants; sometimes it did not. When it did, I assumed it was the effect of the drug that was making them better. Given my long-standing interest in the placebo effect, I should have known better, but back then I did not.

Analyzing the data we had found, we were not surprised to find a substantial placebo effect on depression. What surprised us was how small the drug effect was. Seventy-five percent of the improvement in the drug group also occurred when people were given dummy pills with no active ingredient in them. Needless to say, our meta-analysis proved to be very controversial. Its publication led to heated exchanges (e.g., Klein 1998; Beutler 1998; Kirsch 1998). The response from critics was that these data could not be accurate. Perhaps our search had led us to analyze an unrepresentative subset of clinical trials. Antidepressants had been evaluated in many trials, the critics said, and their effectiveness had been well established.

In an effort to respond to these critics, we decided to replicate our study with a different set of clinical trials (Kirsch et al. 2002). To do this, we used the Freedom of Information Act to request that the Food and Drug Administration (FDA) send us the data that pharmaceutical companies had sent to it in the process of obtaining approval for six new-generation antidepressants that accounted for the bulk of antidepressant prescriptions being written at the time. There are a number of advantages to the FDA dataset. Most important, the FDA requires that the pharmaceutical companies provide information on all of the clinical trials that they have sponsored. Thus, we had data on unpublished trials as well as published trials. This turned out to be very important. Almost half of the clinical trials sponsored by the drug companies have not been published (Turner et al. 2008; Melander et al. 2003). The results of the unpublished trials were known only to the drug companies and the FDA, and most of them failed to find a significant benefit of drug over placebo. A second advantage of the FDA trials in the FDA dataset is that they all used the same primary measure of depression—the Hamilton depression scale (HAM-D). That made it easy to understand the clinical significance of the drug–placebo differences. Finally, the data in the FDA files were the basis upon which the medications were approved. In that sense they have a privileged status. If there is anything wrong with those trials, the medications should not have been approved in the first place.

In the data sent to us by the FDA, only 43 % of the trials showed a statistically significant benefit of drug over placebo. The remaining 57 % were failed or negative trials. Similar results have been reported in other meta-analyses (Turner et al. 2008), including one conducted by the FDA on the clinical trials of all antidepressants that it had approved between 1983 and 2008 (Khin et al. 2011). The results of our analysis indicated that the placebo response was 82 % of the response to these antidepressants. Subsequently, my colleagues and I replicated our meta-analysis on a larger number of trials that had been submitted to the FDA (Kirsch et al. 2008). With this expanded dataset, we found once again found that 82 % of the drug response was duplicated by placebo, with an effect size (d) of 0.32. More important, in both analyses, the mean difference between drug and placebo was less than two points on the HAM-D. The HAM-D is a 17-item scale on which people can score from 0 to 53 points, depending on how depressed they are. A 6-point difference can be obtained just by changes in sleep patterns, with no change in any other symptom of depression. So the 1.8 difference that we found between drug and placebo was very small indeed—small enough to be clinically insignificant. But you don’t have to take my word for how small this difference is. The National Institute for Health and Clinical Excellence (NICE), which drafts treatment guidelines for the National Health Service in the United Kingdom, has established a drug–placebo effect size (d) of 0.50 or a 3-point difference between drug and placebo on the HAM-D as criteria of clinical significance (NICE 2004). Thus, when published and unpublished data are combined, they fail to show a clinically significant advantage for antidepressant medication over inert placebo.

I should mention here the difference between statistical significance and clinical significance. Statistical significance concerns how reliable an effect is. Is it a real effect, or is it just due to chance? Statistical significance does not tell you anything about the size of the effect. Clinical significance, on the other hand, deals with the size of an effect and whether it would make any difference in a person’s life. Imagine, for example, that a study of 500,000 people has shown that smiling increases life expectancy—by 5 min. With 500,000 subjects, I can virtually guarantee you that this difference will be statistically significant, but it is clinically meaningless.

The results of our analyses have since been replicated repeatedly (Turner et al. 2008; NICE 2004; Fournier et al. 2010; Fountoulakis and Möller 2011). Some of the replications used our data; others analyzed different sets of clinical trials. The FDA even did its own meta-analysis on all of the antidepressants that they have approved (Khin et al. 2011). But and despite differences in the way the data have been spun, the numbers are remarkably consistent (Table 1). Differences on the HAM-D are small—always below the criterion set by NICE. Thomas P. Laughren, the director of the FDA’s psychiatry products division, acknowledged this on the American television news program 60 Minutes. He said, “I think we all agree that the changes that you see in the short term trials, the difference in improvement between drug and placebo, is rather small.”

Table 1 Drug–placebo effect sizes and HAM-D difference scores in meta-analyses of antidepressant trials

And it is not only the short-term trials that show a small, clinically insignificant difference between drug and placebo. In their meta-analysis of published clinical trials, NICE (2004) found that the differences between drug and placebo in the long-term trials were no larger than those in short-term trials.

2 Severity of Depression and Antidepressant Effectiveness

Critics of our 2002 meta-analysis argued that our results were based on clinical trials conducted on subjects who were not very depressed (e.g., Thase 2002; Hollon et al. 2002). In more depressed patients, they argued, a more substantial difference might be found. This criticism led my colleagues and I to reanalyze the FDA data in 2008 (Kirsch et al. 2008). We categorized the clinical trials in the FDA database according to the severity of the patients’ depression at the beginning of the trial, using conventionally used categories of depression. As it turns out, all but one of the trials were conducted on moderately depressed patients, and that trial failed to show any significant difference between drug and placebo. Indeed, the difference was virtually nil (0.07 points on the HAM-D). All of the rest of the trials were conducted on patients whose mean baseline scores put them in the “very severe” category of depression, and even among these patients, the drug–placebo difference was below the level of clinical significance.

Still, severity did make a difference. Patients at the very extreme end of depression severity, those scoring at least 28 on the HAM-D, showed an average drug–placebo difference of 4.36 points. To find out how many patients fell within this extremely depressed group, I asked Mark Zimmerman from the Brown University School of Medicine to send me the raw data from a study in which he and his colleagues assessed HAM-D scores of patients who had been diagnosed with unipolar major depressive disorder (MDD) after presenting for an intake at a psychiatric outpatient practice (Zimmerman et al. 2005). Patients with HAM-D scores of 28 or above represented 11 % of these patients. This suggests that 89 % of depressed patients are not receiving a clinically significant benefit from the antidepressants that are prescribed for them.

Yet this 11 % figure may overestimate the number of people who benefit from antidepressants. Antidepressants are also prescribed to people who do not qualify for the diagnosis of major depression. My neighbor’s pet dog died; his physician prescribed an antidepressant. A friend in the United States was diagnosed with lumbar muscle spasms and was prescribed an antidepressant. I have lost count of the number of people who have told me they were prescribed antidepressants when complaining of insomnia—even though insomnia is a frequently reported side effect of antidepressants. About 20 % of patients suffering from insomnia in the United States are given antidepressants as a treatment by their primary care physicians (Simon and VonKorff 1997), despite the fact that “the popularity of antidepressants in the treatment of insomnia is not supported by a large amount of convincing data, but rather by opinions and beliefs of the prescribing physicians” (Wiegand 2008, p. 2411).

3 Predicting Response to Treatment

Severity of depression is one of the few predictors of response to treatment. Type of antidepressant has little if any impact on treatment response. As summarized in a 2011 meta-analysis of studies comparing one antidepressant to another:

On the basis of 234 studies, no clinically relevant differences in efficacy or effectiveness were detected for the treatment of acute, continuation, and maintenance phases of MDD. No differences in efficacy were seen in patients with accompanying symptoms or in subgroups based on age, sex, ethnicity, or comorbid conditions… Current evidence does not warrant recommending a particular second-generation antidepressant on the basis of differences in efficacy. (Gartlehner et al. 2011, p. 772)

Although type of medication does not make a clinically significant difference in outcome, response to placebo does. Almost all antidepressant trials include a placebo run-in phase. Before the trial begins, all of the patients are given a placebo for a week or two. After this run-in period, the patients are reassessed, and anyone who has improved substantially is excluded from the trial. That leaves patients who have not benefitted at all from placebo and those who have benefited only a little bit. These are the patients who are randomized to be given drug or kept on placebo. As it turns out, the patients who show at least a little improvement during the run-in period are the ones most likely to respond to the real drug, as shown not only by physician ratings, but also by changes in brain function (Hunter et al. 2006; Quitkin et al. 1998).

4 How Did These Drugs Get Approved?

How is it that medications with such weak efficacy data were approved by the FDA? The answer lies in an understanding of the approval criteria used by the FDA. The FDA requires two adequately conducted clinical trials showing a significant difference between drug and placebo. But there is a loophole: There is no limit to the number of trials that can be conducted in search of these two significant trials. Trials showing negative results simply do not count. Furthermore, the clinical significance of the findings is not considered. All that matters is that the results are statistically significant.

The most egregious example of the implementation of this criterion is provided by the FDA’s approval of vilazodone in 2011 (http://www.accessdata.fda.gov/drugsatfda_docs/nda/2011/022567Orig1s000StatR.pdf). Seven controlled efficacy trials were conducted. The first five failed to show any significant differences on any measure of depression, and the mean drug–placebo difference in these studies was less than ½ point on the HAM-D, and in two of the three trials, the direction of the difference actually favored the placebo. The company ran two more studies and managed to obtain small but significant drug–placebo differences (1.70 points). The mean drug–placebo difference across the seven studies was 1.01 HAM-D points. This was sufficient for the FDA to grant approval, and the information approved by the FDA for informing doctors and patients reads, “The efficacy of VIIBRYD was established in two 8-week, randomized, double-blind, placebo-controlled trials.” No mention is made of the five failed trials that preceded the two successful ones.

The failure to mention the unsuccessful trials was not merely an oversight; it reflects a carefully decided FDA policy dating back for decades. To my knowledge, there is only one antidepressant in which the FDA included information on the existence of negative trials. The exception is citalopram, and the inclusion of the information followed an objection raised by Paul Leber, who was at the time the director of the FDA Division of Neuropharmacological Drug Products. In an internal memo dated May 4, 1998, Leber wrote:

One aspect of the labelling deserves special mention. The [report] not only describes the clinical trials providing evidence of citalopram’s antidepressant effects, but make mention of adequate and well controlled clinical studies that failed to do so…The Office Director is inclined toward the view that the provision of such information is of no practical value to either the patient or prescriber. I disagree. I believe it is useful for the prescriber, patient, and 3rd-party payer to know, without having to gain access to official FDA review documents, that citalopram’s antidepressants effects were not detected in every controlled clinical trial intended to demonstrate those effects. I am aware that clinical studies often fail to document the efficacy of effective drugs, but I doubt that the public, or even the majority of the medical community, is aware of this fact. I am persuaded that they not only have a right to know but that they should know. Moreover, I believe that labeling that selectively describes positive studies and excludes mention of negative ones can be viewed as potentially ‘false and misleading’. (Leber 1998).

Hooray for Paul Leber. I have never met or corresponded with this gentleman, but because of this courageous memo, he is one of my heroes.

5 The Serotonin Myth

Over the years, I have noticed something very strange in the antidepressant literature. When different antidepressants are compared with each other, their effects are remarkably similar. I first noticed this when Guy Sapirstein did our 1998 meta-analysis of the published literature. When we first saw how small the actual drug effect was, we thought we might have done something wrong. Perhaps we had erred by including trials that had evaluated different types of antidepressants. Maybe we were underestimating the true effectiveness of antidepressants by including clinical trials of drugs that were less effective than others.

Before submitting our paper for publication, we went back to the data and examined the type of antidepressant used each trial. Some were selective serotonin reuptake inhibitors (SSRIs), others were tricyclic medications, and there were trials on antidepressant drugs that were neither SSRIs nor tricyclics. And then we noticed that there was a fourth category of drugs in the trials we had analyzed. These were trials in which drugs that are not thought to be antidepressants at all—tranquilizers and thyroid medications, for example—were given to depressed patients and evaluated for their effect on depression.

When we analyzed the drug and placebo response for each type of drug, we found another surprise awaiting us. It did not matter what kind of drug the patients had been given in the trial. The response to the drug was always the same, and 75 % of that response was also found in the placebo groups. I recall being impressed by how unusual the similarity in results was, but I have since learned that they are not unusual at all. I have since encountered this phenomenon over and over again. In the STAR*D trial, which, at a cost of $35,000,000, is the most costly clinical trial of antidepressants ever conducted, patients who did not respond to the prescribed SSRI were switched to a different antidepressant (Rush et al. 2006). Some were switched to an SNRI, a drug that is supposed to increase norepinephrine as well as serotonin in the brain. Others were switched to an NDRI, which is supposed to increase norepinephrine and dopamine, without affecting serotonin at all. And still others were simply given a different SSRI. About one out of four patients responded clinically to the new drug, but it did not matter which new drug they were given. The effects ranged from 26 to 28 %; in other words, they were exactly the same regardless of type of drug.

The most commonly prescribed antidepressants are SSRIs, drugs that are supposed to selectively target the neurotransmitter serotonin. But there is another antidepressant that has a very different mode of action. It is called tianeptine, and it has been approved for prescription as an antidepressant by the French drug regulatory agency. Tianeptine is an SSRE, a selective serotonin reuptake enhancer. Instead of increasing the amount of serotonin in the brain, it is supposed to decrease it. If the theory that depression is caused by a deficiency of serotonin were correct, we would expect to make depression worse. But it doesn’t. In clinical trials comparing the effects of tianeptine to those of SSRIs and tricyclic antidepressants, 63 % of patients show significant improvement (defined as a 50 % reduction in symptoms), the same response rate that is found for SSRIs, NDRIs, and tricyclics, in this type of trial (Wagstaff et al. 2001). It simply does not matter what is in the medication—it might increase serotonin, decrease it, or have no effect on serotonin at all. The effect on depression is the same.

What do you call pills, the effects of which are independent of their chemical composition? I call them “placebos.”

6 Antidepressants as Active Placebos

All antidepressants seem to be equally effective, and although the difference between drug and placebo is not clinically significant, it is significant statistically. This leads to the obvious question: What do all of these active drugs have in common that make their effect on depression slightly, but statistically significantly, better than placebo?

One thing that antidepressants have in common is that they all produce side effects. Why is that important? Imagine that you are a subject in a clinical trial. You are told that the trial is double blind and that you might be given a placebo. You are told what the side effects of the medication are. The therapeutic effects of the drug may take weeks to notice, but the side effects might occur more quickly. Would you not wonder to which group you had been assigned, drug or placebo? And noticing one of the listed side effects, would you not conclude that you had been given the real drug? In one study, 89 % of the patients in the drug group correctly “guessed” that they had been given the real antidepressant, a result that is very unlikely to be due to chance (Rabkin et al. 1986). In a more recent study (Chen et al. 2011), actual treatment assignment (sertraline, hypericum, or placebo) did not affect treatment outcome, but patients’ guesses about which treatment they were getting did.

In other words, clinical trials are not really double blind. Many patients in clinical trials realize that they have been given the real drug, rather than the placebo, most likely because of the drug’s side effects. What effect is this likely to have in a clinical trial? We do not have to guess at the answer to this question. Bret Rutherford and his colleagues at Columbia University have provided the answer. They examined the response to antidepressants in studies that did not have a placebo group with those in studies where they did have a placebo group (Rutherford et al. 2009). The main difference between these studies is that in the first case, the patients were certain they were getting an active antidepressant, whereas in the placebo-controlled trials, they knew that they might be given a placebo. Knowing for sure that they were getting an active drug boosted the effectiveness of the drug significantly. This supports the hypothesis that the relatively small differences between drug and placebo in antidepressant trials are at least in part due to “breaking blind” and discerning that one is in the drug group, because of the side effects produced by the drug.

7 What to Do?

To summarize, there is a strong therapeutic response to antidepressant medication. But the response to placebo is almost as strong. In the FDA files my colleagues and I analyzed (Kirsch et al. 2008), the response to antidepressants was a mean improvement of 10.1 points on the HAM-D, whereas the response to placebo was an improvement of 8.3 points. Furthermore, meta-analyses of published trials reveal that the response to placebos is mostly a true placebo effect; it is not due to spontaneous remission, the natural history of depression, or regression toward the mean (Khan et al. 2012; Kirsch and Sapirstein 1998). This presents a therapeutic dilemma. The drug effect of antidepressants is not clinically significant, but the placebo effect is. What should be done clinically in light of these findings?

One possibility would be to use antidepressants as active placebos. But the risks involved in antidepressant use render this alternative problematic (Andrews et al. 2012; Domar et al. 2013; Serretti and Chiesa 2009). Among the side effects of antidepressants are sexual dysfunction (which affects 70–80 % of patients on SSRIs), long-term weight gain, insomnia, nausea, and diarrhea. Approximately 20 % of people attempting to quit taking antidepressants show withdrawal symptoms. Antidepressants have been linked to increases in suicidal ideation among children and young adults. Older adults have increased risks of stroke and death from all causes. Pregnant women using antidepressants are at increased risk of miscarriage, and if they don’t miscarry, their offspring are more likely to be born with autism, birth malformations, persistent pulmonary hypertension, and newborn behavioral syndrome. Furthermore, some of these risks have been linked to antidepressant use during the first trimester of pregnancy, when women may not be aware that they are pregnant. Perhaps the most surprising health consequence of antidepressant use is one that affects people of all ages. Antidepressants increase the risk of relapse after one has recovered. People are more likely become depressed again after treatment by antidepressants than after treatment by other means—including placebo treatment (Andrews et al. 2012; Babyak et al. 2000; Dobson et al. 2008). Furthermore, the degree to which the risk of relapse increases depends on the degree to which the particular antidepressant used changes neurotransmission in the brain. Given these health risks, antidepressants should not be used as a first-line treatment for depression.

Another possibility is to prescribe placebos. They are almost as effective as antidepressants, but elicit far fewer side effects. Surveys indicated that many physicians do in fact prescribe placebos (Tilburt et al. 2008; Raz et al. 2011). The conventional wisdom is that for a placebo to be effective, patients must believe they are receiving active medication, which entails deception. Besides being ethically questionable, the practice of deceiving patients runs the risk of undermining trust, which may be one of the most important clinical tools that clinicians have at their disposal. But is the conventional wisdom correct? My colleagues and I have tested and confirmed the hypothesis that placebos can be effective even when given openly, without deception, when given in the context of a warm therapeutic relationship and with an honest but convincing rationale as to why they should be effective (Kaptchuk et al. 2010). Our study targeted irritable bowel syndrome, rather than depression, but a small pilot study suggests that it might work also in the treatment of depression (Kelley et al. 2012). Until this is confirmed, however, placebo treatment is not a viable option.

Fortunately, placebos are not the only alternative to antidepressant treatment. My colleagues and I have conducted a meta-analysis of various treatments for depression, including antidepressants, psychotherapy, the combination of psychotherapy and antidepressants, and “alternative” treatments, which included acupuncture and physical exercise (Khan et al. 2012). We found no significant differences between these treatments or within different types of psychotherapy. When different treatments are equally effective, choice should be based on risk and harm, and of all of these treatments, antidepressant drugs are the riskiest and most harmful. If they are to be used at all, it should be as a last resort, when depression is extremely severe and all other treatment alternatives have been tried and failed.