Keywords

1 Introduction

Although psychoactive drugs have been used since antiquity, clinical psychopharmacology as we currently understand it is has only developed over the last 60 years. Drugs that appeared to specifically target depression were discovered by chance and astute observation, on a background of scientific advances in the understanding of biological processes and in the synthesis of new chemical compounds. The discovery that treatment with the antituberculous drug, iproniazid, and the chlorpromazine derivative, imipramine, led to improvement in 65–70 % of patients with depression led to a search for an understanding of the possible mechanisms. Iproniazid’s ability to antagonise what became known as monoamine oxidase was discovered early on, with the discovery that imipramine could reverse the effects of reserpine shortly after (Ban 2001). This set the stage for the development of the monoamine hypothesis of depression in its various versions which has held sway over subsequent decades and guided, or probably restrained, drug development (Ban 2001). Subsequent developments in the production of new drugs has depended on the growth of interrelated industrial, business and marketing processes leading to the current reliance on major pharmaceutical companies for advances in the field (Preskorn 2010). This has led to considerable criticism in recent years, with increasing distrust of the marketing activities of ‘big pharma’ and a questioning of the transparency, validity and interpretation of clinical drug trials used for registration purposes. It has been argued that this resulted in a ‘me-too’ emphasis in drug development leading to failure to develop novel agents that escape the pharmacological shadow of the first drugs (Preskorn 2011).

Some aspects of clinical trials in depression such as internal validity, consistency, statistical analysis and reporting have improved since the first studies in the 1950s (Brunoni et al. 2010). However, the primary endpoints have not altered since the introduction of the Hamilton Depression Rating Scale (HDRS), and we lack useful markers of disease or treatment response which is a serious challenge to the hope of being better able to match treatment to individual circumstances or illness profile (Simon and Perlis 2010). A number of factors are likely to contribute to this including the limitations of our assessment tools, the large non-specific (placebo) component to treatment response and, more fundamentally, the current classification systems of depression which offer only heterogeneous symptom clusters (Gelenberg et al. 2008) and lack a foundation in neuropsychobiology. In addition, the signal to noise ratio in studies has decreased over time with a growth of placebo response rates and a greater proportion of ‘failed’ trials (Gelenberg et al. 2008). This has been fertile ground for attacks on the usefulness of antidepressants for the majority of cases of depression, especially for those presenting in primary care where most people are treated (Kirsch et al. 2008; Middleton and Moncrieff 2011). Paradoxically this is occurring at a time when the prescription of antidepressants is growing fast (Pratt et al. 2011) and underlines the need to have better tools to target treatment.

In this chapter I will discuss some of the current controversies in the use of antidepressants to treat depression and then outline current approaches to treatment finishing with briefly considering future developments.

2 Some Current Controversies

2.1 Do Antidepressants Work in Treating Depression?

The last decade has seen a challenge to the assertion that antidepressants are useful in the treatment of depression. There are essentially three main versions of the challenge that are amenable to empirical examination (I will leave aside here more emotive philosophical or moral arguments about using antidepressants to treat mental ‘distress’ and concerns about their widespread use in society). The arguments are (1) that antidepressants do not work better than placebo once publication bias has been taken into account, (2) if they do work it is due to an augmented placebo effect because of failed blinding/side-effects and (3) even if there is a specific effect it is too small to be clinically useful and/or the risks and disadvantages outweigh the small benefits.

There is not space here to do the arguments full justice but (1) is easily countered by systematic reviews that include unpublished studies and show clear benefit for antidepressants over placebo, although the effect size is reduced to about 0.3 (Kirsch et al. 2008; Turner et al. 2008). Argument (2) is difficult to address directly but implausible for a number of reasons. First, a meta-analysis of randomised controlled trials (RCTs) of tricylic antidepressants against ‘active’ placebo (anticholinergic drugs), in order to match the adverse effects of the treatment arms and prevent unblinding, still showed a benefit for antidepressants (Moncrieff et al. 2004), while studies comparing tricyclic antidepressants against the better tolerated selective serotonin reuptake inhibitors (SSRIs) show equal efficacy (Anderson 2001). Conversely differences in efficacy have been demonstrated between second-generation antidepressants with similar side-effect profiles (see Sect. 4.2). Second it is difficult to explain any differential effect of depression severity on response to placebo and antidepressants using this argument. More severely ill patients respond less well to placebo but the response to antidepressants is maintained across the severity range (e.g. Kirsch et al. 2008); if the response to antidepressants is simply an enhanced placebo response the drug-placebo difference should be as good, or even larger, in milder depression (Fountoulakis and Moller 2011). Finally the robust relapse prevention data (Geddes et al. 2003) are difficult to explain away as a placebo effect given that patients are currently well (however see Sect. 2.2).

Perhaps the most challenging argument is (3). Meta-analyses at the level of studies (Kirsch et al. 2008) and individual patient data (Fournier et al. 2010) using continuous measures of improvement suggest rather small, and it is argued clinically questionable, differences between drug and placebo at mild and moderate severity depression, i.e. for the vast majority of patients, only reaching an effect size of 0.5, or an HDRS difference of 3, at HDRS scores of 25–28 (Kirsch et al. 2008; Fournier et al. 2010). There are a number of complications in addressing this, including debate about the accuracy of the figures (Fountoulakis and Moller 2011), their generalisability (they are based on a relatively small number of studies/patients with considerable measurement error) and how meaningful it is to apply, a strict threshold of clinical importance (Anderson and Haddad 2011). The method of analysis is also important. A large European meta-analysis of 56 published and unpublished RCTs of SSRIs and serotonin and noradrenaline reuptake inhibitors (SNRIs) in regulatory submissions analysed outcomes by a responder analysis (i.e. the proportion achieving a 50 % or greater reduction in HDRS) (Melander et al. 2008). Their results appear to challenge the previous analyses as there was only a non-significant marginal effect of severity on the drug-placebo difference in response across the range from mild to severe depression (15–20 %) and few would argue, that a number needed to treat (NNT) between 5 and 7 lacks clinical significance. However, this has been countered by the argument that response rates derived from continuous measures provide only an illusion of efficacy (Kirsch and Moncrieff 2007). Response defined arbitrarily as a threshold of improvement in a continuous measure means that the difference between responder and non-responder numbers reflect those who have moved over the threshold rather than a true difference in those improving and those not improving. The majority of people will cluster around the mean improvement and apparent large differences in response proportions may reflect only a small difference in mean HDRS reductions in the two treatment arms. In addition, in milder severity depression, the difference in mean HDRS reductions will be smaller than in more severe depression so that a 15 % difference in response rates may only reflect a difference of 1 or 2 HDRS points or even less.

This argument is however based on a normal distribution in the change in HDRS scores. If improvement is instead more bimodal, with a ‘responder group’ that have a substantial improvement and a ‘non-responder group’ tending to change rather little, the argument weakens considerably. A recent reanalysis of data comparing escitalopram and placebo, using a mixture model to identify subgroups, found that a bimodal distribution fitted the data much better than a unimodal modal (explaining 60 % of the variance compared with 6 %) (Thase et al. 2011). The Montgomery–Åsberg Depression Rating Scale (MÅDRS) dropped from a mean of 30 to 10 for those who benefited and only from 30 to 25 for those who did not. The difference in proportion of benefiters and nonbenefiters was 14 % (NNT 7) for the mild to moderately ill patient group (mean initial MÅDRS 26) and 23 % (NNT 5) for more severely ill patients (mean MÅDRS 33). It is important also to take into account the relapse prevention data where there is an extremently robust effect of antidepressants for those with a high risk of recurrence (Geddes et al. 2003) although interpretation of these data have been challenged (see Sect. 2.2).

What do we take away from this debate? Given the increase in the use of antidepressants it must be right to subject the claims of their efficacy and usefulness to rigorous inspection. We are blessed with abundant data but cursed by potential confounds, endless analysis, reanalysis and subanalysis of the data not to mention vested interests. In my view the current evidence does support a clinically useful benefit from antidepressants for many depressed patients but there remains some uncertainty at the mild to moderate end of the spectrum in acute treatment. The threshold where benefit outweighs disadvantages is not fixed and has to be judged against individual patient needs together with the appropriateness and availability of alternative or conjunctive treatments (Anderson and Haddad 2011).

2.2 Sensitisation to Relapse

The robust efficacy of antidepressants in preventing relapse in patients with highly recurrent depression (Geddes et al. 2003) has been a cornerstone of the argument for antidepressant efficacy and usefulness. There have previously been challenges to the interpretation of these relapse prevention RCTs on the basis of the potential confound from discontinuation symptoms either being mistaken for, or even precipitating, depressive relapse (Moncrieff 2006). Recently there has been an even stronger version of this argument proposing that antidepressants bring about oppositional tolerance in long-term use in which homeostatic mechanisms counter the pharmacological effect of antidepressants. This means that when they are stopped there is an ‘overshoot’ pushing monaminergic neurotransmission in the opposite direction resulting in the triggering of a depressive episode (Andrews et al. 2011). In support of their thesis the authors compared the relapse rate in the placebo arm in standard relapse prevention studies following remission on an antidepressant (antidepressant-placebo) with placebo continuation studies in which patients had remitted on placebo acute treatment (placebo-placebo). They found a 42–56 % relapse rate depending on drug class in the antidepressant-placebo studies compared with only 25 % on placebo-placebo. They report that this remained after correcting for illness characteristics and differential drop-outs (substantially more in the placebo-placebo group). Furthermore, the relapse rate in the antidepressant-placebo group was predicted by the degree of monoamine perturbation (specifically serotonin and noradrenaline) caused by individual antidepressants in the medial prefrontal cortex of rodents using microdialysis (Andrews et al. 2011). Separate evidence indicates that the shorter the length of time without antidepressants in remitted patients the higher the risk of relapse with tryptophan depletion (Ruhe et al. 2007) consistent with the oppositional tolerance hypothesis.

This is an intriguing hypothesis with fundamental implications for our treatment of depression. However, in spite of attempting to correct for patient characteristics, the crucial factor of the underlying risk of relapse is poorly reported in studies, so that it remains highly likely that the population entered into the continuation studies were significantly different to the relapse-prevention study patients who are specifically chosen for a high relapse risk. Uncertainties in extrapolation from animal studies, selection of brain area and study selection for quantifying monoamine perturbation also warrants caution in the pharmacological correlation seen. If the hypothesis is true then one might predict that the longer the period of antidepressant treatment before randomisation to placebo the greater would be the risk of relapse on placebo; in fact the opposite occurs (Geddes et al. 2003). Treatment with antidepressants in healthy volunteers might be also expected to bring about oppositional tolerance and cause vulnerability to mood lowering by tryptophan depletion but this does not happen, at least with treatment over 6 weeks (Barr et al. 1997) which should be long enough to result in adaptive changes. A naturalistic follow-up study of young patients randomised to antidepressant, cognitive behaviour therapy, the combination or placebo did not find that original treatment allocation predicted relapse over a 5-year follow-up (Curry et al. 2011).

At present, therefore, the evidence is not compelling that this type of sensitisation to relapse occurs with antidepressants but, given the importance of the question, a properly designed prospective study to investigate it is warranted.

2.3 Suicide

Over the last two decades concerns have been raised about whether SSRIs (and other antidepressants) might actually cause an increased risk of suicidal acts and completed suicide during treatment. This has been particularly pertinent with regard to their use in adolescents where their benefit in treating depression has been questioned. Resolving the issue has been complicated by a number of factors: the background raised risk of suicidal acts in depression, detecting an increased risk in the context of a decrease in suicidality by effective treatment, the definition of suicidality in children and adolescents where impulsive and self-harm acts are relatively high and the inadequacy of data from studies which have not been designed to address this question. To a large extent the controversy has been replaced by a degree of consensus (Moller et al. 2012). It is now known that suicidal acts are most common before treatment starts (and this applies to psychological as well as drug treatment) and decrease through the early treatment period, so there is indeed a raised risk of suicidal behaviour in the early period of treatment. This is however predominantly due to the nature of the illness itself (Simon and Savarino 2007) with the risks of self-harm greatest in younger age-groups, especially adolescents (Simon and Savarino 2007). On top of this background risk, antidepressant treatment generally (not just SSRIs), compared with placebo, appears to slightly increase the risk of suicidal ideation or suicidal behaviour in adolescents and younger adults up to age of 25 years (number needed to harm about 100 for adolescents, 200 for younger adults). However, in adults older than 25 years antidepressants have a neutral or protective effect which increases with age (number needed to benefit about 160 for those aged 65 and older) (Hammad et al. 2006; Stone et al. 2009). The reasons for this are not understood with suggestions including a link between suicidality and nonresponse to treatment, a higher rate undiagnosed bipolar disorder in younger subjects or an age-related effect of changes in neurotransmitter, especially serotonin, function on impulsive behaviours, activation/drive-enhancement or side-effects such as akathisia (Moller et al. 2012). It is important to recognise, however, that that these data cannot resolve the question as to whether antidepressants affect the risk of dying as the rate of completed suicide in studies is so low (Stone et al. 2009).

The implications of these findings are that all patients should be carefully assessed and monitored for suicide risk before and during treatment, and that particular care should be taken with younger patients who should be monitored more frequently (National Institute for Health and Clinical Excellence 2009a). The risk–benefit balance when prescribing antidepressant treatment in younger patients needs to take into account that there might be a slightly higher risk of suicidal behaviour with antidepressants early in treatment.

2.4 Emotional Blunting

Reports of a blunting of both positive and negative emotions with SSRIs go back 20 years (Opbroek et al. 2002) but there has been relatively little exploration of the phenomenon. Two small studies, one observational (Opbroek et al. 2002) and one qualitative (Price et al. 2009), have concluded that at least some patients do experience emotional blunting or detachment which they attribute to the medication and can distinguish from the effects of depression. The consequences varied between individuals but some found it sufficiently problematic to make them consider stopping, or actually stop, treatment (Price et al. 2009). Some neuroimaging results have provided support for a general reduction of neural responses to both pleasant and unpleasant stimuli with SSRIs e.g. McCabe et al. (2010).

This is a potentially previously unrecognized side-effect of SSRIs, and other serotonergic antidepressants (Price et al. 2009) that may also have implications for our understanding of how antidepressants alter emotional processing. However evidence to date is based on small and largely self-selected samples so further investigation is needed to determine if it is a true effect of treatment and, if so, what proportion of patients is affected.

3 Principles and Practice of Treatment

The obvious aim of treatment is to return the sufferer from depression to being ‘well’ again. However, the patient may have different expectations and emphasis compared with the doctor, and goals may need to be tempered by practical considerations and unwanted effects of treatment. Optimising treatment on an individual basis involves careful assessment, application of the best evidence and sensitivity to the wishes, concerns and beliefs of the patient. This necessarily involves a mutual enterprise based on negotiation which recognises the different roles of doctor and patient and the rights of the patient to make informed choices. I consider here some factors that need to be considered in treating patients with antidepressants but do not aim to be comprehensive and will not discuss in any detail the assessment of risk and the need to adapt drug treatment in the face of factors such as age, medical and psychiatric comorbidity, pregnancy and breastfeeding. These are all important but outside the scope of this chapter.

3.1 Goals of Treatment

The emphasis that clinicians tend to have on symptoms and their improvement does not capture what patients desire from treatment (Zimmerman et al. 2006) which includes an improvement in their overall well-being or quality of life (QoL). In addition to the presence of depressive symptoms this is impacted by the burdens of treatment (such as side effects) as well as wider aspects of functioning, optimism and engagement in social and occupational activities. In addition, much of the focus has been on short-term outcomes whereas depression often has a chronic course with persistent symptoms and relapses. This requires consideration of longer term treatment in which the acceptable balance of benefit to acceptability is likely to be very different; readily illustrated by the different short- and long-term impact of two common side effects of antidepressants, sexual impairment and weight gain.

Although there are many measures that attempt to capture QoL, none have received sufficient general acceptance to become primary, or even required, outcomes in clinical trials. In practice symptom, function and QoL measures tend to improve together and persisting depressive symptoms are associated with ongoing functional and QoL impairments as well as increased risk of relapse (Keller 2003). This has led to an increasing emphasis on absence, or near absence, of symptoms (‘remission’) rather than significant improvement (‘response’) as the goal of treatment (Keller 2003). Remission is usually measured by a minimum number and severity of depressive symptoms (e.g. 7 or less on the observer-rated 17-item HDRS) and was adopted as the primary outcome in the large Sequenced Treatment Alternatives to Relieve Depression (STAR*D) (Rush et al. 2006a). This higher bar, however, emphasises how disappointing the outcomes of treatment are, with typically only about a third of depressed patients in the STAR*D trial achieving remission with initial antidepressant treatment compared with about 50 % of patients meeting criteria for a response (Rush et al. 2006a).

3.2 When to Treat with Antidepressants

Treatment with antidepressants, as with all drug treatments, needs to be based on a balance of benefits and risks. The risks need to be judged on an individual basis, for example medical comorbidity or pregnancy can raise threshold for antidepressant drug treatment on the basis of safety. The balance of evidence, discussed above, supporting an increasing active drug-placebo difference in outcome with increasing severity of depression (Angst and Stabl 1992; Kirsch et al. 2008), has led recent guidelines (Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a) to restrict the primary recommendation for antidepressants to depression with moderate of greater severity while acknowledging that a trial of antidepressants is warranted for milder degrees of depression that have failed to respond to non-drug treatments. The British Association for Psychopharmacology guidelines suggest a dimensional approach with a DSM-IV moderate major depressive episode and a duration of 3 months as a guide to where clinically important benefit starts to be seen (Anderson et al. 2008). Another way of putting this is to consider using antidepressants in situations where spontaneous improvement, or response to nonspecific measures, are unlikely. One of the questions underlying the increase in prescribing of antidepressants is to what degree this represents ‘inappropriate’ prescribing (i.e in those who would get better spontaneously or with simple psychosocial interventions) as opposed to an increasing awareness and appropriate treatment of significant depression. The truth is likely to be a combination of both. The poor response to antidepressants alone, however, emphasizes that drugs always need to be combined with psychosocial approaches including specific psychotherapies, particularly if there is insufficient response or high risk of relapse.

3.3 How to Treat with Antidepressants

Paradoxically the discussions about personalising medicine come at a time when treatment has never been so impersonal and continuity of care so fragmented. The nonspecific or placebo component to treatment response highlights the importance of how, as well what, treatment is given. Expectations (patient and clinician) play a considerable role in outcome. For example when patients recruited into RCTs know they are going to receive an antidepressant in non-placebo-controlled comparative trials the remission rate is on average 50 % higher than for patients on active treatment in placebo-controlled trials (when they know they have a 50:50 chance of receiving placebo) (Sinyor et al. 2010). Patient preference not surprisingly influences whether a treatment is accepted (King et al. 2005) but the evidence that patient preference when receiving active treatment influences outcome is less established. However, recent studies in depression have support a better outcome when the treatment type (drug or psychological treatment) matches the one preferred (Lin et al. 2005; Kocsis et al. 2009; Mergl et al. 2011). Interestingly this might be a stronger effect for psychological treatment than for antidepressant drugs (Mergl et al. 2011). As well as assessing, and matching where possible, expectations to treat the process is also important. More frequent follow-up assessments for patients receiving placebo in RCTs increases the degree of improvement and may account for as much as 40 % of the placebo response (Posternak and Zimmerman 2007). Undergoing standardised assessments with implementation of time-critical treatment changes according to an algorithm improves response for those receiving active treatment (Trivedi et al. 2004). Patient engagement in treatment with feedback of progress from assessments has been shown to improve symptomatic outcome, at least with psychological treatment (Newnham et al. 2010).

The implications from research are therefore that simply providing a prescription for an antidepressant is not enough. Matching or influencing patient preference, and the process of management including assessments, monitoring and probably feedback/engagement are at least as important as the pharmacological agent if treatment benefit is to be maximised.

3.4 Duration of Treatment

Preventing relapse and recurrence is a high priority and the findings from STAR*D (Rush et al. 2006a) show just how high this risk can be. The effectiveness of antidepressants in preventing relapse is well established (Geddes et al. 2003) (but again see Sect. 2.2) with a consistent large benefit in reducing relapse to about a third relative to the risk of relapse on placebo. This is an especially important effect in those with higher risk of relapse. When considering the duration of treatment with antidepressants therefore it is important to take a dimensional view and match the potential benefit to an individual’s risk of relapse. This involves taking into account the history of number, frequency and severity of previous relapses, other risk factors such as psychosocial stress and residual symptoms, protective factors such as change in circumstances or receiving specific psychotherapies targeting relapse and, often forgotten, the consequences of relapse. In effect we need to get rid of the idea of fixed durations of treatment and to use the evidence to individualise treatment according to need. In practice this means a minimum duration of 6–9 months treatment in those with minimal risk factors extending to indefinite treatment in highly recurrent depression (Anderson et al. 2008). This one of the most important areas where we can personalise treatment at present.

4 Choice of Antidepressant

4.1 Current Antidepressants

As stated in the introduction, the drugs currently marketed as antidepressants show the legacy of their first discovery and all act primarily on monoamine systems. The latest drug to be marketed, agomelatine, is unusual in that one of its main actions is as a melatonin agonist, although it is also a 5-HT2C antagonist, a property it shares with other antidepressant drugs such as mirtazapine and trazodone (Hickie and Rogers 2011). Melatonin, although a monoamine, is not one previously strongly implicated in antidepressant action although its role in circadian rhythms and sleep means it has links to theories relating disordered chronobiology with depression (Hickie and Rogers 2011). Theories of the mechanism of action of antidepressants have attempted to move beyond the monoamines, ranging from changes in neuroplasticity and immune function to neuropeptides and neurohumeral effects (Tanti and Belzung 2010). To date this has not radically changed the landscape of drugs available for treatment and the link between pharmacology and the alleviation of depression remains tantalizingly obscure. The link between pharmacology and adverse or unwanted effects is rather better understood.

The classification, or nomenclature, of antidepressants is another problem that does not help in their understanding or clinical use. It has grown in a random way with inconsistencies as to whether they are named according to structure (e.g. tricyclic antidepressant), history (e.g. second generation), mode of action (e.g. SSRI), mode of action with a marketing spin (e.g. noradrenaline and specific serotonin antagonist or NaSSA) and those without a clear nomenclature as yet (e.g. agomelatine). The World Health Organisation (WHO) system is based on monoamine pharmacology and is limited because most recent antidepressants, and putative antidepressants, are simply listed under ‘other antidepressants’ (Nutt 2009). Nutt (2009) has suggested that a more meaningful approaches to nomenclature might be to include neurotransmitter targets and site of action (reuptake sites, receptors, enzymes) combined with a more generic classification as to whether they act at single or multiple sites. Unfortunately our lack of understanding of how pharmacology is translated into efficacy means that all current attempts at classification are going to either have to squeeze atypical or novel drugs into our current monoamine scheme (e.g. lithium as a functional inhibitor of 5-HT2-mediated function, S-adenosylmethionine as a modulator of monoamine neurotransmission), or continue the ramshackle construction depending on empirical evidence (such as adding actions at melatonin, glutamate or glucocorticoid receptors).

Although the basic pharmacology of marketed antidepressants has not changed radically over the last 60 years the pattern of usage has altered. The search for ‘cleaner’ antidepressants with fewer side-effects and greater safety in overdose resulted in the marketing of a series of single action drugs, the SSRIs, in the 1980s. Over the next decade these became the standard first-line treatment, especially once they became available as generic drugs and the acquisition costs reduced. Since then the focus of clinical use has shifted to the current dominance of newer antidepressants over the original tricyclic antidepressants and irreversible monoamine oxidase inhibitors (MAOIs).

4.2 Efficacy of Antidepressants

Arguably more ink has been used (or wasted) on the topic of relative efficacy and/or tolerability of antidepressants than on any other aspect of their use. This has of course been driven strongly by marketing needs, but also the genuine recognition that, given the limited efficacy and tolerability of individual antidepressants, it is important to try and choose the best drug for a patient. In addition, differences between drugs have a potential heuristic value in the search for understanding mechanisms and improving treatments. Clinical experience makes it difficult to believe that antidepressant drugs do not have important differences at an individual patient level, but this has proved difficult to show at a group level in RCTs, and even more difficult to demonstrate differences that are clinically important.

In terms of efficacy there is limited evidence from meta-analyses of RCTs that there might be small advantages (typically 5–10 % difference in response rates, effect sizes of 0.15–0.2) for amitriptyline over SSRIs, for venlafaxine (at doses of 150 mg or more) over SSRIs and for escitalopram 20 mg over other SSRIs including citalopram (Anderson 2001; Smith et al. 2002; Kennedy et al. 2009). The advantage to venlafaxine, an SNRI, over SSRIs may reflect a more general small efficacy benefit for drugs increasing both serotonin and noradrenaline function (SNRIs, mirtazapine, moclobemide) over a single action at the serotonin reuptake site (Papakostas et al. 2007b). A network, or multiple-treatments, meta-analysis of 12 newer antidepressants, which was able to include direct and indirect drug comparisons, was largely consistent with this proposal with mirtazapine, escitalopram, venlafaxine and sertraline being ranked as the top four most effective drugs (in that order), and significantly so compared with fluoxetine, paroxetine, fluvoxamine and reboxetine; the last being less effective than all other antidepressants (Cipriani et al. 2009). A more recent meta-analysis using both direct and indirect comparisons found fewer differences between drugs (Gartlehner et al. 2011). In head-to-head comparisons, sertraline and venlafaxine were more effective than fluoxetine, and escitalopram more effective than citalopram, but the differences were modest (about 5 % difference in response rate). Indirect comparisons only found significant differences in favour of escitalopram over both fluoxetine and duloxetine. The question raised over the efficacy of reboxetine has also led to further attempts to clarify this with the predictable result that one meta-analysis including unpublished data found it did not separate significantly from placebo and was less effective than SSRIs in leading to remission (Eyding et al. 2010) but a previous meta-analysis of essentially the same data against SSRIs concluded they did not differ in efficacy (Papakostas et al. 2008b). These differing results highlight how much outcomes of meta-analyses are dependent on their methodology and suggest that any differences between drugs are not robust. In addition if antidepressants do differ in efficacy it is hard to find a straightforward pharmacological property to explain this: in particular why sertraline and escitalopram might be more effective than other SSRIs. For the former an additional effect in inhibiting dopamine reuptake may be a factor (Carrasco and Sandner 2005), and for the highly selective drug, escitalopram, its unique binding properties at the serotonin transporter could be relevant (Zhong et al. 2012), but the truth is that we have yet to really understand the basis for differences in antidepressant efficacy.

The limited efficacy of antidepressants, and the rationale that multiple pharmacological actions may be more effective than a single action, has lead to the question as to whether combining treatment from the start may be more effective than using a single antidepressant. Two recent RCTs attempting to answer this have reached opposite conclusions. Blier et al. (2010) reported that mirtazapine combined with fluoxetine, venlafaxine or bupropion was greatly superior to fluoxetine monotherapy with equal tolerability and an impressive doubling of remission rates from 25 % to over 50 %. Limitations of this study were the lack of a mirtazapine monotherapy arm, low numbers assigned to each treatment arm and surprisingly few patients remitting on fluoxetine alone. A larger RCT comparing escitalopram monotherapy with escitalopram plus bupropion and venlafaxine plus mirtazapine found no difference in efficacy between treatments but more adverse events with the venlafaxine-mirtazapine combination (Rush et al. 2011). There is therefore a lack of evidence to recommend combination antidepressant treatment first-line at present.

Although overall efficacy might not differ substantially between antidepressants it remains possible that there may be differences between antidepressants in other aspects of efficacy. However, in a comprehensive meta-analysis of newer antidepressants (Gartlehner et al. 2011) it was not possible to find any good evidence for clinical predictors of differential response to specific antidepressants. There is some evidence that mirtazapine may act more quickly than SSRIs although it is not clear that this generalizes to other antidepressants such as venlafaxine and tricyclic antidepressants (Anderson 2001; Gartlehner et al. 2011) and among the SSRIs fluoxetine may have a slower onset of action (Anderson 2001); if these are correct they may reflect pharmacodynamic and pharmacokinetic properties respectively given mirtazapine’s direct receptor action and fluoxetine’s long half-life.

4.3 Tolerability of Antidepressants

Whereas the efficacy differences between antidepressants are subtle, the differences in side-effects and tolerability are clearer. In RCTs more patients stop treatment due to adverse-effects on tricyclic antidepressants than on SSRIs, although the absolute difference is only about 3–4 % (Anderson 2001). Among the more recent antidepressants in a network meta-analysis treatment acceptability, as measured by patients dropping out of treatment for all reasons, was greatest for escitalopram, citalopram, sertraline and bupropion and worst for reboxetine, duloxetine and milnacipran (an SNRI). In direct comparisons, escitalopram and sertraline led to significantly fewer discontinuations than did duloxetine, fluvoxamine, paroxetine, reboxetine and venlafaxine (Cipriani et al. 2009).

However, equally as important as the rather crude measure of tolerability measured by dropouts, is the side-effect profile of different drugs (see Anderson et al. (2008) for a summary table). SSRIs compared with TCAs cause more gastrointestinal (nausea, diarrhoea, anorexia) and stimulant (agitation, anxiety and insomnia) side effects, whereas TCAs cause more anticholinergic side effects (dry mouth, constipation, blurred vision), sweating and dizziness (Anderson 2001). In a systematic review of newer antidepressants, bupropion caused significantly less sexual dysfunction than SSRIs; venlafaxine more nausea and vomiting than SSRIs; sertraline more diarrhoea than other SSRIs, venlafaxine, bupropion, mirtazapine and nefazodone; traxodone more sedation than SSRIs, venlafaxine and bupropion; and mirtazapine more weight-gain than SSRIs and venlafaxine (Gartlehner et al. 2011). Reboxetine causes more anticholinergic-like side-effects and insomnia than SSRIs (Papakostas et al. 2008b). Treatment-emergent sexual dysfunction is an often under-reported adverse effect that has important implications for patients. A meta-analysis found that SSRIs, venlafaxine, the TCA imipramine and the MAOI phenelzine caused greater sexual dysfunction than placebo (ranging from 26 to 80 %) with the worst being sertraline and venlafaxine and the least escitalopram and fluvoxamine (Serretti and Chiesa 2009). Antidepressants with a similar rate of sexual dysfunction to placebo included agomelatine, mirtazapine and moclobemide (Serretti and Chiesa 2009). The discontinuation syndrome on stopping newer antidepressants seems most common with paroxetine and venlafaxine and least with fluoxetine (Gartlehner et al. 2011).

4.4 Safety of Antidepressants

Depression is associated with an increased risk of eventual suicide ranging from a fourfold increase over the general population risk (0.5 %) in outpatient populations, to a 16-fold increased risk in patients hospitalized with suicidal attempts or ideation (Bostwick and Pankratz 2000). The considerable risk of death from overdose of tricyclic and MAOI antidepressants was one driver for the development of newer drugs that would be safer if taken in overdose. The fatal toxicity index (FTI) has been used as measure of lethality in overdose, and describes the mortality rate per 1 million prescriptions of a drug. As a crude measure it is liable to a number of potential confounds, including the risk of overdose in the population being treated with a specific drug which might inflate or mask the true toxicity; this is likely to have contributed to the high FTI for venlafaxine (Rubino et al. 2007). A recent study used the case fatality index (CTI), which is the rate ratio for poisoning mortality relative to self-poisoning (deaths/100,000 self-poisonings), to control for this confound. Tricyclic antidepressants were the most toxic (rate ratio 8.6–23.3) and greater than venlafaxine (rate ratio 2.5) and mirtazapine (rate ratio 1.9) which in turn were higher than SSRIs (rate ratio 0.5) (Hawton et al. 2010). Of the SSRIs citalopram was the most toxic in overdose (rate ratio 1.1) compared with other SSRIs (rate ratios 0–0.4) (Hawton et al. 2010). The finding with citalopram is consistent with recent data showing that it affects cardiac conduction with increases in the QT interval at higher treatment doses (Food and Drug Administration 2011) and which has led to dose restrictions for citalopram and escitalopram.

Antidepressants vary in their pharmacokinetic properties and a particular issue is their ability to interact with other medication through inhibition or competition in catabolic pathways in the liver, particularly involving cytochrome P450 (CYP450) enzymes. This is important in those on treatment for physical health problems and especially for the elderly population who are increasingly on multiple other treatments. A recent guideline from the National Institute for Health and Clinical Excellence specifically considers the treatment of depression in the medically ill providing useful information about drug interactions and a recommendation to reserve antidepressants for more severe degrees of depression because of concerns about the risk–benefit balance (National Institute for Health and Clinical Excellence 2009b). Antidepressants with little or no interactions at the level of CYP450 enzymes include sertraline, citalopram/escitalopram, mirtazapine and venlafaxine. The National Institute for Health and Clinical Excellence (2009b) also draws attention to the risk of increased gastrointestinal bleeding with SSRIs and SNRIs, particularly when combined with aspirin or non-steroidal anti-inflammatory drugs.

4.5 Choosing an Antidepressant

Relative efficacy is therefore not a main criterion for choosing between currently available antidepressants, at least for first-line use, although it seems reasonable to choose an antidepressant with established efficacy. The different tolerability and side-effect profiles of current antidepressants do provide information that is useful in attempting to match treatment to patient (although it is difficult to predict to what degree an individual will be affected). Particular care is needed to consider adverse effects that have long-term impact such as weight gain and sexual dysfunction.

The factors described above have led most guidelines to recommend that SSRIs are a general first-line choice when prescribing an antidepressant, and other factors being equal, sertraline or escitalopram may be reasonable choices (but note recent cardiac concerns with escitalopram). However, it is important to take a rounded view in choosing a drug and Table 1 lists some considerations to consider when prescribing an antidepressant.

Table 1 Choosing a specific antidepressant

5 Treatment Non-Response and Sequencing

Inadequate response to initial treatment is unfortunately a common problem and the STAR*D study found that only about half responded and a third remitted at the first stage (Rush et al. 2006a). In this section, I will concentrate on reviewing pharmacological treatment options but it must be emphasized that they should be undertaken after an assessment of possible reasons for inadequate response including re-evaluation of the diagnosis and treatment goals, attempting to identify factors that may be impairing response to treatment (including nonadherence) and considering the whole range of further treatment options, especially psychological treatment.

It is common to hear the term ‘treatment resistant’ depression used as a shorthand in patients with poor response to treatment, but it is unsatisfactory as it lacks a clear definition and does not help to identify the degree of, or reasons for, lack of improvement on treatment (Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a). An emphasis on considering ‘next-step’ treatment given the person’s current situation and treatment history is preferable, and has been adopted in some recent guidelines (Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a).

The STAR*D study was an ambitious attempt to investigate treatment sequencing. It highlights many important points but I will only mention a few here. First, that poor treatment tolerability is a common reason for changing treatment, not just inadequate response. Second, patients have treatment preferences about next-step treatment and it was not possible to examine the relative efficacy of switching and augmenting treatment because there was little willingness to be randomised between these types of treatment option (although there was within treatment type). This seemed to be related to the adverse effects and degree of improvement on the first treatment. Third, the response and remission rates declined with each treatment step so that by the fourth step only about 15 % of patients were achieving remission (Rush 2007). Amidst the plethora of publications coming from the STAR*D study the primary next-step outcomes are reported in 6 studies (Trivedi et al. 2006; Fava et al. 2006; McGrath et al. 2006; Nierenberg et al. 2006; Rush et al. 2006b; Thase et al. 2007). No treatments were demonstrated to be more effective than any other on the primary outcome of remission although bupropion compared with buspirone augmentation, and the mirtazapine-venlafaxine combination compared with the MAOI tranylcypromine, led to greater reductions in continuous measures of depression, and cognitive behaviour therapy was slower in its action than medication. There were tolerability differences with later stage treatments tending to lead to more discontinuations and lithium and buspirone augmentation were poorly tolerated, probably due to the dosing regime with the former. This disappointing outcome has led to soul-searching but some of the problems may lie with the details and complexity of the study design, and the size of effect that could be detected (Anderson 2009).

The treatment choices faced by clinicians and patients when current treatment has not resulted in sufficient improvement are (1) to persist longer with current treatment, (2) to increase the dose or intensity, (3) to switch treatment and (4) to combine treatments. For antidepressants there has been a debate about how long to treat with antidepressants before changing treatment given the recent recognition that the greatest degree of improvement in depressive symptoms occurs in the first week and 60 % of the improvement seen at 6 weeks occurs by 2 weeks (Posternak and Zimmerman 2005). In spite of the conclusion by the authors of one meta-analysis (Szegedi et al. 2009) that treatment could be changed as early as 2–3 weeks into treatment for nonimprovers (less that 20 % improvement in rating scale scores) other 8-week studies found negative predictive values (proportion of non-improvers at a specific timepoint failing to respond by end of study, NPV) of 55–64 % at 2 weeks, 80–82 % at 4 weeks and 90–93 % at 6 weeks (Nierenberg et al. 1995; Nierenberg et al. 2000; Baldwin et al. 2009). Caution is required in extrapolating from these data to clinical practice; the STAR*D study found that eventual responders following about 12 weeks periods of treatment first responded on average after 5.5 weeks at step 1 increasing to 8.3 weeks at step 4 (with similar figures for remitters) (Rush et al. 2006a). Of eventual remitters over a third did so after 8 weeks showing that even an 8-week trial may not be long enough consistent with a separate 12-week open study which found that the NPV for nonremission at 12 weeks based on nonimprovement at 4 weeks was only 49 % (Quitkin et al. 2003). The decision about when to change treatment therefore cannot be based on a fixed timepoint but needs to be based on whether there is any improvement, or trajectory of improvement, and on the eventual chance of response/remission. This will depend on many factors especially the number of failed treatments and the type of next step being considered. It is reasonable to start thinking about changing treatment at 3–4 weeks, especially dose increase, when there is no improvement (National Institute for Health and Clinical Excellence 2009a), but for many patients longer trials are warranted.

5.1 Dose Increase

In spite of it being common clinical practice evidence from RCTs is that increasing the dose of SSRIs is not more helpful than continuing the same dose in patients who have not responded (Adli et al. 2005), with a lack of evidence for other antidepressants although a trend was seen with high dose venlafaxine (Thase et al. 2006). There is indirect evidence from dose response data for trying a higher dose of tricyclic antidepressants, venlafaxine (>150 vs. 75 mg), escitalopram (20 vs. 10 mg) and MAOIs (Rudolph et al. 1998; Adli et al. 2005; Anderson et al. 2008) and a recent meta-analysis suggests there may be small advantage at higher than starting doses for other SSRIs at the cost of poorer tolerability (Papakostas et al. 2010). Therefore, in spite of the limited evidence increasing the dose, provided side-effects and safety allow, may be a reasonable step especially as there is wide interindividual variability in plasma concentration of antidepressants and associated uncertainty about what is an effective dose for an individual patient. Increasing the dose may also keep a patient in treatment to allow adequate time to respond.

In patients who have failed to respond to previous treatments high-dose antidepressants are sometimes considered. This is usually an off-label use and needs to be discussed with patients. Clinical experience and limited data suggests that some patients do benefit particularly when treated with high-dose TCAs, venlafaxine or MAOIs (Adli et al. 2005) but caution need to be exercised and monitoring for adverse events.

5.2 Switching Antidepressants

Open and controlled studies of antidepressant switching show widely varying response rates (25–70 %) (Anderson 2003). It is important to remember that RCTs of antidepressant switching show up to 50 % response rate by simply continuing on the same antidepressant at the same dose (Shelton et al. 2005; Corya et al. 2006). There may be marginal benefit from switching between antidepressant class than to a second drug of the same class (Papakostas et al. 2008a), largely accounted for by studies switching from an SSRI to venlafaxine (Ruhe et al. 2006). Older studies have also suggested that switching from a reuptake inhibitor to an MAOI may be effective (Anderson 2003) but the STAR*D study did not find tranylcypromine had greater efficacy than a venlafaxine-mirtazapine combination (McGrath et al. 2006).

How to switch antidepressants has been little studied and there is a need to balance the needs of safety (arising from the potential drug interactions) and tolerability against those of avoiding discontinuation reactions and timeliness. Immediate switching appears reasonable with drugs of similar pharmacology and shorter half-lives and STAR*D did not report problems switching between class. If two drugs can reasonably be combined then immediate switching (between modest doses) of the two drugs appears safe and well-tolerated. However, potentially toxic interactions do need to be considered and these are compounded if the initial drug has long-lasting effects (e.g. fluoxetine to a tricyclic antidepressant, MAOI to a serotonergic drug).

5.3 Combining or Augmenting Antidepressants

There has been considerable interest in combining drug treatments since the first antidepressants were developed and two of the major clinically applicable advances in pharmacological treatment have been lithium augmentation and, more recently, atypical antipsychotic augmentation. The rationale for combining antidepressants has been discussed above when considering antidepressant efficacy. However, although commonly done, evidence for it being an effective strategy is largely lacking. In practice the safety of antidepressant combinations is a key factor in determining their choice. Table 2 summarises common combination/augmentation strategies.

Table 2 Common antidepressant augmentation/combination strategies

Some of the strongest evidence is for lithium augmentation of monoamine reuptake inhibitors, mostly tricyclic antidepressants (Bauer et al. 2010). However, the studies are small and there are few recent trials so this evidence is less secure than it once seemed particularly as lithium augmentation was poorly tolerated and not particularly effective in the STAR*D study (Nierenberg et al. 2006). However, as in bipolar disorder, lithium augmentation does appear to decrease the risk of suicide (Guzzetta et al. 2007) making it a potentially valuable treatment. In clinical practice its use appears to have declined and we do not know whether it has efficacy added to newer antidepressants such as SSRIs or non-reuptake inhibiting antidepressants such as mirtazapine. The big development in the last decade has been the evidence that atypical antipsychotics are effect as augmenting agents for SSRIs (Papakostas et al. 2007a; Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a). Pooled data and some individual studies have provided evidence for quetiapine, olanzapine, aripiprazole and risperidone, although further data are needed as well as longer term studies. The best evidence currently is for quetiapine at a lower dose than used as an antimanic or antipsychotic agent (McIntyre et al. 2009). Atypical antipsychotics are reasonably well-tolerated but there is an increased side-effect burden, including weight gain and sedation (National Institute for Health and Clinical Excellence 2009a).

Evidence for other strategies is weaker. Results with triiodothyronine (T3) augmentation suggest efficacy but it has not been extensively studied and longer term experience with continuing treatment is lacking (Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a); it was better tolerated and nonsignificantly better than lithium in the STAR*D study (Nierenberg et al. 2006). In contrast pindolol augmentation has been shown to be an ineffective strategy for improving outcome, although it does speed improvement (Anderson et al. 2008; Portella et al. 2011), and buspirone augmentation is also ineffective (Anderson et al. 2008) and it was less well-tolerated that bupropion in the STAR*D study (Trivedi et al. 2006).

The evidence-base for combining antidepressants is even weaker, including for the popular strategy of mirtazapine augmentation of SSRIs/venlafaxine (Anderson et al. 2008; National Institute for Health and Clinical Excellence 2009a). A variety of other augmentation strategies have been used. Tryptophan augmentation of MAOIs has some support (Anderson 2003) and clinically it has been used as an adjunct with lithium-MAOI and lithium-TCA combinations with the rationale that it increases serotonin availability. There has been interest in adding treatments that might be considered ‘complementary’ to antidepressants such as the omega-3 fatty acid eicosapentanoic acid (Anderson et al. 2008) but the evidence is equivocal.

The disappointing efficacy of strategies for treating depression poorly responsive to antidepressants has lead to a wide variety of proof-of-concept studies based on plausible pharmacological rationales but a full review is beyond the scope of this chapter.

Augmentation/combination strategies are increasingly seen as important next-step treatments and are probably being used earlier in treatment algorithms; caution is needed to make sure they are used safely. Many combinations involve off-label use and discussion with the patient is required and informed consent obtained. Augmentation strategies are particularly useful when current treatment has produced some benefit and stopping the current antidepressant risks losing this, and when dose increase and switching strategies have failed.

6 Conclusions and Future Directions

In this chapter, I have briefly reviewed the use of antidepressants in the treatment of unipolar depression and tried to put it in the historical context of drug development as well as considering some of the current controversies about the efficacy and adverse effects of antidepressants. There is no doubt that there are major limitations in our current ability to treat the syndrome of major depression. The efficacy of antidepressants is limited and has not changed in the years since their discovery, although we have a much better knowledge of their pharmacology and adverse effects, providing a ‘palette’ of drugs with different side-effects that at least allows some personalisation of treatment in that aspect. The relative failure to improve outcomes so far underlines how much of a challenge there is in developing better antidepressants although this remains an aspiration for researchers, clinicians and sufferers alike.

What are the challenges and where are advances likely to lie? The thrust so far has been to produce new compounds that might have advantages over current treatments which has been done by working largely in an empirical way based on known pharmacology and animal models. It remains to be seen whether new pharmacological approaches such as glutamatergic drugs will offer rapid onset of action and greater efficacy (e.g. Zarate et al. (2006) or whether our current lack of understanding of how neurobiology maps onto mood, psychological processes and symptom clusters means that progress in improving outcomes in such a broadly defined groups a major depression is a vain hope. Although there is current enthusiasm for stratified or personalised treatment we are still a long way from being able to identify predictors of response (Simon and Perlis 2010) that will allow us to target treatment more effectively for individuals. The present interest in affective processing and its neurobiology, together with increasing sophistication in functional brain imaging techniques, holds some promise for greater understanding of abnormalities that occur in depression and the effects of pharmacological manipulation. This may provide a means for moving beyond animal models which arguably have anchored us to the pharmacology of the past and to oversimple behaviours that do not sufficiently model or capture what occurs in human depression.

Perhaps, unfortunately, we are destined to be defeated by combination of the complexities of the human brain and the neural networks underlying human thought and experience, the multiple causes of depression and the variations in its meaning for individuals. This may simply mean that crude manipulations of neurotransmitter pathways by systemic administration will never be able to take us farther than the modest ‘improvement-enabling’ effect of current drug treatments. The challenge in that case could be to combine what we already have with smarter targeted psychological or brain region-specific physical interventions rather than to continue to seek to develop a ‘better’ antidepressant drug as currently conceived.

The definition of depression itself is generally accepted to be too broad and probably encompasses heterogenous conditions, or at least facets, at a more basic level (Holtzheimer and Mayberg 2011). This confusion helps little in furthering our understanding of the neurobiology of depression, its social and psychological context and causes, nor in the development of better treatments. It has been argued on this basis that we need to target treatments at more narrowly defined symptoms or symptom sets, or have a different concept of depression (Holtzheimer and Mayberg 2011). It is of interest that a recent antidepressant treatment study found that symptoms dimensions, but not standard rating scales, could identify drug-specific effects of a serotonergic compared with a noradrenergic antidepressant (Uher et al. 2009) which, if replicated, argues that we need to alter our outcome measures in treatment trials if we want to truly personalise treatment and develop new targeted agents.

Finally, although there have been attempts to identify genetic predictors of treatment none have been sufficiently robust to be useful at an individual patient level. We may simply be misguided in thinking that a single gene, or a combination of a few genes, will ever yield a large enough effect to be clinically useful. Alternatively we may need to be more sophisticated in looking for gene × environment interactions (or gene × gene interactions) to explain variance in outcomes (e.g. Keers et al. 2011) and hence to be able to harness knowledge about genes as predictors of outcomes.