In 2019, Aaron T. Beck published a retrospective on his 60 years in the field of cognitive behavioural therapy (CBT), with a specific focus on the scientific method that guided his work developing, implementing, and disseminating an efficacious treatment for diverse psychological disorders. He highlighted what is now CBT lore—that his initial interest in validating psychoanalytic treatment for depression via rigorous scientific methods led him ultimately to leave psychoanalysis behind and revolutionise psychotherapy by focusing on the key role of automatic thoughts (Beck 2019). He never used the term experimental psychopathology; yet, Beck described a back-and-forth process between experiments in the lab and interventions in the clinic, advancing both cognitive theory and therapy. In so doing, he laid the groundwork for research that focuses not just on evidence-based practice, but also on practice-based research.

Researchers define experimental psychopathology (EPP) in a variety of ways (e.g. Waters et al. 2017; Zvolensky et al. 2001), though all agree that the overarching goal is to understand which factors contribute to the aetiology and maintenance of psychological disorders and how they do so. Forsyth and Zvolensky et al. (2001; see also Zvolensky et al. 2013) describe four research designs that fall under the EPP umbrella, each varying on participant type and experimental manipulation of independent variables. Boddez et al. (2017) lay out EPP core tenets: (1) psychopathological behaviour is caused by deviations in cognitive processes that operate on a continuum “shared by healthy people and patients alike” (p. 110); (2) proposed mechanisms (i.e. manipulated constructs) are well-known psychological phenomena rather than novel, previously unidentified constructs; and (3) the central goal is to contribute to improving evidence-based treatments. Despite the long history of clear links between EPP and CBT (e.g. van den Hout et al. 2017), the degree to which researchers consider clinical implications as primary varies considerably (e.g. Forsyth and Zvolensky 2001; Ouimet et al. 2019).

In July 2019, over 4000 researchers from more than 80 countries came together in Berlin to share and discuss CBT research and treatment at the World Congress of Cognitive Behavioural Therapy (WCBCT) – “CBT at the Crossroads”. The integral role of EPP in CBT research is apparent in the conference program (https://wcbct2019.org/Downloads/Congress-Programme.pdf). From basic science/experimental psychopathology poster sessions to symposia related to implementing behavioural experiments in treatment (Kube and Rief 2019), researchers and clinicians learned about applying evidence from new EPP research to their CBT practice.

Although CBT clinicians should feel comfortable pointing to the evidence base to support much of what they do with their clients, the pathway from the cognitive behavioural model to cognitive behavioural treatment is not often clear (Clark 2004). Waters et al. (2017) proposed a conceptual model of research programming to encourage better integration of EPP and clinical practice, highlighting the multi-directional links between theory, experiments, and practice. For example, high relapse rates observed in randomised controlled trials (RCTs) should inform EPP researchers’ efforts to identify additional potential mechanisms of change, which should then be tested via RCTs. The model is iterative: as researchers, we must move information from science to practice, and then use practice to inform our science (Waters et al. 2017).

There is a general structure that EPP/CBT research tends to follow:

  1. (1)

    Identify an existing phenomenon that appears exaggerated among people with psychopathology;

  2. (2)

    Induce that exaggeration among people without psychopathology to see whether “symptoms” appear;

  3. (3)

    Test whether changing that exaggeration (i.e. via an intervention) reduces psychopathology;

  4. (4)

    Modify and repeat.

Although this structure has led to improved treatment outcomes for people with psychopathology (e.g. Clark 2004), it is not uncommon for researchers to skip the second step—testing the role of the proposed mechanism, leaving open the question of whether that mechanism should be prioritised in an intervention. Moreover, psychological research is changing rapidly, with a focus on improving research practices and reproducibility via Open Science methods, increased transparency, more precise measurement, larger, a priori justified sample sizes, and inclusion of diverse samples to name a few. EPP has to change with it, particularly given its important role in developing and refining evidence-based treatments.

In this paper, we take a critical eye to EPP research and its role in CBT practice, focusing on where we have been, where we are now, and where we are going. This is by no means an exhaustive review. Rather, we aim to highlight research that has pushed therapeutic practices forward, current research that has the potential to do so, and to put forth a call for continued improvement of our research practices to ensure we help people with various forms of psychopathology. Moreover, we put a particular emphasis on the diverse crossroads that EPP and CBT researchers encounter, using the WCBCT convention as our inspiration.

Although there are numerous examples of exciting ways in which EPP has advanced CBT, we were able to include only a few. The choice was not always easy, and was certainly influenced by our own areas of expertise and familiarity. Rather than provide a complete account of how EPP shaped CBT for a given disorder, we highlighted themes we believed to be particularly important (e.g. Identifying mechanisms of change), using specific processes and disorders as exemplars.

Where Have We Been?

EPP has a long history dating back to Wundt, who advocated for experimental laboratories as the cornerstone of psychological science, James, who was perhaps the first to use experiments to highlight how psychopathology was a variant of normative processes, and Pavlov, whose experimental work on systematic desensitisation was the precursor to exposure therapy. A detailed review is beyond our paper’s scope; we refer interested readers to Olatunji et al. (2007) and Forsyth and Zvolensky (2001). Instead, we reflect on a few specific programs of research that have put EPP at the centre of treatment development. We provide examples of how researchers used innovative experimental manipulations to better test and measure cognitive and behavioural causes of symptoms, changing the way we understand and treat some disorders.

Developing Novel Interventions: Memory Confidence in OCD

Following a long history of being considered “untreatable”, obsessive-compulsive disorder (OCD) research leapt forward following strong, clinically focused cognitive models by Rachman (1997, 1998) and Salkovskis (1985). By focusing on people’s beliefs—that their thoughts meant they were mad, bad, or dangerous, and that they were responsible for preventing their feared events from occurring, respectively—Rachman and Salkovskis laid the foundation for experimental work focused on delineating the effects of specific beliefs on OCD symptoms. These experiments were critical to developing CBT for OCD, which has one of the highest efficacy rates among anxiety-related disorders (Carpenter et al. 2018). By exploring the memory confidence research in depth, we aim to highlight how well-designed experiments pave the road for specific CBT components.

Since Rachman’s (1997, 1998) and Salkovskis’ (1985) early models, researchers have induced OCD symptoms experimentally to better understand their causal and maintaining mechanisms (De Putter et al. 2017). For example, Rachman et al. (1996) demonstrated that asking healthy people high in thought-action fusion to write out a sentence wishing their loved one would die increased their anxiety and urge to neutralise. Rassin et al. (1999) informed healthy participants that every time they thought the word “apple” (assessed ostensibly via EEG), a participant in the next room would receive a shock. This “thought-action fusion” increased participants’ intrusions of the word “apple”, discomfort, efforts to suppress the word “apple”, self-directed anger, and even their evaluation of the valence of the word “apple”. Using a novel (deceptive) pill-sorting task, Bouchard et al. (1999) demonstrated that perceived high responsibility led to more delayed decision-making, checking, and accuracy, all of which likely reinforce symptoms and mechanisms such as perfectionism. These experiments, in tandem with others highlighting the important roles of mental contamination (e.g. Fairbrother et al. 2005), disgust propensity and sensitivity (e.g. Deacon and Olatunji 2007; Inozu et al. 2017), and doubt and inferential confusion (e.g. Wong and Grisham 2017) as key maintaining factors—and thus treatment targets—for OCD symptoms led to important refinements of CBT for OCD.

One debate in the OCD literature centred on suggestions that people engaged in compulsive checking because of memory deficits. Researchers hypothesised that they could not remember well enough whether they had checked something properly, so they checked again (e.g. Rubenstein et al. 1993; Sher et al. 1989). McNally and Kohlbeck (1993), however, offered preliminary evidence that repeated checking may arise more from poor memory confidence than from poor memory performance. In an early meta-analysis, Woods et al. (2002) concluded that although people with OCD exhibited impaired memory performance, those who checked compulsively had lower memory confidence than those who did not. It was not until van den Hout and Kindt (2003) tested the effects of repeated checking on memory confidence via experimental manipulation, however, that a clinically plausible theory began to take hold.

In an innovative experiment, van den Hout and Kindt (2003) asked healthy participants to repeatedly check either a virtual stove or a virtual light bulb, prior to performing a final check of the virtual stove. Across three experiments, participants who repeatedly checked the stove remembered the final check just as accurately as those who repeatedly checked the light bulb; however, they reported significantly lower vividness, detail, and confidence in their memory. The authors provided strong evidence that OCD symptoms and behaviours likely cause decreased memory confidence. Alcolado and Radomsky (2011) further demonstrated that experimentally reducing people’s memory confidence caused them to check even more. Researchers have replicated these findings across non-clinical and clinical samples (e.g. Boschen and Vuksanovic 2007), leading to a novel CBT intervention specifically targeting memory confidence (Alcolado and Radomsky 2016). Moreover, people undergoing cognitive therapy for compulsive checking reported increased memory confidence immediately after psychoeducation and behavioural experiments targeting that construct (Radomsky et al. 2020b). Crucially, increases in memory confidence predicted decreased reported checking time.

Taken together, people who check repeatedly do so not because they cannot remember whether they checked properly, but because they are not confident that their memory of the check is accurate. It is difficult to overstate the importance of this conclusion for CBT practice. A focus on memory impairment could lead to ineffective memory training (and potentially client beliefs that they are “untreatable”), whereas a focus on memory confidence has already led to efficacious cognitive interventions (and thus client symptom improvement) (Alcolado and Radomsky 2016; Radomsky et al. 2020a, b). Indeed, targeting memory confidence via behavioural experiments has been incorporated successfully into CBT treatment recommendations for compulsive checking (Radomsky et al. 2010; Shafran et al. 2013).

Identifying Mechanisms of Change: EMDR for PTSD

Multiple CBT interventions for posttraumatic stress disorder (PSTD) are effective, including prolonged exposure therapy, cognitive processing therapy, and trauma-focused CBT (see Watkins et al. 2018, for a review). We focus here, however, on eye-movement desensitisation and reprocessing to highlight EPP’s vital role not only in CBT development, but also in better understanding the mechanisms of efficacious treatments and improving their implementation. Shapiro (1989, 1995) developed eye-movement desensitisation and reprocessing (EMDR) as a novel treatment for posttraumatic stress disorder (PTSD), with little theoretical or empirical foundation. Shapiro herself stated “The basis for the effectiveness is, however, unclear in that the technique was not derived from a theoretical position and there is, at this time, insufficient evidence to justify conclusions.” (Shapiro 1989, p. 216). What ensued was a decades-long back-and-forth between researchers and clinicians declaring EMDR essentially a miracle cure (e.g. Marcus et al. 1997) and those insisting EMDR was either ineffective (e.g. McNally 1999) or effective exclusively because it included elements of evidence-based psychotherapy, such as exposure and empathy (e.g. Cusack and Spates 1999).

Despite the debatable mechanism of change, early reviews and meta-analyses suggested that EMDR was an efficacious treatment, equally so to cognitive behavioural and exposure-based therapies (e.g. Davidson and Parker 2001; Van Etten and Taylor 1998), though serious questions remained about the plausibility and clinical utility of the eye movements. The argument that the components above and beyond exposure therapy were likely unnecessary (e.g. Cusack and Spates 1999; Davidson and Parker 2001) was paving the way for ignoring or omitting the potential additive effect of eye movements altogether, despite some research suggesting they may have important impacts on memory and emotionality (e.g. Andrade et al. 1997).

Some EPP researchers, however, were not content to leave unanswered the question of whether and/or why eye movements might be important. van den Hout et al. (2001) extended hypotheses about eye movements taxing working memory during recollection (e.g. Andrade et al. 1997; Merckelbach et al. 1994), by showing experimentally that healthy participants who recalled negative and positive memories reported reduced vividness and negative valence of the negative memories only after engaging in eye movements. They suggested that eye movements were an active ingredient in EMDR, contrary to CBT assumptions at the time. Crucially, there was finally a mechanism to which researchers and clinicians could point: eye movements during recall tax working memory, which reduces memory vividness and emotionality, resulting in re-encoding a less distressing memory (van den Hout and Engelhard 2012).

In line with an alternate theory—that eye movements were effective because they increase communication between the two cerebral hemispheres (e.g. Christman et al. 2003)—many clinicians began substituting auditory tones that alternate from the left to right headphones for eye movements, despite a lack of evidence suggesting that this “binaural stimulation” was equivalent to eye movements or other working memory taxing behaviours (van den Hout et al. 2011). In four experiments, van den Hout et al. (2011) demonstrated that eye movements impair working memory and reduce negative memory vividness and emotionality significantly more than do binaural tones for healthy participants in the lab. Moreover, eye movements resulted in significantly larger reductions in memory emotionality and vividness than did tones among people with PTSD undergoing EMDR (van den Hout et al. 2012). These findings clarified that working memory taxation was the crucial mechanism of action over and above exposure to the memory in EMDR and was thus critical to ensuring its proper implementation.

Since then, experimental (e.g. Kearns and Engelhard 2015) and clinical research (e.g. Hurley 2018) related to eye movements and EMDR has increased substantially, including several symposium presentations at WCBCT (e.g., Bendall 2019; Dominguez 2019; Lee 2019). In a recent meta-analysis of mechanisms of change, Landin-Romero et al. (2018) supported the working memory taxation and psychophysiological change hypotheses, and clinical practice guidelines recommend EMDR as an effective treatment for PTSD (NICE 2018). As such, the innovation at the core of EPP allowed researchers to determine a potential mechanism of change to explain a previously implausible treatment effect, resulting hopefully, in more people receiving adequate care for their PTSD symptoms.

Perhaps even more important, EPP should not end where clinical research begins. There needs to be synergy between EPP and intervention science (Waters et al. 2017). Continued experimental investigations into eye movements resulted in findings that eye movements during initial recall may lead to increased false memories (Houben et al. 2018), though van Schie and Leer (2019) were unable to replicate the finding with a larger sample. van Veen et al. (2019) recently suggested that eye movements may actually function as a safety behaviour during exposure to traumatic memories by reducing distress and vividness in the moment, but ultimately resulting in rebound distress at follow-up. In other words, there is more work to be done, and EPP offers a context in which to do that work well.

Where Are We Now?

CBT is no longer (and has not been for quite a while!) the underdog that needs to prove itself. There are numerous meta-analyses, systematic reviews, Cochrane Reviews, and clinical guidelines attesting to its efficacy and effectiveness for multiple mental health problems (e.g. Hofmann et al. 2012). Nonetheless, many people continue to live with psychopathology, even after receiving high-quality CBT (e.g. Ali et al. 2017; Litt et al. 2018). Accessing evidence-based treatment remains elusive for many, whether due to living in remote locations, lack of public funding, or lack of trained clinicians (Gunter and Whittal 2010). The solutions to many of these questions rest in intervention science. For example, there is now strong evidence that CBT interventions delivered online are generally as efficacious as in-person CBT (e.g. Andersson et al. 2014). In some jurisdictions, publicly funded CBT is a government policy (e.g. Improving Access to Psychological Therapies, UK; The National Collaborating Centre for Mental Health 2020). Manualised training programs support paraprofessionals in administering CBT with good effectiveness rates (e.g. Bass et al. 2013). Thus, it may seem as if the role of EPP in CBT is dwindling (or should dwindle). Not so.

There remain several mental health issues that have been historically understudied, or for which maintaining mechanisms are poorly understood, hampering their treatment efficacy. Even for well-understood disorders, there is significant room for treatment improvement, likely because researchers have yet to identify key mechanisms and/or the relations among those mechanisms (i.e. moderators and mediators). EPP researchers continue to innovate, test, and manipulate (variables, not people) with the goal of improving treatment for people living with psychopathology. Indeed, it is exciting to survey the numerous EPP research programs currently ongoing that have a high potential to contribute to improvements in CBT. Below, we provide just a few examples, before moving on to our thoughts on how to advance the role of EPP for CBT.

Identifying Treatment Targets for Understudied Disorders: Body Dysmorphic Disorder

In a recent meta-analysis, Harrison et al. (2016) demonstrated that CBT was more efficacious for body dysmorphic disorder (BDD) than wait-list control and credible placebo treatments, with treatment gains maintained at 2–4 months. However, outside of controlled research trials, only a minority of people with BDD receive CBT, and remission rates are low (Phillips et al. 2013). Although CBT models of BDD focus on mechanisms such as information processing biases, overvaluation of appearance, and avoidance/safety behaviours (e.g., Wilhelm et al. 2010), evidence for the causal and maintaining roles of those factors is lacking.

More recently, EPP researchers have begun experimentally manipulating these proposed mechanisms. For example, Veale et al. (2016) expanded on previous research demonstrating that people experience more body dissatisfaction after staring into a mirror (e.g. Windheim et al. 2011). They manipulated the object of attention (self vs external) and low mood (self with negative mood induction vs without) among a large sample of unselected women. Whereas attentional direction had no effect on any of the key outcomes, people in the negative mood condition reported more body dissatisfaction and sadness after mirror gazing, suggesting that low mood plays an important role in how mirror gazing impacts BDD symptoms. These findings have important implications for mirror exposure interventions in CBT, and for targeting transdiagnostic factors such as low mood, which frequently co-occurs with psychopathology.

People with BDD engage in safety behaviours such as mirror and body/appearance checking, reassurance-seeking, hiding their disliked feature, and excessive grooming, to avoid their disliked feature being evaluated negatively and/or transiently reduce their anxiety (Wilhelm et al. 2010). Despite hypotheses that these safety behaviours contribute to people’s preoccupations with their own perceived physical flaw and to other important factors in BDD (e.g. catastrophic misinterpretations), experimental work testing the causal and maintaining role of safety behaviours in BDD is only just beginning. Using a 1-week, pre-post experiment, Summers and Cougle (2018) found that healthy participants who increased their safety behaviour use reported greater BDD symptoms, social anxiety symptoms, body dissatisfaction, appearance-, and social-evaluative-related cognitive biases at post-test, compared to participants in control conditions. Following an appearance-threat task (having full-body photos taken), participants in the increased safety behaviour condition also reported higher levels of fear, urges to seek reassurance, and symptoms of anxiety and depression. As such, safety behaviours can cause BDD and associated symptoms, providing solid evidence for the importance of targeting them in CBT. Indeed, Wilver et al. (2020) demonstrated that fading safety behaviours in response to text message reminders reduced BDD symptoms (compared to control) among young women with elevated appearance concerns—a compelling example of how quickly experimental work can translate to potential treatment improvements. Continued experimental research related to causal and maintaining factors of BDD are sorely needed to continue to improve CBT outcomes.

How EPP Advances CBT for Previously Overlooked Disorders: Paranoia

Given the importance of genetic and biological causes in psychotic disorders (and thus, the primacy of pharmaceutical therapies), researchers have focused less on cognitive and behavioural treatment targets. Nonetheless, CBT, as an adjunct to medication, prevents (e.g. Hutton and Taylor 2014) and reduces psychotic symptoms (e.g. Hazell et al. 2016; van der Gaag et al. 2014). However, effect sizes are relatively small, suggesting that more work is needed. EPP may seem particularly misplaced in psychosis research, given the presumed difficulty inherent in “manipulating” psychosis; however, hallucinations and delusions occur on a continuum just like any other symptom (Johns and van Os 2001). A fundamental tenet of EPP is that pathological behaviour is an extreme manifestation of universal cognitive processes that impact functioning for people with and without clinical disorders (e.g. Boddez et al. 2017).

Researchers have thus taken on the challenge of investigating cognitive and behavioural mechanisms relevant to paranoia, a broad symptom common to many presentations of psychotic disorders, using a well-established social exclusion paradigm—Cyberball (Williams et al. 2000). Cyberball is a virtual game in which participants toss a ball with two experimentally controlled “players”. Researchers vary the degree to which the computer-controlled players interact with the participant, i.e., the degree to which they are ostracised. Generally, the more people are excluded, the more they report negative affect, interpret the exclusion as intentional, and engage in maladaptive interpersonal behaviours (Hartgerink et al. 2015).

Kesting et al. (2013) used this paradigm with a sample of healthy undergraduate students to test whether social exclusion and criticism would lead to increases in paranoia symptoms, consistent with hypotheses that threatening interpretations (i.e. cognitive biases) of social and other stressors play a causal role in some psychotic symptoms (e.g. Freeman et al. 2002). They found that people who were excluded and received negative feedback reported a greater increase in state paranoid ideation than those in the control condition (included, received neutral feedback). Social exclusion and negative feedback led to reduced self-esteem, which in turn led to greater beliefs that others were out to get them. Kaltsi et al. (2018) showed experimentally that people who held positive beliefs about paranoia experienced greater paranoid ideation in response to Cyberball than people who held negative beliefs about paranoia. EPP researchers further demonstrated that criticism amplifies social exclusion’s effect on paranoia whereas self-esteem is protective (Butler et al. 2019). Social exclusion can trigger reductions in both explicit and implicit self-esteem, both associated with paranoia post-Cyberball (Stewart et al. 2017).

Despite the seeming difficulty using EPP to better understand psychotic symptoms, Brown et al. (2019) found over 40 experiments conducted in the last 20 years with the objective of determining causes of psychosis. They found that the majority of non-intervention experiments were conducted with non-clinical samples, and that studies involving manipulation of a purported mechanism provided the “most robust causal evidence” (p. 25). Paranoia was by far the most-studied symptom, leading Brown et al. (2019) to advocate for more EPP research aimed at investigating causal and maintaining factors of other symptoms.

Freeman (2016) reviewed the evidence for focusing on specific mechanisms, including worry, self-beliefs, and mistrust when treating paranoia and persecutory delusions. Although one meta-analysis shed doubt on the long-term efficacy of CBT for delusions (Mehl et al. 2015), the more recent interventions targeting mechanisms identified by EPP showed more positive effects on symptoms. In their meta-regression, Sitko et al. (2020) demonstrated that the efficacy of CBT for delusions has improved over time. Clearly, increased understanding of causes and maintaining factors of delusions has translated to increased efficacy of CBT for these symptoms.

The Importance of Theory and Clinical Relevance

We contend that “where we are” currently, in EPP, is at something of a crossroads, between needing to focus on specific mechanisms to better prioritise treatment targets and getting so caught up in minute details that we miss the forest for the trees. Indeed, one thing shared by the research programs discussed thus far is an attempt not just to identify some kind of phenomenological “uniqueness” specific to a disorder, but rather a focus on testing a theory based on experiences working with people living with mental health problems. Although EPP research typically begins when researchers identify human processes that seem exaggerated in people with psychopathology, that identification is most likely to translate into an important treatment target when the mechanism of change is theoretically and clinically plausible. Researchers began studying memory and memory confidence in OCD because clients often reported checking because they “could not remember” whether they had checked properly. Identifying the role of eye movements came about because researchers and clinicians alike were not satisfied with knowing that a treatment worked without knowing why it worked. The research on paranoia has progressed so much further than research on cognitive and behavioural causes of other psychotic symptoms because of its basis in Freeman’s (Freeman 2016; Freeman et al. 2002) plausible, complex, and evidence-based model of persecutory delusions, which focuses on clients’ real concerns about deciding who to trust, when, and on what basis. In other words, when we use EPP to validate experimentally what our clients already know, we seem to get much closer to understanding what thoughts and behaviours drive psychopathological symptoms. This perspective requires keeping clinical relevance at the heart of our research, rather than simply studying what we think might be interesting. It also requires testing theories, rather than relationships among variables. As researchers focused on improving treatment, we need to think deeply about what we want to know about people who experience mental health symptoms, and why we want to know it.

For example, in a previous review, we found that although cognitive science research had provided much knowledge about the phenomenology of OCD, it had contributed little to improving treatments for people living with OCD (Ouimet et al. 2019). We critiqued work on cognitive deficits and highlighted problems with paradigm specificity and failure to account for the alternative hypothesis—that participants’ beliefs and symptoms caused the so-called deficits. Indeed, the major advances in CBT have come not from documented brain differences, but from thoughts, behaviours, and emotions that people living with psychopathology describe. We must consider that some EPP research will be more useful than others for improving treatment.

In principle, experimental psychopathologists are more interested in core human processes that occur across the range of human functioning than in diagnostic categories (van den Hout et al. 2017). Nonetheless, much of the published research focuses on people who experience a specific type of psychopathology. We contend, however, that the programs of research that have been the most effective in developing and refining CBT have tended to focus on the causes of a particular symptom rather than of a particular disorder.

Although it may thus appear that EPP tends towards “splitting” rather than “clumping”, there is a strong connection between this kind of specificity and transdiagnostic treatments for various problems. For example, although Rachman (1984) originally described safety behaviours as they occurred for people with agoraphobia, we now have strong evidence that people use safety behaviours across many types of psychopathology. Similarly, we associate paranoia with psychotic disorders; however, it may play a role in other disorders where social/interpersonal functioning is affected, such as social anxiety disorder, borderline personality disorder, and posttraumatic stress disorder. This focus on the specific allows for strong experimental design and internal validity, which over time contributes to better understanding how a specific cognitive or behavioural process impacts symptoms across multiple presentations.

As such, we encourage EPP researchers to put theory and clinical relevance at the forefront of their research programs, and to prioritise studying human processes that fit logically as potential mechanisms. Essentially, we need to focus on the forest and the trees. One of the enormous advantages of participating in an international conference dedicated to CBT is the opportunity to get a bird’s eye view of the exciting EPP and CBT research going on worldwide.

Where Are We Going?

Looking back and looking forward, a few areas stand out as necessary to ensure EPP remains a solid and effective bridge between basic science and clinical practice. First, the push towards improving scientific practices and transparency in psychology needs to be more quickly and more forcefully integrated into EPP to ensure replicable methods and findings—and thus, reliable treatment targets for CBT. Second, we need more EPP research to identify the cognitive and behavioural causes of symptoms and disorders that remain poorly understood, or for which CBT produces suboptimal outcomes. Third, for therapies that we know are efficacious, we need to start testing which components are truly “active” in reducing people’s symptoms, with the goals of increasing CBT efficiency and access. Finally, we are particularly guilty of ignoring diversity in EPP and CBT research. The time for that to change is past due.

Improving the Science of Psychopathology

The “Open Science” movement is no longer novel. In 2015, the Open Science Collaboration (2015) published the results of their large-scale, multi-site replication project wherein 270 researchers conducted 100 direct replications of journal articles published in 2008 from three leading journals in psychology. They directly reproduced fewer than 40% of the significant findings of the original research. Their work served as a call to action for psychology researchers to critically examine and improve our research practices. Since then, we have witnessed important changes in publication practices, such as pre-registering methods, hypotheses, and analyses to reduce questionable research practices (e.g. p-hacking, HARKing) and making data and analysis code openly available alongside published articles for verification.

Nonetheless, EPP and CBT research have been slower to implement these practices. According to the Center for Open Science (n.d.), over 250 journals currently offer the option to submit articles as Registered Reports. Of these, 11 (4%) appear relevant to clinical psychology/psychiatry, CBT models and treatments, or psychopathology. For comparison, 24 (9%) appear relevant to cognitive psychology (excluding those specific to neuroscience); however, there are far fewer cognitive psychology journals in general (154; Scimago Journal and Country Rank 2019a) than clinical psychology journals (295; Scimago Journal and Country Rank 2019b). Hildebrandt and Prenoveau (2020) analysed the rigour and reproducibility of research design and data analyses in all articles published in 2018 in arguably the highest impact EPP journal, Behaviour Research and Therapy; none met all the criteria they outlined for statistical rigour and reproducibility, and only 14% met just less than half of the criteria. They noted particular areas for improvement, including pre-registering hypotheses, reporting statistical power, including data and analysis code and output, and linking hypotheses to planned analyses.

At the WCBCT, one of us (A.O.) chaired a well-attended panel discussion entitled Open science and reproducibility in CBT research: Where do we go from here? Panellists, including Andreas Burger (UK Leuven), Michelle Craske (UCLA), Peter McEvoy (Curtin University), and Bethany Teachman (University of Virginia), discussed reasons that CBT research has experienced slowed uptake of Open Science practices, and avenues for facilitating such practices moving forward. Improving the science of psychopathology has important implications for improving CBT and increasing access to treatment advances for those who need it most.

Improving CBT

Although CBT is the most empirically validated treatment for a wide range of clinical disorders, it produces suboptimal outcomes for some problems. The reasons for CBT’s underperformance are multifactorial and include lack of research in some areas and therapeutic elements that clients find difficult. Perhaps most importantly, researchers, at times, have skipped the crucial step of identifying and testing causal and maintaining mechanisms. Below, we focus on eating disorders research and treatment as our exemplar, because there is considerable room for improvement in CBT efficacy (particularly for anorexia nervosa) and a notable lack of experimental work attempting to test the core tenets of CBT models of eating disorders.

Historically, eating disorders, and the people who lived with them, were considered intransigent, a term that puts the onus of treatment failure on the client rather than where it belongs—on the researchers and clinicians who have not yet figured out how to help them. Pennesi and Wade (2016) noted that of 54 theoretical models of disordered eating identified in their systematic review, fewer than 20% translated to tests of interventions. They also identified overlapping core features of the models (e.g. emotion regulation, external pressure, maladaptive beliefs; Pennesi and Wade 2016); however, the evidence for any of these variables as symptom causes or perpetuators is minimal (Jansen 2016). The efficacy of CBT for eating disorders is modest at best (Bulik 2014; Galsworthy-Francis and Allan 2014; but see Linardon et al. 2017), and may be due to a lack of EPP work validating the theories that underlie them.

One of the most influential models of eating disorders—the transdiagnostic maintenance model of eating disorders (Fairburn et al. 2003)—holds that multiple mechanisms perpetuate both anorexia and bulimia nervosa symptoms (as well as those of other eating disorders). Targeting those mechanisms via CBT can reduce symptoms (NICE 2017). However, Murray et al. (2019) found that although specialised treatments led to better weight gain outcomes among people with anorexia nervosa at post-treatment than did treatment-as-usual, there were no differences on psychological factors, and weight outcomes were not maintained at follow-up.

To illustrate the shortage of experimental psychopathology work related to models and treatments of eating disorders, we focus on one important hypothesis: that dietary restriction reinforces anorexia and bulimia nervosa symptoms. People with anorexia who do not binge-eat ultimately experience something of a “starvation syndrome”, wherein they withdraw socially, become even more preoccupied with food and eating, but actually experience heightened fullness and decreased appetitive reward (Fairburn et al. 2003). To explain this relationship, some researchers, citing correlational data, proposed that anorexia symptoms stem from biological abnormalities in reward system functioning (e.g. Wierenga et al. 2015). However, Jansen (2016) highlighted a series of experiments that point the finger at restriction as the primary cause of altered reward system functioning. Essentially, people with anorexia expose themselves frequently to high-reward foods, but extinguish their reward expectancies by failing to actually eat these foods. These experimental findings suggest that treatments incorporating exposure to eating high-reward foods may be critical to improving CBT efficacy among the minority of people with eating disorders who engage in restriction without binge-eating. Moreover, inhibitory learning accounts of extinction suggest that ensuring exposures are varied, occur in multiple contexts, and prioritise new learning will help reduce relapse (Jansen 2016; Reilly et al. 2017). However, we need EPP to clarify these mechanisms and how best to target them in CBT for eating disorders.

With respect to bulimia nervosa (and anorexia nervosa, binge-purge type), researchers theorise that people who restrict their eating are more likely to eventually lose control of their eating, binge, purge to correct for the binge, and further exacerbate their maladaptive beliefs related to overvaluation of weight and shape (Fairburn et al. 2003). However, there is limited evidence supporting the causal role of dietary restriction on binge-eating. Cooper et al. (1993) tested the effects of maladaptive beliefs on bulimia symptoms by priming participants with bulimia nervosa with word pairs either related (experimental condition) or unrelated (control condition) to negative and positive interpretations of weight and shape. People in the experimental condition reported more negative thoughts after the cognitive induction and ate less food in a “taste-test” than the control condition; however, they also reported the same number of objective binges and fewer subjective binges in the following 24 h. Although negative thoughts about weight and shape activated dietary restriction, the restriction did not trigger more binge-eating (Cooper et al. 1993). In a longitudinal weight loss study, Anton et al. (2009) found no differences in appetite over 6 months between people engaged in various caloric restriction diets. Beliefs about restriction—manipulated by describing a shake as indulgent and high-calorie or low-fat and low-calorie—increased physiological but not subjective hunger cues, suggesting that perceptions of restriction may be more important than actual restriction (Crum et al. 2011).

As summarised eloquently by Jansen (2016), the bulk of the experimental evidence supporting dietary restriction as a cause of binge-eating does not include experimental manipulation of dietary restriction. Instead, researchers typically classified participants as “high-restrained” or “low-restrained” eaters, and then tested the impact of different variables on their consumption of high-reward food following an initial forced food load. Moreover, although researchers propose that the restrained eating model may explain bulimia nervosa (e.g. Fairburn et al. 2003), very little of the dietary restriction research is focused on eating disorders, blurring the lines among dieting, overeating, binge-eating, and health interventions for overweight or obese people. We are not arguing that dietary restraint is unimportant to understanding bulimia, or that clinicians should not target dietary restraint in their interventions. Rather, we are arguing that we do not know the exact role of dietary restraint in bulimia nervosa, which may be limiting our ability to successfully reduce symptoms in treatment. We need EPP to advance CBT for eating disorders and other forms of psychopathology for which treatment outcome is suboptimal.

Identifying Active Components of CBT

RCTs are the gold standard of intervention science, and typically feature tests of complete treatment packages. We also need to understand which elements of therapy actually lead to the most symptom change. Despite the historical debate regarding whether cognitive or behavioural change was the driving mechanism for treatment efficacy (e.g. Ougrin 2011), researchers and clinicians alike now appear to agree that symptom reduction occurs via diverse cognitive and behavioural mechanisms of change (Hofmann and Hayes 2019). Although it is no longer interesting or useful to pit the “B” against the “C”, it is important to test the efficacy of individual techniques, an objective for which EPP is especially well suited.

There are many important reasons for homing in on particular techniques, whether to add new targets to existing treatments or to validate the efficacy of an existing technique on a specific mechanism. Given the well-documented difficulty accessing evidence-based treatments (Gunter and Whittal 2010), understanding which techniques give us the “greatest bang for our buck” should help reduce the length of some treatments and work through waiting lists a little more quickly. Additionally, understanding which technique works for whom may help clinicians individualise their treatment administration and achieve symptom reduction earlier. Finally, “Mini”-intervention trials are experiments in their own right. They provide further evidence that a particular mechanism plays a role in maintaining symptoms, or identify factors that are not as crucial as we believed, allowing researchers to focus on the next candidate mechanism.

One particularly interesting EPP program exploring intrusive memories in PTSD demonstrated that healthy people who engage in some kind of visuospatial task (i.e. Tetris) after an analogue traumatic event actually experience fewer future trauma intrusions due to interference with encoding of the memory trace (Holmes et al. 2009). Moreover, although playing Tetris disrupts and reduces involuntary intrusions, it leaves voluntary memories intact—providing a selective interference effect (see James et al. 2016, for a review; Badawi et al. 2020 for a replication by a different research group). In a clever series of three experiments, Lau-Zhu et al. (2019) further demonstrated that the interference effect was not attributable to cue overlap, attentional capture, or working memory load during retrieval, providing further experimental support for Tetris post-trauma as a potential inoculation of sorts against traumatic intrusions.

Iyadurai et al. (2018) translated these laboratory-based findings to the clinic by conducting an RCT in an emergency room. Participants who had experienced a motor vehicle collision were randomly assigned to either a memory cue + 20 min of Tetris condition or a control activity log condition. Participants who played Tetris after being reminded of their collision reported substantially fewer intrusions during the week following the collision and significantly less distress related to the intrusions at 1-week follow-up. They overwhelmingly rated the intervention as helpful, easy to implement, and minimally burdensome. Horsch et al. (2017) demonstrated similar positive effects in women who had experienced an emergency caesarean section, suggesting that the intervention likely generalises to diverse types of traumatic experiences. Whether the effects on intrusions have positive downstream impacts on other PTSD symptoms and the degree to which other variables (e.g. negative affect) moderate the effect of the visuospatial task remain to be seen (Cristea et al. 2018). By testing specific techniques validated in EPP research, we get closer to understanding if, how, and why interfering with memory encoding may reduce PTSD incidence.

Typically, specific techniques are just one component that make up an intervention. For example, dialectical behavior therapy (DBT) (Linehan 2014) teaches a diverse variety of skills to improve emotion regulation, distress tolerance, mindfulness, and interpersonal effectiveness. Although DBT is an evidence-based treatment for borderline personality disorder (e.g. Cristea et al. 2017), it is not clear how each technique or skill targets a particular mechanism, or even whether each technique is effective.

The “half-smile” skill, for example, is a distress tolerance skill based on the facial feedback hypothesis (Strack et al. 1988)—that producing a smiling facial expression provides people with (unconscious) feedback that improves their affective experience. In a famous set of two studies, people who held a pen between their teeth (unknowingly simulating a smile) rated cartoons as funnier than people who held a pen with their lips (unknowingly simulating a frown) (Strack et al. 1988). According to Google Scholar, that article was cited over 1600 times until 2016, when Wagenmakers et al. (2016) conducted a Registered Replication Report of the original study, which included over 17 independent replication attempts. A meta-analysis of the findings concluded that there was no effect of smiling on cartoon ratings, shedding substantial doubt on the potential helpfulness of the “half-smile” technique. We recently explored smiling as an emotion regulation strategy in response to viewing negative images (Bahl and Ouimet 2020, manuscript submitted for publication) and during a conversation task (Bahl and Ouimet 2020, manuscript in preparation). Overall, smiling offered no advantages over maintaining a neutral expression or reacting as usual on cognitive, behavioural, subjective, or psychophysiological outcomes, providing further evidence that the “half-smile” technique may be inefficacious at best, and may thus consume more therapeutic time than it is worth.

Testing specific techniques also leads to improvements in their impact. In their classic CBT model of social anxiety disorder, Clark and Wells (1995) theorise that self-focused attention during exposure to a perceived social threat, coupled with excessive use of safety behaviours and biased post-event processing, maintains people’s symptoms. CBT based on their model leads to robust decreases in social anxiety disorder symptoms (e.g. Stangier et al. 2011). Moreover, video feedback—a set of behavioural experiments designed specifically to target self-focused attention and safety behaviours—leads to immediate decreases in social anxiety symptoms and in self-perception ratings (McManus et al. 2009; Schreiber et al. 2015). Some researchers reported equivocal findings on the efficacy of video feedback with clinical (Smits et al. 2006) and analogue (Rodebaugh 2004) samples. Following continued EPP to understand the disparate findings and clarify if, when, how, and for whom video feedback might be helpful (e.g. Moscovitch et al. 2013), Warnock-Parkes et al. (2017) refined and re-tested the technique. Their modifications resulted in greater effect sizes for reductions in negative self-perceptions and social anxiety compared to McManus et al. (2009). In other words, initial tests of a specific technique resulted in further EPP and subsequently, refinement of an already efficacious technique—hopefully providing more relief to a larger proportion of treatment seekers. These findings also bolster support for self-focused attention and safety behaviours as key maintaining mechanisms that must be targeted early in social anxiety treatment.

Mini intervention trials are also vital to determine which factors identified by EPP are actually causal or maintaining mechanisms that need targeting during CBT, rather than simply correlates or consequences of the symptoms. Cognitive bias modification interventions are targeted, computerised techniques that researchers developed to try to reverse attentional, interpretation, memory, and behavioural biases exhibited by people with psychopathology (initially identified via correlational and experimental research). For example, participants may be asked to complete ambiguous sentences with positive (vs negative) outcomes, to identify a target that always appears near a positive (vs negative) stimulus, to use a joystick to virtually approach a positive (vs negative) stimulus, or to actively forget negative (vs positive) stimuli (see Hertel and Mathews 2011). In an ingenious series of experiments, MacLeod et al. (2002) expanded on previous findings demonstrating that people with anxiety had an attention bias for threat-relevant information by modifying the dot-probe paradigm to actually manipulate attention bias. They demonstrated that threat-relevant attention biases could cause anxious reactions to stress. Rapid proliferation of research on attention bias and psychopathology followed, ultimately leading to attention bias modification paradigms as potential interventions, and an ongoing spirited debate about their usefulness (e.g. Emmelkamp 2012).

Despite the innovation in the original work by MacLeod et al. and some promising initial findings supporting attention bias modification as a low-cost, easily accessible intervention for psychopathology (e.g. Hakamata et al. 2010), several meta-analyses have failed to support its reliability in substantially reducing symptoms of anxiety and depression over the long term (e.g. Fodor et al. 2020; Mogoaşe et al. 2014). Moreover, in a meta-analysis of attention bias modification RCTs, Kruijt et al. (2019) estimated the mean threat-related attention bias on the dot-probe paradigm of over 1000 people with anxiety disorders prior to participating (i.e. at baseline). They found no evidence for a threat-relevant attention bias in people with anxiety disorders, shedding some doubt on a long-held belief in the literature. However, Kruijt et al. focused on bias towards threat (vs neutral or positive stimuli) among people with anxiety disorders, rather than bias compared to people with varying degrees of anxiety. Indeed, in a similar meta-analysis that focused on individual differences at a participant level, people who reported higher social anxiety also demonstrated greater attentional biases towards threat (Price et al. 2016), further supporting the general contention that when people are anxious, they preferentially attend to anxiety-relevant stimuli.

Unfortunately, despite the current evidence that attention bias modification is unlikely to lead to reliable improvements in psychopathology, the cart ran ahead of the horse. Building on the excitement and dissemination of the initial attention bias modification findings, app developers boast of the large evidence base backing up the ability of their numerous for-purchase bias modification apps to “cure” people of their mood and anxiety disorders with minimal effort, time, and money (see, e.g. BMI n.d.). Our objective of helping people may lead us to push interventions into the clinic before they are well understood. While this urge is understandable, it can cause more harm than good, potentially teaching clients that therapy is not helpful, or at least cannot help them. The role of EPP goes beyond studying what causes psychopathology; we have an ethical obligation and responsibility to ensure interventions work, understand how and for whom they work, and identify any iatrogenic impacts before they end up available for widespread public use. For example, Laurens et al. (2020) assessed the feasibility of an alcohol avoidance training app among a large sample of people experiencing problems with alcohol use. They found some evidence for the app’s ease of use and efficacy in reducing alcohol consumption. However, there was a large drop-out rate (over 60%), highlighting a need for improvement. Given the proliferation of mental health apps, we need controlled studies of their advantages and limitations to complement the EPP work that serves as their foundation.

To account for the degree to which translation from EPP to CBT can be delayed, researchers may want to consider Blackwell et al.’s (2019) leapfrog design, a version of adaptive rolling trials used previously in cancer treatment research, modified for psychological treatment development and implementation. Essentially, researchers can compare multiple versions of a standard treatment in a clinical trial, and use repeated Bayesian modelling to quickly identify treatment variants that are or are not likely to improve upon the standard treatment. Probably inefficacious arms are dropped earlier (and perhaps sent back to the lab for improvement) and new insights from EPP are translated to new arms to be tested in the already established treatment trial. This research design reduces the likelihood of false negatives and false positives, accelerates the inclusion of new EPP research insights into treatment trials, requires fewer participants, and increases efficiency and cost-effectiveness of clinical trials. Indeed, this design has enormous potential to strengthen and facilitate the EPP-CBT bridge.

Prioritising Diversity in CBT and Reducing Mental Health Disparities

There is a wealth of evidence that people from minority groups are underrepresented in CBT trials (Mak et al. 2007; Mendoza et al. 2012; Wetterneck et al. 2012; Williams et al. 2010), and that non-culturally adapted psychological interventions (including CBT) lead to insufficient improvement for people from underrepresented minority groups (Benish et al. 2011; Hall et al. 2016; Huey and Tilley 2018; Windsor et al. 2015). In some cases, there is not enough well-conducted research to determine the efficacy of interventions for members of minority groups (Mak et al. 2007; Steinka-Fry et al. 2017). We need adequately powered, methodologically rigorous experiments with underrepresented participants to understand which mechanisms to target within diverse populations (e.g. Gregory 2016). This issue is further complicated by researchers’ use of outcome measures that themselves are not culturally adapted, and do not take into consideration how people from different cultural groups interpret the questionnaire and interview items (Helms 2015). Rathod et al. (2018) succinctly summarised the impact of these limitations: “…the generalization of findings of the effectiveness of therapeutic interventions to many cultural groups may not be valid or even appropriate” (p. 165).

The lack of attention to cultural diversity is even more noticeable in EPP. We encountered considerable difficulty locating experiments wherein researchers had a primary goal of understanding how elements of diversity play a role in the cause and/or maintenance of psychopathology. There is strong evidence from correlational research that people from different cultural groups experience symptoms differently (see Chapman et al. 2014; Moleiro 2018 for reviews), and also differ in terms of risk, protective, and associated factors (e.g. Hall-Clark et al. 2016; Williams et al. 2012a, b, c). However, cultural diversity is rarely more than a statistical consideration in EPP research, and even in that case, we fall far short of APA recommendations for reporting and considering diversity in research (APA 2019, p. 79).

We quickly reviewed the sociodemographic reporting in the first two issues published in 2020 by two leading experimental psychopathology journals. Across 46 articles, there were 32 experiments or quasi-experiments, all of which we counted as experiments (excluding secondary analyses of data). Authors of the experiments consistently reported age and sex/gender; however, only 9 (less than 30%!) reported on any other sociodemographic factors (e.g. marital status, education level). Notably, only one EPP article described their sample’s self-reported race/ethnicity. Given the evidence for cultural differences in symptoms and potential mechanisms, we should expect that there are important cultural differences in causal/maintaining models that require not only adaptations in existing treatments but novel treatments that include culture as a core feature through which beliefs and behaviours develop. How can we hope to improve CBT effectiveness and access for all people who need it if we are not even measuring and reporting the diversity of the participants in our experiments?

It is thus incumbent upon EPP researchers to actively consider cultural diversity at the outset of their study design, including conceptualisation/theory, recruitment, methods and measurement, data analysis, interpretation, and conclusions. Obviously, we are not suggesting that EPP researchers should find “clever” ways to manipulate diversity. At a bare minimum, measuring and describing the sample’s sociodemographic characteristics are vital. But we can do much more, starting with intentionally recruiting diverse samples. The purported difficulty of including participants from underrepresented minority groups is not a valid excuse for using almost all-White samples. Williams and colleagues describe current barriers to participating in research for Black Americans and offer evidence-based and tangible solutions for improving both recruitment and retention of these participants (Williams et al. 2012b, d, 2013a, b). In a recent meta-analysis, Pieterse et al. (2012) demonstrated that Black Americans who reported more exposure to racism also reported greater mental distress and psychopathological symptoms. So, the question is not “Is being a member of an underrepresented minority group a risk factor for psychopathology?”, but instead “How do the experiences of people from underrepresented minority groups put them at greater risk for psychopathology?”, a question that EPP researchers are exceptionally poised to answer.

Conclusion

EPP has been instrumental to CBT since its inception. It has been especially useful when researchers used theoretically driven and rigorous research designs to focus on specific processes that align with the experiences of people living with psychopathology. To navigate the most important CBT crossroads, EPP researchers need to expand their efforts by including diverse populations, prioritising open and transparent research practices, and clarifying which therapeutic elements are the most important for whom. In so doing, they will ensure EPP serves as more than just a basic-applied science bridge, but as the framework for understanding core processes in psychopathology and translating them directly to efficacious treatments to reduce suffering and increase well-being.