N-back Versus Complex Span Working Memory Training

Blacker, Kara J.; Negoita, Serban; Ewen, Joshua B.; Courtney, Susan M.

doi:10.1007/s41465-017-0044-1

N-back Versus Complex Span Working Memory Training

Original Article
Published: 16 October 2017

Volume 1, pages 434–454, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Cognitive Enhancement Aims and scope Submit manuscript

N-back Versus Complex Span Working Memory Training

Download PDF

Kara J. Blacker¹,
Serban Negoita¹,
Joshua B. Ewen^1,2,3 &
…
Susan M. Courtney^1,4,5

18k Accesses
38 Citations
461 Altmetric
70 Mentions
Explore all metrics

Abstract

Working memory (WM) is the ability to maintain and manipulate task-relevant information in the absence of sensory input. While its improvement through training is of great interest, the degree to which WM training transfers to untrained WM tasks (near transfer) and other untrained cognitive skills (far transfer) remains debated and the mechanism(s) underlying transfer are unclear. Here we hypothesized that a critical feature of dual n-back training is its reliance on maintaining relational information in WM. In experiment 1, using an individual differences approach, we found evidence that performance on an n-back task was predicted by performance on a measure of relational WM (i.e., WM for vertical spatial relationships independent of absolute spatial locations), whereas the same was not true for a complex span WM task. In experiment 2, we tested the idea that reliance on relational WM is critical to produce transfer from n-back but not complex span task training. Participants completed adaptive training on either a dual n-back task, a symmetry span task, or on a non-WM active control task. We found evidence of near transfer for the dual n-back group; however, far transfer to a measure of fluid intelligence did not emerge. Recording EEG during a separate WM transfer task, we examined group-specific, training-related changes in alpha power, which are proposed to be sensitive to WM demands and top-down modulation of WM. Results indicated that the dual n-back group showed significantly greater frontal alpha power after training compared to before training, more so than both other groups. However, we found no evidence of improvement on measures of relational WM for the dual n-back group, suggesting that near transfer may not be dependent on relational WM. These results suggest that dual n-back and complex span task training may differ in their effectiveness to elicit near transfer as well as in the underlying neural changes they facilitate.

Working memory training revisited: A multi-level meta-analysis of n-back training studies

Article 23 January 2017

A simultaneous examination of two forms of working memory training: Evidence for near transfer only

Article 29 April 2016

Mechanisms Underlying N-back Training: Response Consistency During Training Influences Training Outcome

Article 12 October 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

In the past 15 years, a body of research on the effects of cognitive training has amassed. The overarching goal of cognitive training is to improve specific or broad-ranging cognitive skills, and numerous approaches have been examined with this goal in mind, such as video game training (Green et al. 2016; Powers et al. 2013), working memory (WM) training (Au et al. 2015; Morrison and Chein 2011; Shipstead et al. 2012), meditation training (Chiesa et al. 2011; Lutz et al. 2008; Tang et al. 2015), non-invasive brain stimulation (Berryhill et al. 2014; Parkin et al. 2015), aerobic exercise (Hillman et al. 2008), and many others (for a comprehensive review see, Simons et al. 2016). Of these approaches, WM training has, arguably, received the most attention and investigation, but also been surrounded by the most controversy.

The seminal study by Jaeggi et al. (2008) demonstrated that training on a WM task yielded improvement on a measure of fluid intelligence (Gf). While this result has been replicated many times (e.g., Jaeggi et al. 2014; Jausovec and Jausovec 2012; Rudebeck et al. 2012; Stephenson and Halpern 2013; for a meta-analysis see Au et al. 2015), others have failed to replicate this effect (e.g., Redick et al. 2013; Thompson et al. 2013; for a review see Shipstead et al. 2012). Specifically, this area of investigation has focused on “far” transfer, whereby transfer represents the degree of similarity between the trained task (e.g., dual n-back) and untrained tasks (e.g., matrix reasoning tasks). Some of the debate in the WM training literature has surrounded the methods used to draw conclusions about transfer. For example, many early WM training studies, like Jaeggi et al. (2008), utilized passive control groups that did not engage in any training and were simply tested on the transfer measures twice. It is now widely accepted that making any firm conclusion about transfer effects requires an active control group, which controls for expectations, contact with experimenters, and other factors (see Simons et al. 2016 for a review of best practices). Another early criticism of the cognitive training field was the issue of inadequate baselines, whereby transfer might be concluded in the face of the control group performing worse after training (for a detailed consideration of this issue see, Boot et al. 2013). Furthermore, a critical feature of effective training that has evolved with the field is the use of adaptive training programs. For example, Lövdén et al. (2010) argue that a fundamental prerequisite for successful cognitive training is a mismatch or imbalance between environmental demands and actual brain supply. To create this prolonged mismatch, training tasks must be challenging but manageable with a high degree of effort. This balance between keeping a task difficult enough to ensure the participant is not bored while easy enough to ensure the participant does not get frustrated is typically achieved through adaptive paradigms that keep the effort and feasibility level consistent for each individual participant.

Further, a source of heterogeneity in this literature that may account for some of the inconsistent results is the diverse set of WM tasks that have been used for training purposes. The most commonly used training task is a variant of the single or dual n-back task, but others have used complex span WM tasks (Kane et al. 2004; Redick et al. 2012), or commercialized programs like Cogmed (e.g., Holmes et al. 2009; Klingberg et al. 2005). Indeed, in our review of the literature, we assessed that far transfer to Gf was present most often in response to dual n-back training (Colom et al. 2013; Jaeggi et al. 2008, 2010b, 2014; Jausovec and Jausovec 2012), but there is no evidence of far transfer to Gf from complex span tasks (Chein and Morrison 2010; Richmond et al. 2011). While there are examples of dual n-back training not resulting in far transfer to Gf, this type of training has shown the most promise in effectively improving Gf with training. Therefore, we sought to identify the “active ingredient” in the dual n-back that may make it more efficacious. Specifically, we aimed to directly compare two of the most commonly used tasks for both assessment and training of WM, namely an n-back task and a complex span task. N-back and complex span tasks, while both used to measure WM, have been shown to be only weakly correlated with one another (for a meta-analysis see, Redick and Lindsey 2013). To the best of our knowledge, no previous study has directly compared n-back and complex span WM training in the same study.

Here we hypothesized that a critical difference between n-back and complex span tasks is their reliance on maintaining and updating relational information in WM (Jonides et al. 1997; Miyake et al. 2000). Matrix reasoning tasks that are typically used to assess Gf also involve extracting relational information between stimuli (Carpenter et al. 1990). Previous work from our group has demonstrated a neural dissociation between maintaining concrete and relational information in WM (Ackerman and Courtney 2012; Blacker and Courtney 2016; Blacker et al. 2016; Ikkai et al. 2014). Specifically, these studies have shown that maintaining a concrete piece of sensory information, such as a spatial location, is supported by distinct neural substrates as compared to maintaining a spatial relationship that is independent of the original sensory location. Other groups have also shown dissociable neural substrates for processing or maintaining relational information from other types of concrete stimuli (Badre 2008; Bahlmann et al. 2014; Libby et al. 2014). Based on this previous work, we hypothesized that the n-back task relies more on relational information than a complex span task does due to the constant need for spatiotemporal updating of information. It can be reasoned that n-back tasks may involve multiple forms of relational information depending on the task stimuli. For example, in a spatial n-back task, participants are asked to remember both the spatial relationship between a current and previously presented item, but also must remember the “temporal” relationship with respect to how many items have passed between any given two stimuli. While most n-back tasks likely involve some form of relational information, here we focused on visuospatial versions that seem to have multiple relations to maintain.

We first tested this hypothesis in a cross-sectional study examining individual differences in WM for spatial relations versus locations and performance on n-back and complex span tasks, as well as a measure of Gf. Next, we sought to test whether this differential reliance on relational information could explain the differences in training efficacy found between n-back and complex span tasks in the literature by examining near and far transfer effects in groups trained on either one task or the other. Further, the current study aimed to investigate the neural underpinnings of effective WM training by comparing the effects of these two training tasks on neural activity during performance of a spatial relation and location WM task using EEG. In particular, we hypothesized that n-back training would have a greater effect on neural activity during relational WM than would complex span training.

In particular, we examined alpha power (8–13 Hz) because a body of research has emerged suggesting oscillations in this range are involved in the maintenance of information in WM and have been linked to individual differences in WM performance (for a review see, Roux and Uhlhaas 2013). Alpha band oscillations have been observed in a number of sensory modalities during WM maintenance (Haegens et al. 2010; Herrmann et al. 2004; Kaiser et al. 2007). For example, it has recently been demonstrated that the topography of alpha tracks the spatial location of to-be-remembered items in WM (Foster et al. 2016). The majority of work has focused on alpha over posterior brain regions, which are thought to reflect the inhibition of task-irrelevant brain regions in an effort to prioritize processing of task-relevant information (Jensen et al. 2002; Jokisch and Jensen 2007; Kelly et al. 2006; Klimesch et al. 2007). Moreover, WM performance has also been linked to alpha over frontal brain regions (Sadaghiani et al. 2012; Sauseng et al. 2005a, b). Specifically, there is evidence that synchronization of alpha oscillations in posterior and frontal brain regions serves to protect items currently held in WM (Bonnefond and Jensen 2012). In fact, Zanto et al. (2011) found that modulating this anterior-posterior alpha synchronization via rapid transcranial magnetic stimulation (rTMS) impaired WM performance, which suggests a causal role for prefrontal cortex (PFC) in modulating posterior brain regions. Taken together, this previous evidence suggests that alpha may represent a putative mechanism by which WM training may exert its effect on enhanced performance.

More specifically, in two related studies, our group has found that when participants maintain a spatial relation in WM, compared to a spatial location, there is an increase in posterior alpha power (Blacker et al. 2016; Ikkai et al. 2014). This increase in posterior alpha power has been interpreted as representing suppression of sensory brain regions because the sensory information (i.e., the spatial locations) is no longer task-relevant. In addition to differences in posterior alpha, our previous work has also shown that there is greater frontal alpha power when a relationship is being maintained in WM compared to a location, as well as increased frontal-posterior phase synchrony (Blacker et al. 2016). Therefore, if dual n-back and complex span differentially rely on relational WM, and if dual n-back training strengthens relational WM, then changes in alpha power may represent a neural marker of improvement that would be greater for dual n-back training. Thus, in the current study, we investigated changes in alpha power during this same type of spatial relations and locations WM tasks before and after dual n-back and complex span WM training.

Here we describe the results of two experiments. Experiment 1 is a behavioral study testing our hypothesis that the n-back task relies more on relational WM as compared to complex span tasks. The results of that study supported the idea that the reliance on relational processing in the n-back task could account for differences in transfer following training that have been reported in the literature. In experiment 2, we aimed to directly compare the efficacy of dual n-back and complex span training to elicit near and far transfer and we aimed to investigate a potential neural basis of these training-induced changes in WM performance: changes in alpha power as measured by EEG.

Experiment 1: Individual Differences Study

Method

Participants

Twenty-four healthy adults (age: M = 19.50, SD = 1.2; 8 males) participated for course credit. All participants had normal or corrected-to-normal vision and reported no current and past neurological or psychological conditions (e.g., traumatic brain injury, major depression, epilepsy…etc), and gave written informed consent approved by the Institutional Review Board of Johns Hopkins University.

Tasks and Procedures

Experimental stimuli were controlled by MATLAB (The MathWorks, Natick, MA) using Psychophysics Toolbox extensions (Brainard 1997; Pelli 1997), and displayed on a 21″ Apple iMac monitor. Participants were seated approximately 60 cm from the monitor.

Participants completed one session, which included performance of the following tasks in one of two orders, counterbalanced across participants: Order (1) Spatial Locations and Relations WM task, Symmetry Span, single spatial n-back task, and BOMAT; Order (2) BOMAT, Spatial Locations and Relations WM task, single spatial n-back task, Symmetry Span.

Symmetry Span

Participants completed the automated Symmetry Span task (Kane et al. 2004). Participants recalled sequences of 2–5 red square locations while performing an interleaved symmetry judgment task. The dependent measure for Symmetry Span was the partial score,^{Footnote 1} which is the sum of red squares recalled in the correct location and serial order, regardless of whether the entire trial was recalled correctly. This score will henceforth be referred to as the Symmetry Span Score.

Spatial n-back

Participants were shown a sequence of spatial locations indicated by black squares and were asked to respond each time the current stimulus was in the same location as the one presented n positions back in the sequence. The black squares were presented in eight possible locations organized in an imaginary circle around fixation. The squares were presented on a gray background for 500 ms each, followed by a 2500 ms interstimulus interval (ISI). Participants were required to press the SPACE bar for targets, and their response window lasted from the onset of the stimulus until the presentation of the next stimulus (3000 ms); no response was required for non-targets. Participants were tested on 1-, 2-, 3-, 4-, and 5-back levels in that order. A level consisted of 20 + n stimuli and contained 6 targets and 14 + n non-targets each. For each n-level, we calculated d’ as a measure of sensitivity (d’ = Z(hit rate) – Z(false alarm rate)), which is a useful metric for an n-back task because it measures hit rate while penalizing for false alarms (e.g., Haatveit et al. 2010). We used the average d’ across all five levels as the dependent measure of performance.

Bochumer Matrices Test

We assessed fluid intelligence (Gf) using the Bochumer Matrices Test (BOMAT) (Hossiep et al. 1999). The BOMAT is a matrix reasoning test that contains 29 problems and has two full versions, A and B. All participants in experiment 1 completed version B. The standard time limit of 45 min was used here. The number of correct solutions was used as the dependent measure.

Spatial Locations and Relations Task

Participants also completed a novel WM task that required participants to either maintain spatial relations or spatial locations (Fig. 1). This task was modeled after the tasks used in previous studies (Ackerman and Courtney 2012; Blacker and Courtney 2016; Ikkai et al. 2014).

All stimuli were presented on a 50% gray background. A trial began with a 500-ms fixation cross, presented in the middle of the screen. Next, a 500-ms verbal cue indicated whether the trial would be a “Relation” or a “Location” trial. A sample array was then presented for 500 ms, which contained two or three colored circles (each subtending 0.67° × 0.49° of visual angle). The color of each circle was chosen randomly without replacement from red, green, yellow, and blue. The circles in an array were presented between 2.0 and 2.9° of visual angle apart both horizontally and vertically, but within the same quadrant of visual space. After a 2000 ms delay period, a test array was displayed for 500 ms. Participants had a maximum of 1500 ms to enter their response, which meant that the 1000 ms following the offset of the test array was also considered the response period. Finally, a 300 ms feedback display was presented where the fixation cross turned green for a correct response, red for an incorrect response, and blue if the response was slower than 1500 ms.

For both trial types, there was a low load (i.e., sample array contained two colored circles) and a high load (i.e., sample array contained three colored circles) condition.

For Location trials (Fig. 1), under low load, participants were instructed to imagine a line segment connecting the two sample circles and maintain the location of that line in memory across the delay period. These instructions were used to encourage participants to encode the exact spatial coordinates of one concrete object (i.e., the imaginary line segment), while the same number of circles (i.e., two) was presented on the screen for both Location and Relation trials. At test, participants were asked to decide whether or not the black test circle fell on that imaginary line segment. For match trials, the black test circle fell in the exact center between the two previously presented sample circles (i.e., in the center of the remembered imaginary line segment). For non-match trials, the black test circle fell between 1.6 × 2.2° and 2.1 × 2.9° of visual angle from the position of that center point in any direction between the sample circles.

For Location trials, under high load, participants were instructed to remember the absolute locations of the three sample circles. At test, participants were asked to decide whether or not the black test circle was in one of the three sample locations or in a completely new location. For match trials, the black test circle fell in the identical location as one of the sample circles and for non-match trials it fell between 1.6 × 2.2° and 2.1 × 2.9° of visual angle from any of the three sample locations.

While the low and high load Location trials consisted of different instructions to the participant, the key factor is that under low load participants were asked to maintain one spatial location (i.e., the location of the imaginary line segment) and under high load participants were asked to maintain three spatial locations (i.e., the locations of the three sample circles). The imaginary line manipulation in the low load trials was necessary to equate the number of sample circles displayed for Location and Relation trials. Further, previous work using this task has shown that the load manipulation for Location trials activates brain regions that are typically found with load increases in other types of visuospatial WM (Blacker and Courtney 2016), such as posterior parietal cortex (e.g., Todd and Marois 2004) and frontal eye fields (e.g., Leung et al. 2004).

For Relation trials (Fig. 1), under low load, participants were instructed to encode and maintain the relative vertical positions of the two sample circles (e.g., red is above blue). Upon test, participants indicated whether or not the circles in the test array had the same relative vertical positions as the sample circles. For Relation trials, under high load, participants were instructed to encode and maintain the three possible vertical relationships between the sample circles (e.g., green is above yellow, yellow is above red, red is below green). As with low load, at test, participants indicated whether or not the circles in the test array had the same relative vertical positions as the sample circles. Note, which of the three relationships was tested was unpredictable, which forced participants to maintain all three relationships during the delay period. The horizontal relationship was never task-relevant in Relation trials.

There are a few crucial aspects of the task design worth elaborating on. First, regardless of trial type, under low load, participants were asked to encode and maintain one piece of information: either one spatial location (Location trials) or one spatial relation (Relation trials) and under high load, participants were asked to encode and maintain three pieces of information: either three spatial locations or three spatial relations. Second, trial type was pseudorandomly presented so participants could not predict what trial type they would see until the cue. Load was uncued, so the participants were unaware of the load until the sample array appeared. Third, the sample array circles were always presented in one quadrant of the display and the test array circles were always presented in the same quadrant as the sample circles, although in the Relation task the test circles were never in the exact same locations as the sample circles. For all trial types, participants pressed one button for a “match” response and another for a “non-match” response and these response key mappings were counterbalanced across participants. Participants completed a total of 256 trials.

Data Analysis

The goal of this initial experiment was to investigate individual differences in WM for spatial relations versus locations and how those differences predicted n-back, complex span, and Gf performance. To do so, we tested partial correlations between Relation WM accuracy, while controlling for Location WM accuracy, and n-back, Symmetry Span, and BOMAT. We also tested for partial correlations between Location WM accuracy, while controlling for Relation WM accuracy, and n-back, Symmetry Span, and BOMAT. For the Location and Relation task, we focused our analysis on high load trials, because previous work has shown that individual differences in WM performance are more evident when load is high (Cusack et al. 2009; Linke et al. 2011). However, we report parallel analyses for the low load condition in the Supplementary Online Material.

Results

One participant had below chance level accuracy (< 50%) on the Locations/Relations task and was not included in any analyses. To report complete results for the Locations/Relations task, we tested a 2 (trial type: Location vs. Relation) × 2 (load: low vs. high) repeated-measures ANOVA. A main effect of load emerged, F(1,23) = 66.61, p < 0.001, partial η ² = 0.74, with accuracy being higher for low load compared to high load. However, neither the main effect of trial type, F(1,23) = 1.04, p = 0.32, nor the trial type × load interaction, F(1,23) = 1.72, p = 0.20, reached significance. Table 1 includes descriptive statistics for each level of the task.

Table 1 Descriptive statistics for accuracy in the Locations/Relations WM task in experiment 1

Full size table

Next, we examined whether Relation WM accuracy was correlated with average d’ on the spatial n-back task, while controlling for Location WM accuracy. A significant correlation emerged, R(21) = 0.59, p = 0.003, whereby higher Relation WM accuracy was associated with higher n-back d’ values. Conversely, Location WM accuracy was not significantly correlated with n-back performance, when controlling for Relation WM accuracy, R(21) = 0.30, p = 0.16. Next, we examined how these two types of WM related to Symmetry Span performance. Neither Relation WM nor Location WM accuracy, while controlling for the other, was significantly correlated with Symmetry Span Score, ps ≥ 0.064. Moreover, Relation WM accuracy was significantly correlated with n-back d’ even after controlling for Symmetry Span Score, R(21) = 0.49, p = 0.018, which demonstrates that n-back performance, above and beyond Symmetry Span performance, is related to Relation WM. Finally, we examined correlations with BOMAT score. Relation WM accuracy, controlling for Location accuracy, was significantly correlated with BOMAT score, R(21) = 0.44, p = 0.034, whereby better Relation WM performance was associated with better BOMAT performance. Conversely, Location WM accuracy was not significantly correlated with BOMAT score, when controlling for Relation WM accuracy, R(21) = 0.16, p = 0.47. These results are illustrated in Fig. 2. Here it was possible to test whether these above partial correlation coefficients were significantly different from one another for the BOMAT analyses. The difference here indeed did not reach significance, p = 0.33 (two-tailed). However, our goal was to determine if Relation and Location WM predict BOMAT performance while controlling for the other type of WM. We have used this approach elsewhere (Blacker et al. 2017) and argue that these partial correlations are evidence for a stronger relationship between Relation WM and BOMAT performance compared to Location WM and the BOMAT.

Given that Relation WM was significantly correlated with n-back and BOMAT performance, it would be expected that n-back and BOMAT performance would also be highly correlated. In line with previous work demonstrating the robust relationship between WM and Gf performance (e.g., Engle et al. 1999), here we did find a significant positive correlation between BOMAT score and n-back d’, R = 0.58, p = 0.003.

Discussion

The results of experiment 1 represent a proof of concept that performance on a relational WM task is predictive of performance on both an n-back task and a measure of Gf. This supports our initial hypothesis that the n-back task relies more heavily on relational processing than complex span tasks do. It also demonstrates that a measure of Gf, the BOMAT, is related to relational WM performance. Therefore, we next examined whether this reliance on relational WM could explain differences in transfer between these two tasks in a training study, as well as whether n-back and complex span training would result in significantly different changes in alpha power following training. If n-back training were to have a greater effect than complex span training on an EEG measure related to relational WM, then it would provide support for the idea that the involvement of relational WM might underlie its greater effectiveness for cognitive training and transfer.