Introduction

The complexity and adaptability of goal-directed behavior is astounding. Humans operate under diverse and dynamic demands, yet we consistently produce appropriate actions (Baddeley, 1996; Miller & Cohen, 2001; Norman & Shallice, 1986). How can our cognitive systems achieve such flexibility? Various theoretical accounts of complex representations in the brain (e.g., Hommel, 2021; Sakai, 2008; Schumacher & Hazeltine, 2016) have been proposed to answer this question. A common theme is that sets of stimulus-response (SR) associations are stored hierarchically, so that they can be activated by broader behavioral goals and task contexts. However, despite much interest in these representations, researchers have only begun to unravel their behavioral signatures and determinants. Here we propose a novel approach to examining the task representations that guide behavior when there is uncertainty about the upcoming stimulus and appropriate response.

Task representations are often measured with task-switching procedures in which participants make responses according to distinct sets of SR mappings. For instance, participants might respond to distinct categories of stimuli (e.g., numbers and letters), respond flexibly to stimulus dimensions (color or shape), or respond to subsets with different effectors. These experiments typically produce increased response times (RTs) when the SR set switches on consecutive trials compared to when it repeats (Allport, Styles, & Hsieh, 1994; Rogers & Monsell, 1995).

This RT difference is referred to as a switch cost, but its source remains uncertain and likely involves multiple components (Kiesel et al., 2010; Logan, 2003). For instance, switch procedures commonly require participants to shift attention to distinct dimensions when the SR set switches (e.g., color, shape). Thus, differences in RT between switch and repeat trials could, at least in part, reflect attentional processes rather than the navigation of the underlying task representation. Similarly, many switch procedures associate a particular stimulus with multiple responses, with the response depending on which stimulus dimension is relevant. Consequently, stimuli may activate multiple responses and necessitate resolution of this conflict. Together, these factors can complicate the interpretation of switch costs as measured in conventional switch tasks (and measured at the level of switching vs. repeating task).

A key feature of the standard approach to measuring switch costs is that trials are categorized according to sets predefined by the experimenter. Trials belonging to the same set as the previous trial are compared to trials belonging to a different set. This approach for categorizing transitions as within-set or between-set is useful for testing a priori hypotheses about factors affecting switch costs, for example, how the similarity of the stimulus or response sets affect the switch cost (e.g., Arrington, Altmann, & Carr, 2003; Mayr, 2001). However, performance differences between within-set and between-set transitions only weakly constrain the task organization as reflected by the complete pattern of transitional RTs. In other words, a robust switch cost does not indicate that all within-set or between-set RTs are the same (see Lien, Ruthruff, Remington, & Johnston, 2005). If the experimenter is wrong or even only partly correct about the organization of the SR mappings, experimental results may lead to incorrect inferences about the underlying task organization.

For example, participants might be required to switch between making parity or magnitude judgments about digits. The experimenter may assume participants represent the SR mappings as two task sets (one for magnitude and the other for parity), and this may be verified by a switch cost when subjects change the type of judgment made on consecutive trials. But there may instead be (or additionally be) relationships among SR mappings that are unaccounted for in simple measures of switch cost. The switch cost measure does not identify the presence of these relationships. To better understand the structure of the tasks, all transitions between each SR mapping should be considered.

The present study

The objective of the present study is to develop an alternative approach to examining task structure. Accordingly, we use transitional RTs to measure the relatedness of all combinations of responses. By transitional RT, we refer to classifying each RT as a function of the specific SR mapping used on the previous trial. We compare the effects of stimulus and response manipulations on task structure revealed by the transitional RT approach to those revealed by the conventional switch costs approach.

Because we are interested in the organization of the SR mappings rather than attentional, preparatory, and inhibitory processes, we deviate from conventional task-switching procedures and use univalent stimuli with no task cues. Eight stimuli are mapped consistently to eight responses. We avoid using task cues because they may affect the task representation and because they indicate that a subset of the possible SR mappings may be relevant for next response, adding a preparatory component to the task. More broadly, the interpretation and use of cues engages control processes (Dreisbach et al., 2002; Koch, 2003; Mayr, 2006; Schneider & Logan, 2006), and this may further obscure the effects of task representation that we wish to study.

In Experiment 1, we compare the conventional task-switching approach and the transitional RT approach by manipulating stimulus and response properties that previous work has established affect task representation. Specifically, switch costs are often instantiated by alternating sets of stimuli (e.g., Rogers & Monsell, 1995; Ruthruff, Remington, & Johnston, 2001; Wylie & Allport, 2000) or the task-relevant dimension (e.g., Allport et al., 1994). Therefore, we manipulated whether the eight stimuli belonged to the same category. In the one-set condition, all stimuli were numbers, while in the two-set condition, stimuli mapped to the left responses were numbers and stimuli mapped to the right responses were face images.

To manipulate the responses, we varied the effectors used to make responses (e.g., Philipp et al., 2013). The two-hand group responded to the stimuli with both hands (the left responses with the left hand, and right responses with the right hand). The one-hand group made all responses with their right index fingers.

Experiment 2 provided a conceptual replication of the one stimulus set, one-hand group of Experiment 1 with a different set of stimuli (faces). Given that Experiment 1 used numbers, which may have a pre-existing ordinal relationship, we sought to confirm that dominant organizational principle observed would also emerge in a similar task using arbitrary SR mappings.

Method

Participants

For Experiment 1, 72 participants were sampled from the research pool established by the University of Iowa Psychological and Brain Sciences Department and randomly assigned to one of the four between-subjects conditions. One subject was run in the one-set, one-hand condition instead of the two-set, one-hand condition, leading to a slightly imbalanced design. One participant was excluded from analysis due to chance level accuracy. For Experiment 2, 36 participants were run in a single condition from the same subject pool. Two participants were excluded for low accuracy (< 80%).

Each participant was compensated with credit to fulfill course requirements. All study procedures were reviewed and approved by the University of Iowa Institutional Review Board. Informed consent was obtained from all participants both for participation in the study and for publication of group-level data. All data were collected in person. Data analyzed in this manuscript are available on the Open Science Framework at: https://osf.io/7xf59/.

Task

We evaluated switch costs and transitional RT on data obtained from tasks in which participants responded to one of eight centrally presented stimuli using a standard computer keyboard. Task procedures were programmed and displayed using MATLAB version R2018b software (TheMathWorks, Natick, MA, USA) and the Psychophysics Toolbox Version 3 (Brainard, 1997).

In Experiment 1, stimulus set and response mode were manipulated between subjects. In the one-set stimulus condition, participants were presented with the numbers 1, 2, 3, 4, 5, 6, 7, and 8. In the two-set stimulus condition, stimuli were four numbers – 1, 2, 3, and 4, and four images of faces generated online from faceresearch.org (see Fig. 1, Panel A). The numbers were compatibly mapped to responses in a left-to-right fashion in all conditions. Numbers were always presented left to right and faces (when present) were always mapped to the right-most responses.

Fig. 1
figure 1

Panel A: Stimulus arrangements for the one set and two set conditions in Experiment 1. Panel B: Stimulus arrangements for Experiment 2 (note: One-Set Condition Only). All stimuli were centrally presented, and location represents the location of the corresponding response. Panel C: Example trials in the one-hand and two-hand conditions. ITI inter-trial interval. Response keys above a frame indicate responses given at the beginning (left) or end (right) of that interval

All participants responded using the same set of contiguous keys in the home row of a QWERTY keyboard. Specifically, the keys were: S, D, F, G, H, J, K, and L. In the two-hand responding condition, participants were instructed to respond with the four fingers of their left and right hands (excluding thumbs). In the one-hand responding condition, participants began a trial by pressing the ‘B’ key with their right index finger. When a stimulus appeared, they released the ‘B’ key, moved the same finger to the desired response location, and returned to complete the trial.

Experiment 2 was identical to Experiment 1 except that there was only a single condition in which all the stimuli were faces and the responses were made as in the one-hand conditions of Experiment 1 (Fig. 1, Panel B).

Trials began with a central fixation cross displayed for 500 ms. The target stimulus then appeared at the center of the screen until a response or the 3,000-ms deadline had elapsed. Feedback was presented for 2,000 ms following erroneous responses. Feedback consisted of the mapping of stimuli to responses, identical to that displayed during instruction (Fig. 1, Panel C).

Participants completed one practice block of 16 trials to familiarize themselves with the experimental procedure. Eight experimental blocks followed, with 32 trials each block for a total of 256 experimental trials per subject.

Analyses

Data from the first two experimental blocks were excluded from the analyses to allow behavior to stabilize. None of the conditions showed a significant difference in accuracy from block 3 to block 8, and only the one-set, one-hand condition of Experiment 1 showed a significant change in RT, t(18) = 2.58, p = .019, Cohen’s d = 0.61, ΔRT = 53 ms. We also excluded from the analyses the first two trials of each block, error trials and trials after errors (6.4%), and trials in which the stimulus repeated from the previous trial (11.8%). Because accuracy was above .96 in all conditions, our analyses focused on RT. The experimental program recorded the onset and release of the ‘B’ key on each trial, but we analyzed only the response time.

Conventional task-switching approach

To examine traditional switch costs, we classified trials according to whether the side of the response differed from that of the previous trial, and determined whether this affected RT. For Experiment 1, we performed a 2 (side switch) × 2 (stimulus set) × 2 (response mode) mixed-factor ANOVA. Side switch varied within-subjects, and we manipulated stimulus set and response mode between subjects. Because we associated different stimulus sets with left- versus right-side responses in the two stimulus set groups, alternating response side also means alternating stimulus set. This approach allowed us to evaluate the cost for switching stimulus set (again, in our case, this also corresponds to response side). For Experiment 2, we performed a one-way ANOVA with side-switch as a within-subjects factor.

Transitional RT approach

Our goal was to demonstrate that inferences from the visualizations of the transitional RTs can generate testable hypotheses. Therefore, we converted each trial’s RT to a z-score by obtaining the mean and standard deviation for each response by each participant. We standardized by response key to control for the possibility that moving further in the response array (to a more distant key) would require more time and contaminate our analyses. To visualize the complete set of transitional z-scored RTs (ZRT), we applied an approach from graph theory used to depict connectivity between regions from neuroimaging data (see, e.g., Cohen & D'Esposito, 2016).

$$\mathrm{Weight}\ \mathrm{strength}={e}^{-2 ZRT}$$
(1)

The ZRT was converted to a weight strength using Eq. 1. The constant 2 was chosen so that weights varied between 0 and 10. We collapsed across symmetrical transitions (e.g., L3->R2, R2->L3) to simplify and because these scores were strongly correlated (r=.77). Thus, the weights reflect the benefit of making one response following another. We plotted the weights using a force atlas model with Gephi software (gephi.org; Bastian, Heymann, & Jacomy, 2009), with each response represented by a node and the ZRTs between them represented by edges.Footnote 1 In the force atlas model, nodes repel each other but are attracted by tighter transitional RT relationships. The resulting network is plotted with thicker lines depicting shorter transitional RTs. For Experiment 1, we created separate network models for each of four groups. Alternative approaches, such as presenting the transitional RTs in a matrix, would also be effective for depicting the organization. We chose the network approach because it is easy to visualize relationships among responses. Because we evaluated each of the possible combinations of the eight stimuli, many transitions were present. Accordingly, we analyzed relatively few trials per cell compared to the overall trial counts. After excluding exact repetitions and erroneous responses, there were on average 5.33 trials (SD = 1.20) per subject per transition in Experiment 1 and 5.41 (SD = 2.14) in Experiment 2.

Results

Mean accuracy was high across conditions (.96-.98, lowest subject’s accuracy: .92) and will not be considered further.

Conventional task switching approach

The question posed to the conventional analysis was whether stimulus set and/or response mode would alter the underlying task representation as measured by switch costs. Our design deliberately aligned stimulus-set and side of the response. Switching response side corresponded to switching stimulus set (in the two-set groups) and switching hands (in the two-hand groups).

For Experiment 1, two main effects were reliable: RTs were shorter when the response belonged to the same side of the response array than when successive responses changed sides, F(1,68) = 135.418, p < .001, \({\eta}_p^2\) = .666. RTs were also shorter when two hands were used, F(1,68) = 6.336, p = .014, \({\eta}_p^2\) = .085 (see Fig. 2). There was no significant main effect of stimulus set, F < 1, BF10 = .254.

Fig. 2
figure 2

Conventional switch cost analysis for Experiment 1. Error bars represent standard error of the mean. SameSide = response occupied same side of array as previous trial. DiffSide = response occupied opposite side as previous trial

A reliable interaction between response mode and stimulus sets indicated that responding was faster for one stimulus set in the one-hand response groups, while the opposite was true of the two-hand response groups, F(1,68) = 5.838, p = .018, \({\eta}_p^2\) = .079. No other two-way interaction was reliable. Critically, neither stimulus set, F < 1, BF10 = .247, nor response mode, F < 1, BF10 = .297, affected switch costs, suggesting that they did not alter the task structure. The three-way interaction, side × response mode × stimulus set, was also not reliable, F < 1. Thus, the conventional analysis indicates that neither response mode nor stimulus set affected the underlying task structure in our experiment.

For Experiment 2, we conducted a one-way ANOVA with side as a within-subjects factor, which showed that, as in Experiment 1, same-side responses (1,163 ms) were produced more quickly than different-side responses (1,218 ms), F(1,33) = 22.459, p < .001, \({\eta}_p^2\) = .405. Thus, the shifting cost observed in all groups of Experiment 1 was also observed in Experiment 2.

Transitional RT approach

The visualization of the transitional RTs in Experiment 1 reveals that the stimulus set manipulation had a clear effect on the task representation (Fig. 3). For both response modes, the one-set groups (left column of Fig. 3) appear quite linear, effectively a chain of responses from one side to the other. Neighboring responses are strongly connected, and the other connections are relatively weak. Strikingly, the plot generates the configuration of the responses on the keyboard based only on the pattern of transitional RTs. This is depicted as thicker lines (shorter transitional RTs) connecting neighboring responses and thinner lines (longer transitional RTs) connecting more distant nodes. This pattern is broken when two stimulus sets are used (right column). Here, responses appear to be more closely linked within each side, with a chain only on the left-most responses. In contrast, the right-side responses show a more uniform organization. Because this pattern appears across response-set groups, there appears to be little impact of response mode on task structure. Inspection of Fig. 3 suggests the neighbor benefit was larger for the one-hand groups.

Fig. 3
figure 3

Transitional RT between responses (leftmost – L1; rightmost – R4) for Experiment 1. Nodes are repelled by each other and attracted proportionally to transitional response time (RT). Thicker lines indicate shorter transitional RTs

For Experiment 2, the pattern revealed by the transitional RT analysis was largely consistent with the one-set groups in Experiment 1. That is, the chaining organization, albeit weaker, was maintained, although the relatively strong connection between L1 and L3 produced a spiral pattern (with L1 in the center) rather than the more circular patterns observed in Experiment 1. Consistent with Experiment 1, the switch cost observed when switching between left- and right-sided responses did not seem to emerge from separate clusters as implied by the logic of the switch cost analysis. Rather, the costs emerged because neighbors are more likely to be on the same side Fig. 4.

Fig. 4
figure 4

Transitional response time (RT) between responses (leftmost – L1; rightmost – R4) for Experiment 2. Thicker lines indicate shorter transitional RTs

Post hoc analyses

Our goal was to show that the insights gleaned from the transitional RT analysis could be used to form hypotheses testable with inferential statistics. We focus on the apparent advantage for neighboring responses. For Experiment 1, we performed a post hoc ANOVA on the transitional RT z-scores with stimulus set and response mode as between-subject factors and neighbor (neighboring response on the keyboard vs. non-neighboring response) and response side (left vs. right) as within-subject factors. The z-scores were recentered around zero to adjust for differences in the number trials in the neighbor and non-neighbor cells that were bringing the mean-z below zero.

The main effect of neighbor was significant, F(1,68) = 752.43, p < .001, \({\eta}_p^2\) = .92, indicating that the mean z-score for neighboring responses (-.29) was less than the mean z-score for non-neighbors (.29). There were also two-way interactions between neighbor and response mode, F(1,68) = 12.48, p < .001, \({\eta}_p^2\) = .16, and between neighbor and stimulus set, F(1,68) = 43.74, p < .001, \({\eta}_p^2\) = .39. The former interaction indicates that benefit of neighbors was larger for the one-hand groups (.73) than for the two-hand groups (.44). This may reflect biomechanical factors (e.g., Rosenbaum, 2002). The latter interaction was part of the three-way interaction between stimulus-set, response side, and neighbor, F(1,68) = 8.74, p = .004, \({\eta}_p^2\) = .11. This three-way interaction indicates that, for the one-set groups, the neighbor benefit was nearly identical for both response sides (.66), but for the two-set groups, the benefit was greater for the left side (.56) than for the right side (.46). In short, the faces produced less of a benefit for neighbors than the numbers. No other effects were significant.

For Experiment 2, we conducted a two-way ANOVA with response side and neighbor as factors. There was a significant effect of neighbor, F(1,33) = 94.3476, p < .001, \({\eta}_p^2\) = .74, indicating an advantage for neighbors (-.14) over non-neighbors (.14). Furthermore, this effect was moderated by an interaction with response side, F(1,33) = 5.85, p = .021, \({\eta}_p^2\) = .15, indicating that the advantage was greater for the right-side (-.18 vs. .18) than for the left-side (-.10 vs. .10). This interaction is a topic for further research.

We conclude that the neighbor benefit is not restricted to the number stimuli used in Experiment 1. Across the experiments, differences in task structure can be identified by considering transitional RTs that are not apparent by dividing SR mappings into sets (i.e., chains of neighbors). Moreover, the transitional RT approach reveals that in some cases switch costs emerge even when there is no organization of the mappings into distinct sets. This demonstrates that transitional data can depict representational structures invisible to conventional measures.

General discussion

Experiment 1 compared switch costs to an analysis of individual response transitions under conditions with varying stimuli and response modes. According to the conventional switch-cost analysis, switching response sides produced a cost when compared to repeating the response side. Neither the manipulation of the stimulus set nor the response mode affected the switch cost, suggesting that these manipulations did not affect the task representation. In contrast, the transitional RT analysis showed clear changes in task structure resulting from the manipulations of the stimulus set and response effectors. We observed that the task was organized as a chain of responses in the one-set groups – regardless of the effectors used to produce them. This suggests that, for these groups, the SR pairs were represented as a single task, despite the consistent observation of a switch cost. In the two-set groups, the chain was broken and response side appears to drive task organization. This suggests that a consideration of all transitions may be useful for verifying that the task structure is indeed set based, as is typically assumed, and that alternative forms of organization should be considered. In short, the transitional RT approach increases the range of structures that can be detected.

In Experiment 2, we used a single set of face stimuli that were not ordinally related and still observed the chain-like structure observed in the one-set groups in Experiment 1, although it was less pronounced. This suggests that chaining may be a common means of representing tasks, and that it can be misinterpreted as a division into distinct task sets if only conventional measures are employed.

Limitations

Our eight-choice task used a more complex response scheme than many task-switching procedures (e.g., Rogers & Monsell, 1995), but it was also in some ways less complex than conventional switch tasks. For example, many designs use bivalent stimuli and instruct participants to alternate between relevant stimulus features (e.g., respond to color vs. shape). Relatedly, conventional switching procedures often assign the same stimulus or feature to different responses depending on context (e.g., a cue presented before the stimulus; Logan & Schneider, 2006; Mayr & Bryck, 2005). In this way, attentional and conflict resolution processes likely impact behavioral measures purporting to measure switching between sets of SR pairs (Kiesel et al., 2010). Our study also differed from others in that we did not explicitly instruct participants that two tasks were present, but left the participants to create their own structure (see Dreisbach et al., 2006, 2007).

These differences between our task and those used in many task-switching experiments may have affected the task representation, and it remains to be discovered how the findings here apply to other task procedures. Nonetheless, the transitional RT approach can be used to corroborate that task-switch costs do in fact reflect the organization of separate sets of SR mappings, unlike what is observed here, even though we observed significant switch costs by the conventional measure. Our objective was to demonstrate the utility of the transitional RT approach, but important theoretical questions remain about the organization of mappings in other tasks, including more typical switching procedures. Accordingly, future research could use transitional RTs with task demands we have avoided here for simplicity (e.g., bivalent stimuli, shifts of attention, response conflict).

Experiment 1 used numeric stimuli that were mapped to responses in a way that may already have been learned by participants (we often manually respond to numbers ascending left to right). This may have partly driven the strong chain-like organization. Experiment 2 addressed this concern by demonstrating the chain pattern with face stimuli arbitrarily mapped to responses. Additionally, although stimulus-response mappings were not counterbalanced, partly because we were examining how specific mappings affected task representation, the similarity across Experiments 1 and 2, which used an arbitrary mapping, suggests that our findings are not artefactual.

With eight stimuli and eight responses, the number of possible transitions is quite large. Accordingly, trial counts for each transition per subject was low (i.e., just over five observations per cell) after removing error trials and exact repetitions. However, despite this limitation, the transitional analysis revealed different response relationships across the stimulus-set conditions that were verified by conventional inferential statistics. That said, further iterations of this approach should improve upon this design by including additional trials per transition.

Future directions

The transitional RT approach may be adopted to evaluate the similarity of SR episodes and might effectively be paired with recent multivariate techniques to identify the neural correlates of task representations (e.g., Freund, Bugg, & Braver, 2020a; Freund, Etzel, & Braver, 2020b; Kikumoto & Mayr, 2020). For example, representational similarity analyses (RSAs) explicitly model the underlying neural signal associated with a particular trial type. One starting hypothesis is that trial types with shorter transitional RTs have more similar representations as determined by RSA. Findings from the transitional RT approach might also be applied to neural measures. Future research should examine relationships between EEG signatures known to be associated with task switching, such as the P3a (Barcelo, Escera, Corral, & Periánez, 2006), and the task boundaries suggested by transitional RT analysis.

Taken together, the present study demonstrates that the transitional RT approach can reveal the structure of task representations in a way that may complement conventional switch cost metrics. While switch costs did not detect representational changes due to the number of stimulus sets or mode of responding, fine-grained transitional RT data indicated changes in relationships between responses. These findings indicate that conventional switch cost analyses may obscure differences in structure or provide support for a set-like organization when no such organization exists. Moreover, when a single stimulus set was used, we observed a chainlike organization based on the configuration of the responses, which produced the switch cost as revealed by conventional analyses. The configuration of the responses may play a strong role in task organization, especially when there are no clear relationships among the stimuli.