Introduction

In 1980, the New Wave band Devo claimed that “freedom of choice is what [we] got” and that “freedom from choice is what [we] want”. Indeed, it appears that a lack of freedom is what we want to speed up our decisions: an increase in choice options can slow down decisions, which has been shown in situations ranging from complex decision-making contexts (e.g., Hanoch, Wood, Barnes, Liu, & Rice, 2011) to minimalist laboratory experimental setups (e.g., Merkel, 1885). A specific example of these latter setups are comparisons between so-called forced choice and free choice tasks (Berlyne, 1957).

Forced choice and free choice tasks and their use in research

In the simplest version of forced and free choice tasks (see, e.g., Berlyne, 1957), participants have two response options (e.g., a left and a right key) and are confronted with three different stimuli (e.g., letters or color patches). Participants are instructed to respond to two of these stimuli with prescribed responses (e.g., red → left key press; blue → right key press)—the forced choice task. In case of the third stimulus (e.g., white), in contrast, they can choose “freely” from the two response options—the free choice task. Footnote 1

Notably, and of particular importance to the present study, the vast majority of studies comparing forced and free choice tasks report shorter response times (RTs) in forced choice compared with free choice tasks (e.g., Berlyne, 1957; Janczyk, Nolden, & Jolicoeur, 2015). It is the purpose of the present study to elucidate where this RT difference results from.

One interpretation of the RT difference is that both tasks differ in terms of their underlying response/action selection systems or processes. In this vein, free and forced choice tasks have often been used to operationalize qualitatively different self-generated (or intentional, internally generated, intention-based, voluntary) and externally triggered (or stimulus-based) actions (e.g., Brass & Haggard, 2008; Herwig, Prinz, & Waszak, 2007; Keller et al., 2006; Passingham, Bengtsson, & Lau, 2010; Waszak et al., 2005). Evidence for such a distinction comes, for example, from research on learning and using associations between bodily movements and their environmental consequences (i.e., their action effects), a field that was inspired by Ideomotor Theory (e.g., Greenwald, 1970; Harleß, 1861; Shin, Proctor, & Capaldi, 2010; Stock & Stock, 2004). In particular, when specific bodily movements are consistently followed by an auditory stimulus as an action effect (e.g., left key → low-pitch tone, right key → high-pitch tone), results from some studies suggested that associations between the movements and the effects are only learned in free choice tasks, that is, in an intention-based action control mode (Herwig et al., 2007; see also Gaschler & Nattkemper, 2012; Herwig & Waszak, 2009, 2012; Pfister, Kiesel, & Melcher, 2010).

This claim is, however, controversial. For example, Pfister, Kiesel, and Hoffmann (2011) reported learning of action effects even in forced choice tasks, and many other studies observed clear evidence for a role of action effects for performance in forced choice tasks (e.g., Janczyk, Pfister, Crognale, & Kunde, 2012; Janczyk, Pfister, Hommel, & Kunde, 2014; Janczyk, Pfister, & Kunde, 2012; Janczyk, Skirde, Weigelt, & Kunde, 2009; Kühn, Elsner, Prinz, & Brass, 2009, Exp. 3; Kunde, 2001; Kunde, Pfister, & Janczyk, 2012; Wolfensteller & Ruge, 2011). Furthermore, studies using the response-effect (R-E) compatibility paradigm (Kunde, 2001) reported R-E compatibility effects of the same size in forced and free choice tasks (e.g., Janczyk, Durst, & Ulrich, 2017), and the size of dual-task interference is also comparable for both tasks (Janczyk, Nolden & Jolicoeur, 2015). In addition, Janczyk, Dambacher, Bieleke, and Gollwitzer (2015) used the Psychological Refractory Period (PRP) paradigm in combination with the locus of slack logic (Schweickert, 1978; see also Janczyk, 2013, 2017, or Miller & Reynolds, 2003, for applications) to identify the source of the RT difference within the stream of processing. Based on Gollwitzer’s (1999) implementation intention account, they argued for a perceptual locus, and indeed reported evidence in support of this idea in their study. Essentially, their observations suggest that the RT difference actually results from facilitated perceptual processing of forced choice stimuli.

In light of the evidence summarized in the last paragraph and the importance of forced and free choice tasks in contemporary research, we argue that effect or goal state anticipation drives response selection in both forced and free choice tasks, but that for the latter task, the effect must be self-generated, which comes with additional demands. Here, we investigate further whether both tasks and their RT difference can be described within a common theoretical framework. This will help understanding the sources of the RT difference between the tasks.

A sequential sampling account of the RT difference

Sequential sampling models offer tools to delineate the source(s) of the RT difference between free and forced choice tasks. These approaches assume that evidence for one or the other response is (noisily) accumulated until a decision threshold is reached and the corresponding response is initiated (for an overview, see Ratcliff, Smith, Brown, & McKoon, 2016). The best-known model of this class is the drift–diffusion model, proposed by Ratcliff (1978). While many sophisticated models from this family feature high complexity, the present study focuses on a very simple model with three parameters reminiscent of the features in Grice’s (1968) variable criterion model: (1) the decision thresholds that must be reached to count as a decision and to initiate emission of a response, (2) the non-accumulation time reflecting all the time before and after the accumulation time proper (i.e., early perceptual processing, motor execution, and perhaps other additional processes), and (3) the drift rate reflecting the strength of evidence for one particular response, and thus, the amount of evidence for each response added at each time-step.Footnote 2 With higher drift rates, for example, a threshold is on average reached earlier resulting in shorter RTs and fewer errors (which occur when the incorrect threshold is reached, e.g., due to the noise in the accumulation process). Furthermore, lowering the threshold (using a more liberal criterion) yields shorter RTs but more errors (because the chance of reaching the incorrect threshold increases), and augmenting the thresholds (using a more conservative criterion) yields longer RTs and fewer errors. Importantly, the exact kind of evidence that is accumulated is not further specified within this model. In a simple two-alternative forced choice task, one may think of an individual stimulus as the immediate cause of evidence accumulation into one or the other direction, but as already noted in the previous section, it is also conceivable that an anticipated effect or goal state is the source of evidence being accumulated.

Assuming that effect or goal anticipation in the case of a free choice must happen endogenously (without the stimulus entirely determining the goal as in the case of forced choice trials), at least two scenarios can explain the RT difference between forced and free choice tasks within the framework described above: (1) Accumulation starts at the same time in both tasks, but the evidence driving the accumulation process towards one of the response thresholds is weaker in free choice tasks, and thus, the drift rate is lower (see Fig. 1, left panel). (2) Longer RTs in free choice tasks can also result when drift rates are the same in both tasks, but additional time is needed before (or after) the start of accumulation in free choice tasks (for a more thorough description of the consequences of different onsets of information accumulation, see Bausenhart, Rolke, Seibold, & Ulrich, 2010). In this case, the additional delay would be reflected in the non-accumulation time (see Fig. 1, right panel).

Fig. 1
figure 1

Two accounts of the mean RT difference between free and forced choice tasks. The continuous black line represents a medium decision threshold. The dashed grey line indicates an increased decision threshold and the dotted grey line indicates a lowered threshold. Under the “drift rate” account, the RT difference between forced and free choice tasks becomes smaller with lower thresholds (ΔRTa > ΔRTb > ΔRTc). In other words, task and threshold manipulation interact with each other. In contrast, with differences in non-accumulation times, the RT difference remains the same irrespective of the threshold (ΔRTa = ΔRTb = ΔRTc) and, therefore, reflects an additive relation between task and threshold manipulation

Even though on the global level, both accounts predict longer RTs in free than in forced choice tasks, there is a way to empirically distinguish them by manipulating the decision thresholds. Under the assumption of different non-accumulation times but equal drift rates, the RT difference between free and forced choice tasks should be independent of the actual threshold (see Fig. 1, right panel) and, therefore, of the same size under liberal and conservative criteria. Thus, task type and the manipulation of the decision thresholds should combine additively, because gathering the required additional information takes the same amount of time when both types of task have the same speed of information acquisition. To use a metaphor: If two horses in a horse race run at the same speed, but one horse starts 5 m closer to the goal than the other horse, the distance between the two horses when they cross the finishing line will not change, even if the goal is moved closer to or farther away from the starting point of the race. In contrast, if there is a difference in the drift rates between the two tasks, the RT difference should become smaller the lower the threshold and bigger the higher the threshold is (see Fig. 1, left panel). In other words, task type and the manipulation should statistically interact. In the horse race metaphor, this means that one horse is faster than the other, but they start in the same position. Over the course of the race, the distance between the two horses would increase. If the race is short (liberal criterion), there is less time for the distance to increase, whereas distance can increase more in a longer race (conservative criterion), resulting in a larger difference.

Two previously established methods of manipulating decision thresholds are the amount of catch-trials in an experimental block and time pressure. Catch-trials are trials in which no stimulus appears at the time when a stimulus would normally appear. Participants are instructed not to react to this absence of a stimulus. In general, the more catch-trials there are, the longer the reaction will take (e.g., Gordon, 1967; Näätänen, 1972). It has been theorized that this is because a higher amount of catch-trials leads to a decreased stimulus expectancy, which in turn leads to a higher and thus more conservative decision threshold (e.g., Brysbaert, 1994; Grice, Nullmeyer, & Spiker, 1982; Seibold, Bausenhart, Rolke, & Ulrich, 2011). Another manipulation of the threshold is to vary the time available for responding, that is, varying the time pressure. Increasing time pressure has been repeatedly theorized and empirically shown to lower the decision criterion (e.g., Diederich, 1997; Dror, Basola, & Busemeyer, 1999; Forstmann et al., 2008; Ratcliff & McKoon, 2008).

The present experiments

The aim of the present study was to investigate the RT difference between forced and free choice tasks and to distinguish between the two accounts introduced in the previous section. In Experiment 1, we varied the amount of catch-trials to manipulate thresholds (Näätänen, 1972; Seibold et al., 2011). In Experiments 2 and 3, we manipulated the response deadline (thus inducing time pressure) to manipulate the thresholds (Ratcliff & McKoon, 2008).

Experiment 1

In Experiment 1, participants worked on forced and free choice tasks that were randomly intermingled. We expected longer RTs in the free than in the forced choice task (Berlyne, 1957). The critical manipulation was the proportion of catch-trials within a block (0%, 25%, 50%, or 75%), in which no stimulus appeared and thus no response was to be given. If task type and the catch-trial manipulation affect RTs additively, this would support the idea of comparable drift rates but longer non-accumulation times in free choice tasks. In contrast, if both interact in a way that the RT difference increases with the amount of catch-trials, this would favor an account in terms of different drift rates.

Methods

Participants

Thirty-two persons from the Tübingen area participated (mean (M) age = 24 years; standard deviation (SD) = 3 years; 27 female; one unknown value for age) for monetary compensation or course credit. All participants reported normal or corrected-to-normal vision, were naïve regarding the underlying hypotheses, and provided written informed consent prior to data collection.

Apparatus and stimuli

Stimulus presentation and response collection were done via a standard PC connected to a 17-inch CRT monitor. Stimuli were red, green, and white circles, presented against a black background. Manual responses were collected with the two CTRL keys on a standard keyboard placed on the table in front of the participants.

Tasks and procedure

The task was either to give a predefined response to two of the possible colors (forced choice task: red and green stimuli), or to freely choose one of the two possible responses to the third color (free choice task: white stimulus). On catch-trials, where no stimulus appeared, the participants were instructed not to respond at all. Prior to each block, participants were informed about the percentage of catch-trials in this block. A trial began with the presentation of a small fixation cross (250 ms; see Fig. 2). Following a blank screen (250–350 ms), the stimulus appeared and remained on screen until the response was made. A trial was terminated if no response was given within 1500 ms after stimulus onset. General errors (i.e., no response in non-catch-trials within the time limit of 1500 ms and responses before stimulus appearance) and erroneous responses (response in a catch-trial or wrong key in forced choice trials) triggered respective feedback (1000 ms). The next trial started after an inter-trial interval (ITI) of 1000 ms. Eight blocks of 120 trials (all three stimuli appeared equally often in the normal non-catch-trials) were administered. The amount of catch-trials was varied across four block types (0%, 25%, 50%, and 75%). The first four blocks (one of each type) were ordered by a Latin Square, and the order of the next four blocks was the reverse of the first four blocks.

Fig. 2
figure 2

Time course of a trial in Experiment 1. No feedback text was given if no error occurred. Feedback text was displayed in German and described the type of error made (“Wrong key!”, “Too slow!”, “No stimulus was given!”)

Participants were tested individually in one single session of about 45 min. Written instructions emphasized speed as well as accuracy and, for the free choice trials, an even distribution of left and right responses as well as the avoidance of patterns in maintaining this distribution. The mappings of stimuli and responses in the forced choice task and the order of blocks were counterbalanced across participants. The data of participants whose free choice responses showed a strong bias towards one response option (>80% of choices) were discarded and new data were collected from new participants with the same block sequence (three participants in this experiment).

Design and analyses

The experimental manipulations resulted in two independent variables of interest, namely (1) task type (forced choice vs. free choice) and (2) block type (0% vs. 25% vs. 50% vs. 75% catch-trials). Trials with general errors were discarded. For RT analyses, only correct responses were considered (note that no erroneous responses can be made in free choice tasks). Trials with RTs deviating more than 2.5 SDs from the participants’ mean per condition were excluded. Data were then submitted to a 2 × 4 Analysis of Variance (ANOVA) with repeated measures on task type and block type. Percentages of errors (PEs) were only analyzed for the forced choice task with an ANOVA with block type as repeated measures factor. The choice rates in the free choice task were analyzed similarly as a function of block type. p values were Greenhouse–Geisser adjusted when the assumption of sphericity was violated. In these cases, the respective ε is reported.

Results

Participants chose the left response button in the free choice task about 48.7% of the time in the 0%, 48.0% in the 25%, 47.2% in the 50%, and 50.6% in the 75% catch-trials blocks. These differences were not significant, F(3,93) = 1.13, p = .331, ηp 2 = .04, ε = .69.

Mean correct RTs (2.5% excluded as outliers) are shown in Fig. 3 and are summarized in Table 1. As expected, responses in the forced choice task were faster than in the free choice task, F(1,31) = 55.50, p < .001, ηp 2 = .64, and responses slowed down with an increasing amount of catch-trials in a block, F(3,93) = 102.59, p < .001, ηp 2 = .77, ε = .63. This latter result suggests that the manipulation worked as intended and increased the decision thresholds. Most importantly, there was a significant interaction between block type and task type, F(3,93) = 3.15, p = .048, ηp 2 = .09, ε = .69. A closer look at Fig. 3, however, suggests that this interaction is driven by the smaller RT difference in the 0% catch-trials blocks compared to the other blocks, and arguably, the 0% blocks differ in an important aspect from the other blocks: While in the 0% condition, participants knew that a response is always required, in the other blocks, the additional demand of distinguishing normal from catch-trials was imposed. The drift rate account, however, predicts an increasing RT difference across all levels of increasing decision thresholds (with growing differences as the amount of catch-trials increases). Therefore, we re-analyzed the data but omitted the 0% catch-trial blocks. Again the two main effects were significant as expected, task type: F(1,31) = 47.33, p < .001, ηp 2 = .60, and block type: F(2,62) = 90.47, p < .001, ηp 2 = .74, ε = .87. Clearly, however, their interaction was not significant, F(2,62) = 0.36, p = .697, ηp 2 = .01.

Fig. 3
figure 3

Mean correct RTs in milliseconds (ms) from all three experiments as a function of task type and block type. Error bars are 95% within-subject confidence intervals calculated for the difference between free and forced choice tasks collapsed across block types (see Pfister & Janczyk, 2013)

Table 1 Ms (and SDs) of response times (RTs) in ms and percentages of errors (PEs) in forced choice tasks of Experiments 1–3 as a function of block type and trial type

PEs in the forced choice task (i.e., wrong response keys pressed) increased with the amount of catch-trials (see Table 1), F(3,93) = 8.41, p = .001, ηp 2 = .21, ε = .57. Finally, there was a negligible amount (<0.1%) of catch-trials in which a response was given.

Discussion

The results of Experiment 1 are not in line with the drift rate account, but more compatible with differences in the non-accumulation time. Yet, they entail several aspects that complicate a straightforward interpretation.

First, RTs increased with increasing amount of catch-trials, an observation that complies with the intended manipulation of increasing thresholds (see also Näätänen, 1972; Seibold et al., 2011). At the same time, though, more errors were made in the forced choice task as well. This is unexpected, since increasing thresholds should make errors less likely.Footnote 3 We will get back to this in the General Discussion.

Second, we replicated the common observation of longer RTs in the free than in the forced choice task, and task type interacted with the amount of catch-trials in the initial analyses, which included the 0% catch trials condition. Straightforwardly, this would argue against the account of equal drift rates with the differences arising from different non-accumulation times. Yet, different drift rates should result in increasing RT differences across all amounts of catch-trials, and, clearly, this was not the case. Rather, the interaction was driven by a smaller RT difference in the 0% catch-trial blocks, perhaps reflecting the absence of the additional demand of distinguishing normal from catch-trials in the other blocks. When considering only the comparable blocks with catch-trials, task type and block type combined additively.

Tentatively, we, therefore, take the results as support for the account of different non-accumulation times between the tasks (see Fig. 1, right panel). At the same time, we wish to avoid premature conclusions based on this single experiment. Accordingly, in the following experiments, we sought for converging evidence and employed time pressure as a different means of manipulating response thresholds.

Experiment 2

Experiment 2 followed the same logic as Experiment 1, but time pressure was used to manipulate response thresholds. To individually adjust time limits, the mean and the standard deviation of participants’ RTs in free and forced choice tasks were determined first. Subsequently, the same tasks were presented with three different levels of time pressure that were announced prior to each block and were varied block-wise.

Methods

Participants

Thirty-six persons from the Tübingen area participated (M age = 23 years; SD = 4 years; 31 female) for monetary compensation or course credit. All participants reported normal or corrected-to-normal vision, were naïve regarding the underlying hypotheses, and provided written informed consent prior to data collection.

Stimuli and procedure

Stimuli were adopted from Experiment 1. The task was largely the same, except that there were no catch-trials, and the blank screen interval between the fixation cross and the stimulus’ appearance was fixed to 250 ms. At the beginning, two pre-experimental blocks with a response window of 1500 ms assessed mean RTs of each participant separately for free and forced choices. The respective Ms and their SDs were then used to calculate three different response deadlines separately for the free and forced choice tasks: long (M + SD), medium (M), and short (M − SD). Then three experimental blocks, one of each deadline condition, followed. The order of these blocks and the S-R mapping within the forced choice task were fully counterbalanced. After these three blocks, another three blocks in reverse order followed. At the beginning of every block, the time limit of the task type with the shorter deadline (determined in the first two blocks, see above) was announced to the participants. After each block, participants were informed about how long their responses took on average (averaged across both free and forced choice trials). The same exclusion criterion as in Experiment 1 was used, and data from one participant were discarded and replaced by a new data set in the same condition.

Design and analyses

The experimental manipulations resulted in two independent variables of interest, namely (1) task type (forced choice vs. free choice) and (2) block type (M − SD vs. M vs. M + SD response deadline). Trials with general errors were discarded. For RT analyses, only correct responses were considered (note that no erroneous responses can be made in free choice tasks), and trials with RTs deviating more than 2.5 SDs from the participants’ mean per condition were excluded as outliers from analyses. Data from the experimental blocks were then submitted to a 2 × 3 ANOVA with repeated measures on task type and block type. Error data were only analyzed for the forced choice task by means of an ANOVA with repeated measures on block type. The choice rates in the free choice task were analyzed similarly, but included the pre-experimental blocks. p values were Greenhouse–Geisser adjusted when the assumption of sphericity was violated. In these cases, the respective ε is reported.

Results

Participants chose the left response button in the free choice task about 54.0% of the time in the M + SD blocks, 56.1% in the M blocks, 58.3% in the M  SD blocks, and 51.1% in the pre-experimental blocks, and the main effect of block type was significant, F(3,105) = 5.65, p = .004, ηp 2 = .14, ε = .72. In the pre-experimental blocks, mean RTs were 423 ms in the forced choice task and 444 ms in the free choice task, F(1,35) = 18.16, p < .001, ηp 2 = .34.

Mean correct RTs (1.3% excluded as outliers) are shown in Fig. 3 (middle panel) and are summarized in Table 1. As expected, there was a main effect of block type on RTs, F(2,70) = 363.32, p < .001, ηp 2 = .91, ε = .75, with higher time pressure induced by shorter response deadline resulting in shorter RTs, as well as a main effect of task type, F(1,35) = 5.72, p = .022, ηp 2 = .14, with longer RTs in the free choice task compared to the forced choice task. The interaction between block type and task type was also significant, F(2,70) = 4.50, p = .021, ηp 2 = .11, ε = .81. Inspection of the RTs revealed virtually no RT difference between both tasks in the high time-pressure (M − SD) block, which may point to a large proportion of fast guesses in this condition. Indeed, the PEs in this block ranged from 9.6% to 46.6%, that is, close to chance level. Thus, we performed a median split based on error rates in this condition (with mean PEs in the M − SD condition of 22% and 38% for the below- and above-median groups, respectively), and ran an ANOVA that included this grouping variable. This ANOVA yielded an almost significant interaction between block type, task type, and the grouping variable, F(2,68) = 2.98, p = .068, ηp 2 = .08, ε = .84, and we continued to analyze both groups separately. As expected, for the participants with the above-median PEs, the interaction of task type and block type was significant, F(2,34) = 6.41, p = .004, ηp 2 = .27. In contrast, for the other group of participants with lower PEs—and thus a performance not as close to chance level—the interaction was far from significance, F(2,34) = 0.23, p = .799, ηp 2 = .01.

The PEs in the forced choice task increased with shorter response deadline, F(2,70) = 52.16, p < .001, ηp 2 = .60, ε = .63 (see Table 1).

Discussion

In this experiment, we manipulated the thresholds by inducing time pressure with a response deadline. First, and as expected, RTs were shorter the more time pressure was induced in a block, and also the PEs (in the forced choice task) increased accordingly. This pattern suggests that the time pressure manipulation worked as intended. Second, the initial analysis revealed a significant interaction of task and block type. Taking into account PEs, however, post hoc analyses indicated that this interaction likely resulted from a substantial proportion of fast guesses in the high time-pressure condition, which undermines the validity of the measured performance. When considering only the half of participants with below-median PEs, the interaction vanished, and results are compatible with our tentative proposal from Experiment 1, favoring an account in terms of comparable drift rates but different non-accumulation times. In addition, the RT difference in the other two blocks remained constant, whereas the drift rate account would predict an increase of the RT difference in the longer deadline.

To further validate our conclusion that longer RTs in free than in forced choices are due to differences in non-accumulation times rather than in drift rates, we ran Experiment 3. This experiment was essentially a repetition of Experiment 2, but with less severe time pressure to avoid the high error rates that supposedly resulted from fast guesses.

Experiment 3

Experiment 3 used the same setup as Experiment 2 except that we used response deadlines of M, M + 0.5*SD, and M + 1.5*SD to avoid fast guesses as in the very short time limit in Experiment 2. We expected an additive combination of task type and block type in the present experiment.

Methods

Thirty-six persons from the Tübingen area participated (M age = 23 years, SD = 4 years; 29 female) for monetary compensation or course credit. All participants reported normal or corrected-to-normal vision, were naïve regarding the underlying hypotheses, and provided written informed consent prior to data collection. This experiment was identical to Experiment 2 in all regards with the exception of the time limits, which were set at M, M + 0.5*SD, and M + 1.5*SD.

Results

Participants chose the left response button in the free choice task in about 55.6% of the time in the M blocks, 55.0% in the M + 0.5*SD blocks, 55.6% in the M + 1.5*SD blocks, and 52.3% in the pre-experimental blocks. These differences were not significant, F(3,105) = 1.80, p = .164, ηp 2 = .05, ε = .82. In the pre-experimental blocks, mean RTs were 398 ms in the forced choice condition and 417 ms in the free choice condition, F(1,35) = 15.00, p < .001, ηp 2 = .30.

Mean correct RTs (1.3% excluded as outliers) are shown in Fig. 3 (right panel) and are summarized in Table 1. As expected, there was a main effect of block type, F(2,70) = 98.41, p < .001, ηp 2 = .74, ε = .96, with shorter response deadlines resulting in shorter RTs, as well as a main effect of task type, F(1,35) = 12.92, p = .001, ηp 2 = .27, with longer RTs in free choice tasks compared with the forced choice tasks. The interaction between block type and task type was not significant, F(2,70) = 2.07, p = .133, ηp 2 = .06.

The PEs in the forced choice task decreased with increasing response deadlines, F(2,70) = 11.93, p < .001, ηp 2 = .25, ε = .88 (see Table 1).

Discussion

In Experiment 3, we observed no significant interaction between block type and task type and, if anything, the numerical decrease of the RT effect with longer deadlines was in a direction incompatible with the drift rate account (see Fig. 1, left panel). Rather, the results are in line with predictions of different non-accumulation times between the tasks.

General Discussion

Three experiments were run to elucidate the source of the RT difference between forced and free choice tasks. We used the sequential sampling framework to derive two hypotheses (see Fig. 1): First, the difference can arise from differences in the speed of evidence accumulation with drift rates being smaller for free choice tasks (see Fig. 1, left panel). Second, the difference can arise from differences in the non-accumulation time with a later onset (but a similar rate) of accumulation in the case of free choice tasks (see Fig. 1, right panel). To distinguish these two accounts, we manipulated the response thresholds by varying the amount of catch-trials per block in Experiment 1 (Näätänen, 1972; Seibold et al., 2011) and by inducing time pressure via response deadlines in Experiments 2 and 3 (Ratcliff & McKoon, 2008).

Summary of results

First, in all experiments, forced choice stimuli were responded to faster than free choice stimuli. Second, the manipulations of catch-trials and response deadlines effectively changed the overall level of RTs as expected. However, evidence about the nature of interactions between these manipulations and task type was rather mixed. In Experiments 1 and 2, we observed significant interactions, which seemingly argue against non-accumulation time differences. In Experiment 1, though, this interaction was attributable to the block without any catch-trials, thus without an additional demand of distinguishing normal and catch-trials. In Experiment 2, no RT difference between forced and free choice tasks was evident in the high time-pressure condition, and RTs were only about 250 ms. We suspected a large proportion of fast guesses in this case, and indeed, only the participants with above-median PEs yielded a significant interaction. In the group with below-median PEs, and thus a performance not as close to chance level, the interaction vanished. Admittedly, excluding trials or reducing the number of participants lowers the statistical power for detecting an interaction. However, as a further aspect, the drift rate account predicts increasing RT differences with increasing thresholds, and this was not even descriptively the case. The clearest evidence against the drift rate account, however, comes from Experiment 3. This experiment was a repetition of Experiment 2 without a very high level of time-pressure. In this experiment, no interaction was observed and the results are compatible with the predictions derived from assuming differences in the non-accumulation time.

Overall, it seems that the drift rate account received little if any support from these results. In contrast, we did not observe evidence against the idea that there is a difference in non-accumulation times between free and forced choice tasks. Therefore, we suggest that the RT difference between free and forced choice tasks is at least partly caused by additional processes subsumed in the non-accumulation times of free choice tasks.

Limitations

One odd result in Experiment 1 is that PEs increased with increasing proportions of catch-trials. Because a higher PE, especially together with a longer RT, is compatible with a lower drift rate, a possible explanation would be that the manipulation in Experiment 1 targeted the drift rates instead of decision thresholds. This would have broader implications for every argumentation that requires the assumption or concludes that the amount of catch-trials influences (only) the decision thresholds (e.g., in Brysbaert, 1994; Grice et al., 1982; Seibold et al., 2011). Should the manipulation through catch-trials target the drift rates instead of or additionally to the decision thresholds, this of course complicates the interpretation of the results of Experiment 1. The mean RTs and PEs in Experiments 2 and 3, though, were in line with our assumptions about the manipulation of decision thresholds as PEs increased, while RTs became shorter with shorter response deadlines and thus increasing time pressure. It should be noted, though, that also for time-pressure manipulations and speed–accuracy trade-off instructions, concerns have been raised that not (only) decision thresholds but also other parameters such as the drift rate change (e.g., Arnold, Bröder, & Bayen, 2015; Dambacher & Hübner, 2015; Rae, Heathcote, Donkin, Averall, & Brown, 2014; Rinkenauer, Osman, Ulrich, Müller-Gethmann, & Mattes, 2004).

To check which parameters our manipulations affected, we extracted the parameters for the forced choice trials with EZ (Wagenmakers, Van Der Maas, & Grasman, 2007) and compared the parameter estimates across the block types for each experiment (see the Appendix for a summary of these analyses). Notably, in Experiments 1 and 2, the drift rates were, indeed, affected by the manipulations (i.e., smaller drift rates for conditions which should only have higher decision thresholds in Experiment 1 and larger drift rates for conditions which should only have higher decision thresholds in Experiment 2). No such effect was observed for Experiment 3, which was also the one with the most straightforward and clear data pattern in RTs and PEs. Furthermore, in Experiment 1, there was no significant effect on the extracted response thresholds, while for Experiments 2 and 3, the threshold pattern matched our theoretical assumptions. Finally, in all three experiments, there was a significant influence on the non-accumulation times, which increased with the decision thresholds.

While the results from Experiments 1 and 2 must be interpreted with some caution, it is unclear whether the effects in drift rates are due to trade-offs in parameter estimation itself. If not, the previous studies using similar manipulations may suffer from the same limitations, which potentially has broader implications for other research fields. Importantly, the results of Experiment 3 revealed no drift rate effect.

Another potential limitation is that our conclusion is based on retaining the null-hypotheses of the critical (2 × 3) interaction effects. To facilitate interpretation of the results, we ran a power analysis using GPower (Faul, Erdfelder, Lang, & Buchner, 2007). To achieve a power of 1-β = .8 with α = .05 and ρ = .3 to detect a medium sized effect, the calculated required sample size was n = 20.Footnote 4

Potential additional processes

If we accept that RT differences in free and forced choices are due to differences in the non-accumulation time and assume that there are one or more additional cognitive processes involved in free choice task performance: what is known about them? They are most likely not or only minimally influenced in their duration by manipulations of stimulus features, because in Experiment 3 of Janczyk, Dambacher, Bieleke & Gollwitzer (2015), stimulus brightness only affected forced choice RTs but not free choice RTs (both task types were intermixed in the same blocks). As there is an alternative explanation of the latter result (that participants only ruled out the presence of a forced choice stimulus instead of identifying free choice stimuli), this should be seen as a tentative conclusion. We discuss candidates for the additional processes in the following.

(1) Memory processes triggered one or the other response (trial history bias): Part of the premise of free choice tasks is that participants are asked to respond roughly with the same amount with each response option and without a clear pattern, essentially asking the participants to act as (pseudo-)random number generators for the experiment. The breadth of the literature on random number generation alone suggests that this task is not trivial and can be approached in many different ways. Participants either really generate random numbers or they try to generate patterns that ‘feel’ random but are, in fact, not. Various biases in human random number generation are known (see also Heuer, Janczyk, & Kunde, 2010, for an overview). Examples are a lack of symmetrical response sequences, a lack of long runs of the same response, or a balancing of responses across short sequences (Bar-Hillel & Wagenaar, 1991). Both negative and positive recency effects (i.e., lowered and heightened chances of repetitions) can be observed under different circumstances (Ayton & Fischer, 2004). We suggest that investigating what strategies, if any, are used to generate the pattern of decisions in free choice tasks could provide insight into the processes that are subsumed in the non-accumulation time. To shed some light on whether a free choice is affected by the immediate history of responses in the preceding trials, we ran a post hoc analysis on choice frequencies. In particular, when comparing the percentage of left responses with the percentages of the same type conditional on the previous response (left or right) and type of task (free or forced choice), there were significant differences for all three experiments (see Table 2),Footnote 5 Experiment 1: F(4,124) = 6.11, p = .005, ηp 2 = .16, ε = .451; Experiment 2: F(4,140) = 8.16, p = .002, ηp 2 = .19, ε = .404; Experiment 3: F(4,140) = 16.39, p < .001, ηp 2 = .32, ε = .438. Interestingly, the resulting pattern of choice frequencies bears similarities to reports in the task switching literature where a response repetition benefit (in RTs) is only observed when the task repeats but not when the task switches (e.g., Kleinsorge, 1999; Rogers & Monsell, 1995). Whether or not the present result of fewer response repetitions following a switch from a forced choice to a free choice extends this effect is open to future research.

Table 2 Percentage of left responses in free choice trials, both unconditional (column overall) and conditional on the previous trial being a free or forced choice task and left or right responses

In sum, these observations point to the idea that the responses in the immediately preceding trials were considered on a current free choice trial. In other words, participants seem to use systematic strategies to decide what response to give, which takes time and adds to the RTs in free choice tasks.

(2) Endogenous generation of stimulus/effect representations: Free choice tasks as used in this study are usually intermixed with forced choice tasks. In the introduction, we stated that the exact basis of evidence accumulation is not fully specified in diffusion models. First, after having realized to be in a free choice trial, participants may endogenously generate a representation of one of the two forced choice stimuli and evidence is then accumulated for internal representations of these stimuli that are associated with one or the other responses. Second, according to Ideomotor Theory (e.g., Greenwald, 1970; Harleß, 1861; Shin et al., 2010), bodily movements are always addressed via an anticipation of the sensorial consequences of these movements, that is, their action effects (see also Janczyk, 2016; Janczyk et al., 2017; Kunde, 2001). Importantly, the possible action effects (depressed left/right response keys, visual and proprioceptive feedback from moving a left/right finger) are the same in forced and free choice tasks. A difference, however, is that for forced choice tasks, the stimulus determines the desired action effect, while in free choice tasks, this state must be generated again endogenously. Either way, such processes take time and would, therefore, be compatible with the results of this study.

The present data do not allow distinguishing between these two possibilities, and we do not claim that our list is exhaustive. For example, it is also possible that motor execution takes longer in free choice compared to forced choice tasks. Furthermore, these accounts are not mutually exclusive. It may well be that the choice of an effect is first driven by response history and then the action effect is endogenously generated, and thus both processes contribute to the non-accumulation time.

Modeling free choice data

The present results can be used as constraints for future formal models of free choice behavior that assess their parameters more directly. We are currently aware of only one direct application of a sequential sampling model to data from priming experiments in free and forced choice tasks (Mattler & Palmer, 2012). The most important outcome of the experiments in this study was that the response in the free choice task was biased by a (subliminal) stimulus-preceding prime. In the model, the activity of two accumulator nodes mutually inhibits the response unit of the other accumulator. This inhibition may account for potential response–response conflicts in free choices, which slow down responses (Berlyne, 1957). Furthermore, when a free choice stimulus appears, an exponential drop of the decision thresholds is assumed. While this model fits the priming data, the threshold drop can generally be seen critically, because it is not assumed for the forced choice task, but it is assumed to start immediately after the appearance of the stimulus, implying some sort of stimulus identification. Nevertheless, we believe that the approach by Mattler and Palmer (2012) is a valuable step towards the identification of similarities and differences between forced and free choice tasks.

Conclusion

Applying a framework borrowed from diffusion models, we observed no evidence that the mean RT difference between free and forced choice tasks is attributable to a higher drift rate in forced than in free choice tasks. Our results are rather compatible with a delay of the information accumulation process in free compared to forced choice tasks. Future work should aim at identifying the nature of this delay and the concurrent processes in more detail.