Introduction

For perception and action in everyday life, it is crucial to develop expectations about possible future events and also about temporal moments when these events might occur. Highly coherent events such as speech, body gestures or music, contain structural regularities that allow us to predict specific events at specific time points (Jones & Boltz, 1989). In Western tonal music, listeners familiar with the musical idiom anticipate “what” kind of musical events might occur and “when” in time they are likely to occur. These expectations are based on the pitch and time dimensions of tonal music.

Pitch dimension

Listeners’ expectations about what kind of musical event occurs next are linked to the regularities of the pitch dimension in Western tonal music. These regularities imply frequencies of co-occurrence between musical events (i.e., notes, chords) and frequencies of occurrence of events in a given tonal context. Footnote 1 In a musical piece, notes and chords belonging to one key are frequently associated, and those with highly referential musical functions (e.g., tonic chords) are used more often than are less referential ones (Francès, 1958; Krumhansl, 1990). Listeners become sensitive to these regularities via implicit learning processes with mere exposure to musical pieces obeying this system (Francès, 1958; Krumhansl, 1990; Tillmann, Bharucha & Bigand, 2000). The implicit musical knowledge allows listeners to develop expectations for future events. When listening to a musical context with an established tonal center (i.e., key), the underlying tonal-harmonic hierarchies evoke the expectations that referential tones and chords are more likely to occur in the remainder of the piece than other events. For chords, the harmonic priming paradigm investigates the influence of these expectations on processing speed. A prime context (i.e., a single chord or a chord sequence) is followed by a target chord, on which participants are required to perform a simple perceptual judgment. Footnote 2 The critical point consists in manipulating the harmonic relatedness between prime and target and in analyzing its influence on task performance. Targets were processed faster and more accurately when they are preceded by a harmonically related prime than by an unrelated prime (Bharucha & Stoeckig, 1986, 1987; Justus & Bharucha, 2001; Tekman & Bharucha, 1992, 1998; Tillmann & Bharucha, 2002; Tillmann, Bigand, & Pineau, 1998), and when they have a strong referential function in the prime context in comparison to a less referential one (Bigand, Madurell, Tillmann, & Pineau, 1999; Bigand & Pineau, 1997; Tillmann & Bigand, 2001). Harmonic priming has been interpreted in terms of tonal knowledge activation allowing musical expectancy formation about what kind of event might come next.

Time dimension

Listeners are sensitive to temporal regularities in event sequences and develop expectations about the temporal occurrence of future events (Fraisse, 1974; Longuet-Higgins & Lee, 1984; Povel, 1981; Jones, 1976; Lerdahl & Jackendoff, 1983). Temporal regularities include the organization of event onset intervals through time leading to a sensation of meter - a sensation of a regular succession of strong and weak beats superimposed over an isochronous pulse. They also include the temporal patterns of onset intervals creating rhythms that are perceived against the metrical background. Processing advantages for metrical sequences with a hierarchical structure over nonmetrical sequences have been observed in production tasks (Essens & Povel, 1985), detection tasks and memory (Bharucha & Pryor, 1986; Palmer & Krumhansl, 1990; Yee, Holleran & Jones, 1994). Metrical structure is also reflected in completion judgments: equitonal sequences ending on a strong beat are judged as more complete than sequences ending on a weak beat (Palmer & Krumhansl, 1987a, 1987b).

Pitch and time

For the respective contributions of tonal and temporal regularities in music perception, two theoretical frameworks have been distinguished (Peretz & Kolinsky, 1993): a single-component model that predicts interactive interference between the processing of the two dimensions (Jones & Boltz, 1989), and a two-component model that postulates independence between the processing of temporal and nontemporal information (Peretz & Kolinsky, 1993).

In the single-component model (Jones & Boltz, 1989), perception and memory of musical sequences are dynamic, context-specific processes that are related to attention. Pitch structures create tonal-harmonic accents and temporal structures create temporal accents. These accents are integrated in “a joint-accent structure” that guides listeners’ attention during the ongoing musical sequence and defines the dynamic shape of the sequence. This approach suggests that tonal and temporal structures are not processed independently, but interact. As a consequence, the modification of one dimension influences irrepressibly the processing of the other dimension. For example, changing the temporal organization of the same notes influences the mental representation of the musical structure (see also Lerdahl & Jackendoff, 1983). For melodies, the temporal organization affects memory performance in recognition and recall (Bigand & Pineau, 1996; Boltz, 1991, 1993), the evaluation of musical tension (Bigand, 1997), the degree of completion judgments (Boltz, 1989a, 1989b) and the estimation of duration (Boltz, 1993a, 1995, 1998; Jones, Boltz, & Klein, 1993). In Boltz (1989b) for example, melodic and temporal accents were manipulated in melodies. The melodies ended on notes with different degrees of tonal stability, and they were constructed in such a way that the last note of the melody either coincide with a temporal accent of the melody’s structure or occurred too early or too late. Subjective ratings of musical completion of the melody (i.e., judging the strength of the sense of the finality) showed an influence of tonal and temporal manipulations and an interaction of both factors. Melodies were judged as being more complete when ending on the tonic tone in comparison to the mediant (third degree) or the leading tone (seventh degree). They were judged as more complete when the final tone occurred on-time than when it occurred too early or too late - with the early endings receiving the lowest completion judgments. A significant interaction between tonal ending and temporal accent structures indicated that the temporal manipulation had a stronger impact on melodies ending with less stable tonal events.

In the two-component model, the processing of pitch and time is supposed to be independent at first, with information being integrated at later processing stages (Peretz & Kolinsky, 1993; Peretz & Coltheart, 2003). Empirical support comes from neuropsychological cases and experimental data. Neuropsychological cases documented a double-dissociation between amelodia and arhythmia: some patients showed a selective deficit of pitch processing with temporal processing being preserved, and other patients showed the reverse pattern (Peretz, 1990, 1996; Peretz & Kolinsky, 1993; Peretz & Morais, 1989; Piccirilli et al., 2000). Further evidence is based on perceptual ratings (Palmer & Krumhansl, 1987a, 1987b): melodic sequences were presented either in their original version (pitch and rhythm) or in two modified versions that kept only the information of one of the two dimensions (i.e., isotonous sequences respecting the original rhythm and isochronous sequences respecting the original pitches). Completion judgments provided evidence for independent processing of the two dimensions: judgments of the original excerpts (i.e., pitch and time) were predicted by a linear combination of judgments of the versions keeping only one dimension (i.e., pitch only, time only).

For music cognition, it is crucial to understand how listeners process pitch and time dimensions together as their combination define the musical structure of a piece (e.g., Lerdahl & Jackendoff, 1983). Surprisingly few researches had investigated influences of pitch and time in music perception, and existing studies have mainly focused on melodies. Series of notes are presented sequentially with different durations. Listeners are sensitive to the implied harmony and infer the harmonic meaning from the tonal melodies (see also Holleran et al., 1995). The harmonic organization is explicit in chord sequences, with each chord consisting of three or more notes played simultaneously. The question arises whether the respective influences of pitch and time dimensions in harmonic sequences remain similar to their influences in melodic sequences. The more directly established tonal organization might reinforce the contribution of the pitch dimension.

Some studies extended the investigation of pitch and time processing from melodies to melodies with harmonic accompaniment. Using a sonata theme combining melody and harmony, Palmer and Krumhansl (1987b) replicated independent processing of the two dimensions. Using short melodies accompanied by four chords, Schmuckler and Boltz (1994) reported interactive influences. The last chord of the accompaniment was either harmonically related, less related or unrelated to the context, and was played either on-time, earlier or later than expected on the basis of the context’s regular meter. Additionally, an irregular context (i.e., the first three context chords presented without periodicity) was created to hinder the development of temporal expectations. Participants judged the degree of fit of the last chord (i.e., the target) to their expectations and whether this chord belonged to the sequence or not. These subjective judgments showed an influence of harmonic and temporal manipulations (e.g., stronger fulfilled expectations for related than for unrelated chords), and an interaction: in regular contexts (but not in irregular ones), related targets were judged as better fitting than less-related targets only for targets played on-time or late, but not for targets played early. This study did not focus on pitch and time influences on chord perception since tonal and temporal manipulations of the targets also changed the targets’ relations to the simultaneously sounding melodic line (i.e., creating in some cases acoustical dissonance). For chord sequences, a first attempt to investigate harmonic and temporal structures was realized with the priming paradigm (Bigand et al. 1999, Experiment 3). The manipulation of harmonic relatedness between target and prime context was crossed with the manipulation of the temporal structure of the sequence. The placement of a firmata (i.e., increased duration of a chord in comparison to surrounding chords) created either a temporally symmetric structure or an asymmetric structure. With this manipulation, it was expected that listeners’ attention was drawn either to the last chord (the target) or to the penultimate chord (Jones, 1987). The priming data showed a marginally significant interaction between temporal organization and harmonic context: chord processing took the longest when the target was harmonically and temporally unexpected.

The aim of our study was to further investigate the processing of pitch and time dimensions in chord sequences. For the pitch dimension, the harmonic relation between the prime context’s tonality and the last chord of the sequence (i.e., the target) was manipulated so that the target was either strongly related (i.e., a stable, tonic chord) or less related (i.e., a less stable subdominant chord) (Bigand et al., 2001, 2003). For the time dimension, two manipulations were realized. First, the chords of the prime sequence were played either regularly (i.e., isochronously) or irregularly. The regular, invariant presentation should allow listeners to develop temporal expectations about when the next (and mainly the last chord, the target) should occur. This temporal regularity was disrupted in irregular sequences, in which the periodicity between successive chords was varied. Playing chords without periodicity aimed to render difficult to anticipate the next chord. Second, the temporal occurrence of the final chord of the sequence was manipulated (i.e., the ending time of the sequence): the target occurred either on time respecting the regularly established timing intervals of the context, later or earlier than expected on the basis of this regularity. Footnote 3 These experimental manipulations were similar to Schmuckler and Boltz (1994), but applied to chord sequences and combined with two experimental tasks. To measure the influence of tonal and temporal expectations on speed of chord processing, the priming paradigm was used (Experiments 1 and 4). In the priming paradigm, participants realized a task on a feature that is orthogonal to the experimental manipulations of pitch and time, which allows for an indirect measurement of developed expectations and their influence on chord processing. To allow for comparison with direct tasks mainly used for melodies in previous research, the same material was tested with subjective judgments of completion (Experiments 2 and 3).

Experiment 1

Experiment 1 investigated the influence of musical expectations about the “what” and the “when” of future events on the processing speed of a target chord. In 8-chord sequences, the harmonic relations between target and prime context and the temporal structure (i.e., ending time and temporal regularity of the context) were manipulated. Facilitated processing was predicted for events that were harmonically and temporally expected. For the pitch dimension, we expected to replicate previously reported harmonic priming effects with facilitated processing for the related tonic over the less-related subdominant. For the temporal structure, the hypothesis was that the regular context allows listeners to develop expectations about the temporal occurrence of the target, leading to faster processing for on-time targets in comparison to targets occurring early or late. More specifically, the influence of ending time was expected to be stronger for events occurring too early in time (Boltz, 1989b; Schmuckler & Boltz, 1994). These temporal differences were not expected for the irregularly played sequences, for which response times should be slowed down overall in comparison to regular sequences. The crossed manipulation of harmonic relations and temporal structures allowed us to investigate in how far pitch and time dimensions influence chord processing in interaction or independently.

For the priming paradigm, a timbre discrimination task was used. The prime context was played with a piano timbre and the target by two different timbres (i.e., providing a surface marker that clearly indicated the to-be-judged target). Participants indicated whether the target was played by timbre A or timbre B. The aim of this task was to remove eventual temporal ambiguities about when to respond in the irregular sequences and in sequences with early or late targets.

Methods

Participants

Twenty-five volunteers participated in Experiment 1. Number of years of practice of an instrument ranged from 0 to 14 (mean = 2.88, std= 3.9).

Material

Twenty-four 8-chord-sequences from Bigand et al. (2003) were used as the basis for the experimental material. These 24 sequences have been constructed in the following way: 12 6-chord sequences (covering the 12 major keys) were completed with two different 2-chord endings so that the last chord acted as either a related, tonic chord or a less related subdominant chord. In these sequences, the first seven chords defined the prime context and the last one, the target. The related target never occurred in the prime context, but the less-related target occurred once or twice. Observing facilitated processing of the related target thus provides evidence for cognitive priming in contrast to sensory, perceptual priming (cf. Bigand et al., 2003). On the basis of this material, two temporal manipulations were made, notably prime’s regularity and ending time. Regularity: The prime context was played regularly or irregularly. In regular sequences, the Stimulus Onset Asynchrony SOA between two chords was kept constant for the seven prime chords (SOA= 640 ms) and each prime chord started at the same time points: t1= 0 ms, t2 = 640 ms, t3 = 1280 ms, t4 = 1920 ms, t5 = 2560 ms, t6 = 3200 ms, t7 = 3840 ms. Twelve irregular sequences were built on the basis of regular sequences (cf. Appendix). The first and seventh chords were played at the same time points as in regular sequences (i.e., t1, t7), but the SOAs of chords in positions 2 to 6 were manipulated aiming to destroy the context’s isochrony. These SOAs varied between 320 and 1280 ms: chords were placed at different temporal positions with the restriction that no more than one chord was played at t2 to t6. In regular sequences, each chord was played for 640 ms. In irregular sequences, chords sounded for the duration of the SOA. Ending time: In regular and irregular sequences, the target was played either on time (t8), earlier (t8−240 ms) or later (t8+ 240 ms) than regular SOA. The target sounded for 2000 ms. Figure 1top displays one example sequence of the experimental material with related and less-related endings, Fig. 1bottom displays schematically the temporal manipulations. Sound examples are available at http://olfac.univ-lyon1.fr/unite/equipe-02/tillmann/sound_examples.html.

Fig. 1
figure 1

Top) An example pair of the experimental sequences with A) representing the related condition and B) the less-related condition. Bottom) Schematic representation of the temporal manipulations. a)-c): regular sequences with the target played a) on-time, b) too early and c) too late. d)-f): irregular sequences with the target played at the same three ending times.

Apparatus

The timbres were produced by a Yamaha SO3 synthesizer: prime chords were played with an acoustic grand piano sound (reference XG001) and the target by either an acoustic guitar sound (XG026) (timbre A) or a harp sound (XG047) (timbre B). The Yamaha synthesizer was controlled through a MIDI interface by a Macintosh computer running Digital Performer 3.01 software (Mark of the Unicorn). The sequences were recorded by SoundEdit 16 Software version 2 (MacroMedia), and the experiment was run on Psyscope 1.2.5 PPC software (Cohen, MacWhinney, Flatt, & Provost, 1993).

Procedure

The procedure was split into two phases. In the first phase, participants were trained to differentiate timbre A and timbre B with twenty-four isolated chords and with six eight-chord sequences. They were asked to judge as quickly and accurately as possible the timbre of the isolated chord (or the last chord of the sequence) by pressing a key for timbre A and another key for timbre B on a computer keyboard. Timbre A was defined as sounding rather “bright” and timbre B as rather “dull”. For the chord sequences, participants were informed that all sequences contained eight chords and that the first seven chords were played by a piano-like sound. In the second phase, participants judged the timbre of the target with chord sequences only. In both phases, incorrect responses were accompanied by an alerting feedback signal and a correct response stopped the sounding of the target. After each response, a 250 ms noise mask was presented. Pressing the space bar on the computer keyboard started the next trial. Short breaks were imposed during the experimental session and participants had the possibility to make additional breaks by withholding to press the space bar. The duration of the experiment was about 20 min.

Design

The within-subject factors were Relatedness (related, less related), Regularity (regular, irregular) and Ending time (early, on time, late). Crossing these factors produced 12 possible versions for each of the 12 basic sequences, which were split into two sets of six. For half of the participants, one set was presented with timbre A, the other set with timbre B. For the other half of the participants, the target timbre was reversed for each set. Each participant judged 144 sequences presented in random order.

Results

Percentages of correct responses and correct response times were averaged over all sequences in each experimental condition and were analyzed by two 2 x 2 x 3 ANOVAs with Relatedness (related, less related), Regularity (regular, irregular) and Ending time (early, on time, late) as within-subject factors. Performance accuracy (Table 1) was high overall, and no effects were significant. For correct response times (Fig. 2), the main effect of Relatedness was significant (F (1,24) = 7.36, p< 0.05, MSE = 3478.77): participants responded faster for related than for less-related targets. Overall, response times were shorter when the context was played regularly than irregularly, as reflected in the significant main effect of Regularity (F (1,24) = 22.43, p< 0.001, MSE = 3209.32). Additionally, the main effect of Ending Time was significant (F (1,24) = 10.25, p< 0.001, MSE = 2235.06), response times were slower for early targets than for on-time or late targets. There were no significant interactions involving Relatedness and temporal modifications (Regularity or Ending time) (Fs < 1) nor other significant effects. An additional contrast focusing on the interaction between Regularity and Relatedness for on-time targets only was not significant, F<1. An ANOVA with chord sequences as random variable confirmed the main effects of Relatedness, F2 (1,11) = 7.55, p < .05, Regularity, F2 (1,11) = 27.47, p < .001, and Ending time, F2 (2,22)= 16.33, p < .0001. To study an eventual influence of musical practice, an additional analysis separated participants into 12 nonmusicians (without any musical practice) and 13 experienced listeners (mean of 5.5 years of instrumental practice). The between-subjects factor Musical Expertise was expressed only in a significant main effect, F (1, 23) = 17.12, p< 0.001, MSE = 113485.3, with faster response times in the more experienced group.

Table 1 Percentages of Correct Responses in Experiment 1 and 4 as a function of Regularity (regular, irregular), Ending Time (early, on time, late) and Relatedness (related, less related)
Fig. 2
figure 2

Correct response times in Experiment 1 as a function of regularity of the prime context (regular, irregular), ending time (early, on time or late) and harmonic relatedness (related, less related).

Discussion

Experiment 1 investigated the influence of tonal and temporal structures on chord processing with a priming paradigm. The data provide evidence for priming effects due to the two manipulations, but no interactive relations between them. The harmonic relatedness influenced chord processing with faster response times for related than less-related targets. The regular sequences with on-time targets replicated the previously reported harmonic priming effect (Bigand et al., 2003). The facilitation for related targets was observed not only for on-time targets in regular sequences, but also for targets played early or late, and for irregularly played sequences.

The two temporal manipulations of prime context and target (i.e., regularity and ending time) influenced chord processing. In irregular sequences, responses were overall slowed down in comparison to regular sequences. The temporal jittering of the prime chords had an effect on performance suggesting that listeners were less precise in anticipating the target at the end of the sequence. The temporal occurrence of the target also influenced processing: response times were longer for early targets than for on-time and late targets. This asymmetry between early and late is similar to data reported for melodies (Boltz, 1989) and for melodies with harmonic accompaniment (Schmuckler & Boltz, 1994). Footnote 4

Contrary to our hypothesis, the effect of ending time was not significantly altered by context regularity. This missing interaction between the two temporal manipulations might be due to the local nature of the priming task. In the timbre discrimination task, listeners focused their attention locally on the final events. Over the course of the experiment, the SOA of 640 ms occurred more often than other SOAs since it defines the temporal interval of the regular prime context. Consequently, a temporal interval of 640 ms would be expected more strongly to occur between the two final events than shorter or longer intervals. The shorter interval led to increased response times. The behavioral consequences of the longer interval were less strong (processing times did not increase) as participants were ready to go thanks to the developed expectations.

Finally, the temporal manipulations did not interact with the harmonic relatedness manipulation. The harmonic relatedness effect was not modified by the regularity of the context nor by the ending time of the target. For all ending times (early, on-time, late), response times were the fastest when both tonal and temporal regularity expectations were respected (related, regular) and the slowest when both were violated (less-related, irregular).

Experiment 2

The priming data of Experiment 1 showed that chord processing is influenced by tonal and temporal expectations, but no evidence for an interaction of these two expectancy types was observed. The priming paradigm was used to investigate the influence of expectations on speed of chord processing. In comparison to previous studies on pitch and time perception, our study differed not only in the used material (harmonic sequences instead of melodies with or without accompaniment), but also in the experimental task. Discussions of task influences have previously tried to reconcile differences between experimental data showing either independent or interactive processing (Boltz, 1989, 1993, 1999; Peretz & Kolinsky, 1993; Peretz & Morais, 1989). Peretz and Morais (1989) proposed that pitch and time are processed independently at early processing stages, but are integrated later. Experimental tasks tapping into post-perceptive processes, which might also include decisional stages, should show an interaction between the dimensions, as, for example, judgments of similarity or memory tasks. In previous studies reporting interactions between pitch and time, subjective judgments of completion (or musical tension) are one of the used experimental tasks (Boltz, 1989a, 1989b; Bigand, 1997), next to memory tasks (Boltz, 1991, 1993; Bigand & Pineau, 1996) or duration estimations (Boltz, 1993; Jones et al., 1993). Completion judgments require global integration processes to evaluate the entire sequence including its ending. It might be argued that this task taps into a stage or type of processing that is a better candidate to reveal interactive effects between pitch and time - even in chord sequences. The priming task focused on a local judgment and could be done even without considering the entire context. Footnote 5 It might thus be tapping into processing steps that focus on a local level of the sequence without connecting to the global context and thus favoring independent influences of the two dimensions.

In Experiment 2, participants were required to judge the degree of completion of chord sequences from Experiment 1. If the observed differences between our results and those in the literature are due to the nature of the material (harmony versus melody), pitch and time should influence independently the completion judgments. If they are due to the experimental tasks, an interaction between the two dimensions should be observed also for the harmonic material with the subjective judgments.

Methods

Participants

Twenty-five volunteers participated in Experiment 2, none had participated in Experiment 1. Practice on an instrument ranged from 0 to 10 years (mean = 3.12, std= 3.22).

Material, apparatus

The sequences of Experiment 1 with targets played by timbre B were used. Six practice trials were constructed with material and apparatus described in Experiment 1.

Procedure

Participants were asked to rate the degree of completion (i.e., how well the sequence is ending) of each sequence by using an 8-point scale (1= incomplete; 8= very complete). At the end of each sequence, the scale appeared on the screen and participants had to make their judgment by pressing one of eight keys on the computer keyboard within five seconds. Pressing the space bar started the next trial. Before the experimental phase, participants received practice trials to familiarize with the experimental procedure. Short breaks were indicated during the experiment and participants had the possibility to add further breaks by withholding to press the space bar. The duration of the experiment was about 20 min.

Design

The within-subject factors were Relatedness (related, less related), Regularity (regular, irregular) and Ending time (early, on time, late). Crossing these factors produced 12 versions for each of the 12 basic sequences, resulting in 144 sequences. Each participant judged 144 sequences presented in random order.

Results

Completion ratings (Fig. 3) were averaged over sequences for each condition and were analyzed by a 2 x 2 x 3 ANOVA with Relatedness (related, less related), Regularity (regular, irregular) and Ending time (early, on-time”, “late”) as within-subject factors. The main effect of Relatedness was significant: sequences ending on related targets were judged as being more complete than sequences ending on less-related targets (F (1,24) = 44.85, p < 0.0001, MSE = 1.94). Concerning the temporal manipulations, the main effect of Ending time was significant (F (1,24) = 18.96, p < 0.0001, MSE = 1.001) and Ending time interacted with Regularity (F (2,48) = 11.27, p < 0.0001, MSE = 0.42). In regular sequences, early targets were rated less complete than on-time or late targets, but they were judged equally complete in irregular sequences. No other effects were significant. For regular (but not for irregular) sequences, the mean differences in ratings between related and less-related targets were slightly reduced for early targets in comparison to late or on-time targets, but the three-way-interaction did not reach significance (p =.12). A planned comparison on this specific contrast for regular sequences was significant, F (1,24) = 9.67, p < .005. An ANOVA with chord sequences as random variable confirmed the main effect of Relatedness, F2 (1,11) = 73.24, p < .0001, the main effect of Ending time, F2 (2,22)= 129.99, p < .0001, and their interaction, F2 (2,22) = 42.40, p < .0001. An additional analysis separated participants into 9 nonmusicians (without any musical practice) and 16 experienced listeners (mean of 5.2 years of instrumental practice). The between-subjects factor Musical Expertise found expression only in a significant interaction with Relatedness, F(1,23)=6.82, MSE=1.56, p <.05: differences between related and less-related sequences were more pronounced for musically experienced participants, but remained significant for nonmusicans, F(1,23)=7.28, p < .05 .

Fig. 3
figure 3

Completion ratings in Experiment 2 (from 1 incomplete to 8 very complete), presented as a function of regularity of the prime context (regular, irregular), ending time (early, on time or late) and harmonic relatedness (related, less related).

Discussion

Experiment 2 investigated the influence of both pitch and time dimensions on subjective completion judgments. Independently of the temporal manipulations, the sequences ending on related targets were judged as being more complete than sequences ending on less-related targets. The subjective judgments reflect participants’ sensitivity to the degree of tonal stability of the final chord, with sequences ending on the most referential chord (i.e., the tonic) providing the strongest feeling of finality. This outcome corroborates previous findings on the perception of tonal stability in melodic and harmonic sequences (e.g., Hébert et al. 1995; Bigand & Pineau, 1997; Boltz, 1989a, 1989b; Bigand, 1997). The temporal manipulations also influenced the subjective completion judgments. For regular sequences, an asymmetry in the ending time effect was replicated with a delayed ending being more tolerated than an anticipated ending. Sequences ending on late targets were judged as complete as sequences ending on-time, but the judged completion decreased for early targets. For irregular sequences, judgments were similar for targets being played early, on-time and late. This interaction between ending time and regularity corresponds to our initial hypothesis: the effect of anticipated or delayed endings does not apply as strongly for irregular sequences as for regular sequences, for which specific temporal expectations have been built up during the context. The temporal manipulations of the prime context’s regularity had the expected effect, which was not reflected in the response times of the priming task (Experiment 1). The completion judgments might have strengthened listeners’ global evaluation of the sequence with expectancies arcing the entire sequence. This outcome is in agreement with the subjective judgments reported by Schmuckler and Boltz (1994) for melodic sequences with harmonic accompaniment.

The comparison between Experiments 1 and 2 investigates an eventual influence of task demands on the relative contributions of pitch and time. The completion judgments were sensitive to experimental manipulations on pitch and time dimensions, but no interaction involving both temporal manipulations and harmonic relatedness was observed. In contrast to the priming task, the completion judgments suggest an interaction between the two dimensions for regular sequences. Experiment 3 addresses the hypothesis that the presence of irregular sequences in the same experimental session and associated carry-over effects might have reduced the strength of regular attention cycles.

Experiment 3

Even if regularity has defined a within-subject factor in previous pitch and time experiments on melody perception (e.g., Boltz 1989a, 1989b, Schmuckler & Boltz, 1994), it has also been reported that the effect of rhythmic context influenced melody recognition only when the rhythmic context was used as between-subjects factor, but not as within-subject factor (Kidd, Boltz & Jones, 1984). In Experiments 1 and 2, the two temporal manipulations and the harmonic manipulation defined within-subject factors. It might thus be argued that the strong temporal manipulation of the irregular sequences might have reduced the influence of the other temporal variable (i.e., ending time) as well as its interaction with the harmonic manipulation. Data of Experiment 2 suggest some interactive influences for the regular sequences. Experiment 3 thus focused on the regular sequences only – the sequences, which instill listener’s rhythmic attention cycles and are the closest to natural listening situations of music. Our hypothesis was that with these improved experimental conditions completion judgments should reveal an interaction between harmonic relatedness and ending time.

Methods

Participants

Twenty-one volunteers participated in Experiment 3, none had participated in Experiments 1 and 2. Practice of an instrument ranged from 0 to 17 years (mean 3.9, std= 4.71).

Material, apparatus, procedure

The regular sequences of Experiment 2 were used with the procedure as described in Experiment 2. The experiment lasted for about 10 min.

Design

The within-subject factors were Relatedness (related, less related), and Ending time (early, on time, late). Crossing Relatedness and Ending time produced 6 possible versions for each of the 12 basic sequences, resulting in 72 sequences. Each participant judged 72 sequences presented in random order.

Results

Completion ratings (Fig. 4) averaged over the sequence set were analyzed by a 2 x 3 ANOVA with Relatedness (related, less related) and Ending time (early, on-time, late) as within-subject factors. The main effect of Relatedness was significant: sequences with related targets were judged as being more complete than sequences with less-related targets (F (1,20) = 47.96, p < 0.0001, MSE = .82). The main effect of Ending time was also significant (F (2,40) = 43.44, p < 0.0001, MSE = .82): sequences ending on-time and late were rated more complete than sequences ending early. And, most importantly, the interaction between Relatedness and Ending Time was significant, (F (2,40) = 5.27, p < 0.01, MSE = 0.27): mean rating differences between related and less-related sequences were reduced for early targets in comparison to late or on-time targets. An ANOVA with chord sequences as random variable confirmed the main effect of Relatedness, F2 (1,11) = 20.31, p < .001, the main effect of Ending time, F2 (2,22)= 160.64, p < .0001, and their interaction, F2 (2,22) = 8.10, p < .01. An additional analysis separating participants into 9 nonmusicians (without musical practice) and 12 experienced listeners (mean of 6.8 years of instrumental practice) did not show any significant influence of the between-subjects factor Musical Expertise (Fs<1).

Fig. 4
figure 4

Completion ratings of Experiment 3 (from 1 incomplete to 8 very complete), presented as a function of ending time (early, on timeor late) and harmonic relatedness (related, less related).

Discussion

Experiment 3 replicated the influence of harmonic relatedness and ending time on completion judgments: sequences ending on related targets, ending on time and later were judged as more complete than sequences ending on less-related targets and ending earlier. The data showed an interaction between relatedness and ending time on the completion judgments: the strength of the harmonic relatedness effect was reduced for sequences ending earlier than expected. Comparable to data on melodies (Boltz 1989b), Experiment 3 reported an interaction between listeners’ expectations on both dimensions. The difference between related and less-related sequences was very similar for sequences ending on-time and late, but reduced for early ending sequences. The violation of temporal expectation has a stronger effect on tonal expectations when the event comes in early in contrast to late. For early targets, this effect can be linked to musical rules stating that an anticipation generally does not complete a musical sequence. The completion data show that, for listeners, early endings are sounding less complete, but in addition the difference between tonic and subdominant persists, even if attenuated.

Experiment 4

In Experiment 3, a significant interaction between pitch and time on completion judgments was observed with participants listening solely to regular sequences in the overall experimental session. Based on the hypothesis that listening only to regular sequences might reinforce listeners’ attention cycles, which then strengthen the effect of temporal violation (i.e., and the possibility to observe an interaction), we decided to run the priming paradigm on regular sequences only.

Methods

Participants

Thirty volunteers participated in Experiment 4, none had participated in Experiments 1 to 3. Number of years of practice of an instrument ranged from 0 to 13 (mean= 2.6, std= 3.76).

Material, apparatus and procedure

The regular sequences of Experiment 1 were used. The same apparatus and procedure described in Experiment 1 were used. The duration of the experiment was about 20 min.

Design

The within-subject factors were Relatedness (related, less related) and Ending time (early, on time, late). Crossing these factors produced 6 possible versions for each of the 12 basic sequences, resulting in 72 sequences, which were presented with targets played by TimbreA and by TimbreB. Each participant judged all 144 sequences presented in random order.

Results

Percentages of correct responses and correct response times were averaged over the sequences in each condition and were analyzed by two 2 x 3 ANOVA with Relatedness (related, less related) and Ending time (early, on time, late) as within-subject factors. Performance accuracy (Table 1) was high overall. The main effects of Relatedness and of Ending Time were significant (F(1,29) = 6.67, MSE = 43.72, p <.05 and F(2,58) = 10.79, MSE = 34.76, p <.001 respectively). Correct responses were more numerous for related targets than for less-related targets, and more numerous for on-time and late targets than for early targets. The two-way interaction was not significant (F<1). Correct response times (Fig. 5) also showed significant main effects of Relatedness and Ending Time (F (1,29) = 19.47, p < 0.01, MSE = 1304.55 and F (2,58) = 11.02, p < 0.001, MSE = 1607.79 respectively). Response times were faster for related than for less-related targets, and faster for on-time and late targets than for early targets. The interaction was not significant (F<1). An ANOVA with chord sequences as random variable confirmed the main effects of Relatedness, F2 (1,11) = 16.25, p < .001, and Ending time, F2 (2,22)= 24.20, p < .0001. An additional analysis separating participants into 16 nonmusicians (without musical practice) and 14 experienced listeners (with a mean of 5.6 years of instrumental practice) did not reveal any significant influence of the between-subjects factor Musical Expertise.

Fig. 5
figure 5

Correct response times in Experiment 4 as a function of ending time (early, on time or late) and harmonic relatedness (related, less related).

Discussion

Experiment 4 replicated the data pattern of Experiment 1: facilitated processing for related targets over less-related targets and for late and on-time targets over early targets, but no interaction between harmonic relatedness and ending time. This outcome stands in contrast to the subjective completion judgments, for which the relatedness effect was reduced for early targets in comparison to on-time or late targets. For the priming task, the relatedness effect was not influenced by advanced timing: response time differences between related and less-related targets were the same for early and on-time targets.

General Discussion

Tonal and temporal structures are supposed to guide listeners’ formation of musical expectations, notably about the “what” and the “when” of future events. These expectations influence listeners’ perception, as has been shown for melody perception with subjective judgments and memory tasks (e.g., Bigand & Pineau, 1996; Boltz, 1991, 1989a, 1989b). Our study extended the investigation of tonal and temporal expectations to chord sequences and to the direct comparison of two experimental tasks. Subjective completion judgments and musical priming data showed an influence of both pitch and time dimensions. The comparison of the two tasks suggests that at least for chord sequences the combined influences of the two dimensions depended on the task and the requested level of processing.

Completion judgments (Experiments 2 and 3) showed an influence of the pitch dimension and of the temporal dimensions. The outcome is obtained with chord sequences and thus extends the observations previously reported for melodies without and with harmonic accompaniment (e.g., Boltz 1989a, 1989b; Schmuckler & Boltz, 1994). Sequences ending on musically related events (i.e., tonally stable events) were judged as more complete than sequences ending on less-related events. For regular sequences, completion judgments were reduced for early occurring targets in comparison to on-time or late targets, and for these early targets, the relatedness effect was reduced. These interactive influences between pitch and time observed for chord sequences were comparable to those reported for completion judgments on melodies.

Response time data of the priming task (Experiments 1 and 4) showed an influence of tonal and temporal expectations on the speed of processing, with facilitated processing for expected events at expected time points. The harmonic relatedness manipulation replicated facilitated processing for targets related to the prime context and confirmed the cognitive component in musical priming (Bigand et al., 2003). The temporal manipulations also influenced response times: target processing was faster for regular sequences than for irregular sequences and was faster for on-time and late targets than for early targets. These temporal effects indicate global and local influences of temporal organization. On a global level, the possibility to instill a regular, metric framework in the prime context speeds up response times in comparison to irregular sequences (i.e., without periodicity). On a more local level, target processing is influenced by the temporal distance to the penultimate chord, with increased processing times for intervals that are shorter than the majority of encountered intervals (i.e., SOAs of 640 ms). In sum, processing of target chords that fulfill tonal or temporal expectations is facilitated in the priming task. Concerning the temporal manipulations, the effect of regularity and the increased response times for “early’ targets replicate with musical material timing effects previously observed for isotonous sequences and for temporal intervals between a warning tone and a visual target (Klemmer, 1956, 1967; Los, Knol & Boers, 2001). The increased response times for irregular chord sequences mirror the increased response times to visual targets in case of stronger variability of foreperiods inside the experimental presentation bloc (Klemmer, 1956; Los et al., 2001, Van der Lubbe et al., 2004). The asymmetry in the ending time effect reflects the sequential effects observed for short foreperiods (i.e., 0.5s in Los et al., 2001): when a short interval follows a longer interval, response times increase, but not for the reversed order (i.e., with the longer interval leaving time to rectify). The overall data pattern suggest that participants prefer to expect the same interval again and is in agreement with the attentional theory proposed by Jones and Boltz (1989).

The influence of the temporal manipulations (i.e., context regularity, ending time) and their different articulation for completion judgments and priming data might be integrated in Jones’ (1992) attentional framework, which distinguishes several levels of attending to provide different perspectives on the same object. Attending to local details involves attending over relatively small time periods (note-to-note changes) and is called analytic attending. Attending to larger, more global structural relations, involves attending over relatively large time periods and is called future-oriented attending, which allows listeners to anticipate phrase endings. Listeners “should have little difficulty shifting attending between levels to acquire different temporal perspectives. One may effortlessly shift attending from relatively small time levels to phrase levels (from analytic attending to “future-oriented attending”)” (Jones, 1992). In the following, we propose to integrate the effects of the temporal manipulations in this attention framework, notably by linking global attending to the regularity effect and local attending to the ending time effect. To engage in global attending, a coherent event structure is necessary. Listeners are able to anticipate the event’s future time course with the help of higher order-time structure (e.g., Jones, Boltz & Klein, 1993, for event duration estimation). Event sequences that lack temporal coherence do not support future-oriented attending. In the regular sequences, the periodicity of the temporal structure and the underlying referent beat allow listeners to anticipate the sequence’s ending. In the irregular sequences, this feature is broken as no coherent, higher level reference beat can be easily abstracted. This incoherence might render the engagement in future-oriented attending more difficult, leading to increased response times. In parallel to this global attending, listeners might also switch attention level and zoom their attention to a local level. This switch can be particularly useful in the priming task as it requests local feature processing. Due to the experimental context, listeners might expect a specific time interval between two events (i.e., the most frequently occurring one). This kind of temporal window expectation might then lead to the main effect of ending time in the priming task. This local focus was not necessary to perform the completion judgments. In this task, the two temporal dimensions interacted: the ending time effect was observed only with regular sequences. For the regular sequences, listeners might create a future-oriented attending mode and integrate the ending time in this overall structure, which then is reflected in the completion judgments. However, irregular sequences do not favor the engaging in global attending and thus lead to weaker influence of ending time on the global completion judgments. The preceding discussion of the data in Jones’ attentional framework focuses on temporal expectations. Our data further suggest that independently of task demands – and of the induced attention level (future-oriented versus analytic) - listeners extract the tonality of the prime context. The difference between related and less-related targets was observed independently of temporal occurrence, context regularity and task (only for completion judgments the difference was attenuated for early targets in regular sequences). The persistence of the global effect of tonality even for local judgments is in agreement with other data sets showing that tonal context effects persist over time - even after silent periods (e.g., 2500 ms in Tekman & Bharucha, 1992) or constant local contexts up to 7 chords (Bigand et al., 1999).

One further goal of our study was to investigate whether the processing of pitch and time dimensions was independent or interacted in chord sequences. Previous research focusing on melodic structures provided evidence for perceptual influences of both dimensions either in interaction (e.g., Boltz, 1989; Bigand, 1997) or independence (e.g., Palmer & Krumhansl, 1987ab). Task influences had been discussed as one candidate to reconcile these differences in experimental data on music perception (e.g., Boltz 1999; Peretz & Morais, 1989). The influence of task demands on the interpretation of underlying processes is not specific to music cognition. For example, task-specific encoding can influence memory performance (Craik & Tulving, 1975), changes in experimental tasks influence the estimated amount of explicit versus implicit learning (Reber, 1992) and the type of discrimination task led to data either supporting or contradicting the hypothesis of categorical perception of phonemes (Gerrits & Schouten, 2004). In the following, our data pattern will be discussed in the light of task-specific influences on interaction or independence, and it will be integrated in a multiple stage processing framework as initially proposed by Peretz and Morais (1989).

For the combined contribution of tonal and temporal regularities, the obtained data patterns differ between priming data and completion judgments. The completion judgments showed an interaction between pitch and time dimensions: notably for early targets the difference between related and less-related targets was decreased in comparison to on-time or late targets. The priming paradigm showed that the impact of expectations on one dimension did not modify the expectations on the other dimension, suggesting rather independent influences of the two dimensions.

For the completion judgments, our outcome on chord sequences extends previous data of melodies, with the exception that for chord sequences the interaction concerned on-time targets only in comparison to early targets (but not late targets). For melodies, Boltz (1989b) observed the modification of pitch influence both for early and late targets, even if stronger for early targets. It is worth noting that the tonal stability of the events might play an important role. In the interaction reported by Boltz (1989b), the tonic (i.e., the most stable tone) is less strongly influenced by the temporal manipulations than are less stable or unstable tones (i.e., mediant or leading tones). In the chord sequences, related and less-related targets were both relatively important and stable in the harmonic hierarchy. A strong tonal stability might allow “resisting” against temporal interactions. This interpretation suggesting a difference between single-note lines and chord sequences is reinforced by data on melodies with harmonic accompaniment: the harmonic relatedness effect in subjective judgments decreased only for early occurring chords in the accompaniment, but not for late chords (Schmuckler & Boltz, 1994).

For the priming task, no significant interaction between pitch and time was observed: the harmonic relatedness effect was not reduced by the temporal violation. The aim of the priming task was to measure the influence of expectations on the speed of target processing. It thus requested a very local judgment. Our time manipulation (i.e., ending time) was also situated on a local time level, notably a deviation (shortening or lengthening) from the most frequently occurring SOA. This local feature might explain in part the difference with Bigand et al. (1999) reporting a marginally significant interaction between harmonic relatedness and a temporal manipulation that changes the overall structure of the sequence (i.e., a firmata punctuating the global attention cycle differently to create symmetric and asymmetric structures). However, it is important to note that this interaction was mainly observed for musician listeners attending the Music Conservatory and not for nonmusician listeners. Participants in our study had no or only moderate musical practice, and this weak difference did not influence the data pattern involving pitch and time. In numerous harmonic priming studies, listeners’ sensitivity to subtle manipulations of tonal structures was not dependent on the extent of musical expertise (e.g., Bharucha & Stoeckig, 1986; 1987; Bigand & Pineau, 1997; Bigand et al., 1999, 2003; Tillmann & Bigand, 2001). Musical expertise might become more influential when temporal organizations (i.e., local or global) are modified in combination with harmonic structures. Future research needs to systematically address the role of musical expertise on pitch and time perception (for melodies and chord sequences) by including high-level experts.

In sum, priming data and completion judgments were influenced by manipulations on harmonic relations and temporal regularities, but their combined influence depended on the task. Footnote 6 Based on research mainly focusing on melodies, it has been proposed that interaction and independence of pitch and time dimensions might depend on the stage of processing, with independent processing at early processing stages and interaction at later, more integrative processing stages (Peretz & Morais, 1989; Peretz & Colthart, 2003; Thomson et al., 2001; Pfordresher, 2003). Our study used two tasks focusing on different processing levels in chord sequence perception. The priming task focuses on the processing speed of a local feature (i.e., timbre discrimination), which might tap into first processing stages revealing independent influences of the two dimensions. The completion judgments require a global evaluation of the sequence, and with this more global evaluation process of the sequence, the influence of pitch and time dimensions interacted. Our data on chord sequences as well as previous work on melodies showed interactive influences between pitch and time on completion judgments, notably a reduced relatedness effect for early targets. The absence of this interaction in our priming task raises the question whether the requested local judgment might reveal independent influences of the two dimensions also for melodies. This question needs to be addressed in future research because, in contrast to harmonic structures (Bharucha & Stoeckig, 1987; Bigand & Pineau, 1996; Bigand et al., 1999, 2003; Tillmann et al., 2003), priming effects due to the pitch dimension just started to be investigated for melodies.