Introduction

Our visual landscape is rarely static, including several different kinds of moving objects. The changes over time of visual stimuli are salient events and may index the passage of time (see Mauk and Buonomano 2004; Eagleman et al. 2005; Zago et al. 2011a). According to the internal clock model in which oscillators generate pulses that are detected by a counter (Creelman 1962; Treisman 1963), the brain may estimate how much time has passed by counting the temporal indices associated with discrete visual events.

Distributed, specialized mechanisms exist in the brain to monitor the temporal behavior of moving objects. One such specialization of particular importance is related to the animacy or biological nature of the objects. The animate–inanimate distinction hinges on the living/sentient versus non-living/non-sentient nature of the object. An emerging notion is that, to deal with the motion of animate objects, the brain uses mechanisms partially different from those used to deal with the motion of inanimate objects. Indeed, there exist several differences in the kinematic and kinetic properties of these two different categories of targets (Zago et al. 2011a), and the neural networks processing biological and non-biological motion are partially segregated in the brain (Rizzolatti and Craighero 2004; Blake and Shiffrar 2007).

According to one hypothesis, the time base used by the brain to process visual motion is calibrated against the predictions regarding the motion of biological characters when animacy is detected by the observer, while the time base is calibrated against the predictions of motion of passive objects when no animacy is detected (Chang and Troje 2008; Sebanz and Knoblich 2009; Carrozzo et al. 2010; Orgs and Haggard 2011; Zago et al. 2011a, b; Wang and Jiang 2012).

Consistent with this hypothesis, there is recent evidence that time perception and motor timing are influenced by animacy: The observation of a biological movement performed by other people may bias the timing of a motor act or the judgement of perceived duration of an event (Watanabe 2008; Bove et al. 2009; Carrozzo et al. 2010; Orgs and Haggard 2011; Zago et al. 2011b; Mouta et al. 2012; Wang and Jiang 2012). In particular, Carrozzo et al. (2010) investigated two different timing tasks: (1) button-press responses aimed at intercepting a falling ball and (2) discrimination of the duration of a stationary flash. They used interference paradigms in which the timing task was run concurrently with the presentation of different computer-graphics characters in the background of the scene. The timing task served as a probe to reveal potential biases or distortions of time induced by the background. In both tasks, the observers were presented with different background scenes before and during the execution of the task. The scene displayed characters which could differ in terms of biological (human) or non-biological appearance and kinematics. In all cases, the background characters and their movements were totally unrelated to the foreground target and to the viewer’s action. Carrozzo et al. (2010) found that, for both the motor interception and the time discrimination task, the time estimates were systematically shorter in the sessions involving human characters moving in the background than in those involving inanimate moving characters. The presence of a systematic offset between the time estimates associated with biological movements and the time estimates associated with non-biological movements is compatible with the hypothesis that there exist timing mechanisms differentially tuned to biological movements and to non-biological movements.

In addition to the animacy of the stimuli, also their speed may affect the time estimates. It has long been known that there is a strong link between speed and time perception: The perceived duration of a rapidly moving stimulus is generally longer than that of a slower or stationary stimulus having the same physical duration, a phenomenon known as time dilation (Brown 1995; Lhamon and Goldstone, 1974; Kanai et al. 2006; Kaneko and Murakami 2009). Time dilation has been interpreted in the context of the idea that the brain estimates time based on the number of events that occur (Fraisse 1963; Brown 1995). In other words, the occurrence of a greater number of events would be taken by the brain as evidence for a longer duration. There is some controversy, however, as to whether stimulus speed or its temporal frequency or spatial frequency represents the critical element in the time dilation phenomenon. Thus, Kanai et al. (2006) showed that the time distortion could be determined simply by a flickering stimulus, consistent with the idea that the temporal frequency is the key factor. However, Kaneko and Murakami (2009) found that the speed of the stimulus, rather than temporal frequency or spatial frequency per se, best described the perceived duration of a moving stimulus, with the apparent duration proportionally increasing with the logarithm of speed.

A moving stimulus may also induce visual after-effects. Thus, it is known that the prolonged exposure to a pattern moving at constant speed may affect the perceived speed of subsequent moving patterns (Thompson 1981). In particular, adaptation to a moving stimulus reduces the perceived speed of that stimulus and all slower speeds in the same direction, while it may increase the perceived speed of faster stimuli (Smith and Edgar 1994; Hammett et al. 2005; Hietanen et al. 2008). These effects have been explained by a simple model in which speed is encoded as the ratio of two temporal filters whose sensitivities decay exponentially over time (time constant around 10 s). The model assumes that perceived speed is based upon the ratio of the outputs of a low-pass and band-pass temporal filters, corresponding to a low- and high-speed channel (Smith and Edgar 1994; Hammett et al. 2005). According to the model, adaptation to a fast speed produces a change in filters sensitivities that results in a drop of the ratio and perceived speed is reduced. In contrast, following adaptation to a slow speed, the change in filters sensitivities causes an increase in the ratio and perceived speed is higher.

The effects of speed and animacy may interact between each other and influence time estimates jointly. For instance, the tempo of self-paced, repetitive finger movements is biased by the observation of a rhythmical action, the frequency of self-paced movements becoming entrained to that of the observed movement (Bove et al. 2009). Moreover, these effects last well beyond the observation epoch. Also, pointing movements are contaminated automatically (and without the subject’s awareness) by the speed of a previously seen stimulus when the latter moved according to biological laws (Bisio et al. 2010). Artificially speeding up (slowing down) point-light animations of human actions results in faster (slower) reaction time responses, while no such trend is observed with scrambled or solid object motion, a phenomenon denoted as “behavioral speed contagion” (Watanabe 2008). Apparent biological motion leads observers to underestimate the duration of a square surrounding the picture sequence compared to trials displaying degraded body pictures (Orgs and Haggard 2011), and higher speeds of apparent motion produce shorter perceived durations (Orgs et al. 2011).

In sum, changes in speed of an object may lead to two opposite effects: Speed increments may induce an increase in perceived duration (and decrease in perceived speed), or speed increments may induce a decrease in perceived duration (and increase in perceived speed). Could it be that these opposite effects are related to the biological or non-biological nature of the stimuli?

Here, we investigated the issue of the interactions among timing, speed and animacy by means of two separate experiments. In Experiment 1, we used the same interception task and two types of background characters included in Carrozzo et al. (2010). The animate character was a dancer, while the inanimate character was a whirligig consisting of 14 disconnected rods whose motion matched that of the corresponding body segments of the dancer. We varied the speed of the movements of these two characters across different sessions. This manipulation scaled up or down the motion speed of all segments of the character by the same amount and to the same extent for both the dancer and the whirligig. Crucially, default trials always depicting the same static character of a standing person were randomly intermingled with the dynamic trials in both biological (dancer) and non-biological (whirligig) sessions, so as to assess persistent influences of the animate and inanimate contexts. Therefore, our study allowed addressing the issue of whether speeding up or slowing down an action generates a persistent contextual effect that carries over to later trials in the absence of the action. In Experiment 2, instead, we assessed the effect of speeding up or slowing down the dancer or the whirligig on the perceived animacy as rated on the basis of a questionnaire.

Experiment 1: timing

Methods

Participants

Twelve subjects (5 females and 7 males, 31 ± 7 years old, mean ± SD) participated in this experiment receiving modest monetary compensation. They were right-handed (as assessed by a short questionnaire based on the Edinburgh scale) had normal or corrected-to-normal vision and were naïve to the purpose of the experiments. The participants gave written informed consent to procedures approved by the Institutional Review Board of Fondazione Santa Lucia, in conformity with the Declaration of Helsinki on the use of human subjects in research.

Setup and visual stimuli

General methods were similar to some of those used in our previous study (Carrozzo et al. 2010) with specific modifications needed to address the current experimental questions. Participants sat in front of a 22 LCD monitor (ViewSonic, model VG2230wm, 1,680 × 1,050 pixels, 60 Hz refresh rate) in a dimly illuminated room with the head restrained by a chin rest. Subject’s forehead-monitor distance was 0.6 m. Button-press responses were recorded by means of National Instruments, PCI 6601 timer/counter at 10 μs resolution.

Visual stimuli were programmed in C++ using custom software and rendered using OpenGL on nVidia GeForce 8800 GTX graphics card. The display surface was 470 × 295 mm. Visual stimuli were defined in a right-handed reference frame with rightward x-axis and upward y-axis in the frontal plane, plus in-depth z-axis. Scene projection was computed using on-axis linear perspective, assuming a viewpoint at [0, 1.2 m, D] and looking at point [0, 1.2 m, 0]. The fixation point was located at the origin [0, 0, 0] of this frame. D (horizontal distance between the origin and the viewpoint) could take one of two different values (17 or 28.7 m). The position of the observer relative to the screen was adjusted to keep the viewing angle congruent with the above parameters. Timing of the visual stimuli and motor responses was strictly controlled by linking the duration of stimulus presentation to a counter of screen refreshes. To guarantee precise control of timing, all moving stimuli were generated based on look-up tables.

The horizontal and vertical visual angles of the overall scene were 43° by 28°, respectively. In the following, we report the visual angles of specific scene items only for the apparent viewing distance of 17 m. The visual angles corresponding to the other viewing distance (28.7 m) used in the experiments can be easily derived. The scene included a red cross (0.9° by 0.4°) centered at the origin and drawn on the ground (in perspective, as the other scene items), a building, a few trees, a human figure in the distant background and another human figure in the proximal background (see Fig. 1a). The distant figure (0.7° by 2.5°, placed at 14-m distance in-depth from the origin) was always still. This figure (depicting a man of average height) helped providing a metric reference to the visual scene. The proximal figure (at 8-m distance from the origin) remained static or moved depending on the specific trial (denoted as static and dynamic trials, respectively). The static figure was a standing human (1.2° by 4.3°) depicting a man of average height, while the dynamic one was a male dancer in the biological sessions or a whirligig in the non-biological sessions. Dynamic figures moved smoothly for the whole duration of a dynamic trial. Displacement was mostly (but not exclusively) confined to the frontal plane and was centered on the position which was occupied by the static figure in the static trials (see Fig. 1b).

Fig. 1
figure 1

Schematics of the experiments. a Scene (at the near viewing distance) displayed during the static trials. The ball was thrown from the building and hit ground at the center of the red cross. Different positions of the ball during its motion are shown for illustrative purposes only. Single frames from dynamic trials of the animate session with a dancer (b) or from the inanimate session with a whirligig (c). d Time sequence of events during each trial

In the animate sessions, the sequence of movements performed by the dancer included several steps from classical ballet, such as a pirouette en dehors à la seconde and arabesque. 3D kinematics of the dancer was recorded by means of Vicon 612 motion capture system at 60 Hz, then resampled at 120 Hz using linear interpolation in order to vary the speed of the displayed motion (see below). The dancer wore 62 markers placed on external body references to allow off-line reconstruction of the movements of his body and limbs. The reconstruction was based on a model of the dancer’s skeleton composed of 18 segments with a total of 57 degrees of freedom (dof, 3 translation dof and 3 orientation dof for the pelvis root segment, plus 3 orientation dof for 17 additional segments hierarchically connected to the root). The range of fundamental frequencies of motion was 0.1–0.2 Hz for translation and 0.1–0.8 Hz for rotation (depending on the dof). The first 3 harmonics accounted together for >85 % of the variance in the original data at each dof. As a next step for the graphic presentation, 120-Hz motion was imported in Poser-6 (Smith Micro Software) to generate the sequence of 3D polygonal mesh-frames of the moving character and exported in the experimental control program to be displayed at normal, slow or fast speed. The slow speed condition was achieved by displaying consecutive 120-Hz mesh-frames at 60-Hz refresh rate. The normal and fast speed conditions were obtained by skipping one or two mesh-frames, respectively, in the 120-Hz sequence. In this way, the slow and fast speeds corresponded to 0.5 and 1.5 times the normal one. Full-length movies lasted 25.2, 12.6 and 8.4 s corresponding to slow, normal and fast speeds. In each dynamic trial, a 4.5-s continuous sequence was displayed. This sequence was extracted from the full-length movie at the corresponding speed by randomly selecting a starting frame within the first 10-, 5- or 3.3-s segment (for slow, normal or fast speeds, respectively).

In the inanimate sessions, the dynamic figure, the whirligig (see Fig. 1c), consisted of 14 close but disjointed rods whose individual length matched that of the corresponding head, trunk and limb segments of the human figure in the animate condition (there was no neck, right or left collar, pelvis in the whirligig). Each rod rotated around its center of mass according to the sum of sinusoids whose amplitude and frequency matched those of the first 3 harmonics (zero-phased) of the angular motion of the corresponding body segment of the human dancer, while all rods translated in 3D according to the sum of the first 3 harmonics of the translational motion of the dancer’s pelvis. Changes in speed of the whirligig were achieved in the same manner as for the dancer character. Moreover, the kinematics of both the dancer and the whirligig complied with the 2/3 power law that relates the instantaneous velocity of a limb end point to the curvature of the geometrical path (Lacquaniti et al. 1983). This law is typically obeyed by biological motion, as well as by non-biological harmonic motions.

On average, the envelope of the displacement of the dynamic figure (dancer or whirligig) extended over an area 4.8° × 4.5°. The average speed at the character’s point closest to the interception point was 0.25, 0.5 or 0.75 m s−1 (0.85, 1.7 or 2.6° s−1) for slow, normal and fast speeds, respectively, in the animate condition and 0.3, 0.6 or 0.9 m s−1 (1, 2 or 3° s−1) in the inanimate condition.

A new scene was shown every 4.5 s. The size of the elements in the scene was consistent with an apparent viewing distance D of 17 m (near) or 28.7 m (far). Participants were free to visually explore the new scene for 2.5 s, then the red cross flickered for 0.5 s indicating that they should fixate at the cross center for the remaining 1.5 s of the trial (see Fig. 1d). After the flicker period, a textured soccer ball (0.22 m diameter, 0.8°) was thrown downward from an open window of the building and bounced away after hitting ground at the fixation point. The task for the participants was to press the button with the right index finger when the ball first hit ground, but no performance feedback was provided (that is, whether or not the timing of the motor response was correct). This was done in order to investigate the contribution of internal timing mechanisms in the absence of sensory error signals which may correct the performance with practice. Ball trajectory was confined to the frontal plane (Z = 0). The ball fell under gravity (vertical acceleration = −9.81 m s−2), neglecting air drag. Horizontal velocity was kept constant (−5 m s−1), whereas initial vertical velocity could take one of 3 different values (−5.39, −2.33, −0.09 m s−1) resulting in 3 different fall durations (FD = 0.6, 0.8, 1.0 s). The ball was thrown from a constant height above the ground (5 m), but different horizontal positions (X 0 = 3, 4, 5 m) to achieve a constant contact point with the ground. The restitution coefficient at the ground was 0.7 (consistent with our measurements performed on a real soccer ball). Ball speed at the interception point was 11–12 m s−1 (34.6–38 s−1), depending on the initial conditions.

Protocol

Each participant carried out two sets of experiments at about 1-month distance (with a randomized presentation order across participants): animate and inanimate motion experiments. Each set of experiments included 3 sessions at about 1-week distance between each other, each session involving a different speed (slow, normal or fast) of the character motion (order randomized across participants). In other words, each participant experienced separate sessions involving dancer slow, dancer normal, dancer fast, whirligig slow, whirligig normal and whirligig fast. In each session, the participant first carried out a block with only static trials (no motion of the character), in which the apparent viewing distance of the scene (2 values of D) and ball descent duration (3 values of FD) were randomized trial by trial. There were 20 repetitions of each combination of D and FD (yielding a total of 120 trials). The purpose of this first block was to acquire the baseline data to be subtracted from the experimental data, in order to correct for average differences in performance across days which are unrelated to the specific experimental manipulation. Immediately after the completion of the baseline block, the participant carried out a block of static/dynamic trials in which the presence of character motion (present in dynamic trials and absent in static trials) was randomized across trials, along with the values of apparent viewing distance (2 values of D) and ball descent duration (3 values of FD). There were 50 repetitions of each combination of character motion, D and FD (yielding a total of 600 trials).

The randomization procedure avoided the consecutive repetition of trials with all identical conditions, and the sequence of trials was identical in each session of all participants. Participants were allowed to pause during the experiment whenever they wanted. Trials with invalid responses (button-press earlier or later than 0.5 s relative to the arrival time of the ball on the ground, or no response at all) were rejected and repeated at the end of the experiment (there were <1 % of such trials per experiment).

In sum, animacy (present or absent) and speed (slow, normal fast) of the moving stimuli were blocked to investigate immersive contextual effects on timing, and their persistent influence on the default static trials randomly intermingled with the dynamic trials.

Data analysis and statistics

For each session, the time of the button-press responses was averaged across all 120 trials of the static baseline block. This average value was then subtracted from the time of response of each trial of the static/dynamic block. The response times of the static/dynamic block were then averaged across all repetitions of each condition (except for the analysis of trends in sequences of consecutive static trials, see “Results”).

General linear model repeated measures within-subject ANOVA was carried out on the average response times pooled across all 12 subjects, with animacy (2 levels, dancer or whirligig) and speed (3 levels, slow, normal or fast) as within factors, the other factors being character motion (2 levels, static or dynamic), viewing distance (2 levels, near or far) and ball descent duration (3 levels, FD = 0.6, 0.8, 1.0 s). Sphericity was tested using Mauchly’s test. Post hoc pairwise t tests were carried out with the Bonferroni correction for multiple comparisons. An alpha level of 0.05 was used for all statistical tests.

Results

In agreement with the previous report (Carrozzo et al. 2010), the average response times were significantly earlier in the sessions involving the dancer in the background (animate sessions) than in those involving the whirligig (inanimate sessions). Thus, the main effect of the factor animacy in ANOVA was highly significant (F 1,79 = 1,374.1, P < 10−4). On average, the difference in timing was 16.5 ± 0.9 ms (mean ± SD, n = 432, over all conditions and subjects).

The new findings concern the effects of the speed of character motion (Fig. 2). First, it is important to notice that the average response times remained significantly earlier in the animate sessions than in the inanimate sessions at all character speeds. Indeed, the 95 % confidence limits of Fig. 2 do not overlap between dancers and whirligigs for either slow, normal or fast speeds. Second, the relative effects of speed were completely different in the animate sessions as compared with the inanimate sessions (animacy–speed interaction in ANOVA: F 2,158 = 424.9, P < 10−4). On average, participants delayed their responses appreciably with increasing speed in the inanimate sessions: Responses were 22.9 ± 0.8 ms later with fast speed than with slow speed. By contrast, participants anticipated the responses slightly but significantly with increasing speed in the animate sessions: Responses were 8.3 ± 0.9 ms earlier with fast speed than with slow speed. Post hoc tests revealed that all pair-wise comparisons of Fig. 2 were statistically significant (all P < 0.002). These trends were generally observed also at the level of single participants. Thus, in the inanimate sessions the mean response with fast speed was later than that with slow speed in all subjects, while in the animate sessions the mean response with fast speed was earlier than that with slow speed in 7/12 participants.

Fig. 2
figure 2

Ensemble averages (±95 % confidence intervals, n = 144 for each data point) of the response times for the slow, normal and fast character speeds across all participants. Averages were computed (after subtracting the baseline values of the corresponding session, see “Methods”) over all static and dynamic trials of all sessions involving the dancer and the whirligig. More negative values correspond to earlier responses

A crucial feature of our experimental design was that default static trials always depicting the same static character of a standing person were randomly intermingled with the dynamic trials in all sessions, so as to assess persistent influences of the overall context (Fig. 3). There was a significant main effect of character motion (ANOVA: F 1,79 = 288.9, P < 10−4), the responses in dynamic trials being earlier than those in static trials. Moreover, the effect of motion interacted significantly with that of animacy (F 1,79 = 433.4, P < 10−4), because motion had a much greater effect in the animate sessions than in the inanimate ones.

Fig. 3
figure 3

Ensemble averages (±95 % confidence intervals, n = 72 for each data point) of the response times for static (Sta) and dynamic (Dyn) trials at the 3 character speeds across all participants

Critically, however, speed effects did not interact significantly with the presence or absence of character motion (F 2,158 = 0.26, P = 0.77), or with motion and animacy (F 2,158 = 1.34, P = 0.27). The lack of interaction effects shows that the speed context (slow, normal or fast) biased the response timing in the same direction in the dynamic and static trials, although in the static trials the visual scenes corresponding to the two viewing distances were identical in all sessions, and there were no dynamic signals other than those due to ball motion. Notice that the effect of speed on static trials does not imply any change in the static character (which was identical across all speed and character conditions), but it implies a contextual effect of the speed of motion of the dynamic character (dancer in animate sessions and whirligig in the inanimate sessions) onto the static character during the static trials randomly intermingled with the dynamic trials in each session.

We also verified that the results in static trials remained stationary across the first repetitions after a dynamic trial. To this end, instead of averaging across all repetitions as we did in the previous analyses, we considered the responses in individual trials belonging to a sequence of 4 consecutive static trials following a dynamic trial (Fig. 4). Because each trial lasted 4.5 s (see “Methods”), a sequence of 4 trials corresponded to 18 s of continuous presentation of the still character. In neither animate nor inanimate sessions was there a significant change of response timing in these consecutive static presentations, indicating that the responses were affected by the context steadily. Thus, the interactions between animacy and serial position of the trial in the sequence, between speed and serial position, and among animacy, speed and serial position were not significant (general linear model on all trials of all subjects, F 3,2280 = 1.0064, P = 0.39, F 6,4560 = 0.47, P = 0.83, and F 6,4560 = 0.97, P = 0.45, respectively).

Fig. 4
figure 4

Response timing in consecutive static trials. Sequences of 2 or more consecutive static trials were extracted from all experiments and participants. The response timing in each consecutive trial of the sequence is graphed as a function of the serial position i of the corresponding trial (n = 852 trials for i = 1–2, n = 420 for i = 3, n = 204 for i = 4). The first trial of the sequence was preceded by one or more dynamic trial. Notice that the consecutive static trials were not identical, because either ball descent duration or apparent viewing distance varied between any two consecutive trials due to the randomization procedure

Experiment 2: animacy rating

Methods

A total of twelve subjects (7 females and 5 males, 36.3 ± 7.3 years old, mean ± SD) were recruited according to the same modalities as for Experiment 1. Seven of these subjects had previously (about 6 months earlier) performed Experiment 1. We presented the same 6 moving characters (dancer slow, dancer normal, dancer fast, whirligig slow, whirligig normal, and whirligig fast) and the same background scene as in Experiment 1 at the fixed 17-m viewing distance. Also the computer setup was similar to that of Experiment 1. Here, the characters were randomly intermingled across trials, no static trials were included, and there was no falling ball in the scene. Each trial started with a 3.5-s continuous sequence extracted from the full movie as in Experiment 1. Then the character disappeared and a pair of words appeared at the bottom of the scene. The pair was drawn randomly from a questionnaire based on semantic bi-polar items (Carrozzo et al. 2010). The questionnaire included 9 pairs of Italian words whose English equivalent is Thing/Person, Artificial/Natural, Unaware/Aware, Apathetic/Sensitive, Passive/Active, Automatic/Voluntary, Mechanical/Alive, Inanimate/Animate, Dull/Lively. Participants were asked to press a numerical key between 1 and 7 to rate the character according to the semantic bi-polar pair (question) currently displayed, higher ratings denoting greater animacy. There was no time limit to deliver the response. After the keypress, a new trial started. Subjects were given 9 practice trials including all questions. During each experiment, each question was randomly presented 3 times for each character, yielding a total of 162 trials (6 characters × 9 questions × 3 repetitions).

The rating responses were pooled over all subjects, after averaging across all repetitions of each condition, and subjected to repeated measures within-subject ANOVA. Post hoc pairwise t tests were carried out with the Bonferroni correction for multiple comparisons.

Results

Figure 5 plots the mean ratings as a function of the specific character and semantic item, higher ratings corresponding to greater perceived animacy. In general, the dancers were judged much more animate than the whirligigs, irrespective of the character speed and of the specific semantic item used to interrogate animacy. Statistical analysis [2-way ANOVA, 6 (characters) × 9 (questions)] showed significant effects of both the characters (F 5,440 = 618, P < 10−5) and the questions (F 8,440 = 6.28, P < 10−5), as well as a significant interaction between characters and questions (F 40,440 = 3.84, P < 10−5).

Fig. 5
figure 5

Mean (±95 % confidence intervals, n = 12 for each data point) animacy rating computed across all participants. Ratings for different characters are color-coded (see right inset), and the values for each of the 9 different semantic pairs are plotted in different columns (bi-polar words on the abscissa). Ratings could vary between 1 and 7, higher ratings denoting greater animacy

The average ratings across all questions for each character were 4.98 for dancer slow, 5.90 for dancer normal, 5.15 for dancer fast, 2.03 for whirligig slow, 2.25 for whirligig normal and 2.33 for whirligig fast (Fig. 6). Post hoc tests showed that the animacy rating of any type of dancer (slow, normal and fast) was significantly higher than that of any type of whirligig (all P < 10−5). Moreover, post hoc tests showed that the rating of the normal-speed dancer was significantly higher than that of both the slow and fast dancers (all P < 10−5). Notice that the rating of the normal-speed dancer was higher than that of the other dancers for all semantic items used to interrogate animacy (see Fig. 5). The fast speed whirligig was rated significantly higher than the slow one (P = 0.043), but the other comparisons within the whirligig group were not significant.

Fig. 6
figure 6

Average animacy ratings (±95 % confidence intervals, n = 108 for each data point) for each character across all questions and participants

Discussion

In Experiment 1, we used an interference paradigm in which a timing task was run concurrently with the presentation of computer-graphics characters in the background of the scene. The timing task served as a probe to reveal potential biases of time and/or speed induced by the background characters. Specifically, observers were asked to press a button in synchrony with the landing of a falling ball, while a dancer or a multi-segment whirligig moved in the background in the animate and inanimate sessions, respectively. The speed of the movements of these two characters was varied across different sessions, so that the motion speed of all the segments of the character was scaled up or down by the same amount and to the same extent for both the dancer and the whirligig. The results showed that the average responses were timed systematically earlier in the animate sessions than in the inanimate sessions at all character speeds. This confirms and extends the previous observations of Carrozzo et al. (2010) which were obtained at one speed only.

A novel result was that the effects of character speed were completely different in the animate sessions as compared with the inanimate sessions. Observers delayed their responses considerably with increasing speed in the inanimate sessions. By contrast, the effect of speed in the animate sessions was much weaker and in the opposite direction: Observers anticipated the responses slightly with increasing speed.

It is unlikely that the different effects of the dancer and whirligig on interception timing resulted from different low-level features of the corresponding visual stimuli. First, the spatial position, size and color were matched between the dancer and the whirligig, as were their motion speed and temporal frequency. Moreover, the kinematics of both characters complied with the 2/3 power law typical of harmonic motion (Lacquaniti et al. 1983). The possibility that low-level features of the stimuli were the main cause of the different response timing is also ruled out by the finding that the speed context (slow, normal or fast) of the dynamic trials biased the response timing in the same direction in the static trials randomly intermingled with the dynamic ones. In the static trials, the visual scenes corresponding to the two viewing distances were identical in all sessions, and there were no dynamic signals other than those due to ball motion. Nevertheless, we found no significant interaction of speed effects with the presence or absence of character motion, or with motion and animacy. The intermittent presence of the moving character in the background was sufficient to determine a bias in the time estimates that carried over to the static background.

It is a remarkable finding that the time estimates were systematically affected by the animate or inanimate context even during the static trials, several seconds after the offset of the moving character. The specific, persistent bias in the time estimates is indicative of contextual priming on the observers’ ability to represent elapsed time. The biological context conveyed animacy also to the standing human figure of the static trials, as if the observer expected that this figure began moving at any moment. Instead, the same figure perhaps borrowed the passive features of the whirligig in the corresponding context. As we will discuss later, the carry-over was greater for the inanimate context than for the animate one.

Visual motion after-effects may account, at least in part, for the observed time biases in both static and dynamic trials. The exposure to the moving characters of the dynamic trials may have affected the perceived speed of the falling ball in the current and subsequent trials. Indeed, it is known that exposure to a moving stimulus may reduce the perceived speed of slower stimuli, while it may increase the perceived speed of faster stimuli (Smith and Edgar 1994; Hammett et al. 2005; Hietanen et al. 2008). Moreover, the effects are roughly proportional: The higher the differential in speed between the adapting and test stimulus, the higher the distortion in perceived speed of the test. Here, the characters’ motion was always more than an order of magnitude slower than that of the ball at the interception point (see “Methods”). Thus, prolonged motion of the characters in the background may have induced an increase in the perceived speed of the falling ball, leading to anticipated motor responses; that is, observers may have pressed the button earlier than they should because they felt that the ball was arriving sooner than it actually did. The shifts in response times we found in the inanimate sessions were fully consistent with this hypothesis: Average motor responses were monotonically shifted to earlier values for progressively decreasing speeds of the whirligig (from fast to normal and slow, Fig. 2).

By contrast, the shifts in response times in the animate sessions were not consistent with speed after-effects. In this case, average motor responses changed little with character speed and in the direction opposite to that predicted by speed after-effects. We found that the slower was the dancer, the later the button-press response (Fig. 2). Moreover, the difference in response time between dynamic and static trials was much smaller for the inanimate sessions than for the animate ones (Fig. 3), indicating that there was much greater carry-over (after-effect) in the former than in the latter case. The contextual effects (animate versus inanimate context) affected sequences of four consecutive static trials, corresponding to 18 s of continuous presentation of the still character, as shown by the fact that there was no significant change of response timing across these consecutive trials in either animate or inanimate sessions (Fig. 4). All together, these findings suggest that the dynamic signals associated with the moving whirligig generated robust and persistent after-effects, whereas the dynamic signals associated with the moving dancer did not.

In Experiment 2, we assessed the perceived animacy of the dancer and whirligig at the three character speeds used in the interception experiments. The results were as expected. The dancers were rated by the observers as much more animate than the whirligigs, irrespective of the character speed. Moreover, the rating of the normal-speed dancer was significantly higher than that of the dancers artificially slowed down or sped up. Instead, the perceived animacy of the whirligig changed only slightly as a function of character speed. Therefore, the results of Experiment 1 and Experiment 2 considered together are consistent with our previous hypothesis that event timers are selectively biased as a function of perceived animacy, implicating high-level mechanisms for time modulation (Carrozzo et al. 2010). However, the changes in timing of the button-press responses as a function of character speed do not necessarily parallel the corresponding changes in the perceived animacy of the character (cf. Fig. 2 with Fig. 6), presumably because animacy interacts with speed in the complex manner discussed above.

Here, we demonstrated interference effects that result from showing animate or inanimate motion in the background while the observer performs a totally unrelated task (the interception). However, when the observed action is related and instrumental to the task performance, the interaction between the two (observed and performed) actions results in facilitation rather than interference (Sebanz and Knoblich 2009). In this vein, Zago et al. (2011b) compared the interception of a moving target when it depended on a biological motion or a non-biological motion triggered by the observer and simulated on the computer screen. They found that the performance significantly improved in the presence of biological movements under all ecological conditions of coherence between scene and target gravity directions.

Overall the present and previous studies converge to the idea that time estimates are affected by animacy. Specialization of the neural time estimates as a function of animacy would enhance the temporal resolution of visual processing and the ability to predict critically timed events.