Avian vision is a marvel of nature. Given the weight limitations of flight, combining the demands of vision within the small size of the avian brain remains a compelling puzzle for visual scientists. In combination with their distant and distinct evolutionary path and contrasting central nervous system organization, birds are highly useful to examine in pursuit of a general theory of visual cognition (Cook et al., 2015). Among the various problems facing any highly mobile social animal is recognizing and categorizing the behaviors of others. These abilities require observers to process the dynamic sequence and configuration of an actor’s body and limbs over time. Our laboratory has focused on how pigeons recognize various actions and behaviors. We have also established that pigeons can classify different kinds of extended complex actions performed by digital models (Asen & Cook, 2012; Qadri et al., 2014a; Qadri & Cook, 2017; Qadri et al., 2014b). This research has found that pigeons use a combination of configural motion and static pose cues to recognize and discriminate the actions exhibited by the articulated digital models.

Creating additional challenges for action recognition, however, is the fact that the natural world is cluttered with a wide variety of objects. These objects regularly obscure and occlude actors, preventing observers from having a complete view of the action. For example, as a stationary observer views an animal running through the forest, fragments appear and disappear at different times and locations. As any birder knows, such partial and incomplete views of objects are challenging enough to recognize without the additional complexity of dynamic occlusion or movement (Drori et al., 2003; Herling, 2014; Iizuka, et al., 2017; Tvardíková & Fuchs, 2010). Our previous research with pigeons has tested only a limited set of occluded conditions. In these experiments, the models and occluders have always moved in place in the center of the scene (Asen & Cook, 2012; Qadri, Asen, et al., 2014). In the current experiments we wanted to create a more realistic situation in which the animals moved across the scene, disappeared, and reappeared as would occur frequently in nature.

Under such “dynamic occlusion” conditions, humans integrate successive partial views of an object passing behind an occluder into a whole object (Erlikhman et al., 2016; Ghose et al., 2014; Kellman & Shipley, 1991; McCarthy et al., 2015; Palmer et al., 2006). Palmer et al. (2006) postulated that this type of spatiotemporal integration by humans uses information persistence and positional updating in a coordinated fashion to form a completed mental representation of the occluded moving figure. They suggest that we have a dynamic visual store that briefly maintains shapes and updates their positions to do so. Increasingly there has been a growing evidence of how the brain processes this interaction between form and motion (Erlikhman & Caplovitz, 2017).

It has been suggested that this ability to integrate such fragments into configural or integrated representations may be limited in birds (Loidolt et al., 2006) or possibly uniquely human (Imura & Tomonaga, 2013). The spatiotemporal integration of dynamic and fragmented information has been far less explored in birds. Further, the majority of these efforts have focused on examining shape- or form-related phenomenon rather than motion processing.

For example, amodal completion is the process of perceiving a connected shape when viewing its elements separated by an occluder. Studies of amodal completion in pigeons have produced mixed results (see Qadri & Cook, 2015, for a review) with most of the evidence suggesting that birds do not readily complete such occluded objects. The addition of motion to studies of amodal completion has not resolved the issue. Ushitani et al. (2001) tested pigeons with line elements fragmented into two by a static occluders. Even when aligned or moved in synchrony, pigeons responded to these elements as if they were separated fragments. Nagasaka and Wasserman (2008) tested pigeons using the outline of a square moving in a circular pattern or four separated lines moving in a corresponding circular pattern. When tested with strategically placed occluders and conditions where humans would perceive the line figure as the outline of a square, all four pigeons responded as if they did not see a “completed” figure. Even with training and additional modifications designed to enhance the perception of completion, the results remained mixed. Different pigeons reported seeing a completed figure under differing conditions at different times. The critical condition(s) for consistently producing a “completed” perception by the pigeons were not identified.

In the current study, we examined the action recognition by pigeons under the more realistic and natural conditions of dynamic occlusion. Here, we report how pigeons discriminate the locomotor actions of digital animal models as they transit behind a set of occluders. These occluders were arranged to provide only a successive series of partial or fragmentary views of the actions. Although they experienced all parts of the model at some point in time and space, the pigeons never saw a visually complete model. The main question of interest focused on how these fragments were used to support the discrimination of the actions.

Three experiments were conducted using a go/no-go action recognition discrimination. Experiment 1 looked at how the six pigeons acquired and transferred a locomotor action category discrimination (walk vs. run) of three different digital models under dynamic occlusion conditions. Experiment 2 examined how manipulations of the viewing conditions and context influenced this discrimination. Experiment 3 examined how different manipulations of the actions influenced this discrimination. The results indicate that pigeons can discriminate actions even when a model is partially obscured and fragmented in time and space and do so using cues extracted from the successive views of model.

Experiment 1

Experiment 1 examined whether pigeons could learn to discriminate the different locomotor actions of the model when dynamically passing behind a set of occluders. Six pigeons were tested using the walk and run behaviors developed by Asen and Cook (2012), but with several important differences. Unlike that study where the model was completely visible for the entire duration of the trial, the pigeons here never saw the complete model at any time. Further, Asen and Cook’s models remained stationary while locomoting in the middle of the screen (i.e., “running in place”), but here, the digital animal models transited across the display from one side to the other. Thus, the model appeared, disappeared, and reappeared in the gaps between occluders as the model moved across the display, resulting in the pigeons having to rely on dynamically changing partial views of the model’s actions to recognize them.

The important question of Experiment 1 was whether the pigeons could learn to discriminate such actions from the resulting series of temporally and spatially separated partial views of moving models. Figure 1 shows illustrative frames of the videos tested with the pigeons, including examples from different sequences to portray the variety of occluder shapes and textures used.

Fig. 1
figure 1

Illustrative frames of the partial views using the two different occluder shapes and the three different occluder textures depicted with the cat model walking (top row) and running (bottom row). Each permutation of the three models, two behaviors, and six occluder shape and texture combinations were tested with the pigeons. (Color figure online)

Each session consisted of go and no-go trials. In each trial, a 20-s video was presented to the pigeons. Each trial showed a video of an animal model either walking or running repeatedly across the screen from left to right. Across the screen there were three horizontally spaced occluders. As the models moved left to right, parts of them would appear, disappear, and reappear in the gaps between the occluders. To prevent transiting speed from cuing the action category, the models moved across the screen at the same fixed rate (about 5 s per transit) regardless of whether it was running or walking. For three pigeons the run behavior was the S+ stimulus during which pecking was rewarded, while pecking during the S− walk behavior contributed to a dark time-out. This assignment was reversed for the other three pigeons (walk+/run−). Three digital animal models—antelope, camel, and cat—exhibited these run and walk behaviors as tested on different trials. Across trials, the appearance of the occluders also varied. They appeared as either tall blocks or cylinders with one of three distinct visual textures (brick, metal, or wood; see Fig. 1).

After training, several sessions of data were collected as a baseline period to determine how the scene elements (i.e., model type, occluder shape, occluder texture) influenced the discrimination. Following that, three transfer tests evaluated the specificity of the pigeons’ learned discrimination. The pigeons were tested with (1) familiar models transiting in the opposite direction (right to left), (2) a novel digital animal model (dog), and (3) novel transit timing (two or six transits per trial). If the pigeons had memorized temporally or spatially specific details or features, they would fail to transfer to some or all of these novel situations. Successful transfer to these novel displays would suggest a more generalized and flexible representations of the walk and run actions.

Methods

Animals

Six male pigeons (Columbia livia) were tested. Three pigeons had previously participated in different action discriminations studies (J3 & S5 in Qadri & Cook, 2020; J3 & Y6 in Qadri, Sayde, et al., 2014; Y6 in Asen & Cook, 2012, and Qadri, Asen, et al., 2014). Three pigeons (D1, D2, M4) had participated in different non-action discriminations (Cook et al., 2011; Hagmann & Cook, 2011, 2013). The pigeons were maintained at 85% of their free-feeding weights during testing and had free access to water and grit ad lib. in their home cages. Tufts University’s Institutional Animal Care and Use Committee (IACUC) approved the procedures used in these experiments. The data and code for this study are available by emailing the corresponding author.

Apparatus

Testing was conducted in a flat-black, metal chamber (42-cm tall × 38-cm wide × 37-cm deep) with stimuli presented on an LCD color monitor (model: NEC Accusync LCD51VM, screen resolution: 1,024 × 768 px) visible through a viewing window in the front panel (30 cm × 35 cm) that was recessed 12 cm behind the viewing panel. Pecks to the display were detected by an infrared LED touchscreen (EZscreen EZ-150-Wave-USB) mounted in the front panel of the chamber. A 28-V houselight in the ceiling illuminated the chamber, except during time-outs. Mixed grain was delivered from a central hopper located in the middle of the front panel below the touchscreen.

Stimuli

The 13.5-cm × 10.1-cm video scenes were rendered using digital animation software (Poser 7, www.smithmicro.com). Once rendered, the scenes were joined to create 20-s, 30-fps, AVI video files (VirtualDub, www.virtualdub.org, using the Camstudio codec). The three-dimensional scene consisted of a green grass-like “ground” surface with an upper blue “sky.” The scene was lit using a diffuse light source located above the camera and a second one to the right of the camera. Additional lighting was provided from weaker spotlights above and behind the scene. The model’s shadow was cast from the diffuse light above the camera, while occluders had no cast shadows.

Three vertical occluders were located at the front of the scene. This resulted in four gaps in which the model was partially visible consisting of two central gaps of 16.9 mm in width and edge gaps of 8.2 mm (left) and 4.5 mm (right). Note that in Experiment 2 these scenes were modified so there were four vertical occluders resulting in three equal 16.9 mm gaps (compare Fig. 1 with the Supplemental figures of scenes in Experiment 2). The occluders could be either of two possible shapes (block or cylinder) and scaled to be the same size (3 cm × 8 cm). These occluder shapes could be rendered in one of three different visual textures (brick, metal, or wood).

In each video, an animal model (standing motionless, 4.3 cm nose to haunches × 3.8 cm foot to ear/horn) transited from left to right behind the occluders. The antelope, camel, cat, and later dog digital models were created using third-party content (both 3D shape and action paths; Daz 3D, www.daz3d.com). While transiting, their body followed a walk or run action pattern that was biomechanically characteristic of the depicted species (see Supplemental Fig. 1 for a depiction of the models and behavioral cycles).

Within each 20-s video, the animal model transited the screen in a single direction. It first entered on the left side, transiting rightward across the screen, and disappeared off to the right. After a complete transit, the model reappeared on the left side for another transit. Each model transited the scene four times during each trial’s presentation. It took 4.3 s for the model to move from left to right across the scene, with at least some body part visible at all times before disappearing off-screen for approximately 0.3 s. The transit rate and timing were matched across the walk and run behaviors for each model, although for all models, the behavioral cycle before a pose repeated was shorter for the run action (mean 1.7 behavioral cycles per second, or cps) than for the walk action (0.82 cps).

Procedure

Acquisition

Each trial started with a centrally presented 2.5-cm white ready signal on a black screen. When pecked, the ready signal was replaced by a video stimulus for a 20-s trial period. Each training trial portrayed one of the three animal models (antelope, camel, or cat) transiting behind one of the six occluder types (2 shapes × 3 textures). An intertrial interval (ITI) of 3 s followed the consequences of trial before the presentation of the next ready signal.

There was a brief initial shaping period of several sessions where pecks to all stimuli were rewarded. Once the pigeons were readily pecking the displays, three pigeons (D2, S5, & Y6) were only rewarded for pecking at running models (run S+), while the other three pigeons (D1, J3, & M4) were only rewarded for pecking at walking models (walk S+). Pecks to these correct (S+) actions were rewarded with 3-s of access to mixed grain (3.3 s for Y6) on a variable interval schedule (VI-10 s). Pecks to the incorrect (S–) action resulted in no reward and contributed to a variable dark time-out at the end of the presentation (0.5 s per peck).

Each 72-trial session tested all combinations of occluder shape, texture, model, and behavior. Six of the S+ trials (3 models × 2 occluder shapes) were designated as nonrewarded probe trials. These S+ probe trials allowed for the uncontaminated measurement of peck rate without the interruption of food presentations. All S+ dependent measures were calculated from these probe trials. Across a block of three sessions, all combinations of model, occluder texture, and occluder shape were tested as probe trials. Acquisition consisted of 24 sessions of testing.

Baseline

After acquisition, six additional sessions were conducted consisting of 108 trials each (three repetitions of each stimulus configuration). Twelve S+ trials were conducted as probe trials and counterbalanced across sessions so that each S+ configuration was tested four times. Following this baseline, three different transfer tests were conducted.

Novel direction transfer

For this test, new videos depicting the cat model transiting from right to left were tested (i.e., a novel direction). The cat model was chosen for the test because it supported the best discrimination for five of the six pigeons. Each test session included six novel direction trials (2 behaviors × 3 occluder textures) and six matched baseline trials. Sessions alternated in using the block and cylinder occluder shapes for these test trials. The transfer and matched control trials were all conducted as probe trials. To accommodate these 12 test trials, comparable baseline trials from the 108-trial baseline session were replaced. Four 108-trial test sessions were conducted with each separated by two baseline sessions. After completion, transits from right to left for all three models were added in a counterbalanced manner to the baseline trials for 26 sessions before further tests.

Novel model transfer

New videos were created depicting a novel dog model transiting the scene behind the occluders. This dog model was standardized to the same size and timing parameters as the three baseline animal models. Each test session presented six novel model trials (2 behaviors × 3 occluder textures) and six matched control trials with the familiar models as probe trials. Transit direction and occluder shape alternated between different test sessions. As before, comparable trials were removed from the 108-trial baseline session to accommodate the 12 novel dog test trials. Four 108-trial test sessions were conducted with each separated by two baseline sessions. After completion of testing with the novel model, trials with the dog model were added in a counterbalanced manner to the baseline trials for 40 sessions before further tests.

Novel transit timing transfer

During training, the digital models transited the scene four times during a 20-s trial. In this test, new videos were created depicting the cat model transiting the scene more slowly (two transits per trial) or more quickly (six transits per trial). At two, four, and six scene transits per 20-s trial, the model was visible for either 8.6, 4.3, and 2.7 seconds, respectively, during each transit (see Supplemental Video 1 for depictions of all three transit types).

Each session included eight trials with novel numbers of transits (2 behaviors × 2 directions × 2 test number of transits) along with four matched control trials as probe trials. All test stimuli in a session used a single occluder shape and texture combination. All occluder combinations were presented by the end of testing. Six test sessions, each separated by two baseline sessions, were conducted. As above, the total session remained 108 trials by removing comparable baseline trials from each session. After testing the new number of transits, stimuli with two and six transits were added into the baseline organization.

Discrimination metrics

Discrimination was primarily measured using discrimination ratio (DR), calculated as

$$DR=\frac{S+ probe\ peck\ rate}{\left(S+ probe\ peck\ rate\right)+\left(S- peck\ rate\right)}$$
(1)

In the case of perfect discrimination, this metric results in a value of 1.0 and when no discrimination is present reports a value of 0.5. While the S+ peck rates were stable across the trial, the pigeons’ responses to S− stimuli declined over the 20-s of a trial. We have established from our prior experiments with go/no-go discrimination that the pigeons continually improve their discrimination over the course of a presentation (e.g., Cook et al., 2003; Cook et al., 2016; Koban & Cook, 2009). As a result, the best measure the pigeons’ discriminative ability is to evaluate peck rates from the last half of the trial. One pigeon (J3) appeared to lose inhibitory control with time, however, and consistently pecked more at S− stimuli toward the end of a trial despite low peck rates earlier in the same trial. We have observed this occasionally in a few birds before (Cook et al., 2012). As a result, for this specific bird we used peck rates during the second quarter of the trial (seconds 5 to 10) to best captures his discrimination.

Results

Acquisition

All six pigeons learned the action discrimination, despite the continually partial and fragmented views of the models. The top panel of Fig. 2 shows the birds’ improving discrimination over training. On average, the three birds in the run S+ condition appeared to learn the discrimination faster than the three birds in the walk S+ condition. A mixed ANOVA (Session as a within-subject factor and S+ condition as a between-groups factor) confirmed that the pigeons learned the discrimination, main effect of Session F(23, 92) = 4.1, p < .001, ηp2 = .51. However, the nature of S+ assignment did not have any impact on the learning of the discrimination, F(1, 4) = 1.3, p = .32, or interaction with sessions, F(23, 92) = 0.7, p = .83.

Fig. 2
figure 2

Top: The mean discrimination ratio (DR) across the 24 acquisition and six baseline sessions of Experiment 1. Data in black depict the average for the three pigeons in the Run S+ condition, while data in white are for the three pigeons in the Walk S+ condition. Error bars depict SE. Bottom: The mean discrimination ratio (DR) for each of the three model types during the baseline phase of Experiment 1 for the Run S+ and Walk S+ conditions

Baseline

The subsequent baseline period was used to examine the effects of model, occluder shape, and texture. All pigeons were good at the discrimination regardless of occluder shape (mean DR = .73 for each block and cylinder) or texture (.74 for brick, .74 for metal, and .72 for wood). Five of the six pigeons were best at discriminating the actions with the cat model (.84) followed by the antelope (.76). All pigeons were poorest with the camel model (.59), probably due its distinctive gait. The lower panel of Fig. 2 shows discrimination broken down by model and S+ assignment. One-sample t tests found that each model supported above chance discrimination (.5, using a Holmes–Bonferroni correction), ts(5) > 3.7, ps < .014, ds > 1.5. A mixed ANOVA (Model × Occluder Shape × Occluder Texture as within-subject factors, S+ condition as a between-groups factor) evaluating effects on DR confirmed a significant main effect of model F(2, 8) = 9.8, p = .007, ηp2 = .71. The pairwise comparisons among the models found a significant difference between the cat and camel (p = .017) and between the antelope and camel (p = .028), but not between cat and antelope (p = .220). All other main effects and interactions were not significant.

We next examined how the pigeons responded as the model repeatedly transited the scene during each trial. For this analysis, the average horizontal location of each peck relative to the center of the display was determined for each second of a presentation (i.e., twenty 1-s bins). The model’s leftward origin was standardized as a value of −1 and its rightward destination as 1 relative to the center of the display (coded as 0). Only pecks to positive probe stimuli were analyzed as pecking during nonprobe trials was interrupted by food availability and there were many fewer pecks to the negative stimuli and they were more variable (cf. Stahlman & Blaisdell, 2011). The top panel of Fig. 3 depicts the results with guides highlighting the four separate transits within a baseline trial. The upward slope of each transit reflects how the pigeons tracked the models from left to right as they repeatedly moved across the scene. Their pecks start on the left side of the screen as the model appears, shifts across the screen toward the right as the model transits. It then returns to left side for the reappearance of the model in synchrony with the four transits occurring within a trial.

Fig. 3
figure 3

Mean horizontal peck location to S+ stimuli across time in the trial using 1-s bins. The top panel shows behavior during the baseline phase of Experiment 1. The y-axis presents the horizontal peck location, with the model’s first appearance (origin) coded as −1 and the model’s disappearance (destination) coded as 1. This results in the center of the scene coded as 0 (dashed line). The bottom two panels show the results from the two-transit transfer (left) and six-transit transfer (right) tests. Error bars depict SE

Novel direction transfer test

Reversing the transiting direction of the model had no impact on the discrimination. The pigeons were equally good at discriminating the behaviors whether the model transited in the novel right to left direction (mean DR = .90) as when it transited in the trained left to right direction (mean = .88). A mixed ANOVA (Direction × Session as within-subject factors, S+ condition as a between-groups factor) found no effect of transit direction F(1, 4) = 0.4, p = .566. All other main effects and interactions were not significant.

We looked the pigeons’ tracking behavior in this condition, too. When reversed, their pecks start on the right side of the screen as the model appears, shifts across the screen toward the left as the model transits. It then returns backs on the reappearance of the model on the right side.

Novel model transfer test

All birds showed good discrimination transfer to the novel dog model. Across the four test sessions, the mean DR for the dog model was .78, whereas the mean DR for the three baseline models was .84 (antelope .90, camel .72, cat .90). One-sample t tests for each of the four models confirmed them to be significantly above chance (using a Holmes–Bonferroni correction), ts(5) > 5.3, ps < .003, ds > 2.2. A mixed ANOVA (Model × Session as within-subject factors, S+ condition as a between-groups factor) evaluating DR confirmed a significant effect of model F(3, 12) = 7.6, p = .004, ηp2 = .66. Pairwise comparisons of the models revealed differences that approached significance between the dog and antelope models (p = .058) and the dog and cat models (p = .074) and no difference between the dog and camel models (p = .37). All other main effects and interactions were not significant.

Novel transit timing transfer test

The pigeons were equally good at discriminating the model’s behaviors at both novel transit rates. The bottom two panels of Fig. 3 show that the pigeons’ horizontal peck location across the duration of the trial for these two novel transit rates. The continued synchrony between peck location and transit number across both panels clearly indicates that the pigeons were tracking and pecking at the transiting animal’s position at it moved across the screen. Across the six test sessions, the mean baseline DR with the baseline four transits was .90. When the model transited the scene either two or six times, the mean DR was .93 and .89, respectively. A mixed ANOVA (Transit Number × Session as within-subject factors, S+ condition as a between-groups factor) revealed no main effect of transit timing F(2, 8) = 1.1, p = .406. The analysis did reveal a significant three-way interaction F(10, 40) = 3.0, p = .007, ηp2 = .427. This interaction appeared due to the walk S+ birds performing more poorly on the two-transit trials in the earlier sessions compared with the end of the test.

Discussion

The experiment revealed that, despite never seeing the entire animal model at any one time, the pigeons could discriminate the occluded walk and run actions demonstrated by the three animal models and transfer this to a novel fourth model. Thus, sufficient action information was available within the gaps to support discrimination of the model’s action. The pigeons also showed excellent discrimination transfer to a novel transit direction and novel transit timings, plus their performance was invariant relative to changes in the occluders. The general flexibility of the discrimination indicates that the pigeons were attending to the relevant motion or pose differences between the behaviors of walk and run when visible through gaps. This flexibility is consistent with the results of Asen and Cook (2012), when the full digital model was visible.

Finally, the pigeons’ pecking location, a proxy for attention (Dittrich et al., 2010), was synchronized to the model’s transits and direction of travel. This strongly indicates that the birds specifically tracked the models as they moved across the display. This finding is consistent with work by Wilkinson and Kirkpatrick (2020), who found that pigeons can track and peck at a moving object. An important difference is that the current birds did it spontaneously instead of as a trained requirement of the task.

Experiment 2

Experiment 1 established that pigeons could learn to classify locomotor actions without seeing the entire model. In this experiment, we examined how different scene characteristics affected the pigeons’ discrimination of this dynamically occluded partial information to better understand how the birds processed the displays. Four tests were conducted.

The first test presented scenes with only a single viewing gap to examine how the individual gaps were used. Knowing how and what type of information is processed from a single gap would advance our understanding of how the pigeons integrate information across multiple gaps. These kinds of “slit” viewing conditions have a long history of study in humans, going back more than a century, called anorthoscopic perception. These types of viewing conditions have been explored in primates (Bognár & Vogels, 2021; Imura & Tomonaga, 2013) but have received virtually no attention in birds. In the second test, the size of this single gap was reduced. Intuitively, all reasonable action processing accounts predict reduced accuracy when model visibility is decreased; this test examined how much. To prepare for this test, adjustments to the spacing and number of gaps were first introduced into baseline training.

In the third test the camera position was varied, thus presenting the same scene from a new perspective. Testing novel perspectives evaluated pigeons’ flexibility in perceiving the digital model and its relation to the scene’s elements. In the fourth test, the occluders’ appearance was modified to determine how this irrelevant contextual information impacted the birds’ performance. In theory, changing irrelevant scene characteristics, like occluder texture, should not impact the discrimination since the occluders contain no discriminative information. In practice, however, changes to irrelevant features often surprisingly hinder pigeons (DiPietro et al., 2002; Koban & Cook, 2009; Lazareva et al., 2007; Qadri, Asen, et al., 2014). We wanted to investigate this further within the present dynamic context.

Methods

Animals and apparatus

The same pigeons and apparatus were used as in Experiment 1.

Procedure

Transfer to a single gap

For this test, videos were made depicting the cat model transiting the scene four times with only one of the four gaps being present. To create the stimuli, the cylinder occluder was stretched to cover the locations of all but one of the four gaps (see Supplemental Figure 2). With only one gap available, the total time that some portion of the model was visible was reduced from 4.3 to 1.6 s. Additionally, the cat model appeared either 0 s, 0.7 s, 1.7 s, or 2.7 s after video onset, depending on the gap being tested.

Each 106-trial test session presented eight one-gap trials (2 behaviors × 4 gap positions) and two four-gap control trials (one for each behavior) as probe trials randomly intermixed with the baseline trials. To minimize the number of test trials, only the cylinder occluder shape and the rightward motion direction were tested. The remaining trials of the test session were composed of 96 standard baseline trials. All occluder textures were tested over a three-session block. Six total test sessions were conducted, each separated by one 108-trial baseline session.

Adjustments to baseline stimuli and training

At this point, the baseline stimuli were modified to use three 16.9 mm (64-px) gaps. Instead of four gaps from three occluders, these new stimuli had three gaps created from two interior and two partially visible outside occluders (as is visible in Fig. 1). This change was made to better standardize the gaps. With this modification, the digital models were visible for 1.7 s within each gap and yielded a total of 3.7 s of visibility per 5-s transit. The model now took 0.3 s to appear before the first transit and remained off screen for 1.3 s, before becoming visible again back on the initial side of entry. Pigeons experienced these new three-gap scenes for 10 sessions using the same session organization as before.

After this modification, one-gap scenes were also added to baseline training. Instead of stretching the occluders as in the first test, these new one-gap scenes added additional occluders to create a single 16.9 mm opening at one of the three locations. In these one-gap scenes, the digital models were only visible for 1.7 s during each 5-s transit. Depending on gap location, the animal first appeared either 0.3 s, 1.3 s, or 2.3 s after the start of the trial.

Baseline sessions now consisted of 111 trials, with three-gap scenes presented on 84 trials and one-gap scenes presented on 27 trials. For scenes with three gaps, the animal model, behavior, and numbers of transits were counterbalanced (72 trials with the standard reinforcement contingency), with one of each positive combination added as a probe trial (12 probe trials per session). For scenes with one gap, all combinations of behavior, number of transits, and gap position were included (18 trials with the standard reinforcement contingency), and again one of each positive combination added as a probe trial (9 probe trials per session). The remaining scene features randomly varied (e.g., occluder shape, occluder texture, transit direction). Pigeons received 20 sessions of familiarization training with these three-gap and one-gap scenes before the next test.

Reduced gap transfer and testing

To determine how much information the pigeons needed to discriminate the behaviors, we tested scenes with smaller single gaps. The single gap between cylinder occluders was modified to be either one half (8.5 mm/32 px), one quarter (4.2 mm/16 px), or one eighth (2.1 mm/8 px) of the training size’s 16.9 mm/64 px gap. This reduced the total time per transit (1.6 s, 1.5 s, 1.4 s respectively). A matched control no-gap condition was also included, in which the cat model walked or ran behind the cylinder occluders, but it was completely blocked from view. In total, this created 10 new reduced gap conditions (3 gap sizes × 3 gap positions + 1 no-gap). See Supplemental Fig. 3 for example scenes of these gap conditions.

These transfer videos were presented in a counterbalanced sequence across sessions so that all conditions were tested six times and distributed evenly across the test sessions. This counterbalancing resulted in 10 reduced gap trials (2 behaviors × 5 selected transfer conditions) being added to each session, along with six one-gap control trials (2 behaviors × 3 gap positions) and two three-gap control trials (two behaviors) as nonrewarded probe trials. All transfer probe trials used the cat model transiting from left to right for four transits behind the cylinders. Within a session, the same occluder texture was used for all test trials, but across all the sessions all occluder textures were tested. These test trials were randomly mixed with 93 baseline trials—72 three-gap trials and 18 one-gap trials with the same features counterbalanced and randomized as described in Adjustments to baseline stimuli and training and three random positive three-gap probes using any animal model except the cat. Twelve total test sessions were conducted, although only six sessions included the no-gap condition to minimize the impact of this ambiguous stimulus. Test sessions were separated by a 111-trial baseline session (described in Adjustments section above).

Following this test, reduced-gap stimuli (excluding no-gap scenes) were added into the daily baseline sessions. These baseline sessions consisted of 112 trials, with three-gap scenes presented on 28 trials and one-gap scenes presented on 84 trials. For scenes with three gaps, the animal model, behavior, and numbers of transits tested were counterbalanced, with four random positive probe trials (one of each animal model). For scenes with one gap, the behavior, number of transits, gap size, and gap position were counterbalanced, again with 12 random positive probe trials (4 gap sizes × 3 gap positions). The remaining scene features randomly varied (e.g., occluder shape, occluder texture, transit direction). Pigeons received 30 sessions of training with this new baseline before the next test.

Novel perspective transfer

To create novel camera perspectives, the camera’s position was moved while keeping the central gap at a fixed distance and centered in the video. The camera was “orbited” either horizontally left or right (azimuth ±24°), vertically up (elevation +16°), or a combination of both. Relative to the animal model, the camera was now right, left, elevated, elevated & right, or elevated & left (see Supplemental Fig. 4). These changed perspective scenes were tested with one central 16.9-mm gap and the cat model transiting from left to right for four transits. A different occluder texture was tested each session. Test sessions included 10 trials testing the novel camera perspectives (2 behaviors × 5 camera perspectives) and two baseline control trials (two behaviors) as probe trials. These were randomly mixed with 96 baseline trials (i.e., a 112-trial baseline session as described above, with the 16 probe trials removed). Three 108-trial test sessions were conducted, each separated by two 112-trial baseline sessions.

Novel texture transfer

To determine how the irrelevant occluders were being processed, occluders made from a novel pattern, a novel color, or both novel features were tested. To create occluders with a novel color, the three original textures’ patterns were created using a red color (RGB 255, 0, 0). The novel pattern used a novel, marble pattern with the colors taken from the familiar textures (brick RGB 77, 46, 35; metal RGB 110, 108, 110; wood RGB 83, 53, 32). The third condition using the novel marble texture was presented with a white color (RGB 255, 255, 255; see Supplemental Fig. 5 for examples). These scenes used a single central 16.9-mm gap between block-shaped occluders behind which the cat model transited from the left to the right four times.

Test sessions presented six test trials (2 behaviors × 3 occluder conditions) and two standard occluder control trials (two behaviors) as non-reinforced probe trials randomly intermixed within 96 baseline trials. Three 104-trial test sessions were conducted, each separated by a 112-trial baseline session.

Results

Transfer to a single gap and reduction in gap size

For purposes of exposition, the results of the transfer test to a single gap, its subsequent reduction in size, and the ensuing training are combined. These combined results are shown in Fig. 4A, which displays discrimination as a function of gap size. Overall, the pigeons continued to discriminate well even when the acting model was visible through only a single gap. As might be expected, performance declined with smaller gap sizes.

Fig. 4
figure 4

The mean discrimination ratio (DR) for baseline stimuli (black) as compared with transfer stimuli (gray) for the transfer tests conducted in Experiment 2. Panel A shows the pigeons discrimination as a function of gap size. The black symbols depict performance with baseline, multigap scenes, with the symbol placed on the x-axis at the mean gap size of the scene. Pigeons first experienced four gap scenes (black square) in which the two middle gaps were 16.9 mm, the leftmost gap was 8.2 mm, and the rightmost gap was 4.5 mm. They were tested in transfer to scenes in which one of those four gaps was available (gray squares). Pigeons were then retrained with scene containing three 16.9 mm gaps (black circle). Finally, pigeons were tested in transfer with a single gap of varied in size between 16.9 mm and 0 mm (gray circles). Panel B shows the pigeons discrimination as a function of camera position, with left–right position presented on the x-axis and camera azimuth presented as separate lines. Panel C shows the pigeons discrimination as a function of change in the novel texture of the occluder. Error bars depict SE

In the first test the pigeons discriminated the actions through a single gap at all four original locations. When the scene contained a single gap, the mean DR was .82, while the mean DR with four gaps was higher at a mean DR of .93. The interior, larger gaps supported slightly higher discrimination (gap 2–.84, gap 3–.86) than the smaller gaps on the far sides of the display (gap 1 - .83, gap 4–.75), likely reflecting of their slightly different sizes. One-sample t-tests confirmed the significant above-chance discrimination (.5) with a single gap at each position (using a Holmes–Bonferroni correction), ts(5) > 5.7, ps < .001, ds > 2.3. A mixed ANOVA (Gap position [all gaps baseline and four one-gap positions = five total levels] × Session as within subject factors, S+ condition as a between groups factor) confirmed a significant main effect of gap position, F(4,16) = 4.5, p = .012, η2p = .53. All other main effects and interactions were not significant.

In the second test, the single, central gap’s size was programmatically changed (16.9 mm–2.1 mm). The gray circles in Fig. 4A show the pigeons’ DR as a function of the decreasing gap sizes during this test. Both the 16.9 mm gap (mean DR = .88) and 8.5 mm gap (.88) supported good discrimination. Further reductions in gap size did reduce performance (.56). When the gap was eliminated, discrimination was expectedly quite poor (mean DR = .42). One-sample t tests indicated that the pigeons performed significantly better than chance (.5) with one gap scenes with 16.9-mm and 8.5-mm gaps only (using a Holmes–Bonferroni correction), ts(5) > 15.8, ps < .001, ds > 1.3.

A mixed ANOVA (Gap size [all gaps baseline, one gap at four sizes, and no-gap = six total levels] × three-Session block [due to the no-gap condition not being tested every session] as within subject factors, S+ condition as a between groups factor) confirmed a significant main effect of gap size, F(5, 20) = 19.7, p < .001, ηp2 = .83. Pairwise comparisons revealed that the three-gap baseline, one gap 16.9-mm (baseline size) and the 8.5-mm size were different from each other, and they were different than the smaller sizes and the no-gap condition (ps < .015). Unlike previous tests, the pigeons’ discrimination worsened across sessions, F(3, 12) = 4.4, p = .026, ηp2 = .52. This likely indicates that the birds recognized these transfer scenes were non-rewarded. A three-way interaction between S+ condition, Session, and Gap Condition, F(15, 60) = 2.1, p = .021, ηp2 = .35 suggests this may have been specific to the Run S+ condition and the smallest gap sizes (4.2 mm and less). The related two-way interactions were also significant, S+ condition × Session: F(3,12) = 4.8, p = .020, ηp2 = .55; Session × Gap condition: F(15, 60) = 2.3, p = .013, ηp2 = .36.

Analysis of the ensuing 30-session training phase, where the reduced gap scenes were added to daily training showed a similar pattern—the pigeons’ discrimination was reduced as the gap size became smaller. This additional training improved the pigeons’ discrimination modestly. Pigeons were able to discriminate with the 4.2 mm gap size (DR = .62) but not the 2.1-mm size (DR = .55). One-sample t tests for each gap size indicated that the pigeons now performed significantly better than chance (.5) with a 4.2-mm gap as well as the 16.9 mm and 8.5 mm sizes (using a Holmes–Bonferroni correction), ts(5) > 6.2, ps < .002, ds > 2.5. These results suggest that the smallest condition limited the spatial extent or time available to effectively discriminate the model’s behaviors. A mixed ANOVA (Gap condition and Session as within subject factors, S+ condition as a between groups factor) on the DR showed a significant effect of gap condition, F(4, 16) = 307, p < .001, ηp2 = .987, but no main effect of session, F(29, 116) =1.462, p = .082, and its interaction, F(116, 464) = 0.97, p = .581. Unlike with the transfer test, pairwise comparisons of the gap sizes found significant differences between all gap sizes (ps < .003).

Whereas in Experiment 1 the pigeons’ tracked the models as they transited (and still did so during baseline trials), in this one-gap scenes, the pigeons’ largely pecked the gap location where the model would appear. With the smallest gap and no gap conditions, however, the pigeons instead spent their time pecking at the middle of the screen suggesting a loss of stimulus control.

Novel perspective transfer

Moving the camera perspective leftward or rightward had no significant impact on the discrimination (see Fig. 4B). These adjusted scenes supported the same level of discrimination (mean DR = .87) as baseline scenes (.87). Elevating the camera upwards, however, did reduce the pigeons’ discrimination (mean DR = .75). One-sample t tests indicated that each novel perspective supported above-chance discrimination (using a Holmes–Bonferroni correction), ts(5) > 4.8, ps < .005, ds > 2.0.

A mixed ANOVA (Camera elevation × Camera position × Session as within subject factors, S+ condition as a between groups factor) revealed no significant effects. A deeper analysis suggested that one of the birds (J1) performed drastically worse than the other five pigeons, thus adding considerable variability to the results. Rerunning this analysis excluding J1 revealed a significant effect of camera elevation, F(1, 3) = 11.3, p = .044, ηp2 = .790, but not camera position F(2, 6) = 2.0, p = .211. This analysis also showed a significant effect of session F(2, 6) = 7.6, p = .023, ηp2 = .717. A pairwise analysis showed that the pigeons performed significantly more poorly on the first test session than the second and third (p ≤ 0.047). All other main effects and interactions were not significant.

Novel texture transfer

The pigeons’ discrimination varied as function of the degree of change in the texture of the occluder (see Fig. 4C). Compared with scenes with the familiar baseline occluders (mean DR = .90), a novel occluder color (mean DR = .83) disrupted the discrimination but less than a novel occluder pattern (mean DR = .75). Scenes where the occluders used both a novel color and novel pattern supported the poorest discrimination (mean DR = .60). The latter condition was found to be not reliably different from chance, t(5) = 1.2, p = .289, d = 0.5. A mixed ANOVA (Pattern × Color × Session as within subject factors, S+ condition as a between groups factor) using DR showed a significant effect of pattern F(1, 4) = 8.3, p = .045, ηp2 = .68, but not color, F(1, 4) = 5.6, p = .077. No other main effects or interactions were significant.

Discussion

This experiment found that pigeons could discriminate the actions of the occluded models with only a single gap of sufficient size. The pigeons needed a gap of at least 8.5 mm to reliably discriminate among the walk and run actions at first, although with experience they learned to successfully discriminate with a 4.2-mm gap. Given this gap size, pigeons were likely temporally integrating successive movements of the fragmentary models as they appeared within the single gap. The remaining tests revealed the pigeons were resilient, but not immune, to changes to scene characteristics. Horizontal camera movements had little impact, while vertical changes reduced the discrimination. Color and pattern changes on the occluders also impacted performance, despite being irrelevant to the discrimination of the actions themselves.

These outcomes are consistent with previous research with pigeons. Several investigations found irrelevant scene changes impacted ongoing discriminations by pigeons (DiPietro et al., 2002; Koban & Cook, 2009; Lazareva et al., 2007; Qadri, Asen, et al., 2014). Previous speculation has suggested that this impact on discrimination may be an effect of introducing novel edge relationships in the scene features (Lazareva et al., 2007; Qadri, Asen, et al., 2014). These new edges and their identity may have attracted attention, reducing attention to the model’s actions and consequently reducing discrimination. Alternatively, the pigeons may have been adversely impacted by the novelty of these features, perhaps reflecting the species’ well-known neophobia. In line with this latter hypothesis, the total number of pecks made to the test conditions revealed fewer total pecks alongside the reduced discrimination.

Experiment 3

Experiment 3 examined how different types of dynamic and static cues contributed to the pigeons’ discrimination of the fragmented views of the actions. Our past research has focused on the role of global and local movement cues as related to the articulated actions themselves and the additional contribution of different shape or pose cues (Qadri, Asen, et al., 2014; Qadri & Cook, 2017, 2020). When presented in full view with complete models, we have found that both action and pose cues contribute to the discrimination of complex human actions by pigeons (Qadri & Cook, 2017).

Three conditions were tested to clarify what action and pose information the pigeons were extracting from the partial views experienced through the gap. In the first test, all motion information was eliminated, leaving a partially occluded familiar model presented for 20 seconds in a partially visible motionless pose. In the second test, the transiting motion was restored to this motionless model, but without the articulated actions. This test used a “gliding” static model in a single pose that appeared across the gap. In the third test, only the articulated motion was presented by having the partially occluded model run or walk in a fixed location within a gap. Separating these different sources of information allowed us to identify their respective contributions to the pigeons’ discrimination of the partially occluded actions.

Methods

Animals and apparatus

The same pigeons and apparatus were used as in the prior experiments.

Procedure

Test 1: Static pose

Videos were made depicting the familiar cat model in a static pose facing rightward in the central gap such that only controlled parts of the model were visible. Six portions of the model’s body were selected to demonstrate the walk and run poses. These presented only the head and neck area (head condition), the front legs (full front legs), all four legs (torso and portion of all legs), only the torso (torso, no legs visible), only the back legs (full back legs), or only the tail (tail). These videos used a single 8.5-mm central gap to show each static pose for the trial’s 20-s duration (see Supplemental Fig. 6).

Six 110-trial test sessions were conducted, each separated by a 112-trial baseline session. Each test session presented 12 novel static pose trials (2 behaviors × 6 poses) and two matched baseline control trials (two behaviors) as probe trials. These probes were randomly intermixed within the 96 baseline trials (baseline session, minus the 16 baseline probes). Each test session used a single occluder texture for probe trials, counterbalanced across test sessions.

Test 2: Transiting motion

Videos were made depicting the familiar cat model in a static pose facing rightward transiting the scene without articulated run or walk motions. This created the effect of the model gliding across the scene as visible through the central 8.5-mm gap. For each behavior, four poses (not used in the prior static pose test) were selected to have a fair overall representation of the behavioral cycles’ variability (see Supplemental Fig. 7 for images of the poses used and Supplemental Video 2 for an example video). The model was initially “off-screen” for about 1.4 s, then was visible “gliding” behind the occluders for about 1.6 s through the central gap, then was off-screen for 3.4 s before becoming visible again in the gap. This gliding model completed four transits during the 20-s trial.

Six 106-trial test sessions were conducted, each separated by a 112-trial baseline session. Each test session presented eight novel transiting pose trials (2 behaviors × 4 poses) and two baseline control trials (2 behaviors) as probe trials. These test trials were randomly intermixed with the 96 trials from a baseline session. Each test session used a single occluder texture for probe trials, counterbalanced across test sessions.

Test 3: Articulated motion

Videos were made depicting a portion of the familiar cat model facing rightward visible in an 8.5 mm central gap. The visible parts of the model appeared to run or walk in place, depicting articulating motion within the model, but with no transiting motion across the screen. Five portions of the model’s body were selected to demonstrate the actions in different videos—using only the head and neck (head condition), only the front legs (full front legs), all four legs (torso and portions of all legs), only the back legs (full back legs), or only the tail (tail). Supplemental Video 3 is one such video used in this test.

Six 108-trial test sessions were conducted, each separated by a 112-trial baseline session. Each test session presented ten novel transiting pose trials (2 behaviors × 5 body potions) and two baseline control trials (two behaviors) as probe trials. These were randomly intermixed within the 96 baseline trials. Each test session used a single occluder texture for probe trials, counterbalanced across test sessions.

Results

The results of the three different tests are summarized in Fig. 5. The figure shows mean DR from each test along with the combined average baseline discrimination over the three tests. The pigeons did very poorly with the static condition. They improved when the static poses glided across the central gap allowing a “complete” successive view of the pose. Pigeons did best with the articulated motion condition with a single view of a partially occluded model moving in place. The following sections describe these results in more detail.

Fig. 5
figure 5

The mean discrimination ratio (DR) for baseline stimuli (black bar; average performance on baseline performance for all three tests) as compared with each type of transfer stimuli tested in Experiment 3 (gray bars). Error bars depict SE

Test 1: Static poses

When shown only a partially occluded model in static pose, the pigeons failed to discriminate the depicted behaviors. For this condition the mean DR dropped to .51. This ranged from .42 to .58 dependent on the nature of the body part that was visible (torso = .58, head = .54, front legs = .52, rear legs = .51, all legs = .51, tail =.42). One-sample t tests found that none of these values was significantly different from chance (using a Holmes–Bonferroni correction), ts(5) < 1.7, ps > .160. A mixed ANOVA (Pose condition [baseline and six static poses = seven levels] × Session as within subject factors, S+ condition as a between groups factor) indicated a significant effect of pose condition F(6, 24) = 9.3, p < .001, ηp2 = .70. Pairwise comparisons indicated that this main effect was due to the baseline condition (.89) differing significantly from every static pose test condition (ps < .015). No other main effects or interactions were significant.

Test 2: Transiting motion

Pigeons showed reduced discrimination when the static models transited/glided across the scene behind the occluders. Compared with performance in baseline trials (.88), the mean DR was .62 when the model “glided” through the central gap. This latter value was statistically better than chance (one sample using a Holmes–Bonferroni correction), t(5) = 2.9, p = .032, d = 1.2, but significantly below baseline performance (paired using a Holmes–Bonferroni correction), t(5) = 6.8, p = .001, d = 2.8. A mixed ANOVA (Transiting motion condition × Session as within subject factors, with S+ condition as a between groups factor and DR as the response) found a significant effect of the transiting motion condition, F(1, 4) = 40.9, p = .003, ηp2 = .91. No other main effects and interactions were significant.

Test 3: Articulated motion

The pigeons demonstrated excellent discrimination when shown only a part of the model walking or running in a fixed location within the central gap. The birds’ average performance in this transfer and during the baseline trials was equivalent (.90). Discrimination varied slightly depending on the body part that was visible. These values ranged from a mean DR of .95 to .81 (torso = .95, front legs = .92, back legs = .93, tail = .89, and head = .81). Each of these parts supported above-chance discrimination, using a Holmes–Bonferroni correction, ts(5) > 3.1, ps < 0.027, ds > 1.3.

Consistent with this analysis, a mixed ANOVA (Articulated motion condition [baseline and five articulated motion views = six levels] × Session as within subject factors, with S+ condition as a between groups factor and DR as the response) did not find a significant effect of the articulated motion condition F(5, 20) = 1.4, p = .269. This analysis revealed a significant three-way interaction between Group × Condition × Session, F(10, 40) = 2.5, p < .018, ηp2 = .40. Looking more closely this appeared to be due the run S+ birds performing more poorly with head condition during the third test session. All other main effects and interactions were not significant.

Discussion

Experiment 3 revealed the greater contribution of articulated motion to the pigeons’ action discrimination relative to static pose cues. That the pigeons attended more to articulated motion features is not unexpected given that we found this type of dynamic superiority effect in a number of settings (e.g. Cook, 2001; Cook & Katz, 1999). Pigeons have consistently discriminated actions better than pose information alone tested with different digital models (Asen & Cook, 2012; Qadri, Sayde, et al., 2014). Even when the model is fully visible, pigeons require more than the separate static forms to correctly discriminate such displays (Asen & Cook, 2012). Thus, the current results further consolidate this pattern by indicating that, even when partially occluded, articulated motion contributes substantially to the birds’ discrimination of actions.

The processing of static information was more complex. In the present experiment, static form processing seemed to depend partially on what parts of the model were visible. While not significant, the static torso did support slightly better performance than the other static parts (Aust & Huber, 2002; Qadri, Asen, et al., 2014). The pigeons’ ability to discriminate the behaviors from static poses did moderately improve to above-chance levels when the model transited the gap in a single pose. This transiting discrimination was consistently better than with any single static part tested in the first test. The benefit of having the transiting motion was likely not due to just seeing the “best” part of the static model as it never reached baseline levels. This pattern suggests the pigeons may have been successively integrating pose cues from different parts of the animal as they were appearing across time.

The benefit of the transiting motion could also have had other sources. For example, the transiting motion certainly increased the stimuli’s similarity to the trained conditions. Further, the transiting motion may have helped capture the pigeon’s attention, allowing them to attend to these static figures in a way that an unmoving fixed presentation did not. This attentional attraction account would be consistent with the pigeons’ demonstrated tracking of the models as they transited the scene in the two earlier experiments.

General discussion

These experiments revealed that pigeons can discriminate the actions of digital models under highly restricted and dynamic viewing conditions. Using dynamic conditions mimicking those of natural occlusion, pigeons discriminated among the different locomotor actions of different animal models. Experiment 1 established that pigeons can successfully learn to discriminate sequentially presented walking and running actions within and across the gaps of a set of occluding objects. This was accomplished in part by tracking the model as it transited across the display. Experiment 2 revealed that a single gap of 4.2 mm or greater was minimally needed to reliably determine the actions. Despite their independence from the discriminative actions, this experiment also found that features of the background and occluders may have also been processed as part of the pigeons’ “representation” of the discrimination. Experiment 3 found that the model’s articulated motions were most critical to the discrimination, while static pose or shape information may have had only a secondary contribution. Together, these experiments demonstrate for the first time that pigeons can successfully recognize dynamically occluded fragments of actions. This important capacity would, of course, be highly valuable to any highly mobile and social animal in navigating an object-filled natural world.

What perceptual and cognitive processes in the birds supported this discrimination? Previous research has established that pigeons use a combination of dynamic and static features to discriminate actions, with an attentional emphasis on the motion of the extremities (Asen & Cook, 2012; Qadri, Asen, et al., 2014; Qadri & Cook, 2016; Qadri, Sayde, et al., 2014; Troje & Aust, 2013). The present results deepen our understanding of how these features are utilized. The pigeons’ improvement with multiple gaps (Experiment 2) and with successively presented static information (Experiment 3) suggests the birds were likely recognizing and integrating information across the gaps and time to discriminate the actions. Despite having never seen the model in its entirety, such results raise the possibility the pigeons may have developed a complete, configural representation of the dynamic model or at least a connected series of views regarding each model (cf. Aust & Huber, 2010).

These experiments highlight how much and what type of visible action information is needed to discriminate locomotory actions. For example, they expand our understanding of the contribution of spatially localized information. A single viewing aperture of 4.2 mm or greater allowed the pigeons sufficient momentary information to distinguish the actions of the transiting models. Smaller apertures did not support reliable discrimination. This provide a good indication of the approximate size and scope of the action features being used by the pigeons. These seem to be on the larger size, as opposed to small localized details or features, as the current and past research suggests that seeing more of the model supports the best discrimination (Asen & Cook, 2012; Qadri, Asen, et al., 2014).

Our results look similar to those from human studies involving dynamic occlusion. In their study of humans recognizing moving objects under conditions of dymaic occlusion, Palmer et al. (2006) postulated that humans use informational persistence and positional updating to form a completed mental representation of the occluded moving figure. While we are uncertain as to whether the pigeons constructed a single mental representation of the digital models in similar conditions, the results suggest that the pigeons do have a dynamic visual store that can at least briefly maintain passing shapes and successively use these snapshots as the model’s position changed. The pigeons’ better performance when more information was successively presented suggests that some part of the pigeon’s system accumulated and connected visual information from the actions of the fragmentary models.

While connecting successive partial views of a moving object into a single representation would be valuable, the current results do not yet demand such an explanation (Aust & Huber, 2003). Future experiments could directly examine this question by presenting the models’ different parts in and out of their experienced or normal order. If the pigeons indeed integrate successive views into complete representations, such re-orderings should diminish their recognition of the behaviors. Similarly, if trained with re-ordered models, presenting a normally organized model should also reduce action recognition.

Finally, it is worth noting the results of occluder transfer tests in Experiment 2. Here, we found the pigeons’ action discrimination was diminished by simply changing occluders’ appearance. Why did changing these features irrelevant to the discrimination impact the pigeons? This is not the first time that features unrelated to the main discrimination have had unexpected impacts (DiPietro et al., 2002; Koban & Cook, 2009; Lazareva et al., 2007; Qadri, Asen, et al., 2014). We have previously speculated that this may be an effect of the introducing novel edge relationships in the scene. This raises the possibility that surface and edge relations beyond the critical features are encoded in the pigeons’ representation of the task. Pigeons are frequently neophobic and perhaps these introduced featural changes attract attention or encourage avoidance of the display. The result might interfere or prevent the relevant information from being encoded in a timely way. This issue does raise the question of what pigeons are encoding when looking at complex displays of information.

It is convenient to talk about their recognition, and presumably their representation, of the movements of the models, yet other features of the display are likely being encoded, too. Despite using a wide variety of models and occluders to encourage as generalized and flexible a discrimination as possible, the pigeons were clearly familiarized with specific features within the displays, even features directly irrelevant to the action discrimination. Given their large memory capacity perhaps that is not surprising (Cook et al., 2005; Fersen & Delius, 1989; Vaughan & Greene, 1984). Nevertheless, such results are a constant reminder that what pigeons are encoding from our displays is not always what we might expect. The rigidity of pigeons as demonstrated by such findings shows perhaps where they diverge from the symbolic and abstract capacities of humans.

These experiments establish that pigeons, and likely most birds, can clearly process sequentially presented partial action information in ways that mimic human object and action recognition abilities. Whether the pigeons extract an integrated configural representation of the animal models from these successive views remains to be seen, as the current results raise that intriguing possibility. From our human perspective, that certainly seems the most natural way to represent such spatially and temporally distributed experiences. Nevertheless, the pigeons may not do so. Pigeons are often very absolute, regularly relying on specific and direct experiences of the past to guide behavior. Continued exploration of pigeons’ ability to discriminate and represent constantly changing motion and scene features under limited viewing conditions like those tested here will be needed to provide a more complete understanding avian visual cognition.