Introduction

Much of our current understanding about how animals discriminate and learn about stimuli in their environments has been derived from experiments that use not only simple stimuli but that also pose simple problems. By ‘simple’ we mean here, in the case of stimuli, variation along a single sensory dimension, and in the case of simple problems, the solving of elemental discriminations in which each stimulus or reaction is specifically and unambiguously associated with a defined and predictable outcome. Although such a strategy is certainly useful to understand basic principles of associative learning, it may overlook the core of capacities that animals may exhibit in natural situations, when confronted with stimuli that vary along various dimensions provided by compound stimuli and with problems that may admit several solutions and outcomes requiring sophisticated decision-making (Watanabe and Huber 2006).

This possibility may appear as irrelevant in the case of invertebrates where research on experience-dependent plasticity has focused predominantly on elemental forms of learning. In fact, invertebrates have been granted with only low-level forms of cognitive processing and, generally, the term ‘cognitive’ has been carefully avoided for these animals. Such a prevailing view, based on the assumption that invertebrate behavior is organized in terms of isolated and rather automatic modules having specific sensory inputs and motor outputs, has inhibited the analysis of intermediate and higher-forms of cognitive processing in invertebrates (Menzel and Giurfa 2001; Giurfa 2003).

Besides elemental forms of associative learning and memory, animals, including invertebrates, can sometimes respond to novel stimuli that they have never met before or can generate novel responses that are adaptive given the context in which they are produced. In doing this they exhibit positive transfer of learning (Robertson 2001), a capacity that involves stimulus comparison and generalization such that the animal's responses can be aimed appropriately towards novel stimuli (Giurfa 2003). Here we focus on experiments in honeybees that demonstrate such a positive transfer from known to novel visual stimuli, a capacity that demonstrates their generalization power in the visual domain and which also underlies stimulus categorization where this capacity exists.

We will present and analyze evidence showing stimulus generalization for specific visual features or set of features and will discuss such results in the framework of categorization performances. We ask whether besides well documented generalization abilities for visual features, honeybees can also be granted with categorization abilities or whether stimulus categorization is a prerogative of vertebrates characterized as ‘good learners’ such as pigeons, dolphins or primates.

Stimulus generalization and categorization

A fundamental function of perceptual systems is to record events related with relevant consequences and to signal their reappearance. This requires learning, memorization and evaluation of perceptual input. It also requires the capacity of coping with possible distortions of the original stimuli, due to noise, extrinsic or intrinsic environmental interferences, positional or developmental changes, etc. Two strategies which allow for flexible responding when the animal is confronted with these possible interferences are stimulus generalization and categorization (Thorndike 1913; Spence 1937; Estes 1994). These two strategies allow responding in an adaptive way to novel stimuli on the basis of similarity criteria. Although generalization and categorization are different processes, they are often deeply intermingled and a clear separation is sometimes difficult (Estes 1994; Zentall et al. 2002).

Generalization involves assessing the similarity between the present perceptual input and the previous experience. The evaluation of similarity is performed along one or several dimensions such that stimuli that lie close to each other along a perceptual scale or in a perceptual space are treated as equivalent (Spence 1937; Shephard 1958; Ghirlanda and Enquist 2003). Generalization processes imply gradual responding along a perceptual scale (Spence 1937; Shephard 1958; Ghirlanda and Enquist 2003).

Categorization, on the other hand, refers to the classification of perceptual input into defined functional groups (Harnard 1987). It can be defined as the ability to group distinguishable objects or events on the basis of a common feature or set of features, and therefore to respond similarly to them (Troje et al. 1999; Delius et al. 2000; Huber 2001; Zentall et al. 2002). Categorization deals, therefore, with the extraction of these defining features from objects of the animal's environment. Although it is currently debated whether or not categories have strict or fuzzy boundaries, there is a general agreement for acknowledging that category boundaries are sharper than those corresponding to the gradual decrease of responding along a perceptual scale underlying generalization and that they are indicative of perceptual discontinuities (Pastore 1987).

At this point, one could argue that categorization is nothing more than generalization, because a stimulus can be assigned to a category simply depending on its similarity with a known stimulus representation. However, categorization includes a discriminative task, which demands not only the existence of a single but of at least two category representations. Keller and Schönfeld (1950) defined categorization as generalization within and discrimination between classes. Thus, generalization underlies categorization but the reciprocal is certainly not valid.

A functional definition of categorization was provided by Herrnstein (1990) who recognized five basic levels, from the lowest to the highest level of abstraction: (a) straightforward discrimination, in which an animal attempts to identify a unique stimulus, but the imperfect resolving power of its sensory system results in a category that occupies a small region, rather than a point, in a perceptual space; (b) categorization by rote, in which an animal learns and categorizes according to a relatively small number of such small regions; (c) open-ended categorization, in which categorization is governed by one or more regions circumscribed by generalization gradients (i.e. regions extending beyond those created by the resolving power of the sensory system but still determined by proximity in perceptual space); (d) concepts, in which categorization is governed, not by proximities in perceptual space, but by the relationship to reinforcement that different stimuli could share; and (e) abstract relations, in which categorization is governed by relations between concepts.

We will review experiments on honeybee visual generalization, which go beyond straightforward discrimination (Herrnstein's first level; see above). We will discuss whether the performances and strategies uncovered by these works are consistent with some of the other four levels defined by Herrnstein.

The honeybee as a model for studies on visual cognition

The honeybee Apis mellifera constitutes a good model for addressing the question of visual categorization due to its remarkable learning and memory capabilities (Menzel 1999, 2001; Menzel and Giurfa 2001; Giurfa 2003). Despite its small size, the honeybee displays an extremely rich behavioral repertoire. A social lifestyle is obligatory, and a single bee cannot survive very long independent of the colony. Honeybees are central-place foragers, which means that they have to return back to the hive after any foraging bout. In this context, they often have to navigate over distances of several kilometers using landmark constellations and celestial cues such as the azimuthal position of the sun and the polarized light pattern of the sky (Wehner and Rossel 1985). They visit hundreds of flowers in quick and efficient succession for gathering food and learn and memorize local landmarks characterizing places of interest, which they communicate to other hive-mates through ritualized body movements, called ‘dances’ (von Frisch 1967). Bees see floral colors, shapes and patterns and resolve movements achromatically with a high temporal resolution (Menzel and Backhaus 1991; Srinivasan et al. 1994). Their olfactory sense allows them to distinguish a large range of odors (Guerrieri et al. 2005) and their mechanosensory perception is also extremely rich.

Bees are flower-constant (Chittka et al. 1999) as they exploit only one flower species as long as it offers a profitable reward. Flower recognition can exploit several modalities such as olfaction because bees learn and memorize flower odors and also mark with attractant and repellent scents rewarding and emptied flowers, respectively (Giurfa and Núñez 1992; Giurfa 1993). These strategies will not be considered here as visual (and not olfactory) recognition is the main topic of the present review.

In that sense, recognition based on a memorized image of the flower species currently exploited is therefore crucial for a bee in order to forage efficiently. Recognition operates when the bee is approaching a flower during its foraging flight. In such an approach flight, and depending on the visual range to the target, different visual cues may activate different representations and trigger different responses (Giurfa and Menzel 1997; Giurfa and Lehrer 2001). Depending on the similarity between the perceived and the stored information, the bee will recognize the flower as similar or not. Recognition has to be flexible enough to identify the appropriate flower despite its different orientation in space, or despite distortions in shape introduced by wind, approach direction, occlusion by leaves, etc. Thus, flower constancy requires the ability to discriminate between different flower species but also the capacity of generalizing between slightly different flowers of the same species.

It is possible to study experimentally visual recognition in honeybees as they can be easily trained to fly towards a visual target on which a reward of sucrose solution is delivered by the experimenter (von Frisch 1915). The associations build in this context link visual stimuli and reward, but also the response of the animal (e.g. landing) and reward, i.e. bees learn that a given visual cue (e.g. a color) will be associated with a reward of sucrose solution and that they have to land on it to get the reward. Although this basic design does not correspond to a “go-no go” design typically used in vertebrate experiments of categorization, experimental variations can be conceived which address the point of stimulus exposure. For instance, training free-flying bees to discriminate visual stimuli in Y-shaped mazes is a procedure common to most of the works reviewed here. In such design bees have to fly towards a rewarded stimulus in one of the arms of the maze and avoid entering the alternative arm with a non-rewarded stimulus. It has been shown that, under such circumstances, choosing the rewarded alternative means also learning to avoid explicitly the non-rewarded stimulus (Giurfa et al. 1999).

Using this basic design in which procedural modifications can be introduced, several studies on honeybees trained to discriminate different patterns and shapes have shown performances which could be interpreted as visual categorization. Whether these results reflect or not categorization abilities, they have changed dramatically previous views on honeybee visual recognition.

Previous views on visual pattern recognition by bees and other insects

Early studies on visual pattern and shape recognition were performed already in the beginning of the 20th century (e.g. von Frisch 1915; Hertz 1933) and since that time they have used relatively simple stimuli (e.g. Hertz 1933; Wehner 1972b; Gould 1985). Researchers mostly credited bees, and insects in general, with limited recognition capabilities associated with low-level cognitive abilities (von Frisch 1962). The majority of studies on bee visual perception aimed towards an examination of the mechanisms of the bee's visual system and only few of them asked for the implementation of such mechanisms into flexible, higher-level, cognitive strategies.

The earliest ideas on how bees perceive patterns and shapes put the accent on the detection of simple features, which would be evaluated isolated from each other, irrespective of the actual pattern. Before 1940, the work of Mathilde Hertz led to the conclusion that bees detected and discriminated patterns purely on the basis of cues like the disruption of the pattern (related to spatial frequency) and the area of black or color (Hertz 1933, 1935). Peripheral rather than central processing was assumed to underlie this performance (Hertz 1933, 1935).

Later, the dominant view until the beginning of the 1990s was that bees essentially learn and recognize patterns in terms of a retinotopic template (Wehner 1972a,b; Gould 1985). The template theory postulated that patterns are perceived as a retinotopically fixed images. The terms ‘snapshot’ (Cartwright and Collett 1983) or ‘eidetic image’ (Wehner and Lindauer 1966; Wehner 1972a,b, 1981) were often used to refer to this pixel-based and detailed form of pattern representation. Recognition of a perceived shape or pattern would depend on the amount of overlapping between a memorized representation and the stimulus actually perceived (Wehner 1972a, 1974, 1981; Gould 1985). The template hypothesis therefore implied a pixel-based storing and comparison of visual information. This ‘holistic’ representation is not very flexible concerning generalization and transfer of acquired information to novel, unknown stimuli (Dill et al. 1993). In fact, a small displacement of the memorized pattern in the insect visual field would preclude recognition as the retinal position of the object would be different with respect to the memorized one (Wehner 1974). This strategy is therefore extremely rigid and does not account for flexible stimuli use in the natural world, in which bees experience an enormous variety of shapes and patterns during their foraging flights and respond to the same flower species even if flowers appear differently oriented in space or are partially occluded by vegetation.

Positive transfer in honeybee visual recognition

As the template theory became unsatisfactory due to its small explanatory power in numerous cases, and with progress in this research area, new experiments started to show in the 1990s that positive transfer of learning (Robertson 2001) occurred as bees sometimes responded to novel stimuli that they never met before and generated novel responses that were adaptive given the context in which they were produced. Some of these studies aimed at unrevealing visual processing mechanisms and were not directly concerned by the problem of categorization (van Hateren et al. 1990; Horridge 1997a; Srinivasan et al. 1993, 1994). Nevertheless, they have in common with other works that explicitly asked for categorization abilities in bees that they showed positive transfer from trained to novel stimuli.

Generalization of edge orientation

The first study showing generalization capabilities beyond straightforward discrimination in honeybees was, in fact, not concerned by the question of high-level cognitive performances in honeybees. This study focused on sensory physiology of bees and, more specifically, on the mechanisms of orientation discrimination in honeybees (van Hateren et al. 1990) such that the term ‘categorization’ was not mentioned in this work. Van Hateren et al. (1990) trained free-flying bees with pairs of achromatic (black and white) disks presenting stripes of varying period and width (ten different stimuli) but with a single orientation that could be varied by rotating the disks (Fig. 1a). Bees were trained to discriminate two given stripe orientations (e.g. 45° from 135°) by rewarding one of these orientations with sucrose solution and the other not. During the training, pairs of stimuli with extremely different spatial quality (see Fig. 1a) were presented in a random succession to the bees. Within each pair, one was oriented at 45° and the other at 135°. Thus, irrespectively of their differences in spatial detail, gratings could be classified as displaying either a 45° or a 135° orientation. In this case, gratings oriented at 45° were rewarded with sucrose solution while those at 135° were non-rewarded. Thus, the critical procedural modification introduced by van Hateren et al. (1990) was to train each bee with a changing succession of pairs of different disks, one of which was always rewarded and the other not. Despite the difference in pattern quality, all the rewarded patterns had the same edge orientation and all the non-rewarded patterns had also a common orientation, perpendicular to the rewarded one. Through this training procedure in which rewarding and non-rewarding patterns were randomly changed, the possible formation of a template of a rewarded pattern was prevented. Under these circumstances, the bees had to extract and learn the orientation that was common to all rewarded patterns to solve the task.

Fig. 1
figure 1

Categorization of edge orientation by honeybees. a Training stimuli (P1 to P10) used in van Hateren et al.'s experiments (1990). Pairs of stimuli were presented in a random succession to the bees. Within each pair, one was oriented at 45° and the other at 135°. In this case, gratings oriented at 45° were rewarded with sucrose solution while those at 135° were non-rewarded. b Tests performed with stimulus pairs not used during the training. In each case, there was a significant preference for the pattern presenting the orientation rewarded during the training. Bars indicate the proportion of choices for each stimulus. Bees transferred their choice from the known to the novel patterns and classified them according to their orientation (from van Hateren et al. 1990)

In the tests, bees were presented with novel patterns (Fig. 1b), which they were never exposed to before, which were all non-rewarded, but which exhibited the same stripe orientations as the rewarding and non-rewarding patterns employed during the training. In such transfer tests, bees chose the appropriate orientation despite the novelty of the structural details of the stimuli. The authors concluded that bees detect the orientation of a visual pattern per se, independently of pattern quality (van Hateren et al. 1990). This conclusion led to a model of orientation detection in the honeybee, based on the existence of three types of orientation detectors, with a defined preferred orientations and tuning (Srinivasan et al. 1994), comparable to those available in the mammalian visual cortex (Hubel and Wiesel 1962). Such detectors were found later by means of electrophysiological recordings in the visual areas of the bee brain (Yang and Maddess 1997).

The work by van Hateren et al. (1990) shows that bees can extract pattern orientation as a feature per se, irrespective of pattern quality, and generalize their response to unknown stimuli. This performance could comply, in principle, with the definition of categorization because bees exhibited appropriate transfer from known to novel stimuli such that stimuli were classified according to their orientation. In that sense, bees would be able to categorize patterns based on their main orientation.

However, although discrimination between classes (orientations) was granted, generalization within classes was not really studied in detail. In other words, when bees transferred their choice to a novel pattern sharing the same orientation as previously rewarded ones, were they really generalizing their choice to novel, distinguishable stimuli? Or were they just choosing the novel stimuli because they could not distinguish them from the previous ones? In the latter case, speaking about categorization would be obviously senseless.

In van Hateren et al.'s (1990) experiments the answers to these questions are partial. Bees could discriminate between a random-grating used during the training (i.e. patterns P1 to P10 in Fig. 1a) and a similarly oriented single bar used in some tests (i.e. test patterns in Fig. 1b, first row), but they could not discriminate between two of the ten training stimuli similarly oriented (P4 vs. P8 in Fig. 1a). No information was provided about whether bees could or not discriminate between the eight other training stimuli. Therefore, only the results of the tests involving the distinguishable stimuli could be strictly viewed as reflecting a categorization performance.

As a conclusion, one could safely state that bees exhibit generalization of orientation between patterns of very distinct spatial quality, a performance that goes beyond straightforward discrimination. This performance could be viewed as visual categorization but stating this conclusion on a firm ground requires additional control experiments showing that bees treated all stimuli used as distinct independently of generalizing their responses in certain cases and not in others. Although caution is necessary, certain patterns that were treated as equivalent by bees based on their common orientation could obviously be discriminated (e.g. P2 vs. P9, or P5 vs. P7). Thus, although not all requirements for concluding that categorization occurred were fulfilled, van Hateren et al.'s results (1990) strongly suggest that bees could indeed categorize patterns based on their main orientation.

Generalization of radial and concentric patterns

Horridge and Zhang (1995) did another study on pattern vision in honeybees using patterns with no predominant orientation, namely radial and concentric patterns (Fig. 2a,b). Their original motivation, again, was not to study categorization but instead to provide evidence about the existence of specific filters in the bee visual system that would be tuned to such kinds of patterns. Horridge and Zhang (1995) proposed that besides having orientation detectors, as shown by the work of van Hateren et al. (1990; see also Srinivasan et al. 1993, 1994), bees extract the global concentric or radial nature of a pattern per se, and that local orientation is neglected when these detectors operating on the whole pattern are excited.

Fig. 2
figure 2

Categorization of radial and concentric patterns. a Training stimuli used in Horridge and Zhang's experiments (1995). Pairs of stimuli were presented in a random succession to the bees. Within each pair, one was sectored and presented therefore radial cues, while the other was concentric and presented therefore tangential cues. In this case, radial patterns were rewarded with sucrose solution (Tr+) while concentric patterns were non-rewarded (Tr−). b A transfer test in which a novel radial pattern was presented against a trained concentric one. Bees preferred the novel radial pattern. c A transfer test with novel stimuli made from four bars disposed in order to create radial cues (the cross) or tangential cues (the square). Bees preferred the radial cross stimulus to the square. Bars indicate the proportion of choices for each stimulus (from Horridge and Zhang 1995)

As for the experiment on orientation extraction (van Hateren et al. 1990; see above), bees were trained with a series of changing radial vs. concentric patterns (Fig. 2a) and then confronted with novel patterns, concentric vs. radial (Fig. 2b), that were not used during the training. Depending on the contingency of the stimuli, bees chose the novel radial or concentric patterns, thus showing a capacity to transfer their choice of the rewarded feature to novel stimuli sharing this feature with the trained ones.

Thus, the results of these experiments suggest that bees could categorize patterns based either on their radial symmetry or on their concentric organization. As for the case of orientation generalization, however, caution is necessary because not all requirements for a categorization experiments were fulfilled. Although the authors showed (Horridge 1997a) that stimuli within a group (radial or concentric) could be indeed distinguished from each other, thus showing that generalization within each class was not due to the lack of discrimination, additional tests should address the issue of whether test stimuli were indeed perceived as being different from training stimuli. Test stimuli in Fig. 2c were, nevertheless, in principle extremely different from those used during the training (Fig. 2a), thus suggesting that bees could indeed categorize patterns based on their radial or concentric organization.

Generalization of pattern disruption

Pattern disruption is a cue that has historically deserved a particular attention in earlier studies on insect vision (Hertz 1933) because it was originally believed that insects, and bees in particular, could only distinguish patterns and shapes on the basis of their disruption, i.e. they would not recognize shapes but just classify them as dissected or not dissected (Hertz 1933, 1935). Although these ideas have been abandoned, pattern disruption is certainly a low-level visual cue that bees exploit under certain circumstances. Horridge (1997a) showed that honeybees discriminate between patterns that differ in average disruption as a generalized cue, irrespective of pattern. To this end, he trained bees with different kinds of black and white patterns that presented either orientation cues (vertical gratings), radial cues or concentric cues. The only feature that remained constant between rewarded patterns of different quality was the period of the black and white areas. Bees were trained, for instance, on randomized-phase vertical gratings of period 6 cm rewarded vs. vertical gratings of period 4 cm non-rewarded. They learned the task and transferred their choice to radial patterns whose sectors were spaced also by a period of 6 cm. Furthermore, when confronted in the tests with unknown concentric patterns (spirals), they preferred a spiral with a period of 6 cm to a spiral with a period of 2 cm. Bees trained to prefer a larger period transferred to an even larger period when given a forced choice with a pair of patterns of differing disruption from those they were trained on. However, bees trained to prefer a smaller period could not transfer to an even smaller period and preferred the formerly negative pattern. In other words, extrapolation towards larger periods was possible but not towards smaller ones (Horridge 1997a).

Fig. 3
figure 3

Categorization of bilateral symmetry. a Example of triads of training stimuli used to train an individual bee for bilateral symmetry. Each triad consisted of a (+) symmetric stimulus rewarded with sucrose solution, and two different non-rewarded (−) asymmetric stimuli presented simultaneously. In the case of bees trained for asymmetry, each triad had a rewarded (+) asymmetric stimulus and two different non-rewarded (−) symmetric stimuli. b Novel stimuli used during the multiple-choice, generalization tests. None was rewarded. c Choice frequency for the trained feature in the tests; the performance of bees trained for symmetry (white circles) and for asymmetry (black circles) is shown. From test 7 onwards, bees trained to discriminate bilaterally symmetric from non-symmetric patterns learned the task and transferred it appropriately to the novel stimuli, thus demonstrating a capacity to classify stimuli on the basis of their symmetry or asymmetry (from Giurfa et al. 1996)

Although this work shows that honeybees were able to classify stimuli according to their disruption, irrespective of pattern, and thus support the idea that bees can categorize patterns purely on the basis of disruption, control discrimination experiments were also absent here (Horridge 1997a) because this study was not concerned by the issue of visual categorization. Control experiments, which should demonstrate to which extent some patterns that were close in period were really distinguishable for the bees, are critical for explaining the unidirectional nature of transfer towards periods different from the trained ones.

Generalization of bilateral symmetry

Transfer to novel instances has been also shown in the case of bilaterally symmetric patterns that are vertically displayed (Giurfa et al. 1996). In this case, Giurfa et al. (1996) asked explicitly whether bees can perceive bilateral symmetry as an independent pattern feature. The term ‘categorization’ was introduced here to account for this kind of visual performance in honeybees (Giurfa et al. 1996). Bees were trained with triads of patterns (Fig. 3a) in which one pattern was rewarded with sucrose solution and the other two were non-rewarded. For the bees trained for symmetry, the rewarded pattern was symmetric and the non-rewarded patterns were asymmetric. For the bees trained for asymmetry, the rewarded pattern was asymmetric and the two non-rewarded patterns were symmetric. To avoid learning of a specific pattern, bees were again confronted with a succession of changing triads along training. The tests were interspersed along the training with the triads and they consisted in presenting 12 novel stimuli (Fig. 3b), 6 symmetric and 6 asymmetric, all non-rewarded.

Bees trained to discriminate bilaterally symmetric from non-symmetric patterns learned the task and transferred it appropriately to novel stimuli, thus demonstrating a capacity to detect and generalize appropriately symmetry or asymmetry (Fig. 3c). Interestingly, bees trained for symmetry chose the novel symmetric stimuli more frequently, came closer to and hovered longer in front of them than bees trained for asymmetry did for the novel asymmetric stimuli. It was thus suggested that bees have a predisposition for learning and generalizing symmetry. Such a predisposition can either be innate and could facilitate a better and faster learning about stimuli that are biologically relevant (Rodriguez et al. 2004) or can be based on the transfer of past experience from predominantly symmetric flowers in the field. A feature-positive effect could eventually explain the better performance of bees trained to symmetry compared to those trained to asymmetry. This effect is related to the notion that better acquisition occurs when subjects are trained to respond to the presence of a given feature (here symmetry) rather than to its absence. However, asymmetry could be also a feature per se thus questioning the validity of this latter interpretation.

In this study, acquisition curves were provided for the first time because the experimenters controlled the individual performance of each bee studied along the whole experiment (Fig. 3c). This factor, which was absent in all honeybee vision works cited up to now, is important as pattern recognition strategies may change with cumulative experience along training (Giurfa et al. 2003; Stach and Giurfa 2005) such that different levels of experience with the same stimuli may result in different recognition strategies. Acquisition curves showed that bees did not master the task during the first tests, probably because they were applying low-level, hierarchic cues (i.e. disruption) that were irrelevant to the problem; however, from test 7 onwards, they exhibited an abrupt increase of correct responses that reveals that bees started to focus on the appropriate feature predicting the presence of reward.

Although the training stimuli in these experiments did not resemble to each other, at least to the human eye, the control experiments showing that all symmetrical and asymmetrical patterns were distinguishable from each other are also missing here. Specific analyses performed in this work showed that stimuli varied along several low-level cues that bees usually use while distinguishing patterns (disruption, orientation, subtended angle, area, etc.) but that bees were not responding to these cues but to symmetry or asymmetry.

In this case, a specific ecological advantage would arise from flower categorization in terms of symmetrical vs. asymmetrical. The perception of symmetry would be important for pollinators because symmetry of a flower may signal its quality and thus influence mating and reproductive success of plants by affecting the behavior of pollinators (Møller and Eriksson 1994, 1995). As bees discriminate between symmetry and asymmetry, they should also be capable of performing selective pollination with respect to floral symmetry even within a patch of flowers. This may indicate that plants may have exploited such cognitive capabilities of the pollinators during the evolution of flowers.

Thus, although the performance of bees in Giurfa et al.'s experiments (1996) is indeed consistent with categorization of figures based on their symmetry, additional controls would be necessary to show that all patterns used in this work were indeed distinct for the bees despite the fact that some were considered as equivalent based on their symmetry.

Generalization based on topological invariants

Mathematicians classify and categorize shapes by identifying their special properties, called topological invariants. As indicated by the term, topological invariants remain constant even when an object's appearance changes due to orientation, change of position, noise, and other distortions (Chen 1982). Recently, Chen et al. (2003) have proposed that topological invariants exist in honeybee vision. They suggested that global topological features are primitives in bee's vision (i.e. features that are processed first) thus underlining the role of global rather than local analysis in shape perception by honeybees.

Fig. 4
figure 4

Categorization based on sets of multiple features. a Training stimuli used in Stach et al.'s experiments (2004). A patterns (A1 to A6) differed from each other but shared a common layout defined by the spatial arrangement of orientations in the four quadrants. B patterns (B1 to B6) shared a common layout perpendicular to that of A patterns. b Test stimuli. Bees transferred appropriately their choice to these novel, non-rewarded patterns preserving the basic layout of the trained ones. c Test stimuli used to determine whether bees extract or not the simplified layout of four bars from the rewarded A patterns. The four test pairs shown correspond to the honeybees trained with A patterns. Equivalent tests were performed with the honeybees trained with B patterns (not shown). S+, simplified layout of the rewarded training patterns; UL, upper-left bar rotated; UR, upper-right bar rotated; LL, lower-left bar rotated; LR, lower-right bar rotated. d (Left panel) Acquisition curve showing the pooled performance of bees rewarded on A and B patterns. The proportion of correct choices along seven blocks of six consecutive visits is shown. Bees learned to discriminate the rewarding patterns (A or B) used for the training a and improved significantly their correct choices along training. (Right panel) Proportion of correct choices in the tests with the novel patterns. Bees always preferred the simplified layout of the training patterns previously rewarded (S+) to any variant in which one bar was rotated, thus showing that they were using the four bars in their appropriate spatial locations and orientations

In their study, Chen et al. (2003) trained bees with only one pair of black and white patterns, one rewarded ring vs. one non-rewarded “S” shaped pattern. In subsequent tests, bees were presented with novel patterns differing in shape with respect to the trained ones. In all cases, bees chose the pattern that was topologically equivalent to the rewarded ring. For instance, Chen et al.'s data (2003) suggest that a hollow diamond and the ring – two shapes that are topologically equivalent – might be indistinguishable for honeybees.

This design of these experiments refers to a generalization rather than to a categorization problem as the question raised is to which extent experience with a single pair of patterns can be generalized to novel patterns preserving the rewarded topology. However, concluding that topological invariants such as the number of holes, inside vs. outside, and connectivity exist in honeybee vision implies that bees could classify patterns according to these basic topologies, and thus perform topology-based categorization. In this sense, training bees with a changing succession of different patterns in which a certain topology would be preserved would not change the basic result found by Chen et al. (2003): when confronted with novel patterns bees would respond to them based on the presence or absence of the rewarded topology. Again, even with a multiple-stimulus training, stating that bees categorize patterns based on topological invariants would be incautious in the absence of appropriate control experiments. However, Chen et al.'s results (2003) allow addressing the question of whether or not such categorization is possible in honeybees.

Generalization based on sets of multiple features

The previous works have in common that they assumed that bees focused their attention on a single feature at a time (orientation, radial symmetry, bilateral symmetry or disruption) to solve the problem. In other words, they demonstrate that bees can generalize visual stimuli on the basis of a single feature, beyond straightforward discrimination. In fact, it has been repeatedly argued that due to limited cognitive capabilities, bees could not do anything but focus on a single isolated feature at a time (Horridge 1996, 1997b) and could not therefore attain levels of stimulus classification such as configural categorization as exhibited by humans (Maurer et al. 2002).

Recently, Stach et al. (2004) showed that a further level may exist in honeybee visual generalization. Besides focusing on a single feature, honeybees were shown to assemble different features to build a generic pattern representation, which could be used to respond appropriately to novel stimuli sharing such a basic layout (Stach et al. 2004). Honeybees trained with a series of complex patterns sharing a common layout comprising four edge orientations (Fig. 4a) remembered these orientations simultaneously in their appropriate positions, and transferred their response to novel stimuli that preserved the trained layout (Fig. 4b). Honeybees also transferred their response to patterns with fewer correct orientations (Fig. 4c), depending on their match with the trained layout. This generic pattern configuration was inculcated by a training in which a randomized succession of changing patterns sharing a common configuration was used (Stach et al. 2004). Thus, the question of whether bees can extract a configuration common to a group of rewarded patterns, made from four different edge orientations arranged in a specific spatial relationship to each other, was answered positively. Bees can extract such configuration and respond to novel patterns that also present this configuration.

This capacity could also be inculcated after prolonged training with a single pair of constant patterns instead of using a randomized succession of changing patterns sharing a common configuration (Stach and Giurfa 2005). In this study, bees were trained with a single pair of patterns (Fig. 5a) following a short (21 trials) or a long (42 trials) training. They were subsequently tested with their simplified layout (Fig. 5b). Bees which received the short training failed to discriminate between the two simplified layouts, while bees which received the long training discriminated and preferred the simplified layout corresponding to the rewarded pattern (Fig. 5b). Thus, bees could generalize or not from the trained to simplified patterns sharing the same basic layout, depending on training length. Furthermore, bees which received the short training discriminated and preferred the training pattern to its simplified layout while they chose randomly between both patterns after 42 learning trials (Fig. 5d). Enhanced experience promotes, therefore, a higher level of generalization, thus showing that recognition strategies vary dynamically along training.

Fig. 5
figure 5

The effect of cumulative experience on pattern generalization (from Stach and Giurfa 2005). a Learning curves of the two groups of honeybees trained along either 21 or 42 trials. Bees were trained with a single pair of stimuli A and B, shown under the learning curves. For each training length, a group of bees was trained with A rewarded and B non-rewarded while another group had the reverse contingency. Curves show pooled performances (correct choices of S+ vs. S−) along six or three blocks of seven learning trials each. b In a test, bees were presented with the simplified layouts of the trained stimuli, shown under the test bars. Depending on training length, bees could generalize or not from the trained patterns to the simplified patterns. More learning trials (here 42) were required to transfer the choice of the known patterns to their simplified layouts. c Learning curves of the two groups of honeybees trained as explained in a. d In a test, bees were presented with the stimulus previously rewarded vs. its simplified layout. Bees trained along 21 trials significantly preferred the rewarded pattern to its simplified layout. Bees trained along 42 trials chose randomly between both patterns, thus showing that enhanced training results in higher generalization

The results of Stach et al. (2004) show that honeybees extract regularities in their visual environment and establish correspondences among correlated features such that they generate a large set of object descriptions from a finite set of elements. This performance could be the basis for configural categorization, although further control experiments would be necessary in this case.

A related conclusion was reported by Zhang et al. (2004) who showed that honeybees have the ability to group similar, natural images together. They showed positive transfer to novel stimuli within four groups of stimuli: (1) star-shaped flowers, (2) circular-shaped flowers, (3) plant stems and (4) landscapes. Although these experiments did not reveal the specific cues used by the bees to establish belonging to one of these groups, Zhang et al. (2004), excluded the use of color and mean luminance as defining single low-level features. They rather suggested that configurational properties of the figures, in which specific features such as circular symmetry, angular periodicity, bilateral symmetry and the presence of a horizontal, high-contrast edge, the horizon, would be integrated, could help the bees to classify the different stimuli. Therefore, they suggested that this categorization is based on a combination of low-level features, a suggestion that coincides with that of Stach et al. (2004).

Conclusions

We have seen that honeybees show positive transfer of appropriate responding from a trained to a novel set of stimuli, and that their performances are consistent with the notion of categorization. Such a transfer was demonstrated for specific isolated features such as symmetry or orientation, but also for configuration (layouts) of features. The transfers discussed in this review can certainly be viewed as reflecting powerful generalization abilities in the visual domain. The question is whether besides generalization, they also reflect a certain level of categorization. A purist point of view would deny this latter capacity to honeybees based on the evidence reviewed. Such a view would claim that in all works presented control experiments are missing, including stimulus balance and demonstration that transfer to novel stimuli was not due to lack of discrimination. Although we agree with the necessity of such a strict framework, it would be also careless to ignore that all the evidence presented strongly suggests that bees can indeed categorize visual stimuli based on defined properties. Moreover, given the nature of the stimuli used in the works presented, it is clear that many if not most of the transfer tests of these experiments involved clearly distinguishable stimuli such that the concerns on the lack of discriminability are not necessarily justified.

Since due to species-specific differences it is not possible to reproduce linearly in bees experimental procedures successfully employed in the case of vertebrates, the challenge is to conceive experiments addressing the issue of visual categorization in honeybees that preserve the biological context of this insect. In doing this, the issue of stimulus balance and discriminability should be explicitly addressed if the term categorization is to be used. Setting strict requirements for the use of this term should be a general concern, independently of the species used. Sharp and strict definitions should be inspired by scientific rigor and not by a biased skepticism of what a certain species, and not the other, can achieve in terms of cognitive abilities.

Old views on insect vision recognition that assumed that insects were limited to building rigid templates of the patterns viewed appear now inadequate when considering the results obtained in the last decade. Without being concerned by levels of cognitive processing, several works studying mechanisms of pattern vision in insects have refuted the idea of a template-based recognition (Ernst and Heisenberg 1999; Efler and Ronacher 2000; Campan and Lehrer 2002; Hempel de Ibarra and Giurfa 2003). Other works have put the accent on the inherent plasticity of honeybee visual learning but studied mainly elemental problem solving from the perspective of experimental psychology (Grossman 1970; for review see Bitterman 1996), thus avoiding cognitive interpretations. In the present review, we have focused on cognitive visual performances in honeybees. The works we have presented and discussed show that bees are able to abstract certain general properties of patterns without memorizing them entirely. This capacity is more understandable in the case of a miniature brain with storing capacities that are obviously limited, and which has to recognize different objects and landmarks in a variable environment and from different viewpoints. In these circumstances, storing templates of any object or landmark in all their possible variations would not necessarily contribute to cognitive economy.

An important aspect to consider is the role of the training procedures in categorization performances of bees. In the different examples reviewed in this work, a common training procedure was used, namely to randomize patterns with respect to all possible parameters but one, which was consistently associated with reward or absence of it. This procedure, which is the very basis of learning sets (Harlow 1949), prevents the forming of an association between any of the variable features and the reward, leaving the bee with only the common feature (or set of features) as the useful information predicting reward. Therefore, the basic principle of a learning set as used in all experiments performed on pattern recognition in honeybees is that animals are explicitly trained to perform in a given way defined a priori by the experimenter. This procedural aspect raises a question on the nature of the task emerging from the experiments on visual generalization by bees: does the extraction of a common feature or of a set of features from a series of different stimuli occur under training conditions different from those of a learning set? Contrarily to training with a changing sequence of patterns sharing a common feature, training with the same constant pair of patterns does not impose explicitly the necessity of extracting a specific feature. Under such training, any available cue could be used to discriminate between patterns. Do bees under these circumstances also extract particular features of patterns, thus being able to show categorization-like performances even after such a simplified training? As discussed above (see ‘Generalization based on sets of multiple features’), the amount of experience with the training patterns is critical to the amount of generalization exhibited in the tests (Stach and Giurfa 2005). Higher levels of experience result in higher levels of generalization reflected in significant responding to novel stimuli. With ongoing training, redundant information seems to be eliminated and reduced to the minimum that is necessary and sufficient to solve a discrimination task. Thus, controlling precisely the level of experience of individuals is crucial in experiments on visual recognition.

Could a neural basis for visual stimulus categorization exist in the honeybee brain? If we admit that visual stimuli are categorized on the basis of specific features, the neural implementation of category recognition could be relatively simple. The feature(s) allowing stimulus classification would activate specific neuronal detectors in the visual neuropiles (the optic lobes) of the bee brain. Examples of such feature detectors are the orientation detectors whose tuning and orientation have been already characterized by means of electrophysiological recordings in the honeybee optic lobes (Yang and Maddess 1997; see above). In the case of multiple-feature categorization, synchronous activation of the corresponding feature detectors could provide the neural representation of the category. However, in case of category learning, the activation of an additional neural element is needed. Such element would be necessary and sufficient to represent the reward (sucrose solution) and should contact and modulate the activity of the visual feature detectors in order to assign value to appropriate firing. This kind of neuron has been found in the honeybee brain as related to the olfactory circuit. VUMmx1 is a neuron present in the honeybee brain that receives its name from its localization (the name is the abbreviation of “ventral unpaired median neuron of the maxillary neuromere 1”). The dendrites of VUMmx1 arborize symmetrically in the brain and converge with the olfactory pathway at different sites (Hammer 1993). The essential property of VUMmx1 is that it responds to sucrose solution delivered both at the antennae and the proboscis with long lasting spike activity (Hammer 1993). Furthermore, the activity of this neuron constitutes the neuronal representation of reward in olfactory learning as shown by the fact that bees can learn an olfactory stimulus which was paired with an artificial depolarization of VUMmx1 instead of sucrose reward (Hammer 1993). Other VUM neurons whose function is still unknown are present in the bee brain. It could be conceived that one of them (or more than one) contacts the visual circuit to function as reinforcement in associative visual learning. Category learning, if any, could be thus reduced in the honeybee brain to the progressive establishment (through Hebbian rules, for instance) of an associative neural circuit relating visual-coding and reinforcement-coding neurons, similar to that underlying simple associative (e.g. Pavlovian) conditioning.

Our review is obviously based on available published data. Because the works presented here analyzed categorization-like performances based on specific pattern or shape properties (symmetry, orientation, etc.), they were done using black and white stimuli in order to study the effect of a single variable at a time and determine in this way its effect on the task considered. It would be extremely interesting to study whether, besides generalizing flower-like stimuli on the basis of geometrical features, bees can also classify such stimuli based on further features such as hue or achromatic contrast to the background.

Suggesting that insects can categorize visual stimuli raises further questions about further levels of cognitive processing. In Herrnstein's classification (1990) the levels following ‘open-ended categorization’ were those of ‘concepts’ and ‘abstract relations’. It seems thus obvious to ask whether bees can also exhibit performances that would be considered as concept- or abstract relation formation. Giurfa et al. (2001) have shown that bees can learn to solve both a delayed matching-to-sample task and a delayed non-matching-to-sample task thus meaning that they can master the relations of sameness and difference, respectively. However, they cannot solve transitive inference problems due to memory constraints and because, due to their natural organization of foraging activities, they assign a higher value to the last acquired memory instead of handling several memories simultaneously and assigning them equivalent weights (Benard and Giurfa 2004). Instead of simply asking whether or not animals can do the same as humans do, further research should explore the mechanisms of problem solving and how ecological and evolutionary constraints influence cognitive processing.