Introduction

In the past, several hypotheses about the evolution of art, including iconic representations (i.e., figurative imagery, realistic art), have been proposed. These hypotheses differ as to whether art is an adaptation or not (e.g., Pinker 1997, 2002), on which level it is selected—the cultural level (Boyd and Richerson 1985, chap 8; 2005) or the genetic level—and which mechanism is responsible for its evolution—mating display (Miller 1998, 1999, 2000, 2001), group bonding (Coe 2003; Dissanayake 1992, 2001), and so on. These different suggestions are all possible solutions to the same problem: the high costs of art production (it is known to be resource, time, and energy consuming). How could such a costly behavior have emerged? Are the costs compensated by benefits (art as an adaptation)? Or are they merely borne by a system that can support a certain amount of suboptimal variants (art as a consequence of non-adaptive evolution)? In order to answer these questions, we need a framework in which all hypotheses about art can be articulated and evaluated. Previously, we have proposed the concept of SE to this end (Verpooten and Nelissen 2010). In this article, we will first discuss why SE should be considered when modeling the evolution of iconic representations. Then we will apply the concept specifically to shed light on the late emergence of iconic representations in human evolution.

The concept of sensory exploitation

The concept of SE is based upon a model from sexual selection theory of the same name. In sexual selection, SE is a fairly recent model that specifically focuses on female preferences in mate choice. These female preferences result from pre-existing biases of the female psychosensoryFootnote 1 system that function in other contexts such as finding food or avoiding becoming food. Male display traits evolve to exploit these pre-existing female biases to achieve matings. Some scholars have defended SE as an alternative to indirect benefit models such as Good Genes Selection and Fisher’s Runaway Process (e.g., Ryan 1998). Almost all biologists agree today that SE may provide the initial nudge for the evolution of male displays, although they are still debating the relative roles of SE and indirect benefit models in the subsequent evolution and maintenance of female mating preferences and male display traits (Fuller et al. 2005). Some empirical data does seem to indicate that SE is also important in maintaining traits. For instance, when male display traits are obviously mimicking signals as is the case in the egg spots of cichlids. In that case, a runaway process would compromise the success of the mimic. Therefore, SE is a primary force in the evolution of male display traits, and selection through indirect benefits is merely secondary (Kokko et al. 2003).

Arnqvist (2006) usefully distinguishes two main types of sensory biases. First, females are adapted to respond in particular ways to a range of stimuli in order, for example, to successfully find food, avoid becoming food for predators and breed at optimal rates, times, and places. Such multi-dimensional response repertoires form a virtually infinite number of pre-existing sensory biases that are potential targets for novel male traits. Arnqvist (2006) refers to these biases as “adaptive sensory biases.” Notice that male traits that result from exploiting these adaptive sensory biases are often “mimics”.Footnote 2

Secondly, pre-existing sensory biases need not be the direct result of selection. In theory, they can simply be incidental and selectively neutral consequences of how organisms are built (e.g., Endler and Basolo 1998). For example, artificial neural network models have shown that networks trained to recognize certain stimuli seem to generally produce various sensory biases for novel stimuli as a by-product (e.g., Arak and Enquist 1993). Similarly, research in “receiver psychology” (e.g., Guilford and Dawkins 1991) has also suggested that higher brain processes may incidentally produce pre-existing sensory biases for particular male traits. Following Arak and Enquist (1993), Arnqvist (2006) refers to such sensory biases as “hidden preferences.” These, then, can be seen as side-effects or contingencies of how the sensory system, defined in its widest sense, of the receiver is constructed. Usually, it results in abstract biases, for symmetrical or exaggerated traits, for instance (Ryan 1998).

Sexual selection models prove to apply well to the evolution of human artistic and esthetic behavior because of the crucial role perception plays in sexual selection as in art and because both function as intraspecies signaling system. Moreover, there are conspicuous similarities between human artistic behavior and sexually evolved display behaviors in other animals (Darwin 1871)—e.g., bower decoration by male bowerbirds. Miller (1998, 1999, 2000, 2001) applied the indirect benefit model to explain the evolution of art, explicitly excluding a possible role for pre-existing psychosensory biases. We have argued that, based on current findings in sexual selection, he thereby underestimates the explanatory power of SE regarding the evolution of art (Verpooten and Nelissen 2010). The above-mentioned facts, i.e., that SE provides the initial nudge and a primary force in the evolution of male display traits, equally apply to the evolution of artistic behavior (i.e., producing and experiencing art). It is important to note that, although the concept we use is based upon a model from sexual selection, we do not intend to hypothesize here that art production evolved as a sexually selected trait (nor do we exclude it as a possibility). We only use SE for its mechanism: the interaction between psychosensory biases and traits that evolved by exploiting these biases. In our view, this mechanism can also work in non-sexual contexts; we are only looking at sexual selection as a signaling system analogous to artistic behavior. It is clear from the evidence in sexual selection that the primary force of SE will always be present. The same applies to art. Secondary forces, such as indirect benefits (e.g., as a mating display see Miller 1998, 1999, 2000, 2001), may be operating but are in principle not required for art to evolve. Therefore, here we will explore how far we can get without a priori invoking these secondary processes.

Sensory biases and art

Van Damme (2008, p. 30) describes art as follows: “Numerous contemporary definitions of the term “art” mention in one way or another both “esthetics” (denoting say, high quality or captivating visual appearance) and “meaning” (referring to some high quality or captivating referential content) as diagnostic features, although any clear-cut distinction between the two appears unwarranted, if only since there is no signified without a signifier.” This description is very well suited for our evolutionary approach from the SE perspective. The distinction Van Damme makes between esthetics and meaning roughly corresponds to the distinction made by Arnqvist (2006) mentioned above, between hidden preferences influencing the design of signals and adaptive sensory biases influencing the content of signals, resulting in mimicking signals, respectively. Thus, from a broad signal evolution perspective, we can state that what Van Damme has called esthetics, corresponds to design, and results from the exploitation of hidden preferences, and what he has called meaning corresponds to content and results from exploitation of adaptive sensory biases, by mimicking signals or traits.

The role of sensory or perceptual biases in the evolution of art has already extensively been investigated by several researchers (e.g., Hodgson 2006; Kohn and Mithen 1999; Ramachandran and Hirstein 1999). Essentially, they all have focused on the abstract, geometric aspect of visual art. They state that art emerged because its geometric patterns are supernormal stimuli to the neural areas of the early visual cortex. As such (exaggerated) symmetry, contrast, repetition, and so on, in visual art hyperstimulate these early neural areas. Thus, they have focused on what we have called hidden preferences. We agree with these authors that hidden preferences probably play an important role in the design aspects of human visual representations as they do in the design of male display traits. Hodgson (2006) is particularly relevant to our discussion as his focus is also on the emergence of prehistoric art. He has made some observations that are very significant to our proposal (see further).

However, as indicated by Van Damme’s definition, design is only one aspect of human visual art—content, or meaning (mimics/iconic representations as the result of adaptive sensory biases) is at least as important in most cases. We will make this clear by way of example in the next section: a comparison between egg spots in cichlids and visual art in humans from a semiotic viewpoint. This is followed by an introduction to some of the human adaptive sensory biases exploitable by iconic representations.

Iconic representations as a result of adaptive sensory biases

Semioticists generally agree that biological mimicry is a semiotic phenomenon (Maran 2007). In his essay, “Iconicity” Sebeok (1989) demonstrates that mimicry is a case of iconicity in nature. “A sign is said to be iconic when the modeling process employed in its creation involves some form of simulation” (Sebeok and Danesi 2000) and this is exactly what happens when adaptive sensory biases are exploited. We suggest that this also works the other way around: not only are mimics icons, visual art, or more specifically iconic representations (i.e., realistic art, figurative imagery) can be usefully perceived as mimics resulting from exploitation of human adaptive sensory biases.

Van Damme (2008, p. 38) defines iconic representations as: “The two- or three-dimensional rendering of humans and other animals, or to be more precise, the representation of things resembling those in the external world, or indeed imaginary worlds, fauna and flora especially, but also topographical features, built environments, and other human-made objects.” This definition is equally applicable to mimics. Many cichlid species independently evolved mouth breeding as a highly specialized brood care behavior. In different lineages of mouth breeding cichlids, we can find egg dummies, formed of various parts of the body, which resemble the ova of the corresponding species. The most abundant form of these is egg spots, which are conspicuously yellow spots on the anal fin of males. Females of mouth breeding cichlids undoubtedly evolved sensory capabilities to detect eggs and are supposed to have a strong affinity for them, as they pick them up immediately after spawning. In fact, the ability to detect the eggs directly affects the female’s fertility: Every missed egg results in a reduction in fitness. Consequently, a pre-existing sensory bias may have been present in early mouth breeders and may still be present in mouth breeding species which lack egg dummies. As a consequence, males would have evolved egg spots in response to this female adaptive sensory bias. After the female (receiver) has picked up her eggs (model), the male displays in front of her showing the egg spots on his anal fin (mimic). The female responds to the life-like egg illusion with a sucking reaction, and obtains a mouthful of sperm from the canny male in the process. It may be that the female’s mating preference for a male with well-elaborated egg spots does not yield any direct benefits for the female nor any good genes for the viability of the female’s offspring. Runaway selection is also limited by the mimicking function of the egg spots: they may need to remain life-like to mislead the female. Thus, this may well be an example of the strong version of SE. The female’s preference could be solely maintained by the benefit of the detection of eggs after spawning (Tobler 2006) (Fig. 1)

Fig. 1
figure 1

The mating system of mouth breeding cichlids. (A) After laying her eggs the female (right) sucks them up in her mouth. Her ability to detect the eggs is strongly selected for, since every missed egg results in a reduction of fitness. (B) This ability depends on a hair trigger response to “egg signals.” Subsequently, males (left) evolved egg spots, accurate two-dimensional mimics the eggs, to exploit this female response. Choice-display coevolution is inhibited by the fact that the female’s bias for eggs is vital for detecting the eggs, and there is no reason to a priori state that the effectiveness of the male egg spots are linked to genetic quality. So, this may well be an example of the strong version of sensory exploitation (artwork: Alexandra Crouwers and Jan Verpooten)

.

What is interesting for the problem of the evolution of human representational art, is that cases of mimicry like this one show how ordinary selection via SE can produce two-dimensional representations (the egg spots) on a surface (the anal fin of the male) of three-dimensional objects (the eggs). To a female cichlid both the signal from the egg and the signal from the egg spot mean “egg,” in the sense that she responds indiscriminately toward both those signals with a sucking reaction. In the same way, humans react toward iconic representations—even though we might “know” it is an illusion—as we react to the real thing. However, there is a difference between humans looking at art and the female cichlid looking at the egg spots: she really is deceived, whereas we know we are looking at a painting of a landscape and not at the real thing. However, does this distinction really matter? Not materially. For even though we know the movie or the novel, for instance, is not real, we still become deeply emotionally involved. Even though we might know it is fiction, we react as if it is not. Art exploits our visual system in the case of iconic representations and our emotions, regardless of our consciousness of the distinction between fiction and reality. Human iconic representations are mimics and as such also result from SE.

One of us (Mark Nelissen) has performed considerable research on cichlids and has described the system of the egg spots (in Tropheus and Simochromis). During courtship, males vibrate their body while showing the egg spots to the female. It could well be that by doing this they enhance the egg illusion, giving it a more three-dimensional effect in combination with the light–dark grading in color and the colorless outer ring the egg spots exhibit. Of course the female reacts toward formal features, design in other words, but this design is not “just design” but design designated to evoke meaning.

Rock art researchers throughout the world have explicitly or implicitly invoked ritual as an activity associated with rock art (Ross and Davidson 2006). Just as in the cichlid ritual, here too, the ritual might form an essential part of experiencing the iconic representation, providing an ideal context for deception of the senses. For instance, in the case of cave art, the illusion might have been enhanced by the use of lamps. Cave art must have required artificial illumination both to create it and to view it. The dim, flickering light provided by fat-burning lamps may have been integral to the intended appearance of these subterranean paintings (Debeaune and White 1993). Indeed, it has been suggested by several authors (e.g., Wachtel 1993) that the flickering artificial light created a cinematic effect, in combination with the use of the natural bumpiness of the cave’s walls, enhancing the illusion by bringing motion and depth into the depicted animals. In Lascaux, for instance, numerous lamps of this kind have been found (Delporte 1977).

Therefore, instead of focusing on geometrical patterns resulting from exploiting activation of early visual areas of the cortex, we focus on the exploitation of psychosensory or mental biases for iconic images, thus on a higher level of visual processing; for instance, face recognition. Humans have a hair-trigger response to faces. Everywhere we look, we see faces; in cloud formations, in Rorschach inkblots, and so on. The fusiform face area (FFA) is a part of the human visual system which may be specialized for facial recognition (first described by Sergent et al. 1992). It has recently been suggested that non-face objects may have certain features that are weakly triggering the face cells. In the same way, objects like rocky outcroppings and cloud formations may set off face radar if they bear enough resemblance to actual faces (Tsao and Livingstone 2008) (see “Enhancing accidental iconicity” section). Whether the hair-trigger response to faces is innate or learned, it represents a critical evolutionary adaptation, one that dwarfs side effects. The information faces convey is so rich, not just regarding another person’s identity, but also their mental state, health, and other factors. It is extremely beneficial for the brain to become good at the task of face recognition and not to be very strict in its inclusion criteria. The cost of missing a face is higher than the cost of declaring a non-face to be a face. Therefore, face recognition is an adaptive sensory bias, which is highly susceptible to exploitation by a depiction of a face as a side effect. If our brain had been less sensitive to faces and had stricter inclusion criteria, perhaps many fewer portraits would have been painted throughout art history.

However, strong the bias for faces is, it is not always exploited. In fact, in many prehistoric iconic representations, the face is not extensively elaborated. This is probably due to the specific context in which the depiction is produced and experienced (analogously, it might be that female cichlids are much less sensitive to “egg-like signals” a long time after spawning or before spawning). In many representations of the human figure, much more attention is given to specific parts of the body. For instance, in the well-known UP “Venus” figurines, the head is rather schematic, whereas breasts, buttocks, and belly are sculpted in great detail and disproportionately exaggerated. Many different hypotheses have been proposed to explain these distorted female representations (for an overview, see McDermott 1996). While speculative, McDermott’s (1996) interpretation is particularly interesting for our approach. He proposes that these disproportions resulted from egocentric or autogenous (self-generated) visual information obtained from a self-viewing perspective. In other words, the disproportions in Venus figurines result from the position of the female creators’ eyes relative to their own bodies. Indeed, we shall argue below in “Enhancing accidental iconicity” section that self-exploitation of perceptual biasesFootnote 3 may have been the first step in the emergence of iconic art. Whether these Venus figurines were created as self-representations, as fertility symbols or as erotic items, and whether they were created by men and/or women, they may constitute material evidence of strong adaptive sensory biases for above-mentioned parts of the female body.

We have already touched upon another frequently recurring theme in art history and even more so in art prehistory: the depiction of animals (large wild animals are among the most common themes in cave paintings). Again, a set of adaptive sensory biases might be one of the underlying causes of the tendency to depict animals. In particular, some have speculated that this could well be drawn back to the shared human capacity for “biophilia” (Wilson 1984). Biophilia is defined as a biologically based or innate predisposition to attend to, or affiliate with, natural-like elements or processes (Kellert and Wilson 1993). This set of tendencies is claimed to be the result of human evolution in a natural world in which human survival significantly depended on interactions with natural elements and entities, such as animals (animals could be, for example, predator or prey). Leading biophilia theorists have characterized it as including both positive and negative affective states toward natural-like elements.Footnote 4 These affective states may be exploitable by artificial natural-like signals, such as iconic representations of natural elements. For instance, the depictions of large cats in Grotte Chauvet (believed to be one of the oldest two-dimensional iconic representations) might have elicited a fear response, drawing attention to the depiction. What art needs to be maintained, improved, and reproduced over different generations, in other words to become a “tradition”, is to have attention drawn to it.

Is iconic art production genetically and/or culturally transmitted?

In Miller’s model, artistic production is maintained by the genetic reproductive success it renders compensating for its costs. In our SE concept, transmission of art production as a human behavior is possible by both genetic and cultural selection, in principle. If visual art is seen as the manifestation of differing sensitivities based upon adaptive sensory biases and hidden preferences, then the persistence of its production can be both the result of genetic level selection and/or cultural level selection. If costs are bearable or if any benefits (cultural or genetic) are involved, persistent psychosensory biases will bias genetic or cultural transmission. The impact of psychosensory biases will depend on several conditions (i.e., costs, benefits, context), but the upper limit is always determined by the costs. The model predicts that the better the costs can be borne, be it by direct benefits or by a greater carrying capacity of the population,Footnote 5 the more the psychosensory biases will manifested themselves.

There are some indications from the archeological record that iconic art production is a mainly culturally transmitted behavior, while the ability to experience and interpret art is not and does in fact predate art production, just as the origin of female sensory biases leading to mate preferences sometimes predates exploitation (e.g., Ryan 1998). One of these indications is provided by Hodgson (2006). He remarks that the “first art,” both (pre)historical and developmental (children’s first drawings are abstract patterns), is geometric. Therefore, what he calls “geometric primitives” predate iconic art. Hodgson further notices that no culture has ever been shown to have an iconic art tradition without a geometric tradition, but vice versa, some cultures only have a geometric tradition. He draws from this that the making of geometrics may be a more accessible process than the making of representational motifs and that knowledge of geometrics may be innate whereas, we could add, making representations is not and requires individual learning and social transmission of skills to be evolutionary maintained. In the following section, we will explore how social learning could have played a major role indeed in the development of iconic art traditions. This hypothesis is supported by the coincidence between demographic transitions determining social transmission and the emergence of iconic art traditions.

The emergence of full-fledged iconic art traditions

Sensory exploitation may provide the initial nudge for the evolution of visual art as it does in sexual selection (Kokko et al. 2003). However, does it also provide a mechanism that is responsible for the persistence of visual art across cultures? If no indirect benefits are derived—that is, if an adaptive explanation is excluded—the evolution and maintenance of male ornaments may be driven exclusively by SE, the same goes for the evolution and maintenance of artistic production as a behavior. Here, we will investigate this theoretical possibility on the basis of empirical data. This section is primarily based on Powell et al. (2009).

It is only after the UP transition, which occurred in Europe and western Asia about 45,000 years ago (ka) (Bar-Yosef 2002; Mellars 2005), and later in southern and eastern Asia (James and Petraglia 2005; Petraglia 2007), Australia (Brumm and Moore 2005; O’Connell and Allen 2007), and Africa (Ambrose 1998a) that more complex figurative art appears consistently in the archeological record. This period is seen by many as marking the origins of modern human behavior. UP material culture, usually referred to as the LSA in Africa, is characterized by a substantial increase in technological and cultural complexity, including not only the first consistent presence of iconic representations but also other symbolic behavior, such as abstract art and body decoration (e.g., threaded shell beads, teeth, ivory, ostrich egg shells, ocher, and tattoo kits); systematically produced microlithic stone tools; functional and ritual bone, antler, and ivory artifacts; grinding and pounding stone tools; improved hunting and trapping technology; an increase in the long-distance transfer of raw materials; and musical instruments, in the form of bone pipes (Mellars 2005, Bar-Yosef 2002, Brumm and Moore 2005, Ambrose 1998, McBrearty and Brooks 2000).

The oldest evidence of this UP iconic art traditions appears from around 35 ka on. There are the schematic, monochrome, red paintings on rock fragments from Fumane Cave in northern Italy and the impressive painted depictions of animals from Grotte Chauvet in the Ardèche in southern France (Floss and Rouquerol 2007). Human and animal figurines of approximately this age were found in Stratzing in the Wachau of Lower Austria (Floss and Rouquerol 2007) and in Vogelherd and Hohlen Fells cave, in southwestern Germany (Conard 2003). In the latter, very recently, a Venus figurine was found which was produced at least 35 ka (predating the well-known Venuses from the Gravettian culture by at least 5,000 years) (Conard 2009). The oldest evidence for Middle Stone Age figurative art in Africa is seven paintings on mobile stone blocks from Apollo 11 Cave in southwestern Namibia, which date from between 25.5 and 27.5 ka (Vogelsang 1998).

How could SE help to explain that during the UP/LSA iconic representations became widespread, complex, and persistently present across continents and cultures? As we postulated that iconic representations evolved by exploiting pre-existing biases, one could wonder why it did not come to full bloom much earlier. Indeed, it follows that these biases predate the UP/LSA extensively.

Until very recently, the appearance of consistent and complex painting and sculpture in the UP was considered to be part of a more general “cognitive revolution,” with scholars employing such expressions as “creative explosion” (Van Damme 2008). Indeed, some have suggested (Klein 2000; Mithen 1996) that the main cause of behavioral modernity, one of whose hallmarks is considered to be the creation of complex figurative art, was a heritable biological change (mutation(s) with neurocognitive consequences) just before the UP/LSA. Meanwhile, many authors have argued that anatomical modern humans possessed the requisite capacities long before the UP/LSA (e.g., Mellars 2005, McBrearty and Brooks 2000). It is now widely accepted that anatomical modern humans evolved in Africa some 160–200 ka (e.g., McBrearty and Brooks 2000) (and expanded into most habitable parts of the Old World between 90 and 40 ka; e.g., Ambrose 1998b). The findings mentioned above contradict the theory that a neurocognitive change had to take place to produce UP/LSA iconic representations. Moreover, in Africa, the idea of a single transition has been contested (McBrearty and Brooks 2000) because there is strong evidence of the sporadic appearance of many other markers of modern behavior at multiple sites as early as 70–90 ka (Bar-Yosef 2002, McBrearty and Brooks 2000) and possibly as far back as 160 ka (Marean et al. 2007). Therefore, again, how could the delay of some 100,000 years between anatomical modernity and consistent presence of more complex iconic art be explained (Mellars 2005)?

We know how and why egg spots evolved in haplochromines cichlids: they have been proven to be a genetic trait that provides a selective advantage because they encourage females to participate in oral mating (Salzburger et al. 2007). However, as discussed above, the UP/LSA transition does not seem to be the result of immediate genetic level changes. Instead, we suggest, as have others (e.g., Shennan 2001; Powell et al. 2009), that it resulted from demographic changes which affected transmission on the cultural level. We propose that UP/LSA iconic representations evolved from exploitation of human psychosensory biases via the accumulation of more basic, culturally transmitted ingredients of artistic behavior. The ability to create an iconic representation such as the ones dating some 35 ka requires skills and knowledge which a solitary individual cannot acquire during one lifetime (see further). In other words, UP/LSA art requires a cultural tradition, a gradual accumulation of innovations built upon previous ones, maintained by social learning. Without imitation and observation of others, an individual will not acquire the skills and other innovations necessary to produce, for instance, a cave painting like the ones created around 35 ka. True, someone must have been the first to invent a particular relevant skill, but without incorporation into the cultural repertoire via cultural transmission, acquired skills will not be retained, nor be further improved upon over the generations.

Empirical evidence from different research fields suggest the larger the interacting pool of social learners is—i.e., the “effective population size” (Henrich 2004)—the greater the number (and complexity of) cultural innovations in a population (in wild orangutans and chimpanzees: e.g., van Schaik and Knott 2001; in humans: e.g., Henrich 2004). Some data suggest that cultural changes, which could have increased effective population sizes, actually took place around 45 ka. For instance, the flowering of long distant contact (e.g., White 1982), greater tendencies toward colonization (Stiner and Kuhn 2006), and an overall population increase (Bar-Yosef 2002). Which innovations were maintained by cultural transmission—and why they were maintained—are the next questions to be addressed.

Three recent cultural evolutionary models (Henrich 2004; Shennan 2001; Powell et al. 2009), which explicitly demonstrate the positive effect of increasing population size on the accumulation of beneficial culturally inherited skills, have been proposed as an integral explanatory component of the appearance of modern behavior.

Henrich’s model (2004) demonstrates that under certain critical conditions, directly biased transmission can lead to cumulative adaptation of a culturally inherited skill. He terms this as “cumulative adaptive evolution” which depends on a critical population size.

Powell et al. (2009) adapt and extend Henrich’s transmission model (2004) into a more realistic structured metapopulation, which reflects plausible late Pleistocene conditions, to investigate the effects of demographic factors on the accumulation (or loss) of cultural complexity. The results of their simulation demonstrate that the influence of demography on cultural transmission processes provides a mechanism to explain, among other things, the delay between the emergence of anatomical modern humans as a species and the material expression of modern behavioral traits.

A problem, however, with these models with respect to the subject at hand—i.e., iconic representations, is their basic assumption of adaptivity. Increased complexity of skills is associated with increased adaptivity. This is true for technological utilitarian skills, and perhaps also for the creation of symbols that function in identifying groups (i.e., ethnic markers) (Boyd and Richerson 2005), but not necessarily for iconic representations, which may not have a utilitarian purpose at all, nor a function in evolutionary terms. At certain times and places throughout human evolution, producing and experiencing iconic representations may have been neutral or even maladaptive, depending on specific conditions. The question as to whether visual art such as iconic representations is, or has been, adaptive or not is thus a tricky one, and hard to answer. Illustrative of this are the divided opinions on adaptivity of visual art (e.g., Pinker 2002). Moreover, under the proponents of art as adaptive there is no consensus in what way it actually would be. To some, it is sexually adaptive (e.g., Miller 1998, 1999, 2000, 2001), to others, it is a group adaptation (Coe 2003; Dissanayake 1992, 2001). We conclude that if it can be shown that iconic representations evolve even when they are maladaptive, they definitely will when they induce some kind of benefits on any kind of unit of selection. Therefore, here we propose a model in which iconic art tradition can evolve without any adaptivity assumptions as a mere consequence of SE and demographic changes.

An iconic art tradition could only have evolved as a consequence of the accumulation of several innovations in artistic production behavior, whereby more complex ones are built upon simpler ones. For instance, rock artists needed to know where to find pigments and how to process them for use. Also, possibly, knowledge of locations with usable surfaces for painting needed to be maintained in the collective memory of the population by cultural transmission. Secondly, innovations concerning painting skills and methods needed to accumulate. These include intensive training in hand skills or fine motor skills for drawing and insight into how a real object can be translated into a two-dimensional representation that suggests three-dimensionality through shading and skillful use of colors (e.g., the rock art in Grotte Chauvet). Naturally, some of these innovations also function in other contexts, such as trained fine motor control in tool making and use and pigments in ceremonial or ritual contexts (Power 1999). We probably need to distinguish two categories of innovative or cumulated skills: the ones that are retained solely for the purpose of iconic art production (e.g., drawing skills); and the ones that are primarily retained for other, utilitarian, purposes. We expect that some of the complex skills resulting from Henrich’s (2004) “cumulative adaptive evolution” would enable, as a side-effect to their effectiveness in technological practices, exploitation of psychosensory biases through the production of iconic representations.

Thus, demographic transition enabled evolution of iconic art traditions through increased capacity to maintain innovations of art production. Even if the resulting iconic art tradition is not adaptive, the general adaptivity of the populations of social learners increased in the UP/LSA which made its capacity higher. This allowed for neutral and even maladaptive practices to evolve as a result of SE, instead of being eliminated by natural selection (Fig. 2).

Fig. 2
figure 2

Sensory exploitation, cultural transmission and the influence of the size of the interacting pool of social learners on art. In this figure 4 hypothetical populations of social learners and the artworks that they produce are shown. All arrows stand for the direction in which “information” is transmitted. In addition, when the arrow is black, that information directly determines the outward appearance of an artwork. This kind of information will come from the artist that created the work, which are also represented in black. Driven by the process of sensory exploitation, artists will create artworks that exploit theirs and others’ pre-existing biases. Portraits result from exploitation of biases caused by face recognition and animal depictions from biases caused by biophilia (or biophobia). Population 1 is a small and isolated population of social learners. As a result, the innovations required for its members to produce iconic art will not accumulate. They will however produce abstract art that does not require (much) social learning (Hodgson 2006). In populations 2–4 iconic art traditions will naturally and necessarily occur because these are large and interconnected, creating an interacting pool of social learners that is large enough for innovations required for production iconic art to spontaneously accumulate and persist regardless any beneficial effects of the artworks (artwork: Alexandra Crouwers and Jan Verpooten)

Enhancing accidental iconicity

However, didn’t SE leave any marks of its working from before the UP/LSA transition? It seems there are some findings, albeit sparse and controversial, of collecting stones and protosculptural activity that seem to fit SE particularly well. The findings we refer are to predate the appearance of a consistent iconic tradition. They point to the collection and enhancement of stone objects that are accidentally iconic—i.e., they coincidentally attract the attention of humans by playing upon their adaptive sensory biases.

The most recent finding is a large piece of rock—6 m long and 2 m high—found in a cave in the Tsodilo Hills in Botswana and resembling the body and head of a python (press reports, late 2006). The surface of the rock shows hundreds of artificial indentations that might have been applied to suggest a snake’s scales. The indentations appear to have been made by stone tools excavated in the cave and are provisionally dated to more than 70 ka.

Two modified stone objects date further back in time. The so-called Tan Tan figurine found in Morocco is a small stone whose natural shape resembles that of a human being. Some of the object’s natural grooves, which are in part responsible for its anthropomorphic appearance, seem to have been accentuated artificially in what is interpreted as an attempt to enhance the human resemblance. It has been provisionally dated to between 500 and 300 ka (Bednarik 2003). The “Berekhat Ram figurine,” Israel, is dated 233 ka and presents a similar case of semi- or protosculptural activity (Goren-Inbar 1986).

The oldest object found at a hominid occupation site is a naturally weathered pebble resembling a hominid face, without any of these anthropogenic enhancements. The site in Makapansgat, South Africa, where it was found is dated 3 million years old (Dart 1974).

However, sparse and controversial, it is significant for the application of the concept of SE to paleolithic art that all these early findings appear to be (enhanced) semblances. Concerning paintings, we mentioned earlier that the natural bumpiness of a rocky surface is often used to enhance the three-dimensionality of depictions. In fact, some of the paintings on natural bumps may have been created as enhanced semblances. From the SE perspective, one would expect that the first iconic representations originated from accidental exploitation by natural objects that elicit responses. Imagine an early human stumbling upon a stone that draws her attention because it triggers a strong response as a result of adaptive sensory biases, exhibiting “accidental iconicity.” If it resembles a human face it could play upon the FFA. She might keep the stone, start a collection of objects that draw the attention of her adaptive sensory biases. Later, she might even start scratching at it with a harder stone, deepening its natural crevices, resulting in something that looks even more like a face. She acts probably driven by her own responses to the ever enhancing “mimic” of a human face. This specific case (an initial spark of artistic behavior) would be an incident of self-exploitation. Logically, the first person upon which the “effectiveness” of an artwork is tested is the artist herself. Not only when “finished” but also during the several intermediate stages in the artistic process. When (the products of) these self-exploiting behaviors subsequently become part of a socially transmitted cultural repertoire they evolve into traditions by the accumulation of innovative variants, as discussed in the previous section.

One might object that the analogy between biological signal evolution through SE and the evolution of human art behavior ends when considering self-exploitation. However, male fiddler crabs prove otherwise. Courting male fiddler crabs sometimes build mounds of sand, called hoods, at the entrances to their burrows. Males wave their single enlarged claws to attract females to their burrows for mating. It has been shown that burrows with hoods are more attractive to females and that females visually orient to these structures. Interestingly, a recent study showed that males themselves were also attracted toward their own hoods as a consequence of SE or sensory trap (Ribeiro et al. 2006). Hence, hood building, like art production, causes self-exploitation.

Another objection one could make is that in the anecdote of the early artist the artistic process through sensory (self-)exploitation occurs on the individual level, while SE as an evolutionary selection process typically occurs on the population level—evolution is a change in gene frequencies in a population that usually occurs over many generations. For instance, mound building in male fiddlers probably evolved gradually because of the increased reproductive success SE of females yielded for the mound builders relative to the non-mound builders. Probably because males and females share a lot of the same sensory biases and responses, the males are equally attracted to their own mounds. Therefore, mound building evolves on the evolutionary level—that is, through sexual selection, over many generations.

However, one should not exaggerate this distinction between the crab and the artist. First, the artistic process described here in an anecdotal form may in fact occur far more gradually and also spread over many generations of social learners as we proposed above.

Secondly, in mimicry and SE in other animals, previous experience and learning of the individual plays an important role as well (e.g., ten Cate and Rowe 2007). Also, male bowerbirds when decorating their bowers are reported to inspect their bowers from a distance during the process, like a painter who steps back from his canvas to check the intermediate result while painting.

Moreover, when individual learning has a social component, the cultural transmission of traits influencing behavior is enabled, and cultural transmission has an evolutionary dynamic analogous to genetic transmission, but could occur at a much higher rate, as transmission through social learning can happen all the time (Richerson and Boyd 2001). For instance, in bowerbirds, styles in bower decoration are said to spread over populations and even jump to other species of bowerbirds, via cultural transmission (http://www.life.umd.edu/biology/borgialab/).

Also, stone play in Japanese macaques is a well-documented example of animal behavior that seems to have much in common with human artistic behavior from our viewpoint. Just as artistic behavior, stone play exhibits inventive variations transmitted in a context of social facilitation and observational learning, it does not seem to have any instrumental function and it probably involves some form of self-exploitation (Huffman and Quiatt 1986).

We do not intend to dismiss the idea that certain capacities used in the production of iconic representations are unique to modern humans, but our approach shows that these differences are gradual rather than absolute. As said, biological mimicry illustrates that icons are not only produced in the human species (Maran 2007), we only produce more of them and a greater variety.

Modern culture

The process of the gradual accumulation of the innovative skills and knowledge that affect artistic production, as mentioned above, may have led to something that we might not perceive of as art today, but that nevertheless plays upon a whole range of psychosensory biases, namely multimedia products like movies, advertisement, and video games. These products of modern culture probably have more in common with cave art than cave art has with modern painting. As Marshall McLuhan said: “Ads are the cave art of the twentieth century.” While these products are directed at exploiting emotional and visual sensitivities, modern art often is not. Its aim is instead conceptual, analyzing and “deconstructing” its own underlying mechanisms. This distinction between modern art and rock art is one of the reasons the use of the term “art” is tricky in a scientific approach. However, as we have hoped to show, a bio-evolutionary account of art such as iconic representations is necessary and worthwhile as it provides a framework in which ideas about more specific aspects of visualizations can be articulated.

Conclusion

We have proposed that the concept of SE combined with ideas about cultural transmission sheds light on the late emergence of iconic art traditions in human evolution during the UP/LSA. Scholars disagree on the adaptiveness of art. We have advanced a view in which art can evolve even if it is not adaptive. First, a demographic transition increased the capacity of UP/LSA cultures, enabling an increased tolerance for neutral and even maladaptive traits. Secondly, the same demographic transition led to “cumulative adaptive evolution” and as such to more complex adaptive skills. Subsequently, these skills could serve potential non-adaptive purposes as well, such as iconic art production. The evolution then of art production is solely driven by SE and not hindered by costs (i.e., elimination by natural selection) because of the high adaptivity of cumulative culture itself. As such, indirect benefits of iconic art production are not a prerequisite; however, if present, they may additionally drive its evolution as a secondary force. Whether investigated from a biological, sociological, anthropological, or philosophical perspective, one cannot ignore the fact that iconic art draws upon sensory sensitivities. Our view based on SE could serve as a concept that enables articulation and evaluation of all existing hypotheses about art.