Three Anecdotes to Start with

One. I am looking at a white dot, placed exactly at the center of a white piece of paper. I asked the students in my class on the psychology of creativity to draw me something that follows five of Ramachandran and Hirstein’s (1999) eight rules of beauty and tell me whether what they have is now a beautiful drawing. The dot hits five rules, the student argues: peak shift, isolation, symmetry, contrast extraction, as well as problem solving (“What is this dot doing here?”) and/or metaphor (“Poor lonely dot!”). The student does not argue that her piece is beautiful or artistic at all; she proclaims it “lame,” and I have to agree—it has no legs. Following the rules—even the rule of making something unexpected—has led to a thoroughly sterile result.

Two. I am in a gallery in Beijing, circa 1995, hoping to buy a painting I just fell in love with. The painter doesn’t speak any English, my Chinese vocabulary contains a mere 500 words; his daughter translates. “My father says you have good eye, but the painting is not for sale.” What about this one, I ask. “My father says you have good eye, this painting is also not for sale. But he will make an exception for you, because you have good eye.” When I pick up the painting the next day, it gets ceremoniously signed, in the presence of quite an entourage—both brush and chop get applied. When the painting is rolled up and handed to me, the artist, for the first time, looks me in the eye and addresses me, while his daughter translates: “He says he has done many duck-and-lotus-flower paintings, but this is the best one he ever made. He will now paint no more such paintings.”

Three. I am sitting in the dark nave of a gothic church. I am 8 years old. Bats swoop down, as bats do: erratically elegant, efficient, black, silent. I don’t mind; I am used to their presence. I sing in the church choir; every Friday night during our rehearsals the bats go hunting, occasionally offering their fluttering counterpoint to our Handel or Aichinger. But today I am down in the nave, for my first concert of any kind, listening to, looking up at the small band of musicians that has temporarily taken over our Friday bubble of light. Then, during a slow movement, within the space of a few seconds, I am transformed: My hair stands on end, I get goose bumps, my whole body shudders and shakes, and my soul (I am quite the religious little dude) soars to the ceiling—one with the music, one with the quivering bats. I am certain nobody has ever felt anything like this before. This must be a moment of divine provenance, so I commit the name of the violinist to memory, and the piece: Lola Bobesco, J. S. Bach’s violin concerto in A minor.

Three Thousand Words of Commentary

So, here we go: from trite all the way to being haunted into a vaulted ceiling by some Catholic God.

Allow me to start with these chills (as they are now most commonly known in the literature—some also call them thrills, shivers, or frissons). What do we know about them?

First, some people experience them, some do not—estimates vary from 53% (Goldstein, 1980) to 86% (Panksepp, 1995). Musicians have them more often than nonmusicians (Goldstein, 1980). It is important to note that they are intra-individually repeatable—whatever gives you the shivers today is likely to give you the shivers tomorrow (Blood & Zatorre, 2001; Sloboda, 1991). That is the main reason why I bring them up: it’s hard to argue that chills are not an important aesthetic emotion, and yet, their sheer replicability clearly runs afoul of Mechner’s (2017) claim of the tinge of surprise as a necessary element of the synergistic brew (most clearly in Section 1.12, where he lists surprise as one of the five common features of aesthetic phenomena, although other areas of the text are less sanguine on that point). What is less surprising than exposing oneself—often voluntarily—to the same stimulus over and over again, with more or less the same result? Mechner (2017) realizes this, and resorts to a Heraclitian gambit: You cannot step in the same river twice (Section 4.6), not because the river changes, but because you do. Simple observation suggests that this argument does not ring particularly true. There are enduring works of art that bring many people great joy whenever they revisit them. Many of us have photographs or paintings, original or reproduced, on our walls, and we don’t seem to get tired of them too easily; many people revisit the same music over and over again (Greasley, Lamont, & Sloboda, 2013); churches and orchestra halls keep programming the St. Matthew Passion every Easter season. If we were the avid neovores Mechner (2017) implies that we are (see Biederman & Vessel, 2006, for essentially the same point), why do 6 million people per year flock to the Mona Lisa, which is among the most reproduced art works of all time? Shouldn’t the mystery in her half-smile have faded by now? Why does Spotify (as far as I can tell) not have a randomization function over all its songs, which would truly expose us to the new?; why is there (as far as I can tell) no radio station that indiscriminately adheres to all styles of music?; why does my local egregiously expensive organic grocery store blanket its shoppers with college radio music that is 20 to 30 years old, hitting its shoppers’ right in the heart of their late-adolescent memories?Footnote 1

The answer to this, obviously, is that familiarity is a strong driver of aesthetic judgment. We may think we know what we like, but we certainly like what we know, and research shows that. If you ask people to nominate the greatest pop musicians of all time, they tend to cite musicians who were successful when the participants were in adolescence or early adulthood (North & Hargreaves, 2002). We prefer the songs that charted when we were in our early 20s (Hemming, 2013; Holbrook & Schindler, 1989, 2013). We also like songs from previous generations or from our early youth better than those of later generations (Holbrook & Schindler, 2013)—we’ve heard the former played on our radios or by friends, older siblings, or our parents when we were young. Children inherit (some of) the musical taste of their parents, probably through repeated listening rather than by genetic determination (ter Bogt, Delsing, van Zalk, Christenson, & Meeus, 2011). Part of this is due to engagement: Younger people listen more often to music than older folks do (Bonneville-Roussy, Rentfrow, Xu, & Potter, 2013), and so songs of your youth are more likely to end up in your long-term memory in the first place. This engagement, in turn, may be related to how music shapes self-identity and also projects it, both of which are areas of acute importance in this particular age group (Greasley et al., 2013, have some nice examples of people avoiding identification with a genre or band because they are “bland” or “lack creativity,” but secretly listening to and enjoying said genre or band).

More direct evidence for the relationship between familiarity and the aesthetic response comes from experimental work. For instance, in a study where 75 college students rated 60 30-s musical excerpts from popular music (mostly pop, ambient, and electronic pieces) the correlation between self-rated familiarity of a piece and how much the participant liked it was a rather astonishing 0.91 (North & Hargreaves, 1995).

Familiarity often does its work outside awareness. In cognitive psychology, the mere-exposure effect (Kunst-Wilson & Zajonc, 1980) is well-known—mere exposure to random stimuli (such as polygons, nonsense words, scribbles, or photographs of faces) increases people’s affective response to them, even in the complete absence of recognition memory for the stimulus. My favorite example of mere exposure in aesthetics is the Caillebotte effect (Cutting, 2003). Gustave Caillebotte (1848–1894) was an impressionist painter who was lucky enough to have been born into a wealthy family. Caillebotte helped out his more-or-less starving friends (Impressionism wasn’t exactly warmly received in its early days) by collecting their work. He bequeathed his entire collection (14 Monets, 10 Renoirs, 7 Degas, and so on) to the French government, on the condition that the works would be exhibited, together, in the (then) museum of contemporary art in Paris, the Palais du Luxembourg. The French government was not too keen on this arrangement, and negotiations ensued. The state eventually took about half of the paintings (39 of them) and exhibited them in the Palais; this collection later formed the backbone of the Jeu de Paume, and the collection is now at the Musée d’Orsay. The other part of Caillebotte’s collection disappeared into private hands. One implication is that the former set of 39 pieces has been widely reproduced in art history books and other media; the latter—much harder to track down—has not. For his study, Cutting paired the Caillebotte images with similar images by the same painter (thus showing side-by-side two Degas paintings of dancers, two Monets showing the Gare Saint-Lazare, two Renoir picnics, and so on). Then he sent his undergraduate research assistants to the library to tabulate how often each image was represented in art and art history books. Finally, he asked a random sample of students (a) which of the two side-by-side images they recognized, if any, and (b) which one they preferred. Few people recognized any of the images (3% on average, and expertise played a role—those who visited museums at least once a year recognized about 4%, and those who had ever visited the Musée d’Orsay recognized about 12%). Participants did, however, prefer the more frequent of the two images 59% of the time—better than chance, and independent of whether or not they were museumgoers or had ever visited the collection. The percentage liking of the more frequent image went up nicely with the ratio of frequency of reproduction in each pair. Older adults showed the same effect; crucially, 6-to-9-year-old children did not, suggesting that cultural exposure is indeed a critical determinant here. (This is the reason why, in the old days, record companies would pay DJs handsomely to have their new records played on the radio, and why the latest album of your favorite band, at first listen, always tends to sound less interesting than their previous work. You need that familiarization.)

Neuroimaging work likewise shows the power of familiarity. Music—unsurprisingly—activates the reward/pleasure centers of the brain, notably the nucleus accumbens (for a review, see Zatorre & Salimpoor, 2013), but expectation plays a large role in that pleasure. Chills—again—provide a prime example. In a combined PET/fMRI study, Salimpoor, Benovoy, Larcher, Dagher, and Zatorre (2011) had participants listen to their own favorite chill-inducing pieces of music; they pressed a button whenever an episode of chills started (let’s call this time zero). Chills were associated with a stark rise in activation in the right nucleus accumbens right at time zero, as one would expect, but there were also clear signs of anticipation: activation in the nucleus accumbens was already above baseline 15 s before time zero, and at that time, activation also started building up in the right caudate (a structure often implicated in learning stimulus-response associations), the caudate then dropped out at time zero. PET scans pointed out that activation was accompanied by dopamine release. We can see a two-step process here: A phase of anticipation or wanting or craving, evidenced by dopamine release in the caudate, and one of consummation or liking or fulfillment, evidenced by dopamine release in the nucleus accumbens. The anticipation is also visible in a ramp-up of heart rate and muscle tension half a minute or so before the chill proper (Blood & Zatorre, 2001). Salimpoor et al. (2011) assume that this anticipation phase is a reflection of familiarity—either because of direct memory effects (as is likely when you hear your favorite music), or through implicit knowledge of the rules of the musical language.

It is not only the audience which prefers the familiar. Artists also often revisit themes and subjects, and seem to be rewarded for it. Galenson (2003) found that artists who tentatively and incrementally grow in skill (think Cézanne and his endless series of views on Mont Ste.-Victoire, or think the Chinese painter in my second anecdote) tend to command higher prices at auction for their late-life work; artists who are driven by concepts and ideas, and thus work in breakthroughs (think Picasso and this many different successive painting styles) often peak early in their career. Cézanne’s most expensive work, for instance, was painted when he was in his mid-60; the price profile for Picasso peaks at age 25.

Here, however, an interesting historic and cultural difference emerges. French (and maybe by extension Western) painters born before 1850 tend to produce their most valuable work late in life; painters born after that date tend to peak at earlier ages (Galenson, 1999), likely indicating that the modernist movement has driven painters off the path of incremental seeking to the cliff of finding. This emphasis on the new may also be a Western phenomenon: In Japanese woodblock artists, a positive relation between age and eminence emerges (Kozbelt & Durmysheva, 2007). These two examples suggest that Mechner’s (2017) idea of surprise as the hallmark of the synergistic brew might betray a Western and modernist viewpoint, rather than articulate a universal truth (note Section 6.5, however).

A final aspect of familiarity that bears mentioning is prototypicality. An example in aesthetics is the finding that composite faces (which average all the features of the faces that go into them, thus edging closer to the prototypical face) are in general better liked than individual faces; the more faces go into the composite, the higher the preference (e.g., Langlois & Roggman, 1990). This effect already operates in infants (Langlois, Roggman, & Rieser-Danner, 1990), suggesting that this is, indeed, a simple prototype effect and not—like the Caillebotte effect—the result of cultural learning. My favorite example is a demonstration from my own undergraduate class on the psychology of aesthetics. David Cope, an American composer, developed an AI system for music composition, called EMI (Experiments in Musical Intelligence). The system extracts patterns from a set of scores, and then reuses those patterns to create its own music. Thus, feeding EMI all of Bach’s inventions will “teach” EMI what a Bach invention generally sounds like, and how to write one. In my class, students listen to a real Bach invention, a real Beethoven sonata, a real Chopin nocturne, and so on, as well as an EMI Bach invention, and EMI Beethoven sonata, and an EMI Chopin nocturne; I ask them, for each one, how much they like the piece, and whether they think it was composed by a human or by EMI. Students invariably prefer EMI’s work over that by the corresponding human composer, but they also ascribe EMI’s compositions to the human, not “the computer.” What is likely happening here is that EMI’s pattern extraction algorithm smoothens the noise (AKA the quirks, the realness, the unpredictable, the unexpected) out of Bach’s inventions, and then creates something that is devoid of that noise, and hence closer to a prototypical Bach invention. The end result is something that is ultimately more recognizably “Bach” than a real Bach piece, and hence more likeable.

Finally, it appears that self-relevance—maybe the ultimate instantiation of familiarity—plays a crucial role in the aesthetic experience too. A nice example is an fMRI study by Vessel, Starr, and Rubin (2012). Participants viewed 109 different works of art, imagining they were assisting the curator of a new museum in selecting new pieces. They were specifically requested to base their judgment in aesthetics, that is, how “powerful, moving, or profound” they found each image to be. These aesthetic judgments showed (as expected) nice linear relationships with brain activation within sensory regions and within the reward system, but—and this interesting—a step-function appeared for parts of the default mode network associated with self-relevance: these regions were active only for the images that received the highest recommendation. In other words, we do find art more beautiful and interesting the more it tickles our senses and our pleasure centers, but to truly blow us away, art needs to be personally relevant. (This confirms my suspicions about every curated art show I have ever been lured into, and just about every personal iTunes playlist anyone ever shared with me.)

Summarizing my argument so far, it is clear that familiarity (which is, in a sense, the opposite of surprise) seems to be very much an essential part of the aesthetic experience.

The aesthetic response cannot be all about familiarity, of course. The first anecdote illustrates that meticulously following the rules may just be a shortcut to boredom. Thus, artists often employ unpredictability—what Mechner (2017) calls the surprise tinge—to create or sustain interest. This unexpectedness needs to be Goldilocked: in the same study where North and Hargreaves (2002) found a strong linear relationship between familiarity and liking, they also found a curvilinear, inverse-J-shaped relationship between subjective complexity (a good stand-in for unpredictability) and liking; this nonlinear effect of complexity explained 50% of the variance. If music is too simple, it does not please; neither does it please when it gets too convoluted. Note that the shape is an inverted J, not an inverted U: in general, we prefer plainer songs over complicated ones. The effect stacks with that of familiarity: familiarity explains 82% of the variance in liking; complexity adds another 3%.

In the psychology of music, such irregularities are often called expectancy violations (Meyer, 1956). A first observation is that music has expectancy violations woven into its very fabric. Music (like speech, and many natural phenomena) can be considered a form of pink noise, also known as 1/f noise (e.g., Voss & Clarke, 1978). We can express an event within a musical piece (e.g., a particular pitch, or a particular interval, or the length of a note) in terms of its frequency of occurrence, and note that P (f) = 1/fn, that is, the frequency of an event is inversely proportional to its rank in a frequency table (Manaris et al., 2005). If n = 0, every event happens with equal probability, that is, it is perfectly unpredictable; this situation is known as white noise. (If the event in question is the occurrence of a particular pitch, this would describe 12-tone music.) If n = 1, we have perfect pink noise; n = 2 constitutes brown noise, which sounds “boring” (Schroeder, 1991).

Manaris et al. (2005) analyzed a corpus of 196 musical pieces from the Western canon, and found that these pieces indeed conformed to pink noise (mean n = 1.2) across a wide variety of musical parameters (pitch, chromatic tone, duration, pitch distance, harmonic interval, etc.). Their explanation is that pink noise tends to be stable, that is, a system that is governed by values of n between 0 and 2 tends to return to its initial state when perturbed. Seen under this light, music is a process of stabilizing a complex hierarchy of pitches, durations, harmonic intervals, and so on. Manaris et al. also asked a group of participants to rate 12 musical excerpts on pleasantness (these ranged from Berg [unpleasant] to Debussy [pleasant]), and found that the 6 most pleasing pieces had a harmonic-consonance n of 1.2, and a chromatic-tone n of 1.4; unpleasant pieces averaged 0.5 and 0.6 on these metrics. This is then another example of Goldilocking—music that becomes too unpredictable, too close to white noise, sounds unpleasant; pleasant music tends to move just a little more in the direction of predictability than true pink noise. It is interesting that Manaris et al. found that when musical pieces fall too far outside the pink-noise range, interpreters might spontaneously bring them back—Bach’s Two-Part Invention No. 13 in A minor, as written, has an undesirable n value close to 4 for note durations; in his recording of the piece, the harpsichordist John Sankey brought this n down to 1.5.

Note that the peak of unpredictability that leads to the highest level of aesthetic response is not fixed. For instance, North and Hargreaves (1995) found that their inverted-J function shifted with musical experience: More experienced subjects preferred more complex pieces of music. Simonton (1980) analyzed melodic originality of 15,618 themes from the classical repertoire, and found, on average, an increase in originality over historic time, culminating in Schoenberg, or pure melodic unpredictability. The reason is likely that the culture, over time, habituates to the new, and uses that new set point as its own baseline for originality.

Chills can be instructive here too. The specific pieces that produce chills are not predictable between individuals (your chill-provokers may not be mine; Grewe, Nagel, Kopiez, & Altenmüller, 2005; Laeng, Eidet, Sulutvedt, & Panksepp, 2016; Nusbaum et al., 2014), but we do know that there are certain musical characteristics that appear to be necessary for producing strong emotional reactions. These characteristics are, however, tied to the particular emotion (Sloboda, 1991). Chills are associated with musical shifts—new or unexpected harmonies, or sudden changes in dynamics or texture. In contrast, tears and lumps in the throat happen during melodic appoggiaturas, or in response to harmonic or melodic sequences such as moving through the cycle of fifths to the tonic. Finally, the heart races during acceleration and repeated syncopation. Thus, expectancy violations are a precondition for one type of strong emotional aesthetic response. Two others, however, have a different origin: being moved to tears is more a question of the composer’s skillful and/or playful execution of a syntactic rule, and a pounding heart merely seems to be a sympathetic response to pounding music. Here, then, we come full circle: the aesthetic brew can be ignited both by surprising elements (provided they hit the sweet spot of uncertainty—not too predictable, not too random), and by following the old, familiar rules. Depending on what ignites it, the outcome is different: Violations of the expected make the skin crawl, familiarity makes the throat swell or the heart hammer. The North and Hargreaves (2002) study—the only study I am aware of that pits the two aspects against each other—suggests that familiarity might be the stronger component in the mix.

Summarizing this part, surprise/complexity/expectation violations can clearly be determinants of the aesthetic experience as well, acting in concert with familiarity.

By Way of Conclusion

Three anecdotes and 3,000 words of commentary—a long slog. I apologize for the density of this article and its many references. My argument is ultimately simple: the aesthetic brew is mainly driven by familiarity—in a rather straightforward way—and less so by surprise, which is also harder to engineer and seems to be the necessary condition for only a subset of aesthetic responses.

It is interesting that I must admit that my own conclusion in the previous sentence—a conclusion based on data—sits somewhat uneasily with my own experience (or maybe my self-concept): Certainly I go out and survey new horizons? Surely I am open-minded and have a taste for discovery? One meta-lesson may be that there might well be individual and cultural differences in what exact memory-cum-exploration mix each of us prefers. I can only speculate here, but I suspect that Mechner’s (2017) own penchant for exploration colored his own idea of the primacy of surprise. If so, we are all better for it.