Introduction

Cross-modal correspondences have been defined as “compatibility effects between attributes or dimensions of a stimulus (i.e., an object or event) in different sensory modalities” (Spence 2011a, p. 973). Cross-modal correspondences have now been demonstrated between many different combinations of sensory modalities (e.g., audition–olfaction, Belkin et al. 1997; audition–gustation, Crisinel and Spence 2010; olfaction–touch, Demattè et al. 2006b; olfaction–vision, Demattè et al. 2006a; Gilbert et al. 1996; audition–vision, Marks 1974; although not necessarily between every possible pairings of features/dimensions, see Evans and Treisman 2010; see also Spence 2011a, for a review). Especially relevant to the aims of the present study is the recent growth of studies that have investigated cross-modal correspondences involving the olfactory modality (see Table 1).

Table 1 Summary of studies that have reported cross-modal correspondences involving olfactory stimuli

All of these studies focused on cross-modal correspondences between a specific pair of modalities. Moreover, they used different stimuli and experimental designs, thus making any direct comparison of the results somewhat difficult. By contrast, in the study reported here, cross-modal correspondences were tested between the same set of olfactory stimuli and both audition (the pitch and timbre of musical notes) and vision (the angularity of shapes). Each of the latter features have been shown to correspond to a variety of features in other sensory modalities: Pitch, for example, is associated not only with odors (Belkin et al. 1997) but also with visual brightness (Marks 1974), size (Marks et al. 1987), tastes/flavors (Crisinel and Spence 2010), and elevation (Evans and Treisman 2010; Melara and O’Brien 1987; Pratt 1930), while associations with the angularity of shapes have been demonstrated for non-words (e.g., “takete” and “baluba,” Köhler 1929; see also Bremner et al. 2013) and for tastes/flavors (Deroy and Valentin 2011; Ngo et al. 2011) as well as odors (Hanson-Vaux et al. 2013; Seo et al. 2010) and a variety of concepts (such as, for example, angry, calm, or love; Lyman 1979).

Pitch and angularity have, however, rarely been compared or related in an experiment testing for the correspondence with a third feature (although see Crisinel and Spence 2012b, for an exception). Moreover, the correspondence between pitch and angularity has not itself been robustly tested: Some years ago, O’Boyle and Tarte (1980) demonstrated that an angular, star-like shape was more likely to be associated with a higher frequency tone than a circle or ellipse. However, in their study, a rounded shape made out of three ellipses also tended to be associated with higher frequency sounds. The possible correspondences between pitch and angularity therefore remain unclear at the present time.

In a more recent experiment, Walker et al. (2013) documented a cross-modal correspondence between “low pitch” and “blunt,” on the one hand, and “high pitch” and “sharp” on the other. This said, the latter demonstration depends on the assumption that all sensory dimensions can be aligned and that their polarity will correspond. Thus, knowing that a stimulus which is more A is also considered to be more B and that a dimension which is more B is considered to be more C, one should be able to predict with some degree of certainty how dimensions A and C would be aligned. As stated by the authors, “given that auditory pitch connotes variation in size, one would expect to find systematic mappings between any contrasting feature (e.g., fast-slow, bright-dark) that is represented or implied by the parallel poles of size (small-big) or pitch (high-low)” (Walker et al. 2013, p. 3; see also Palmer and Schloss 2012). However, other researchers have suggested that correspondences may be related in rather more complex, and hence less predictable, ways (Deroy et al. 2013). For example, while high pitch corresponds to smaller size, falling pitch corresponds to a decrease in size. A two-dimensional model of alignment between sensory dimensions might thus fail to explain all cross-modal correspondences satisfactorily. This might be specifically the case once one starts to investigate odors, as their many qualitative differences do not fit onto a polar scale.

In parallel to theoretical research on the topic of cross-modal correspondences, the use of so-called synesthetic messaging in the marketing of a variety of products has recently started to attract an increasing amount of attention (see, for example, Klink 2000; Spence 2012a, b; Spence and Gallace 2011). The marketplace constitutes a much more complex environment than the laboratories in which most theoretical research is conducted. In more ecologically valid situations, a network of cross-modal correspondences may come into play at one and the same time (e.g., think only of the shape, color, and texture of product packaging), and the stimuli are often more complex than those typically used in laboratory research (see Sester et al. 2013). For example, the auditory background in which a product is consumed or chosen is often made up of a complex piece of music—with various instruments, tempos, and pitches, not to mention human voices and lyrics. While the cross-modal correspondences described with musical notes have been extended to whole pieces of instrumental music (Mesz et al. 2011), musical pieces will often also have strong cultural or semantic connotations. For example, the playing of French or German music has been reported to influence the choice of French or German wine in a supermarket setting (North et al. 1997, 1999), while “powerful and heavy” or “zingy and refreshing” music has been shown to influence the evaluation of these same characteristics when tasting wine (North 2012; though see also Spence 2011b).

Recently, Courvoisier launched a marketing program underlining cross-modal correspondences which exist between their products and other sensory dimensions (see http://courvoisier.com/uk/le-nez-de-courvoisier/ downloaded on 6 August 2012). One part of this program involved the design and presentation to customers of soundtracks that had been composed specifically in order to correspond to particular olfactory notes found in Cognac (provided in a kit of six small glass bottles). The parallel is then made between the combination of the soundtracks, played by different musical instruments into a single musical piece, and the combination of the various olfactory notes into the complex aroma of Cognac. Is this just a useful didactic/marketing tool? Or do the various soundtracks really correspond cross-modally to the odors that they are designed to represent? If the latter claim turns out to be true, then the further study of cross-modal correspondences might well be expected to have increasingly important implications for marketing (e.g., in the design of advertising jingles).

Here, we deal with both a more theoretical aspect, looking at how correspondences between smell and musical notes compare to correspondences with shapes, and a more practical aspect, checking whether soundtracks composed to represent smells really do correspond to them cross-modally, thus using materials developed for marketing purposes and evaluating them in an experimental context.

From a theoretical point of view, if all sensory dimensions are aligned, then the odors should be matched to pitch and shape in the same way. That is, the odors matched to the extremes of one scale should also be the ones matched to the extremes of the other scale.

The study reported here is comprised of two experiments. In the first experiment, the participants sniffed seven odors (six from the Courvoisier kit, plus one unpleasant odor) and associated them with a musical note and with a shape. The participants also had to rate the odors on perceptual (e.g., intensity, sweetness) and affective (e.g., arousal, happiness) scales. There were two main aims of the present study: first, to test the cross-modal correspondences elicited by this new set of odors and to compare them to results from previous research and second, to compare associations with musical notes and shapes and their correlations with a variety of perceptual and affective ratings. In the second experiment, the participants matched three of the odors to the three soundtracks composed for Courvoisier and the results were compared to the intended matching.

Materials and Methods

Participants

Twenty-five participants took part in experiment 1 (aged 22–62 years, 13 females). The experiment was approved by the Central University Research Ethics Committee of Oxford University. The participants gave their informed consent, reported no cold or other impairment of their sense of smell, and no hearing impairment. The experiment lasted for approximately 10 min. The participants also took part in unrelated experiments for another 20 min and were compensated for their efforts with £5 (UK Sterling). Forty-six participants took part in experiment 2 (aged 22–84 years, 25 females), as part of the Scent and Sensibility conference held at the Institute of Philosophy, School of Advanced Study, University of London, on 1st June 2012.

Stimuli

Six samples from the Nez de Courvoisier® aroma kit (Courvoisier Import Company, Deerfield, IL USA), as well as one sample from the Nez du Vin aroma kit (Brizard & Co, Dorchester, UK), were used as olfactory stimuli in this study. These kits were designed to help those who would like to learn more about cognac or wine to experience the odors commonly found in these drinks individually. The samples consisted of odors identified as those of ginger cookies, dried plums, roasted coffee, crème brûlée, candied orange, iris flower, and musk. In experiment 1, the samples were presented in small glass bottles identified by a number written on the side of the bottle. In experiment 2, three of the odors (ginger cookies, crème brûlée, and candied orange) were presented on scent strips. The odors were used in the concentrations provided in the kits.

The auditory stimuli utilized in experiment 1 were identical to those used in several previous studies (Crisinel and Spence 2010, 2012a). They came from an online musical instrument samples database from the University of Iowa Electronic Music Studios (http://theremin.music.uiowa.edu/MIS.html, downloaded on 31 October 2009). They consisted of notes played by four types of instruments (piano, strings, woodwind, and brass). The pitch of the notes ranged from C2 (64.4 Hz) to C6 (1,046.5 Hz) in intervals of two tones. Thus, the participants had a choice of 52 different sounds (13 notes × 4 instruments) to choose from when selecting a sound to match to an odor. The sounds were edited to last for 1,500 ms and were presented over closed-ear headphones (Beyerdynamic DT 531) at a loudness of 70 dB.

The three musical soundtracks used in experiment 2 were designed by Laurent Assoulen to represent some of the aromas, partly through the use of different musical instruments (ginger cookies (strings), candied orange (harp), and crème brûlée (piano)). Each soundtrack lasted for 40 s.

Procedure

Experiment 1 was programmed in E-Prime (Version 2). The participants were first given the number of the sample that they were to sniff. After opening the glass bottle and sniffing its contents orthonasally, the participant had to choose a sound to match the orthonasal smell. The sounds were presented on four scales corresponding to the four types of instruments. Pitch increased along the scales (horizontally), the direction was randomly chosen for each trial. The sounds could be heard by clicking on the scales. The participants were free to click on as many of the sounds as they wished before making their choice. After having made their response, the participants rated the brightness, complexity, and intensity of the odor on nine-point scales (anchored with not at all and extremely so). The participants also rated the odor on three nine-point bipolar scales: unpleasant–pleasant, relaxing–arousing, and sad–happy. The scales were presented one at a time in a random order. Finally, the participants had to try and identify the sample and note down their response on a sheet listing all sample numbers. The seven olfactory stimuli were presented once in a random order. The participants were free to sniff the sample as often as they wished during a trial.

In experiment 2, the participants were given a scent strip with one of the odors. They then listened to the three soundtracks and chose the one they thought best matched the odor. The same process was subsequently repeated with the two remaining odors.

Results

Pitch

A repeated-measures analysis of variance (ANOVA) was conducted in order to assess whether there were any differences between the average pitch matched to the odors. The results indicated that the odors affected the choice of pitch, F(6, 144) = 7.45, p < 0.001 (see Fig. 1a). Post hoc t tests (Bonferroni-corrected) revealed that the three odors associated with the lowest pitch (ginger cookies, musk, and roasted coffee) were significantly different from the two odors associated with the highest pitch (candied orange and iris flower).

Fig. 1
figure 1

Mean pitch matched to each odor in experiment 1 (a). MIDI (musical instrument digital interface) note numbers were used to code the pitch of the chosen notes. Western musical scale notation is shown on the right-hand y-axis. The odors represented by a black circle were significantly different than those represented by a white circle. Mean ratings on the shape scale (anchored with an angular and rounded shape) for each of the seven odors in experiment 1 (b). Error bars represent the SEM. *p < 0.05

Types of Instruments

Chi-square tests for goodness of fit were conducted to determine which odors induced a distribution of instrument choice that was different from that expected by chance. Of the seven odors presented, four gave rise to significant preferences in the choice of instrument: candied orange (χ 2(3, N = 25) = 10.04, p = 0.02), dried plums (χ 2(3, N = 25) = 8.44, p = 0.04), iris flower (χ 2(3, N = 25) = 8.44, p = 0.04), and musk (χ 2(3, N = 25) = 9.08, p = 0.03). The piano was the preferred instrument for these odors, except for musk, which was mainly associated with brass instruments (see Fig. 2).

Fig. 2
figure 2

Choice of instrument for each odor in experiment 1. The four odors on the left led to significant preferences in the choice of instrument by participants

Shape

A repeated-measures ANOVA was conducted in order to determine whether there were any differences between the average shape matched to each of the odors. The results indicated that the odors affected the choice of shape, F(6, 144) = 3.22, p < 0.005 (see Fig. 1b). Post hoc t tests (Bonferroni-corrected) revealed that the only significant difference in shape ratings was between the crème brûlée and the musk odors (p = 0.05).

Odor Ratings

Certain of the ratings seem to be correlated with both the choice of pitch and shape: Odors rated as happier, more pleasant, and sweeter tend to be associated with higher pitch and a more rounded shape. However, other ratings seem to be specifically correlated with the choice of either pitch or shape: Those odors that were rated as more arousing tended to be associated with the angular shape, but not with a particular pitch; odors judged as brighter were associated with higher pitch and, to a lesser extent, rounder shapes (Table 2).

Table 2 Correlations between the various ratings and the choice of pitch and shape in experiment 1

Matching

Chi-square tests for goodness of fit were conducted to determine which odors induced a distribution of soundtrack choices that was different from that expected by chance. Of the three odors presented, two gave rise to significant preferences in the choice of soundtrack: candied orange (χ 2(2, N = 44) = 21.59, p < 0.001) and ginger cookies (χ 2(2, N = 43) = 6.88, p = 0.03). While the preferred soundtrack for the odor of candied orange was actually the one designed to match it, this was not the case for the odor of ginger cookies, which was matched most often to the crème brûlée soundtrack (see Fig. 3).

Fig. 3
figure 3

Choice of soundtrack for each odor in experiment 2

Discussion

The results of the present study build on the recent findings reported by Crisinel and Spence (2012a) and Hanson-Vaux et al. (2013). These studies demonstrated reliable cross-modal correspondences between specific odors and parameters of music (namely pitch and instrument class) and shape (angularity). They suggest both similarities and differences in the way in which odors are associated to pitch and shapes. Stimuli judged as happier, more pleasant, and sweeter tend to be associated to both higher pitch and a more rounded shape. This is somewhat surprising considering that the angular shape used in O’Boyle and Tarte’s (1980) study was associated with a higher pitch than a circle or ellipse (note, however, that O’Boyle and Tarte obtained contradictory results with another rounded shape). However, this result accords well with the results of a study by Lyman (1979). He demonstrated that a rounded shape was preferred to represent the concepts of calmness, happiness, and goodness.

Other ratings seem to be more specifically correlated with the choice of either pitch or shape: Odors rated as more arousing tend to be associated with the angular shape, but not with a particular pitch; odors judged as brighter were associated with higher pitch and, to a lesser extent, rounder shapes. These results therefore further support a more complex model of cross-modal correspondences, according to which various elements, such as a matching of perceptual dimensions but also the emotional similarity of stimuli, explain subtle variations in olfactory correspondences (see Deroy et al. 2013). They certainly suggest that a simpler model with all of the dimensions aligned (Walker et al. 2013) is not appropriate when it comes to odors. Odor–pitch and odor–shape matchings seem to be neither completely independent from one another, nor the same. For example, whether an odor is perceived as arousing will influence its correspondence to a shape, but not to pitch. Thus rather than an alignment, a network of correspondences, with different strengths/weights, might be a better way to represent cross-modal correspondences.

Notice here that this does not mean that conceptual or linguistic factors do not play any role in these more complex mappings and explain (at least in part) for instance why odors that make one feel happier or “high” are associated to higher pitch (see Spence 2011a for a discussion; and Lakoff and Johnson 2003, for this specific metaphor).

The strong correlations between emotional ratings (pleasantness, happiness, and arousal) and the choice of pitch and shape seem to indicate that emotional dimensions might need to be taken into account when trying to explain the matching between odors and other sensory dimensions. The results of the research outlined here therefore suggest that emotional/affective mapping of stimuli may be an additional class of cross-modal correspondence, one that was not really emphasized by Spence (2011a, b) in his review. It might come as no surprise that the role of emotions is particularly clear in cross-modal correspondences involving odors, as the hedonic value is reported to be a salient (or even the only, see Yeshurun and Sobel 2010) psychological dimension of odors (Berglund et al. 1973; Chrea et al. 2009; Schiffman et al. 1977; Zarzo 2008).

Rather than being seen as a two-dimensional alignment of sensory dimensions, cross-modal correspondences might be best explained by a multidimensional network. Such a network could include strong cross-modal correspondences, such as those found in the natural environment (e.g., pitch-size), but also others that might be culture or even individual-specific (see Ernst 2007; Shankar et al. 2010). When asked to make unusual matchings, participants try to find a path within the network to connect the two stimuli, for example, by matching the pleasantness of the two stimuli. Determining how these paths are chosen and whether they vary between cultures or even depending on the context in which the task is presented will be fascinating questions for future research in this area.

The mixed results of the cross-modal matching of soundtracks to odors underline the difficulty of composing music that the majority of people will associate with specific odors. The intuition of composers seems not to be sufficient here. Using sophisticated algorithms might provide a solution in the future, but for now such algorithms have only been developed for the basic tastes (Mesz et al. 2012). It might be that the associations between tastes/flavors or odors and music are not specific enough to represent an individual stimulus and could thus only be used to evoke broad categories such as basic tastes. Even if that were to be the case, the use of cross-modally congruent music in marketing (e.g., in advertising jingles) could still be used to evoke a product (or product attribute), together with other parameters such as shape, color, the texture of the packaging, etc. Cross-modal correspondences might thus be used more fruitfully in a marketing context when combined with each other. That is, the combined use of shape, color, texture, and music might evoke another sensory characteristic of a product, such as its smell, with more precision than any element used on its own.

The paradigm outlined here could, in the future, be extended to a range of other olfactory stimuli, think of the complex aromas of coffee, whiskey, wine, or perfumes in order to design cross-modally corresponding soundtracks (e.g., for marketing/advertising; Spence 2012b; Crisinel et al. 2012) as well as determining other cross-modally congruent features in a range of modalities to improve the overall experience of the product. One could also imagine testing other sets of musical compositions in order to see whether people find it easier to match them to the aromas used here.