Introduction

The effect of language on odor perception and processing has been a matter of scientific debate for decades. The general role of verbal processes in olfaction has been discussed controversially (for a review, see Olofsson and Gottfried 2015) and is currently not yet fully understood. Meanwhile, it has been demonstrated repeatedly that verbal context information has considerable impact on odor evaluations: Verbal cues bias pleasantness ratings (Bensafi et al. 2007; Distel and Hudson 2001; Djordjevic et al. 2008; Herz 2003; Herz and Clef 2001; Lorig and Roberts 1990; Lundström et al. 2006; Moskowitz 1979; Rolls et al. 2003) as well as quality evaluations (Herz and Clef 2001; Stevenson and Mahmut 2013) and, finally, brain activation varies with the label presented with an odor (Araujo et al. 2005; Lorig and Roberts 1990; Lundström et al. 2006). Studies in this research area have usually paid particular attention to language effects triggered by overtly available verbal or visual cues. Can comparable effects be found in the absence of explicit source information in the perceptual context? Herz (2005) took a clear position on this question and proposed two different mental processes, or more specifically, a dual–coding hypothesis: In case a verbal identifier is present, odor processing is strongly language mediated. Without, odors are processed sensation driven. Olfactory perception neither depends on verbal coding nor does an odor automatically trigger verbal equivalents. This assumption has been substantiated by works that have repeatedly demonstrated (1) a poor naming ability for even familiar odors (Cain 1979; Cain and Potts 1996; Cain et al. 1998; Desor and Beauchamp 1974; Wijk and Cain 1994a, b; de Wijk et al. 1995) and (2) how odor recognition is unaffected by access to verbal labels (Ayabe-Kanamura et al. 1997; Lehrner 1993; Rabin and Cain 1984). However, despite our difficulties in odor naming, smell sensations are expressed as a feature of an odorous object in most languages (“the smell of…” or “smells like…) rather than as a discrete, object–independent sensation like, for example, a color (Berglund and Höglund 2012; Holley 2002; Majid 2015; Majid and Burenhult 2014; de Wijk et al. 1995). Several authors have argued that a central—if not the major—function of odor perception may be the determination of the source that emanates a specific smell (Auvray and Spence 2008; Gibson 1966; Holley 2002; Sugiyama et al. 2006). And when source information is not readily available from the perceptual context, it may be actively retrieved from memory as well. That means, perceptual processing (1) may be regularly accompanied by verbal–semantic processing and (2) these mechanisms of odor naming may occur spontaneously, triggered by an olfactory sensation itself. These assumptions are in line with empirical findings from several areas in olfactory research:

  1. (1)

    Odor classifications: language effects have been found in empirical evaluations of odors despite missing source cues. In classification studies, odors from the same lexical category (fruits, flowers) have been arranged together regardless of apparently dissimilar sensory codes (Ayabe-Kanamura et al. 1998; Carrasco and Ridout 1993; Prost et al. 2001; Seo et al. 2011; Urdapilleta et al. 2006) and cultural differences in odor arrangements have complied with culture–specific uses of odors (Chrea et al. 2005; Chrea et al. 2004; Ueno 1993). Remarkably, these identification effects have been found although participants have neither been provided with information on an odor’s source nor been instructed to name the presented smells.

  2. (2)

    Crossmodal associations between olfaction and vision: Several studies have shown how language has mediated crossmodal associations between odors and stimuli of other sensory modalities, specifically vision. That is, expectations about an odor’s identity have affected associations to colors and shapes (Dematte et al. 2006; Gilbert et al. 1996; Jacquot et al. 2016; Kaeppler 2018; Maric and Jacquot 2013, 2013; Spector and Maurer 2008; Zellner et al. 2008). Interestingly, this effect has not been found in languages like Maniq, Malay, or Thai that use a more comprehensive and abstract (rather than source based) vocabulary to describe odors (Levitan et al. 2014; Valk et al. 2017) and whose speakers have less problems in naming odors correctly (Majid 2015; Majid and Burenhult 2014).

  3. (3)

    Mental processing of odors: Olofsson and Gottfried (2015) proposed that source object representations are established on an early stage of olfactory processing, presumably even ahead of valence encoding. In a series of studies they demonstrated that behavioral responses are slower when decisions are based on accessing the valence of an odor compared to odor object features (Olofsson 2014; Olofsson et al. 2013).

Taken together, these findings imply that odor sensations may elicit a verbal referent or—more general—an object representation of an assumed source and although this representation is (veridically) incorrect in many cases, it may still affect the perceptual process and odor evaluations, respectively. Thus, we assume that subjects build hypotheses about an odor’s identity and that these assumptions, whether correct or incorrect, bias perception and shape odor evaluations. To test this hypothesis, we adopted an approach used repeatedly to investigate the nature of mental odor representations: the comparison of evaluations of odor samples and mentally visualized odors (Breckler and Fried 1993; Carrasco and Ridout 1993; Chrea et al. 2005; Herz 2003). Applying this approach, previous studies have usually found differences between odor ratings and odor label ratings. These dissimilarities have been considered as an evidence for the sensation–driven processing of odors when verbal cues are absent and thus bolstered Herz’ assumption of a dual coding. Remarkably, language effects caused by spontaneous odor identifications could be found in the results of all odor–label studies. False identifications may root incongruity between the rating of an odor and the rating of its correct label. Hence, the results of previous studies could have been affected by a comparison of apples and oranges, when participants actually evaluated (imagined) odors based on falsely generated labels. Hence, differences may not be created by different processing modalities (sensory versus verbal) but by different smells being rated. In the present study, we investigated whether the disparities in the way people rate odors and odor labels can be attributed to identification mechanisms. Specifically, we aimed to probe if evaluations of odors and equivalent (rather than correct) odor labels are similar—simply by matching each odor to both its true label and the name ascribed to it by each participant, in case it was identified incorrectly. In an odor condition, subjects were asked to rate 20 odorants on a 40–item attribute list (evaluation task) as well as on five perceptual dimensions (perceptual rating task) and to provide a verbal label for each odor presented (naming task). In a subsequent imagery condition, the same participants performed the evaluation task and rating task on a set of written verbal odor labels. For each participant, this set comprised both the correct names of the stimuli presented in the odor condition (true labels) and the labels generated by them in the naming task (identified labels). We compared the ratings of each odor to both its true and its identified label in order to evaluate the potential differences between objectively and subjectively matching odor–label pairs. If the assumption about an odor’s source affects odor evaluation as we expected, we should find a better agreement between ratings of an odor and its identified label than between an odor and its true label if it has not been associated with this odor before.

Material and Methods

Material

Odorants

We wanted to assess the impact of identification mechanisms. Thus, 20 common odorants were selected to cover a broad range of familiarity and identifiability (Cain et al. 1983; Chrea et al. 2009; Doty et al. 1984; Fornazieri et al. 2010; Hummel et al. 1997; Nordin et al. 1998) as well as different levels of typicality for an equivalent semantic category (Bueno and Megherbi 2009; Storms et al. 2001; van Overschelde et al. 2004). The set was not meant to represent the human olfactory space comprehensively. Nevertheless, we paid attention to the impact the odor selection would have on the scope of our results (for a review, see Crisinel et al. 2012). Odors and abbreviations used in the text are listed in Table 1. The majority of odorants was supplied as liquid solutions by Symrise (Holzminden, Germany). For COC, LAV, and VIO, natural aromatic oils were used (Aromell, Germany). Odors were presented in white pen–like devices that carried a cotton swab soaked with the diluted odorant. Pens were coded by a random two–digit number.

Table 1 Odorants used in the odor condition

Attribute List

The selection of attributes in verbal profiling approaches of odors has often been arbitrary. Usually, word lists have been derived from expert literature and applied with untrained subjects, who likely understood and used the terms differently (Lawless 1984; Solomon 1990, 1997). At the same time, an approach that tries to capture natural language has to necessarily build on an insufficient and predominantly source–based olfactory vocabulary, at least in Western languages (Majid and Burenhult 2014). We therefore applied a twofold method to derive a meaningful set of verbal descriptors in a systematic approach: We initially collected a comprehensive list of odor-related terms used by experts and olfactory research and eventually applied a subset that was informative to untrained subjects.

An extensive literature review including odor classification studies (Coxon et al. 1978; Cunningham and Crady 1971; Dalton et al. 2008; Dravnieks 1985; Higuchi et al. 2004; Pilgrim and Schutz 1957; Prost et al. 2001; Zarzo 2008), odor profiles, and fragrance catalogs (Arctander 1969; Boelens and Haring 1981; Sigma-Aldrich Company 2011; Thiboud 1991) yielded a temporary list of 414 English terms: These referred to odor sources (n = 252), non–olfactory qualities (n = 122), olfactory qualities (n = 15), effects (n = 11), hedonic qualities (n = 9), and perceptual context (n = 5). An overview of all terms and their sources are available in Online Resource 1. As all study parts were conducted in the subjects’ native language German, the complete list was translated into German and randomly divided in two subsets. Using an online questionnaire, these sets were presented to 100 participants each. Subjects were asked to rate the applicability of each attribute for describing the perceptual quality of an unspecified odor on a five-point scale (not at all applicable–very applicable). Ninety-six of these terms were judged as relevant by the majority of the subjects (selection criteria: rated with 4 or 5 by at least 50% of the respondents and rated with 1 or 2 by not more than 20% of the respondents) and further consolidated. (1) In order to weight different classes of characteristics equally, terms referring to odor sources with a common core characteristic, were replaced by a single term. For example: Lemon, lime, grapefruit, orange, mandarin, citrus, fruity (citrus) were substituted by the term citrus. (2) Terms that referred to very specific odor sources (wet dog, fried chicken) were replaced by the most distinctive feature. (3) Terms that represented odor qualities not exemplified by the odor set of the main study, were removed. This resulted in a final list of 40 attributes (Table 2). Despite thorough preliminaries, the set had eventually two shortcomings: It still included a number of terms that indicated odor sources or referred to a perceptual context. Further, it was rather complex and potentially difficult to handle for untrained subjects in the main study. A further consolidation lacked a reasonable approach. Therefore, we decided to apply the list as a compromise between a comprehensive set of expert terms and a sample of odor–related vocabulary found in Western cultures, corrected for the common predominance of crossmodal and source references.

Table 2 Attributes applied in the evaluation task

Procedure

Participants underwent two experimental sessions that were separated by approximately 2 weeks. Both conditions were conducted in the same well–ventilated room on university campus.

Odor Condition

Participants were instructed to place each odor pen under their nostrils at a distance of approximately 0.5 in. and smell the odor by breathing normally. The presentation order of odors was fully randomized for each subject. Odors were presented one at a time with a break of at least 90 s between two odors. Each session lasted approximately 120 min. Answers were recorded using a computer–based questionnaire. Participants performed three tasks on each of the 20 odorants.

Evaluation Task

Subjects were instructed to rate each odor against a 40–attribute list using a nine–point rating scale (“How applicable is each term to describe the odor?” not at all applicable–very applicable). Attributes were arranged randomly for each odor.

Naming Task

First, participants rated the familiarity of each odor on a nine–point rating scale (not at all familiar–very familiar). They were then asked to freely identify each odor by providing the most accurate source name and to judge the certainty of their answer on a nine–point rating scale (“How certain do you feel in having identified the correct odor source?” not at all–very certain). Subjects were not forced to produce a source name. If a participant could not provide a label for a given odor, this case was classified as misidentified without label.

Perceptual Rating Task

Eventually, respondents assessed each odor on perceptual dimensions with high descriptive ability (Moss et al. 2016) using a nine–point rating scale: (1) intensity (low–high), (2) pleasantness (very unpleasant–very pleasant), and (3) edibility (not at all edible–edible).

Participants went through the tasks in this order (evaluation–naming–perceptual rating) and completed any given task for each of the 20 odorants before receiving instructions for the subsequent task. Note that subjects performed the evaluation task for all odors prior to the naming task and the perceptual rating task, i.e., attribute ratings were unaffected by an explicit demand to name the presented odorants.

Imagery Condition

In the imagery condition, participants were asked to rate a set of written verbal terms, each referring to a specific odor source. This set of terms was composed of 20 labels indicating the actual source of each odor (true label) plus up to 20 terms generated by each respondent in the naming task of the odor condition (identified labels). That is, for each participant the label set was composed individually in order to compare odor ratings and label ratings for both objectively and subjectively matching samples. When, for a given odor, the identification in the naming task was correct, nothing but the true label was presented. When, however, a subject misidentified the source, both the true and the identified source name were presented (separately). If, for example, a participant was able to name all odors correctly, the label set consisted of 20 true labels. If, on the contrary, a subject misidentified each odor, the label set consisted of 20 true labels and 20 identified (but incorrect) labels. If a participant could not produce a label for a particular odor, it was classified as misidentified without label. As an identified label was missing, only a true label could be presented in the imagery condition. These cases were considered accordingly in the data analysis.

Odor names were presented written at the top of a computer–based rating form. Respondents were instructed to mentally imagine each odor as vivid as possible.

Evaluation Task

Subjects were instructed to rate each odor against the 40–attribute list using a nine–point rating scale. The order of attributes was fully randomized for each label. Labels were evaluated one after another in a fully randomized order.

Perceptual Rating Task

Additionally, respondents were asked to rate the odor specified by each label on intensity, pleasantness and edibility using a nine–point rating scale. Labels were evaluated one after another in a fully randomized order.

Participants went through both tasks in this order (evaluation–perceptual rating) and completed the evaluation of all labels before receiving instructions for the perceptual ratings task.

The whole experimental procedure for each participant is illustrated in Fig. 1.

Fig. 1
figure 1

Experimental procedure

Participants

In total, 56 participants (41 women; mean age = 21.73, SD = 2.64) were recruited from Leuphana University of Lüneburg and participated for course credit. They were tested individually. All respondents reported a normal sense of smell; they were free of respiratory infections or allergies at the time they were being tested. Participants were instructed not to use perfume, body lotions or odorous cosmetics at the day of testing, not to eat intensely spiced foods and not to smoke 1 h prior to the experimental session. Subjects neither had previous experience in olfactory testing nor were trained in odor evaluation or identification. The experiment was conducted in their native language German. Subjects provided verbal informed consent. They were informed that the experiment aimed to study odor perception and evaluation in general. At the end of the second session, subjects were fully debriefed. The study was conducted according to the Declaration of Helsinki–Ethical Principles for Medical Research Involving Human Subjects and approved by the Ethics Committee of Leuphana University of Lüneburg.

Results

Based on the naming accuracy in the naming task, we distinguished three odor–label pairings to assess the agreement between odor ratings and label ratings as a function of naming accuracy: ideal, congruent, and incongruent pair (Table 3).

Table 3 Possible pairings of odor and odor label between odor and imagery condition

Ideal pair: An odor was identified correctly in the naming task. For this odor, only the true label was presented in the imagery condition. Hence, odor ratings were compared to the ratings of a label that was not only correct, but had also been associated with this odor by the subject before. For example: presentation of PEA in odor condition, correct identification as “peanut” by the subject, presentation of odor label “peanut” in the imagery condition. Ideal pairs required correct odor identification in the naming task. If, however, an odor was identified incorrectly, the subject was presented with two different labels in the imagery condition: the wrong label produced by the subject as well as the true label. Congruent pair: An odor was identified incorrectly in the naming task. The identified label was presented in the imagery condition. For example: presentation of PEA in odor condition, identification as “chocolate” by the subject, presentation of odor label “chocolate” in the imagery condition. Although they were objectively different, odor and odor label subjectively referred to the same thing. Incongruent pair: An odor was identified incorrectly in the naming task. The true label was presented in the imagery condition. For example: Presentation of PEA in odor condition, identification as “chocolate” by the subject, presentation of odor label “peanut” in the imagery condition. Odor and odor label factually refer to the same thing, but may still not match for the subject that named the odor differently. Usually, each incongruent pair had a corresponding congruent pair. If, however, a participant could not produce a label for a specific odor, this odor was classified as misidentified without label. In this case, only a true label could be presented in the imagery condition, although this had not been associated with the odor before. Odor and label made up an incongruent pair; a matching congruent odor–label pair was missing.

Perceptual ratings of familiarity intensity, pleasantness, and edibility of an odor were treated as metric data (cf. Seo et al. 2008). Attribute ratings were analyzed on ordinal level as they represent the degree of appropriateness of a specific term in describing an odor quality. With respect to correlations, we calculated Spearman’s coefficients as data was not normally distributed (details may be requested from the author). Testing revealed no significant gender differences. Hence, data was collapsed across gender. We applied an exploratory step-by-step approach to the data. We first analyzed the whole data set and subsequently focused on subsets based on identification accuracy or ambiguity. All statistical analyses were conducted with SPSS statistics (version 24.0) for Windows.

Identifications

Rates of correct identification varied considerably across odors. Overall, odorants were identified correctly in about 37.59% of the cases (across all odorants and subjects). Given that an incorrect label might still be reasonably close to the accurate source name, we adopted a scheme proposed by Cain (1979). Following this approach, we further categorized inaccurate labels in near misses—names of substances that are perceptually or semantically similar to the true odor source (melon for pineapple), and far misses—vague category labels (fruit for pineapple) or evidently incorrect labels (glue for pineapple). Near misses may be treated as correct identifications (Sulmont-Rosse 2005) which lead to a remarkable increase in correct identifications overall (58.75%) and especially for several odors like ANI, CIN, ISO, MUS, PAT, ROS, and TUR (Fig. 2).

Fig. 2
figure 2

Identification rates: correct identification (dark gray bars), near miss (light gray bars), and far miss (shaded bars)

Note that with respect to odor–label pairs, both near and far misses were treated as incorrect identification.

A significant positive relationship was found between odor familiarity and naming certainty (rs = 0.732, p < 0.01). Subjects were more confident in finding the correct name for odors that appeared familiar to them (Online Resource 2).

Perceptual Ratings

We first compared odors and odor labels by directly matching the ratings for odor samples and associated true labels as it has been done in previous studies (Bonfigli et al. 2002; Breckler and Fried 1993; Chrea et al. 2005). Odor–label agreements for intensity, pleasantness, and edibility ratings were analyzed by a Mann–Whitney U test. Scores on all dimensions were found to vary considerably between both perceived and imagined odors: Intensity ratings differed significantly for eleven of 20 odor–label pairs, pleasantness ratings for 16, edibility ratings for 15 (Online Resource 3). These results match the findings of previous studies that applied the same approach (Bonfigli et al. 2002; Breckler and Fried 1993; Chrea et al. 2005; Herz 2003). Interestingly, highly significant differences between odor and odor label ratings were especially found for ambiguous odors, i.e., odors that were repeatedly matched with either one or a very dissimilar source label across participants (for example: PIN was identified as fruit in 30 cases, and as cleanser or solvent in 14 cases).

Ambiguous Odors

We specified five odors as ambiguous (PIN, ELD, ISO, LAV, MUS) and further analyzed these samples to assess whether high odor–label dissimilarities could be rooted in different assumptions about an odor’s source. Truly, pleasantness as well as edibility ratings were significantly higher (p < 0.001) for PIN when identified as a fruit (n = 30) than as cleanser or solvent (n = 14). A comparable pattern of significant differences was found for ELD, ISO, LAV, and MUS for pleasantness as well as edibility ratings (Table 4).

Table 4 Mean scores for pleasantness and edibility ratings in odor condition for ambiguous odors (Mann–Whitney U test)

Note, that these perceptual ratings followed the naming task. That means, the explicitly requested source labels biased these evaluations. By contrast, the evaluation of all odors on the 40 attribute terms preceded odor naming. However, when odor naming is triggered spontaneously, identification mechanisms would affect these ratings as well. For each ambiguous odor, attribute ratings of both assigned labels were contrasted. Results of this comparison are shown in Table 5. Differences could not be found for each of the 40 terms, but for those descriptors one would reasonably expect to differ between the two ambiguous odor labels. For example, ISO identified as candy should not differ from ISO identified as solvent on attributes like baked, bitter, burnt, earthy, etc. At the same time, both cases should differ on alcoholic, balsamic, fresh, solvent–containing, etc.

Table 5 Median of attribute ratings in odor condition for ambiguous odors (Mann–Whitney U test)

That means, the same odor was rated significantly different on quality attributes when it was identified differently, although the rating task preceded the naming task.

The directions of these differences varied from those we had expected in only two cases for LAV, where lavender received lower ratings on aromatic and higher ratings on alcoholic than disinfectant.

Odor-Label-Associations for Perceptual Ratings

Different from previous studies, we calculated odor–label agreements separately for ideal, congruent and incongruent pairs, respectively. If differences between odor and odor-label evaluations are rooted in identification mechanisms, then false identifications should result in a better agreement for pairs of odor and identified label (i.e., incongruent) than for pairs of odor and true label (i.e., congruent). Spearman’s correlation coefficients for intensity, pleasantness, and edibility ratings for each of the 20 odorants are shown in Table 6.

Table 6 Spearman’s correlation coefficients for odor and label ratings sorted by pairings

Across all pairings, positive correlations were found for the majority of cases; 49.12% of these correlations reached significance. Average correlation coefficients were calculated by transformation of correlation scores to Fisher’s Z values; 95%–confidence intervals were calculated for each score in order to compare correlations pairwise (congruent–incongruent, congruent–ideal).

For all perceptual ratings, mean correlations for congruent pairs (intensity, rmean = 0.382; pleasantness, rmean = 0.593; edibility, rmean = 0.644) were significantly (p < 0.05) higher than for incongruent pairs (intensity, rmean = 0.237; pleasantness, rmean = 0.212, edibility, rmean = 0.184). At the same time, no significant differences were found between congruent and ideal (intensity, rmean = 0.421, pleasantness, rmean = 0.529, edibility, rmean = 0.544) odor–label pairs.

Interestingly, this pattern could be found for intensity ratings as well, though less emphasized than for pleasantness and edibility. One might question whether an untrained subject or anybody is able to rate the intensity of an imagined odor and whether the consistencies found are rooted in the congruence between odor and odor label. Interestingly, identification mechanisms can help to understand these findings: When participants inferred intensity of an odor based on what they meant, the source of the odor to be (in odor and imagery condition alike), an intensity rating may not reflect the actual (or imagined) strength of a smell. It may rather reveal associations of pleasantness or familiarity with an odor source that have in turn influence on intensity evaluations (Ayabe-Kanamura et al. 1998; Distel et al. 1999; Distel and Hudson 2001; Doty 1975; Henion 1971; Hudson and Distel 2002; Moskowitz et al. 1976; Royet et al. 1999; Sulmont et al. 2002).

Odor-Label-Associations for Attribute Ratings

To assess odor–label agreements for attribute ratings, Euclidean distances between odors and related labels were calculated across all 40 attributes, separately for each odor sample. For this purpose, attribute values were converted to normalized rank scores (between 0 and 1) that can be treated as interval–scaled data. The calculated Euclidean distances were again re–scaled into a 0–1 range, where a score of 0 indicates identical ratings of an odor and its corresponding label across all attributes and 1 indicates the maximum difference in ratings. Euclidean distance scores were averaged for ideal, congruent and incongruent pairs, respectively (Table 7).

Table 7 Double-scaled Euclidean distances of attributes ratings averaged across all subjects

As expected, averaged distances were found to be significantly smaller for congruent (M = 0.219, SD = 0.026) than for incongruent odor–label pairs (M = 0.251, SD = 0.042), t(19) = − 4.57, p < 0.001. A significant difference was also found between ideal (M = 0.174, SD = 0.047) and incongruent odor–label pairs, t(19) = − 5.09, p < 0.001.

Discussion

The major aim of the study was to assess if subjects (1) build hypotheses about an odor’s identity and (2) whether these assumptions —correct or incorrect—bias perceptual processes. We specifically investigated how differences between odor ratings and odor–label ratings might be attributed to odor identification mechanisms. We found a generally better agreement between the evaluation of an odor and its identified label than between the evaluation of an odor and its true (yet not associated) label. Our results indicate that basic perceptual as well as attribute–related odor ratings are affected by the mental image of an odor. Further, this mental image is probably built upon an explicit request and a spontaneous identification attempt alike.

More specifically, our findings provide an alternative explanation to the results of previous studies that have applied a comparable approach and substantiated the Herz’ idea of dual coding in olfaction (Herz 2003; 2005) proposed that people willingly gather information on an odor’s source from the external context and that these cues exert a considerable influence on mental odor processing. They may be equally willing to retrieve this information from memory with the help of self–generated source labels and without an explicit invitation to do so. Although these labels are incorrect in many cases, they may overwrite sensory information just as contextual cues can do. Thus, a language–based coding of odors might not be limited to situations where source cues are evidently available. This conclusion is supported by our data and previous research alike. Although comparable studies typically concluded that odors and odor labels induce different mental representations, they usually also found consistencies that could have been caused by unprompted identifications and subsequent language effects (Breckler and Fried 1993; Carrasco and Ridout 1993; Chrea et al. 2005; Herz 2003). Specific empirical evidence for the assumption that people build ideas about an odor’s identity spontaneously has been provided by Zellner et al. (2008). In a series of experiments, they asked subjects to (among other tasks) choose appropriate colors for six fine fragrances and to rate the odors on the dimensions masculinity and femininity. They found that the categorizations of an odor as masculine or feminine significantly affected the matching of colors to these fragrances—even if subjects were asked to rate masculinity and femininity only after the color matching task. Zellner and colleagues concluded that a (gender) categorization of fragrances might be triggered automatically as fine fragrances are usually explicitly marketed as masculine or feminine. The impact of odor labels on perceptual ratings has been addressed by Stevenson and Mahmut (2013). They assessed the consistency of perceptual ratings (familiarity, intensity, edibility, pleasantness, activity, potency) over two test occasions separated by a 20–min intermission: Evaluations remained stable (in comparison to chance level) when odors were named consistently in both stages, with highest reliability scores for edibility and pleasantness ratings.

Our results suggest a generally lower impact of odor names on perceptual ratings (pleasantness, edibility, and particularly intensity) than on odor quality descriptions (attributes). This effect may be rooted in a process of preverbal identification (Herz and Eich 1995) or recognition without identification (Cleary et al. 2010)—the access to episodic memory content such as familiarity, likeability, or general source category ahead of odor naming. From an evolutionary perspective, it seems reasonable that basic olfactory assessments are quick and affected by semantic criteria to a lesser extent (Yeshurun and Sobel 2010). Olofsson and Gottfried (2015) contradicted this proposition and provided evidence that odor object representations are built very early in mental odor processing. Remarkably, these configural odor objects are a blending of single perceptual qualities where distinctive features become inaccessible as soon as an object representation is created. That means, although olfactory sensations are very likely classified instantly in order to reveal the presence of a source object (Holley 2002), in terms of an evolutionary survival strategy, these classifications could be narrowed to a judgment on familiarity, attraction, or rejection, respectively (Köster 2002; Köster 2005). Hence, the specific odor source might or might not be relevant in the very first stages (or the whole) perceptual process. Whether this is true for odor perception in general or limited to the specific context of an odor study remains to be answered.

Limitations

We are aware of several limitations of the study: First, due to a complex setting, the study sample was rather small. Considering subsets of the data in an exploratory approach resulted in sometimes too few cases for a proper analysis or statements of practical significance.

Second, across all subjects and odors the most frequently applied rating on attributes was “0” (“not at all applicable”). This might express that an odor sample did not at all smell fruity, earthy, bitter, etc. to a rater’s mind. However, when running the experiments, we observed that subjects frequently rated attributes with “not at all applicable” when they expressed general problems in verbalizing their olfactory sensations. In this case, a “0” may reveal a rater’s insecurity rather than an odor quality. These artifacts certainly increase the ambiguity of the data and reduce its explanatory power. The subjects’ uncertainty may be rooted in the general difficulty of untrained subjects to verbally describe odors (Lawless 1984; Levinson and Majid 2014; Solomon 1990, 1997, for a review, see Crisinel et al. 2012). Beyond that, the large number of attributes might have hampered the rating process. As a result, the differences between odor and label ratings were found to be rather small for all types of pairings: The highest calculated Euclidean score was 0.330 on 0–1 range. That means that completely different items (incongruent odors and labels) are still described rather similarly and not completely different from identical items (ideal and congruent odor–label pairs) on verbal attributes. Additionally, Euclidean distance scores turned out to be a generally weak measure of (dis)similarity as they relied on a substantial und unweighted aggregation of data across attributes and subjects. While an assessment on attribute–level provided meaningful insights, this approach was limited to ambiguous odors with a sufficient number of cases.

Third, non–experts are not only limited in verbalizing odor sensation. They often have difficulties in building olfactory mental images (Arshamian and Larsson 2014), especially when odors are hard to identify (Stevenson et al. 2007). Thus, we cannot determine with certainty which kind of object participants actually rated in the imagery condition. While some respondents might have generated vivid mental representations of olfactory events as requested, others may have relied on their crossmodal associations of a given source label.

Finally, we need to consider the universality of our findings carefully. The shown effects may strongly depend on the specific cultural context of the odor study in general as well as the odor lexicon of the participants. Our conclusions were drawn from a German sample and may (if at all) be generalized only to languages that rely on a comparable source–based odor vocabulary. Further research on cultural groups with more abstract odor languages is needed. Considering previous cross–cultural research (Levitan et al. 2014; Majid 2015; Majid and Burenhult 2014; Valk et al. 2017), it is questionable that, for example Thai, Maniq, or Jahai participants would show a similar propensity towards instant odor naming and comparable effects of identification mechanisms.