Keywords

2.1 The WAT Proposal, in a Nutshell

In this book, we intend to outline a theory which is able to account how different kinds of abstract concepts and words are represented. The theory is called WAT, an acronym for words as social tools (the initial formulation of this view was presented in Borghi and Cimatti 2009). Words can be seen as tools because, similar to physical tools, they allow us to act in the world, together with and in relation to other individuals; they are social also since they are acquired and used in a social context. We will claim that intending words as social tools will allow us to explain the representation in the brain of abstract concepts and word meanings (ACWs) and their use. As we have seen, abstract concepts come in a great variety, and we will argue that one of the problems of the theories proposed so far concerns their difficulty in providing a framework which is sufficiently general to deal with different kinds of abstract concepts.

The WAT theory we propose has five main tenets.

  1. 1.

    Embodiment and grounding of ACWs. First, we assume that not only concrete concepts and words but also ACWs are embodied and grounded in our perception, action, and emotional systems. This means that not only concrete words as “ball” but also words such as “truth” reactivate sensorimotor networks in the brain. As we will see in the next chapter, a variety of theories share this assumption, supported by growing evidence.

  2. 2.

    Importance of language for ACWs. The theory holds that for the representation of ACWs, the linguistic mediation is more crucial than for the representation of concrete concepts and words. This means that ACWs activate linguistic areas in the brain more than concrete concepts and words. Some embodied theories share this assumption, but the reasons advanced for this relevance of language may differ. We argue that language is more important for the representation of abstract than of concrete word meanings for at least two reasons. First, members of ACWs differ more than members of concrete concepts and words—for example, experiences of freedom differ more than experiences of balls—and are often more complex; thus, a unifying label can work as a glue (see Chap. 1). Second, due to the fact that they do not have concrete referents, the mediation of language might be more crucial for the acquisition of ACWs than of concrete concepts and words. We will develop this aspect in point 3; in the next chapters, we will present behavioral and neural data in keeping with this hypothesis.

  3. 3.

    Acquisition modality of ACWs (see Fig. 2.1). Given that ACWs do not have concrete referents, their acquisition modality relies more on language than the acquisition of concrete concepts and words. Take for example “ball” or “bottle”: Infants can learn to differentiate bottles from glasses on the basis of their perceptual and motor experience. The presence of somebody using the same label indicating bottles and a different label indicating glasses certainly changes categorization: It helps to better differentiate between categories and it renders them more compact and cohese (Cangelosi and Parisi 1998; Lupyan 2012; Yoshida and Smith 2005). However, it is probably not necessary to form them. But consider now “freedom” and “democracy.” The presence of somebody helping us, thanks to the use of the same label, to assemble and put together sparse and variable experiences, as those related to freedom, might be crucial. The presence of an authoritative member of our community telling us what democracy is can be crucial for the acquisition of the notion. This authoritative member can be for example a parent, an expert, or a teacher. Literacy and schooling are obviously important to build up complexity around abstract concept skeleton, as it is independent reading of sources we trust. To clarify, concrete concepts still need tokens or labels. But, of course, those labels can be directly linked with the referent, a real object. For example, they can be pronounced in presence of it and the object can be indicated. Abstract words referent typically cannot be indicated. This does not imply that abstract concepts such as “freedom” are not grounded: Activating them, we probably reactivate previous experiences, visualize scenes, etc. But this visualization is probably not sufficient to create a category that keeps together the Statue of Freedom, the experience of running on a field, that of exiting prison, and many others, without the support of other people and the help of the linguistic mediation. With linguistic mediation, we do not intend exclusively the fact of using a given label in presence of an entity/event/situation, but also the fact that our conspecifics can provide us with explanations and meaning clarifications, and that in certain cases, we might need to have an idea of the social stratification of our in-group, to know the authoritative members of our community on whose opinion we intend to rely upon (Prinz 2002, 2012). To our knowledge, our theory is the unique, or the first, to propose this principle in combination with an embodied approach. In the next chapters, we will present some data on acquisition in adults and on conceptual development in children that support this hypothesis.

    Fig. 2.1
    figure 1

    The acquisition modality of concrete versus abstract words. The figure illustrates the tenet on acquisition of the WAT proposal: It shows that in order to learn a concrete concept such as “ball,” the social input is less relevant than in order to learn the abstract concept “phantasy,” which assembles many different experiences and states

  1. 4.

    Brain representation of ACWs: We hypothesize that the different acquisition modality of concrete and abstract concepts and words is reflected in the way in which they are represented in the brain. While both should activate the sensorimotor network, the linguistic network should be activated more by the ACWs than by concrete concepts and words. In the next chapters, we will present some data in support of this.

  2. 5.

    Linguistic diversity of ACWs: Due to the fact that language plays a major role in the representation of ACWs, we hypothesize that they are more affected by differences between languages than concrete concepts and words, that is, that their meaning will change more depending on the cultural and linguistic milieu in which they are learned. This means that behavioral studies should show a higher variability in meaning of ACWs and a higher dependence of their meaning from the spoken (and written) language, compared to concrete concepts and words. In the next chapters, we will present some data in keeping with this principle.

Notice that, different from other theories, our theory does not define ACWs in negative, by specifying what they are not, but it defines them in positive, clarifying what they are and in what they differ from concrete concepts and words: They do not differ in embodiment, but they differ in acquisition modality, in brain representation and in variability across languages, and they are also likely to differ in the assessment of quantity (see Chap. 1 for this).

2.2 Some Reasons Why Language is so Important for ACWs

As anticipated, we propose that concrete and abstract concepts and words are grounded in perception, action, emotion systems, as well as in linguistic systems, but to a different extent. The difference in grounding between concrete and abstract concepts is a matter of degree: The first are grounded primarily in the sensorimotor system, while ACWs are primarily grounded in the linguistic system. We detail below three main reasons why we think that language plays a major role for ACWs trying to scrutinize the different functions language might play.

Language as glue One of the reasons why language plays a major role in ACWs’ representation is given by the heterogeneity and variety of experiences to which typically ACWs refer to. Given the great variety and diversity of the members of abstract categories, using the same word to refer to them can contribute to the category formation. In this sense, as anticipated in Chap. 1, language can work as a sort of glue. A comparison between basic-level concrete concepts and abstract concepts can help to clarify this point.

Consider basic-level concrete concepts, such as “flower” and “hammer”. Beyond the diversity of the members of these two categories, we can form a summary mental image of flowers and of hammers (Rosch et al. 1976). In addition, even if all categories continuously change and are updated in light of new exemplars of the category we encounter, concrete concepts are more stable than ACWs. Consider now how the acquisition process of a concrete word such as “flower” works. Children typically hear it in presence of different kinds of flowers, and at the beginning, they might start hypothesizing that it refers to the petals, to the stalk, or to the flower’s scent. Then, progressively, they have to learn to refer the word “flower” to roses, cowslips, and daisies, i.e., to different flowers, thus abstracting from the idiosyncratic aspects of each exemplar they have encountered, and they have to learn as well to refer the word to the flower as a whole, not to its parts. Once learned, the word will re-evoke the experience of a flower and will help predict possible actions to perform with flowers—people might be able to imagine a flower, its scent and fragrance, and its color and might be able to prepare themselves to pick up nice flowers. Obviously, the concept of flower to which the word refers will be continuously updated, once new flowers experiences are collected, but somehow, it is not difficult to form an image of the flower’s referent.

Take now ACWs such as “truth” or “phantasy,” and consider how the acquisition of the word “phantasy” might take place. In this case, it will be much more difficult to refer the word to a single object, and even to a single experience/event. The word “phantasy” might be heard in conjunction with really different situations. Each speaker will associate it to different experiences, but this is not the whole story: The word will be characterized not only by interindividual but also by intraindividual variability: What is “phantasy” for us now might be markedly different from what we associate to “phantasy” in a week. This example highlights that ACWs activate variable, different, and idiosyncratic experiences, and their variability both within individuals and across individuals is greater than the variability that characterizes concrete concepts.

There is however an exception to this variability, represented by nominal kinds (Keil 1989), which are often considered abstract concepts. Nominal kinds are definitional concepts, such as those defined using kinship terms. These concepts might indeed be more anchored to a dictionary definition than other concepts. For example, we all agree, since it has been established within our culture, that aunts are the sisters of one of our parents, or that physicians possess a degree in medicine. Obviously, instability characterizes these concepts as well, but to an extent which is comparable with that of concrete concepts. Concepts of this sort differ indeed from other abstract concepts also because the referent of the corresponding words is typically perceivable through the senses—for example, following the conventions of our culture, we typically represent physicians as wearing a white coat and white shoes.

Overall, the examples we presented show that the meaning of abstract words is usually highly variable, with the exception of the meaning of nominal kinds. Language allows us to cope with this variability and facilitates the acquisition of abstract concepts. With concrete concepts, the environment provides a structure helping kids to learn words (Malt et al. 2010). With abstract words, it is language, conveyed by others, which provides a scaffolding structure helping children understand the meaning. This is the first reason why the presence of a unifying label is particularly precious for ACWs: Possessing a unifying label for diverse and sparse experiences can provide a sort of glue helping keep them together, i.e., it can help to form the category.

Language as social tool An additional reason why language plays a major role for ACWs is due to the social dimension language incorporates. This social dimension might be more relevant for abstract concepts and words since, in order to learn them, we need the contribution of others and of language; this is partially true for all concepts and words, but in different proportions. Consider two verbs such as “to think” and “to pass”; the meaning of the verb “to pass” can be inferred by observing a scene, while the meaning of the verb “to think” cannot. To understand it, the help of others directing our attention, explaining us what is going on, becomes critical. This example clarifies that the social dimension we refer to is not due to the content of the words. For example, in the sentence “Pass me the salt,” the concrete verb “to pass” evokes the presence of another person, while in the sentence “I think to go,” the abstract verb “to think” does not. Still, according to WAT, the presence of others, their help, and their clarifications would be more crucial to learn the abstract verb “to think” than the concrete verb “to pass.”

Importantly, acquiring new abstract words implies a sophisticate social cognition ability: For example, we might need to select the people who have the authority to teach us the meaning of a new word—such as parents in most cases, or experts in a given domain for words we acquire later (Prinz 2002, 2012), or even technological supports as Internet or books for words we acquire late.

The fact that learning of abstract words might occur through the mediation of technological supports is not in contrast with the idea that their acquisition is social, for some reasons. The first is that these linguistically mediated supports are a social and collective product. This might seem not sufficient, since all kinds of artifacts, even cups or bottles, can be considered as social products. The second relies on the theory of simulation: When we read language, we should reactivate the experiences we first had during word acquisition. Relying on a written authoritative source can evoke a situation similar to that in which a real person is explaining to us the word meaning—or using it appropriately (according to Wittgenstein, word meaning has no other explanation than their use). In fact, Borghi et al. 2011) found an effect of written meaning explanations on the representation of ACWs. In the study, which will be described in detail in Chap. 4, page 84–86 (see also Fig. 4.1), participants were presented with novel exemplars of concrete and abstract categories, which were learned observing visual objects; first, they were invited to form categories on a sensorimotor basis, then they were presented with novel labels, and in one condition, they were provided with written explanations of the category meaning. Results of a subsequent property verification task showed that, different from concrete words, abstract words were responded to faster with the mouth, using a microphone, than with the hand, pressing a key on a keyboard; the advantage of the mouth responses with abstract words was particularly marked when explanations were provided. Thus, the written explanations had an influence on representation of the meaning of abstract words. This suggests that, in principle, the idea that the acquisition of ACWs is a social process would hold also in the cases in which a person is alone in a cell and is learning new words from a written text (thanks to people at the ELSP conference for a feedback on that).

Language as material thing As we have seen, abstract concepts refer to entities that are not perceived directly with the senses. But the words to design them, i.e., abstract words, can be obviously perceived with the senses: They are perceived through vision if they are read, through audition if they are listened to, and through both audition and touch/proprioception if they are produced. Importantly, the way in which they are perceived involves actions: Words are produced (spoken, written, etc.) and actively received (listened to, read, etc.). Since concrete words have perceivable referents while abstract words do not, the sensorial dimension provided by the materiality of words (as they are read, written, listened to, or produced) has a higher probability to influence ACWs’ representation than the representation of concrete concepts and words.

2.3 What is Crucial in Language? Sounds, Labels, Explanations?

We have seen some of the reasons why, according to WAT, language plays an important role, particularly for ACWs’ representation. However, it is important to specify which aspects of language play a role: The phonological properties of words, i.e., their sound? The auditory properties associated to the referents of words, as for example, the sound “mou” associated to cows? The linguistic labels, i.e., the names or verbs or adjectives associated with the concepts? The explanations of the word meaning? We will consider all these four aspects.

2.3.1 Phonology

Let us start with phonology. Are abstract concepts associated with peculiar phonological characteristics, and if it is the case, to what extent does this exert an influence on their representation?

The majority of the studies concern concepts differing in abstraction level, rather than concepts differing in abstractness. Research on categorization has shown that basic-level concepts are characterized by shorter names compared to superordinate ones (e.g., Rosch et al. 1976). Anthropologists have shown that generic species categories are typically named using a single label (e.g., “squirrel,” “cat,” “pine”), while levels subordinate to the basic one are typically named with compound names (e.g., “gray squirrel,” “Persian cat”; “maritime pine”; Berlin et al. 1973). Consistently, both adults and children tend to interpret compounds as referring to subordinate terms. Crucially, this phenomenon holds across languages. (Notice however that the majority of studies pertain English samples; thus, the results might be at least in part biased). Basic-level terms, which are referred to with short names, more typically refer to concrete items, while the hierarchical level is more difficult to determine for abstract terms. For example, “freedom” can be hardly categorized as a basic- or a superordinate-level term.

Recent experimental work shows that the phonological differences do not pertain solely to concepts differing in abstraction level but also to concepts differing in abstractness. In a recent study, Reilly et al. (2012) asked participants to make semantic judgments for nonwords and found that they were more likely to associate an increase in word length and a decrease in word likeness with abstract concepts. Asking participants to decide whether real words were abstract or concrete, they found that they tended to wrongly categorize longer, inflected words (e.g., “apartment”) as abstract and shorter, uninflected abstract words (e.g., “fate”) as concrete. Overall, these results suggest that we are sensitive to statistical regularities in the forms of words and that we distinguish concrete and abstract words also on this basis.

Despite its interest, the phenomenon is not at the focus of our problem that consists in the identification of the aspects of language which are crucial for ACWs’ representation. This result is indeed not necessarily informative on how we represent the meaning of single words, but it mainly concerns the metalinguistic knowledge we possess of classes of words. It tells us on which aspects we rely in representing ACWs as a class, clarifying which elements we take into account when for example we have to evaluate whether a word is abstract or not. Still, interestingly for us, this finding shows that it is very likely that ACWs’ representations are characterized to some extent also by some knowledge on word form, since phonological aspects and meaning are not arbitrarily related.

2.3.2 Auditory Properties

Results of recent work by Lupyan and Thompson-Schill (2012), who investigated how visual categorization is influenced by verbal labels, help us identify the linguistic components which likely influence conceptual representation. In their experiments, participants heard a cue, then they saw a picture and had to respond whether it matched or not with the cue. The cue could for example be a word (e.g., “dog”) or a characteristic sound (e.g., a barking sound). The authors demonstrate that verbal labels (e.g., “dog”) are more effective than sounds (e.g., the sound of a barking dog) in facilitating visual identification. Labels, reason the authors, have an advantage over sounds for a variety of reasons: because they are words, because they refer to categories, and because they have easily reproducible phonological forms. They demonstrate that nouns are more effective than verbs; hence, the advantage is not due to the fact that verbal labels are words. In addition, nouns are more effective than nonword sound imitation, such as “arf arf” for dog; thus, the advantage of labels is not due exclusively to their sound. (Notice however that the way in which animal calls are encoded in specific languages varies depending on the phonology of languages; thus, a clear distinction should be made between animal sounds and speech sounds.) To note that this result might seem counterintuitive, since young children call cows “muh muh” and dogs “arf arf” before naming them “cows” and “dogs”. Finally, the advantage persists even when novel categories are taught, to which novel labels or sounds are associated. These results indicate that words are not only pointers that index referents (e.g., Glenberg and Robertson 2000), and cast doubts on a view which takes into account only the referential aspect of words. Studies on gestures confirm the fact that reference to a concrete object out in the world is not the default way to make meaning (Mittelberg, personal communication, 2013; Cienki and Mittelberg 2013). In order for a proposition to work and make claims about something, it needs both content and function words (icons and indices, following the terminology of Peirce 19311935).

Consider now abstract words. We argued that their representation keeps track of language more than the representation of concrete words. It remains to be determined which aspects of language are more crucial for abstract than for concrete concepts.

In principle, abstract concepts can be grounded in the auditory modality, similar to concrete concepts. Recent results reveal that abstract concepts such as socialism and conservatorism are grounded in different modalities, including the auditory one (Farias et al. 2013). Participants were more likely to evaluate words associated with conservatorism as louder when presented to the right ear than words associated to socialism, even if the sounds did not differ in intensity. This auditory pattern mapped the spatial mapping of the terms, with the words related to conservatorism more oriented to the right and those associated with socialism to the left: The semantic dimension overlaps with the visual and the auditory dimensions. Even if this result concerns a specific subset of abstract concepts, it reveals that abstract concepts can be grounded in the auditory modality, provided that there is an association between sound direction and word meaning. In the same vein, in a study with acronyms of Dutch political parties, van Elk et al. (2010) demonstrated that participants performed button-press responses earlier when the button location corresponded to the political orientation of the party (e.g., they provided faster left responses to left-oriented parties and viceversa); furthermore, responses were faster when a political acronym was displayed on the side of the screen corresponding to the political orientation of its party. The results of these two studies support our view that abstract concepts are both grounded in multimodal dimensions and deeply interwoven in language: As Farias et al. (2013) argue, “an opposition between symbolic representational and modality specific representations is misleading at best” (p. 5).

Even if abstract concepts can be grounded in the auditory modality, the possibility that a sound, as a mooing sound for cows, is more crucial for abstract than for concrete concepts is quite remote, since abstract concepts typically do not have a single, concrete referent, and are therefore difficult to associate to a specific sound. The remoteness of this possibility is empirically confirmed by the negative correlation found by Connell and Lynott (2012) between auditory properties and abstract concepts.

This negative correlation does not contradict our proposal. The perceptual dimension ratings obtained by L&C concern indeed the conceptual referent and the word meaning, not the word per se; for example, the auditory modality would be negatively correlated to the meaning of the abstract word “truth,” and not to the meaning of concrete words such as “dog” or “telephone.” According to our proposal, the acoustic modality is relevant for abstract concepts because their label and eventually the verbal explanation of their meaning would come to our mind—not the sounds produced by the conceptual referent.

In support of this view, we can briefly refer to the study by Borghi et al. (2011) we introduced on page 24 and that we will extensively illustrate on page 84–86 (see also Figs. 4.1, 4.2, 4.3). In this work, novel categories were used, which were learned observing visual objects, and no sound was associated with them. Even if no sound was associated with the category, nevertheless, results with abstract concepts showed faster and more accurate responses when responding with the microphone, i.e., producing a sound, than when pressing a key on a keyboard; this advantage was not present with concrete concepts. The advantage was more marked when participants were taught not only the label but the explanation of the word meaning as well. This result testifies that the association between the acoustic properties and abstract concepts pertains specifically their labels and the explanations of the word meaning, not the sound elicited by the referent of the category.

In sum, in principle, abstract concepts and words can evoke auditory properties. However, typically, their referents are not associated with specific auditory properties, for the simple reason that it is easier to think of the sound of a telephone than of the sound of the truth. We propose that the linguistic aspects which count more for ACWs’ representation are not the sounds/auditory properties of their referents, but their labels.

2.3.3 Labels

The data we have presented lead us to argue that neither the phonological specificity of abstract terms nor the sound of the entities they refer to are at the core of the linguistic representation of ACWs, even if both factors might play a role. In contrast, even if these hypotheses should be tested with further experiments, we predict that two aspects of language might be really relevant for ACWs’ representation: labels and explanations.

Labels are relevant for representation of all kinds of concepts and words, as shown by Lupyan and Thompson-Schill (2012). However, we hypothesize that they are particularly crucial for abstract ones, since they facilitate categorization of elements that otherwise would be difficult to classify together.

Literature on word acquisition in children can be informative as to the importance of labels. Many studies have investigated the role played by labels for categorization, with a special focus on how much a common name renders things similar and promotes inferences on their properties. According to one influential view, labels work from the very beginning as category markers, as children expect labels to designate categories and mark their distinctions (e.g., Waxman and Markov 1995). In a well-known study, Gelman and Markman (1986) demonstrated that children use labels as indicators of a category, then they generalize properties to that category. In their study, the authors had triads of elements. They taught children hidden properties of one of the elements of the triad (e.g., “it has hollow bones”) and found that children tended to generalize the property to the element of the triad that had the same name but was perceptually dissimilar rather than to the perceptually similar element which had a different name. These results, however, were challenged by Sloutsky and Fisher (2004) who demonstrated with the same stimuli used by Gelman and Markman (1986) that children’s behavior cannot be predicted relying only on labels, but that only a model based on both labels and appearance can accurately predict their performance. Sloutsky and collaborators have also argued that labels contribute in increasing category similarity and have shown that early in development, labels work like other perceptual features such as shape, color, and size. Only later, in the course of development, they start to be perceived as category markers. For example, Deng and Sloutsky (2012) investigated the role played by labels for categorization in children aged four and five as well as in adults. They found that adults use labels against perceptual similarity, while this is not the case for children. In addition, they found that early in development, labels work as other features, such as shape and size, but later during the development, in adulthood, they become crucial to indicate the category, marking the distinctions between different categories.

Now, consider abstract concepts. It is possible that during the acquisition of ACWs, labels work also as category markers. Since running on the grass, exiting prison, and taking a decision without the influence of others do not have much in common, but can all be categorized as experiences of “freedom,” using the same label to designate them will be really helpful for building the category. The presence of the same name can indeed direct attention in a top-down manner (Gliga et al. 2010), guiding learning, thus helping people to collect the sparse and diverse experiences that can be associated with a specific category. To our knowledge, the debate on labels as category markers has focused on concrete words and has not dealt with the differences between kinds of words. This is probably due to the fact that children acquire ACWs later than concrete concepts and words.

We propose that the top-down mechanism according to which labels guide learning characterizes the acquisition of ACWs, for two reasons. The first is that their members can be really diverse from a perceptual and motor point of view; the second is that ACWs are acquired relatively late in the course of development, as data on age of acquisition reveal (Della Rosa et al. 2010). We are not fully in keeping with the labels-as-category markers approach, though. In particular, we do not think it is very useful when it contrasts labels and perceptual similarity (see Deng and Sloutsky 2012, for a similar view), since typically the same label is correlated with a higher perceptual similarity between the category members. Even if referents of ACWs are not perceptually similar, they might have common characteristics derived from similar experiences, or they might rely on common image schemas (Barsalou 1999). For example, all experiences of freedom might include a reference to the self, and crossing of a boundary or absence of a boundary. As anticipated in the first part of the chapter, ACWs are grounded in multimodal experiences, and among these experiences, the linguistic one has a special status.

In support of this view, Borghi et al. (2011) and Granito et al. (in preparation) found with novel objects that the use of labels helps more the formation of abstract than of concrete categories. More specifically, Borghi et al. (2011) found that the disadvantage in processing of abstract over concrete concepts is maintained, but slightly reduced when people are taught labels to apply to categories.

Notice that Borghi et al. used written labels in their word acquisition study with adults, but acoustically presented labels are probably the most effective with children. In infancy (6–10 months), the acoustic modality dominates indeed over the visual one (for a review, see Lewkowicz 1994). Sloutsky and Napolitano (2003) demonstrated that not only infants but children as well (4 year olds) have a preference for acoustic over visual modalities: When submitted with combinations of visual and acoustic stimuli (scenes associated with a sound), they made equivalence judgments on the basis of the auditory components rather than of the visual one, and they encoded more readily the acoustic than the visual components.

2.3.4 Explanations

Borghi et al. (2011) found that not only labels but also explanations of the word meaning play a role and influence object processing and that explanations are more effective to learn abstract word meanings than concrete ones. That of explanation is a function of language not considered by Lupyan and Thomson-Schill. There are cases in which the meaning of a category has to be learned, thanks to the contribution of other members of our community. As argued by Prinz (2002), to learn the word “democracy,” we may visualize a series of scenes, but also rely on the opinion of authoritative members of our community. Other people can help us understand abstract concepts providing us with explanations, or furnishing us a list of possible instances of the category. When hearing or reading new terms, we often search for their meaning on the dictionary, or look up their meaning on Wikipedia. The role played by explanation can be seen as in contrast with the idea advanced by Wittgenstein of language games, since it anchors concepts to a specific meaning, and with the idea that what counts is not the explanation of word meaning but their use. We believe it is not. Providing explanations consists indeed in providing a context where the word can be found, as well as in highlighting the relationship between the elements that the word evokes.

Consider that the explanations to account for abstract terms meaning are typically longer than those that can be used to explain concrete word meaning, since in the last case, the external environment can provide much more scaffolding and support. For example, explaining the meaning of “democracy” requires many more words than explaining the meaning of “bottle,” also due to the fact that in the last case a bottle can be shown to the learner (see Chap. 4 for further details on studies on modality of acquisition (MOA); e.g., Wauters et al. 2003).

Underlying the role of explanations, we stress a peculiarity of language, which is often neglected. Nobody would obviously deny that language has social nature. However, theorists belonging to the different approaches have not pointed out the sociality of language in the way we do in our proposal.

Theorists favoring a pragmatic view have focused mostly on the communicative aspects of language: Their claim that language can be conceived (of) as a form of action which changes and modifies the surrounding world is important for us. More recently, theorists favoring an embodied and grounded cognition perspective have emphasized mostly the fact that language is grounded in perception and action systems. According to a third recent approach, word meanings would be determined by the statistical distribution of words across language (e.g., Landauer and Dumais 1997; Griffiths et al. 2007). Recent proposals have shown that the last two approaches are not necessarily conflicting but can be reconciled (e.g., Andrews et al. 2013; Louwerse 2011; Meteyard et al. 2012; Borghi and Caruana in press). We are completely sympathetic with the view that embodied/grounded and distributional approaches can be reconciled (see for example Borghi and Cimatti 2012). However, here we intend to claim something more. In our view, language can work as a communicative/action device (pragmatic), as a pointer (embodied and grounded view), and its meaning can be determined by a network of associated words (distributional view). The social dimension of language enters into play in all these approaches. However, we think that the way we represent language—and abstract words in particular—does not only keep track of the frequency of occurrence of associated words, but also of the relevance/authority (for us) of the members (e.g., parents, authoritative members of our community) who explained to us the meaning of words. These “sociological” aspects would have a mental counterpart and cognitive consequences. In this sense, we believe that the social aspects intrinsic in language can influence the way concepts and words are acquired and therefore represented.

Summarizing, in Sect. 2.3, we have scrutinized different aspects of language that can be relevant for ACWs’ representation. We propose that, even if phonological and auditory properties might play some role, they are not as crucial as verbal labels and explanations in influencing ACWs’ representation.

2.4 Which Mechanisms?

So far, we have clarified the role that different linguistic aspects—phonology, acoustic properties, labels, and explanations—might have in influencing the representation of ACWs. Direct evidence in favor of the WAT view will be presented more extensively in the next chapters. Here, we outline a proposal concerning the different mechanisms that might underlie the activation of linguistic information for ACWs. This proposal is currently speculative and needs to be verified with appropriate experimental evidence.

Let us start with some recent evidence which needs to be accounted for. In a recent study, Ghio et al. (2013) analyzed three different kinds of abstract sentences—sentences referring to emotions, to mental states, and to math concepts—and compared them to concrete sentences describing actions with three different effectors: hand, legs, and mouth. Participants had to rate all sentences on concreteness, context availability, familiarity, and body part involvement using 7-point scales. The authors found that, when required, participants associated abstract sentences with effectors. Specifically, mental states and emotional sentences were more associated with the mouth effector than with the legs and hands, while math concepts evoked preferentially the hands. This activation of the mouth with most typical abstract concepts is predicted by the WAT. In our view, this is likely due to the acquisition process of abstract concepts, which occurs mainly through the mediation of language, as discussed in the course of the book. The results by Borghi et al. (2011), by Scorolli et al. (2012), and by Granito et al. (in preparation) go in the same direction. The evidence by Ghio et al. concerns participants’ associations; hence, it is metalinguistic; the evidence by Scorolli et al. (2012) is obtained with a TMS study, while the evidence by Borghi et al. and Granito et al. pertains to production of a motor response with the mouth.

To account for evidence as the described one, we would need to understand more in depth what happens and what phenomena are at the basis of the mouth activation with ACWs. We will outline below some possible mechanisms that might underlie the effects found. These mechanisms are not mutually exclusive—most probably they are all present. Two pertain more the memory traces we keep of the way in which we acquired the concepts, two the ways in which these concepts are processed online.

It is possible that the motor activation of the mouth effector depends on traces of acoustic experiences evoked while listening to or while producing the verbal labels and the explanations of word meaning. Alternatively, the mouth activation can be due to a form of motor preparation, aimed at the rehearsal of the label or of the explanation associated with the word meaning. Finally, it can be due to some kind of inner language.

The notion of inner language requires some further clarification. As highlighted by Vygotsky (1978, 1986) and also by supporters of the extended mind view, as the philosopher Andy Clark (Clark and Chalmers 1998; Clark 2008), language is initially a social and public phenomenon, which becomes internalized during the course of development. This internalization allows children and later adults to use a form of inner speech, which guides their actions and has the power to augment their computational abilities and abstract thought capabilities (see also Clark 1998). Consider for example the cases in which we use language to remember some dancing steps, or when verbalizations helps us compute or solve some difficult problem. We propose that not only external, public language, but inner speech as well might play a role for ACWs. Abstract, difficult notions do indeed require more internal elaboration, hence more inner speech, as if we need to retell and re-explain to ourselves their meaning. The advantage of this interpretation is that it helps reconcile the WAT account of abstract concepts with the idea, proposed by Barsalou and Wiemer Hastings (2005) and discussed in Chap. 3, that introspection plays a major role for abstract concepts and the data showing that introspective properties are more frequent with abstract than with concrete concepts. It extends Barsalou and Wiemer Hastings’s (2005) view clarifying how introspection might occur, i.e., through the mediation of a form of inner speech which involves the mouth.

Since production and comprehension systems are two faces of the same coin (Pickering and Garrod 2013) and rely on the same neural substrates, as shown for example by recent literature on mirror neuron activation (e.g., D’Ausilio et al. 2009), the different mechanisms can be hard to disentangle. In fact, literature on the mirror neuron system has shown that part of the neural circuitry involved in the execution of motor actions is also activated during the comprehension of language referring to those actions. (Gallese et al. 1996; Jirak et al. 2010; review in Rizzolatti and Craighero 2004, and many others).

A, B, C, and D are all compatible with the evidence of Borghi et al. (2011) of an activation of the mouth effector, particularly when an explanation of its meaning was provided, with TMS data by Scorolli et al. (2012) which suggest an activation of the mouth effector with noun–verb combinations where an abstract verb is present and with fMRI results by Sakreida et al. (2013) which show an activation of the linguistic neural network with combinations composed by an abstract verb and an abstract noun (see Chap. 5 for further discussion of this evidence). One could object that the results found by Borghi et al. did not concern acoustically presented explanations, but written ones. However, a written text typically re-evokes the experience of its acoustic presentation. Furthermore, evidence by Granito et al. (in preparation) concerns more directly verbally produced explanations and is compatible with A–D accounts (see Chap. 4 for a detailed description of these experiments). Finally, the four mechanisms are compatible with the association Ghio et al. (2013) found between the mouth effectors and ACWs.

All mechanisms we have proposed so far share one problem. It is not clear why the association between ACWs and the mouth effector does not hold for number concepts, which according to Ghio et al. (2013) are considered as more associated with the hand than with the mouth effector. One possible explanation is that number concepts are a very special kind of abstract concepts, since the experience of finger counting (Fischer and Brugger 2011; Fischer 2008, 2011; Lindemann et al. 2011; Badets and Pesenti 2010; Ranzini et al. 2011) (see Fig. 2.2) provides a clear way to scaffold them.

Fig. 2.2
figure 2

Study by Ranzini et al. (2011). Participants were presented with digits and with images of graspable and non-graspable objects of different size (large vs. small). Their task consisted in repeating aloud the odd or even digit within a pair depending on the object type. The digits could precede or follow the object presentation. Responses were faster for graspable than non-graspable objects preceded by numbers; results revealed also an effect of numerical magnitude after the presentation of graspable objects. Overall results suggest that graspable objects facilitate number processing, supporting the view that abstract concepts as numbers are grounded in sensorimotor experience

A further problem that might lead to favor the first two memory-based mechanisms is the evidence by Borghi et al. (2011) who showed a stronger motor activation of the mouth when written explanations were provided. However, the fact that explanations have a motor effect, activating the mouth, is not necessarily in conflict with the activation of internal language—it is indeed possible that we retell ourselves the meaning of the concept, as it occurs in silent reading, when people pronounce each word they read.

Overall, the four possible mechanisms we propose are the following:

  1. a.

    memory traces of listened labels and explanations;

  2. b.

    memory traces of the experience of producing the label and the explanation;

  3. c.

    motor preparation, aimed at rehearsal of the label or of the explanation;

  4. d.

    motor activation due to inner language.

While current experimental evidence indicates that the mouth is more activated with ACWs, it does not allow to disentangle among them. It is also possible that more mechanisms are activated at the same time. Further experimental evidence is necessary to investigate and to better understand the processes that take place and the specific mechanisms responsible for the activation of the mouth with ACWs.

2.5 Conclusion: WAT and the Scaffolding Role of Language

In this chapter, we have sketched a proposal concerning the representation of ACWs, the WAT proposal. The most crucial distinguishing aspect of this proposal is that it holds that the acquisition of ACWs relies more on language and on the contribution of other people to the clarification of word meaning. Due to the fact that the scaffolding function of the physical environment is less powerful for abstract than for concrete concepts, language helps filling this gap. This dominance of language is reflected in the way we represent ACWs in the brain, and it has a motor counterpart, i.e., it implies the activation of the mouth effector. We outlined some mechanisms that might underlie the activation of the mouth effector. Further research is needed to further detail them and to get a better understanding of what is going on when we use an abstract concept. In addition, further research is needed in order to understand whether this proposal can hold for all abstract concepts or only for a subset of them. To this aim, a fine-grained analysis of the different kinds of abstract concepts is badly needed.

In the next chapter, we will distinguish the WAT view from other proposals in the field. In the further chapters, we will critically discuss the evidence obtained so far which favors the WAT view and we will illustrate what kind of further evidence is needed to fully support it.