Recent discussions of animal communication and the evolution of language have advocated adopting a ‘pragmatics-first’ approach. The general idea behind this approach is that pragmatic phenomena are key to understanding certain continuities between animal and linguistic communication, and can thus aid our understanding of the emergence of human linguistic communication. In this spirit, Arnold & Zuberbühler recommend adopting “a pragmatics approach to exploring how primates extract information from … highly ambiguous, though discrete, signals” (2013: 2). And Seyfarth & Cheney (2017) likewise propose that “animal communication constitutes a rich pragmatic system” and that “the ubiquity of pragmatics, combined with the relative scarcity of semantics and syntax, suggest that, as language evolved, semantics and syntax were built upon a foundation of sophisticated pragmatic inference” (p. 340, emphases added). Relatedly, these (and other) authors propose a shift in perspective in the study of the evolution of language. Instead of looking from the start for the origins of human language–understood as a syntactically and semantically combinatorial, recursive system of discrete symbolic elements–we should begin by looking for the origins of human linguistic communication. This shift in focus is thought to allow us to recognize certain continuities in use–between what humans do with words and what nonhuman animals do with their communicative signals (‘nonwords’). Such continuities may exist alongside the admittedly deep syntactic and semantic differences between human languages and animal communication systems. Hence ‘pragmatics first’.

However, if we are to adopt a pragmatics-first approach as these authors recommend, we need a clearer understanding of which aspects of animal communication should count as pragmatic. I begin–in Sect. Pragmatics: Carnapian, Gricean, and ‘Intermediary’–by distinguishing two different conceptions of pragmatics that advocates of the pragmatics-first approach have implicitly relied on: one Carnapian, the other Gricean. I explain why Carnapian pragmatics sets the explanatory bar too low for pragmatics-first approaches, whereas Gricean pragmatics sets the bar too high. This motivates a need for developing an intermediary pragmatics that would apply less indiscriminately than Carnapian pragmatics yet more broadly than Gricean pragmatics. In Sect. Intermediary pragmatics and communication ‘from a psychological point of view’–pressing from above, as it were–I spell out what I take to be a key insight concerning linguistic communication and its emergence as it occurs in the work of a leading proponent of the Gricean approach, Michael Tomasello. In Sect. Intermediary pragmatics and biosemantics–pressing from below–I argue that this Gricean insight ought to be acknowledged by a view of communication which is Carnapian on its face: Ruth Millikan’s biosemantics. Combining the Gricean insight with elements from Millikan’s view, I articulate what I take to be a genuinely intermediary conception of pragmatics. In Sect. Intermediary pragmatics: how to do things with nonwords, I explain how this intermediary conception could better serve the purposes of those who look for potentially significant precursors of human linguistic communication in animals’ communicative behaviors. Along the way, I offer some possible illustrations from animal communication. I leave it for further empirical investigation to settle which forms of animal communication actually fall under the scope of intermediary pragmatics as characterized here.

Pragmatics: Carnapian, Gricean, and ‘Intermediary’

In their (2012), Wheeler & Fischer advocate looking for “continuities … between the communication systems of humans and our extant primate relatives” in “the flexible, learned responses of receivers”, since these presumably reveal capacities for contextual derivation of call contents (2012: 199). They therefore suggest that “a more productive framework” for primate communication research should be “pragmatics, the field of linguistics that examines the role of context in shaping the meaning of linguistic utterances” (2012: 2030). Likewise, Seyfarth & Cheney suggest we should turn to “sophisticated pragmatic inference” as the “foundation” upon which language was built (2017: 340). These authors’ focus on ‘pragmatic inferences’ on the part of, specifically, the receivers of signals–and the kind of examples from primate communication that they appeal to–suggest that they are implicitly relying on a rather broad conception of pragmatics. On this conception, arguably found in Rudolf Carnap (1942) (see Bar-On & Moore 2018, Arnold & Bar-On 2020), pragmatics covers indiscriminately any phenomenon involving the use of contextual factors of whatever sort to derive the semantic content or significance of an utterance or signal.

Carnapian pragmatics

The study of the variations of the content of signal types with the context of production and its context-dependent apprehension by interpreters.

Carnapian pragmatics thus characterized covers a great variety of cases in which a signal or utterance requires fixing some contextual parameter–e.g. time, place, identity (or producer or receiver)–in order to receive a determinate interpretation. Carnapian pragmatics was initially introduced to accommodate the ‘indexical’ context-dependence of truth-conditions of natural language sentences such as “It’s snowing”, or “You are late”, and later extended to accommodate the context-dependence of many other types of linguistic expressions (proper names, definite descriptions, adjectives, possessives, etc.).Footnote 1 But Carnapian pragmatics can be readily seen to cover a host of animal signals: various vocalizations by non-primates (birds, prairie dogs, and suricats, among others), as well as bee dances, cicadas’ mating ‘songs’, firefly mating flashes, octopus color changes, and so on (see Fitch 2010: Ch. 4 for a relevant survey). In the case of animal signals, Carnapian context-dependence has predominantly to do with the fact that they are produced and received by particular individuals, at particular times, in particular places. So, for example, a baboon social grunt is issued and received in a specific situation by baboons of a certain relative social ‘rank’ and gender (Seyfarth & Cheney 2017). And a bee dance signals to observer bees the presence of nectar at a specific direction and distance from where the dance is performed (among other things).

However, the capacity for this type of ‘indexical’–or ‘narrow’–context-dependent interpretation (Recanati 2002) is not only widely shared across animal species; it is also manifested in animals’ derivation of information from non-communicative signs. All signs–including so-called ‘natural’ signs–have content whose interpretation is keyed to the context in which they occur. Tree rings indicate the age of this tree at a given time; racoon tracks will signify the recent presence of a (certain species of) racoon at a particular point in a path; and certain kinds of red spots signify a measles infection afflicting a particular individual at a specific time; and so on. The use of such signs to derive information about the environment requires an ability for context-dependent interpretation.Footnote 2 Granted, this marks some continuity between humans’ use of linguistic signs and animals’ use of signals, but this continuity seems hardly sufficient by itself to illuminate the origins of distinctively linguistic communication. For, e.g., monkey call interpretation to count as relevant to these origins it would have to be shown that it goes beyond the narrow kind characteristic of all animal interpretation.Footnote 3

It is telling that accounts that highlight flexible sensitivity to context in, say, insect signals (e.g. Maynard Smith & Harper 2003; Oller & Griebel 2008) also typically note crucial differences between insect (as well as nonhuman primate) and human communication. Key among these are differences pertaining to the psychology underlying linguistic interpretation (see e.g. Maynard Smith & Harper 2003: Ch.7 and Fitch 2010). Carnapian pragmatic analyses appear to be, by design, silent on the issue of underlying psychology. Their focus is on generating correct contextual mappings between signal types and the circumstances in which signal tokens are produced. It is consistent with a given Carnapian analysis of, e.g., monkey calls that call users ‘compute’ the context-dependent content of calls as though they were merely natural signs of some threat. This suggests that a purely Carnapian analysis of a form of communication can at best provide a starting point for a pragmatics-first approach. Arguably, any such analysis ought to be supplemented with an account of potentially relevant continuities in underlying psychology between the analyzed form of communication and human linguistic communication.Footnote 4

Emphasis on the underlying psychology of communication is very much at the heart of a much more restrictive conception of pragmatics due to Paul Grice (1957). As is well known, Grice distinguished sharply between ‘natural’ and ‘nonnatural’ meaning. Natural signs, such as dark clouds, or rings on a tree trunk, or deer tracks, possess (only) natural meaning. Natural meaning is ‘factive’; nonnatural meaning is not. Whereas it is not possible for the rings on a tree to mean that the tree is 300 years old unless that is the tree’s age, it is possible for three rings on a bus to mean that the bus is full, even if the bus is not full. Relatedly, unlike natural signs, signs with nonnatural meaning are intentionally issued by minded agents with a special kind of ‘reflexive’ intention that constitutes ‘speaker meaning’: “[t]o say that [a speaker] S meant something by U is to say that S intended the utterance of U to produce some effect in an audience by means of the audience’s recognition of this very intention” (Grice 1968: 46, emphasis added).Footnote 5 On a strictly Gricean conception of pragmatics, (properly) pragmatic phenomena include only utterances produced with speaker meaning and interpreted as such. But even ‘post-Gricean’ accounts of linguistic communication, which relax the strict conception, adhere to the core Gricean idea human communication is distinctively ‘ostensive-inferential’.Footnote 6 Even on post-Gricean views, the hallmark of human communication is the rational production of an utterance by a speaker who intends “to make evident to an addressee the intention to make some thought(s) manifest to [them]” (Carston 2015: 454). So speakers overtly provide clues to enable their hearers to derive the specific contextual meanings the speakers have in mind, relying on hearers’ ability to ‘read their mind’. Gricean pragmatics as understood here has all and only forms of ostensive-inferential communication in its scope.

Gricean pragmatics

The study of rationally evaluable communicative utterances issued by producers ostensively (or “overtly”) and interpreted as such by their ‘mindreading’ interpreters.

Gricean pragmatics covers a much narrower range of phenomena than does Carnapian pragmatics, since it is only applicable to interactions involving ostensive-inferential communication. Accordingly, proponents of a Gricean pragmatics-first approach to the evolution of language have a very restrictive conception of the relevance of forms of animal communication to our understanding of the emergence of linguistic communication. To have such relevance, a form of animal communication would have to rely on at least some capacity for Gricean mindreading. It would have to be shown that the nonhuman producers issue signals ostensively, with certain kinds of audience-directed communicative intentions, and that their nonhuman receivers make inferences about those intentions when interpreting the signals. (See, inter alia, Origgi & Sperber 2000, Burling 2005, Hurford 2007, Tomasello 2008, Fitch 2010, and Scott-Phillips 2015.). Clearly, even if it is accepted that call receivers have a capacity to extract rich information from signals, it does not follow that their doing so depends on their employment of Gricean mindreading. After all, animals could exercise that capacity in processing non-communicative natural signs, which–by their nature–in no way involve speaker intentions or their attribution. The exercise of an ability for (some) context-dependent interpretation of, say, alarm calls is consistent with the calls’ being produced and interpreted as purely natural signs of the threats, so (by Gricean lights) can have no specific relevance to the emergence of linguistic communication.

The Carnapian and the Gricean conceptions of pragmatics can thus be seen to have very different implications for the relevance of behaviors such as primate alarm calls to the study of language evolution. On the Carnapian conception, primates’ communication via calls would indeed be relevant to the evolution of language, simply in virtue of the (narrow) context-dependence of calls’ content and interpretation. But, by the same token, so would any form of context-dependent interpretation, including animals’ ubiquitous interpretation of natural signs. This means that a Carnapian pragmatics-first approach would set an explanatory bar that is too low. By contrast, on the Gricean conception, establishing the evolutionary relevance of primate calls would require showing pragmatic continuities as understood by the Gricean conception. Insofar as animals do not exhibit a capacity for ostensive-inferential communication, their use of calls and other signals can be no more relevant to the evolution of human linguistic communication than any other forms of non-mindreading contextual decoding of signals, natural signs included. But this means that a Gricean pragmatics-first approach sets an explanatory bar that is too high.

That the Gricean approach yields implausibly strong requirements can be readily appreciated by considering the linguistic communication of young children. It is generally accepted that the sort of mindreading tasks involved in producing and processing utterances with Gricean meaning are too cognitively taxing for children under the age of 4 or 5–an age at which they already engage in rather sophisticated forms of linguistic communication. (See, e.g., Breheny 2006.) Adopting a Gricean pragmatics-first approach to the evolution of language would likewise appear to set an impossible standard; for, it implies that our ancestors would have had to engage in Gricean communication before language could begin to emerge. This would present us with a puzzle that seems entirely of a piece with the puzzle of language evolution itself. This puzzle concerns the question how the psychological capacity required for ostensive-inferential communication–a capacity for thought that is language-like: viz. propositional-compositional, recursive, and metarepresentational–could have emerged before language. (See Bar-On 2013, 2018)

A plausible retreat for the Gricean proponent is to suggest that, even before engaging in properly Gricean communication, young children nevertheless exhibit capacities for intentional and cooperative ‘pre-’ (or ‘minimally’) Gricean’ communication. And proponents of a Gricean pragmatics-first approach to the origins of language could similarly ‘lower the bar’, accepting that a form of non-Gricean nonhuman communication would be relevant to the emergence of linguistic communication, provided it could likewise be shown to manifest at least ‘proto’ Gricean capacities.Footnote 7 Note that this would mean making room for a conception of pragmatics whose scope is both narrower than that of Carnapian pragmatics and broader than that of Gricean pragmatics. On such an ‘intermediary’ conception of pragmatics, pragmatic phenomena would include many communicative interactions that are not properly Gricean. At the same, they would not include all Carnapian context-dependent uses of signals. To a first approximation, we can characterize intermediary pragmatics schematically, in analogy with the way we earlier characterized Carnapian and Gricean pragmatics, as follows:

Intermediary pragmatics–first pass

The study of communicative interactions exhibiting capacities that a. go beyond ‘narrow’ context-dependence, b. fall short of being ostensive-inferential, but c. exploit ‘proto’ Gricean capacities.

Many phenomena covered by Carnapian pragmatics would fall outside the scope of intermediary pragmatics. But intermediary pragmatics would cover many phenomena that are excluded from Gricean pragmatics. In the next section, I motivate the need for considering a narrower range of phenomena than those covered by Carnapian pragmatics, drawing on the work of Michael Tomasello. In Sect. 3, I explain how the Gricean insight informing this work can be accommodated by the anti-Gricean, biosemantic perspective on communication associated with the work of Ruth Millikan. Intermediary pragmatics as I envisage it would integrate key Gricean and Millikanian insights. I conclude by articulating an intermediary pragmatics-first approach that would seek to identify legitimate psychological yet non-Gricean precursors of human linguistic communication in animal communication.

Intermediary pragmatics and communication ‘from a psychological point of view’

In an essay on the origins of human communication, a leading proponent of the Gricean view of language, Michael Tomasello (2008), argues that we humans engage in a form of communication that is essentially different from all paradigmatic forms of communication “in the biological world” (2008: 13), in being ostensive-inferential. Humans, Tomasello says, use “communicative signals that are chosen and produced … flexibly and strategically for particular social goals… adjusted … for particular circumstances”, and “intentional in the sense that the individual controls their use flexibly toward the goal of influencing” the behavior and psychological states of others; they intentionally inform others “for cooperative motives”, attending to their audience’s psychological states and relying on their ability to infer their communicative intentions (2008: 13). If we are to understand how things could “move in the human direction”, evolutionarily speaking, we must identify the origins of this “underlying psychological infrastructure of human cooperative communication” (2008: 9f.). Yet Tomasello thinks that this infrastructure is (almost) entirely absent from existing forms of animal communication.

Primate communication: minded and intentional yet not fully gricean?

Alarm calls and other “vocal displays”, Tomasello argues, fail constitute psychological communication in his sense, because of the lack of flexibility in primate call production: primates “do not learn to produce their vocal calls at all, and they have very little voluntary control over them” (2008: 16); and their vocalizations are “mostly very tightly tied to emotions” (2008: 17) (and compare Burling 2005, Hurford 2007, and Fitch 2010). Tomasello then goes on to propose that the gestures of nonhuman primates may be “the best place to look for the evolutionary roots” of human communication (2008: 15), since in the gestural domain there are hints of what he describes as ‘communication from a psychological point of view’: behaviors that involve producers who attempt to convey a message by trying to “influence the behavior or psychological states of recipients intentionally” (2008: 14, emphasis added). As an example of such behaviors in our closest relatives, the great apes, Tomasello considers the use of so-called attention-getters by chimpanzees, which include distinctive patterns of gestural, postural, and facial expressions (including ground-slap, poke-at, and throw-stuff, ‘play face and posture’ displays, and ‘leaf-clipping’ noises). (A prototypical example is that of a young chimpanzee producing a gesture to draw attention to her playful facial expression and posture; 2008: 27.) On Tomasello’s analysis, in instances of attention-getting, the significance of the complete communicative act does not reside in the attention-getting gesture itself. Rather, the gesture’s function is to draw the receiver’s attention to a behavioral display put on by the producers. In order to react appropriately, the recipient must attend to the gesture (2008: 27–8).Footnote 8 The use of attention-getters is flexible: once in an individual’s repertoire, the individual can use them to accomplish a wide array of social goals, such as play, grooming, nursing, and so on (ibid.). Importantly, the use of attention-getters exhibits a ‘two-tiered’ structure; it is (as I shall put it) psychologically mediated:

“[The] communicator has some action he wants from the recipient … and to attain this he attempts to draw the recipient’s attention to something…”. [This] indirectness [represents a] genuine evolutionary novelty–almost certainly confined to great apes and perhaps other primates–and may be considered the closest thing we have to a ‘missing link’ between nonhuman primate communication and … human referential [ostensive-inferential] communication.” (2008: 29).

Now, as I read him, Tomasello does not think great apes are capable of fully rational, ostensive-inferential communication. Still, he himself is prepared to regard at least some of their communicative behaviors as providing a potential evolutionary ‘missing link’ and thus as relevant to the emergence of linguistic communication, precisely because they exhibit the psychological mediation essential to the latter. This suggests that we ought to separate two main strands that are intertwined in Tomasello’s Gricean conception of human communication. Gricean communication is, first, intentional and minded communication: it depends for its success on communicators attending–and intentionally adjusting their communicative behavior–to each other’s mutually recognized states of mind. It thus exhibits a rather specific type of context-dependence: mind-dependent context-dependence (as I shall put it). But, secondly, Gricean communication is ostensive-inferential: it involves the production of utterances with overt intentions to affect the audience states of mind, relying on the audience’s ability to reflect on the producer’s intentions (and other states of mind).Footnote 9 As Tomasello himself seems to accept, communication can go beyond mere Carnapian narrow context-dependence in being intentional and minded in the relevant sense without yet also being ostensive-inferential. And such ‘proto’ Gricean communication, it seems, could have potential significance for our understanding of the emergence of fully mature human linguistic communication (whether in ontogeny or phylogeny) even from a Gricean perspective. In this way, I think Tomasello’s discussion opens up space for an intermediary conception of pragmatics which covers forms of communication that are intentional and minded though not fully Gricean. Thus:

Intermediary pragmatics–Gricean take

The study of psychologically mediated communicative uses of signals: the production and apprehension of signals that have intersubjectively recognized communicative purposes, and that depend for their success on animals’ recognition of each other’s states of mind.

This conception is intermediary, insofar as it covers many instances of communication that fall short of being ostensive-inferential, while excluding all forms of Carnapian context-dependent interpretation that fails to be psychologically mediated.Footnote 10

Psychologically mediated communication: signal repertoires versus signal uses

As mentioned earlier, Tomasello thinks that psychologically mediated communication is not only likely confined to the great apes; it is also limited to their gestural communication, which he takes to contrast sharply with all communication via unlearned calls. Production in primate vocalizations is said to be completely inflexible and constitute ‘individualistic expressions of emotions’, as opposed to ‘recipient-directed acts’; consequently, “vocal displays, with their genetically fixed and highly inflexible structure, would seem to be a very long way from human-style communication” (2008: 18–20). Thus, on Tomasello’s view, precious few existing forms of animal communication would fall under the scope of intermediary pragmatics as just characterized.

However, I believe Tomasello’s position here fails properly to draw an important distinction: between signal repertoires, on the one hand, and the way animals make use of them in communicative episodes, on the other. Primate call repertoires–understood as distinct patterns of vocalization–may well be unlearned and perhaps have acoustic and informational structure that is genetically fixed and inflexible. But from this it does not follow that primates’ use of their innate calls–what they do when producing and interpreting calls in communicative episodes–fails to exhibit some psychological continuity with aspects of human linguistic communication. More specifically, whether or not primates’ call use is psychologically mediated in the relevant sense cannot be settled by determining the etiology and structure of the calls, understood as elements in a system (the signal repertoire), or even whether they are issued as ‘expressions of emotions’. It depends, rather, on whether primates’ production and reception of calls essentially relies on their recognition of each other’s states of mind (such as attention, intentions, or various affective states). It is in principle possible for communicators to produce and interpret elements of unlearned, limited, constrained, and expressive repertoires in relevantly flexible ways. For, producers can be mindful of their audience’s psychological states in their use of such signals, and receivers can recognize the signals as addressed to them, and both can modify their use in light of their perception of each other’s psychological states, thereby manifesting a capacity for intentional and minded communication.Footnote 11

Tomasello himself appears implicitly to recognize the possibility of a dissociation between features of signal repertoires and of signal use, respectively, when he observes that the use of the pointing gesture, which arises spontaneously in human babies, already exhibits psychological mediation (2008: Ch.4). But, given this dissociation, it cannot be concluded that primates (or other species) cannot make psychologically mediated uses of signals purely on the basis of the fact that the signals belong to unlearned, limited, and rigidly structured repertoires. Indeed, recent experiments by Crockford et al. (2012) suggest that Ugandan wild chimpanzees make selective and strategic use of elements of an extremely limited and innately constrained repertoire. When producing a snake alert call, these chimpanzee manifest fine-tuned sensitivity to whether or not call receivers have themselves seen the snake or have previously heard the call (as well as how far away they were relative to the caller, and whether they were affiliated with the caller). What Crockford et al. were specifically attempting to determine is whether–and to what extent–chimpanzee callers and receivers engage in minded and intentional (= psychologically mediated) communication, despite having at their disposal a very limited and rigidly fixed repertoire of calls.Footnote 12 It remains in dispute whether the experiments are sufficient to establish that chimpanzees are mindreaders who reflect on receivers’ “state of knowledge” (as the authors themselves suggested). But the findings do seem to show that call producers consult and closely monitor, specifically, others’ attention to a potential threat, and that call receivers tailor their movements to the location of a threat that is invisible to them but which they recognize to be perceived by the caller, carefully skirting the location; and both appear to tailor their responses to their apprehension of each other’s perception, level of alarm, and so on. To the extent that this is so, these chimpanzees’ call communication may well fall within the scope of intermediary pragmatics as here understood. Being psychologically mediated, their communication would seem to go beyond merely narrow Carnapian context-dependence; and, though it falls short of being properly Gricean, it should still be of interest to proponents of a pragmatics-first approach.

Intermediary pragmatics and biosemantics

The need for an intermediary pragmatics, we have seen, can be motivated ‘from above’: from a broadly Gricean perspective of the need to identify psychologically mediated forms of communication. I now turn to an opposing perspective associated with the work of Ruth Millikan. Millikan’s ‘biosemantic’ approach (e.g. 1989, 2006) is designed to provide a single framework within which to account for both nonhuman and human communication, whereby neither need rely on ostensive-inferential abilities. Millikan’s denial that such abilities must play an essential role in our understanding of even linguistic communication has led critics (e.g. Origgi & Sperber 2000) to object that her view treats linguistic communication on the code model–which suggests that it would be friendly to a Carnapian pragmatics-first approach. However, after briefly outlining key features of Millikan’s biosemantics, I explain how the need for intermediary pragmatics can also be motivated ‘from below’: from within Millikan’s anti-Gricean view. I thus conclude that an intermediary pragmatics-first approach which integrates both Gricean and Millikanian insights would be especially suitable for the purposes of those who seek potential precursors of human linguistic communication in animal communication.

Millikan on natural versus intentional signs

Earlier, we noted that animal calls and other communicative signals are different from natural signs, such as clouds or deer-tracks, and various other physiological symptoms, such as sneezes or red measles spots. In their seminal work, Maynard Smith & Harper (2003) distinguish–within the category of animal signals–between cues and signals. A cue is “any feature of the world, animate or inanimate, that can be used by an animal as a guide to future action” (2003: 3); whereas a (communicative) signal is “any act or structure which alters the behaviour of other organisms, which evolved because of that effect, and which is effective because the receiver’s response has also evolved” (ibid.). For example, the CO2 emitted by a mammal, which conveys to a mosquito the location of something to bite, is a cue for the mosquito, but is not produced as a signal by the mammal (ibid.) (so cues are merely natural signs in Grice’s sense). By contrast, a funnel spider’s vibrating of its web, which conveys to an opponent information about the vibrating spider’s size, is a signal, since it presumably evolved to convey information about its size (ibid.; and see Ch. 1 and passim). Monkey alarm calls, social grunts and ‘chutters’, and other vocalizations in social animals are likewise signals in that they have been designed to communicate information to designated recipients (op. cit. Ch. 7).

On Maynard Smith & Harper’s analysis, what separates communicative from natural signs is a matter of their evolutionary history and biological purpose, rather than the informational content they carry or their reliability. Communicative signals, unlike natural signs, thus have what Millikan has described as proper functions, where an item’s proper function is some effect that instances of the item have had, historically, which accounts for why the item has continued to be reproduced (1984: 28). On Millikan’s biosemantic account (as on Maynard Smith & Harper’s), the communicative character of a wide variety of animal signals–e.g. bee dances, octopuses’ ‘angry’ flashes, beaver danger tail splashes, alarm calls–are to be understood in terms of the fact that they have evolved through a process of mutual adjustments between signalers and receivers. This is something Millikan thinks animals’ communicative signals and human linguistic signs in fact have in common. Both are ‘intentional representations’, in the sense that they “are supposed to represent things; this is their (natural) proper function–why they continue to be produced …” (1984: Ch. 6). What renders both intentional in Millikan’s sense is the fact that “they have been ‘designed’, in accordance with human or animal purposes, or by learning mechanisms, or by natural selection, to be interpreted according to predetermined (semantic) rules to which targeted interpreters are cooperatively adjusted” (2004: 15–16).Footnote 13 And this explains another commonality Millikan finds between animal and human communicative signals: in contrast with natural signs, they are non-factive–they can be false (see, e.g. Millikan 2006: Ch. 6). However, this does not render them nonnatural in Grice’s sense. For, although both animal and linguistic communicative signals have evolved via processes that involve mutual (‘cooperative’) adaptations between signalers and receivers (2006: 104f.), neither must rely on producers’ and interpreters’ Gricean (ostensive-inferential) mindreading abilities.

On Millikan’s account, all signs carry specific information only relative to the (‘local’) context in which they occur. What we earlier described as Carnapian narrow context-dependence is everywhere. (See her 2006: Ch.s 3 & 4, and 2017: Part II.) When it comes to animal signals, Millikan also notes a certain evolutionary continuity between intentional and natural signs; animal communicative signals in many cases “evolve gradually” from natural signs or ‘cues’, such as preparatory ‘intention movements’ (2006: 103f.). Moreover, Millikan thinks that animal signals do not entirely lose their character as natural signs, once they become intentional (in her sense).Footnote 14 For an interpreter of an animal signals, the mechanism and history of the signals make no difference–so long as they “correlate well enough with corresponding world affairs within some trackable domain”; … [I]t doesn’t matter to the purposes of the chick whether its mother’s food call is merely a recurrent natural sign, or also an intentional sign” (2006: 105, 109). In other words, in general, the signals need not be treated by their users as communicative in order to accomplish their designed purposes. And, in particular, although successful communication often requires convergence between producers and receivers’ states of mind, it does not require them to think about each other’s states of mind.

Just as a gosling’s imprinting mechanism has the proper function of allowing it to fix on images of its mother so it can follow her, and bee dances have the proper function of directing fellow bees to where there is nectar, so linguistic utterances in the indicative mood have the proper function of producing beliefs in hearers, and utterances in the imperative mood have the proper function of producing in hearers desires to comply.Footnote 15 (Likewise for other linguistic ‘constructions’, which include words and phrases, as well as syntactic structures.) But serving these proper functions does not require that speakers intend their hearers to form beliefs and desires (1984: 58); and “[i]nterpreting speech does not require making any inference or having any beliefs … about speakers’ intentions” (1984: 62, emphasis added). Inference may well be needed to derive the contextual meaning of utterances; but–at least in basic cases of language use–there is no need for hearers to decipher what the speaker is ‘trying to say’, or what is ‘on her mind’.Footnote 16 What is more, even “conventional human signs”, when “used for their conventional purposes … usually are read the same way that natural signs are read” (2006: 109, emphasis added). Such linguistic signs can serve their proper function without their producers or receivers recognizing their proper function or being aware of the processes that have ‘stabilized’ them into conventional signs. So, for example, if I hear the doorbell and say to you: “There’s someone at the door”, I do not need to have–and you don’t need to recognize–a Gricean intention concerning your belief in order for my utterance successfully to communicate to you what it is conventionally designed to communicate (i.e., what historically accounts for the proliferation of utterances of that type), namely: that there is someone at the door. Millikan would thus deny that the communicative-intentional (in her sense of ‘intentional’) character of even linguistic signals normally depends on the presence and attribution of Gricean intentions. Although she agrees that we do, sometimes, need to consider speakers’ intentions when interpreting their communicative acts. For example, the interpretation of completely innovative uses of language may require considering the producer’s intentions. (For relevant discussion, see, e.g., her 2006: esp. 107f. and 131ff., and 2017: Chapters 12, 13, esp. pp.174ff.).

This aspect of Millikan’s view puts it directly at odds with views that take Gricean mindreading to be not only uniquely but also essentially involved in linguistic communication. It has led critics to argue that her account purchases continuities between human and nonhuman communication at the cost of inappropriately applying a ‘code model’ to both. (See Origgi & Sperber 2000.) On a standard construal of the code model, senders produce signals that encode (context-dependent) messages, which receivers then contextually decode, where the mechanisms for pairing signals with messages are reflexive/automatic or sub-personal, or else–if they involve learning–are purely associative. (See, e.g., Scott-Phillips 2015: 5, 157.) Although at least some Griceans (e.g., Origgi & Sperber, Scott-Phillips) believe the code model is perfectly suitable for understanding all animal communication, they think it is entirely inadequate when it comes to human linguistic communication. Humans regularly communicate successfully using sounds and gestures that do not have pre-existing conventional (‘encoded’) meanings. But, moreover, successful linguistic communication typically goes beyond conventional meanings; it essentially exploits our distinctive capacity for ostensive-inferential mindreading. And these authors think this has direct evolutionary implications: any explanation of the evolution of linguistic communication must suppose that “language as we know it developed as an adaptation in a species already involved in [ostensive-] inferential communication, and therefore already capable of some serious degree of mindreading” (Origgi & Sperber 2000: 159, emphasis added).Footnote 17

Psychologically mediated communication: an integrated view

Now, suppose we were to deny–with Millikan–that ostensive-inferential Gricean mindreading is any more essential to linguistic communication than it is essential to all nonhuman communication. Suppose, moreover, we were to accept that animal signals and even conventional linguistic signs (at least in some of their uses) can be treated as natural signs. Does this obviate all need for intermediary pragmatics that would cover mind-dependent context-dependent communication? In other words, does accepting Millikan’s biosemantic framework mean there is no need to go beyond Carnapian pragmatics? What I want to argue next is that, appearances to the contrary, we ought to, and also can–by Millikan’s own lights–accommodate the Gricean insight regarding psychological mediation used here to motivate intermediary pragmatics.

Let us return to the distinction drawn earlier between signal repertoires and the ways signals are used by producers and receivers (2.2). I think we should agree with Millikan that (genuinely) communicative signals possessed of proper functions may nevertheless be produced and interpreted as natural signs of the states of affairs they represent (when correct). But note, too, that–as Griceans often observe (see e.g. Grice 1989)–a sign with natural meaning can be used intentionally and even ostensively to communicate a certain message and be interpreted as such. As Tomasello notes, a communicative gesture with a content that is unlearned and unstructured–such as pointing–can be modified and adjusted by a child depending on her audience and used flexibly to communicate variable messages. More generally, the way producers and interpreters use communicative signals is relatively independent of whether the signals, as elements in a repertoire, have acquired their content naturally, or rather via a learning process of mutual adjustments. It is also independent of whether the elements are unlearned or conventional, what informational content they have, as well as whether or not they have a Millikanian proper function. Whether communicators use signals in ways that go beyond narrow Carnapian context-dependent, whether or not they only engage in coded communication, is not something that can be decided just by studying the properties of elements of signal repertoires and their history in abstraction from their use in communication. It requires examining more directly the psychological aspects of their use in given situations.

Accepting the Gricean insight derived from Tomasello’s discussion, what is of special interest for our purposes is the possibility that unlearned and rigidly structured animal signals whose content is fixed by their proper function may nevertheless be used in psychologically mediated ways. We can capture this possibility, I submit, while preserving key elements of Millikan’s biosemantic view. Distinguishing distal from proximal functions of communicative signals, I propose the following:

Psychologically mediated communication (a Biosemantic Take)

In a given species, the accomplishment of the proper function of signals with given representational contents may essentially rely on users’ apprehending and responding to features of each other’s psychological states–what they are attending or reacting to, where they are heading, what they intend to do, whether they are angry, playful, scared, and so on. If so, then we may say that, in that species, the signals’ distal proper function is accomplished through the fulfillment of a more proximal–and mind-dependent–proper function.

The signal’s distal proper function is itself not essentially mind-dependent; it is whatever beneficial effects for signalers and receivers that explain why elements of call systems have emerged and continue to exist. The signal’s proximal proper function would be mind-dependent inasmuch as its accomplishment would rely on communicators’ evolved capacity for a certain kind of psychological give and take, of a sort that is absent from fully coded communication.

Let me first illustrate what I have in mind in terms of Millikan’s own characterization of certain linguistic phenomena.Footnote 18 As noted earlier, Millikan holds that what are in fact conventional linguistic signs may be treated as natural signs, rather than as intentionally produced communicative signals. However, when it comes to devices such as the demonstrative “this”, her account seems rather different. When using such devices, she notes, language speakers rely on non-conventional, “improvisational” techniques or methods: “[g]esturing toward something, pointedly looking at it, nodding toward it, …. Rolling one’s eyes toward it, … are common ways to assure that one’s hearer will think of the right thing”. In other words, the use of such devices appears mind-dependent (in our sense). Now, on traditional pragmatic accounts, this renders the acquisition and use of demonstratives (and indexicals such as “you”, “he”) difficult. Indeed, Griceans often cite our regular success in communicating via such linguistic ‘pointing’ devices as evidence for the use of Gricean mindreading capacities in linguistic communication (see, e.g., Tomasello 2008: 57ff.). Millikan disagrees. For, she thinks that, on the contrary, demonstratives “are among the most primitive of signs” (2006: 153). Demonstratives such as ‘this’ are different from conventional signs in having no specific referential content that they have been designed to convey (1984: 168). Instead, “‘this’ appears to be a peculiar sort of free variable–a place holder for something the speaker has in mind and that the hearer will easily gather … as what the speaker means”. It only ‘protorefers’, and what it refers to “must be improvised”. In general, “improvised signs do not themselves have referents … What they mean is, just, what the improviser intends them to mean” (1984: 166f.).Footnote 19 But even linguistic signs that have some conventional meaning exhibit this feature: to “know which John is meant when somebody says ‘John’… you may have to take into account with whom the speaker is acquainted, … or what general domain he or she has in mental focus…” (2006: 131). The same applies to knowing which dog is intended when one uses the definite description “the dog” to refer to a particular dog (2006: 127f.).

What this suggests is that, on Millikan’s own account, successful referential communication via certain linguistic devices may not only be independent of convention and heavily context-dependent; it may also essentially rely on users’ abilities to track and recognize each other’s states of mind. These are mind-dependent abilities–to disclose and apprehend aspects of what communicators have in mind. As I would put it: In the case of certain linguistic devices, the distal (referential) proper function–to pick out some relevant item of interest in the world–is to be accomplished by fulfilling a more proximal, mind-dependent function: to draw the hearer’s attention to what the speaker has in mind.

Does this not take us back to a Gricean (ostensive-inferential) view of linguistic communication? Doesn’t acknowledging mind-dependent aspects of communication necessarily re-introduce the idea that speakers and hearers must be thinking about each other’s mind when communicating? Millikan thinks it does not–correctly, I believe. She says:

“[I]f you understand the phrase ‘understand what the speaker intends to communicate’ transparently, it does not imply that the hearer thinks about the mind of the speaker at all. It describes the content of the hearer’s understanding, but not necessarily by using a description of that content that the hearer herself would employ or understand (…) It means, merely, that the hearer thinks the same content that the speaker purposefully communicates … no thoughts about other people’s minds are necessary in order to grasp their meanings during ordinary communication … (2006: 131).” (And compare Recanati (2002: 113).

Unlike genuinely coded communication, successful linguistic communication often does require communicators to be able to draw each other’s attention to what they are attending to, and more generally, to know what is on each other’s mind. But that does not mean that speakers and hearers must be able to think about each other’s thoughts and other states of mind, or have intentions, beliefs, or desires directed at those states. Put differently, the enabling conditions for successful mind-dependent communication may include communicators’ ability to see what others see, or hear what they hear, or notice what they notice (as well as recognize each other’s other states of mind). But the ‘methods’ or ‘techniques’–the underlying psychological mechanisms–used in accomplishing such psychological coordination need not involve representing others’ mental states as such, attributing specific communicative intentions, beliefs, and desires, or having beliefs about those states–where this, in turn, is taken to require having conceptual understanding of mental states. And, moreover, the goals of such communication in no way need to include coming to understand, or know about, what is on others’ mind. Of course, it is consistent with this claim that more sophisticated human communicators sometimes do employ such metarepresentational attributions and engage in fully ostensive-inferential communication. What matters, in the present context, is whether the capacity for ostensive-inferential communication is necessary for the emergence of linguistic communication, as Griceans maintain. Millikan would deny that it is. If she is right, this opens up the possibility that the capacity for fully Gricean, ostensive-inferential communication is parasitic on linguistic capacities and that the former cannot precede the latter in either phylogeny or ontogeny.Footnote 20

Millikan notes in this connection a mistaken assumption that has driven contemporary debates concerning whether animals and young children possess a ‘theory of mind’. This is the assumption

“that to represent a mental state … requires knowing certain things about what it is… [so] must involve knowing things that our current theories take to be definitive of mental states. In particular … one would recognize that another individual harboured a false belief and could predict their behaviour accordingly.” (2017: 104).

Millikan thinks it is a mistake to suppose “that recognizing another’s mental states requires having a theory of mind … [t]hat “a ‘theory’ would be required for awareness of another’s mental states” (2017: 103), so that, for example, for an infant to learn to look where a parent looks when she says “See that doggie?” the infant would have to “employ … concepts of mental states “ and “understand the innards of minds” (2006: 133).

As an antidote to the familiar ‘theory’-theory construal of mindreading, Millikan proposes that recognition of mental states “might involve merely affording knowledge rather than factic knowledge” (2017: 104, emphasis added). To have affording knowledge of, say, another’s mental state, one does not need to possess factual information about–or any theoretical understanding of–what states of that type are. One does not need to possess ‘factic’ beliefs about the state as such, or to have a concept of the state (as traditionally understood). One only needs to have a kind of practical knowledge: an ability to “recognize [the state] in some way or ways so as either to collect information about it or to learn how to deal with it” (op. cit.). For example, a dog can “perceive, and thus represent” what is, in reality, a squirrel’s intention to escape up a tree. The dog can learn different ways of recognizing when a squirrel intends to escape. But, says Millikan, “[t]he dog no more needs to grasp the true nature of squirrel intentions in order to represent and take account of them than you or he needs to grasp the true nature of water–…–in order to represent it” (2017:106).Footnote 21

Putting together Tomasello’s Gricean insight with key elements in Millikan’s anti-Gricean view, we can arrive at what I take to be a plausible and genuinely intermediary pragmatics-first approach. Unlike Gricean approaches, this intermediary approach would not limit its attention to communicators who can have informed beliefs about mental states, or possess conceptual or theoretical understanding of what having such states amounts to. It would study any aspects of animals’ use of calls or other communicative signals that rely on animals’ capacity to ‘perceive, and thus represent’ each other’s states of mind. Arguably, many animals exhibit this capacity in their uptake of expressive behavior–behavior that is designed openly to show psychological states, and thus enables non-inferential perception or recognition of those states (in the sense of Recanati 2002; Millikan 1984, 2006, 2017; see Bar-On 2013, 2018, Arnold & Bar-On 2020). Communication that exploits such a psychological capacity would be, in my sense, mind-dependent without being theory-of-mind-dependent. Coded communication does not exploit such a capacity; animals whose communication involves purely encoding and decoding signals have no need to rely on psychological mediation. When it comes to coded communication, the derivation of the contents of signals is accomplished automatically, or reflexively, or through associative learning.Footnote 22

In contrast with coded communication, when it comes to psychologically mediated communication, the accomplishment of the (distal) communicative function of signals essentially relies on the accomplishment of a more proximal psychological function. Thus:

Intermediary pragmatics–a synthesis

The study of communicative devices whose distal proper function is designed to be accomplished via the fulfillment of a proximal psychological function: devices whose uses are psychologically mediated, in that they rely on producers’ and receivers’ sensitivities to–or recognition of–each other’s states of mind and a non-theoretical ability to represent those states.

In keeping with the Gricean insight, explaining the emergence of linguistic communication requires explaining the emergence of psychologically mediated communication. The latter would represent an evolutionary innovation relative to coded communication. On my proposed account, this innovation would have appeared on the evolutionary scene when our nonhuman ancestors, who already had the capacity for openly showing and non-inferentially perceiving each other’s psychological states, began to harness this capacity to enhance or modify their use of communicative signals.

Intermediary pragmatics: how to do things with nonwords

One implication of the foregoing discussion is that advocates of pragmatics-first approaches should not assume from the outset that nonhuman-human communicative continuities depend, specifically, on whatever similarities can be found between animal signals and natural language discrete words or phrases that encode symbolic, conventional meanings. As the above discussion of linguistic devices such as “this” illustrates, improvisation most clearly plays an indispensable role when it comes to communication via expressions that do not have fixed symbolic-conventional content. It can also play a crucial role when it comes to expressions that have highly open-ended, situation-specific content–e.g., indexicals such as “here” and “there”, exclamatives (e.g. “wow! “, “yay!”, “hey!”, and also “psst” or “shh”), expletives (e.g. “dammit!”), but also, inter alia, proper names and definite descriptions–devices whose successful use in communication depends, at a minimum, on communicators’ ability to attract, gauge, and modify each other’s attention to specific features of the given situation. Indeed, on one influential view of language evolution, due to linguist Ray Jackendoff (2002), a subset of these sorts of non-symbolic devices, which carry no or limited encoded meanings, constitute present-day linguistic “fossils” of Protolanguage: a hypothetical intermediate stage in the evolution of language from animal communication. What would have rendered Protolanguage an intermediate stage in our ancestors’ journey to language is the fact that it would have consisted exclusively of a small repertoire of non-symbolic devices–holophrastic elements, with very minimal semantics and no syntax (as exemplified by the fossils) (2002: 235ff.).Footnote 23 If the intermediary pragmatics-first approach envisaged here is on the right track, it could shed light on what would have rendered Protolanguage pragmatically–and not only semantically–intermediate: users of Protolanguage would have essentially relied on their non-Gricean capacities for situation-specific psychological give-and-take.Footnote 24

On the potential evolutionary significance of ‘nonwords’

This way of understanding the contribution of my proposed intermediary pragmatics-first approach to the origins of language gives rise to the following two questions:

  1. (i)

    What might be the evolutionary advantage of having a Protolanguage that works in a psychologically mediated (yet non-Gricean) way? And

  2. (ii)

    What non-Gricean capacities exhibited by extant nonhuman animals might have made it possible for such a Protolanguage to have emerged?

Attempting to provide full answers these questions goes beyond the scope of this paper. But here are some initial thoughts.

As regards (i): The survival benefits of all animal signals, we may assume, are just the benefits (to producers and receivers) of the receivers’ taking an action that is appropriate to perceived situations (avoiding predation, getting food, securing a mate, increasing social bonds, and so on). In social creatures that have states of mind (such as sensation, perception, attention, but also affective states such as fear, agitation, and so on), being able to recognize and spontaneously respond to various features of these states can be very useful, insofar as it can greatly enhance their ability to engage in appropriate behavior and modify it flexibly in response to changing circumstances, by allowing individuals to take advantage of each other’s states of mind. (It can no doubt be useful for me to recognize your fear of something that I myself had failed to notice, and of which I, too, should be afraid.) To put it in Millikanian terms, in minded creatures, others’ states of mind constitute significant affordances. But, now, suppose this capacity for ‘minding other minds’ can be coopted (or ‘harnessed’) in their use of communicative signals. This would allow communicators to make more diverse and flexible uses of elements of their limited, noncombinatorial (and unlearned) repertoires, thereby significantly increasing the number of messages they can convey and understand. (Users of Jackendoff’s Protolanguage would have been able to convey a reasonably wide–even if by no means unlimited–range of situation-specific messages. Think of how many messages can be conveyed by pointing, for example.) Such an increase in expressive power and communicative agility would no doubt help animals navigate their environment, both physical and social. But suppose–as suggested earlier–that being attuned to key features of others’ states of mind, as openly displayed in some of their (expressive) behaviors, does not require thinking about, or attributing to others, mental states as such; that is, suppose it does not rely on a(n even minimally) metarepresentational theory of mind. Then, in contrast with narrowly context-dependent, rigidly coded, and ‘unminded’ communication, mind-dependent communication as described here could be seen to have some of the advantages of Gricean communication, yet without relying on the cognitive resources required for the latter.Footnote 25 To repeat, then, on the present proposal, the key evolutionary innovation that could have put our ancestors on their way to linguistic communication would have involved the coupling of a widespread capacity for context-dependent use of communicative signals with a less widespread capacity for non-theory-of-mind representation of others’ state of mind. It is when the latter capacity is harnessed so as to allow animals to modify and augment the communicative function of the former that we begin to have ‘communication from a psychological point of view’.

Psychologically mediated communication in monkeys?

We can now turn to question (ii) above. As we saw earlier, there are some indications in current studies of chimpanzees’ use of unlearned calls that primates exhibit some capacity for psychologically mediated communication. But it would be instructive to consider whether some origins of such communication could be found even earlier, phylogenetically speaking. What about monkey calls, which (as we have seen) have been used to motivate pragmatics-first approaches–and which are often dismissed by Griceans as entirely irrelevant to the origins of human communication? Here, again, adopting an intermediary pragmatics-first approach would mean establishing, at a minimum, that monkeys’ use of their unlearned calls (in production and reception) is not fully ‘scripted’, or dictated by monkeys’ perception of the non-social situation, and that in producing and interpreting calls monkeys directly rely on their perception of each other’s states of mind–that communication via calls involves psychological give and take as described above.

To illustrate, consider the call system of putty-nosed monkeys–an arboreal species of monkeys belonging to the genus Cercopithecus.Footnote 26 Putty-nosed males have a repertoire of three ‘loud’ call types that can carry over long distances: booms, pyows, and hacks. Booms are very rarely heard and occur in a wide range of contexts, whereas pyows and hacks are produced frequently. All these calls were initially thought to be functionally referential (just as the vervet alarm calls had been claimed to be) (Eckhardt & Zuberbühler 2004). However, in a reevaluation of the putty-nosed call system, Arnold & Zuberbühler (2013) argue that pyows and hacks–whether produced discretely or in sequences–simply do not behave like functionally referential labels for distinct types of predator (2013: 1). They note that call series were observed to be produced in a variety of contexts that did not involve predators at all (2013: 2). And even when produced discretely, neither hacks nor pyows seemed tightly linked to the presence of specific types of predators. Instead, their use seems to depend on “a high degree of flexibility in both call production and comprehension that is absent in context-bound, though potentially more informative, signals”. Pyows, especially, “appear to function primarily as an attention getter” (2013: 5, and compare 2012: 307). Given the relatively loose, unstable relationship between calls and the presence of (specific types of–or any) predators, the authors conclude that the use of both types of calls must rely “on listeners’ abilities to integrate information from a number of sources” (ibid).

Notably, Arnold & Zuberbühler observe that “listeners … attempt to acquire additional information about the behavior of the caller” in order to determine the cause or target of the call (2013: 2, emphasis added). Thus, when a putty-nosed male produces a pyow call, his body posture and other features of his demeanor reveal aspects of his state of mind–whether he is alarmed or relaxed, and, if alarmed, how alarmed he is, what he is alarmed by, what his attention is directed at, and how he is preparing to act next. Upon hearing his call, other group members within sight can be observed to actively attempt to find out more about his state in order to learn what the male was calling about, rather than immediately reacting by reflexively engaging in a fixed pattern of anti-predator behavior. If females with visual access observe the male’s alert body posture and his gaze as fixated on the threat, they will subsequently chime in with their own chirp calls. And other group members who lack visual access to the male appear to be alerted to the threat upon hearing female chirp calls that accompany the male calls. Only then do they approach the threat and begin calling and mobbing. All in all, what Arnold & Zuberbühler seem to describe is an intricate and highly dynamic pattern of ‘division of communicative labor’ surrounding the putty-nosed monkey intragroup calling behavior.Footnote 27 Determining whether this description is correct would require more careful analysis. But if it is, this would suggest that the communicative work of at least some monkey call systems goes beyond the integration of environmental cues, general background knowledge, and context-dependent information that is tightly associated with innate call types. It relies crucially on monkeys’ recognition and uptake of multiple psychological aspects of the calling situation (as openly shown in monkeys’ expressive behaviors). In that case, putty-nosed monkeys’ use of calls, too, would fall under the scope of intermediary pragmatics as presented here.

***

In this paper, I have argued that, if we are to identify potential precursors of human linguistic communication in animal communication, we ought to adopt a genuinely intermediary pragmatics-first approach. Doing so, I have suggested, means looking beyond the Carnapian narrow context-dependence of the informational content of animals’ communicative signals, as well as beyond their Millikanian distal proper function. In keeping with the Gricean insight concerning communication ‘from a psychological point of view’, this requires investigating the extent to which the relevant forms of communication exploit a capacity for mind-dependent context-dependent uses of signals. Such uses would resemble and potentially foreshadow what Millikan describes as ‘improvisational’ uses of non-symbolic linguistic devices. An intermediary pragmatics-first investigation could help shed light on the emergence of linguistic communication by bringing into view a way of reconceiving the puzzle of the evolution of language. In approaching the puzzle, we should not be asking, in the first place: How could metarepresentational ostensive-inferential communication have emerged from merely coded animal communication? We should instead be asking: How could animals’ psychological capacity for non-Gricean recognition of each other’s states of mind come to be harnessed so as to enable psychologically mediated uses of communicative signals? Initially posing our question in this way would, I believe, open up promising new directions in the study of animal communication and the evolution of linguistic communication.