1 Introduction

Since it became a topic of empirical research, the study of children’s theory of mind – their understanding of the underlying psychological basis of behavior – has been dominated by the discovery that younger children systematically fail false-belief tasks, and start to succeed sometime after their fourth birthdays (Wimmer and Perner 1983; Wellman et al. 2001). The debate regarding the interpretation of this discovery has divided philosophers and psychologists along nativist and empiricist lines. Empiricists have claimed that the shift in performance on false-belief tasks around children’s fourth year signals their acquisition of a genuinely meta-representational concept of belief (Perner 1991; Gopnik and Wellman 1992). Nativists argued that younger children’s failures reflected a performance error related to children’s underdeveloped executive and attentional resources and the processing demands inherent to the task, rather than a fundamental lack of competence with the concept of belief (Fodor 1992; Leslie et al. 2004).

In the two decades after the false belief task was first introduced as a measure of theory of mind development, both empiricist and nativist camps remained firmly entrenched (see, for example, Scholl and Leslie’s (2001) response to Wellman et al. (2001)). More recently, new methods for studying false belief understanding in preverbal infants appear to have vindicated the nativist position (Onishi and Baillargeon 2005; Buttelmann et al. 2009b; Kovács et al. 2010; Senju et al. 2011; Barrett et al. 2013; Buttelmann et al. 2014; Southgate and Vernetti 2014). These studies seem to show that while younger children do systematically fail false-belief tasks that attempt to elicit explicit, communicative responses, infants as young as 6 months of age understand false beliefs in tasks where success is measured by their spontaneous reactions to behavior, either with anticipatory looking, violation of expectation, active helping, or EEG paradigms.

Interpreting these findings has created a great deal of controversy. Many authors have argued that such implicit measures do not actually demonstrate genuine meta-representational abilities, and offered a variety of alternative interpretations that preserve the empiricist narrative (Perner 2010; Gallagher and Povinelli 2012; Butterfill and Apperly 2013; Heyes 2014). In response, nativists have produced a steady stream of new empirical results aimed at refuting these deflationary hypotheses (Senju et al. 2011; Buttelmann et al. 2014; Moll et al. 2015; Scott et al. 2015). Others have defended a rich, mentalistic interpretation of the new infancy data on theoretical grounds, pointing out that the post hoc nature of many of these deflationary proposals counts against their credibility, while other proposals seem to be ill-equipped to explain the sheer range and flexibility of infants’ socio-cognitive abilities (Baillargeon et al. 2010; Carruthers 2013; Scott 2014; Scott and Baillargeon 2014; Thompson 2014; Christensen and Michael 2015). Thus, in spite of this controversy, theory of mind nativism continues to enjoy substantial evidential support, and is currently driving a highly productive research program. There is, in short, good reason to think that children possess the concept of belief well before they pass the false belief task.

However, even if we accept this conclusion (as I will in this paper), nativism about theory of mind development still faces a significant challenge when it comes to explaining why younger children systematically fail the standard false belief task (hereafter, FBT). The standard nativist line is that these tasks impose severe demands on young children’s still-developing executive resources, which causes them to fail. But this account is ill-equipped to explain why certain forms of social experience and training affect when children succeed on the FBT, as it is not clear that the findings in question could be explained solely in terms of a child’s developing executive abilities.

The goal of the current paper is to show how these findings fit into a revised nativist framework. Typically, data showing a role for social experience in theory of mind development are cited in support of empiricist accounts. However, there is no inconsistency between nativism and a role for social learning. All contemporary nativist approaches to the mind are meant as explanations for how individual learning takes place; they do not deny that individuals ever learn at all, or that innate knowledge is ever enriched by experience (pace Fodor (1975)). But it is incumbent upon the nativist to explain how various types of experience lead to individual differences in theory of mind development.

This is the challenge that I take up in this paper. Given the strong evidence in its favor, and its growing acceptance in the field, I will be taking the claim that young children can represent beliefs as a point of departure. Those who are agnostic or skeptical of this claim are invited to view the account I’ll be laying out as a way of filling in the following conditional: if infants could represent beliefs, how would we go about explaining the influence of social experience on FBT performance? Providing the best answer to this conditional question should be important even for those who ultimately reject its antecedent.

My proposal, which I’ll call the pragmatic development account, is that while young children are capable of representing beliefs early on in development, they are not yet very good at understanding when facts about belief are relevant to conversation. In spite of the fact that they constantly attribute beliefs, desires, goals and intentions to other agents, understanding when these pre-linguistic concepts are implicated in conversation is not just a matter of acquiring the right vocabulary. Young children do not initially expect people’s beliefs to be a topic for conversation. They have to learn this through experience with the pragmatics of belief discourse – that is, during social interactions in which facts about beliefs are implicated in conversation. With this experience, children learn to adjust their prior expectations about the relevance of doxastic facts when interpreting particular speech acts.

As a result, different levels of exposure to belief discourse can affect how children interpret questions like the ones they must answer in FBTs. When they lack the requisite experience, children are prone to misinterpret the crucial false belief query as a kind of indirect speech act that is not about the beliefs at all. But as they gain more experience with belief discourse, children start to recognize the true purpose of the experimenter’s question, and respond accordingly. In other words, younger children fail the FBT not because they lack the concept of belief or because the tasks are too executively demanding, but due to a mistaken Gricean inference.

The pragmatic development account is not wholly new. Siegal and Beattie (1991) proposed a Gricean account of younger children’s systematic failure on FBTs. They argued that three-year-olds are typically too inexperienced to pick up on experimenters’ conversational implicatures during the FBT; as a result, they fail to grasp the relevance of mentalistic factors to the experimenters’ questions, opting instead for a more familiar, world-oriented interpretation. Thus, when children hear “Where will Sally look for her marble?” they interpret it as, “Where will Sally have to look for the marble in order to find it?” rather than “Where will Sally look for her marble first?” Siegal and Beattie supported this interpretation by showing that three-year-olds tended to pass a modified version of the FBT in which they were asked the latter question, even though they would still fail when asked the former. Later, Surian and Leslie (1999) both replicated Siegal and Beattie’s findings and expanded upon them by showing that a similar manipulation failed to improve the performance of a control group of individuals with autism spectrum disorder (a population widely believed to suffer from a chronic theory of mind deficit). In support of a similar hypothesis, Hansen (2010) argued that children in the FBT might interpret the experimenter’s query as a question about the state of the world, rather than a question about the agent’s mental states. To this end, he showed that young children perform much better on FBTs in which the experimenter makes it clear the he is not asking about the state of the world: “You and I both know where Sally’s marble is, but where does Sally think it is?” Pursuing a different version of the pragmatic development strategy, Helming and colleagues have proposed that it is children’s propensity to be helpful which leads them to misinterpret the experimenter query during the FBT (Helming et al. 2014). The current proposal builds upon these earlier pragmatic accounts in that it specifically engages with the social learning that goes into passing the FBT, rather than simply focusing on the on-line pragmatic demands of the task itself. It also makes a number of novel recommendations about how to tease apart the respective contributions of children’s executive abilities and their pragmatic understanding of the task.

In the next section, I describe current nativist explanations of children’s failure on the FBT, and then present findings that seem prima facie inconsistent with these explanations. In the third section, I lay the groundwork for the pragmatic development account, and present several arguments for why belief discourse poses pragmatic challenges for the novice speaker. In section 4, I present the core elements of the pragmatic development account, and show how it is able to explain children’s performance on a wide range of FBTs. In section 5, I show how the account is able to accommodate a particularly challenging set of findings from Call and Tomasello (1999). In section 6, I end by making several predictions that would distinguish my own view from other nativist accounts.

2 A Challenge for Existing Nativist Accounts

Many of the prominent nativist accounts of theory of mind development have focused on the processing load that the FBT places on younger children’s developing executive functioning. Baillargeon and her colleagues’ response account, for instance, posits that younger children are unable to cope with the demands of simultaneously attributing a false belief, selecting a response to the experimenter’s question, and inhibiting a prepotent tendency to answer the experimenter’s question with her own knowledge, perhaps due to still immature connections between mindreading and executive regions of the brain (Baillargeon et al. 2010). Along similar lines, Leslie and colleagues have argued that success on FBTs is modulated by the development of a domain general selection processor responsible for inhibiting the mindreading system’s tendency to attribute the subject’s own beliefs to others by default (Leslie and Polizzi 1998; Leslie et al. 2005). Carruthers (2013) also holds a “processing load” view, but emphasizes that all three components of FBTs – attributing a false belief, interpreting the experimenter’s question, and generating a response that will communicate the appropriate information to the experimenter – involve mindreading (see also Sperber and Wilson 2002). According to this triple mindreading account, executing each of these tasks simultaneously places heavy demands on both processing resources internal to the mindreading system and general executive resources, both of which may be insufficiently developed in younger children, which explains why younger children fail the FBT while still possessing the concept of belief.

In line with these “processing load” accounts, a number of studies have found that performance on various executive tasks predicts earlier success on the FBT (Carlson et al. 2002; Benson and Sabbagh 2005). However, the overall correlation between these two constructs is in fact quite weak. According to a recent meta-analysis by Devine and Hughes, the correlation between executive functioning tasks and performance on FBTs is only .22 after controlling for age and verbal ability, with differences in executive functioning accounting for only 8 % of the variance in performance on FBTs (Devine and Hughes 2014). Moreover, this correlation was consistent across diverse measures of executive functioning. This suggests that no single component of executive functioning accounts for its relation to FBT performance. Thus, although it does make a small contribution to children’s performance on the FBT, it seems unlikely that executive functioning holds the key to explaining why most children fail the task until after their fourth birthday.

Moreover, any account that appeals solely to the maturation of children’s executive abilities as an explanation of how they come to pass the FBT is ultimately underequipped when it comes to explaining the various experience-related factors that influence FBT performance. For instance, it’s been shown that the extent to which a child’s mother talks about mental states predicts how early that child will succeed on FBTs (Ruffman et al. 2002; Symons 2004; Symons et al. 2006). Beyond maternal interactions, children with older siblings also appear to have an advantage on the FBT (Perner et al. 1994; Ruffman et al. 1998). Further, interventions that train children on various aspects of mental state discourse have tended to improve children’s performance on FBTs (Slaughter and Gopnik 1996; Hale and Tager-Flusberg 2003; Lohmann and Tomasello 2003; Wellman 2012).

Exposure to language in general also has dramatic effects on when children are able to pass the FBT. Deaf children born to hearing parents who are exposed to sign-language late in life are significantly delayed on explicit false-belief tasks when compared to both hearing children and deaf children born to deaf parents (whose FBT performance is comparable to that of hearing children) (Peterson et al. 2005; Wellman et al. 2011). Notably, this delay is not the result of any sort of congenital neurological abnormality (as is the case with children on the autism spectrum) but is instead due to purely environmental factors. Nevertheless, late-signing deaf children still reliably display the same developmental progression through various types of theory of mind problems as typically developing children (e.g. succeeding on problems involving diverse desires before problems involving false beliefs; see Section 4.3). However, late-signing deaf children are able to succeed earlier on FBTs after they are exposed to theory-of-mind-based interventions using “thought bubbles” that draw attention to individuals’ beliefs (Wellman and Peterson 2013).

Some of the most striking evidence for the importance of experiential factors in theory of mind development comes from a natural experiment that took place in Nicaragua during the last few decades of the twentieth century. In 1977, an expanded elementary school for special-needs children was opened in the city of Managua. Here, for the first time, deaf children in Nicaragua came into extended contact with one another. Although their education was conducted in Spanish, amongst themselves the students began to develop their own novel system of gestural communication, an amalgam of the children’s various idiosyncratic home-sign gestures. This system of gestural communication was expanded as older students passed it on to new ones, and rapidly developed into a full-fledged sign language known today as Nicaraguan Sign Language, or NSL (Senghas et al. 2004).

Importantly, the version of NSL acquired by its earliest speakers was less complex than the one acquired by later speakers, and completely lacking in mental state vocabulary (Pyers and Senghas 2009). In a study with adult speakers of NSL that compared the performance of earlier “first cohort” and later “second cohort” speakers of NSL, Pyers and Senghas found that first cohort speakers systematically failed a non-verbal elicited-response version of the FBT, while second cohort speakers were generally successful. In a follow-up several years later, the performance of the first cohort speakers on the FBT had significantly improved. Pyers and Senghas attributed this improvement to an intermingling between first and second cohort speakers of NSL, leading the first cohort speakers to acquire a greater facility with mental state discourse. Note that one could not plausibly attribute the change in the first cohort speakers’ performance on the FBT to a development in executive abilities (as the nativist might for the parallel change in performance of 3–4 year olds), as these subjects were adults at the time of the first test, and likely possessed fully mature executive resources. Indeed, both the difference between first and second cohort NSL speakers and the change in first cohort speakers’ performance appear to be the result of social experiences specifically related to their acquisition of mental state vocabulary.

Explanations of FBT performance that appeal solely to the on-line demands that the task places on executive resources do not tell us much about why these kinds of experiences affect when an individual ultimately overcomes those demands. Even if important maturational changes to children’s executive resources do occur between the ages of three and four, and individual differences in executive functioning do correlate somewhat with individual differences on the FBT, it’s not obvious how these internal cognitive developments could explain why an individual’s social experiences also seem to matter for her performance on the FBT, particularly when that individual does not pass the FBT until long after her fifth birthday (as is the case with language-deprived children). This suggests that a child’s social environment makes an independent contribution to her performance on the FBT.

At this point, one might suggest that language might be the crucial factor in passing the FBT. Indeed, various authors have proposed a crucial role for various aspects of language in theory of mind development, including complementation syntax (de Villiers and Pyers 2002), mental state vocabulary (Montgomery 2005), and the social experience that comes with linguistic interactions (Tomasello and Rakoczy 2003; Dunn and Brophy 2005; Harris et al. 2005); however, in a recent meta-analysis of the theory of mind and language literature, Milligan et al. (2007) were unable to identify a special role for any single aspect of language independent of general language ability. Moreover, after controlling for age, they determined that linguistic factors accounted for roughly 10 % of the variance in theory of mind abilities. Thus, language, like executive functioning, makes only a small (albeit statistically significant) contribution to performance on the FBT.

However, one limitation of this meta-analysis was that it did not assess how pragmatic learning affects children’s performance on the FBT. In the next section, I explore the pragmatic factors that accompany both the FBT, and belief discourse in general.

3 The Pragmatic Challenges of Belief Discourse

To begin to make sense of all the above-mentioned findings within a nativist framework, we must first be clear about the basic problem that contemporary versions of theory of mind nativism are meant to solve, namely, explaining how even very young children’s spontaneous expectations about behavior seem to be sensitive to the mental states of others. Nativists posit that they are able to do this because they possess innately channeled inference mechanisms that take observable behaviors as input and generate mental state attributions as output. But this account only explains how young children come to possess mental state concepts. Learning to apply these concepts in an adult-like manner in linguistic interactions is another story. A novice speaker of a language, even one who is able to represent the mental states of others, may nevertheless demonstrate non-adult-like performance on tasks that require her to interpret other speakers’ utterances as being about mental states. After all, the nativist’s hypothesis is about where our conceptual understanding of mental states comes from, not how we learn to talk about them. The nativist about mindreading is silent when it comes to explaining how we learn to participate in mental-state discourse – which, it turns out, is surprisingly tricky for the novice speaker. In particular, it appears that younger children do not expect beliefs to be a likely topic of conversation.

3.1 References to Beliefs in the Explanation and Description of Behavior

To see why doxastic facts pose a particular difficulty for the novice speaker, consider first the asymmetrical roles that beliefs and desires play in ordinary folk-psychological explanation (Rakoczy et al. 2007; Steglich-Petersen and Michael 2015). Suppose, for instance, that we observe Sally walk over to the cookie jar and open the lid. When asked why Sally opened the lid to the cookie jar, a natural and perfectly informative response would be, “Because she wanted a cookie.” Note that this response makes no mention of Sally’s beliefs – just her desires. Now, consider an alternative response: “Because she wanted a cookie, and she believed that there would be cookies inside the jar when she opened it.” This explanation, while accurate, is a bit odd. To mention Sally’s belief in this context seems to provide too much information, a violation of Grice’s Maxim of Quantity (Grice 1991). Sally’s belief about the cookie jar is so obvious that it is simply not worth mentioning. This is because when we give explanations of this type, we tend to presuppose that facts about Sally’s beliefs are a part of the conversational common ground. Even when this is not in fact the case, and the listener actually does not take facts about Sally’s beliefs to be in the common ground, the speaker’s act of only referring to Sally’s desires is itself evidence that some fact about Sally’s beliefs has been presupposed. It is then incumbent on the listener to supply that fact herself in order to render the explanation coherent.Footnote 1 Thus, overt reference to beliefs is notably absent from even this very simple instance of a folk psychological explanation; in its place, we find a subtle practice that relies upon presupposition and pragmatic inference.

Our descriptions of behavior also seem to frequently omit reference to beliefs. In an elegant series of experiments, Papafragou et al. (2007) presented both adults and children with short scenes, which the subjects were then asked to describe. In their control conditions, they found that both adults and children tended to make very few references to the actors’ beliefs when describing the scenes, opting instead to refer to agents’ goals, or simply to their overt physical behaviors. However, the experimenters hypothesized that both children and adults would be more likely to describe a scene in terms of actors’ beliefs when they are provided with additional cues that make doxastic factors more salient. Specifically, they predicted that the presence of syntactic cues from sentences with clausal complement structure (e.g. “Sally believes THAT the marble is in the box,”) or situational cues in which a character acts on a false belief would prompt subjects to use more belief words.

To test these predictions, the authors presented both adults and children between the ages of three and five with silent scenarios showing actors engaged in various activities. Some of these scenarios showed actors performing simple actions, while others showed the actors acting on false beliefs (e.g. absent-mindedly drinking from a flower vase that had been placed where their water glass was while they were not looking). In some cases, these scenes were accompanied with nonsense sentences containing either a clausal complement structure introduced by ‘that’ (e.g. “Vanissa LODS that she ziptorks the siltap”), a transitive structure with a direct object (“Vanissa VAMS the torp”), or an intransitive structure (“Vanissa TROMS”). Across their experiments, they found that both the false belief scenario and the clausal complement cue substantially increased both adults’ and children’s references to beliefs when describing what they saw. This effect was strongest when both cues were co-occurring; when such cues were absent, they tended to describe the scene using non-doxastic vocabulary.

These results show two things: first, that we do not spontaneously refer to beliefs in our behavioral descriptions; second, talk of beliefs seems more likely when some feature of the situation has raised the saliency of belief facts. Thus, in description, as with explanation, doxastic facts are not often mentioned under ordinary circumstances. Yet representations of belief facts still appear to be available, as overt references to them can be prompted by the presence of a syntactic cue. The fact that false belief scenarios do prompt references to beliefs is also telling, because it suggests that it is only in somewhat unusual circumstances that it becomes important for speakers to draw attention to beliefs. This suggests that while we do represent the beliefs of others, it is only in special circumstances that these representations get overtly mentioned in conversation.

If this asymmetry in the role of beliefs in the explanation and description of behavior were in fact pervasive in the novice speaker’s linguistic input, then we would expect a corresponding asymmetry in the frequency of overt references made to beliefs and desires in child-directed speech. There is some indication that this is in fact the case: according to the Child Language Data Exchange System (CHILDES) database, by age 4, children have heard the verb ‘think’ an average of 611, 220 times, while they have heard ‘want’ 1.3 million times (MacWhinney 2014).Footnote 2 We see something similar in a study conducted by Tamoepeau and Ruffman, in which mothers were made to tell a story to their children from a book containing only images: references to desires were roughly twice as frequent as references to beliefs (Taumoepeau and Ruffman 2006). These findings provide support for the claim that we frequently omit references to beliefs in our explanations and descriptions of behavior. They also highlight a more basic fact, namely that belief discourse input is relatively sparse for a novice speaker, at least when compared to desire input.Footnote 3

In both our explanations and our descriptions of behavior, then, facts about belief are often left implicit. For adults, this pragmatic dimension of belief discourse is barely noticeable, and engaging in these discursive practices is positively effortless. But for a child – even one who possesses the concept of belief – this might make belief discourse rather difficult. Not only must the child be able to grasp the role of beliefs in generating behavior, but she must also know that common knowledge of these facts is often being presupposed during conversation. But until she has learned this, she will only notice that talk of beliefs is comparatively rare. For the child, it will seem as though beliefs are not the sort of thing that people are often interested in talking about.

3.2 The Pragmatics of ‘Thinks’

Another factor adding to difficulties associated with belief discourse is that the verb that we most often use to express the belief concept, ‘think,’ is generally not used to attribute beliefs. Often, ‘think’ is used in indirect speech acts as a way of proffering a complement clause that the speaker takes to be true (Simons 2007). To illustrate, consider the following exchange:

Agnes: When does the game start?

Roberta: I think that it starts around 7pm.

Interpreted literally, Roberta has responded to Agnes’ question by self-attributing a belief about the game. But this interpretation would be bizarre: facts about Roberta’s mental states are orthogonal to the question under discussion, and Roberta’s referring to them would seem to violate the maxim of quantity by bringing up irrelevant information. Of course, we do not interpret Roberta’s utterance in this manner because it is clear that the primary illocutionary act being performed is not, in fact, about Roberta’s mental states, but rather about the game itself.Footnote 4 In the exchange above, Roberta is using ‘think’ as a way of indirectly endorsing the truth of the complement clause, namely, that the game starts at 7 pm. Used in this “parenthetical” manner (Hooper 1975; Simons 2007), sentences of the form “S thinks that P″ become pragmatically enriched so that they imply that the speaker takes P to be true; in contrast, literal, attributive uses of “S thinks that P″ are neutral with respect to the truth of P.Footnote 5

Thus, utterances containing ‘think’ often require an additional inference about speaker meaning to determine whether it is being used indirectly or attributively, which in turn impacts whether or not the complement clause is being asserted as true. Even worse (from the perspective of the novice speaker), indirect uses of ‘think’ appear to be far more common than attributive uses: corpus analyses of child-directed speech reveal that the overwhelming majority of adults’ uses of ‘think’ are of the indirect variety; correspondingly, most of younger children’s early uses of ‘think’ tend to be indirect and first-personal in nature, rather than genuine belief ascriptions (Shatz et al. 1983; Bloom et al. 1989; Diessel and Tomasello 2001).The combination of the infrequency with which we overtly refer to beliefs in explanation and description and the pragmatic noisiness of ‘think’ makes interpreting utterances containing ‘think’ quite challenging for the novice speaker. It is therefore unsurprising that children below the age of four also sometimes show non-adult-like comprehension of ‘think,’ and often seem to treat it as equivalent to ‘know’ (Johnson and Maratsos 1977; Moore et al. 1989).

Multiple authors have interpreted younger children’s difficulties with epistemic verbs as evidence of an underlying conceptual deficit: younger children fail to distinguish the meanings of ‘think’ and ‘know’ because they lack the concepts those words express (Tardif and Wellman 2000; Perner et al. 2003). However, recent experimental evidence suggests that, contrary to the above interpretation, children do in fact demonstrate an adult-like semantic understanding of ‘think’, provided that extraneous task demands have been sufficiently reduced. Specifically, while children do poorly on tasks requiring them to say what an individual thinks, they do much better when they are asked to make truth-value judgments about sentences in which ‘think’ is used attributively (Lewis et al. 2012; Lewis 2013; Hacquard 2014; Dudley et al. 2015).

Pursuing this idea, Lewis et al. (2012) proposed that children’s non-adult-like performance on other tasks involving ‘think’ is the product of pragmatic factors, not a conceptual or semantic deficit. According to this ‘pragmatic development hypothesis,’ three-year-olds do in fact have the appropriate semantics for ‘think’ and the corresponding concept of belief, but they tend to make incorrect inferences about the intentions behind utterances in which ‘think’ occurs, treating literal uses of ‘think’ verbs as indirect by default. This hypothesis predicts that experimental manipulations that make attributive interpretations of utterances containing ‘think’ more salient should lead to more adult-like performance on comprehension tasks.

To test this prediction, Lewis et al. (2012) presented a sample of four-year-olds with vignettes in which cartoon characters played a game of hide-and-seek. After watching one or more characters hide, participants first interacted with a puppet that would ascribe beliefs to the seeker (e.g. “Dora thinks Swiper is behind the toy box,”) and then were asked by the experimenter whether or not what the puppet said was correct. In their first experiment, participants tended to give incorrect truth-value judgments when the puppet accurately ascribed false beliefs to the seeker. However, in their next experiment, a second seeker with conflicting beliefs about the location of the hider was added to the vignette. In this experiment, participants’ truth-value judgments about the puppet’s belief ascriptions improved across all conditions, revealing an adult-like semantic understanding of the verb ‘to think’.

To explain this improvement, the authors suggest that children in the 1-seeker condition failed because they defaulted to an indirect interpretation of the puppet’s use of ‘think’, which led them to infer that the puppet was in fact proffering a false statement. By introducing another conflicting perspective to the scenario, the authors were able to highlight the relevance of the first seeker’s beliefs in the child’s conversation with the puppet, which led the children to interpret the puppet as using ‘think’ attributively and give the correct answer. This suggests that the subjects’ initial responses were not based on a failure to represent the character’s beliefs or an immature understanding of the meaning of ‘think’, but rather a failure to correctly interpret the speaker meaning behind the original belief ascription made by the puppet.

Notably, standard nativist accounts of children’s theory of mind development that stress the development of executive functioning would not have predicted this result. Such an account would have predicted that the addition of the second seeker would have made the task harder, since adding another perspective to the situation would have given the subjects yet another concurrent mindreading task and increased the executive burden of the task. The fact that adding the second seeker did not have this effect is further evidence that demands on executive functioning do not fully explain children’s systematic failures on the FBT.

Introducing a telling contrast in order to highlight the attributive interpretation of ‘think’ has also been demonstrated in three-year-olds. Arguing along similar lines as Lewis et al. (2012); Hansen (2010) showed that younger children’s success rates on FBTs surpass chance levels when experimenters ask, “You and I both know where Sally’s marble is, but where does Sally think it is?” This manipulation is particularly effective, since it actually introduces two pieces of contrastive information that serve to highlight the relevance of the subject’s doxastic state. First, by drawing attention to the knowledge she shares with the child, the experimenter’s query serves to highlight the fact that Sally does not share in this knowledge. Second, the query involves the use of both ‘know’, which has a factive semantics, and ‘think’, which does not. Contrasting ‘think’ and ‘know’ in this context is an effective way of eliminating the possibility that ‘think’ is being used indirectly, since such an interpretation (which tends to imply that the complement clause is true) would render the contrast with ‘know’ uninformative. Thus, both pieces of contrastive information contained in the experimenter’s query lead the child to interpret the topic of conversation to be Sally’s beliefs rather than reality. Both Hansen (2010) and Lewis et al. (2012) thus provide compelling evidence that children are able to grasp that ‘think’ expresses a belief attribution, provided that other, non-doxastic interpretations of ‘think’ are excluded by contextually relevant information.

4 The Pragmatic Development Account

One thing that the studies by Hansen and Lewis et al. tell us is that we should expect younger children to have difficulties on FBTs that ask them what a particular agent thinks (e.g. Jacques and Zelazo 2005; Low and Simpson 2012): in those tasks, children are likely defaulting to an indirect interpretation of the verb, rather than an attributive one. Notably, the subset of FBTs that employ the ‘thinks’-question includes the majority of unexpected-contents and deceptive-object versions of the FBT (e.g. Perner et al. 1987; Gopnik and Astington 1988).Footnote 6 This is because these tasks all try to draw children’s attention to their own or another agent’s prior expectations about the world, which requires an overt reference to beliefs. Since younger children do not expect beliefs to be a topic of conversation, and are used to ‘think’ being used in indirect speech acts, they naturally interpret this to be the experimenter’s true communicative purpose. They thus interpret the question “what will S think is in the box?” as an indirect question about the contents of the box, and respond accordingly.

4.1 “Where will Sally Look for the Apple?”

However, many standard FBTs do not ask children what a particular character will think, but rather where she will look (e.g. Wimmer and Perner 1983). It might be objected that the above results tell us nothing at all about why younger children fail this kind of task, since the word ‘think’ is never actually used. But this objection misses the point of the proposal. The frequency of indirect uses of ‘think’ and the absence of overt references to belief in explanations and descriptions of behavior would, according to this account, lower the probability that belief facts are relevant to interpreting the speech acts of others. This would hold true regardless of whether the word ‘think’ is used in a given utterance. Thus, when an experimenter asks a child, “Where will Sally look for her marble?” he wants the child to show that she knows that Sally believes that the marble is in its old location. But, if the child has had little experience with belief discourse, then she is unlikely to judge that this is the speaker’s true communicative intention. Because such an interpretation would implicate beliefs as a topic of conversation, its low prior probability would place it at a disadvantage relative to any other, competing interpretations that might be available. Thus, the child would be unlikely to attribute the doxastic interpretation to the experimenter’s speech act.

Importantly, I do not claim that the pragmatics of the false-belief task cause children to lose track of the agent’s beliefs, or that pragmatic factors make this information cognitively inaccessible; this would render my view virtually indistinguishable from a processing-load account. Children, on my account, spontaneously infer and maintain mental representations of the agent’s beliefs and goals throughout the false-belief scenario. But although they represent the agent as having certain beliefs, they fail to infer that the experimenter is interested in those beliefs, and that this interest is motivating her communicative intention. Thus, while children are perfectly capable of representing false beliefs in this scenario, their inexperience with belief discourse leads them to err in their Gricean reasoning about what the experimenter wants from them.

But while younger children do, from our adult perspective, get the answer to the FBT wrong, it is important to note that from their perspective, their answer is perfectly justified. For given their experience with belief discourse, the actual communicative intention of the experimenter – which is to get children to show that they know that Sally thinks the marble is in the incorrect location – would seem quite unusual. It is only natural that children should instead attribute to the experimenter a more plausible communicative intention.

But, one might wonder, how else would the child interpret the experimenter’s query in when she asks where Sally will look for the marble? As proponents of other pragmatic development accounts have pointed out, the standard change-of-location scenario creates a set of conditions in which other interpretations of the experimenter’s query would be highly salient to the child. For instance, Siegal and Beattie (1991) suggest that children may interpret the experimenter’s query as “Where will Sally find the marble?” given that obtaining the ball is Sally’s ultimate goal in the FBT scenario, and children treat the impending resolution of this goal as highly salient. Helming et al. (2014) offer a related explanation, drawing upon the well-established finding that children are highly motivated to engage in spontaneous helping behavior (Warneken and Tomasello 2007; Warneken and Tomasello 2009). They suggest that children in the change-of-location FBT would treat the fact that Sally has an unfulfilled goal as highly salient, and would be very motivated to help her fulfill that goal. Consequently, children infer that the experimenter must be indirectly asking them to help Sally; they thus interpret “Where will Sally look for her marble?” as “Where should Sally look for her marble? Let’s help her find it!” They respond by giving the most helpful answer possible – namely, by indicating the actual location of the ball. In other words, children fail to judge that the doxastic interpretation of the experimenter’s question is correct because they judge it far more likely that the experimenter is concerned with (what they see as) the most salient feature of the situation: that Sally needs help. Thus, when children hear the experimenter ask, “Where will Sally look for her marble?” according to the pragmatic development account, they think, the experimenter wants me to help Sally find her marble.

This explanation also enables us to make sense of the fact that children tend to do slightly better on FBT tasks in which the change of location is the result of a deliberate deception (Chandler et al. 1989; Hala et al. 1991; Wellman et al. 2001). In such a context the child may find it less likely that the experimenter (who is also the deceiver) would be inviting the child to undo his deception, which in turn raises the probability that the experimenter is doing something other than inviting the child to help. It also helps to explain why younger children succeed on false-belief tasks that use helping as a dependent measure: when children’s inclination to help is exploited by the false-belief task design, and does not interfere with it, children’s early false-belief competence is on full display (Buttelmann et al. 2009a; Southgate et al. 2010; Buttelmann et al. 2014).

If children’s failures on the FBT are due in part to the fact that they do not see doxastic facts as conversationally relevant, this would suggest that manipulations that raise the salience of these facts should improve performance. A number of findings in the literature support this prediction. As we saw, the Lewis et al. (2012) and Hansen (2010) studies showed that the presence of contrastive information can heighten the salience of belief facts and thus trigger a doxastic interpretation. Asking a child where an agent will look first also leads to improved performance (Siegal and Beattie 1991; Surian and Leslie 1999); perhaps this added specificity simply restricts the range of plausible interpretations, forcing the child to consider the doxastic one more carefully. Asking a child to play out a character’s actions using a toy rather than directly querying them about the character’s actions (i.e. the “Duplo Task”) also helps, perhaps because the play-acting activity naturally leads the child to treat the character’s beliefs as contextually relevant (Rubio-Fernández and Geurts 2013); unlike traditional FBTs, the Duplo Task presents children with a situation in which showing their understanding of beliefs actually makes sense from their point of view. All of these manipulations, whether they involve the verb ‘think’ or not, seem to change the features of the situation in a way that makes children regard doxastic facts as more salient, enabling them to demonstrate their knowledge of the character’s beliefs.

4.2 Social Experience and the FBT

Here, the importance of social experience for understanding the relevance of belief facts becomes clear: children who have had more opportunities to observe and participate in conversations about beliefs seem to be better attuned to the conversational relevance of psychological facts. They may, for instance, gradually encounter more situations in which non-doxastic interpretations of speech acts fail to explain speakers’ behavior, forcing them to entertain alternative, doxastic interpretations. In this manner, children may come to learn that the concept of belief that they deploy to interpret the behavior of others is also regularly implicated (either explicitly or implicitly) in everyday speech, especially in contexts involving diverse beliefs (Lewis et al. 2012) and false beliefs (Papafragou et al. 2007). This newly acquired knowledge prompts children to adjust their prior expectations about the potential relevance of belief-facts when drawing inferences about speaker meaning. They are then better able to disambiguate indirect and attributive uses of ‘think,’ and, most importantly for our current discussion, accurately interpret experimenter queries in the FBT.

This experience could be achieved via exposure to maternal “mind-minded” conversation (Ruffman et al. 2002), interactions with older siblings (Perner et al. 1994; Ruffman et al. 1998), or various forms of explicit training (Hale and Tager-Flusberg 2003; Lohmann and Tomasello 2003). Notably, the absence of these experiences would lead to corresponding delays on FBTs. Late-signing deaf children, for instance, are not exposed to belief discourse until primary school, and consequently they show delays in explicit false belief performance (Wellman et al. 2011); yet, when they are exposed to theory of mind-based training interventions, they rapidly improve (Wellman and Peterson 2013). The first cohort of Nicaraguan signers did not even possess mental state vocabulary when Pyers and Senghas (2009) first tested their explicit false belief competence, which they systematically failed. Several years later, after being exposed to the mental state vocabulary of the second cohort, their performance markedly improved. According to the pragmatic account, what developed in the interim was not a new set of concepts; rather, it was their sensitivity to the contextual factors that rendered beliefs conversationally salient. For the late-signing deaf-children, their general deficit in linguistic experience meant that they lacked crucial experience with belief discourse; Wellman and Peterson’s intervention succeeded in compensating for this deficit. For the first-cohort Nicaraguan signers, the language itself was impoverished with respect to mental state terms, which resulted in impoverished experience with belief discourse.

These findings, which resist explanation under accounts that appeal solely to the executive demands of the FBT to explain systematic failures, are convincingly explained under the pragmatic development account. But more importantly, they point to the specific importance of experience with mental state discourse in improving children’s performance on the FBT. Such experiences provide a developmental scaffold for the ability to understand when psychological facts are conversationally relevant.

At this point, one might object that while the pragmatic development account is well-equipped to explain experience-dependent individual differences in FBT-performance, it may seem less obvious that it can explain why most children suddenly pass this task around 4-and-a-half years of age. This sharp developmental shift was convincingly explained by the processing-load account, as it seemed indicative of a biologically-based, maturational change to a child’s executive abilities. But now that the processing-load account has been shown to be inadequate (for the reasons given in Section 2), it is not clear that the space left in its wake can be filled by appealing solely to a child’s experiences with belief discourse.

The strength of this objection depends upon the claim that there is a sharp shift in performance on the FBT around 4.5 years of age. But as we’ve already seen, the precise timing of this shift is in fact quite variable: children with more experience with belief discourse pass the FBT slightly earlier, while children with less experience pass it slightly later; children with significantly less experience (e.g. late-signing deaf children and first-cohort speakers of NSL) pass it significantly later. When we include populations from a broader range of cultures, the age of success on the FBT varies still further: for instance, children from Hong Kong do not pass the task until they are 6 (Wellman et al. 2011), and Samoan children do not pass the FBT until around 8 years of age (Mayer and Trauble 2012). Moreover, even in culturally homogeneous, Western populations, the shift in children’s performance is not particularly abrupt: at 30 months of age, the rate of success on the task is 20 %; by 40 months, it rises to 50 %; by 56 months, it rises to 74.6 % (Wellman et al. 2001). There is, in other words, a gradual, linear change in the rate of FBT success, the timing of which varies substantially in different populations. This broad pattern is consistent with the hypothesis that FBT performance is related to gradually increasing exposure to belief-discourse, which itself may vary on individual and cultural bases.Footnote 7 Thus, the pragmatic development account is fully able to explain why children pass the FBT when they do, whether or not this happens to occur at 4.5 years of age.

4.3 Desire Discourse

The pragmatic development account also helps us understand another major developmental finding in the theory of mind literature, namely that children consistently succeed on verbal tasks that implicate the concept of desire well before those that involve false beliefs (Wellman and Woolley 1990; Hadwin and Perner 1991; Rakoczy et al. 2007). Explaining these findings has proven challenging for nativists, who hold that basic conceptual understanding of both belief and desire emerge in the first year of life. Leslie and colleagues (Leslie et al. 2004) have argued that desire-based tasks are less demanding on a child’s executive resources than FBTs; however, Rakoczy et al. (2007) have shown that the gap between desire and false belief persists even when both types of task are matched for logical complexity.

An initial prediction of the pragmatic development account is that this phenomenon is likely to have its roots in children’s conversational experiences, and indeed, there is reason to believe that this is the case. As I argued in section 3, both our explanations and descriptions of behavior tend to omit any overt reference to beliefs and refer only to desires, which leads references to desires to be roughly twice as frequent as talk of uses of ‘think’Footnote 8; thus, children have a much greater input for desire discourse than for belief discourse (see also Smiley and Huttenlocher 1989; Taumoepeau and Ruffman 2006). Moreover the frequency of indirect uses of ‘think’ makes the input for belief discourse fairly noisy, whereas ‘want’ does not seem to pose the same kinds of pragmatic difficulties.Footnote 9 One would expect, then, that proficiency with desire-discourse would precede proficiency with belief-discourse, as the input for the former would be both greater and more easily interpretable than the input for the latter. Thus, according to the current account, children succeed on tasks involving desire before they succeed on tasks involving belief because desire discourse lacks the pragmatic difficulties posed by belief discourse.

5 A Problem Case: Call and Tomasello (1999)

One set of findings that seems to raise doubts about the pragmatic development account is reported by Call and Tomasello (1999), who developed an entirely non-verbal change-of-location FBT. In this task, two groups of children (with mean ages of 4 and 5 years) were asked to sit in front of a large rectangular barrier. Behind the barrier were two identical boxes. One experimenter, the Hider, would place a sticker inside a box while it was behind the barrier, where the child could not see it. Seated behind the Hider was the Communicator, whose job was to point to the box containing the sticker after the barrier was raised. The child was told that her job was to point to the box containing the sticker, which she would then be able to keep.

On the crucial false-belief trials, the Communicator would watch the Hider place the sticker in one of the boxes, and then briefly left the room. During the Communicator’s absence, the Hider would switch the locations of the two boxes. The Communicator would then return and point to the location in which the sticker was originally hidden prior to the switch. In order to pass this task, children had to recognize that the Communicator had a false belief about the location of the sticker, and then ignore the Communicator’s pointing gesture. Children’s performance on this task was then compared to their performance on a standard FBT.

Call and Tomasello found that children’s performance on this non-verbal task was highly similar to their performance on verbal versions of the FBT, with performance on both types of task improving with age. In other words, the younger children tended to fail both tasks, while the older children tended to pass them. These results are problematic for the pragmatic development account for two reasons. First, the task is non-verbal, and made use of simple pointing gestures, which children understand well by this age (Behne et al. 2005; Liszkowski et al. 2007). This seems to eliminate the possibility that children were responding to an indirect speech act. Second, children of both age groups clearly seemed to understand that their goal in this task was to collect the stickers for themselves. This makes it unlikely that children in the crucial false-belief trial thought that they were being invited to help the ignorant communicator. Thus, both the pragmatic development explanation of children’s misunderstanding in the standard FBT, and the explanation for their systematic errors in that task are off the table.

However, there is reason to believe that the correlation between children’s performance on this non-verbal task and the standard FBT is illusory. While Call and Tomasello’s design did reduce the pragmatic demands of the FBT, it also increased its inferential complexity and executive demands. In an ordinary change-of-location FBT, children must simply recall where the agent thinks the object is in order to give a correct response. In this task, in contrast, the child must 1) track the visible displacement of the sticker in the original hiding phase; 2) track the invisible displacement of the sticker during the switching of the boxes, 3) remember which parts of the task the Communicator was present for, and 4) ignore the Communicator’s advice when she returned on the basis what the she remembers of 1)-3). Already, this exceeds the processing demands of an ordinary FBT.

In addition to this added executive burden, the child must be capable of reasoning with non-specific, quantified belief-attributions. Because the child does not know where the sticker is, she must not represent, “The Communicator knows the sticker is in Box A,” but rather, “The Communicator knows which box the sticker is in.” Then, when she sees the boxes switched, she must infer, “Whichever location the Communicator thought the sticker was in, it is now in the other one.” Then, when the Communicator points to one box, the child must select the other. This kind of reasoning is vastly more complex than what is required in the FBT. Thus, it appears that the above task design has simply traded pragmatic challenges for greater inferential complexity and executive demands.

Call and Tomasello do attempt to head off this line of criticism by administering a series of control tasks, testing children’s ability to track the visible and invisible displacements of the sticker, as well as the child’s ability to ignore the Communicator. The authors thus argue that the children were capable of surmounting all of the executive and inferential demands of the task. Crucially, however, the study only controlled for each of these factors independently of one another, whereas the real challenge of the task would have been coping with all of these demands simultaneously. The study thus failed to adequately control for the executive demands and inferential complexity of the task.

6 Predictions

One could empirically distinguish between the pragmatic development account and the standard nativist “processing load” account in several ways. The processing load account points to the information-processing demands of the FBT itself to explain younger children’s failures, and predicts that if we reduce these demands, children’s performance should improve. In the past, this approach has been successful, and several authors have developed simplified, less executively demanding versions of the FBT that children are able to pass before their fourth birthdays (Rubio-Fernández and Geurts 2013; Rubio-Fernández and Geurts 2015). The pragmatic development account, in contrast, points to the conversational context of the FBT as a crucial determiner of performance. It predicts that if children are better able to grasp the fact that in this context they are supposed to attend to belief facts, then their performance on a standard FBT should improve – even if no change is made to the task’s immediate information-processing demands. Thus, where the processing load account would predict that contextual manipulations that leave the basic executive demands of the FBT unchanged should have no effect on performance, the pragmatic development account predicts that these manipulations should lead to improvements in performance.

This contextual manipulation could be achieved using a between-subjects design with children aged 3.5 to 4 years (i.e. on the cusp of passing the FBT). Both the experimental group and the control group would complete a standard, change-of-location FBT as the dependent measure. But prior to completing the task, the experimental group would engage in a pre-test familiarization activity that would serve to heighten the saliency of beliefs. One might achieve this effect by having children answer a number of questions that draw a contrast between knowledge and belief. One way to do this would be to base the activity on Wellman and Peterson’s “thought bubble” intervention, originally used with deaf children (Wellman and Peterson 2013). Unlike prior research that used multiple training sessions to improve children’s performance on the FBT, the pragmatic account predicts that these manipulations should have an immediate effect.

Alternatively, one could use a contextual manipulation to diminish the saliency of alternative interpretations of the FBT query – for instance, by making the prospect of helping the agent seem undesirable (Helming et al. 2014). This could be achieved by borrowing from the design of Vaish et al. (2010). These authors found that three year-old children would selectively avoid helping an agent if they previously saw that agent intentionally harm someone else. If children’s incorrect response in change-of-location FBTs is in fact a helping response, then seeing an agent commit an intentionally harmful act prior to completing the FBT should also diminish their inclination to give this response.

One could also use the same type of contextual manipulation to cause children who we would expect to pass the FBT (e.g. children aged 4.5 to 5) to fail the task. This could be achieved by making the needs of the agent especially salient and/or by emphasizing the prosocial nature of the agent, causing the motivation to help to override the still-fragile doxastic response. This manipulation could be implemented by showing children an agent with a false belief who has just been the victim of an unfair resource allocation. Since we know that children at this age are highly motivated to rectify such inequalities (Li et al. 2014), we might expect them to use the context of the FBT as a chance to rectify the inequality, and thus give the helpful but incorrect response.

Crucially, all of these designs would include a standard FBT task, free of simplifying manipulations. However, one might also build on the last suggestion by taking a simplified FBT that younger children normally pass (e.g. Rubio-Fernández and Geurts 2013, 2015), and using a contextual manipulation to make it harder. The pragmatic development account predicts that even if the processing demands of the task have been sufficiently reduced, a contextual manipulation emphasizing the helping motivation should cause these children to fail. Something like this might explain one of the results of Rubio-Fernández and Geurts (2015). In their Experiment 2, they administered a simplified version of the FBT – the “Duplo task” – that three-year-olds have been known to pass with little difficulty, but with one modification: they made sure to mention the desired object. Footnote 10 This caused three-year-olds overwhelmingly to fail the task. The authors attribute children’s failure in this manipulation to the fact that mentioning the object made it more salient to the child, disrupting the child’s perspective-taking in the process. But the pragmatic development account offers a different interpretation: drawing the child’s attention to the banana qua object of desire triggered their motivation to help, causing them to lead the character to the bananas’ correct location instead of acting out the appropriate behavior.

An alternative to the contextual manipulation strategy would be to modify the immediate FBT context in a way that would increase the salience of belief facts. This kind of approach would probably make it more difficult to tease apart pragmatic factors from executive demands, since it could be claimed that any resulting improvement in performance might be due to a reduction in processing demands, rather than anything to do with the salience of belief facts. But this sort of obstacle could be overcome by following the example of Lewis et al. (2012). Recall that Lewis and colleagues were able to improve children’s performance on a truth-value judgment task for belief reports by adding an additional seeker with contrasting beliefs (see Section 3.2). This manipulation would have also increased the processing demands of the task – reading two minds is harder than reading one – but this did not seem to hurt children’s performance. One could implement a similar manipulation in a more traditional FBT design. In such a task, children would be presented with a hide-and-seek scenario; in the 1-seeker condition, children would see just one seeker with a false belief about the location of the hider; in the 2-seeker condition, children would see two seekers, one with a false belief about the hider’s location, and the other with a true belief. In both conditions, the test question would be, “Where will [the seeker with the false belief] look for [the hider]?” The pragmatic development account would predict that having two seekers as opposed to one should improve performance, whereas the processing load account would predict that it would lead to a decrease in performance.

7 Conclusion

In this paper, I’ve illustrated how belief discourse poses substantial challenges for young children that have nothing to do with whether or not they possess the concept of belief. This highlights a new way of interpreting the relationship between children’s early social experiences and their performance on FBTs: even children who are able to represent false beliefs must still learn from their social environment how and when belief facts are implicated in conversation before they are able to pass the FBT. Thus, if a child’s social environment is enriched or impoverished with respect to belief discourse, this will affect when she passes the FBT. The pragmatic development account thus provides the theory of mind nativist with a framework for accommodating a wide range of variation in FBT performance brought on by differences in individuals’ social experiences, as well as set of empirical predictions for testing and extending that framework and enriching our understanding of theory of mind development.