Here’s the latest twist on an old riddle: How many mindreading systems does it take to engage in basic forms of social cognition? Answers vary. Some say “Maybe two” (Apperly and Butterfill 2009; Butterfill and Apperly 2013). Others say “One, and only one—definitely” (Carruthers 2013, 2015). A better answer is, I contend, “None at all”. This paper shows how the last answer might be true and attempts to make accepting it attractive.

This paper argues that mind-reading hypotheses (MRHs), of any kind, neither best characterize nor best explain basic social cognition. Attention is given to the two most popular MRHs: one-ToM and two-ToM theories. These MRHs face competition in the form of complementary behaviour reading hypotheses (CBRHs). Following Buckner (2014), it is argued that the best strategy for putting complementary behaviour reading hypotheses out of play is to appeal to theoretical considerations about the psychosemantics of basic acts of social cognition. In particular, need-based accounts that satisfy a teleological criterion can plausibly put CBRHs out of play. So all looks good for MRHs. But there is a twist. For against this backdrop a new competitor for MRHs is revealed: mind minding hypotheses. Mind minding hypotheses assume that no mental states or contents are attributed during basic acts of social cognition, and yet they are capable of explaining all the known facts and they also satisfy the teleological criterion. In conclusion, some objections concerning the theoretical tenability of MMHs are addressed and prospects for further research are canvassed.

In making this case for mind minding hypotheses (MMHs) the first step is to agree terms and targets. Section 1 achieves this by clarifying the core theoretical commitments of MRHs as exemplified by the two most popular MRHs about basic social cognition: one-ToM and two-ToM theories. Section 2 examines strategies proponents of MRHs might employ in attempting to put competing behaviour reading hypotheses out of play. Analysis reveals that better experimental designs cannot deal with the so-called logical problem and this gives succor to behaviour reading hypotheses. Following Buckner (2014), it is argued that this can be best achieved by appeal to need-based psychosemantic considerations about the likely targets of acts of social cognition. Section 3 introduces a novel kind of non-mindreading proposal: mind minding hypotheses. Although MMH satisfies, by their inherent design, the teleological criterion described in Sect. 2 they are non-representational through and through and, as such, assume that acts of basic social cognition do not involve the attribution of mental state concepts and contents. Section 4 rounds up the discussion by responding to some objections to an enactivist variant of MMH with the aim of demonstrating its theoretical tenability and motivating future research and development of MMHs.

1 One mindreading system or two?

Theory of mind (hereafter ToM) abilities, of some sort or other, are widely assumed to be the beating heart of interesting forms of social cognition (see Baron-Cohen et al. 2013 for a 30-year retrospective). This thought is usually coupled with a familiar explanatory hypothesis about how ToM abilities are carried off—namely, that ToM abilities are underwritten by capacities to mentalize or mind read. On the standard interpretation, mind reading minimally requires:

  1. 1.

    Representing and attributing mental state concepts (minimally, belief and desire or that of other mental states that play similar sorts of cognitive and conative roles);

  2. 2.

    Representing and attributing the contents of such attitudes;

  3. 3.

    Appreciating (by some means) how such attitudes interrelate (this may, e.g. be by the possession and use of a theory or by more simulative means) (see Hutto et al. 2011, p. 376).

Conditions 1–3 can be weakened in various ways, allowing for the possibility of mindreaders with less than fully-fledged mindreading capacities. It is possible to represent and attribute mental state concepts and contents other than those of the canonical propositional attitudes and to still qualify as a mindreader. Nevertheless conditions 1–3 must be met in some form or other if the notion of mindreading is not to lose its theoretical substance.

It is easy to find proponents of MRHs explicitly subscribing to at least conditions 1–2 (see Spaulding 2011, p. 143; Butterfill and Apperly 2013; Buckner 2014, p. 566 for recent examples). This lends confidence that the conditions set out above are neither arbitrary nor overly restrictive (pace Overgaard and Michael 2015, p. 178, footnote 2). Even so, clarification concerning two further points will set the stage properly for the argument to come.

First, assessing any MRH requires clarifying exactly how that MRH understands the nature of the hypothesized mental concepts and contents that are allegedly attributed during these acts. This is important because there is scope for different MRHs to understand the nature of attributed concepts and contents and what MR attributions involve quite differently. Very demanding MRHs have sought, for example, to explain the recent findings concerning the basic social cognitive acts of infants by proposing, “the child must imagine a thought bubble in [the experimenter’s] head that has the actual cognitive content driving his behaviour” (Buttelmann et al. 2009, p. 341, emphasis added).

Taken at its word, this MRH appears to be asking too much. It is seemingly unnecessary that mind readers would need to attribute the actual content of another’s thoughts in order to keep track of what they are thinking about. Being sensitive to the philosophical distinction between intensional (with-an-s) content and extensional content helps us to see what is at issue here. Content construed as intensional (with-an-s) is understood to be the guise, manner or mode of presentation under which some particular object or state of affairs is grasped by a thinker. Bearing this in mind, we can imagine that the experimenter in the case described is thinking about a particular ball hidden in a box while, at the same time, allowing that she is thinking of the ball as or under the description of ‘The soft toy that we’ve used a dozen times in these trials’. We can also imagine that the experimenter has an equally idiosyncratic way of thinking of the box. In this light, any MRH that insists that mind readers need to attribute anything like actual, intensional (with-an-s) contents in order to keep track of what another is thinking about seems implausibly demanding: Even though it is possible to think about ‘the ball in the box’ in many ways, it seems enough for a MRH to assume mind readers have some way of attributing contentful attitudes such that those attributions pick out, in extension, what the other is thinking about. With respect to the case just imagined, it would be enough to explain the social cognitive feats of the child by assuming that it has some means of ascribing a co-extensive content to the experimenter—e.g. that ‘she thinks the ball is in the box’—even if such an ascription does not capture the exact, actual content of the experimenter’s thought. Notably, the most promising MRHs about basic social cognition—those that are the focus of this paper—make very modest assumptions about what is attributed during acts of basic social cognition and also what such attribution involves. For example, they do not assume that in the simplest cases mind readers are attributing intensional (with-an-s) contents or that they are consciously, otherwise explicitly, aware of the contents being attributed.

The second issue concerns the target ambitions of MRHs and their rivals. Standardly MRHs are construed as inferences to the best explanation, they offer to explain how it is that mind readers manage to keep track of other minds in acts of basic social cognition. However, in offering such explanations MRHs typically assume a number of important things about what is being explained—specifically, that the basic acts of social cognition in question are best characterized as ones in which mental states and contents are in fact being attributed. Yet it is possible to pull explanans and explanandum apart. It might be that attributing contentful attitudes is not the best way to characterize what is going on in basic social cognition even if it turns out that such attributions feature in the best explanation of such acts, and vice versa. The strongest challenge to MRHs holds that MRHs are needed neither to best describe nor best explain basic acts of social cognition. That is precisely the challenge that MMHs offer by offering an alternative on both fronts.

While the basic ToM assumption about what lies at the heart of much social cognition still holds sway and MRHs are still very popular explanations of such abilities, there is a growing lack of consensus about how many ToMs humans might be operating with and how we should understand the exact properties of such ToMs.

Advocates of the one-ToM hypothesis (one-ToMH) propose that human beings operate with one and only one MR system—where this is understood as a cognitive architecture of particular design and a dedicated, domain-specific function (Fodor 1995; Segal 1996; Leslie et al. 2004; Baillargeon et al. 2010). Carruthers (2015) advances the most developed contemporary version of one-ToMH. It holds that one, and only one, MR system operates throughout the whole of human development—from early infancy to adulthood. Crucially, as Carruthers (2013) emphasises, “while the operations of this system probably become more streamlined and efficient with age, its representational capacities do not alter in any fundamental way” (p. 142).

One-ToMH assumes that we initially operate with primitive mentalistic concepts—e.g. THINKS, LIKES—and principles about how these concepts interact. By applying such ToM principles the child, so the story goes, is able to make meta-representational attributions to others, such as, borrowing directly from Carruthers (2015), THE AGENT THINKS THE TOY IS IN THE BOX. Importantly, even though this complete ascription can be true or false, and even though the embedded expression inside the THINKS-operator will be true or false, as infants we need not have any explicit understanding of the concepts of truth and falsity in order to make such attributions. Explicit mastery of the concept of truth can, and apparently does, come later (see also Southgate 2013, p. 14). This works if it can be safely assumed that “since infants have propositional thoughts from the outset themselves ... they can take whatever proposition they have used to conceptualize the situation seen by the target agent and embed that proposition into the scope of a ‘thinks that’ operator” (Carruthers 2013, p. 162). In this way infants have a means of ascribing propositional contents even if they are not capable of attributing the exact actual intensional (with-an-s) contents of another’s thoughts.

An important feature of one-ToMH is that basic conceptual primitives act as atomistic placeholders—they are stable linchpins, mental items that denote, refer or pick out items about which we can come to weave a wider set of conceptually-grounded inferences. These conceptual primitives can therefore be built upon through a process of on-going enrichment. On this model, when this happens no new concepts are introduced—instead we acquire expanded conceptions, wider understandings and new inferential connections tethered to our old concepts (see Hutto 2005, for fuller discussion). Hence there is—and can only be—one core MR theory that develops from childhood to adulthood, one theory that, as Apperly (2013) puts it, ‘grows up’ through the introduction of new principles and the expansion of inferential linkages.

This explains why our incrementally shifting conceptual-based ToM abilities are brought to bear in complex ways in performance. It explains why individuals exhibit varying degrees of success at various points and stages in their development as they respond to particular task demands that may tax their supporting or as yet poorly developed capacities for executive function, working memory and linguistic facility.

Well-known experiments show that children as young as 25 months (Southgate et al. 2007) and even 15 months (Onishi and Baillargeon 2005) can pass language-free versions of false-belief tasks. Taken at face value the infant data suggests that very young children must have some command of the concept of belief in place very early on. This is so even though much older children lack the capacity to pass standard, verbally based, false belief tasks—tasks that were previously taken to be the litmus test for possessing the concept of belief.

The one-ToMH easily accommodates the full range of social cognition data, while making short work of this so-called developmental ‘paradox’ (Southgate 2013, p. 10, see also De Bruin and Newen 2014). The situation only appears paradoxical if we accept one-ToMH: it neatly dissolves the apparent paradox in assuming, on the one hand, that conceptual capacities come in degrees—viz. that our grip on any given concept may be partial and piecemeal at any given time—while at the same time, on the other, holding that certain basic concepts, those that are gradually enriched, are nevertheless always in play from the start. So there is no need to posit any radical conceptual changes in the form of introducing new conceptual primitives in order to deal with the developmental paradox.

There is another ToMish way to explain all this. The two-ToMH proposes that humans, at least, may be operating with not one, but two functionally distinct mindreading systems. One of the main motivations behind two-ToMH is the apparent fact that mastery of propositional attitude concepts and their attribution requires a great deal of cognitive sophistication. Inspired by Davidson (1984, 1990), the authors of this account take attributing propositional attitudes to be an interpretatively tricky business. This is because such attitudes have “arbitrarily nestable contents, interact with each other in uncodifiably complex ways and are individuated by their causal and normative roles in explaining thoughts and actions” (Butterfill and Apperly 2013, p. 610).

On this basis, Butterfill and Apperly (2013) assume that “some feature, or combination of features, of the propositional attitudes makes full-blown theory of mind cognition demanding” (p. 610). This fact is supported by a body of evidence about adult performance on ToM-related tasks (Apperly 2011, 2013). The Two-ToMH posits that in addition to the mindreading system that underwrites adult capacities to ascribe propositional attitudes to humans—call it maximum Tom (or MaxToM)—we may also make use of a minimal ToM (or MinToM) for fast and efficient but limited ToM tasks.

MinToM is distinctively different from MaxToM in only making use of the concepts ENCOUNTERING and REGISTERING. These are both non-representational and relational concepts. ENCOUNTERING denotes a relation between an agent and an object while REGISTERING denotes a relation between an individual, an object and a location (Butterfill and Apperly 2013, pp. 614–617). MinToM makes “use of objects and their relations to agents, rather than representations of objects, to predict others’ behaviours” (Butterfill and Apperly 2013, p. 622). There is a key, formal difference therefore between the concepts employed by MinToM and those employed in MaxToM: the former are extensional; the latter are intensional (with-an-s). As a result Apperly and Butterfill (2009) conclude that whatever MinToM users “represent, it is not a state with propositional content” (Apperly and Butterfill 2009, p. 957). Still, they hold that MinToM users have the “ability to ascribe simple forms of mental content, at least in the form of belief-like states” (p. 965). This is why positing MinToM still qualifies as a kind of MRH.

Unlike one-ToMH, two-ToMH assumes that MinToM never grows up and that radically new concepts and principles come into play with the advent of MaxToM. But this does not mean that MinToM fades away. Two-ToMH postulates that in normally developing human adults MinToM and MaxToM continue to exist intact and operate alongside one another (Butterfill and Apperly 2013, p. 629).

The two-ToMH has adequate means to deal with the developmental ‘paradox’. On the assumption that MinToM comes into play early on it is hypothesized that human infants use it, and it alone, when attributing mental states in cases of basic social cognition. Infants using MinToM make do with less. To the extent that very young children only employ MinToM to complete social cognition tasks they do so without representing and attributing propositional attitudes as such.Footnote 1 That capacity doesn’t kick in until much later, only after older children begin to pass explicit, verbally based false belief tests—a first sign of the emergence of MaxToM.

Apart from its capacity to deal with the developmental paradox, a second motivation for believing the two-ToMH stems from the need to explain a wealth of evidence about human adult performances that suggests ToM responding is sometimes fast and automatic and at other times slow and effortful (Butterfill and Apperly 2013, pp. 609–612). Moreover, it appears that the operation of MaxToM abilities of adult humans is sometimes affected by a more automatic tendency to engage in basic ToM tasks in certain experimental set ups (see Apperly 2013, for a full and up-to date review of the findings). Two-ToMH also has the advantage of being able to explain why basic ToM abilities are widespread, occurring not only in human adults under cognitive load but also in infants, and other non-human animals, whereas, by contrast, MaxToM cognition is comparatively rare (Butterfill and Apperly 2013, p. 629; Hutto 2004).

2 Doubting ToMs

ToM interpretations and MR explanations of the interesting forms of social cognition are not the only live theoretical possibilities, however intuitively attractive they may be to some. There are a number of alternative interpretations and explanations of the same findings. In reviewing the full range of theoretical possibilities it is important to beware of conferring unfair advantage on any given one because of the way an explanandum is described and labeled. Doing so risks systematically confounding descriptions of what is being done with substantial accounts of how agents manage to do what they do, leading to evidence for the former being mistakenly treated as if they were direct evidence of the latter.

Taking some recent statements by Carruthers as salient examples, this danger becomes clear and present when it is reported that new findings indicate that infants, for instance, are capable of “representing and reasoning about false beliefs” (Carruthers 2015, p. 1). This way of setting things out can lead to overblown claims that there is “simply no other way of explaining our competence in this domain” (Carruthers 2009a, p. 167).

Elsewhere and more guardedly, Carruthers (2013) claims that non-MR explanations are implausible rivals, allegedly on purely empirical grounds. In light of the sum total of evidence he deems it unlikely that infants are engaged either in behaviour reading, thus deploying a set of behaviour-rules, or representing thoughts in a non-MR manner (p. 167, see also Fletcher and Carruthers 2013, for a developed argument along these lines). Such approaches, he claims, only stay in the ring by ad hoc defences and by piggybacking on more empirically fecund MR work. While such alternatives offer logical challenges to MRHs they generate no flourishing, interesting empirical results of their own and should not be taken seriously. While there may be some truth to the claim that some of these hypotheses are not independently empirically productive, it is far from obvious that alternatives have or can be put to straight empirical test; it is even less obvious that they have already been experimentally refuted.

Greater caution is surely in order: at most, empirical findings only, much more modestly and neutrally, provide solid evidence that in cases of basic social cognition at least the person’s “own action selections vary significantly with respectively true/false belief conditions of observed others” (Brincker 2014, p. 1). Undeniable facts about the truth or falsity of other’s beliefs affect, for example, infantile expectations about the locations at which others will look for objects and these facts modulate infant actions such as helping and engaging with others cooperatively in on-line tasks (Onishi and Baillargeon 2005; Buttelmann et al. 2009; Knudsen and Liszkowski 2012; Southgate et al. 2010). Yet, as with the ToM literature on animals, all this evidence securely shows is that there is “tracking of correlations” (Buckner 2014, p. 567).

For ToMists, the gap between evidence and theory is what generates the “insurmountable ‘logical problem”’ (Buckner 2014, p. 567). The logical problem arises because there is always theoretical space for non-ToM accounts pitched in deflationary non-MR terms of the same social cognitive capacities. Non-ToM accounts propose that so-called ToM abilities can be understood and explained equally well as capacities to represent and attribute behaviours (not mental states) and or as capacities to target mental states but understand the way such mental states are targeted and responded to in non-ToM ways (Heyes and Frith 2014, pp. 1243091–1243093, Hutto et al. 2011; Fenici 2014 for an overview of alternatives).

Not everyone is pessimistic about overcoming the logical problem by purely empirical means (Yet see Thompson 2014, for an argument that MinToM is not distinguished from behaviour reading). Optimists believe that by adapting and developing new paradigms, using multiple tests and utilizing special experimental designs it should be possible to rule out non-ToM alternatives (see Butterfill and Apperly 2013, p. 628 for a discussion).

Consider Southgate’s (2013) attempt to rule out complementary behavioural reading hypotheses (CBRHs).Footnote 2 She acknowledges that “simpler, non-mentalistic interpretations in terms of behaviour-reading are always possible ... one cannot know for sure whether infants are genuinely using mental state concepts such as ‘knows’ or ‘believes’ when generating predictions about what the other will do” (Southgate 2013, p. 7). She reviews and rejects various existing experimental strategies for dealing with MRH challengers (Southgate 2013, p. 8). In the end she proposes one experimental design that, by her lights, succeeds in ruling out all CBRHs. The set up was that 12- to 18-month olds were trained in the use of two different kinds of blindfolds: one group used ordinary opaque—O-blindfolds—and the other used transparent, trick—T-blindfolds. The assumption was that MRHs but not CBRHs would predict that infants would draw on their own perceptual experience, in this case their first-person experiences of seeing through the particular kind of blindfolds they experienced in generating their predictions about how another would respond in similar circumstances. The study revealed that children’s gaze-following behaviour did, in fact, significantly depend upon the type of blindfold to which they had become accustomed. Those who had T-blindfolds followed the gaze of others significantly more often than those who had only experienced O-blindfolds.

Does this settle the matter? Surely a crafty CBRH could be cooked up to accommodate these findings. Offering an in-depth analysis of Lurz’s (2009, 2011) attempts to deal with the logical problem, Buckner convincingly argues that it is always possible to devise more challenging CBRHs to defeat the particular MRHs that Lurz favours.Footnote 3 The overarching rule of this theoretical arms race is that “any experiment for which a CBRH ... can be constructed that makes the same predictions as (and is at least as plausible as) its MRH fails to overcome the logical problem” (Buckner 2014, p. 568). All that is required to create shadowing CBRHs is to ensure that none of the explanatory work need get done by attributing “mental state concepts or projections of perceptual experience” (Buckner 2014, p. 572).

With respect to Southgate’s (2013) design a CBRH will challenge that the relevant predictions are in fact grounded on projections of the children’s first-person experiences as opposed to the self-observation of their own behaviours associated with the use of O- or T-blindfolds. This possibility can only be avoided if it is assumed not just that (A) we have transparent, non-behaviourally mediated access to our first-person experiences but also, more strongly, that (B) there are no associated observable behavioural consequences associated with such experiences (see Carruthers 2011 for an extended argument against the first possibility). Unless A and B are assumed, it is possible that the attribution of some combination of behavioural factors and not any attribution of mental states could be doing the predictive work in the Southgate cases. Sceptics will need more persuading.

The way to construct an effective CBRH is to focus on contextual factors and different behavioural consequences than those the proponent of the targeted MRH considers and protects against experimentally. One expands the set of observational evidence that the imagined behaviour reader might be using as evidence to fuel its predictions (Buckner 2014, p. 572). There seems to be no end to the resources that can be called upon for this purpose: “there is no in-principle limit to the complexity of the predictions which behavioural strategies might support” (Butterfill and Apperly 2013, p. 626).

The moral seems clear: There is no good reason to hope that the logical problem can be dealt with by coming up with ever more ingenious experimental designs. This will be so just in case, as Buckner claims, the root issue is “semantic rather than methodological” (Buckner 2014, p. 568). The logical problem is intractable precisely because it is an artifact of the ToM assumption that some or other behaviour must always be available to serve as the evidential basis for the attribution of any given mental state. As long as behaviour is a necessary evidential basis for ToM attributions it will always be possible that the relevant acts of basic social cognition are driven by direct responses to the behavioural evidence without positing anything that goes beyond it.

What is a fan of MRHs to do? Simply ignoring the logical problem doesn’t seem to be a legitimate option. For unless it is dealt with there can be no absolute, watertight confidence that MRHs are the best explanations of relevant acts of social cognition. Even the strong conviction that it is mental states, and not just behaviour, that is being targeted in basic social cognition is not justified as long as the logical problem is in force.

No knockdown deductive argument has been provided to show that no possible future design will succeed. It is just a good inductive bet, given what we know, that such attempts will fail. Still one way forward is to keep on trying to develop more and more foolproof designs, perhaps with multiple measures, to rule out any possible CBRHs.

Another, better way of dealing with all CBRHs, right here and right now, is to take seriously Buckner’s (2014) recommendation that we should give greater attention to the psychosemantics of social cognition. This proposal is inspired by his analysis “that the appearance of ‘logical problems’ in any experimental design is a function of one’s underlying assumptions about the nature of representation” (Buckner 2014, p. 579).

Whether or not the connection is as strong as Buckner claims, a sufficient way to avoid the logical problem in the social cognition debate, in a wholesale manner, is by adhering to a particular version of psychosemantics. How so? Buckner (2014) insightfully compares the logical problem in the social cognition debate with the distality problem in the psychosemantics literature. The distality problem arises for naturalized theories of content of a purely causal or informational bent. This is because such theories, at least in their basic formulations, lack a non-arbitrary, principled means of designating where referents or contents are to be found within complex causal or information chains. Rendered as a question, the distality problem asks: What, in such-and-such a theory, justifies the assumption that mental representations stand for distal, environmental items as opposed to retinal stimuli or neural excitations?

In overcoming the distality problem, “the most popular ... devices appeal in one way or another to the organism’s needs” (Buckner 2014, p. 586). Some or other version of teleosemantics does the trick because of the central emphasis such theories of content place on teleology and the role representations play in service of answering cognizer’s needs (Buckner 2014, p. 578). Focusing on what serves organismic needs, for example, reveals that a frog’s representations should be about or should target flies and not merely retinal stimuli that are the historically normal means of detecting flies.

For those willing to adopt some or other teleosemantic theory of content, there appears to be a way of dealing with the logical problem in the social cognition debate after all. Just as with the frog and fly case, any attention given to behaviours in basic acts of social cognition will only be a means to the end of targeting mental states. Or rather, this will be so, just in case the targeting of mental states and not associated behaviours answers the needs of social cognizers. This is plausible enough. With reference to the case of chimpanzees, for example, Buckner proposes that “only seeing stands at the right place in the dominant’s motivational psychology to control the sorts of behaviours that the dominant uses line-of-gaze to track, like pursuit of food items, awareness of a subordinate’s sexual advances on females, confrontational eye-gaze, and so on” (Buckner 2014, p. 578).

Against this backdrop, MRHs are a safe bet and should be preferred to CBRHs across the board. This will be the case so long as the following sort of counterfactual holds good and generalizes: “had line-of-gaze not been a reliable indicator for seeing during some critical period—if, for example, its conspecifics had been blind, and ‘saw’ using their hands—then line-of-gaze would not possess its current epistemic significance, such as being recruited as evidence to determine which food items are safe to eat” (Buckner 2014, p. 578).

By insisting on a teleological criterion as a test of adequacy for theories of basic social cognition Buckner’s (2014) strategy provides a decisive way of putting CBRHs out of play without the need for experimental innovation. Placing appropriate emphasis on the needs of social cognizers and adopting a variety of teleosemantics opens the way to “evaluate the prospects of a clear, ecumenical, and empirically productive psychosemantics of mindreading” (Buckner 2014, p. 579). At the very least, going teleosemantical is an adequate way to secure the idea that it is mental states and not mere behaviours that are being targeted in the acts of basic social cognition. A needs-based analysis gives us reason to think that behaviours associated with mental states are only tracked because they allow mental states to be tracked.

At first glance this appears to be excellent news for fans of MRHs: it looks like there is reason to believe some or other MRH must be true, even if non-ToM alternatives cannot be ruled out experimentally. The case for thinking that some or other MRH must be true can be made if we help ourselves to the plausible assumption that the best theory of psychosemantics needs to satisfy the teleological criterion.

This is a giant step in the right direction. But it leaves open the possibility that mental states are being targeted in non-MR ways in the relevant acts of basic social cognition.

3 Minding minds without reading them

There is a new kind of alternative to MRHs to consider if we go Buckner’s way. MRHs face a new type of challenger—MMHs. MMHs question the standard ToM characterizations of the relevant acts of basic social cognition and offer a different, non-MR explanation of how they are carried off. Hutto (2011) advanced a MMH according to which basic acts of social cognition involve being responsive to the intentional attitudes of others in ways that do not involve representing or attributing any kind of mental state concepts or contents.

As we saw, MinToM puts the non-representational and extensional notions of ENCOUNTERING and REGISTERING center stage in its account. In a similar vein MMH trades on a distinction between contentless but world-directed intentional attitudes and contentful, intensional propositional attitudes (Hutto 2008, 2011, 2013). Pure intentional attitudes exhibit a kind of basic intentionality in that they are directed at objects and states of affairs. Despite this, pure intentional attitudes differ from propositional attitudes, because they exhibit a kind of basic intentionality that is only target-focused but not content-involving. Basic forms of intentionality take the form of involvement relations that are not best understood in terms of reference and truth (Hutto 2013; Hutto and Myin 2013).

With the distinction between intentional and propositional attitudes in play it becomes possible to conceive of agents harbouring intentional attitudes towards the intentional attitudes of others. This might involve, say, attending to another’s attending—of targeting what the other is directed at in specific circumstances and how they are so directed (the action-relevant properties of a situation afforded to various agents). All of this can be achieved without conceptually or contentfully representing or attributing any states of mind whatsoever. Even so, it must be stressed, in all such cases mind minders would still be minding minds. In line with the teleological criterion it is not behaviours that are being targeted. MMHs must not be confused with CBRHs.

Also, like MinToM, the MMH comfortably allows that it is still possible to form expectations about a target’s contentful, fully-fledged propositional attitudes using only lesser MM means. The catch is that when MMing this is only ever done incidentally, no direct propositional attitude attributions are made. This is possible as long as tracking capacities need not depend on directly specifying what is being tracked as such: something can be reliably tracked by targeting properties or features that are co-extensive with whatever is tracked (for an extended discussion in relation to the infant data see Fenici 2013, 2014). This proposal marches in step with Butterfill and Apperly’s (2013) observation that it is perfectly possible to keep track of someone’s propositional attitudes without having a concept of a propositional attitude, just as it is possible to track toxicity, say, by noxious smell while lacking any concept of toxins per se (Butterfill and Apperly 2013, p. 607, see also Hutto 2008, Chap. 3).

It is also conceivable that there may be cases of quite complex MMing that do not involve any party harbouring contentful states of mind at all (For an extensive discussion see Zawidzki 2013). This will come to pass when the following two conditions are satisfied: (1) the minds minded are not content-involving; and (2) the way and means by which the minds are minded does not involve representing and attributing contentful states of mind. MMers can achieve this by sensitively and appropriately responding to each other’s action-relevant, world-directed intentional attitudes without making any attributions and in wholly non-contentful ways and by wholly non-contentful means.

MMHs differ significantly from MinToM, because despite operating with non-representational notions of what is tracked MinToM remains fully representationalist in the way it understands how what is tracked is tracked. It retains the idea that feats of social cognition entail a capacity to represent non-propositional, intentional states of mind. This is why Apperly and Butterfill (2009). Butterfill and Apperly (2013) speak of ‘representing’ registrations.

Why assume registrations must be represented? Perhaps these theorists are adhering to the Fodorian fiat that “Tracking requires a way to represent the trackee” (Fodor 2003, p. 20). Perhaps, they assume that tracking some X is necessarily a concept-involving capacity of some kind. That tracking involves representing-as. After all, borrowing from Fodor again, we are told that, “To represent (e.g. mentally) Mr. James as a cat is to represent him falling under the concept CAT” (Fodor 2007, p. 105).

That the capacity to track items is best explained by structured conceptual representations is a familiar working assumption embraced by adherents of classical cognitivism. Indeed, this assumption apparently motivates and justifies belief in the atomistic conceptual primitives—concepts that one-ToMH holds are in play throughout all ToMish social cognition. Carruthers (2011) makes his commitment to this working assumption indelibly clear: “many mental states are realized discretely in the brain and possess causally relevant component structure ... they possess a discrete existence and are structured out of component concepts” (Carruthers 2011, p. xiv, See the preface of Fodor and Pylyshyn 2015, for an even more detailed list of related working assumptions).

It is not obvious why we should accept these fiats and framework assumptions. What justifies the claim that tracking always involves conceptual and contentful representations? Or that it need do so in the cases under scrutiny? We should be especially cautious in accepting these claims given general reasons to doubt that contentful representations—referential concepts or truth-conditional propositions—are necessary for, or even form part of the best explanation of basic tracking capacities (Ramsey 2007; Chemero 2009; Hutto and Myin 2013). Certainly, cognitivist assumptions cannot simply be taken for granted in the context of evaluating competing non-representationalist—enactive, embodied—proposals that are being actively developed in the cognitive sciences.Footnote 4 In the absence of arguments to support the above assumptions, there is no obvious basis for ruling out the possibility that relevant acts of basic social cognition might not be best explained in terms of capacities to nonconceptually track attitudes in contentless ways (as per MMHs) as opposed to tracking attitudes by means of conceptually structured contentful attitudes (as MRHs would have it).

Indeed, it is no great stretch to imagine that organisms might be keeping track of things in their environments without relying on any concepts at all if one is prepared to allow that it is possible to track, say, toxicity without making use of the concept of a toxin as such. Being initially set up to be set off by certain environmental affordances in particular ways might suffice to explain how on-line forms of tracking take place by wholly non-conceptual means.

MMHs also differ from MinToM in another very important respect. MMHs assume that engaging basic social cognition does not require the possession and use of a represented set of principles. Ecologically shaped embodied know-how is all that is required for developing and improving capacities to keep track of the attitudes of others. The sorts of creatures capable of interesting forms of basic social cognition—certainly humans—develop and refine how they respond to ecologically available affordances through sustained embodied, engaged interactions. This enables greater context-sensitivity as skilled responding improves over time. A robust and fertile empirically inspired literature, drawing on the ecological dynamics framework, offers ways of understanding and explicating how dynamically embodied skills for social cognition can develop in constraint-led ways along the same lines that other structured skills, such as sporting abilities, are acquired. Importantly, this is an approach that shuns the intellectualist paradigm of classical cognitivism and its conceptual and contentful representationalist assumptions (Warren 2006; Araújo 2009; Davids and Araújo 2010a, b; Weast et al. 2011; Wilson and Golonka 2013).

Of course, principles or rules (and related concepts) can be used to describe MM activity. But this fact in no way warrants the assumption that descriptions capture the content of a set of rules operating at the subpersonal level that MMers literally call upon or which otherwise causally explain their acts of social cognition. Even ardent cognitivists are suspicious of such hyper-intellectualist claims in other domains. In particular, Burge (2010) insists: “To perceive, individuals need not represent their own states or operations, even ‘implicitly”’ (p. 405). For defenders of the MMH, the same holds for social cognition.

In sum, it is surely at least conceivable that Mind Minding might best characterize and explain basic acts of social cognition, and that these acts neither qualify as nor are they best explained by any kind of Mind Reading. The MMH is surely a live possibility that is worth serious consideration.

4 Mind minding: brief replies to some objections

Larger, on-going debates in the cognitive sciences need to be settled before we can rationally choose between MMHs and MRHs. For example, the outcome of current debates about the explanatory value of positing contentful mental representations matters to which of these possible explanations we should prefer. Spaulding (2011) tries to rule out the MMH as a non-starter on precisely these grounds. She objects to deflationary radically enactive, embodied accounts of social cognition because she regards them as fundamentally incapable of explaining even very primitive forms of flexible, non-verbal cognitive activity. Radically enactive accounts flounder, in her view, precisely because they shun contentful representations. The overall logic is clear. If non-representationalist enactive explanations are inadequate as accounts of basic cognition, then MMHs that rely upon such explanations are surely not in the running when it comes to explaining basic social cognition (Spaulding 2011, p. 156).

Spaulding’s argument is, of course, double-edged. For if it turns out that simple non-verbal cognitive activity can be adequately explained without assuming the existence of contentful representations then the doors will be open to explaining basic social cognition using non-representational resources as well. It may be that the basic cognition of human infants is not, in fact, content involving. That would spell bad news for MRHs since it would mean that we can safely assume that infants have mental representations with propositional content that can be deployed when ascribing mental states to others.Footnote 5

Such are the stakes, what of the arguments? It is too much to hope to settle these big issues in the little space available. Still we get some useful traction of the debate by critically examining Spaulding’s (2011) headline test case of honeybees. By her lights, the best explanation of sophisticated honeybee navigational feats is that the bees “represent and protologically reason about the location of the hive” (Spaulding 2011, p. 156, see also Carruthers 2009b, p. 98). Spaulding (2011) has ‘difficulty seeing’ how anything short of a system of internal, well-structured contentful representations with denotational and truth-evaluable contents could do the relevant work.

Yet far from being an uncontentious, well-established fact that contentful representations must feature in the best explanations of the flexible nonverbal cognitive activity of bees, it is a matter of controversy whether the hypothesized mental maps of honeybees should be understood as content bearing at all. Focusing on Gallistel’s (1998) highly influential work on insect navigation, Rescorla (2013) puts his finger on the root problem. He observes that despite the fact that Gallistel talks ‘representational’ talk, the explanations he offers only lay stress on systematic structure-preserving correspondences—correspondences that hold between elements of these organisms and elements of their environments. Thus even if it is conceded that honeybee navigation is best explained in Gallistel’s way it does not follow that contentful representations are involved. As Rescorla makes clear:

Explanatory power resides solely in the “functioning isomorphism” between mind and world. There is no obvious reason why “functioning isomorphism” must have truth conditional content ... The burden of proof lies with those who claim that functioning isomorphism suffices for truth conditions (Rescorla 2013, p. 96).Footnote 6

Focusing on the flexible abilities of honeybees is therefore not, after all, a secure, non-question-begging way of making the case for the need to posit contentful representations, even if it is agreed that some kind of mental map account is required. The issue turns on whether the models and maps in question should be understood as having semantic properties.

In denying precisely that semantics comes into the story at this stage, the proposed non-representational account of basic cognition is even more radical than that proposed by those who argue that “there are no such things as word meanings or conceptual contents” (Fodor and Pylyshyn 2015, p. 1, 50). For it is possible to argue that intensions have no place in a serious science of the mind while holding, nonetheless, that “the paradigm of explanation in cognitive psychology is the attribution of a creature’s actions to its ... propositional attitudes” (Fodor and Pylyshyn 2015 p. 2). For example, Rupert (2011) also rejects “a grand, roughly Fregean tradition” (p. 102). Yet he advocates a view of representational content according to which “a mental representation corresponds to, or is about, an individual, property, or kind” (Rupert 2011, p. 101). Elaborating on this idea, he says: “When I speak of representational content, I have in mind externalistic content ... Whatever else such content is, it is not intrinsic to vehicles; moreover, something’s having externalist content is not tantamount to the vehicle’s being associated with a sense or idea that then determines the vehicle’s referential or truth-conditional properties” (Rupert 2011, p. 102).

At this point the debate between representationalists and non-representationalists appears to pivot on whether we count the world as containing contents, as made up of structures with truth conditional properties and that are already conceptually articulated. One reason for thinking we cannot assume this is that:

Propositions are very different from states of affairs. In particular, propositions are true or false, while states of affairs are not the sort of things that can be either true or false. On many standard ways of thinking about propositions and states of affairs, states of affairs are the things that make propositions true or false (Bermúdez 2011, p. 404).

Closer to home with respect to basic social cognition, Lavelle (2012) argues that because MMHs rely on a non-representational enactivism it encounters special problems in explaining “the flexibility of pre-linguistic social interactions” (p. 469). Focusing on human infants, Lavelle finds it puzzling how, on such an account, an infant manages to respond appropriately without appealing to background knowledge in the form of rules or theories. She paints the following picture:

In one situation, the infant has the intentional attitude of wanting the ball for herself. In this situation, when she perceives her father’s behaviour of reaching for the ball, she responds to the natural sign that is his behaviour by grabbing the ball and moving it towards herself. But in another situation, the infant has the intentional attitude of wanting to assist her father. When she perceives his movement towards the ball, she responds by pushing the ball towards him. What this example is intended to illustrate is that the infant’s response to her father’s behaviour will vary depending on what her intentional attitude happens to be (Lavelle 2012, p. 468).

Lavelle (2012) wonders just how a radically enactive MMH approach can explain “how the infant can respond to her father’s intentional attitude in a way that concords with her own intentional attitude, whilst being unaware of what his intentional attitude is” (Lavelle 2012, p. 468, emphasis added). She takes it that this would require the infant to have some means of selecting which natural sign it would be appropriate to respond to “from the variety with which she is presented ... The puzzle deepens when one takes into account that the infant is not able to have any knowledge of the experimenter’s intentional attitudes” (Lavelle 2012, p. 465, emphasis added).

She puzzles over what the child’s criteria of choice could possibly be in such cases. Yet the real puzzle is why it should be assumed that this is a matter of intellectually based choice on the part of the child in the first place. This objection to MMHs appears to rest on an overly intellectualized characterization of what must be going on in acts of basic social cognition. The objection is only effective if it is assumed that the only way to modulate behaviour intelligently with respect to the attitudes of others is by means of knowledge or awareness of those attitudes.

The best retort for enactivist defenders of MMHs to make is not, as Lavelle (2012) considers, to “question the assumption that infants have no way of choosing which natural signs to respond to” (p. 466). It is to question that infants need make any such choice in the first place. Enactivists assume that the cognition driving social engagements and interactions is not grounded in modes of awareness or rules based on representations of any kind. Rather it assumes that normally developing infants are already set up, non-accidentally, to target and tune into the attitudes of others. Crucially, these infantile ways of responding to the attitudes of others are not fixed and slavishly automatic, they are dynamic, spontaneous and context-sensitive.

Exactly how such interactions unfold in any particular situation will depend on a number of hard-to-predict factors including the participants past interactions, how these are molded by experience, and the current context. Such factors make a difference not only to the intentional attitudes of participants but also how they respond to the intentional attitudes of others. There is no need for infants (or some subpersonal cognitive system within them) to be aware of, or have knowledge of, or represent what is targeted in the steps of these dynamical processes in order for them to respond flexibly or appropriately.

Arguably, the ToM framework imposes an overly neat, spectatorial picture of what goes on in basic social cognition. The model is derived from reflection on the sorts of principles that could adequately describe mature cases of explicitly propositional attitude attributions. Problems arise when it is further, tacitly supposed that only such a model, or something near enough, could possibly explain the mechanics of mindreading. This last assumption is surely the source of the philosophically inspired intellectual difficulties that accompany attempts to conceive of alternative explanations.

The ToM model is challenged by the fact that there is no evidence that we start our social cognitive careers with well-defined notions of self and other in play such that it would be necessary to be aware of, or to know, other minds by representing and assigning mental states concepts and contents to them—let alone being aware of or knowing ToM principles by which such ascriptions are allegedly achieved. Indeed, “recent research suggests that sometimes, we may not store our own or other’s perspectives as distinct versions of reality that we can keep separate and compare, and that our encodings can even interfere with our encodings of our own experience of an event” (Southgate 2013, p.13 ff).

Speaking of the new infant data more generally, Brincker (2014) offers an enactivist-friendly reading: “On a theoretical level such findings complicate our notion of social perception, as we see not only the actual behaviors of others, but their potential and afforded action targets, and how these relate to our own current action in overall shared affordance space. The key is that affordances alert to potential outcomes as they relate to actual objects and agents in the spatial environment” (p. 2).Footnote 7

These short replies and observations only scratch the surface. The motivation for adopting MMHs in this paper has focused on issues concerning their psychosemantics. But much of the explanatory power of MMHs will come from the alternative account of psychotechtonics. Non-representational, enactive accounts of cognition can step in at this point to provide an alternative to the familiar computational rules and representations account of how embodied know-how of the required sort is possible even without assuming the existence of a cognitive system of ToM principles (see Hutto and Myin 2013). On that score, there are, of course, many more questions to answer about MMHs and ways to refine them.Footnote 8 Still enough has hopefully been said to show that there is no reason not to take them seriously as candidates for understanding basic social cognition.

5 Conclusion

This paper does not provide a knockdown proof against MRHs. It does not prove once and for all that they are not the best way to characterize or best explain basic acts of social cognition. It is designed, rather, to invoke healthy scepticism about the tendency to treat MRHs as if they legitimately enjoy the status of the default view; as if they are the only game in town. The aim of this paper is to remove major conceptual barriers to the active investigation of alternatives—new kids on the block—that may fruitfully inform the future of social cognition research and reshape how we characterize and explain such activity.