Comparative metaphysics: the development of representing natural and normative regularities in humans and non-human primates

How do human children come to carve up and think of the world around them in its most general and abstract structure? And to which degree are these general forms of viewing the world shared by other animals, notably by non-human primates? These questions of what could be called comparative metaphysics are the focus of the present paper.

One influential assumption in recent comparative cognitive science has been that there is in fact a massive divide between the ways humans and other primates come to carve up the world in the following way: while humans think in essentialist ways about objects and general causal regularities, trying to carve up things deeply at their joints, non-human primates remain fundamentally limited to a shallow metaphysics of representing their surroundings in terms of superficial perceptible properties (Penn et al. 2008).

In contrast, the main claim I would like to put forward in this paper is the following: on the one hand, there are fundamental commonalities between human children and non-human primates in their natural ontology, that is, in the ways they carve up the natural world. This shared natural ontology is characterized by thinking in essentialist ways about natural kind objects and by thinking generically about natural regularities. On the other hand, however, there are fundamental differences when it comes to social ontology. Here human children, and only human children, come to think about their social environment in terms of socially constituted objects in essentialist ways and to think about general social prescriptive (rather than merely descriptive) rules.

1 Natural ontology

The natural world as we think of it consists of enduring natural kind objects which we conceive of in essentialist ways – distinguishing between their merely superficial and their deep, essential properties that make them what they are and determine their criteria of persistence. And the natural world as we think of it is governed by descriptive regularities which we conceive of in generic terms – describing what is in general the case.

In the following I will first describe recent research documenting commonalities between the natural ontology found early in human development and in non-human primates in two respects, regarding essentialist thinking about objects, and regarding generic thinking about statistical regularities.

1.1 Thinking about objects

The most fundamental form of thinking about an objective world, a world out there existing independently of us and our perceptions of it, is thinking about objects. As has long been noted, the most basic form of thinking about objects is tracking them through space and time even when they are currently not perceived, a capacity known as “object permanence” (Piaget 1952) . And as has long been known in developmental and comparative psychology, this basic form of spatio-temporal tracking of objects develops very early in human ontogeny and is widespread in the animal kingdom (Baillargeon et al. 1985; Tomasello and Call 1997). In a typical test, for example, subjects would see two objects simultaneously moving from different directions into a box and would then be probed regarding their numerical expectations as to how many objects are in the box (indicated, for example, in the time they manually search in the box after having found the first and after having found the second object). Very young children in their first year, and many animal species master such tasks (in that they search longer after having found the first object compared a control condition in which they only saw one object move into the box, and in that they search longer after having found the first object than after having found the second one).

However, the mature ways we think about objects go way beyond just spatio-temporally tracking them. We do not just conceive of objects as portions of stuff moving through space and time. Rather, we think of and keep track of objects as objects of certain natural kinds with deep essential properties that make them what they are and that determine criteria of identity over time. We engage, that is, in sortal (and not only in spatio-temporal) object individuation (Strawson 1959; Wiggins 1997). Now, there is a long tradition in philosophy of assuming that such sortal object individuation is a capacity that requires and builds on language. Quine, for example, famously claimed:

The mother, red, and water are for the infant all of a type: each is just a history of sporadic encounter, a scattered portion of what goes on. […] It is only when the child has got on to the full and proper use of individuative terms like “apple” that he can properly be said to have taken to using terms as terms, and speaking of objects. (Quine 1957), p. 9)

Interestingly, recent developmental findings seem to speak in favor of this claim. In a typical test, subjects would see the following scenario: at time 1, they see an object of kind A come out of a box and then re-enter the bow, followed at time 2, by an object of kind B coming out of and then re-entering the box (Xu 1997; Xu and Carey 1996; Xu et al. 2004) for review, see (Xu 2007). This scenario is very similar to the one described above in studies on spatio-temporal tracking, but the crucial difference is that in this adapted version, since the objects are never seen as moving simultaneously, merely spatio-temporal information is of no use. Rather, what seems necessary for solving this task is the use of sortal concepts: “This is an A (time 1), and now there is this B (time 2), and since As do not normally turn into Bs, there must be two objects in the box”. Empirically, studies using such tasks, both with more implicit looking time measures (after seeing these events, the infant sees the contents of the box: either 2 objects (expected) or 1 object (unexpected)) and with more active manual search measures (how long will infants keep on searching in the box in the condition described above compared to a condition in which they saw the same A come out of the box and re-enter the box at times 1 and 2) have produced the following findings: (i) infants master these tasks considerably later than spatio-temporal tracking tasks, namely around 12 months; (ii) competence on such tasks is correlated with language comprehension, and (iii) performance in such tasks is boosted when the scenario is accompanied by sortal languge (e.g. “look, an A” at time 1 etc.). This has led to an empirically based version of Quine’s claim in developmental and comparative psychology: sortal object individuation is based on the acquisition of language and therefore uniquely human (Xu 2002).

In a series of studies, we recently set out to test this “no language, no sortal object individuation”-claim with non-human primates (Mendes et al. 2008a, b, 2011). The findings were very clear: with the basically same kinds of manual search tasks as used with 12-month-olds infants, apes performed in absolutely comparable ways. So on the premise that the kinds of tasks taken as sortal object individuation tasks really do measure sortal object individuation, Quine’s claim seems to be wrong from a comparative point of view. But is the premise about the tasks justified? Does one really need to make use of sortal concepts in order to solve these tasks? Doubts are in order for the following reason: in these scenarios, information about the kinds of objects and information about their merely superficial properties are necessarily confounded. And so success in these tasks could be due to tracking of superficial (rather than essential) properties.

Dis-entagling representations of essential properties defining what kind of thing an object is and its merely superficial accidental properties has been the focus of a separate research tradition with adults and much older children on psychological essentialism (Gelman 2003; Keil 1989). Building on the insights of natural kind semantic in the Kripke-Putnam tradition, this line of research has investigated whether and how children develop natural kind concepts that allow them to distinguish deep essential from superficial merely characteristic properties by confronting them with (picture book) scenarios in which objects undergo massive, yet identity-preserving changes in superficial features. For example, an animal of kind A is taken from its parents at birth and raised by animals of kind B like which it learns to behave etc. – now, will it turn to be an A or a B? Children from age 4 (and adults, of course) claim that despite the massive superficial transformations of the animal, its essential properties remain untouched and it will thus turn out to be an A (Gelman 2004).

In a further series of studies, therefore, we drew on the insights of this research tradition on psychological essentialism in order to investigate whether infants and non-human primates really make use of sortal (and not just property-based) object individuation. In one study, we confronted infants with events of the following structure: at time 1, they saw an object with appearance 1 (e.g., a toy bunny) enter into a box, and at time 2 they either saw an object with appearance 1 (same bunny) or with appearance 2 (e.g. toy carrot) come out of the box (Cacchione et al. 2013). In fact, the two appearances belonged to one and the same object – a soft toy that could be turned inside out, with carrot-appearances on one side, and bunny-appearances on the other. Now, importantly there were two groups, one familiar, the other one unfamiliar with such dual-aspect objects. The crucial result was that the infants ignorant about such dual-aspect objects took the difference in superficial appearance as diagnostic for questions of numerical identity, as indicated in the fact that they searched longer in the bunny/rabbit condition than in the bunny/bunny condition – in contrast to the infants in the other group who ignored the superficial differences for their judgment of numerical identity, as indicated in the fact that they did not search differently in the two conditions. In another study with a related yet slightly different approach, great apes saw a food item of kind 1 (e.g. slice of banana) enter into a box and a food item either of kind 1 or of kind 2 (e.g. slice of carrot) come out of the box. Crucially, in some conditions, the food item entering the box was first changed in its superficial properties (e.g. the banana slice was painted orange) so that it was perceptually more similar to items of kind 2 than to other items of kind 1. The findings were very clear: apes based their judgment of numerical identity and thus their numerical expectations solely on the question whether the objects seen going in and out of the box were of the same kind – searching longer when they were not than when they were- and did not take into account similarities in surface features at all (Cacchione et al. 2014, unpublished manuscript).

So, what these studies taken together suggest is that, in fact, barely linguistic infants and non-linguistic apes share a conceptual framework of thinking about natural objects in sortal and essentialist ways (Rakoczy and Cacchione 2014).

1.2 Thinking about regularities

While the natural world as we think of it is made out of the building blocks of natural kind objects, it is organized around natural descriptive regularities. Picking up such regularities requires capacities for generic thought, thinking about what is generally the case, and in particular, given the probabilistic nature of most natural regularities, requires mastery of intuitive statistical reasoning. Such intuitive statistical reasoning – about what is likely or unlikely, what are random or non-random events etc. - has long been studied in developmental and cognitive psychology since Piaget. Until very recently, basically all of this research suggested that reasoning about probabilities develops late in ontogeny, depends on language and formal education (Piaget and Inhelder 1975), remains fragile even in adulthood (Tversky and Kahneman 1974, 1981), and only works under special circumstances (Cosmides and Tooby 1996; Gigerenzer and Hoffrage 1995).

Against this background, it is all the more spectacular that exciting new research suggests that such reasoning capacities might well be in place much earlier than assumed even in the absence of language. Already preverbal infants engage in some intuitive statistics: most basically, they draw systematic inferences from populations to samples and vice versa, expecting randomly drawn samples to reflect the distribution in the population drawn from and the other way around (Denison and Xu 2010; Téglás et al. 2007; Xu and Garcia 2008).

Since nothing was known regarding the comparative question whether such intuitive statistics was unique to humans until recently, we set out to test for analogous capacities in non-human primates (Rakoczy et al. 2014). And again, the findings from a series of experiments were very clear: great apes engage in much the same intuitive statistical inferences as preverbal human infants did. When confronted with populations consisting of food items of two kinds (one of which was more attractive to them) in varying distributions from which an experimenter randomly sampled, subjects inferred that the samples would likely reflect the distribution of the populations and thus consistently chose samples from the populations with the more favorable distributions, i.e. with the bigger relative frequency of preferred food items over non-preferred one.

While regarding human infants, recent work has uncovered more and more about how intuitive statistical reasoning plays a role in the inductive learning of all kinds of regularities such as causal laws (Gopnik and Wellman 2012), much less is currently known regarding non-human primates’ uses of their intuitive statistics. But what is clear from the developmental and comparative work taken together is that there is a core cognitive capacity, developing early in human ontogeny and shared by human and non-human primates, for generic thought, for tracking general statistical regularities.

More generally, there is thus a natural ontology shared by human and non-human primates characterized by essentialist ways of thinking about natural kind objects and by generic ways of thinking about general (statistical) regularities.

2 Social ontology

In contrast to this natural ontology shared by human children and non-human primates, I would like to argue now, there is a distinctively social ontology, a way of viewing one’s surroundings in terms of socially constituted objects and socially constituted prescriptive norms that is distinctively human: it is acquired early in human ontogeny but markedly absent in other species. This basic idea is nicely brought out in the following quote from John Searle: “My dog has very good vision, indeed much better than mine. But I can still see things he cannot see. We can both see, for example, a man crossing a line carrying a ball. But I can see the man score a touchdown and the dog cannot” (Searle 2005).

2.1 Thinking about socially constituted objects

We think about natural objects in essentialist, “deep” ways, as objects of a given natural kind constituted by their underlying essential properties that transcend their merely superficial properties. In structurally similar ways, we conceive of many objects of the social world in “deep” ways, as objects of a given socially constituted kind that are what they are due to their essential properties that transcend their merely superficial features. The difference, of course, is that the essential properties of natural kinds are natural properties that are inextricably linked to the natural regularities to which a given kind is subject and that we may –even collectively- actually not know very much about (Putnam 1975). The essential properties of socially constituted kinds, in contrast, are those properties that we collectively assign to them in a given practice and that are closely linked to the deontic, normative regularities to which they are subject (Searle 1995).

Now, what exactly is social constitution of objects and how might children come to participate in and understand such constitution? An influential account explicates the basic logical structure of social constitution in the following way: socially constituted objects are such that a natural object counts as something else: “X counts as a Y in a certain context C” (Searle 1969, 1995, 2010). For example, a piece of paper counts as money in a given currency area, an assembly of bricks counts as a University building, Peter counts as a teacher in certain contexts etc. Social constitution is thus a matter of super-imposition of institutional facts (the Y-facts like “this is a 5-dollar-bill”) on brute facts (the X-facts like “this is a slip of paper”). The essential properties that make a socially constituted object into what it is (qua socially constituted object) are the institutional properties going along with the Y-status of the object that are to a large degree independent of the X-properties (the same Y-properties, for example, of money, can be multiply realized in many X-properties. Think if coins, bills, cheques, electronic money etc.). And so just like an understanding of natural kind objects involves a distinction between the essential natural properties that constitute the kind and its merely accidental surface features, an understanding of social kinds involves the distinction between the essential institutional properties (can be used to pay etc.) and the merely accidental natural properties (is made of paper etc.) of objects of a socially constituted kind.

And just like human infants from 1 year of age show evidence of thinking about natural kind objects in essentialist ways (coordinating information about essential and merely superficial properties), so they also begin to engage in structurally analogous thinking (coordinating information about natural and institutional properties) regarding socially constituted objects from their second year on. The first areas in which this can be clearly seen are different forms of social play such as simple rules games and notably pretend play. In joint pretence with others, in particular, children from one and a half years of age apply and understand very clearly the dual structure of brute and assigned facts typical of social constitution – for example, when they jointly pretend that a banana (X) is a fictional telephone (Y) in the context of a given game (Rakoczy 2008b; Walton 1990). Empirically, it has been shown that children from this age are already quite competent at picking up, following and respecting others’ fictional status assignments in joint pretending, tracking over time and adapting their own actions in systematic ways accordingly (Rakoczy et al. 2004, 2005); (Rakoczy 2006, 2008a; Rakoczy and Tomasello 2006). That social play is the first domain in which children’s understanding of and participation in social construction of objects becomes visible might not be a coincidence. One possibility is the following: most forms of institutional life and social construction are difficult to enter into and to understand because they are holistically structured. You cannot understand what money is, for example, unless you understand a lot more about related economic and political matters. Play, in contrast, is “non-serious” in the very sense that it is less holistically related with the rest of institutional reality – which might make it a perfect candidate for a cradle of social constitution.

2.2 Thinking about prescriptive norms

Just as the glue that organizes the natural world (as we think of it) are descriptive regularities, the glue that keeps together the socially constituted world (as we think of it) are prescriptive rules. Full-blown understanding of the social constitution of objects involves an understanding of the deontic powers going along with being a certain kind of socially constituted object, an understanding of the rules coming along with the corresponding practices.

Practices of social constitution first and foremost involve so-called constitutive rules (Rawls 1955; Searle 1969, 1995), rules of the form “Xs count as Ys in context C”. These rules do not regulate an already existing activity but bring it into existence in the first place. For example, among the constitutive rule of chess are “wooden pieces of such and such shape count as kings” and “kings can move in such and such ways” – it is not such that there was chess already and then someone invented kings or determined how they could move…). Now, what a king is is a matter of how one can and ought to use such pieces in the course of the game (how it can be moved, how one has to move when under attack etc.). And constitutive rules are always related to and backed up by so-called regulative rules that prescribe actions in an already existing activity (e.g. how to use one’s cutlery when eating – where eating predates cutlery logically and historically…)Footnote 1.

Such social norms or rules have a number of essential logical properties that participants of the social practices in question need to get a conceptual grip on. In particular, they

  • are agent-neutral. That is, they apply to any participant in the practice in the same ways, to oneself as much as to someone else.

  • have normative force. That is, they provide motivating reasons to follow the rule oneself, but also reasons for enforcing the rule towards third parties (e.g. by critique in the case of rule-incompatible actions)

  • apply in context-relative ways. That is, a given kind of behavior might be licensed in one context but inappropriate in another. This can be clearly seen when considering the “X counts as X in C” formula. One and the same X (e.g. a middle-sized ball) can be used in different contexts in different activities (e.g. handball vs. football) to have different status – and with that come different norms: It is perfectly fine to touch the ball with the hand in one context (handball), but not in another (football).

Now, how does an understanding of these aspects of social norms develop? Recently, new research with novel methods –investigating spontaneous forms of critique, protest, and other kinds of interventions in response to third-party mistakes – as produced evidence that a grasp of the agent-neutral normative force of social norms develops very early in human ontogeny in a number of domains.

Children as young as 2–3 years understand the rules governing social games, both rule games (governed by explicit rules) and games of pretense (governed by implicit rules). Regarding rule games, children not only learn how to play novel board games quickly (games in which, for example, tokens need to be moved to certain places in certain ways like in pool), but they are equally quick in drawing normative, agent-neutral conclusions (Rakoczy et al. 2009; Rakoczy 2008a, b; Schmidt et al. 2011): When a third party (usually a puppet) announces she is joining the game and performs an act that violates the game’s rules (such as moving a token to the right place in the wrong way), children often spontaneously protest, criticize, and teach the wrongdoer (but they do not do so when she acts appropriately).

With respect to social games of pretence, children as young as 2 years understand the implicit norms governing such fictional activities. Social pretence games are characterized by implicit constitutive rules (Walton 1990): When two actors set up a pretence scenario together (e.g., pretending that a stone is soap), this defines the normative space of the game: The stone counts as a fictional ‘soap’ in the context of the game and is to be treated accordingly. Two- and 3-year-old children understand and enforce this normative structure in agent-neutral ways: They play the game appropriately themselves, and when a third party joins the game, they actively and spontaneously criticize, protest, and teach in response to actions violating the game norms (e.g., confusing the fictional identities of the objects) – but do not so when the other person acts appropriately (Rakoczy 2008b; Wyman, Rakoczy and Tomasello 2009).

Children this age also understand some basic normative aspects of language use. They understand that different kinds of speech acts (e.g., assertions vs. imperatives) with the same propositional content can have different directions-of-fit and are thus subject to different normative constraints (Searle 1995). When confronted with a speaker making either assertions or imperatives with the same content about or toward a listener (“the listener is doing X” vs. “listener, do X!”), they respond very differently in cases of the nonfulfillment of the semantic content of the speech act (hearer is not doing X): They criticize the speaker for being wrong in the case of assertions (“No, listener isn’t doing X!”), but criticize the addressee for making action mistakes in the case of unfulfilled imperatives (“No! It is X you must do!”) (Lohse et al. 2014; Rakoczy and Tomasello 2009).

Finally, recent research suggests that young children understand some aspects of the normative institution of property. Property itself is a system of constitutive rules that define under which conditions individuals own something and which rights and obligations this engenders (Snare 1972). Toddlers begin to grasp some of the conditions under which ownership is established and altered over time (Blake and Harris 2009; Kim and Kalish 2009). A recent study showed that young children already conceive of property as normative and as agent-neutral: When confronted with an agent treating an object in property-relevant ways (e.g., taking it without asking and throwing it away), 3-year-old children protested when both their own property and someone else’s was affected (but not when the agent performed the same acts on her own property; (Rossano et al. 2011).

In all of these domains, then, children very early in development (by ages 2 to 3) reveal basic forms of understanding of the agent-neutral normative force of social norms that both guides their own behaviour and sets evaluative standards for criticizing and teaching others.

And they also begin to grasp the context-relativity of social norms at this age. First of all, they understand that the same behaviour can count as a mistake if performed in the context of a game, but is perfectly appropriate outside of this context where no such norm is in operation. In both pretence and rule game scenarios, children protested against a given behaviour when performed in the course of the ongoing game, but did not do so when the agent was not part of the game anymore, for example, when she announced to quit the game and do something else before (Rakoczy 2008b; Rakoczy et al. 2008, 2009). Second, young children also understand that the same kind of behaviour can be subject to quite different norms in different contexts: In a recent study, children understood, for example, that different pretense games that went on in parallel and between which they switched back and forth licensed quite different acts with the same object (Wyman et al. 2009). And finally, children this age even understand that different kinds of norms differ with respect to the degree of their context-relativity or in the scope of the contexts in which they apply. While conventional social norms often have a rather well defined and limited context pertaining to the social group in question, more general norms of rationality and moral norms are often taken as applying universally to all rational beings (Korsgaard 1996); (Turiel 1983). Young children already seem to share this intuition: In a recent study, 3-year-olds witnessed actions that violated (a) norms of instrumental rationality (using inefficient means to pursue an end), (b) conventional social norms of a game (playing a game wrongly), or (c) moral norms (inflicting harm without reason). The context-relativity was here operationalized by using different transgressors: in one condition (in a minimal group paradigm), the transgressor came from the same in-group as the child, in another from an out-group. The rationale was the following: if children view a given kind of norm as more context-relative in the sense of applying to a local community only, this should reveal itself in a differentiation between transgressions by in- and out-group members (with more critique in the former case). And in fact, the results revealed that children did not differentiate between in- and out-group transgressor in the case of instrumental and moral mistakes (criticizing both at comparable levels), but criticized in-group members significantly more than out-group members for conventional mistakes (Schmidt et al. 2012). Relatedly, in a very recent study, we tested for children’s intuitions about the context-relativity of conventional and moral norms in the sense that agents can or cannot freely change such contexts when acting. Children saw an agent who was at first involved in a conventional (playing a sorting game with a given apparatus) or moral practice (also playing with the apparatus, but now there were the beloved objects of a third person that needed to be protected against destruction), and who then performed either an act that was or was not in accordance with the practice. The crucial variation was the following: When not in accordance with the practice, the agent beforehand had announced either to go on with the practice (sort/protect the person’s objects) or to leave the practice. And the crucial finding was the following: in the conventional case, children criticized behaviour that was not in accordance with the sorting-rules only when the actor had announced to play the game. In the moral case, in contrast, children criticized equally any action that was against the moral practice (not protecting the person’s beloved objects so that they get destroyed), regardless of the agent’s announcement – indicating that they already share the adult intuition that one cannot opt out of a moral practice in a way that one can opt out of conventional practices (Josephs and Rakoczy 2014, unpublished manuscript).

2.3 And what about nonhuman primates?

All in all, thus, children by age 2–3 have developed a grasp of the basic ingredients and structure of our social ontology: they participate in and understand the logic of social construction, in particular in the areas of social games. And they understand that such social practices are governed by prescriptive social norms that they use to guide their own and others’ behaviour.

From a comparative point of view, this form of social cognition seems to be rather unique. True, other animals, notably non-human primates, by far not socially blind, do engage in some basic forms of theory of mind (Call and Tomasello 2008). Also, social learning (in the broad sense of learning via the observation of others’ actions) is widespread in non-human primates (e.g. (Price and Whiten 2012), as is the instrumental use of objects as tools (Boesch and Boesch 1990). But all of this falls short of participating in social practices involving social constitution of objects and normative governance.

First of all, though non-human primates use objects as tools (e.g. to crack nuts or fish for termites), and though such tool use seems to spread socially via some forms of social learning, this is not yet a case of social constitution. In social constitution, an object X (e.g. slip of paper) is assigned in totally arbitrary waysFootnote 2 (i.e. in ways not essentially tied to the physical-causal makeup of the object) a status function Y (money) in such a way that it becomes a socially recognized institutional fact that the object has now the Y properties (is worth such and such, can be used to buy etc.). And we have seen that very young children come to understand the basic structure of such social construction in joint games. Regarding ape tool use, the basic point is that the “using as” involved in making use of objects for instrumental purposes is simply a much weaker cognitive relation than the “counting as” of socially assigning status functions to objects. “Using as”, i.e. making use of the physical-causal properties of an object in order to bring about desired effects (to get nuts open, termites fished etc.) is simply not arbitrary and fact-creating in the way “counting as” is.

Second, aren’t there other domains of social behavior where we find instances of what looks suspiciously like social status assignment and social constitution? What, for example, about social play? Here it should be noted that despite some (rather dubious) anecdotal reports, there is no convincing empirical (let alone, experimental) evidence for pretence (or other rule-governed play) in non-human primates whatsoever (Gómez, 2008; Gómez and Martin-Andrade 2002). What then about dominance hierarchies? That some animal is dominant compared to some other animal, is that not like an institutional fact, recognized by a group and thereby brought about – much like the fact that some person is president is brought about and sustained by social recognition? In this case, it should be noted that the basic methodological problem is the following: dominance hierarchies (like many other social relations) come in two kinds, if you will, brute ones and institutional ones. Brute ones, widespread in most socially living animals, are in the end built on and can be cashed out in brute physical states of affairs (who is stronger, basically, or has more allies). Institutional ones, in contrast, are not funded in this way, but depend on social assignment – think of the dominance hierarchy in Universities, companies etc. Now, while is there is no question about the existence and influence of non-human primate dominance hierarchies in the first sense, yet there is no convincing evidence for anything like socially constructed, institutional dominance hierarchies.

Third, one of the fundamental reasons why there is no such convincing evidence for understanding of and participation in practices of social construction in non-human primates is that we do not have evidence that they engage in the kind of behavior that we take to be clear indicators of normative awareness – normative behaviours towards third parties such as critique, protest etc. Footnote 3.

3 Conclusion

None of the foregoing is to rule out, of course, that with better methods, with more fine-grained observations and with ecologically more valid and sensitive experiments we will 1 day find evidence for a more human-like social ontology in non-human primates. But the evidence as of today gives us good reason to believe in the following picture: Humans live as much in a world of trees and stones and nuts as in a world of touchdowns, marriages and money – and while we share the former with many other species, we are alone in the latter. That is to say, on the one hand, humans early in their ontogeny share with other primates a fundamental worldview when it comes to natural ontology – conceiving of the natural world in essentialist ways, as consisting of enduring objects with essential properties, and in generic ways, as governed by general descriptive regularities that can be picked up on the basis of intuitive statistical reasoning. On the other hand, human children then go on to develop a distinctive social ontology – thinking of the social world as made up of socially constituted objects and as governed by general prescriptive rules.

But why is that so? Are these differences in social ontology cognitively fundamental differences between humans and other primates, rooted in psychologically primitive capacities? Probably not. Rather, the following, more nuanced picture is probably more accurate: Humans and non-human primates share basic forms of individual intentionality, and basic forms of essentialist and generic thinking – resulting in the common natural ontology. Humans and non-human primates probably even also share some cognitive ingredients required for social ontology, namely what could be called “dual level thinking”. Social ontology is built on very special forms of dual level thinking – on thinking about objects on two layers, both in brute terms (X) and in terms of the kinds of status functions we socially assign to them (Y). It is because in practices of shared intentionality we participants of a practice socially assign these functions together that they exist, and that they have binding normative force on us. But rudimentary versions of such dual level thinking seem to be well present in non-human primates’ tool use and planning. “Using something as something” – for example, using a stone as tool to crack nuts- as noted above, does not yet amount to “counting something as something”. Nevertheless, it constitutes a basic form of dual-level thinking about the object – in terms of its intrinsic nature and its causal role in a goal-directed individual activity. Similarly, in their sophisticated planning, apes seem to engage in dual level-thinking in the sense that they think about current states of affairs while imagining alternatives that could be brought about in the future (Mendes et al. 2008a) On the assumption that every form of intentionality is inherently minimally normative in the sense that it fixes conditions of satisfaction, criteria of success etc. (Searle 2001), non-human primates might even be said to engage in basic normative activities – activities with criteria of instrumental success (was the desired end for which the object was used, the cracking of the nut, reached?) that might even confer standards of evaluation on the object itself (is this a good nut-cracker stone? etc.). Two very exciting questions for future research in animal cognition in this context are what convincing indicators of such primitive normative awareness and evaluation might be in non-verbal creatures, and whether such indicators are to be found empirically.

But these shared capacities underlying common forms of natural ontology, when combined with distinctively human social-cognitive capacities (in particular, distinctively human form of social intentionality) become then co-opted in subsequent human development to represent and carve up the human social world in very special ways. Most plausibly, when the tendency to think in essentialist and generic ways, dual level thinking and primitive normativity meet specifically human forms of shared intentionality that develop from the second year on, it is this marriage that ultimately results in the social ontology of social constitution and social normativity (Rakoczy and Tomasello 2007; Tomasello and Rakoczy 2003).