Keywords

The appeal to “causal mechanisms” has been a rallying cry for Critical Realism (CR) ever since its (re)emergence in the UK during the 1970s (Bhaskar 1975, 1979; Harré 1970; Harré and Madden 1975).Footnote 1 Causal mechanisms have likewise become a conceptual mainstay of the Morphogenetic/Morphostatic (MM) approach since its inception in the late 1970s (Archer 1979, 1985, 1995).

Over the last decade and a half, other voices have joined in as well, and the rallying cry has rapidly built into something of a chorus (Gerring 2008; Hedström and Ylikoski 2010; Mahoney 2001). In American social science, at least, mechanisms have now gone mainstream. Mechanisms talk pervades not only sociology but political science and economics as well.

The attraction of the mechanismic approach is clear enough. There is a widespread recognition that the search for social laws, even probabilistic ones, has proven futile. There is also a general if not universal sentiment that cultural interpretation does not exhaust social science; some form of causal explanation must also be a goal. In the present constellation, then, many social scientists are attracted to mechanismic explanation as a possible via media between nomothetic hubris and idiographic humility.

How should Critical Realists and Morphogenetic theorists respond to the sudden popularity of mechanisms talk? With some ambivalence, I will argue. On the one hand, they should welcome it, insofar as the turn towards mechanisms does involve a turn away from logical positivism. On the other hand, they should remain wary, because the mechanismic turn has not been as sharp or as final as it seems; many neo-mechanists are still half-positivist.

The remainder of the essay is in four parts. In Part I, I review the four most influential approaches to causal mechanisms within contemporary American sociology: mainstream, analytical, counterfactual and neo-pragmatist. I argue that they are not particularly critical nor even fully realist. In Part II, I subject the mechanisms concept to a critical-historical analysis. I argue that the mechanisms metaphor still carries a good deal of ontological baggage from the mechanismic worldview of the seventeenth century. It must therefore be used with a great deal of caution. In Part III, I survey the recent discussion of causal mechanisms within the philosophy of biology. I highlight a number of commonalities between biological and social mechanisms, but caution against easy analogies. In the conclusion, I argue in favor of a thicker and more pluralistic understanding of causation, not only for the social sciences, but also for the “special sciences” more generally.

1 Causal Mechanisms in American Sociology: Four Approaches

There is a folk version of mechanisms talk that one encounters quite often in American sociology and political science these days. I will call it the generic approach (GA). In the GA, causal mechanisms are a supplement to the positivist approach. “Cause” and “effect” are still defined in Humean or positivist terms, i.e., as “events” or “variables.” “Causal mechanisms” are then construed as the “causal chain” that connects them. The causal links are likewise conceptualized as “variables” or “events” and frequently characterized as “mediating” or “intervening.” Even a casual inspection of recent articles in the leading journals in the United States will quickly turn up many examples of this “association plus theory” version of causal mechanisms.

The historical origins of the GA are somewhat unclear. Knight, Morgan and Winship trace them back to Paul Lazarsfeld’s notion of “M-accounts” (Knight and Winship 2013; Morgan and Winship 2007). Such accounts introduce a third variable that might cast further light on the causal connections that underlie a statistical association and help disambiguate the direction of causal influence (Kendall and Lazarsfeld 1950; Lazarsfeld 1955). For example, if one discovered a correlation between being male and having car accidents, one might introduce an additional variable such as “miles driven” and see what effect this had on the association (Hagenaar 2004).

A decade later, Hubert Blalock (1967), Otis Dudley Duncan (1966) and others drew on the pioneering work of biologist Sewall Wright (1921) in order to incorporate intervening variables into multiple regression analysis via path analytic techniques (Sewell et al. 1969, 1970). In this way, it was argued, one could indeed get from correlation to causation (Blalock 1968; Land 1969).

While another famous member of the Columbia Sociology Department did explicitly invoke the mechanisms idea (Merton 1949), Lazarsfeld himself did not, nor did later advocates of path analysis, with one important exception: Raymond Boudon. Hearkening back to Merton, Boudon would argue that the only way to get from statistical association to causal inference was via “generative mechanisms” derived from social theory (Boudon 1974, 1976, 1991). By “theory”, of course, Boudon meant rational choice theory.

But Boudon’s proto-GA approach was vehemently rejected by leading members of the Wisconsin School such as Robert Hauser, who insisted that a causal explanation just was a statistical model, nothing more, nothing less (Hauser 1976). Hauser’s view prevailed, at least within mainstream sociology, and in the decades that followed, most sociologists stopped talking about causal mechanisms. Why? One likely reason is that rapid advances in computing power and statistical software made it so easy to do “kitchen sink” regression analyses (i.e., to throw in every variable “including the kitchen sink”). Be that as it may, this much is certain: the GA approach to causal mechanisms long antedates the current-day revival of mechanismic thinking – indeed, it entirely antedates widespread use of the mechanisms concept.

Apart from a few neo-Marxists familiar with CR (Brooks 1989; Isaac 1987a, b; Wright 1987, 1997) there was relatively little explicit talk about causal mechanisms within American sociology between the mid-1970s and the mid-1990s.Footnote 2 This changed abruptly around the turn of the millennium, due mainly to the efforts of an interdisciplinary group of predominantly Scandinavian social scientists centered around Peter Hedström. Placing themselves in the lineage of Lazarsfeld and Merton, they sought to revive the agenda first set forth by Boudon two decades before, namely, a loose-jointed version of rational choice theory in which individual actors were the basic building blocks of all causal mechanisms, the “cogs and wheels” inside of the “black box” connecting causal variables (Elster 1983. 1989, 1999; Hedström and Swedberg 1996, 1998; Sørensen 1998). Perhaps because of the deep resistance to rational choice within American sociology, the Hedström group later restyled their approach as “Analytical Sociology” (Demeulenaere 2011; Hedström 2005, 2008; Hedström and Bearman 2009a; Hedström and Ylikoski 2010).

Unlike Boudon, Hedström and many of his followers do not envision causal mechanisms as a mere add-on for statistical analysis (i.e., as a way of giving greater “depth” to regression modeling). On the contrary, they have become increasingly opposed to variables-oriented sociology as such. Instead, following James Coleman (1994), they see it as a way of putting sociological analysis on firmer ontological foundations – namely, methodologically individualistic ones. All social phenomena, they insist, can ultimately be reduced to “micro-level” interactions between individuals, with their “desires, beliefs, and opportunities.” They do include “macro-level phenomena” in their basic model, but only as perceived “constraints” on individual action. They explicitly reject strong, ontological versions of social-structural emergence in favor of weak, epistemic understandings of property emergence. In other words, they regard social structures as real only if and insofar as social actors behave as if they are real. Accordingly, they distinguish between three basic categories of social mechanisms: (1) “macro-micro” or “situational”; (2) “micro-micro” or “individual action” and (3) “micro-macro” or “transformational” (Hedström and Swedberg 1996: 297). While the AS approach appears similar to the MM approach at a schematic level, the resemblance is only superficial, not only because social structures are treated as weakly emergent but also because human persons are treated as “rational actors.”

The third approach to causal mechanisms currently on offer within American sociology is “counterfactual dependency” (CFDp). Its chief advocates have been Christopher Winship and his students (Elwert and Winship 2010; Elwert 2013; Morgan 2001, 2013; Morgan and Winship 2007). The principal architects of CFDp have been James Woodward (2002, 2003, 2004, 2011), on the philosophical side, and Judea Pearl, on the statistical side (Pearl 2009, 2010), though other philosophers and statisticians have lately begun contributing to this literature as well (Bollen and Pearl 2013; Hitchcock 2001; Hoerl et al. 2011; Spirtes et al. 1993).

David Hume is sometimes presented as the founder of this approach (Menzies 2009). In a key passage of the Essay Concerning Human Understanding, Hume offered the following definition of causation: “We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed.” Hume’s second locution may be read as implying a relation of counterfactual dependency. Other scholars (Hausman 1981; Sekhon 2004), including Pearl (2009) himself, trace the origins of CFDp to John Stuart Mill’s “method of difference” (Mill 1986). That method, to recall, involves comparing two similar cases that yield different outcomes to find the one major difference that distinguishes them; this will be the key cause.

But the revival of counterfactual reasoning within contemporary philosophy is mainly due to the influence of David Lewis. In his early writings on counterfactuals, Lewis defined causation as follows: “Where c and e are two distinct possible events, e causally depends on c if and only if, if c were to occur e would occur; and if c were not to occur e would not occur” (Lewis 1973a, b). Following Hume, then, Lewis understands cause and effect as “events.” He further stipulates that these events must be “independent” of one another. Critics immediately discovered a number of problems with Lewis’ approach. What if c and e are both caused by b (spuriousness)? What if c is caused by b (transitivity)? What if e can be caused by b and/or c (overdetermination)? What if c is prevented by b (preemption)? How can one be certain that c caused e, rather than the other way around (temporal asymmetry)? How can one be certain that c could not have occurred (possible worlds)?

In his later writings, Lewis attempted to deal with these problems by redefining counterfactual dependency in terms of continuous variation rather than discrete events, such that if, how, when c occurs will affect if, how and when e occurs (Lewis 2000). Conceived in this way, counterfactual semantics were easily combined with statistical analysis of “potential outcomes” (the so-called Neyman-Holland-Rubin model). CFDp was born.

The potential outcomes approach was originally developed by horticulturalists and epidemiologists, who were interested in the average effects of specific interventions on a particular population (e.g., the use of a new fertilizer or drug). In other words, the goals were practical rather than scientific. But what if the analyst was interested in typical causes rather than average effects? Judea Pearl has argued that one can get from observed effects to underlying causes by combining counterfactual reasoning with three further elements: causal models, causal graphs and structural equations. By “causal models”, he understands pictorial representations of a causal system. His stock examples are human-made physical set-ups, such as electrical circuits. By causal graphs, he understands “directed acyclic graphs” (DAGs) which represent causal processes via arrow diagrams, with each arrow standing in for a causal variable. Structural equation models are then used to test for the presence, strength and direction of the causal effects. It is very important to note that Pearl’s approach requires a number of highly restrictive assumptions that are not often realized in the social world. Otherwise, the statistical tests will not be sufficient to establish the direction and magnitude of the effects, nor can the DAG be assumed to be an accurate model of the actual causal process.

The fourth and final approach to causal mechanisms that I wish to touch on is the neo-pragmatist one (NP) recently sketched out in a well-received article by Neil Gross (2009). There, Gross proposes that we define “social mechanisms” as “composed of chains or aggregations of actors confronting problem situations and mobilizing more or less habitual responses” (2009: 368). This leads to a research agenda which “entails breaking complex social phenomena into their component parts to see how aggregations or chains of actors employing habits to resolve problem situations to bring about systematic effects” (2009: 375). As Gross himself notes, the NP approach is fairly similar to AS, insofar as it tries to explain higher-order social properties in terms of lower-order individual-level processes. It also resembles AS in another respect, which Gross does not highlight: namely, in its rejection of strong social emergence. There are at least two important differences between NP and AS. One is that the NP rejects the utilitarian rational-actor model in favor of a practical, habitual-actor model. Another is that it invokes micro-macro explanations to explain stability (morphostasis), rather than transformation (morphogenesis), as in AS. The implication, never made explicit by Gross, is that system-level change – morphogenesis – is the result of “creative action” (Joas 1996) that responds to “problem situations” that challenge habitual routines. While the NP approach is much less developed at present than either AS or CFDp, both philosophically and methodologically, neo-pragmatism qua social theory is certainly very much en vogue amongst younger, theoretically minded American sociologists today.

For the Critical Realist, however, none of these approaches can be considered fully realist, if by “fully realist” we mean epistemologically, ontologically and ethically realist. For example, the AS approach is epistemologically realist but not ontologically realist. It squarely rejects empiricist and positivist understandings of causation as a constant conjunction or probabilistic association between events or variables. And it firmly embraces a realist view of causation in terms of mechanisms. However, it is ontologically ir-realist to the extent that it allows only an epistemic form of social emergence understood as higher-order properties that are perceived by social actors. As I have shown elsewhere (Gorski 2009), this renders AS – and all such efforts to combine social realism with methodological individualism – ontologically incoherent. How so? On the one hand, AS admits the existence of non-observable sub-individual level entities and process (e.g., conflicting desires and rational choices) while denying the existence of supra-individual entities and processes on the grounds that they are not observable. In this regard, AS is still empiricist and not fully realist.

With CFDp, we encounter the reverse situation: it is ontologically realist but epistemologically antirealist. It is ontologically realist to the degree that it at least tacitly allows for “downward” or “macro-micro” forms of causation. For example, CFDp analyses of social mobility often look at the impact that a “macro” variable such as education or neighborhood has on a “micro” variable such as life chances or average income (Morgan 2001; Sharkey and Elwert 2011). Still CFDp remains epistemologically irrealist insofar as it conceptualizes causation as a probabilistic relationship between variables, rather than as a processual relationship between active entities. And this generates epistemological confusion within CFDp. For instance, empirical analyses within the CFDp framework frequently confuse mechanisms with models. Specifically, they present DAGs as causal mechanisms rather than as statistical models of those mechanisms.

Gross’ NP approach might be described as “Hume lite.” It is doubly irrealist but not strongly so. The epistemological irrealism manifests itself in the unspoken equation of mechanisms with regularities. Consequently, it cannot allow for suppressed, inactive or intermittent mechanisms. In other words, neo-pragmatism is another form of actualism. The ontological irrealism reveals itself in an easygoing form of methodological individualism, which is skeptical about the existence of supra-individual social structures. It leads to two programmatic difficulties. One is that social transformation cannot be explained in terms of causal mechanisms; it can only be accounted for in terms of “creative action” (Joas 1996). Of course, creative action can be a mechanism of social transformation (Sewell 1996). But it is hardly the only one. The other shortcoming is that it must explain morphostasis purely in terms of habitual action, because it lacks, or rather eschews, any notion of social structures that might generate or reproduce habits.

What is more, none of the four approaches is morally realist. They see causal mechanisms as an integral feature of a good explanation; but they do not attribute any critical function to them. Both CR and MM see the proper identification of causal mechanisms as the sine qua non of a good explanation. But they also believe that mechanismic analysis can function as a form of social critique, and in at least two different ways. The first is what Bhaskar has called “explanatory critique” (Bhaskar 1986, 2002). This involves the identification of social mechanisms whose operation is systematically misrecognized by, and therefore concealed from, the social actors themselves, where such misrecognition is crucial to the continued operation of the mechanism. This form of critique is hardly specific to CR of course. After all, the paradigmatic example of an explanatory critique is Marx’s analysis of the extraction of surplus value (Marx et al. 1976). Of course, social actors do not always have a “false consciousness” about the social structures they are enmeshed in; sometimes, they understand them quite well and enter into them more or less voluntarily, faute de mieux. I therefore propose that we distinguish a second form of mechanismic analysis. Let us call it “eudemonistic critique.” It involves showing that a particular form of life-conduct or social organization limits or prevents the realization of certain human capacities or relational goods – and that it does so unnecessarily.

2 Causal Mechanisms and the Physicalist Imaginary: A Critical-Historical Analysis

For the Critical Realist, the recent move towards mechanismic analysis in American sociology marks a welcome departure from the sort of positivist empiricism that so dominated the discipline for over half a century. However, as we have just seen, the realist train has not quite made it out of the Humean station; it remains half-stuck in various forms of skeptical irrealism – epistemological, ontological or moral. Why? There are many reasons, of course, including commitments to: certain positivist-inspired methodological techniques; a deeply individualistic ethico-political framework; and a sharp distinction between “facts” and “values.”Footnote 3 In this section, I would like to argue that there is also a another deeper and less obvious reason: contemporary approaches to social mechanisms are tacitly structured by a physicalist imaginary whose roots lie in the “mechanical philosophy” of the seventeenth century. In concluding, I will contend that CR itself has not entirely disentangled itself from this imaginary.

I borrow the notion of an “imaginary” from Charles Taylor. In A Secular Age, for example, Taylor defines a “social imaginary” as “the way that we collectively imagine, even pre-theoretically, our social life” (Taylor 2007: 146). Elsewhere, he defines it more colloquially as: “the ways people imagine their social existence, how they fit together with others, how things go on between them and their fellows, the expectations that are normally met, and the deeper normative notions and images that underlie these expectations” (Taylor 2002: 23).

Following Taylor we might also speak of a “natural imaginary.” By “natural imaginary”, I mean something like “the way that we collectively imagine, even pre-theoretically, the natural world.” Or, more expansively, “how we collectively and often pretheoretically envision the ontological furniture of the natural world, how it is ordered, and where human beings fit into these arrangements.”

Taylor argues that social imaginaries are historically and culturally variable. I would argue that this applies to natural imaginaries as well. The physicalist imaginary, for example, is an early modern revival of the ancient atomism of Democritus and Lucretius, supplemented by the geometrical formalism of Pythagoras and Plato (Funkenstein 1986; Gaukroger 2006; Shapin et al. 1985; Wilson 2008). Put simply, it presumes that the world is “really” composed of elementary particles that interact in a deterministic manner that can be captured mathematically, and that everything else in the world is ultimately epiphenomenal. I refer to this as the “physicalist imaginary” because the resurgence of atomistic metaphysics in the seventeenth Century was largely a philosophical response to the triumph of celestial mechanics, which combined atomism and Pythagoreanism. I say “philosophical”, because its chief proponents were men like Descartes and Hobbes, who may have fancied themselves physicists, but who are now known to us mainly as philosophers, not least because their physical theories were catastrophic failures, whereas those whom we nowadays remember as physicists did not firmly embrace this atomistic metaphysics, either because they thought scientific knowledge was founded on experimentation (e.g., Galileo and Boyle) and/or because they were ultimately committed to a non-mechanistic metaphysics of some sort (e.g., Newton, who believed that divine intervention was necessary to maintain cosmic order).

Though they are mostly implicit in modern-day social theory, the basic metaphysical assumptions of the physicalist imaginary are made very explicit in the atomistic physics of early modern mechanism. Four of the key assumptions are as follows: (1) all things really consist of atoms; (2) all change is just the motion and collision of atoms; (3) all such motions and collisions obey the laws of geometry; (4) therefore, all events are fully determined in advance.

While many early modern scientists came to believe that this was an accurate description of the physical world, Hobbes and other neo-Epicurean philosophers argued that the social world could also be understood in exactly the same way (Martinich 2005). It, too, was comprised of “atoms” (i.e., individuals). Individual behavior was driven by internal “motions” (i.e., desires for objects). Human interaction was subject to “natural law” (in the Grotian sense of “self-preservation”). Therefore, social life was also fully deterministic.

The central ambition of the neo-Epicureans was to do to the Aristotelian view of human society what mechanistic physics had done to the Aristotelian cosmos, namely to supplant it. In this way, they hoped to unify the sciences by placing them on the same metaphysical foundation: atomism. In so doing, they transformed a natural imaginary into a social one, giving birth to the physicalist worldview that still underpins much work in sociology and in the social sciences more broadly.

The physicalist imaginary is deeply embedded in the modern social imaginary, so deeply in fact that it is worth recalling the preceding natural imaginary it replaced, namely, the Aristotelian world picture that underwrote medieval natural philosophy (Feser 2004; Sachs 2004). Let me quickly draw out four important points of contrast. (1) Hylemorphic ontology: in Aristotelian metaphysics, the world was comprised, not of atoms, but of “substances”, complex combinations of matter and form, which were hierarchically ordered. The physicalist imaginary was derived from a flat and monistic ontology in which there was only one substance. (2) Causal Pluralism. In Aristotelian natural philosophy, an adequate explanation invoked four different types of causation: material, efficient, formal and final. Early modern mechanism reduced the four types of causation to one: efficient. (3) Powerful Particulars. Different substances behave in characteristic ways. There are no “laws of motion” that apply equally to all realms of being (physical, biological, social and so on). In the physicalist imaginary by contrast particular powers are lumped together into the unifying category of “cause.” (4) Human Freedom. One of the characteristics powers of human persons is to act according to reason; another is to live in society. In the physicalist imaginary, by contrast, human beings are just so many billiard balls, jostling into one another.

Whereas the early modern mechanists projected a physicalist ontology onto the human world, Aristotelianism did more or less the reverse: it understood the cosmos as a living entity, suffused with agency and purpose (Lear 1988). In sum, the shift from the Aristotelian cosmos to the mechanical world-picture involved stripping away: (1) the ontological category of form and therefore also of substance; (2) material, formal and final aspects of causation; all causation was reconceived as efficient causation; (3) the shift from a purposive to a deterministic view of the natural order. (4) the shift from a biocentric to a physiocentric cosmos.

The continuing influence of the physicalist imaginary on the social sciences can be seen in various ways. One is the enduring power of certain metaphysical prejudices. Two are particularly consequential: ontological smallism and causal deflation. By “ontological smallism” (Wilson 2004: 22–24), I mean the pervasive tendency within scientific discourse to privilege the small over the large in all realms of study. The unspoken assumption is that things at larger scales can only be explained in terms of things at a smaller scale and never the reverse. The classic expressions of smallism in the social sciences are some form of “methodological individualism” and its Siamese twin, methodological reductionism.Footnote 4 By “causal deflation”, I mean the tendency to squeeze all forms of causation into the model of efficient causation and, even more, to (re)conceive of efficient causation in a purely mechanistic fashion (i.e., as a direct transfer of energy from one entity to another via physical contact). One common manifestation of deflation is the widespread practice of representing all causation in terms of nodes and arrows. Social science smallism creates an epistemological privilege for reductionism in all disciplines that in turn justifies a disciplinary hierarchy in which intellectual status is directly correlated with smallist commitments. The smaller the primary objects of study in your discipline or sub-field, the more scientific your research is, and the higher your status. Causal deflation, meanwhile, compels social scientists to translate all manner of causal relations into an efficientist language, renders any form of causation not involving physical contact (e.g., collective memory) inherently “spooky” and “unscientific”, and blinds researchers to the diversity and specificity of causal relations in the social world.

CR cannot be charged with smallism. It has been committed to strong emergence and ontological stratification since its inception. However, it has not entirely freed itself of the smallest prejudice. For example, the recurring trope of “underlying mechanisms” carries the unfortunate connotation that mechanisms operate at the micro-scale. There is, as well, a small remnant of causal deflation. The MM approach does draw a clear distinction between “macro” and “micro” causation, to be sure. But macro-to-micro causation is often represented in terms of efficient causation: structure at T1 impacts agency at T2. No doubt! But not only. Structure also influences agency synchronically by constraining and enabling certain agentic powers. What is needed, then, is: (1) an understanding of social mechanisms that is fully shorn of the mechanistic metaphysics of the physicalist imaginary; and (2) an understanding of social causation that is more attentive to: (a) different forms of social causation; and (b) specific types of causal powers in the social world. Recent work in the philosophy of biology can help move CR and MM towards all of these goals.

3 Causal Mechanisms in the Life Sciences: The Chicago School Approach

While most Critical Realists will happily agree that the physicalist imaginary is ontologically inadequate, few analytic metaphysicians would join them. Amongst philosophers, particularly philosophers of mind, ontological smallism and causal deflation are still very much the order of the day. In the philosophy of science, however, and especially in the philosophy of biology, other views have been gaining ground, heterodox views that are more consonant with CR and MM. Since much of this work has been done by William Wimsatt and his students at the University of Chicago, I will refer to this approach as “the Chicago School” – not to be confused with the Chicago School in Economics, of course, which is mechanism par excellence! The Chicago School provides very powerful arguments against the physicalist imaginary and a useful starting point for reconstructing our social imaginary along Critical Realist lines.

The area of philosophy where the validity of smallism and deflation have been most heavily debated is probably the philosophy of mind. The central question in this literature concerns the relationship between mind and brain. And the most popular answer is probably Jaegwon Kim’s notion of “supervenience” (Kim 1979, 1987, 1993, 2002). While some of the early twentieth century emergentists used “supervenience” as a synonym for emergence (Broad 1929), present-day philosophers of mind typically present it as an alternative to emergence. Let us assume, as CR does, a “stratified” or “layer-cake” ontology. For simplicity’s sake, let us further assume two strata or layers, “A” and “B”, where “A” is higher and “B” is lower. “The core idea of supervenience is captured by the slogan that there cannot be an A-difference without a B-difference” (McLaughlin 1995, 2005, 2006). For example, let us imagine that mental states (level A) “supervene” on brain states (level B). This means that any change in mental state (feelings of pain or hunger, thoughts of exercise or dinner, and so on) will correspond to a change in brain state. The attraction of this approach for philosophers of mind is that it saves the qualia – the “secondary qualities” of subjective experience (e.g., sweetness, redness, perhaps even beauty) (Searle 1998, 2004) – but without abandoning physicalism. This is because supervenience allows for a “weak” or “epistemic” form of emergence. It allows for emergent properties (e.g., qualia) that can be exhaustively explained in terms of lower-order physical entities and processes (e.g., the firing of neurons). Some sociologists have also been attracted to supervenience for similar reasons: it allows one to defend “methodological individualism” without denying the existence of macro-social properties (Healy 1998; Hedström and Bearman 2009b; Sawyer 2002, 2005). On this account, there may be “social facts” (e.g., birth rates, crime rates and so on), but they will “supervene” on individual activity. In other words, higher order processes and properties are nothing but aggregations of lower order ones.

The problem with supervenience, as Kim himself has recently concluded, is that it does not in fact provide the sort of stable middle ground between Cartesian dualism and reductive physicalism that it promises (Kim 1999, 2005, 2011). To see why, consider the mind/brain relationship again. If we accept that all mental states supervene on brain states in Kim’s sense, then why bother studying mental states at all? Why not just focus on the brain? After all, the supervenience account strongly implies that the “real” causal action will be at the level of the brain, anyway; mental states are ultimately just epiphenomenal. To suggest otherwise, Kim argues, entails the possibility that mental states might have causal powers independent of brain states, opening the door to “downward causation” (Andersen 2000; Campbell 1974; Murphy et al. 2009). By “downward causation”, Kim understands a form of efficient causation in which A properties cause B properties. For example, one’s mental state at T1 would exert “downward causation” on one’s brain state at T2 (Kim 2000). And this, says Kim with rather considerable alarm, would seem to threaten the “physical closure” of the world, because it implies that mental processes might sometimes overrule or even violate physical laws. With the dissolution of supervenience Kim concludes, there are really only two stable positions left in the philosophy of mind: reductive physicalism and metaphysical dualism. And only one of these is scientifically legitimate, namely, physicalism. Is he right?

The recent work of the Chicago School suggests not. In a series of articles, William Wimsatt paves the way by turning the tables on reductive physicalists. He asks: What would it really mean for a higher order system property to really just be nothing but an aggregation of lower order processes? (Wimsatt 1985, 1997). Wimsatt enumerates four conditions which must all be fulfilled: (1) Inter-Substitution: internal rearrangements or external substitutions of system parts will not affect system properties; (2) Qualitative Similarity: Increases in the size or scale of the system have no influence on its system properties; (3) De/Recomposition. The system can be disassembled and reassembled without any loss of system properties. (4) Linearity: “There are no cooperative or inhibitory interactions amongst the parts of the system for this property” (1997: 386). As Wimsatt rightly notes, there are precious few systems that actually fulfill all four of these criteria. The proverbial heap of sand might come close. But even a pile of stones might not, since the exact shape and arrangement of the stones and the size of the pile might affect their stability (Mumford 2012). In Wimsatt’s words: “Very few system properties are aggregative, suggesting that emergence, defined as failure of aggregativity, is extremely common – the rule, rather than the exception” (1997: 382).

While Wimsatt has argued strongly against reductive physicalism, other members of his Chicago School have strongly criticized nomothetic understandings of scientific knowledge. At least in the biological sciences, they contend, explanations usually appeal to mechanisms rather than laws. But just what is a biological mechanism? In a much cited paper, Peter Machamer, Lindley Darden and Carl Craver (hereafter: MDC) offer the following definition: “Mechanisms are entities and activities organized such that they are productive of regular changes from start or set-up to finish or termination conditions” (Machamer et al. 2000: 3). Let us examine their definition a little more closely. The first thing to note is that it includes both “entities” and “activities.” By means of this “dual ontology”, MDC seek to incorporate the insights of both “substantialist” definitions of causal mechanisms that focus on the dispositional properties of natural kinds (Cartwright 1989; Ellis and Lierse 1994; Ellis 2001; Mumford 1998) and those of “process ontologies” that give relations pride of place (Latour 2013; Rescher 1996, 2000; Stengers 2011; Whitehead 1978). It is also worth noting that MDC themselves give priority to activities, and for much the same reasons as MM gives priority to practice in its conception of persons, namely: First, they argue that we learn about the causal nexus of the world through our own activity in the world, regardless of whether “we” means scientific researchers or young children. Second, they argue that entities exert their powers only via activities (Machamer 2004). This leads to a third point: causation is first and foremost about “production”, not conjunction, correlation, or relevance, as the Humeans and neo-Humeans have variously asserted (Glennan 1994). Now, as some critics pointed out (Bogen 2005), the reference to “regularity” might seem to put MDC back in the Humean camp, with its actualist prejudices. However, members of the Chicago School quickly clarified that mechanisms are always regular in their activities but not necessarily in their occurrences (Craver 2006; Darden 2006; Glennan 2010). Consider the “fight or flight response.” Its operation may be regular, but its initiation is irregular. Another attractive feature of MDC’s definition that bears emphasis is their notion of start-up and finishing conditions. The advantage of this formulation is that it captures the temporal dimension of causal mechanisms without recourse to an events-ontology.

If one commonality between the Chicago School and CR/MM approaches is a commitment to mechanismic explanation and a rejection of nomothetic ones, another is a strong embrace of a layered ontology and a concomitant suspicion of ontological smallism. Biological mechanisms can rarely be fully described within a flat ontology. This is because the entities that comprise them often vary significantly in size or scale. What is more, the activities of some of these entities may depend upon those of various sub-mechanisms as well. Thus, descriptions of biological mechanisms routinely distinguish between various “levels” and frequently specify “inter-level” processes. While decomposition is often a helpful strategy for investigating mechanisms (Bechtel and Richardson 1993), so is re-composition: discovering what role a particular entity plays in a larger system can illuminate why it has the particular powers or structure that it does. Consequently, the direction of investigation in the life sciences is sometimes from large to small, rather than the other ways around. Within most areas of the life sciences, however, scientific investigation operates within a certain scalar range. MDC refer to this operative range as “topping off” and “bottoming out.” The top and bottom levels in a given field are defined through the interplay of disciplinary convention and explanatory relevance. That is to say, that researchers typically have a tacit feel for the scalar range within which they typically search for causal mechanisms, a “personal knowledge” based on their scientific training and theoretical tools. However, they will sometimes breach or move these ontological boundaries in search of a fuller description of the mechanisms they are investigating.

While the Chicago School approach provides a powerful critique of reductive physicalism and ontological smallism, premised on a mechanismic epistemology and a layered ontology, it has thus far been less successful in effecting a reflation of causality. To be sure, MDC’s distinction between entities and activities does point towards the Aristotelian distinction between material and efficient causation. What is more, MDC’s frequent references to the “organization” and “function” of mechanisms and systems gestures towards the categories of “formal” and “final” causation. But in the end, MDC understand causation exclusively in terms of activity, which is to say, in terms of effective causation. Consider the following passage:

In our view, the phrase ‘top down causation’ is often used to describe a perfectly coherent and familiar relationship between the activities of wholes and the behaviors of their components, but the relationship is not a causal relationship. Likewise, the phrase ‘bottom-up causation’ does not, properly speaking, pick out a causal relationship. Rather, in unobjectionable cases both phrases describe mechanistically mediated effects. Mechanistically mediated effects are hybrids of constitutive and causal relations in a mechanism… (Craver and Bechtel 2007: 547)

Elsewhere, however, they note that (1) the operation of a mechanism typically depends upon the causal powers of its constituent parts; (2) the organizational form of a causal mechanism both constrains and enables the causal powers of its constituent parts; and (3) the analysis of a mechanism generally requires knowledge of its end state or function (Craver 2001). Why we should not see these relations as causal ones – specifically, material, formal and final – is not at all clear, at least not to me.

4 Ontological Dis/Analogies Between Biological and Social Mechanisms

The Chicago School approach provides some useful arguments against reductive physicalism. Specifically, it delivers an open challenge to the natural imaginary bequeathed to us by the mechanistic thinkers of the seventeenth century. For them, the work of science was like watching a game of billiards. All the action takes place in a two-dimensional closed system and consists of energy transfers between point particles resulting in motion that obeys the basic rules of Euclidean geometry. Or so the observer may infer after watching repeated rounds of the game. That such interactions presume not only a closed system but human intervention – that the interactions themselves are, in this sense, humanly created – is quickly forgotten. Let us call this the billiard-ball ontology.

Of course, it is no longer clear how far the billiard-ball ontology actually obtains for the atomic world, not to speak of the quantum world. Be that as it may, it is quite clear that the billiard-ball ontology generates a highly inadequate understanding of the biological realm. Let us simply note some gross contrasts to establish this point. To begin, no biological system is perfectly closed. Indeed, one definition of a living organism is that it absorbs external energy in order to sustain internal order. Further, interactions between biological systems typically involve much more than energy transfers. A cell can become infected by a malicious virus, for instance, and an eco-system can be invaded by a new species.

A second major point of difference is that biological processes occur in four dimensions, rather than two. The third dimension is the spatial dimension of physical scale. Biological entities vary enormously in size from small proteins through mid-sized organisms to vast ecosystems, and many important causal mechanisms are cross-scalar. The fourth dimension is the temporal one of historical time. Of course, time also matters in the mechanistic world of the billiard-ball ontology but in a purely physical rather than genuinely historical sense. Collisions between billiard balls occur in time. And they lead to new configurations. But the basic parameters of the system and the laws of interaction governing its components do not change. Not so in the biological realm. There, new entities may emerge over time (e.g., molecules, mutations, species, behaviors, niches and so on), creating the possibilities of fundamentally new types of powers and interactions: organisms that can walk or fly, populations that can split or migrate and so on. Meanwhile, the sorts of change that are likely to occur and endure are constrained by changes that have taken place in the past. For example, evolutionary adaptations are constrained by the body plans of organisms (Stadler et al. 2001). Finally, at least some biological entities engage in purposive activity, oriented, at minimum, to physical survival and biological reproduction. In short, material, formal and final causality play a much larger role in the biological world than in the physical world at least as that world is conceived in the mechanistic imaginary.Footnote 5

Let us now turn from the dis-analogies between the physical and biological realms to the analogies between the biological and social realms. There are many. They, too, can be conceived in terms of Aristotle’s four types of causation. Let us begin with the material. Human inventiveness continues to bring new entities into the world thereby creating the possibility of new structures and mechanisms. The transportation and communications revolutions of the modern era provide many illustrations (steamships, automobiles, telephones, the internet and so on). Of course, one can also think of artifacts in instrumental terms, as technical means to human ends and, in this way, fold them back into the category of efficient causation. And indeed, that is how most social scientists do tend to think about artifacts, perhaps for that very reason. However, one can – and should – also think of them as material causes of new forms of social organization. Didn’t the invention of the railroad contribute to the development of national consciousness? Wasn’t the mass produced automobile a material cause of American suburbanization? Wasn’t the creation of the internet a material cause of new forms of social networks?

The second analogy, already touched on, concerns formal causation. One of the most common and consequential types of formal causation in the social world is “path dependency” in which established forms of social organization place powerful downstream constraints on subsequent developmental trajectories (Mahoney 2000; Pierson 2004). The paradigmatic example is the QWERTY keyboard. But there are other types as well. Sociologists of organization have long noted the strong tendencies towards structural “isomorphism” within any given social “field” (higher education, automobile companies, etc.) (Powell and DiMaggio 1991). One reason may be that there are certain well-established ways of doing things in certain domains of social life and new entrants into the domain tend to imitate them to one degree or another. But again, social scientists often tend to conceive of social forms in instrumental or strategic terms so as to subsume them into models of efficient causation. But isn’t this too simple? Don’t social forms also constrain actors’ strategies? Indeed, don’t the dominant forms even “choose” or at least advantage some actors over others? If so, then perhaps it is best to speak of formal causation.

That we should find the Aristotelian schema helpful for thinking about causation in the biological world is hardly surprising. After all, it developed out of Aristotle’s zoological researches. More interesting, perhaps is that we should find it generative in the social domain as well. Of course, biological analogies have a long history in social science. They were frequently deployed by earlier generations, from Spencer and Durkheim through Malinowski and Parsons. But the well-known shortcomings of evolutionary and functionalist approaches should also give us pause and prompt us to reflect on the dis-analogies as well. They, too, are not hard to find.

Let us begin with the material causes of biological and social structures. The building blocks of biological structures are primarily, naturally occurring, material substances, including animal species and populations. By contrast, even moderately complex social structures are minimally composed of: (1) human persons (qua “actors” or “agents”); (2) physical artifacts (e.g., machines, buildings); (3) symbolic systems (e.g., rituals, rules). The contrast should not be overdrawn, of course. Some animals do build things, typically shelters. And some of the higher mammals also appear to be capable of a fairly high degree of intra-species communication. But these latter capacities are far more developed in human animals, opening qualitatively different possibilities. The crucial point is that social structures contain a much higher proportion of artifactual and symbolic elements than one finds even in the most highly developed communities of non-human social animals (e.g., social insects and primates).

Now, let us turn from matter to form. In the biological domain, the form of a structure is often the result of the spatio-temporal organization of naturally occurring matter, such that the microphysical arrangement of the component parts constrains the causal powers of those parts while creating new, higher-order causal powers. In the social domain, by contrast, the form of a structure (also) involves symbolically mediated relations between human persons and artifacts, which coordinate and magnify the causal powers of individual actors. As a result, social structures cannot be properly understood in a purely spatio-temporal manner. We could not understand or even categorize a human institution (e.g., a “bank” or a “college”) simply by observing the placement and movement of persons in a building. To this degree, the old interpretivist critique of “behavioral” social science was spot on. However, interpretivists sometimes imply that social structures are reducible to human interactions, and this is not quite right either. Why? Because of the artifactual element. Buildings, for example, are important to the operations of banks and schools, because they constrain and enable certain patterns of interaction and cooperation.

What about final causes? Since Darwin, final causation has been declared anathema within the biological sciences. Of course, as critical observers such as Etienne Gilson noted early on, Darwinian theory cannot really do without something very much like final causation (Gilson 1984). For instance, at the level of the organism, it must presume a will to survive and/or reproduce. Meanwhile, at the level of the species, it must presume something like a developmental tendency towards adaptation and/or fitness. Be that as it may, the reflective capacities generated by human language mean that human behavior is not fully intelligible without reference to some sort of finality (Sehon 2005). Why? Because human beings are forever making plans and telling stories (Ricœur 1990). In making plans, they reflexively seek to purse their concerns and attempt to relate present actions to future purposes. And in telling stories, they relate past actions to future purposes.

In this section, I have argued that structures and mechanisms in the biological and social realms are not easily handled within the framework of the billiard ball ontology with its deflated view of causation. More positively, I have argued that an Aristotelian approach to ontology and causation provides a far more fruitful starting point, because it restores the scalar and historical dimension to structure and the material, formal and final aspects of causation. Whether it provides an appropriate ending point is beyond the scope of this essay. This much seems certain however: a non-reductive ontology and a pluralist approach to causation would help to resolve some of the persistent aporia that the physicalist imaginary has bequeathed to the modern social sciences.

5 Conclusion: Mechanisms or Powers?

CR and MM first embraced the mechanisms concept as an alternative to the nomothetic model of scientific explanation championed by logical positivists. Should they continue to do so? I am of two minds about this. On the one hand, there are good intellectual reasons for abandoning the concept. At the same time, there are also good pragmatic and theoretical ones for retaining it.

The fundamental problem with the mechanisms concept is that it primes a whole series of fallacious assumptions about social ontology, specifically: smallism, physicalism, invisibilism and sequentialism. We have already encountered the first two. By smallism, I mean the tendency to privilege smaller units of analysis over larger ones. By physicalism, I mean the tendency to conceptualize interactions in terms of physical contact and energy transfers. However fruitful these heuristics may have proven for the development of mechanistic physics, they have now outlived their usefulness within physics and have proven less useful in the biological domain, where causal mechanisms may include units that differ significantly in scale and even less useful in the social domain where interactions between units are symbolically as well as physically mediated. The mechanisms concept also tacitly implies that causal processes are invisible to social actors and can only be revealed by social analysts. This is not always the case. Social actors may be quite aware that they are enmeshed in an exploitative relationship. Indeed, they can and do create institutions for the express purpose of dominating others! The final fallacy I would like to touch on is sequentialism. The mechanisms concept implies that a past occurrence can only affect the present by means of a spatio-temporally connected series of physically specifiable interactions. This is manifestly untrue in social life. The most obvious and important counter-example concerns memory. Various forms of social memory – individual and collective, neuronal and historical, traumatic or foundational – may exert an ongoing effect across time (Assmann 2011; Halbwachs 1992; Olick and Robbins 1998). As a result, past events can become a part of a social mechanism, and dead people can be social actors.

The mechanismic approach was supposed to seal the break with logical positivism. In this, it has not succeeded. Why? Because logical positivism is also premised on the physicalist imaginary. This is why anti- and demi-realist versions of causal mechanisms have proliferated in recent years. Crypto-positivists, neo-Humeans, and semi-realists have all embraced the mechanisms concept as a halfway house between nomothetic and fully realist forms of social science.

And yet, it is perhaps for this very reason that proponents of CR and MM should think twice before letting go of the mechanisms concept. It has become an important focus of intellectual debate, with contending schools attempting to impose their preferred definitions on it. There are also important theoretical and political reasons not to let go it just yet: the mechanisms concept reminds us that there are fairly regular but non-observable processes in the social world, even today. For example, however much the technological means of modern capitalism have been transformed, the inner logical of capital accumulation is not really as different as some observers suggest. Nor should the rapidity with which capital – and information – now circulate around the globe lead us to imagine that these mechanisms and structures have all dissolved into “contingencies” and “flows” whose only properties are “risk” and “acceleration.” Social life is not that fleeting, even in the morphogenetic society. For all these reasons, social scientists should not give up trying to identify the causal mechanisms that shape our world. To do so would be an abdication of their proper vocation.