Introduction

Setting the stage for modern conceptions of complexity

In an early paper on “Complexity”, Richard Levins (1970) makes a distinction between “aggregate”, “composed”, and “evolved” systems. This trichotomy is rich in methodological consequences for several philosophical issues. When philosophers believed that the way to explain phenomena or systematize a science was to find laws, these distinctions were easily overlooked. Now that we recognize the central importance of the search for mechanisms and construction of mechanistic explanations, Levins’ distinction has become far more central:

“There are many ways in which complex systems might be classified ... First consider the aggregate system in which the properties of the whole are statistics of the properties of individual parts ... [as] part of a mean, or variance, ... in which the parts are therefore affecting the properties of the whole in the same way, and are not directly acting on each other in the model. This does not mean that there is no physical interacting ... It is simply irrelevant ... for some kinds of problems, though not for others.” (Levins 1970. p. 76)

“After the aggregate system, the next kind would be a composed system, such as an engineer’s circuit. In this system, the way in which different kinds of parts are strung together into the system will determine system properties. The properties of the system therefore are no longer derivable from simple statistics of the components ... this is a composed system because the properties of the parts can be completely specified by study in isolation. They do not affect the mode of response of each other, but only the way that a signal is processed that passes through all of them.” (p. 77)

“A third kind of system no longer permits this kind of analysis. This is a kind of system in which the component subsystems have evolved together and are not even obviously separable; in which it may be conceptually difficult to decide what are the really relevant component subsystems.” (p. 77)

I had been working on functional organization, and concerned with how it mapped onto anatomically and physically defined systems (Wimsatt 1971, 1972, 2002). This occurs in ways that are rich, complex, and variable, and only more confusing if you ask how to pick out relevant parts. That interest and Levins’ concerns helped to drive the analysis I proposed (1971, 1974) for why biological organization could be so difficult to characterize. My analysis of “descriptive” and “interactional” complexity could be seen as an attempt to approach systems of the third kind (“evolved” systems) using tools appropriate to systems of the second kind (composed systems) supplemented by recognizing how multiple partial descriptions of the second kind all apply to the same system (my “perspectives”) and individuate parts in different ways so we are forced to articulate causal interactions between variables in different perspectives. Kauffman’s (1971) analysis of “articulation of parts explanations” (an important early account of how to construct mechanistic explanations) also grew out of interactions with Levins’ views in this period, but took them in a different direction—looking at the use of perspectives in generating models when they were taken as competing alternatives rather than complementary slices.Footnote 1

Levins’ views naturally engender questions that engage philosophers and biologists alike:

  1. (1)

    Is emergence compatible with reductionism? Can a thorough-going philosophical naturalistFootnote 2 be both an emergentist and accept reductionist explanations of system phenomena? How do we have to understand reductionism for this to be so? How should we understand emergence? As widely understood by scientists, emergence is not only compatible with mechanism but is the most common outcome of an inter-level mechanistic explanation of a phenomenon (Wimsatt 1997). This is the domain of what Levins calls “composed” or “engineered” systems. Such mechanistic explanations are also reductionist explanations of the behavior or a property of a system in terms of the interactions of its parts and properties. Such a reductionist need not deny the causal importance of higher-level phenomena, regularities, entities, structures, and mechanisms built upon them. The eliminative reductionism or “nothing-but-ism” that philosophers with Russellian or Quinean intuitions extended to encompass the causal structure of compositional hierarchies is unfounded and without merit.Footnote 3 So also are the imperialistic “desert ontologies” of a Stephen Weinberg or Francis Crick when they suggest that (“ultimately” and “in principle”Footnote 4) everything can be done at their level: that there is thus no reason to engage in (and only new sources of error and imprecision to be found through) research pursued at higher levels of organization. This is unfounded for the systems they lay claim to, but everything they say seems appropriate for properties of systems that are aggregates of the parts properties. So I would argue that a major confusion in discussions of reductionism arise from a conflation of what Levins distinguishes as “aggregate” and “engineered” systems.

  2. (2)

    Dan Dennett (1995) talks about the importance of “reverse engineering”—the investigative procedures of industrial espionage he claims are continuous with biological methods of investigation. Reverse engineering takes a (technological) object (usually designed and constructed by engineers of another company or another nationFootnote 5), and figure out how, and how well, it works so you can build another one, adapt it to another manufacturing or use environment, modify it for another application, or improve on it). Is figuring out how a biological organism, sub-system, or super-system (deme, or ecosystem) works any different from this? That is can a mechanistic reductionist be methodologically confident that intuitions derived from the analysis of “engineered” systems be adequate to the analysis of “evolved” systems?

  3. (3)

    Can methodologies bias scientific results? That is, can they lead in a systematic way to incorrect predictions, and explanations, and ineffective strategies for manipulating or controlling the system under study? Scientific investigations can go wrong, like any broadly ampliative inductive procedure. But systematic errors, like systematic failures in a machine, indicate a design flaw to be corrected.Footnote 6 Or a very finicky operation with a tool in conditions for which errors are not well documented, so that some systematic checking is in order with each use. Or, it might indicate a tool optimally designed for one task being used inappropriately for another. How might this bear in contexts when there is a danger of confusing, or preferring or being forced to use tools appropriate to one of these system types on a problem arising with another?

  4. (4)

    Reductionism, even in more mellow forms (e.g., non-eliminative forms recognizing the importance and causal potency of upper-level phenomena, regularities and entities), tends to de-emphasize context. Given widespread use of reductionistic methods, can we do anything to detect and control or reduce the frequency or magnitude of reductionistic biases?

These questions have an increasingly Levinsian flavor: the first starts with a degree of removal and abstraction from practice more characteristic of a philosopher, and they move towards increasing practical focus and concern with actual use. Whatever the fruits of philosophical analysis, one must be suspicious of views that pride themselves on their irrelevance to practice. To my mind the best philosophy would at least seek such relevance. My inclinations in this direction were richly encouraged by Dick Levins’ sensitivity to practice—especially striking from one of the most theoretical of biologists, and one of the things I most value in his work)!Footnote 7 This cluster of research problems is robustly overdetermined for me. The several powerful tools and fructifying examples Levins has provided have guided me from the first: I have sought to advertise and broaden their usefulness for others (in my 1974, 1980a, 1981b).

Reductionism, mechanism, and emergence

I assume that a reductive explanation of a behavior or a property of a system is one that shows it to be mechanistically explicable in terms of the properties of and interactions among the parts of the system Footnote 8 (see also Kauffman 1971). The relevant explanations are causal, but needn’t be deductive or involve laws—contrary to conventional wisdom (Wimsatt 1976b; Cartwright 1983). How system properties depend upon the organization and interactions of these parts then becomes a topic of primary interest. Emergence is one broader kind of pattern of relationships between a system property or relationship and the organization and properties of the parts, but here accounts of it often diverge.

Many accounts appeal to multiple-realizeability as a criterion for emergence, and claim it involves a failure of reduction. Multiple realizability is sometimes connected with emergence, as with most of the kinds of robustness discussed by Wagner (2005) but it is not necessary connected with non-reducibility. I have argued (1981a, 1994, 2007) that multiple realizeability is entailed by the existence of compositional levels of organization, is far broader than often supposed, and in these contexts is neither mysterious nor contrary to reductionism.

One can usefully classify concepts of emergence in terms of the kind of context-sensitivity they supposed for properties.Footnote 9 This account analyzes emergence in terms of dependence of a system property on the arrangements of the parts—ultimately therefore on the context-sensitivity of relational parts’ properties to intra-systemic conditions. Some accounts of emergence suppose extra-systemic context-sensitivity of system properties.Footnote 10 Neither need be anti-reductionistic. The latter would require finding a larger embedding system including the initially extra-systemic properties engaging the broader context-sensitivities. If an adequate mechanistic account could then be provided, a reductionistic account in terms of the original system would have failed, but one at a higher level would have succeeded—(including the original system together with more variables from its context or environment).Footnote 11

Such level and context or scope switching is a well-hidden secret of reductionistic approaches, and often leads to claims of success that are only partially merited. Level switching (usually going up a level) can also help to remove modeling biases resulting from “perceptual focus” on objects at a preferred level. I explain in 1980b how these biases led to the invisibility of false simplifying assumptions made about the structure of groups in models of group selection. The models started by focusing on genes and individual organisms but in the process made standard simplifying assumptions appropriate for some questions at those levels, but inappropriate for almost any questions about higher levels of organization. Without explicitly or obviously stating it, one assumption was equivalent to assuming that groups did not exist. Not surprisingly, group selection was then found not to have a significant effect! Other assumptions were nearly as deleterious. We must work back and forth between levels to check that features crucial to a phenomenon at an upper level are not simplified out of existence when modeling it at the lower level. On the analysis offered below, we would have shown that these group properties were emergent at the higher level, although yielding to a more sensitive reductive analysis.

Many cases were classically considered as involving emergence—cases motivating claims that “the whole is more than the sum of the parts”—like an electronic oscillator circuit. There’s nothing anti-reductionistic, mysterious, or inexplicable about being an oscillator. You can make one by hooking up an inductance, a capacitor, and a resistor in the right way with a voltage source. The system has the property of being an oscillator although none of its parts in isolation exhibit properties anything like it. It is the way that these disparate parts are strung together that makes them an oscillator (an oscillator must contain a closed circuit with these components in series). A deductive theory relates properties of the parts to the frequency and amplitude of the oscillator. This is a reductionistic account both under the strong conditions of the formal model of reduction, and also under the weaker characterization given here. This is intuitively a case of emergence, though it can’t be if we tie emergence to non-reduceability.

More generally, emergence of a system property relative to the properties of the parts of that system indicates its dependence on their mode of organization. This relates it directly to Levins’ notion of a compositional system. It presupposes the system’s decomposition into parts and their properties, and its dependence is explicated via a mechanistic explanation. Eight out of ten of the examples from Table 1 below are prima facie examples of emergence, and depend in some way on the mode of organization of the parts. All of the cases are consistent with totally reductionistic accounts of the systemic phenomena in question. Not every counterexample kills an analysis, but too many which are too central do. These surely qualify. We need an analysis of emergence consistent with reductionism.

Table 1 Examples of failures of aggregativity (with indications of decompositions assumed and conditions met; a subset of cases in Wimsatt 2007)

The rest of the paper is organized as follows: I proceed indirectly by focussing on a classical question: When intuitively is a system “more than the sum of its parts”? This question has stimulated a variety of erroneous but revealing views contributing to vulgar reductionisms—”Nothing-but-isms”. I first analyze four conditions for aggegativity—circumstances where “nothing-but-ism” would be justified and system properties are nothing but sums or collections of parts properties. Emergence is then the failure of one or more of these conditions. There are many ways they can fail outright, or be met only partially or approximately, and these provide a rich set of tools for classifying modes of dependence of systems properties on parts’ properties. The richness of classifications using these tools becomes much more interesting than classification of a dependency relation as “emergent”. After exploring how the conditions are applied in some simpler cases, I pick three more complicated cases that have involved interesting and sometimes controversial scientific issues. Applying these conditions can be more complex than one might have thought, but also quite useful. Each case adds new dimensions to the problem. I then return to the use of the conditions for aggregativity as tools for analyzing modes of organization, and argue that the order imposed by these conditions, even when some fail or are only partially met, make us more inclined to regard properties as natural kinds. But unintentional biases can still arise from overemphasizing the partial aggregativity of systems. The pervasiveness of such biases is a back-handed tribute to the importance of approximations, idealizations, and limiting case arguments in science. I close by considering reductionistic biases and functional localization fallacies, and why reductionistic claims are usually formulated more strongly (appearing more “aggregative”) early in a new reductionistic research program than later.

Aggregativity

Emergence should involve some kind of organizational interdependence of diverse parts, but these seem an open list with no obvious way to classify them or to turn it into a general analysis of emergence. It is easier to discuss failures of emergence, so I proceed in a backwards fashion (Wimsatt 1986b), by figuring what conditions should be met for the system property not to be emergent (i.e, for it to be a “mere aggregate” of its parts properties). This has a straightforward, revealing, and compact analysis that will follow and elaborate the significance of Levins’ line between aggregate and composed systems. I find four conditions for aggregativity. Emergence can then be classified and analyzed more systematically by looking at how these conditions can fail to be met. Examples for different properties and decompositions of the system into parts are given in Table 1.

Four conditions seem separately necessary and jointly sufficient for aggregativity Footnote 12 or non-emergence. Aggregativity and emergence concern the relationship between a property of a system under study, and properties of its parts. For each condition, the system property must remain invariant or stable under modifications of the system in the specified way—a kind of independence of the property over changes in the mode of organization of the parts. This invariance indicates that the system property is not affected by wide variation in relationships between and among the parts and their properties. To be aggregative, the system property would have to depend upon the parts’ properties in a very strongly atomistic manner, under all physically possible decompositions—an almost impossibly strong demand. It is rare indeed that all of these conditions are met (Table 1). Aggregativity is the complete antithesis of functional organization. Our reductionistic science to date has focused disproportionately upon such properties, or others that do somewhat less—meeting some of these conditions, approximately, some of the time—and studying them under conditions in which they are “well-behaved”. “In principle” analyses characteristically stop with the first very small set, but the second set is much larger, and the ways in which we relate to them are both more interesting and much more important for the practice of science. Aggregative or even such “pseudo-aggregative” properties are treated as relatively fundamental (Martinez 1992). In consequence, their import in the description of the natural world has been substantially exaggerated. It is methodologically crucial that we can come to understand how this happens.

These conditions are: (1) a condition on the intersubstitution or rearrangement of parts; (2) a condition on size scaling (primarily, though not exclusively, for quantitative properties) with addition or subtraction of parts; (3) a condition on invariance under the decomposition and reaggregation of parts; and (4) a linearity condition that there be no cooperative or inhibitory interactions among parts in the production or realization of the system property. These conditions are not independent of one another. There seem to be close connections between (1) and (3) and between (2) and (4). The conditions (Wimsatt 1986b) are as follows:

For a system property to be an aggregate with respect to a decomposition of the system into parts and their properties, the following four conditions must be met: Suppose P(S i ) = F{[p 1 ,p 2 ,…,p n (s 1 )], [p 1 ,p 2 ,…,p n (s 2 )],…, [p 1 ,p 2 ,…,p n (s m ),]} is a composition function for system property P(S i ) in terms of parts’ properties p 1 , p 2 ,…, p n , of parts s 1 , s 2 ,…,s m. The composition function is an equation—an inter-level synthetic identity, with the lower level specification a realization or instanciation of the system property.Footnote 13

  1. 1.

    IS (I nter S ubstitution). Invariance of the system property under operations rearranging the parts in the system or interchanging any number of parts with a corresponding numbers of parts from a relevant equivalence class of parts (cf. commutativity of composition function).

  2. 2.

    QS (Size scaling). Qualitative Similarity of the system property (identity, or if a quantitative property, differing only in value) under addition or subtraction of parts (cf. recursive generability of a class of composition functions).

  3. 3.

    RA (Decomposition and R e A ggregation). Invariance of the system property under operations involving decomposition and reaggregation of parts (cf. associativity of composition function).

  4. 4.

    CI (Linearity). There are no Cooperative or Inhibitory interactions among the parts of the system which affect this property.

Note that conditions IS and RA are obviously relative to given parts decompositions, as are (less obviously) QS and CI. A system property may meet these conditions for some decompositions, but not for others. Table 1 presents different examples of aggregativity and its failure—species of emergence (Many of these examples are discussed in Wimsatt 1986b).

Figure 1 illustrates the first three conditions for the amplification ratio in an (idealized) multi-stage linear amplifier. The system property, total amplification ratio, is the product of the component amplification ratios (Fig. 1a), a composition function which is commutative (Fig. 1b, condition IS), and associative (Fig. 1d, condition RA), and shows qualitative similarity when adding or subtracting parts (Fig. 1c, condition QS). (This example seems to violate the 4th condition (non-linearity), but we act as if it doesn’t: geometric rates of increase are treated linearly here (and in other relevant cases) because it is the exponent which is theoretically significant for the properties we are interested in.) Subjective volume grows linearly with the exponent (as our decibels scale reflects), and both of them grow linearly with addition of components to the chain. The 4th condition is found for cooperative interactions in the hemoglobin molecule: the four subunits take up and release oxygen more efficiently by being organized into a tetramer than they would as four independent units (a monomeric hemoglobin can be found for comparison in the lamprey). The staged linear amplifiers are interesting because they show that aggregativity does not literally mean “additivity”—here multiplicative relations do equally wellFootnote 14 (exponential growth is also much more common in biology than linear relations).

Fig. 1
figure 1

Conditions of Aggregativity illustrated with idealized linear unbounded amplifiers. (a) Total amplification ratio, At, is the product of the amplification ratios of the individual amplifiers: At = A1 × A2 × A3 × A4. (b) Total amplification ratio, At = A4 × A1 × A3 × A2 remains unchanged over intersubstitutions changing the order of the amplifiers (or commutation of the A’s in the composition function). (c) Total amplification ratio, At(n) = At(n−1) × A(n), remains qualitatively similar when adding or subtracting parts. (d) Total amplification ratio is invariant under subsystem aggregation—it is associative A1 × A2 × A3 × A4 = (A1 × A2) × (A3 × A4). (e) The intersubstitutions of 1a–1d which all preserve a strict serial organization of the amplifiers hide the real organization dependence of the Total Amplification Ratio. This can be seen in the rearrangements of 4 components into series-parallel networks. Assume each box in each circuit has a different amplification ratio. Then to preserve the A.R, the boxes can be interchanged only within organizationally defined equivalence classes defined by crosshatch patterns. Interestingly, these classes can often be aggregated as larger components, as in these cases, where whole clusters with similar patterns can be permuted, as long as they are moved as a cluster (see Wimsatt, 1986a, b)

The linear amplifier of Fig. 1a–d is case 10 of Table 1. To meet all criteria, we must assume that each sub-amplifier is exactly linear throughout the entire range—from the smallest input to the largest output—required of the entire system. No real-world amplifiers here: this is an idealization, which I will return to below.

Even this simple story has some important limits, however. Amplifiers are themselves integrated functional wholes with differentiated parts—which cannot be permuted with impunity. (That is why we need circuit diagrams to assemble and to understand them—we cannot put them together in just any fashion!) Even the parts are integrated wholes. If you cut randomly through a resistor or capacitor, the pieces do not perform like the original. Testing these conditions against different ways of decomposing the system is revealing of its organization. This suggests broader uses for the conditions and the analysis. I’ll return to this below.

We aren’t done yet: notice that all of the examples so far (Fig. 1a–d) have the amplifiers arranged in series. This is an implicit organizational constraint on the whole system—readily accepted because our common uses of amplifiers connect them in this way. But we could also connect them differently, as in the three series-parallel networks diagrammed in Fig. 1e, and then the invariances in total amplification ratio, ∑A, disappear in ways discussed in my 1997, and 2007.

Finally, we have assumed that each sub-amplifier is exactly linear throughout the entire range—from the smallest input to the largest output—required of the entire system. They must multiply input signals of different frequencies and amplitudes by the same amount over this entire range. This is an idealization—case 10 of Table 1. Real-world amplifiers are approximately linear through given power and frequency ranges of input signals (see Fig. 2a and case 11 of Table 1). (Frequency correction curves are published so that linearity can be restored by the user by “boosting” different frequencies by different amounts, but these curves are themselves functions of the amplitude of the input signal.) The amplifiers—not perfectly linear to begin with—become increasingly non-linear outside these ranges. They are most commonly limited on the low side by insensitivity to inputs below a certain value, and on the high side by not having enough power to keep the transformation linear. So with real amplifiers the order of the amplifiers does matter, even in the serial circuit.

Indeed, the appearance of common and unqualified aggregativity is a chimera, and is usually a product of uninspected assumed constancies, idealizations, and overlooked possible dimensions of variation. The amplifier case illustrates this beautifully. We treated amplifiers and their parts as unbreakable modules because they are commonly treated as such in our theory and practice, and we considered only serially organized amplifiers because these modes are common for functional reasons. In the analysis of complex system properties, such kinds of errors are so easy to commit that they are almost the rule rather than the exception, contributing to design failures in engineering, modeling errors and errors of experimental design in science, and conceptual errors in philosophy.

But some properties at least seem like paradigmatic aggregative properties. The great conservation laws of physics—those of mass (case 1), energy, (now replaced by the hybrid, mass-energy), momentum, and net charge, (if we include its sign) in effect indicate that these properties actually do fill the bill. They appear aggregative under any and all decompositions. Indeed, that’s why there are conservation laws for them! Curiously, some properties you might have expected to don’t measure up. Thus, volume (Table 1, case 2) isn’t aggregative for solvent-solute interactions in chemistry. If you dissolve salt in water, the volume of the water + salt will be even less than that of the water alone. (So sometimes the whole is less than the sum of its parts!).

A system property may be aggregative for some decompositions but not for others. More generally, any of the conditions may be met for some decompositions, but not for others. This is probably the most common situation.

This fact has critical importance in theory construction. These variations allow for and suggest feedback between these criteria and the choice of decompositions of a system for further analysis. We tend to look for invariances, and these conditions are treated as desiderata, so in experimenting with alternative descriptions of and manipulations on the system we try to find ways to make them work—decomposing, cutting, pasting, and adjusting until they are satisfied to the greatest degree possible. And we will tend to regard decompositions meeting the aggregativity conditions, even approximately, as “natural”, because they provide simpler and less context-dependent regularities, theory, and mathematical models involving these aspects of their behavior. This is illustrated in the next section in considering chromosomes vs. genes as units of analysis.

There are other complementary grounds for regarding relationships, decompositions and the parts they produce as “natural”.Footnote 15 These include “robustness” (Levins 1966; Wimsatt 1980a,1981a, and new analyses by Odenbaugh 2001; Weisberg 2003, 2006; Plutinski 2006; and Weisberg and Riesman 2007), and “generative entrenchment” (Wimsatt 1986a, b, 2001; Griffiths 1996). Other heuristics are also used with these conditions to construct and validate decompositions—see the (growing) list of reductionistic problem solving strategies in my 1980b, 1985, 2006 or 2007(appendix). But heuristics have systematic biases, which may give misleading results. One of the most systematic biases is to generate behavior that is approximately aggregative (on one to all of the conditions) under very special conditions or strong constraints on the system and its environment, and then forget these qualifications in subsequent discussions. Quite different kinds of examples of this are discussed in the next two sections.

Adaptation to fine and coarse grained environments—derivational paradoxes for a formal account of aggregativity

The formality of the four conditions for aggregativity might suggest a direct analysis of properties of theoretical equations to determine whether a system property is aggregative or not. In the next two sections, I demonstrate that we can’t do so without considering at least (1) the choice of a parts-decomposition, (2) idealizations and assumptions made in the description of a system, and (3) further idealizations and approximations made in the derivation of equations relating system-level and parts-level properties.Footnote 16 I discuss the third here because the case is less complicated and requires less exposition. The first two, though apparently conceptually simpler are discussed in the next section. The case I will discuss to illustrate them cuts to the core of the conceptualization of multi-locus population genetics, and thus bear centrally on the hotly debated units of selection controversy.

In evaluating the aggregativity of properties, it won’t suffice just to look at the equations posited for their composition. This is nicely illustrated in the paradoxical relation between Levins’ fine and coarse-grained adaptive functions (Levins 1968: 17–18, Wimsatt 1980a, 1986b.) These mathematical functions model the fitness of an organism in a sequence of environments in terms of its fitnesses in the individual environments. Both appear to be aggregative. But we can’t stop there for the relationships they are taken to describe cannot be aggregative: they both describe fitness in sequences of environments, apparently possibly the same environments, but they aggregate in different ways and are not equivalent.

Levins’ fine-grained adaptive function is given by a sum of products:

$$\mathbf{W}_{f}=\sum \bf{p}_{i}\mathbf{W}_{i}$$
(1)

In this equation, W f is the net fitness of the organism in a mixture of temporal sub-environments E 1, . . . ,E n, in relative proportions p 1, . . . ,p n, in which it has fitnesses W 1, . . . ,W n. Levins supposes that organisms “experience” the composite environment as an “average” of component environments—thus their linear contributions to its fitness. The specified component fitnesses for sub-environments are those, which would be realized by that organism in a pure environment of the corresponding type for the entire interval. The form of the equation is like that given in decision theory for “expected utility”, with W i’s as utilities, and p i’s as probabilities.Footnote 17 Note that the fitness specified by the equation depends upon the relative frequencies or proportions of the sub-environments, but not on their order.

His coarse-grained adaptive function, using the same variables is given by:

$$\mathbf{W}_{c}=\prod\mathbf{W}_{\bf i}^{\mathbf{p}_{\bf i}}$$
(2)

This fitness function is also independent of the order of the sub-environments. It suggests the multiplicative law for combination of probabilities, and conjures up an image of the organism jumping through a series of hoops, with its chance of getting through each being independent of whether it has passed through any of the others.Footnote 18

Each of these two “adaptive functions” has a mathematical form meeting all conditions for aggregativity (the first is additive, the second multiplicative), so fitness in both cases seem to be an aggregative property of the component fitnesses and the frequencies of their environments. But this would lead to a direct contradiction. Both functions start from the same general expression for fitness, and transform that expression in different ways making different mathematical assumptions. They are not equivalent, and produce different answers except under very special limiting cases (Strobeck 1975). This is common enough for approximations, but that doesn’t remove the paradox. If fitness were really aggregative as each equation—taken separately—suggests, then one should be able to transform situations meeting either equation into situations meeting the other by simply reordering the subenvironments! So one or the other or both of the equations must be false, and—despite what the equations say—the fitness of an organism cannot be an aggregative function of its fitnesses in the sub-environments. (Cases 1 and 2 of Fig. 2 depict “fine” and “coarse” grained environments for organisms with temporal integration ranges intermediate between 1 and 12, and not too close to either.)Footnote 19

The “coarse grained” adaptive function is derived by twice making an approximation which is literally false:Footnote 20 (1) The fitness of an organism in a sub-environment is assumed to be a function only of that (sub)-environment—there are no “historicity” effects. (Equivalently) (2) later, it is assumed that the fitness function does not change over the discrete period of integration used, even if this period (Δt in equation 2.2 of Levins 1968, p. 18) is allowed to be quite long. But if true, this would allow unlimited applications of conditions IS and RA (inter-substitution and reaggregation) to reorder the sub-environments, changing the “small” and “well mixed” sub-environments of the “fine-grained” adaptive function into the “large” and “well separated” sub-environments of the “coarse-grained” adaptive function. Suppose that all of the micro-environments of one type were lumped together, and followed by all of the micro-environments of the other type (transforming case 1 of Fig. 2 into case 2). But the “adaptive functions” for these two cases are different mathematical functions, and yield different answers when given the same fitnesses and frequencies for their components. So they can’t both be right at the same time. What has to give is unrestricted application of IS and RA. But with this goes the claim that either of the adaptive functions are truly aggregative.

Fig. 2
figure 2

Patterns of Environmental Grain. Case 1: p(E1) = p(E2) = 0.5; regular repeat of sub-environments with unit length; fine grained (for both environments) for an organism with tolerance, integration range, or threshold ≥1. Case 2: same, but with variation on a different scale gotten by re-ordering sub-environments; coarse grain for organism with thresholds (for both environments) of <12. Case 3: p(E1) = p(E2) = 0.5; Sample trial random variable, with 13 E1, 11 E2. Fine grain for organism with thresholds ≥3 for white ≥2 for “gray”. (In calling this, “gray” we are treating a regular 2-D repeat of black and white pixels within the squares as a (perceptually) fine grained property! To a “Laplacean demon” which calculates exactly and does no averaging, there is no gray—only arrays of black and white pixels showing a variety of “homogeneous” regularities on different size scales.) Also note (compare case 1) that random noise coarsens the grain

Consider three idealized environments which are checkerboards, all with equal proportions of black and white squares, in the air above which the temperature is 50°C and 0°C, respectively (this mimics solar heating effects). They differ only in the size of the squares, which are 10 mm, 10 m, and 10 km on a side. We will compare the fates of two organisms in them—a water buffalo and a Drosophila (or fruit fly), universal test instrument of classical genetics [the grains of their environments are compared for two different niche dimensions across a range of size scales in Table 2]. Assume that 0°C is too cold, and 50°C to hot for either organism to stand for long periods of time, and 25°C is about optimum for each. At roughly 3 mm and 3 m in length, the fly and water buffalo have a length ratio of about 1:1000. Thus they bear about the same relationships to neighboring checkerboard scalesFootnote 21.... Each is about a third of the length of the smallest and middle scales, respectively, and about 1/3000 of the length of the middle and largest scales. How would they do when moving through these different scale checkerboards? Because their nervous systems are tuned to detecting things (including temperature differences) on their own size scales, each would detect those variations. Detection on that scale is relevant to locomotory decisions over that or much larger distances—avoiding local hot or cold spots or going up or down larger temperature gradients. But they also “buffer” physiologically on the scale to which they are perceptually sensitive: they have enough thermal mass (and low enough thermal conductivities) to be unaffected by air temperature variations for that combination of size and temperature ranges. (They wouldn’t notice variations in patches the next scale down—10 μm and 10 mm, respectively—but would perceive only their comfortable “average” 25°C). But they would be in real trouble the next scale up (squares 10 m and 10 km on a side), dying or being sorely stressed before they could get to the other side. The smallest scale for each organism (10 μm and 10 mm) is both perceptually and physiologically fine grained (for thermoregulation). Their respective middle scales are perceptually coarse-grained, but physiologically fine grained for each. And their relative large scale is coarse-grained for both organisms in both respects. But, save for extremely large and small grains, the Drosophila and the water buffalo will experience any given environment differently because of their different scales relative to the grain of that environment. Figure 2 actually illustrates both perceptual and physiological graining. The scale of the “gray” denoting type E1 sub-environments was chosen to be fine enough to be conventionally treated as an average gray, but coarse enough still to be discriminable as an array of black and white dots (at least to some of us, or when I’m wearing my glasses!)

Table 2 Environmental grain for different niche dimensions, organisms, and size scales

Levins’ two adaptive functions are designed for different kinds of limiting cases, which make different approximations appropriate. Ignoring this can lead to explicit contradiction. He suggests (1968: 18–19) that real cases will fall somewhere on a continuum between them. The simplest mathematical assumption of unlimited re-orderability (through application of IS and RA) is too strong—stronger than Levins actually needs for deriving his “fine-grained” adaptive function. (The derivation only requires the weaker condition that the sub-environments can be arbitrarily re-ordered as long as that leaves a representative sample of the environments of the whole in any sequence of a length range important to determining fitness.) This length range is the tolerance, integration range, or threshold of Fig. 2. This constraint on the representativeness of any appropriate sub-sequence of the sequence allows some reordering of the sub-environments, but prevents unlimited reorderability, and is a function of the relevant (sensory and thermoregulatory) physiology of the organism. It prevents transforming a fine-grained environment into a coarse-grained one but also shows that the “fine grained” adaptive function is not a true aggregative property of the fitnesses of the sub-environments. Indeed, stochastic fluctuations from the average (illustrated in case 3 of Fig. 2) require higher thresholds or tolerances than the regular periodicity of otherwise comparable case 1, and the variance in length of “white” and “gray” subenvironments would be reflected in the larger biological thresholds we would find in nature).

The “sub-sequences of the environment important in determining fitness” may differ in length and composition for different adaptive problems (e.g., thermoregulation, mating, or predation) which determine an “environmental scale”. This is a temporal or spatial size scale for determining relevant changes in the environment as a function of properties of the organism—it could thus equally be thought of as an organismal scale, or most accurately as a scale relating organismal and environmental properties. As we have already seen, the relevant scale differs for different organisms as a function of their properties and capabilities, and also for different functional subsystems of the organism, and for the environmental variables in question, and for how far these variables deviate from their ideal values for those organisms. (The tolerable size for checkerboard squares for each organism would have been smaller—likely much smaller—if the temperatures had been −50 and 100°C, rather than 0 and 50°C—even with the same mean temperature.) Threshold or transition regions occur where environmental variations much less than that are averaged (added), and those much larger than that are treated as independent obstacles (in effect a multiplicative sieve) which must all be gotten through. These different scales or limiting cases provide motivations for the two distinct adaptive functions.

Thresholds are common, and inconsistent with the unrestricted application of either or both of conditions QS and CI. How far do temperatures fluctuate from my current (preferred) body temperature, how rapidly do they change it (a measure of my thermal mass, surface area, and surface conductivity), and how big is their spatial extent relative to my rate of travel through them? Or how great is the distance between prey captures, and how much net energy do I get per capture relative to how far I can go between captures? Is it large and concentrated enough to be worth claiming or defending?Footnote 22 For many adaptive problems there are couplings between size and time scales in terms of the rate and frequency of various energy flows, and inequalities which must be satisfied for the organism to survive. These different thresholds are rarely totally independent of one another. Larger mammals usually better survive the extended cold temperatures of the arctic than smaller ones, and for longer, but this depends on how well fed they are, and whether they can gain shelter or hibernate.

So we learn from this example that one cannot simply look at the form of an expression relating system and parts’ properties to tell whether a system property is aggregative. We must look also at the assumptions made in deriving the “composition function” for the system property, and make sure that all of them are empirically adequate for the case in question. This fact places important limitations on a formalistic account of aggregativity, for it isn’t enough to look at the form of equations in the finished empirically adequate theories. You also have to know how you got there, because the approximations you made along the way cannot be forgotten in evaluating aggregativity.

Perspectival, contextual and representational complexities, or “It ain’t quite so simple as that!” (An extended example from the genetics of multi-locus systems)

Chromosomal versus gametic linkage and other segregation analogues

I now want to demonstrate how aggregativity and assessments of aggregativity also depend upon (1) the choice of a parts-decomposition, and (2) idealizations and assumptions made in the description of a system. Examples #5a and #5b from Table 1 reveal how the apparent aggregativity of a system property depends upon the decomposition used. These will be discussed at length.

Consider a multi-locus genetic system with the genes organized into chromosomes (example #5b). A gamete is a haploid (or “half”) genotype gotten by taking one or the other of each of the homologous chromosomes pairs of its parental genotype. Sperm and egg gametes contributed by males and females fuze at fertilization to form zygotes carrying whole genotypes. The expected frequency of a randomly drawn gamete is assumed to be the product of the frequencies in the population of the different chromosomes types that compose it. With random assortment at the level of whole chromosomes, meiotic processes reliably (usually, but not universally) produce gametes with exactly one of each chromosome-type. So it looks as if gamete frequencies (with normal meiosis) could be an aggregative property of chromosome frequencies. Assume for now that they are. (They aren’t always as I show below.)

We could also calculate gene frequencies and describe the genotype as a series of genes at each locus in each chromosome of the haploid genotype (example #6a). But we cannot assume that the frequency of a randomly drawn gamete is the product of gene frequencies of all of its genes in all chromosomes making up a haploid genotype. Unless genes are randomly partitioned to start with (they rarely are), genetic linkage between genes in chromosomes entails that they won’t immediately distribute in this way. Without selection or other biasing forces, crossing over and recombination will gradually scramble genetic combinations among mating members of the population over successive generations, so that gametic frequencies exponentially asymptote to these multiplicative values, “linkage equilibrium” values, at rates proportional to their “linkage distance”—a function of their relative locations along the chromosome).Footnote 23

So what? When genes come in chromosomes, as they do in the real world, one must recognize this when calculating gametic frequencies.Footnote 24 (This dependence both makes linkage mapping possible, and necessary—see any genetics text or Wimsatt 1992.) These lead to violations of condition IS which force recognition of multiple levels of organization: the gene or allele, and the structure of the chromosome, gamete, and haploid–diploid life cycle. In nature, even higher level conditions on population structure are commonly relevant (Wade 1996). Many applications of population genetic single-locus models implicitly assume an aggregativity that simply is not there. Two populations having identical arrays of gene frequencies but different arrays of chromosomes frequencies will produce different gamete frequencies (and thus different genotype frequencies) in ways determined most immediately by their chromosome frequencies. So decompositions of the genotype into whole chromosomes are actually more aggregative than decompositions into genes. Chromosomes and their linkage structure are recognized in the theory as real natural objects via the structure of the relevant equations. These express gametic frequencies in a given generation as functions of recombination frequencies between genetic loci (yielding their relative locations in the linkage map) and whole gametic frequencies in the last generation (Ernst Mayr’s charge that population genetics was “beanbag genetics”—viewing organisms as a “bag of genes”—is false. On the more accurate multi-locus theory, organisms (or genomes) are more like a can of worms than a bag of genes).

But this is still too simple—there is more structure here. For some purposes, the gametes are kinds of meta-chromosomes: one inherits gametic complements of chromosomes, not chromosomes drawn independently at random from their homologues in the population at large. These are assembled into diploid genotypes, not arbitrarily large polyploid assemblages of chromosomes. And these diploid zygotes come from mating pairs. Gametes and diploid genotypes are each assemblages structured via a systematic combination of constraints and random elements, and the results reflect both. If we start with non-random statistical associations between chromosomes in individuals (so gamete frequencies are not simply products of the frequencies of component chromosomes), then independent assortment of chromosomes under random mating will go only half-way towards equilibrium in each successive generation. This failure to reach the maximally mixed state in the next generation (the Hardy–Weinberg multi-locus equilibrium) reflects structural relationships among chromosomes, gametes, and genotypes—a more complex failure of condition IS.

This structure is best viewed in the Punnett square diagram of Fig. 3 for two alternative alleles (A, a and B, b) at each of two loci (A and B). These come naturally packaged into the alternative gametic combinations A- -B, A- -b, a- -B, and a- -b, contributed by the male (alternatives across the top, invariant down the columns), and female (alternatives down the side, invariant across the rows). Resulting zygotic combinations are found in the squares at the intersection of their generating gametes.

Fig. 3
figure 3

The Composition of 2-Allele, 2-Locus genotypes from gametes, and the production of new gametic types via independent assortment or recombination

If close together on the same chromosome, genes will show a continuing slowly decaying statistical association, given by the “linkage distance” r, ranging from 0 (very close) to 0.5 (very far) as a function of how frequently recombination events separate them—a property of their relative locations on the chromosome. Surprisingly, if they are either very far apart on a long chromosome, or are on entirely different chromosomes, they will show an identical “linkage” effect of 0.5. The identity of this effect (when r is 0.5) with the genes far apart on the same chromosome or on different chromosomes shows that this association is not—or not just—a product of chromosomal organization. It is a property of the genotypic-gametic life cycle, as can also be seen in Fig. 3.Footnote 25

Consider recombinations first. For this assume that each gamete is a single chromosome, with the A and B loci separated by a dotted region of the chromosome in which crossing over and recombination occurs. Assume equal recombinations (chromosomes line up so that no loci are gained or lost in any reassortments), and occur with equal frequency, r, for each of the 16 possible pairings of gametes. In each square along the reverse diagonal we see the chromosomes before recombination above, and the results after recombination below. Recombinations happen in all squares but only in the squares of the reverse diagonal do the recombination products differ from the recombination inputs.Footnote 26 In all others, either both chromosomes are identical (as on the forward diagonal), or they differ only at one locus (as for the rest). In these cases, recombination will not produce new combinations. So new recombination products—when post- and pre-recombination chromosomes differ—can occur in only 4 out of 16 squares.Footnote 27

But this story can be replayed, and with the same diagram, for independent assortment! Now A and B are on different chromosomes and the dotted line—indicating direct physical connection when they were on the same chromosome—now indicates that these chromosomes come into the union packaged in gametes. The “crossing over” now indicates free or random interchange due to independent assortment of different chromosomes and the genes they contain in the production of new gametic combinations. Independent assortment yields an equal probability that two chromosomes which came in together in the same gamete will stay together in outgoing gametes—reflected in an r of 0.5. This is the same situation again! Independent assortment happens in all squares but only in the squares along the reverse diagonal do the products of independent assortment differ from the inputs to independent assortment. New products of independent assortment may occur in only 4 out of the 16 squares. The similar results in these two cases, as compared with the single locus case, reflect the structural constraints of the diploid–haploid life-cycle.

Compare the single-locus case, again using the same diagram. Consider the B-locus, with B and b as alternative alleles in the four contiguous squares of the upper left quadrant. (This pattern for the B-locus is repeated in all four quadrants, so picking one gives no loss in generality). These four squares are a 2 × 2 Punnett square for a single factor cross among heterozygotes at the B-locus—the one-locus analogue of the 4 × 4 two-locus Punnett square of the whole figure. Heterozygotes are formed in 2 out of the 4 squares (again along its reverse diagonal).Footnote 28 This different proportion of squares in which new arrangements of elements can occur—2/4 vs. 4/16—has consequences. It means that total mixing (and Hardy–Weinberg equilibrium) can be achieved instantaneously in one generation for the single locus case, rather than asymptotically over many generations as for two or more loci. The asymptotic rather than 1-generation approach to equilibrium reflects the presence of a structural condition—a higher-level “segregation analogue”—retarding the rate of loss of variance among larger genetic structures. A structural relation among parts with consequences in the equations for gamete production, it should not be surprizing that it produces a failure of aggregativity in which the arrangement of the parts matters. (This is but one of several segregation analogues, reflecting different aspects of population structure (Wimsatt 1981b), and providing in effect a kind of “external genetics”, or exogenetics.Footnote 29)

To illuminate the structural relationship producing “gametic linkage” in another way, consider what we would need to negate its effect—to make the 2-locus case come to equilibrium in one generation as the single locus case does. Suppose a population starting with equal numbers of AABB and aabb homozygous genotypes. Only gametes A- -B and a- -b would be produced, so the middle two rows and columns would be empty. Only the four corner squares would count, producing a “reduced” 2 × 2 table. Equilibrium in this population in one generation requires equal numbers of the four gametic types in the next generation. One of two counterfactual modifications of how chromosomes assort or recombine would do it: (1) If recombination or assortment were no longer random but obligate, so that if reassortment could happen, it did. That is, if r = 1, then the two occupied squares in the reverse diagonal produce enough A- -b and a- -B gametes to balance the A- -B and a- -b gametes produced in the upper left and lower right squares. (2) Alternatively, we could leave recombination and independent assortment alone, and have obligate dissimilar matings rather than random matings. Then all of the matings are in the upper right and lower left squares, and an r of 0.5 will produce equal numbers of the four gametic types in the next generation. Of course neither condition holds! The thought experiments indicate structural aspects of the cycle producing gametes and genotypes that violate aggregativity assumptions. The second is a special case of assortative mating indicating super-individual population structure. Assortative mating is common in nature. It has important evolutionary consequences, indicating yet another failure of aggregativity via failing condition IS! That it is commonly ignored reflects more about our entrenched idealizations (and common assumptions of Hardy–Weinberg equilibrium) than anything else.

So to summarize: the level of structural decomposition of the genome (with genic vs. chromosomal partitions) affects the apparent aggregativity of the properties in question—the frequency of gametes and genotypes produced. Chromosomal decompositions are more aggregative than genic ones because they reflect intra-chromosomal linkage, but they still ignore the factor I have called gametic linkage. All these partitions—gene, locus, chromosome, gamete, and genotype—are needed in different combinations for different problems, as are super-organismal assemblages in other cases and conditions: mating pairs, families, groups, and demes. (I list only genetic assemblages. As Brandon (1982) argues, these won’t suffice for all questions of evolutionary dynamics—we need other properties of phenotypic units.) For sufficiently constrained problems and conditions, the smaller simpler partitions may appear to be aggregative, but this is usually misleading. And we aren’t yet done with this case: there are other idealizations hiding in the wings.

Gametic composition is still not aggregative at the level of whole chromosomes or gametes: there are also problems with conditions QS and RA. These problems are partially hidden by assumptions of standard models of the recombination process. These are occasionally violated (showing non-aggregativity of the relevant properties) but even more interesting, the fact that they are so regularly met is a special product of design features of the meiotic process. Thus meiosis operates so as to increase the apparent aggregativity of processes for producing gametes, thereby increasing both the average fitness and the heritability of traits and fitness in offspring. Thus apparent but in fact highly conditional aggregativity arises because of a “special hookup” of processes and parts—a special and quite complicated adaptation of the hereditary machinery.Footnote 30 This designed aggregativity is not aggregativity as we have discussed here at all but results from designed organization that is complex in proportion to its importance: it is designed to help to produce the kind of “quasi-independence” and apparent “modularity” of traits in a highly complex and interactive system because this “quasi-independence” as Lewontin first argued (Lewontin 1978; Wimsatt 1981b; Wagner 2005) is crucial to evolution.

Genotype as phenotype, assumed conservation, and further complexities

Models of recombination and linkage commonly used in population genetics suppose that the number and arrangement of loci in chromosomes is invariant.Footnote 31 This (false) assumption prevents variation in genome size in ways that test QS, and don’t allow ReAggregation except through recombination of homologous segments that preserve their orientations. Rarer but evolutionarily important translocations, deletions, duplications, and inversions violate this idealization. These standard models thus exaggerate the aggregativity of the actual physical processes. The larger changes produced by inversions and other more arcane reconstructions of the genome often cause major further changes both in fitness and in the types of gametes produced in ways often characteristic of speciation events.

Existing theories also can’t show whether gamete frequencies are aggregates of gene or chromosome frequencies for another reason—they are not conservative.Footnote 32 Current theories give equations for the characteristic products of these processes, not for all of the products, or perhaps more accurately, not for their products under all circumstances. This is because not all products would be classified as gametes. The lack of conservation in these theories is hidden by dealing with gamete frequencies, rather than numbers, so there is never a full balancing of the equations for producing gametes from parental genotypes as in, say, chemical reactions. It seems a plausible requirement for (real) aggregativity that the equations be conservative, as it surely is for all of the conserved quantities of physics. This case is discussed more fully in Chapter 12 of Wimsatt (2007).

Indeed one must impose constancy of size, composition, and arrangement as side conditions on the architecture of the genome in population genetic models of genome formation. This also reveals that we are not dealing with aggregative properties. This is like the case in the initial exposition of aggregativity, when properties of linear amplifiers seemed aggregative as long as we limited ourselves to strictly serial arrangements. (These conditions are so taken for granted that they are rarely stated—or studied. Interestingly, their violation, though usually pathological, and idealized out of existence, are some of the very kinds of changes that evolutionary developmental biology studies as leading to occasional adaptive macro-mutations.) With these conditions and theories, things may appear aggregative which are not, which are highly sensitive to the arrangement of mechanical parts necessary to produce—and have the function of producing—this apparently aggregative behavior. (A more familiar analogy might help: that computers can do sums accurately doesn’t make them “mere aggregates” either—even if we limit the case to special-purpose machines which can only do sums.)

This fictional or quasi-aggregativity is particularly pronounced for quantitative genetic multi-locus models of additive traits, where it is supposed that each of a number of genes contribute additively to that trait. The expression of any of the genes and the additivity of their contributions usually depends on phenotypic conditions produced by many other genes that are a normally presupposed part of the genetic background. (To worry about the intensity or additivity of eye pigment, you need eyes!) Additive fitness contributions of genes has played a central if contested and ambiguous role in discussions of higher-level units of selection (Wimsatt 1980b, 1981b; Sober 1981, 1985; Brandon 1982; Griesemer and Wade 1988; Lloyd 1988, 1989; Sarkar 1994; Wade 1996).Footnote 33 Following Lewontin (1978) on “quasi-independence”, I have argued (1981b, 1986b) that the additivity of fitness components which exists is local, and context-dependent (though it can appear to be context-independent for small and limited changes). This local additivity does not show that fitness is an aggregative property of genes.

Aggregativity and dimensionality

Aggregative or near aggregative relations reduce the dimensionality of equations and necessary theory—producing simpler theories in obvious ways. If relationships between parts can be ignored or their complexity can be reduced, then there are fewer alternative ways of composing entities that will behave in distinguishable manners, and the complexity and dimensionality of set of equations necessary to describe the system’s behavior is reduced—usually combinatorially. This gives an enormous computational advantage to simpler theories, and a strong prima facie preference for them that is a major (and largely unrecognized) source of bias in the units of selection controversy, as documented by Wade (1978) and Wimsatt (1980b, 1981b). This is nowhere better illustrated than in Lewontin’s famous table of the dimensionality of population genetic theories under different assumptions about gene interactions (Lewontin 1974, p. 283, Wimsatt 1980b). As one might expect, the fewer the kinds of relevant interactions, and the fewer the alleles and loci involved in the relevant equations for fitness of evolutionary units and the determination of evolutionary trajectories, the more aggregative the phenomena appear.Footnote 34

The following modification of Lewontin’s table is from Wimsatt 1980b.Footnote 35

Consider the simplifying assumptions in Lewontin’s table:

Table 3 Sufficient dimensionality required for prediction of evolution at a single locus with a alleles when there are n segregating loci in the system

With either assortative mating or sex-linkage, the frequency of genotypes is required to determine the frequency of matings of different types, and from them the frequencies of offspring genotypes. If there is any populational heterogeneity, the specified constraints on pairing violate condition IS, not only at the genotypic level, but also (given diploidy) at the gametic level. But even if different genotypes pair randomly, genes may be clustered in a non-random fashion in gametes.Footnote 36 As we learned in section 3 this will result both in non-random production of gametes and non-random production of genotypes when individuals producing these gametes mate. With different genotypes having different fitnesses, different genotype frequencies will produce different net effects on gene frequencies, so these higher level units—frequencies of genotypes and of gametes are required to predict ther outcomes of selection correctly. Any epistatic effects on fitness (violating condition CI) will impact viability and reproductive output of different genotypes, and the frequencies of different gametes, genes, and genotypes in the next generation. Finally, if genes contribute additively to fitness, this contribution is statistically independent of genetic context—what genes are found at other loci. However, as noted in section 3, genome size (and arrangement) cannot be scaled at all (violating QS) without (almost always) massive epistatic (indeed lethal) effects on fitness. Nonetheless, if the right things are held invariant, the three assumptions in the table show different degrees of context-sensitivity assumed in the population genetic models, and less context-sensitivity (associated with meeting more assumptions) makes for simpler equations.

The dimensionality specified in the table is the number of independent equations that must be solved simultaneously (actually, iteratively, in each generation) to predict the outcome. This is less than the number of similar variables: at each locus, the gene frequencies must sum to 1, so if n−1 of the alleles are specified, that determines the frequency of the last; similarly for the gametic and genotypic classes in the second and first columns must sum to 1, so the frequencies of the last gametic and genotypic types can be calculated as 1 minus the sum of the others.

The simplifying assumptions not only reduce the number of independent equations—but also the complexity of each equation. Multi-locus gametes and genotypes can arise in a different ways from other multi-locus gametes and genotypes. These ways of origination are reflected as terms in the equations for how the frequencies of gametes and genotypes change, and they grow combinatorially with the number of units that must be considered. Thus the computational complexity of the system grows combinatorially simultaneously in two ways, with the overall complexity the multiplicative product of these two modes of growth. This explosion is thus worse than combinatorial, and probably deserves a new name.

So as the relationships of the lowest-level parts in the larger structures become causally relevant—changes in these relationships change outcomes, the complexity of a dynamical theory grows extremely rapidly with the number and number of kinds of parts. Thus to find the simplest workable theory of a system, it is natural to start with one assuming the minimal number of causally relevant relational properties because this lets you go with the smallest number of relevant parts. In the logical extreme, this is to assume aggregative behavior, either directly, or by making equilibrium assumptions that suppress the relevance of context. If a property is aggregative, then the value of that property for the whole system is all that matters in its dynamical behavior. If the property of the system is invariant over aggregative operations on its parts, then it is independent of variations in these changes, which is to say that their individual values do not matter as long as the value for the whole remains invariant. If that is not possible, structures are assumed which preserve the largest possible degree of invariance of system properties on organizational rearrangements of parts’ properties.

Aggregativity as a heuristic for evaluating decompositions, and our concepts of natural kinds

These cases have yielded many interesting features of claims of aggregativity or partial aggregativity. Seeing them together suggests interactions among the development of theory, methods of decomposition, and experimental design, and what we make of what we have found. Together they give a somewhat different picture of the nature and uses of aggregativity, and have further implications for the assessment (and biases) of reductionistic methodologies.

  1. (1)

    Table 1 shows that very few system properties are aggregative functions of parts’ properties, so emergence—as failure of aggregativity—is extremely common. It is the rule, rather than the exception. The conservation laws of physics pick out aggregative properties, but little else does. So this could—perhaps fairly—be criticized as yielding a very weak notion of emergence. But it accords with intuitions of most scientists I know, who are not willing to give up either their reductionism or their emergence, and who agree with its classification of particular cases. Even if a stronger notion of emergence is preferred, this one would be required for these kinds of cases.

  2. (2)

    So then why is the temptation for “Nothing but-ism”—the ontological war-cry of what Dennett (1995) calls “greedy reductionism”—so strong? We see statements quite regularly in science like “Genes are the only units of selection”, “Organisms are nothing but bags of genes”, “The mind is nothing but neural activity”, “Social behavior is reducible to the behavior of individuals.” If total aggregativity is so rare, why are claims like these so common? While true aggregativity requires invariance of the system property under all decompositions and reaggregations, I suggest that we often (fallaciously) think of behavior at higher levels as being aggregates of the behavior of parts for particular decompositions which do show this invariance—or show it only partially (for some of the criteria) or approximately (invariant within an ε for the criterion in question—see below). Such properties look aggregative for some decompositions or conditions, but reveal themselves as emergent or organization-dependent for others. So appearances are deceiving!Footnote 37 We saw this in the discussion of multi-locus genetic systems. Analyzing such practices naturally changes our focus from ontological to methodological questions: from how to specify relations between system and parts’ properties to looking at the reasons for, process of, and idealizations made in choosing and performing a decomposition, and the broader effects of those choices.

  3. (3)

    Properties may be aggregative for some decompositions but not for others, or more so for some than for others. The degree of aggregativity may then be used—consciously or unconsciously, but in any case quite rationally as a criterion for choosing among decompositions: we will tend to see more aggregative decompositions as natural decompositions, and their parts as instances of natural kinds , because these provide simpler and less context-dependent regularities, theory, and mathematical models for the behavior they capture. The dimensionality reductions arising from the standard simplifying assumptions in classical multi-locus population genetics (in Table 3) show this clearly. These decompositions may be particularly revealing cuts on nature, but we must take care. (They may be the right cuts for the wrong question.) We will tend to see these parts as special, and to make “nothing but” style reductionistic claims for them. This is a particularly pervasive kind of functional localization fallacy —a move from the claim that a decomposition is particularly powerful or revealing to the claim that the entites and forces it yields are all that matters (Wimsatt 1974, 1985; Bechtel and Richardson 1992). Such claims are false or methodologically misleading if taken to suggest that one shouldn’t bother to construct models or theories of the system at levels or with methods other than that of the parts in question, that these preferred entities are the only “real” ones (Wimsatt 1994), or that questions one can pursue with such decompositions are more important.

  4. (4)

    When partial aggregativity leads to greater physical or functional modularity of the parts (likely in evolving systems—see Lewontin 1978; Wimsatt 1981b, pp. 141–142; Schank and Wimsatt 2000, or Brandon 1999), it may promote (evolutionarily) a consilience or robustness of parts’ boundaries individuated using different properties of the system. Platt (1969) and Abbott (2001) note that system boundaries in one property may provide symmetry-breaking factors producing growing differences and boundaries in other properties along the same dividing lines. Greater robustness (Wimsatt 1981a) of parts under that decomposition—a standard criterion of objecthood—will also strongly contribute to the judgement that this is a decomposition into natural or real parts. Such claims are prima facie reasonable. (Robustness is a degree-property, so these claims are not an all-or-nothing affair.) These judgements are all context-sensitive, so they still don’t support “nothing-but” style claims.

  5. (5)

    The four conditions all specify invariance of the system property under operations on the parts. For quantitative properties one can easily produce a family of criteria for approximate or local aggregativity, in which variation of the system property within ±ε are tolerated for various values of ε (Wimsatt 1986b). In (1974) I used this strategy to describe different degrees of near-decomposability or modularity in systems. Tolerances are useful theoretical tools because we use quantitative or formal qualitative frameworks as templates which nature may meet in varying degrees. With a particularly adaptable framework that can be fitted to nature in different possible ways, we may try many such mappings, looking for “best fits”. (The “coefficient of determination” or r-value of an equation in a linear regression is a particularly simple example.) Using “tolerances” for key qualitative concepts is particularly useful in a messy, inexact, and approximate world with many regularities and stable patterns, but few exceptionless generalizations. We already need tolerances for the “noise” we face in experimental situations, but this is an additional and very important reason.

  6. (6)

    Since approximations are frequently used to produce equations of aggregative form, we must investigate the accuracy of the approximation under relevant conditions. This may take two forms, with different consequences:

  7.  
    1. (a)

      Emphasizing conditions for accuracy focuses attention upon the context-dependence of system properties, with failure of aggregativity often leading to using or studying the system under different conditions, seeking conditions under which its behavior is more aggregative or may be qualitatively different. (Here the search simplifies conceptualization and analysis of data, and may also generate better experimental control—indicating conditions one should attempt to realize or avoid.) If conditions for aggregative behavior are found, we will be tempted (see (3) above) to regard that decomposition, and the conditions which produced it as reflecting the real nature of the system, however, unjustified that may ultimately be.

    2. (b)

      Emphasizing required accuracy highlights our purposes and demands of our applications, adjusting demands or methods as required—using methods suited to non-aggregative behavior (for greater accuracy) or weaker methods treating it as aggregative (for less). Awareness of our role in classifying the property this way may make us more self-critical of our idealizations, and of possible biases in our problem-solving heuristics (Wimsatt 1987, 1980b, 1985). Such awareness is important but we should avoid the view that classification of the system property is merely instrumental, conventional, or socially determined. (“Determines” too easily equivocates between “plays a role in determining” and “is sufficient to uniquely determine.”)

  8. (7)

    One can also assess aggregativity in past and present theories of the relation of system and parts’ properties. In doing so, one must look not only at the final derived equations, but also at their derivation, to see if the idealizations and approximations used assume one or more of the conditions of aggregativity, and whether they are legitimate for the conditions at hand. One may thereby derive an aggregative model, and also come to understand its conditions of applicability. As the genetic and environmental examples discussed above show, apparent units of aggregation can be very misleading, and the respects in which they can be treated as aggregates quite limited.

  9. (8)

    One can track systematic changes in our view of the relation between system and parts’ properties, both analyzing their status in historically important disputes, and comparing their changing status in successively better theories of a phenomenon. Does this raise Hempel or Nagel’s worries (Wimsatt 1986b)—that aggregativity and emergence are really about our knowledge of the world, not about the world itself? Hardly so! This information is just what a sophisticated realist needs—to see how the world changes as viewed through our theories, when our theories change. Discussions of theory-dependence always suppose that the effects of this theory dependence are somehow impossibly confounded with those of the world. But this belief is ungrounded. As long as you can tell one from the other—object of study from tool of access—you’re okay, and to do this you need to be able to vary both to assess and separate these effects. Criteria for aggregativity can be viewed either as statements about ontological relations in the world, or as tools for constructing and characterizing theories. It’s just that these are different uses. If one make their aims clear, there are no unavoidable threats to objectivity.

  10. (9)

    Earlier reductionistic theories of the behavior of a system will tend to have more simplifying assumptions, controlled variables, and assumed constancies, and predicates treated as monadic or of reduced order than later ones (Wimsatt 1980b, 1985). More realistic models are often suggested by the failure modes of these simpler ones (1987). Higher order relational properties and more complex interactions between parts resulting from increasingly detailed specification of the internal structure and environmental relations of the system become accessible to analysis with increasingly powerful simulation tools. These kinds of progress should increase the degree and kinds of emergence postulated of system properties. This is the opposite of that predicted on the classical positivist model of emergence, which saw emergence disappearing with the progress of science. Increased awareness of emergence is just what is happening with recent increased interest in the study of complex systems. The increased talk of holism and emergence accompanying the rising interest in non-linear dynamics—systems which violate (at least) the fourth condition for aggregativity—often come with loose (and often misleading) anti-reductionist talk, but is a clear confirmation of the kind of analysis pursued here.

Reductionisms and biases revisited

This analysis can help to address some new questions concerning reductionism—particularly how it is used and perceived in the context of an incomplete analysis—i.e. our usual situation! The four conditions of aggregativity provide a powerful adjustable framework to evaluate how well each condition is met across different decompositions of a system into parts. The better a decomposition meets these conditions, the more easily we can treat it as factoring the system into a set of modular parts having monadic, intrinsic, or context-independent properties. We saw in the discussion of multi-locus population genetics how aggregativity of properties differed for gene, chromosome, and gamete-level decompositions of the genome. With particularly simple and theoretically productive decompositions, we will tend to view these parts and properties as instances of natural kinds, as robust, and to regard the system as “nothing more” than the collection of its parts. We have here turned an architectonic distinction between kinds of properties into a search heuristic for finding preferred, simple, “maximally reductionistic” decompositions of systems into parts—decompositions which lead readily to extremes of “nothing but” talk and disciplinary imperialism.

A reductive explanation of a system property or behavior shows it to be mechanistically explicable in terms of properties of and interactions among the parts of the system. What does this kind of explanation have to do with “nothing but” style reductionism? In principle, nothing—but in practice they are temptingly connected and easily confused. With total knowledge of a system, the two species of “reduction” are clearly distinguishable, but we don’t have that, and with increasing degrees of ignorance about a system they come to look more and more alike! A closer look at what we do in conditions of partial ignorance is especially important for fields and explanatory tasks whose major questions are still “in process”, and just the kinds of judgements we should seek for limited, fallible, and error-prone scientists. Indeed this is our common situation!

A system which is aggregative for a given decomposition is almost trivially mechanistically explicable: the parts all have the property in question, and enter into the explanation of how the system has it in the same simple way. Relationships with other parts are monadic (nonexistent) or of relatively low order, and would tend to meet strong conditions of symmetry and homogeneity. Such systems are relatively uninteresting: their parts show no functional differentiation. But none of these conditions follow from saying that the properties of a system are mechanistically explicable. To say that the behavior of a system is totally explicable in terms of the behavior of its parts is not to say that it is an aggregative function of the parts. (The inference does go the other way, contributing to the confusion, but the rarity of true aggregativity makes even this fallacious inference empty: there are too few occasions to invoke it.)

But suppose that early in the investigation of a system (say, an organism) we think we know a good set of parts (e.g., its genes). If we don’t yet know the diverse ways in which these genes may interact with each other and with the physical conditions in the organism (on all of the relevant size and time scales), we may treat their interactions as all alike. We may do so either in a first-order simplified model of the system that we simulate or analyze, or in the “out of sight out of mind” blissful ignorance that often accompanies our view of a complex task before we really get into it. In either case we are likely to overestimate how aggregative the system is, how simple it will be to understand its behavior, and to make the most simple-minded reductionistic claims about what can be learned from studying it at the lowest (or indeed only at the lowest!) level. (My 1979, 1980b, 1985, and 1997 provide examples of how these assumptions can emerge and cause trouble in seemingly benign applications of commonly effective reductionistic problem-solving heuristics).

This would explain (indeed, predict!) (1) the characteristic oversimplifications in early claims made for the human genome project, and (2) the subsequent (necessary) broadening of the project to make it viable. (This included adding parallel comparative genomic studies of other species at different phylogenetic distances to determine what varies across those species groups, and their significance; and developmental and physiological studies at a variety of levels of organization of the expression of the genetic traits of interest). It also explains (and predicts) (3) the increasing moderation of claims for what we will learn from it. (4) Explanations coming out of it will be far more contextual and qualified, and may involve the discovery of qualitatively new kinds of mechanisms and interactions. This pattern—these four changes in the character of the program and the claims made for it—are not only explicable after the fact, but predictable in advance, given normally applicable reductionistic research strategies and their biases (1980b, 1985, 1997). These apply not only to the human genome project, but chart the expected trajectory of any successful reductionistic research program in the empirical sciences.

I neglect here the obvious political purposes served by exaggerating how much could be achieved with how little, and I am delighted (and unsurprised) that Dick Levins deals with that in his paper here. This could arise simply from cognitive bias, but phenomena are also often overdetermined, and reductionistic claims often do serve political ends, inside and outside of academe. (Most of the offending claimants do (or should!) start with the disclaimer, “In principle,...”. Yes, the road to hell is paved with good intentions!)Footnote 38 To assess these claims fairly, we must recognize the limitations of our knowledge, the heuristic character of our tools, and specific biases likely to result from their application. Knowledge of how we tend to construct models and theories in contexts of partial ignorance might well have produced a better project, at a more reasonable pace, with more realistic ends, and at lower cost to the other fields, which we must support and develop in any case to decode the texts we find in “the book of life”. But the project would still have been taken up by those it has served.

But doesn’t this chastening skepticism about reductionism run counter to the facts? Why then should reductionist methodologies have been so successful? A crucial property (the fourth of six)Footnote 39 of heuristic principles is that they succeed in part by transforming a problem into a different but related problem (which is easier to solve). If the problem is solved effectively, we will tend to identify the new problem as the old one—saying, “Now that we’ve clarified the problem so that it can be solved,...” or some such thing! Quite substantial changes in a paradigm may thus be hidden—particularly a cumulative string of such changes, each too small to be regarded as “fundamental”. I hold this kind of ex post facto reification responsible for the exaggeratedly high opinion we have of reductionistic methodologies, and also for the largely mistaken belief that work elaborating a paradigm is merely playing out already given options.

But hasn’t the reduction succeeded? Well, it has and it hasn’t: its successes are often genuine, but quite misleading—it may succeed via a series of subtle shifts such that important aspects of the original question have not been answered. By shifting them out of bounds of the new science, there will be a natural tendency to downgrade their importance. We work on problems that yield to our methods, and we all have a tendency to overestimate their power, and the importance and centrality of our own field. (It’s what we know best after all!) In this respect, scientists share with others a cognitive bias which is quite general across fields and contexts. So questions that can’t be addressed using our own very successful methods must not be very scientific, not very important, or both! This sounds like it’s worth a good laugh, but it may be even more dangerous than failing to solve a problem—we may now even fail to recognize that it exists!

Abbott (2001) provides a revealing analysis of perceptual distortions leading individuals near the extremes of an income distribution to see themselves as closer to the middle than they really are. This is a generalizeable phenomenon. If we discriminate differences among cognitive positions near our own, and lump differences among positions further away, systematic metrical distortions will exaggerate the centrality and importance of our own position. This is a quite general property of perspectives and of any perspectival view of the world. Thus if we tend to assume unconsciously that the importance of our specialty is proportional to the fraction of our knowledge of the world that it represents, we would both explain our systematic overestimates of the importance of our own areas, and predict that this distortion should be more extreme as we get more narrowly specialized. [This should apply both across individuals, and for an individual over time.]

The whole of embryology was differentiated out of the study of heredity with the rise of classical transmission genetics between 1900 and 1926, even though developmental phenomena had figured centrally there earlier, and were regarded through much of this period as important constraints on the form of acceptable theories of heredity.Footnote 40 Morgan was originally skeptical of Mendelism for just that reason—until the spectacular successes of his research group led them to simply ignore its quite paradoxical inadequacies on that score. Development again achieved the promise of a resurgence in genetics in the early 1960’s with the discovery of the lac operon—the first account of how a gene could perform a complex and conditional control function in the expression of a trait, but a simplistic and incorrect promise it turned out to be. The genetics of elephants (or any eucaryote) was not like the genetics of bacteria after all—especially not the developmental genetics of metazoa. Development is now again at center stage (accelerating since the late 1970’s or early 1980’s) for various reasons. Important aspects of it can now be studied molecularly, and we have discovered some extremely powerful and general large-scale gene complexes (the related HOM, HOX and DHOX families) which give us a handle on many more (but far from all) developmental phenomena. Is this a triumph for reductionism? In part, but it has succeeded in this by entraining and using a successively broader diversity of kinds of data and theories from other sources, and recognizing a whole new cast of causal entities. This very diversity of major players makes it much less reductionistic. The methodology (and even more, the rhetoric!) are still quite reductionistic (reflecting the character of the heuristics used), but neither the ontology nor the epistemology are anywhere close!