Keywords

1 Introduction

The search for mechanisms that underlie the phenomena under study is ubiquitous in many biological fields. Physiologists seek to find the mechanism for muscle contraction, cancer scientists try to discover the mechanisms that cause cell proliferation, and ecologists aim at elucidating the various mechanisms that bring about the maintenance of species diversity – just to mention a few examples. In the last 15 years, the philosophical literature on mechanisms has dramatically increased. Among the major proponents of the “new mechanistic philosophy” (Skipper and Millstein 2005, p. 327) are Carl Craver (2007), William Bechtel (2006, 2008), Stuart Glennan (2002, 2005), Lindley Darden (2006, 2008), and Peter Machamer et~al. (2000). According to the mechanist’s view, scientific practice consists in the discovery, representation, and manipulation of mechanisms. Scientific explanations are (exclusively or primarily) conceived as mechanistic explanations, that is, as descriptions of how the components of a mechanism work together to produce the phenomenon to be explained.Footnote 1

Our primary interest in this chapter is the modeling of biological mechanisms. How are, can, and should mechanisms be represented? Are certain kinds of models of mechanisms advantageous with regard to particular scientific purposes like explanation, understanding, prediction, or manipulation? Previous philosophical literature on this topic (e.g., Glennan 2005; Craver 2007; Bechtel 2008) regards mechanistic models as being primarily qualitative representations. According to the mechanist’s view, adequate models of mechanisms describe all and only those factors that contribute to bringing about the mechanism’s behavior of interest (i.e., the “constitutively relevant” factors; cf. Craver 2007, pp. 139–159). These factors include the entities (or objects) that compose the mechanism, the activities (or operations or interactions) that these entities engage in, and the spatial and temporal organization of the entities and activities (i.e., how the entities are spatially distributed, which position shifts of entities take place, which activities initiate which other activities to what time). These qualitative models of biological mechanisms are typically depicted by diagrams (cf. Perini 2005), which scientists sometimes call “cartoon models” (Ganesan et~al. 2009, p. 1621). Diagrams make it easier to understand how the steps of a mechanism together bring about the behavior in question. Hence, the representations of mechanisms that can be found in common biology textbooks are typically qualitative models.

However, biological research practice is much more diverse than what is depicted in biology textbooks. Whereas the models of mechanisms which are designed for textbooks aim at providing explanations and promote understanding, modeling strategies that are pursued in contemporary scientific practice, by contrast, serve multiple purposes. Besides offering explanation, models of mechanisms are also used, for instance, to make (quantitative or qualitative) predictions, to guide hypotheses building in scientific discovery, and to design manipulation experiments or even computer simulations. In some research contexts what will be needed are not purely qualitative models of mechanisms, but rather models that contain quantitative, probabilistic information. These models often have the virtue of being closer to the experiments and studies that are actually carried out in biological research practice. It is due to this closeness that probabilistic and quantitative models often allow for more usable predictions, in particular when it comes to predicting the probabilities of certain phenomena of interest under specific manipulations. Another advantage of models of mechanisms that combine qualitative with quantitative, probabilistic information might be that they allow for the integration of qualitative (e.g., molecular) studies and probabilistic (e.g., ecological or evolutionary) studies in a certain biological field. This is, for example, an urgent issue in epigenetics where the laboratory experiments performed by molecular epigeneticists and the observational studies and computer simulations conducted by ecologists and evolutionary biologists need to be brought together (cf. Baedke 2012).

With this chapter, we respond to the need of contemporary biology for models of mechanisms that include quantitative, probabilistic information. We argue that the formal framework of causal graph theory is well suited to provide us with probabilistic, (often) quantitative representations of biological mechanisms.Footnote 2 We illustrate this claim with an example from actual biological research, namely, feedback regulation of fatty acid biosynthesis in Brassica napus. Modeling this example allows us to show how causal graph theory is able to account for certain features of biological mechanisms that have been regarded as problematic (e.g., their multilevel character and the feedback relations that they frequently contain). However, besides the virtues our analysis of this case study also reveals which difficulties causal graph theoretical modeling strategies face when it comes to representing mechanisms. As a result, we argue for the balanced view that, even though causal graph theoretical models of mechanisms have advantages with respect to particular scientific purposes, they also have shortcomings with respect to other purposes.

We start with an introduction of the basic formal concepts of causal graph theory (Sect. 3.2). In Sect. 3.3, we present what can be regarded as the major characteristics of biological mechanisms, namely, their multilevel character, their two kinds of components, and the spatial and temporal organization of their components. Section 3.4 deals with the case study that is central to our analysis: the mechanism for feedback inhibition of ACCase by 18:1-ACP in Brassica napus. In Sect. 3.5, we discuss how this mechanism (as well as one of its submechanisms) can be modeled by using causal graph theory. In doing so, we also address the possible objection that causal graph theory can account neither for the feedback relations that many biological mechanisms contain nor for the fact that mechanisms are frequently organized in nested hierarchies. On the basis of this analysis, we can then specify, on the one hand, the virtues and, on the other hand, the shortcomings of modeling biological mechanisms within a causal graph framework (Sect. 3.6).

2 Causal Graph Theory

Causal graph theory is intended to model causality in a quite abstract and empirically meaningful way; it therefore provides principles which connect causal structures to empirical data. While causal structures are represented by graphs, empirical data is stored by means of probability distributions over sets of statistical variables. In this section we will introduce the basic formal concepts needed to investigate the question of whether a causal graph framework is capable of representing mechanisms. We start by giving some notational conventions and remarks concerning statistical variables and probability distributions (Sect. 3.2.1) before providing definitions for “probabilistic dependence” and “probabilistic independence” (Sect. 3.2.2). We introduce the concept of a causal graph (Sect. 3.2.3) and illustrate how such a causal graph, complemented by a probability distribution, becomes a causal model (Sect. 3.2.4).

2.1 Statistical Variables and Probability Distributions

A statistical variable X is a function that assigns exactly one of at least two mutually exclusive properties/possible values of X (“val(X)” designates the set of X’s possible values) to every individual in X’s domain DX. Statistical variables can be used in a way quite similar to predicate constants. “X(a) = x” (where “a” is an individual constant), for instance, can be read as the token-level statement “individual a (e.g., a particular Drosophila fly) has property x (e.g., red eye color)” and “X(u) = x” (where “u” is an individual variable) as the type-level statement “having property x.” Formulae like “X(u) = x” can be abbreviated as “X = x” or, even shorter, as “x” whenever reference to individuals u is not needed. For the sake of simplicity, we shall only use discrete variables, that is, variables X whose set of possible values val(X) is finite. Continuous quantities can be captured by discrete variables whose values correspond to the accuracy of the used measurement methods.

Given a statistical variable X or a set of statistical variables X, then Pr is a probability distribution over X if and only if Pr is a function assigning a value ri ∈ [0,1] to every x ∈ val(X), so that the sum of all assigned ri equals 1. Since probability distributions should be capable of storing empirical data, we interpret probabilities as objective probabilities, that is, as inductively inferred limit tendencies of observed frequencies.

2.2 Probabilistic Dependence and Independence Relations

Given a probability distribution Pr over variable set V, conditional probabilistic dependence between two variables X and Y can be defined in the following way:

(1)DEPPr(X,Y|M) if and only if there are x, y, and m so thatPr(x|y,m) ≠ Pr(x|m), provided Pr(y,m) > 0.Footnote 3

Read “DEPPr(X,Y|M)” as “X and Y are probabilistically dependent conditional on M.” According to definition (1), two variables X and Y are probabilistically dependent conditional on M if the probability of at least one value of one of these two variables is probabilistically sensitive to at least one value of the other variable in at least one context M = m. So “probabilistic dependence” is a quite weak notion. “Probabilistic independence,” on the other hand, is a very strong notion. If two variables X and Y are probabilistically independent conditional on M, then there is not a single X-value x and not a single Y-value y so that x is probabilistically sensitive to y in any context M = m. Conditional probabilistic independence (INDEPPr) is defined as the negation of conditional probabilistic dependence:

(2) INDEPPr(X,Y|M) if and only if for all x, y, and m, Pr(x|y,m) = Pr(x|m), provided Pr(y,m) > 0.

Unconditional probabilistic dependence/independence (DEPPr(X,Y)/INDEPPr (X,Y)) turns out to be a special case of conditional probabilistic dependence/independence; it can be defined as conditional probabilistic dependence/independence given the empty context M = ∅:

(3) DEPPr(X,Y) if and only if DEPPr(X,Y|∅). (4) INDEPPr(X,Y) if and only if INDEPPr(X,Y|∅).

2.3 Graphs and Causal Graphs

Let us turn to the concept of a causal graph. A graph G is an ordered pair 〈V,E〉, where V is a set of so-called vertices (which are statistical variables in causal graphs) while E is a set of so-called edges. Edges may be all kinds of arrows (e.g., “→,” “⋅⋅⋅>,” and “↔”) or undirected links (“—”) representing diverse binary relations among objects in V. Two variables in a graph’s variable set V are called adjacent if and only if they are connected by an edge. A chain of n ≥ 1 edges connecting two variables X and Y of a graph’s variable set V is called a path between X and Y. A path of the form XY is called a directed path from X to Y. Whenever a path contains a subpath of the form XZY, then Z is called a collider on this path; the path is called a collider path in that case. X is called an ancestor of Y if and only if there is a directed path from X to Y; Y is called a descendant of X in that case. The set of all ancestors of a variable X is denoted by “Anc(X),” while the set of all descendants of X is indicated by “Des(X).” All X for which XY holds are called parents of Y; the set of all parents of Y is referred to via “Pa(Y).” All Y for which XY holds are called children of X; the set of all children of X is referred to via “Chi(X).” Variables to which no arrowhead is pointing are called exogenous variables. Non-exogenous variables are called endogenous variables. A graph G = 〈V,E〉 containing a path of the form XX (with X ∈ V) is called a cyclic graph; an acyclic graph is a graph that is not a cyclic graph. A graph G = 〈V,E〉 is called a directed graph if E contains only directed edges.

A graph becomes a causal graph as soon as its edges are interpreted causally. We will interpret “XY” as “X is a direct cause of Y in causal graph G.” X is a cause (i.e., a direct/indirect cause) of Y in G if and only if there is a causal chain XY in G.

2.4 Bayesian Networks and Causal Models

A directed acyclic graph (DAG) G = 〈V,E〉 and a probability distribution Pr over G’s variable set V together become a so-called Bayesian network (BN) 〈G,Pr〉 if and only if G and Pr satisfy the Markov condition Footnote 4 (MC). If G is an acyclic causal graph, then G and Pr become an acyclic causal model (CM) if and only if G and Pr satisfy the causal Markov condition Footnote 5 (CMC) or d-separationFootnote 6:

(MC/CMC): G = 〈V,E〉 and Pr satisfy the (causal) Markov condition if and only if for all X ∈ V, INDEPPr(X,V\Des(X)|Pa(X)).Footnote 7

V\Des(X) is the set of all non-descendants of X. Note that “Des(X)” and “Pa(X)” in CMC refer to X’s effects and X’s direct causes, respectively, while “Des(X)” and “Pa(X)” are not causally interpreted at all in MC. The main idea behind CMC can be traced back to Reichenbach’s The Direction of Time (1956).Footnote 8 It captures the strong intuition that conditioning on all common causes as well as conditioning on intermediate causes breaks down the probabilistic influence between two formerly correlated variables X and Y. Or in other words, the direct causes of a variable X contain all the probabilistic information which can be found under the causes of event types X = x; knowing the values of X’s parents screens X off from all of its indirect causes.

We illustrate how CMC works by providing some examples. CMC implies for the DAG in Fig. 3.1, for instance, the following independence relations (as well as all probabilistic independence relations implied by them). These independence relations can directly be read off CMC applied to this DAG: INDEPPr(X1,X4), INDEPPr(X2,{X3,X4,X6}|X1), INDEPPr(X3,{X2,X4}|X1), INDEPPr(X4,{X1,X2,X3,X5}), INDEPPr(X5,{X1,X4,X6}|{X2,X3}), and INDEPPr(X6,{X1,X2,X5}|{X3,X4}).

Fig. 3.1
figure 1figure 1

A simple exemplary causal graph

It follows from MC/CMC that the equation Pr(X1,,Xn) = ΠiPr(Xi|pa(Xi))Footnote 9 holds in every BN/acyclic CM 〈V,E,Pr〉 and, thus, that every BN/acyclic CM determines a fully defined probability distribution Pr(X1,,Xn) over the variable set V of this BN/acyclic CM. Hence, BNs/acyclic CMs allow for probabilistic reasoning about events which can be described in terms of the variables in V. Because Pr(X1,,Xn) = ΠiPr(Xi|pa(Xi)) holds in acyclic CMs, the conditional probabilities Pr(Xi|pa(Xi)) – which are called Xi’s parameters – can represent the causal strengths of a variable Xi’s direct causes. Note that Pr(X1,,Xn) = ΠiPr(Xi|pa(Xi)) and thus MC/CMC do not hold in cyclic CMs, either. It is because of this that in cyclic CMs there are always some variables whose parameters are undefined (these are the variables lying on a cyclic directed path) and, thus, that also the causal strengths of their direct causes are undefined in such models.

3 Biological Mechanisms

Before we can assess the strengths and shortcomings of causal graph theoretical models of biological mechanisms, we need to know what the main features of biological mechanisms are. In the last 15 years, philosophical interest in mechanisms has significantly increased. Those who endorse the mechanistic account place the concept of a mechanism at the heart of their philosophical analysis of scientific practice. They regard models of mechanisms as being involved in almost all scientific activities, let it be explanation, discovery, prediction, generalization, or intervention. There are still controversies in the debate with regard to how the notion of a mechanism should be specified, for instance, to which ontological category the components of a mechanism belong (Machamer et~al. 2000; Tabery 2004; Torres 2008), whether the regular occurrence of the mechanism’s behavior is a necessary condition (Bogen 2005; Craver and Kaiser 2013), or whether the concept of a mechanism can be extended such that it also accounts for the behavior of complex systems (Bechtel and Abrahamsen 2010, 2011) or for historical processes (Glennan 2010; see also Glennan’s chapter in this volume). Despite these differences there are also many points of accordance. In what follows we will briefly present what are regarded as the major characteristics of biological mechanisms in the debate.

To begin with, a mechanism is always a mechanism for a certain behavior (Glennan 2002), for instance, the mechanism for protein synthesis or the mechanism for cell division. This is crucial because only those factors (i.e., entities and activities/interactions) that contribute to producing the specific behavior of the mechanism are said to be components of this mechanism.Footnote 10 An important consequence is that, although, for example, protein synthesis is the behavior of a cell, not all parts of the cell are also components of the mechanism for protein synthesis. Some parts of the cell (e.g., the centrosome and the cytoskeleton) are causally irrelevant for synthesizing proteins and thus do not count as components of the mechanism for protein synthesis.Footnote 11 In other words, the decomposition of a mechanism into its components depends on how the behavior of the mechanism is characterized (Kauffmann 1970; Craver and Darden 2001).

A second major characteristic of mechanisms is their multilevel character. The notion “multilevel character” refers to two distinct but related features of mechanisms: first, it appeals to the part-whole relation that exists between a mechanism and its components. This part-whole relation gives rise to the ontological claim that the mechanism as a whole is located on a higher level of organizationFootnote 12 than the entities and activities/interactions that compose the mechanism. For instance, the mechanism for muscle contraction is said to be located on a higher level than the calcium ions, the sarcoplasmic reticulum, the myosin and actin molecules, etc., that interact with each other in a certain way (or that perform certain activities) in order to bring about the behavior of the mechanism as a whole (i.e., the contraction of the muscle fiber). Second, what is also meant by “multilevel character” is the fact that many mechanisms (in particular, in the biological realm) occur in nested hierarchies. Many mechanisms have components that are themselves (lower-level) mechanisms; and many mechanisms themselves constitute a component in a higher-level mechanism. For instance, the calcium pump that actively transports the calcium ions from the cytosol back into the sarcoplasmic reticulum is a part of the mechanism for muscle contraction. However, the calcium pump is also a mechanism on its own, namely, a mechanism for active transport of calcium ions. As such, it has its own components (e.g., A-, N-, and P-domain, transmembrane domain, calcium ions, ATP) with their own organization. Furthermore, the mechanism for muscle contraction constitutes itself a part in a higher-level mechanism, for instance, in the mechanism for crawling by peristalsis, a behavior that is exhibited, for example, by earthworms.

The third feature of mechanisms concerns their components. It is the one with respect to which there exists least conformity. The proponents of the mechanistic view concur that mechanisms consist of components, but they use different terminologies to classify the components, and some of them assign the components to different ontological kinds (whereas others are just not interested in metaphysical issues). For instance, Machamer et~al. (2000) endorse the dualistic thesis that mechanisms are composed of entities and activities, which they conceive as two distinct ontological kinds. By contrast, Glennan (1996, 2002) characterizes mechanisms in a monist fashion, that is, as being constituted exclusively by entities that interact with each other and thereby change their properties. Other mechanists do not take a stand on this ontological dispute, but nevertheless draw the distinction between the spatial components of a mechanism and “what the spatial components are doing” or “the changes in which the spatial components are involved.” Moreover, these authors adopt a different terminology to describe this difference. Bechtel (2006, 2008), for example, speaks of component parts and component operations (or functions). We think that it is not necessary (although legitimate) to become engaged in the ontological dispute about whether mechanisms consist of components that belong to one or to two distinct ontological kinds. One can avoid this dispute and yet argue that the two concepts – let it be entities and activities, entities and interactions, component parts and component operations, or whatever one likes – are descriptively adequate, that is, useful for representational purposes. When biologists represent mechanisms, they typically distinguish between the object itself (e.g., ribosome) and what the object is doing or the interactions in which the object is involved (e.g., binding, moving along the mRNA, releasing polypeptide). Thus, one should account for this difference when one models biological mechanisms. This, however, leaves open the ontological question of whether activities can be reduced to property changes of entitiesFootnote 13 or not. In sum, the third feature of mechanisms is that they are represented as having two kinds of components, entities and activities (or operations or interactions).

A fourth major characteristic of mechanisms is the importance of the spatial and temporal organization of their components for the functioning of the mechanism. Only if the components of a mechanism are organized in a specific way, the mechanism as a whole brings about the behavior in question. It is important to note that mechanisms are organized in a spatial as well as in a temporal manner. The spatial organization refers to the fact that certain entities are localized in certain regions of the mechanism, move from one region to another, and perform different activities in different regions. For instance, it is significant to the functioning of the mechanism of photosynthesis that the transport of electrons through the thylakoid membrane causes the transport of protons from the chloroplast stroma into the thylakoid lumen and that the resulting chemiosmotic potential is used for ATP synthesis by transporting the protons back into the stroma again. The temporal organization means that a mechanism is temporally divided into certain stages which have characteristic rates and durations as well as a particular order. Earlier stages give rise to latter stages so that there exists a “productive continuity” (Machamer et~al. 2000, p. 3) between the stages of a mechanism. In other words, the activities or interactions are “orchestrated” (Bechtel 2006, p. 33) such that they produce the phenomenon of interest. Consider the mechanism of photosynthesis again. This mechanism is also characterized by a specific sequence of activities. The first step is the absorption of a photon (by the photosystem II). This causes the excitation of an electron, which is followed by the transport of this electron down the electron transport chain. This transport brings about the transport of protons and so on.

At this point one could discuss further features of mechanisms, like the fact that most mechanisms produce a certain behavior in a regular way (given certain conditions) or that the components of mechanisms might be connected by a special kind of causal relations, namely, “productive causal relations” (Bogen 2008). However, these characteristics of mechanisms are far more controversial than the ones we have mentioned so far. This is why we do not take them for granted here. In what follows we examine the question of whether causal graph theoretical models of biological mechanisms are able to capture the major characteristics of mechanisms that we have presented in this section, namely, the multilevel character of mechanisms, their two kinds of components, and the spatial and temporal organization of their components. We do this by means of an extended analysis of an example from recent biological research. As announced before, the result of our analysis will be that causal graph theory succeeds with regard to some respects while failing with regard to others (Sect. 3.5). But before, we give a short introduction to the case study that we are concerned with (Sect. 3.4).

4 Feedback Inhibition of ACCase by 18:1-ACP in Brassica napus

Feedback inhibition is a common mode of metabolic control. Generally speaking, in feedback inhibition a product P produced late in a reaction pathway inhibits an enzyme E that acts earlier in the pathway and that transforms the substrate S into an intermediate product IP1. Figure 3.2 illustrates this general connection.

Fig. 3.2
figure 2figure 2

The general mechanism for feedback inhibition

Figure 3.2 shows that the substrate S is transformed in several steps into the product P (via the intermediate products IP1,…,IPn). As P accumulates, it slows down and finally switches off its own synthesis by inhibiting the regulatory enzyme E that often catalyzes the first committed step of the pathway. That way, feedback inhibition prevents the cell from wasting resources by synthesizing more P than necessary. Because enzyme activity can be rapidly changed by allosteric modulators, feedback inhibition of regulatory enzymes provides almost instantaneous control of the flux through the pathway.

Many instances of this general mechanism of feedback inhibition can be found in nature. In this chapter, we focus on an example from contemporary botanical research, namely, on the feedback regulation of fatty acid biosynthesis in canola (Brassica napus), which has only recently been identified by Andre et~al. (2012).Footnote 14 Fatty acid biosynthesis is a crucial process for both plants and animals, providing the cell with components for membrane biogenesis and repair and with energy reserves in specialized cells (such as epidermal cells or the cells of oilseeds). Since the need for fatty acids not only varies with the cell type but also depends on the stage of development, time of the day, or rate of growth, fatty acid biosynthesis must be closely regulated to meet these changes. Although the biochemistry of plant acid biosynthesis has been extensively studied,Footnote 15 comparatively little is known about its regulation and control (Ohlrogge and Jaworski 1997). However, knowing the mechanism of how fatty acid biosynthesis in plants is regulated is important, not least because it may give rise to the design of strategies for increasing fatty acid synthesis in plants (cf. Tan et~al. 2011). This is particularly significant in light of the economic potential of genetically manipulated oil crops for improved nutritional quality or as renewable sources of petrochemical substitutes.Footnote 16

The main aim of the experimental studies conducted by Andre et~al. (2012) was to discover the feedback system that regulates the biosynthesis of fatty acids in the plastids of Brassica napus. The major results of their studies are twofold: first, they provide evidence for the hypothesis that plastidic acetyl-CoA carboxylase (in short, ACCase) is the enzymatic target of the feedback inhibition (i.e., the enzyme E that is inhibited). ACCase catalyzes the transformation of acetyl-CoA into malonyl-CoA. Second, their experiments indicate that the 18:1-acyl carrier protein (in short, 18:1-ACP) is the feedback signal, that is, the inhibitor of ACCase. On the basis of these findings, they proposed the mechanism for feedback inhibition of fatty acid synthesis in Brassica napus that is illustrated in Fig. 3.3.

Fig. 3.3
figure 3figure 3

Mechanism for feedback inhibition of fatty acid synthesis in Brassica napus (Reproduced from Andre et~al. 2012)

The mechanism for feedback inhibition that takes place in the plastid (depicted in the upper, inner box) can be characterized as an instance of the general mechanism presented in Fig. 3.2. The enzyme ACCase (E) converts the substrate acetyl-CoA (S) into the intermediate product malonyl-CoA (IP1), which is then transformed into the product 18:1-ACP (P). If the concentration of 18:1-ACP increases, more and more 18:1-ACP molecules bind to ACCase molecules and inhibit them. This, in turn, slows down and finally switches off the synthesis of further 18:1-ACP.

5 Modeling the Mechanism for Feedback Inhibition

The mechanism presented in the previous section can be characterized as bringing about the regulation of the synthesis of 18:1-ACP (which is a fatty acid). One way to characterize this phenomenon in more detail is to specify it quantitatively: the concentration of 18:1-ACP is regulated such that it very likely does not reach a certain upper bound b (i.e., the probability for a concentration of 18:1-ACP lower than b is greater than a certain defined probability threshold r). Figure 3.4 shows an illustration.

Fig. 3.4
figure 4figure 4

The explanandum phenomenon of 18:1-ACP regulation [The dots stand for the 18:1-ACP concentrations (C18:1-ACP) measured over time (t) (To be precise, the empirical data that biologists actually gather are not concentrations. Rather, they measure, for instance, optical densities (in spectrophotometric studies) and then draw inferences from the density values about the concentrations). More than r (95 % in this example) of 18:1-ACP concentrations measured so far do not exceed b]

5.1 A Causal Graph Theoretical Model of the Mechanism for Feedback Inhibition

How can the mechanism that brings about the regulation of fatty acid synthesis (more precisely, the regulation of the synthesis of 18:1-ACP) be represented within a causal graph framework? At first, we need to introduce a variable P, standing for the concentration of the product 18:1-ACP.Footnote 17 P shall be a discrete variable fine-grained enough to correspond to the given measurement accuracy. The phenomenon may then be described as Pr(p ≤ b) > r.

Furthermore, the concentration of the substrate acetyl-CoA (represented by variable S) is causally relevant for the 18:1-ACP concentration P: the higher the concentration of acetyl-CoA is, the higher will be the probability for higher 18:1-ACP concentrations. Another factor that is causally relevant for the 18:1-ACP concentration is the concentration of the regulatory enzyme ACCase. Here we have to distinguish between active enzymes and enzymes which bind the product 18:1-ACP (at the effector interaction site). We represent the former by the variable Eactive and the latter by the variable EP-bound. While the concentration of active enzymes is causally relevant to the concentration of the product 18:1-ACP (the higher Eactive’s value, the higher the 18:1-ACP concentration), the 18:1-ACP concentration is causally relevant to the concentration of P-bound enzymes (the higher P’s value, the higher EP-bound’s value) which is, again, causally relevant to the concentration of active enzymes (the higher EP-bound’s value, the lower Eactive’s value), etc. The negative causal influence of EP-bound on Eactive represents the fact that the binding of 18:1-ACP molecules to active ACCases causes the inhibition of the ACCases (i.e., the ACCases becoming inactive), and the negative causal influence of Eactive on S stands for the fact that many active enzymes decrease the amount of the ACCases. According to these considerations, we may illustrate the mechanism by the causal graph depicted in Fig. 3.5.

Fig. 3.5
figure 5figure 5

Static cyclic CM of the mechanism for feedback inhibition [S and Eactive are direct causes of P. P is a direct cause of EP-bound which is a direct cause of Eactive which is, again, a direct cause of S and P, etc. Direct causal influences are represented by arrows. A plus (“+”) above an arrow stands for a positive causal influence (i.e., high cause values lead to high effect values), and a minus (“−”) stands for a negative causal influence (i.e., high cause values lead to low effect values)] (One might object that this causal graph is inadequate because it contains two variables that are analytically dependent, namely, EP-bound and Eactive. We do not think that this is the case. EP-bound and Eactive are analytically independent variables because there is a temporal distance between the binding of P to E and the inactivation of E (i.e., the conformational change of the substrate binding site). In other words, the binding of P to E and the inactivation of E are not the same processes occurring at the same time, but rather the former causes the latter. This is also why there exists a submechanism that specifies this causal relation)

To get a causal model, we have to supplement the causal graph depicted in Fig. 3.5 with a probability distribution Pr over variable set V = {S,P,Eactive,EP-bound}. Pr will imply that the probability of p ≤ b is greater than r (this is the phenomenon the mechanism brings about). The probabilities Pr will correspond to the positive/negative causal influences as described above. So the probability for high P-values, for example, will be high given high S- and Eactive-values, and low given low S- or Eactive-values.

The probabilities Pr are interpreted as inductively inferred limit tendencies of the observed frequencies of the diverse concentrations, as they are found under normal conditions. These normal conditions can be captured by adding a context C = c. This context is simply an instantiation of a variable or a set of variables which stand for the typical experimental setup and are not (or only slightly) changed during measuring or manipulating S, P, Eactive, or EP-bound. With regard to our case study, the context C = c will include a certain temperature (or range of tolerable temperatures), a particular level (or tolerable range) of salinity, and a certain pH value (or range of tolerable pH values). The conditional probabilities along the causal arrows should correspond to the causal strengths of the variables’ direct causes in context C = c.

Here we can observe the first problem of our causal model: while the parameters of a causal model are uniquely defined in an acyclic CM, this is not the case in cyclic CMs. This is a problem when it comes to explaining or predicting certain phenomena. We typically explain or predict a variable X’s taking value x by means of this variable’s direct or indirect causes and its parameters or the parameters of the variables lying between X and its indirect causes. So we explain or predict X = x by reference to X’s causes and only to X’s causes and not to X’s effects. But in our cyclic CM, some variable’s causes are also their effects. P, for example, is a cause and an effect of EP-bound. So conditioning on P does not correspond to the probabilistic influence of EP-bound’s direct causes alone, but rather to a mixture of the probabilistic influences one gets from EP-bound’s direct causes and some of its effects. In other words, conditioning on P does not give us the probabilistic influence of P on EP-bound transported only over path PEP-bound, but the mixed probabilistic influence of P transported over PEP-bound and EP-boundEactiveP. A second problem of our causal model is that it does not capture the dynamic aspect of mechanisms – it does not show how the parts of the mechanism described influence each other over a period of time. A third deficit of our causal model is that it does not represent any hierarchic organization, that is, it does not account for the fact that mechanisms are often embedded in higher-level mechanisms and have parts that are (sub)mechanisms themselves (see Sect. 3.3). The above model just describes the causal relations that are responsible for bringing about the behavior of the mechanism, that is, it refers only to causes at one and the same ontological level and therefore (even if the first problem would not exist) does not, strictly speaking, allow for interlevel explanation/prediction. In order to cope with these three problems, in the next two subsections, we expand our causal model that represents the mechanism for feedback inhibition of fatty acid synthesis in Brassica napus.

5.2 Dynamic Causal Models

The first two problems discussed in the last section can be solved by unrolling the causal model over a period of time and thereby constructing a dynamic CM.Footnote 18 In doing so, we quite plausibly presuppose that causal influences need some time to spread and do not occur instantaneously. We get a dynamic CM if we add time indices to the variables of our system V = {S,P,Eactive,EP-bound}, representing the mechanism’s diverse stages. By presupposing that causal influences need some time to take place, we can generate the dynamic CM whose causal graph is depicted in Fig. 3.6 (for five stages) on the basis of our static CM in Sect. 3.5.1.

Fig. 3.6
figure 6figure 6

The causal graph of a five-stage dynamic CM representing the mechanism for feedback inhibition in Brassica napus

The dashed arrows transport probabilistic influences (the substrate concentration Si, for instance, is always probabilistically relevant to the substrate concentration at the next stage) in exactly the same way as their non-dashed counterparts. The only difference is that we interpret continuous arrows as direct causal connections while we want to leave it open whether the dashed arrows represent such causal connections. Dashed arrows could, for example, also be interpreted as analytic dependencies.Footnote 19 The variables of the five stages together with the continuous and the dashed arrows constitute the dynamic CM’s causal graph.

The corresponding static CM’s topological structure can be read off from the dynamic CM. One just has to abstract from the diverse stages of the dynamic CM and look at the continuous arrows: there has to be an arrow from S to P, from P to EP-bound, from EP-bound to Eactive, and from Eactive to S and to P in the corresponding static CM, and these all have to be causal arrows in this static CM.

Note that the time intervals between two stages of a dynamic CM should be suitably chosen. On the one hand, if they are too small, then the causal influence may not have enough time to spread from the cause to the effect variable and correlations between causes and effects will get lost. On the other hand, these intervals should not be too large, either. This may lead to violations of very basic causal intuitions. To give an example, suppose the causal model in Fig. 3.6 shows the correct causal structure of the mechanism for feedback inhibition of fatty acid synthesis in Brassica napus. Then S is an indirect but not a direct cause of EP-bound. S’s causal influence on EP-bound is mediated via P. But if the interval between two stages were too large, say, for example, it were chosen such that stage 3 in the dynamic CM in Fig. 3.6 would be the next stage after stage 1, then S and EP-bound would be correlated and this correlation would not break down under conditionalization on the intermediate cause P. Thus, conditioning on an effect’s direct causes would not screen it off from its indirect causes.

Dynamic CMs have some advantages over static CMs. First of all, they are acyclic CMs, and, thus, we can use the same methods as in BNs to compute the probabilities we are interested in. Furthermore, CMC holds and the causal model’s parameters are defined. So we know the causal strengths of a variable’s causes, and we can thus use dynamic CMs to explain certain phenomena which can be described by means of endogenous variables. So the first problem discussed in Sect. 3.5.1 can be solved: we can generate explanations and predictions by referring to the causes of the event of interest and to the probabilistic influence of these causes on this event. In addition, we can predict the probabilities of certain effects of interventions. We can, for example, predict the probability of certain P-concentrations at stage 5 given certain S- and Eactive-concentrations at stage 1 when we change the concentration of S in a certain way at stage 3 via manipulation. The second problem can also be solved: the dynamic CM tells us how the parts of the mechanism described influence each other over a period of time, and we can thus also make predictions about what will (most likely) happen at later stages of the mechanism when we manipulate certain variables at earlier stages of the mechanism. Another nice feature of dynamic CMs is, provided the time intervals between the diverse stages of the mechanism are suitably chosen, that standard methods can be used for causal discovery because CMC holds for dynamic CMs. Causal discovery is still a serious problem for cyclic CMs, and there are only a few algorithms which, in general, do not lead to very detailed causal information (cf. Richardson 1996; Spirtes 1995). The third problem, however, still remains: our dynamic CM captures only causal information at one and the same ontological level and thus does not allow for interlevel mechanistic explanation, manipulation, and prediction.

5.3 Hierarchically Ordered Causal Models

There are at least two possibilities to represent the hierarchic organization of mechanisms within causal graph theory, that is, to solve the third problem that we mentioned at the end of Sect. 3.5.1. Each of these approaches has its own merits and deficits. One of these possibilities is developed in detail in Casini et~al. (2011). Casini et~al. provide a quite powerful formalism. They propose to start to represent a mechanism’s top level by a causally interpreted BN. Such a BN’s variable set V may then contain some so-called network variables. These are variables whose values are BNs themselves. Network variables (or, more precisely, the BNs which are their possible values) are intended to represent the possible states (e.g., “functioning” and “malfunctioning”) of a mechanism’s submechanisms. These BNs’ variable sets may then themselves contain network variables which stand for the possible states of a submechanism’s submechanisms and so on. To connect the diverse levels of the mechanism represented by such BNs, Casini et~al. suggest an additional modeling assumption: the recursive causal Markov condition (RCMC). Whenever this condition holds, then Casini et~al.’s formalism allows for probabilistic reasoning across the diverse levels of the represented mechanism.

In this chapter, we can discuss Casini et~al.’s (2011) approach only very briefly. For a detailed discussion of their formalism see Gebharter (forthcoming). Though their formalism is definitely powerful, their crucial modeling assumption RCMC is quite controversial. First of all, it is neither obvious that RCMC holds in general, nor is it clear how one could distinguish cases in which it holds from cases in which it does not. Secondly, RCMC leads to contra-intuitive consequences. We have the strong intuition that learning information about a mechanism’s microstructure should at least sometimes lead to better (or at least different) predictions of the phenomena this mechanism will bring about. This should be the case, for example, when the macro-variable describing the possible states of the mechanism is described in a quite coarse-grained way, while more and more knowledge about the mechanism’s microstructure is collected. But, according to RCMC, a mechanism’s micro-variables are probabilistically screened off from its macro-variables whenever the state of the submechanism represented by a network variable is known. A third deficit of Casini et~al.’s approach is that it does not provide any information about how a submechanism’s microstructure is connected to the macrostructure of the overlying mechanism, that is, how exactly changes of some of the submechanism’s micro-variables’ values influence the mechanism’s macro-variables due to probabilistic influences transported over its causal microstructure. Such information is crucial when it comes to the question of how macro-phenomena can be controlled by manipulating some of their underlying mechanisms’ micro-variables.

In what follows we sketch an alternative approach for representing the hierarchic structure of mechanisms which avoids these problems. According to our approach, the submechanisms that a particular mechanism contains are, at least in most cases, adequately represented not via network variables, as Casini et~al. (2011) propose, but via causal arrows. We will illustrate this claim on the basis of the case study that we have already introduced, namely, the mechanism for feedback inhibition of fatty acid synthesis in Brassica napus. This mechanism can be modeled within a causal graph framework as described in Sect. 3.5.1. An example for a submechanism of this mechanism is the mechanism for allosteric inhibition. This submechanism specifies the causal arrow between the variables EP-bound and Eactive (see Fig. 3.7). That is, it describes how exactly the binding of the product 18:1-ACP (i.e., P) to the regulatory enzyme ACCase (i.e., Eactive) causes the inhibition or inactivation of ACCase with the effect that ACCase cannot bind the substrate acetyl-CoA (i.e., S) and convert it into 18:1-ACP anymore. In other words, this submechanism discloses why it is the case that the higher the concentration of 18:1-ACP, the lower the concentration of active ACCase.

Fig. 3.7
figure 7figure 7

Static CM of the phenomenon that is brought about by the submechanism for allosteric inhibition

But how can such a submechanism be modeled within a causal graph framework, and how can it be related to the mechanism for feedback inhibition of which it is a part? In order to assess these questions, we need to go into more scientific details. Unfortunately, the biochemical submechanism that explains how the binding of 18:1-ACP to the enzyme ACCase (EP-bound) causes the inhibition of ACCase (Eactive) in Brassica napus has not been discovered yet (Andre et~al. 2012). The same is true for the biochemical inhibition mechanisms in other species, for instance, in Escherichia coli (Heath and Rock 1995; Davis and Cronan 2001). However, in order to get an idea of how the model of the submechanism might look like, we will consider a different but analogous example, in which extensive molecular and structural studies have been carried out to unravel the biochemical mechanism of inhibition. In their recent work, Ganesan et~al. (2009) investigated a different feedback system, namely, the allosteric inhibition of the enzyme serine protease (more precisely, of hepatocyte growth factor activator, in short “HGFA”) by an antibody (Ab40). Their goal was to unravel the molecular details of this inhibition mechanism. That is, they aimed at characterizing the molecular interactions and conformational changes that are caused by the binding of Ab40 (in general terms, of product P) to the effector interaction site of the enzyme HGFA (in general terms, to enzyme E) and that bring about the inhibition or deactivation of HGFA. Their work is very useful for our analysis because, on an abstract level, Ganesan et~al. (2009) were interested in discovering the same submechanism as the one we singled out above, namely, the submechanism that explains how the binding of P to E causes the inhibition of E, in other words, why it is the case that the higher EP-bound’s value, the lower Eactive’s value.

The exact route by which the amino acids that compose E transmit the allosteric effect, that is, by which intermediate steps the binding of P to the remote effector interaction site of E causes the altered catalytic activity of E, is in general very poorly known (Sot et~al. 2009). However, the structural and kinetic studies that Ganesan et~al. (2009) performed produce some relief. One of their main results is that the binding of Ab40 (i.e., P) to the effector interaction site of HGFA (i.e., E) is accompanied by a major structural change (called the “allosteric switch”; Ganesan et~al. 2009, p. 1620), namely, the movement of a certain part of the enzyme, the 99-loop, from the competent into the noncompetent conformation. This, in turn, obstructs the binding of the substrate to the enzyme E; more precisely, it causes a steric clash between the P2-Leu and the S2 subsite of E and the loss of stabilizing interactions between P4-Lys and the S4 subsite of E. The diagram in Fig. 3.8 provides a general illustration of these changes (while leaving out most of the molecular details).

Fig. 3.8
figure 8figure 8

Qualitative model of the mechanism for allosteric inhibition of HGFA by Fab40 (Fab40 is a special type of Ab40) (Adapted from Ganesan et~al. 2009. With permission from Elsevier)

The molecular interactions could be described in far more details. However, the foregoing description suffices for our purposes. How can this submechanism for allosteric inhibition of HGFA by Ab40 be modeled in a causal graph framework? We propose to model the submechanism with a static CM containing the variables and causal topology depicted in Fig. 3.9.

Fig. 3.9
figure 9figure 9

Static CM of the submechanism for allosteric inhibition of HGFA by Fab40

The first thing to note is that B, 99-loop, S2, and S4 are binary (and, thus, qualitative) variables. B can take one of the two values “bindings between functional groups of Ab40 and the effector interaction site of HGFA are established” and “bindings between functional groups of Ab40 and the effector interaction site of HGFA are not established.” 99-loop can take one of the two values “being in the competent state” and “being in the noncompetent state.” S2 can take one of the two values “having an ideally shaped hydrophobic pocket to recognize P2-Leu” and “having a deformed pocket so that P2-Leu cannot be recognized.” S4 can take one of the two values “being able to perform stabilizing interactions to P4-Lys” and “being unable to perform stabilizing interactions to P4-Lys.” This model describes that if bindings between functional groups of Ab40 and the effector interaction site of HGFA are established, then the probability is high that 99-loop is in its competent state, which is why the probability is high that S2 has an ideally shaped hydrophobic pocket to recognize Leu and S4 is able to perform stabilizing interactions to P4-Lys. On the higher level, we would say that if P (Ab40) binds to E (HGFA), this submechanism brings about the behavior that E (HGFA) is inactive (which means, on the lower level, that the two amino acids P2-Leu and P4-Lys of the substrate cannot bind to the substrate binding sites S2 and S4 of the enzyme (HGFA)).

We are aware of the fact that it is very unlikely that the biochemical submechanism for the inhibition of ACCase by 18:1-ACP in Brassica napus looks exactly like the submechanism for the inhibition of HGFA by Ab40, which we just described. There are too many molecular differences between the two enzymes and the two inhibitory products. However, for the sake of the argument, suppose that also in the case of the inhibition of ACCase, the binding of 18:1-ACP causes the movement of some part of the enzyme X from a competent state into a noncompetent state. Suppose further that this allosteric switch brings about certain molecular and conformational changes in two substrate binding sites S2 and S4 of the enzyme ACCase, which prevent the substrate to bind to the enzyme. A static CM of this hypothetical submechanism would look like the one in Fig. 3.10.

Fig. 3.10
figure 10figure 10

Static CM of the hypothetical submechanism for allosteric inhibition of ACCase by 18:1-ACP (The corresponding possible values of the variables are the following: B can take one of the two values “bindings between functional groups of 18:1-ACP and the effector interaction site of ACCase are established” and “bindings between functional groups of 18:1-ACP and the effector interaction site of ACCase are not established.” X can take one of the two values “being in the competent state” and “being in the noncompetent state.” S2 and S4 can take one of the two values “having an ideal conformation that allows its binding to a certain part of 18:1-ACP” and “having a deformed conformation that inhibits its binding to a certain part of 18:1-ACP.”)

On this basis we can now tackle the crucial question of how the model of the mechanism for feedback inhibition, which we developed in Sect. 3.5.1, and the model of one of its submechanisms, namely, of the biochemical mechanism of allosteric inhibition, can be related within a causal graph framework. We propose to model the hierarchic order of this multilevel mechanism by means of a hierarchic static causal model with the topological structure depicted in Fig. 3.11.

Fig. 3.11
figure 11figure 11

Hierarchic static CM of the mechanism for feedback inhibition and of one of its submechanisms, namely, the biochemical mechanism for allosteric inhibition

The two-headed arrows between EP-bound and B as well as between S2 and Eactive and S4 and Eactive which connect the two levels of the two mechanisms do not stand for causal, but rather for constitutive relevance relations, for instance, in the sense of Craver (2007). Hence, they transport probabilistic dependencies and the effects of manipulations in the same way as direct causal loops in static CMs. Note that the causal arrow EP-boundEactive in our original static CM disappeared in the hierarchic causal model. It is replaced by the underlying mechanism of this causal arrow, that is, by a causal structure whose input and output variables are connected to EP-bound and Eactive, respectively, via constitutive relevance relations in Fig. 3.5. Also note that it is not clear how the submechanism represented by EP-boundEactive could be analyzed in Casini et~al.’s (2011) approach. They would need to add a network variable N between EP-bound and Eactive (EP-boundNEactive). But then and because there is no intermediate (macro-level) cause N between EP-bound and Eactive, it is unclear what this network variable N should represent at the mechanism’s macro-level.

Our hierarchic static CM can be used for mechanistic reasoningFootnote 20 across diverse levels. In contrast to Casini et~al.’s (2011) models, our model also tells us how exactly probabilistic influence between macro-variables is transported over the underlying mechanism’s causal microstructure and how exactly (i.e., over which causal and/or constitutive relevance paths) manipulations of micro-variables influence certain macro-variables. For example, if we manipulate S4, this will change Eactive and S2 because S4 and S2 are constitutively relevant for Eactive. Since X is a direct cause of S4, changing S4 will, on the other hand, not have a direct influence on X’s value. But changing S4 will nevertheless have a quite indirect influence on X: a change of S4’s value will have an influence on Eactive’s value at the macro-level, which influences its macro-level effect EP-bound. Since B is constitutively relevant for EP-bound, EP-bound-changes will lead to B-changes which will, since B is a direct cause of X at the micro-level, lead to certain X-changes.

Though such hierarchic models as the one depicted in Fig. 3.11 can be used for probabilistic reasoning across a mechanism’s diverse levels, they cannot generally be used for explanation and prediction. The reason is the same as in the case of static CMs, as illustrated in Sect. 3.5.1: a certain EP-bound-value, for example, can be explained or predicted only via reference to EP-bound’s causes, for example, P. But in our hierarchic static CM, P does influence EP-bound not only as a cause but also as an effect: P influences EP-bound not only over PEP-bound but also over PEactive<⋅⋅⋅>S2XB<⋅⋅⋅>EP-bound and PEactive<⋅⋅⋅>S4XB<⋅⋅⋅>EP-bound. So the probabilistic influence of P on EP-bound does not correspond to P’s causal influence on EP-bound alone. We can solve this problem by rolling out our hierarchic model over time as we have already done for our original static CM in Sect. 3.5.2. Figure 3.12 is an illustration of the result of this procedure.

Fig. 3.12
figure 12figure 12

Hierarchic dynamic causal model of the mechanism for feedback inhibition and the biochemical mechanism for allosteric inhibition

Note that, while causal influences need some time to spread, value changes produced by constitutive relevance relations occur instantaneously. Because of this, the two-headed dashed arrows representing such constitutive relevance relations only connect variables at one and the same stage. This also corresponds to the fact that one cannot change one of two constitutively dependent variables without changing the other. Note also that the causal arrows from EP-bound to Eactive disappeared in the hierarchic dynamic CM. This is because these arrows represented a submechanism at work which is explicated in more detail in the hierarchic dynamic CM – the hierarchic dynamic CM tells us exactly (and, in contrast to our original dynamic CM developed in Sect. 3.5.2, in a mechanistic way)Footnote 21 how EP-bound influences Eactive and thus finally solves problem three, too: hierarchic dynamic CMs allow for probabilistic interlevel explanation and prediction of certain Eactive-values. Certain Eactive-values, for instance, can be mechanistically explained or predicted by certain EP-bound-values: EP-bound at stage 1 has some influence on its constitutive part B at stage 1. B at stage 1 causes X at the micro-level at stage 1.5 which causes S2 and S4 at the micro-level at stage 2, and, since S2 and S4 are constitutively relevant for Eactive, they have a direct probabilistic influence on Eactive at stage 2.

One could object that, since the two-headed dashed arrows in our hierarchic dynamic CM transport the influences of interventions in both directions, CMC does not hold in such models and, hence, they should have the same problems as static CMs when it comes to explanation and prediction. The first point of such an objection is definitely true: CMC does not hold for hierarchic dynamic CMs.Footnote 22 However, this does not lead to the suspected consequence. The problem for explanation and prediction in static CMs was that the probabilities one gets when conditioning on some variables also provide some information which can only be achieved if one also knew these variables effects (in other words, probabilistic information is transported not only over cause paths but also over effect paths). But the events that we want to explain do not occur because some of their effects occurred (i.e., because they had a probabilistic influence on them), and events we want to predict cannot be predicted via reference to some of their effects (which have not occurred yet). However, this problem does not arise for hierarchic dynamic CMs. In a hierarchic dynamic CM, cycles appear only due to constitutive relevance relations within certain stages, and, thus, conditioning on a variable’s causes does only provide probabilistic information about this variable’s values transported over cause or constitutive relevance paths. It never provides probabilistic information transported over an effect path.

6 Merits and Limits of Causal Graph Theoretical Models

On the basis of the preceding analysis, we can now approach the question of whether causal graph theory is suited for modeling biological mechanisms and what the advantages and shortcomings of representing mechanisms within a causal graph framework are. In the previous literature the concern has been raised that, even if it is possible to provide causal graph theoretical models of biological mechanisms, they are deficient because they fail to comprise some important kinds of information. In this line, for instance, Weber (2012) argues that because causal graph theoretical models only encompass sets of variables and relations of causal dependence, they fail to include information about the structure of biological entities (such as information about the DNA double helix topology and the movements undergone by a replicating DNA molecule) and about their spatiotemporal organization. However, claims like these remain on a quite general level. Our goal in this section is to use the results of our analysis of the case study in the previous section in order to assess and to specify these claims. We do so by pointing out which kinds of information about biological mechanisms cannot or can only insufficiently be represented within a causal graph framework and what are the reasons for these failures. In addition to revealing the limitations of causal graph theoretical models of mechanisms, we also highlight the virtues they have with respect to certain scientific purposes.

To begin with, recall the major characteristics of biological mechanisms that we identified in Sect. 3.3. First, mechanisms possess a multilevel character, which means, on the one hand, that there exists a part-whole relation between the mechanism and its components and, on the other hand, that mechanisms frequently occur in nested hierarchies. Second, mechanisms are represented as having two different kinds of components: entities (having particular properties) and activities (or interactions, operations, etc.). Finally, a mechanism brings about a specific behavior only if its components are spatially and temporally organized in a certain way. Can all these three features of biological mechanisms adequately be represented by causal graph theoretical models?

Consider first the multilevel character of mechanisms. As we have shown in the previous section, the fact that many mechanisms occur in nested hierarchies (i.e., that they are embedded in higher-level mechanisms and have components that are themselves submechanisms) can be represented in at least two ways. On the one hand, one can represent a mechanism’s submechanisms by so-called network variables, as, for instance, Casini et~al. (2011) do. We, on the other hand, think that there are good reasons for representing such submechanisms by causal arrows between variables X and Y. In our approach one can generate a hierarchic causal model by replacing such a causal arrow by another causal structure. This causal structure should be on a lower ontological level than X and Y, it should contain at least one constitutively relevant part of X and at least one of Y, and there should be at least one causal path going from the former to the latter at the micro-level. Such hierarchic models allow, in contrast to purely qualitative models, for probabilistic mechanistic reasoning across different levels. Hierarchic dynamic CMs do even allow for probabilistic mechanistic interlevel explanation and prediction. Contrary to Casini et~al.’s models, they can also provide detailed information about how certain causal influences at the macro-level are realized by their underlying causal influences propagated at the micro-level. This is important when it comes to questions about how certain manipulations of macro- or micro-variables influence certain other macro- or micro-variables of interest and how a mechanism’s causal microstructure is connected to its macrostructure.

Let us now turn to the second feature of mechanisms. Do causal graph theoretical models succeed in representing mechanisms as being composed of two different kinds of components, namely, entities and activities (or operations, interactions, etc.)? It is quite clear that causal models represent entities. Precisely speaking, the individuals in the domains DX1,…,DXn of the causal model’s variables X1,…,Xn represent the entities that are components of the mechanism. Furthermore, the variables X1,…,Xn taking certain values represent different properties or different behaviors of these entities. But can causal graph theoretical models represent activities, too?

A convenient first step towards an answer to this question seems to be to scrutinize the activities that are involved in our case study. Examples of activities that are part of the mechanism for feedback inhibition of fatty acid synthesis in Brassica napus are the binding of 18:1-ACP (P) to ACCase (E), the transformation of acetyl-CoA (S) into 18:1-ACP (P) (via the intermediate product malonyl-CoA), and the inhibition of ACCase (E) by 18:1-ACP (P) (see description of Fig. 3.5). The submechanism that brings about the activity of the inhibition of ACCase by 18:1-ACP is, in turn, composed of the following micro-activities: the establishment of a certain kind of binding between a functional group of 18:1-ACP and the effector interaction site of ACCase, the shifting the conformation of a particular part of ACCase, the deformation of the conformation of the S2 part of the substrate binding site of ACCase, etc. (see description of Fig. 3.9). What all these activities have in common is that they are temporally extended processes that involve some kind of change. Correspondingly, Machamer et~al. have characterized activities as being “the producers of change” (2000, p. 3). It should be noted that not all activities must involve interactions between two or more distinct entities.Footnote 23 There might also be activities (so-called noninteractive activities (Tabery 2004, p. 9; Torres 2008, p. 246), like the shifting of the conformation of a particular part of ACCase) that involve only one entity (i.e., the particular part of ACCase) and a change of its properties (i.e., from the property “being in a competent state” to “being in a noncompetent state”).Footnote 24 In any case, activities involve the change of properties. In principle, the variables of a causal graph theoretical model could just be chosen in such a way that the different values they can take represent different processes or changes of properties. However, such a choice of variables would completely be at odds with experimental practice in biology. In most cases it is difficult or even impossible to measure entire processes by just measuring once. Rather, what biologists do, for instance, to collect empirical data about the inhibition of ACCase by 18:1-ACP, is that they measure the concentration of the product (which is an indicator of ACCase’s activity and, thus, also of its inhibition) to different times. Against this background it would be inadequate to choose the variable in such a way that one of its values represents the entire process/activity of inhibition of ACCase by 18:1-ACP. The option of representing activities simply by variables taking certain values can also be ruled out by the following argumentation: if activities were represented by variables taking certain values, then activities would neither involve changes nor be productive – they would rather occur due to other productive causal relations. Since activities are productive and involve changes, they must be represented differently.

We think that there are two ways in a causal graph theoretical model by which the activities that compose a mechanism can be captured: they can either be represented by causal arrows between variables. For instance, the causal arrow between S and P in Fig. 3.5 represents the activity “transformation of acetyl-CoA into 18:1-ACP.” This is the option that matches the neat picture that several authors seem to have in mind: in a causal model the variables represent the entities (and their possible properties), and the arrows represent the activities. However, our analysis shows that things are not that neat. There is a second, equally adequate way to represent activities in causal graph theoretical models, namely, representing them by the change of the value of a variable. For instance, the activity “shifting the conformation of a particular part of ACCase” is represented in Fig. 3.9 by the variable X, changing its value from “being in a competent state” to “being in a noncompetent state.”

A related view of static CMs, which we have to give up, is the neat view that the different variables in static CMs always represent the possible properties and activities of distinct entities. The flexibility of the choice of variables allows that one static CM contains variables that represent different possible properties (and activities) of the same entity. For instance, in our static CM depicted in Fig. 3.5, the variables EP-bound and Eactive both refer to the concentrations of enzymes but describe different properties of these enzymes, namely, “being bound to P” and “being active.” In other words, in causal graph theoretical models, the boundaries between different entities and between entities and activities often become fuzzier than in qualitative models. This fuzziness may have the disadvantage of impeding the understanding of how a mechanism brings about a certain phenomenon – when one looks at a static CM or at a dynamic CM, one does not recognize at first sight what the entities are and which activities they perform.

To conclude, we think that it is possible to represent mechanisms as being composed of entities and activities in a causal graph framework. However, what one does not get are neat static CMs in which each variable represents a distinct entity and the arrows represent activities. This might be disadvantageous for some purposes, but not for others.

Finally, how do things stand with the third main feature of mechanisms, namely, with the spatial and temporal organization of their components? How much and which structural and spatial information one actually represents simply depends on one’s choice of variables. In our case study, for instance, the causal graph theoretical model depicted in Fig. 3.11 contains structural as well as spatial information: the variable S2, for example, refers to a particular entity, namely, the S2 part of the substrate binding site of ACCase, and to the two possible structural properties that this entity can exhibit, namely, “having an ideal conformation that allows its binding to a certain part of 18:1-ACP” and “having a deformed conformation that inhibits its binding to a certain part of 18:1-ACP.”Footnote 25 A different example is the variable EP-bound which represents the concentration of those regulatory enzymes (ACCases) that are bound to, that is, spatially connected to, the product 18:1-ACP. Hence, it is possible to include certain crucial structural and spatial information about the components of a mechanism into a causal graph theoretical model – one just has to choose variables that refer to structural and spatial properties.

Information about the temporal organization can be captured by and read off from the causal arrows of dynamic CMs: in the example we discussed in Sect. 3.5.2, for instance, S at stage 1 causes P at stage 2, which causes EP-bound at stage 3. So at first S interacts with P, then P interacts with EP-bound, etc. However, even if there are no in-principle reasons for why it is impossible to include all the details of the spatial and temporal organization of a mechanism’s components into a causal graph theoretical model, this does not preclude that there may be heuristic reasons for doing so. For instance, including all the relevant spatial, structural, and dynamic information might give rise to a causal model that includes too many different variables, so that it is unmanageable and thus not useful.

In sum, causal graph theoretical models can account for the three main features of mechanisms. However, they do so in a quite abstract way, which is why they are far worse than purely qualitative models with respect to the purpose of providing understanding. Qualitative models tell us in a very intelligible way how the components of a mechanism interact to bring about the phenomenon of interest. They make, contrary to probabilistic causal models, clear distinctions between the macro- and the micro-level (i.e., between mechanisms and their submechanisms) and between distinct entities and activities (or operations, interactions, etc.). Purely qualitative models of mechanisms can also be used to explain certain behaviors of systems by revealing how the components of a mechanism bring about the behavior in question. These qualitative models are, however, limited. They fail when it comes to explain why certain systems frequently (but not always) bring about certain behaviors. In other words, they fail when it comes to explaining probabilistic phenomena like the phenomenon described in Sect. 3.5. Moreover, they do not allow for probabilistic prediction and (interlevel) manipulation. But knowing how we can bring about a particular phenomenon with high probability is a crucial investigative strategy in the biological sciences. Finally, purely qualitative models fail to integrate qualitative information with quantitative, probabilistic information. The latter is an important task in certain research areas like epigenetics where laboratory molecular experiments need to be brought together with ecological or evolutionary observational studies and computer simulations.

7 Conclusion

In this chapter, we have shown how the formal framework of causal graph theory can be used to model biological mechanisms in a probabilistic and quantitative way. Our analysis of the mechanism for feedback regulation of fatty acid biosynthesis in Brassica napus revealed that causal graph theoretical models can be extended such that they can also account for more complex forms of organization of the components of a mechanism (like feedback) as well as for the fact that mechanisms are frequently organized into nested hierarchies. We argued that, because causal graph theoretical models are not purely qualitative, but rather include probabilistic and quantitative information, they are useful in the context of causal discovery – in particular if one wants to make quantitative, probabilistic predictions or conduct manipulations. What is more, since causal graph theoretical models allow us to represent different levels of mechanisms in the same model (e.g., a mechanism, one of its submechanisms, and the relations between them), they enable us to carry out interlevel mechanistic manipulation and prediction, too.

However, our analysis of the case study did not only disclose advantages of representing biological mechanisms within a causal graph framework. Rather, it gave rise to the more balanced view that probabilistic, quantitative models of mechanisms – although there are clear merits with respect to some purposes – also have shortcomings with respect to other purposes. Accordingly, our analysis revealed that causal graph theoretical models have the resources to represent the three main features of biological mechanisms, namely, their multilevel character, their two kinds of components, and the spatial and temporal organization of their components. However, it also became clear that in some respects probabilistic, quantitative models of mechanisms are insufficient (e.g., because the boundaries of entities and between entities and activities become fuzzy and because the amount of structural/spatial and dynamical information that can be represented is limited) which makes them inadequate for some purposes (in particular for providing understanding). With this analysis we hope to have shed some light on the merits and limitations of modeling biological mechanisms within a causal graph framework and to have provided some interesting prospect for future philosophical work.