Introduction

I want to begin with two observations about organic chemistry. First, diagrams are ubiquitous in both the professional and pedagogical texts of the discipline, and second, the discipline is less mathematically intensive than other physical sciences. I think that there is an interesting connection between these observations. The kinds of theoretical explanations that organic chemists produce and seek are different from those of most other physical sciences (furthermore, they are not comfortably accommodated by standard accounts of biological explanation either). Whereas mathematical laws and models play a central and well-documented (if not well understood) role in the explanations given in more typical physical sciences, they play a more limited role in organic chemistry. In spite of the limited role of mathematics (and laws), organic chemists have been able to develop a robust, theoretical understanding of the phenomena they study. The primary theoretical devices employed in organic chemistry are not mathematical equations; instead, they are diagrams. One major reason that diagrams are ubiquitous in the texts of organic chemistry, then, is because they play a central role in the explanations provided within the discipline. The relatively light mathematical demands of the discipline follow from this same fact—in organic chemistry it is diagrams rather than mathematics that carry the explanatory weight. To understand how this is so, it is necessary to investigate both the nature of the diagrams employed in organic chemistry and how these diagrams are used in the explanations of the discipline. I will begin this paper by describing and characterizing the roles of the most important sort of diagram used in organic chemistry. Next I will present a model of explanations in organic chemistry and describe how diagrams contribute to these explanations. This will be followed by two examples that will support my abstract account of the role of diagrams in the explanations of organic chemistry. I will conclude with some general remarks about the specific roles that diagrams play in the formulation and articulation of the explanations offered by organic chemists. I hope to have established the fruitfulness of reflection on the practice of organic chemistry for those philosophers of science who are interested, more generally, in the ways in which visual representations contribute to scientific practice (see Perini 2005 for a more general formulation of this interest).

There are many different types of diagrams used in organic chemistry, but in this paper I will focus on one broad class of diagram—structural formulas—and only briefly mention another—potential energy diagrams. This will be sufficient for the purpose of understanding how diagrams contribute to the explanations of organic chemistry because most of these explanations are presented and/or supported principally using diagrams of (either or) both of these sorts. Structural formulas (see Fig. 1 for an example) themselves come in a variety of forms, and what they depict or represent is context dependent. In all of these varieties and contexts, however, structural formulas are formally similar; they are two-dimensional arrangements of a fixed alphabet of signs. This alphabet will generally include letters, dots, and lines of various sorts. Typically, letters are used as atomic symbols, lines are used as signs for chemical bonds and dots are used to indicate individual electrons. The spatial arrangement of these signs in any particular formula will, at least to a certain extent, be relevant to determining what the formula signifies. There are, of course, a complex set of conventions and constraints that determine both whether or not a particular spatial arrangement of signs from the alphabet is a structural formula and if so, what that formula depicts.

Fig. 1
figure 1

A structural formula and its corresponding systematic name. This compound is commonly known as caffeine

The most basic use of structural formulas in chemistry is as labels, or perhaps more suggestively, as descriptive names (see Evans 1982, p. 31) for chemical kinds. Particular chemical compounds, or chemical kinds, are the entities whose properties and transformations are the objects of study in chemistry. By the end of the 19th century, it had become clear to organic chemists that chemical kinds were individuated not only by their atomic composition and connectivity, but also (to a certain extent, at any rate) by the spatial arrangement of atoms in a compound. Structural formulas developed along with this recognition and, at least initially, their principal role was as unambiguous denoting expressions for chemical kinds. With some minor caveats, structural formulas can be put into one-to-one correspondence with (potential) chemical kinds (see Benfey 1964 for the development of structural formulas). As a result, one way to ensure that claims about chemical compounds were (or are) not ambiguous is to use structural formulas to refer to chemical kinds.Footnote 1 Structural formulas pick out the compounds that they denote by representing or depicting the characteristics that a compound would have to have in order to be appropriately named by the formula. The set of atomic symbols within a structural formula reveals the composition of the denoted compound, while the lines connecting these atomic symbols represent the connectivity and, when coupled with some additional conventions, the relevant three-dimensional structure of the depicted compound. The structural formula thus has a descriptive content (typically consisting of a specification of composition, connectivity, and some aspects of three dimensional arrangement) in virtue of which it denotes a particular chemical compound (which may or may not actually exist).

The role of structural formulas in contemporary organic chemistry is not limited to being descriptive names for the chemical kinds. Instead, these formulas have taken on a more significant role as the principal sort of theoretical device used in the explanations of the discipline. In this capacity, structural formulas are used like models, in a fairly concrete sense of that term. A model in this sense is an object that is used to stand for a part of the world (see Giere 1999, 2004 for a general account of a related, representational notion of ‘model’). Being objects, there are a variety of claims true of any particular model. In virtue of the fact that these models stand for a part of the world, at least some of the claims that are true about the model can be leveraged into facts about the part of the world that they depict. To get a feel for what I mean by claiming that structural formulas are models, it will be useful to keep in mind a different, more obvious, sort of model that is closely related to the structural formula. Along with their textbooks, beginning organic chemistry students are often requested to purchase model kits. These model kits contain balls of various colors and sticks with which these balls can be connected together. These balls and sticks can be assembled into physical models of chemical compounds. One of the principal uses of these physical models is to teach the student how to draw and interpret structural formulas. Once a student has mastered these skills, it is generally possible for the student to move back and forth between a structural formula and a corresponding physical model. The structural formula in its use as a model can then, roughly, be thought of as a two-dimensional version of its corresponding physical model. As a result, in many of the uses of structural formulas in the explanations of organic chemistry, a ball-and-stick model could be substituted for the structural formula.Footnote 2

Many chemical and physical phenomena can be explained (and sometimes predicted) on the basis of facts recognizable by observation or manipulation of these physical models and/or their corresponding structural formulas. In such explanations, which will be examined in more detail below, certain readily recognizable features of the model are used to license conclusions about the chemical compound (or configuration thereof) that the model depicts. Of course, not all the features of the model are significant, and there are substantial, and evolving, theoretical commitments that underwrite these sorts of inferences. Much of the usefulness of these models in the explanations of organic chemistry derives from the fact that the features of the model that are relevant to explanation transcend the features used to convey the descriptive content of the model. Recall that by the descriptive content of a structural formula, I mean the content carried by the structural formula in virtue of which it characterizes the chemical compound that it depicts, that is, roughly, the composition, connectivity and to a certain extent the spatial arrangement of the atoms in the molecule. By extension, one can understand the descriptive content of a physical model to consist of this same information. When structural formulas or physical models are used in explanation, it is frequently facts about these models over and above this descriptive content that license the conclusions about the depicted compound which drive the explanation. For example, often a physical model of a chemical compound can be manipulated into a variety of different three-dimensional arrangements by rotating the groups connected by a single stick. The fact that a particular arrangement of balls and sticks is accessible by a sequence of such rotations would license the conclusion (in most cases) that the depicted compound could be found in a configuration corresponding to that arrangement of balls and sticks. Similarly, given a particular structural formula, the possibility of finding the depicted compound in a particular configuration could be established by showing that a corresponding structural formula could be generated by a specific series of transformations of the original structural formula. Whether it is established by manipulation of a physical model, or by transformations of a structural formula, the discovery that a chemical compound can assume a particular configuration can then go on to play an important role in explaining or predicting the chemical properties of that compound.

Explanation in organic chemistry

Before going on to describe more completely and concretely how structural formulas contribute to the explanations of organic chemistry, it will be useful to present an abstract model of the explanations in this discipline. With this characterization in hand, it will then be possible to offer a general description of how structural formulas, particularly in their role as models, facilitate these explanations. Then specific examples will be presented that both support the general account of explanation and showcase the central role of structural formulas and/or models in those explanations. The abstract model of explanation in organic chemistry that I will present has been more completely developed in other work (Goodwin 2003), so here I will just provide a sketch.

It is useful to think of the explanations offered in organic chemistry as answers to contrastive ‘why’ questions about chemical transformations. For example, an explanation of the product distribution of a particular reaction can be thought of as an answer to a question like: “Why is A the preferred (exclusive) product of reaction R (rather than B, or C, or D)?” Given that organic chemistry is primarily concerned with the transformations of organic compounds, it should come as no surprise that most (but not all) of the explanations offered in the discipline can be reconstructed in these terms. Acceptable responses to these ‘why’ questions, according to this reconstruction, have two distinct components. The first component of an acceptable response, which I will call the direct answer to the question, presupposes a background theoretical model of chemical transformations derived from thermodynamics (and transition-state theory). Against the backdrop provided by this model, the direct answer is an assertion about the energy of some of the chemical compounds (or configurations thereof) involved in the transformation relative to some of the compounds (or configurations thereof) mentioned in the contrast class. So, in the example above, the direct answer to the question might be something like: “Product A is more thermodynamically stable than any of B, C, or D (and the reaction is under thermodynamic control).” For most purposes, these direct answers can be thought of as qualitative claims about the relative positions of species mentioned in the question on a potential energy diagram. A potential energy diagram (see Fig. 2 for a schematic representation of a potential energy diagram) plots the potential energy of reactants, products, and intermediate structures in a chemical transformation against the ‘reaction coordinate’, which measures the degree of progress through the reaction. So returning to our example, the direct answer to the question amounts to pointing out that product A is lower in energy (as indicated on the y-axis) than any of B, C, or D in a potential energy diagram of the possible transformations. While a direct answer of this sort does constitute a formally sufficient (given the background theoretical model) answer to the question, it would not by itself be an acceptable explanation in organic chemistry. More is required because by accepting the question as well posed, an experienced explainer can infer the sorts of qualitative features of the potential energy diagrams that will constitute the direct answer to the question (the direct answer is, in some sense, contained in the presupposition of the question). As a result, it is best to think of these direct answers as revealing only the common, theoretical apparatus that underwrites how organic chemists think about transformations. The specific details that provide the explanatory power to any particular explanation come from the second component of the answer to the initial contrastive ‘why’ question.

Fig. 2
figure 2

A schematic representation of an exothermic potential energy diagram

The second component of explanation in organic chemistry—the component that allows the organic chemist to flesh out the formal, direct answers to ‘why’ questions—is what I call a structural account. The role of a structural account is to provide for the relevant qualitative energy differences mentioned in the direct answer to the question in terms of structural features of the reactants, products, or intermediate structures. In terms of potential energy diagrams, then, the goal of a structural account would be to justify the relative positions of structures in the potential energy diagram that supports the direct answer to the question. This justification will typically make use of a limited set of robustly applicable concepts that are correlated (in a context sensitive manner) with qualitative energy differences. These concepts can be seen to apply to a particular chemical compound or configuration on the basis of easily recognizable structural features. There are a variety of different sorts of structural features relevant to deciding when these concepts apply, but in almost all cases they can be recognized in virtue of either the descriptive content of a structural formula or by treating the structural formula as a model and discovering or observing the feature to apply. To return once again to our example, one could provide a structural account of the energy difference between A and the members of the contrast class by pointing out that the double bond in A is more highly substituted than the double bonds in any of the alternative possible structures, B, C, or D. Highly substituted double bonds are stabilized by hyperconjugation, and so tend to be lower in energy than analogous less highly substituted double bonds (Lowry 1987, p. 590).Footnote 3 By identifying this structural feature you can recognize that there will be hyperconjugation in the product, and thereby account for the energy difference mentioned in the direct answer to the question. In order to find that the product contains a highly substituted double bond, one must assess the number of alkyl groups bound to the carbons in the double bond. This can be done by simply counting the number of lines emerging from the carbons of the double bond in the structural formula that terminate in the atomic symbol ‘C’. The information required to decide that the robustly applicable concept (hyperconjugation) applied in this case was contained in the descriptive content of the structural formula (since is information about the connectivity of the atoms in the compound). As we shall see, however, there are many cases where the structural features that indicate the applicability of one of these robust concepts must be discovered by manipulation or inspection of the relevant model and/or structural formula and are not contained in the descriptive content of that formula.

An example: Adolf Baeyer and strain theory

In this example, I will describe an explanation for the perceived prevalence of five and six-membered carbon rings that was suggested by one of the most eminent organic chemists of the 19th century. Adolf Baeyer won the Nobel Prize in 1905 for his work on organic dyes (he was responsible for the synthesis of indigo). During the course of his research he worked with many cyclic and aromatic compounds and noticed that most naturally occurring cyclic carbon compounds had six-membered rings; furthermore, five and six-membered rings were much easier to prepare in the laboratory than cyclic compounds with different ring sizes (Ihde 1966). In order to explain these facts about the prevalence of cyclic carbon compounds of various sizes, he suggested what has come to be known as strain theory. Baeyer’s explanation is interesting not only because it is one of the first examples where a structural formula is used as a model in order to account for energy differences, but also because it introduced the concept of ‘strain’, which is still one of the robustly applicable concepts that are used in the structural accounts provided in modern organic chemistry.

Baeyer’s strain theory was first described in a paper published in 1885 (available in Leicester 1963, pp. 465–467). He begins his account by asserting that there must be a spatial basis for facts about the relative difficulty of synthesizing rings with less than five or more than six members. Next, he proceeds to summarize the accepted facts about the bonding of carbon atoms. For our purposes, we need mention only two of these facts: carbon has a valence of four (so it typically forms four bonds) and those bonds are oriented towards the corners of a tetrahedron centered on the carbon. Because the angle between a pair of lines running from any two vertices of a tetrahedron to its center is about 109°, the second of these facts entails that the angle between bonds to the same carbon atom should be about 109°. To these accepted facts about carbon bonding, Baeyer adds just one more claim, “the direction of these attractions (bonds to carbon) can undergo a diversion which causes a strain which increases with the size of the diversion” (Leicester 1963, p. 466). Now, appealing to facts that can be discovered by manipulation of a physical model, Baeyer applies these accepted facts and his additional claim to explain the prevalence of five and six-membered rings. He says:

If, now, as can be shown clearly only by the use of a model, an attempt is made to join a greater number of carbon atoms without force, that is, in the direction of the tetrahedral axes, or the wires of the models, the result is either a zigzag line or a ring of five atoms ...When a larger or smaller ring is formed, the wires must be bent, i.e., there occurs a strain...

(Leicester 1963, p. 466)

Thus, by manipulations of physical models of carbon chains of various lengths, it is possible to discover that, when these chains have something other than five members, they cannot be closed into rings without bending the wires that represent the bonds between carbon atoms. Baeyer also suggests a way of quantifying the amount of strain involved in forming rings of various numbers of members. He presents this method by producing a series of structural formulas of carbon rings with two, three, four, five, and six members. Under each of these structural formulas he records the amount of deviation from 109° that would be required if the ring was a regular n-gon of the appropriate number of sides. In this case, it is facts about the structural formulas considered as geometrical figures (regular polygons of various numbers of sides) that Baeyer suggests can be correlated with energy differences in the compounds that they denote. As his simple computations reveal, it is five-membered rings that have the least strain, followed by six-membered rings. Rings with less than five members or more than six members are predicted to have bonding angles that deviate substantially from 109°, and thus to have significant strain, thereby being unstable and difficult to produce.

Baeyer’s explanation can be recast in the terms of the model of explanation suggested above. To the question, “Why are cyclic carbon compounds with five or six-membered rings readily producible (rather than carbon compounds with rings of different sizes)?” Baeyer’s direct answer is, “Because five and six-membered rings are more stable than rings of other sizes.Footnote 4 He provides a structural account of this direct answer by claiming that strain will be more substantial in the rings of other sizes than it will be in either five or six-membered rings. Baeyer’s strain theory amounts to the suggestion that facts about the physical model or structural formula of a compound licenses inferences about the relative stability of the chemical compounds that these models represent. If strain theory is accepted, then a structural fact discoverable by manipulation of a physical model, such as the fact that the creation of the model requires bending the wires, indicates that strain is a factor in the compound, and thus that it is likely to have a higher energy than comparable unstrained compounds. Or equivalently, examination of the structural formula of a ring compound can reveal that there will be substantial deviation from the ‘ideal’ bond angle of 109°, again indicating strain, and a relatively high energy. Strain theory therefore allows for an account of energy differences between compounds based on facts about the structure of the compound accessible by investigation of models or structural formulas of the compound. Such an account proceeds through the recognition that a robustly applicable concept, ‘strain’, is more important in one compound than in others. With strain theory in place, it becomes possible to provide structural accounts for some of the energy differences relevant to the explanations of organic chemistry.

A second example: Ernst Mohr and large rings

At the end of the paper in which Baeyer introduced strain theory, he suggested that the energetic impacts of strain might be measurable by thermochemical comparison of rings of different sizes (Ihde 1966, p. 148). Eventually, organic chemists succeeded in synthesizing compounds with three, four, and more than six carbons in a ring, and so it became possible to perform these thermochemical measurements. The idea behind the thermochemical measurement of strain is that since strain increases the potential energy of a compound (or makes it unstable, in Baeyer’s language), when strained compounds are decomposed (by combustion) they should give off more energy than do comparable unstrained compounds. In other words, strain provides a structural account of the differences in the heat of combustion of otherwise analogous compounds, and so by measuring these differences you can quantify the significance of stain in a cyclic compound. When these thermochemical measurements were actually performed, they corroborated what had been the experience of organic chemists working with cyclic compounds: (1) it was difficult to produce three or four-membered rings, (2) Five and six-membered rings were easy to make and were stable, and (3) larger rings were not hard to make and they were stable, but they were produced in relatively low yields.Footnote 5 In thermochemical terms, chemists found that there was significant strain associated with three and four-membered rings, but essentially no strain associated with rings of five or more members (Ihde 1966, pp. 148–151). So Baeyer’s strain theory gave the right predictions for three, four, and five-membered rings, but it predicted strain where none was evident for rings with more carbons.

The necessary adjustment to Baeyer’s strain theory was suggested five years after it was originally published (Ihde 1966, p. 150), but not generally recognized until the publication of a paper by Ernst Mohr in 1918 (Mohr 1918). In this paper, Mohr used models (and/or structural formulas) to demonstrate the limitations of Baeyer’s original account of strain theory. Recall that Baeyer had treated the structural formulas of cyclic compounds as if they were regular polygons when he computed the bond angle deviations upon which his strain predictions were based. By doing this, Baeyer was implicitly assuming that the compounds depicted by these structural formulas had all of their ring carbons in the same plane. If this restriction is relaxed, it is possible to arrange the tetrahedral carbon atoms in rings with six or more members in such a way that there is in no ring strain. In other words by moving the ring carbons out of the same plane, the ideal bond angles of 109° can be preserved for all of the carbon–carbon bonds in the ring, thereby eliminating any strain in the compound. Mohr was able to convincingly demonstrate this possibility with representations of three-dimensional models of the chair and boat forms of cyclohexane, which are two configurations of a six-membered ring hydrocarbon that preserve all the tetrahedral bond angles. Mohr’s demonstration can be understood, in the terms of the model of explanation expounded above, as a structural account of the fact that there is no elevation in the heats of combustion of cyclic compounds with ring sizes of six or more. By experimenting with models, Mohr found that there was a way to arrange the ball and stick model of cyclohexane that did not result in any strain on the sticks. This fact was leveraged into the conclusion that there were configurations of cyclohexane not subject to any ring strain. Again, then, facts about the structure of those compounds relevant to an explanation that are accessible by investigation of models or structural formulas (but not contained in the descriptive content) allow for an account of energy differences (or in this case the lack thereof) between compounds. Eventually, structural formulas were supplemented with conventions so that they could be used to indicate the three-dimensional arrangement of the carbon atoms in ring compounds. With these conventions in place, it became possible to explain the lack of strain in six and higher membered rings by producing structural formulas that entailed that the tetrahedral bond angles for these ring structures could be preserved.

Conclusion

The real force of the explanations in organic chemistry originates in the structural accounts, which relate the energy differences appealed to by the background theoretical model to the structures of the compounds (or intermediates) involved in a transformation. These structural accounts proceed through a limited set of robustly applicable concepts, such as the ‘strain’ and ‘hyperconjugation’ encountered in this paper. By deciding which of these concepts apply to the compounds (or intermediates) of a particular transformation, it is often possible to explain, or even to predict, facts about the outcome, mechanism, and rates of the transformation, even when that transformation has never been encountered before. These concepts allow organic chemistry to be pushed into new territory, and therefore account, at least in part, for the theoretical explanatory and predictive power of the discipline. Crucial to the strategy of employing these robust concepts to account for the relevant energy differences is the ability to recognize when these concepts apply. It is in facilitating this recognition that the role of structural formulas in the explanations of organic chemistry becomes most apparent. In order to recognize that ‘hyperconjugation’ was relevant to the explanation of the stability of an alkene product, the number of carbons bonded to the double bond was assessed. Because this information is contained in the descriptive content of the structural formula, it is easily available by inspection of that formula. Similarly, in order to explain why there is no ring strain in cyclohexane, it suffices to establish the possibility of a configuration of the compound where all of the bond angles are 109°. That such a configuration is possible can be established by manipulations of a molecular model or by transformations of the structural formula of the compound. If a novel chemical compound was being considered, perhaps as a synthetic target, then its structural formula (or perhaps the structural formulas of its possible precursors) could be used to determine the relevance of ‘strain’ and ‘hyperconjugation’ to transformations that might result in its production. In its role as a descriptive name, the structural formula carries information about the composition, connectivity, and arrangement of a compound that is frequently relevant to the applicability of these robust structural concepts. Furthermore, in its role as a model, the structural formulas are often essential for the chemist to discover, by manipulation or transformation, whether one of these concepts applies. As a result, the texts of organic chemists are replete with structural formulas. These structural formulas allow the reader of the text to recognize the facts that license the application of the robust concepts that will ultimately be appealed to in order to support a synthetic strategy, to explain, or to predict the outcome of an organic transformation.

The example of strain theory brings out another sense in which structural formulas are important to the explanations of organic chemistry. The concept of ‘strain’ is not only applied on the basis of facts about the transformations of structural formulas or the manipulations of molecular models, but it was also ‘discovered’ on the basis of facts such as these. Baeyer’s proposal seems to have been based on a mechanical analogy, taking the evident tension in wires or springs to indicate that the molecules denoted by such model would also have heightened potential energy. The model, or formula, in a case like this, acts as a source of structural concepts that might be correlated with the sorts of energy differences essential to explanation in organic chemistry. Since organic chemists, at least in Baeyer’s time, had no independent access to the structures of chemical compounds, it is no surprise that attempts to identify the structural causes of reactivity began by searching for correlations between representations of structure and reactivity. When a feature of the representation was found to correlate well with energy differences and thus to allow explanations of reactivity, this was taken as an indication that the relevant feature of the representation was not arbitrary, but could be used to infer features of the denoted compound. Molecular models and structural formulas are creatures of the manifest image, and so their characteristics can be discovered by everyday perception or manipulation. As models, these objects are taken to stand for, or to represent, chemical kinds, which are creatures of the scientific image. Strain theory is an example of how features of the manifest image can be co-opted, by a sort of analogy, into supporting (roughly) causal explanations within the scientific image. Much more ought to be said about the use of models as a sort of bridge between the manifest and scientific images, but this will have to await a future paper.

One way to think about the theoretical content of strain theory is as the postulation that certain features of molecular models (the strain required to put them together) or structural formulas (the angles between the carbon-carbon bonds) permit an inference about the energy of the depicted compound. In other words, strain theory says that an additional feature of these models is significant, and it is significant because it has energetic implications for the compounds in whose models that feature occurs. By proposing strain theory, Baeyer suggesting a modification in the role of these representations (structural formulas, or molecular models) as models, that is, he was suggesting a modification in the extent to which they stand for the chemical compounds that they depict. When originally proposed, strain theory permitted inferences that conflicted with the phenomena, and so it had to be adjusted. Mohr’s correction to the strain theory can be understood as a sort of tuning of the conditions under which it can be inferred that a chemical compound has increased energy due to strain. Now, organic chemists recognize that there will be strain in a compound (or configuration) only when no three-dimensional arrangement is possible that preserves the ideal bond angles. Understood in this more refined way,Footnote 6 ‘strain’ is still a useful concept that allows for structural accounts of the energy differences relevant to the explanations of organic chemistry (Ihde 1966, pp. 151–158). The evolution of strain theory is an example of how the theoretical apparatus of organic chemistry is adjusted in response to empirical data. In this case, the adjustment amounts to a refinement of the role of structural formulas (or molecular models) as models. So not only do structural formulas facilitate the discovery and recognition of the robust concepts that are crucial to the explanations of organic chemistry, but their significance as models is also part of the evolving theoretical apparatus with which organic chemists confront the world.