1 Introduction: why study the unideal?

Approximate symmetry is ubiquitous. For most of us, our first visual experiences after birth include the approximate bilateral symmetry of our mothers’ faces; cursory examination of nature finds approximate \(60{}^\circ \) rotational symmetry in snowflakes, honeycomb, and the Bénard convection cells of shallow boiling water in a saucepan (Stewart and Golubitsky 1992, pp. 21–22, 100, 149–150; Brading and Castellani 2007, pp. 1331–1332). By contrast, rarefied exact symmetries are seemingly found only in more fundamental domains of inquiry, where they have implications for laws of nature, ontology, the nature of identity, and the methodology of physics and the mathematical sciences. Perhaps that is one reason why philosophers of physics (and scientists more generally) have—idealistically—focused their analytic attention on exact symmetries. Yet associating the ideal and exact with the fundamental and that which is of foundational or philosophical significance can, I suggest, misguide us.

For example, symmetries have long been essential in particle physics as a means not only to classify and unify species of particles (Brading and Castellani 2007, §8.3), but also to make predictions in decay and scattering experiments (Bangu 2008, 2013, §5). The most famous example of this is isospin, a spin-like (i.e., SU(2)) quantity assigned to nucleons that treated protons and neutrons as different states of what is fundamentally the same sort of particle. Developed by Heisenberg and Wigner in the 1930s, it was not yet clear whether it was an exact symmetry, since measurements of the mass of the neutron relative to the proton had yet to be precise enough to resolve the question. But this did not deter the fruitful application of isospin, for, even though it arises in fact from a merely approximate symmetry, it still facilitated good predictions, modulated by perturbation theory, and helped explain the spin structure of the nucleus. Besides isospin, the discrete symmetries of charge conjugation, parity transformation, and time reversal are all approximate in the Standard Model of particle physics, as are higher continuous groups such as SU(3) when applied to flavor quantum numbers—a historical but still useful precursor to the exact SU(3) for color charge in the Standard Model—and higher ones still, such as SU(4) and SU(6) (Costa and Fogli 2012; Sundermeyer 2014; Lichtenberg 1970, §1.5).

Any seemingly exact symmetry has the potential to be rendered merely approximate under the development of science. Indeed, “perhaps one would find that none of the symmetries [that our theories attribute to Nature] is really exact if our observational/experimental capabilities were sufficiently improved” (Sundermeyer 2014, p. 12). Nonetheless, even in such a highly asymmetric world our current theories would still have domains of applicability in which their descriptions and predictions would be accurate with a high, if not perfect, degree of approximation. In other words, a more accurate but less symmetric theory ought to explain how and when a less accurate but more symmetric theory can be so successful. Such an explanation might quite plausibly attribute approximate symmetry to the less symmetric theory. Approximate symmetry deserves attention at least because the notion of exact symmetry will not be sufficient for this explanatory role. Of course, there is much more to say on this topic, which is the subject of Sect. 3.

After this, I explore two further topics on which the role of approximate symmetries in inthertheoretic reduction bears. The first (Sect. 4) concerns what Redhead (1975) calls the “Curie-Post Principle”, that (roughly) for two theories, one of which explains the success of the other, the more fundamental theory—the one serving as the explanans—cannot have more symmetries than the less fundamental theory—the one serving as the explanandum. The second (Sect. 5) concerns the identification of accidental symmetries, and their distinction from fundamental symmetries. In a word, I synthesize what I see as the best features of accounts by Kosso (2000), Lange (2011), and Redhead (1975), while avoiding their flaws, to describe a symmetry’s being accidental as inherently relational: it denotes the symmetry’s status as approximate or exact, or the expectation of such a status, in a less fundamental theory with respect to a more fundamental theory, one with broader scope, that explains its success.

In order to draw these consequences, however, one must first ask seriously: what exactly is approximate symmetry?

2 Defining approximate symmetry

2.1 Extant definitions

Commonly invoked, approximate symmetry is rarely defined. Some authors who do so just identify approximate symmetries with broken symmetries (Costa and Fogli 2012, pp. 116–117, 189–191, Rosen 1975, pp. 78–80, 1995, p. 128, 2008, p. 134).Footnote 1 However, there is good reason to consider these to be distinct concepts. Broken symmetries are typically divided into two further subtypes, explicit and spontaneous. As Brading et al. (2017, §4) remark, “Explicit symmetry breaking indicates a situation where the dynamical equations are not manifestly invariant under the symmetry group considered” while “Spontaneous symmetry breaking (SSB) occurs in a situation where, given a symmetry of the equations of motion, solutions exist which are not invariant under the action of this symmetry without any explicit asymmetric input (whence the attribute ‘spontaneous’)”.Footnote 2 In both cases, though, the broken symmetry is the result of a dynamical process or otherwise involves some comparison between two symmetry structures (e.g., symmetry groups). For instance, the parity transformation symmetry of Minkowski spacetime is explicitly broken by introducing a matter Lagrangian that couples differently to different chiralities, as is now understood to be the case with the weak nuclear force. The rotational symmetry of a Lagrangian \(L = (\nabla _a \phi ) (\nabla ^a \phi ) + V(\phi )\) for a scalar field \(\phi \) with a “Mexican hat” potential \(V(\phi ) = -10|\phi |^2 + |\phi |^4\) is spontaneously broken by the ground states \(\phi = \sqrt{5}e^{i\theta }\). In neither case, however, should the mere broken nature of the symmetry imply that it is still approximate. Some symmetries are so broken (whether explicitly or spontaneously) that there is no plausible sense in which they still hold approximately.Footnote 3

In contrast with this conflation, Castellani (2003a, p. 321) does clearly distinguish approximate from broken symmetries: “A symmetry can be exact, approximate, or broken. Exact means unconditionally valid; approximate means valid under certain conditions; broken can mean different things, depending on the object considered and its context”Footnote 4—here, by “different things” Castellani refers to the two types of broken symmetry, explicit and spontaneous, discussed above. However, she does not explicitly define what it means for a symmetry to be “valid”. One possibility follows Rosen (2008, pp. 43–44), who uses the descriptor of “valid” regarding approximate symmetries just to denote those situations or systems in which the symmetry is present. For example, charge conjugation and parity inversion are said to be “valid” concerning particle processes that do not essentially involve the weak nuclear force—e.g., those that involve only electromagnetism. But this interpretation just makes approximate symmetry a kind of exact symmetry that applies to fewer situations. By contrast, as the examples of the human face and honeycomb at the beginning of this essay illustrate, approximate symmetry should come in degrees that depend not on the range of situations in which the symmetry transformation is exact, but how similar the situation to which it is applied is to one where the symmetry transformation is exact. In a word, an adequate definition of approximate symmetry should underwrite the claim that human faces are more or less bilaterally symmetric, depending on the case. If “valid under certain conditions” means exact under certain conditions, then real human faces are not even approximately bilaterally symmetric.

Rosen (1995, 2008) has proposed instead to define approximate symmetries and quantify their degree of approximation with a distance function defined on the abstract possibility space for the system under consideration, a class which is inclusive of spaces of states and histories.Footnote 5 Explaining how this works requires spelling out a bit more formal detail, as follows.Footnote 6 Let \(\mathcal {S}\) be the abstract possibility space of a physical system: it consists of all the possible ways that states of affairs could be arranged for the system, according to some theory. Let \(\mathcal {T}\) be the set of bijective transformations of \(\mathcal {S}\), i.e., the set of injective and surjective functions \(T: \mathcal {S} \rightarrow \mathcal {S}\). The elements of \(\mathcal {T}\) form a group under composition, but it is not yet appropriate to deem it the (exact) symmetry group of \(\mathcal {S}\)—indeed, the specification of such a group will only be relative to a certain notion of equivalence on the abstract possibility space. So, let \(\equiv \) be an equivalence relation on \(\mathcal {S}\), i.e., a reflexive, symmetric, and transitive relation thereon. Suppose that \(\equiv \) represents sameness with respect to some particular class of relevant properties shared by the elements within each of its equivalence classes and not shared between them. For instance, if \(\mathcal {S}\) were the set of kinematically possible histories for a mechanical system, \(\equiv \) could have exactly two equivalence classes, one consisting of the histories that are solutions to a certain equation (i.e., the dynamically possible histories), and the other consisting of the histories that are not solutions. Then, the \(\equiv \)-symmetry group of \(\mathcal {S}\) would be the largest subgroup \(\mathcal {T}_\equiv \) of \(\mathcal {T}\) such that for all \(s \in \mathcal {S}\) and \(T \in \mathcal {T}_\equiv \), \(T(s) \equiv s\). In a word, the \(\equiv \)-symmetry group of \(\mathcal {S}\) is the maximal group of abstract possibility space transformations that preserve \(\equiv \).Footnote 7

Thus does Rosen define (exact) symmetry.Footnote 8 To define approximate symmetry, he further supposes that \(\mathcal {S}\) is a pseudometric space, i.e., it is equipped with a pseudometric function \(d: \mathcal {S} \times \mathcal {S} \rightarrow [0,\infty )\), one that satisfies the following three conditions for all \(u,v,w \in \mathcal {S}\),

  • Null self-distance \(d(u,u) = 0\),

  • Symmetry \(d(u,v) = d(v,u)\),

  • Triangle inequality \(d(u,w) \le d(u,v) + d(v,w)\).Footnote 9

In fact, the pseudometric can replicate the role of the equivalence relation because it induces an equivalence relation \(\equiv _d\) on \(\mathcal {S}\) in the obvious way: for any \(u,v \in \mathcal {S}\), \(u \equiv _d v\) if and only if \(d(u,v) = 0\). But it includes more structure than an equivalence relation, in the sense that it also assigns positive numerical distances between non-equivalent possibilities. So for any \(\epsilon > 0\), one can check whether a given transformation \(T \in \mathcal {T}\) satisfies \(d(s,T(s)) < \epsilon \) for all \(s \in \mathcal {S}\). If it does, it is an (\(\epsilon \)-)approximate \(\equiv _d\)-symmetry of \(\mathcal {S}\).Footnote 10 Clearly any \(T \in \mathcal {T}_{\equiv _d}\) is an approximate \(\equiv _d\)-symmetry of \(\mathcal {S}\) (for any \(\epsilon \)), but the set of all \(\epsilon \)-approximate \(\equiv _d\)-symmetries may only form a groupoid instead of a group: two \(\epsilon \)-approximate \(\equiv _d\)-symmetries composed may not be an \(\epsilon \)-approximate \(\equiv _d\)-symmetry.

Importantly, “what is or is not considered an approximate symmetry transformation depends on how great a deviation from equivalence one is willing to tolerate, where that is expressed by the greatest metric ‘distance’ one will accept” (Rosen 2008, p. 132). Such a deviation is from sameness with regard to the relevant physical features of the system. Rosen (1995, p. 159) even suggests that “The extent of deviation from exact symmetry that can still be considered approximate symmetry will depend on the context and application and could very well be a matter of personal taste”. However, there ought to be nothing subjective about the choice of d if this whole framework is to be fruitfully applied in physics: specifying the context of application should fully determine the relevant pseudometric. In other words, the numerical distance between two possibilities may depend on contextual relations between the system and its environment—e.g., what of it can be measured or observed and how precisely—that are at least intersubjectively fixed.

Even if Rosen were to accept this amendment promoting objectivity, one could still object to the plausibility of a contextual determination of the pseudometric. Rosen does not suggest anything about the form of the pseudometric, and there are hundreds of proposals (with more published every year) in the scientific literature on ways of quantifying symmetry in diverse applications (Petitjean 2003). Even a cursory examination of these ways suggests that scientists do not show how their proposed (a)symmetry metric is uniquely determined from their particular application’s context; they rather just argue for their proposal’s plausibility and utility. Simply put, requiring a particular pseudometric on the abstract possibility space may require more structure than is determined by this context. For instance, in previous work (Fletcher 2016, 2019), I have described how in general relativity, sometimes the relevant way to compare spacetimes involves an (uncountably) infinite collection of pseudometrics, one for each idealized spacetime observer, that cannot be reduced to a single choice. What’s more, it isn’t clear why certain features—e.g., the triangle inequality property—are really so essential to determining how similar two possibilities are.Footnote 11

2.2 A new proposal

This leads to a natural emendation, which shall constitute my own proposal for formalizing approximate symmetry. In particular, instead of beginning from a definition of exact symmetry and then generalizing, I shall begin with a definition of approximate symmetry, of which exact symmetry will turn out to be a special case, when it exists. One advantage of this approach to defining approximate symmetry is that it applies even when there are no exactly symmetric possibilities—i.e., when exact symmetry can only be considered an idealization.Footnote 12 Apropos of the guiding discussion in Sect. 1, a good definition of approximate symmetry should allow for the possibility of radical asymmetry that is nevertheless very like symmetry, even when the latter is ultimately not physically possible.

To do so, instead of a pseudometric I introduce a class \(\mathcal {R}\) of similarity relations on \(\mathcal {S}\), also called a similarity structure on \(\mathcal {S}\) (Fletcher forthcoming b). For present purposes, I shall hew to the traditional characterization thereof (Carnap 1967): a similarity relation \(\sim \) on \(\mathcal {S}\) is a reflexive and symmetric binary relation on \(\mathcal {S}\); for any \(u,v \in \mathcal {S}\), \(u \sim v\) is interpreted as “u is similar to v”. Here, reflexivity captures the observation that every possibility is similar to itself, and symmetry the intuition that the similarity relation has no “directionality” to it: asserting that u is similar to v is just to assert that v is similar to u.Footnote 13 However, unlike an equivalence relation, a similarity relation is not in general transitive. Sorites-like considerations illuminate why: shades of a color may be quite similar to others a bit darker and brighter, but transitivity would then imply that the blackest is a similar shade as the whitest.

Each relation in a similarity structure \(\mathcal {R}\) represents a relevant way that the possibilities of \(\mathcal {S}\) are similar to one another. A pseudometric d on \(\mathcal {S}\) induces such a family: for every \(\epsilon \ge 0\), let \(u \sim _\epsilon v\) just when \(d(u,v) \le \epsilon \).Footnote 14 This suggests the analogous definition of approximate symmetry for similarity structures: for any transformation \(T \in \mathcal {T}\), if \(T(s) \sim s\) for all \(s \in \mathcal {S}\), then T is a \(\sim \)-approximate symmetry of \(\mathcal {S}\).Footnote 15 Although \(\sim \) is not in general transitive for the reasons discussed above, if it is transitive, then this definition reduces to that of an exact symmetry. Thus, defining approximate symmetry through similarity structures as I have makes exact symmetry a special case of approximate symmetry.Footnote 16

Moreover, it provides a more flexible way of describing degrees of approximation. Any restriction of a similarity relation—that is, a subset of it—is also a similarity relation, for reflexivity and symmetry are preserved under the subset relation. If \(\sim ' \; \subseteq \; \sim \), then \(\sim '\) is at least as discriminating as \(\sim \), in the sense that it determines no more possibilities than \(\sim \) does to be similar to one another. And if \(\sim ' \; \subsetneq \; \sim \), then \(\sim '\) is (strictly) more discriminating than \(\sim \). This suggests an analogous definition of one approximate symmetry transformation being at least as (or more) discriminating than another. To see this, suppose the abstract possibility space \(\mathcal {S}\) is equipped with a similarity structure \(\mathcal {R}\) and consider transformations \(T,T' \in \mathcal {T}\) such that for any \(\sim \; \in \mathcal {R}\), if \(T(s) \sim s\) for all \(s \in \mathcal {S}\) then \(T'(s) \sim s\) for all \(s \in \mathcal {S}\). \(T'\) is then at least as discriminating as T because it preserves similarity relations at least as discriminating as those that T does. If further there is some \(\sim \; \in \mathcal {R}\) such that for all \(s \in \mathcal {S}\), \(T'(s) \sim s\), while for some \(s' \in \mathcal {S}\), \(T(s') \not \sim s'\), then \(T'\) is (strictly) more discriminating that T because it preserves similarity relations that are more disciminating than those that T does. In both of the cases of the similarity relations and the transformations, discrimination has the structure of a preorder, inherited directed from that structure of the subset relation. Some transformations will discriminate more than others, but others will discriminate in incomparable ways.

Fig. 1
figure 1

Two sets of gears. Individual spur gears have a wider class of exact symmetries than individual helical gears, but the latter can have the same class, as approximate symmetries, for sufficiently small helix angles. Image by Ningbo Twirl Motor Co., Ltd.: http://www.twirlmotor.com/Blog/Industry-news/new-12.html

For example, consider again the case of a similarity structure on an abstract possibility space \(\mathcal {S}\) induced from a pseudometric thereon. Whenever \(\epsilon ' \le \epsilon \), the similarity relation \(\sim _{\epsilon '}\) is at least as discriminating as the similarity relation \(\sim _\epsilon \). Suppose further that the possibilities in question are positions in Euclidean space of a particular gear, such as those in Fig. 1. Spur gears—the sort of gear most have as their mental picture—are exactly symmetric about their center for rotations of multiples of \(360^\circ /n\), where n is the number of the gear’s teeth. A spur gear also has, for each of its teeth, a reflection symmetry, whose plane bisects the gear into two “half-circles” through the centers of the tooth and the gear (much like the reflection symmetries of an n-gon). Helical gears, by contrast, have only the same rotation symmetries exactly. Any reflection symmetry will map a right-handed gear into a left-handed one, and vice versa. But if the gear’s helix angle—the constant angle that the helical teeth make with the axis of rotation of the gear—is sufficiently small, then even a helical gear could have approximate reflection symmetry. This will be the case if the pseudometric on \(\mathcal {S}\) is, for example, the Hausdorff distance function \(d_H\), which sets the distance between two shapes to be the largest of all the distances from a point in one shape to the closest point in the other.Footnote 17 But for a fixed helix angle, there will be a sufficiently small \(\epsilon \) such that a helical gear whose teeth have that angle will not have \(\sim _\epsilon \)-approximate reflection symmetry (with respect to \(d_H\)). Thus, for helical gears, the rotation symmetries are more discriminating than the reflection symmetries because the rotation symmetries preserve \(\sim _0\), which is more discriminating than any \(\sim _{\epsilon }\) for \(\epsilon > 0\), but the reflection symmetries do not. In contrast, for spur gears, the rotation symmetries are just as discriminating as the reflection symmetries because the two types both preserve \(\sim _0\).

The foregoing example features similarity relations and transformations that have totally ordered discrimination, as they are determined from a distance function. But it is easy to modify the example to induce only partially ordered similarity relations and transformations. Suppose that in addition to position in Euclidean space, the abstract possibility space of the gears also includes their monochrome intensity, i.e., their grayscale color, ranging from black to white, modeled as a point on the interval \([-1,1]\), equipped with the standard distance function. Suppose also then that the similarity structure includes for every \(\epsilon \in [0,2]\) relations \(\sim '_\epsilon \)—defined so that for all \(u,v \in \mathcal {S}\), \(u \sim '_\epsilon v\) when the grayscale color of u and v are within \(\epsilon \)—and their intersections with the relations \(\sim _\epsilon \). Only the perfectly gray gears (regardless of shape), those with color value 0, are exactly symmetric under color inversion, which maps a gear with color \(x \mapsto - x\). Similarity relations regarding shape and color—ones that are the intersection of some \(\sim _\epsilon \) and some \(\sim '_{\epsilon '}\)—are more discriminating than ones that take into account shape or color only, such as \(\sim _\epsilon \) and \(\sim '_{\epsilon '}\), respectively. But none of the \(\sim _\epsilon \) are at least as discriminating as any of the \(\sim '_{\epsilon '}\) (for any \(\epsilon , \epsilon '\)). Analogously, none of the color inversion transformations are at least as discriminating as the rotations or reflections and vice versa, but transformations composed from the elements of these two groups are more discriminating than the ones from which they are composed.Footnote 18

In the simple example determined from a distance function, there is a sense in which when one approximate symmetry is more discriminating than another, it is also more exact: it lies between the less discriminating symmetry and an exact symmetry in the discrimination partial order. But discrimination is a relation of broader scope than comparative exactness, for it applies even in contexts where there is no exact underlying symmetry, or where there is more than one. Formally, some similarity relation \(\sim \) may have a restriction \(\equiv \) which is also a non-trivial equivalence relation, i.e., an equivalence relation which is not the identity relation.Footnote 19 Thus in this case a \(\equiv \)-symmetry is also a \(\sim \)-approximate symmetry, and \(\equiv \) is more discriminating than \(\sim \). But in other cases there may be no such restriction \(\equiv \).

Example 1

Let \(\mathcal {S} = \mathbb {Z}\) and \(I_\mathbb {Z}\) be the identity relation on \(\mathbb {Z}\). Suppose that the similarity structure for \(\mathcal {S}\) consists of a single relation, \(\sim \; = \{ (i,j): |i - j| \le 1 \}\). I.e., the similarity relation’s extension consists in all pairs of numbers whose elements are identical or adjacent on the number line. Clearly the only restriction of \(\sim \) that is an equivalence relation is \(I_\mathbb {Z}\), while both the successor (\(i \mapsto i + 1\)) and predecessor (\(i \mapsto i - 1\)) transformations on \(\mathbb {Z}\) are \(\sim \)-approximate symmetries.

It is also elementary to construct examples in which there is more than one restriction that is a non-trivial equivalence, without there being a single equivalence relation that contains them.

Example 2

Let \(\mathcal {S}\) be a four-element set, e.g., the set of cardinal directions \(\mathcal {S} = \{N,E,S,W\}\), and \(I_\mathcal {S}\) be the identity relation on \(\mathcal {S}\). Suppose that the similarity structure for \(\mathcal {S}\) consists of a single relation,

$$\begin{aligned} \sim \; = I_\mathcal {S} \cup \{ (N,E), (E,N), (E,S), (S,E), (S,W), (W,S), (W,N), (N,W) \}. \end{aligned}$$

Here are two distinct restrictions of \(\sim \) that are non-trivial equivalence relations:

$$\begin{aligned} \equiv _1 \;&= I_\mathcal {S} \cup \{ (N,E), (E,N), (S,W), (W,S)\}, \\ \equiv _2 \;&= I_\mathcal {S} \cup \{ (E,S), (S,E), (W,N), (N,W) \}. \end{aligned}$$

Clearly, neither is a subset of the other.

The restrictions of a similarity relation which are also equivalence relations thus form a partial order by set inclusion that always has a least element—the trivial equivalence relation—but does not necessarily have a greatest element.

Fig. 2
figure 2

Simulated diffraction pattern of an icosahedral quasicrystal along one of the axes of (approximate) symmetry. The sizes of the dots represent intensity on the image as the light sources passes through several layers of the quasicrystal. Image by Steffen Weber: http://www.jcrystal.com/steffenweber/qc.html

Part of the significance of the first observation, that there may be approximate symmetries without an exact symmetry that they approximate, is that it entails that systems may admit of approximate symmetry even when the most obvious exact symmetry to which they correspond is impossible. This underscores the claim, expressed in Sect. 1, that any of the observed symmetries of Nature may turn out only to be approximate.

One needn’t look to speculative fundamental physics to find it: intriguing concrete examples are already found in quasicrystals.Footnote 20 Although there are various definitions of quasicrystals, what is important for present purposes is that modern crystalography classifies crystals by their diffraction diagrams: light (such as X-rays) incident on a solid sample of atoms undergoes diffraction, with the spacings between the atoms serving as a kind of diffraction grating if the wavelength is carefully chosen so as not to fall within the atoms’ absorbency spectrum. This so-called Bragg diffraction produces an image opposite the sample from the light source that is essentially the Fourier spectrum of diffracted plane waves of light. When the atoms have some order to their spacing, the pattern displayed is highly concentrated at points on the image, hence “A crystal is any solid with an essentially discrete diffraction diagram” while “The symmetry of a crystal is the symmetry implied by its diffraction diagrams” (Senechal 1995, p. 27). Periodic crystals are those whose exact symmetry includes translations. It turns out that five-fold rotational symmetry, as depicted in Fig. 2, is impossible in periodic crystals, but can be observed to a high degree of approximation in aperiodic crystals—quasicrystals—such as certain aluminum–manganese alloys that nonetheless have long-range order.Footnote 21 Thus there is a definite sense in which the five-fold approximate symmetry of a quasicrystal has no exact crystalline symmetry which it approximates.

I will explore further implications of this observation for the role of symmetry in intertheoretical relations in the following sections.

3 Symmetry in intertheoretic reduction: a new proposal

An exemplary new physical theory of a particular system or phenomenon encompassed or explains the success of its predecessor. For physical theories actually used in the practice of science, this intertheoretic relationship is not typically a reduction in the usual philosophers’ sense of a deduction (of the old theory from the new), for the two are often incompatible when applied to the self-same phenomena. Rather, it is somewhat more in the vein of what Post (1971) calls “correspondence” or the “physicists’ sense of reduction” as described by Nickles (1973) and elaborated by Ehlers (1986): it shows how the (typically) newer, more expansive theory represents more phenomena successfully, and accounts for how the (typically) older, less expansive theory represented phenomena successfully by showing that it well approximates the former in these cases. For instance, this is so for special relativity’s explanation of the success of Galilean spacetime, and general relativity’s explanation of the success of special relativity. And it is also the case for effective field theories, where high-energy versions reproduce to good approximation the predictions of low-energy versions.

My present goal isn’t to discourse on the concepts of reduction or explanation, but instead indicate how approximate symmetries are implicated therein. One theory may correspond to or reduce another via a limiting relation or approximation, yet the two theories may have different symmetries preserving important classes of features they have in common, such as their empirical features. If one were to think of reduction as a kind of deduction, this would not pose a problem if every empirical equivalence-preserving symmetry of the old theory is not only a definable transformation for the new theory, but also a symmetry thereof.Footnote 22

However, this is generally not the case: the Poincaré spacetime symmetries of the special theory of relativity, which preserve all empirical content of each of that theory’s models, not only fail to preserve this content in general relativity, but not all of those transformations are in general even definable there (Fletcher forthcoming a). (For instance, if the spacetime manifold is not diffeomorphic to \(\mathbb {R}^4\), there will no sets of Killing vector fields that could form the Poincaré Lie group.) How can the one theory therefore explain the symmetries of another with an overlapping domain of application, particularly when those symmetries preserve the explanandum’s empirical content?Footnote 23

I argue that the pertinent empirical equivalence-preserving symmetries of the old theory can be explained as being merely approximately empirical equivalence-preserving in the reducing theory. Thus appreciation of the role of approximation in many cases of intertheoretic reduction undergirds the epistemic features of symmetries. But to do this, one must slightly expand the formalism for approximate symmetry introduced in Sect. 2. There, I assumed that transformation are functions on the whole abstract possibility space \(\mathcal {S}\), and that equivalence relations delineating which possibilities have shared properties are relations on \(\mathcal {S}\), too. Here, I allow the former to be partial functions, i.e., be defined on only a proper subset \(\mathcal {S}'\) of the whole abstract possibility space, and the latter to be equivalence relations on \(\mathcal {S}'\), too. Then the analysis of transformations as \(\equiv \)-symmetries or \(\sim \)-approximate symmetries proceeds as before, but restricting their definitional conditions to the domain of definition of the transformation and equivalence relation, \(\mathcal {S}'\). In other words, one allows a transformation to preserve an equivalence relation \(\equiv \) or a similarity relation \(\sim \) just on a subset of elements of \(\mathcal {S}\).

Additionally, one may consider an abstract possibility space whose members consist of the union of the possibilities ascribed to a system by two theories, one old and another new, the latter of which explains the success of the former. One supposes that the abstract possibility space is equipped with a similarity structure \(\mathcal {R}\) containing similarity relations relating possibilities of each of the two theories to one another, and which encode similarities of their relevant—usually, empirical—features. This will be the case when every successfully applied possibility of the old theory is empirically similar to one of the new theory. The idea is that successful applications of the possibilities of the old theory could have just as well used a corresponding possibility of the new theory.

Formally, let \(\mathcal {O},\mathcal {N} \subseteq \mathcal {S}\) denote the possibilities of the old and new theories, respectively.Footnote 24 Then \(\mathcal {N}\) explains \(\mathcal {O}\) relative to \(\mathcal {R}\) when for every \(o \in \mathcal {O}\) and every \(\sim \; \in \mathcal {R}\), there is some \(n \in \mathcal {N}\) such that \(n \sim o\).Footnote 25 Here, the new theory explains the old (relative to a similarity structure) via the possibilities that each of the two theories afford. For example, in the case considered in Sect. 2 where the similarity structure is generated from a pseudometric, one might restrict the structure to contain all the \(\sim _\epsilon \) such that \(\epsilon \ge c\) for some constant \( c > 0\). In the case of the gears, the new theory might contain flexible continuum states for the gears, while the old theory might only contain rigid continuum states, certain of which are well-approximated observationally (but not to arbitrary precision) by flexible states.

With these preliminaries, one can define the formal conditions needed for the explanation of symmetries. Let \(\equiv \) be an equivalence relation defined at least on \(\mathcal {O}\), and suppose that T is a partial or total transformation on \(\mathcal {S}\) that is a \(\equiv \)-symmetry for \(\mathcal {O}\), i.e., for every \(o,o' \in \mathcal {O}\), if \(o \equiv o'\) then \(T(o),T(o') \in \mathcal {O}\) and \(T(o) \equiv T(o')\). Then \(\mathcal {N}\) explains this \(\equiv \)-symmetry when for every \(o \in \mathcal {O}\) and every \(\sim \; \in \mathcal {R}\) that is less discriminating than \(\equiv \), there exists some \(n \in \mathcal {N}\) such that \(T(n) \in \mathcal {N}\), \(n \sim o\), and \(T(n) \sim T(o)\). In other words, for \(\mathcal {N}\) to explain the \(\equiv \)-symmetry for \(\mathcal {O}\) is for two conditions to hold: first, \(\mathcal {N}\) explains \(\mathcal {O}\) relative to the similarity relations in \(\mathcal {R}\) that are less discriminating than \(\equiv \), and second, this explanation of \(\mathcal {O}\) by \(\mathcal {N}\) is invariant under T. The similarity requirement above describing how \(\mathcal {N}\) explains \(\mathcal {O}\) captures how individual successes of the possibilities in \(\mathcal {O}\) can be explained through the possibilities of \(\mathcal {N}\). The invariance requirement is intended to capture the further demand that, just as two possibilities of \(\mathcal {O}\) related by T that are in the same equivalence class of \(\equiv \) are equally well suited to represent the features they share in common in virtue of being in that class, there are possibilities in \(\mathcal {N}\) related by T are adequately similar to those two of \(\mathcal {O}\). Note that this invariance is not required of all the witnesses in \(\mathcal {N}\) of the explanation of \(\mathcal {O}\): T may well not preserve \(\sim \) for some n and o for which \(n \sim o\), but all that matters is that it do so for some.

Interestingly, according to this definition, in order for \(\mathcal {N}\) to explain an \(\equiv \)-symmetry T of \(\mathcal {O}\), it is not necessary for T to be even a \(\sim \)-approximate symmetry of \(\mathcal {N}\), or even any subset thereof, including the witnesses to this explanation for any \(o \in \mathcal {O}\), when \(\sim \) is less discriminating than \(\equiv \). To see this, take again the case from Sect. 2 of spur and helical gears, dividing them into equivalence classes based on their shape. Helical gears with sufficiently (but not arbitrarily) small helix angle, whether left- or right-handed, may each be relevantly similar in shape (i.e., related by each \(\sim \; \in \mathcal {R}\)) to spur gears of the same size, but not to each other. Under the parity (mirror) transformation, the shape of spur gears are invariant, while left- and right-handed helical gears switch their handedness. So, even if there are truly no spur gears and only helical gears in a set of gears of interest, the small helix angle of those gears allows one to explain why a theory that treats any individual helical gear as a spur gear may be adequate, and why that adequacy does not change under parity transformations, even if the helical gears are not even approximately symmetric under this transformation.

But, T being a \(\equiv \)-symmetry of any subset of \(\mathcal {N}\) that explains \(\mathcal {O}\) (relative to \(\mathcal {R}\)) is sufficient for that subset to explain the \(\equiv \)-symmetry of \(\mathcal {O}\), provided that \(\equiv \) is at least as discriminating as any \(\sim \; \in \mathcal {R}\). To see this, suppose that

1.:

T, a partial or total transformation on \(\mathcal {S}\), is a \(\equiv \)-symmetry for \(\mathcal {O}\);

2.:

\(\equiv \) is at least as discriminating as any \(\sim \; \in \mathcal {R}\);

3.:

\(\mathcal {N}\) explains \(\mathcal {O}\) (relative to \(\mathcal {R}\)), i.e., for every \(o \in \mathcal {O}\) and every \(\sim \; \in \mathcal {R}\), there is some \(n \in \mathcal {N}\) such that \(T(n) \in \mathcal {N}\) and \(n \sim o\); and

4.:

T is at least a \(\equiv \)-symmetry of the subset of \(\mathcal {N}\) that witnesses the explanation of \(\mathcal {O}\) (relative to \(\mathcal {R}\)).

Now pick any o, \(\sim \), and n over which condition 3 quantifies. By condition 1, \(o \equiv T(o)\); by condition 3, \(n \sim o\); and by condition 4, \(n \equiv T(n)\). Because \(\equiv \) is at least as discriminating as \(\sim \) (condition 2), these first two statements imply \(n \sim T(o)\), which, combined with the third, implies \(T(n) \sim T(o)\). Since o and \(\sim \) were arbitrary and some n exists for any such choice, by definition \(\mathcal {N}\)—really, the subset thereof witnessing the explanation—explains the \(\equiv \)-symmetry of \(\mathcal {O}\).Footnote 26

4 Curie-Post principle

Once one establishes a certain intertheoretic relationship between two theories, does it entail anything about the relationships between their symmetries? Redhead (1975), following Post (1971), introduces the “Curie-Post Principle” as an analog to Curie’s principle, which states, roughly, that the symmetries of causes are always to be found in the symmetries of their effects (Brading and Castellani 2007, §2.2). If one extrapolates this idea to theories, taking a newer theory as a “cause” whose “effect” is the older theory it explains, then one arrives at the claim that the symmetries of the newer theory must also be found in the theory it explains. Redhead (1975) defends a restricted version of this principle. Essentially, he considers an old and a new theory, as discussed above, where both the new theory’s possibilities \(\mathcal {N}\) and the old theory’s possibilities \(\mathcal {O}\) are restricted to those with their common overlapping domain of application and which have also been “well-confirmed”, i.e., successfully applied. Call these subsets \(\mathcal {N}'\) and \(\mathcal {O}'\), respectively. Furthermore, comparison of the possibilities—e.g., through the similarity relations of a similarity structure \(\mathcal {R}\) on \(\mathcal {O} \cup \mathcal {N}\)—are restricted to those sensitive only to properties that are well-defined for the possibilities of the old theory.Footnote 27 In other words, one assumes that each similarity relation in \(\mathcal {R}\) is defined on \(\mathcal {O}\). Finally, there is a further restriction on the abstract possibility space of the new theory to just that subset which explains the successful possibilities of the old (in the sense given before), i.e., those \(n \in \mathcal {N}'\) such that there is some \(o \in \mathcal {O}'\) for which \(n \sim o\) for each \(\sim \; \in \mathcal {R}\). Call this subset \(\mathcal {N}''\).

With these preliminaries in hand, he proposes the following restricted version of a Curie-Post Principle:

  • Curie-Post(-Redhead) Principle (Informal) “The [possibilities of \(\mathcal {N}\)] cannot be more symmetric than [those of \(\mathcal {O}'\)] provided the symmetry transformations considered do not break the [restrictions on \(\mathcal {N}\)] used in formulating the correspondence relation” (Redhead 1975, p. 104).

In fact, he writes that “For transformations which do not break the [restrictions on \(\mathcal {N}\)], [\(\mathcal {O}'\)] will be approximately symmetric in a sense that could be specified in terms of the metric” (Redhead 1975, p. 109n24) that defines approximation of the possibilities of one theory by another. Redhead does not precisely say quite what it means to break any of the aforementioned restrictions, but here is one possible explication:

  • Curie-Post(-Redhead) Principle (Formal) For any \(\equiv \)-symmetry T of \(\mathcal {N}\), suppose that there is a subset \(\mathcal {N}''' \subseteq \mathcal {N}''\) such that \(\mathcal {N}'''\) explains \(\mathcal {O}'\) relative to the similarity relations \(\sim \) less discriminating than \(\equiv \), and \(T[\mathcal {N}'''] \subseteq \mathcal {N}''\).

    Then T is a \(\sim \)-approximate symmetry of \(\mathcal {O}'\) for each such \(\sim \).

The conditions stated—that T preserves membership in \(\mathcal {N}''\) for a set of possibilities that explain those of \(\mathcal {O}'\)—are intended to capture the demand that T not “break” the restrictions on the possibilities of \(\mathcal {N}\).

Redhead does not prove his version of this principle. Indeed, as I have explicated it, it is false, in general. Consider again the case of the gears, but now in which the states of the spur gears are \(\mathcal {N}\) and those of the left-handed helical gears—with a helix angle that is small, but not arbitrarily so—\(\mathcal {O}'\). Both the shape of the spur gears and their membership in the class \(\mathcal {N}\) are invariant under parity (mirror) transformations, but neither the shape of the left-handed helical gears nor their membership in \(\mathcal {O}'\) are even approximately invariant under them. So, it is entirely possible for the possibilities of the newer theory that explains the success of the older theory to have symmetries that the possibilities of the older theory do not have, even approximately, because the action of T on \(\mathcal {O}'\) need not be contained in \(\mathcal {O}'\), and even if it is, it need not preserve the \(\sim \) relations. The case is not helped if one assumes that the similarity structure arises from a (pseudo)metric, as Redhead (1975, p. 101) does, for the case just described can arise from a metric on the space of shapes for gears.

Although my explication of the principle does not hold in general, perhaps there are other explications, maybe with further qualifications, that do. After all, there are of course many cases in which an old theory does have a transformation as an approximate symmetry that is an exact symmetry for a new theory that explains its success: consider, e.g., Lorentz boosts of sufficiently small velocity, which are exact symmetries of Minkowski spacetime but only approximate symmetries in Galilean spacetime because they approximate Galilean boosts for sufficiently small bounded (compact) spacetime regions. This may require restricting attention even further to continuous symmetries—i.e., those that form a topological group—that are sufficiently close to the identity.Footnote 28 But, I leave the investigation of these possibilities to future work.Footnote 29

5 Accidental symmetries

I now turn to the characterization of accidental symmetries in physics, for which I propose a novel account in Sect. 5.4 using the foregoing formalism. It will help to motivate the need for this account by contrasting it with the developed accounts available, due to Kosso (2000) (Sect. 5.1), Lange (2011) (Sect. 5.2), and Redhead (1975) (Sect. 5.3).

5.1 Kosso’s (2000) account

To begin with Kosso’s, his is a holistic, interdependence account: “A symmetry is fundamental to the extent that other things depend on it and that it depends on other things” (Kosso 2000, p. 115). The more dependence relations between a symmetry and other structures in a theory, the more fundamental it is; it is accidental to the extent that it is not fundamental. The fundamentality of a symmetry is determined through its interdependence with these other structures. We find justification for a symmetry’s fundamentality, as described in a theory, through its theoretical rigidity:

A claim is theoretically rigid to the extent that it cannot be changed or abandoned without forcing changes or abandonment of other theoretical claims. ...Theoretical rigidity is being used here as an epistemic criterion to justify the belief that a symmetry is fundamental (Kosso 2000, p. 117).

Conversely, we find evidence for a symmetry being accidental to the extent that it exhibits “no theoretical connection to other events or properties in nature. ...It is accidental in virtue of its lack of interdependence, and it is clearly recognized as accidental by its theoretical alienation.” (Kosso 2000; p. 118). In Kosso’s holistic account, therefore, one has evidence that a symmetry is more or less fundamental—that is, less or more accidental—to the extent that it is well integrated and embedded in the theory of phenomena in which it is found.

One consequence of this is that one should expect physicists to make comparative judgments or claims of degree about a symmetry’s putative fundamentality. That is, if being fundamental or accidental is a matter of degree (of interdependence or integration), one would expect to find physicists claiming for some certain symmetry that it is somewhat accidental (or fundamental), or more accidental (or fundamental) than another. But one does not find this. One finds rather that physicists make categorical judgments about a symmetry being accidental. For instance, Weinberg (1995, p. 529) opines that

It often happens ...that the effective Lagrangian [at low energies] automatically obeys one or more symmetries, which are not symmetries of the underlying theory .... Indeed, most of the experimentally discovered symmetries of elementary particle physics are ‘accidental symmetries’ of this sort.

Such symmetries of the low-energy theory are accidental, simpliciter. Similarly, Polchinski (2017, p. 18) remarks that “In the context of the Standard Model, global SU(3) is now seen as an accident, which appears in an approximate way because the scale of the quark masses happens to be somewhat less than the scale of quark confinement”. There is no characterization of the degree of theoretical rigidity of the accidental symmetry, or qualifications about how weak that rigidity is.

One might defend Kosso’s account on the grounds of it being interesting despite not being an “actor’s category”, in the historians’ sense. But Kosso (2000, p. 109) is explicit that he intends his account to describe the practices of physics, in particular particle physicists with regard to the Standard Model. So, Kosso’s account, whether interesting in its own right, does not seem to capture adequately what physicists mean by “accidental symmetry”.

5.2 Lange’s (2011) account

Now, one of the features of Kosso’s account is that

An accidental symmetry can be changed or even abandoned without any collateral changes in nature, since nothing depends on an accidental symmetry. ...Changing a fundamental symmetry, on the other hand, precipitates a widespread and radical alternation in the theoretical network, since other things depend on the fundamental symmetry. (Kosso 2000, p. 119)

Lange (2011, p. 349n7) elevates this featureFootnote 30 to a characterization of accidental symmetries, dropping the requirement on interdependence: “I understand accidental symmetries as symmetries that impose no constraints on the allowable interactions in that the symmetries might not still have held had there been different kinds of interactions”.Footnote 31 But the same issue about the mismatch between the account’s predictions and physicists’ usage arises: surely it is possible that some symmetries impose few (but more than zero) constraints on interactions, while others impose some more. These intermediate symmetries would not quite be as accidental, but it is difficult to find expressions of this sort—that is, of the degree of being accidental—among physicists.

5.3 Redhead’s (1975) account

The idea of imposing constraints on other physics is also found in the characterization of accidental symmetries by Redhead (1975, p. 81): “while expressing interesting features of some specialized phenomenon, they are in a sense dynamical accidents having no fundamental physical significance”. By this he means that they are symmetries that preserve the properties that they do, such as being a dynamically allowed possibility, only when applied to a subclass of possibilities—just like the relationship of \(\mathcal {N}'\) to \(\mathcal {N}\) in Sect. 3—and which are not held to constrain the development of future physics. Thus the constraint on allowable interactions provided by accidental symmetries for Redhead is primarily of heuristic potential for future physics, rather than providing counterlegal claims about present physics, as it was for Lange (2011). He emphasizes moreover the malleability of this ascription, that “What may appear initially as an accidental symmetry may later transpire to have heuristic potential. ...Alternatively putative heuristic symmetries may be downgraded to accidental status” (Redhead 1975, p. 81), giving the examples of gauge symmetry and unitary symmetry in hadron physics, respectively.

Recognizing the heuristic dimensions of accidental symmetries and how their status changes relative to theoretical developments importantly contrasts Redhead’s views with those of Kosso and Lange. But those two were right to see this classification as having some other (additional) role than a heuristic one: though accidental symmetries may guide future research, they also bear upon the interpretation of current theories.

5.4 A new account

In light of my account of approximate symmetries, I believe I can offer a new account of accidental symmetries that combines the virtues of the three foregoing accounts without their vices. Let \(\mathcal {O}\) be an abstract possibility space representing some physical phenomena from a theory with limited scope, such as one in particle physics restricted to a particular low-energy scale, and let \(\mathcal {N}\) be those from a theory of wider scope that explains \(\mathcal {O}\) relative to a similarity structure \(\mathcal {R}\) encoding, e.g., observable features of the possibilities. Further, suppose that a transformation T on their joint abstract possibility space is a symmetry or approximate symmetry (of some equivalence or similarity relation, respectively). Then T is accidental in \(\mathcal {O}\) relative to \(\mathcal {N}\) when, after T is restricted to \(\mathcal {N}\), it is not also a symmetry or approximate symmetry, respectively, of \(\mathcal {N}\); insofar as such an \(\mathcal {N}\) is not available but anticipated, one then anticipates classifying T as accidental (in this relational way). In a word, a symmetry, exact or approximate, is accidental for the possibilities from a theory relative to another that explains it when it is not a symmetry for the possibilities from that other theory.

Finding an approximate symmetry in a theory that one anticipates explaining with another can sometimes be a good heuristic constraint, therefore, on the search for that explaining theory—cf. footnote 23. As Weinberg (1997, p. 40) advises in the context of particle physics, “we would be suspicious of any approximate symmetry that could not be explained as an accidental consequence of the constraints imposed by renormalizability and the various exact symmetries”. (Here, the exact symmetries are those anticipated of the explaining theory, and the constraint of renormalizability restricts further the theories under consideration.) When one has reason to believe that a symmetry is accidental, even if it is exact, that may influence how one decides to run a surplus structure argument on it, for “what are ‘unphysical’ surplus features in the appropriate description at a determinate regime may become physically relevant features in a very different regime” (Castellani 2003b, pp. 435–436).

Moreover, my characterization of accidental symmetry is categorical, fitting with physicists’ usage. Although accidental symmetries do not put constraints on the form of the allowable interactions, just as Lange describes, this is a consequence, not a characterization, of what it means to be accidental. The reason they do not impose these constraints is the same reason that they are less integrated or interdependent, in the sense of Kosso: they arise only from merely effective state ascriptions to the phenomena, which are superseded by some theory with better accuracy or broader scope that ascribes possible states to the phenomena for which the putative symmetry does not hold, as is the case with global SU(3) symmetry, mentioned in the quotation from Polchinski (2017) in Sect. 5.1.

6 An invitation

With the formal account of approximate symmetry from Sect. 2 in hand, one can at last investigate conceptual questions regarding approximate symmetries with more precision. I did so for three sorts of cases pertaining to the role of symmetry in intertheoretic relations. Besides giving an account of what it means to explain a symmetry of a class of possibilities over and above explaining the success of those possibilities individually (Sect. 3), I also gave a more precise explication of the notion of an accidental symmetry (Sect. 5) and of the Curie-Post(-Redhead) Principle (Sect. 4), which affirms the greater symmetry of a theory than another which it explains. Even though that Principle was shown to be false, I suggested that perhaps further work will reveal a restriction of it that is true.Footnote 32

Two other subjects already invite themselves for investigation within this formal framework. One is the role of approximate symmetry in the more usual applications of Curie’s Principle to symmetry arguments concerning the origins of asymmetric phenomena, especially in considerations of cause and effect. There is considerable debate about the domain of application of such a principle (Brading et al. 2017, §3). But perhaps with the exception of somewhat informal remarks by Rosen (2008, §§5.2, 6.2), the role of approximate symmetries in arguments that invoke Curie-like principles, and the status of those arguments, have not received much attention.

The other subject is the role of approximate symmetry in the interpretation of theories, e.g., their suggested ontologies and role in scientific reasoning and theorizing. Already van Fraassen (1989) has suggested that exact symmetry can play such a role in place of laws, but what about approximate symmetry? In a realist vein, Kantorovich (2003) has argued that symmetries should be understood as physical relations or structures that are ontologically prior to (and thus, in certain respects explanatory of) particles. His presentation of the historical evidence acknowledges that approximate symmetries played a role in the development of particle physics and the Standard Model, but he does not substantially develop what implications this has for the ontological interpretation of symemtries.Footnote 33 Although the initially developed approximate symmetries were jettisoned for exact local symmetries, it remains entirely possible that these will only be approximate in some future physics. What status therefore does their ontological interpretation have now, in light of the defeasibility of our judgments of their exactness?

But these two subjects hardly exhaust the possibilities. For example: Some have contended that symmetry plays a role in characterizing what is objective in physics (Weyl 1952; Nozick 2001; Debs and Redhead 2007); do approximate symmetries then characterize what is approximately objective? If so, what is the function and interpretation of approximate objectivity? Could it be understood through approximate invariance, or a related concept? Others have contended that symmetries play a special (epistemic) role in discovering the laws of nature, restricting the possible variety of phenomena we observe: “if the correlations between events changed from day to day, and would be different for different points of space, it would be impossible to discover them” (Wigner 1967, p. 29). But it seems that merely approximate symmetry may be sufficient to play this role (Brading and Castellani 2007, p. 1363); consequently, is mere similarity, not symmetry, what’s needed to discover physical law? I hope that this essay has served as an invitation to consider these questions further.