"Die Energie der Welt ist constant. Die Entropie der Welt strebt einem Maximum zu."Footnote 1

-Rudolf Clausius (1822–1888)

"For Boltzmann…the probability calculus was primarily a technique for evading paradox; the mechanical approach to gas theory…exemplified by the H-theorem, was always his fundamental tool, the one to which he returned again and again."Footnote 2

-Thomas Kuhn (1922–1996)

"Let us now turn to the second matter in dispute between us. That the majority of students don't understand philosophy doesn't bother me. But can any two people understand philosophical questions? Is there any sense at all in breaking one's head over such questions? Shouldn't the irresistible pressure to philosophize be compared with the nausea caused by migraine headaches? As if something could still struggle its strangled way out, even though nothing is actually there at all?

My opinion about the high, majestic task of philosophy is to make things clear, in order to finally heal mankind from these terrible migraine headaches. Now, I am one who hopes not to make you angry by my forthrightness, but the first duty of philosophy as love of wisdom is complete frankness. Through my study of Schopenhauer, I am learning Greek ways of thinking again, but piecemeal."Footnote 3

-Ludwig Boltzmann’s (1844–1906) letter to Franz Brentano (Vienna, January 4th, 1905)

1 Introduction

With respect to an empirically successful physical theory T, it is believed that one can use T to acquire approximately true descriptions and explanations of phenomena in the world without appreciating T’s historical development if one has and uses a universally accepted formulation and partial interpretation of T (cf. the remarks in [226], p. 923). In contemporary statistical mechanics (SM) for the classical limit, everyone does what is right in their own eyes (borrowing some wording from Judges 21:25). Unlike contemporary Minkowskian special relativity, or classical Maxwellian electrodynamics, there is no generally agreed upon formulation or approach to non-equilibrium or equilibrium SM. Aside from the (a) Gibbsian approach,Footnote 4 there are (b) epistemic and information theoretic approaches including some with and some without the Shannon entropy,Footnote 5 (c) Boltzmannian approaches with and without ergodicity that use the Boltzmann entropy,Footnote 6 (d) stochastic dynamical approaches that modify the underlying classical microdynamics,Footnote 7 (e) the Brussels-Austin School,Footnote 8 and (f) the BBGKY Hierarchy approach, a type of chimera that includes both Gibbsian and Boltzmannian ideas.Footnote 9

Scientific realism is the view that most of the unobservables that are essential to our best physical theories exist and that most property attributions to the self-same unobservables expressed in statements essential to our best physical theories are at least approximately true. Given realism, the multifarious ways of formulating and interpreting SM should not be brushed off as harmless. Each formulation and interpretation of SM recommends a distinctive scientific ontology. For example, some theories provide competing characterizations or interpretations of quantities like entropy. The Boltzmannians claim that thermodynamic entropy is the Boltzmann entropy (SB), an objective property of physical systems whose mathematical representative can change its value over time (q.v., n. 11). For many Gibbsians, SG(ρ) or the Gibbs entropy is thermodynamic entropy.Footnote 10SG(ρ) is a constant of motion on the assumed classical Hamiltonian mechanics. It is a time-independent function of a density or probability distribution associated with modal system ensembles. Sometimes interpretations of one and the same formulation of SM can imply different scientific ontologies. For example, different assumptions about the interpretation of probability in one and the same approach recommend non-identical scientific ontologies. One might adopt (a) and yet understand the involved probabilities to be propensities that cause relative frequencies. Or one could remain Gibbsian and yet believe that probabilities in the theory just are frequencies solely (i.e., propensities are removed from the interpretation of the formulation of (a)). Who is right? Some will shout the answer: “That theory or approach which enjoys the most empirical success is the theory that is closest to the truth!” (remember that I am assuming realism).

Suppose the exclaimed answer is correct and that (a)–(f) could somehow be empirically distinguished. Let us further suppose that in point of fact, the deliverances of experimentation and scientific observation privilege (c) the Boltzmannian approach. Like many modern promulgators of (a)–(b) and (d)–(f), defenders of standard Boltzmannian SM (BSM) claim to be in the possession of a unique ideological solidarity with those fathers of modern kinetic theory and statistical mechanics that are James Clerk Maxwell (1831–1879), Ludwig Eduard Boltzmann (1844–1906), and Josiah Willard Gibbs (1839–1903).Footnote 11 In light of the dizzying array of approaches, proponents of BSM have tried to distinguish their perspective, not just by pointing to their theory’s empirical success, but also by telling a story (the Standard Story) about how Boltzmann came to affirm a combinatorial characterization of entropy (q.v., n. 11) and a statistical statement of the second law of thermodynamics (q.v., Appendix 1).

According to the Standard Story, from 1866Footnote 12 to 1877,Footnote 13 Boltzmann hoped to provide a purely mechanical justification of the second law, eventually (as of 1872Footnote 14) relying upon his famous minimum theorem (later called the H-theorem). However, Boltzmann’s efforts were met by the famous reversibility objection articulated by his colleague Johann Josef Loschmidt (1821–1895) in 1876.Footnote 15 In 1877,Footnote 16 Boltzmann repented and turned to his combinatorial arguments wherein was featured combinatorial entropy (q.v., n. 11) and a statistical understanding of the second law. He subsequently (to quote one renowned historian of physics) turned “his attention to other matters, returning…only occasionally, to add a footnote or two to his earlier expositions, or to comment on some other physicist’s discussion…” (M. J. Klein, Ehrenfest [143], p. 108).Footnote 17

In what follows, I challenge the Standard Story while also providing the beginnings of a Boltzmannian approach that stands in true solidarity with Boltzmann’s corpus. Like Hans Christian Ørsted’s (1777–1851) reason for seeking a discovery of the interaction between electricity and magnetism (his Naturphilosophie),Footnote 18 or one of Maxwell’s reasons for preferring a field ontology in electrodynamics (viz., that causes must be spatiotemporally local),Footnote 19 or one of Albert Einstein’s (1879–1955) reasons for preferring the Lorentzian spacetime of general relativity to the Minkowski spacetime of special relativity (viz., the action-reaction principle),Footnote 20 my Boltzmannian outlook is motivated by a metaphysical thesis, a thesis that is friendly to (what I will show in Sect. 5.1 was) Boltzmann’s aim to mechanically explainFootnote 21 the process of entropic increase:

(Causal Collisions (CC)): Within the collisions that are quantified over by the hypothesis of molecular chaos (HMC) (Sect. 8.1) and that produce entropic increase thereby making true the Boltzmann equation (Sect. 4) and H-theorem (Sect. 5) are instances of an obtaining fundamental causal relation that is formally and temporally asymmetric. Particular instances of this fundamental relation in evolutions of thermodynamic systems necessitate one-sided chaos and produce the velocity correlations referenced by the (HMC).

I will detail precisely how (CC) does a surprising amount of explanatory work (it earns its keep) primarily by arguing that it enables one to respond to the reversibility objection without having to endorse Boltzmann’s combinatorial arguments.

2 The Maxwell Distribution

Let’s travel back in time to the year 1859 at a meeting of the British Association for the Advancement of Science in Aberdeen Scotland. Motivated by the 1857 Adams Prize that was announced in March of 1855, James Clerk Maxwell has just completed his studies on the rings of Saturn.Footnote 22 Those studies involved the utilization of probabilistic reasoning in physics ([126], p. 128, [203], p. 168) as well as reflection upon complex systems of colliding bodies ([125], p. 25). What is more, that reasoning and reflection primed Maxwell for the development of contributions to, and investigations of the early kinetic theory of gases.Footnote 23

Maxwell’s presentation at the aforementioned 1859 meeting is entitled “Illustrations of the Dynamical Theory of Gases” and it will be published in two parts in The Philosophical Magazine a year later.Footnote 24 Maxwell’s exposition of proposition IV (found later in ([171], 22–24) includes a heuristic argument for a particular hypothesis concerning the velocity distribution for gas molecules understood as elastic spheres composing a gas at uniform pressure. A velocity distribution for a gas system is a quantitative description of the molecular velocities enjoyed by the constituents of the gas at particular temperatures. Velocity distributions can give one both the average and most probable molecular speeds of constituents of a gas system at various temperatures. From knowledge of average molecular speeds multifarious phenomenological properties can be inferred.

If we glide forward in time to 1867, we’ll find Maxwell at his family estate (Glenlair House) just before he’d become the first Cavendish Professor of Physics at Cambridge. There Maxwell publishes “On the Dynamical Theory of Gases”Footnote 25 in the Royal Society’s Philosophical Transactions after admitting the existence of problems with his 1860 theory of gas diffusion revealed in criticisms from Clausius in [74].Footnote 26 That paper sharpens some of his 1860 argumentation in that several of the assumptions of the 1860 work are abandoned in favor of more realistic assumptions, although both the 1860 and 1867 projects maintain the spirit of some earlier correspondence between Maxwell and Sir George Gabriel Stokes (1819–1903).Footnote 27

Contrary to the reigning paradigm of thought at the time (especially in the work of ClausiusFootnote 28), Maxwell hypothesized that collisions between gas molecules over time do not produce the same or close to the same velocities for every constituent molecule of a gas system, although the molecular kinetic energies are caused by those collisions to equal or closely approach the same value.Footnote 29 Rather, over time, collisions produce a distribution of speeds or velocities. All velocities and positions of the molecular constituents consistent with the conservation laws and the system’s total energy are assumed to be nomologically possible as the system evolves.

For Maxwell, collisions are causal phenomena, as are the processes of physical systems that lead to them.Footnote 30 The reason why Maxwell believes collisions are causal is because within such processes forces act and those forces are causes.Footnote 31 Maxwell includes in the titles of the 1860 and 1867 projects the term ‘dynamical’. This is purposeful. As in his celebrated paper “A Dynamical Theory of the Electromagnetic Field” published in 1865,Footnote 32 Maxwell’s approach is dynamical because he’s trying to account for the motions of bodies by appeal to causal forces, except in the case of gas systems he does not involve the causal influences of fields. In December of 1866, Maxwell wrote to Stokes as follows: “I therefore call the theory a dynamical theory because it considers the motions of bodies as produced by certain forces”.Footnote 33

Maxwell is propelled into his particular way of studying gas systems by reading Clausius's [73] memoir “On the Mean Length of the Paths Described by the Separate Molecules of Gaseous Bodies”.Footnote 34 He learns that the key to discerning the properties of gas systems is to look to collisions of the constituents of those molecules, which were (as with Maxwell) causal phenomena in the mind and work of Clausius. Clausius believed that around each gas constituent (or center of gravity) is a “sphere of actionFootnote 35 determined by the capacity of such constituents to “drive” one another “asunder” (i.e., to repel one another).Footnote 36 When a constituent α approaches another constituent β thereby entering β’s domain of repulsive influence or sphere of action, a rebounding effect results from a repulsive force, and both α and β (because of Newton’s third law of motion) change their velocities (given appropriate inertial masses). The acting repulsive force is causal in that it produces its “effects…at very small distances”.Footnote 37 In a manner very much dependent upon Clausius, Maxwell’s [174] work maintained that gas systems attain velocity distributions indicative of thermal equilibrium (q.v., Eqs. (0) and (1) below) because of the collisions of their constituents, where again, the collisions were understood by Maxwell to be the causal mechanisms that produce velocity changes.Footnote 38 This is all encoded in the underlying mathematics.Footnote 39

Maxwell provided a quantitative statement of his velocity distribution. That is to say, he wrote down an equation for\(f\left(\mathbf{v}\right)\) or the average number of molecular constituents in a gas that enjoy a velocity between two limits (\(\mathbf{v}\) and \(\mathbf{v}+{d}^{3}\mathbf{v}\)) subsequent to a great many collisions between similar gas constituents (Maxwell, Part 1 [171], 22). The 1867 statement of \(f\left(\mathbf{v}\right)\) takes the general form \(f\left(\mathbf{v}\right)=\alpha {e}^{-\beta {u}^{2}}\)(where the velocity \(\mathbf{v}\) is a three-vector with three Cartesian components vx, vy, and vz, the distribution function \(f\left(\mathbf{v}\right)\) is isotropic,Footnote 40 α and β are constants, e is Euler’s number (the base of natural logarithms) such that \(e\approx 2.71828\), and u is mean velocity). Or more precisely,

  1. (0)

    Maxwell’s Distribution Law (Vector Notation):\(f\left(\mathbf{v}\right)\propto {\mathbf{v}}^{2}{e}^{-m{\mathbf{v}}^{2}/2kT}\) This function was said to satisfy the relation \(f{(\mathbf{v}}_{1})f{(\mathbf{v}}_{2})=f{(\mathbf{u}}_{1})f({\mathbf{u}}_{2})\) for two distinct gas constituents enjoying respective pre-collision velocities \({(\mathbf{v}}_{1})\) and \({(\mathbf{v}}_{2})\), and post-collision velocities \({(\mathbf{u}}_{1})\) and \(({\mathbf{u}}_{2})\).

    Maxwell’s actual work (which was without modern vector notation) would affirm,

  2. (1)

    Maxwell’s Distribution Law:\(f\left(v\right)=(\frac{N}{{\alpha }^{3}{\pi }^\frac{3}{2}}){e}^{-(\frac{{v}^{2}}{{\alpha }^{2}})}\)(where \(N\) is the number of gas molecules, and α2 takes a value that is inversely proportional to the gas’s absolute temperature)Footnote 41

Equation (1) (or (0)) implies that the distribution function is asymptotically Gaussian. It was understood by Maxwell to give a velocity distribution for the molecules of a gas in thermal equilibrium. He would try to show that his distribution is stable in the sense that collisions among molecules would not disrupt or otherwise change the distribution’s applicability to select gases in equilibrium. In other words, Maxwell attempted to quantitatively demonstrate that once gas systems achieve equilibrium status they stay there. His attempt failed.Footnote 42 Maxwell’s failure notwithstanding, important justifications of Maxwell’s distribution exist (Brush, vol. 2 [47], pp. 187–188), as do modern versions of his reasoning with suitable fixes ([88], pp. 81–84). We can now claim that for an appropriate restricted set of classical gas systems, Maxwell’s distribution is indeed the correct velocity distribution in that it accurately describes the distribution of velocities for those systems in equilibrium. Important experimental confirmation appears in the work of Nobel Laureate Otto Stern (1888–1969) and the 1927 experimentation of John A. Eldridge (b. 1891).Footnote 43

Maxwell attempted to show that his distribution is the only stable distribution under collisions. To do that he used a collision number over some time period of dynamical evolution of the choice gas system (qq.v., n. 38 and n. 39).Footnote 44 How Maxwell acquired his collision number thereby attempting to justify his claim regarding stability appears to be mysterious. Numerous commentators have expressed their inability to get past several obscurities and confusions in Maxwell's [174] argumentation.Footnote 45 However, everyone seems to agree that his 1860 and 1867 reasoning made use of other assumptions some of which are probabilistic. I have found at least five. I explicate three of them below leaving the last two assumptions about the nature of collisions for Sect. 8.

  1. (2)

    The constituents of gas systems are centers of force and can be regarded as what we now call “Maxwell molecules”, i.e., point-like molecules (for all intents and purposes point-masses) or collections thereof that “move about as a single body”Footnote 46 that interact by means of central repulsive forces inversely proportional to the fifth power of the distance between them.Footnote 47

  2. (3)

    Every direction of particle rebound subsequent to a binary collision is equally probable.Footnote 48

  3. (4)

    If (3), then both ((a) all three velocity components of any involved velocity have independent probability distributions (and) (b) every displacement direction is as likely as every other).

Notice that assumptions (3) and (4) are at least in part about probabilities.Footnote 49

Are (2)-(4) good assumptions? Leaving aside Maxwell’s claim regarding fifth powers, no atomist would baulk at (2). Assumption (4) is proven in [174]. But what about (3)? I shall not appraise it. Maxwell already did. He called it a precarious assumption which he believed put his approach in danger of being altogether unrelated to actual world collisions and interactions.Footnote 50 I wish to add only that Maxwell’s justification of (3) rested upon the work of Sir John Herschel (1792–1871), specifically Herschel’s unsigned 1850 review of Adolphe Quetelet’s (1796–1874) work on probability in the 92nd volume of the Edinburgh Review [128].Footnote 51 This influence is important because we know that Herschel held an epistemic or Bayesian view of probability, maintaining that probabilities are degrees of belief or credences.Footnote 52 Herschel’s understanding of probability seemed to have rubbed off on Maxwell for one can clearly see an allegiance to an epistemic interpretation of probability in Maxwell’s corpus.Footnote 53 This should not surprise us. Frequentism was the interpretation of choice in the twentieth century, but Bayesianism reigned supreme in physics during the nineteenth century ([5], p. 629). These matters will become important later.

3 The Maxwell–Boltzmann Distribution

From 1868 to 1871, Boltzmann generalized (1) (i.e., the Maxwell distribution) for gas molecules in such a way that he obtained a velocity distribution for systems of gas molecules with internal and gravitational degrees of freedom (the generalizations eventually captured systems of polyatomic gas molecules).Footnote 54,Footnote 55 Boltzmann’s main result is called the Maxwell–Boltzmann distribution. While Maxwell's [174] distribution took the form: \(f\left(\mathbf{v}\right)=\alpha {e}^{-\beta {u}^{2}}\) (as noted above), Boltzmann's [16] distribution took the form: \(f\left(\mathbf{v}\right)=a{e}^{-\beta E}\) where \(a\) and \(\beta\) stand for constants, and \(E\) is energy. It’s more full content reads,

  1. (5)

    Maxwell–Boltzmann Distribution: \(f\left(v\right)=A{e}^{-h(\frac{1}{2}m{v}^{2}+V[x])}\) (where A is the number of molecules such that that amount normalizes f; h is really just 1/kT in modern notation, k is Boltzmann’s constant)Footnote 56 or we could just write: \(f\left(v\right)=A{e}^{-\frac{E}{kT}}\) (where E is total energy).Footnote 57

Due to objections from Francis Guthrie (1831–1898), Maxwell would himself try his hand at deriving this generalized distribution in [175] and then again in [177].Footnote 58

Boltzmann tried to prove that any distribution (for the types of gases with which he was concerned) would tend towards (5) (uniqueness), given a change in time, but would later (1898) state in volume two of his Lectures on Gas Theory that he could not actually prove its uniqueness ([35], pp. 313–340).

Although Maxwell did not seem to favor calling \(f\left(v\right)\) a probability, both he (subsequent to 1867) and Boltzmann interpreted (5) in such a way that it said that the most highly probable energy of a gas molecule takes a value equal to kT, where it is understood that molecules could take on energies with a great many other values (consistent with the total energy values and conservation laws) because the likelihood of such energy assignments is never zero. That the distribution given in (5) represents an appropriate gas in equilibrium and that it gives the unique distribution for such a gas system is generally agreed upon by even modern practitioners of what we now call classical statistical mechanics. It is therefore a bona fide law of classical theory. Some scholars also maintain that Boltzmann’s derivation of (5) is “impeccable” (at least for the non-polyatomic cases), given that the distribution faithfully represents the speeds of molecules in systems at equilibrium ([64], pp. 88, and see also pp. 283–286).

4 die Fundamentalgleichung

After generalizing the Maxwell distribution so as to obtain the Maxwell–Boltzmann distribution, Boltzmann remarked that “[i]t has thus not yet been demonstrated that whatever the state of the gas may have been at the start, it must always approach the limit discovered by Maxwell”.Footnote 59 Boltzmann is here concerned with the missing proof of the uniqueness of the distribution function. In (Boltzmann, Further Studies on the Thermal Equilibrium of Gas Molecules [21]), Boltzmann turned to the task of finding an equation (what we would later call the Boltzmann equation) that tracks the evolution of the velocity distribution over time in irreversible processes so as to help reach the missing proof.Footnote 60 For cases involving systems with but one species of particle, the Boltzmann equation reads,

  1. (6)

    The Boltzmann Equation or Boltzmann’s Transport Equation: \(\frac{\partial f}{\partial t}+\mathbf{a}\left(\frac{\partial f}{\partial \mathbf{v}}\right)+\mathbf{v}\left(\frac{\partial f}{\partial \mathbf{r}}\right)={\frac{\partial f}{\partial t}}_{coll.}\), here f is dependent upon time t, position r, and velocity v, and it represents the distribution function describing the gas system’s state and also how that state evolves, \(\mathbf{a}\) represents the accelerations of the particles/molecules between their collisions, and the right-hand side of the equation or \({\frac{\partial f}{\partial t}}_{coll.}\) represents the collision produced rate of change of the distribution function f.Footnote 61

Here is an expression closer to the original work,Footnote 62

  1. (7)
    $${{\bf{Early\, Boltzmann\, Equation:}} \frac{\partial f}{\partial t}=\int \mathrm{d}{\mathbf{v}}_{2}\int \{f{(\mathbf{u}}_{1})f{(\mathbf{u}}_{2})-f{(\mathbf{v}}_{1})f({\mathbf{v}}_{2})\}\left|{\mathbf{v}}_{1}-{\mathbf{v}}_{2}\right| d\Omega \sigma \left(\Omega \right)}$$

where \(d\Omega \sigma \left(\Omega \right)\) is the differential collision cross section “for a collision in which the relative velocity” after the collision is “in the solid angle \(d\Omega\) at \(\Omega\) compared to the relative velocity before”.Footnote 63 The involved integrals are over every possible scattering angle and every possible velocity \({\mathbf{v}}_{2}\) of the collision partner. Function f is the distribution function, velocities u1 and u2 are final (post-collision) velocities, and v1 and v2 are initial (pre-collision) velocities.

The Boltzmann equation “completely determines the evolution of the distribution f from its initial value”.Footnote 64 It says how “the distribution” changes “in time under the action of the collisions”.Footnote 65 And if you can correctly solve for f, then with (7) or some form of (6) you’ll obtain all that’s needed to compute thermodynamic phenomenological properties of the appropriate relevant system. Boltzmann would add that the right side of Eq. (7) vanishes such that \(\frac{\partial f}{\partial t}=0\) when the distribution function is Maxwell’s, and all other functions tend toward Maxwell’s (uniqueness). Boltzmann also affirmed that the velocity distribution will cease to change once it becomes the Maxwell distribution (stability or stationarity).Footnote 66

The literature on the Boltzmann equation is immense. It has grown large for several reasons. First, it has numerous beneficial applications and uses in modern physics.Footnote 67 You can use it to figure out transport coefficients (hence “transport equation”) for heat conduction, gas interdiffusion, and gas viscosity.Footnote 68 And it is utilized in contemporary physics for the study of neutron transport as well as plasma systems. Second, the equation is not time-reversal invariant,Footnote 69 and the reason why is usually connected to an assumption of the justification of the equation, viz., the HMC or hypothesis of molecular chaos defined and discussed in Sect. 8 below [228]. Third, given several assumptions, including the supposition that the gas system under evaluation is dilute and that its constituents are approximated as hard shells, the Boltzmann equation was derived from the time-reversal invariant equations of motion in classical mechanics by Oscar Lanford III (1940–2013).Footnote 70 There’s some question as to how the irreversibility or asymmetry of the distribution evolution emerges in a way (and this has been demonstrated in [211, 212]) that avoids the reversibility objection of Loschmidt discussed in Sect. 8 below.Footnote 71 I will have more to say about time-reversal invariance and the emergence of irreversibility shortly. For now, let us turn our attention to Boltzmann’s minimum theorem (i.e., the H-theorem).

5 The H-Theorem

My discussion in this section shall pertain to monatomic gases.

With the Boltzmann equation in hand, Boltzmann thought himself properly equipped for proving the uniqueness of the Maxwell distribution. One can already see how \(\frac{\partial f}{\partial t}\) vanishes given that the distribution function is Maxwell’s because that function satisfies the relation: \(f{(\mathbf{v}}_{1})f{(\mathbf{v}}_{2})=f{(\mathbf{u}}_{1})f({\mathbf{u}}_{2})\), as I have already noted. Justifying that conditional is not enough to secure uniqueness. One must also show that if \(\frac{\partial f}{\partial t}\) vanishes, then the distribution function must be Maxwell’s. To acquire the needed demonstration, Boltzmann introduced the time-dependent functional H (not to be confused with the Hamiltonian).Footnote 73 He defined that functional in terms of the distribution function f.

  1. (8)
    $$\mathrm{H}\equiv \int f\mathrm{log}\,f\,d\mathbf{v}$$

On the assumption that the Boltzmann equation is omnitemporally true, and the assumption that the time and velocity dependent function f satisfies the Boltzmann equation, it can be rigorously proven that for any time t, the distribution function f is Maxwellian, just in case, \(\frac{d\mathrm{H}}{dt}=0\). On the same assumptions (i.e., f satisfies the Boltzmann equation and that that equation is omnitemporally true) it can also be proven that,

  1. (9)
    $$\frac{d\mathrm{H}}{dt} \le 0,\; for\; any\; time\; t$$

But it will turn out that the relevant proofs make use of the hypothesis of molecular chaos (HMC) discussed and defined in Sect. 8 below. This was not realized by Boltzmann until sometime after his 1872 and 1875 work.

The conjunction of the above results is called Boltzmann’s minimum theorem or H-theorem ([226], p. 965), cf. ([63], pp. 137–140). The quantity H is a monotonically decreasing function in time unless the velocity distribution is the Maxwell distribution. And so, the theorem helps secure the uniqueness of the Maxwell distribution.Footnote 74 As Boltzmann’s 1896 summary of the H-theorem in his Lectures on Gas Theory stated, “[w]e have shown that the quantity we have called H can only decrease, so that the velocity distribution must necessarily approach Maxwell’s more and more closely”.Footnote 75

The preceding discussion pertains to Boltzmann’s H-theorem for monatomic gases. In his 1872 work, he also tried to prove an H-theorem for polyatomic gases (see also (Boltzmann, On the Thermal Equilibrium of Gases on Which External Forces Act [23])). That proof did not fare well, as Lorentz found a problem with Boltzmann’s derivation.Footnote 76 Lorentz notes that part of Boltzmann’s derivation of the Boltzmann equation and the H-theorem is a commitment to the existence of reciprocal collisions. Boltzmann appears to assume that if there exists a collision \(\left[A,B\right]\to [{A}^{\prime},{B}^{\prime}]\), then there exists an inverse collision (following Lorentz’s way of characterizing sets of velocities of colliding molecules) that is \(\left[{A}^{\prime},{B}^{\prime}\right]\to [A,B]\). Lorentz proves that this assumption is false for polyatomic molecules that are non-spherical.Footnote 77 He then provides a simplified version of Boltzmann's [21] proof of the H-theorem for monatomic gases. This streamlined proof is later used by Boltzmann in both his Lectures on Gas Theory [35], and his 1887 response to Peter Guthrie Tait (1831–1901) entitled Über einige Fragen der Kinetischen Gastheorie (On Some Questions about Kinetic Gas Theory). Moreover, modern textbooks often choose to follow Lorentz’s proof for the monatomic case when presenting the derivation of the H-theorem for pedagogical purposes ([88], pp. 328–329, [152], pp. 599. nn. 38–39).

Boltzmann would graciously accept Lorentz’s criticism and provide a follow-up proof that made use of cycles of collisions as that which drives H-decrease in the polyatomic cases (q.v., n. 76). His maneuver is both unrealistic and embraced by no one, save Lorentz.Footnote 78

So, Boltzmann’s attempts at proving an H-theorem for polyatomic gas types had problems. Not even his attempted demonstrations of the H-theorem for the monatomic cases are wholly without problems. A decisive and rigorous proof for the monatomic gas type would have to wait until the 1933 and 1957 work of Torsten Carleman (1892–1949).Footnote 79 In addition, Carlo Cercignani (1939–2010) taught us that that demonstration has a cousin yielding an H-theorem for polyatomic gases ([64], p. 96). Both Cercignani and Darrigol have proven an H-theorem for polyatomic gas types (ibid., 287–290; [88], pp. 493–496).Footnote 80 Thus, for both monatomic and polyatomic gas types, we have an H-theorem. How should we interpret it?

5.1 Interpreting the H-Theorem: Collisions and Causation

Boltzmann’s proposed mechanical explanations of the second law of thermodynamics characterize would systems of colliding gas molecules as systems whose constituents causally interact. There are four reasons why one should accept this interpretation. First, mechanical explanations of natural phenomena for physicists such as Clausius, Maxwell, and Boltzmann are part of (to quote Christiaan Huygens’s (1629–1695) characterization of the mechanical approach, a characterization alive and well during the nineteenth century) “the true Philosophy, in which one conceives the causes of all natural effects in terms of mechanical motions”.Footnote 81 In other words, a mechanical explanation just is one involving a report on causes that are mechanical motions inter alia (qq.v., n. 30, n. 31, and n. 33). In Sect. 2, I detailed how this approach to mechanical explanation shows up in the work of Clausius and Maxwell. As I shall demonstrate in Sect. 6, Boltzmann’s H-theorem is part of his attempt to mechanically explain the second law. It is therefore highly likely that by Boltzmann’s lights, the type of explanation of entropic increase the H-theorem offers is a causal explanation.

Second, in part one of his Lectures on Gas Theory, Boltzmann presents a Lorentz-inspired derivation of the H-theorem. In his discussion of value changes of the H-functional, Boltzmann reports that changes in H over a small period of time are “due to…causes” later noting that the changes result from collisions ([35], p. 50, and see also 51–52). This suggests that for Boltzmann, the process of entropic increase is a causal process and that collisions for Boltzmann are causal phenomena.

Third, although Boltzmann seems dismissive of metaphysics (he calls metaphysics a “spiritual migraine”Footnote 82) and whilst he views physical hypotheses as pictures or images of the world that are not directly corresponding truths about it, Boltzmann does consistently interpret all forces (and so those forces at work in collisions) causally ([38], p. 54). That is to say, he believes that the image of the world supplied by physics depicts the world as a place endowed with causal forces. Commenting on Heinrich Hertz’s (1857–1894) 1887 discovery of a form of electromagnetic radiation (i.e., radio waves), Boltzmann causally interpreted the action of the electromagnetic field. He remarked, “…electric and magnetic forces do not act directly at a distance but are caused by changes of state that are propagated from one volume element to the next at the speed of light” ([38], p. 84). At the May 29th, 1886 meeting of the Imperial Academy of Science, and so well before gravitation would be reduced to spacetime curvature by Einstein in 1915, Boltzmann causally interpreted the gravitational force ([37], p. 17). At the same event, Boltzmann provided a causal characterization of pressure. He said that the molecules involved in thermodynamic systems impinge (or strike) “now more now less strongly, now head on now at an angle” maintaining that when the pressure produced by these impinging molecules is at a point “bigger…we shall at once look for an external cause that moves the molecules to flow preferentially to that point” (ibid., 20).

Some of the strongest evidence for my interpretation of Boltzmann comes from his 1899 Clark University lectures. In them Boltzmann describes the evolution of a gravitating system and says in that context that in general “the cause of motion…we call force”, concluding “that at least in this special case acceleration is the decisive feature of force…namely gravity”. ([37], pp. 127–128). Boltzmann added that:

Kirchhoff rejected the notion that it was the task of science to unravel the true nature of phenomena and to state their first and fundamental metaphysical causes. On the contrary he confined the task of natural science to describing phenomena, a stipulation that he still called a restriction.Footnote 83

Boltzmann would connect Kirchhoff’s view to Hertz’s in his 1899 Munich lecture: “nobody has yet pointed out that a certain idea [apparently the same idea articulated in the lines just quoted] in Kirchhoff’s mechanics if followed to its logical conclusion leads directly to Hertz’s ideas” ([37], p. 89). Boltzmann would both reject Kirchhoff’s view ([38], p. 78) and distinguish his understanding of mechanics from that of Hertz by noting that his approach holds on to causal forces, while Hertz’s 1899 Principles of Mechanics in a New Form abandons them completely.Footnote 84 He said that “difficulties arise” for Hertz’s approach “as soon as one wants to represent the most ordinary processes of daily experience involving the action of force”.Footnote 85 Again, forces are at work in the collisions referenced by the Boltzmann equation. Therefore, according to Boltzmann, so too is causation.

Fourth, in some of Boltzmann’s notes on natural philosophy put together for a lecture to be given on November 23rd, 1904, Boltzmann said, “[i]t is just its own bad luck that changes in velocity must have a cause”, subsequently committing to a view about the relata of causation, i.e., that “[a] thing cannot be the cause of a thing, but merely of change”.Footnote 86 Colliding things produce velocity changes, according to the Boltzmann equation and H-theorem, and so these remarks support my reading. Boltzmann believes that the mechanism of velocity change in the process of entropic increase is a causal mechanism.

We can safely conclude that there’s good evidence from Boltzmann’s Lectures on Gas Theory, Boltzmann’s Lectures on Mechanics, his personal lecture notes, and his public lecture content that all supports the thesis that Boltzmann endorsed a causal approach to mechanistically explaining the second law.

5.2 Interpreting the H-Theorem: Applications and Exceptions

As early as his work in 1872 and 1875, Boltzmann recognized that there could be gas systems that have unique initial conditions such that they do not evolve to the Maxwell distribution. This is because such special systems start out precluding certain velocity and/or position values otherwise consistent with the conservation laws and energy totals. He conjectured that perhaps some constraints on very special systems keep their constituents from realizing all possible values consistent with those laws and totals. As Boltzmann himself put matters when discussing a gas confined to a container, “it is possible that only certain, and not all possible positions and velocities can occur in the course of time (e.g., if they were all initially in a line perpendicular to the vessel walls)”.Footnote 87 It is an assumption of the velocity/energy distribution approach of Maxwell and Boltzmann that every nomologically possible velocity be realizable by gas constituents. I believe Boltzmann was keen enough to realize the connection between special velocity precluding initial conditions and H-theorem inapplicability. My opinion is that as Boltzmann developed a mechanistic explanation of the second law, he knew of possible systems to which the H-theorem could not be applied. My reading is most, and not too, charitable. It entails that it did not take the articulation of Loschmidt’s reversibility paradox for Boltzmann to come to the realization that some monatomic gas systems escaped H-theorem application.Footnote 88 The principle of charity is not all that can be said for the proposed interpretation. It explains why (to quote Cercignani) Boltzmann, “when answering” Loschmidt’s paradox (discussed in Sect. 8 below):

did not indicate that he had changed his viewpoint, or that he had deepened his understanding of the subject, as a consequence of the reflections caused by the [reversibility] objection that had been raised against him, but acted as if he were simply re-elaborating his old ideas.Footnote 89

The fact that such a report is correct has perplexed Boltzmann scholars.Footnote 90 There exists a challenge to render that report consistent with Boltzmann’s judgment that deriving the H-theorem amounts to rigorously proving “that whatever the distribution of live force [kinetic energy] may have been at the beginning [initial time], subsequent to a very long time period it must always approach that [one] found by Maxwell”.Footnote 91 The best way to introduce coherence and consistency here is to insist that even in 1872 and 1875 Boltzmann was aware of systems that did not approach Maxwell’s distribution on account of the unique initial conditions they enjoyed (agreeing in part with [7, 235]) though I am not claiming that Boltzmann’s H-theorem project was always statistical in the sense that at least Badino seems to have in mind). The necessity of approaching the Maxwell distribution rests upon the assumptions and antecedent of the H-theorem. As I’ve said several times now, one of the relevant assumptions is that the Boltzmann equation concerns all nomologically possible velocity and position values, i.e., all those that satisfy the laws of conservation ([88], p. 173 although I disagree with Darrigol’s presumption at n. 37). The cases that admit exceptions to the general claim that H always decreases, or that minus-H (where minus-H is proportional to entropy) always increases are cases that prohibit some velocity and position values.

5.2.1 Maxwell’s (But Really Thomson’s) Demon and Loschmidt’s Exorcism

My reading is controversial. Let me add further lines of support. Boltzmann’s proof of the H-theorem appears after the articulation of the Maxwell “demon”Footnote 92 case, a case well-known for the trouble it produces for any non-statistical and exceptionless statement of the second law. Maxwell discussed it for the first time in a letter to Tait, dated December 11th, 1867,Footnote 93 restating it in several places including the appendix to his 1871 book Theory of Heat. It is there that he supposes that there’s a container filled with air that possesses uniform pressure and temperature (the system is in equilibrium). The container is divided into two sides. The two sides of the container are labeled A and B, and the division between them is wrought by a diaphragm with a large hole in it. Over the hole is a sliding plate with very small mass that is controlled by “a being whose senses are so acute that he can see every molecule of the air, at least when it is near the hole”.Footnote 94 Maxwell says the being follows the command: Open the plate over the hole when a molecule possessing “more than the mean velocity” in compartment A moves near the hole (n. 94). This allows for faster molecules to move into compartment B. The plate is to remain closed for all other molecules, although when in compartment B, a slower (i.e., slower than the mean velocity) molecule draws near the opening, the plate is to be opened allowing that molecule to pass from B to A. Maxwell infers that compartment B will begin to enjoy an increase in the mean velocities of its inhabitant molecules, while compartment A will enjoy a decrease of mean velocities. These changes all obtain without the expenditure of work. As a result, “the [non-statistical and exceptionless] second law of thermodynamics is no longer true”,Footnote 95 and “[t]he 2nd law of Thermodynamics has the same degree of truth as the statement that if you throw a tumblerful of water into the sea you cannot get the same water out again”.Footnote 96

Great. There were reasons to abandon a non-statistical and exceptionless statement of the second law before Boltzmann’s proof of the H-theorem. But why should one think that Boltzmann was aware of those reasons when he tried to prove the H-theorem? Loschmidt articulated a Maxwell- “demon”-like case (without a demon) in 1869.Footnote 97 Boltzmann would have known of Loschmidt's version of the case since they were colleagues (q.v., n. 97). Indeed, Boltzmann responds to Loschmidt, providing a version of the case that resembles Maxwell’s. He wrote,

When for instance a gas at constant temperature is divided into two halves by a separating wall with a small hole on it, it would be possible to bring in front of the hole a contraption that guides the faster molecules preferably into one half and the slower ones preferably into the other half, which would contradict the second law.Footnote 98

5.2.2 The Reversibility Objection Before Loschmidt

I’ve argued that Boltzmann showed an awareness of the real possibility of the existence of gas systems to which the H-theorem could not be applied. My evidence for this resides beside the articulation of Boltzmann’s original proofs of the H-theorem in his 1875 work. My second reason for maintaining that Boltzmann knew of failures of H-theorem application to even monatomic gas systems had to do with his awareness of a Maxwell “demon” case prior to 1872. I now add that the reversibility objection (q.v., Sect. 8 below) was articulated and probably known to Boltzmann before his proof of the H-theorem. This fact constitutes my third justification for believing that Boltzmann knew of H-theorem inapplicable systems before at least 1875. Notice that all three justifications support the further claim that Boltzmann was aware of nomologically possible violations of the non-statistical version of the second law prior to the 1875 publication of the H-theorem.

The already referenced 1867 letter from Maxwell to Tait included a penciled annotation by Thomson which read, “Very good. Another way [to violate the non-statistical second law of thermodynamics] is to reverse the motion of every particle of the Universe and to preside over the unstable motion thus produced”.Footnote 99 The next year Maxwell said there was an apparent conflict between the existence of irreversible processes and the reversibility of all motion.Footnote 100 Two years later, Maxwell would write to Strutt arguing that reversing all of the motions (i.e., velocities) of every particle in the universe would “upset the 2nd law of Thermodynamics”.Footnote 101

Of course, Boltzmann was quite far removed from Scotland and probably did not have access to the correspondence of Maxwell, Tait, or Thomson. Moreover, I can find no evidence that he was aware of the reversibility objection before the publication of [21]. However, there is evidence that Boltzmann knew the work of Thomson, taking it seriously enough to cite it. For example, Boltzmann cited Thomson’s work with Tait on the principle of least action in hydrodynamics at (Boltzmann, “On the Compressive Forces” [19]). In [22], Boltzmann cites Thomson once again, although this time on work related to electricity. We also know that much later, Boltzmann corresponded with Thomson in 1892 and 1893, answering several of Thomson’s objections to Boltzmann’s kinetic theory from energy dissipation.Footnote 102 I therefore think it is likely that Boltzmann read Thomson’s reversibility objection in Nature, published in 1874. We know Boltzmann read Nature because Boltzmann published in it several times.Footnote 103 Why is this important? Because (again) in 1875, Boltzmann tried to prove his H-theorem in much the same way he did in 1872. However, this time, he’d seek to make his earlier work known to a broader audience ([88], p. 172). His remarks regarding the non-statistical nature of the second law as viewed through the lens of the H-theorem are univocal:

[We] have so far proceeded as follows: [we] have shown that the quantity H cannot increase during the evolution of the state of the gas; wherefrom [we] have concluded that it must be constant in the case of equilibrium since it evidently cannot constantly decrease in this case. I was thus able to derive the definitive equations that lead to the equilibrium distribution of states. This suggests that the value that H takes in the case of equilibrium is the smallest of all the values that H can take in agreement with the conservation of the total number of atoms and the conservation of the total live force.Footnote 104

If Boltzmann was aware of the reversibility objection before 1875, why would he assert the above if reversibility worries should cause him to abandon deterministic or non-statistical statements of minus-H increase over time? I believe the best charitable response to this query should not argue that Boltzmann had a statistical interpretation of the H-theorem all along (contra [7], q.v., my arguments against this in Sect. 5.3 below). Rather, the correct response suggests instead that Boltzmann believed that while the deterministic or non-statistical statement of the second law of thermodynamics admits exceptions, at least some of those exceptions (if not all of them) have to do with systems to which the H-theorem cannot be applied. However, if the H-theorem does apply to a system out of equilibrium, minus-H must increase monotonically for all time until equilibrium is obtained. There it will remain given that the Boltzmann equation holds for all time (anticipating here worries about fluctuations and recurrence about which I will say more in a part two essay). You see, proof of the H-theorem amounts to proof of a deterministic and non-statistical second law suitably restricted to systems that satisfy the antecedent of the H-theorem. Of course, my reading assumes that Boltzmann strongly associates H (or minus-H) with entropy. Let me now say more about that association in Boltzmann’s corpus.

5.3 Interpreting the H-Theorem: \(-\mathbf{H}\) and Entropy

Boltzmann’s 1872 interpretation of the H-theorem is given in the following often-quoted passage which I will call PERICOPE,

It has thus been rigorously proved that, whatever may be the initial distribution of kinetic energy, in the course of a very long time it must always necessarily approach the one found by Maxwell [notice that he’s speaking here in terms of energy changes]. The procedure used so far is of course nothing more than a mathematical artifice employed in order to give a rigorous proof of a theorem whose exact proof has not previously been found. It gains meaning by its applicability to the theory of polyatomic gas molecules. There one can again prove that a certain quantity [H] can only decrease as a consequence of molecular motion, or in a limiting case can remain constant. One can also prove that for the atomic motion of a system of arbitrarily many material points there always exists a certain quantity which, in consequence of any atomic motion, cannot increase, and this quantity agrees up to a constant factor with the value found for the well-known integral \(\int\frac{dQ}{T}=0\) in my paper on the ‘Analytical proof of the 2nd law, etc.’. We have therefore prepared the way for an analytical proof of the second law in a completely different way from those previously investigated. Up to now the object has been to show that \(\int\frac{dQ}{T}=0\) for reversible cyclic processes, but it has not been proved analytically that this quantity is always negative for irreversible processes, which are the only ones that occur in nature. The reversible cyclic process is only an ideal, which one can more or less closely approach but never completely attain. Here, however, we have succeeded in showing that \(\int\frac{dQ}{T}\) is in general negative, and is equal to zero only for the limiting case, which is of course the reversible cyclic process (since if one can go through the process in either direction, \(\int\frac{dQ}{T}\) cannot be negative).Footnote 105, Footnote 106

PERICOPE suggests that minus-H is entropy, and that H is equal to minus entropy.Footnote 107 It implies that the H-theorem is Boltzmann’s attempt to ground the second law of thermodynamics in mechanics or microdynamics.Footnote 108 But why does Boltzmann use minus-H and \(\int \frac{\delta Q}{T}\) (which I will call the Clausius integral) interchangeably? The question is perplexing because the entropy of Boltzmann's [21] and 1875 work is that of a closed system, while the Clausius integral has to do with heat Q being exchanged with a system at absolute temperature T. To properly answer this question, we must first understand Rudolf Clausius’s theory of entropy.

6 Clausius Entropy

6.1 The Clausius Integral and Entropy

Clausius associated the entropy of physical systems,at least in certain contexts, with the Clausius integral in his 1865 memoir,Footnote 109 where the differential \(\delta Q\) is a differential form representative of exchanged heat or infinitesimal heat transfers into the system as part of a reversible process. The source of the heat is some external system enjoying absolute temperature T. For reversible cyclic evolutions, we have:

  1. (10)
    $$\oint \frac{\delta Q}{T}=0$$

\(\frac{\delta Q}{T}\) is therefore an exact differential. Clausius likewise tried to capture what he understood to be irreversible processes involving systems that evolve from one equilibrium state to another equilibrium state in a manner that could be reversed by means of a reversible transformation. While the relevant transformations in the cycle obtain, the inequality given by (11) holds true:

  1. (11)
    $$\int \frac{\delta Q}{T}<0$$

The total entropy of the global system that includes external sources for the needed heat exchange was said to increase in entropy (hence, an irreversible process). The entropic increase on account of a reversible transition of a system from one equilibrium state S1 to a distinct equilibrium state S2 is less than the entropy increase on account of an irreversible transition from one equilibrium state S3 to a distinct equilibrium state S4.Footnote 110 In ([79], p. 5), Clausius would add (12) below understanding the conjunction of it with (10) as an expression of the two fundamental equations of thermodynamics.

  1. (12)
    $$\delta Q=dU+\delta W$$

where U is energyFootnote 111 and \(\delta W\) represents the infinitesimal process of work done while heat is transferred.Footnote 112Q and W are here path dependent quantities, while U, like entropy S, is a path independent state function or property. Clausius would also add that,

  1. (13)
    $$\frac{\delta Q}{T}=\frac{dH}{T}+dZ$$

whereH and Z are both state functions (the latter by stipulation really), H being the heat of the body/system although he would famously reduce that notion to vis viva (kinetic energy) thereby reducing it to motion,Footnote 113 and Z being disgregation. More on these two quantities soon. Reflect, for now, on the fact that because (13) holds, and because Clausius was strongly associating entropy with \(\frac{\delta Q}{T}\), in 1865, Clausius affirmed that:

  1. (14)
    $$dS=\frac{dH}{T}+dZ$$

And because (13) holds, we have shown precisely how Clausius strongly associates entropy with heat exchange. Indeed, one popular way of charactering Clausius’s understanding of the second law is as follows:

Heat cannot pass spontaneously from a body of lower temperature to a body of higher temperature.Footnote 114

Clausius himself remarked under a section entitled “New Fundamental Principle concerning Heat”, “Heat cannot, of itself, pass from a colder to a hotter body,”Footnote 115 and as early as 1854, Clausius declared that he had found an analytic “expression of the second law” for reversible cyclic processes in Eq. (10).Footnote 116 These equations and remarks from Clausius flirt with Maxwell’s view of entropy and heat exchange in his Theory of Heat. Maxwell said, “when there is no communication of heat this quantity [entropy] remains constant, but when heat enters or leaves the body the quantity increases or diminishes”.Footnote 117 I believe it is a mistake to attribute Maxwell’s position to Clausius—Clausius’s strong association of entropy with the relevant integral notwithstanding—for two reasons. First, in 1865, Clausius’s statement of the second law referred to the entropy of the universe. The universe, however, is a closed isolated system that does not exchange heat with some other body. Moreover, he says that the entropy of the cosmos increases to a maximum. Not all subsystems of an entropy increasing cosmos such as ours will be in equilibrium. Thus, one cannot explain the increasing entropy of our cosmos by appeal to equilibrium subsystems exchanging heat with their environments.

Second, Maxwell’s statement was experimentally falsified by the Gay–Lussac–Joule experiment,Footnote 118 and there are good reasons to believe Clausius was aware of the relevant experimentation, for Clausius was familiar with the work of Joseph Louis Gay-Lussac (1778–1850), having discussed his work several times in The Mechanical Theory of Heat. He was also familiar with Joule’s experimentation, citing his results authoritatively throughout the same work.

6.2 The Complete Nature of Clausius Entropy

What then is the complete and fully general nature of entropy according to Clausius? He clearly had some more general conception, for again, in one place he characterizes the second law of thermodynamics as the principle that “[t]he entropy of the universe tends to a maximum”,Footnote 119 christening it (i.e., the law itself) with fundamental status.Footnote 120

Clausius’s seminal work on thermodynamics was his 1864 Abhandlungen über die mechanische Wärmetheorie (Treatises on the Mechanical Theory of Heat) which was later (1876) to become a more “connected whole” under the title The Mechanical Theory of Heat (I have worked and will work with the 1879 English translation).Footnote 121 In that work, the entropy of a thermodynamic system is not as fundamental as that system’s energy U suitably understood ([84], p. 195). I include the qualification because in the 1867 (first English edition) of The Mechanical Theory of Heat [80], Clausius agreed with William Rankine (1820–1872), maintaining that small changes of energy are given by the sum of small changes to H and small changes to internal work (I) due to internal molecular forces, and so, \(dU=dH+dI\).Footnote 122 In the improved and transmuted 1879 edition, U is “[t]he sum of the Vis Viva and of the Ergal…”,Footnote 123 but Ergal (J in Clausius’s work) is just potential energy (first coined by Rankine).Footnote 124 The type of energy involved here appears to be both kinetic and potential energy. And this notion of energy (U) is one and the same as that in the work of Thomson. Clausius related his conception to Thomson’s as early as 1866 ([79], p. 5).

More should be said because in the 1867 edition of The Mechanical Theory of Heat, and the original work behind it, there exists a more fundamental quantity lurking beneath entropy, viz., disgregation. Disgregation (Z in Clausius’s equations) is a quantity increased by heat. In fact, for Clausius, heat causally produces work as in [61] and it does this by increasing disgregation.Footnote 125 Disgregation itself is “the degree in which the molecules of a body are dispersed”.Footnote 126 In this earlier edition of Clausius’s major work, entropy, or infinitesimal changes thereof, was/were specified by appeal to, inter alia, disgregation or infinitesimal changes thereof. We already expressed the mathematics that encodes these facts via Eq. (14). As I’ve already said, the quantity H in that equation is the heat in the system, but by this Clausius meant vis viva or kinetic energy of molecular motion, as Maxwell ([176], p. 258) pointed out criticizing Tait’s misreading of Clausius.Footnote 127 This treatment (i.e., (14)) of entropy was lambasted by both Maxwell and Tait, the former providing the clearer and more measured response of the two.Footnote 128 As I already foreshadowed, Clausius removes disgregation from the later 1879 edition of The Mechanical Theory of Heat. But even if we keep Z in Clausius’s framework, the principle cause of the increase of disgregation is molecular motion responsible for increasing dispersion, and that motion can be understood in terms of kinetic energy. Thus, U (which for Clausius you’ll recall is Vis Viva and Ergal) resides beneath disgregation (i.e., it is more fundamental than disgregation) which is related to entropy in the way (14) suggests.

Having discovered Clausius’s mature view on the status of entropy, energy, and disgregation, in the hierarchy of being, we should now answer the question: What, according to Clausius, is entropy? Entropy is that property of physical systems that tracks (or serves as a measure of) the processes of energy transformation, often, though not always associated with heat exchange ([58], p. 273). He wrote,

We might call ‘S’ the transformational content of the body, just as we termed the magnitude ‘U’ the thermal and ergonal content. But as I hold it better to borrow terms for important magnitudes from the ancient languages so that they may be adopted unchanged in all modern languages, I propose to call the magnitude S, the entropy of the body, from the Greek word \(\tau \rho o\pi \eta\), transformation.Footnote 129

For Clausius, there “is a natural bias in the distribution of energy and in the direction which energy changes tend to take. Entropy gives us a measure of this bias in the case of material bodies or systems of bodies”.Footnote 130

6.3 Boltzmann and the Clausius Entropy

There can be no doubt that Boltzmann knew Clausius’s work on entropy, and that his understanding of entropy was, in some places, one and the same as that of Clausius. For example, Boltzmann’s second published paper was “On the Mechanical Significance [Meaning] of the Second Law of Heat Theory”.Footnote 131 There, perhaps under the influence of [200], Boltzmann tries to provide an extension of Clausius’s earlier conception of entropy to systems that feature molecules that enjoy periodic motions assuming all the while many aspects of Clausius’s early kinetic theory of gases. That it is Clausius’s concept of entropy that Boltzmann is extending is well justified by the fact that in 1871, Clausius would provide the same extension of his (i.e., Clausius’s) concept to periodic motions [81], realizing after Boltzmann’s rebuke, that Boltzmann’s extension or generalization of Clausius’s concept came before Clausius’s own generalization in [82].Footnote 132 Furthermore, \(dS=\frac{\delta Q}{T}\) holds in both of the aforementioned 1871 papers by Boltzmann and Clausius. Of course, this is the case in Clausius’s earlier work too.Footnote 133

Later review and correspondence included at least two concessions by Boltzmann. Clausius in [81] accomplished something that Boltzmann in [15] failed to. Clausius provided the more accurate mathematical characterization of entropy, and [15] had ignored changes in the potential.Footnote 134 These concessions suggest that Boltzmann took Clausius’s work on entropy quite seriously. But one might now ask:

(Maxwell’s Question): Why would Boltzmann worry about priority in this context [Boltzmann rebuked Clausius for reproducing Boltzmann’s earlier work] when Boltzmann had already begun exploring the Maxwellian distribution-based approach to thermodynamics and statistical mechanics?

I call this Maxwell’s question because in an 1873 letter to Tait, Maxwell would ponder a similar query, expressing his astonishment about the continued interest in mechanical approaches to the second law. He wrote,

It is a rare sport to see those learned Germans contending for the priority of the discovery that the 2nd law of [thermodynamics] is the [Hamilton Principle]… the [Hamilton Principle], the while, soars along in a region unvexed by statistical considerations, while the German Icari flap their waxen wings in nephelococcygia [i.e., cuckoo land] amid those cloudy forms which the ignorance and finitude of human science have invested with the incommunicable attributes of the Queen of Heaven.Footnote 135

The answer to Maxwell’s question is simple. Boltzmann cares about mechanical justifications of the second law and regards his H-theorem as a means whereby he can attain just such a mechanical justification. That is why in PERICOPE he speaks, albeit rather imprecisely and somewhat clumsily, of entropy as if it is represented by the Clausius integral. What Boltzmann is actually attempting to do is connect minus-H with Clausius’s energy transformational notion of entropy, the type of entropy which Clausius believed had a mechanical explanation. Boltzmann’s H-theorem should be viewed as the fulfillment of Clausius’s vision. The H-theorem does in fact provide a mechanical explanation of how energy tends to transform and, more derivatively, how entropy tends toward a maximum.Footnote 136

That Boltzmann is borrowing Clausius’s transformational conception of entropy in his work on the H-theorem is a conclusion other scholars have reached. Olivier Darrigol and Jürgen Renn state,

Boltzmann…noted that the value of \(-\mathrm{H}\) corresponding to Maxwell’s distribution was identical to Clausius’s entropy. For other distributions, he proposed to regard this function as an extension of the entropy concept to states out of equilibrium, since it was an ever-increasing function of time.Footnote 137

But I go further than Darrigol and Renn because there is an additional inference to make. Because the H-theorem has to do with Clausius entropy, and Clausius entropy is defined in terms of energy, tracking how energy transforms over time, we can agree with noted historian Stephen G. Brush when he says that “the H-theorem is a microscopic version of the general principle of dissipation of energy proposed by Kelvin in 1852, and reformulated by Clausius in 1865 in the phrase, ‘the entropy of the universe tends to a maximum.’”Footnote 138 In other words, we can affirm that according to both Clausius and Boltzmann’s deep conception of entropy, entropy is that quantity which tracks the way energy is changing or transforming over time. This deep conception of entropy understood as a quantity that tracks energy can be used to describe thermodynamic systems both in and out of equilibrium. The H-theorem not only helps facilitate such descriptions it also helps provide a mechanistic explanation of entropic increase and stability after equilibrium is reached.

Understanding Boltzmann’s H-theorem as a theorem about a type of entropy that tracks transformations of energy is not new. Glimmers of that interpretation of Boltzmann appear in the work of Edward P. Culverwell’s (1855–1931) 1890 article on Boltzmann’s kinetic theory. Culverwell explicates Boltzmann’s characterization of a gas in equilibrium as a system enjoying a status which entails that that gas’s “energy is equally distributed among all…[its] degrees of freedom”.Footnote 139 Moreover, this precise entropy-energy connection is used in the contemporary practice of thermodynamics. As Klein and Nellis put it in their recent textbook on thermodynamics, “the property entropy is introduced in order to quantify the quality of energy”.Footnote 140

7 The Probabilistic Interpretation of the H-Theorem

Given the reasoning of Sect. 6, it may be difficult to see room for the probability calculus and its accompanying interpretation since the explanation the H-theorem affords is mechanical. According to Sect. 5.1, for Boltzmann (as for Clausius and Maxwell) a mechanical explanation is one that explains features of a gas system SYS by appeal to the causal behavior of subsystems of SYS. As in Maxwell’s statistical mechanics, probability does enter Boltzmann’s reasoning. It does so in a way that manifests an epistemic view of probability. According to Boltzmann, we don’t know the precise state of the molecules of SYS, so we cook up our best understanding of how they are dancing (a statistical distribution law) and then, given that assumption, we look to mechanical features to see how the subsystem’s velocities are changing as they approach the state described by the distribution law. The approach to that state, as well as the mechanism whereby the system remains in that state has directly to do with causal influences among the subsystems of SYS. Appreciating the velocity changes due to causal influences revealed in the collisions that push a non-equilibrium system like SYS toward equilibrium is the means whereby we appreciate how the system’s entropy is changing over time. Our entire methodology is always removed from the precise details about actual world goings-on because the way we are modeling the involved causal influence through velocity change is through equations about how the distribution function itself changes over time. Our best efforts can only ever be approximate, and time has told us (or, from Boltzmann’s perspective, will tell us) that statistical hypotheses coupled with the right equations (i.e., the Boltzmann equation and a statement of the H-functional) bear fruit and aid us in our efforts to save the phenomena.

Obviously, the aforementioned statistical hypotheses include quantitative statements of equations revealing the contents of distribution functions like the Maxwell and Maxwell–Boltzmann distribution laws. That these laws were understood by Boltzmann to give probabilities is evidenced by both Boltzmann’s interpretation and reinterpretation of the Maxwell distribution in (Boltzmann, “Studies on the Equilibrium of Live Force Between Moving Material Points” [16]),Footnote 141 and in Boltzmann’s first attempt at proving the H-theorem in 1872. Beside his 1872 attempted proof is an agreement with Maxwell. Probabilistic methods are required (quoting Darrigol’s careful exegesis) “in order to deal with highly irregular processes involving a huge number of molecules. The irregularity and the law of large numbers explain the stability of macroscopic averages”.Footnote 142

Does the admission of probabilistic resources into the H-theorem project mean that there are exceptions to the theorem? That is a tricky question. Theorems are necessarily true, if true. Given that the antecedent of a theorem is satisfied, the consequent is strictly implied. There are no exceptions to the H-theorem in one important sense then. But while Boltzmann does show, for systems satisfying the antecedent of the theorem, that the distribution of a gas system SYS at a time t causally depends (the Boltzmann equation is a deterministic equation) upon its distribution at some prior time, (as I’ve said before) the notion of a distribution itself is statistical or probabilistic.Footnote 143 There is no guarantee that nature always gives us systems fit for the assumption that the distribution used is appropriate (q.v., my discussion along these lines earlier in Sect. 5.2), and it is for this reason that I (and more importantly Boltzmann) have already said that there can be systems that do not evolve in the way demanded by the Boltzmann equation and the H-theorem.

7.1 Loschmidt’s Reversibility Objection: Articulated

Perhaps there is a way of more direclty objecting to the use of the H-theorem in attempts to mechanically explain appearances. Recall that the antecedent of the H-theorem is the conjunction that identity (8) holds, the Boltzmann equation is omnitemporally true, and the distribution function f satisfies that equation.Footnote 144 We can now appreciate the question: what if we could find a system that satisfied the antecedent of the theorem but which did not have a distribution that tended toward the Maxwell distribution? An objection along these lines was voiced by Loschmidt [169]. His objection (the reversibility objection) began with the correct assumption that the laws of classical mechanics (specifically Hamiltonian mechanics) are time-reversal invariant and that therefore the evolutions involving the increase of the minus-H function can be turned around resulting in evolutions involving a decreasing minus-H function. These minus-H function decreasing evolutions are perfectly consistent with the underlying laws of Hamiltonian mechanics because those laws are time-reversal invariant. If, however, the mechanics drives the minus-H function increase, then how can minus-H decrease be driven by the same mechanics? The reversed evolutions contradict the H-theorem. This is the reversibility paradox.Footnote 145

7.1.1 The H-Theorem Untarnished in Boltzmann’s Eyes

My response to Loschmidt’s famous objection will come later (Sect. 8). For now, I point out that contrary to what the Standard Story would have us believe, after Boltzmann engaged with Loschmidt’s reversibility objection, he continued to positively affirm the H-theorem as a means whereby one achieves insight into the deterministic and exceptionless increase of entropy for systems that satisfy the antecedent of the H-theorem. He continued on in this way late into his career, viewing his H-theorem as the mechanical justification of the second law, even after articulating his combinatorial definition of entropy and his combinatorial arguments for a statistical statement of the second law. The H-theorem was viewed by him to be a more fundamental justification of the second law, one which the combinatorial arguments illustrate.Footnote 146 There are four reasons in favor of this understanding of Boltzmann’s work.

First, in Boltzmann’s very reply to Loschmidt, Boltzmann affirms that “the existence of microstates for which the entropy decreases does not contradict the general endeavor to deduce the entropy law from atomistic considerations” ([88], 198 my emphasis). The very section immediately following Boltzmann’s reply to Loschmidt is entitled “Comments on the Mechanical Meaning of the Second Law of Heat Theory”. There Boltzmann invests time and energy discussing the mechanical justification of the second law, never giving up on it.

Second, as late as Boltzmann’s first volume of the Lectures on Gas Theory (1896), Boltzmann says, “if at the beginning of some time interval [the value of the distribution function] is on the average the same at each position in the gas…, the same will hold true at all future times”.Footnote 147 And he would add, “the quantity we have called H can only decrease”.Footnote 148 In the second volume of the Lectures on Gas Theory published in 1898, after churning through several points in a proof sketch, Boltzmann concluded, “[s]ince the same holds for all other kinds of molecules, and similarly for collisions of different molecules of the same kind with each other, we have proved that in this special case the value of H can only decrease as a result of collisions”.Footnote 149 Undoubtedly, throughout Boltzmann’s corpus, the way he views the H-theorem and its implications is “predominantly deterministic”.Footnote 150

Third, when Boltzmann was writing his Lectures on Gas Theory, he stopped midway through. Why did he do this (volume 1 was published in 1896, volume 2 in 1898)? He did it because he thought it necessary to author a treatise on mechanics because his gas theory was/or should be grounded (he believed) in mechanics. So, he published volume one of his treatise on mechanics in 1897. As he says in his Lectures on Gas Theory, the atomistic approach to the physics of matter provides the best mechanical explanation of phenomena ([35], pp. 26–27). This was no isolated supposition in Boltzmann’s more general corpus. As Jungnickel and McCormmach ([136], p. 191) state, "Boltzmann presented mechanics as the foundation of all theoretical physics."Footnote 151

Fourth, Boltzmann’s general physical methodology distinguished between mechanical principles or laws, hypotheses, and the world. The laws are those of classical Hamiltonian mechanics. Hypotheses are principles like the second law of thermodynamics. Laws or mechanical principles are tested by the confirmation or disconfirmation of the hypotheses they entail. Hypotheses are confirmed in conjunction with the mechanical laws from which they follow.

…neither the Theory of Gases nor any other physical theory can be quite a congruent account of facts…Certainly, therefore, Hertz is right when he says: ‘The rigour of science requires, that we distinguish well the undraped figure of nature itself from the gay-coloured vesture with which we clothe it at our pleasure.’ But I think the predilection for nudity would be carried too far if we were to forego every hypothesis. Only we must not demand too much from hypotheses…Footnote 152

He continued:

Every hypothesis must derive indubitable results from mechanically well-defined assumptions by mathematically correct methods. If the results agree with a large series of facts, we must be content, even if the true nature of facts is not revealed in every respect. No one hypothesis has hitherto attained this last end, the Theory of Gases not excepted. But this theory agrees in so many respects with the facts, that we can hardly doubt that in gases certain entities, the number and size of which can roughly be determined, fly about pell-mell. Can it be seriously expected that they will behave exactly as aggregates of Newtonian centres of force, or as the rigid bodies of our Mechanics? And how awkward is the human mind in divining the nature of things, when forsaken by the analogy of what we see and touch directly?Footnote 153

This is a hypothetico-deductive method that includes the humble assertion that nature may not conform perfectly to our hypotheses and mechanical laws. According to this method, hypotheses like the second law must follow from mechanics.

While Boltzmann (Certain Questions [29]) provides me with some ammunition for my reading, the same source could be interpreted as completely taking it away:

It can never be proved from the equations of motion alone, that the minimum function H must always decrease. It can only be deduced from the laws of probability, that if the initial state is not specially arranged for a certain purpose, but haphazard…the probability that H decreases is always greater than that it increases.Footnote 154

This quotation gives my exegetical project the most serious kind of trouble. In it, Boltzmann admits to being unable to recover the H-theorem from mechanical considerations and suggests that the relationship between the antecedent of the theorem and its consequent is a probabilistic relation. I find that this series of remarks contains elements that are false, and worse, nonsensical. Again, the antecedents of theorems entail their consequents, and yet Boltzmann is quite clearly allowing for cases in which the consequent fails to follow from satisfaction of its antecedent. That is nonsensical. In addition, Boltzmann says that the H-theorem does not follow from mechanics. But I have already pointed out how Lanford showed that on the supposition that a choice gas system is dilute and that its constituents are approximated as hard shells (plus some further assumptions), the Boltzmann equation follows from the time-reversal invariant equations of motion in classical mechanics.Footnote 155 As Fields medal winner Cédric Villani stated at his 2010 Cambridge University lecture,

Probably the single most important theorem in the [kinetic] theory remains the Lanford theorem from 1973. Lanford rigorously derived the Boltzmann equation from Newtonian mechanics…[for an appropriate domain]…in…[the appropriate limit] you recover the Boltzmann equation…This was the first result showing that you could…get this Boltzmann equation out of the Newton equation[s].Footnote 156

Proofs of the H-theorem itself have been articulated in such a way that they satisfy the standards of rigor in contemporary mathematics (q.v., n. 79).

We should not take the passage quoted above and cited in note 154 too seriously despite how often it is quoted. Boltzmann contradicts it numerous times in his Lectures on Gas Theory, and those lectures are the best source for Boltzmann’s mature thought on thermodynamics and statistical mechanics. Here are the reasons for this:

  1. (a)

    In 1900, the minister in Vienna described Boltzmann’s attempt to acquire recognition and leadership status among the community of physicists through the publication of his Lectures (both his lectures on mechanics and those on gas theory) as his "almost morbid ambition".Footnote 157

  2. (b)

    Boltzmann believed that experimental physicists were at a disadvantage when compared to theoretical physicists because the latter could publish books rooted in their lectures and thereby present theories, quoting Jungnickel and McCormmach,

    …from the perspective of their preferred methods. Boltzmann's published lectures on theoretical physics—covering his favorite parts of it, Maxwell's electromagnetic theory, gas theory, and analytical mechanics—were not syntheses of authoritative writings in the field but his version of theoretical physics.Footnote 158

  3. (c)

    There's evidence that Boltzmann believed that the atomic theory was going to fall out of favor and become completely abandoned. One of his reasons for publishing the Lectures on Gas Theory was to produce a historical deposit of the best statement of an atomistic physics of thermodynamics and statistical mechanics (as they pertained to the physics of gases) that he could muster so that when atomic theory was (in his words) "again revived, not too much will have to be rediscovered”.Footnote 159

Points (a)-(c) clearly justify a high view of the Lectures on Gas Theory understood as the best avenue to Boltzmann’s mature thought on statistical mechanics. Interestingly, Boltzmann cites (Boltzmann, On the Relation between the Second Law and Probability Calculus [25]) only once in either of its two volumes. It seemed to have been a theme—not only in Boltzmann’s own corpus but also in the work of his contemporaries—that the H-theorem and mechanical approach take precedence.Footnote 160 Consider:

  1. (d)

    Outside of the Lectures on Gas Theory and after 1877, there are only five papers/works in which Boltzmann uses the probability calculus, and among these five, only one of them applies the probability calculus to a real-world physical scenario. Among the remaining four papers, two are really just replies, and the last two are summaries of his earlier 1877 work.Footnote 161

  2. (e)

    Boltzmann’s combinatorial work was almost entirely ignored by his contemporaries. The standard discussion of the work of both Maxwell and Boltzmann at the end of the nineteenth century was Rev. Henry William Watson’s (1827–1903) A Treatise on the Kinetic Theory of Gases.Footnote 162 That work never once cites Boltzmann's 1877 paper in which he presents the two combinatorial arguments. Burbury’s A Treatise on the Kinetic Theory of Gases (Cambridge University Press, 1899) does not discuss Boltzmann’s combinatorial approach. Bryan mentions it in a footnote in his contribution to the Nature debates. And the principal concern of [100] (once an encyclopedia article on statistical mechanics published near the beginning of the twentieth century) was the status of Boltzmann’s H-theorem.Footnote 163

  3. (f)

    That Boltzmann’s contemporaries understood him to prefer the mechanical approach to thermodynamics and statistical mechanics can be seen in the synopsis of one of his famous students, viz., Paul Ehrenfest (1880–1933). He wrote,

    "Mechanical representations, were the material from which Boltzmann preferred to fashion his creations…He obviously derived intense aesthetic pleasure from letting his imagination play over a confusion of interrelated motions, forces and reactions until the point was reached where they could actually be grasped. This can be recognized at many points in his lectures on mechanics, on the theory of gases, and especially on electromagnetism. In lectures and seminars Boltzmann was never satisfied with just a purely schematic or analytical characterization of a mechanical model. Its structure and its motion were always pursued to the last detail."


    Footnote 164

As I’ve already suggested, Boltzmann published numerous replies as part of a mid-1890s discussion of his work in the journal Nature. Discussants included George Hartley Bryan (1864–1928), Burbury, Culverwell, Joseph Larmor (1857–1942), and Watson.Footnote 165 That debate took place just after the publication of a new proofFootnote 166 of the H-theorem in (H. W. [236], pp. 33–49). Although no one questioned the correctness of Watson’s suitably amended (by Culverwell) proof,Footnote 167 many objections and searching questions were raised about how Boltzmann used the H-theorem in his theorizing about irreversible thermodynamic processes and mechanics. In the face of those objections and questions, Boltzmann never once abandons the theorem (even though he could have easily reverted to his 1877 combinatorial and probabilistic approach in which the H-theorem played no essential role).Footnote 168

8 The Reversibility Paradox Answered

In Sects. 2 and 5.1, I showed that Clausius, Maxwell, and Boltzmann thought of collisions as instances of causation that drive entropic increase (i.e., collisions are that which produces the transition from non-equilibrium states to equilibrium states). This fact underwrites the sense in which their way of explaining the second law was mechanical. With respect to Boltzmann and the H-theorem, ensuring minus-H increase requires special types of collisions. Not just any will do. Only collisions with a unique type of built in asymmetry get the job done. I turn now to exploring the full nature of that asymmetry. My exploration will reveal another way in which statistical considerations enter the mechanical explanation of the second law. It will also reveal the solution to the reversibility paradox.

8.1 The Hypothesis of Molecular Chaos

When in 1895, Boltzmann said that the H-theorem only guarantees that it is highly likely that both (i) appropriate non-equilibrium gas systems increase in entropy over time and (ii) that suitable equilibrium gas systems stay in equilibrium, he said this in reply to the reversibility paradox as articulated, not by Loschmidt but by Culverwell. I rejected Boltzmann’s response in Sect. 7.1.1, because it makes both false and nonsensical claims. I showed, in the same section, that his remarks do not reflect the refined and mature views he communicated elsewhere. Am I preparing the way for a non-statistical statement of the second law? No. As in both the work of Maxwell and Boltzmann, my theory will admit probabilistic considerations in at least two places. First, (again) the Maxwell distribution is itself a statistical principle. Second, it was realized during the Nature debates in the mid-1890s that an important assumption—which I will call the hypothesis of molecular chaos (HMC)—about the nature of collisions was required in order for the H-theorem to be applicable to real-world systems.Footnote 169 With this virtually everyone (whether mathematicians, historians of physics, philosophers of physics, or physicists themselves) agrees. Disagreement arises over the precise form of the assumption.Footnote 170 I maintain that the assumption is directly related to my explanation of how and in what way some systems avoid H-theorem application (q.v., Sect. 5.2). Systems that have very special initial conditions are not guaranteed to be the kind to which the H-theorem is applicable.Footnote 171 All positions and velocities consistent with the conservation laws must be allowed early on. One way to help ensure that the system does not begin in some special state is to suppose (and Bryan [53], p. 29) made this explicit), that the molecular constituents of the system are statistically independent in that their motions are not correlated temporally prior to that which produces entropic increase (i.e., collisions). That is to say, HMC states that the pre-collision velocities of two colliding molecules in a gas system of the right kind are statistically independent, and that the post-collision velocities of those same molecules become correlated both after and because of the collision. This one-sided or asymmetric molecular chaos propagates for positive times in the sense that collisions that drive minus-H increase retain this correlation-creating ability throughout the system’s evolution toward equilibrium. When I say that the velocities after collisions are correlated, I shall at least mean that in order to retrieve the probability of the post-(binary) collision trajectory of one of the molecules in the collision, one should conditionalize on the post-(binary) collision trajectory of the other molecule, inter alia and vice versa.

Bryan was not the first to notice the HMC in Boltzmann’s project. Something close to it was recognized by Lorentz in his 1886 correspondence with Boltzmann about time-reversal invariance and the derivation of the Maxwell–Boltzmann distribution for polyatomic gas systems. He stated,

We may assume that in a natural gas the particles have completely irregular positions and phases, or at least that there is no definite relation between the positions that the particles have before time dt [in which a collision of a given kind occurs] and the number of collisions which they will experience [during this time]. In contrast, it is clear that the positions and phases of the particles are not completely irregular with respect to the past collisions, because they result precisely from the latter collisions. Now, if we revert all velocities as you wish to do, we get a state in which the positions and phases are prepared for the forthcoming collisions and therefore complete irregularity no longer holds.Footnote 172

Here Lorentz articulates the idea that before collisions during dt, gas particles are statistically independent and therefore “irregular” with respect to their positions and phases. He likewise affirms that collisions cause those positions and phases to become in some sense regular.

Through some persuasive efforts, Boltzmann came to accept (at least for a substantial period of time) the HMC. He wrote, “[w]e shall therefore, [he concludes in 1896] now explicitly make the assumption that the” pre-collision motions are “molecularly disordered and” remain “so throughout all future time”.Footnote 173 Elsewhere Boltzmann criticizes Gustav Kirchhoff’s (1824–1887) derivation of the Maxwell distribution in [140]. The basis of his critical review is that Kirchhoff has not properly assumed the HMC. Boltzmann turns out to be wrong about this, but the fact that he uses the HMC as a criterion for determining the threshold of a good derivation of the Maxwell distribution suggests a high view of the HMC.Footnote 174

The more technically inclined reader will desire a formal presentation of the HMC in the language of mathematics.Footnote 175 I will not provide one because there isn’t one. Brilliant mathematicians have given this issue much thought and have concluded that “the physical derivation of the Boltzmann equation is based on the propagation of one-sided chaos, but no one knows how this property should be expressed mathematically…”Footnote 176 Call this the (No Mathematics Problem (NMP)). This may strike one as a troubling situation. But matters are worse. The second law is commonly used as part of a solution to the problem of the arrow of time which asks: Why do we perceptually behold irreversible processes when the laws of mechanics governing the micro-constituents of the systems in those processes are time-reversal invariant? If one’s answer in any way relies upon the H-theorem, then one’s answer will invite yet another problem of asymmetry: Why do the binary collisions that produce minus-H increase produce correlations only after those collisions obtain? As Brown et. al. stated, “… there is no reason given as to why the…[HMC]…holds for pre-collision velocities rather than post-collision ones”.Footnote 177 Call this the (Chaos Asymmetry Problem (CAP)).

8.2 The Solution at Long Last

My proposed resolution of the reversibility paradox will also serve as the solution to the NMP, and the CAP. The first step of the solution is to understand the HMC as an interpretive postulate about the nature of that which drives minus-H increase, viz., collisions. That the collisions are responsible for entropic increase in Boltzmann’s H-theorem-laden kinetic theory is acknowledged by virtually all scholars.Footnote 178 The standard story in kinetic theory is that collisions between molecules in non-equilibrium closed systems drive those systems into equilibrium, and that “equilibrium” writes Thomas Kuhn, “is, by definition, the state in which the distribution is unaffected by collisions”.Footnote 179 But now we must ask, if collisions causally produce minus-H increase, then why do collisions among molecules of gas systems in equilibrium fail to increase minus-H even further? Of course, once the system reaches equilibrium it is characterized by the Maxwell distribution, in which case, the functional H vanishes thereby reaching its minimum value (the H-theorem used to be called the minimum theorem). It’s just a mathematical fact that H cannot decrease, and that minus-H cannot increase. But mathematical facts can have metaphysical explanations. That is to say, there exists a reason why once H vanishes, entropy fails to increase, and that reason consists in the fact of energy dissipation. Recall Stephen G. Brush’s point (quoted previously) that “the H-theorem is a microscopic version of the general principle of [the] dissipation of energy proposed by Kelvin in 1852, and reformulated by Clausius in 1865…’”Footnote 180 The way energy has transformed and dissipated—remember entropy tracks this energy transformation—has left the system in equilibrium no longer allowing it to further transform.Footnote 181 The capability of the system to perform work becomes attenuated. Contemporary thermodynamicists such as Sanford Klein and Gregory Nellis interpret the second law “as a system for assigning quality to energy”. They continued,

Although energy is conserved, the quality of energy is always reduced during energy transformation processes. Lower quality energy is less useful to us in the sense that its capability for doing work has been diminished. The quality of energy is continuously degraded by all real processes; this observation can be expressed in lay terms as ‘running out of energy’.Footnote 182

The energy transformative process is a causal one. That interpretation is plausible for at least two reasons. First, at the heart of the process in thermodynamic or statistical mechanical evolutions are causally efficacious collisions producing entropic increase. Second, kinetic energy is a causal quantity. Rankine said that “actual”, or what, in 1862, Thomson would identify as kinetic energy “is a measurable, transferable, and transformable affection of a substance, the presence of which causes the substance to tend to change its state in one or more respects…”.Footnote 183 Modern statements do not differ, as contemporary classical (non-relativistic) physics universally characterizes kinetic energy in terms of work. Changes in kinetic energy \(dT\) (where T is not temperature but kinetic energy) are also specified by appeal to work done by net force, or \(\mathbf{F}\cdot d\mathbf{r}\).Footnote 184 But forces in classical mechanics are causal.Footnote 185

Some think we can forsake forces in a conceptually sophisticated enough classical (non-relativistic) mechanics if we appropriate Hamiltonian mechanics, an energy-based theory. Hamiltonian mechanics is an energy-based approach to the dynamics of classical (non-relativistic) systems because the laws of motion in Hamiltonian mechanics use the Hamiltonian or energy function (and here, I follow [218], pp. 528–531):

  1. (15)
    $$\mathcal{H}={\sum }_{i=1}^{n}\left({p}_{i}{\dot{q}}_{i}\right)-\mathcal{L}$$

    given that [\(i=1,\dots ,n\)], and that the system is described by generalized momenta:

  1. (16)
    $${p}_{i}=\frac{\partial \mathcal{L}}{\partial {\dot{q}}_{i}}$$

    and specified by generalized coordinates:

  1. (17)
    $$\mathbf{q}=({q}_{1},\dots ,{q}_{n})$$

    Here the Lagrangian \(\mathcal{L}\) is a function of \(\mathbf{q},\) \(\dot{\mathbf{q}}\) (specified below), and time. If the system is isolated, the generalized coordinates stand in a time-independent relationship to the Cartesian or rectangular coordinates tracking the system, and the potential energy of the system is velocity independent,

  1. (18)
    $$\mathcal{H}=T+U$$

    and generalized momenta as well as generalized velocity can be written (respectively) as:

  1. (19), (20)
    $$\mathbf{p}=\left({p}_{1},\dots ,{p}_{n}\right), \dot{\mathbf{q}}={(\dot{q}}_{1},\dots ,{\dot{q}}_{n})$$

Both potential and kinetic energy are analyzed (at least in part) in terms of work (force times displacement). When the Hamiltonian equals kinetic and potential energy, force thereby enters Hamiltonian mechanics. When it is appropriate to specify the Hamiltonian in terms of the Lagrangian \(\mathcal{L}\), force enters more indirectly.Footnote 186 The Lagrangian \(\mathcal{L}\) is (in many appropriate circumstances) equal to \(T-U\) (where U is here potential energy and not internal energy). So, the Lagrangian is (in appropriate circumstances) at least in part specified by appeal to the kinetic and potential energy of the system. But again, kinetic and potential energy, even in Hamiltonian mechanics, is, in part, standardly interpreted and analyzed in terms of work. But work is, in part, specified in terms of net force. Thus, forces are indispensable to any plausible interpretation of Hamiltonian mechanics, and therefore causation is as well since forces are causes (Hamilton would agree! Q.v., the end of n. 185).

If our interpretation of classical mechanics is causal, then it admits an asymmetry. Causation is formally asymmetric. How then do I meet the famous reversibility objection in the work of Thomson, Loschmidt, and Culverwell? Recall the gist of that objection. All minus-H increasing evolutions imply the possibility of minus-H decreasing reversed evolutions of an appropriate isolated gas system. This, thought Thomson, Loschmidt and Culverwell, is a consequence of the reversibility of the microdynamics ([89], p. 774). The reversibility of the microdynamics and therefore also the reversibility of the supervening macroscopic evolutions was thought to be a consequence of the reversal of the involved velocities. As Loschmidt wrote, “the entire course of events will be retraced if, at some instant, the velocities of all its parts are reversed”.Footnote 187 Or as Thomson put it, “[i]f, then, the motion of every particle of matter in the universe were precisely reversed at any instant, the course of nature would be simply reversed for ever after”.Footnote 188 Reversing velocities was deemed naturally possible because the underlying microdynamical equations of motion in Hamiltonian mechanics were correctly thought to be time-reversal invariant.

The idea that you get so much from simple velocity reversal of microconstituents of real-world classical systems is mistaken. Entropic increase as envisioned by the H-theorem-laden kinetic theory is not driven by an underlying microdynamical evolutionary process that is reversible. Am I denying that Hamilton’s equations of motion are time-reversal invariant? No. Recall that those equations are time-reversal invariant only if replacing t with minus-t (being careful to also flip the sign of all odd forms of t such as velocity) allows solutions to be taken to solutions (or as Thomson said, “any solution remains a solution” ([221], p. 441)). I am certainly not denying that. It’s a mathematical fact. However, the equations of motion, once fully interpreted and thereby rendered applicable to real-world classical systems, inform us about unfolding causal processes that possess an asymmetry even in the micro-processes. That asymmetry stems from the causation in collisions, the very engine of entropic increase and so also the source of the asymmetry in the HMC. This source is not directly represented by the mathematics expressing the microdynamical (Hamiltonian) laws and so it is no surprise that the HMC is not directly represented in that mathematics either. The HMC is not part of the formulation of mechanical principles that govern collisions (compare Uffink’s remarks in [226], p. 969). Rather, it is an understanding of how that formalism fits the real world (i.e., it is part of an interpretation of the mechanics).Footnote 189 But one might counter: The HMC is about collisions, and we have a classical mathematical collision theory.

8.2.1 Solving the Chaos Asymmetry and No Mathematics Problems

Go back to Maxwell’s “On the Dynamical Theory of Gases” [174]. There, Maxwell assumes that all collisions are elastic (total kinetic energy and momentum are conserved through collisions). This assumption is false for polyatomic molecules, and false for atomic collisions. The latter conjunct holds because in collisions between atoms some kinetic energy is converted into other forms of energy. But set these points aside. As in some modern accounts of classical collision theory, Maxwell accounted for constituent collisions between two arbitrary gas molecules by giving attention to their pre-collision velocities, their post-collision velocities, and those collision “parameters that are necessary to determine the final velocities of the molecules”.Footnote 190 As already revealed in preceding discussion (q.v., n. 39), the collision parameters are usually the azimuthal angle \(\phi\), and the impact parameter \(b\). With respect to binary collisions, the latter is nothing more than two modal entities, viz., the paths the two molecules would travel were they to fail to interact with one another (in the center-of-mass frame). The former is just an angle, viz., the angle that fixes the plane upon which sits the post-collision trajectories of both molecules. The type of interaction involved need not be restricted to a physical contact in a real-world collision because the types of entities interacting are not restricted to or always best approximated by point masses. The interaction may be complex involving various force-types. For example, it may include the exertion of non-contact forces made manifest in attractions or repulsions alongside or with contact forces. But even if the involved force impressions were purely contact forces, the impact parameter would not provide that which is sufficient for fully determining (in the sense of producing) the post-collision velocities. For elastic collisions of molecules of gases (not unlike ideal gases) with the same masses, one can through straightforward mathematical reasoning, determine (in the (epistemic) sense that you can infer or derive) the post-collision velocities of the two colliding molecules from knowledge of the laws of conservation, the impact parameter, the azimuthal angle, and the pre-collision velocities. The sense of determination here is epistemic because it would be obviously shortsighted to judge that because a mathematical fact about the post-collision velocities follows from mathematical facts about conservation, the pre-collision velocities, the azimuthal angle, and the impact parameter, that therefore nothing more in the world metaphysically determines the post-collision velocities when the phenomenon under study is a collision phenomenon involving impact force impression. There was a collision! There was an interaction between the two molecules! What has happened in Maxwell’s treatment (and as we shall see in Boltzmann’s treatment too) is that Maxwell has chosen to model around the interactions, or the intimate details of the impact-laden collisions.Footnote 191 What has happened is that Maxwell has utilized a conceptual strategy that Mark Wilson calls physics avoidance [243]. Maxwell sought to model the world using quantitative walk-arounds that enabled his models to escape severe mathematical difficulties.Footnote 192

Boltzmann’s proof of the H-theorem treats collisions much the same way Maxwell modeled them, i.e., by using the conservation laws, plus the “initial value of the kinetic energies \({k}_{1}\) and \({k}_{2}\) of” two colliding molecules, “and by the value \({k{^{\prime}}}_{1}\) of the kinetic energy of the first molecule after the collision”.Footnote 193 Boltzmann’s combinatorial argument of 1877 practices a similar type of physics avoidance, except in that context it completely “neglects the contribution to the energy of the system that stems from interactions between the particles”.Footnote 194 Followers of Boltzmann who prefer the combinatorial method do not resist the neglect. In fact, the same Boltzmannians provide a means whereby we can empirically distinguish H-theorem-laden statistical mechanics from the more popular modern Boltzmannian approach found in places like [2, 115, 117, 162, 166]. For example, Goldstein et. al. call the entropy discussed in Boltzmann’s combinatorial approach, Boltzmann entropy (SB). They provide sound justification for separating SB from the entropy that is minus-H, stating that:

[i]f interaction cannot be ignored, then the H functional does not correspond to the Boltzmann entropy…[w]hen interaction can be ignored there is only kinetic energy, so the Boltzmann macro states based on the empirical distribution alone determine the energy and hence the H functional corresponds to the Boltzmann entropy.Footnote 195

In modern classical mechanical approaches to Boltzmannian statistical mechanics that use an H-theorem and a Boltzmann collision operator Q, impact interactions are avoided or modeled around (as is implied in [230], p. 79). In that context too, collisions are all assumed to be binary, and the involved particles don’t really contact one another, for in that literature binary collisions are processes “in which two particles happen to come very close to each other, so that their respective trajectories are strongly deviated in a very short time”.Footnote 196 There is physics avoidance afoot here because modern theoreticians are engaging in modeling walk-arounds.

Ignoring interactive contact collisions between point-like objects is as old as Newtonian mechanics. Newton’s second law says that “[a] change in motion [not a rate] is proportional to the motive force impressed [where the proportionality constant is inertial mass] and takes place along the straight line in which that force is impressed” ([188], p. 416).Footnote 197 The masses of the objects to which the second law was intended to apply never equal zero. Even point masses have mass. However, when two point-like objects or point masses collide, their accelerations are obliterated, and as a result the second law fails as there is a force impressed but no resulting acceleration (and not because of a balance rendering the force vector equal to the zero vector). Newton was aware of this problem and saw no application of the second law of motion to contact interactions. Here is Wilson on Newton’s approach to the problem. Note the similarities to the methods of Maxwell and Boltzmann,

Whenever these radii contact one another (we shall only worry about the head on collision case), Newton abandons the requirement that the ‘a’ in ‘F = ma’ must make sense and shifts his focus to the two balls’ incoming stores of linear momentum and kinetic energy (as we now dub them), together with a purely empirical factor called a coefficient of restitution (it governs how much the total kinetic energy budget will diminish post-collision). In effect, this treatment blocks out the crucial interval of time \(\Delta t\) where ‘F = ma’ fails to make sense and glues together the incoming and outgoing events exterior to \(\Delta t\) through a mixture of conservation principles (conservation of linear momentum) and raw empirics (coefficients of restitution extracted from experiment). Formally, tactics that patch over problematic intervals or regions in this manner are frequently called matched asymptotics.Footnote 198

The problem is not unique to Newton’s formulation of classical mechanics. It reappears in Hamiltonian mechanics. Mathematician Robert Devaney stated,

…specific Hamiltonian systems which arise in applications often suffer singularities as well. By a singularity we mean a point where the differential equation itself is undefined. A typical example of a singularity is a collision between two or more of the point masses in the Newtonian n-body problem. At collision, the differential equation breaks down: the velocities of the particles involved become undefined. A singularity or collision can create havoc among nearby solution curves. Solutions which pass near a singularity may behave in an erratic or unstable manner, and solutions which start out close to one another can end up far apart after passing by a singularity.Footnote 199

That you should care about more than mere positions and post-collision velocities in such cases, and that you should give attention to the interactions during the relevant \(\Delta t\) (sometimes this time interval is referred to with the symbol \(\Delta {t}^{*}\)) was expressed very clearly by Gottfried Wilhelm Leibniz (1646–1716). The fact that you could recover so much while ignoring the details of the evolution during \(\Delta t\) was, for Leibniz, “a convenient trick”.Footnote 200 Leibniz thought that in order to get to the deep joints of nature, you need to pick up on what’s transpiring during the relevant \(\Delta t\). Like Leibniz, I maintain that what one will find there (i.e., in the relevant \(\Delta t\)) is efficient causation or causal interaction ([165], pp. 139–142).Footnote 201That causation (call it fundamental causation or causationF) drives the engine of entropic or minus-H increase. It results in correlations (that’s why you can use correlations to find causal interactions), correlations that are one-sided precisely because causationF is asymmetric. That is to say, obtaining causalF relations in entropy producing collisions explain the HMC. The Chaos Asymmetry Problem (CAP) has been resolved. The propagating one-sided chaos referenced by the HMC is one-sided because the velocity correlations are the effects of temporarily prior causes in temporally directed obtaining causalF relations. It is no surprise then that the Boltzmann equation breaks T-symmetry. It does this because the collisions it is about involve obtaining causalF relations that are temporally asymmetric.

The introduction of causationF into entropy increasing collisions during the relevant \(\Delta ts\) resolves the No Mathematics Problem (NMP) as well. There’s no mathematical representation of the HMC because its source is unrepresented by the formalism and because the correlations HMC references are set down during \(\Delta t\). Our best modeling of the collision process walks around those times since its chief concern is recovering post-collision velocities. We can nonetheless point to that best modeling as evidence of the existence of causationF in the \(\Delta ts\) because that modeling, while one step removed from the phenomena, nonetheless recognizes that forces and resulting accelerations obtain so as to get the velocity changes. Applying the time-reversal invariance operation will not change the directions of the forces (the causal structure) nor the directions of the resulting accelerations (though the displacement is reversed). Forces and accelerations are even forms of t.

8.2.2 Solving the Reversibility Paradox

To see the resolution of the reversibility paradox, recognize first that the time-reversal invariance operation in Hamiltonian mechanics is one that is applied to Hamilton's equations of motion (and appropriate deductive consequences thereof). Odd forms of t receive sign changes and solutions are still mapped to solutions. Execution of that operation together with the execution of an appropriate time-translation (so as to help us appreciate a temporally reversed evolution) will not entail that there exists an evolution that satisfies a temporally reversed HMC. This is because the HMC is not a part of the formulation of Hamiltonian mechanics, nor is it a deductive consequence of the equations of motion in Hamiltonian mechanics. Again, HMC is an interpretive hypothesis.

Suppose there's an elastic impact collision C between two molecules (1 and 2) of a monatomic gas system. Molecules 1 & 2 have velocities v1 and v2 (respectively) before C. After C, they take velocities u1 and u2 (respectively). v1 does not equal u1, and v2 does not equal u2, and I will assume that molecule 1′s mass is larger than molecule 2′s inertial mass, but not significantly larger. Velocities v1 and v2 produce C. C produces post-collision velocities u1 and u2. The fact that the accelerations and force impressions in C are not reversed under time-reversal suggests that in the time-reversed evolution, the velocity transitions run from \(-{\mathbf{u}}_{1}\) and \(-{\mathbf{u}}_{2}\) to \(-{\mathbf{v}}_{1}\) and \(-{\mathbf{v}}_{2}\). So, in the reversed evolution, you won’t approach the Maxwell distribution precisely because the velocity transitions/changes go in the wrong direction. They go in the wrong direction because of the fundamental causal structure of the evolution. The (one-step-removed) evidence for this resides in the fact that in the reversed evolution, the forces are still pushing in the same directions as the actual world evolution, and the accelerations keep their actual world directions as well. Because the HMC is an interpretive postulate, the time-reversal operation alone will not change its one-sidedness either. Whatever is done with the HMC in the reversed evolution is done by hand. The causal structure of the world must be changed to realize reversed evolutions.Footnote 202

What of the classical possible world w at which monatomic gas systems of the right kind evolve in perfect accord with the models of Maxwell [174] and Boltzmann [21, 23]? At w, will there fail to be monotonic increase of \(-H\), if such gas systems begin their evolutions in low entropy states? At w, all binary “collisions” never introduce problems of mathematical singularities because the constituent molecules never meet. My project seeks to causally interpret only those collisions quantified over by the HMC. My central thesis, CC in Sect. 1, made this clear. What I’m recommending is that we understand the HMC as an interpretive postulate about the types of collisions that transpire during the crucial \(\Delta ts\). As I’ve argued, both Maxwell and Boltzmann did not include specific reference to such impact collisions in their mathematical models because (on my interpretation) they were employing the conceptual strategy of physics avoidance (hence the NMP). They were able to discover the various distribution laws because the model walk-arounds “do the trick” (as Leibniz would say) of recovering the velocities of gas molecules after the collisions that are walked around.Footnote 203 They encounter a reversibility paradox precisely because the surface meaning of their modeling describes systems like those in w, systems whose dynamical evolutions are such that their time-reversal yields a past-directed evolution. In w, \(-H\) does not monotonically increase in accordance with the H-theorem. How could it? The evolutions there are completely time-reversable. Nonetheless, there is no violation of the H-theorem there because the HMC (a precondition of the theorem) fails to hold at w. The collisions the HMC quantifies over do not transpire there and so neither does entropic increase of the kind required by the H-theorem. But as soon as we shift to the real world, where monatomic gases like helium (He), argon (Ar), xenon (Xe) and others, evolve in ways featuring real world impact collisions avoided by the Maxwell–Boltzmann modeling but targeted from afar by that modeling nonetheless (and so my project remains true to the spirit of Boltzmann’s work), the HMC becomes part of a true and correct (in the appropriate limit) interpretation of Hamiltonian mechanics being made true by the causal structure of the actual world. It is the contingent causal way the world is that determines the entropic asymmetry described by the H-theorem. It is a consequence of my framework that the proposed interpretation of Hamiltonian mechanics makes a detectable empirical difference. It is to that empirical difference that I now turn.

Is the HMC empirically justified? Yes. It is indirectly justified by all the fruit or empirical success produced by the H-theorem and Boltzmann equation in modern kinetic theory. For example, it should be obvious by now that the H-theorem predicts that if a classical monatomic gas system SYS satisfies certain conditions, then SYS will evolve to thermal equilibrium over a sufficiently long period of time. That is in fact what we observe. More generally, the H-theorem predicts the truth of the second law of thermodynamics for systems that satisfy the antecedent of the theorem. Consequently, in a restricted sense, the theorem “demonstrates the second law of thermodynamics”.Footnote 204 Nature’s obedience to the second law is what we observe. In addition, I have already indicated how the Boltzmann equation is used to great benefit in the study of neutron transport, plasma physics, and the kinetic theory of gases (q.v., Sect. 4; and see [63, 85, 239]). What is more, the H-theorem and Boltzmann equation bear much fruit in hydrodynamics as well [217]. The empirical successes of the Maxwell and Maxwell–Boltzmann distributions discussed in the sources at note 43 are also relevant indirect justifications of the HMC. Why believe the above constitutes indirect evidence for the HMC? The HMC “is a fundamental requirement for the application of the Boltzmann kinetic theory, the Boltzmann transport equation, and the presence of Maxwell–Boltzmann statistics”.Footnote 205

There is a recent and more direct justification as well. I call this other justification “more direct” and not “direct” tout court because we are not currently able to directly observe the correlated velocities of gas molecules. However, there are a class of granular media that are low-density media which approximate gas systems (they are called “granular gases” in light of this). G.W. Baxter and J.S. Olafsen gave attention to such systems in 2007. They discovered that these low-density granular systems exhibit molecular chaos, but that once the systems become sufficiently dense (i.e., once there are sufficient enough interactions (this is my gloss)), the velocities of the constituents of the relevant systems become correlated.Footnote 206

9 Conclusion

I have shown that the Standard Story is historically inaccurate. Once Boltzmann discovered the H-theorem it remained front and center in his mind. He always believed that some systems did not experience minus-H increase and he was in possession of reasons for delimiting the second law to a statistical claim well before the publication of Loschmidt’s reversibility objection in 1876. But even after wrestling with that objection, Boltzmann always remained pragmatically committed to the project of mechanically justifying the second law. It is therefore in a truly Boltzmannian spirit that I have tried to resolve the reversibility paradox in a way that remains true to mechanical natural philosophy.

There remains at least one puzzle to solve. How ought the probabilities in the proposed causal Boltzmannian approach to be interpreted? I’ve shown that both Maxwell and Boltzmann favored (at least at one time) epistemic interpretations of the involved probabilities, and I believe that is the best option in this context. Of course, a lot more needs to be said about these epistemic probabilities, but I hope to articulate my opinions about the matter in a part two essay that uses the framework of this project to tackle the famous recurrence objections.