Keywords

1 Introduction

The ergodic hypothesis has been formulated by [1, 2] and has famously been discussed by [3] in their influential encyclopedia article on statistical mechanics, where they provide an overview of and comment on Boltzmann’s work in statistical physics.

Ever since, the ergodic hypothesis has been debated controversially. This refers not only to the status of the ergodic hypothesis within Boltzmann’s work (see, e.g., [4]), but more generally to its applicability with respect to realistic systems (see, e.g., [5, 6]) and its relevance for physics as such (see, e.g., [7,8,9]).

Despite its debatable status, the concept of ergodicity has attracted a lot of attention. Today there even exists a proper branch of mathematics, so-called ergodic theory, with a plentitude of rigorous mathematical results (most notably, the results of [10,11,12]; see [13] for an overview).

Interestingly enough, though, Boltzmann himself never highlighted the ergodic hypothesis. Although he introduces it in his early work, he mentions it not even once in his two volumes on gas theory, which constitute his opus magnum on statistical mechanics (cf. [14]). Still, he seems to rely on ergodicity, at least as an idealization, also in his later work like, for instance, when he estimates the rate of fluctuations in the letter to Zermelo (cf. [15]).

This said, has ergodicity been a fundamental assumption of Boltzmann as the Ehrenfests suggest? If so, why didn’t he make this more explicit? This seems the more surprising as he does emphasize the explanatory value of other concepts. For instance, he stresses the fact that equilibrium is a typical state, i.e., a state which is realized by an overwhelming number of micro configurations, at several points throughout his work (see, e.g., [14,15,16]).

In this paper, I argue that ergodicity, as an idealization, or essential ergodicity, in the strict sense (as defined in Sect. 3.3 below), is a consequence rather than an assumption of Boltzmann’s approach. Based on this, I claim that the ergodic hypothesis should be read as a typicality statement, in a way analogous to how Boltzmann taught us to read the H-theorem (see [15, 16]). That is, just as a dynamical system of many particles doesn’t approach equilibrium for all, but for typical initial conditions (given a low-entropy initial macrostate) and stays there not for all, but for most times, in the case of ergodicity, not all, but typical systems behave not strictly, but essentially, that is qualitatively, as if they were ergodic.

To make this point precise, what can be shown is the following: On typical trajectories, the time and phase space averages of physical macrostates coincide in good approximation. This property of the dynamics, which I call ‘essential ergodicity’, follows from the stationarity of the measure and the typicality of the equilibrium state alone.

2 The Ergodic Hypothesis

To discuss the ergodic hypothesis, we need to introduce the realm of Boltzmann’s statistical mechanics: the theory of measure-preserving dynamical systems.

2.1 Measure-Preserving Dynamical Systems

Let (\(\Gamma , \mathcal {B}(\Gamma ), T, \mu \)) denote a Hamiltonian system. For N particles, \(\Gamma \cong \mathbb {R}^{6N}\) is called phase space. It is the space of all possible microstates X of the system, where a point \(X=(q,p)\) in \(\Gamma \) represents the positions and momenta of all the particles: \((q,p)=({q}_1, ..., {q}_{3N}, {p}_1, ..., {p}_{3N})\).

The Hamiltonian flow T is a one-parameter flow \(T^t(q,p)=(q, p)(t)\) on \(\Gamma \) with t representing time. It is connected to the Hamiltonian vector field \(v_H\) as follows: \( v_H(T^t(q,p)) = {dT^t(q,p)}/{dt}.\) In other words, the flow lines are the integral curves along the Hamiltonian vector field, where the latter is specified by \(v_H = (\partial H/\partial p, -\partial H/\partial q)\). This is the physical vector field of the system, generated by the Hamiltonian H, and the flow lines represent the possible trajectories of the system. Finally, \(\mu \) refers to the Liouville measure,

$$\begin{aligned} d\mu =\prod _{i=1}^{3N} dq_i dp_i,\end{aligned}$$
(1)

or to any other stationary measure derived thereof.

Note that we call a measure \(\mu \) stationary (with respect to T) if and only if the flow T is measure-preserving (with respect to \(\mu \)). Given a Hamiltonian system, it follows from Liouville’s theorem that the Liouville measure is conserved under the Hamiltonian phase flow. That is, for every \(A\in \mathcal {B}(\Gamma )\),

$$\begin{aligned} \mu (T^{-t}A)=\mu (A).\end{aligned}$$
(2)

Since the Liouville measure is just the 6N-dimensional Lebesgue measure, this says that phase space volume is conserved under time evolution.

If we introduce the notion of the time-evolved measure, \(\mu _t (A) := \mu (T^{-t}A)\), we can reformulate the condition of stationarity as follows. A measure \(\mu \) is stationary if and only if, for every \(A\in \mathcal {B}(\Gamma )\),

$$\begin{aligned} \mu _t(A) = \mu (A).\end{aligned}$$
(3)

According to this equation, the measure itself is invariant under time translation, which is the main reason for physicists to accept it as the measure grounding a statistical analysis in physics (see, e.g., [3, 17, 18]). In practice, we are not concerned with the Liouville measure per se, but with appropriate stationary measures derived thereof.Footnote 1

2.2 Variants of the Ergodic Hypothesis

Within the framework of Hamiltonian systems or, more generally, measure-preserving dynamical systems, we can analyze Boltzmann’s ergodic hypothesis.

Let again \((\Gamma , \mathcal {B}(\Gamma ), T, \mu )\) be a measure-preserving dynamical system and \(A\in \mathcal {B}(\Gamma )\). Let, in what follows, \(\mu (\Gamma )=1\).Footnote 2 We call

$$\begin{aligned} \mu (A) = \int _\Gamma \chi _A(x) d\mu (x)\end{aligned}$$
(4)

the ‘phase space average’ of A with \(\chi _A\) being the characteristic function which is 1 if \(x\in A\) and 0 otherwise. Further we call

$$\begin{aligned} \hat{A}(x)= \lim _{\mathcal {T}\rightarrow \infty } \frac{1}{\mathcal {T}} \int _{0}^{\mathcal {T}} \chi _{A}(T^tx)dt\end{aligned}$$
(5)

the ‘time average’ of A for some \(x\in \Gamma \). Here it has been proven by [10] that the infinite-time limit exists pointwise almost everywhere on \(\Gamma \) and the limit function \(\hat{A}(x)\) is integrable.

A dynamical system is called ergodic if and only if, for all \(A\in \mathcal {B}(\Gamma )\) and almost all \(x\in \Gamma \) (i.e. for all x except a measure-zero set), the time and phase averages coincide:

$$\begin{aligned} \mu (A) =\hat{A}(x).\end{aligned}$$
(6)

In other words, a system is called ergodic if and only if, for almost all solutions, the fraction of time the system spends in a certain region in phase space (in the limit \(t\rightarrow \infty \)!) is precisely equal to the phase space average of that region.

Historically, the ergodic hypothesis has been formulated differently. In its original version due to [1] (cited by [3]), it refers to the assertion that a trajectory literally has to go through every point in phase space (more precisely, in the constant-energy hypersurface). But this would imply that there is only one solution with all possible microstates belonging to one and the same solution. This has been proven impossible by [21, 22].

In a weaker formulation, the so-called ‘quasi-ergodic hypothesis’ demands that a trajectory has to come arbitrarily close to every point in phase space (see [3]). Later, the results of [10, 11] established the precise conditions under which equality of the time and phase space average is obtained.Footnote 3

For realistic physical systems, this equality of the time and phase space average—that is, ergodicity—turned out to be extremely hard to prove, if it can be proven at all (it took almost 50 years to extend the proof of [23] which held for the model of one billiard ball on a 2-dimensional table to the full model of \(N\ge 2\) hard spheres in a container with periodic boundary conditions, i.e. a torus, of dimension \(d\ge 2\); see [24]).

At this point, the question arises: What if we were not interested in the exact coincidence of the time and phase space average in the first place? What if all we need is an approximate equality of the time and phase space average on typical trajectories? The point I want to make is the following: Boltzmann, being concerned with the analysis of realistic physical systems, need not be and presumably was not interested in ergodicity in the strict sense. According to [3], Boltzmann used ergodicity to estimate the fraction of time a system spends in a certain macrostate. To obtain such an estimate, however, it suffices to establish a result qualitatively comparable to ergodicity: an almost equality of the time and phase space average of physical macrostates on typical trajectories. This is precisely where the notion of essential ergodicity comes into play.

3 Essential Ergodicity

We need one last ingredient to grasp the notion of essential ergodicity and that is the notion of typicality of macro- and microstates. We will then find that, given a stationary measure and a typical macrostate, that is, an equilibrium state in Boltzmann’s sense, a typical system behaves essentially as if it were ergodic.

3.1 Typicality and Boltzmann’s Notion of Equilibrium

Given a measure on the space of possible states of the system—like a volume measure on phase space—this is naturally a measure of probability or typicality.Footnote 4 Let again \(\mu \) denote the volume measure on \(\Gamma \). We call a measurable set \(A\subset \Gamma \) ‘typical’ (with respect to \(\Gamma \)) if and only if

$$\begin{aligned} \mu (A) = 1 -\varepsilon \end{aligned}$$
(7)

for \(0<\varepsilon <<1\). This definition of ‘typical sets’ directly entails a definition of ‘typical points’ (cf. [28]). We say that a point x is ‘typical’ (with respect to \(\Gamma \)) if and only if \(x\in A\) and A is typical with respect to \(\Gamma \).

In Boltzmann’s statistical mechanics, we are concerned with ‘points’ (microstates) and ‘sets’ (macro-regions). Macro-regions are regions of phase space corresponding to physical macrostates of the system. More precisely, every microstate X, represented by a point (qp) on \(\Gamma \), belongs to respectively determines a certain macrostate M(X), represented by an entire region \(\Gamma _M\subset \Gamma \)—the set of all microstates realizing that particular macrostate. While a microstate comprises the exact positions and velocities of all the particles, \(X=({q}_1, ..., {q}_N, {p}_1, ..., {p}_N\)), a macrostate M(X) is specified by the macroscopic, thermodynamic variables of the system, like volume V, temperature T, and so on. By definition, any two macrostates \(M_i\) and \(M_j\) are macroscopically distinct, hence there are only finitely many macrostates \(M_i\), and all macrostates together provide a partition of phase space into disjoint ‘macro regions’ \(\Gamma _{M_i}\) with \(\Gamma =\bigcup _{i=1}^n \Gamma _{M_i}\). Here it is a consequence of the large number of particles that every macrostate M(X) is realized by a huge number of microstates X and, hence, the precise way of partitioning doesn’t matter.

In this set-up, Boltzmann defined ‘equilibrium’ precisely as the typical macrostate of the system.

Definition 1

(Boltzmann equilibrium) Let \((\Gamma , \mathcal {B}(\Gamma ), T, \mu )\) be a dynamical system. Let \(\Gamma \) be partitioned into finitely many disjoint, measurable subsets \(\Gamma _{M_i}, i=1, \ldots , n\) by some (set of) physical macrovariable(s) \(M_i\), i.e., \(\Gamma =\bigcup _{i=1}^n \Gamma _{M_i}\). Then a set \(\Gamma _{Eq}\in \{\Gamma _{M_1}, ..., \Gamma _{M_n}\}\) with phase space average

$$\begin{aligned} \mu (\Gamma _{Eq})=1-\varepsilon \end{aligned}$$
(8)

where \(\varepsilon \in \mathbb {R}\), \( 0<\varepsilon <<1\), is called the ‘equilibrium set’ or ‘equilibrium region’. The corresponding macrostate \(M_{Eq}\) is called the ‘Boltzmann equilibrium’ of the system.

Be aware that this definition is grounded on a particular, physical macro partition of phase space. In other words, it is not an arbitrary value of \(\varepsilon \) which, when given, determines an equilibrium state—such a definition would be meaningless from the point of physics. Instead, it is a partition determined by the physical macrovariables of the theory, which is given, and it is with respect to that partition that a region of overwhelming phase space measure, if it exists, defines an equilibrium state in Boltzmann’s sense (and by the way determines the value of \(\varepsilon \)).

At this point, it has been Boltzmann’s crucial insight that, for a realistic physical system of \(N \approx 10^{24}\) particles (where, for a medium-sized object, we take Avogadro’s constant) and a partition into macroscopically distinct states, there always exists a region of overwhelming phase space measure (see, e.g., [15]).Footnote 5 This follows essentially from the vast gap between micro and macro description of the system and the fact that, for a large number of particles, small differences at the macroscopic level translate into huge differences in the corresponding phase space volumes.

To obtain an idea of the numbers, consider a gas in a medium-sized box. For that model, [30, 31] estimates the volume of all non-equilibrium regions together as compared to the equilibrium region to be:

$$\begin{aligned} \frac{\mu (\bigcup _{i=1}^n \Gamma _{M_i}\setminus \Gamma _{{Eq}})}{\mu (\Gamma _{{Eq}})}= \frac{\mu (\Omega \setminus \Gamma _{{Eq}})}{\mu (\Gamma _{{Eq}})}\approx 1: 10^N\end{aligned}$$
(9)

with \(N\approx 10^{24}\). This implies, with \(\mu (\Gamma _{Eq})\approx \mu (\Omega )\), that \(\varepsilon \) is of the order \(1:10^N\approx 1:10^{10^{24}}=\frac{1}{10^{1000000000000000000000000}}.\)

Both Boltzmann’s realization that equilibrium is a typical state and his understanding that any two distinct macrostates relate to macro-regions that differ vastly in size provided the grounds for his explanation of irreversible behaviour (cf. [14, 15]; see [9, 32,33,34]) for further elaboration of this point). In the following, however, we are only concerned with ergodicity and, related to that, a system’s long-time behaviour.

3.2 Precise Bounds on the Time and Phase Space Average of the Equilibrium State

In what follows, we give precise bounds on the time average of the equilibrium state. Therefore, consider a dynamical system with a stationary measure \(\mu \) and an equilibrium state \(\Gamma _{eq}\) in the sense of Boltzmann. That is, \(\mu (\Gamma _{Eq})=1-\varepsilon \).

To be able to formulate the bound on the time average and, later, the notion of ‘essential ergodicity’, we have to distinguish between a ‘good’ set G and a ‘bad’ set B of points \(x \in \Gamma \). Let, in what follows, B be the ‘bad’ set of points for which the time average of equilibrium \(\hat{\Gamma }_{Eq}(x)\) is smaller than \(1-k\varepsilon \) (with \(1\le k\le 1/\varepsilon \)). All points in this set determine trajectories which spend a fraction of less than \(1-k\varepsilon \) of their time in equilibrium. Let further G be the ‘good’ set of points with a time average \(\hat{\Gamma }_{Eq}(x)\) of at least \(1-k\varepsilon \). All points in this set determine trajectories that spend a fraction of at least \(1-k\varepsilon \) of their time in equilibrium. To be precise,

$$\begin{aligned} B:=\{x\in \Gamma |\hat{\Gamma }_{Eq}(x)<1-k\varepsilon \}, \quad G:=\{x\in \Gamma |\hat{\Gamma }_{Eq}(x)\ge 1-k\varepsilon \}.\end{aligned}$$
(10)

While, for a realistic physical system, ergodicity is hard to prove – if it can be proven at all –, essential ergodicity is not. In fact, it follows almost directly from the stationarity of the measure and the typicality of the equilibrium state. To be precise, with respect to the two sets B and G the following can been shown. For all \(\varepsilon , k\in \mathbb {R}\) with \(0<\varepsilon <<1\) and \(1\le k\le 1/\varepsilon \):

$$\begin{aligned} \mu (B) <1/k, \quad\mu (G)> 1- 1/k.\end{aligned}$$
(11)

The proof can be found in [35]. An essential ingredient entering the proof is the pointwise existence and integrability of the time average (cf. [10]). Hence, in the case of non-ergodic systems, the time average of equilibrium need not attain a fix value on (almost all of) \(\Gamma \)—in fact, it may have different values on different trajectories –, but still it exists (pointwise almost everywhere) and this suffices to estimate the size of the set of trajectories with a time average smaller (or larger) than a particular value.

To grasp the full meaning of Eq. 11, consider a physically relevant value of k. Recall that, for a medium-sized macroscopic object, \(\varepsilon \) is tiny: \(\varepsilon \approx 10^{-N}\) with \(N\approx 10^{24}\). In that case, one can choose k within the given bounds (\(1\le k\le 1/\varepsilon \)) large enough for \(\mu (B)\) to be close to zero and \(\mu (G)\) to be close to one. Consider, for example,

$$\begin{aligned} k=1/\sqrt{\varepsilon }.\end{aligned}$$
(12)

In that case, we distinguish between the ‘good’ set G of trajectories which spend at least \(1-\sqrt{\varepsilon }\) of their time in equilibrium and the ‘bad’ set B of trajectories which spend less than \(1-\sqrt{\varepsilon }\) of their time in equilibrium. And we obtain:

$$\begin{aligned} \mu (B)<\sqrt{\varepsilon }, \mu (G)>1-\sqrt{\varepsilon }.\end{aligned}$$
(13)

Given the value of \(\varepsilon \) from above, \(\varepsilon \approx 10^{-10^{24}}\), it follows that \(\sqrt{\varepsilon }\approx 10^{-10^{23}}\). Consequently, the equilibrium region is of measure \(\mu (\Gamma _{Eq})\approx 1-10^{-10^{24}}\) and the measures of the sets B and G are

$$\begin{aligned} \mu (B)<10^{-10^{23}}, \mu (G)>1-10^{-10^{23}}.\end{aligned}$$
(14)

Note that B is now the set of trajectories which spend less than \(1-10^{-10^{23}}\) and G the set of trajectories which spend at least \(1-10^{-10^{23}}\) (!) of their time in equilibrium. We thus find that trajectories which spend almost all of their time in equilibrium are typical whereas trajectories which spend less than almost all of their time in equilibrium are atypical!

The converse statement has be proven as well ([36, 37]; see [35] for a different proof). It says that if there exists a region \(\Gamma _{Eq'}\subset \Gamma \) in which by far most trajectories spend by far most of their time, then this region has very large phase space measure. To be precise, if there exists a region \(G'\) with \(\mu (G')=1-\delta \) such that \(\forall x\in G'\): \(\hat{\Gamma }_{Eq'}(x)\ge 1-\varepsilon '\), then the following holds:

$$\begin{aligned} \mu (\Gamma _{Eq'})\ge (1-\varepsilon ')(1-\delta ).\end{aligned}$$
(15)

Here we are again interested in those cases where \(\delta \) and \(\varepsilon '\) are very small, \(0<\delta <<1\) and \(0<\varepsilon '<<1\) (while the result holds for other values of \(\delta \) and \(\varepsilon '\) as well).

This converse result tells us that, if there exists a state in which a typical trajectory spends by far most of its time, then this state is of overwhelming phase space measure.

Why is this converse statement interesting? It doesn’t start from Boltzmann’s notion of equilibrium. Instead, it starts from a thermodynamic or thermodynamic-like notion of equilibrium.

According to a standard thermodynamics textbook (like, e.g., [38] or [39]), a thermodynamic equilibrium is a state in which a system, once it is in that state, stays for all times. In what follows, we give a definition which relaxes that standard definition a little bit in that it allows for rare fluctuations out of equilibrium and for some atypical trajectories (all \(x \notin G'\)) that don’t behave thermodynamic-like.Footnote 6

Definition 2

(Thermodynamic equilibrium) Let \((\Gamma , \mathcal {B}(\Gamma ), T, \mu )\) be a dynamical system. Let \(\Gamma \) be partitioned into finitely many disjoint, measurable subsets \(\Gamma _{M_i} (i=1,...,n)\) by some (set of) physical macrovariable(s) \(M_i\), i.e., \(\Gamma =\bigcup _{i=1}^n \Gamma _{M_i}\). Let \(G'\subset \Gamma \) with \(\mu (G')=1-\delta \) and \(0<\delta <<1\). Let \(0<\varepsilon ' << 1\). A set \(\Gamma _{Eq'}\in \{\Gamma _{M_1}, ..., \Gamma _{M_n}\}\) (connected to a macrostate \(M_{Eq'}\)) with time average

$$\begin{aligned} \hat{\Gamma }_{Eq'}(x) \ge 1-\varepsilon '\end{aligned}$$
(16)

for all \(x\in G'\) is called a ‘thermodynamic equilibrium’.

To summarize, we obtain that, for every dynamical system with a stationary measure and a state of overwhelming phase space measure, almost all trajectories spend almost all of their time in that state, and the other way round, given a state in which almost all trajectories spend almost all of their time, that state is of overwhelming phase space measure. Hence, an equilibrium state in Boltzmann’s sense is a thermodynamic equilibrium and the other way round!Footnote 7

The only two assumptions which enter the proofs in [35] are:

  1. (a)

    that the measure is stationary (resp. the dynamics is measure-preserving), i.e., \(\mu _t(A)=\mu (A)\) for all \(A\in \mathcal {B}(\Gamma )\) and

  2. (b)

    that there is a macrostate of overwhelming phase space measure, i.e., a Boltzmann equilibrium \(\Gamma _{Eq}\) with \(\mu (\Gamma _{Eq})=1-\varepsilon \), or, for the reverse direction, a) and

  3. (c)

    that there is a state in which typical trajectories spend by far most of their time, i.e., a thermodynamic equilibrium \(\Gamma _{Eq'}\) with \(\hat{\Gamma }_{Eq'}\ge 1-\varepsilon '\).

Ergodicity doesn’t enter the proofs, nor do we get ergodicity out of it. However, we get something similar to ergodicity, what we call ‘essential ergodicity’.

3.3 Essential Ergodicity

While, for an ergodic system, the time and phase space averages exactly coincide for all but a measure-zero set of solutions, for an essentially ergodic system, the time and phase space averages approximately coincide on typical solutions. To be precise, the following definition applies.

Definition 3

(Essential ergodicity) Let \((\Gamma , \mathcal {B}(\Gamma ), T, \mu )\) be a dynamical system. Let \(\Gamma \) be partitioned into finitely many disjoint, measurable subsets \(\Gamma _{M_i} (i=1,...,n)\) by some (set of) physical macrovariable(s) \(M_i\), i.e., \(\Gamma =\bigcup _{i=1}^n \Gamma _{M_i}\). Let \(0<\varepsilon <<1\). A system is called ‘essentially ergodic’ if and only if

$$\begin{aligned} |\hat{\Gamma }_{M_i}(x)-\mu (\Gamma _{M_i})|\le \varepsilon \end{aligned}$$
(17)

\(\forall i=1, ..., n\) and \(\forall x\in G\) with \(\mu (G)\ge 1-\delta \), \(0< \delta <<1\).

For a measure-preserving system with an equilibrium state (a Boltzmann or thermodynamic equilibrium), the Eq. 17 follows in a straightforward way from the two definitions of equilibrium given in Eqs. 8 and 16 and the corresponding results on the time and phase space average, Eqs. 14 and 15, respectively.Footnote 8

Theorem 1

(FAPP ergodic hypothesis) Let \((\Gamma , \mathcal {B}(\Gamma ), T, \mu )\) be a measure-preserving dynamical system. Let there be an equilibrium state \(M_{Eq}\) (a Boltzmann or thermodynamic equilibrium) with corresponding equilibrium region \(\Gamma _{Eq}\subset \Gamma \).

Then the system is essentially ergodic. In particular, there exists an \(\varepsilon \in \mathbb {R}\) with \(0<\varepsilon <<1\) such that

$$\begin{aligned} |\hat{\Gamma }_{Eq}(x)-\mu (\Gamma _{Eq})|\le \varepsilon \end{aligned}$$
(18)

\(\forall x\in G\) with \(\mu (G)\ge 1-\delta \), \(0< \delta <<1\).

Proof

We only prove Eq. 18. From that, the Eq. 17 follow directly. Let \(0<\delta ', \varepsilon ', \varepsilon ''<<1\). For the first direction of proof, consider a thermodynamic equilibrium, i.e., \(\hat{\Gamma }_{Eq}(x) \ge 1-\varepsilon '\) for all \(x\in G'\) with \(\mu (G')=1-\delta '\). It follows from Eq. 15 that \(\mu (\Gamma _{Eq})\ge (1-\varepsilon ')(1-\delta ')\) and, hence,

$$\begin{aligned} |\hat{\Gamma }_{Eq}(x)-\mu (\Gamma _{Eq})|\le \varepsilon ' + \delta ' - \varepsilon ' \delta '. \end{aligned}$$

Now set \(G=G'\), \(\delta = \delta '\) and \(\varepsilon =\varepsilon ' + \delta ' - \varepsilon ' \delta '\).

For the other direction, consider a Boltzmann equilibrium, i.e., \(\mu ({\Gamma }_{Eq}) = 1-\varepsilon ''\). It follows from Eq. 11 that \(\mu (G'')>1-\sqrt{\varepsilon ''}\) with \(G''=\{x\in \Gamma |\hat{\Gamma }_{Eq}(x)\ge 1-\sqrt{\varepsilon ''}\}\). Hence, for all \(x\in G''\),

$$\begin{aligned} |\hat{\Gamma }_{Eq}(x)-\mu (\Gamma _{Eq})|\le \varepsilon ''. \end{aligned}$$

Now set \(G=G''\), \(\delta = \sqrt{\varepsilon ''}\) and \(\varepsilon =\varepsilon ''\).

3.4 Scope and Limits of (Essential) Ergodicity

Although the notion of essential ergodicity is weaker than the notion of ergodicity, it predicts qualitatively the same long-time behaviour. In particular, it tells us that a typical trajectory spends by far most of its time in equilibrium, where equilibrium is defined in Boltzmann’s way in terms of the phase space average, and it makes this notion of ‘by far most’ mathematically precise.Footnote 9 This justifies, in a rigorous way, Boltzmann’s assumption of ergodicity as an idealization or FAPP truth in analyzing the system’s long-time behaviour (as done, e.g., in his estimate of the fluctuation rate [15]). In other words, based on Boltzmann’s account, the ergodic hypothesis is well-justified. It is a good working hypothesis for those time scales on which it begins to matter that trajectories wind around all of phase space.

Let us, at this point, use the above result on essential ergodicity to estimate the rate of fluctuations out of equilibrium. Recall that, according to Eq. 14, typical trajectories spend at least \(1-10^{-10^{23}}\) of their time in equilibrium, when equilibrium is of measure \(\mu (\Gamma _{Eq})=1-10^{-10^{24}}\) (which is a reasonable value for a medium-sized object). In other words, they spend a fraction of less than \(10^{-10^{23}}\) of their time out of equilibrium, that is, in a fluctuation. If we assume that fluctuations happen randomly, in accordance with a trajectory wandering around phase space erratically, we obtain the following estimate for typical trajectories: a fluctuation of 1 second occurs about every \(10^{10^{23}}\) seconds. But this means that a typical medium-sized system spends trillions of years in equilibrium as compared to one second in non-equilibrium, a time larger than the age of the universe!Footnote 10

So far we argued that essential ergodicity substantiates Boltzmanns assertions about the long-time behaviour of macroscopic systems. What about the short-time behaviour? In physics and philosophy, several attempts have been made to use ergodicity in some way or the other to explain a system’s evolution from non-equilibrium to equilibrium (see [43] or [44, 45]; for earlier attempts as well as a thorough critique, see [9] and the references therein).

In this paper, I argue that ergodicity—just like epsilon-ergodicity, essential ergodicity, or any other notion involving an infinite-time limit—does not and cannot tell us anything about the approach to equilibrium, which is a behaviour within short times. This is simply due to the fact that the notion of ergodicity (or any notion akin to that) involves an infinite-time limit. Because of that limit, ergodicity can, at best, tell us something about the system’s long-time behaviour where ‘long-time’ refers to time scales comparable to the recurrence times, where it begins to matter that the system’s trajectory winds around all of phase space. For those short time scales on which the system evolves from non-equilibrium to equilibrium, ergodicity (or any notion akin to that) doesn’t play any role. In fact, for a realistic gas, the equilibration time scale (i.e. the time scale of a system’s approach to equilibrium) is fractions of a second as compared to trillions of years for the recurrence time!

Boltzmann’s explanation of the irreversible approach to equilibrium is a genuine typicality result (see the discussion and references at the end of Sect. 3.1)—ergodicity doesn’t add to nor take anything from that.

At this point, a quote of the mathematician Schwartz fits well.Footnote 11 Schwartz writes with respect to Birkhoff’s ergodic theorem and the widely-spread conception that ergodicity might help to explain thermodynamic behaviour [8, pp. 23–24]:

The intellectual attractiveness of a mathematical argument, as well as the considerable mental labor involved in following it, makes mathematics a powerful tool of intellectual prestidigitation – a glittering deception in which some are entrapped, and some, alas, entrappers. Thus, for instance, the delicious ingenuity of the Birkhoff ergodic theorem has created the general impression that it must play a central role in the foundations of statistical mechanics. [...] The Birkhoff theorem in fact does us the service of establishing its own inability to be more than a questionably relevant superstructure upon [the] hypothesis [of typicality].

4 Conclusion

Based on typicality and stationarity as the two basic concepts of Boltzmann’s approach, it follows that ergodicity, as an idealization, or essential ergodicity, in the strict sense, is a consequence rather than an assumption of Boltzmann’s account.

I believe that Boltzmann was aware of this fact. In my opinion, he simply didn’t highlight the precise mathematical connection between the concepts of typicality, stationarity, and essential ergodicity because it was absolutely clear to him that, given a state of overwhelming phase space volume and a stationary measure, by far most trajectories would stay in that state by far most of their time—just like by far most trajectories starting from non-equilibrium would move into equilibrium very quickly. He didn’t need a mathematical theorem to make this more precise.

Let me now end this paper with a variation of the both picturesque and paradigmatic example of Tim Maudlin, about typicality incidents occurring in the Sahara desert.Footnote 12 In what follows, I will adapt this example to the case of essential ergodicity.

A person wandering through the Sahara is typically surrounded by sand by far most of her time. In other words, she is typically hardly ever in an oasis. This fact is independent of the exact form of her ‘wandering about’, if she changes direction often, or not, if she moves fast, or not, and so on. Even if she doesn’t move at all, she is typically surrounded by sand (in that case, for all times). In other words, independent of the dynamics, the long-time average of ‘being surrounded by sand’ is close to one on typical trajectories. This follows solely from the fact that all oases together constitute a vanishing small part of the Sahara desert and remain to do so throughout all times.