Abstract
The Kelvin–Planck statement of the second law of thermodynamics is a stricture on the nature of heat receipt by any body suffering a cyclic process. It makes no mention of temperature or of entropy. Beginning with a Kelvin–Planck statement of the Second Law, we show that entropy and temperature—in particular, existence of functions that relate the local specific entropy and thermodynamic temperature to the local state in a material body—emerge immediately and simultaneously as consequences of the Hahn–Banach theorem. The existence of such functions of state requires no stipulation that their domains be restricted to equilibrium states. Further properties, including uniqueness, are addressed in a companion paper.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
There are several widely-accepted formulations of the Second Law in classical thermodynamics, some invoking notions of temperature and entropy, and at least one invoking neither of these explicitly.
In particular, the so-called Kelvin–Planck Second Law is an elemental stricture on the nature of heat receipt by a body during the course of any cyclic process the body might experience. It says, in effect, that during the course of a cyclic process the body cannot merely receive heat from its exterior without also emitting heat to it (in a manner qualitatively distinguishable from that of the heat receiptFootnote 1). The First Law then implies that, over the course of a cyclic process, the heat received by the body cannot be converted entirely into work; there must be some heat emission as well. In the Kelvin–Planck Second Law there is no explicit mention of entropy or of temperature, much less of a thermodynamic temperature scale.
As we indicated, other invocations of the Second Law are explicit in their use of a thermodynamic temperature scale and of an entropy. Indeed the opening paragraph of Gibbs’s “On the Equilibrium of Heterogeneous Substances" [16] invokes an inequality of the form
“dq denoting the element of heat received from external sources and T denoting the temperature of the part of the system receiving it.”. (This interpretation of the right side of (1.1) is taken from that same Gibbs paragraph.)
Much of modern classical thermodynamics takes as its starting point a Second Law of the form (1.1), usually called the Clausius–Duhem inequality, deemed to obtain for any body suffering any process, even processes in which there is rapid heating or cooling, in which there are sharp temperature gradients, and in which there is rapid and severe deformation.Footnote 2 Neither at the start of the process nor at its end need the body be in equilibrium.
This raises some historical and, more importantly, conceptual questions. Entropy and a thermodynamic temperature scale are generally regarded to be derived entities, deduced from more fundamental statements of the Second Law (such as the Kelvin–Planck version) by means of brilliant arguments posited by the early thermodynamics pioneers. Those arguments, however, often invoke idealized slow reversible processes (for example, Carnot cycles) in which the body suffering the process is always in (or arbitrarily close to) a condition of equilibrium.Footnote 3 Because the classically derived notions of entropy and thermodynamic temperature rest upon arguments in which the only body-states visited are ones at or very close to equilibrium, it is reasonable to question whether these notions actually have rigorous logical extensions to non-equilibrium domains.
Gibbs seemed willing to embrace such extensions. A reading of Gibbs’s interpretation of the right side of (1.1) indicates that he had no reluctance to invoke a thermodynamic temperature scale in bodies having different local temperatures in different parts, and in an earlier, less read article [15], Gibbs clearly felt free to attribute an entropy to a body that is not in equilibrium:
When the body is not in a state of thermodynamic equilibrium, its state is not one of those which are represented by our surface. The body, however, as a whole has a certain volume, entropy, and energy, which are equal to the sums of the volumes, etc., of its parts.
Note, in particular, that Gibbs was not reluctant to assert the existence of a local entropy within an un-equilibrated body, its total entropy coming from a summing process.
Yet it is not easy to trace a clear path from the equilibrium arguments for entropy and thermodynamic temperature posited by the early pioneers to the non-equilibrium entropy and temperature invoked by Gibbs. Even less evident is a precise line of argument that begins with the pioneers and terminates with the free-wheeling modern use of local entropy and thermodynamic temperature in the Clausius–Duhem inequality, in particular when it is applied to bodies experiencing rapid, non-uniform heat transfer and deformation.
Our aim is to connect, in a precise way, an elemental Kelvin–Planck statement of the Second Law to the existence and properties of a thermodynamic temperature scale and an entropy scale, both viewed as functions of the local material state, that together satisfy the requirements of the Clausius–Duhem inequality (as it is invoked in modern classical physicsFootnote 4) for all processes that material bodies under consideration are deemed to admit. The mathematical ideas we use, principally from functional analysis, were not available to the earliest pioneers of classical thermodynamics, nor were they available to Gibbs.Footnote 5 Our primary working tool is the Hahn–Banach Theorem, in particular a version that ensures that two non-empty disjoint closed convex sets in a locally convex topological vector space, at least one of them compact, can be strictly separated by a hyperplane [2, 4, 24, 33]. Along the way, the Hahn–Banach Theorem will have the additional benefit of imparting to thermodynamics an intuitive geometric flavor, different in substance and setting from the geometric one pioneered by Gibbs [14, 15].
2 Some Background
This article and its companion [11] constitute a major amplification of two much earlier ones by us, both drawing on the Hahn–Banach Theorem heavily. The first [9], published in 1983, was an extensive discussion of how the Hahn–Banach Theorem serves to connect a suitably formulated version of the Kelvin–Planck Second Law to the existence and properties (including uniqueness) of a thermodynamic temperature scale that conforms to the so-called Clausius inequality—that is, to the Clausius–Duhem inequality restricted to cyclic processes. For cyclic processes, the left side of (1.1) reduces to zero, so there was no involvement of entropy.
The second article [10] was published originally in 1984 as an appendix in [34] and soon after in a collection [28] of short essays about the foundations of thermodynamics. That article indicated how, beginning with a slightly stronger version of a Kelvin–Planck Second Law, the Hahn–Banach Theorem delivers simultaneously both a thermodynamic temperature and an entropy satisfying the requirements of the Clausius–Duhem inequality, not restricted to cyclic processes.
Although [10] contained a Hahn–Banach proof of the equivalence of the Kelvin–Planck Second Law with the existence of thermodynamic-temperature and entropy functions of state suited to the Clausius–Duhem inequality, several other theorems (including two about uniqueness) were merely stated without proofs. Those proofs we said would be forthcoming in a fuller article. Moreover, we promised a more compelling presentation of the existence argument, in which certain presumptions about the structure of the putative set of thermodynamical processes would be substantially weakened. This article and its companion are intended to fulfill those promises. The weakened, and more natural, assumptions about the structure of the process-set have required a deeper analysis, much of it deferred to the appendix of this article.
Remark 2.1
The 1986 volume [28], in which [10] appears, contains a wealth of chapters by different authors devoted to the study of the mathematical foundations of non-equilibrium classical thermodynamics. The same is true of [34]. Even beyond those, there are many schools of thought about how classical thermodynamics might be extended to non-equilibrium settings. These are surveyed amply and critically in the 2008 book by Lebon et al. [19] (although some work contained in [28, 34] and related articles escaped the book’s notice). Readers of this article might want to see very different work by Lieb and Yngvason [20], which in 1999 began as an exploration of the construction of classical entropy for bodies in equilibrium and then turned in 2013 to questions about the extent to which the same could be done for un-equilibrated bodies [21]. For a recent summary of some of their work see [35]; see also an article by Kammerlander and Renner [17].
Remark 2.2
To a great extent, discussions with James Serrin in the late 1970s and early 1980s, in particular his formulations of the Second Law in terms of a heat accumulation function, provided inspiration for our work (although not our reliance on the Hahn–Banach theorem). Serrin’s views at the time are captured in [25,26,27, 29].
Remark 2.3
As in our earlier articles, we want to call particular attention to work [30, 31] by Miroslav Šilhavý,Footnote 6 who realized independently and at about the same time that Hahn–Banach separation theorems, taken with the Kelvin–Planck Second Law, might provide a basis for existence of a thermodynamic temperature scale consistent with the cyclic-process Clausius inequality. In [31] Šilhavý viewed the thermodynamic temperature scale to be a function having as its domain a pre-supposed empirical temperature scale. The most apt comparison to our work is with some preliminary notes [8] we wrote in 1978 for James Serrin. There, we also viewed a Clausius-inequality temperature scale to be a function having as its domain a pre-supposed empirical temperature scale, and we too used Hahn–Banach separation theorem arguments to demonstrate how the existence of such a Clausius-inequality temperature scale derives immediately from, and is equivalent to, the Kelvin–Planck Second Law.
Our subsequent published article [9] on the Clausius inequality was much more ambitious. There, we chose not to pre-suppose an empirical temperature scale, carrying a pre-ordained notion of “hotness" and “hotter than." Rather, we regarded the desired Clausius-inequality temperature scale to be a “function of state,” the state domain depending on the material under consideration.Footnote 7 In this way, we could not only establish, via the Hahn–Banach Theorem, the equivalence of the Kelvin–Planck Second Law with the existence of a temperature scale satisfying the Clausius inequality, we could also tie relative values of that temperature to a “hotter than" relation on the set of states, a relation deriving solely from processes the material is deemed to admit. This is the position taken in [10] and here, where the entropy density, like the thermodynamic temperature scale, is a Hahn–Banach-derived function of the local material state.Footnote 8
3 Thermodynamical Theories
To a great extent modern classical thermodynamics manifests itself as a collection of thermodynamical theories tailored to particular materials, these various theories sharing common premises and common methodologies. There are, for example, thermodynamical theories of elastic materials, of gases, of viscous fluids, of diffusive reacting mixtures, and so on. Each such theory presumably carries with it versions of the First and Second Laws, rendered concrete and precise within the context of the specific class of materials under study.
With this viewpoint in mind, we regard the theorems contained in this article and its companion to provide something like a “meta-thermodynamics" that sheds an overarching light on the structure of specific thermodynamical theories. In particular, almost all of the theorems contained here assert that a theory has Property A (usually a statement about the nature of heat transfer between bodies and their exteriors in processes the theory admits) if and only if it has Property B (usually a statement about entropy and thermodynamic temperature). The deeper and more difficult of those implications always derives from the Hahn–Banach Theorem.
We will regard a thermodynamical theory to be a mathematical object consisting of two sets: (i) a state space \(\Sigma \) that characterizes the set of (local) states that might be exhibited within a material body embraced by the theory and (ii) a set \(\mathscr {P}\) of processes that abstracts the essential features of physical processes that such bodies are deemed to admit. Taken together, these two sets will, for us, serve to constitute an instance \((\Sigma ,\mathscr {P})\) of a thermodynamical theory.
In this section and the next we will use terms such as body, material, material point, and physical process, but only in an informal way to guide thinking about the two sets \(\Sigma \) and \(\mathscr {P}\) that constitute a thermodynamical theory or to provide justification for the structure these sets are presumed to possess. Again, though, a thermodynamical theory \((\Sigma ,\mathscr {P})\) is a purely mathematical object suited to precisely stated questions and theorems. In particular, we will be in a position to say what we mean by a Kelvin–Planck theory—that is, a thermodynamical theory that complies with a precisely stated version of the Kelvin–Planck Second Law. And we will be a position to ask about circumstances under which a particular thermodynamical theory \((\Sigma ,\mathscr {P})\) admits two functions of state—a specific-entropy \(\eta :\Sigma \rightarrow \mathbb {R}\) and a thermodynamic temperature scale \(T:\Sigma \rightarrow \mathbb {R}_+\) that together comply with the Clausius–Duhem inequality for all processes \(\mathscr {P}\) the theory contains.
Remark 3.1
The mathematical objects and theorems contained here lend themselves to a variety of physical interpretations. At least at the outset, it will be helpful for the reader to think of a thermodynamical theory \((\Sigma ,\mathscr {P})\) as a description of a particular material (for example, carbon dioxide, water, rubber, a metal alloy, a diffusive reacting mixture). In this context, a specific-entropy function \(\eta :\Sigma \rightarrow \mathbb {R}\) will have an interpretation as an attribute of a particular material—in the parlance of continuum physics, a “constitutive function” for that material. Nevertheless, we intend the abstract idea of a thermodynamical theory to be broadly adaptable to a variety of circumstances and instances.
3.1 State Spaces
Central to virtually all classical theories of material body behavior is the idea of “functions of state" that serve to compute local values of certain material attributes. Indeed, one of our aims is to establish, from the Kelvin–Planck Second Law, the existence of specific-entropy and thermodynamic-temperature functions, suited to the Clausius–Duhem inequality, that permit the calculation of the local specific entropy (entropy per mass) and the local thermodynamic temperature once the local material “state” is specified.
Just how the “state of a material point” is specified will vary from one thermodynamical theory to another.Footnote 9 For a theory of a gas of fixed composition it might be supposed that the local state is captured completely by specification of the pair (p, v), where p is the local pressure and v is the local specific volume (the reciprocal of the density). For an elastic material, it might be supposed that the local state is captured by the pair (u, F), where u is the local specific internal energy (internal energy per mass) and F is the local deformation gradient. For a reacting and diffusive mixture having n chemical species, the local state might be described by the vector \([c_1,c_2,\dots ,c_n,\theta ] \in \mathbb {R}^{n+1}\), where \(c_i\) is the local molar concentration of the \(i^{th}\) species and \(\theta \) is the local temperature in degrees Fahrenheit.
In any case, we shall take for granted that a thermodynamical theory has associated with it a state space \(\Sigma \), understood to be the set of local states that might be exhibited within a material body during processes the theory purports to describe. It will be presumed that \(\Sigma \) carries with it a Hausdorff topology.
In fact, we will go further by supposing hereafter that \(\Sigma \) is compact. This supposition will simplify the mathematics greatly, and in most instances it will be physically apt: A well-grounded theory would suffer no loss from exclusion of processes that visit material states which are physically unreasonable. Excluded from consideration, for example, might be processes involving mass-densities so high as to be realized only in black holes or so low as to be inconsistent with the tenets of continuum models.
Remark 3.2
When the state space is merely presumed to be locally compact, realization of the objectives of this paper become more technically delicate, and certain theorems here become false without modification. In Appendix E of [9] we showed how this might proceed when attention is restricted solely to cyclic processes, with the aim of producing a thermodynamic temperature scale consistent with the Clausius inequality.
3.2 Processes
A process experienced by a particular body can be described in a variety of ways, some highly picturesque, involving pulleys and pistons. For our purposes, however, there will be only two aspects of the process that need be considered: (i) the change of condition of the body from the beginning of the process to its end and (ii) the heating measure for the process, which is an overall accounting of the nature of heat receipt the body experiences during the course of the process. We will describe each of these separately. For us, a process will be identified with specifications of both its change of condition and its heating measure.
3.2.1 The Change of Condition for a Process
Recall that members of \(\Sigma \) are understood to be local state descriptions—that is, candidates for describing the state of a material point within a body. If we consider a body at a fixed instant, its material points will be exhibited in various states of \(\Sigma \). Although there might be just one state exhibited throughout the body (in which case the body is thermodynamically uniform), the distribution of states over the body could be far more diffuse. In any case, we shall need a device to describe that distribution for a particular body at a fixed instant.
By the (instantaneous) condition of the body we mean a positive regular Borel measure on \(\Sigma \), denoted here by \(\mathcal {m}\), interpreted in the following way: For each Borel set \(\Lambda \subset \Sigma \), \(\mathcal {m}(\Lambda )\) is the mass of that part of the body consisting of all material points in states contained in \(\Lambda \). More colloquially, we can think of \(\mathcal {m}(\Lambda )\) to be determined by excising from the body only material in states contained within \(\Lambda \) and weighing that part of the body so removed. Note that \(\mathcal {m}(\Sigma )\) is the mass of the entire body. Note also that if a body of mass M is thermodynamically uniform, with all material in state \(\sigma \), then the body’s condition is \(M\delta _{\sigma }\), where \(\delta _{\sigma }\) is the Dirac measure concentrated at \(\sigma \).Footnote 10
Now consider a physical process suffered by a particular body, with both the body and the process presumably embraced by the thermodynamical theory under consideration. During the process, the body might experience rapid deformation and heat treanser, so that each material point within the body might present itself in a great variety of states as the process ensues. In particular, the body’s final condition \(\mathcal {m}_f\) might be very different from the body’s initial condition \(\mathcal {m}_i\). We associate with the process a change of condition, \(\Delta \mathcal {m}\) defined by
Here \(\Delta \mathcal {m}\) is understood to be a signed regular Borel measure on \(\Sigma \), which is to say that \(\Delta \mathcal {m}\) might take positive values on some Borel sets and negative values on others.Footnote 11 Note, however, that we always have
since each term on the right is the (conserved) total mass of the body suffering the process.
3.2.2 The Heating Measure for a Process
During the course of the physical process under consideration, the body suffering the process might experience deformation and nonuniform transfer of heat to and from its exterior. Indeed, at a given instant there might be heat receipt in some parts of the body and heat removal in other parts. It should be kept in mind that each material point can be expected to visit a variety of states in \(\Sigma \) as time progresses.
With the process we associate a heating measure \(\mathcal {q}\), which is a signed regular Borel measure on \(\Sigma \) with the following interpretation: for each Borel set \(\Lambda \subset \Sigma \), \(\mathcal {q}(\Lambda )\) is the net amount of heat received over the course of the entire process (from the exterior of the body suffering the process) by material in states contained within \(\Lambda \) at the time of heat receipt. In colloquial terms, imagine viewing the evolving process through glasses that filter out material not in states contained in \(\Lambda \); some material might disappear and then reappear. The net heat received, over the entire process, by the visible material (from the exterior of the entire body) is \(\mathcal {q}(\Lambda )\).
3.2.3 Example: Change of Condition and Heating Measure Derived from a More Concrete Process Description
Because the abstract idea of a process’s change of condition and heating measure will be important hereafter,Footnote 12 we will indicate how these can be calculated from a somewhat more tangible description of a process. With the process (having a compact metric space as the state space \(\Sigma \)) we associate:
-
(i)
a body \(\mathscr {B}\) that experiences the process. Here we regard \(\mathscr {B}\) to be a set (of material points), taken with a \(\sigma \)-algebra of subsets of \(\mathscr {B}\), called the parts of \(\mathscr {B}\). We presume that \(\mathscr {B}\) comes equipped with a positive mass measure \(\mu \) defined on its parts: for each part \(P \in \mathscr {B}\), \(\mu (P)\) is the mass of part P.
-
(ii)
a closed interval of the real line \(\mathscr {I}:= [t_i,t_f]\), identified with the time interval over which the process transpires.
-
(iii)
a measurableFootnote 13 function \({\hat{\sigma }}: \mathscr {B}\times \mathscr {I}\rightarrow \Sigma \), with \({\hat{\sigma }}(X,t)\) interpreted as the state of material point X at instant t.
-
(iv)
a real-valued signed measure h on \(\mathscr {B}\times \mathscr {I}\), interpreted as follows: For each part \(P \subset \mathscr {B}\) and each Lebesgue-measurable set \(J \subset \mathscr {I}\), h(P, J) is the net amount of heat received by part P from the exterior of the body during instants contained in J.
For a process described this way, construction of the heating measure \(\mathcal {q}\) proceeds as follows: for each Borel set \(\Lambda \subset \Sigma \),
To construct the change of condition for the process we begin by defining the initial and final state assignments to material points:
The initial condition and final condition of body \(\mathscr {B}\) are then defined by the requirement that, for each Borel set \(\Lambda \subset \Sigma \),
The change of condition for the process is then given by
3.2.4 The Set of Processes and Some of Its Properties
In a theory with state space \(\Sigma \), a process will be regarded to be a pair \((\Delta \mathcal {m},\mathcal {q})\), where \(\Delta \mathcal {m}\) is the change of condition for the process and \(\mathcal {q}\) is its heating measure. We can regard both of these as members of \(\mathscr {M}(\Sigma )\), the vector space of signed regular Borel measures on \(\Sigma \). In fact, from (3.2) it follows that \(\Delta \mathcal {m}\) is always a member of the linear subspace \(\mathscr {M}^{\circ }(\Sigma )\subset \mathscr {M}(\Sigma )\) defined by
Thus we can regard a process \(\mathcal {p}\) = \((\Delta \mathcal {m},\mathcal {q})\) to be a member of the vector space
Hereafter it will be understood that \(\mathscr {M}(\Sigma )\) carries the weak-star topology,Footnote 14 that \(\mathscr {M}^{\circ }(\Sigma )\) carries the topology it inherits as a subset of \(\mathscr {M}(\Sigma )\), and that \(\mathscr {V}(\Sigma )\) carries the resulting product topology. For a set \(X \in \mathscr {V}(\Sigma )\) we denote by cl (X) its closure.
For a thermodynamical theory with state space \(\Sigma \), the set of processes, \(\mathscr {P}\ \subset \mathscr {V}(\Sigma )\), will be understood to consist of members of \(\mathscr {V}(\Sigma )\) that correspond to physical processes deemed to be admitted by material bodies in circumstances the theory purports to embrace. Physical considerations suggest that, for any reasonable theory, the set \(\mathscr {P}\) should carry a certain structure, in particular that it should share at least some of the attributes of a convex cone in \(\mathscr {V}(\Sigma )\). Recall that \(\mathscr {P}\) would be a convex cone were it to have both of the following properties:
-
(i)
For each \(\mathcal {p}\) in \(\mathscr {P}\) and each non-negative number \(\alpha \), \(\alpha \mathcal {p}\) is a member of \(\mathscr {P}\).
-
(ii)
For all \(\mathcal {p}\) and \(\mathcal {p}^*\) in \(\mathscr {P}\), \(\mathcal {p}+ \mathcal {p}^*\) is a member of \(\mathscr {P}\).
With respect to (i), it is not difficult to argue on physical grounds that that the inclusion will be satisfied so long as \(\alpha \) is a non-negative integer: If \(\mathcal {p}\)= \((\Delta \mathcal {m},\mathcal {q})\) is a physical process suffered by a body \(\mathscr {B}\), then for any positive integer n, we can simultaneously execute the same process on n copies of \(\mathscr {B}\), copies that are not in thermal communication. The n bodies, viewed as a single body, will have suffered a physical process for which the change of condition is \(n\Delta \mathcal {m}\) and the heating measure is \(n\mathcal {q}\). Thus, \(n\mathcal {p}= (n\Delta \mathcal {m}, n\mathcal {q})\) is a member of \(\mathscr {P}\), corresponding to the physical n-body process described.
Similarly, we can expect on physical grounds that the inclusion in (ii) will be satisfied so long as \(\mathcal {p}\) = \((\Delta \mathcal {m},\mathcal {q})\) and \(\mathcal {p}^* =(\Delta \mathcal {m}^*,\mathcal {q}^*)\) correspond to two physical processes having the same temporal duration: If these physical processes are suffered by bodies \(\mathscr {B}\) and \(\mathscr {B}^*\), then the two processes can be executed simultaneously, with \(\mathscr {B}\) and \(\mathscr {B}^*\) thermally isolated from one another, perhaps by large physical distance. This simultaneous execution can be viewed to be another physical process, suffered by the body composed of \(\mathscr {B}\) and \(\mathscr {B}^*\), having change of condition \(\Delta \mathcal {m}+ \Delta \mathcal {m}^*\) and heating measure \(\mathcal {q}+ \mathcal {q}^*\). In this case, the new physical process would have a representation in \(\mathscr {V}(\Sigma )\) (and in \(\mathscr {P}\)) given by \(\mathcal {p}+ \mathcal {p}^*\).
These considerations tell us that, in a reasonable theory, the process set \(\mathscr {P}\) can be expected to have some natural structure, including features that are suggestive of a convex cone in \(\mathscr {V}(\Sigma )\). In fact, in [10] we assumed that \(\mathscr {P}\) is a convex cone. Here we make no such assumption.
We defer to the Appendix a far more nuanced discussion of the structure that we will suppose \(\mathscr {P}\) possesses. By \(\textrm{Cone} \,(\mathscr {P})\) we mean the set in \(\mathscr {V}(\Sigma )\) defined by
Based on a few plausible physical assumptions, we argue in the Appendix that, in a reasonable theory, the set
should not only be a closed cone in \(\mathscr {V}(\Sigma )\), it should also be convex. This we will take for granted hereafter.
3.3 Definition of a Thermodynamical Theory
For the record, we posit the following definition:
Definition 3.3
A thermodynamical theory consists of a (compact) Hausdorff set \(\Sigma \), called the state space of the theory, and a set \(\mathscr {P}\subset \mathscr {V}(\Sigma )\) such that
is convex. Elements of \(\mathscr {P}\) are the processes of the theory.
Remark 3.4
The definition is formulated in such a way as to remind the reader of our presumption that \(\Sigma \) is compact. Recall Remark 3.2.
4 Kelvin–Planck Theories
In this section we will make precise what we mean by a Kelvin–Planck theory—that is, a thermodynamical theory that respects a form of the Kelvin–Planck Second Law. We want to capture the following idea: In every cyclic process in which the body suffering the process experiences a heat absorption from the body’s exterior, there must also be heat emission to the exterior, the emission being qualitatively different from the absorption. If there were there no heat emission, the process would be perfectly efficient, for by the First Law the heat absorbed would be converted entirely into work.
By a cyclic process in the thermodynamical theory \((\Sigma ,\mathscr {P})\) we will mean a process in which the condition of the body at the end of the process is the same as it was at its beginning. That is, a cyclic process \(\mathcal {p}= (\Delta \mathcal {m},\mathcal {q})\) is a process such that the change of condition \(\Delta \mathcal {m}\) is 0.
Consider a cyclic process \(\mathcal {p}^*:= (0,\mathcal {q}^*)\) with \(\mathcal {q}^* \ne 0\). Recall that if \(\Lambda \subset \Sigma \) is a Borel set of states, then \(\mathcal {q}^*(\Lambda )\) is interpreted to be the net amount of heat absorbed during the course of the entire process by material while in states contained in \(\Lambda \). If \(\mathcal {q}^*\) is a non-negative Borel measure—that is, one that takes non-negative values on every Borel set, then there is no Borel set of states that, for the process, can be associated with net heat emission. Moreover, by supposition \(\mathcal {q}^*\) is not the zero measure, so there is at least one Borel set on which \(\mathcal {q}^*\) is positive, corresponding to heat absorption.
For these reasons, when \(\mathcal {q}^* \ne 0\) is a non-negative measure, we will regard the cyclic process \(\mathcal {p}^*:= (0,\mathcal {q}^*)\) to be inconsistent with the spirit of the Kelvin–Planck Second Law. For the thermodynamical theory \((\Sigma ,\mathscr {P})\) we denote by \(\mathscr {M}_+(\Sigma )\) the set of non-negative regular Borel measures on \(\Sigma \), and we also let
Thus, for a thermodynamical theory \((\Sigma ,\mathscr {P})\) we might regard the requirement
to be a full embodiment of the Kelvin–Planck Second Law. Or, if we want to assert that a nonzero element of \((0,\mathscr {M}_+(\Sigma ))\) cannot even be approximated by the theory’s processes, then we might strengthen (4.1) by requiring that
However, two examples will reveal a sense in which even (4.2) falls a little short of capturing the Kelvin–Planck stricture against an approach to perfect conversion of heat into work in cyclic processes. The examples will indicate why we prefer to express the Kelvin–Planck Second Law in terms of a requirement that is somewhat stronger than (4.2).
Each example will be in the form of a toy thermodynamical theory in which the state space \(\Sigma \) is identified with the real interval [0, 1]. Recall that, for \(x \in \Sigma \), \(\delta _x\) denotes the Dirac measure at x. That is, if \(\Lambda \subset \Sigma \) is a Borel set then \(\delta _x(\Lambda ) = 1\) if x is in \(\Lambda \) and is zero otherwise.
Example 4.1
(A sequence of cyclic processes with small fixed heat emission but unbounded heat receipt) Consider a thermodynamical theory \((\Sigma ,\mathscr {P})\), in which \(\mathscr {P}\) contains the sequence of cyclic processes
In each process of the sequence there is heat absorbed (by material in state 1) and heat emitted (by material in state 0). Thus, no process of the sequence is a member of the forbidden set \((0,\mathscr {M}_{+}(\Sigma ))\), nor does the sequence converge to any nonzero member of the forbidden set. For this reason, a putative assertion of the Kelvin–Planck Second Law in the form (4.2) would not preclude for the theory \((\Sigma ,\mathscr {P})\) the presence of the sequence (4.3) in \(\mathscr {P}\).
Nevertheless, the sequence contains cyclic processes that come arbitrarily close to having perfect efficiency as n increases: In each process, the heat absorbed (all at state 1) is n, while the work done (equal, in a cyclic process, to the net amount of heat received) is \(n-1\). The efficiency, then, is \(\frac{n-1}{n}\), which approaches 1 as n gets large. Although members of the sequence (4.3) do not converge to a member of the forbidden set, they do come close to aligning in the vector space \(\mathscr {V}(\Sigma )\) with the forbidden element \((0,\delta _1)\).
Such an arbitrarily close approach to perfect efficiency would seem to violate the spirit of the Kelvin–Planck Second Law. The example reveals a sense in which the condition expressed by (4.2) is not a fully suitable reflection of that spirit.
Example 4.2
(A sequence of almost-cyclic processes, each with heat receipt but no heat emission) Consider a thermodynamical theory \((\Sigma ,\mathscr {P})\), in which \(\mathscr {P}\) contains the sequence of processes
In each process of the sequence, the heating measure indicates no heat emission, only (unbounded) heat absorption, entirely at state \(\frac{1}{2}\). Still, no process of the sequence constitutes a violation of a Kelvin–Planck-type Second Law, as no process is cyclic. Nevertheless, as n increases the change of condition approaches 0 while the heat absorption becomes unbounded. Although the sequence does not converge to any member of the forbidden set \((0,\mathscr {M}_{+}(\Sigma ))\), its processes nevertheless violate the Kelvin–Planck spirit, for as n increases they increasingly resemble cyclic processes with (large) heat absorption but no heat emission.
Here, as in Example 4.1, a codification of the Kelvin–Planck Second Law in the form (4.2) does not suffice to preclude the presence in \(\mathscr {P}\) of a troubling process sequence, in this case (4.4).
Stated informally, the difficulty in both examples is that, while neither sequence converges to an element of the forbidden set \((0,\mathscr {M}_+(\Sigma ))\), members of each sequence come arbitrarily close to pointing along a “forbidden direction” in the vector space \(\mathscr {V}(\Sigma )\). For a thermodynamical theory \((\Sigma ,\mathscr {P})\), we will identify the direction of a process \(\mathcal {p}\in \mathscr {P}\) with the half-line
Note that \(\textrm{Cone} \,(\mathscr {P})\), given as before by
is the set of all directions generated by members of \(\mathscr {P}\). The condition
then says in effect that no nonzero element of the forbidden set (0,\(\mathscr {M}_{+}(\Sigma )\)) can be approximated by vectors of \(\mathscr {V}(\Sigma )\) having directions associated with members of \(\mathscr {P}\).
Remark 4.3
(Examples 4.1and 4.2reconsidered) Although the problematic thermodynamical theories considered in Examples 4.1 and 4.2 were not precluded by the putative Kelvin–Planck Second Law in the form (4.2), they are precluded by the strengthened condition (4.7). In the case of Example 4.1 the sequence in Cone (\(\mathscr {P}\))
converges to \((0,\delta _1)\). In the case of Example 4.2 the sequence in Cone (\(\mathscr {P}\))
converges to \((0,\delta _{1/2})\).
For these reasons, our preferred codification of the Kelvin–Planck Second Law will take the form (4.7) rather than (4.2). Note that if \(\mathscr {P}\) is itself a cone then there is no difference between (4.7) and (4.2). Recall that in Definition 3.3 (the definition of a thermodynamical theory \((\Sigma ,\mathscr {P})\)) we let
Definition 4.4
A Kelvin–Planck theory is a thermodynamical theory \((\Sigma ,\mathscr {P})\) such that
5 Hahn–Banach Equivalence of the Kelvin–Planck Second Law and the Existence of Entropy-Temperature Functions of State
The following theorem asserts that, for a thermodynamical theory, compliance with the Kelvin–Planck Second Law is equivalent to the existence of two continuous functions of state, a specific-entropy function and a thermodynamic temperature scale that, taken together, satisfy the Clausius–Duhem inequality for all processes the theory contains. Entropy and thermodynamic temperature emerge simultaneously and almost immediately as a direct consequence of the Hahn–Banach theorem. There is no reliance at all on venerable thermodynamic conceptual machinery in the form of reversible processes, Carnot cycles, heat baths, or even the idea of equilibrium.
In the theorem statement \(\text {C}(\Sigma ,\mathbb {R})\) denotes the set of real-valued continuous functions on \(\Sigma \), and \(\text {C}(\Sigma ,\mathbb {R}_+)\) is the set of positive-valued continuous functions. \(\mathbb {R}_+\) denotes the set of strictly positive real numbers.
Theorem 5.1
(Existence of Entropy and Thermodynamic Temperature) For a thermodynamical theory \((\Sigma ,\mathscr {P})\) the following are equivalent:
(i) \((\Sigma ,\mathscr {P})\) is a Kelvin–Planck theory.
(ii) There exist functions \(\eta \in \text {C}(\Sigma ,\mathbb {R})\) and \(T \in \text {C}(\Sigma ,\mathbb {R}_+)\) such that
Proof of Theorem 5.1 will make use of some fairly straightforward adaptations of ideas (see, for example, [4]) in functional analysis that were unavailable to the thermodynamics pioneers: First, \(\mathscr {V}(\Sigma )\) is a locally convex Hausdorff topological vector space. Second, the compactness of \(\Sigma \) ensures that the convex set
is (weak-star) compact. Finally, if \(f:\mathscr {V}(\Sigma )\rightarrow \mathbb {R}\) is a continuous linear function, then there exist functions \(\alpha (\cdot )\) and \(\beta (\cdot )\) in \(\text {C}(\Sigma ,\mathbb {R})\) such that, for every \((\mathcal {v},\mathcal {w}) \in \mathscr {V}(\Sigma )\),
What follows is the version of the Hahn–Banach theorem that underlies almost all theorems in this article and its companion article, [11].
Theorem 5.2
(Hahn–Banach) Let V be a Hausdorff locally convex topological vector space, and let A and B be non-empty disjoint closed convex subsets of V, with B compact. There is a continuous linear function \(f: V \rightarrow \, \mathbb {R}\) and a number \(\gamma \in \mathbb {R}\) such that
and
In particular, if A is a cone, then
and
Remark 5.3
For proofs of this version of the Hahn–Banach theorem see Theorem 21.12 in [4], Theorem 1.7 in [2], or Corollary 14.4 in [18]. The last sentence of Theorem 5.2 is not usually stated explicitly, but it is an easy consequence of the preceding one.
We are now in a position to prove Theorem 5.1, the central theorem of this article.
Proof of Theorem 5.1
To prove that (i) implies (ii) we first note for the Kelvin–Planck theory \((\Sigma ,\mathscr {P})\) that, in the Hausdorff locally convex topological vector space \(\mathscr {V}(\Sigma )\), the closed convex cone \({\hat{\mathscr {P}}}\) is disjoint from the convex compact set \((0,\mathscr {M}_{+}^1(\Sigma ))\). From the Hahn–Banach theorem, then, there is a continuous linear function \(f: \mathscr {V}(\Sigma )\rightarrow \mathbb {R}\) such that
and
Moreover, there are functions \(\eta \,(\cdot )\) and \(\beta \,(\cdot )\) in \(\text {C}(\Sigma ,\mathbb {R})\) such that \(f(\cdot ,\cdot )\) has the representationFootnote 15
Note that for each \(\sigma \in \Sigma \) the Dirac measure \(\delta _{\sigma }\) is a member of \(\mathscr {M}_{+}^1(\Sigma )\). From this, (5.5), and (5.6) it follows that \(\beta (\cdot )\) takes strictly positive values. Letting \(T(\cdot ) = 1/\beta (\cdot )\), we get (5.1) as a consequence of (5.4) and (5.6). This completes proof that (i) implies (ii).
To prove that (ii) implies (i) we first observe that if the inequality (5.1) is satisfied for a particular \((\Delta \mathcal {m},\mathcal {q})\in \mathscr {P}\), then the inequality is also satisfied by \(\alpha (\Delta \mathcal {m},\mathcal {q})\) for every non-negative number \(\alpha \). For this reason, (ii) implies that the inequality
is satisfied for all \((\mathcal {v},\mathcal {w})\) in \(\textrm{Cone} \,(\mathscr {P})\) and therefore for all \((\mathcal {v},\mathcal {w})\) in \({\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P})]\). To show that \((\Sigma ,\mathscr {P})\) is a Kelvin–Planck theory we must show that \({\hat{\mathscr {P}}}\) can contain no member of the form \((0,\mathcal {w})\), where \(\mathcal {w}\) is a nonzero member of \(\mathscr {M}_{+}(\Sigma )\). Because \(T(\cdot )\) is positive-valued, such an element could not satisfy (5.7). This completes the proof of Theorem 5.1. \(\square \)
Remark 5.4
(Interpretation of (ii)) In Theorem 5.1 (ii) we will, of course, regard (5.1) to be an expression of the Clausius–Duhem inequality, with \(\eta (\cdot )\) and \(T(\cdot )\) playing the roles of specific-entropy (entropy per mass) and thermodynamic temperature functions of state that assign to each \(\sigma \in \Sigma \) a specific-entropy \(\eta (\sigma )\) and a value \(T(\sigma )\) of the thermodynamic temperature.
If, for a physical process, \(\mathcal {m}_i\) and \(\mathcal {m}_f\) are the initial and final conditions of the body suffering the process then, with \(\Delta \mathcal {m}=\mathcal {m}_f - \mathcal {m}_i\), we have
In view of (5.8) we can interpret the integral on the left side of (5.1) to be the difference in the entropy of the body suffering the process between the end of the process and its beginning.
In this sense, Theorem 5.1 tells us that for any Kelvin–Planck theory, there is a notion of the entropy of a body (along with a thermodynamic temperature scale) that aligns with the Gibbs version (1.1) of the Clausius–Duhem inequality with which we began. Note, however, that Theorem 5.1does much more, for it provides, in the spirit of modern classical physics, a local notion of specific entropy (entropy per mass), as a function of the local state within a body.
If a particular process \((\Delta \mathcal {m},\mathcal {q})\) derives from the data specified in the example of Sect. 3.2.3, the inequality in (ii) can be pulled back to a more traditional description of the Clausius–Duhem inequality, in effect an elaboration of the Gibbs version (1.1) suited to modern continuum physics:
Connections of entropy (with existence derived via [10]) to the theory of partial differential equations (in particular the canonical equations of continuum physics) are discussed by Evans [6].
In preparation for our concluding remarks and for the companion article [11], we record the following definition:
Definition 5.5
(Entropy, Thermodynamic Temperature) Let \((\Sigma ,\mathscr {P})\) be a Kelvin–Planck theory. An element \((\eta ,T)\) of \(\text {C}(\Sigma ,\mathbb {R}) \times \text {C}(\Sigma ,\mathbb {R}_+)\) that satisfies (5.1) is a Clausius–Duhem pair for the theory. A function \(T \in \text {C}(\Sigma ,\mathbb {R}_+)\) is a Clausius–Duhem temperature scale for the theory if there exists \(\eta \in \text {C}(\Sigma ,\mathbb {R})\) such that \((\eta ,T)\) is a Clausius–Duhem pair. In that case, \(\eta (\cdot )\) is a specific-entropy function for the theory (corresponding to the Clausius–Duhem temperature scale \(T(\cdot ))\).
Remark 5.6
(Differentiability of the specific-entropy function and the thermodynamic temperature scale) In applications of the Clausius–Duhem inequality, differentiability of the entropy and temperature with respect to state descriptors often plays a role. Here we focused solely on continuity of these functions. When, for a thermodynamical theory \((\Sigma ,\mathscr {P})\), the state space \(\Sigma \) is such that differentiability of real-valued functions on \(\Sigma \) has meaning, Theorem 5.1 remains true with \(\text {C}(\Sigma ,\mathbb {R})\) replaced by \(\text {C}^{\,k}(\Sigma ,\mathbb {R})\), so long as the same replacement is made in the definition of the topology on \(\mathscr {M}(\Sigma )\), given in footnote 14. That revised topology, which is coarser than the weak-star topology, exerts itself in the definition of \({\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P}))\). This is discussed more fully, but in a narrower context, in Remark 10.2 of [9]. Similar considerations apply to the theorems of the companion article [11].
6 Concluding Remarks
In any thermodynamical theory that complies with the Kelvin–Planck Second Law, as expressed by (4.9), Theorem 5.1 asserts that there are invariably specific-entropy and thermodynamic-temperature functions (of the local material state) that together satisfy the Clausius–Duhem condition (5.1). Moreover, the two conditions are equivalent, so any theory for which there is a Clausius–Duhem entropy-temperature pair must comply with the form of the Kelvin–Planck Second Law given by (4.9).
Again, the proof that (i) implies (ii) is immediate. It relies only on the Hahn–Banach Theorem and functional analysis infrastructure unavailable to the brilliant founders of classical thermodynamics. It is worth emphasizing again that, with respect to the existence of Clausius–Duhem entropy-temperature pairs, there is no reliance on reversible processes or notions of thermodynamic equilibrium. There is no requirement that the set of processes contain certain ones of a specified kind. To some extent this will change in the companion article [11], where we consider properties (including uniqueness) of specific-entropy and thermodynamic-temperature functions of state, in particular the relation of those properties to the supply of processes.
Notes
We shall be more precise about this later on. Planck [23] requires that heat exchange between the body and its exterior cannot, on the whole, simply amount to extraction of heat from a single “heat reservoir."
When we refer to classical or continuum physics, we mean that part of physics that embraces subjects such as fluid mechanics, heat transfer, elasticity theory, amd the theory of diffusive and reacting mixtures, in which bodies are regarded as continuous media.
Henri Lebesgue was born in the year that Equilibrium of Heterogeneous Substances [16] was first published.
See also Šilhavý’s book [32].
For example, the local state of a gas might be specified by the local pressure and the local specific volume. For other examples see Section 3.1.
After establishing a thermodynamic temperature function (of the empirical temperature), one suited to the cyclic-process Clausius inequality, Šilhavý [31], went on to construct an entropy, but that entropy is an attribute of an entire body, not the entropy-density function of the local state established here via the Hahn–Banach Theorem.
The idea of a material point is basic to classical physics, wherein reference is freely made to the density, velocity, stress tensor, temperature, or species concentrations at a point within a body. We will always regard a state as an attribute of a material point within a body, not of the body as a whole. For the body as a whole we will refer to its condition.
The Dirac measure \(\delta _{\sigma }\) is defined by the requirement that, for each Borel set \(\Lambda \subset \Sigma \), \(\delta _{\sigma }(\Lambda )\) is either 1 or 0 according to whether \(\sigma \) is or is not a member of \(\Lambda \).
When the compact Hausdorff topology of \(\Sigma \) is given by a metric, in particular in the almost universal case in which the state space is taken to be a subset of \(\mathbb {R}^N\), every finite signed Borel measure is already regular. See Chapter 12 in [3].
The idea of expressing the condition of a body as a measure on a state space was inspired by a paper by Noll [22]. As far as we can recall from private conversations in 1978, James Serrin had invented what we call a heating measure, but on a one-dimensional “hotness manifold." He later abandoned that in published works, as he came to favor what he called a heat “accumulation function" on the hotness manifold [27, 29]. We do not take hotness as a primitive notion.
Here \(\Sigma \) is understood to carry the Borel \(\sigma \)-algebra.
The weak-star topology on \(\mathscr {M}(\Sigma )\) is its coarsest topology such that, for every continuous function \(\varphi :\Sigma \rightarrow \mathbb {R}\), the map
$$\begin{aligned} \mathcal {v}\in \mathscr {M}(\Sigma )\rightarrow \int _\Sigma \varphi \, d \mathcal {v}\end{aligned}$$is continuous. Then \(\mathcal {v}_0 \in \mathscr {M}(\Sigma )\) is in the weak-star closure of a subset \(S \subset \mathscr {M}(\Sigma )\) if, for every finite sequence \(\varphi _1, \varphi _2,\dots ,\varphi _n\) in \(\textrm{C}(\Sigma ,\mathbb {R})\) and every \(\varepsilon > 0\), there is a \(\mathcal {v}\in S\) such that \(|\int _{\Sigma }\varphi _j d \mathcal {v}- \int _{\Sigma }\varphi _j d \mathcal {v}_0\,| < \varepsilon ,\ j=1,\dots ,n\). Unlike the norm topology, the weak-star topology on \(\mathscr {M}(\Sigma )\) reflects the topology of \(\Sigma \). For example, if \(\sigma _i \rightarrow \sigma \) in \(\Sigma \) as \(i \rightarrow \infty \), then \(\delta _{\sigma _i} \rightarrow \delta _{\sigma }\) with respect to the weak-star topology in \(\mathscr {M}(\Sigma )\), while \(\Vert \delta _{\sigma _i} - \delta _{\sigma }\Vert =2\) for each i.
See, for example, §3.14 in [24]. Although for every continuous linear functional g on \(\mathscr {M}(\Sigma )\) there is a unique continuous function \(\varphi \in \text {C}(\Sigma ,\mathbb {R})\) such that \(g(\mu ) = \int _{\Sigma }\varphi \, d \mu ,\ \forall \mu \in \mathscr {M}(\Sigma )\), the situation for \(\mathscr {M}^{\circ }(\Sigma )\), with the topology given earlier, is a little different. In that case, the representing function \(\varphi \) is unique only up to an additive constant.
There might be several histories that different bodies can experience which nevertheless result in the same overall record carried by \(\mathcal {p}\). These different histories might have different durations. Our presumption here is that for each \(\mathcal {p}\) there is at least one such history.
Here is is understood that the functions \(\Delta {\bar{\mathcal {m}}}(\cdot )\) and \({\bar{\mathcal {q}}}(\cdot )\) are particular to the process history under consideration and that \(\Delta {\bar{\mathcal {m}}}(\tau ) = {\bar{\mathcal {m}}}(\tau ) - {\bar{\mathcal {m}}}(t^i_{\mathcal {p}})\), where \({\bar{\mathcal {m}}}(\tau )\) gives the condition at time \(\tau \) of the body suffering the process.
References
Bowen, R.M.: Thermochemistry of reacting materials. J. Chem. Phys. 49(4), 1625–1637 (1968)
Brézis, H.: Functional Analysis, Sobolev Spaces, and Partial Differential Equations. Springer (2011)
Charalambos, D., Aliprantis, B.: Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, Berlin (2013)
Choquet, G.: Lectures on Analysis. W. A. Benjamin (1969)
Coleman, B.D., Noll, W.: The thermodynamics of elastic materials with heat conduction and viscosity. Arch. Ration. Mech. Anal. 13(1), 167–178 (1963)
Evans, L.C.: Entropy and Partial Differential Equations, Lecture Notes UC Berkeley (2004). https://math.berkeley.edu/~evans/entropy.and.PDE.pdf
Feinberg, M.: Foundations of Chemical Reaction Network Theory. Springer, Berlin (2019)
Feinberg, M., Lavine, R.: Preliminary Notes on the Second Law of Thermodynamics. (July 2, 1978) https://zenodo.org/records/10635783
Feinberg, M., Lavine, R.: Thermodynamics based on the Hahn–Banach theorem: the Clausius inequality. Arch. Ration. Mech. Anal. 82(3), 203–293 (1983)
Feinberg, M., Lavine, R.: Foundations of the Clausius-Duhem Inequality. In: J. Serrin (ed.) New Perspectives in Thermodynamics, pp. 49–64. Springer (1986). Also available as Appendix 2A in Truesdell, C., Rational Thermodynamics, Springer (1984)
Feinberg, M., Lavine, R.: Entropy and thermodynamic temperature in nonequilibrium classical thermodynamics as immediate consequences of the Hahn-Banach Theorem: II. Properties. arXiv:2308.10676 [math-ph] (2023)
Fermi, E.: Thermodynamics. Dover Publications (1956)
Feynman, R., Leighton, R.B., Sands, M.; et al.: The Feynman Lectures on Physics, vol. 1. Addison-Wesley, Boston (1965)
Gibbs, J.W.: Graphical methods in the thermodynamics of fluids. In: The Scientific Papers of J. Willard Gibbs, Vol. 1. Dover Publications (1961). Originally published in Transactions of the Connecticut Academy, II, pp. 309-342, April (1873)
Gibbs, J.W.: A method of geometrical representation of the thermodynamic properties of substances by means of surfaces. In: The Scientific Papers of J. Willard Gibbs, Vol. 1. Dover Publications (1961). Originally published in Transactions of the Connecticut Academy, II, pp. 382–404 (1873)
Gibbs, J.W.: On the equilibrium of heterogeneous substances. In: The Scientific Papers of J. Willard Gibbs, Vol. 1. Dover Publications (1961). Originally published in Transactions of the Connecticut Academy, III, pp. 108-224, Oct, 1875-May, 1876 and pp. 343-524, May, 1877-July (1878)
Kammerlander, P., Renner, R.: Tangible phenomenological thermodynamics. arXiv preprint arXiv:2002.08968 (2020)
Kelley, J.L., Namioka, I.: Linear Topological Spaces. Springer, Berlin (1963)
Lebon, G., Jou, D., Casas-Vázquez, J.: Understanding Non-equilibrium Thermodynamics. Springer, Berlin (2008)
Lieb, E.H., Yngvason, J.: The physics and mathematics of the second law of thermodynamics. Phys. Rep. 310(1), 1–96 (1999)
Lieb, E.H., Yngvason, J.: The entropy concept for non-equilibrium states. Proc. R. Soc. A 469:20130408(2158) (2013)
Noll, W.: On certain convex sets of measures and on phases of reacting mixtures. Arch. Ration. Mech. Anal. 38(1), 1–12 (1970)
Planck, M.: Treatise on Thermodynamics, 5th edn. Dover Publications (1945)
Rudin, W.: Functional Analysis, 2nd edn. McGraw-Hill Publishing, New York (1991)
Serrin, J.: Foundations of Classical Thermodynamics: Lecture Notes. University of Chicago, Department of Mathematics (1975)
Serrin, J.: The concepts of thermodynamics. In: de La Penha, G.M., Madeiros, L.A. (eds.) Contemporary Developments in Continuum Mechanics and Partial Differential Equations, North-Holland Mathematics Studies, vol. 30, pp. 411–451. Elsevier, New York (1978)
Serrin, J.: Conceptual analysis of the classical second laws of thermodynamics. Arch. Ration. Mech. Anal. 70(4), 355–371, 1979
Serrin, J. (ed.): New Perspectives in Thermodynamics. Springer, Berlin (1986)
Serrin, J.: An outline of thermodynamical structure. In: Serrin, J. (ed.) New Perspectives in Thermodynamics, pp. 3–32. Springer, Berlin (1986)
Šilhavỳ, M.: On measures, convex cones, and foundations of thermodynamics I. Systems with vector-valued actions. Czechoslovak J. Phys. B 30(8), 841–861, 1980
Šilhavý, M.: On measures, convex cones, and foundations of thermodynamics II. Thermodynamic Systems. Czechoslovak J. Phys. 30(9), 961–991, 1980
Šilhavý, M.: The Mechanics and Thermodynamics of Continuous Media. Springer, Berlin (1997)
Simon, B.: Convexity: An Analytic Viewpoint. Cambridge University Press, Cambridge (2011)
Truesdell, C.: Rational Thermodynamics, 2nd edn. Springer, Berlin (1984)
Yngvason, J.: A direct road to entropy and the second law of thermodynamics. arXiv preprint arXiv:2202.07982 (2022)
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by C. Dafermos.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: The Convexity of \({\hat{\mathscr {P}}} \)
Appendix: The Convexity of \({\hat{\mathscr {P}}} \)
In the main body of this article, the set \(\mathscr {P}\ \subset \mathscr {V}(\Sigma )\) carried information about the totality of outcomes admitted by processes within a particular thermodynamical theory. In this appendix we will argue that, in natural theories, \(\mathscr {P}\) can be expected to have a special structure. In particular, we will provide support for the presumption in the main text that \({\hat{\mathscr {P}}}: =\text {cl}\,[\text {Cone}\,(\mathscr {P})]\) is convex.
Recall that an element of \(\mathscr {P}\), say \(\mathcal {p}= (\Delta \mathcal {m}, \mathcal {q})\), provides information about the overall result of a particular process, with \(\Delta \mathcal {m}:= \mathcal {m}(t^f) - \mathcal {m}(t^i)\) giving the overall difference, from the initial time to the final time, in the condition of the body suffering the process and with \(\mathcal {q}\) giving the process’s overall heating measure. Although the emphasis has been on the overall outcome, a physical process nevertheless evolves over time in the instants between its inception and completion.
With this in mind, we take for granted that each process \(\mathcal {p}\in \mathscr {P}\) can be associated with a physical history experienced by a particular body.Footnote 16 In particular, with \(\mathcal {p}\) we can associate a closed time interval \([t^i_{\mathcal {p}},t^f_{\mathcal {p}}]\), where \(t^i_{\mathcal {p}}\) is the initial time at which the process begins and \(t^f_{\mathcal {p}}\) is the final time at which it ends. The duration of the process history is the positive number \(t^f_{\mathcal {p}} - t^i_{\mathcal {p}}\).
Moreover, we assume that we can associate, at every instant in \([t_{\mathcal {p}}^i,t_{\mathcal {p}}^f]\), a specification of the difference between the body’s current condition and its condition at the process’s inception, and that we can also associate a specification of the current (cumulative) heating measure. That is, with process \(\mathcal {p}\ = (\Delta \mathcal {m}, \mathcal {q})\) we assume that there is a continuous process history,Footnote 17
such that \({\bar{\mathcal {q}}}(t^i_{\mathcal {p}}) =0\), \({\bar{\mathcal {q}}}(t^f_{\mathcal {p}}) = \mathcal {q}\), \(\Delta {\bar{\mathcal {m}}}(t^i_{\mathcal {p}}) = 0\), and \(\Delta {\bar{\mathcal {m}}}(t^f_{\mathcal {p}}) = \Delta \mathcal {m}\).
Remark A.1
Given the process history described above, we take for granted that, for each closed time interval contained within \([t^i_{\mathcal {p}},t^f_{\mathcal {p}}]\), there is another member of \(\mathscr {P}\), say \(\mathcal {p}^* = (\Delta \mathcal {m}^*, \mathcal {q}^*)\), corresponding to the restriction of the given process history to that smaller time interval. That is, if \([t^i_{\mathcal {p}^*},t^f_{\mathcal {p}^*}]\) is the smaller time interval, then
With this as background, what follows is a brief list of properties we assume to be possessed by the set of process histories associated with \(\mathscr {P}\). Each property will be accompanied by a rationale. Taken together, these properties will shed light on the geometric structure of \(\mathscr {P}\).
Property 1
If \(\mathcal {p}_1\) and \(\mathcal {p}_2\) are members of \(\mathscr {P}\) associated with process histories of identical duration, then \(\mathcal {p}_1 + \mathcal {p}_2\) is also a member of \(\mathscr {P}\) having associated with it a history of that same duration.
Rationale
If the two processes \(\mathcal {p}_1 = (\Delta \mathcal {m}_1, \mathcal {q}_1)\) and \(\mathcal {p}_2 = (\Delta \mathcal {m}_2, \mathcal {q}_2)\) are experienced by bodies \(\mathscr {B}_1\) and \(\mathscr {B}_2\) then those same processes can be run simultaneously with copies of \(\mathscr {B}_1\) and \(\mathscr {B}_2\) at remote locations (or, more generally, thermally insulated from each other). The union of the bodies is again a body. The process experienced by the union will have \(\Delta \mathcal {m}_1 + \Delta \mathcal {m}_2\) as the body’s change of condition and \(\mathcal {q}_1 + \mathcal {q}_2\) as its heating measure. Thus, \(\mathcal {p}_1 + \mathcal {p}_2\) is a member of \(\mathscr {P}\).
Remark A.2
If \(\mathcal {p}\) is a member of \(\mathscr {P}\) there is, by supposition, a process history associated with it. It is a consequence of Property 1 (and its rationale) that, for any integer n, \(n\mathcal {p}\) is also a member of \(\mathscr {P}\).
Property 2
If \(\mathcal {p}=(\Delta \mathcal {m}, \mathcal {q}) \in \mathscr {P}\) has associated with it a process history of duration d then, for any integer N, \(\mathcal {p}\) also has associated with it a history of duration d/N.
Rationale
The time interval for the given process history can be regarded to be the union of N sequential (closed) time intervals, each of duration d/N. With each such smaller interval we can associate, as in Remark A.1, a sub-process history. Using N copies of the original body suffering the process, we can execute those N sub-process histories simultaneously, as in the rationale for Property 1. The union of the N body-copies is again a body, this one suffering a process of duration d/N. By virtue of Property 1 the overall change of condition will again be \(\Delta \mathcal {m}\) and the overall heating measure will again be \(\mathcal {q}\).
Property 3
Suppose that two members of \(\mathscr {P}\), say \(\mathcal {p}_1\) and \(\mathcal {p}_2\), are associated with histories of durations \(d_1\) and \(d_2\). If \(d_1/d_2\) is rational, then \(\mathcal {p}_1+\mathcal {p}_2\) is also a member of \(\mathscr {P}\).
Rationale
This is just a consequence of Properties 1 and 2: Suppose that \(d_1/d_2 = N_1/N_2\), where \(N_1\) and \(N_2\) are integers. Then, from Property 2, \(\mathcal {p}_1\) and \(\mathcal {p}_2\) can be associated with two histories of identical duration, \(d_1/N_1 = d_2/N_2\). From Property 1 it follows that \(\mathcal {p}_1+\mathcal {p}_2\) is a member of \(\mathscr {P}\).
We will not assume that we can always associate with \(\mathcal {p}\in \mathscr {P}\) a process history of rational duration. Nevertheless, we will assume that there is invariably a nearby element of \(\mathscr {P}\) that can be associated with a rational-duration process history. This is made precise in the following way:
Property 4
If \(\mathcal {p}\) is an element of \(\mathscr {P}\) and \(\mathscr {O}\subset \mathscr {V}(\Sigma )\) is any open neighborhood of \(\mathcal {p}\), there is in \(\mathscr {O}\) an element of \(\mathscr {P}\) that can be associated with a process history of rational duration.
Rationale
This is just a consequence of the natural assumption that \(\mathcal {p}\) can be associated with a process history that is continuous in time. In fact, for Property 4 to obtain it is sufficient that \(\mathcal {p}\) can be associated with a process history that is merely continuous at its final time.
In the next proposition we assume that, for the thermodynamical theory \((\Sigma , \mathscr {P})\) under consideration, the set \(\mathscr {P}\) and the set of histories associated with its elements possess Properties 1-4.
Proposition A.3
\({\hat{\mathscr {P}}}: =\text {cl}\,[\text {Cone}\,(\mathscr {P})]\) is a convex subset of \(\mathscr {V}(\Sigma )\).
Proof
We need to show that if \(x_1\) and \(x_2\) are members of \({\hat{\mathscr {P}}}\) and \(\alpha \) is a number between 0 and 1, then \(\alpha x_1 + (1-\alpha )x_2\) is also a member of \({\hat{\mathscr {P}}}\). It is not difficult to show that, in a topological vector space, the closure of a cone is again a cone, whereupon \({\hat{\mathscr {P}}}\) is a cone. In particular \(\alpha x_1\) and \((1-\alpha )x_2\) are members of \({\hat{\mathscr {P}}}\). Therefore, to establish that \({\hat{\mathscr {P}}}\) is convex, it is enough to prove the following lemma:
Lemma A.4
If \(v_1\) and \(v_2\) are members of \({\hat{\mathscr {P}}}\) then so is \(v_1 + v_2\).
Proof
Our aim is to show that if, in the topological vector space \(\mathscr {V}(\Sigma )\), \(\mathscr {O}\) is an open neighborhood of \(v_1 + v_2\), then \(\mathscr {O}\) contains an element of \(\text {Cone}\) (\(\mathscr {P}\)).
Because vector addition \(\mathscr {V}(\Sigma )\times \mathscr {V}(\Sigma )\rightarrow \mathscr {V}(\Sigma )\) is continuous, there are open sets \(\mathscr {O}_{\,1}\) and \(\mathscr {O}_{\,2}\) in \(\mathscr {V}(\Sigma )\) containing \(v_1\) and \(v_2\) respectively such that the set
is contained in \(\mathscr {O}\).
By supposition, \(v_1\) is a member of cl [ cone (\(\mathscr {P}\)) ]. Because \(\mathscr {O}_{\,1}\) is an open neighborhood of \(v_1\), there must be a member of \(\textrm{Cone} \,(\mathscr {P})\) in \(\mathscr {O}_{\,1}\). That is, \(\mathscr {O}_{\,1}\) must contain a member of the form \(\alpha _1\mathcal {p}_1\), with \(\alpha _1\) a positive number and \(\mathcal {p}_1\) a member of \(\mathscr {P}\). Because, in the topological vector space \(\mathscr {V}(\Sigma )\), scalar multiplication \(\mathbb {R} \times \mathscr {V}(\Sigma )\rightarrow \mathscr {V}(\Sigma )\) is continuous, there is an open interval \(I_1 \subset \mathbb {R}_+\) containing \(\alpha _1\) and an open neighborhood \({\hat{\mathscr {O}}}_1\) of \(\mathcal {p}_1\) such that the set
is contained in \(\mathscr {O}_1\).
In particular, there is a rational number \(\alpha ^*_1 \in I_1\) and, from Property 4, an element \(\mathcal {p}^*_1 \in \hat{\mathscr {O}_1}\) associated with a process history of rational duration such that \(\alpha ^*_1\mathcal {p}^*_1\) is a member of \(\mathscr {O}_1\). Similarly, \(\mathscr {O}_2\) contains a member of the form \(\alpha ^*_2\mathcal {p}^*_2\), where \(\alpha ^*_2\) is rational and \(\mathcal {p}^*_2 \in \mathscr {P}\) has associated with it a process history of rational duration. Because \(\mathscr {O}_{\,1} + \mathscr {O}_{\,2}\) is contained in \(\mathscr {O}\), we have the inclusion
It remains to be shown that \(\alpha ^*_1\mathcal {p}^*_1 + \alpha ^*_2\mathcal {p}^*_2\) is a member of \(\textrm{Cone} \,(\mathscr {P})\).
Let \(n_1, m_1, n_2, m_2\) be integers such that \(\alpha _1^* = n_1/ m_1\) and \(\alpha _2^* = n_2/ m_2\). Thus, we have the inclusion
From Remark A.2 it follows that \(n_1m_2\mathcal {p}^*_1\) and \(n_2m_1\mathcal {p}^*_2\) are members of \(\mathscr {P}\) having associated with them individual process histories of rational durations (identical to those associated with \(\mathcal {p}^*_1\) and \(\mathcal {p}^*_2\), respectively). From Property 3, then, their sum \(\mathcal {p}^{**}:= n_1m_2\mathcal {p}^*_1 + n_2m_1\mathcal {p}^*_2\) is a member of \(\mathscr {P}\), so we have the inclusion
Thus, there is a member of \(\text {Cone}\) (\(\mathscr {P}\)) that lies in \(\mathscr {O}\). This is what we wanted to prove. \(\square \)
This completes the proof of Proposition A.3. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Feinberg, M., Lavine, R.B. Entropy and Thermodynamic Temperature in Nonequilibrium Classical Thermodynamics as Immediate Consequences of the Hahn–Banach Theorem: I. Existence. Arch Rational Mech Anal 248, 45 (2024). https://doi.org/10.1007/s00205-024-01986-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00205-024-01986-w