Keywords

8.1 Introduction

In his Treatise on Electricity and Magnetism, James Clerk Maxwell closes the first section on “Description of Phenomena” (Ch. 1, Experiment I, Art. 27, last paragraph) by the following two statements [1]: “No force, either of attraction or of repulsion, can be observed between an electrified body and a body not electrified. When, in any case, bodies not previously electrified are observed to be acted on by an electrified body, it is because they have become electrified by inductionFootnote 1 [italics in the original]. Although the former statement appears so trivial as to rarely even be explicitly stated in elementary textbooks, its correct interpretation requires extreme care.

In the following few pages of the Treatise (culminating with Art. 34), Maxwell recounts the methodology followed by Faraday (see Ref. [4], p. 279) “in his very admirable demonstration of the laws of electrical phenomena.” After summarizing experiments conducted by means of a hollow vessel connected to a gold leaf electrometer, he logically concludes that “the electrification of a body is therefore a physical quantity capable of measurement” and “we are therefore entitled to speak of any electrified body as ‘charged with a certain quantity of positive or negative electricity’.” Finally, the law determining the force between electrified “bodies of dimensions small compared with the distance between them” is given as established by means of the torsion balance devised by Mitchell, used by Cavendish and “successfully applied”(Art. 38) by Coulomb. That is the relationship referred to today as Coulomb’s law for the electrostatic force, F Coul, between two point-like charges, q 1,2, at a mutual distance r 12, that is, \({\mathbf {F_{Coul}}} = q_1 q_2\, {\hat {\mathbf {r}}_{12}}/r_{12}^2\), in appropriate units. In light of this reasoning, Maxwell’s cornerstone statement that “No force, either of attraction or of repulsion, can be observed between an electrified body and a body not electrified” appears irrefutable since, mathematically speaking, the force between any two point-like charges vanishes if the charge of either is zero, that is, if either body is electrically neutral (if q i = 0, with i = 1 or 2, then F Coul = 0).

This might be sufficient, at least classically, if physical reality presented us only with two point-like particles. However, as Maxwell explains (Art. 28, Experiment II), charge distribution in extended objects is affected by external charges. If one finite body is neutral, existing charges within it will be redistributed either by moving freely on its surface (in the case of conductors), or by locally producing dipoles (in the case of insulators). Since the size of the objects involved does not vanish with respect to their mutual distances, this will produce a net force even if a body was “not previously electrified,” as Maxwell indicates in his latter statement (Art. 27). These electrostatic forces due to polarization are well known even to primary and secondary school pupils, who are explained the reason they can pick up (electrically neutral) pieces of paper by rubbing a comb through their hair [5, 6]. However, such phenomena occur even on the sub-atomic scale and indeed recent computations have shown that the neutron-protonFootnote 2 (and neutron-neutron) electrodynamical Casimir-Polder interactions are expected to be “quite relevant” [9]. Despite the existence of electrostatic forces even between one charged and one neutral object, an apparently inescapable reading of Maxwell’s latter statement is that, if two bodies separated by empty space are both electrically neutral, and if an independent means to produce polarization is absent, they should not interact with each other, that is, no electrical force should exist between them.

The present contribution is motivated by the fact that, in conflict with such an apparently straightforward conclusion, forces among neutral objects have been well known to exist for millennia from direct observation of the physical world and they have also played an early, vital role within atomistic philosophy, which obviously depends on interatomic forces in order to explain the universe as humans experience it [10]. During the emergence of modern science, the cohesion of polished marbles was reported by Boyle, [11] who conducted fascinating experiments by means of the vacuum pump he had invented [12, 13], and by Newton [14]. In 1840, as polishing technology developed, strong adhesion was reported by Whitworth also in the case of metal surfaces [15], and this phenomenon was further investigated by Tyndall, who ruled out atmospheric pressure as the cause of attraction [16]. The strong interaction of highly polished metal surfaces eventually led to the invention of the so-called Johansson gauge blocks, which are mentioned by Richard Feynman as a “fairly direct” demonstration of intermolecular forces (Ref. [17], p. 12–6) and are possibly the earliest technology enabled by engineering dispersion forces (for a full account and significance of these developments, see Ref. [18]). It is important to stress, therefore, that Casimir’s discovery [19] was not, as often erroneously stated, that neutral metal plates attract but that there exists a deep physical meaning of that already well known fact [20].

Here we shall briefly touch on four different facets of this explosively expanding research field. Firstly, from the pedagogical standpoint, we intend to provide a relatively gentle introduction to the standard foundations of dispersion force theory. However, in order to provide motivated readers with powerful tools needed to reproduce results from the research literature, we shall pursue a computer algebra approach. The package adopted herein is Mathematica (v. 9.0.1.0), without any pretense whatsoever of programming expertise on the side of the author and indeed used with a fair amount of wilful unsophistication – and no guarantees – so as to leave ample room for improvement.

Secondly, we shall emphasize several aspects of the Casimir effect that are amenable to classical or semi-classical treatments. One goal of this approach is to remind readers that, unlike routinely claimed, the Casimir effect does not provide direct proof of field quantization such as is given by single photon detection. This may dissatisfy some purists, who repeatFootnote 3 that, for instance, since the Casimir force is proportional to Planck’s constant, ħ, its existence must necessarily be proof that the electromagnetic field is quantized. Such conclusion is well known to be demonstrably incorrect (for this debate, see Sect. 8.4 herein, Ref. [18] and Ref. [22], Sect. 8.3). The interpretation of the Casimir force in terms of radiation pressure is derived from Casimir’s own original suggestion, in his foundational paper, that “This force may be interpreted as a zero point pressure of electromagnetic waves” [19]. The epistemological and ontological debates as to whether a zero-point field is compatible with classical electrodynamics are also briefly discussed below (Sect. 8.4, and References herein). A second goal of this approach, again pedagogically, is to contribute a working mental image of the physical origin of dispersion forces for use by educators as they help students at all levels develop their intuition for a concept that, as clearly shown by intense recent focus by physics and chemistry education researchers, is notoriously complex to grasp [23,24,25].

The third aspect, mainly left to Suggested Exercises, is concerned with application of dispersion forces to fundamental science and to technology on various scales including, of course, in nanotechnology, with an emphasis on processes leading to energy exchange by dispersion force manipulation.

Finally, as our fourth perspective, throughout this presentation we shall endeavor to provide insight into and further references regarding some historical developments surrounding the field, which is well known to be mired in multiple, hotly debated controversies.

It will become quickly apparent that dispersion force research has undergone a tremendous expansion in the last few decades. This is clearly neither the place for a technical review of the subject nor, indeed, of a review of the several reviews now available. Readers interested in becoming familiar with such reviews and the many subfields now developing may refer, for instance, to a past analysis by this author written within a space propulsion technology context [26]. Two reviews by the present author are also to appear, again considering the aerospace marketplace, especially focused on historical and future developments in the process of technology transfer of dispersion force enabled technologies – including the effect of controversies on their economic and industrial implications – with several hundred relevant references [18]. A non-mathematical introduction aimed at the level of a well-educated audience is also available [27].

8.2 Intermolecular Forces: Fundamental Results

In this section, we proceed by building a chain of arguments leading to an intuitive understanding of the physical forces between molecules as well as between macroscopic boundaries. Notice that here we follow a typesetting style similar to that used by Dubin [28], with input information (Mathematica [In] prompt) given within the LATE X \verbatim environment whereas the output (Mathematica [Out] prompt) is reproduced by exporting the Mathematica result to LATE X and displaying it within the {equation} environment. In some cases, typographical need mandated departures from such work flow without altering any results and notice also that the appearance of some characters may be different in this document than within the notebook (i.e.  \[Omega]→ ω).

8.2.1 London Expression of the van der Waals Force

Let us commence with a standard non-relativistic treatment [29, 30] of the unretarded interaction of two polarizable one-electron atomsFootnote 4 as originally given by London [32] with the further restriction to one-dimension (1D) [33, 34]. Consider two atoms arranged along, for instance, the x-axis at a mutual distance R with their electrons at a distance z 1 and z 2 from their respective nuclei. The atomic electrostatic interaction potential, V int, expanded to 1st order in (z 1, z 2), is:

Clear["Global‘∗"]; Vint[z1_, z2_, R_] =  eˆ2  ((1/R) + (1/(R + z2 - z1)) - (1/(R - z1)) - (1/(R + z2))) VintExpanded[z1_, z2_, R_] =  Normal[Series[Vint[z1, z2, R], {z1, 0, 1}, {z2, 0, 1}]]

$$\displaystyle \begin{aligned} -\frac{2 e^2 \text{z1}\, \text{z2}}{R^3} \end{aligned} $$
(8.1)

so that the total electrostatic potential to the same order, V tot, becomes:

VtotExpanded[z1_, z2_, R_]  = (1/2) kz1ˆ2 + (1/2) kz2ˆ2 + VintExpanded[z1, z2, R]

$$\displaystyle \begin{aligned} -\frac{2 e^2\,\, \text{z1}\,\, \text{z2}}{R^3}+\frac{k\,\, (\text{z1})^2}{2}+\frac{k\,\, (\text{z2})^2}{2} \end{aligned} $$
(8.2)

This result allows us to extract the secular (or characteristic) equation [35, 36] as follows:

Az1z2 = {{2 SeriesCoefficient[VtotExpanded[z1, z2, R],             {z1, 0, 2}],             SeriesCoefficient[VtotExpanded[z1, z2, R],              {z1 z2, 0, 1}]},         {   SeriesCoefficient[VtotExpanded[z1, z2, R],              {z1 z2, 0, 1}],           2 SeriesCoefficient[VtotExpanded[z1, z2, R],            {z2, 0, 2}]}}; Az1z2 // MatrixForm SecularEquation[z1_, z2_, R_] = - \[Omega]ˆ2 m IdentityMatrix[2]      + Az1z2 ; SecularEquation[z1, z2, R] // MatrixForm

$$\displaystyle \begin{aligned} \left( \begin{array}{cc} k-m \omega ^2 & -\displaystyle{\frac{2 e^2}{R^3}} \\ -\displaystyle{\frac{2 e^2}{R^3}} & k-m \omega ^2 \\ \end{array} \right) \end{aligned} $$
(8.3)

The (positive) normal mode frequencies, (ω1, ω2), are found from the determinant of the secular equation:

 \[Omega]1[z1_, z2_, R_] =  Solve[Det[SecularEquation[z1, z2, R]] ==0, \[Omega]][[2, 1, 2]]  \[Omega]2[z1_, z2_, R_] =  Solve[Det[SecularEquation[z1, z2, R]] ==0, \[Omega]][[4, 1, 2]]

$$\displaystyle \begin{aligned} \frac{\sqrt{k R^3-2 e^2}}{\sqrt{m} R^{3/2}} \end{aligned} $$
(8.4)
$$\displaystyle \begin{aligned} \frac{\sqrt{2 e^2+k R^3}}{\sqrt{m} R^{3/2}} \end{aligned} $$
(8.5)

In order to extract the standard expression, it is necessary expand this result to 2nd order in the dimensionless parameter β ≡ e 2∕(k R 3) near β = 0 (the expansions to 1st order cancel each other out). For instance, for the first frequency, ω1, we find, upon remultiplying by the natural (unperturbed) frequency \(\omega 0=\sqrt {k/m}\):

 \[Omega]1exp[z1_, z2_, R_] = Expand[(k/m)ˆ(1/2) (Normal[  Series[Sqrt[Apart[(( \[Omega]1[z1, z2, R])ˆ2/(k/m))]  /. (eˆ2/(k Rˆ3)) ->  \[Beta]], { \[Beta], 0, 2}]]  /.  \[Beta] -> (eˆ2/(k Rˆ3)))]

$$\displaystyle \begin{aligned} +\, \sqrt{\frac{k}{m}}-\frac{e^2 \sqrt{\displaystyle{\frac{k}{m}}}}{k R^3}-\frac{e^4 \sqrt{\displaystyle{\frac{k}{m}}}}{2 k^2 R^6} \end{aligned} $$
(8.6)

and analogously for ω2, which yields a positive sign in the second term. Finally, by writing to the total energy for the two oscillators in their ground states and by introducing the classical static polarizability, α = e 2k, we find, for the unretarded van der Waals interaction energy, V vdW:

Eosc[z1_, z2_, R_] =   Expand[(1/     2)  \[HBar] ( \[Omega]1exp[z1, z2, R] +  \[Omega]2exp[z1, z2,         R])] /. Sqrt[k/m] ->  \[Omega]0 ; DeltaE[z1_, z2_, R_] =  Eosc[z1, z2, R] -  \[HBar]  \[Omega]0 /. eˆ4/kˆ2 ->  \[Alpha]ˆ2

$$\displaystyle \begin{aligned} -\frac{\alpha ^2\, \omega \text{0}\, \hbar }{2\, R^6} {} \end{aligned} $$
(8.7)

Orientational averaging in the case of three dimensions replaces our numerical constant 1 2 by 3 4, which is the result by London [32].

8.2.1.1 Suggested Exercise 1

By building upon the above syntax (or by developing your own), generalize the above approach to three atoms. This case is treated in a pedagogical manner by Farina, Santos, and Tort [37] who, in the process, recover a very important result by Axilrod and Teller [38]. Show that, for some particular configurations of the three atoms, the mutual force is repulsive. This result, obtained before Casimir’s papers and several years earlier than the development of the Lifshitz theory, demonstrated that unretarded, van der Waals forces are not always attractive – a finding with important technological applications [18].

8.2.2 Van der Waals Forces Between Half-Infinite Semispaces

In the assumption of pair-wise additivity, we shall now consider the van der Waals force between two parallel-plane, semi-infinite slabs separated by an empty gap of width s, as done by de Boer [39, 40]. The standard approach consists of choosing, for instance, the (x, y)-plane parallel to the two facing surfaces and of proceeding by multiple integrations. For generality and later use, we shall assume that an interatomic potential of the form:

U[x_, y_, z_] = -B/(xˆ2 + yˆ2 + zˆ2)ˆ( \[Gamma]/2)

With this definition, and by identifying s → R, the potential of Eq. (8.7), corresponds to γ = 6. Let us first consider one single atom at a distance s from one semi-infinite slab of atom number density N1. A triple integration over the entire (x, y)-plane and for z ∈ [s, +) yields the atom-slab potential, V (s), as:

V[s_] = N1 Integrate[  Integrate[  U[x, y, z], {x, - \[Infinity], + \[Infinity]}, {y, - \[Infinity],     + \[Infinity]}, Assumptions -> Re[ \[Gamma]] > 2 &&     Re[zˆ2] > 0], {z,s, + \[Infinity]},  Assumptions -> Re[ \[Gamma]] > 3 && Re[s] > 0 && Im[s] == 0]

Notice that Mathematica must be told specific information about all quantities involved although this may be obvious from our specific physical application. One further integration over all atoms in the second slab, assumed to have atom number density N2, yields the slab-slab potential, and, finally, the van der Waals force is the opposite of the derivative with respect to the gap width:

U[s_] = N2 Integrate[V[s + r], {r, 0, + \[Infinity]},  Assumptions -> Re[ \[Gamma]] > 4 && Re[s] > 0]; u[s_] = U[s] (Denominator[U[s]])/Factor[Denominator[U[s]]] Fplaneplane[s_] = -D[u[s], s]

$$\displaystyle \begin{aligned} -\frac{2 \pi B\, \text{N1}\, \text{N2}\, s^{4-\gamma }}{(\gamma -4) (\gamma -3) (\gamma -2)} \end{aligned} $$
(8.8)
$$\displaystyle \begin{aligned} \frac{2 \pi B\, (4-\gamma )\, \text{N1}\, \text{N2}\, s^{3-\gamma }}{(\gamma -4) (\gamma -3) (\gamma -2)} \end{aligned} $$
(8.9)

By defining the de Boer-Hamaker constant, A dBH, and by reading out the values of γ and B from the London potential found in Sect. 8.2.2, we find the standard expression due to de Boer [39, 40]:

FvdW[s_] = (Fplaneplane[s] /. { \[Gamma] -> 6,  B -> 3  \[HBar]  \[Omega]0  \[Alpha]ˆ2 /   4})/(N1 N2 3  \[Pi]ˆ2  \[Alpha]ˆ2  \[HBar]  \[Omega]0/4) AdBH

$$\displaystyle \begin{aligned} -\frac{\text{AdBH}}{6 \pi s^3} \end{aligned} $$
(8.10)

8.2.2.1 Suggested Exercise 2

  1. (a)

    Extend the previous approach to the case of two macroscopic homogeneous spherical particle distributions of equal radius R and prove that the particle-particle analytical solution can be cast in a form proportional to the result given by Hamaker [41]:

    f[x_] = (2/(xˆ2 - 4) ) + (2/xˆ2) + Log[((xˆ2 - 4)/xˆ2)];

  2. (b)

    Demonstrate that, in the limit of distances s ≫ R, this solution converges back to the London potential. Find the next non-vanishing term in the series expansion.

  3. (c)

    Demonstrate that, in the limit for x → 2, the solution converges back to the plate-plate result by de Boer given above. Find the next non-vanishing term in the series expansion.

  4. (d)

    Create a plot showing your analytical solution at (a) compared to the two limits in (b) and (c). It may be helpful to create two plots, one for near range and one for far range. Due to the power-law nature of these interactions, the natural choice are log-log diagrams of the absolute value of the force.

8.2.3 The Casimir Effect

Although the first to prove that radiation pressure considerations lead to the correct expression for the Casimir force was González [42], below we adapt to the 1D case the full three dimensional (3D) mathematical treatment of Casimir’s suggestion given by Milonni, Cook, and Goggin [30, 43] (note that the statement attributed to Debye by these authors in their Ref. 3 is actually due to Casimir; see footnote on p. 7 of Ref. [22]).

In the radiation pressure interpretation of the Casimir effect, the Casimir force is deemed to arise from the competing outward and inward radiation pressures acting on two parallel, ideal reflectors facing each other across a gap of width s. We recall that the expression for the classical radiation pressure [44] due to light incident at angle θ from the normal is, for a perfect reflector, \(P_{\mathrm {rad}}=2 u\cos ^2\theta \), where u is the radiation energy density, and the factor 2 corresponds to perfect reflection. Assuming that the energy per mode is \(\textstyle {1\over 2}\hbar \omega \), and noticing that only \(\textstyle {1\over 2}\) of such energy contributes to radiation pressure in either direction, the radiation pressure becomes:

$$\displaystyle \begin{aligned} P_{\mathrm{rad}}=2 \left({1\over 2}\right) \left({\hbar\omega\over 2}\right)\left({1\over V}\right)\left({k_z\over k} \right)^2\, , {} \end{aligned} $$
(8.11)

where V  is a quantization volume, k is the wave vector k = ωc, k z is the normal component of the wave vector, and c is the speed of light.

8.2.3.1 The One-Dimensional (1D) Casimir Effect

In this 1D treatment, [45] however, radiation is assumed only to be incident at θ = 0 so that no angular integrations are necessary and k z ≡ k. The quantization volume can be expressed as the product of a generic square plate area of side b multiplied by the gap width, so that V = b 2s. The boundary conditions corresponding to a perfect reflector require that the only modes present within the gap be those for which k = s, where n is an integer. In this 1D case, the replacement of sums by integrals is not apparent but we must still include a dimensional factor, b 2, which cancels the same term in the quantization volume (see Ref. [30], Secs. 2.7 and 3.10). Therefore, by considering contributions from two independent polarizations, we find:

$$\displaystyle \begin{aligned} P_{\mathrm{out}} = {\pi\hbar c\over s^2}\sum_0^{+\infty}\, n \end{aligned} $$
(8.12)

As for the inward pressure, the standard replacement \(\sum _n(\dots ) \rightarrow (s/\pi )\int (\dots )\) in Eq. (8.11) yields:

$$\displaystyle \begin{aligned} P_{\mathrm{in}} = -{\pi\hbar c\over s^2}\int_0^{+\infty}\, x\, dx\, . \end{aligned} $$
(8.13)

Therefore the Casimir pressure, P Cas, becomes:

$$\displaystyle \begin{aligned} P_{\mathrm{Cas}} = {\pi\hbar c\over s^2}\left( \sum_0^{+\infty}\, n - \int_0^{+\infty}\, x\, dx\right) \end{aligned} $$
(8.14)

As is typical in computations of this type [46], extracting physical content requires the evaluation of the difference of two formally infinite quantities [47]. One such method is use of the Euler-Maclaurin formula, [48, 49] which we write as:

$$\displaystyle \begin{aligned} \varDelta^N_0 {=}\sum_0^N\, n {-} \int_0^N\, x\, dx {=} \textstyle{1\over 2}\left[f(N){+}f(0)\right]{+} \displaystyle{\sum_{m=1}^{M(K)}}\displaystyle{B_{2m}\over (2m)!}\left[ f^{2m-1}(N){-}f^{2m-1}(0)\right]\, , \end{aligned} $$
(8.15)

where M(K) = K∕2 is K is even and to (K − 1)∕2 is K is odd. In order to obtain a finite result, we follow Casimir and multiply the integrand and summand functions by an exponential function to introduce a cutoff so that the argument in the sum and integral becomes:

fcas[x_,  \[Lambda]_] = x Exp[- \[Lambda] x]

where λ is the cutoff parameter to be considered in the limit λ → 0+ and, in our application, N → +. In order to proceed with applying the Euler-Maclaurin summation formula, we evaluate the following quantity (m = 1), which is the only one contributing to the final result in the cutoff limit as can be verified directly:

Limit[D[fcas[x,  \[Lambda]], {x, 1} ], x -> 0,  Assumptions -> Im[ \[Lambda]] == 0 && Re[ \[Lambda]] > 0]

$$\displaystyle \begin{aligned} 1 \end{aligned} $$
(8.16)

Hence the final result is:

PCas1D[s_] = -(BernoulliB[2]/(2!)) Pi  \[HBar] c/sˆ2

$$\displaystyle \begin{aligned} - \frac{\pi c \hbar }{12 s^2} \end{aligned} $$
(8.17)

which is the result given by Kupiszewska and Mostowski [45]. Notice that the result given by Bordag, Mohideen, and Mostepanenko [50] for this case refers to the potential energy and pressure per polarization hence accounting for the difference by a factor of 2 (see footnote (3) on p. 74 of Ellingsen’s thesis [51]).

8.2.3.2 Suggested Exercise 3

Both the above sum and integral can be calculated analytically for the chosen analytical cutoff function and with N → +. Find such functions, study their difference as a function of λ and take the limit for λ → 0+ noticing whether any numerical instabilities arise in the numerical limit. Repeat this study considering the difference between the sum and the integral for finite N with N ≫ 1.

8.2.3.3 Suggested Exercise 4

In the case of one infinitely conducting and one infinitely magnetically permeable plate [52, 53], the boundary conditions change and the allowed modes within the gap are defined by \(k = (n + \textstyle {1\over 2}) \pi /s\). Apply the above approach to this case and discuss your conclusions.

8.2.3.4 Suggested Exercise 5

Generalize the approach of this Section to the full 3D case and recover the result found by Casimir.

8.2.4 Electrodynamical Casimir-Lifshitz Force

Here we present a derivation of the important result by Lifshitz [54] for the force between two semi-infinite, parallel-plane slabs without any assumption of additivity and including retardation. This calculation has been given many times by different authors but, in order to provide users with the mathematical tools to extend the validity of this treatment further, here we present a Mathematica formulation inspired by the matrix approach of Podgornik, Hansen, and Parsegian [55]. Unlike done originally by Lifshitz, this will enable one to later very efficiently consider the algebraically far more complex cases of slabs of finite thickness, multilayers, and even anisotropic materials [56,57,58,59,60,61,62]. In order to limit complications, we shall restrict ourselves to the vanishing absolute temperature limit (T = 0 K). Extension to the case of finite temperatures is standard, once the correct dielectric function is adopted, and it is treated in the References.

The most general geometry of the system is therefore that of two multilayer stacks described by layers of dielectric and magnetic permeability (𝜖 i, μ i) constant within each layer separated by a gap of width s and with a semi-infinite space bounding the system to each side of the multilayers opposite to the gap. Upon choosing the (x, y) plane as parallel to the multilayer surfaces and the z-axis as perpendicular to those surfaces and oriented to the right, the Maxwell equations provide the general solutions for the bound state fields, which are assumed to exponentially decay for z →±. With this geometry, the solutions for the harmonic components of the electric field within the i-th layer can be written as (we shall assume non-magnetic media, or μ i = 1) [30, 50]:

$$\displaystyle \begin{aligned} {\mathbf{E_i}}({\mathbf{r}}) = \left[e_{x,i}(z)\, {\hat{\mathbf{i}}}+e_{y,i}(z)\,{\hat{\mathbf{j}}}+e_{z,i}(z)\, {\hat{\mathbf{k}}}\right] e^{i(k_x x + k_y y)}\, , \end{aligned} $$
(8.18)

where

$$\displaystyle \begin{aligned} {d^2 e_{z,i}\over dz^2} - K_i e_{z,i} +0\, , \end{aligned} $$
(8.19)

and analogously for e x,i and e y,i, so that the general solution is:

$$\displaystyle \begin{aligned} e_{z,i}(z) = A_i e^{K_i z} + B_i e^{-K_i z} {} \end{aligned} $$
(8.20)

where

$$\displaystyle \begin{aligned} K_i^2 = k_\parallel^2 - \epsilon_i {\omega^2\over c^2} \end{aligned} $$
(8.21)

with \(k_\parallel ^2=k_x^2+k_y^2\) and ω a real quantity (clearly \(i=\sqrt {-1}\) must not be confused either with the layer number index or with the x-axis versor); also notice that various layer numbering conventions exist in the literature and here we shall number all layers in the left multistack by negative integers increasing in magnitude towards the left and by positive integers to the right, that is, i = ⋯ − 2, −1⋯ + 1, +2… Also, from Eq. (8.20), the field to the left (right) of the leftmost (rightmost) boundary must vanish, that is, B LL = 0 and A RR = 0:

$$\displaystyle \begin{aligned} e_{z,LL}(z) = A_{LL}e^{K_{LL}\, z} \end{aligned} $$
(8.22)
$$\displaystyle \begin{aligned} e_{z,RR}(z) = B_{RR}e^{-K_{RR}\, z} \end{aligned} $$
(8.23)

8.2.4.1 Boundary Transition Matrix: Electrical Field Continuity

We can now write the electric field in two adjacent layers, generically referred as as left (L) and right (R):

ezL[kparal_,  \[Omega]_, z_] =   AL  Exp[KL[kparal,  \[Omega]] z] + BL  Exp[-KL[kparal,     \[Omega]] z] ; ezR[kparal_,  \[Omega]_, z_] =   AR Exp[KR[kparal,  \[Omega]] z] + BR  Exp[-KR[kparal,     \[Omega]] z] ;

with obvious definitions, such as K L,R = K i−1,i > 0 and similarly for 𝜖 L,R, and kparal← k . Let us now apply the continuity condition to the normal component of the D field, which is guaranteed by imposing the continuity of 𝜖e z and de zdz at the boundary.

$$\displaystyle \begin{aligned} \left( \begin{array}{c} A_R \\ B_R \\ \end{array} \right)= \left( \begin{array}{cc} \mathbb{D} \text{E}_{11} & \mathbb{D} \text{E}_{12} \\ \mathbb{D} \text{E}_{21} & \mathbb{D} \text{E}_{22} \\ \end{array} \right) \left( \begin{array}{c} A_L \\ B_L \\ \end{array} \right)=\mathbb{D}_{\text{LR}}^E\left( \begin{array}{c} A_L \\ B_L \\ \end{array} \right) \end{aligned} $$
(8.24)

Requesting such continuity yields the coefficients on the right (R) of the left-most boundary, assumed conveniently placed at z = 0, in terms of the coefficients on the left (L), or in matrix form:

Solve[  \[Epsilon]L[ \[Omega]] ezL[kparal,  \[Omega], z]  ==   \[Epsilon]R[ \[Omega]]  ezR[kparal,  \[Omega], z] &&   D[ezL[kparal,  \[Omega], z], z] == D[ezR[kparal,     \[Omega], z], z] ,    {AR, BR}][[1]]  /.  z -> 0

With these solutions, it is possible to build the boundary transition matrix, \(\mathbb {D}_{LR}^E\), which yields the (A R, B R) coefficients on the right of the boundary by extracting the coefficients of the (A L, B L) constants, and we finally find:

 \[ScriptCapitalD]E11[kparal_,  \[Omega]_] =   Expand[Coefficient[     Solve[  \[Epsilon]L[ \[Omega]] ezL[kparal,  \[Omega],            z]  ==  \[Epsilon]R[ \[Omega]]  ezR[kparal,              \[Omega], z] &&         D[ezL[kparal,  \[Omega], z], z] ==        D[ezR[kparal,  \[Omega], z], z] , {AR, BR}][[1, 1, 2]] /.      z -> 0 , AL]];  \[ScriptCapitalD]E12[kparal_,  \[Omega]_] =   Expand[Coefficient[     Solve[  \[Epsilon]L [ \[Omega]] ezL[kparal,  \[Omega],            z]  ==  \[Epsilon]R[ \[Omega]]  ezR[kparal,              \[Omega], z] &&         D[ezL[kparal,  \[Omega], z], z] ==        D[ezR[kparal,  \[Omega], z], z] , {AR, BR}][[1, 1, 2]] /.      z -> 0 , BL]];  \[ScriptCapitalD]E21[kparal_,  \[Omega]_] =   Expand[Coefficient[     Solve[  \[Epsilon]L [ \[Omega]] ezL[kparal,  \[Omega],            z]  ==  \[Epsilon]R[ \[Omega]]  ezR[kparal,              \[Omega], z] &&         D[ezL[kparal,  \[Omega], z], z] ==        D[ezR[kparal,  \[Omega], z], z] , {AR, BR}][[1, 2, 2]] /.      z -> 0 , AL]];  \[ScriptCapitalD]E22[kparal_,  \[Omega]_] =   Expand[Coefficient[     Solve[  \[Epsilon]L[ \[Omega]] ezL[kparal,  \[Omega],            z]  ==  \[Epsilon]R[ \[Omega]]  ezR[kparal,              \[Omega], z] &&         D[ezL[kparal,  \[Omega], z], z] ==        D[ezR[kparal,  \[Omega], z], z] , {AR, BR}][[1, 2, 2]] /.      z -> 0 , BL]];  \[ScriptCapitalD]ELR [   kparal_,  \[Omega]_] = {{ \[ScriptCapitalD]E11[     kparal,  \[Omega]],  \[ScriptCapitalD]E12[     kparal,  \[Omega]]}, { \[ScriptCapitalD]E21[     kparal,  \[Omega]],  \[ScriptCapitalD]E22[     kparal,  \[Omega]]}}; MatrixForm[ \[ScriptCapitalD]ELR [   kparal,  \[Omega]]]

$$\displaystyle \begin{aligned} \displaystyle{ \left( \begin{array}{cc} \frac{\text{KL}(\text{kparal},\omega )}{2 \text{KR}(\text{kparal},\omega )}+\frac{\epsilon \text{L}(\omega )}{2 \epsilon \text{R}(\omega )} & \frac{\epsilon \text{L}(\omega )}{2 \epsilon \text{R}(\omega )}-\frac{\text{KL}(\text{kparal},\omega )}{2 \text{KR}(\text{kparal},\omega )} \\ \frac{\epsilon \text{L}(\omega )}{2 \epsilon \text{R}(\omega )}-\frac{\text{KL}(\text{kparal},\omega )}{2 \text{KR}(\text{kparal},\omega )} & \frac{\text{KL}(\text{kparal},\omega )}{2 \text{KR}(\text{kparal},\omega )}+\frac{\epsilon \text{L}(\omega )}{2 \epsilon \text{R}(\omega )} \\ \end{array} \right)} \end{aligned} $$
(8.25)

8.2.4.2 Translation Matrix

The next transformation matrix needed is a translation within a homogeneous (H) medium by a length equal to the thickness T of the generic layer so as to shift the next discontinuity placed at a generic location to the right to the origin z = 0:

$$\displaystyle \begin{aligned} \left( \begin{array}{c} A_{z+T} \\ B_{z+T} \\ \end{array} \right)= \left( \begin{array}{cc} \varPi_{11} & \varPi_{12} \\ \varPi_{21} & \varPi_{22} \\ \end{array} \right) \left( \begin{array}{c} A_z \\ B_z \\ \end{array} \right)=\left( \begin{array}{cc} \varPi_{11} & 0 \\ 0 & \varPi_{22} \\ \end{array} \right) \left( \begin{array}{c} A_z \\ B_z \\ \end{array} \right)={\Pi}_{z,z+T}\left(\begin{array}{c} A_z \\ B_z \\ \end{array}\right) \end{aligned} $$
(8.26)

Let us consider the solutions within such a homogeneous medium at generic coordinates z and at z + T:

ezH[kparal_,  \[Omega]_, z_] =  AH  Exp[KH [kparal,  \[Omega]] z] + BH  Exp[-KH[kparal,    \[Omega]] z] ezH[kparal,  \[Omega], z + T]

where KH is equal to K i within the layer. Therefore the translation matrix, which yields the solution at z + T in terms of that at z, is, straightforwardly:

 \[CapitalPi]11[kparal_,  \[Omega]_, T_] =   Simplify[Coefficient[ezH[kparal,  \[Omega], z + T], AH]/     Exp[KH [kparal,  \[Omega]] z]] ;  \[CapitalPi]12[kparal_,  \[Omega]_, T_] = 0. ;  \[CapitalPi]21[kparal_,  \[Omega]_, T_] = 0. ;  \[CapitalPi]22[kparal_,  \[Omega]_, T_] =   Simplify[Coefficient[ezH[kparal,  \[Omega], z + T], BH]/     Exp[-KH [kparal,  \[Omega]] z]] ;  \[CapitalPi][kparal_,  \[Omega]_,   T_] = {{ \[CapitalPi]11[kparal,  \[Omega], T],  \[CapitalPi]12[     kparal,  \[Omega], T]}, { \[CapitalPi]21[kparal,  \[Omega],     T],  \[CapitalPi]22[kparal,  \[Omega],     T]}}; MatrixForm[ \[CapitalPi][kparal,  \[Omega], T]]

$$\displaystyle \begin{aligned} \left( \begin{array}{cc} e^{T \text{KH}(\text{kparal},\omega )} & 0. \\ 0. & e^{-T \text{KH}(\text{kparal},\omega )} \\ \end{array} \right) \end{aligned} $$
(8.27)

8.2.4.3 Modes Due to Electric Field Continuity

In the archetypal case of two semi-infinite slabs, described by dielectric functions 𝜖 −1 and 𝜖 +1, separated by a gap of medium 𝜖 Gap, moving from left to right, the zones to consider are fewer than in the full case of stacks of finite thickness (we do not use the “0” subscript, as in 𝜖 0, to avoid confusion with the SI symbol for the vacuum permittivity). Let us consider the modes associated with electric field continuity. The first boundary discontinuity is located at z = 0, between the left slab and the gap medium (assumed by Lifshitz to be the vacuum), described as:

$$\displaystyle \begin{aligned} \left( \begin{array}{c} A_0 \\ B_0 \\ \end{array} \right)= \mathbb{D}_{-1,0}^E\left( \begin{array}{c} A_{-1} \\ B_{-1} \\ \end{array} \right)\end{aligned} $$
(8.28)

That is:

 \[ScriptCapitalD]ELR1 [   kparal_,  \[Omega]_] = {{ \[ScriptCapitalD]E11[      kparal,  \[Omega]],  \[ScriptCapitalD]E12[      kparal,  \[Omega]]}, { \[ScriptCapitalD]E21[      kparal,  \[Omega]],  \[ScriptCapitalD]E22[      kparal,  \[Omega]]}} /. { \[Epsilon]L[ \[Omega]] ->         \[Epsilon]m1 [\  \[Omega]],  \[Epsilon]R[ \[Omega]] ->   \[Epsilon]Gap[ \[Omega]],    KL[kparal,  \[Omega]] -> Km1[kparal,  \[Omega]],    KR[kparal,  \[Omega]] ->    KGap[kparal,  \[Omega]]} ; MatrixForm[ \[ScriptCapitalD]ELR1 [   kparal,  \[Omega]]]

This is followed by translation through the gap assumed to be of width T → s, that is:

$$\displaystyle \begin{aligned} \left( \begin{array}{c} A_{s+0} \\ B_{s+0} \\ \end{array} \right) =\varPi _{0,s+0} \left( \begin{array}{c} A_0 \\ B_0 \\ \end{array} \right)= \varPi _{0,s+0}\, \mathbb{D}_{-1,0}^e\left( \begin{array}{c} A_{-1} \\ B_{-1} \\ \end{array} \right) \end{aligned} $$
(8.29)

and, finally, upon transforming over the boundary to the right of the gap, the overall transfer matrix is:

$$\displaystyle \begin{aligned} \mathbb{M}_{-1,+1}^E= \mathbb{D}_{0,+1}^E\, \varPi_{0,0+s}\, \mathbb{D}_{-1,0}^E \end{aligned} $$
(8.30)

The conditions of field decay away from the gap demand that the (1,1) element of this result must vanish [58]. For reasons of space here we omit obvious steps and do not reproduce the unwieldy elements of the total matrix product. Standard manipulations by Mathematica to bring this condition into a familiar form (also omitted for brevity), yield the condition (see Ref. [30], Eq. (7.27)):

$$\displaystyle \begin{aligned} {(\epsilon_{\mathrm{Gap}}K_{-1}+ \epsilon_{-1}K_{\mathrm{Gap}}) (\epsilon_{\mathrm{Gap}}K_{+1}+ \epsilon_{+1}K_{\mathrm{Gap}}) \over (\epsilon_{\mathrm{Gap}}K_{-1}- \epsilon_{-1}K_{\mathrm{Gap}}) (\epsilon_{\mathrm{Gap}}K_{+1}- \epsilon_{+1}K_{\mathrm{Gap}})}e^{2K_{\mathrm{Gap}} s}-1=0\, . {} \end{aligned} $$
(8.31)

The modes associated with magnetic field continuity are found in a completely analogous fashion, leading to (see Ref. [30], Eq. (7.28)):

$$\displaystyle \begin{aligned} {(K_{-1}+ K_{\mathrm{Gap}}) (K_{+1}+ K_{\mathrm{Gap}}) \over (K_{-1}- K_{\mathrm{Gap}}) (K_{+1}- K_{\mathrm{Gap}})}e^{2K_{\mathrm{Gap}} s}-1=0\, . {} \end{aligned} $$
(8.32)

8.2.4.4 Lifshitz Expression

A reasoning in the complex plane, also due to Lifshitz, leads to consideration of an integrand for the dispersion force between two semi-infinite slabs separated by an empty gap (𝜖 Gap = 1) as a double integral. A few standard changes of variable, also possible to implement within Mathematica, finally lead to the familiar expression for the Lifshitz pressure:

$$\displaystyle \begin{aligned} \begin{aligned} F_{\mathrm{Lif}}(s) = & - {\hbar\over 2\pi^2 c^3}\int_1^{+\infty} dp\, p^2 \int_0^{+\infty} d\omega_I \, \omega_I^3 \\ & \times \left(\left[ {(s_{-1}+ \epsilon_{-1}p)(s_{+1}+ \epsilon_{+1}p) \over (s_{-1}- \epsilon_{-1}p) (s_{+1}- \epsilon_{+1}p)} e^{2\omega_I p s/c}-1\right]^{-1} \right.\\ & {}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, + \left. \left[ {(s_{-1}+ p)(s_{+1}+ p) \over (s_{-1}- p) (s_{+1}- p)} e^{2\omega_I p s/c}-1\right]^{-1} \right) \end{aligned} {} \end{aligned} $$
(8.33)

where ω I is the imaginary part of the complex frequency, ω C = ω R +  I, the following variables were introduced:

$$\displaystyle \begin{aligned} s_{\pm 1}=+\sqrt{p^2 -1 +\epsilon_{\pm 1}}\, ,\end{aligned} $$
(8.34)

and, on causality considerations, the dielectric function 𝜖 ±1(ω I) is always real.

8.2.4.5 Suggested Exercise 6

Carry out the Mathematica manipulations leading to the results given above at Eqs. (8.31) and (8.32).

8.2.4.6 Suggested Exercise 7

Extend the above treatment to that of two possibly unequal slabs of finite thicknesses, a L and a R (compare your results to those in Ref. [50], Sec. 4.1.1.). Show that you can recover the above results in the limit of infinite thicknesses, a L, a R → +. What happens if the thickness of both slabs vanishes?

8.2.4.7 Suggested Exercise 8

Consider two bi-layers interacting across a gap. Show that you can recover the standard Lifshitz expression if the thickness of one layer vanishes while the other one diverges in each bi-layer.

8.2.4.8 Suggested Exercise 9

Recover the Casimir pressure expression by taking appropriate analytical limits (𝜖 ±1 →=  ) in the integrand at Eq. (8.33) and by using Mathematica to compute the integral. Show that:

$$\displaystyle \begin{aligned} F_{\mathrm{Cas}}(s) = -{\hbar c \pi^2\over 240\, s^4}\, . \end{aligned} $$
(8.35)

8.2.4.9 Suggested Exercise 10

  1. (a)

    Investigate the dependence of our results for the Lifshitz pressure and the unretarded Hamaker constant on the numerical values of the constants in the analytical expression of the dielectric function.

  2. (b)

    How would you manipulate the dielectric function?

  3. (c)

    By building upon the early treatment by Arnold, Hunklinger, and Dransfeld [63], and the more modern approach by Chen, Klimchitskaya, Mostepanenko, and Mohideen [64], formulate a model of illumination-dependent dielectric function (some possibly useful information is provided in Exercise 11).

  4. (d)

    Calculate the Lifshitz pressure as a function of illumination for realistic values of parameters of interest, for instance, by using:

    LifshitzPressure[s_] :=  NIntegrate[   LifshitzIntegrand[ \[Omega]I, p, s], { \[Omega]I, 0,     \[Infinity]}, {p,     1,  \[Infinity]},   Method -> {"GlobalAdaptive",     "SingularityHandler" -> "DuffyCoordinates"},   WorkingPrecision -> 14, PrecisionGoal -> 10,    MaxRecursion -> 50]

  5. (e)

    Comment on the possibility to drive nano-oscillators by dispersion force manipulation (an example is given in Refs. [65,66,67]). Under what conditions can the system be driven into parametric resonance by this approach? Compare your result to that for a mechanically driven Casimir force parametric amplifier [68].

8.2.4.10 Suggested Exercise 11

Use the following naive model of the dielectric function of amorphous silicon (a:Si) (or any other Kramers-Kronig consistent one):

 \[Epsilon]aSi[ \[Omega]_] =   1 + ( \[Epsilon]1/(1 - ( \[Omega] / \[CapitalOmega]1)ˆ2 -        I ( \[Omega] / \[Gamma])));

with the following choices (ω is expressed in s −1):

 \[Epsilon]1 = 11.;  \[CapitalOmega]1 = 5.8 10ˆ15;  \[Gamma] = 16.0 10ˆ15;

  1. (a)

    Using values of the natural constants in the MKS system, calculate the Lifshitz pressure in the s = 0.01 − 10 μm. Identify unretarded and retarded limits and identify the onset of retardation. Use the unretarded behavior to estimate the Hamaker constant defined at Sect. 8.2.2.

  2. (b)

    It has been said that the prediction that the Casimir pressure between two ideal surfaces at a distance of approximately 10 nm should be approximately equal to 1 atmosphere is unphysical. What is the Lifshitz pressure between the two surfaces considered above across a 10 nm gap, expressed in atmospheres? What is the ratio of the Lifshitz to Casimir pressures in that case?

  3. (c)

    What should the length of the side of four square pads be, if all placed within the same 10 nm distance of a highly polished ceiling, to hold the weight of one human being against the gravitational force of the earth? (assume a mass m human = 102 kg) [69].

8.3 “Perpetuum Mobile” Considerations

Given the non-trivial nature of the systems typically involved in the study of Casimir effects, it is not unusual to employ conservation arguments to uncover theoretical limitations, possible novel applications, or pitfalls leading to paradoxes connected to incompletely understood energy exchange processes in systems such as the one at Exercise 10(e) above [27, 70,71,72]. An additional example is the fierce debateFootnote 5 regarding the existence of “quantum friction,” that is, a force expected to dampen the motion of two plane, parallel slabs moving transversally with respect to each other [74,75,76,77,78]. This friction can be interpreted as yet another fascinating manifestation of the electromagnetic vacuum state responsible for the static Casimir effect emphasized in this contribution. A suggestive description of this process is that “Qualitatively, with the inclusion of quantum fluctuations, the vacuum behaves as a complex fluid that hinders and influences the bodies moving through it” [79]. Although different viewpoints are possible [80], the fundamental origin of the friction, which requires neither roughness nor contact of any kind, can be seen to lie in the asymmetric reflection of virtual photons incident from different directions because of the relative motion of the two slabs [81]. This area of research continues to attract intense attention [82] with the earliest references cited [83] being traditionally those to the work by Teodorovich [84] and Levitov [85] along with important early contributions reported [86] to be due to Mahanti [87] and to Schaich and Harris [88].

In the most heated phase of the quantum friction debate, an appeal was made to an elementary mechanical gedanken experiment designed to logically demonstrate that the very existence of any such force would imply that “an unlimited amount of useful energy could be extracted from the quantum vacuum” [76].

The great apparent strength of this reasoning [76] lies in the fact that, if its conclusions were correct, hardly any treatment of quantum-electrodynamics would be necessary to decisively rule out the existence of quantum friction in the face of such strong energy conservation arguments. In fact, such proof, as presently formulated, appears to be of such broad applicability as to be able to rule out the existence of any friction – and even any lateral forces – between parallel moving plane surfaces in nature, regardless of their origin. However, such implications have gone completely unchallenged since the rebuttal stated: “There follows a discourse proving that there is no frictional force between two stationary media. I agree with this statement” [77].

Here we only consider this objection from an elementary mechanics standpoint and, while taking no position regarding all well known field theoretic treatments, we prove that the argument presented to show that quantum friction cannot exist [76] is logically flawed. As the entire reasoning hinges on simple mechanics and it is completely accessible to undergraduate students, appreciating this error has implications not only in our understanding of quantum electro-dynamics but also from the historical and pedagogical standpoints.

In that proof, the typical problem of a glass plate initially moving at a speed u with respect to a substratum held at rest in the laboratory frame is treated. A change in the reference frame is proposed on relativity grounds (Fig. 2, therein) as an artifice to expose a “paradox.” Therefore it is stated that “a substrate … assumed to be infinitely heavy” moves at a constant velocity u while constrained to be parallel to and under a glass plate of finite weight, initially at rest in the reference frame of the observer, and separated from the substratum by an empty gap of constant width (the author here employs the adjective “heavy” where massive is required).

The proposed proof is a classical Reductio ad Absurdum (Proof by Contradiction) [89]. If such a force as quantum friction existed, it is argued, “… the glass would be accelerated until its velocity matches the apparent velocity u of the substrate.” This fact is employed to prove two statements: (A) “One could put an arbitrary number of cleverly designed glass pieces … and let them become accelerated by the quantum vacuum.” (B) “Quantum friction thus leads to the paradox that an unlimited amount of useful energy could be extracted from the quantum vacuum.” Since (B) contradicts the principle of conservation of energy, it is concluded that quantum friction may not exist.

In order to explore the veracity of such statements, let us provide a more detailed and physically realistic description of this gedanken experiment. This can be done by assuming that the mass M sub of the substratum be finite and possibly, but not necessarily, much larger than that of the glass plate, M plate.

In principle, the quantity M sub includes the entire subsystem to which the substratum is constrained, such as the Earth, or just the mass of a layer of material if the experiment is conducted in outer space. Since, as is acknowledged in the proof, Newton’s Third Law applies in this case, the frictional force F fric acting on the glass plate must be at all times equal and opposite to that acting on the substratum, −F fric, so that the total momentum of the system, absent any other external forces with non-vanishing horizontal component, will be rigorously conserved throughout the process.

As is well-known from the elementary mechanics treatment of one-dimensional inelastic collisions, the final speed of the substratum-glass-bar system in the new reference frame will eventually approach the asymptotic value, v fin = M subu∕(M sub + M plate) and the total energy change, ΔE, in terms of the total initial energy, E in, always negative, will be to ΔE = −E inM plate∕(M sub + M plate) < 0.

These results show that, as the practically unattainable limit M subM plate → + is approached, regardless of any quantum-electrodynamical details, the final velocity of the substratum-glass-bar system will indeed be v fin →u , consistently with (A). However, there will be no energy gain as ΔE → 0 in that limit, unlike claimed at (B), due to a correspondingly small, but here logically critical, negative difference in the speed of the final system with respect to the initial speed of the massive substratum.

The reasoning error may have been caused by the assumption that, in the more typical reference frame in which the glass plate is moving and the substratum is at rest, the final state of the system should correspond to a substratum-glass-bar system at rest after a long interaction. Although this assumption is practically correct in a ground-based laboratory, the final speed of the substratum-glass-bar system rigorously speaking does not vanish. In order to eliminate such mistaken assumptions, it may be helpful to visualize the substratum as a wedge-shaped “glider” of the type used in some elementary mechanics demonstrations with the glass plate represented by another glider of finite mass constrained to freely slide on top of it while they both ride an “airtrack” – a variation on the theme of a well-known experiment [90].

A different gedanken experiment, which adheres even more strictly to the typical theoretical treatments of quantum friction, consists of assuming that an external force F ext be acting directly on the substratum so as to keep it at a constant velocity u in the laboratory reference frame. In this case, the substratum-glass-bar system, of total mass M sub + M plate, will, after a long interaction time, by hypothesis already be moving with a velocity u. However, in order to maintain the substratum at a constant velocity, Newton’s Second Law requires that F ext = −F fric. Therefore, although after a long interaction the final kinetic energy of the glass plate will indeed have increased by \(\varDelta E_{\mathrm {plate}}=+\textstyle {1\over 2}M_{\mathrm {plate}}|{\mathbf u}|{ }^2\) as claimed at (B), the total mechanical work done by the force F ext on the substratum will equal \(W_{\mathrm {ext}}= -\textstyle {1\over 2}M_{\mathrm {plate}}|{\mathbf u}|{ }^2\). Therefore, the net total energy change will be exactly ΔE = 0.

Historically, powerful gedanken experiments in which requiring that an absurd state of perpetuum mobile be avoided to obtain the correct solution to a mechanical problem were made famous by Simon Stevin (1548/49–1620), who even placed one on the title page of his Hypomnemata Mathematica (1605–1608) [91,92,93] to the enthusiastic approval of Ernst Mach (Ref. [94], p. 24). In the present case, however, it appears that the proposed reasoning fails to allow for any conclusions against, or in favor of, the existence of frictional forces, whether they be due to quantum fields or to any other interaction.

8.4 Conclusions

It is appropriate to end by returning to the two opening statements by Maxwell quoted in the Introduction. As we have explored in this contribution, the concept that “No force, either of attraction or of repulsion, can be observed between an electrified body and a body not electrified” [1] requires a far more sophisticated interpretation. Although Coulomb’s law – a cornerstone in the edifice of electrodynamics coded into our minds ever since primary school – predicts that a neutral point-like particle shall not interact with a charged point-like particle, that needs not clash with the existence of dispersion forces. In this contribution, we have explored much evidence to eliminate misconceptions so as to develop the opposite expectation, that is, that polarizable bodies always electrodynamically interact even if they are neutral and such interaction may well be dominant. To that end, we had to explore the profoundest implications of the latter statement by Maxwell, that “When, in any case, bodies not previously electrified are observed to be acted on by an electrified body, it is because they have become electrified by induction.” This is the first link in the long logical chain leading to an operational understanding of the technological opportunity represented by dispersion force engineering.

An important historical question deserves to be considered at last: “Given the above statements, could Maxwell contribute to our modern understanding of intermolecular forces?” And, more explicitly: “If the existence of dispersion forces can be accommodated, or at least hypothesized, within the structure of classical electrodynamics, did Maxwell make that logical connection?” In what follows, we analyze this issue often by directly quoting from the writings of the protagonists of the period.

Our motivation in exploring this issue is the recurring theme among scientists and historians of science alike – whether implicitly or explicitly stated – as to whether Maxwell could have made further progress towards elucidating those fundamental questions by reaping a fuller harvest of the mathematical physics machinery of the Treatise. In the opinion of the present author, as regards Maxwell’s abilities and persistence in probing difficult issues, this line of inquiry should realistically consider his death from abdominal cancer in 1879 at the relatively young age of 48 after “bringing to bear on a subject still full of obscurity the steady light of patient thought and expending upon it all the resources of a never failing ingenuity.” (Ref. [95], Vol. I, Preface) Indeed, Freeman Dyson, referring to Maxwell’s Presidential Address to Section A of the British Association in 1870, only points out that “It is difficult to read Maxwell’s address without being infuriated by his excessive modesty …” and, writing on Missed Opportunities, he turns the tables on the critics with a scorching criticism of his own: “But the mathematicians of the nineteenth century failed miserably to grasp the equally great opportunity offered to them in 1865 by Maxwell. If they had taken Maxwell’s equations to heart as Euler took Newton’s, they would have discovered, among other things, Einstein’s theory of special relativity, the theory of topological groups and their linear representations, and probably large pieces of the theory of hyperbolic differential equations and functional analysis. A great part of twentieth century physics and mathematics could have been created in the nineteenth century, simply by exploring to the end the mathematical concepts to which Maxwell’s equations naturally lead” [96].

However, apart from the temptation to engage in counterfactual history [97] speculations – what Carr referred to as “parlour games” [98] – about Maxwell’s alleged omissions, there is merit in investigating classical electrodynamics as a logical tool to clarify the role of quantization in dispersion force theory, to strengthen semi-classical arguments and to develop new expository approaches. The logical pathways presented herein suggestively illustrate that Coulomb’s law with point-like, non-polarizable neutral particles does not imply that polarizable neutral particles should not interact. A description of dispersion forces as a natural consequence of classical electrodynamics can greatly aid to provide much needed pedagogical devices for use by educators [25], to dispel the widespread concept of the Casimir effect as a “mystery,” to enhance effective communication with investors, the media and the public, and to stimulate confidence in the viability of dispersion force engineering startup and spin-off companies [18].

Chronologically, it is relevant to notice that the first edition of the Treatise dates to 1873, or thirty-three years after Whitworth’s report at the Glasgow meeting of the British Association (Ref. [99], p. 4) and only two years before Tyndall’s paper read at the Royal Institution [16], which means that the contemporary state of the art in cohesion experimentation had to be well known to Maxwell. As far as atomistic theory – regarded as the indispensable framework of dispersion force physics – in the words of Maxwell’s early biographer, “we are indebted for all the modern developments of the molecular theory of gases, as well as for its establishment on a sound dynamical basis,” [100] mainly to Clausius, Boltzmann, and Maxwell. More specifically about molecular interactions, as shown even just by his famous entry on Capillary Action in the 9th edition of the Encyclopædia Britannica (Ref. [95], Vol. II, p. 541), Maxwell’s contribution to early explorations in the nature of molecular forces was nothing short of substantial and is, by itself, the subject of extensive studies [101]. Coulson, also cited by El’yashevich and Prot’ko [102], provides a list of related “problems advanced by Maxwell: (1) What is a molecule and what is the nature of the aggregate of atoms of which it consists? (2) What is the origin of intermolecular or interatomic forces? And what is their law of dependence on distance and orientation? (3) Why are molecules so invariable in character with no evolutionary or continuously varying properties? (4) How does a molecule form?” [103].

Coulson summarizes the situation by stating that “Maxwell had almost got to the limit of what he could have done in the discussion of interatomic forces” and he names a formidable list of items unavailable to Maxwell including “the discovery of the electron…the nuclear atom, …electron shells, …stationary states, …the wave equation, and …the Pauli exclusion principle” [103]. Use of the adverb “almost” is due to the opinion that “Maxwell could have been expected to make further progress than he did” only in the latter of items (2), that is, in ascertaining the “form of the interatomic and intermolecular force …by making more use of Clausius’ virial theorem.” If so, what can be said about “the origin of intermolecular or interatomic forces” by means of Maxwell’s equations? Coulson reflects on “how impossible it was that Maxwell should have been able to describe either the dispersion attractive forces, or the spin repulsion and attractions.” The assessment is that accounts of interatomic force physics before and after the development of modern quantum theory “appear to have almost nothing in common” [103].

This peremptory position is justifiable from the standpoint of modern quantum electrodynamics but it fails to capture the subtler mathematical implications of Maxwell’s equations. For instance, shortly after Maxwell’s early death, Lebedev could already discern such connections with extraordinary clarity and rare intuition in his doctoral dissertation: “Hidden in Hertz’s research, in the interpretation of light oscillations as electromagnetic processes, is still another as yet undealt with question, that of the sources of light emission …such a problem leads us …quite unexpectedly as it were, to one of the most complicated problems of modern physics – the study of molecular forces. Adopting the point of view of the electromagnetic theory of light, we must state that between two radiating molecules, just as between two vibrators in which electromagnetic oscillations are excited, there exist ponderomotive forces: They are due to the electrodynamics interaction between the alternating electric current in the molecules (according to Ampere’s laws) or the alternating charges in them (in accord to Coulomb’s laws); we must therefore state that there exist between the molecules in such a case molecular forces whose cause is inseparably linked with the radiation processes …”Footnote 6 The obvious modern objection to any argument based on the “radiation process” is that atoms, in their stationary states, are in fact not radiating. This same severe limitation is mentioned much later by Casimir in his critique of Overbeek’s intuitive, but “misleading” [107, 108] model – an imperfect model that, as Casimir nevertheless generously repeats, provided the initial impetus towards the expression for the Casimir-Polder force. Disregarding this issue of principle, in order to anticipate the connection to dispersion forces, Maxwell would have had to speculate about the inner structure of the atom to conclude that their charged constituents, if driven from their positions of equilibrium, will oscillate, radiate, and interact as Lebedev suggested. In the present work, we took advantage of Lorentz atomic models “to go further than Maxwell” (Ref. [109], §122) but, as remarked by Coulson, [103] Maxwell’s description of polarized matter had to be formulated prior to the discovery of the electron by J. J. Thomson in 1897 [110, 111]. Although the historical development of our understanding of molecular structure following Maxwell is a complex subject [2, 112,113,114], Yaghjian judges that “the closest he seems to approach the idea of dielectrics containing dipoles is …in explaining the theory of Mossotti that a dielectric contains small conducting elements insulated from one another and capable of charge separation (forming dipoles)” [115].

Mossotti’s results [116, 117] regarding atomic structure “were based on an ether concept typical of his epoch, and are hence difficult to follow for the modern reader” [118]. Remarkably, however, Mossotti “utilised a mathematical method which had been developed by Poisson for the examination of a similar question in magnetism.” (Ref. [119], §142–151; see also Ref. [112], pp. 188–189). As explained much later by Van Vleck, “this concept of the polarization of the molecule as the cause of the departures of 𝜖 and μ from unity is by no means a purely twentieth-century concept, and was intimated by Faraday” [120]. Indeed, Mossotti’s starting point had been Faraday’s conclusion that “it is the molecules of the substances that polarize as wholes …and that however complicated the composition of a body may be, all those particles or atoms which are held together by chemical affinity to form one molecule of the resulting body, act as one conducting mass or particle …” (Ref. [121], §1699–1700). Van Vleck continues: “In 1836, Mossotti pictured the molecule as a conducting sphere of radius a, on which the charge would, of course, readjust or ‘polarize’ itself under the influence of an applied field, thus making the molecular moment different from zero. If the electric susceptibility χ e is small compared to unity, he thereby showed that χ e = Na 3. It seems almost too hackneyed to mention that the values of a obtained from this simple equation (together with the observed N and χ e) are comparable in magnitude with the molecular radii in kinetic theory” [120] (N is the number of atoms in the unit volume; see also Ref. [122], Sec. 10.12).

Maxwell was obviously deeply affected by Mossotti’s work, which he repeatedly cites and critiques over a period of several years, somewhat appearing to vacillate between acceptance and doubt but never fully endorsing it. For instance, in 1841, in a short communication devoted to that subject, Maxwell writes that “although M. Mossotti’s general view may be correct, I believe it will be found that his analysis is erroneous” [123]. In 1864, in A dynamical theory of the electromagnetic field, Maxwell lucidly explains that “in a dielectric under the action of electromotive force, we may conceive that the electricity in each molecule is so displaced that one side is rendered positively and the other negatively electrical, but that the electricity remains entirely connected with the molecule, and does not pass from one molecule to the other”Footnote 7 [95] (Vol. I, p. 526). In 1869, in his paper On the Mathematical Classification of Physical Quantities, he judges rather ungenerously that “Mossotti …was enabled to make use of the mathematical investigation of Poisson relative to magnetic induction, merely translating it from the magnetic language into the electric, and from French into Italian” [95] (Vol. II, p. 258). Finally, in the Treatise, Maxwell comments that “This theory of dielectrics is consistent with the laws of electricity, and may be actually true.” (Vol. I, §62).

The early proof by Mossotti [117], recast by Jeans in modern notation (Ref. [119], §149), remains a standard textbook calculation in the electrostatics of dielectric media to this day. Of particular interest in our case are the even simpler model with one central, point-like positive charge surrounded by a cloud with homogeneous negative charge density [122] and that of two homogeneously, opposite charged spheres superimposed to each other in the absence of an external field [124]. In both cases, the polarizability of the system is shown to be α ≈ a 3. As pointed out by Lorentz (Ref. [109], §124), such results are only rigorously valid for the static polarizability, which was extended to include the dynamic polarizability by treating the time-dependent case first by Lorenz [125] and, independently, by Lorentz himself – “certainly a curious case of coincidence” [109].

In the Introduction, we stated that two neutral, polarizable particles will not interact “if an independent means to produce polarization is absent.” In fact, Spruch considered the dipole-dipole interaction of two particles of dynamic polarizabilities α 1,2(ω) within a volume \(\mathbb {V}\) much larger than the size of the particles, and immersed within a background classical field E 0(ω, r), where r is the position vector [126]. On realizing that – consistently with Lebedev’s and Overbeek’s intuitions – only the “radiation” term proportional to 1∕r contributes to the potential, \(V^{\mathrm {class}}_{\mathrm {pol\ pol}}(r)\), and by introducing the energy density u(ω) = |E 0(ω, r)|2 of the mode of frequency ω of the smoothly varying electric field, this treatment leads to the following result:

$$\displaystyle \begin{aligned} V^{\mathrm{class}}_{\mathrm{pol\ pol}}(r) = {\mathbb{V}\over c^5 r}\int^{c/r}_0\,\,\, \alpha_1(\omega)\alpha_2(\omega)\, u(\omega)\, \omega^4\, d\omega\, . {} \end{aligned} $$
(8.36)

Crucially, Spruch comments that the above is “a result that Maxwell could have derived, and perhaps did” [126]. The first part of this statement – once again in the tradition of results Maxwell allegedly missed – is correct but quite bold given the logical chain we have seen is needed to reach the concept of dynamic polarizability even if just with a rudimentary model for a “molecule.” The latter part of the statement, although an effective ‘narrative hook,’ is probably fictional.

The connection between the above equation and quantum electrodynamics takes place by simply writing the energy density per mode of the zero-point field as \(\mathbb {V}u(\omega ) = \textstyle {1\over 2}\hbar \omega \), thus leading to the Casimir-Polder potential, V C−P, only as “the last step” [126]:

$$\displaystyle \begin{aligned} V_{\mathrm C-P} \sim -{\hbar c \alpha_1(0)\alpha_2(0)\over r^7} \end{aligned} $$
(8.37)

However, the presence of this random field of intensity proportional to ħ in Spruch’s approach does not imply that quantization of the electromagnetic field is needed to reach the Casimir-Polder expression. In fact, simply the introduction of a fluctuating classical field of appropriate specific energy density leads to the same mathematical results obtained by the methods of QED. This was stressed by Casimir, who, writing one year before his death in 2000, commented on this approach that “The problem in quantum electrodynamics is then reduced to a problem in classical electrodynamics” [20].

As Spruch and Kelsey explained: “Why vacuum-fluctuation arguments worked in the past in the problems to which they were applied is, to our knowledge, not completely understood, but the simplicity of the approach gives it considerable appeal, as a means of providing physical insight into known results and as a means of suggesting new results” [127]. Consequently, a very extensive literature reporting dispersion force calculations now exists based not on standard field quantization but on the injection into the system of an appropriate classical field – an approach sometimes referred to as ‘random’ or ‘stochastic’ electrodynamics (SED) [128,129,130,131,132].

A succinct and lucid presentation of the logical premises of SED has been given by Milonni (Ref. [30], Sec. 8.12; see also Ref. [133], Sec. 5). He explains: “In QED we cannot arbitrarily set to zero the homogeneous, source-free solution of the Maxwell operator equations in the Heisenberg picture. This “vacuum” field is necessary for the formal consistency of QED…Classically, however, we generally assume implicitly that the homogeneous solution of the Maxwell equations is that in which the electric and magnetic fields vanish identically. That is, we assume that there are no fields in the absence of any sources…This difference between classical and quantum electrodynamics, together with the evident importance of the fluctuating vacuum field in QED, suggests the adoption of a different boundary condition in classical electrodynamics: instead of assuming that the classical field vanishes in the absence of sources, we can assume that there is a fluctuating classical field with zero-point energy \(\textstyle {1\over 2}\hbar \omega \) per mode. Whether it is a better working assumption than the standard, “obvious” one is a matter to be ultimately decided by comparison with experiment” [30]. From this point of view, important parallels and connections exist with the cosmological constant problem, which has been speculated to be connected to gravitating zero-point energy affecting the expansion of the universe [134,135,136].

Fascinatingly, as first shown by Marshall [128], any classical zero-point field of spectral energy density ρ 0(ω) ∝ ω 3 is Lorentz invariant; furthermore, the specific choice ρ 0(ω) = ħω 3∕2π 2c 3, well known from QED, causes a mean square displacement fluctuation in a classical charged harmonic oscillator equal to that of the corresponding quantum problem. Shortly after these findings, Boyer proposed to elevate the Lorentz invariance of the zero-point spectrum to the role of an SED postulate [137, 138]. Therefore, “…we require that the spectrum of the radiation shall look the same to all observers moving at constant relative velocity with respect to each other” [137] and “…ħ enters our theory, not as any quantum of action, but solely as the constant setting the scale of the zero-point electromagnetic radiation spectrum” [139].

As Boyer reported, “Some readers of this classical electromagnetic analysis are distressed, even indignant at the idea of a “classical” electromagnetic zero-point radiation. They insist that zero-point radiation is a “quantum” idea which can not be used as part of classical physics. However, surely this objection is without merit” [131]. Such strong opinions also transpire from scientific language usage. For instance, Boyer judges that “The idea of quanta forms a subterfuge for what is a natural part of a theory of classical statistical thermodynamics including electromagnetism,” [139] using the noun subterfuge three more times in the course of the same paper. As though in direct response, Milonni, typically quite impartial, turns the same terminology against random electrodynamics, which he describes as “at best an interesting subterfuge” (Ref. [133], Sec. 5.2) before using the same term twice more in the same paper (Ref. [133], Sec. 5.5). The sometimes acrimonious debate as to the reasons non-fully quantum theories of dispersion forces lead to correct results – only partially within our scope herein – continues unabated to this day [18, 140] as this approach is now also being tested to probe spacetime fluctuations in the weak field limit of the gravitational field, for which no quantization scheme is yet known [22, 141].

The adoption of a classical random field described by ħ “…as a multiplicative constant chosen by comparison of theoretical predictions with experiment” [30] changes none of the mathematical findings by Spruch. In the fully retarded regime in which the only significant contribution to the integral at Eq. (8.36) comes from the static polarizabilities α 1,2(0) – provided by experimental measurements – we finally recover the quantum electrodynamical Casimir-Polder expression, V C−P, from completely classical considerations (see Ref. [142], Sec. IV). The classical picture of dispersion forces to emerge is therefore that of an interaction caused by a Lorentz invariant stochastic field postulated to fill the universe and driving the process of mutual atomic polarization.Footnote 8

On the one hand, from the historical point of view, we can now ask: “Was such a description within Maxwell’s hypothetical reach?” As we have seen, this would have required both a model for the dynamical polarizability – or at least its static limit – and the concept referred to today as the Lorentz covariance of Maxwell’s equations as the framework to accommodate a classical zero-point field. Strictly speaking, covariance would not be understood till after Maxwell’s death, that is, at the very earliest, till the little cited discovery of the “Voigt transformations,” [143,144,145,146] in 1887 and the later critical re-elaborations by Lorentz and Poincaré [147, 148]. Even with such machinery, however, the existence of homogeneous solutions of Maxwell’s equations different than the “true vacuum” (all fields equal to zero) was not truly appreciated till the use by Lifshitz [54] of Rytov’s “random field” [149] and the much later work by Marshall cited above. Of course, an appreciation of the fact that any external field, invariant or not, could “drive” dispersion forces as shown by a rudimentary model of polarizability was within Maxwell’s potential reach, and that might have led to further speculations as to the form of the intermolecular potential even earlier than the discovery of invariance. In fact, as we have explored in this work, introduction of external fields is a common strategy to engineer dispersion forces [132]. If the existence of random fields had been speculated and if dynamical polarization had been at least tentatively modeled, Maxwell could have logically connected electromagnetism to the existence of dispersion forces and to cohesion. All such suppositions, however, must deal with the reality of Maxwell’s early death and one is left speculating the many ways in which he would have further contributed to physics if, as for Casimir, his life had spanned 90 years, or till 1921.