1 Introduction: The Standard Model and Gravity

Throughout its history, theoretical physics has sought to identify the fundamental constitutents of matter and understand how they behave. This objective has led to the construction and operation of increasingly larger particle colliders with which these constituents have been studied with greater and greater precision, and ultimately, to the discovery and validation of the Standard Model of particle physics. This model stands as a testament to the work of many people, and some might claim it constitutes a theory of everything once a consensus is reached on how to incorporate the gravitational force [39].

Fig. 1
figure 1

Standard Model and gravitational force strength versus distance

Interestingly, the coupling constants determining the strength with which fundamental particles within the Standard Model interact imply the weak force is \(10^{24}\) times stronger than the force of gravity, as calculated between a pair of protons in an atomic nucleus [47]. In fact, the strength of the weak force between point particles is not expected to compare to the strength of the gravitational force until the distance between the particles decreases by 20 orders of magnitude from the Fermi scale (\(\approx 10^{-15}\,\hbox {m}\)) to the Planck scale (\(\approx 10^{-35}\,\hbox {m}\)). Prior to the advent of string theory, this fact made unification of the Standard Model forces with gravity awkward, because particle collider experiments suggest the Standard Model forces unify in strength at electroweak and grand unification distance scales (\(10^{-18}\,\hbox {m}\) and \(10^{-31}\,\hbox {m}\)), without suggesting why the gravitational force should exist at all. Figure 1 shows a graph of the variation in Standard Model force strengths with distance. In this figure, the Planck scale at which a theory of everything (e.g. string theory) has been hypothesized to provide a unified description of the forces of Nature is indicated [58].

Visible in Fig. 1 is the symmetry breaking of the electroweak force at the electroweak scale (\(10^{-18}\,\hbox {m}\)), where it branches into the weak nuclear force and electromagnetic (EM) force described over distance scales relevant to the chemistry of atoms [57]. Technically, description of these forces works by computing the probabilities of quantum state transitions in terms of Feynman diagrams, which as shown in Fig. 2, represent forces as squiggly lines. Importantly, while it is tempting to think of Feynman diagrams as pictures of real processes occurring in 3+1 dimensional Minkowski spacetime, they are in fact mathematical bookkeeping devices used for purposes of calculation, and for this reason the squiggly lines representing the electromagnetic force in Fig. 2 are called virtual photons. This Standard Model description of the electromagnetic force is markedly different from the classical description given by Maxwell, in which the electromagnetic force is mediated by real electromagnetic waves propagating between charged particles.

From a reductionist perspective, the quantum description of the electromagnetic force should give rise to the classical description at the scale of atomic radii (\(10^{-10}\,\hbox {m}\)). Technically, this requires a distance-dependent parameter \(\alpha \), quantifying the strength of the quantum electromagnetic interaction, to account for the strength of the classical electromagnetic force at atomic distance scales. This parameter \(\alpha \), typically understood to encompass the total effect of virtual photon interactions at sub-atomic length scales, is called the fine structure constant, and derives its name from its appearance in formulae for special relativistic corrections \(\delta E_{n,l}\) to the energy levels \(E_{n,l}\) of the hydrogen atom [53]:

$$\begin{aligned} \delta E_{n,l}=\alpha ^{2}E_{n,l}\left( \frac{1}{n\left( l+\frac{1}{2}\right) }-\frac{3}{4n^{2}}\right) . \end{aligned}$$
(1)

Numerically, this constant is expressible in terms of the electric charge e, speed of light c, Planck’s constant \(\hbar \), and permittivity of free space \(\epsilon _{0}\), as:

$$\begin{aligned} \alpha =\frac{e^{2}}{(4\pi \epsilon _{0})\hbar c}\approx 1/137.036, \end{aligned}$$
(2)
Fig. 2
figure 2

Feynman diagram of a virtual photon mediating the electromagnetic force

and deep reasoned explanation of its value has long been a subject of interest and controversy amongst physicists. For example, while many modern day physicists accept an anthropic explanation of \(\alpha \)’s experimentally measured value, which suggests its value could be different in other spacetime regions unamenable to human existence, one of the progenitors of quantum physics, Max Born, had this to say about its value [30, 42, 46]:

If \(\alpha \) were bigger than it really is, we should not be able to distinguish matter from ether, and our task to disentangle the natural laws would be hopelessly difficult. The fact however that alpha has just its value 1/137 is certainly no chance, but itself a law of Nature.

Computationally, input of the fine structure constant in the relativistic Schrodinger equation, also known as the Dirac equation, allows relativistic corrections to the Bohr emission spectrum of atomic hydrogen to be calculated [40]. Interestingly, upon separation of radial and angular polar coordinates, this equation is a confluent hypergeometric differential equation, a fact that links the mathematical formalism of quantum physics to number theory, because confluent hypergeometric functions appear in the theory of automorphic forms [8]. Of course, in absence of providing any further physical insight, this link is dismissible as a mathematical coincidence with no relevance to physics, so our objective in this paper is to review how different number theoretic concepts have been found relevant to the study of physics, with the underlying motive being clarification of whether or not number theory has an essential role to play in physical modeling that could account for the value of fundamental physical constants such as \(\alpha \).

To this end, Chapter 2 begins by discussing the statistical relationship between the experimentally measured quantum mechanical spectra of atomic nuclei and the zeroes of Langlands L-functions discovered in the 1950’s. Chapter 3 follows by explaining how the mathematical formalism of classical mechanics relates to the definition of Hasse–Weil L-functions. Chapter 4 discusses how another class of number theoretic functions, called modular forms, arise in the study of thermodynamics as statistical mechanical partition functions. Chapters 5 and 6 introduce the subjects of twistor theory and conformal field theory as a means of relating the discussion in the previous chapters to the study of gravity. Chapter 7 concludes with some remarks regarding the relevance of number theory to better understanding the fine tuning of fundamental physical constants, and the relevance of number theory to applied physics.

2 Nuclear Physics and Random Matrices

Just as the discrete absorption and emission of electromagnetic radiation by atoms and molecules reveals their quantum mechanical quality, the discrete absorption of neutrons reveals the quantum mechanical quality of atomic nuclei. To appreciate what is meant by this statement, Fig. 3 shows the experimentally measured gamma ray emission from \({}^{68}\)Zn nuclei bombarded with neutrons, plotted against incident neutron energy. Theoretically, the emission peaks in this plot indicate the capture of incident neutrons by \({}^{68}\)Zn nuclei, which produces excited \({}^{69}\)Zn nuclear states that relax via emission of electromagnetic radiation [37]. Accordingly, it is reasonable to associate a different excited state of the \({}^{69}\)Zn nucleus with each emission peak, and take Fig. 3 as evidence there are numerous closely spaced energy levels of the \({}^{69}\)Zn nucleus.

Fig. 3
figure 3

Experimentally measured gamma ray emission from \({}^{68}\)Zn nuclei bombarded with accelerated neutrons, plotted against incident neutron energy [37]

In general, large numbers of closely spaced gamma ray emission peaks are experimentally observed when heavy nuclei are bombarded with neutrons, and the irregularity of their spacing makes identification of their energy levels with a small set of quantum numbers impossible [11]. Nevertheless, in 1956, after having studied the statistical distribution of these energy levels, Wigner proposed a probability distribution for the spacing of adjacent energy levels of heavy nuclei in terms of the distribution of eigenvalues of random real symmetric \(2\times 2\) matrices [14]. This probability distribution is:

$$\begin{aligned} p(x)dx = {\pi \over 2}x \exp (-\pi x^2 / 4)dx \end{aligned}$$
(3)

where x is the energy level spacing normalized by the average spacing, and attention is restricted to real symmetric matrices because nuclear interactions exhibiting time reversal invariance are described by unitary quantum operators that equate with their complex conjugates [54]. A histogram of normalized energy level spacings of Uranium 238 nuclei plotted against the Wigner distribution is shown in Fig. 4.

Fig. 4
figure 4

Histogram of Uranium 238 nuclear energy level spacings plotted against the Wigner distribution [14]

Because heavy nuclei have more than 2 excited states, Wigner also studied the eigenvalue statistics of random real symmetric \(M\times M\) matrices, whose distribution of adjacent eigenvalue spacings turns out to be very similar to distribution (3), and relevant to the statistical distribution of Langlands L-function zeroes along the critical line Re \(s = {1\over 2}\) [18, 35]. In fact, number theorists have conjectured statistical metrics describing the distribution of L-function zeroes are equivalent to statistical metrics describing the eigenvalues of random \(M\times M\) matrices of various symmetry classes (e.g. real symmetric). Henceforth, we’ll restrict our attention to the case where the random matrices are Hermitian, with complex entries selected at random according to the Gaussian unitary ensemble (GUE) [44]. In this case, as \(M\rightarrow \infty \), the normalized spacing x between adjacent random matrix eigenvalues occurs with probability:

$$\begin{aligned} p(x)dx = {32 \over {\pi }^2}x^2 \exp (-\pi x^2 / 4)dx, \end{aligned}$$
(4)

and conjecturally, this distribution coincides with the distribution of normalized spacings between non-trivial zeroes of primitive Langlands L-functions \(\mathcal {L}(\pi ,s)\) along the critical line Re \(s = {1\over 2}\) as Im \(s\rightarrow \infty \) [4, 48]. According to this conjecture, the precise definition of the primitive Langlands L-function determines the manner in which the GUE distribution limit is approached as Im \(s\rightarrow \infty \), but not the distribution itself [28]. Figure 5 shows a graph of the normalized eigenvalue spacing distribution of GUE random matrices plotted against the normalized spacing of the ten thousand non-trivial zeroes of the Riemann zeta function (the prototypical example of a Langlands L-function) above the \(10^{12}\)th non-trivial zero up the critical line [43].

Fig. 5
figure 5

Probability distribution of the normalized spacing between eigenvalues of random \(M\times M\) Hermitian matrices in the limit \(M\rightarrow \infty \), plotted against the distribution of normalized spacings between ten thousand non-trivial zeroes of the Riemann zeta function [43]

Notably, beyond its relevance to nuclear physics and number theory, random matrix theory also describes the normalized energy level spacings of quantum systems defined by “quantizing” classical physical systems with chaotic dynamics [12]. These quantum systems are defined by interpreting classical Hamiltonian functions as quantum operators with discrete spectra, and their normalized energy level spacing statistics agree with the predictions of random matrix theory in the limit \(\hbar \rightarrow 0\) [6]. In the next chapter, we’ll explain how the mathematical formalism of classical physics relates to the definition of a second class of L-functions, called Hasse–Weil L-functions.

3 Classical Dynamics and L-Functions

In classical physics, the real time dynamics of a closed system of finitely many massive point particles in 3 dimensions are specified by a single Hamiltonian function from a classical phase space\(\mathcal {J}_{\mathbf {x},\mathbf {p}}\) to the real numbers:

$$\begin{aligned} H:\mathcal {\mathcal {J}}_{\mathbf {x},\mathbf {p}}\rightarrow \mathbb {R}, \end{aligned}$$
(5)

whose associated vector field determines a 1 dimensional classical system trajectory in \(\mathcal {J}_{\mathbf {x},\mathbf {p}}\) exhibiting periodic and/or chaotic behavior [20]. Observing that the conserved value \(\langle H\rangle \) of the Hamiltonian function H along a trajectory in \(\mathcal {J}_{\mathbf {x},\mathbf {p}}\), identifying the system’s total kinetic and potential energy, specifies a characteristic frequency \(\langle H\rangle /\hbar \) at which to track the system trajectory in real time, we can define a map:

$$\begin{aligned} \mathcal {F}^*_{\langle H\rangle /\hbar }:\mathcal {J}_{\mathbf {x},\mathbf {p}}\rightarrow \mathcal {J}_{\mathbf {x},\mathbf {p}}, \end{aligned}$$
(6)

from \(\mathcal {J}_{\mathbf {x},\mathbf {p}}\) to itself, induced by integrating the flow of the Hamiltonian vector field over a time period \(2\pi \hbar /\langle H\rangle \).

A special case of physical interest occurs when \(\mathcal {J}_{\mathbf {x},\mathbf {p}}\) is the Jacobian of a Riemann surface\(\varSigma \), and \(\mathcal {F}^*_{\langle H\rangle /\hbar }\) is the pullback of a map from the Riemann surface to itself [19]:

$$\begin{aligned} \mathcal {F}_{\langle H\rangle /\hbar }:\varSigma \rightarrow \varSigma . \end{aligned}$$
(7)

In this case, the dynamic L-function:

$$\begin{aligned} \mathcal {L}_{Dyn}(\mathcal {F}_{\langle H\rangle /\hbar })=\exp \sum _{j=1}^{\infty } {\frac{e^{-js}}{j}}|Fix \mathcal {F}_{\langle H\rangle /\hbar }^j|, \end{aligned}$$
(8)

is defined by the values of the coefficients \(|Fix \mathcal {F}_{\langle H\rangle /\hbar }^j|\), which count the number of fixed points of the iterated map \(\mathcal {F}_{\langle H\rangle /\hbar }^j\) in \(\varSigma \) [49]. In general, unlike number theoretic Langlands L-functions, conjectured to satisfy the generalized Riemann hypothesis, the zeroes of this dynamic L-function in the complex s-plane do not lie on the critical line Re \(s=\frac{1}{2}\) [51]. However, in the event \(\varSigma \) is an algebraic manifold, it can, by virtue of the existence of number theoretic maps called Frobenius morphisms, be associated with infinitely many dynamic L-functions proven to independently satisfy the Riemann hypothesis [41].

Beyond Riemann surfaces, Frobenius morphisms and dynamic L-functions can be associated with any algebraic manifold \(\mathcal {S}_1^{\star }\), and the product of these dynamic L-functions is called a Hasse–Weil L-function \(\mathcal {L}_{\mathcal {S}_1^{\star }}(s)\). Interestingly, it has been conjectured that every Hasse–Weil L-function is identical to a “reciprocal” Langlands L-function \(\mathcal {L}(\pi ,s)\), and this conjecture has been proven in the event the algebraic manifold \(\mathcal {S}_1^{\star }\) is a Shimura variety [2, 52]. To highlight the importance of this mathematical fact, Fig. 6 depicts a mnemonic of L-function reciprocity \(\mathcal {L}_{\mathcal {S}_1^{\star }}(s)=\mathcal {L}(\pi ,s)\) in which the letter “R” stands for “Reciprocal”. The symbols \(\pi \) and \(\rho \) denote the automorphic and Galois representations associated with formal definition of the Langlands and Hasse–Weil L-functions, and the symbols \(\varOmega \) and \(\tau \) denote the period matrix of the Riemann surface \(\varSigma \) and the functional parameter of modular forms introduced in the next chapter [7].

Fig. 6
figure 6

Mnemonic of L-function reciprocity [7]

4 Partition Functions and Modular Forms

In this chapter, we’ll consider how classical physical systems are modeled when they are open to interact with their environment. In this circumstance, the set of material particles constituting the classical system must be distinguished from those in its environment, at which point the effect of environmental dynamics on the system is modeled statistically. For example, when a classical gas of N identical particles trapped in a 3 dimensional chamber arrives at thermal equilibrium with its environment, Boltzmann statistics model the likelihood of the system occupying a particular region of classical phase space, without reference to any environmental variables other than temperature T. In mathematical terms, this means the classical partition function:

$$\begin{aligned} Z_{C}(T)={\frac{1}{{N!\cdot h^{3N}}} }\int _{\mathcal {J}_{\mathbf {x},\mathbf {p}}}e^{-H/T}\cdot d\mathbf {x}d\mathbf {p}, \end{aligned}$$
(9)

assigns a probability to the system occupying a hypervolume element of \(\mathcal {J}_{\mathbf {x},\mathbf {p}}\) that is weighted by a negative exponential Boltzmann factor \(e^{-H/T}\), to account for variations in the entropy of the environment as a function of its total energy. Based on this assignment, the system is most likely to have a total energy \(H=\langle H \rangle \) where its free energy:

$$\begin{aligned} -T\log Z_{C}(T), \end{aligned}$$
(10)

is minimized [34].

Mathematically, as a function of inverse temperature \(T^{-1}\), the right hand side of Eq. (9) is a decomposition of \(Z_{C}(T)\) into a sum of exponential transients with different decay rates, whose amplitudes can be resolved by taking a Laplace transform (in \(T^{-1}\)):

$$\begin{aligned} Laplace[Z_{C}](s)=\int _0^\infty Z_{C}(T)\cdot e^{-s/T}\cdot d(1/T). \end{aligned}$$
(11)

For example, if we approximate the integral in Eq. (9) as a sum of finitely many transients, its Laplace transform has a simple pole along the real axis in the complex s-plane for each transient, and its residue at each pole specifies the transient amplitude. Therefore, we might suspect there is a natural number theoretic class of classical partition functions whose Laplace transforms are simply related to L-functions, and as it happens, if we’re willing to replace the linear Laplace transform in Eq. (11) with its multiplicative cousin called a Mellin transform (in \(T^{-1}\)):

$$\begin{aligned} Mellin[\varTheta ](s)&=\int _0^\infty \varTheta (T)\cdot \left( \frac{1}{T}\right) ^{s-1}\cdot d(1/T) \end{aligned}$$
(12)
$$\begin{aligned}&=\int _0^\infty \varTheta (T)\cdot e^{(1-s)\log T}\cdot d(1/T). \end{aligned}$$
(13)

there is a class of functions familiar to number theorists, called modular forms, whose Mellin transforms can be L-functions [31].

Fig. 7
figure 7

Illustration of a 2 dimensional fermionic wave function \(\psi _\varSigma \) defined as a vector bundle over each point of a genus 3 Riemann surface

Importantly, modular forms are known to arise in physics as quantum partition functions\(Z_{Q}(T)\) of open quantum systems at thermal equilibrium with their environment, and exhibit modular invariance in a parameter \(\tau =iT^{-1}\), where T is temperature [25]. That is, rather than being defined as integrals over classical position and momentum coordinates as in equation (9), quantum partition functions \(Z_{Q}(T)\) are defined as sums of Boltzmann factors over the energy eigenstates of quantum Hamiltonian operators. From a 3 dimensional reductionist point of view, this quantum definition is fundamental, because Hamiltonian operator dynamics of atoms should account for Hamiltonian vector field dynamics of larger material objects constructed out of these atoms, and classical partition function (9) is an approximation to a statistical mechanical partition function of a quantum system defined in 3 spatial dimensions. That said, this reductionist view is hypothetical, and the requisite correspondence between quantum and classical descriptions of Nature whereby approximations of this sort are valid remains a subject of considerable physical interest. For example, it has been suggested that quantum-classical correspondence can be understood in terms of coherent quantum states whose real time quantum dynamics most closely resemble the real time classical dynamics of points in classical phase space [60].

Leaving aside the question of whether or not this 3 dimensional reductionist view is fundamentally correct, we’ll point out here that a general class of modular forms arise as partition functions of quantum systems in 2 spatial dimensions. For example, the ratio of theta constant and eta modular forms:

$$\begin{aligned} \frac{\theta _{a,b}(\tau )}{\eta (\tau )} \end{aligned}$$
(14)

is the quantum partition function of a fermionic system defined over a genus 1 Riemann surface, and similar modular invariant partition functions are associated with fermionic systems defined over higher genus Riemann surfaces [1]. As a visual aid, Fig. 7 shows a representation of a 2 dimensional fermionic wave function, labeled \(\psi _{\varSigma }\), defined as a vector bundle over each point of a genus 3 Riemann surface.

In passing, we note that while quantum partition function (14) is a modular form of weight 0, it is the weight 1/2 theta constant modular form whose Mellin transform is simply related to an L-function. That is:

$$\begin{aligned} \int _0^\infty \theta _{0,0}(\tau )\cdot \tau ^{s-1} d\tau -\int _0^1 \tau ^{s-1}\frac{1}{\sqrt{\tau }}d\tau -\int _1^\infty \tau ^{s-1}d\tau =\pi ^{-s}\varGamma (s)\zeta (2s), \end{aligned}$$
(15)

where \(\zeta (s)\) is the Riemann zeta function, and the second and third integrals on the left hand side of the equation subtract divergences of the Mellin transform at small and large values of \(\tau \).

5 Gauge Fields and Twistors

To begin explaining how our discussion in the previous chapters is related to the study of gravity, we’ll now turn to the subject of gauge fields, and their description in twistor theory. Historically, the first gauge field of interest to physicists was the electromagnetic field in 3+1 dimensional spacetime, whose real time classical dynamics are described by Maxwell’s equations. In this context, the term gauge refers to the mathematical invariance of the electric and magnetic fields when the electric and magnetic potential functions, whose space and time derivatives define the fields, are adjusted in a particular way at every point of spacetime. Such an adjustment is called a gauge transformation, and beyond the electromagnetic field, Yang-Mills fields, including the weak and strong nuclear fields, are invariant under gauge transformations of their respective potential functions [24].

In quantum physics, gauge transformations of Yang-Mills fields occur in conjunction with rotations of material particle wave functions at each point of spacetime [59]. For example, in the case of the strong nuclear field, a quark wave function with 3 components:

$$\begin{aligned} \psi _{quark}=\left( \begin{array}{c}\psi _1\\ \psi _2\\ \psi _3\end{array}\right) , \end{aligned}$$
(16)

representing three possible “colors” red, green, and blue of the quark, is rotated by \(3\times 3\) matrix multiplication at each point of spacetime when the strong field undergoes a gauge transformation. Technically, this is because the Dirac equation, describing the quark wave function in the presence of the strong field, expresses the interaction between the quark and the strong field in terms of the strong field potential, not the field itself [13].

Fig. 8
figure 8

Simplified conception of twistor theory in which light ray intersections locate material particles at points of Minkowski spacetime

In twistor theory, gauge fields have a mathematical description in twistor space that serves as an alternative to their description in spacetime. However, before stating what this description is, we should first clarify that twistor theory is a non-local approach to physical modeling, in which physical systems, rather than being described as configurations of particles and fields satisfying equations of motion referencing coordinates of points in 3 dimensional Euclidean space, are described in terms of equations of motion referencing the coordinates of points in a space of light rays [45]. In other words, rather than identifying spacetime as the fundamental arena in which to model physical processes, the space of light rays in spacetime is identified as fundamental, with the understanding that light ray intersections identify spacetime points. In this approach to physical modeling, for sake of intuition, the term “twistor” can be thought of as a being synonymous with “light ray”. Figure 8 illustrates a set of light ray intersections in Minkowski spacetime.

In mathematical terms, the space of light rays in 3+1 dimensional spacetime is \(\mathbb {R}^3\times \mathbb {S}^2\), because a light ray at time \(t=0\) can be incident with any point in 3 dimensional space and point in any direction, and differs from twistor space, which is defined as a complex manifold \(\mathbb {C}P^3\) with 6 real dimensions. According to this definition, each twistor is incident with a set of points in “complexified” Minkowski spacetime, and each point in spacetime is incident with a set of twistors constituting a Riemann sphere\(\mathbb {C}P^1\) in twistor space. While providing physical motivation for the definition of twistor space in terms of complex numbers is not our concern here, the definition itself is of immediate importance because it implies twistor space has a natural dual space, in which the coordinates of points are expressed as complex conjugates of the coordinates of points in twistor space. For our purposes, this is important, because it turns out Yang-Mills fields can be conveniently described in terms of vector bundles fibered over twistor space or its dual, but not both. That is to say, in more technical terms, the Yang-Mills field equations in complexified spacetime \(\mathbb {C}^4\), with gauge group \(GL(m,\mathbb {C})\), are mathematically equivalent to a patching condition on rank mholomorphic vector bundles fibered over twistor space [10].

Happily, we can explain how the twistor theoretic description of gauge fields relates to both modular forms and L-functions without detailing the proof of the previous statement. To do this, let’s focus our attention on the set of twistors incident with a point E in real Minkowski spacetime, which is a Riemann sphere \(\mathbb {C}P^1\) in twistor space, and consider a rank m vector bundle \(\psi \) fibered over this sphere. Intuitively, we can think of this vector bundle as defining a wave function over a 2 dimensional subspace of twistor space, in analogy to the way rank 3 quark wave functions are defined as vector bundles over 3+1 dimensional spacetime. Based on the presence of matter-antimatter particle interactions in the Standard Model of quantum physics, let’s also consider a rank m vector bundle \(\overline{\psi }\) fibered over a second copy of the Riemann sphere, which in the language of twistor theory, would be the space of dual twistors incident with E. Figure 9 illustrates a point E in spacetime, and two Riemann spheres in twistor space and its dual, representing the sets of twistors and dual twistors incident with E.

Fig. 9
figure 9

A point E in spacetime corresponds to Riemann spheres in both twistor space and its dual. These spheres are branch covered by a single Riemann surface \(\varSigma \) on which 2 dimensional fermion wave functions are defined

Also illustrated in Fig. 9 is a Riemann surface \(\varSigma \) fibered by two vector bundles \(\psi _{\varSigma }\) and \(\overline{\psi }_{\varSigma }\), and two branched covering maps from \(\varSigma \) to the aforementioned Riemann spheres over which \(\psi \) and \(\overline{\psi }\) are defined. With respect to the theory of integrable systems, we can identify \(\varSigma \) as a spectral curve, and the stereographic projection of each Riemann sphere as a complex spectral parameter space. In this context, \(\varSigma \) is defined by an algebraic equation relating complex eigenvalues \(\lambda \) and \(\mu \) of two commuting ordinary differential operators [26]. The eigenfunctions of these differential operators, called Baker-Akhiezer wave functions, should not be confused with the fermionic wave functions constituting the vector bundles \(\psi \) and \(\overline{\psi }\), which for our purposes can be regarded as eigenfunctions of the Dirac operator defined over \(\varSigma \) [33].

As noted in Chapter 3, if \(\varSigma \) is defined by an algebraic equation with coefficients in an algebraic number field, its reduction at each prime location is defined by an algebraic equation with coefficients in a finite field \(\mathbb {F}_{q=p^f}\), and its associated dynamic L-function has 2g zeroes along the critical line Re \(s = {1\over 2}\), where g is the genus of \(\varSigma \). Importantly, the spacing statistics of the zeroes of associated with a typical curve \(\varSigma \) defined over \(\mathbb {F}_{q=p^f}\) approach GUE eigenvalue spacing statistics as \(q\rightarrow \infty \), and can therefore be identified with the non-trivial zero spacing statistics of a Langlands L-function \(\mathcal {L}(\pi ,s)\), as suggested by Fig. 6 [27]. In the next chapter, we’ll review how conformal field theory provides a description of these statistics, and how this description relates to gravitational physics.

6 Gauge/Gravity Correspondence and Conformal Fields

By way of introducing the subject of conformal field theory, let’s recall the Yang–Lee theorem as it applies to the ferromagnetic Ising model [9, 38]. This theorem states that for a general class of ferromagnetic Ising models, the values of an external magnetic field where the statistical mechanical partition function assigned to the model is zero are pure imaginary complex numbers, and therefore lie along the critical line Re \(s = {0}\) in a manner resmbling the way in which the zeroes of number theoretic L-functions are hypothesized to lie along the critical line Re \(s = {1\over 2}\). Specifically, if we define the Ising energy of an array of g spins as:

$$\begin{aligned} \mathcal {H}=-\mathcal {B}\sum _{j=1}^g \sigma _{j}-\mathcal {C}\sum _{<j_1,j_2>}\sigma _{j_1}\sigma _{j_2}, \end{aligned}$$
(17)

where \(\mathcal {C}\in \mathbb {R}\) specifies the strength of interaction between neighboring spins, and \(\mathcal {B}\in \mathbb {R}\) specifies the strength of interaction between the spins and an external magnetic field, then for fixed values of \(\mathcal {C}\) and temperature \(\mathcal {T}\), the Yang–Lee theorem states that the quantum partition function:

$$\begin{aligned} \mathcal {Z}_Q(\mathcal {T})=\sum _{[\sigma _{j}]}e^{-\mathcal {H}/\mathcal {T}}, \end{aligned}$$
(18)

has all its zeroes in the complex plane at pure imaginary values of the ratio \(\mathcal {B}/\mathcal {T}\).

Mathematically, the Yang–Lee theorem holds true independently of the dimension of the space in which the Ising model is defined, and remains true in the thermodynamic limit \(g\rightarrow \infty \). Therefore, if we regard the lattice site spin polarizations as one dimensional degrees of freedom analogous to position coordinates, which alongside their conjugate momenta define 2g real coordinates of a Jacobian \(\mathcal {J}\) associated with a genus g Riemann surface \(\varSigma \), we can regard this thermodynamic limit as occurring in conjunction with the number theoretic limit \(q\rightarrow \infty \) referenced in the previous chapter. For our purposes, this is important, because it allows us to regard L-function zeroes as limit sets of Yang–Lee zero flows in the complex fugacity plane:

$$\begin{aligned} z=e^{\mathcal {B}/\mathcal {T}}, \end{aligned}$$
(19)

whose statistics are describable in terms of conformal field theory [36].

To understand what is meant by the previous statement, let’s first think of the local density of Yang–Lee zeroes in the complex z-plane as specifying a 2 dimensional static electric charge distribution and an associated 2 dimensional electrostatic potential \(\phi (z)\) [5]. By definition, such a potential is an analytic function of the complex variable \(z=x+iy\) away from the Yang–Lee zero locations, where it is a harmonic function:

$$\begin{aligned} {{\partial ^2\phi }\over {\partial x^2}}+{{\partial ^2\phi }\over {\partial y^2}}=0, \end{aligned}$$
(20)

whose gradient defines an electrostatic field. Because analytic (i.e. conformal) maps of the complex z-plane leave the harmonicity of \(\phi \) unchanged, this electrostatic field is sometimes called a conformal field [50]. Notable examples of conformal maps include Mobius transformations, whose action on the complex z-plane is mathematically equivalent to the action of the restricted Lorentz group fixing a point E in spacetime [22].

Presently, the study of 2 dimensional conformal fields is of interest to physicists as a consequence of a theoretical relationship discovered between the quantum physics of gauge fields and gravitational physics called the gauge/gravity correspondence [23]. According to this correspondence, statistical mechanical metrics of 2 dimensional conformal fields (e.g. 2 point functions) have an alternative interpretation as descriptors of gravitational physics in spacetime [21]. Therefore, to end this review, we will introduce the mathematical formalism of conformal field theory as it relates to the gauge/gravity correspondence.

For intuition’s sake, let’s first observe that the mathematical description of a 2 dimensional electrostatic field is equivalent to the description of the 2 dimensional velocity field of a fluid in 2 dimensional space, whereby the potential function \(\phi (z)\) is referred to as a stream function [16]. In the event the fluid is inviscid, its flow conserves kinetic energy, and can be defined in terms of a stream function \(\phi (x,y)\) and a Hamiltonian density functional:

$$\begin{aligned} H(\phi )=\int (\nabla \phi \cdot \nabla \phi +V(\nabla \phi )) {d}x{d}y \end{aligned}$$
(21)

in much the same way as the classical dynamics of material particles are defined in terms of Hamiltonian functions. Necessarily, this Hamiltonian definition of fluid flow is equivalent to a definition given in terms of the Navier–Stokes equation, and statistical mechanical metrics of the flow can be calculated by directly solving the Navier–Stokes equation and time averaging properties of the solution [32].

In the context of the gauge/gravity correspondence, the description of conformal fields can also be given in terms of a Hamiltonian density functional, but instead of focusing on conformal fields \(\nabla \phi =(\phi _x,\phi _y)\) with two components, the mathematical entities of interest are conformal fields \((\phi _1,\phi _2,...,\phi _d)\) with d components, where d is the real dimension of a symmetric space such as a Shimura variety [17]. A clear and compelling physical interpretation of what these fields represent remains wanting, but it has been put forth that the statistics of these fields (e.g. correlation functions) describe the gravitational motion of mass across 2 dimensional “screens” (i.e. surfaces) embedded in the 3 spatial dimensions of Minkowski spacetime [15, 56]. In passing, we note that the correlation functions of these fields, invariant under conformal transformations of the screen, are hypergeometric functions [15, 55].

7 Conclusion

In this paper, the scientific relationship between modern physics and number theory has been reviewed as a means of better understanding whether or not number theory has an essential role to play in physical modeling, and motivating more general interest in number theoretic concepts among physicists. Specifically, the experimental and theoretical relevance of L-functions to nuclear physics and thermodynamics is addressed in Chapters 2 and 3, the theoretical relevance of modular forms to quantum physics is addressed in Chapter 4, and the relationship between these number theoretic concepts and the gauge/gravity correspondence is addressed in Chapters 5 and 6.

Philosophically, the presentation of ideas in this paper is important, because a deeper understanding of the laws of physics in number theoretic terms could help provide a better understanding of why fundamental physical constants adopt their experimentally measured values, and thereby inform the debate surrounding the anthropic principle. From a distance, this is not unreasonable, because the history of scientific publication bridging the disciplines of theoretical physics and number theory is long and storied, and modern day physicists have proposed grand unified theories of Standard Model quantum physics and gravity in which number theoretic groups such as PSL(2, 7) play a fundamental role [29]. Upon close review, the possibility of a fruitful union between modern physics and number theory shedding light on this issue is even more compelling, because the hypothetical alignment of L-function zeroes along the critical line Re \(s = {1\over 2}\) appears central to the definition of conformal scalar field theories describing physics in spacetime, and may therefore place sharp constraints on fundamental constants describing physical processes therein.

Truth be told, the examples of number theory’s relevance to modern physics reviewed in this paper may well be far removed from any technological application of physical law that could improve the material quality of life of human beings. However, if the history of science and engineering is our guide, the gain of scientific knowledge, through theory and experiment, necessarily precedes unforeseen technological application of this knowledge, and it may serve some greater purpose to hazard a guess as to what economic value a number theoretic understanding of physical law might have. To this end, at the risk of guessing incorrectly, we’ll suggest L-functions and modular forms have an essential role to play in modeling self organized critical systems, which in broad scientific terms, are dynamical systems with critical point attractors [3].

Interestingly, while many examples of self organized critical systems in Nature, running the gamut from solar flares to earthquakes, have been studied on a case by case basis, a comprehensive theory describing self organized critical systems in general does not exist. This absence leaves way for a number theoretic description of self organized critical point attractors that could provide human civilization with better capability of predicting natural phenomena such as atmospheric turbulence and earthquakes, and thereby mitigate the negative consequences of natural disasters.