Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

16.1 Units of Measurement for Electromagnetic Phenomena

The units of measurement relative to electromagnetic phenomena have been introduced through a long and complex historical process. Without going into the details of this process, we try here to summarise the fundamental points in light of a modern vision of the phenomena themselves. We stress that these notes are aimed at a reader who is already familiar with the basic phenomenology of electromagnetism.

Within electrostatics, the fundamental law is Coulomb’s law which is written, in vacuum, in the general form

$$\mathbf{F} = k_{\mathrm{C}} { q_1 q_2 \over r^2} \operatorname{vers} \mathbf{r} , $$

where F is the force that a point charge q 1 exerts on the point charge q 2 placed at the distance r=r 2r 1, and where k C is a constant that implicitly defines the unit of measurement of the charge (assuming, of course, that the units of measurement of the mechanical quantities have already been set). We note that k C can be chosen to be dimensional or dimensionless. The electric field vector E is defined in an arbitrary point using the equation

$$\mathbf{E} = {\mathbf{F} \over q_{\mathrm{p}}} , $$

where F is the electric force exerted on the “test” charge q p placed at the same location. From this definition and Coulomb’s law we can deduce the expression of the electric field due to a point charge q, which is

$$\mathbf{E} = k_{\mathrm{C}} {q \over r^2} \operatorname{vers} \mathbf{r} , $$

from which the Gauss theorem (in its integral form) follows

$$\varPhi(\mathbf{E}) = \int_\varSigma \mathbf{E} \cdot \mathbf{n} \,\mathrm{d} S = 4 \pi k_{\mathrm{C}} Q , $$

where Q is the charge contained within the surface Σ. In its differential form, we have

$$\operatorname{div} \mathbf{E} = 4 \pi k_{\mathrm{C}} \rho , $$

where ρ is the density of the electric charge (the charge contained within the unit volume).

Regarding the definition of the electric field vector, we note that it is not the only possible one, since we could have defined the electric field created by the charge q as

$$\mathbf{E} = k_{\mathrm{C}} \delta {q \over r^2} \operatorname{vers} \mathbf{r} , $$

with δ an arbitrary constant (possibly dimensional), as long as the electric force that the field exerts on the test charge q p is written in the form

$$\mathbf{F} = {1 \over\delta} q_{\mathrm{p}} \mathbf{E} . $$

Fortunately, the constant δ has (historically) always been set to unity. The same is not true for magnetic phenomena.

Regarding magnetostatics, we have equations that are similar to those of electrostatics. In these equations, for historical reason, the fictitious concept of “magnetic mass” (or “magnetic pole”) is introduced. The equations corresponding to those previously written are the following ones (the symbol m denoting the magnetic mass):

  • Gilbert’s lawFootnote 1 (analogous to Coulomb’s law)

    $$\mathbf{F} = k_{\mathrm{G}} {m_1 m_2 \over r^2} \operatorname{vers} \mathbf{r} . $$
  • Definition of the vector of the magnetic induction generated from the magnetic mass m

    $$\mathbf{B} = k_{\mathrm{G}} \gamma {m \over r^2} \operatorname{vers} \mathbf{r} , $$

    where γ is an arbitrary constant (possibly dimensional).

  • Force acting on the test magnetic mass m p

    $$ \mathbf{F} = {1 \over\gamma} m_{\mathrm{p}} \mathbf{B} . $$
    (16.1)
  • Equivalent of the Gauss theorem (integral form)

    $$\varPhi(\mathbf{B}) = 0 , $$

    since isolated magnetic masses (magnetic monopoles) do not exist.

  • Equivalent of the Gauss theorem (differential form)

    $$\operatorname{div} \mathbf{B} = 0 . $$

The first quantitative relations between electric and magnetic phenomena were established with experiments based on electric currents.Footnote 2 The intensity of the electric current i flowing in a conductor is defined by the simple equation

$$i = {\mathrm{d} q \over{\mathrm{d}} t} , $$

from which we can define the current density j as a vector directed along the direction of motion of the (positive) charges of magnitude

$$j = {i \over\sigma} , $$

where σ is the cross-sectional area of the conductor. The experiments performed during the first half of the nineteenth century especially by Ørsted, Ampère, and Faraday, led to the idea that electric currents create magnetic fields in their surroundings and that, at the same time, a magnetic field is able to exert a force on electric currents. During the same period, a new idea clearly emerged: that permanent magnets contain, at the microscopic level, a large number of elementary electric currents. These currents would be responsible, ultimately, for magnetostatic phenomena.

In modern terms, the magnetic properties of the currents can be summarised by a single law that is expressed by saying that, in stationary conditions, the current element of an elementary circuit (microscopic or macroscopic) i 1 d 1 acts on the current element of another elementary circuit, i 2 d 2, with an infinitesimal force dF given by

$$\mathrm{d} \mathbf{F} = k_{\mathrm{A}} i_2 \,\mathrm{d} \boldsymbol{\ell}_2 \times \biggl( i_1 \,\mathrm{d} \boldsymbol{\ell}_1 \times {\operatorname{vers} \mathbf{r} \over r^2} \biggr) , $$

where k A is a new constant (which cannot be independent of those already introduced), and where r is the radius vector that goes from the current element i 1 d 1 to the current element i 2 d 2. This law allows the introduction of the magnetic induction vector. The definition of this vector is somewhat arbitrary and it is assumed, in general, that the current element i d creates the elementary induction vector dB given by (first law of Laplace or Biot and Savart’s law)

$$ \mathrm{d} \mathbf{B} = k_{\mathrm{A}} \beta i \,\mathrm{d} \boldsymbol{\ell}\times {\operatorname{vers} \mathbf{r} \over r^2} , $$
(16.2)

and that a current element i d is subject, in the presence of an induction vector B, to a force dF given by (second law of Laplace)

$$\mathrm{d} \mathbf{F} = {1 \over\beta} i \,\mathrm{d} \boldsymbol{\ell }\times \mathbf{B} . $$

The quantity β introduced in these equations is arbitrary.

Let’s see the mathematical consequences of Eq. (16.2) for a closed circuit. The magnetic induction vector is given by

$$\mathbf{B} = k_{\mathrm{A}} \beta \oint_{\mathrm{C}} i \,\mathrm{d} \boldsymbol{\ell }\times{\operatorname{vers} \mathbf{r} \over r^2} , $$

where C is the curve describing the closed circuit. Using standard mathematical methods, one finds the following equations

$$\operatorname{div} \mathbf{B} =0 , $$

which confirms the analogous equation for magnetostatics, and

$$\operatorname{rot} \mathbf{B} = 4 \pi k_{\mathrm{A}} \beta \mathbf{j} , $$

where j is the current density.

This equation, known as Ampère’s law, applies only to stationary phenomena. As shown by Maxwell, it can be transformed into a more general equation that is also valid for phenomena that are variable in time. To do this we observe that, taking the divergence of both sides, we have

$$\operatorname{div} \mathbf{j}= 0 , $$

while, in general, the continuity equation must hold

$$\operatorname{div} \mathbf{j}+ {\partial\rho\over\partial t} = 0 , $$

ρ being the charge density. In order to rearrange things, we take the derivative (with respect to time) of the differential expression of Coulomb’s law

$${\partial\rho\over\partial t} = {1 \over4 \pi k_{\mathrm{C}}} {\partial\over\partial t} ( \operatorname{div} \mathbf{E}) , $$

so that in general the following equation holds

$$\operatorname{div} \biggl( \mathbf{j}+ {1 \over4 \pi k_{\mathrm{C}}} {\partial \mathbf{E} \over\partial t} \biggr) = 0 . $$

The second term in parentheses is the so-called displacement current density. With its introduction, the equation for \(\operatorname{rot} \mathbf{B}\), corrected to include non-stationary phenomena, is

$$\operatorname{rot} \mathbf{B} - {k_{\mathrm{A}} \beta\over k_{\mathrm{C}}} {\partial \mathbf{E} \over\partial t} = 4 \pi k_{\mathrm{A}} \beta \mathbf{j} . $$

Finally, we need to consider the phenomena of magnetic induction. The law that describes them can be deduced, at least in a particular case, from the second law of Laplace. We have

$$\operatorname{rot} \mathbf{E} = -{1 \over\beta} {\partial \mathbf{B} \over \partial t} , $$

so that, in summary, the laws governing the electromagnetic phenomena can all be enclosed in the following four Maxwell’s equations

$$\begin{aligned} & \operatorname{div} \mathbf{E} = 4\pi k_{\mathrm{C}} \rho , \\ & \operatorname{rot} \mathbf{B} - {k_{\mathrm{A}} \beta\over k_{\mathrm {C}}} {\partial \mathbf{E} \over\partial t} = 4 \pi k_{\mathrm{A}} \beta \mathbf{j} , \end{aligned} \qquad \begin{aligned} & \operatorname{div} \mathbf{B} = 0 , \\ & \operatorname{rot} \mathbf{E} + {1 \over\beta} {\partial \mathbf{B} \over \partial t} =0 . \end{aligned} $$

We now consider Maxwell’s equations in vacuum. Taking the curl of the third equation and substituting the fourth, we obtain the wave equation

$$\nabla^2 \mathbf{B} = {k_{\mathrm{A}} \over k_{\mathrm{C}}} {\partial^2 \mathbf{B} \over\partial t^2} . $$

On the other hand we know that electromagnetic waves propagate in vacuum with velocity c, so that we must have

$${k_{\mathrm{A}} \over k_{\mathrm{C}}} = {1 \over c^2} , $$

or

$$k_{\mathrm{A}} = {k_{\mathrm{C}} \over c^2} , $$

that is, a relation between the quantities k A and k C that is independent of the unit system under consideration.

Let’s see how we proceed in the two more common systems of units, the cgs system of Gauss (sometimes also called the Gauss-Hertz system) and the International System of Units (SI). In the cgs system, we assume k C=1, so that the unit of charge is defined as the charge that repels an equal charge, at a distance of one centimeter, with the force of one dyne. Such unit of charge is called Franklin or statcoulomb. Since k C=1, it follows that k A=1/c 2. Within this system we also assume that β=c, so that Maxwell’s equations are written as

$$\begin{aligned} & \operatorname{div} \mathbf{E} = 4\pi \rho , \\ & \operatorname{rot} \mathbf{B} - {1 \over c} {\partial \mathbf{E} \over\partial t} = 4 \pi {\mathbf{j}\over c} , \end{aligned} \qquad \begin{aligned} & \operatorname{div} \mathbf{B} = 0 , \\ & \operatorname{rot} \mathbf{E} + {1 \over c} {\partial \mathbf{B} \over \partial t} =0 . \end{aligned} $$

Moreover, the first and the second law of Laplace, together with the law that summarises them, can be written in the form

$$\begin{aligned} \mathrm{d} \mathbf{B} & = {i \over c} \mathrm{d} \boldsymbol{\ell}\times {\operatorname{vers} \mathbf{r} \over r^2} , \qquad\mathrm{d} \mathbf{F} = {i \over c} \mathrm{d} \boldsymbol{\ell }\times \mathbf{B} , \\ \mathrm{d} \mathbf{F} & = {i_2 \over c} \,\mathrm{d} \boldsymbol{\ell}_2 \times \biggl( {i_1 \over c} \,\mathrm{d} \boldsymbol{\ell}_1 \times {\operatorname{vers} \mathbf{r} \over r^2} \biggr) . \end{aligned} $$

Within the International System, instead, two new constants are introduced. They are the vacuum permittivity (also called dielectric permittivity of the vacuum) ϵ 0 and the vacuum permeability (magnetic permeability of the vacuum) μ 0, such that

$$\epsilon_0 \mu_0 = {1 \over c^2} . $$

Using these quantities, we put

$$k_{\mathrm{C}} = {1 \over4 \pi \epsilon_0} , $$

so that we have

$$k_{\mathrm{A}} = {k_{\mathrm{C}} \over c^2} = {1 \over4 \pi \epsilon _0 c^2} = {\mu_0 \over4 \pi} . $$

Within this system we also put β=1, so that Maxwell’s equations are written as

$$\begin{aligned} & \operatorname{div} \mathbf{E} = {\rho\over\epsilon_0} , \\ & \operatorname{rot} \mathbf{B} - {1 \over c^2} {\partial \mathbf{E} \over\partial t} = \mu_0 \mathbf{j} , \end{aligned} \qquad \begin{aligned} & \operatorname{div} \mathbf{B} = 0 , \\ & \operatorname{rot} \mathbf{E} + {\partial \mathbf{B} \over\partial t} =0 . \end{aligned} $$

Moreover, the first and the second law of Laplace and the law that summarises them are written, respectively, in the form

$$ \begin{aligned}[c] \mathrm{d} \mathbf{B} & = {\mu_0 \over4 \pi} i \,\mathrm{d} \boldsymbol{\ell}\times {\operatorname{vers} \mathbf{r} \over r^2} , \qquad \mathrm{d} \mathbf{F} = i \,\mathrm{d} \boldsymbol{\ell }\times \mathbf{B} , \\ \mathrm{d} \mathbf{F} & = {\mu_0 \over4 \pi} i_2 \,\mathrm {d} \boldsymbol{\ell}_2 \times \biggl( i_1 \,\mathrm{d} \boldsymbol{\ell}_1 \times {\operatorname{vers} \mathbf{r} \over r^2} \biggr) . \end{aligned} $$
(16.3)

With respect to the numerical values of ϵ 0 and μ 0, the Ampère (unit of measurement of the current) is defined as the current that, flowing along an infinite straight wire of negligible thickness in vacuum, attracts an equal wire, located at a distance of one meter, with a force per unit length equal to 2×10−7 N m−1. Using Eq. (16.3) we deduce that in such a geometry the force per unit length that acts on one of the two conductors is attractive and has a magnitude given by the following expression

$${\mathrm{d}F \over{\mathrm{d}}l} = 2 {\mu_0 \over4 \pi} {i^2 \over r} , $$

so we must haveFootnote 3

$$\mu_0 = 4 \pi\times10^{-7}\ \mathrm{N}\,\mathrm{A}^{-2} = 1.256637 \times10^{-6} \ \mathrm{N}\,\mathrm{A}^{-2} , $$

and then, recalling that the Coulomb is the charge transported in one second by a current of one Ampère

$$\epsilon_0 = {1 \over\mu_0 c^2} = 8.854188 \times10^{-12} \ \mathrm{C}^2\,\mathrm{N}^{-1}\,\mathrm{m}^{-2} . $$

Finally, it remains to analyse the relation between magnetic masses and currents. We can infer from Laplace’s laws that a filiform planar circuit of area σ and current i behaves, at distances much larger than its size, as a magnetic dipole directed along the unit vector n perpendicular to the plane of the circuit. The direction of n is specified by the rule of the corkscrew (or the right screw). This is the so-called Ampère principle of equivalence, which is expressed by the formula

$$\boldsymbol{\mu}= k_{\mathrm{P}} i \sigma \mathbf{n} , $$

where k P is a new constant to be related to those previously introduced. To establish this relation, we evaluate, for example, the moment of the forces acting on an elementary dipole located at a point in space where the field B is present. Using Eq. (16.1), we have

$$\mathbf{M} = {1 \over\gamma} \boldsymbol{\mu}\times \mathbf{B} = {1 \over\gamma} k_{\mathrm{P}} i \sigma \mathbf{n} \times \mathbf{B} . $$

Instead, from the second law of Laplace we have

$$\mathbf{M} = {1 \over\beta} \oint i \mathbf{r} \times( \mathrm{d} \boldsymbol{\ell }\times \mathbf{B} ) , $$

which can be rewritten as

$$\mathbf{M} = {1 \over\beta} i \sigma \mathbf{n} \times \mathbf{B} . $$

Equating the two expressions for M we have

$$k_{\mathrm{P}} = {\gamma\over\beta} . $$

Finally, considering the force exerted between two infinitesimal circuits, treated in the first instance as elementary dipoles and then as coils carrying a current, we obtain the relation

$$k_{\mathrm{G}} k_{\mathrm{P}}^2 = k_{\mathrm{A}} , $$

which allows to write k G in the form

$$k_{\mathrm{G}} = {k_{\mathrm{A}} \over k_{\mathrm{P}}^2} = {k_{\mathrm {C}} \beta^2 \over c^2 \gamma^2} . $$

In the cgs system, since k C=1 and β=c, and assuming γ=1, we obtain

$$k_{\mathrm{P}} = {1 \over c} , \qquad k_{\mathrm{G}} = 1 . $$

Ampère’s principle of equivalence is therefore

$$\boldsymbol{\mu}= {i \over c} \sigma \mathbf{n} , $$

and Gilbert’s law

$$\mathbf{F} = {m_1 m_2 \over r^2} \operatorname{vers} \mathbf{r} . $$

In the International System, instead, since k C=1/(4πϵ 0) and β=1, assumingFootnote 4 γ=μ 0 and recalling that c 2=1/(ϵ 0 μ 0), we obtain

$$k_{\mathrm{P}} = \mu_0 , \qquad k_{\mathrm{G}} = {1 \over4 \pi\mu_0} . $$

In this case Ampère’s principle of equivalence is

$$\boldsymbol{\mu}= \mu_0 i \sigma \mathbf{n} , $$

and Gilbert’s law is

$$\mathbf{F} = {1 \over4 \pi \mu_0} {m_1 m_2 \over r^2} \operatorname{vers} \mathbf{r} . $$

Finally, we note that, besides the two systems introduced here, there are other ones that have been used for the electromagnetic phenomena. In particular, it is worth mentioning the electrostatic cgs system, the electromagnetic cgs system and the cgs system of Heavyside.

16.2 Tensor Algebra

In this volume, we often need to deal with vectors and tensors, together with their differential expressions such as divergences, curls and gradients. It is therefore useful to give a brief introduction to this topic in order to make the reader familiar with a compact formalism that allows to easily deduce a series of vector and tensor identities, as well as various transformation formulae.

The traditional definition of a tensor that is commonly given in physics is based on the generalisation of the definition of a vector. In a Cartesian orthogonal reference system, the vector v is defined as an entity with three components (v x , v y , v z ) (or v 1, v 2, v 3) which, under an arbitrary rotation of the reference system, are modified according to the law

$$v'_i = \sum_j C_{ij} v_j , $$

where the coefficients C ij are the direction cosines of the new axes with respect to the old ones. In close analogy, we define a tensor T of rank n as an entity with 3n components (T ij with i,…,j=1,3) which, under a rotation of the reference system, are transformed according to the law

$$T'_{i \ldots j} = \sum_{k, \ldots,l} C_{ik} \cdots C_{jl} T_{k \ldots l} . $$

The tensor most commonly known in physics is the stress tensor that characterises inside an elastic material the force dF that is exerted on a surface dS with normal n. In components we have

$$\mathrm{d}F_i = \sum_j T_{ij} n_j \,\mathrm{d}S . $$

In addition to the stress tensor we can also mention, for their importance in various fields of physics, the deformation tensor, the inertia tensor, and the dielectric tensor.

A particular tensor of rank two is the so-called dyad that is obtained from two vectors u and v when the direct product of their components is considered. The dyad is indicated simply by the symbol uv, and we have by definition

$$(\mathbf{u} \mathbf{v} )_{ij} = u_i v_j \quad(i,j=1,2,3) . $$

Obviously, in general

$$\mathbf{u} \mathbf{v} \neq \mathbf{v} \mathbf{u} . $$

A scalar quantity is, by definition, a tensor of rank zero, while a vector is, by definition, a tensor of rank one. Tensors of higher rank may be obtained by considering the direct product of tensors of lower rank. For example, by the direct product of two tensors of rank two a tensor of rank four is obtained.

The tensor algebra covers all operations that can be performed on tensors. We now provide some definitions

  1. 1.

    Given two tensors T and V, the first of rank n (n≥1) and the second of rank n′ (n′≥1), we define the scalar product (or internal product) of the two tensors a tensor of rank (n+n′−2) obtained by a sum (or saturation) which operates over the last index of the first tensor and the first index of the second tensor. For example, if n and n′ are both equal to 2, defining W as the tensor obtained by the scalar product, we have that W is also a tensor of rank two defined by

    $$W_{ij} = \sum_k T_{ik} V_{kj} . $$
  2. 2.

    Given a tensor of rank n (with n≥1), the divergence of such tensor is a tensor of rank (n−1) obtained by saturating its first component with the formal vector (called “nabla” operator or “del” operator) defined by

    $$\boldsymbol{\nabla}\equiv \biggl({\partial\over\partial x}, {\partial\over\partial y}, {\partial\over\partial z} \biggr) . $$

    For example, for a tensor T of rank two, \(\operatorname{div} \mathbf{T}\) is a vector whose components are given by

    $$(\operatorname{div} \mathbf{T})_i = \sum _j {\partial\over\partial x_j} T_{ji} = (\boldsymbol{\nabla }\cdot{\mathbf{T}})_i . $$
  3. 3.

    Given a tensor of rank n (with n≥0), the gradient of such tensor is a tensor of rank (n+1) obtained by applying to it the formal vector in such a way that the first index of the resulting tensor is the “derivation one”. For example, for a tensor of rank 1, i.e. for a a vector v, we have

    $$(\operatorname{grad} \mathbf{v} )_{ij} = (\boldsymbol{\nabla }\mathbf{v} )_{ij} = {\partial\over\partial x_i} v_j . $$

    It should be noted that this convention is not universally adopted. Some authors prefer to indicate with the symbol \(\operatorname{grad} \mathbf{v}\) the quantity

    $$(\operatorname{grad} \mathbf{v} )_{ij} = {\partial\over\partial x_j} v_i . $$

    The reader should therefore pay attention to the conventions used by each author before using the vector identities that are found in different books. For example, using our conventions, we have

    $$\sum_i u_i {\partial v_j \over\partial x_i} = (\mathbf{u} \cdot \operatorname{grad} \mathbf{v} )_j , \qquad \sum _i u_i {\partial v_i \over\partial x_j} = \bigl[ (\operatorname{grad} \mathbf{v} ) \cdot \mathbf{u} \bigr]_j . $$

    Using the formal vector , the quantities in the right-hand side can also be written, respectively, as

    $$\bigl[( \mathbf{u} \cdot \boldsymbol{\nabla}) \mathbf{v} ) \bigr]_j , \qquad \bigl[( \boldsymbol{\nabla }\mathbf{v} ) \cdot \mathbf{u} \bigr]_j . $$
  4. 4.

    Given a tensor of rank n (n≥1), the curl (also known as rotor) of such a tensor is a tensor of the same rank n with the first component being obtained by saturating the first component of the given tensor with the completely antisymmetric tensor (known as the Ricci, or Ricci-Levi Civita tensor) and with the component of the formal vector . For example, for a vector v we have

    $$(\operatorname{rot} \mathbf{v} )_i = \sum_{jk} \epsilon_{ijk} {\partial v_k \over \partial x_j} = (\boldsymbol{\nabla }\times \mathbf{v} )_i , $$

    and for a tensor T of rank two

    $$(\operatorname{rot} \mathbf{T})_{ij} = \sum _{kl} \epsilon_{ikl} {\partial T_{lj} \over \partial x_k} = ( \boldsymbol{\nabla}\times\mathbf{T})_{ij} . $$

    The antisymmetric tensor of rank three ϵ ijk , introduced in these expressions, is defined by the equation ϵ ijk =0 if at least two of the three indices i,j,k are equal; by the equation ϵ ijk =1 if the ordered triad (i,j,k) is an even permutation of the fundamental triad (1,2,3); and by the equation ϵ ijk =−1 if the ordered triad (i,j,k) is an odd permutation of the fundamental triad (1,2,3). Ultimately, only 6 of the 27 components of the tensor are different from zero. Note that the usual vector product between two vectors can be conveniently expressed through the antisymmetric tensor. If w=u×v, we have

    $$w_i = \sum_{jk} \epsilon_{ijk} u_j v_k . $$

    Note also that the vector product operation and the curl operator (which involve the antisymmetric tensor) imply a choice about the chirality of the Cartesian orthogonal system in which the components of the vectors (and tensors) are defined. The convention that is now almost universally accepted (and that we use) is to choose a right-handed triad, i.e. to suppose that, if the axes x and y are directed respectively along the thumb and index finger of the right hand, the z axis is directed along the middle finger.

    The antisymmetric tensor has a number of properties. The first concerns the permutation of its indices. For an even permutation the tensor remains unchanged, while for an odd permutation the tensor changes sign. In formulae

    $$\epsilon_{ijk} = \epsilon_{jki} = \epsilon_{kij} = -\epsilon_{jik} = -\epsilon_{ikj} = -\epsilon_{kji} . $$

    In addition, the following saturation properties hold

    $$\begin{aligned} \sum_k \epsilon_{ijk} \epsilon_{lmk} =& \delta_{il} \delta_{jm} - \delta_{im} \delta_{jl} , \\ \sum_{jk} \epsilon_{ijk} \epsilon_{ljk} =& 2 \delta_{il} , \\ \sum_{ijk} \epsilon_{ijk} \epsilon_{ijk} =& 6 , \end{aligned}$$

    where δ ij is the so-called Kronecker delta, i.e. the symbol defined by

    $$\delta_{ij} = 1 \quad\mathrm{if}\ i=j , \qquad \delta_{ij} = 0 \quad\mathrm{if}\ i \ne j . $$

The above definitions and properties can be used to obtain a number of vector identities that are listed below. In these equations, the quantities f and g are scalars, a and b are vectors, and T is a tensor of rank 2.

$$ \bullet \quad \operatorname{div} (f \mathbf{a} ) = \mathbf{a} \cdot \operatorname{grad} f + f \operatorname{div} \mathbf{a} . $$
(16.4)

In fact we have

$$\operatorname{div}(f \mathbf{a} ) = \sum_i {\partial\over\partial x_i} (f a_i) = \sum_i a_i {\partial f \over\partial x_i} + f \sum_i {\partial a_i \over\partial x_i} . $$
$$ \bullet \quad \operatorname{grad}(fg) = g \operatorname{grad} f + f \operatorname{grad} g . $$
(16.5)

In fact we have, for the i-th component

$$\bigl[ \operatorname{grad} (fg) \bigr]_i = {\partial\over\partial x_i} (fg) = g {\partial f \over\partial x_i} + f {\partial g \over\partial x_i} . $$
$$ \bullet \quad \operatorname{rot} (f \mathbf{a}) = \operatorname{grad} f \times \mathbf{a} + f \operatorname{rot} \mathbf{a} . $$
(16.6)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{rot} (f \mathbf{a} ) \bigr]_i &= \sum _{jk} \epsilon_{ijk} {\partial\over \partial x_j} (f a_k) = \sum_{jk} \epsilon_{ijk} \biggl[ \biggl( {\partial f \over\partial x_j} \biggr) a_k + f {\partial a_k \over\partial x_j} \biggr] = \\ &= \bigl[ (\operatorname{grad} f) \times \mathbf{a} \bigr]_i + f [ \operatorname{rot} \mathbf{a} ]_i . \end{aligned} $$
$$ \bullet \quad \operatorname{div} (\mathbf{a} \times \mathbf{b} ) = \mathbf{b} \cdot \operatorname{rot} \mathbf{a} - \mathbf{a} \cdot \operatorname{rot} \mathbf{b} . $$
(16.7)

In fact we have

$$\begin{aligned} \operatorname{div}(\mathbf{a} \times \mathbf{b} ) & = \sum_i {\partial\over\partial x_i} \biggl( \sum_{jk} \epsilon_{ijk} a_j b_k \biggr) = \sum _{ijk} \epsilon_{ijk} \biggl[ \biggl( {\partial a_j \over\partial x_i} \biggr) b_k + a_j \biggl( {\partial b_k \over\partial x_i} \biggr) \biggr] \\ & = \sum_{ijk} b_k \epsilon_{kij} { \partial a_j \over\partial x_i} - \sum_{ijk} a_j \epsilon_{jik} { \partial b_k \over\partial x_i} = \sum _k b_k (\operatorname{rot} \mathbf{a})_k - \sum_j a_j ( \operatorname {rot} \mathbf{b})_j . \end{aligned} $$
$$ \bullet \quad \operatorname{grad}(\mathbf{a} \cdot \mathbf{b} ) = (\operatorname{grad} \mathbf{a} ) \cdot \mathbf{b} +(\operatorname{grad} \mathbf{b} ) \cdot \mathbf{a} . $$
(16.8)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{grad}(\mathbf{a} \cdot \mathbf{b} ) \bigr]_i & = {\partial\over \partial x_i} \biggl( \sum_j a_j b_j \biggr) = \sum_j \biggl( {\partial a_j \over\partial x_i} \biggr) b_j + \sum _j a_j \biggl( {\partial b_j \over\partial x_i} \biggr) \\ & = \bigl[ (\operatorname{grad} \mathbf{a} ) \cdot \mathbf{b} \bigr]_i + \bigl[ (\operatorname{grad} \mathbf{b} ) \cdot \mathbf{a} \bigr]_i . \end{aligned} $$
$$ \bullet \quad \operatorname{rot} (\mathbf{a} \times \mathbf{b} ) = \mathbf{b} \cdot \operatorname{grad} \mathbf{a} -\mathbf{a} \cdot \operatorname{grad} \mathbf{b} + \mathbf{a} \operatorname{div} \mathbf{b} - \mathbf{b} \operatorname{div} \mathbf{a} . $$
(16.9)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{rot} (\mathbf{a} \times \mathbf{b} ) \bigr]_i =& \sum_{jk} \epsilon_{ijk} {\partial\over\partial x_j} (\mathbf{a} \times \mathbf{b} )_k = \sum_{jklm} \epsilon_{ijk} \epsilon_{klm} {\partial\over \partial x_j} (a_l b_m) \\ =& \sum_{jlm} (\delta_{il} \delta_{jm} - \delta_{im} \delta_{jl}) \biggl[ \biggl( {\partial a_l \over\partial x_j} \biggr) b_m + a_l {\partial b_m \over\partial x_j} \biggr] \\ =& \sum_{ij} \biggl( b_j {\partial a_i \over\partial x_j} - b_i {\partial a_j \over\partial x_j} + a_i {\partial b_j \over\partial x_j} - a_j {\partial b_i \over\partial x_j} \biggr) \\ =& [ \mathbf{b} \cdot \operatorname{grad} \mathbf{a} ]_i - b_i \operatorname{div} \mathbf{a} + a_i \operatorname{div} \mathbf{b} - [ \mathbf{a} \cdot \operatorname{grad} \mathbf{b} ]_i . \end{aligned}$$
$$ \bullet \quad \operatorname{grad} (f \mathbf{a} ) = (\operatorname{grad}f) \mathbf{a} + f \operatorname{grad} \mathbf{a} . $$
(16.10)

In fact we have, for the ij-th component

$$\bigl[ \operatorname{grad}(f \mathbf{a} ) \bigr]_{ij} = {\partial\over\partial x_i} (f a_j) = \biggl( {\partial f \over\partial x_i} \biggr) a_j + f {\partial a_j \over\partial x_i} = (\operatorname{grad} f)_i a_j + f (\operatorname{grad} \mathbf{a} )_{ij} . $$
$$ \bullet \quad \operatorname{div} ( \mathbf{a} \mathbf{b} ) = \mathbf{b} \operatorname{div} \mathbf{a} + \mathbf{a} \cdot\operatorname{grad} \mathbf{b} . $$
(16.11)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{div} ( \mathbf{a} \mathbf{b} ) \bigr]_i & = \sum_j {\partial\over\partial x_j} (a_j b_i) = \sum _j \biggl[ \biggl( {\partial a_j \over \partial x_j} \biggr) b_i + a_j {\partial b_i \over\partial x_j} \biggr] \\ & = b_i \operatorname{div} \mathbf{a} + [ \mathbf{a} \cdot \operatorname{grad} \mathbf{b} ]_i . \end{aligned} $$
$$ \bullet \quad \mathbf{a} \times\operatorname{rot} \mathbf{b} = (\operatorname{grad} \mathbf{b} ) \cdot \mathbf{a} - \mathbf{a} \cdot \operatorname{grad} \mathbf{b} . $$
(16.12)

In fact we have, for the i-th component

$$\begin{aligned}{} [ \mathbf{a} \times\operatorname{rot} \mathbf{b} ]_i =& \sum _{jk} \epsilon_{ijk} a_j ( \operatorname{rot} \mathbf{b} )_k = \sum_{jklm} \epsilon_{ijk} \epsilon_{klm} a_j {\partial b_m \over\partial x_l} \\ =&\sum_{jlm} (\delta_{il} \delta_{jm} -\delta_{im} \delta_{jl}) a_j {\partial b_m \over\partial x_l} \\ =& \sum_j \biggl( a_j {\partial b_j \over\partial x_i} - a_j {\partial b_i \over\partial x_j} \biggr) = \bigl[ ( \operatorname{grad} \mathbf{b} ) \cdot \mathbf{a} \bigr]_i - [ \mathbf{a} \cdot \operatorname{grad} \mathbf{b} ]_i . \end{aligned}$$
$$ \bullet \quad \operatorname{div} (f \mathbf{T}) = (\operatorname{grad} f) \cdot{ \mathbf{T}} + f \operatorname{div} \mathbf{T} . $$
(16.13)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{div} (f \mathbf{T}) \bigr]_i & = \sum_j {\partial\over\partial x_j} (f T_{ji}) = \sum _j \biggl[ \biggl( {\partial f \over\partial x_j} \biggr) T_{ji} + f {\partial T_{ji} \over\partial x_j} \biggr] \\ & = \bigl[ (\operatorname{grad} f ) \cdot{\mathbf{T}} \bigr]_i + f ( \operatorname{div} \mathbf{T} )_i . \end{aligned} $$
$$ \bullet \quad \operatorname{rot} (\operatorname{rot} \mathbf{a} ) = \operatorname{grad} \operatorname{div} \mathbf{a} - \nabla^2 \mathbf{a} . $$
(16.14)

In fact we have, for the i-th component

$$\begin{aligned} \bigl[ \operatorname{rot} (\operatorname{rot} \mathbf{a} ) \bigr]_i & = \sum_{jk} \epsilon_{ijk} {\partial\over\partial x_j} (\operatorname{rot} \mathbf{a} )_k = \sum_{jklm} \epsilon_{ijk} \epsilon_{klm} {\partial\over\partial x_j} {\partial\over\partial x_l} a_m \\ & = \sum_{jlm} (\delta_{il} \delta_{jm} - \delta_{im} \delta_{jl}) {\partial^2 a_m \over\partial x_j \partial x_l} \\ & = \sum_j \biggl( {\partial^2 a_j \over\partial x_j \partial x_i} - {\partial^2 a_i \over\partial x_j \partial x_j} \biggr) = [\operatorname{grad} \operatorname{div} \mathbf{a} ]_i - \bigl[ \nabla^2 \mathbf{a} \bigr]_i . \end{aligned} $$

There are also other vector identities that apply only in integral form. They result from the theorems of Gauss and Stokes-Ampère, which we recall now.

Gauss theorem: If Σ is a closed surface enclosing the volume V and if n is the normal external to the surface, Gauss theorem is expressed by the equation

$$\bullet \quad \int_\varSigma \mathbf{a} \cdot \mathbf{n} \,\mathrm{d} S = \int _V \operatorname{div} \mathbf{a} \,\mathrm{d} V , $$

where a is an arbitrary vector that is a function of the position.

Stokes-Ampère theorem: if is a closed circuit and if Σ is a surface that is leaning on this circuit, the Stokes-Ampère theorem is stated by the equation

$$\bullet \quad \oint_\ell \mathbf{a} \cdot{\mathrm{d}} \boldsymbol{\ell}= \int _\varSigma \operatorname{rot} \mathbf{a} \cdot \mathbf{n} \,\mathrm{d} S , $$

where n is the normal external to the surface. We note that the validity of this equation implies a convention about the direction of integration along the circuit, which in turn depends on the implicit convention in the definition of the curl operator. When the (x,y,z) system used to define the vector components is a right-handed system, then the direction of integration along the circuit follows the corkscrew (or the right screw) rule, for which the direction of n coincides with the direction of advancement of the corkscrew.

Various identities can be obtained from the Gauss and Stokes-Ampère theorems. Some of them are collected below.

$$\bullet \quad \oint_\ell f \,\mathrm{d} \boldsymbol{\ell}= \int_\varSigma \mathbf{n} \times\operatorname{grad} f \,\mathrm{d} S . $$

This identity can be proven by noting that, if c is an arbitrary constant vector, we have

$$\mathbf{c} \cdot\oint_\ell f \,\mathrm{d} \boldsymbol{\ell}= \oint_\ell(f \mathbf{c} ) \cdot{\mathrm{d}} \ell , $$

and, applying the Stokes-Ampère theorem

$$\mathbf{c} \cdot\oint_\ell f \,\mathrm{d} \boldsymbol{\ell}= \int _\varSigma \operatorname{rot} (f \mathbf{c} ) \cdot \mathbf{n} \, \mathrm{d} S . $$

Recalling the vector identity of Eq. (16.6), and taking into account that c is a constant vector, we have

$$\mathbf{c} \cdot\oint_\ell f \,\mathrm{d} \boldsymbol{\ell}= \int _\varSigma \bigl[ (\operatorname{grad} f) \times \mathbf{c} \bigr] \cdot \mathbf{n} \,\mathrm {d} S = \mathbf{c} \cdot\int_\varSigma \mathbf{n} \times\operatorname{grad} f \, \mathrm{d} S . $$

The identity therefore follows, because c is an arbitrary vector.

With entirely similar procedures and taking into account the vector identities demonstrated previously, we obtain the additional identities

$$\begin{aligned} & \bullet \quad \oint_\ell \mathbf{a} \times\mathrm{d} \boldsymbol{\ell}= \int _\varSigma \bigl[ \mathbf{n} \operatorname{div} \mathbf{a} -( \operatorname{grad} \mathbf{a} ) \cdot \mathbf{n} \bigr] \,\mathrm{d} S . \\ &\bullet \quad \int_\varSigma \mathbf{n} \times \mathbf{a} \,\mathrm{d} S = \int _V \operatorname{rot} \mathbf{a} \,\mathrm{d} V . \\ &\bullet \quad \int_\varSigma f \mathbf{n} \,\mathrm{d} S = \int _V \operatorname{grad} f \,\mathrm{d} V . \end{aligned}$$

In particular, if we put f=1 in this last identity, we get

$$\bullet \quad \int_\varSigma \mathbf{n} \,\mathrm{d} S = 0 , $$

which is an important geometrical relation valid for an arbitrary closed surface.

16.3 The Dirac Delta Function

The Dirac delta function, traditionally indicated by the symbol δ(x), can be thought of as a function which is null for any value of x, except for an infinitesimal interval centered at the origin where the function has a very high peak which tends to infinity, but such that the integral of the function in dx is equal to 1. Obviously, it is not a function in strict mathematical sense, but can be thought of as the limit of a family of functions depending on a suitable parameter. For example, if we consider the family of functions f(x,a)

$$f(x,a) = \begin{cases} {1 \over a} & \mbox{for } |x| \le{a \over2}, \cr 0 & \mbox{for } |x| > {a \over2}, \end{cases} $$

we have that

$$\delta(x) = \lim_{a \rightarrow0} f(x,a) . $$

Similarly, if we consider the family

$$g(x,a) = {1 \over\sqrt{2 \pi a}} \mathrm{e}^{-(x/a)^2} , $$

we also have

$$\delta(x) = \lim_{a \rightarrow0} g(x,a) . $$

There are endless possibilities to represent the Dirac delta as the limit of suitable families of functions. The most common representations in mathematical physics are the following ones

$$\begin{aligned} \delta(x) =& \lim_{\varOmega\rightarrow\infty} {1 \over\pi} {\sin(\varOmega x) \over x} , \\ \delta(x) =& \lim_{\varOmega\rightarrow\infty} {1 \over\pi} {\sin^2 (\varOmega x) \over\varOmega x^2} . \end{aligned}$$

The fundamental property of the Dirac delta is summarised in the following expression, which constitutes its formal definition

$$\int_{-\infty}^\infty F(x) \delta(x) \,\mathrm{d}x = F(0) , $$

and from which, by means of simple changes of variable, the following two relations are found

$$\int_{-\infty}^\infty F(x) \delta(x- x_0) \, \mathrm{d}x = F(x_0) , $$
$$\int_{-\infty}^\infty F(x) \delta(ax) \,\mathrm{d}x = {1 \over | a |} F(0) , $$

where a is any real number different from zero. From these equations we can get an important generalisation concerning the Dirac delta whose argument is an arbitrary real function g(x). Denoting this quantity by the symbol δ[g(x)] and denoting by x i the zeroes (if any) of the function g(x), we have

$$\int_{-\infty}^\infty F(x) \delta\bigl[g(x)\bigr] \, \mathrm{d} x = \sum_i {1 \over |g'(x_i)|} F(x_i) , $$

where g′(x) is the derivative of the function g(x) with respect to its argument. Further generalisations to the case of the three-dimensional Dirac delta are described directly in the text (see Sect. 3.2).

Finally, we can give a meaning to the derivative of the Dirac delta function, δ′(x), defined by the usual relation

$$\delta'(x) = \lim_{\Delta x \rightarrow0} {\delta(x+ \Delta x) - \delta (x) \over\Delta x} . $$

Using this definition we have, for an arbitrary function F(x)

$$\int_{-\infty}^\infty F(x) \delta'(x) \, \mathrm{d} x = \lim_{\Delta x \rightarrow0} \int_{-\infty}^\infty F(x) {\delta(x+ \Delta x) - \delta(x) \over\Delta x} \,\mathrm{d} x , $$

from which we obtain

$$\int_{-\infty}^\infty F(x) \delta'(x) \, \mathrm{d} x = \lim_{\Delta x \rightarrow0} {F(- \Delta x) - F(0) \over\Delta x} = - F'(0) . $$

16.4 Recovering the Elementary Laws of Electromagnetism

In Chap. 3, starting from the Liénerd and Wiechart potentials, we calculated the expressions of the electric and magnetic field at an arbitrary point in space, due to a single moving charge. The results are contained in Eqs. (3.19) and (3.20). We are now going to show how the basic equations of electromagnetism valid for stationary phenomena can be derived from these equations in the non-relativistic limit. The purpose of this appendix is a simple consistency check, since it is obvious that the equations from which we start, being a consequence of Maxwell’s equations, must already contain those results that, even historically, are the basis of Maxwell’s equations themselves.

Consider a particle with electric charge e, moving within an electric conductor having a constant transverse section. Its velocity is much lower than the velocity of light. To fix ideas, we can think that the velocity is of the order of 10−2 cm s−1, which represents the order of magnitude of the drift velocities of electrons inside a conductor in a typical macroscopic electric circuit. The corresponding value of β is of the order of 10−12, so that the approximation β 2≪1 is certainly verified. Furthermore, the effects of the curvature of the conductor (causing very small accelerations) can certainly be neglected so that we can assume that the electric field is given only by the Coulomb term of Eq. (3.19). Neglecting terms of the order of β 2, such field is written in the form

$$\mathbf{E}(\mathbf{r}, t) = {e \over\kappa^3 R^2} (\mathbf{n} - \boldsymbol{\beta }) , $$

where κ, R, n are the quantities introduced in Chap. 3 and that need to be calculated at the retarded time t′. The magnetic field is then given by Eq. (3.20), i.e.

$$B(\mathbf{r}, t) = \mathbf{n} \times \mathbf{E}(\mathbf{r}, t) . $$

We can immediately notice that, if we put β=0 (i.e. we consider an electric charge at rest), obviously we do not need to consider the difference between real time and retarded time, so we obtain, being κ=1

$$\mathbf{E}(\mathbf{r} ) = {e \mathbf{n} \over R^2} , \qquad \mathbf{B}(\mathbf{r} ) = 0 . $$

These are the ordinary equations of electrostatics which represent, in terms of fields, Coulomb’s law.

We are now going to see what we get at first order in β. With simple considerations it can be shown that the electric field E(r,t) is exactly equal to what one would calculate using Coulomb’s law and assuming, hypothetically, that the velocity of light were infinite (i.e. neglecting the difference between real and retarded time). In fact, referring to Fig. 16.1 and denoting by a single quote the quantities measured at the retarded time t′ and without superscript the same quantities at time t, we have

$$t' = t - {R' \over c} , \qquad \mathbf{R}' = \mathbf{R} + \bigl(t-t'\bigr) \mathbf{v} =\mathbf{R} + R' \boldsymbol{\beta }, $$

from which it follows, dividing by R

$$ \mathbf{n}' - \boldsymbol{\beta}= {\mathbf{R} \over R'} . $$
(16.15)

Introducing the new notations in the expression for the electric field and recalling that β is constant we obtain

$$\mathbf{E}(\mathbf{r}, t) = {e \over\kappa^{\prime 3} R^{\prime 2}} \bigl(\mathbf{n}' - \beta \bigr) = {e \mathbf{R} \over\kappa^{\prime 3} R^{\prime 3}} . $$

On the other hand we have by definition that

$$\kappa' = 1 - \boldsymbol{\beta}\cdot \mathbf{n}' , $$

and applying Carnot’s theorem to the triangle PP tP t

$$ R = R' \sqrt{1 -2 \boldsymbol{\beta}\cdot \mathbf{n}' + \beta^2} . $$
(16.16)

Substituting in the expression of the electric field, we obtain the result we anticipated. In fact we obtain, apart from terms of the order of β 2

$$\mathbf{E}(\mathbf{r},t) = { e \mathbf{R} (1 - 2 \beta\cdot \mathbf{n}' + \beta^2 )^{3/2} \over R^3 (1 - \boldsymbol{\beta}\cdot \mathbf{n}')^3} \simeq {e \mathbf{n} \over R^2} . $$
Fig. 16.1
figure 1

We want to evaluate the electric field in the point P at time t. P t is the position of the particle at the same time, while P t is the position of the particle at the retarded time

It remains to evaluate the contribution of the magnetic field. We have

$$\mathbf{B}(\mathbf{r}, t) = \mathbf{n}' \times \mathbf{E}(\mathbf{r}, t) = {e \mathbf{n}' \times \mathbf{n} \over R^2} . $$

On the other hand, also apart from terms of the order of β 2, we have, from Eqs. (16.15) and (16.16)

$$\mathbf{n}' = \boldsymbol{\beta}+ \mathbf{n} (1 - \boldsymbol{\beta}\cdot \mathbf{n}) , $$

so that

$$\mathbf{B}(\mathbf{r}, t) = { e \boldsymbol{\beta}\times \mathbf{n} \over R^2} . $$

Now we apply this equation to the case of an element of a conductor, of length d. Denoting by N the number density of the moving charged particles and with S the transverse section of the conductor, the element contains a number of particles given by NS d with velocity v=c β parallel to d. There is an equal number of fixed particles of opposite charge, so the resulting electric field is null for the property previously demonstrated. For the magnetic field we have instead

$$B(\mathbf{r}, t) = e {N S v \over c} \mathrm{d} \boldsymbol{\ell}\times {\mathbf{n} \over R^2} . $$

On the other hand, if we denote by i the intensity of the current flowing in the conductor

$$i = e N S v , $$

so that the equation for the magnetic field is written

$$B(\mathbf{r}, t) = {i \over c} \mathrm{d} \boldsymbol{\ell}\times {\mathbf{n} \over R^2} . $$

This is just the Biot and Savart law expressing the magnetic field generated by a current element. As is clear from our deduction, although electric charges move within the conductor at very low speed, they are nevertheless able to create a relativistic effect which is manifested by the presence of the magnetic field.

16.5 The Relativistic Larmor Equation

Within the radiation zone, Eqs. (3.18) and (3.20) provide the expressions for the electric and magnetic field due to a moving charge

$$\mathbf{E}( \mathbf{r}, t) = {e \over c^2 \kappa^3 R} \mathbf{n} \times\bigl[( \mathbf{n} - \boldsymbol{\beta }) \times \mathbf{a} \bigr] , \qquad \mathbf{B}( \mathbf{r}, t) = \mathbf{n} \times \mathbf{E} ( \mathbf{r}, t) , $$

where e is the value of the electric charge, c is the speed of light, n is the unit vector along the direction of R, the vector that goes from the charge to the point of coordinates r, β=v/c is the velocity of the charge in units of the speed of light, a is the acceleration, and κ is defined by the equation

$$\kappa= 1 - \mathbf{n} \cdot \boldsymbol{\beta }. $$

We recall that the quantities R, κ, n, β, and a that appear in the previous equations must be evaluated at the retarded time t′, related to the time t by the equation

$$t' = t - {R \over c} . $$

Expanding the double vector product, we obtain

$$\mathbf{E} (\mathbf{r}, t) = {e \over c^2 \kappa^3 R} \bigl[ (\mathbf{n} \cdot \mathbf{a} ) ( \mathbf{n} - \boldsymbol{\beta }) - \kappa \mathbf{a} \bigr] . $$

On the other hand, we know that the Poynting vector is given by

$$\mathbf{S} (\mathbf{r}, t) = {c \over4 \pi} E^2(\mathbf{r}, t) \mathbf{n} , $$

and expanding the square of the electric field we obtain with simple algebra

$$\mathbf{S}(\mathbf{r}, t) = {e^2 \over4 \pi c^3 R^2} \biggl[ {a^2 \over\kappa^4} + 2 {(\mathbf{n} \cdot \mathbf{a} ) (\boldsymbol{\beta}\cdot \mathbf{a} ) \over \kappa^5} - {(1 - \beta^2) (\mathbf{n} \cdot \mathbf{a} )^2 \over\kappa^6} \biggr] \mathbf{n} . $$

This expression shows that, in the general case, the angular distribution of the emitted radiation (i.e. the radiation diagram) is quite complex. The special cases where the acceleration is either parallel or perpendicular to the velocity have been discussed in the text. Here, it is sufficient to emphasize the fact that, for any velocity and acceleration, there are always two directions where the Poynting vector is zero. This can be shown simply from the expression of the electric field. The electric field is obviously zero along the directions characterised by those unit vectors n 0 such that the vector n 0β is parallel to the vector a. The same holds for the Poynting vector. The directions n 0 are then contained in the plane defined by the vectors β and a, and are given by the solutions of the equation

$$(\mathbf{n}_0 - \boldsymbol{\beta }) \times \mathbf{a} = 0 . $$

Denoting by α the angle between the velocity and the acceleration vectors, the unit vectors n 0 are defined by the angles θ ± (which start from the acceleration vector and increase in the same direction as α) given by

$$\theta_+ = \arcsin(\beta\sin \alpha) , \qquad\theta_-= \pi- \arcsin(\beta\sin \alpha) . $$

For example, if α=45 and β=0.8, we have θ +=34.45 and θ =145.55.

Let us now move on to the calculation of the power. We note that if the integral

$$ \mathcal{I} = \oint \mathbf{S}(\mathbf{r}, t) \cdot \mathbf{n} R^2 \,\mathrm{d} \varOmega , $$
(16.17)

were simply executed over a sphere of radius R centred on the position of the charge at the retarded time (tR/c), we would obtain the ratio between the energy that flows across the sphere in a time interval dt and the dt itself. This quantity is however of not much interest. It is more interesting to obtain the power emitted by the charged particle. In order to do this, we need to take into account the fact that the energy that flows across the sphere in a time dt was emitted by the particle in the time dt′ which depends on the direction and is related to dt by

$$\mathrm{d}t = \kappa \,\mathrm{d}t' . $$

To find the power W emitted by the charged particle we therefore need to calculate the integral

$$W = \oint \mathbf{S}(\mathbf{r}, t) \cdot \mathbf{n} {\mathrm{d}t \over {\mathrm{d}}t'} R^2 \,\mathrm{d} \varOmega= \oint \mathbf{S}(\mathbf{r}, t) \cdot \mathbf{n} \kappa R^2 \,\mathrm{d} \varOmega . $$

Substituting the above expression of the Poynting vector, we find

$$W = {e^2 \over4 \pi c^3} \oint \biggl[ {a^2 \over\kappa^3} + 2 {(\mathbf{n} \cdot \mathbf{a} ) (\boldsymbol{\beta}\cdot \mathbf{a} ) \over\kappa^4} - {(1 - \beta^2) (\mathbf{n} \cdot \mathbf{a} )^2 \over\kappa^5} \biggr] \,\mathrm{d} \varOmega . $$

To calculate this integral, we introduce a system of polar coordinates (ψ,χ) with the polar axis directed along the velocity vector and the azimuth χ measured from the plane containing the velocity and the acceleration. With obvious notations, the three vectors β, a, and n in this system of coordinates are given by

$$\boldsymbol{\beta}= \beta \mathbf{k} , \qquad \mathbf{a} = a_\perp \mathbf{i} + a_\parallel \mathbf{k} , \qquad \mathbf{n} = \sin \psi\cos \chi \mathbf{i}+ \sin \psi\sin \chi \mathbf{j}+ \cos \psi \mathbf{k} , $$

so that the integrand can be written in the form

$$\begin{aligned} & {a_\parallel^2 + a_\perp^2 \over(1- \beta\cos \psi)^3} + 2 \beta a_\parallel{\sin \psi\cos \chi a_\perp+ \cos \psi a_\parallel\over(1 - \beta\cos \psi)^4} \\ &\quad {}- \bigl(1- \beta^2\bigr) {\sin^2 \psi\cos^2 \chi a_\perp^2 + 2 \sin \psi\cos \psi\cos\chi a_\perp a_\parallel+ \cos^2 \psi a_\parallel^2 \over(1 -\beta\cos \psi)^5} , \end{aligned} $$

and dΩ is given by sinψ dψ dχ. By integrating in dχ within the interval (0,2π), the factors that do not contain any function of χ result in 2π, those containing cosχ produce zero, while the factor containing cos2 χ gives π. In summary we have

$$\begin{aligned} W & = {e^2 \over2 c^3} \int_0^\pi \biggl[ {a_\parallel^2 + a_\perp^2 \over(1- \beta\cos \psi)^3} + { 2 \beta\cos\psi a_\parallel^2 \over(1 - \beta\cos \psi)^4} \\ &\quad {}- \bigl(1- \beta^2\bigr) {{1 \over2} \sin^2 \psi a_\perp^2 + \cos^2 \psi a_\parallel^2 \over(1 -\beta\cos\psi)^5} \biggr] \sin \psi \, \mathrm{d} \psi . \end{aligned} $$

The integrals in dψ appearing in this expression are simple and can be evaluated either by integrating by parts or by changing the integration variable from ψ to x=1−βcosψ. We obtain

$$\begin{aligned} {1 \over2} \int_0^\pi {1 \over(1 - \beta\cos\psi)^3} \sin \psi \,\mathrm{d} \psi =& {1 \over(1 - \beta^2)^2} , \\ {1 \over2} \int_0^\pi {\cos\psi\over(1 - \beta\cos\psi)^4} \sin \psi \,\mathrm{d} \psi =& {4 \over3} { \beta\over(1 - \beta^2)^3} , \\ {1 \over2} \int_0^\pi {\sin^2 \psi\over(1 - \beta\cos\psi)^5} \sin \psi \,\mathrm{d} \psi =& {2 \over3} {1 \over(1 - \beta^2)^3} , \\ {1 \over2} \int_0^\pi {\cos^2 \psi\over(1 - \beta\cos\psi)^5} \sin \psi \,\mathrm{d} \psi =& {1 \over3} {1 + 5 \beta^2 \over (1 - \beta^2)^4} . \end{aligned}$$

Substituting these expressions and grouping separately the terms in \(a_{\parallel}^{2}\) and in \(a_{\perp}^{2}\), we get

$$\begin{aligned} W & = {e^2 \over2 c^3} \biggl\{ a_\parallel^2 \biggl[ {1 \over (1 -\beta^2)^2} + {8 \over3} {\beta^2 \over(1 -\beta^2)^3} - {1 \over3} {1 + 5 \beta^2 \over(1 -\beta^2)^3} \biggr] \\ &\quad {} + a_\perp^2 \biggl[ {1 \over (1 - \beta^2)^2} - {1 \over3} {1 \over(1-\beta^2)^2} \biggr] \biggr\} , \end{aligned} $$

or, expanding,

$$W = {2 e^2 \over3 c^3} \biggl[ {a_\parallel^2 \over(1 - \beta^2)^3} + {a_\perp^2 \over(1 - \beta^2)^2} \biggr] . $$

Recalling the definition of the relativistic factor γ

$$\gamma= {1 \over\sqrt{1 - \beta^2}} , $$

the expression for the power emitted by a relativistic charge in accelerated motion can also be written in the more representative form

$$W = {2 e^2 \over3 c^3} \bigl( \gamma^6 a_\parallel^2 + \gamma^4 a_\perp^2\bigr) . $$

This formula is a generalisation of the Larmor equation (3.23) to the relativistic case. Obviously, for γ=1 we find Larmor equation since \(a_{\parallel}^{2} + a_{\perp}^{2} = a^{2}\).

To conclude, we note that, if we had executed the integral of the Poynting vector on the sphere without taking into account the difference between the dt and the dt′ (i.e. the integral \(\mathcal{I}\) of Eq. (16.17)), we would have obviously obtained a different expression. Taking into account that

$$\begin{aligned} {1 \over2} \int_0^\pi {1 \over(1 - \beta\cos\psi)^4} \sin \psi \,\mathrm{d} \psi =& {1 \over3} { 3 + \beta^2 \over(1 - \beta^2)^3} , \\ {1 \over2} \int_0^\pi {\cos\psi\over(1 - \beta\cos\psi)^5} \sin\psi \,\mathrm{d} \psi =& {1 \over3} { \beta (5 +\beta^2) \over(1 - \beta^2)^4} , \\ {1 \over2} \int_0^\pi {\sin^2 \psi\over(1 - \beta\cos\psi)^6} \sin \psi \,\mathrm{d} \psi =& {2 \over15} { 5 + \beta^2 \over(1 - \beta^2)^4} , \\ {1 \over2} \int_0^\pi {\cos^2 \psi\over(1 - \beta\cos\psi)^6} \sin \psi \,\mathrm{d} \psi =& {1 \over15} {5 + 38 \beta^2 + 5 \beta^4 \over(1 - \beta^2)^5} , \end{aligned}$$

we have, in fact, that

$$\mathcal{I} = {2 e^2 \over3 c^3} \biggl[ \gamma^8 \biggl(1 + {1 \over5} \beta^2 \biggr) a_\parallel^2 + \gamma^6 \biggl(1 + {2 \over5} \beta^2 \biggr) a_\perp^2 \biggr] . $$

This difference between the power emitted by the particle (W) and the power received on the sphere (\(\mathcal{I}\)) is a simple kinematic effect and has nothing to do with relativity. A similar effect occurs in the case of acoustic waves emitted, for example, by an airplane travelling at a speed close to the velocity of sound. While the power emitted by the plane into acoustic waves is fixed, the received power can be very large and, at the limit, almost infinite if the plane travels for a long time at exactly the speed of sound (the so-called sonic bang is precisely due to this phenomenon).

16.6 Gravitational Waves

The equations that we have obtained for the radiation of electromagnetic waves can also be applied, with some slight modifications, to treat gravitational radiation. Obviously, this is not rigorous, since the laws of gravitational radiation should be derived from the general theory of relativity. The approach followed here is however sufficient to describe the fundamental properties of the mechanisms for the generation of gravitational waves and leads to formulae that are substantially correct (as can be verified a posteriori).

We start by performing a formal transformation to the equations for electromagnetic radiation described in Sect. 3.10

$$e_i \rightarrow m_i \quad(i=1, \ldots ,N) , $$

i.e. we replace, for each particle, the charge with the mass. Furthermore, in the equations that express the Poynting vector (i.e. in those which express the radiated power), we multiply the right-hand side by the universal gravitational constant G. We note, incidentally, that in these equations the dimensional factor [e 2] is replaced by the dimensional factor [Gm 2] having the same dimensions. The various quantities introduced in Sect. 3.10, i.e. the electric dipole moment D (Eq. (3.31)), the magnetic dipole moment M (Eq. (3.32)), and the symmetric tensor of order two (related to the electric quadrupole moment) (Eq. (3.33)) are transformed in as many quantities for which we use, respectively, the symbols D G, M G, and , i.e.

We now note that the quantity D G, the analogous of the electric dipole, is, by definition, the coordinate of the centre of mass of the system of N particles r G multiplied by the total mass. We therefore have

$$\mathbf{D}_{\mathrm{G}} = \sum_{i=1}^N m_i \mathbf{s}_i = \mathcal{M} \mathbf{r}_{\mathrm{G}} , $$

where

$$\mathcal{M} = \sum_{i=1}^N m_i . $$

We then have, for an isolated system,

$$\ddot{\mathbf{D}}_{\mathrm{G}} = \mathcal{M} {\mathrm{d}^2 \over{\mathrm{d}}t^2} \mathbf{r}_{\mathrm{G}} = 0 . $$

Furthermore, the analogous of the magnetic dipole, the quantity M G, is proportional to the total angular momentum of the system J, since

$$2 c \mathbf{M}_{\mathrm{G}} = \sum_{i=1}^N m_i \mathbf{s}_i \times \mathbf{v}_i = \mathbf{J} . $$

We then obtain, for an isolated system,

$$\dot{\mathbf{M}}_{\mathrm{G}} = {\mathrm{d} \over{\mathrm{d}}t} \mathbf{J} = 0 , $$

and therefore also

$$\ddot{\mathbf{M}}_{\mathrm{G}}= 0 . $$

According to our analogy, we can then conclude that for gravitational waves there is neither the analogue of electric dipole radiation, nor the analogue of magnetic dipole radiation. It therefore only remains the analogue of the electric quadrupole radiation (in addition, obviously, to the radiation due to higher multipoles). The tensor is traditionally denoted by the symbol , because it is essentially an inertia tensor. It is however not to be confused with the ordinary inertia tensor \(\mathcal{I}\) that is used to describe the dynamics of a rigid body, and that is defined as

$$\mathcal{I} = \sum_{i=1}^N m_i \bigl( s_i^2 {\mathbf{U}} - \mathbf{s}_i \mathbf{s}_i \bigr) , $$

where U is the unitary tensor. We have, obviously,

since, recalling the definition of the trace of a tensor

$$\operatorname{Tr} \mathcal{I} = \sum_i m_i \bigl(3 s_i^2 - x_i^2 - y_i^2 -z_i^2\bigr) = 2 \sum _i m_i s_i^2 . $$

The two tensors \(\mathcal{I}\) and differ by a quantity which is proportional to the unitary tensor. This property is strictly analogous to the one that exists between the tensors \(\mathcal{Q}\) and in electrodynamics. Therefore, within our analogy, the power emitted in gravitational waves at the lowest order (of the multipolar expansion) can be obtained from Eq. (3.34) and is given by

$$W_{\mathrm{G}} = {G \over20 c^5} \sum_{jk} ( \dddot{\mathcal{I}}_{jk} )^{ 2} . $$

This formula is correct in all respects, aside from the numerical factor. The calculations based on general relativity produce a similar result, where the factor \({1 \over 20}\) is replaced by the factor \({1 \over5}\). Intuitively, one can justify this multiplication by a factor of four noting that an electromagnetic wave is described by two vectors E and B that are not independent and are perpendicular to the direction of propagation, say z. Only two components of one of the two fields, such as E x and E y , are sufficient to describe the wave. A gravitational wave is instead described by two independent tensors also perpendicular to the direction of propagation. If we denote these tensors by the traditional symbols e + and e ×, the wave is described by the eight components (\(e_{xx}^{+}, e_{xy}^{+}, e_{yx}^{+}, e_{yy}^{+}, e_{xx}^{\times}, e_{xy}^{\times}, e_{yx}^{\times}, e_{yy}^{\times}\)). The factor of four is therefore associated with, say, the degrees of freedom of the polarisation. The correct formula for the power emitted in gravitational waves is then

$$W_{\mathrm{G}} = {G \over5 c^5} \sum_{jk} ( \dddot{\mathcal{I}}_{jk} )^{ 2} . $$

Finally, we note that if we change the origin of coordinates putting

$$\mathbf{s}^{ \prime}_i = \mathbf{b} + \mathbf{s}_i , $$

with b a constant vector, we obtain that the new inertia tensor \(\mathcal{I}^{ \prime}\) is

$$\mathcal{I}^{\prime}= \mathcal{I} + \mathcal{M} \bigl[ \bigl(2 \mathbf{b} \cdot \mathbf{r}_{\mathrm{G}} + b^2\bigr) \mathbf{U} - \mathbf{b} \mathbf{b} - \mathbf{b} \mathbf{r}_{\mathrm{G}} - \mathbf{r}_{\mathrm{G}} \mathbf{b} \bigr] , $$

so that, for an isolated system,

$$\ddot{\mathcal{I}}^{\prime} = \ddot{\mathcal{I}} , $$

and, all the more so, \({\dddot{\mathcal{I}}}^{ \prime} = \dddot{\mathcal{I}}\). This equation allows to calculate the inertia tensor in a coordinate system having an arbitrary origin in order to determine the power emitted in gravitational waves.

16.7 Calculation of the Thomas-Fermi Integral

Some applications of atomic physics based on the Thomas-Fermi model require the calculation of the following integral

$$\mathcal{I} = \int_0^\infty{(1+\chi) \chi^{3/2} \over x^{1/2}} \,\mathrm{d} x , $$

where χ(x) is the solution of the Thomas-Fermi equation

$$x^{1/2} \chi'' =\chi^{3/2} , $$

which satisfies the boundary condition

$$\chi(0)=1 , \qquad\lim_{x \to\infty} \chi(x) = 0 . $$

The integral is split into the sum of two integrals

$$ \mathcal{I} = \mathcal{I}_1 + \mathcal{I}_2 , $$
(16.18)

where

$$\mathcal{I}_1 = \int_0^\infty { \chi^{3/2} \over x^{1/2}} \,\mathrm{d} x , \qquad\mathcal{I}_2 = \int _0^\infty{ \chi^{5/2} \over x^{1/2}} \,\mathrm{d} x . $$

The first integral is trivial since, taking into account the Thomas-Fermi equation and the boundary conditions of the function χ, we have

$$ \mathcal{I}_1 = \int_0^\infty \chi'' \,\mathrm{d}x = -\chi'(0) . $$
(16.19)

The calculation of the second integral is more complex. It can be done in the following way. On one hand, we have

$$\mathcal{I}_2 = \int_0^\infty { \chi^{5/2} \over x^{1/2}} \,\mathrm{d} x = \int_0^\infty \chi \chi'' \,\mathrm{d} x , $$

and, integrating by parts and taking into account that χ(0)=1

$$ \mathcal{I}_2 = - \chi'(0) - \int_0^\infty \chi ^{\prime 2} \,\mathrm{d} x . $$
(16.20)

On the other hand, considering the quantity x −1/2 dx as a differential factor, by integrating again by parts and recalling the Thomas-Fermi equation, we obtain

$$\mathcal{I}_2 = \int_0^\infty { \chi^{5/2} \over x^{1/2}} \,\mathrm{d} x = -5 \int_0^\infty x^{1/2} \chi^{3/2} \chi' \,{\mathrm{d}} x = -5 \int_0^\infty x \chi' \chi'' \,\mathrm{d} x . $$

Now we note that the product χχ″ can be expressed in the form

$$\chi' \chi'' = {1 \over2} {\mathrm{d} \chi^{\prime 2}\over{\mathrm {d}} x} , $$

and integrating again by parts we get

$$\mathcal{I}_2 = {5 \over2} \int_0^\infty \chi^{\prime 2} \,\mathrm{d} x . $$

Comparing this expression with Eq. (16.20), we obtain

$$\int_0^\infty\chi^{\prime 2} \,\mathrm{d} x = -{2 \over7} \chi'(0) , \quad\mathrm{or}, \quad \mathcal{I}_2 = - {5 \over7} \chi'(0) . $$

Finally, recalling Eqs. (16.18) and (16.19) we get

$$ \mathcal{I} = \int_0^\infty{(1+\chi) \chi^{3/2} \over x^{1/2}} \,\mathrm{d} x = -{12 \over7} \chi'(0) . $$
(16.21)

16.8 Energy of the Ground Configuration of the Silicon Atom

As an application of the results obtained in Chap. 8, we evaluate the energy of the ground configuration of the silicon atom, i.e. of the 1s 22s 22p 63s 23p 2 configuration. Before performing these calculations, we need to evaluate some 3-j symbols. By means of the analytical formula given in Eq. (7.16), we have

$$ \begin{pmatrix} 0 & 0 & 0 \cr0 & 0 & 0\end{pmatrix}^{2} = 1 , \qquad \begin{pmatrix} 0 & 1 & 1 \cr0 & 0 & 0 \end{pmatrix}^{ 2} = { {1 \over3}} , \qquad \begin{pmatrix} 1 & 1 & 2 \cr0 & 0 & 0 \end{pmatrix}^{ 2} = { {2 \over15}} . $$

The configuration contains four closed subshells and one open subshell. We start by evaluating the degenerate contribution to the energy. The Hamiltonian H 0 (defined in Eq. (7.3)) and the \(\mathcal{F}\) part of the Hamiltonian \(\mathcal{H}_{1}\) (defined in Eqs. (8.2) and (8.3)) produce five terms, one for each subshell (open or closed). The corresponding energy (that we denote by \(\mathcal{E}_{1}\)) is obtained from Eq. (8.7) and is given by

$$\begin{aligned} \mathcal{E}_1 & = 2 W_0(1s) + 2 W_0(2s) +6 W_0(2p) + 2 W_0(3s) + 2 W_0(3p) \\ &\quad {} + 2 I(1s) + 2 I(2s) +6 I(2p) + 2 I(3s) + 2 I(3p) , \end{aligned} $$

where W 0 is defined by Eq. (7.11) and I(n,l) is the integral defined in Eq. (8.6). The energy of the Coulomb interaction (i.e. the part \(\mathcal{G}\) of the Hamiltonian \(\mathcal{H}_{1}\)) resulting from closed subshells contributes four terms. Denoting by \(\mathcal{E}_{2}\) the corresponding energy, we have, using Eq. (8.17),

$$\mathcal{E}_2 = F^0(1s,1s) + F^0(2s,2s) + 15 F^0(2p,2p) - {{6 \over5}} F^2(2p,2p) + F^0(3s,3s) , $$

where the quantities F k(n a l a ,n n l b ) are defined in Eq. (8.9). Considering the energy of the Coulomb interaction between different closed subshells, we have six contributions, as many as the number of the distinct pairs that can be formed with the four closed subshells. Denoting by \(\mathcal{E}_{3}\) the corresponding energy, we have, using Eqs. (8.14) and (8.16)

$$\begin{aligned} \mathcal{E}_3 & = 4 F^0(1s,2s) - 2 G^0(1s,2s) + 12 F^0(1s,2p) - 2 G^1(1s,2p) + 4 F^0(1s,3s) \\ &\quad {} - 2 G^0(1s,3s) + 12 F^0(2s,2p) - 2 G^1(2s,2p) + 4 F^0(2s,3s) - 2 G^0(2s,3s) \\ &\quad {} + 12 F^0(2p,3s) - 2 G^1(2p,3s) , \end{aligned} $$

where the quantities G k(n a l a ,n n l b ) are defined in Eq. (8.10). Finally, we need to evaluate the contribution of the Coulomb interaction between the open subshell 3p and the four closed subshells. Denoting by \(\mathcal{E}_{4}\) the corresponding energy, we have, using Eqs. (8.13) and (8.15),

$$\begin{aligned} \mathcal{E}_4 & = 4 F^0(1s,3p) - {{2 \over3}} G^1(1s,3p) + 4 F^0(2s,3p) - {{2 \over3}} G^1(2s,3p) + 12 F^0 (2p,3p) \\ &\quad {} - 2 G^0(2p,3p) - {{4 \over5}} G^2(2p,3p) + 4 F^0(3s,3p) - {{2 \over3}} G^1(3s,3p) . \end{aligned} $$

The four contributions to the energy that we have calculated are degenerate with respect to all the states of the configuration. For the degenerate part of the energy of the ground configuration of the silicon atom, \(\mathcal {E}\), we then have

$$\mathcal{E} = \mathcal{E}_1 + \mathcal{E}_2 + \mathcal{E}_3 + \mathcal {E}_4 . $$

What remains to calculate is given by Eq. (8.11), with the sum extended to the only pair of electrons belonging to the open subshell 3p. The explicit computation is done in Sect. 8.6. The two 3p electrons give rise to three terms that, in order of increasing energy, are 3 P, 1 D, and 1 S. The ratio between the intervals (1 S1 D) and (1 D3 P) is equal to 3/2.

16.9 Calculation of the Fine-Structure Constant of a Term

The calculation of the constant ζ(α,LS), which characterises the fine-structure intervals of the terms belonging to a given configuration, can be carried out with a process based on the diagonal sum rule. A similar process was followed in Sect. 8.1 to determine the energy of the terms. The starting point is Eq. (9.8) which, in the case of diagonal matrix elements, is

$$\sum_i \langle \alpha L S M_L M_S | \xi(r_i) \boldsymbol{\ell}_i \cdot \mathbf{s}_i | \alpha L S M_L M_S \rangle = \zeta( \alpha, LS) M_L M_S . $$

On the other hand, for any eigenstate of the configuration of the form Ψ A(a 1,a 2,…,a N ) of Eq. (7.1), the diagonal matrix element of the same operator is given by

$$\sum_i \bigl\langle \varPsi^{\mathrm{A}}(a_1,a_2, \ldots,a_N) \bigr| \xi(r_i) \boldsymbol{\ell}_i \cdot \mathbf{s}_i \bigl| \varPsi^{\mathrm{A}}(a_1,a_2, \ldots,a_N) \bigr\rangle = \sum_i \zeta_{n_i l_i} m_i m_{si} , $$

where \(\zeta_{n_{i} l_{i}}\) is the quantity defined in Eq. (9.10).

We now consider the particular case of the pf configuration which, as shown in Table 7.3, gives rise to the six terms 1 D, 1 F, 1 G, 3 D, 3 F, and 3 G. We start from a state having the highest values of the quantum numbers M L and M S , i.e. M L =4, M S =1. This state can only originate from the 3 G term. Considering instead single particle states, this state is of the type m 1=1, \(m_{s1}={{1 \over2}}\), m 2=3, \(m_{s2}={{1 \over2}}\), where the indices 1 and 2 refer, respectively, to the p and f electron. Using the same notations as in Sect. 8.1 we can write the equalityFootnote 5

$$[4,1] = \bigl(1^+,3^+\bigr) , $$

that, according to the previous equation, isFootnote 6

$$4 \zeta\bigl( {}^3\!G\bigr) = {{1 \over2}} \zeta_{np} + {{3 \over2}} \zeta_{nf} . $$

We therefore obtain the result

$$\zeta\bigl( {}^3\!G\bigr) = {{1 \over8}} \zeta_{np} + {{3 \over8}} \zeta_{nf} . $$

We then proceed by lowering the value of M L (maintaining M S =1). We obtain the equations

$$[3,1] = \bigl(0^+,3^+\bigr) + \bigl(1^+,2^+\bigr) , \qquad[2, 1] = \bigl(-1^+,3^+\bigr) + \bigl(0^+, 2^+\bigr) + \bigl(1^+, 1^+\bigr) , $$

from which we have, noting that the combination [M L =3, M S =1] can originate from the 3 G and 3 F terms, and that the combination [M L =2, M S =1] can originate from the 3 G, 3 F, and 3 D terms,

$$\begin{aligned} 3 \bigl[\zeta\bigl({}^3\!G\bigr) + \zeta\bigl({}^3\!F \bigr)\bigr] =& {{3 \over2}} \zeta_{nf} + { {1 \over2}} \zeta_{np} + \zeta_{nf} , \\ 2 \bigl[\zeta\bigl({}^3\!G\bigr) + \zeta\bigl({}^3\!F \bigr) + \zeta\bigl({}^3\!D\bigr)\bigr] =& - {{1 \over2}} \zeta_{np} + {{3 \over2}} \zeta_{nf} + \zeta_{nf} + {{1 \over2}} \zeta_{np} + { {1 \over2}} \zeta_{nf} . \end{aligned}$$

By solving the system, we arrive at the following expressions (which can also be obtained from Eq. (9.11))

$$\zeta\bigl({}^3\!F\bigr) = {{1 \over24}} \zeta_{np} + {{11 \over24}} \zeta_{nf} , \qquad \zeta\bigl({}^3\!D\bigr) = - {{1 \over6}} \zeta_{np} + {{2 \over3}} \zeta_{nf} . $$

In principle, we could also consider the values of M S =0. For example,

$$[4,0] = \bigl(1^+, 3^-\bigr) + \bigl(1^-, 3^+\bigr) . $$

However, in so doing we obtain equations of the form 0=0 and the value of ζ(1 G) is undetermined. This is entirely consistent, since the singlet states do not have fine structure and the constant ζ is not defined.

The cases of the configurations of equivalent electrons are also interesting, because, by repeating the same arguments, we obtain directly the third Hund’s rule. For example, consider the configuration p 2 which produces, as shown in Table 7.4, the three terms 1 S, 1 D, and 3 P. For the singlet terms, the fine structure constant remains undetermined, as usual. For the triplet term we have instead

$$[1,1] = \bigl(0^+, 1^+\bigr) , $$

from which we obtain

$$\zeta\bigl({}^3\!P\bigr) = {{1 \over2}} \zeta_{np} . $$

If we consider the complementary configuration p 4, we get the same structure of terms. This time, to find the fine structure constant of the 3 P, term, we need to consider the equationFootnote 7

$$[1,1] = \bigl(1^+,1^-,0^+,-1^+\bigr) , $$

and we obtain

$$\zeta\bigl({}^3\!P\bigr) = - {{1 \over2}} \zeta_{np} , $$

i.e. a value that is exactly the same (but opposite in sign) to the one of the configuration p 2. These arguments can be repeated for any configuration of equivalent electrons and for the corresponding complementary configuration and lead to the third Hund’s rule. In the particular case of configurations that fill half of a subshell (such as p 3, d 5, and f 7), the configuration coincides with the complementary one, and the fine structure constant is zero for all the terms.

16.10 The Fundamental Principle of Statistical Thermodynamics

Consider, in all generality, a macroscopic physical system. We suppose that the system is in thermal equilibrium with an ideal heat reservoir having temperature T (canonical ensemble). We also suppose to identify with the index i all possible microscopic states of the system and we denote by E i the energy of the i-th state. Macroscopically, the system is in a steady state. On the other hand, from the microscopical point of view, we can think that it constantly evolves from one microscopic state to another. We can then introduce a statistical description denoting by p i the probability that the system is in the i-th microscopic state. The following normalisation property should obviously be valid

$$\sum_i p_i = 1 . $$

We now need to relate the probability p i with the energy E i . To do so, we give a definition of the entropy by putting, according to an hypothesis originally due to Boltzmann

$$S = - k_{\mathrm{B}} \sum_i p_i \ln p_i , $$

where k B is the Boltzmann constant.

This definition can be justified by considering that the entropy of a system measures the amount of “disorder” contained in the system itself and noting that the function defined above has the mathematical property of assuming its maximum value when all the probabilities p i are equal to one another, and take the minimum value (which is equal to 0) when a single p i is equal to 1 and all the others are equal to 0. The proof of the second property is trivial. To prove the first property we note that giving an arbitrary variation δp i to the probabilities, the corresponding change δS of the entropy is

$$\delta S = - k_{\mathrm{B}} \sum_i (\ln p_i + 1) \delta p_i . $$

On the other hand, being

$$\sum_i \delta p_i = 0 , $$

it follows that if lnp i is constant (i.e. independent of i), δS is null and therefore the entropy presents an extreme. It is then easy to verify that such an extreme is actually a maximum, since

$${\mathrm{d}^2 S \over{\mathrm{d}} p_i^2} = - k_{\mathrm{B}} {1 \over p_i} < 0 . $$

Having justified the definition of the entropy, we now take into account that the internal energy of the system is given by the expression

$$U = \sum_i p_i E_i . $$

If we consider an infinitesimal thermodynamical transformation of the system, the internal energy will vary, in general, because both the probabilities p i and the energies E i change. We then have

$$\delta U = \sum_i (\delta p_i) E_i + \sum_i p_i ( \delta E_i) . $$

If, however, the external conditions of the system are not varied, the quantities E i remain fixed to the initial value and the second term of the right hand side is null. On the other hand, to keep constant the external conditions of the system means that the system does not accomplish mechanical work on the ambient medium, so we can write, according to the first principle of thermodynamics

$$\delta U = \sum_i (\delta p_i) E_i = \delta Q = T \delta S , $$

where δQ is the heat exchanged with the reservoir, and taking into account that

$$\delta S = - k_{\mathrm{B}} \sum_i (\delta p_i) \ln p_i , $$

we obtain the equation

$$\sum_i (\delta p_i) E_i = - k_{\mathrm{B}} T \sum_i ( \delta p_i) \ln p_i . $$

This equation must be satisfied for an arbitrary thermodynamic transformation (as long as no work is done). The following relation therefore must hold

$$E_i = -k_{\mathrm{B}} T \ln p_i + \mathrm{const.} , $$

which leads to the relation

$$p_i = A \mathrm{e}^{- \beta E_i } , $$

where A is a constant and where we have put

$$\beta= {1 \over k_{\mathrm{B}} T} . $$

The constant A is determined by imposing the normalisation condition. Since we must have

$$\sum_i p_i = \sum _i A \mathrm{e}^{- \beta E_i } = 1 , $$

it follows that

$$A = {1 \over{\mathcal{Z}}} , $$

where the quantity \(\mathcal{Z}\), known as the sum over states, is given by

$$\mathcal{Z} = \sum_i \mathrm{e}^{- \beta E_i } . $$

The expression of p i can therefore be written in its final form

$$ p_i = {1 \over{\mathcal{Z}}} \mathrm{e}^{- \beta E_i } = {\mathrm{e}^{- \beta E_i } \over \sum_j \mathrm{e}^{- \beta E_j } } . $$
(16.22)

This expression, often referred to as Gibbs principle, is of extreme generality and can rightly be considered the basis of all statistical thermodynamics. It can be written in an alternative form by assuming that the microscopic states of the system are not discrete (and therefore countable) but are identified by the representative point in the phase space of the system having dimension \(2 \mathcal{N}\), where \(\mathcal{N}\) is the number of degrees of freedom of the system itself. In this case, denoting by dP the probability that the representative point of the system is in the cell \(\mathrm{d} \varGamma=\mathrm{d}q_{1} \,\mathrm{d}q_{2} \cdots{\mathrm{d}}q_{\mathcal{N}} \,\mathrm{d}p_{1} \,\mathrm{d}p_{2} \cdots{\mathrm{d}}p_{\mathcal{N}}\) of the phase space centered around the values (q i ,p i ), and denoting by H(q i ,p i ) the Hamiltonian of the system, we have

$$ \mathrm{d}P = {1 \over{\mathcal{Z}}} \mathrm{e}^{- \beta H(q_i,p_i)} \,\mathrm{d} \varGamma= {\mathrm{e}^{- \beta H(q_i,p_i)} \,\mathrm{d} \varGamma \over \int{\mathrm{e}}^{- \beta H(q_i,p_i)} \,\mathrm{d} \varGamma} , $$
(16.23)

where the integral is over the entire volume of the phase space available to the system. Equations (16.22) and (16.23) coincide, respectively, with Eqs. (10.2) and (10.1) which we have assumed as the basic principles for the deduction of the various laws of thermodynamical equilibrium in Chap. 10.

16.11 Transition Probability for the Coherences

In Chap. 11, we have introduced the so-called random phase approximation and we have determined the kinetic equations for the diagonal matrix elements ρ α of the density matrix operator of the physical system. The result that we found is the kinetic equation (9.11), which is interpreted by introducing the transition probability per unit time between different states of the system. This probability is given by Fermi’s golden rule, expressed by Eq. (11.10). We now want to generalise these results by determining the kinetic equations for the so-called coherences, i.e. for the non-diagonal matrix elements of the density matrix operator.

We start again from Eq. (8.11) and introduce the hypothesis, less restrictive than that of the random phases, that in the physical system there might exist coherences, even if only within pairs of states, |α〉 and |α′〉, having the same energy eigenvalue (degenerate states) and such that the matrix element of the interaction Hamiltonian between them, \(\mathcal{H}^{\mathrm{I}}_{\alpha\alpha'}\), is zero. Taking into account this approximation, when evaluating the product \(c_{\alpha}(t) c_{\alpha'}^{*}(t)\) we obtain, considering only the terms that are at most quadratic in the matrix elements of \(\mathcal {H}^{\mathrm{I}}\),

$$\begin{aligned} c_\alpha^{\phantom{*}}(t) c_{\alpha'}^*(t) & = c_\alpha^{\phantom{*}}(0) c_{\alpha'}^*(0) \\ &\quad {} + {1 \over\hbar^2} \sum_{\beta\beta'} { \mathcal{H}}^{\mathrm{I}}_{\alpha\beta} {\mathcal {H}}^{\mathrm{I}}_{\beta' \alpha'} c_\beta^{\phantom{*}}(0) c_{\beta'}^*(0) {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\beta} t} -1 \over\omega_{\alpha\beta}} {\mathrm{e}^{-\mathrm{i} \omega_{\alpha' \beta'} t} -1 \over\omega_{\alpha' \beta'}} \\ & \quad {}+ {1 \over\hbar^2} \biggl[ \sum_{\beta\gamma} { \mathcal{H}}^{\mathrm{I}}_{\alpha\beta} {\mathcal{H}}^{\mathrm{I}}_{\beta\gamma} c_\gamma^{\phantom{*}}(0) c_{\alpha'}^*(0) \biggl( {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\gamma} t} -1 \over \omega_{\alpha\gamma} \omega_{\beta\gamma}} - {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\beta} t} -1 \over \omega_{\alpha\beta} \omega_{\beta\gamma}} \biggr) \\ &\quad {} + \mathcal{C}. \mathcal{C.} \bigl( \alpha\leftrightarrow\alpha' \bigr) \biggr] , \end{aligned} $$

where the symbol \([\cdots+ \mathcal{C}.\mathcal{C.} ( \alpha \leftrightarrow \alpha' ) ]\) means that we need to add to the term in brackets its complex conjugate (with the exchange of the indices α and α′).

We now need to recall the approximation we have introduced in that the states between which coherences exist are iso-energetic. As regards the second line of the previous equation, this implies that ω αβ =ω αβ and so the two temporal factors are one the complex conjugate of the other. Regarding the third line, we can consider the limit (ω αγ →0) and the temporal function between round brackets, which we indicate with \(\mathcal{F}(t)\), is equal to

$$\begin{aligned} \mathcal{F}(t) & = {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\gamma} t} -1 \over \omega_{\alpha\gamma} \omega_{\beta\gamma}} - {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\beta} t} -1 \over \omega_{\alpha\beta} \omega_{\beta\gamma}} = - {\mathrm{i} t \over\omega_{\alpha\beta}} + {\mathrm{e}^{ \mathrm{i} \omega_{\alpha\beta} t } -1 \over \omega_{\alpha\beta}^2} \\ & = - { 2 \sin^2 (\omega_{\alpha\beta} t /2 ) \over \omega^2_{\alpha\beta}} + \mathrm{i} {\sin(\omega_{\alpha\beta} t) - \omega_{\alpha\beta} t \over\omega_{\alpha\beta}^2} . \end{aligned} $$

We now proceed by evaluating the statistical average over the physical system. We introduce the notation of the density matrix by puttingFootnote 8 \(\rho_{\alpha\alpha'} = \langle c_{\alpha}^{\phantom{*}}(t) c_{\alpha'}^{*} (t) \rangle \). Changing the index of the sum γ in α″, the kinetic equation for the coherences becomes

$$ \begin{aligned}[b] \rho_{\alpha\alpha'}(t) & = \rho_{\alpha\alpha'}(0) + {1 \over\hbar^2} \sum_{\beta\beta'} {\mathcal{H}}^{\mathrm{I}}_{\alpha\beta} {\mathcal {H}}^{\mathrm{I}}_{\beta' \alpha'} \rho_{\beta\beta'}(0) {4 \sin^2 (\omega_{\alpha\beta} t /2) \over\omega_{\alpha\beta}^2} \\ &\quad {}+ {1 \over\hbar^2} \biggl[ \sum_{\beta\alpha''} { \mathcal{H}}^{\mathrm{I}}_{\alpha\beta} {\mathcal{H}}^{\mathrm{I}}_{\beta\alpha''} \rho_{\alpha'' \alpha'}(0) \mathcal{F}(t) + \mathcal{C}.\mathcal{C.} \bigl( \alpha \leftrightarrow\alpha' \bigr) \biggr] . \end{aligned} $$
(16.24)

We consider the limit of this equation for t→∞. As we have seen on various occasions within the text (cf. Fig. 11.1)

$$\lim_{t \rightarrow\infty} {4 \sin^2 (\omega_{\alpha\beta} t /2) \over\omega_{\alpha\beta}^2} = 2 \pi t \delta ( \omega_{\alpha \beta}) = 2 \pi \hbar t \delta(E_\alpha- E_\beta) , $$

where we have used the definition of the Bohr frequencies in terms of the energies of the states of the physical system. Regarding the function \(\mathcal{F}(t)\), while its real part produces again a Dirac delta over the energy, the imaginary part behaves, at the limit of t→∞, as shown in Fig. 16.2. It can rigorously be shown within the distribution theory that we have

$$\lim_{t \rightarrow\infty} {\mathcal{F}}(t) = - \pi t \delta ( \omega_{\alpha\beta}) - \mathrm{i} t \mathrm{PP} {1 \over \omega_{\alpha\beta}} = - \pi \hbar t \biggl[ \delta(E_\alpha- E_\beta) + {\mathrm{i} \over\pi} \mathrm{PP} {1 \over E_\alpha- E_\beta} \biggr] , $$

where the symbol PP means the Cauchy principal value.

Fig. 16.2
figure 2

The imaginary part of \(\mathcal{F}(t)\) is plotted as a function of ω αβ for a fixed time t. As t increases, the behaviour of the function becomes more and more similar to that one of the function −t/ω αβ (dotted line), except in the origin where it is zero

We are now able to write the kinetic equation that generalises Eq. (11.9), valid for the diagonal elements of the density matrix, to the case of coherences. Starting with Eq. (16.24) and noting that all the terms in the right hand side behave linearly with t, we can write, for t→∞

$$ {\mathrm{d} \over{\mathrm{d}}t} \rho_{\alpha\alpha'} = \sum _{\beta\beta'} T_{\alpha\alpha' \beta\beta'} \rho_{\beta\beta'} - \sum _{\alpha''} \bigl( R_{\alpha\alpha''} \rho_{\alpha'' \alpha'} + R^*_{\alpha' \alpha''} \rho_{\alpha\alpha''} \bigr) , $$
(16.25)

where T ααββ, the rate of transfer from the coherence ρ ββ to the coherence ρ αα, is given by

$$T_{\alpha\alpha' \beta\beta'} = {2 \pi\over\hbar} \mathcal{H}^{\mathrm{I}}_{\alpha\beta} {\mathcal{H}}^{\mathrm {I}}_{\beta' \alpha'} \delta(E_\alpha- E_\beta) , $$

and where R αα, the relaxation rate that relates the coherence ρ αα to the coherence ρ αα, is given by

$$R_{\alpha\alpha''} = {\pi\over\hbar} \sum_\beta {\mathcal{H}}^{\mathrm{I}}_{\alpha\beta} {\mathcal{H}}^{\mathrm {I}}_{\beta\alpha''} \biggl[ \delta(E_\alpha- E_\beta) + {\mathrm{i} \over\pi} \mathrm{PP} {1 \over E_\alpha- E_\beta} \biggr] . $$

It is easy to show that Eq. (16.25) coincides with Eq. (11.9) in the case of the random phase approximation, i.e. when we consider only the diagonal elements of the density matrix. We have, in fact,

$$T_{\alpha\alpha\beta\beta} = {2 \pi\over\hbar} \bigl|{\mathcal{H}}^{\mathrm{I}}_{\alpha\beta}\bigr|^2 \delta(E_\alpha- E_\beta) = P_{\alpha\beta} , $$

where P αβ is the transition probability per unit time between the states |α〉 and |β〉 (or between the states |β〉 and |α〉) given by Eq. (11.10) (Fermi’s golden rule). Similarly,

$$R_{\alpha\alpha} + R^*_{\alpha\alpha} = {2 \pi\over\hbar} \sum _\beta\bigl|{\mathcal{H}}^{\mathrm{I}}_{\alpha\beta}\bigr|^2 \delta(E_\alpha- E_\beta) = \sum_\beta P_{\alpha\beta} . $$

Equation (16.25) can therefore be rightly considered the generalisation of Fermi’s golden rule to the case of coherences. It is important to stress the presence of the imaginary factor in the relaxation rates. Such factor is responsible for some phenomena typical of the interaction between material systems and the radiation field such as, in particular, the anomalous dispersion phenomena that occur in the propagation of polarised radiation in an anisotropic medium (Faraday effect, Macaluso-Corbino effect, etc.).

16.12 Sums over the Magnetic Quantum Numbers

Here, we want to prove Eq. (11.17). That is, we want to show that, for any polarisation unit vector e, having defined the averages of the square moduli of the dipole matrix elements \(\mathcal{A}\) and \(\mathcal {A'}\) over the magnetic quantum numbers by the equations

$$\begin{aligned} \mathcal{A} =& {1 \over g_a g_b} \sum_{\alpha, \beta} | \mathbf{r}_{b\beta,a\alpha} \cdot \mathbf{e} |^2 , \\ \mathcal{A}' =& | \mathbf{r}_{ba} |^2 = {1 \over g_a g_b} \sum_{\alpha, \beta} | \mathbf{r}_{b\beta,a\alpha} |^2 = {1 \over g_a g_b} \sum _{\alpha, \beta} \langle u_{b\beta} | \mathbf{r} | u_{a \alpha} \rangle \cdot \langle u_{a \alpha} | \mathbf{r} | u_{b \beta} \rangle , \end{aligned}$$

we have

$$\mathcal{A} = {{1 \over3}} \mathcal{A}' . $$

The indices a and b in the previous equations denote any two energy levels of the atomic system while the indices α and β denote the respective magnetic sublevels, which are degenerate with respect to the energy. To demonstrate the equation, we need to introduce a more detailed notation which takes into account the fact that the atomic levels are normally characterised not only by a set of internal quantum numbers γ (which specify the configuration and the term), but also by the quantum number for the angular momentum J and the magnetic quantum number M. Applying the formal substitutions

$$\begin{aligned} & |u_{a \alpha} \rangle \rightarrow|\gamma_a J_a M_a \rangle ,\qquad |u_{b \beta} \rangle \rightarrow| \gamma_b J_b M_b \rangle , \\ & g_a \rightarrow2 J_a + 1 , \qquad g_b \rightarrow2 J_b + 1 , \end{aligned} $$

we obtain

$$\begin{aligned} \mathcal{A} & = \sum_{M_a M_b} {| \langle \gamma_b J_b M_b | \mathbf{r} \cdot \mathbf{e} | \gamma_a J_a M_a \rangle |^2 \over(2 J_a +1)(2 J_b+1)} , \\ \mathcal{A}' & = \sum_{M_a M_b} { \langle \gamma_b J_b M_b | \mathbf{r} | \gamma_a J_a M_a \rangle \cdot \langle \gamma_a J_a M_a | \mathbf{r} | \gamma_b J_b M_b \rangle \over(2 J_a +1)(2 J_b+1)} . \end{aligned} $$

To calculate \(\mathcal{A}\) we apply the Wigner-Eckart theorem, noting that the scalar product re can be expressed in terms of the spherical components of the two vectors. We have in fact (cf. Eq. (9.5))

$$\mathbf{r} \cdot \mathbf{e} = \sum_q (-1)^q r_q e_{-q} . $$

Using Eq. (9.4) we have

$$\begin{aligned} & \langle \gamma_b J_b M_b | r_q | \gamma_a J_a M_a \rangle \\ & \quad = (-1)^{J_a+M_b+1} \sqrt{2J_b+1} \begin{pmatrix}J_b & J_a & 1 \cr-M_b & M_a & q \end{pmatrix} \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle , \end{aligned} $$

and we obtain

$$\begin{aligned} \mathcal{A} & = \sum_{q q'} (-1)^{q+q'} e_{-q} (e_{-q'})^* \\ & \quad {}\times\sum_{M_a M_b} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q \end{pmatrix} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q' \end{pmatrix} {| \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1 } . \end{aligned} $$

The sum over M a and M b of the product of the two 3-j symbols can be calculated using the property of the 3-j symbols of Eq. (7.18). We have

$$\sum_{M_a M_b} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q \end{pmatrix} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q' \end{pmatrix} = {{1 \over3}} \delta_{qq'} , $$

and we obtain, being ∑ q e q (e q )=1

$$\mathcal{A} = {{1 \over3}} { | \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1} . $$

To calculate the quantity \(\mathcal{A}'\) we proceed in a similar way first noting that 〈γ a J a M a |r|γ b J b M b 〉=〈γ b J b M b |r|γ a J a M a . We obtain

$$\mathcal{A}' = \sum_q \sum _{M_a M_b} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q \end{pmatrix} \begin{pmatrix} J_b & J_a & 1 \cr-M_b & M_a & q' \end{pmatrix} {| \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1 } , $$

and using the same property of the 3-j symbols and summing over q we arrive at the result that we wanted to prove, i.e.

$$\mathcal{A}' = { | \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1} = 3 \mathcal{A} . $$

The above results can be used to express the quantity |r ba |2 that we introduced within the text in terms of the reduced matrix elements of the spherical tensor r. Since \(|\mathbf{r}_{ba}|^{2} = \mathcal {A'}\), we have

$$|\mathbf{r}_{ba}|^2 = { | \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1} . $$

On the other hand, being |r ba |2=|r ab |2, we obtain by symmetry

$$|\mathbf{r}_{ba}|^2 = |\mathbf{r}_{ab}|^2 = { | \langle \gamma_b J_b \| \mathbf{r} \| \gamma_a J_a \rangle |^2 \over2 J_a + 1} = { | \langle \gamma_a J_a \| \mathbf{r} \| \gamma_b J_b \rangle |^2 \over2 J_b + 1} , $$

an equation that relates the reduced matrix elements under the exchange of the bra with the ket.

In spectroscopy, the concept of line (or transition) strength is commonly used. Such quantity is invariant with respect to the exchange of the lower and upper level, and is defined by

$$\mathcal{S} = g_b \bigl| \langle \gamma_b J_b \| \mathbf{d} \| \gamma_a J_a \rangle \bigr|^2 = g_a \bigl| \langle \gamma_a J_a \| \mathbf{d} \| \gamma_b J_b \rangle \bigr|^2 , $$

where d=−e 0 r is the electric dipole operator. The quantities introduced in the text are therefore related to the line strength via the relation

$$|\mathbf{r}_{ba}|^2 = |\mathbf{r}_{ab}|^2 = {1 \over e_0^2} {\mathcal{S} \over g_a g_b } . $$

These relations can then be used to express the Einstein coefficients in terms of the line strength instead of in terms of the dipole matrix elements. For example, recalling Eq. (11.20), the Einstein coefficient A ab can be written in the form

$$g_a A_{ab} = {64 \pi^4 \nu_{ab}^3 \over3 h c^3} \mathcal{S} . $$

An alternative quantity that is also used to characterise the strength of a line (or a transition) is the so-called oscillator strength. This quantity is introduced in the following way. The absorption coefficient of a plasma of “classical” atoms, described by the Lorentz atomic model and integrated in frequency is given by

$$\bigl[k_{\mathrm{R}}^{(\mathrm{a})} \bigr]_{\mathrm{class}} = \mathcal{N} {\pi e_0^2 \over m c} , $$

where \(\mathcal{N}\) is the number density of atoms. Comparing this expression with that one for \(k_{\mathrm{R}}^{(\mathrm{a})}\) obtained in Sect. 11.9 (Eq. (11.33)), we see that the two quantities coincide if we identify \(\mathcal{N}\) with \(\mathcal{N}_{b}\) and multiply the classical expression for the dimensionless quantity f ba , known as the oscillator strength of the transition, given by

$$f_{ba} = {8 \pi^2 m \nu_{ab} \over3 h} g_a |\mathbf{r}_{ba} |^2 . $$

The oscillator strength may be considered as a parameter measuring the efficiency of the transition, since it represents a sort of “equivalent number” of classical oscillators. Typically, it is a relatively small number that can reach values of the order of unity only for the strongest spectral lines. The relations between oscillator strength, line strength, and Einstein coefficients are easily obtained using the previous relations. For example, we have

$$g_b f_{ba} = {8 \pi^2 m \nu_{ab} \over3 h e_0^2} \mathcal{S} , \qquad g_a A_{ab} = {8 \pi^2 e_0^2 \nu_{ab}^2 \over m c^3} g_b f_{ba} . $$

16.13 Calculation of a Matrix Element

We wish to calculate the probability per unit time that the following elementary process occurs: a non-relativistic free electron of momentum q undergoes a transition to a free state of momentum q′ due to absorption of a photon with wave vector k. According to Fermi’s golden rule, repeating the arguments presented in Sect. 11.4 but without introducing the dipole approximation,Footnote 9 such probability is proportional to the squared modulus of the matrix element \(\mathcal{M}\) given by

$$\mathcal{M} = \langle u_{ \mathrm{f}} | \mathrm{e}^{ \mathrm{i} \mathbf{k} \cdot \mathbf{r}} \mathbf{p} \cdot \mathbf{e} | u_{ \mathrm{i}} \rangle , $$

where |u i〉 and |u f〉 are the eigenvectors of the atomic system (in our case of the free electron) in the initial and final state, respectively, e is the polarisation unit vector of the absorbed photon, and p is the momentum operator of the electron. Within the representation of the wavefunctions, where the operator p is given by \(-\mathrm{i} \hbar \operatorname{grad}\), the matrix element \(\mathcal{M}\) is

$$\mathcal{M} = - \mathrm{i} \hbar \mathbf{e} \cdot\int\psi_{ \mathrm{f}}^* (\mathbf{r} ) \mathrm{e}^{ \mathrm{i} \mathbf{k} \cdot \mathbf{r}} \operatorname{grad} \bigl[\psi_{ \mathrm{i}}(\mathbf{r} )\bigr] \,\mathrm{d}^3 \mathbf{r} . $$

On the other hand, the eigenfunctions ψ f and ψ i are of the type of a plane wave, i.e.

$$\psi_{ \mathrm{f}}(\mathbf{r} ) = {1 \over\sqrt{\mathcal{V}}} \mathrm{e}^{ \mathrm{i} \mathbf{q}^{ \prime} \cdot \mathbf{r} / \hbar} , \qquad \psi_{ \mathrm{i}}(\mathbf{r} ) = {1 \over\sqrt{\mathcal{V}}} \mathrm{e}^{ \mathrm{i} \mathbf{q} \cdot \mathbf{r} / \hbar} , $$

where \(\mathcal{V}\) is the normalisation volume. Substituting in the integral we have

$$\mathcal{M} = {\mathbf{e} \cdot \mathbf{q} \over{\mathcal{V}}} \int{\mathrm{e}}^{- \mathrm{i} (\mathbf{q}^{ \prime} - \hbar \mathbf{k} - \mathbf{q} ) \cdot \mathbf{r} / \hbar} \, \mathrm{d}^3 \mathbf{r} . $$

The integral is null unless the argument of the exponential is zero. This leads to the equality q′=(ħ k+q) which represents the conservation of momentum. In such case, the integral is simply equal to \(\mathcal{V}\), so we obtain

$$\mathcal{M} = \mathbf{e} \cdot \mathbf{q} = \mathbf{e} \cdot\bigl(\mathbf{q}^{ \prime} - \hbar \mathbf{k} \bigr) . $$

16.14 Gauge Invariance in Quantum Electrodynamics

Consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) defined in Eq. (15.22) of the text that we rewrite here in the form

$$\mathcal{R}_{\mathrm{f}\,\mathrm{i}} = P + Q , $$

where

$$\begin{aligned} P =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) {H_{\mathbf{g}} + \epsilon+ \hbar\omega\over(\epsilon+ \hbar\omega )^2 - \epsilon_{\mathbf{g}}^2 } (\boldsymbol{\alpha}\cdot \mathbf{e} ) W_r^{\phantom{\dagger}} (\mathbf{p} ) , \\ Q =& W^\dagger_s \bigl(\mathbf{p} '\bigr) (\boldsymbol{\alpha}\cdot \mathbf{e} ) {H_{\mathbf{h}} + \epsilon- \hbar\omega' \over(\epsilon- \hbar\omega ')^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) . \end{aligned}$$

We want to demonstrate that \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is invariant with respect to the transformation

$$ \boldsymbol{\alpha}\cdot \mathbf{e} \rightarrow \boldsymbol{\alpha}\cdot \mathbf{e} + C (\boldsymbol{\alpha}\cdot \mathbf{u} - 1) , $$
(16.26)

where C is an arbitrary constant and where u is the unit vector of the direction of the initial photon (u=k/k). Performing such transformation, the quantities P and Q are transformed according to the equations

$$P \rightarrow P + C P' , \qquad Q \rightarrow Q + C Q' , $$

where

$$\begin{aligned} P' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \bigr) {H_{\mathbf{g}} + \epsilon+ \hbar\omega\over(\epsilon+ \hbar \omega)^2 - \epsilon_{\mathbf{g}}^2 } \bigl[(\boldsymbol{\alpha}\cdot \mathbf{u} ) -1 \bigr] W_r^{\phantom{\dagger}} (\mathbf{p} ) , \\ Q' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl[ (\boldsymbol{\alpha}\cdot \mathbf{u} ) -1 \bigr] {H_{\mathbf{h}} + \epsilon- \hbar\omega' \over(\epsilon- \hbar\omega')^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) . \end{aligned}$$

We multiply the two quantities P′ and Q′ by the product cħk and note that

$$c \hbar k \bigl[(\boldsymbol{\alpha}\cdot \mathbf{u} ) -1 \bigr] = c \hbar (\boldsymbol{\alpha}\cdot \mathbf{k} ) - \hbar \omega . $$

Recalling the kinematic relations of the Compton effect and noting that the quantities g and h, contained respectively in P′ and Q′, are given by (see Eq. (15.15))

$$\mathbf{g} = \mathbf{p} + \hbar \mathbf{k} , \qquad \mathbf{h} = \mathbf{p} - \hbar \mathbf{k}^{ \prime} = \mathbf{p}^{ \prime} - \hbar \mathbf{k} , $$

we can perform the following substitution in the expression of P

$$\hbar \mathbf{k} = \mathbf{g} - \mathbf{p} , $$

and in the expression of Q

$$\hbar \mathbf{k} = \mathbf{p}^{ \prime} - \mathbf{h} . $$

Substituting, and recalling also that ϵħω′=ϵ′−ħω, we obtain

$$\begin{aligned} c \hbar k P' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \bigr) {H_{\mathbf{g}} + \epsilon+ \hbar\omega\over(\epsilon+ \hbar\omega)^2 - \epsilon_{\mathbf{g}}^2 } [c \boldsymbol{\alpha}\cdot \mathbf{g} -c \boldsymbol{\alpha}\cdot \mathbf{p} - \hbar \omega ] W_r^{\phantom{\dagger}} (\mathbf{p} ) , \\ c \hbar k Q' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl[ c \boldsymbol{\alpha}\cdot \mathbf{p}^{ \prime} - c \boldsymbol{\alpha}\cdot \mathbf{h} - \hbar\omega \bigr] {H_{\mathbf{h}} + \epsilon' - \hbar\omega \over(\epsilon' - \hbar\omega)^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha }\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) . \end{aligned}$$

Within the square brackets, we add and subtract the factor βmc 2 and recall that an expression of the type (c αq+βmc 2), with q arbitrary, is the Dirac Hamiltonian H q . We obtain

$$\begin{aligned} c \hbar k P' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \bigr) {H_{\mathbf{g}} + \epsilon+ \hbar \omega\over(\epsilon+ \hbar \omega)^2 - \epsilon_{\mathbf{g}}^2 } [H_{\mathbf{g}} -H_{\mathbf{p}} - \hbar \omega ] W_r^{\phantom{\dagger}} (\mathbf{p} ) , \\ c \hbar k Q' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) [ H_{ \mathbf{p}^{ \prime}} - H_{ \mathbf{h}} - \hbar \omega ] {H_{\mathbf{h}} + \epsilon' - \hbar \omega\over(\epsilon' - \hbar \omega)^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) . \end{aligned}$$

Now, noting that

$$H_{\mathbf{p}} W_r^{\phantom{\dagger}} (\mathbf{p} ) = \epsilon W_r^{\phantom{\dagger}} ( \mathbf{p} ) , \qquad W^\dagger_s \bigl( \mathbf{p}^{ \prime} \bigr) H_{\mathbf{p}^{ \prime}} = \epsilon' W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) , $$

we have

$$\begin{aligned} c \hbar k P' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \bigr) {H_{\mathbf{g}} + \epsilon+ \hbar\omega\over(\epsilon+ \hbar\omega)^2 - \epsilon_{\mathbf{g}}^2 } [H_{\mathbf{g}} -\epsilon- \hbar \omega ] W_r^{\phantom{\dagger}} (\mathbf{p} ) , \\ c \hbar k Q' =& W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl[ \epsilon' - H_{ \mathbf{h}} - \hbar \omega \bigr] {H_{\mathbf{h}} + \epsilon' - \hbar \omega\over(\epsilon' - \hbar \omega)^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) . \end{aligned}$$

Finally, taking into account that

$$\begin{aligned}[c] [ H_{\mathbf{g}} + \epsilon+ \hbar\omega ] [ H_{\mathbf{g}} - \epsilon- \hbar\omega ] & = \epsilon_{\mathbf{g}}^2 - (\epsilon+ \hbar \omega)^2 , \\ \bigl[ \epsilon' - H_{\mathbf{h}} - \hbar\omega \bigr] \bigl[ H_{\mathbf{h}} + \epsilon' - \hbar\omega \bigr] & = \bigl( \epsilon' - \hbar \omega\bigr)^2 - \epsilon_{\mathbf{h}}^2, \end{aligned} $$

we obtain

$$c \hbar k P' = - W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) , \qquad c \hbar k Q' = W^\dagger_s \bigl(\mathbf{p} '\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) , $$

from which it follows that (P′+Q′)=0. This shows that the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is invariant with respect to the transformation (16.26). In all similarity, it can be shown that \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is also invariant with respect to the transformation

$$\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \rightarrow \boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} + C' \bigl(\boldsymbol{\alpha}\cdot \mathbf{u}^{ \prime} - 1\bigr) , $$

where C′ is an arbitrary constant and where u′ is the unit vector of the direction of the final photon (u′=k′/k′).

An alternative way to express these invariant properties is to formally consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) as a function of the matrix (αe), or, alternatively, of the matrix (αe ′∗). From the above proof it follows that

$$ \begin{aligned}[c] \mathcal{R}_{\mathrm{f}\,\mathrm{i}} \{ \boldsymbol{\alpha}\cdot \mathbf{e} \rightarrow \boldsymbol{\alpha}\cdot \mathbf{u} \} &= \mathcal{R}_{\mathrm{f}\,\mathrm{i}} \{ \boldsymbol{\alpha }\cdot \mathbf{e} \rightarrow1 \}, \\ \mathcal{R}_{\mathrm{f}\,\mathrm{i}} \bigl\{ \boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \rightarrow \boldsymbol{\alpha}\cdot \mathbf{u}^{ \prime} \bigr\} &= \mathcal{R}_{\mathrm{f}\,\mathrm{i}} \bigl\{ \boldsymbol{\alpha }\cdot \mathbf{e} ^{ \prime *}\rightarrow1 \bigr\} . \end{aligned} $$
(16.27)

16.15 The Gamma Matrices and the Relativistic Invariants

The relation between energy ϵ p and momentum p of a relativistic particle of mass m is

$$\epsilon_{\mathbf{p}}^2 = c^2 p^2 + m^2 c^4 . $$

In particular, for a photon (m=0) we have

$$\epsilon_{\mathbf{p}} = c p , \quad\mathrm{with}\ p = | \mathbf{p} | , $$

or, in terms of frequency and wavenumber

$$\hbar \omega= c \hbar k , \quad\mathrm{with} \ k = | \mathbf{k} | . $$

Such relations may be formally simplified if we adopt a unit system in which ħ=c=1. The introduction of this convention is equivalent to define the unit time interval as the time needed by light to travel the unit of length. With this definition, the energy, the momentum, and the mass (and similarly for a photon, the angular frequency and the wavenumber) all assume the dimensions of the reciprocal of a length (or a time). The relation between momentum and energy is written, in this system of units, in the form

$$\epsilon_{\mathbf{p}}^2 = p^2 + m^2 , \quad\mathrm{or} \quad \epsilon_{\mathbf{p}}^2 - p^2 = m^2 , $$

and for a photon

$$\epsilon_{\mathbf{p}} = p , \quad\mathrm{or} \quad\omega= k . $$

We now introduce with the symbol \(\mathcal{P}_{\mu}\) (μ=0,1,2,3) the quadrivector momentum-energy of the particle. It is an entity with four components that are defined in this way

$$\mathcal{P}_0 = \epsilon_{\mathbf{p}} , \qquad \mathcal{P}_1 = p_1 = p_x , \qquad \mathcal{P}_2 = p_2 = p_y , \qquad \mathcal{P}_3 = p_3 = p_z , $$

or, in a more compact form

$$\mathcal{P} = ( \epsilon_{\mathbf{p}} , \mathbf{p} ) . $$

Defining the metric tensor g μν as

$$g_{00}=1 , \qquad g_{0i} = g_{i0} = 0 \quad (i=1,2,3) , \qquad g_{jk}=-\delta_{ik} \quad (j,k=1,2,3) , $$

the scalar product of two quadrivectors \(\mathcal{P}\) and \(\mathcal{Q}\) is

$$(\mathcal{P} {\mathcal{Q}}) = \sum_{\mu\nu} g_{\mu\nu} {\mathcal {P}}_\mu{\mathcal{Q}}_\nu= \epsilon_{\mathbf{p}} \epsilon_{\mathbf{q}} - \mathbf{p} \cdot \mathbf{q} . $$

In particular we have

$$\mathcal{P}^2 = (\mathcal{P} {\mathcal{P}}) = \sum _{\mu\nu} g_{\mu\nu} {\mathcal{P}}_\mu { \mathcal{P}}_\nu= \epsilon_{\mathbf{p}}^2 - p^2 = m^2 . $$

These quantities (the scalar product of two quadrivectors defined by the above metric tensor and, in particular, the square of a quadrivector) are relativistic invariants, i.e. do not change under Lorentz transformations. We are now going to show how the probability amplitudes of Compton scattering can be expressed in terms of these invariants.

Consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) defined in Eq. (15.22). This quantity is composed of two terms that we denote by P and Q. For the first one we have, taking into account the system of units we have introduced (c=ħ=1)

$$P = W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) {H_{\mathbf{g}} + \epsilon+ \omega\over(\epsilon+ \omega)^2 - \epsilon_{\mathbf{g}}^2 } (\boldsymbol{\alpha}\cdot \mathbf{e} ) W_r^{\phantom{\dagger}} (\mathbf{p} ) , $$

where

$$H_{\mathbf{g}} = \boldsymbol{\alpha}\cdot \mathbf{g} + \beta m , $$

with

$$\mathbf{g} = \mathbf{p} + \mathbf{k} , \qquad\epsilon_{\mathbf{g}} = \sqrt{ g^2 + m^2} . $$

Recalling that the square of the Dirac matrix β is unity, we can write

$$P = W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) \beta^2 {H_{\mathbf{g}} + \epsilon+ \omega\over(\epsilon+ \omega)^2 - \epsilon_{\mathbf{g}}^2 } \beta^2 (\boldsymbol{\alpha}\cdot \mathbf{e} ) W_r^{\phantom{\dagger}} ( \mathbf{p} ) . $$

If we now also recall that the Dirac matrix β anticommutes with any of the α matrices, we obtain

$$P = W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) \beta \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *}\bigr) {\beta(H_{\mathbf{g}} + \epsilon+ \omega) \over(\epsilon+ \omega)^2 - \epsilon_{\mathbf{g}}^2 } \beta( \boldsymbol{\alpha}\cdot \mathbf{e} ) \beta W_r^{\phantom{\dagger}} (\mathbf{p} ) . $$

We now define the matrices γ μ (μ=0,1,2,3)

$$\gamma_0 = \beta , \qquad\gamma_1 = \beta \alpha_1 , \qquad \gamma_2 = \beta \alpha_2 , \qquad\gamma_3 = \beta \alpha_3 . $$

The fundamental property of these matrices regards their anticommutator which is (as easily derived from the properties of the α and β matrices)

$$ \{ \gamma_\mu, \gamma_\nu\} = \gamma_\mu \gamma_\nu+ \gamma_\nu \gamma_\mu= 2 g_{\mu\nu} , $$
(16.28)

where g μν is the metric tensor that we have previously defined. Moreover, given an arbitrary quadrivector \(\mathcal{V}\), we define by the symbol the matrix

With these definitions, the quantity P can be written in the form

(16.29)

where the quadrivectors \(\mathcal{G}\), \(\mathcal{E}\) and \(\mathcal {E}^{\prime}\) are given by

$$\mathcal{G} = (\omega+ \epsilon, \mathbf{g} ) , \qquad \mathcal{E} = (0, \mathbf{e} ) , \qquad \mathcal{E}^{\prime} = \bigl(0, \mathbf{e}^{ \prime}\bigr) . $$

We note that the quadrivector \(\mathcal{G}\) can also be written in the form

$$\mathcal{G} = \mathcal{P} + \mathcal{K} , $$

where

$$\mathcal{K} = (\omega, \mathbf{k}) . $$

If we now consider the quantity P , complex conjugate of P, we need to proceed carefully because the γ matrices (except γ 0) are not Hermitian. We have in fact

$$\begin{aligned} \gamma_0^\dagger &= \beta^\dagger= \beta= \gamma_0 , \\ \gamma_i^\dagger &= (\beta \alpha_i^{\phantom{\dagger}})^\dagger = \alpha_i^\dagger\beta^\dagger= \alpha_i^{\phantom{\dagger}} \beta= - \beta \alpha_i^{\phantom{\dagger}} = - \gamma_i \quad(i=1,2,3) . \end{aligned} $$

These properties can be summarised in only one relation

$$\gamma_\mu^\dagger= \gamma_0 \gamma_\mu\gamma_0 , $$

which implies, for an arbitrary quadrivector

We therefore obtain

or

(16.30)

Similar considerations can be repeated for the other term Q of Eq. (15.22) defined by

$$Q = W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) (\boldsymbol{\alpha}\cdot \mathbf{e} ) {H_{\mathbf{h}} + \epsilon- \omega' \over(\epsilon- \omega')^2 - \epsilon_{\mathbf{h}}^2 } \bigl(\boldsymbol{\alpha}\cdot \mathbf{e}^{ \prime *} \bigr) W_r^{\phantom{\dagger}} (\mathbf{p} ) , $$

where

$$H_{\mathbf{h}} = \boldsymbol{\alpha}\cdot \mathbf{h} + \beta m , $$

with

$$\mathbf{h} = \mathbf{p} - \mathbf{k}^{ \prime} , \qquad\epsilon_{\mathbf{h}} = \sqrt{ h^2 + m^2} . $$

We have

(16.31)

where the quadrivector \(\mathcal{H}\) is defined by

$$\mathcal{H} = \bigl(\epsilon- \omega', \mathbf{h} \bigr) = \mathcal{P} - \mathcal {K'} , $$

being

$$\mathcal{K}' = \bigl(\omega', \mathbf{k}^{ \prime} \bigr) . $$

Still in analogy to what discussed before, we also have

(16.32)

We can now evaluate the square of the modulus of the quantity \(\mathcal {R}_{\mathrm{f}\,\mathrm{i}}\). It is

$$|{\mathcal{R}}_{\mathrm{f}\,\mathrm{i}} |^2 = (P+Q) \bigl(P^* + Q^*\bigr) = PP^*+PQ^*+QP^*+ QQ^* , $$

where, using Eqs. (16.29)–(16.32), the four terms are given by

These expressions can be simplified when one considers the average over the initial spin states of the electron and the sum over the final spin states of the electron. Taking into account the results we have obtained in Sect. 15.5 (Eqs. (15.18) and (15.20)) we have

$$\sum_{r=1,2} W_r^{\phantom{\dagger}} (\mathbf{p} ) W^\dagger_r(\mathbf{p}) = { \epsilon_{\mathbf{p}} + H_{\mathbf{p}} \over2 \epsilon_{\mathbf{p}}} , \qquad \sum_{s=1,2} W_s^{\phantom{\dagger}} \bigl( \mathbf{p}^{ \prime}\bigr) W^\dagger_s\bigl(\mathbf{p}^{ \prime}\bigr) = { \epsilon_{\mathbf{p}^{ \prime}} + H_{\mathbf{p}^{ \prime}} \over2 \epsilon_{\mathbf{p}^{ \prime}}} , $$

and defining the quadrivector

$$\mathcal{P} = (\epsilon_{\mathbf{p}}, \mathbf{p} ) , \qquad \mathcal{P}' = \bigl(\epsilon_{\mathbf{p}^{ \prime}}, \mathbf{p}^{ \prime} \bigr) , $$

we obtain

By denoting with the symbol 〈⋯〉 the average over the spin states and using the definition of the trace of a matrix, according to which a scalar product of the form \(W_{1}^{\phantom{\dagger}} \mathcal{X} W_{2}^{\dagger}\), with \(W_{1}^{\phantom{\dagger}} \) and \(W_{2}^{\phantom{\dagger}} \) arbitrary spinors and \(\mathcal{X}\) an arbitrary matrix, can be written in the form \(\operatorname{Tr}(W_{2}^{\phantom{\dagger}} W_{1}^{\dagger}{\mathcal {X}})\), we have

with similar expressions for the other three terms 〈PQ 〉, 〈QP 〉, and 〈QQ 〉.

This last result can be greatly simplified if we sum over the polarisation states of the final photon and we average over the polarisation states of the initial photon. The average over the polarisation states of the initial photon, for example, is obtained by applying to the previous formula the formal substitution

where

$$\mathcal{E}^{(i)} = \bigl(0, \mathbf{e}^{ (i)}\bigr) , $$

e (i) (i=1,2) being two unit vectors that we can assume real, perpendicular to each other and perpendicular to the direction of the initial photon. Taking into account the invariance under gauge transformations described in Sect. 16.14 and, in particular, recalling Eq. (16.27), the sum can be modified by extending it to a third “unit quadrivector” (that we denote by \(\mathcal{E}^{(3)}\)) and subtracting then the contribution from another unit quadrivector. According to special relativity, this unit quadrivector, which we denote by \(\mathcal{E}^{(0)}\), is of the purely temporal type. Defining

$$\mathcal{E}^{(3)} = \bigl(0, \mathbf{e}^{ (3)}\bigr) , \qquad \mathcal{E}^{(0)} = (1 , \mathbf{0} ) , $$

where e (3) is an unit vector directed along the direction of the incoming photon (e (3)=k/k), and recalling the definition of the metric tensor, we apply the following transformation

We note that the sum \(\mathcal{S}\) can also be written in the form

where γ is the formal vector defined by γ=(γ 1,γ 2,γ 3). The right-hand side can then be transformed to get

On the other hand, taking into account Eq. (15.7), the quantity in square brackets is equal to the Kronecker delta δ ij , so we obtain

Finally, we take into account the properties of the γ matrices. From Eq. (16.28) we have

$$\gamma_\mu\gamma_\nu= - \gamma_\nu \gamma_\mu+ 2 g_{\mu\nu} . $$

Moreover, it is easy to verify that the following relation holds

$$\sum_{\mu\nu} g_{\mu\nu} \gamma_\mu \gamma_\nu= 4 , $$

and that, given the properties of the metric tensor,

$$\sum_\mu g_{\mu\nu} g_{\mu\rho} = \delta_{\nu\rho} , $$

so that

$$\sum_{\mu\nu} g_{\mu\nu} g_{\mu\rho} \gamma_\nu= \gamma_\rho . $$

Taking advantage of these properties, we get after some algebra

Summarising the foregoing, the average on the states of polarisation of the initial photon is obtained by performing the formal transformation

Similarly, the sum over the polarisation states the final photon is obtained by performing the formal transformation

Now we denote by the symbol 〈〈PP 〉〉 the quantity obtained by taking the average of 〈PP 〉 over the states of initial polarisation and the sum of the same quantity over the states of final polarisation.Footnote 10 We have

At this point it is necessary to briefly discuss the traces of the products of the γ matrices. It is easy to verify that the trace of the product of an odd number of γ matrices is null. When instead the number of γ matrices is zero or even, the result is different from zero. Denoting by a an arbitrary constant, with \(\mathcal{A}\), \(\mathcal{B}\), \(\mathcal{C}\), and \(\mathcal{D}\) four arbitrary quadrivectors, and recalling the definition of the scalar product of quadrivectors, we have

The first relation is obvious. For the second one we have

and using the anticommutation property of the γ matrices

$$\operatorname{Tr} \{ \gamma_\mu\gamma_\nu\} = 8 g_{\mu\nu} - \operatorname{Tr} \{ \gamma_\nu \gamma_\mu\} . $$

From the cyclic property of the trace it then follows that

$$\operatorname{Tr} \{ \gamma_\mu\gamma_\nu\} = 4 g_{\mu\nu} , $$

which proves by simple substitution the second relation. For the third relation we have

and, for the anticommutation property of the γ matrices,

$$\begin{aligned} \operatorname{Tr} \{ \gamma_\mu \gamma_\nu\gamma_\rho\gamma_\sigma \} & = 2 g_{\mu\nu} \operatorname{Tr} \{ \gamma_\rho \gamma_\sigma\} - \operatorname{Tr} \{ \gamma_\nu \gamma_\mu\gamma_\rho\gamma_\sigma \} \\ & = 2 g_{\mu\nu} \operatorname{Tr} \{ \gamma_\rho \gamma_\sigma\} - 2 g_{\mu\rho} \operatorname{Tr}\{ \gamma_\nu\gamma_\sigma \} + \operatorname{Tr} \{ \gamma_\nu\gamma_\rho\gamma_\mu \gamma_\sigma \} \\ & = 2 g_{\mu\nu} \operatorname{Tr} \{ \gamma_\rho \gamma_\sigma\} - 2 g_{\mu\rho} \operatorname{Tr} \{ \gamma_\nu\gamma_\sigma \} + 2 g_{\mu\sigma} { \operatorname{Tr}} \{ \gamma_\nu\gamma_\rho\} \\ &\quad {}-\operatorname{Tr} \{ \gamma_\nu\gamma_\rho \gamma_\sigma\gamma_\mu\} . \end{aligned} $$

Using the cyclic property of the trace we then have

$$\operatorname{Tr} \{ \gamma_\mu\gamma_\nu \gamma_\rho\gamma_\sigma\}= g_{\mu\nu} { \operatorname{Tr}} \{ \gamma_\rho\gamma_\sigma\} - g_{\mu\rho} \operatorname{Tr} \{ \gamma_\nu \gamma_\sigma \} + g_{\mu\sigma} \operatorname{Tr} \{ \gamma_\nu\gamma_\rho\} , $$

and using of the result previously obtained

$$\operatorname{Tr} \{ \gamma_\mu\gamma_\nu \gamma_\rho\gamma_\sigma\}= 4 g_{\mu\nu} g_{ \rho\sigma} - 4 g_{\mu\rho} g_{ \nu\sigma} + 4 g_{\mu\sigma} g_{ \nu\rho} . $$

The third relation is then finally obtained by simple substitution of this identity.

The result obtained for 〈〈PP 〉〉 shows that the trace contained in this quantity can be expressed exclusively in terms of scalar products of quadrivectors, i.e. in terms of relativistic invariants. Similar considerations can then be repeated for the other quantities 〈〈PQ 〉〉, 〈〈QP 〉〉, and 〈〈QQ 〉〉, which, once calculated, allow one to obtain the transition probability per unit time and the cross section. Obviously, in the particular case in which the electron is initially at rest, one finds again for the cross section the Klein-Nishina equation in the form of Eq. (15.37), which refers to the average over the polarisation states of the initial photon and the sum over the polarisation states of the final photon.

The formalism of the γ matrices presented in this chapter is very powerful and elegant. It allows one to deal with relative ease even with the most complex problems in quantum electrodynamics. In any case, we emphasize that the formalism that we have used in the text to deduce the Klein-Nishina equation, which does not make use of the γ matrices, was the first to be used in the applications.

16.16 Physical Constants

The constants are expressed in the cgs system of units with at most six significant digits.

  • Constant of gravitation: G=6.67428×10−8 cm3 g−1 s−2

  • Velocity of light in vacuum: c=2.99792×1010 cm s−1

  • Planck constant: h=6.62607×10−27 erg s

  • Reduced Planck constant: ħ=h/(2π)=1.05457×10−27 erg s

  • Boltzmann constant: k B=1.38065×10−16 erg K−1

  • Charge of the electron (absolute value): e 0=4.80320×10−10 esu

  • Electron mass: m=9.10938×10−28 g

  • Reduced electron mass: m r=mM p/(m+M p)=9.10442×10−28 g

  • Atomic mass unit (amu): m H=1.66054×10−24 g

  • Proton mass: M p=1.67262×10−24 g

  • Proton/electron mass ratio: M p/m=1.83615×103

  • Avogadro constant: N A=6.02214×1023 mol−1

  • Fine-structure constant: \(\alpha=e_{0}^{2}/(\hbar c)=7.29735 \times10^{-3}\)

  • Reciprocal of the fine-structure constant: \(1/\alpha=\hbar c/e_{0}^{2}= 137.036\)

  • Classical radius of the electron: \(r_{\mathrm{c}}=e_{0}^{2}/(m c^{2})=2.81794 \times 10^{-13}\mbox{ cm}\)

  • Compton wavelength of the electron: λ C=h/(mc)=2.42631×10−10 cm

  • Radius of the first Bohr orbit: \(a_{0}=\hbar^{2} /(m e_{0}^{2})=5.29177 \times10^{-9}\mbox{ cm}\)

  • Rydberg constant: \(R = m e_{0}^{4} /(4 \pi c \hbar^{3}) = 1.09737 \times10^{5}\mbox{ cm}^{-1}\)

  • Rydberg constant (hydrogen atom): \(R_{\mathrm{H}} = m_{\mathrm{r}} e_{0}^{4} /(4 \pi c \hbar^{3}) = 1.09677 \times10^{5}\mbox{ cm}^{-1}\)

  • Bohr magneton: μ 0=e 0 ħ/(2mc)=9.27401×10−21 erg G−1

  • Thomson cross section: \(\sigma_{\mathrm{T}} = 8 \pi r_{\mathrm{c}}^{2}/3 = 6.65246 \times10^{-25} \mbox{ cm}^{2}\)

  • Stefan-Boltzmann constant:Footnote 11 σ=5.67040×10−5 erg cm−2 s−1 K−4

  • Radiation density constant:11 a=7.56577×10−15 erg cm−3 K−4

  • First radiation constant: c 1=2πhc 2=3.74177×10−5 erg cm2 s−1

  • Second radiation constant: c 2=hc/k B=1.43877 cm K