Abstract
This chapter is dedicated to introduce the formalism of tensorial algebra which is used throughout the book. It also contains some mathematical proofs and further complementary material needed for deepening some of the topics covered in the book.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
16.1 Units of Measurement for Electromagnetic Phenomena
The units of measurement relative to electromagnetic phenomena have been introduced through a long and complex historical process. Without going into the details of this process, we try here to summarise the fundamental points in light of a modern vision of the phenomena themselves. We stress that these notes are aimed at a reader who is already familiar with the basic phenomenology of electromagnetism.
Within electrostatics, the fundamental law is Coulomb’s law which is written, in vacuum, in the general form
where F is the force that a point charge q 1 exerts on the point charge q 2 placed at the distance r=r 2−r 1, and where k C is a constant that implicitly defines the unit of measurement of the charge (assuming, of course, that the units of measurement of the mechanical quantities have already been set). We note that k C can be chosen to be dimensional or dimensionless. The electric field vector E is defined in an arbitrary point using the equation
where F is the electric force exerted on the “test” charge q p placed at the same location. From this definition and Coulomb’s law we can deduce the expression of the electric field due to a point charge q, which is
from which the Gauss theorem (in its integral form) follows
where Q is the charge contained within the surface Σ. In its differential form, we have
where ρ is the density of the electric charge (the charge contained within the unit volume).
Regarding the definition of the electric field vector, we note that it is not the only possible one, since we could have defined the electric field created by the charge q as
with δ an arbitrary constant (possibly dimensional), as long as the electric force that the field exerts on the test charge q p is written in the form
Fortunately, the constant δ has (historically) always been set to unity. The same is not true for magnetic phenomena.
Regarding magnetostatics, we have equations that are similar to those of electrostatics. In these equations, for historical reason, the fictitious concept of “magnetic mass” (or “magnetic pole”) is introduced. The equations corresponding to those previously written are the following ones (the symbol m denoting the magnetic mass):
-
Gilbert’s lawFootnote 1 (analogous to Coulomb’s law)
$$\mathbf{F} = k_{\mathrm{G}} {m_1 m_2 \over r^2} \operatorname{vers} \mathbf{r} . $$ -
Definition of the vector of the magnetic induction generated from the magnetic mass m
$$\mathbf{B} = k_{\mathrm{G}} \gamma {m \over r^2} \operatorname{vers} \mathbf{r} , $$where γ is an arbitrary constant (possibly dimensional).
-
Force acting on the test magnetic mass m p
$$ \mathbf{F} = {1 \over\gamma} m_{\mathrm{p}} \mathbf{B} . $$(16.1) -
Equivalent of the Gauss theorem (integral form)
$$\varPhi(\mathbf{B}) = 0 , $$since isolated magnetic masses (magnetic monopoles) do not exist.
-
Equivalent of the Gauss theorem (differential form)
$$\operatorname{div} \mathbf{B} = 0 . $$
The first quantitative relations between electric and magnetic phenomena were established with experiments based on electric currents.Footnote 2 The intensity of the electric current i flowing in a conductor is defined by the simple equation
from which we can define the current density j as a vector directed along the direction of motion of the (positive) charges of magnitude
where σ is the cross-sectional area of the conductor. The experiments performed during the first half of the nineteenth century especially by Ørsted, Ampère, and Faraday, led to the idea that electric currents create magnetic fields in their surroundings and that, at the same time, a magnetic field is able to exert a force on electric currents. During the same period, a new idea clearly emerged: that permanent magnets contain, at the microscopic level, a large number of elementary electric currents. These currents would be responsible, ultimately, for magnetostatic phenomena.
In modern terms, the magnetic properties of the currents can be summarised by a single law that is expressed by saying that, in stationary conditions, the current element of an elementary circuit (microscopic or macroscopic) i 1 dℓ 1 acts on the current element of another elementary circuit, i 2 dℓ 2, with an infinitesimal force dF given by
where k A is a new constant (which cannot be independent of those already introduced), and where r is the radius vector that goes from the current element i 1 dℓ 1 to the current element i 2 dℓ 2. This law allows the introduction of the magnetic induction vector. The definition of this vector is somewhat arbitrary and it is assumed, in general, that the current element i dℓ creates the elementary induction vector dB given by (first law of Laplace or Biot and Savart’s law)
and that a current element i dℓ is subject, in the presence of an induction vector B, to a force dF given by (second law of Laplace)
The quantity β introduced in these equations is arbitrary.
Let’s see the mathematical consequences of Eq. (16.2) for a closed circuit. The magnetic induction vector is given by
where C is the curve describing the closed circuit. Using standard mathematical methods, one finds the following equations
which confirms the analogous equation for magnetostatics, and
where j is the current density.
This equation, known as Ampère’s law, applies only to stationary phenomena. As shown by Maxwell, it can be transformed into a more general equation that is also valid for phenomena that are variable in time. To do this we observe that, taking the divergence of both sides, we have
while, in general, the continuity equation must hold
ρ being the charge density. In order to rearrange things, we take the derivative (with respect to time) of the differential expression of Coulomb’s law
so that in general the following equation holds
The second term in parentheses is the so-called displacement current density. With its introduction, the equation for \(\operatorname{rot} \mathbf{B}\), corrected to include non-stationary phenomena, is
Finally, we need to consider the phenomena of magnetic induction. The law that describes them can be deduced, at least in a particular case, from the second law of Laplace. We have
so that, in summary, the laws governing the electromagnetic phenomena can all be enclosed in the following four Maxwell’s equations
We now consider Maxwell’s equations in vacuum. Taking the curl of the third equation and substituting the fourth, we obtain the wave equation
On the other hand we know that electromagnetic waves propagate in vacuum with velocity c, so that we must have
or
that is, a relation between the quantities k A and k C that is independent of the unit system under consideration.
Let’s see how we proceed in the two more common systems of units, the cgs system of Gauss (sometimes also called the Gauss-Hertz system) and the International System of Units (SI). In the cgs system, we assume k C=1, so that the unit of charge is defined as the charge that repels an equal charge, at a distance of one centimeter, with the force of one dyne. Such unit of charge is called Franklin or statcoulomb. Since k C=1, it follows that k A=1/c 2. Within this system we also assume that β=c, so that Maxwell’s equations are written as
Moreover, the first and the second law of Laplace, together with the law that summarises them, can be written in the form
Within the International System, instead, two new constants are introduced. They are the vacuum permittivity (also called dielectric permittivity of the vacuum) ϵ 0 and the vacuum permeability (magnetic permeability of the vacuum) μ 0, such that
Using these quantities, we put
so that we have
Within this system we also put β=1, so that Maxwell’s equations are written as
Moreover, the first and the second law of Laplace and the law that summarises them are written, respectively, in the form
With respect to the numerical values of ϵ 0 and μ 0, the Ampère (unit of measurement of the current) is defined as the current that, flowing along an infinite straight wire of negligible thickness in vacuum, attracts an equal wire, located at a distance of one meter, with a force per unit length equal to 2×10−7 N m−1. Using Eq. (16.3) we deduce that in such a geometry the force per unit length that acts on one of the two conductors is attractive and has a magnitude given by the following expression
so we must haveFootnote 3
and then, recalling that the Coulomb is the charge transported in one second by a current of one Ampère
Finally, it remains to analyse the relation between magnetic masses and currents. We can infer from Laplace’s laws that a filiform planar circuit of area σ and current i behaves, at distances much larger than its size, as a magnetic dipole directed along the unit vector n perpendicular to the plane of the circuit. The direction of n is specified by the rule of the corkscrew (or the right screw). This is the so-called Ampère principle of equivalence, which is expressed by the formula
where k P is a new constant to be related to those previously introduced. To establish this relation, we evaluate, for example, the moment of the forces acting on an elementary dipole located at a point in space where the field B is present. Using Eq. (16.1), we have
Instead, from the second law of Laplace we have
which can be rewritten as
Equating the two expressions for M we have
Finally, considering the force exerted between two infinitesimal circuits, treated in the first instance as elementary dipoles and then as coils carrying a current, we obtain the relation
which allows to write k G in the form
In the cgs system, since k C=1 and β=c, and assuming γ=1, we obtain
Ampère’s principle of equivalence is therefore
and Gilbert’s law
In the International System, instead, since k C=1/(4πϵ 0) and β=1, assumingFootnote 4 γ=μ 0 and recalling that c 2=1/(ϵ 0 μ 0), we obtain
In this case Ampère’s principle of equivalence is
and Gilbert’s law is
Finally, we note that, besides the two systems introduced here, there are other ones that have been used for the electromagnetic phenomena. In particular, it is worth mentioning the electrostatic cgs system, the electromagnetic cgs system and the cgs system of Heavyside.
16.2 Tensor Algebra
In this volume, we often need to deal with vectors and tensors, together with their differential expressions such as divergences, curls and gradients. It is therefore useful to give a brief introduction to this topic in order to make the reader familiar with a compact formalism that allows to easily deduce a series of vector and tensor identities, as well as various transformation formulae.
The traditional definition of a tensor that is commonly given in physics is based on the generalisation of the definition of a vector. In a Cartesian orthogonal reference system, the vector v is defined as an entity with three components (v x , v y , v z ) (or v 1, v 2, v 3) which, under an arbitrary rotation of the reference system, are modified according to the law
where the coefficients C ij are the direction cosines of the new axes with respect to the old ones. In close analogy, we define a tensor T of rank n as an entity with 3n components (T i…j with i,…,j=1,3) which, under a rotation of the reference system, are transformed according to the law
The tensor most commonly known in physics is the stress tensor that characterises inside an elastic material the force dF that is exerted on a surface dS with normal n. In components we have
In addition to the stress tensor we can also mention, for their importance in various fields of physics, the deformation tensor, the inertia tensor, and the dielectric tensor.
A particular tensor of rank two is the so-called dyad that is obtained from two vectors u and v when the direct product of their components is considered. The dyad is indicated simply by the symbol uv, and we have by definition
Obviously, in general
A scalar quantity is, by definition, a tensor of rank zero, while a vector is, by definition, a tensor of rank one. Tensors of higher rank may be obtained by considering the direct product of tensors of lower rank. For example, by the direct product of two tensors of rank two a tensor of rank four is obtained.
The tensor algebra covers all operations that can be performed on tensors. We now provide some definitions
-
1.
Given two tensors T and V, the first of rank n (n≥1) and the second of rank n′ (n′≥1), we define the scalar product (or internal product) of the two tensors a tensor of rank (n+n′−2) obtained by a sum (or saturation) which operates over the last index of the first tensor and the first index of the second tensor. For example, if n and n′ are both equal to 2, defining W as the tensor obtained by the scalar product, we have that W is also a tensor of rank two defined by
$$W_{ij} = \sum_k T_{ik} V_{kj} . $$ -
2.
Given a tensor of rank n (with n≥1), the divergence of such tensor is a tensor of rank (n−1) obtained by saturating its first component with the formal vector ∇ (called “nabla” operator or “del” operator) defined by
$$\boldsymbol{\nabla}\equiv \biggl({\partial\over\partial x}, {\partial\over\partial y}, {\partial\over\partial z} \biggr) . $$For example, for a tensor T of rank two, \(\operatorname{div} \mathbf{T}\) is a vector whose components are given by
$$(\operatorname{div} \mathbf{T})_i = \sum _j {\partial\over\partial x_j} T_{ji} = (\boldsymbol{\nabla }\cdot{\mathbf{T}})_i . $$ -
3.
Given a tensor of rank n (with n≥0), the gradient of such tensor is a tensor of rank (n+1) obtained by applying to it the formal vector ∇ in such a way that the first index of the resulting tensor is the “derivation one”. For example, for a tensor of rank 1, i.e. for a a vector v, we have
$$(\operatorname{grad} \mathbf{v} )_{ij} = (\boldsymbol{\nabla }\mathbf{v} )_{ij} = {\partial\over\partial x_i} v_j . $$It should be noted that this convention is not universally adopted. Some authors prefer to indicate with the symbol \(\operatorname{grad} \mathbf{v}\) the quantity
$$(\operatorname{grad} \mathbf{v} )_{ij} = {\partial\over\partial x_j} v_i . $$The reader should therefore pay attention to the conventions used by each author before using the vector identities that are found in different books. For example, using our conventions, we have
$$\sum_i u_i {\partial v_j \over\partial x_i} = (\mathbf{u} \cdot \operatorname{grad} \mathbf{v} )_j , \qquad \sum _i u_i {\partial v_i \over\partial x_j} = \bigl[ (\operatorname{grad} \mathbf{v} ) \cdot \mathbf{u} \bigr]_j . $$Using the formal vector ∇, the quantities in the right-hand side can also be written, respectively, as
$$\bigl[( \mathbf{u} \cdot \boldsymbol{\nabla}) \mathbf{v} ) \bigr]_j , \qquad \bigl[( \boldsymbol{\nabla }\mathbf{v} ) \cdot \mathbf{u} \bigr]_j . $$ -
4.
Given a tensor of rank n (n≥1), the curl (also known as rotor) of such a tensor is a tensor of the same rank n with the first component being obtained by saturating the first component of the given tensor with the completely antisymmetric tensor (known as the Ricci, or Ricci-Levi Civita tensor) and with the component of the formal vector ∇. For example, for a vector v we have
$$(\operatorname{rot} \mathbf{v} )_i = \sum_{jk} \epsilon_{ijk} {\partial v_k \over \partial x_j} = (\boldsymbol{\nabla }\times \mathbf{v} )_i , $$and for a tensor T of rank two
$$(\operatorname{rot} \mathbf{T})_{ij} = \sum _{kl} \epsilon_{ikl} {\partial T_{lj} \over \partial x_k} = ( \boldsymbol{\nabla}\times\mathbf{T})_{ij} . $$The antisymmetric tensor of rank three ϵ ijk , introduced in these expressions, is defined by the equation ϵ ijk =0 if at least two of the three indices i,j,k are equal; by the equation ϵ ijk =1 if the ordered triad (i,j,k) is an even permutation of the fundamental triad (1,2,3); and by the equation ϵ ijk =−1 if the ordered triad (i,j,k) is an odd permutation of the fundamental triad (1,2,3). Ultimately, only 6 of the 27 components of the tensor are different from zero. Note that the usual vector product between two vectors can be conveniently expressed through the antisymmetric tensor. If w=u×v, we have
$$w_i = \sum_{jk} \epsilon_{ijk} u_j v_k . $$Note also that the vector product operation and the curl operator (which involve the antisymmetric tensor) imply a choice about the chirality of the Cartesian orthogonal system in which the components of the vectors (and tensors) are defined. The convention that is now almost universally accepted (and that we use) is to choose a right-handed triad, i.e. to suppose that, if the axes x and y are directed respectively along the thumb and index finger of the right hand, the z axis is directed along the middle finger.
The antisymmetric tensor has a number of properties. The first concerns the permutation of its indices. For an even permutation the tensor remains unchanged, while for an odd permutation the tensor changes sign. In formulae
$$\epsilon_{ijk} = \epsilon_{jki} = \epsilon_{kij} = -\epsilon_{jik} = -\epsilon_{ikj} = -\epsilon_{kji} . $$In addition, the following saturation properties hold
$$\begin{aligned} \sum_k \epsilon_{ijk} \epsilon_{lmk} =& \delta_{il} \delta_{jm} - \delta_{im} \delta_{jl} , \\ \sum_{jk} \epsilon_{ijk} \epsilon_{ljk} =& 2 \delta_{il} , \\ \sum_{ijk} \epsilon_{ijk} \epsilon_{ijk} =& 6 , \end{aligned}$$where δ ij is the so-called Kronecker delta, i.e. the symbol defined by
$$\delta_{ij} = 1 \quad\mathrm{if}\ i=j , \qquad \delta_{ij} = 0 \quad\mathrm{if}\ i \ne j . $$
The above definitions and properties can be used to obtain a number of vector identities that are listed below. In these equations, the quantities f and g are scalars, a and b are vectors, and T is a tensor of rank 2.
In fact we have
In fact we have, for the i-th component
In fact we have, for the i-th component
In fact we have
In fact we have, for the i-th component
In fact we have, for the i-th component
In fact we have, for the ij-th component
In fact we have, for the i-th component
In fact we have, for the i-th component
In fact we have, for the i-th component
In fact we have, for the i-th component
There are also other vector identities that apply only in integral form. They result from the theorems of Gauss and Stokes-Ampère, which we recall now.
Gauss theorem: If Σ is a closed surface enclosing the volume V and if n is the normal external to the surface, Gauss theorem is expressed by the equation
where a is an arbitrary vector that is a function of the position.
Stokes-Ampère theorem: if ℓ is a closed circuit and if Σ is a surface that is leaning on this circuit, the Stokes-Ampère theorem is stated by the equation
where n is the normal external to the surface. We note that the validity of this equation implies a convention about the direction of integration along the circuit, which in turn depends on the implicit convention in the definition of the curl operator. When the (x,y,z) system used to define the vector components is a right-handed system, then the direction of integration along the circuit follows the corkscrew (or the right screw) rule, for which the direction of n coincides with the direction of advancement of the corkscrew.
Various identities can be obtained from the Gauss and Stokes-Ampère theorems. Some of them are collected below.
This identity can be proven by noting that, if c is an arbitrary constant vector, we have
and, applying the Stokes-Ampère theorem
Recalling the vector identity of Eq. (16.6), and taking into account that c is a constant vector, we have
The identity therefore follows, because c is an arbitrary vector.
With entirely similar procedures and taking into account the vector identities demonstrated previously, we obtain the additional identities
In particular, if we put f=1 in this last identity, we get
which is an important geometrical relation valid for an arbitrary closed surface.
16.3 The Dirac Delta Function
The Dirac delta function, traditionally indicated by the symbol δ(x), can be thought of as a function which is null for any value of x, except for an infinitesimal interval centered at the origin where the function has a very high peak which tends to infinity, but such that the integral of the function in dx is equal to 1. Obviously, it is not a function in strict mathematical sense, but can be thought of as the limit of a family of functions depending on a suitable parameter. For example, if we consider the family of functions f(x,a)
we have that
Similarly, if we consider the family
we also have
There are endless possibilities to represent the Dirac delta as the limit of suitable families of functions. The most common representations in mathematical physics are the following ones
The fundamental property of the Dirac delta is summarised in the following expression, which constitutes its formal definition
and from which, by means of simple changes of variable, the following two relations are found
where a is any real number different from zero. From these equations we can get an important generalisation concerning the Dirac delta whose argument is an arbitrary real function g(x). Denoting this quantity by the symbol δ[g(x)] and denoting by x i the zeroes (if any) of the function g(x), we have
where g′(x) is the derivative of the function g(x) with respect to its argument. Further generalisations to the case of the three-dimensional Dirac delta are described directly in the text (see Sect. 3.2).
Finally, we can give a meaning to the derivative of the Dirac delta function, δ′(x), defined by the usual relation
Using this definition we have, for an arbitrary function F(x)
from which we obtain
16.4 Recovering the Elementary Laws of Electromagnetism
In Chap. 3, starting from the Liénerd and Wiechart potentials, we calculated the expressions of the electric and magnetic field at an arbitrary point in space, due to a single moving charge. The results are contained in Eqs. (3.19) and (3.20). We are now going to show how the basic equations of electromagnetism valid for stationary phenomena can be derived from these equations in the non-relativistic limit. The purpose of this appendix is a simple consistency check, since it is obvious that the equations from which we start, being a consequence of Maxwell’s equations, must already contain those results that, even historically, are the basis of Maxwell’s equations themselves.
Consider a particle with electric charge e, moving within an electric conductor having a constant transverse section. Its velocity is much lower than the velocity of light. To fix ideas, we can think that the velocity is of the order of 10−2 cm s−1, which represents the order of magnitude of the drift velocities of electrons inside a conductor in a typical macroscopic electric circuit. The corresponding value of β is of the order of 10−12, so that the approximation β 2≪1 is certainly verified. Furthermore, the effects of the curvature of the conductor (causing very small accelerations) can certainly be neglected so that we can assume that the electric field is given only by the Coulomb term of Eq. (3.19). Neglecting terms of the order of β 2, such field is written in the form
where κ, R, n are the quantities introduced in Chap. 3 and that need to be calculated at the retarded time t′. The magnetic field is then given by Eq. (3.20), i.e.
We can immediately notice that, if we put β=0 (i.e. we consider an electric charge at rest), obviously we do not need to consider the difference between real time and retarded time, so we obtain, being κ=1
These are the ordinary equations of electrostatics which represent, in terms of fields, Coulomb’s law.
We are now going to see what we get at first order in β. With simple considerations it can be shown that the electric field E(r,t) is exactly equal to what one would calculate using Coulomb’s law and assuming, hypothetically, that the velocity of light were infinite (i.e. neglecting the difference between real and retarded time). In fact, referring to Fig. 16.1 and denoting by a single quote the quantities measured at the retarded time t′ and without superscript the same quantities at time t, we have
from which it follows, dividing by R′
Introducing the new notations in the expression for the electric field and recalling that β is constant we obtain
On the other hand we have by definition that
and applying Carnot’s theorem to the triangle PP t′P t
Substituting in the expression of the electric field, we obtain the result we anticipated. In fact we obtain, apart from terms of the order of β 2
It remains to evaluate the contribution of the magnetic field. We have
On the other hand, also apart from terms of the order of β 2, we have, from Eqs. (16.15) and (16.16)
so that
Now we apply this equation to the case of an element of a conductor, of length dℓ. Denoting by N the number density of the moving charged particles and with S the transverse section of the conductor, the element contains a number of particles given by NS dℓ with velocity v=c β parallel to dℓ. There is an equal number of fixed particles of opposite charge, so the resulting electric field is null for the property previously demonstrated. For the magnetic field we have instead
On the other hand, if we denote by i the intensity of the current flowing in the conductor
so that the equation for the magnetic field is written
This is just the Biot and Savart law expressing the magnetic field generated by a current element. As is clear from our deduction, although electric charges move within the conductor at very low speed, they are nevertheless able to create a relativistic effect which is manifested by the presence of the magnetic field.
16.5 The Relativistic Larmor Equation
Within the radiation zone, Eqs. (3.18) and (3.20) provide the expressions for the electric and magnetic field due to a moving charge
where e is the value of the electric charge, c is the speed of light, n is the unit vector along the direction of R, the vector that goes from the charge to the point of coordinates r, β=v/c is the velocity of the charge in units of the speed of light, a is the acceleration, and κ is defined by the equation
We recall that the quantities R, κ, n, β, and a that appear in the previous equations must be evaluated at the retarded time t′, related to the time t by the equation
Expanding the double vector product, we obtain
On the other hand, we know that the Poynting vector is given by
and expanding the square of the electric field we obtain with simple algebra
This expression shows that, in the general case, the angular distribution of the emitted radiation (i.e. the radiation diagram) is quite complex. The special cases where the acceleration is either parallel or perpendicular to the velocity have been discussed in the text. Here, it is sufficient to emphasize the fact that, for any velocity and acceleration, there are always two directions where the Poynting vector is zero. This can be shown simply from the expression of the electric field. The electric field is obviously zero along the directions characterised by those unit vectors n 0 such that the vector n 0−β is parallel to the vector a. The same holds for the Poynting vector. The directions n 0 are then contained in the plane defined by the vectors β and a, and are given by the solutions of the equation
Denoting by α the angle between the velocity and the acceleration vectors, the unit vectors n 0 are defined by the angles θ ± (which start from the acceleration vector and increase in the same direction as α) given by
For example, if α=45∘ and β=0.8, we have θ +=34∘.45 and θ −=145∘.55.
Let us now move on to the calculation of the power. We note that if the integral
were simply executed over a sphere of radius R centred on the position of the charge at the retarded time (t−R/c), we would obtain the ratio between the energy that flows across the sphere in a time interval dt and the dt itself. This quantity is however of not much interest. It is more interesting to obtain the power emitted by the charged particle. In order to do this, we need to take into account the fact that the energy that flows across the sphere in a time dt was emitted by the particle in the time dt′ which depends on the direction and is related to dt by
To find the power W emitted by the charged particle we therefore need to calculate the integral
Substituting the above expression of the Poynting vector, we find
To calculate this integral, we introduce a system of polar coordinates (ψ,χ) with the polar axis directed along the velocity vector and the azimuth χ measured from the plane containing the velocity and the acceleration. With obvious notations, the three vectors β, a, and n in this system of coordinates are given by
so that the integrand can be written in the form
and dΩ is given by sinψ dψ dχ. By integrating in dχ within the interval (0,2π), the factors that do not contain any function of χ result in 2π, those containing cosχ produce zero, while the factor containing cos2 χ gives π. In summary we have
The integrals in dψ appearing in this expression are simple and can be evaluated either by integrating by parts or by changing the integration variable from ψ to x=1−βcosψ. We obtain
Substituting these expressions and grouping separately the terms in \(a_{\parallel}^{2}\) and in \(a_{\perp}^{2}\), we get
or, expanding,
Recalling the definition of the relativistic factor γ
the expression for the power emitted by a relativistic charge in accelerated motion can also be written in the more representative form
This formula is a generalisation of the Larmor equation (3.23) to the relativistic case. Obviously, for γ=1 we find Larmor equation since \(a_{\parallel}^{2} + a_{\perp}^{2} = a^{2}\).
To conclude, we note that, if we had executed the integral of the Poynting vector on the sphere without taking into account the difference between the dt and the dt′ (i.e. the integral \(\mathcal{I}\) of Eq. (16.17)), we would have obviously obtained a different expression. Taking into account that
we have, in fact, that
This difference between the power emitted by the particle (W) and the power received on the sphere (\(\mathcal{I}\)) is a simple kinematic effect and has nothing to do with relativity. A similar effect occurs in the case of acoustic waves emitted, for example, by an airplane travelling at a speed close to the velocity of sound. While the power emitted by the plane into acoustic waves is fixed, the received power can be very large and, at the limit, almost infinite if the plane travels for a long time at exactly the speed of sound (the so-called sonic bang is precisely due to this phenomenon).
16.6 Gravitational Waves
The equations that we have obtained for the radiation of electromagnetic waves can also be applied, with some slight modifications, to treat gravitational radiation. Obviously, this is not rigorous, since the laws of gravitational radiation should be derived from the general theory of relativity. The approach followed here is however sufficient to describe the fundamental properties of the mechanisms for the generation of gravitational waves and leads to formulae that are substantially correct (as can be verified a posteriori).
We start by performing a formal transformation to the equations for electromagnetic radiation described in Sect. 3.10
i.e. we replace, for each particle, the charge with the mass. Furthermore, in the equations that express the Poynting vector (i.e. in those which express the radiated power), we multiply the right-hand side by the universal gravitational constant G. We note, incidentally, that in these equations the dimensional factor [e 2] is replaced by the dimensional factor [Gm 2] having the same dimensions. The various quantities introduced in Sect. 3.10, i.e. the electric dipole moment D (Eq. (3.31)), the magnetic dipole moment M (Eq. (3.32)), and the symmetric tensor of order two (related to the electric quadrupole moment) (Eq. (3.33)) are transformed in as many quantities for which we use, respectively, the symbols D G, M G, and , i.e.
We now note that the quantity D G, the analogous of the electric dipole, is, by definition, the coordinate of the centre of mass of the system of N particles r G multiplied by the total mass. We therefore have
where
We then have, for an isolated system,
Furthermore, the analogous of the magnetic dipole, the quantity M G, is proportional to the total angular momentum of the system J, since
We then obtain, for an isolated system,
and therefore also
According to our analogy, we can then conclude that for gravitational waves there is neither the analogue of electric dipole radiation, nor the analogue of magnetic dipole radiation. It therefore only remains the analogue of the electric quadrupole radiation (in addition, obviously, to the radiation due to higher multipoles). The tensor is traditionally denoted by the symbol , because it is essentially an inertia tensor. It is however not to be confused with the ordinary inertia tensor \(\mathcal{I}\) that is used to describe the dynamics of a rigid body, and that is defined as
where U is the unitary tensor. We have, obviously,
since, recalling the definition of the trace of a tensor
The two tensors \(\mathcal{I}\) and differ by a quantity which is proportional to the unitary tensor. This property is strictly analogous to the one that exists between the tensors \(\mathcal{Q}\) and in electrodynamics. Therefore, within our analogy, the power emitted in gravitational waves at the lowest order (of the multipolar expansion) can be obtained from Eq. (3.34) and is given by
This formula is correct in all respects, aside from the numerical factor. The calculations based on general relativity produce a similar result, where the factor \({1 \over 20}\) is replaced by the factor \({1 \over5}\). Intuitively, one can justify this multiplication by a factor of four noting that an electromagnetic wave is described by two vectors E and B that are not independent and are perpendicular to the direction of propagation, say z. Only two components of one of the two fields, such as E x and E y , are sufficient to describe the wave. A gravitational wave is instead described by two independent tensors also perpendicular to the direction of propagation. If we denote these tensors by the traditional symbols e + and e ×, the wave is described by the eight components (\(e_{xx}^{+}, e_{xy}^{+}, e_{yx}^{+}, e_{yy}^{+}, e_{xx}^{\times}, e_{xy}^{\times}, e_{yx}^{\times}, e_{yy}^{\times}\)). The factor of four is therefore associated with, say, the degrees of freedom of the polarisation. The correct formula for the power emitted in gravitational waves is then
Finally, we note that if we change the origin of coordinates putting
with b a constant vector, we obtain that the new inertia tensor \(\mathcal{I}^{ \prime}\) is
so that, for an isolated system,
and, all the more so, \({\dddot{\mathcal{I}}}^{ \prime} = \dddot{\mathcal{I}}\). This equation allows to calculate the inertia tensor in a coordinate system having an arbitrary origin in order to determine the power emitted in gravitational waves.
16.7 Calculation of the Thomas-Fermi Integral
Some applications of atomic physics based on the Thomas-Fermi model require the calculation of the following integral
where χ(x) is the solution of the Thomas-Fermi equation
which satisfies the boundary condition
The integral is split into the sum of two integrals
where
The first integral is trivial since, taking into account the Thomas-Fermi equation and the boundary conditions of the function χ, we have
The calculation of the second integral is more complex. It can be done in the following way. On one hand, we have
and, integrating by parts and taking into account that χ(0)=1
On the other hand, considering the quantity x −1/2 dx as a differential factor, by integrating again by parts and recalling the Thomas-Fermi equation, we obtain
Now we note that the product χ′χ″ can be expressed in the form
and integrating again by parts we get
Comparing this expression with Eq. (16.20), we obtain
Finally, recalling Eqs. (16.18) and (16.19) we get
16.8 Energy of the Ground Configuration of the Silicon Atom
As an application of the results obtained in Chap. 8, we evaluate the energy of the ground configuration of the silicon atom, i.e. of the 1s 22s 22p 63s 23p 2 configuration. Before performing these calculations, we need to evaluate some 3-j symbols. By means of the analytical formula given in Eq. (7.16), we have
The configuration contains four closed subshells and one open subshell. We start by evaluating the degenerate contribution to the energy. The Hamiltonian H 0 (defined in Eq. (7.3)) and the \(\mathcal{F}\) part of the Hamiltonian \(\mathcal{H}_{1}\) (defined in Eqs. (8.2) and (8.3)) produce five terms, one for each subshell (open or closed). The corresponding energy (that we denote by \(\mathcal{E}_{1}\)) is obtained from Eq. (8.7) and is given by
where W 0 is defined by Eq. (7.11) and I(n,l) is the integral defined in Eq. (8.6). The energy of the Coulomb interaction (i.e. the part \(\mathcal{G}\) of the Hamiltonian \(\mathcal{H}_{1}\)) resulting from closed subshells contributes four terms. Denoting by \(\mathcal{E}_{2}\) the corresponding energy, we have, using Eq. (8.17),
where the quantities F k(n a l a ,n n l b ) are defined in Eq. (8.9). Considering the energy of the Coulomb interaction between different closed subshells, we have six contributions, as many as the number of the distinct pairs that can be formed with the four closed subshells. Denoting by \(\mathcal{E}_{3}\) the corresponding energy, we have, using Eqs. (8.14) and (8.16)
where the quantities G k(n a l a ,n n l b ) are defined in Eq. (8.10). Finally, we need to evaluate the contribution of the Coulomb interaction between the open subshell 3p and the four closed subshells. Denoting by \(\mathcal{E}_{4}\) the corresponding energy, we have, using Eqs. (8.13) and (8.15),
The four contributions to the energy that we have calculated are degenerate with respect to all the states of the configuration. For the degenerate part of the energy of the ground configuration of the silicon atom, \(\mathcal {E}\), we then have
What remains to calculate is given by Eq. (8.11), with the sum extended to the only pair of electrons belonging to the open subshell 3p. The explicit computation is done in Sect. 8.6. The two 3p electrons give rise to three terms that, in order of increasing energy, are 3 P, 1 D, and 1 S. The ratio between the intervals (1 S−1 D) and (1 D−3 P) is equal to 3/2.
16.9 Calculation of the Fine-Structure Constant of a Term
The calculation of the constant ζ(α,LS), which characterises the fine-structure intervals of the terms belonging to a given configuration, can be carried out with a process based on the diagonal sum rule. A similar process was followed in Sect. 8.1 to determine the energy of the terms. The starting point is Eq. (9.8) which, in the case of diagonal matrix elements, is
On the other hand, for any eigenstate of the configuration of the form Ψ A(a 1,a 2,…,a N ) of Eq. (7.1), the diagonal matrix element of the same operator is given by
where \(\zeta_{n_{i} l_{i}}\) is the quantity defined in Eq. (9.10).
We now consider the particular case of the pf configuration which, as shown in Table 7.3, gives rise to the six terms 1 D, 1 F, 1 G, 3 D, 3 F, and 3 G. We start from a state having the highest values of the quantum numbers M L and M S , i.e. M L =4, M S =1. This state can only originate from the 3 G term. Considering instead single particle states, this state is of the type m 1=1, \(m_{s1}={{1 \over2}}\), m 2=3, \(m_{s2}={{1 \over2}}\), where the indices 1 and 2 refer, respectively, to the p and f electron. Using the same notations as in Sect. 8.1 we can write the equalityFootnote 5
that, according to the previous equation, isFootnote 6
We therefore obtain the result
We then proceed by lowering the value of M L (maintaining M S =1). We obtain the equations
from which we have, noting that the combination [M L =3, M S =1] can originate from the 3 G and 3 F terms, and that the combination [M L =2, M S =1] can originate from the 3 G, 3 F, and 3 D terms,
By solving the system, we arrive at the following expressions (which can also be obtained from Eq. (9.11))
In principle, we could also consider the values of M S =0. For example,
However, in so doing we obtain equations of the form 0=0 and the value of ζ(1 G) is undetermined. This is entirely consistent, since the singlet states do not have fine structure and the constant ζ is not defined.
The cases of the configurations of equivalent electrons are also interesting, because, by repeating the same arguments, we obtain directly the third Hund’s rule. For example, consider the configuration p 2 which produces, as shown in Table 7.4, the three terms 1 S, 1 D, and 3 P. For the singlet terms, the fine structure constant remains undetermined, as usual. For the triplet term we have instead
from which we obtain
If we consider the complementary configuration p 4, we get the same structure of terms. This time, to find the fine structure constant of the 3 P, term, we need to consider the equationFootnote 7
and we obtain
i.e. a value that is exactly the same (but opposite in sign) to the one of the configuration p 2. These arguments can be repeated for any configuration of equivalent electrons and for the corresponding complementary configuration and lead to the third Hund’s rule. In the particular case of configurations that fill half of a subshell (such as p 3, d 5, and f 7), the configuration coincides with the complementary one, and the fine structure constant is zero for all the terms.
16.10 The Fundamental Principle of Statistical Thermodynamics
Consider, in all generality, a macroscopic physical system. We suppose that the system is in thermal equilibrium with an ideal heat reservoir having temperature T (canonical ensemble). We also suppose to identify with the index i all possible microscopic states of the system and we denote by E i the energy of the i-th state. Macroscopically, the system is in a steady state. On the other hand, from the microscopical point of view, we can think that it constantly evolves from one microscopic state to another. We can then introduce a statistical description denoting by p i the probability that the system is in the i-th microscopic state. The following normalisation property should obviously be valid
We now need to relate the probability p i with the energy E i . To do so, we give a definition of the entropy by putting, according to an hypothesis originally due to Boltzmann
where k B is the Boltzmann constant.
This definition can be justified by considering that the entropy of a system measures the amount of “disorder” contained in the system itself and noting that the function defined above has the mathematical property of assuming its maximum value when all the probabilities p i are equal to one another, and take the minimum value (which is equal to 0) when a single p i is equal to 1 and all the others are equal to 0. The proof of the second property is trivial. To prove the first property we note that giving an arbitrary variation δp i to the probabilities, the corresponding change δS of the entropy is
On the other hand, being
it follows that if lnp i is constant (i.e. independent of i), δS is null and therefore the entropy presents an extreme. It is then easy to verify that such an extreme is actually a maximum, since
Having justified the definition of the entropy, we now take into account that the internal energy of the system is given by the expression
If we consider an infinitesimal thermodynamical transformation of the system, the internal energy will vary, in general, because both the probabilities p i and the energies E i change. We then have
If, however, the external conditions of the system are not varied, the quantities E i remain fixed to the initial value and the second term of the right hand side is null. On the other hand, to keep constant the external conditions of the system means that the system does not accomplish mechanical work on the ambient medium, so we can write, according to the first principle of thermodynamics
where δQ is the heat exchanged with the reservoir, and taking into account that
we obtain the equation
This equation must be satisfied for an arbitrary thermodynamic transformation (as long as no work is done). The following relation therefore must hold
which leads to the relation
where A is a constant and where we have put
The constant A is determined by imposing the normalisation condition. Since we must have
it follows that
where the quantity \(\mathcal{Z}\), known as the sum over states, is given by
The expression of p i can therefore be written in its final form
This expression, often referred to as Gibbs principle, is of extreme generality and can rightly be considered the basis of all statistical thermodynamics. It can be written in an alternative form by assuming that the microscopic states of the system are not discrete (and therefore countable) but are identified by the representative point in the phase space of the system having dimension \(2 \mathcal{N}\), where \(\mathcal{N}\) is the number of degrees of freedom of the system itself. In this case, denoting by dP the probability that the representative point of the system is in the cell \(\mathrm{d} \varGamma=\mathrm{d}q_{1} \,\mathrm{d}q_{2} \cdots{\mathrm{d}}q_{\mathcal{N}} \,\mathrm{d}p_{1} \,\mathrm{d}p_{2} \cdots{\mathrm{d}}p_{\mathcal{N}}\) of the phase space centered around the values (q i ,p i ), and denoting by H(q i ,p i ) the Hamiltonian of the system, we have
where the integral is over the entire volume of the phase space available to the system. Equations (16.22) and (16.23) coincide, respectively, with Eqs. (10.2) and (10.1) which we have assumed as the basic principles for the deduction of the various laws of thermodynamical equilibrium in Chap. 10.
16.11 Transition Probability for the Coherences
In Chap. 11, we have introduced the so-called random phase approximation and we have determined the kinetic equations for the diagonal matrix elements ρ α of the density matrix operator of the physical system. The result that we found is the kinetic equation (9.11), which is interpreted by introducing the transition probability per unit time between different states of the system. This probability is given by Fermi’s golden rule, expressed by Eq. (11.10). We now want to generalise these results by determining the kinetic equations for the so-called coherences, i.e. for the non-diagonal matrix elements of the density matrix operator.
We start again from Eq. (8.11) and introduce the hypothesis, less restrictive than that of the random phases, that in the physical system there might exist coherences, even if only within pairs of states, |α〉 and |α′〉, having the same energy eigenvalue (degenerate states) and such that the matrix element of the interaction Hamiltonian between them, \(\mathcal{H}^{\mathrm{I}}_{\alpha\alpha'}\), is zero. Taking into account this approximation, when evaluating the product \(c_{\alpha}(t) c_{\alpha'}^{*}(t)\) we obtain, considering only the terms that are at most quadratic in the matrix elements of \(\mathcal {H}^{\mathrm{I}}\),
where the symbol \([\cdots+ \mathcal{C}.\mathcal{C.} ( \alpha \leftrightarrow \alpha' ) ]\) means that we need to add to the term in brackets its complex conjugate (with the exchange of the indices α and α′).
We now need to recall the approximation we have introduced in that the states between which coherences exist are iso-energetic. As regards the second line of the previous equation, this implies that ω αβ =ω α′β′ and so the two temporal factors are one the complex conjugate of the other. Regarding the third line, we can consider the limit (ω αγ →0) and the temporal function between round brackets, which we indicate with \(\mathcal{F}(t)\), is equal to
We now proceed by evaluating the statistical average over the physical system. We introduce the notation of the density matrix by puttingFootnote 8 \(\rho_{\alpha\alpha'} = \langle c_{\alpha}^{\phantom{*}}(t) c_{\alpha'}^{*} (t) \rangle \). Changing the index of the sum γ in α″, the kinetic equation for the coherences becomes
We consider the limit of this equation for t→∞. As we have seen on various occasions within the text (cf. Fig. 11.1)
where we have used the definition of the Bohr frequencies in terms of the energies of the states of the physical system. Regarding the function \(\mathcal{F}(t)\), while its real part produces again a Dirac delta over the energy, the imaginary part behaves, at the limit of t→∞, as shown in Fig. 16.2. It can rigorously be shown within the distribution theory that we have
where the symbol PP means the Cauchy principal value.
We are now able to write the kinetic equation that generalises Eq. (11.9), valid for the diagonal elements of the density matrix, to the case of coherences. Starting with Eq. (16.24) and noting that all the terms in the right hand side behave linearly with t, we can write, for t→∞
where T αα′ββ′, the rate of transfer from the coherence ρ ββ′ to the coherence ρ αα′, is given by
and where R αα″, the relaxation rate that relates the coherence ρ αα′ to the coherence ρ α″α′, is given by
It is easy to show that Eq. (16.25) coincides with Eq. (11.9) in the case of the random phase approximation, i.e. when we consider only the diagonal elements of the density matrix. We have, in fact,
where P αβ is the transition probability per unit time between the states |α〉 and |β〉 (or between the states |β〉 and |α〉) given by Eq. (11.10) (Fermi’s golden rule). Similarly,
Equation (16.25) can therefore be rightly considered the generalisation of Fermi’s golden rule to the case of coherences. It is important to stress the presence of the imaginary factor in the relaxation rates. Such factor is responsible for some phenomena typical of the interaction between material systems and the radiation field such as, in particular, the anomalous dispersion phenomena that occur in the propagation of polarised radiation in an anisotropic medium (Faraday effect, Macaluso-Corbino effect, etc.).
16.12 Sums over the Magnetic Quantum Numbers
Here, we want to prove Eq. (11.17). That is, we want to show that, for any polarisation unit vector e, having defined the averages of the square moduli of the dipole matrix elements \(\mathcal{A}\) and \(\mathcal {A'}\) over the magnetic quantum numbers by the equations
we have
The indices a and b in the previous equations denote any two energy levels of the atomic system while the indices α and β denote the respective magnetic sublevels, which are degenerate with respect to the energy. To demonstrate the equation, we need to introduce a more detailed notation which takes into account the fact that the atomic levels are normally characterised not only by a set of internal quantum numbers γ (which specify the configuration and the term), but also by the quantum number for the angular momentum J and the magnetic quantum number M. Applying the formal substitutions
we obtain
To calculate \(\mathcal{A}\) we apply the Wigner-Eckart theorem, noting that the scalar product r⋅e can be expressed in terms of the spherical components of the two vectors. We have in fact (cf. Eq. (9.5))
Using Eq. (9.4) we have
and we obtain
The sum over M a and M b of the product of the two 3-j symbols can be calculated using the property of the 3-j symbols of Eq. (7.18). We have
and we obtain, being ∑ q e q (e q )∗=1
To calculate the quantity \(\mathcal{A}'\) we proceed in a similar way first noting that 〈γ a J a M a |r|γ b J b M b 〉=〈γ b J b M b |r|γ a J a M a 〉∗. We obtain
and using the same property of the 3-j symbols and summing over q we arrive at the result that we wanted to prove, i.e.
The above results can be used to express the quantity |r ba |2 that we introduced within the text in terms of the reduced matrix elements of the spherical tensor r. Since \(|\mathbf{r}_{ba}|^{2} = \mathcal {A'}\), we have
On the other hand, being |r ba |2=|r ab |2, we obtain by symmetry
an equation that relates the reduced matrix elements under the exchange of the bra with the ket.
In spectroscopy, the concept of line (or transition) strength is commonly used. Such quantity is invariant with respect to the exchange of the lower and upper level, and is defined by
where d=−e 0 r is the electric dipole operator. The quantities introduced in the text are therefore related to the line strength via the relation
These relations can then be used to express the Einstein coefficients in terms of the line strength instead of in terms of the dipole matrix elements. For example, recalling Eq. (11.20), the Einstein coefficient A ab can be written in the form
An alternative quantity that is also used to characterise the strength of a line (or a transition) is the so-called oscillator strength. This quantity is introduced in the following way. The absorption coefficient of a plasma of “classical” atoms, described by the Lorentz atomic model and integrated in frequency is given by
where \(\mathcal{N}\) is the number density of atoms. Comparing this expression with that one for \(k_{\mathrm{R}}^{(\mathrm{a})}\) obtained in Sect. 11.9 (Eq. (11.33)), we see that the two quantities coincide if we identify \(\mathcal{N}\) with \(\mathcal{N}_{b}\) and multiply the classical expression for the dimensionless quantity f ba , known as the oscillator strength of the transition, given by
The oscillator strength may be considered as a parameter measuring the efficiency of the transition, since it represents a sort of “equivalent number” of classical oscillators. Typically, it is a relatively small number that can reach values of the order of unity only for the strongest spectral lines. The relations between oscillator strength, line strength, and Einstein coefficients are easily obtained using the previous relations. For example, we have
16.13 Calculation of a Matrix Element
We wish to calculate the probability per unit time that the following elementary process occurs: a non-relativistic free electron of momentum q undergoes a transition to a free state of momentum q′ due to absorption of a photon with wave vector k. According to Fermi’s golden rule, repeating the arguments presented in Sect. 11.4 but without introducing the dipole approximation,Footnote 9 such probability is proportional to the squared modulus of the matrix element \(\mathcal{M}\) given by
where |u i〉 and |u f〉 are the eigenvectors of the atomic system (in our case of the free electron) in the initial and final state, respectively, e is the polarisation unit vector of the absorbed photon, and p is the momentum operator of the electron. Within the representation of the wavefunctions, where the operator p is given by \(-\mathrm{i} \hbar \operatorname{grad}\), the matrix element \(\mathcal{M}\) is
On the other hand, the eigenfunctions ψ f and ψ i are of the type of a plane wave, i.e.
where \(\mathcal{V}\) is the normalisation volume. Substituting in the integral we have
The integral is null unless the argument of the exponential is zero. This leads to the equality q′=(ħ k+q) which represents the conservation of momentum. In such case, the integral is simply equal to \(\mathcal{V}\), so we obtain
16.14 Gauge Invariance in Quantum Electrodynamics
Consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) defined in Eq. (15.22) of the text that we rewrite here in the form
where
We want to demonstrate that \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is invariant with respect to the transformation
where C is an arbitrary constant and where u is the unit vector of the direction of the initial photon (u=k/k). Performing such transformation, the quantities P and Q are transformed according to the equations
where
We multiply the two quantities P′ and Q′ by the product cħk and note that
Recalling the kinematic relations of the Compton effect and noting that the quantities g and h, contained respectively in P′ and Q′, are given by (see Eq. (15.15))
we can perform the following substitution in the expression of P′
and in the expression of Q′
Substituting, and recalling also that ϵ−ħω′=ϵ′−ħω, we obtain
Within the square brackets, we add and subtract the factor βmc 2 and recall that an expression of the type (c α⋅q+βmc 2), with q arbitrary, is the Dirac Hamiltonian H q . We obtain
Now, noting that
we have
Finally, taking into account that
we obtain
from which it follows that (P′+Q′)=0. This shows that the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is invariant with respect to the transformation (16.26). In all similarity, it can be shown that \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) is also invariant with respect to the transformation
where C′ is an arbitrary constant and where u′ is the unit vector of the direction of the final photon (u′=k′/k′).
An alternative way to express these invariant properties is to formally consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) as a function of the matrix (α⋅e), or, alternatively, of the matrix (α⋅e ′∗). From the above proof it follows that
16.15 The Gamma Matrices and the Relativistic Invariants
The relation between energy ϵ p and momentum p of a relativistic particle of mass m is
In particular, for a photon (m=0) we have
or, in terms of frequency and wavenumber
Such relations may be formally simplified if we adopt a unit system in which ħ=c=1. The introduction of this convention is equivalent to define the unit time interval as the time needed by light to travel the unit of length. With this definition, the energy, the momentum, and the mass (and similarly for a photon, the angular frequency and the wavenumber) all assume the dimensions of the reciprocal of a length (or a time). The relation between momentum and energy is written, in this system of units, in the form
and for a photon
We now introduce with the symbol \(\mathcal{P}_{\mu}\) (μ=0,1,2,3) the quadrivector momentum-energy of the particle. It is an entity with four components that are defined in this way
or, in a more compact form
Defining the metric tensor g μν as
the scalar product of two quadrivectors \(\mathcal{P}\) and \(\mathcal{Q}\) is
In particular we have
These quantities (the scalar product of two quadrivectors defined by the above metric tensor and, in particular, the square of a quadrivector) are relativistic invariants, i.e. do not change under Lorentz transformations. We are now going to show how the probability amplitudes of Compton scattering can be expressed in terms of these invariants.
Consider the quantity \(\mathcal{R}_{\mathrm{f}\,\mathrm{i}}\) defined in Eq. (15.22). This quantity is composed of two terms that we denote by P and Q. For the first one we have, taking into account the system of units we have introduced (c=ħ=1)
where
with
Recalling that the square of the Dirac matrix β is unity, we can write
If we now also recall that the Dirac matrix β anticommutes with any of the α matrices, we obtain
We now define the matrices γ μ (μ=0,1,2,3)
The fundamental property of these matrices regards their anticommutator which is (as easily derived from the properties of the α and β matrices)
where g μν is the metric tensor that we have previously defined. Moreover, given an arbitrary quadrivector \(\mathcal{V}\), we define by the symbol the matrix
With these definitions, the quantity P can be written in the form
where the quadrivectors \(\mathcal{G}\), \(\mathcal{E}\) and \(\mathcal {E}^{\prime}\) are given by
We note that the quadrivector \(\mathcal{G}\) can also be written in the form
where
If we now consider the quantity P ∗, complex conjugate of P, we need to proceed carefully because the γ matrices (except γ 0) are not Hermitian. We have in fact
These properties can be summarised in only one relation
which implies, for an arbitrary quadrivector
We therefore obtain
or
Similar considerations can be repeated for the other term Q of Eq. (15.22) defined by
where
with
We have
where the quadrivector \(\mathcal{H}\) is defined by
being
Still in analogy to what discussed before, we also have
We can now evaluate the square of the modulus of the quantity \(\mathcal {R}_{\mathrm{f}\,\mathrm{i}}\). It is
where, using Eqs. (16.29)–(16.32), the four terms are given by
These expressions can be simplified when one considers the average over the initial spin states of the electron and the sum over the final spin states of the electron. Taking into account the results we have obtained in Sect. 15.5 (Eqs. (15.18) and (15.20)) we have
and defining the quadrivector
we obtain
By denoting with the symbol 〈⋯〉 the average over the spin states and using the definition of the trace of a matrix, according to which a scalar product of the form \(W_{1}^{\phantom{\dagger}} \mathcal{X} W_{2}^{\dagger}\), with \(W_{1}^{\phantom{\dagger}} \) and \(W_{2}^{\phantom{\dagger}} \) arbitrary spinors and \(\mathcal{X}\) an arbitrary matrix, can be written in the form \(\operatorname{Tr}(W_{2}^{\phantom{\dagger}} W_{1}^{\dagger}{\mathcal {X}})\), we have
with similar expressions for the other three terms 〈PQ ∗〉, 〈QP ∗〉, and 〈QQ ∗〉.
This last result can be greatly simplified if we sum over the polarisation states of the final photon and we average over the polarisation states of the initial photon. The average over the polarisation states of the initial photon, for example, is obtained by applying to the previous formula the formal substitution
where
e (i) (i=1,2) being two unit vectors that we can assume real, perpendicular to each other and perpendicular to the direction of the initial photon. Taking into account the invariance under gauge transformations described in Sect. 16.14 and, in particular, recalling Eq. (16.27), the sum can be modified by extending it to a third “unit quadrivector” (that we denote by \(\mathcal{E}^{(3)}\)) and subtracting then the contribution from another unit quadrivector. According to special relativity, this unit quadrivector, which we denote by \(\mathcal{E}^{(0)}\), is of the purely temporal type. Defining
where e (3) is an unit vector directed along the direction of the incoming photon (e (3)=k/k), and recalling the definition of the metric tensor, we apply the following transformation
We note that the sum \(\mathcal{S}\) can also be written in the form
where γ is the formal vector defined by γ=(γ 1,γ 2,γ 3). The right-hand side can then be transformed to get
On the other hand, taking into account Eq. (15.7), the quantity in square brackets is equal to the Kronecker delta δ ij , so we obtain
Finally, we take into account the properties of the γ matrices. From Eq. (16.28) we have
Moreover, it is easy to verify that the following relation holds
and that, given the properties of the metric tensor,
so that
Taking advantage of these properties, we get after some algebra
Summarising the foregoing, the average on the states of polarisation of the initial photon is obtained by performing the formal transformation
Similarly, the sum over the polarisation states the final photon is obtained by performing the formal transformation
Now we denote by the symbol 〈〈PP ∗〉〉 the quantity obtained by taking the average of 〈PP ∗〉 over the states of initial polarisation and the sum of the same quantity over the states of final polarisation.Footnote 10 We have
At this point it is necessary to briefly discuss the traces of the products of the γ matrices. It is easy to verify that the trace of the product of an odd number of γ matrices is null. When instead the number of γ matrices is zero or even, the result is different from zero. Denoting by a an arbitrary constant, with \(\mathcal{A}\), \(\mathcal{B}\), \(\mathcal{C}\), and \(\mathcal{D}\) four arbitrary quadrivectors, and recalling the definition of the scalar product of quadrivectors, we have
The first relation is obvious. For the second one we have
and using the anticommutation property of the γ matrices
From the cyclic property of the trace it then follows that
which proves by simple substitution the second relation. For the third relation we have
and, for the anticommutation property of the γ matrices,
Using the cyclic property of the trace we then have
and using of the result previously obtained
The third relation is then finally obtained by simple substitution of this identity.
The result obtained for 〈〈PP ∗〉〉 shows that the trace contained in this quantity can be expressed exclusively in terms of scalar products of quadrivectors, i.e. in terms of relativistic invariants. Similar considerations can then be repeated for the other quantities 〈〈PQ ∗〉〉, 〈〈QP ∗〉〉, and 〈〈QQ ∗〉〉, which, once calculated, allow one to obtain the transition probability per unit time and the cross section. Obviously, in the particular case in which the electron is initially at rest, one finds again for the cross section the Klein-Nishina equation in the form of Eq. (15.37), which refers to the average over the polarisation states of the initial photon and the sum over the polarisation states of the final photon.
The formalism of the γ matrices presented in this chapter is very powerful and elegant. It allows one to deal with relative ease even with the most complex problems in quantum electrodynamics. In any case, we emphasize that the formalism that we have used in the text to deduce the Klein-Nishina equation, which does not make use of the γ matrices, was the first to be used in the applications.
16.16 Physical Constants
The constants are expressed in the cgs system of units with at most six significant digits.
-
Constant of gravitation: G=6.67428×10−8 cm3 g−1 s−2
-
Velocity of light in vacuum: c=2.99792×1010 cm s−1
-
Planck constant: h=6.62607×10−27 erg s
-
Reduced Planck constant: ħ=h/(2π)=1.05457×10−27 erg s
-
Boltzmann constant: k B=1.38065×10−16 erg K−1
-
Charge of the electron (absolute value): e 0=4.80320×10−10 esu
-
Electron mass: m=9.10938×10−28 g
-
Reduced electron mass: m r=mM p/(m+M p)=9.10442×10−28 g
-
Atomic mass unit (amu): m H=1.66054×10−24 g
-
Proton mass: M p=1.67262×10−24 g
-
Proton/electron mass ratio: M p/m=1.83615×103
-
Avogadro constant: N A=6.02214×1023 mol−1
-
Fine-structure constant: \(\alpha=e_{0}^{2}/(\hbar c)=7.29735 \times10^{-3}\)
-
Reciprocal of the fine-structure constant: \(1/\alpha=\hbar c/e_{0}^{2}= 137.036\)
-
Classical radius of the electron: \(r_{\mathrm{c}}=e_{0}^{2}/(m c^{2})=2.81794 \times 10^{-13}\mbox{ cm}\)
-
Compton wavelength of the electron: λ C=h/(mc)=2.42631×10−10 cm
-
Radius of the first Bohr orbit: \(a_{0}=\hbar^{2} /(m e_{0}^{2})=5.29177 \times10^{-9}\mbox{ cm}\)
-
Rydberg constant: \(R = m e_{0}^{4} /(4 \pi c \hbar^{3}) = 1.09737 \times10^{5}\mbox{ cm}^{-1}\)
-
Rydberg constant (hydrogen atom): \(R_{\mathrm{H}} = m_{\mathrm{r}} e_{0}^{4} /(4 \pi c \hbar^{3}) = 1.09677 \times10^{5}\mbox{ cm}^{-1}\)
-
Bohr magneton: μ 0=e 0 ħ/(2mc)=9.27401×10−21 erg G−1
-
Thomson cross section: \(\sigma_{\mathrm{T}} = 8 \pi r_{\mathrm{c}}^{2}/3 = 6.65246 \times10^{-25} \mbox{ cm}^{2}\)
-
Stefan-Boltzmann constant:Footnote 11 σ=5.67040×10−5 erg cm−2 s−1 K−4
-
Radiation density constant:11 a=7.56577×10−15 erg cm−3 K−4
-
First radiation constant: c 1=2πhc 2=3.74177×10−5 erg cm2 s−1
-
Second radiation constant: c 2=hc/k B=1.43877 cm K
Notes
- 1.
This law is not universally attributed to Gilbert. In fact, this law was discovered experimentally by Coulomb himself and could therefore be rightly called the “second Coulomb’s law”. William Gilbert (1564–1603) was an English physician who lived well before Coulomb. He is remembered for his studies on the terrestrial magnetism and by the fact that he realised that the magnetic force should increase with decreasing distance.
- 2.
These experiments were made possible thanks to the discovery of the electric battery by Alessandro Volta.
- 3.
With the introduction of capacity and inductance, together with their units, the Farad (F) and the Henry (H), the units in which μ 0 and ϵ 0 are expressed are, respectively, H m−1 and F m−1.
- 4.
This convention is not universally accepted. Some authors prefer to assume γ=1 also in the International System. In this case Ampère’s principle of equivalence is written as μ=iσ n while in Gilbert’s law the factor μ 0 is to appear in the numerator rather than in the denominator.
- 5.
The symbol [M L ,M S ] means the sum of the diagonal matrix elements of the Hamiltonian of the spin-orbit interaction over all the states Ψ A for which M L and M S are the eigenvalues of L z and S z , respectively. Similarly, the notation \((m_{1}^{\pm},m_{2}^{\pm}) \) is used to denote the diagonal matrix element of the same Hamiltonian on the state where the electron 1 has the magnetic quantum number m 1 and spin quantum number +1/2 or −1/2 and, similarly, the electron 2 has the magnetic quantum number m 2 and spin quantum number +1/2 or −1/2.
- 6.
We note that even if there are electrons in the closed subshells they do not produce any contribution to the equation.
- 7.
The quantity (1+,1−,0+,−1+), relative to the configuration p 4, is obtained from the corresponding quantity (0+,1+), relative to the configuration p 2, taking the “complementary” of the latter, i.e. (−1+,−1−,0−,1−), and then changing sign to all the values of m and m s .
- 8.
We note that in Chap. 11 we only introduced the diagonal elements of the density matrix, denoted for simplicity by the symbol ρ α instead of ρ αα .
- 9.
The dipole approximation is appropriate when considering the interaction between radiation and electrons that are bound in an atom. For free electrons, described by eigenfunctions of the type of a plane wave, the approximation cannot be applied.
- 10.
Recall that the first average, 〈PP ∗〉, has a similar meaning with respect to the spin states of the electron.
- 11.
\(\sigma= {2 \pi^{5} k_{\mathrm{B}}^{4} \over15 h^{3} c^{2}}\), \(a={ 4 \sigma\over c} = {8 \pi^{5} k_{\mathrm{B}}^{4} \over15 h^{3} c^{3}}\).
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Italia
About this chapter
Cite this chapter
Landi Degl’Innocenti, E. (2014). Appendix. In: Atomic Spectroscopy and Radiative Processes. UNITEXT for Physics. Springer, Milano. https://doi.org/10.1007/978-88-470-2808-1_16
Download citation
DOI: https://doi.org/10.1007/978-88-470-2808-1_16
Publisher Name: Springer, Milano
Print ISBN: 978-88-470-2807-4
Online ISBN: 978-88-470-2808-1
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)