Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Integration of functions in higher dimensions is much more difficult than it is in one dimension. The basic reason is that in order to integrate a function, one has to know how to measure the volume of sets. In one dimension, most sets can be decomposed into intervals (cf. Exercise 1.21), and we took the length of an interval to be its volume. However, already in \({\mathbb R}^2\) there is a vastly more diverse menagerie of shapes. Thus knowing how to integrate over one shape does not immediately tell you how to integrate over others. A second reason is that, even if one knows how to define the integral of functions in \({\mathbb R}^N\), in higher dimensions there is no comparable deus ex machina to replace The Fundamental Theorem of Calculus.

A thoroughly satisfactory theory that addresses the first issue was developed by Lebesgue, but, because it takes too much time to explain, his is not the theory presented here. Instead, we will stay with Riemann’s approach.

5.1 Integration Over Rectangles

The simplest analog in \({\mathbb {R}^N}\) of a closed interval is a closed rectangle R,Footnote 1 a set of the form

$$\prod _{j=1}^N[a_j,b_j]=[a_1,b_1]\,\times \,\cdots \,\times \, [a_N,b_N]=\{\mathbf x\in {\mathbb {R}^N}:\,a_j\le x_j\le b_j\text { for }1\le j\le N\},$$

where \(a_j\le b_j\) for each j. Such rectangles have three great virtues. First, if one includes the empty set \(\emptyset \) as a rectangle, then the intersection of any two rectangles is again a rectangle. Secondly, there is no question how to assign the volume |R| of a rectangle, it’s got to be \(\prod _{j=1}^N(b_j-a_j)\), the product of the lengths of its sides. Finally, rectangles are easily subdivided into other rectangles. Indeed, every subdivision of the intervals making up its sides leads to a subdivision of the rectangle into sub-rectangles. With this in mind, we will now mimic the procedure that we carried out in Sect. 3.1.

Much of what follows relies on the following, at first sight obvious, lemma. In its statement, and elsewhere, two sets are said to be non-overlapping if their interiors are disjoint.

Lemma 5.1.1

If   \(\mathcal C\) is a finite collection of non-overlapping rectangles each of which is contained in the rectangle R, then \(|R|\ge \sum _{S\in \mathcal C}|S|\). On the other hand, if \(\mathcal C\) is any finite collection of rectangles whose union contains a rectangle R, then \(|R|\le \sum _{S\in \mathcal C}|S|\).

Proof

Since \(|S\cap R |\le |S|\), we may and will assume throughout that \(R\supseteq \bigcup _{S\in \mathcal C}S\). Also, without loss in generality, we will assume that \(\mathrm {int}(R)\ne \emptyset \).

The proof is by induction on N. Thus, suppose that \(N=1\). Given a closed interval I, use \(a_I\) and \(b_I\) to denote its left and right endpoints. Determine the points \(a_R\le c_0<\cdots <c_\ell \le b_R\) so that

$$\{c_k:\,0\le k\le \ell \}=\{a_I:\,I\in \mathcal C\}\cup \{b_I:\,I\in \mathcal C\},$$

and set \(\mathcal C_k=\{I\in \mathcal C:\,[c_{k-1},c_k]\subseteq I\}\). Clearly \(|I|= \sum _{\{k:\,I\in \mathcal C_k\}}(c_k-c_{k-1})\) for each \(I\in \mathcal C\).Footnote 2

When the intervals in \(\mathcal C\) are non-overlapping, no \(\mathcal C_k\) contains more than one \(I\in \mathcal C\), and so

$$\begin{aligned}\sum _{I\in \mathcal C}|I|&=\sum _{I\in \mathcal C}\sum _{\{k:I\in \mathcal C_k\}}(c_k-c_{k-1}) =\sum _{k=1}^\ell \mathrm {card}(\mathcal C_k) (c_k-c_{k-1})\\ {}&\le \sum _{k=1}^\ell (c_k-c_{k-1}) \le (b_R-a_R)=|R|.\end{aligned}$$

If \(R=\bigcup _{I\in \mathcal C}I\), then \(c_0=a_R\), \(c_\ell =b_R\), and, for each \(0\le k\le \ell \), there is an \(I\in \mathcal C\) for which \(I\in \mathcal C_k\). To prove this last assertion, simply note that if \(x\in (c_{k-1},c_k)\) and \(\mathcal C\ni I\ni x\), then \([c_{k-1},c_k]\subseteq I\) and therefore \(I\in \mathcal C_k\). Knowing this, we have

$$\begin{aligned}\sum _{I\in \mathcal C}|I|&=\sum _{I\in \mathcal C}\sum _{\{k:I\in \mathcal C_k\}}(c_k-c_{k-1})=\sum _{k=1}^\ell \mathrm {card}(\mathcal C_k)(c_k-c_{k-1}) \\ {}&\ge \sum _{k=1}^\ell (c_k-c_{k-1})=(b_R-a_R)=|R|.\end{aligned}$$

Now assume the result for N. Given a rectangle S in \({\mathbb R}^{N+1}\), determine \(a_S,\,b_S\in {\mathbb R}\) and the rectangle \(Q_S\) in \({\mathbb {R}^N}\) so that \(S=Q_S\times [a_S,b_S]\). As before, choose points \(a_R\le c_0<\cdots <c_\ell \le b_R\) for \(\{[a_S,b_S]:\,S\in \mathcal C\}\), and define

$$\mathcal C_k=\{S\in \mathcal C:\,[c_{k-1},c_k]\subseteq [a_S,b_S]\}.$$

Then, for each \(S\in \mathcal C\),

$$|S|=|Q_S|(b_S-a_S)=|Q_S|\sum _{\{k:S\in \mathcal C_k\}}(c_k-c_{k-1}).$$

If the rectangles in \(\mathcal C\) are non-overlapping, then, for each k, the rectangles in \(\{Q_S:\,S\in \mathcal C_k\}\) are non-overlapping. Hence, since \(\bigcup _{S\in \mathcal C_k}Q_S\subseteq Q_R\), the induction hypothesis implies \(\sum _{S\in \mathcal C_k}|Q_S|\le |Q_R|\) for each \(1\le k\le \ell \), and therefore

$$\begin{aligned}\sum _{S\in \mathcal C}|S|&=\sum _{S\in \mathcal C}|Q_S|\sum _{\{k:\,S\in \mathcal C_k\}}(c_k-c_{k-1})\\ {}&= \sum _{k=1}^\ell (c_k-c_{k-1})\sum _{S\in \mathcal C_k}|Q_S|\le (b_R-a_R)|Q_R|=|R|.\end{aligned}$$

Finally, assume that \(R=\bigcup _{S\in \mathcal C}S\). In this case, \(c_0=a_R\) and \(c_\ell =b_R\). In addition, for each \(1\le k\le \ell \), \(Q_R=\bigcup _{S\in \mathcal C_k}Q_S\). To see this, note that if \(\mathbf x=(x_1,\ldots ,x_{N+1})\in R\) and \(x_{N+1}\in (c_{k-1},c_k)\), then \(S\ni \mathbf x\implies [c_{k-1},c_k]\subseteq [a_S,b_S]\) and therefore that \(S\in \mathcal C_k\). Hence, by the induction hypothesis, \(|Q_R|\le \sum _{S\in \mathcal C_k}\mathrm {vol}(Q_S)\) for each \(1\le k\le \ell \), and therefore

$$\begin{aligned}\sum _{S\in \mathcal C}|S|&=\sum _{S\in \mathcal C}|Q_S|\sum _{\{k:S\in \mathcal C_k\}}(c_k-c_{k-1})\\ {}&=\sum _{k=1}^\ell (c_k-c_{k-1})\sum _{S\in \mathcal C_k} |Q_S|\ge (b_R-a_R)|Q_R|=|R|.\end{aligned}$$

\(\square \)

Given a rectangle \(\prod _{j=1}^N[a_j,b_j]\), throughout this section \(\mathcal C\) will be a finite collection of non-overlapping, closed rectangles R whose union is \(\prod _{j=1}^N[a_j,b_j]\), and the mesh size \(\Vert \mathcal C\Vert \) will be \(\max \{\mathrm {diam}(R):\,R\in \mathcal C\}\), where the diameter \(\mathrm {diam}(R)\) of \(R=\prod _{j=1}^N[r_j,s_j]\) equals \(\sqrt{\sum _{j=1}^N(s_j-r_j)^2}\). For instance, \(\mathcal C\) might be obtained by subdividing each of the sides \([a_j,b_j]\) into n equal parts and taking \(\mathcal C\) to be the set of \(n^N\) rectangles

$$\prod _{j=1}^N\bigl [a_j+\tfrac{m_j-1}{n}(b_j-a_j),a_j+\tfrac{m_j}{n}(b_j-a_j)\bigr ] \quad \text {for }1\le m_1,\ldots ,m_N\le n.$$

Next, say that \(\varXi :\mathcal C\longrightarrow {\mathbb {R}^N}\) is a choice function if \({\varvec{\varXi }}(R)\in R\) for each \(R\in \mathcal C\), and define the Riemann sum

$$\mathcal R(f;\mathcal C,{\varvec{\varXi }})=\sum _{R\in \mathcal C}f\bigl ({\varvec{\varXi }}(R)\bigr )|R|$$

for bounded functions \(f:\prod _{j=1}^N[a_j,b_j]\longrightarrow {\mathbb R}\). Again, we say that f is Riemann integrable if there exists a \(\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x\in {\mathbb R}\) to which the Riemann sums \(\mathcal R(f;\mathcal C,\varXi )\) converge, in the same sense as before, as \(\Vert \mathcal C\Vert \rightarrow 0\), in which case \(\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x\) is called the Riemann integral or just the integral of f on \(\prod _{j=1}^N[a_j,b_j]\).

There are no essentially new ideas needed to analyze when a function is Riemann integrable. As we did in Sect. 3.1, one introduces the upper and lower Riemann sums

$$\mathcal U(f;\mathcal C)=\sum _{R\in \mathcal C}\left( \sup _Rf\right) |R|\quad \text {and} \quad \mathcal L(f;\mathcal C)=\sum _{R\in \mathcal C}\left( \inf _Rf\right) |R|,$$

and, using the same reasoning as we did in the proof of Lemma 3.1.1, checks that \(\mathcal L(f;\mathcal C)\le \mathcal R(f;\mathcal C,{\varvec{\varXi }})\le \mathcal U(f;\mathcal C)\) for any \({\varvec{\varXi }}\) and \(\mathcal L(f;\mathcal C)\le \mathcal U(f;\mathcal C^{\prime })\) for any \(\mathcal C^{\prime }\). Further, one can show that for each \(\mathcal C\) and \(\epsilon >0\), there exists a \(\delta >0\) such that

$$\Vert \mathcal C^{\prime }\Vert <\delta \implies \mathcal U(f;\mathcal C^{\prime })\le \mathcal U(f;\mathcal C)+\epsilon \text { and }\mathcal L(f;\mathcal C^{\prime })\ge \mathcal L(f;\mathcal C)-\epsilon .$$

The proof that such a \(\delta \) exists is basically the same as, but somewhat more involved than, the corresponding one in Lemma 3.1.1. Namely, given \(\delta >0\) and a rectangle \(R=\prod _{j=1}^N[c_j,d_j]\in \mathcal C\), define \(R^-_k(\delta )\) and \(R^+_k(\delta )\) to be the rectangles

$$\left( \prod _{1\le j<k}[a_j,b_j]\right) \times \bigl [a_k\vee (c_k-\delta ),b_k\wedge (c_k+\delta )\bigr ]\times \left( \prod _{k<j\le N}[a_j,b_j]\right) $$

and

$$\left( \prod _{1\le j<k}[a_j,b_j]\right) \times \bigl [a_k\vee (d_k-\delta ),b_k\wedge (d_k+\delta )\bigr ]\times \left( \prod _{k<j\le N}[a_j,b_j]\right) $$

for \(1\le k\le N\), with the understanding that the first factor is absent if \(k=1\) and the last factor is absent if \(k=N\). Now suppose that \(\Vert \mathcal C^{\prime }\Vert <\delta \) and \(R^{\prime }\in \mathcal C^{\prime }\). Then either \(R^{\prime }\subseteq R\) for some \(R\in \mathcal C\) or there is an \(1\le k\le N\) and an \(R\in \mathcal C\) such that the interior of the kth side of \(R^{\prime }\) contains one of the end points of kth side of R, in which case \(R^{\prime }\subseteq R^-_k(\delta )\cup R^+_k(\delta )\). Thus, if \(\mathcal D\) is the set of \(R^{\prime }\in \mathcal C^{\prime }\) that are not contained in any \(R\in \mathcal C\), then, because \(\sup _{R^{\prime }}f\le \sup _Rf\) if \(R^{\prime }\subseteq R\), one can use Lemma 5.1.1 to see that

$$\begin{aligned}\mathcal U(f;\mathcal C^{\prime })&-\mathcal U(f;\mathcal C)=\sum _{R^{\prime }\in \mathcal C^{\prime }}\sum _{R\in \mathcal C}\left( \sup _{R^{\prime }}f-\sup _Rf\right) |R^{\prime }\cap R|\\&\le \sum _{R^{\prime }\in \mathcal D}\sum _{R\in \mathcal C}\left( \sup _{R^{\prime }}f-\sup _Rf\right) |R^{\prime }\cap R|\le 2\Vert f\Vert _{\prod _1^N[a_j,b_j]}\sum _{R^{\prime }\in \mathcal D}|R^{\prime }|\\ {}&\le 2\Vert f\Vert _{\prod _1^N[a_j,b_j]}\sum _{k=1}^N\sum _{R\in \mathcal C} \bigl (|R^-_k(\delta )|+|R^+_k(\delta )|\bigr ).\end{aligned}$$

Since \(|R^\pm _k(\delta )|\le \delta \prod _{j\ne k}(b_j-a_j)\), it follows that there exists a constant \(A<\infty \) such that \(\mathcal U(f;\mathcal C^{\prime })\le \mathcal U(f;\mathcal C)+A\delta \) if \(\Vert \mathcal C^{\prime }\Vert <\delta \).

With these preparations, we now have the following analog of Theorem 3.1.2. However, before stating the result, we need to make another definition. Namely, we will say that a subset \(\varGamma \) of the rectangle \(\prod _{j=1}^N[a_j,b_j]\) is Riemann negligible if, for each \(\epsilon >0\) there is a \(\mathcal C\) such that

$$\sum _{\begin{array}{c} R\in \mathcal C\\ \varGamma \cap R\ne \emptyset \end{array}}|R|<\epsilon .$$

Riemann negligible sets will play an important role in our considerations.

Theorem 5.1.2

Let \(f:\prod _{j=1}^N[a_j,b_j]\longrightarrow {\mathbb C}\) be a bounded function. Then f is Riemann integrable if and only if for each \(\epsilon >0\) there is a \(\mathcal C\) such that

$$\sum _{\begin{array}{c} R\in \mathcal C\\ \sup _Rf-\inf _Rf\ge \epsilon \end{array}}|R|<\epsilon .$$

In particular, f is Riemann integrable if it is continuous off of a Riemann negligible set. Finally if f is Riemann integrable and takes all its values in a compact set \(K\subseteq {\mathbb C}\) and \(\varphi :K\longrightarrow {\mathbb C}\) is continuous, then \(\varphi \circ f\) is Riemann integrable.

Proof

Except for the one that says f is Riemann integrable if it is continuous off of a Riemann negligible set, all these assertions are proved in exactly the same way as the analogous statements in Theorem 3.1.2.

Now suppose that f is continuous off of the Riemann negligible set \(\varGamma \). Given \(\epsilon >0\), choose \(\mathcal C\) so that \(\sum _{R\in \mathcal D}|R|<\epsilon \), where \(\mathcal D=\{R\in \mathcal C:\,R\cap \varGamma \ne \emptyset \}\). Then \(K=\bigcup _{R\in \mathcal C{\setminus } \mathcal D}R\) is a compact set on which f is continuous. Hence, we can find a \(\delta >0\) such that \(|f(y)-f(x)|<\epsilon \) for all \(x,\,y\in K\) with \(|y-x|\le \delta \). Finally, subdivide each \(R\in \mathcal C{\setminus } \mathcal D\) into rectangles of diameter less than \(\delta \), and take \(\mathcal C^{\prime }\) to be the cover consisting of the elements of \(\mathcal D\) and the sub-rectangles into which the elements of \(\mathcal C{\setminus }\mathcal D\) were subdivided. Then

$$\sum _{\begin{array}{c} R^{\prime }\in \mathcal C^{\prime }\\ \sup _{R^{\prime }}f-\inf _{R^{\prime }}f\ge \epsilon \end{array}}|R^{\prime }|\le \sum _{R\in \mathcal D}|R|<\epsilon .$$

\(\square \)

We now have the basic facts about Riemann integration in \({\mathbb R}^N\), and from them follow the Riemann integrability of linear combinations and products of bounded Riemann integrable functions as well as the obvious analogs of (3.1.1), (3.1.5), (3.1.4), and Theorem 3.14. The replacement for (3.1.2) is

$$\begin{aligned} \int _{\prod _1^N[\lambda a_j,\lambda b_j]}f(\mathbf x)\,d\mathbf x=\lambda ^N\int _{\prod _1^N[a_j,b_j]}f(\lambda \mathbf x)\,dx\end{aligned}$$
(5.1.1)

for bounded, Riemann integrable functions on \(\prod _{j=1}^N[\lambda a_j,\lambda b_j]\). It is also useful to note that Riemann integration is translation invariant in the sense that if f is a bounded, Riemann integrable function on \(\prod _{j=1}^N[c_j+a_j,c_j+b_j]\) for some \(\mathbf c=(c_1,\ldots ,c_N)\in {\mathbb {R}^N}\), then \(x\rightsquigarrow f(\mathbf c+\mathbf x)\) is Riemann integrable on \(\prod _{j=1}^N[a_j,b_j]\) and

$$\begin{aligned} \int _{\prod _{1}^N[c_j+a_k,c_j+b_j]}f(\mathbf x)\,d\mathbf x=\int _{\prod _1^N[a_j,b_j]}f(\mathbf c+\mathbf x)\,d\mathbf x,\end{aligned}$$
(5.1.2)

a property that follows immediately from the corresponding fact for Riemann sums. In addition, by the same procedure as we used in Sect. 3.1, we can extend the definition of the Riemann integral to cover situations in which either the integrand f or the region over which the integration is performed is unbounded. Thus, for example, if f is a function that is bounded and Riemann integrable on bounded rectangles, then one defines

$$\int _{\mathbb {R}^N}f(\mathbf x)\,d\mathbf x=\lim _{\begin{array}{c} a_1\vee \cdots \vee a_N\rightarrow -\infty \\ b_1\wedge \cdots \wedge b_N\rightarrow \infty \end{array}}\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x$$

if the limit exists.

5.2 Iterated Integrals and Fubini’s Theorem

Evaluating integrals in N variables is hard and usually possible only if one can reduce the computation to integrals in one variable. One way to make such a reduction is to write an integral in N variables as N iterated integrals in one variable, one for each dimension, and the following theorem, known as Fubini’s Theorem, shows this can be done. In its statement, if \(\mathbf x=(x_1,\ldots ,x_N)\in {\mathbb {R}^N}\) and \(1\le M<N\), then \(\mathbf x^{(M)}_1\equiv (x_1,\ldots ,x_M)\) and \(\mathbf x^{(M)}_2\equiv (x_{M+1},\ldots ,x_N)\).

Theorem 5.2.1

Suppose that \(f:\prod _{j=1}^N[a_j,b_j]\longrightarrow {\mathbb C}\) is a bounded, Riemann integrable function. Further, for some \(1\le M<N\) and each \(\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\), assume that \(\mathbf x^{(M)}_1\in \prod _{j=1}^M[a_j,b_j]\longmapsto f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\in {\mathbb C}\) is Riemann integrable. Then

$$\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\longmapsto f^{(M)}_1(\mathbf x^{(M)}_2) \equiv \int _{\prod _1^M[a_j,b_j]}f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_1$$

is Riemann integrable and

$$\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x=\int _{\prod _{M+1}^N[a_j,b_j]}f^{(M)}_1(\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_2.$$

In particular, this result applies if f is a bounded, Riemann integrable function with the property that, for each \(\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\), \(\mathbf x^{(M)}_1\in \prod _{j=1}^M[a_j,b_j]\longmapsto f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\in {\mathbb C}\) is continuous at all but a Riemann negligible set of points.

Proof

Given \(\epsilon >0\), choose \(\delta >0\) so that

$$\Vert \mathcal C\Vert <\delta \implies \left| \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,dx-\mathcal R(f;\,\mathcal C,{\varvec{\varXi }})\right| <\epsilon $$

for every choice function \({\varvec{\varXi }}\). Next, let \(\mathcal C^{(M)}_2\) be a cover of \(\prod _{j=M+1}^N[a_j,b_j]\) with \(\Vert \mathcal C^{(M)}_2\Vert <\frac{\delta }{2}\), and let \({\varvec{\varXi }}^{(M)}_2\) be an associated choice function. Finally, because \(\mathbf x^{(M)}_1\rightsquigarrow f\bigl (\mathbf x_1^{(M)},{\varvec{\varXi }}^{(M)}_2(R_2)\bigr )\) is Riemann integrable for each \(R_2\in \mathcal C_2^{(M)}\), we can choose a cover \(\mathcal C^{(M)}_1\) of \(\prod _{j=1}^M[a_j,b_j]\) with \(\Vert \mathcal C^{(M)}_1\Vert <\frac{\delta }{2}\) and an associated choice function \({\varvec{\varXi }}^{(M)}_1\) such that

$$\sum _{R_2\in \mathcal C^{(M)}_2}\left| \sum _{R_1\in \mathcal C^{(M)}_1}f\bigl ({\varvec{\varXi }}^{(M)}_1(R_1),{\varvec{\varXi }}^{(M)}_2(R_2)\bigr ) \bigr )|R_1|-f^{(M)}_1({\varvec{\varXi }}^{(M)}_2(R_2)\bigr )\right| |R_2|<\epsilon .$$

If

$$ \mathcal C=\{R_1\times R_2:\,R_1\in \mathcal C^{(M)}_1\; \& \;R_2\in \mathcal C^{(M)}_2\}$$

and \({\varvec{\varXi }}\bigl (R_1\times R_2)=\bigl ({\varvec{\varXi }}^{(M)}_1(R_1),{\varvec{\varXi }}^{(M)}_2(R_2)\bigr )\), then \(\Vert \mathcal C\Vert <\delta \) and

$$\mathcal R(f;\mathcal C,{\varvec{\varXi }})=\sum _{R_1\in \mathcal C^{(M)}_1}\sum _{R_2\in \mathcal C^{(M)}_2}f\bigl ({\varvec{\varXi }}^{(M)}_1(R_1),{\varvec{\varXi }}^{(M)}_2(R_2)\bigr )|R_1||R_2|,$$

and so

$$\begin{aligned}&\left| \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x-\mathcal R(f^{(M)}_1;\mathcal C^{(M)}_2,{\varvec{\varXi }}^{(M)}_2)\right| \\ {}&\le \left| \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x-\mathcal R(f;\mathcal C,{\varvec{\varXi }})\right| \\ {}&+\sum _{R_2\in \mathcal C^{(M)}_2}\left| \sum _{R_1\in \mathcal C^{(M)}_1}f\bigl ({\varvec{\varXi }}^{(M)}_1(R_1),{\varvec{\varXi }}^{(M)}_2 (R_2)\bigr )|R_1|-f^{(M)}_1({\varvec{\varXi }}^{(M)}_2(R_2)\bigr )\right| |R_2|\end{aligned}$$

is less than \(2\epsilon \). Hence, \(\mathcal R(f^{(M)}_1;\mathcal C^{(M)}_2,{\varvec{\varXi }}^{(M)}_2\bigr )\) converges to \(\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x\) as \(\Vert \mathcal C^{(M)}_2\Vert \rightarrow 0\).\(\square \)

It should be clear that the preceding result holds equally well when the roles of \(\mathbf x^{(M)}_1\) and \(\mathbf x^{(M)}_2\) are reversed. Thus, if \(f:\prod _{j=1}^N[a_j,b_j]\longmapsto {\mathbb C}\) is a bounded, Riemann integrable function such that \(\mathbf x^{(M)}_1\in \prod _{j=1}^M[a_j,b_j]\longmapsto f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\in {\mathbb C}\) is Riemann integrable for each \(\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\) and \(\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\longmapsto f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\) is Riemann integrable for each \(\mathbf x^{(M)}_1\in \prod _{j=1}^N[a_j,b_j]\), then

$$\begin{aligned} \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x={\left\{ \begin{array}{ll} \int _{\prod _{M+1}^N[a_j,b_j]}f^{(M)}_1(\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_2\\ \int _{\prod _{1}^M[a_j,b_j]}f^{(M)}_2(\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_1,\end{array}\right. }\end{aligned}$$
(5.2.1)

where

$$f^{(M)}_1(\mathbf x^{(M)}_2)=\int _{\prod _{1}^M[a_j,b_j]}f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_1$$

and

$$f^{(M)}_2(\mathbf x^{(M)}_1)=\int _{\prod _{M+1}^N[a_j,b_j]}f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_2.$$

Corollary 5.2.2

Let f be a continuous function on \(\prod _{j=1}^N[a_j,b_j]\). Then for each \(1\le M<N\),

$$\mathbf x^{(M)}_2\in \prod _{j=M+1}^N[a_j,b_j]\longmapsto f^{(M)}_1(\mathbf x^{(M)}_2)\equiv \int _{\prod _1^M[a_j,b_j]}f(\mathbf x^{(M)}_1,\mathbf x^{(M)}_2)\,d\mathbf x^{(M)}_1\in {\mathbb C}$$

is continuous. Furthermore,

$$f^{(M+1)}_1(\mathbf x^{(M+1)}_2)=\int _{[a_M,b_M]}f^{(M)}_1(x_M,\mathbf x^{M+1}_2)\,dx_M\quad \text {for } 1\le M<N-1$$

and

$$\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x=\int _{[a_N,b_N]}f^{(N-1)}_1(x_N)\,dx_N.$$

Proof

Once the first assertion is proved, the others follow immediately from Theorem 5.2.1. But, because f is uniformly continuous, the first assertion follows from the obvious higher dimensional analog of Theorem 3.1.4.\(\square \)

By repeated applications of Corollary 5.2.2, one sees that

$$\begin{aligned}\begin{aligned}&\int _{\prod _{j=1}^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x\\ {}&=\int _{a_N}^{b_N}\left( \cdots \left( \int _{a_1}^{b_1}f(x_1,\ldots ,x_{N-1},x_N)\,dx_1\right) \cdots \right) dx_N.\end{aligned} \end{aligned}$$

The expression on the right is called an iterated integral . Of course, there is nothing sacrosanct about the order in which one does the integrals. Thus

$$\begin{aligned} \begin{aligned}&\int _{\prod _{j=1}^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x\\ {}&= \int \limits _{a_{\pi (N)}}^{b_{\pi (N)}}\left( \cdots \left( \int \limits _{a_{\pi (1)}}^{b_{\pi (1)}}f(x_1,\ldots ,x_{N-1},x_N)\,dx_{\pi (1)}\right) \cdots \right) dx_{\pi (N)}\end{aligned}\end{aligned}$$
(5.2.2)

for any permutation \(\pi \) of \(\{1,\ldots ,N\}\). In that it shows integrals in N variables can be evaluated by doing N integrals in one variable, (5.2.2) makes it possible to bring Theorem 3.2.1 to bear on the problem. However, it is hard enough to find one indefinite integral on \({\mathbb R}\), much less a succession of N of them. Nonetheless, there is an important consequence of (5.2.2). Namely, if \(f(\mathbf x)=\prod _{j=1}^Nf_j(x_j)\), where, for each \(1\le j\le N\), \(f_j\) is a continuous function on \([a_j,b_j]\), then

$$\begin{aligned} \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x=\prod _{j=1}^N\int _{[a_j,b_j]}f_j(x_j)\,dx_j.\end{aligned}$$
(5.2.3)

In fact, starting from Theorem 5.2.1, it is easy to check that (5.2.3) holds when each \(f_j\) is bounded and Riemann integrable.

Looking at (5.2.2), one might be tempted to think that there is an analog of the Fundamental Theorem of Calculus for integrals in several variables. Namely, taking \(\pi \) to be the identity permutation that leaves the order unchanged and thinking of the expression on the right as a function F of \((b_1,\ldots ,b_N)\), it becomes clear that \(\partial _{\mathbf e_1}\ldots \partial _{\mathbf e_N}F=f\). However, what made this information valuable when \(N=1\) is the fact that a function on \({\mathbb R}\) can be recovered, up to an additive constant, from its derivative, and that is why we could say that \(F(b)-F(a)=\int _a^bf(x)\,dx\) for any F satisfying \(F^{\prime }=f\). When \(N\ge 2\), the equality \(\partial _{\mathbf e_1}\ldots \partial _{\mathbf e_N}F=f\) provides much less information. Indeed, even when \(N=2\), if F satisfies \(\partial _{\mathbf e_1}\partial _{\mathbf e_2}F=f\), then so does \(F(x_1,x_2)+F_1(x_1)+F_2(x_2)\) for any choice of differentiable functions \(F_1\) and \(F_2\), and the ambiguity gets worse as N increases. Thus finding an F that satisfies \(\partial _{\mathbf e_1}\ldots \partial _{\mathbf e_N}F=f\) does little to advance one toward finding the integral of f.

To provide an interesting example of the way in which Fubini’s Theorem plays an important role, define Euler’s Beta function \(B:(0,\infty )^2\longrightarrow (0,\infty )\) by

$$\begin{aligned}B(\alpha ,\beta )=\int _{(0,1)}x^{\alpha -1}(1-x)^{\beta -1}\,dx.\end{aligned}$$

It turns out that his Beta function is intimately related to his (cf. Exercise 3.3) Gamma function. In fact,

$$\begin{aligned} B(\alpha ,\beta )=\frac{\varGamma (\alpha )\varGamma (\beta )}{\varGamma (\alpha +\beta )},\end{aligned}$$
(5.2.4)

which means that \(\frac{1}{B(\alpha ,\beta )}\) is closely related to the binomial coefficients in the same sense that \(\varGamma (t)\) is related to factorials. Although (5.2.4) holds for all \((\alpha ,\beta )\in (0,\infty )^2\), in order to avoid distracting technicalities, we will prove it only for \((\alpha ,\beta )\in [1,\infty )^2\). Thus let \(\alpha ,\,\beta \ge 1\) be given. Then, by (5.2.3) and (5.2.1), \(\varGamma (\alpha )\varGamma (\beta )\) equals

$$\begin{aligned}\lim _{r\rightarrow \infty }&\int _{[0,r]^2}x_1^{\alpha -1}x_2^{\beta -1}e^{-x_1-x_2}\,d\mathbf x\\&=\lim _{r\rightarrow \infty }\int _0^r x_2^{\beta -1}\left( \int _0^r x_1^{\alpha -1}e^{-(x_1+x_2)}\,dx_1\right) dx_2.\end{aligned}$$

By (5.1.2),

$$\int _0^r x_1^{\alpha -1}e^{-(x_1+x_2)}\,dx_1=\int _{x_2}^{r+x_2}(y_1-x_2)^{\alpha -1}e^{-y_1}\,dy_1,$$

and so

$$\begin{aligned}\int _0^r&x_2^{\beta -1}\left( \int _0^r x_1^{\alpha -1}e^{-(x_1+x_2)}\,dx_1\right) dx_2\\ {}&=\int _0^rx_2^{\beta -1}\left( \int _{x_2}^{r+x_2}(y_1-x_2)^{\alpha -1}e^{-y_1}\,dy_1\right) dx_2.\end{aligned}$$

Now consider the function

$$ f(y_1,x_2)={\left\{ \begin{array}{ll}(y_1-x_2)^{\alpha -1}x_2^{\beta -1}e^{-y_1}&{}\text {if }x_2\in [0,r]\; \& \;x_2\le y_1\le r+x_2\\ 0&{}\text {otherwise}\end{array}\right. }$$

on \([0,2r]\times [0,r]\). Because the only discontinuities of f lie in the Riemann negligible set \(\{(r+x_2,x_2):\,x_2\in [0,r]\}\), it is Riemann integrable on \([0,2r]\times [0,r]\). In addition, for each \(y_1\in [0,2r]\), \(x_2\rightsquigarrow f(y_1,x_2)\) and, for each \(x_2\in [0,r]\), \(y_1\rightsquigarrow f(y_1,x_2)\) have at most two discontinuities. We can therefore apply (5.2.1) to justify

$$\begin{aligned}&\int _0^rx_2^{\beta -1}\left( \int _{x_2}^{r+x_2}(y_1-x_2)^{\alpha -1}e^{-y_1}\,dy_1\right) dx_2= \int _0^r\left( \int _0^{2r}f(y_1,x_2)\,dy_1\right) dx_2 \\&=\int _0^{2r}\left( \int _0^{r}f(y_1,x_2)\,dx_2\right) dy_1=\int _0^{2r} e^{-y_1}\left( \int \limits _{(y_1-r)^+}^{r\wedge y_1}(y_1-x_2)^{\alpha -1}x_2^{\beta -1}\,dx_2\right) dy_1.\end{aligned}$$

Further, by (3.1.2)

$$\int _{(y_1-r)^+}^{r\wedge y_1}(y_1-x_2)^{\alpha -1}x_2^{\beta -1}\,dx_2=y_1^{\alpha +\beta -1}\int _{(1-y_1^{-1}r)^+}^{1\wedge (y_1^{-1}r)}(1-y_2)^{\alpha -1}y_2^{\beta -1}\,dy_2.$$

Collecting these together, we have

$$\varGamma (\alpha )\varGamma (\beta )=\lim _{r\rightarrow \infty }\int _0^{2r}y_1^{\alpha +\beta -1}e^{-y_1}\left( \int _{(1-y_1^{-1}r)^+}^{1\wedge (y_1^{-1}r)}(1-y_2)^{\alpha -1}y_2^{\beta -1}\,dy_2\right) dy_1.$$

Finally,

$$\begin{aligned}\int _0^{2r}&y_1^{\alpha +\beta -1}e^{-y_1}\left( \int _{(1-y_1^{-1}r)^+}^{1\wedge (y_1^{-1}r)}(1-y_2)^{\alpha -1}y_2^{\beta -1}\,dy_2\right) dy_1\\ {}&= \int _0^{r}y_1^{\alpha +\beta -1}e^{-y_1}\,dy_1B(\alpha ,\beta )\\&+ \int _r^{2r}y_1^{\alpha +\beta -1}e^{-y_1}\left( \int _{(1-y_1^{-1}r)}^{y_1^{-1}r}(1-y_2)^{\alpha -1}y_2^{\beta -1}\,dy_2\right) dy_1,\end{aligned}$$

and, as \(r\rightarrow \infty \), the first term on the right tends to \(\varGamma (\alpha +\beta )\) whereas the second term is dominated by \(\int _r^\infty y_1^{\alpha +\beta -1}e^{-y_1}\,dy_1\) and therefore tends to 0.

The preceding computation illustrates one of the trickier aspects of proper applications of Fubini’s Theorem. When one reverses the order of integration, it is very important to figure out what are the resulting correct limits of integration. As in the application above, the correct limits can look very different after the order of integration is changed.

The Eq. (5.2.4) provides a proof of Stirling’s formula for the Gamma function as a consequence of (1.8.7). Indeed, by (5.2.4), \(\varGamma (n+1+\theta )=\frac{n!\varGamma (\theta )}{B(n+1,\theta )}\) for \(n\in \mathbb Z^+\) and \(\theta \in [1,2)\), and, by (3.1.2),

$$B(n+1,\theta )=n^{-\theta }\int _0^ny^{\theta -1}\bigl (1-\tfrac{y}{n}\bigr )^n\,dy.$$

Further, because \(1-x\le e^{-x}\) for all \(x\in {\mathbb R}\),

$$\int _0^ny^{\theta -1}\bigl (1-\tfrac{y}{n}\bigr )^n\,dy\le \int _0^\infty y^{\theta -1}e^{-y}\,dy=\varGamma (\theta ),$$

and, for all \(r>0\),

$$\varliminf _{n\rightarrow \infty }\int _0^ny^{\theta -1}\bigl (1-\tfrac{y}{n}\bigr )^n\,dy\ge \varliminf _{n\rightarrow \infty } \int _0^ry^{\theta -1}\bigl (1-\tfrac{y}{n}\bigr )^n\,dy=\int _0^ry^{\theta -1}e^{-y}\,dy.$$

Since the final expression tends to \(\varGamma (\theta )\) as \(r\rightarrow \infty \) uniformly fast for \(\theta \in [1,2]\), we now know that

$$\frac{\varGamma (\theta )}{n^\theta B(n+1,\theta )}\longrightarrow 1$$

uniformly fast for \(\theta \in [1,2]\). Combining this with (1.8.7) we see that

$$\lim _{n\rightarrow \infty }\frac{\varGamma (n+\theta +1)}{\sqrt{2\pi n}\left( \frac{n}{e}\right) ^nn^\theta }\longrightarrow 1$$

uniformly fast for \(\theta \in [1,2]\). Given \(t\ge 3\), determine \(n_t\in \mathbb Z^+\) and \(\theta _t\in [1,2)\) so that \(t=n_t+\theta _t\). Then the preceding says that

$$\lim _{t\rightarrow \infty }\frac{\varGamma (t+1)}{\sqrt{2\pi t}\left( \tfrac{t}{e}\right) ^t}\sqrt{\frac{t}{t-\theta _t}}\left( \frac{t}{t-\theta _t}\right) ^te^{-\theta _t}=1.$$

Finally, it is obvious that, as \(t\rightarrow \infty \), \(\sqrt{\frac{t}{t-\theta _t}}\) tends to 1 and, because, by (1.7.5),

$$\log \left( \left( \frac{t}{t-\theta _t}\right) ^te^{-\theta _t}\right) =-t\log \left( 1-\tfrac{\theta _t}{t}\right) -\theta _t\longrightarrow 0,$$

so does \(\left( \frac{t}{t-\theta _t}\right) ^te^{-\theta _t}\). Hence we have shown that

$$\begin{aligned} \varGamma (t+1)\sim \sqrt{2\pi t}\left( \frac{t}{e}\right) ^t\quad \text {as } t\rightarrow \infty \end{aligned}$$
(5.2.5)

in the sense that \(\lim _{t\rightarrow \infty }\frac{\varGamma (t+1)}{\sqrt{2\pi t}\left( \tfrac{t}{e}\right) ^t}=1.\)

5.3 Volume of and Integration Over Sets

We motivated our initial discussion of integration by computing the area under the graph of a non-negative function, and as we will see in this section, integration provides a method for computing the volume of more general regions. However, before we begin, we must first be more precise about what we will mean by the volume of a region.

Although we do not know yet what the volume of a general set \(\varGamma \) is, we know a few properties that volume should possess. In particular, we know that the volume of a subset should be no larger than that of the set containing it. In addition, volume should be additive in the sense that the volume of the union of disjoint sets should be the sum of their volumes. Taking these comments into account, for a given bounded set \(\varGamma \subseteq {\mathbb {R}^N}\), we define the exterior volume \(|\varGamma |_{\mathrm e}\) of \(\varGamma \) to be the infimum of the sums \(\sum _{R\in \mathcal C}|R|\) as \(\mathcal C\) runs over all finite collections of non-overlapping rectangles whose union contains \(\varGamma \).Footnote 3 Similarly, define the interior volume \(|\varGamma |_{\mathrm i}\) to be the supremum of the sums \(\sum _{R\in \mathcal C}|R|\) as \(\mathcal C\) runs over finite collections of non-overlapping rectangles each of which is contained in \(\varGamma \). Clearly the notion of exterior volume is consistent with the properties that we want volume to have. To see that the same is true of interior volume, note that an equivalent description would have been that \(|\varGamma |_{\mathrm i}\) is the supremum of \(\sum _{R\in \mathcal C}|R|\) as \(\mathcal C\) runs over finite collections of rectangles that are mutually disjoint and each of which is contained in \(\varGamma \). Indeed, given a \(\mathcal C\) of the sort in the definition of interior volume, shrink the sides of each \(R\in \mathcal C\) with \(|R|>0\) by a factor \(\theta \in (0,1)\) and eliminate the ones with \(|R|=0\). The resulting rectangles will be mutually disjoint and the sum of their volumes will be \(\theta ^N\) times that of the original ones. Hence, by taking \(\theta \) close enough to 1, we can get arbitrarily close to the original sum.

Obviously \(\varGamma \) is Riemann negligible if and only if \(|\varGamma |_{\mathrm e}=0\). Next notice that \(|\varGamma |_{\mathrm i}\le |\varGamma |_{\mathrm e}\) for all bounded \(\varGamma \)’s. Indeed, suppose that \(\mathcal C_1\) is a finite collection of non-overlapping rectangles contained in \(\varGamma \) and that \(\mathcal C_2\) is a finite collection of rectangles whose union contains \(\varGamma \). Then, by Lemma 5.1.1,

$$\sum _{R_2\in \mathcal C_2}|R_2|\ge \sum _{R_2\in \mathcal C_2}\sum _{R_1\in \mathcal C_1}|R_1\cap R_2|=\sum _{R_1\in \mathcal C_1}\sum _{R_2\in \mathcal C_2}|R_1\cap R_2| \ge \sum _{R_1\in \mathcal C_1}|R_1|.$$

In addition, it is easy to check that

$$ \begin{aligned} |\varGamma _{\!1}|_{\mathrm e}\le |\varGamma _{\!2}|_{\mathrm e}\text { and } |\varGamma _{\!1}|_{\mathrm i}\le |\varGamma _{\!2}|_{\mathrm i}\quad \text {if }\varGamma _{\!1}\subseteq \varGamma _{\!2}\\ \ |\varGamma _{\!1}\cup \varGamma _{\!2}|_{\mathrm e}\le |\varGamma _{\!1}|_{\mathrm e}+|\varGamma _{\!2}|_{\mathrm e}\text { for all }\varGamma _{\!1}\; \& \;\varGamma _{\!2},\\ \ \text { and } |\varGamma _{\!1}\cup \varGamma _{\!2}|_{\mathrm i}\ge |\varGamma _{\!1}|_{\mathrm i}+|\varGamma _{\!2}|_{\mathrm i}\text { if } \varGamma _{\!1}\cap \varGamma _{\!2}=\emptyset . \end{aligned}$$

We will say that \(\varGamma \) is Riemann measurable if \(|\varGamma |_{\mathrm i}=|\varGamma |_{\mathrm e}\), in which case we will call \(\mathrm {vol}(\varGamma )\equiv |\varGamma |_{\mathrm e}\) the volume of \(\varGamma \). Clearly \(|\varGamma |_{\mathrm e}=0\) implies that \(\varGamma \) is Riemann measurable and \(\mathrm {vol}(\varGamma )=0\). In particular, if \(\varGamma \) is Riemann negligible and therefore \(|\varGamma |_{\mathrm e}=0\), then \(\varGamma \) is Riemann measurable and has volume 0. In addition, if R is a rectangle, \(|R|_{\mathrm e}\le |R|\le |R|_{\mathrm i}\), and therefore R is Riemann measurable and \(\mathrm {vol}(R)=|R|\).

One suspects that these considerations are intimately related to Riemann integration, and the following theorem justifies that suspicion. In its statement and elsewhere, \({\mathbf 1}_\varGamma \) denotes the indicator function of a set \(\varGamma \). That is, \({\mathbf 1}_\varGamma (\mathbf x)\) is 1 if \(\mathbf x\in \varGamma \) and is 0 if \(\mathbf x\notin \varGamma \).

Theorem 5.3.1

Let \(\varGamma \) be a subset of \(\prod _{j=1}^N[a_j,b_j]\). Then \(\varGamma \) is Riemann measurable if and only if \({\mathbf 1}_\varGamma \) is Riemann integrable on \(\prod _{j=1}^N[a_j,b_j]\), in which case

$$\mathrm {vol}(\varGamma )=\int _{\prod _{j=1}^N[a_j,b_j]}{\mathbf 1}_\varGamma (\mathbf x)\,d\mathbf x.$$

Proof

First observe that, without loss in generality, we may assume that all the collections \(\mathcal C\) entering the definitions of outer and inner volume can be taken to be subsets of non-overlapping covers of \(\prod _{j=1}^N[a_j,b_j]\).

Now suppose that \(\varGamma \) is Riemann measurable. Given \(\epsilon >0\), choose a non-overlapping cover \(\mathcal C_1\) of \(\prod _{j=1}^N[a_j,b_j]\) such that

$$\sum _{\begin{array}{c} R\in \mathcal C_1\\ R\cap \varGamma \ne \emptyset \end{array}}|R|\le \mathrm {vol}(\varGamma )+\tfrac{\epsilon }{2}.$$

Then

$$\mathcal U({\mathbf 1}_\varGamma ;\mathcal C_1)=\sum _{\begin{array}{c} R\in \mathcal C_1\\ R\cap \varGamma \ne \emptyset \end{array}}|R|\le \mathrm {vol}(\varGamma )+\tfrac{\epsilon }{2}.$$

Next, choose \(\mathcal C_2\) so that

$$\sum _{\begin{array}{c} R\in \mathcal C_2\\ R\subseteq \varGamma \end{array}}|R|\ge \mathrm {vol}(\varGamma )-\tfrac{\epsilon }{2},$$

and observe that then \(\mathcal L({\mathbf 1}_\varGamma ;\mathcal C_2)\ge \mathrm {vol}(\varGamma )-\tfrac{\epsilon }{2}\). Hence if

$$ \mathcal C=\{R_1\cap R_2:\,R_1\in \mathcal C_1\; \& \;R_2\in \mathcal C_2\},$$

then

$$\mathcal U({\mathbf 1}_\varGamma ;\mathcal C)\le \mathcal U({\mathbf 1}_\varGamma ;\mathcal C_1)\le \mathrm {vol}(\varGamma )+\tfrac{\epsilon }{2}\le \mathcal L({\mathbf 1}_\varGamma ;\mathcal C_2)+\epsilon \le \mathcal L({\mathbf 1}_\varGamma ;\mathcal C)+\epsilon ,$$

and so not only is \({\mathbf 1}_\varGamma \) Riemann integrable but also its integral is equal to \(\mathrm {vol}(\varGamma )\).

Conversely, if \({\mathbf 1}_\varGamma \) is Riemann integrable and \(\epsilon >0\), choose \(\mathcal C\) so that \(\mathcal U({\mathbf 1}_\varGamma ;\mathcal C)\le \mathcal L({\mathbf 1}_\varGamma ;\mathcal C)+\epsilon \). Define associated choice functions \({\varvec{\varXi }}_1\) and \({\varvec{\varXi }}_2\) so that \({\varvec{\varXi }}_1(R)\in \varGamma \) if \(R\cap \varGamma \ne \emptyset \) and \({\varvec{\varXi }}_2(R)\notin \varGamma \) unless \(R\subseteq \varGamma \). Then

$$|\varGamma |_{\mathrm e}\le \sum _{\begin{array}{c} R\in \mathcal C\\ R\cap \varGamma \ne \emptyset \end{array} }|R|=\mathfrak R({\mathbf 1}_\varGamma ;\mathcal C,{\varvec{\varXi }}_1)\le \mathfrak R({\mathbf 1}_\varGamma ;\mathcal C,{\varvec{\varXi }}_2)\,+\,\epsilon =\sum _{\begin{array}{c} R\in \mathcal C\\ R\subseteq \varGamma \end{array}}|R|\,+\,\epsilon \le |\varGamma |_{\mathrm i}\,+\,\epsilon ,$$

and so \(\varGamma \) is Riemann measurable.\(\square \)

Corollary 5.3.2

If \(\varGamma _{\!1}\) and \(\varGamma _{\!2}\) are bounded, Riemann measurable sets, then so are \(\varGamma _{\!1}\cup \varGamma _{\!2}\), \(\varGamma _{\!1}\cap \varGamma _{\!2}\), and \(\varGamma _{\!2}{\setminus } \varGamma _{\!1}\). In addition,

$$\mathrm {vol}\bigl (\varGamma _{\!1}\cup \varGamma _{\!2})=\mathrm {vol}(\varGamma _{\!1})+\mathrm {vol}(\varGamma _{\!2})-\mathrm {vol}(\varGamma _{\!1}\cap \varGamma _{\!2})$$

and

$$\mathrm {vol}\bigl (\varGamma _{\!2}{\setminus } \varGamma _{\!1}\bigr )=\mathrm {vol}\bigl (\varGamma _{\!2}\bigr )-\mathrm {vol}\bigl (\varGamma _{\!1}\cap \varGamma _{\!2}\bigr ).$$

In particular, if \(\mathrm {vol}(\varGamma _{\!1}\cap \varGamma _{\!2})=0\), then \(\mathrm {vol}(\varGamma _{\!1}\cup \varGamma _{\!2})=\mathrm {vol}(\varGamma _{\!1})+\mathrm {vol}(\varGamma _{\!2})\). Finally, \(\varGamma \subseteq \prod _{j=1}^N[a_j,b_j]\) is Riemann measurable if and only if for each \(\epsilon >0\) there exist Riemann measurable subsets A and B of \(\prod _{j=1}^N[a_j,b_j]\) such that \(A\subseteq \varGamma \subseteq B\) and \(\mathrm {vol}(B{\setminus } A)<\epsilon \).

Proof

By Theorem 5.3.1, \({\mathbf 1}_{\varGamma _{\!1}}\) and \({\mathbf 1}_{\varGamma _{\!2}}\) are Riemann integrable. Thus, since

$${\mathbf 1}_{\varGamma _{\!1}\cap \varGamma _{\!2}}={\mathbf 1}_{\varGamma _{\!1}}{\mathbf 1}_{\varGamma _{\!2}}\text { and }{\mathbf 1}_{\varGamma _{\!1}\cup \varGamma _{\!2}}={\mathbf 1}_{\varGamma _{\!1}}+{\mathbf 1}_{\varGamma _{\!2}}-{\mathbf 1}_{\varGamma _{\!1}\cap \varGamma _{\!2}},$$

that same theorem implies that \(\varGamma _{\!1}\cup \varGamma _{\!2}\) and \(\varGamma _{\!1}\cap \varGamma _{\!2}\) are Riemann measurable. At the same time,

$${\mathbf 1}_{\varGamma _{\!2}{\setminus } \varGamma _{\!1}}={\mathbf 1}_{\varGamma _{\!2}}-{\mathbf 1}_{\varGamma _{\!1}\cap \varGamma _{\!2}},$$

and so \(\varGamma _{\!2}{\setminus } \varGamma _{\!1}\) is also Riemann measurable. Also, by Theorem 5.3.1, the equations relating their volumes follow immediately for the equations relating their indicator functions.

Turning to the final assertion, there is nothing to do if \(\varGamma \) is Riemann measurable since we can then take \(A=\varGamma =B\) for all \(\epsilon >0\). Now suppose that for each \(\epsilon >0\) there exist Riemann measurable sets \(A_\epsilon \) and \(B_\epsilon \) such that \(A_\epsilon \subseteq \varGamma \subseteq B_\epsilon \subseteq \prod _{j=1}^N[a_j,b_j]\) and \(\mathrm {vol}(B_\epsilon {\setminus } A_\epsilon )<\epsilon \). Then

$$|\varGamma |_{\mathrm i}\ge \mathrm {vol}(A_\epsilon )\ge \mathrm {vol}(B_\epsilon )-\epsilon \ge |\varGamma |_{\mathrm e}-\epsilon ,$$

and so \(\varGamma \) is Riemann measurable.\(\square \)

It is reassuring that the preceding result is consistent with our earlier computation of the area under a graph. In fact, we now have the following more general result.

Theorem 5.3.3

Assume that \(f:\prod _{j=1}^N[a_j,b_j]\longrightarrow {\mathbb R}\) is continuous. Then the graph

$$G(f)=\left\{ \bigl (\mathbf x,f(\mathbf x)\bigr ):\,\mathbf x\in \prod _{j=1}^N[a_j,b_j]\right\} $$

is a Riemann negligible subset of \({\mathbb R}^{N+1}\). Moreover, if, in addition, f is non-negative and \(\varGamma =\bigl \{(\mathbf x,y)\in {\mathbb R}^{N+1}:\,0\le y\le f(\mathbf x)\bigr \}\), then \(\varGamma \) is Riemann measurable and

$$\mathrm {vol}(\varGamma )=\int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x.$$

Proof

Set \(r=\Vert f\Vert _{\prod _1^N[a_j,b_j]}\), and for each \(\epsilon >0\) choose \(\delta _\epsilon >0\) so that

$$|f(\mathbf y)-f(\mathbf x)|<\epsilon \quad \text {if } |\mathbf y-\mathbf x|\le \delta _\epsilon .$$

Next let \(\mathcal C\) with \(\Vert \mathcal C\Vert <\delta _\epsilon \) be a cover of \(\prod _1^N[a_j,b_j]\) by non-overlapping rectangles, and choose \(K\in \mathbb Z^+\) so that \(\frac{r}{K+1}<\epsilon \le \frac{r}{K}\). Then for each \(R\in \mathcal C\) there is a \(1\le k_R\le 2(K-1)\) such that

$$\bigl \{(\mathbf x,f(\mathbf x)\bigr ):\,\mathbf x\in R\bigr \}\subseteq R\times \bigl [-r+\tfrac{(k_R-1)r}{K},-r+\tfrac{(k_R+2)r}{K}\bigr ],$$

and therefore

$$|G(f)|_{\mathrm e}\le \frac{3r}{K}\sum _{R\in \mathcal C}|R|\le 6\left( \prod _{j=1}^N(b_j-a_j)\right) \epsilon ,$$

which proves that G(f) is Riemann negligible.

Turning to the second assertion, note that all the discontinuities of \({\mathbf 1}_\varGamma \) on \(\prod _{j=1}^N[a_j,b_j]\times [0,r]\) are contained in G(f), and therefore \({\mathbf 1}_\varGamma \) is Riemann measurable. In addition, for each \(\mathbf x\in \prod _{j=1}^N[a_j,b_j]\), \(y\in [0,r]\longmapsto {\mathbf 1}_\varGamma (\mathbf x,y)\in \{0,1\}\) has at most one discontinuity. Hence, by Theorem 5.3.1 and (5.2.1),

$$\mathrm {vol}(\varGamma )=\int _{\prod _1^N[a_j,b_j]}\left( \int _0^r{\mathbf 1}_\varGamma (\mathbf x,y)\,dy\right) \,d\mathbf x= \int _{\prod _1^N[a_j,b_j]}f(\mathbf x)\,d\mathbf x.$$

\(\square \)

Theorem 5.3.3 allows us to confirm that the volume (i.e., the area) of the closed unit ball \(\overline{B(\mathbf 0,1)}\) in \({\mathbb R}^2\) is \(\pi \), the half period of the trigonometric sine function. Indeed, \(\overline{B(\mathbf 0,1)}=H_+\cup H_-\), where

$$H_\pm =\Bigl \{(x_1,x_2):\,0\le \pm x_2\le \sqrt{1-x_1^2}\Bigr \}.$$

By Theorem 5.3.3 both \(H_+\) and \(H_-\) are Riemann measurable, and each has area

$$\int _{-1}^1\sqrt{(1-x^2)}\,dx=2\int _0^{\frac{\pi }{2}}\cos ^2 \theta \,d\theta =\int _0^{\frac{\pi }{2}}\bigl (1-\cos 2\theta \bigr )\,d\theta =\frac{\pi }{2}.$$

Finally, \(H_+\cap H_-=[-1,1]\,\times \,\{0\}\) is a rectangle with area 0. Hence, by Corollary 5.3.2, the desired conclusion follows. Moreover, because \({\mathbf 1}_{\overline{B(\mathbf 0,r)}}(\mathbf x)={\mathbf 1}_{\overline{B(\mathbf 0,1)}}(r^{-1}\mathbf x)\), we can use (5.1.1) and (5.1.2) to see that

$$\begin{aligned} \mathrm {vol}\bigl (\overline{B(\mathbf c,r)}\bigr )=\pi r^2\end{aligned}$$
(5.3.1)

for balls in \({\mathbb R}^2\).

Having defined what we mean by the volume of a set, we now define what we will mean by the integral of a function on a set. Given a bounded, Riemann measurable set \(\varGamma \), we say that a bounded function \(f:\varGamma \longrightarrow {\mathbb C}\) is Riemann integrable on \(\varGamma \) if the function

$${\mathbf 1}_\varGamma f\equiv {\left\{ \begin{array}{ll}f&{}\text {on }\varGamma \\ 0 &{}\text {off }\varGamma \end{array}\right. }$$

is Riemann integrable on some rectangle \(\prod _{j=1}^N[a_j,b_j]\supseteq \varGamma \), in which case the Riemann integral of f on \(\varGamma \) is

$$\int _\varGamma f(\mathbf x)\,d\mathbf x\equiv \int _{\prod _1^N[a_j,b_j]}{\mathbf 1}_\varGamma (\mathbf x)f(\mathbf x)\,d\mathbf x.$$

In particular, if \(\varGamma \) is a bounded, Riemann measurable set, then every bounded, Riemann integrable function on \(\prod _{j=1}^N[a_j,b_j]\) will be is Riemann integrable on \(\varGamma \). In particular, notice that if \(\partial \varGamma \) is Riemann negligible and f is a bounded function of \(\varGamma \) that is continuous off of a Riemann negligible set, then f is Riemann integrable on \(\varGamma \). Obviously, the choice of the rectangle \(\prod _{j=1}^N[a_j,b_j]\) is irrelevant as long as it contains \(\varGamma \).

The following simple result gives an integral version of the intermediate value theorem, Theorem 1.3.6.

Theorem 5.3.4

Suppose that \(K\subseteq \prod _{j=1}^N[a_j,b_j]\) is a compact, connected, Riemann measurable set. If \(f:K\longrightarrow {\mathbb R}\) is continuous, then there exists a \({\varvec{\xi }}\in K\) such that

$$\int _Kf(\mathbf x)\,dx=f({\varvec{\xi }})\mathrm {vol}(K).$$

Proof

If \(\mathrm {vol}(K)=0\) there is nothing to do. Now assume that \(\mathrm {vol}(K)>0\). Then \(\frac{1}{\mathrm {vol}(K)}\int _Kf(\mathbf x)\,d\mathbf x\) lies between the minimum and maximum values that f takes on K, and therefore, by Exercise 4.5 and Lemma 4.1.2, there exists a \({\varvec{\xi }}\in K\) such that \(f({\varvec{\xi }})=\frac{1}{\mathrm {vol}(K)}\int _Kf(\mathbf x)\,d\mathbf x\).\(\square \)

5.4 Integration of Rotationally Invariant Functions

One of the reasons for our introducing the concepts in the preceding section is that they encourage us to get away from rectangles when computing integrals. Indeed, if \(\varGamma =\bigcup _{m=0}^n\varGamma _{\!m}\) where the \(\varGamma _{\!m}\)’s are bounded Riemann measurable sets, then, for any bounded, Riemann integrable function f,

$$\begin{aligned} \int _\varGamma f(\mathbf x)\,d\mathbf x=\sum _{m=0}^n\int _{\varGamma _m}f(\mathbf x)\,d\mathbf x\quad \text {if } \mathrm {vol}\bigl (\varGamma _{\!m}\cap \varGamma _{\!m^{\prime }}\bigr )=0 \text { for } m^{\prime }\ne m,\end{aligned}$$
(5.4.1)

since

$$0\le \sum _{m=0}^n{\mathbf 1}_{\varGamma _{\! m}}f-{\mathbf 1}_\varGamma f\le 2\Vert f\Vert _{\mathrm u}\sum _{0\le m<m^{\prime }\le n}{\mathbf 1}_{\varGamma _{\! m}\cap \varGamma _{\!m^{\prime }}}.$$

The advantage afforded by (5.4.1) is that a judicious choice of the \(\varGamma _{\!m}\)’s can simplify computations. For example, suppose that f is a function on the closed ball \(\overline{B(\mathbf 0,r)}\) in \({\mathbb R}^2\), and assume that \(f(\mathbf x)=\tilde{f}(|\mathbf x|)\) for some continuous function \(\tilde{f}:[0,r]\longrightarrow {\mathbb C}\). For each \(n\ge 1\), set \(\varGamma _{\!0,n}=\{\mathbf 0\}\) and \(\varGamma _{\!m,n}=\overline{B\bigl (\mathbf 0,\frac{mr}{n}\bigr )}{\setminus }\overline{B\bigl (\mathbf 0,\frac{(m-1)r}{n}\bigr )}\) if \(1\le m\le n\). By Corollary 5.3.2 and the considerations leading up to (5.3.1), we know that the \(\varGamma _{\!m,n}\)’s are Riemann measurable, and, obviously, for each \(n\ge 1\) they are a cover of \(\overline{B(\mathbf 0,r)}\) by mutually disjoint sets. If we define

$$f_n(\mathbf x)=\sum _{m=0}^n\tilde{f}\left( \frac{(2m-1)r}{2n}\right) {\mathbf 1}_{\varGamma _{m,n}},$$

then \(f_n\) is Riemann measurable, \(f_n\longrightarrow f\) uniformly, and therefore

$$\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x=\lim _{n\rightarrow \infty }\int _{\overline{B(\mathbf 0,r)}}f_n(\mathbf x)\,d\mathbf x= \lim _{n\rightarrow \infty }\sum _{m=1}^n\tilde{f} \bigl (\tfrac{(2m-1)r}{2n}\bigr )\mathrm {vol}(\varGamma _{\!m,n}).$$

Finally, by Corollary 5.3.2 and (5.3.1), \(\mathrm {vol}(\varGamma _{\!m,n})=\frac{(2m-1)\pi r^2}{n^2}\), and so

$$\int _{\varGamma _{\!m}}f_n(\mathbf x)\,d\mathbf x=\frac{2\pi r}{n}\sum _{m=1}^n\tilde{f} \bigl (\tfrac{(2m-1)r}{2n}\bigr )\tfrac{(2m-1)r}{2n} =2\pi \mathcal R(g;\mathcal C_n,\varXi _n),$$

where \(g(\rho )=\rho \tilde{f} (\rho )\), \(\mathcal C_n=\bigl \{\bigl [\frac{(m-1)r}{n},\frac{mr}{n}\bigr ]:\,1\le m\le n\bigr \}\) and \(\varXi _n\bigl (\bigl [\frac{(m-1)r}{n},\frac{mr}{n}\bigr ]\bigr )=\frac{(2m-1)r}{n}\). Hence, we have now proved that

$$\begin{aligned} \int _{\overline{B(\mathbf 0,r)}}f (\mathbf x)\,d\mathbf x=2\pi \int _0^r\tilde{f} (\rho )\rho \,d\rho \quad \text {if } f(\mathbf x)=\tilde{f}(|\mathbf x|)\end{aligned}$$
(5.4.2)

when \(\tilde{f} :[0,r]\longrightarrow {\mathbb C}\) is continuous. The preceding is an example of how, by taking advantage of symmetry properties, one can sometimes reduce the computation of an integral in higher dimensions to one in lower dimensions. In this example the symmetry was the rotational invariance of both the region of integration and the integrand.

Here is a beautiful application of (5.4.2) to a famous calculation. It is known that the function \(x\rightsquigarrow e^{-\frac{x^2}{2}}\) does not admit an indefinite integral that can be written as a concatenation of polynomials, trigonometric functions, and exponentials. Nonetheless, by combining (5.2.1) with (5.4.2), we will now show that

$$\begin{aligned} \int _{\mathbb R}e^{-\frac{x^2}{2}}\,dx=\sqrt{2\pi }.\end{aligned}$$
(5.4.3)

Given \(r>0\), use (5.2.3) to write

$$\left( \int _{-r}^re^{-\frac{x^2}{2}}\,dx\right) ^2=\int _{-r}^r\left( \int _{-r}^re^{-\frac{x_1^2+x_2^2}{2}}\,dx_1 \right) dx_2=\int _{[-r,r]^2} e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x.$$

Next observe that

$$\int _{\overline{B(\mathbf 0,\sqrt{2}r)}}e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x\ge \int _{[-r,r]^2} e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x\ge \int _{\overline{B(\mathbf 0,r)}}e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x,$$

and that, by (5.4.2),

$$\int _{\overline{B(\mathbf 0,R)}}e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x=2\pi \int _0^R e^{-\frac{\rho ^2}{2}}\rho \,d\rho =2\pi \bigl (1-e^{-\frac{R^2}{2}}\bigr ).$$

Thus, after letting \(r\rightarrow \infty \), we arrive at (5.4.3). Once one has (5.4.3), there are lots of other computations which follow. For example, one can compute (cf.  Exercise 3.3) \(\varGamma \bigl (\frac{1}{2}\bigr )=\int _0^\infty x^{-\frac{1}{2}}e^{-x}\,dx\). To this end, make the change of variables \(y=(2x)^{\frac{1}{2}}\) to see that

$$\begin{aligned}\varGamma \bigl (\tfrac{1}{2}\bigr )&=\lim _{r\rightarrow \infty }\int _{r^{-1}}^rx^{-\frac{1}{2}}e^{-x}\,dx=\lim _{r\rightarrow \infty }2^{\frac{1}{2}}\int _{(2r)^{-\frac{1}{2}}} ^{(2r)^{\frac{1}{2}}}e^{-\frac{y^2}{2}}\,dy\\ {}&=2^{\frac{1}{2}}\int _{[0,\infty )} e^{-\frac{y^2}{2}}\,dy=2^{-\frac{1}{2}}\int _{\mathbb R}e^{-\frac{y^2}{2}}\,dy,\end{aligned}$$

and conclude that

$$\begin{aligned} \varGamma \bigl (\tfrac{1}{2}\bigr )=\sqrt{\pi }.\end{aligned}$$
(5.4.4)

We will now develop the N-dimensional analog of (5.4.2) for other \(N\ge 1\). Obviously, the 1-dimensional analog is simply the statement that

$$\int _{-r}^rf(x)\,dx=2\int _0^rf(\rho )\,d\rho $$

for even functions f on \([-r,r]\). Thus, assume that \(N\ge 3\), and begin by noting that the closed ball \(\overline{B(\mathbf 0,r)}\) of radius \(r\ge 0\) centered at the origin is Riemann measurable. Indeed, \(\overline{B(\mathbf 0,r)}\) is the union of the hemispheres

$$H_+\equiv \left\{ \mathbf x:\,0\le x_N\le \sqrt{\sum _{j=1}^{N-1}x_j^2}\right\} \text { and } H_-\equiv \left\{ \mathbf x:\,-\sqrt{\sum _{j=1}^{N-1}x_j^2}\le x_N\le 0\right\} ,$$

and so, by Theorem 5.3.3 and Corollary 5.3.2, \(\overline{B(\mathbf 0,r)}\) is Riemann measurable. Further, by (5.1.2) and (5.1.1), for any \(\mathbf c\in {\mathbb {R}^N}\), \(\overline{B(\mathbf c,r)}\) is Riemann measurable and \(\mathrm {vol}\bigl (\overline{B(\mathbf c,r)}\bigr )=\mathrm {vol}\bigl (\overline{B(\mathbf 0,r)}\bigr )\varOmega _Nr^N\), where \(\varOmega _N\) is the volume of the closed unit ball \(\overline{B(\mathbf 0,1)}\) in \({\mathbb {R}^N}\).

Proceeding in precisely the same way as we did in the derivation of (5.4.2) and using the identity \(b^N-a^N=(b-a)\sum _{k=0}^{N-1}a^kb^{N-1-k}\), we see that, for any continuous \(\tilde{f} :[0,r]\longrightarrow {\mathbb C}\),

$$\int _{\overline{B(\mathbf 0,r)}}\tilde{f} (|\mathbf x|)\,d\mathbf x=\lim _{n\rightarrow \infty }\frac{N\varOmega _Nr}{n}\lim _{n\rightarrow \infty }\sum _{m=1}^n\tilde{f} \bigl (\xi _{m,n})\xi _{m,n}^{N-1},$$

where

$$\xi _{m,n}=\frac{r}{n}\left( \frac{1}{N}\sum _{k=0}^{N-1}m^k(m-1)^{N-1-k}\right) ^{\frac{1}{N-1}}\in \bigl [\tfrac{(m-1)r}{n},\tfrac{mr}{n}\bigr ],$$

and conclude from this that

$$\begin{aligned} \int _{\overline{B(\mathbf 0,r)}}\tilde{f} (|\mathbf x|)\,d\mathbf x=N\varOmega _N\int _0^r\tilde{f} (\rho )\rho ^{N-1}\,d\rho .\end{aligned}$$
(5.4.5)

Combining (5.4.5) with (5.4.3), we get an expression for \(\varOmega _N\). By the same reasoning as we used to derive (5.4.3), one finds that

$$(2\pi )^{\frac{N}{2}}=\left( \int _{\mathbb R}e^{-\frac{x^2}{2}}\,dx\right) ^N=\lim _{r\rightarrow \infty }\int _{B(\mathbf 0,r)}e^{-\frac{|\mathbf x|^2}{2}}\,d\mathbf x=N\varOmega _N\int _0^\infty \rho ^{N-1}e^{-\frac{\rho ^2}{2}}\,d\rho .$$

Next make the change of variables \(\rho =(2t)^{\frac{1}{2}}\) to see that

$$\int _{[0,\infty )} \rho ^{N-1}e^{-\frac{\rho ^2}{2}}\,d\rho =2^{\frac{N}{2}-1}\int _0^\infty t^{\frac{N}{2}-1}e^{-t}\,dt=2^{\frac{N}{2}-1}\varGamma \bigl (\tfrac{N}{2}\bigr ).$$

Thus, we now know that

$$\varOmega _N=\frac{2\pi ^{\frac{N}{2}}}{N\varGamma \bigl (\tfrac{N}{2}\bigr )}= \frac{\pi ^{\frac{N}{2}}}{\varGamma \bigl (\tfrac{N}{2}+1\bigr )}.$$

By (5.4.4) and induction on N,

$$\varGamma \bigl (\tfrac{2N+1}{2}\bigr )=\pi ^{\frac{1}{2}}2^{-N}\prod _{k=1}^N(2k-1)=\pi ^{\frac{1}{2}}\frac{(2N)!}{4^NN!},$$

and therefore

$$\begin{aligned} \varOmega _{2N}=\frac{\pi ^N}{N!}\quad \text {and} \quad \varOmega _{2N-1}=\frac{4^N\pi ^{N-1}N!}{(2N)!}\quad \text {for } N\ge 1 .\end{aligned}$$

Applying (3.2.4), we find that

$$\varOmega _{2N}\sim (2\pi N)^{-\frac{1}{2}}\left( \frac{\pi e}{N}\right) ^N\quad \text {and} \quad \varOmega _{2N-1}\sim (\sqrt{2}\pi )^{-1} \left( \frac{\pi e}{N}\right) ^N.$$

Thus, as N gets large, \(\varOmega _N\), the volume of the unit ball in \({\mathbb {R}^N}\), is tending to 0 at a very fast rate. Seeing as the volume of the cube \([-1,1]^N\) that circumscribes \(\overline{B(\mathbf 0,1)}\) has volume \(2^N\), this means that the \(2^N\) corners of \([-1,1]^N\) that are not in \(\overline{B(\mathbf 0,1)}\) take up the lion’s share of the available space. Hence, if we lived in a large dimensional universe, the biblical tradition that a farmer leave the corners of his field to be harvested by the poor would be very generous.

5.5 Rotation Invariance of Integrals

Because they fit together so nicely, thus far we have been dealing exclusively with rectangles whose sides are parallel to the standard coordinate axes. However, this restriction obscures a basic property of integrals, the property of rotation invariance . To formulate this property, recall that \((\mathbf e_1,\ldots ,\mathbf e_N)\in ({\mathbb {R}^N})^N\) is called an orthonormal basis in \({\mathbb {R}^N}\) if \((\mathbf e_i,\mathbf e_j)_{\mathbb {R}^N}=\delta _{i,j}\). The standard orthonormal basis \((\mathbf e_1^0,\ldots ,\mathbf e^0_N)\) is the one for which \((\mathbf e^0_i)_j=\delta _{i,j}\), but there are many others. For example, in \({\mathbb R}^2\), for each \(\theta \in [0,2\pi )\), \(\bigl ((\cos \theta ,\sin \theta ),(\mp \sin \theta ,\pm \cos \theta )\bigr )\) is an orthonormal basis, and every orthonormal basis in \({\mathbb R}^2\) is one of these.

A rotationFootnote 4 in \({\mathbb {R}^N}\) is a map \(\mathfrak R:{\mathbb {R}^N}\longrightarrow {\mathbb {R}^N}\) of the form \(\mathfrak R(\mathbf x)=\sum _{j=1}^Nx_j\mathbf e_j\) where \((\mathbf e_1,\ldots ,\mathbf e_N)\) is an orthonormal basis. Obviously \(\mathfrak R\) is linear in the sense that

$$\mathfrak R(\alpha \mathbf x+\beta \mathbf y)=\alpha \mathfrak R(\mathbf x)+\beta \mathfrak R(\mathbf y).$$

In addition, \(\mathfrak R\) preserves inner products: \(\bigl (\mathfrak R(\mathbf x),\mathfrak R(\mathbf y)\bigr )_{\mathbb {R}^N}=(\mathbf x,\mathbf y)_{\mathbb {R}^N}\). To check this, simply note that

$$\bigl (\mathfrak R(\mathbf x),\mathfrak R(\mathbf y)\bigr )_{\mathbb {R}^N}=\sum _{i,j=1}^Nx_iy_j(\mathbf e_i,\mathbf e_j)_{\mathbb {R}^N}=\sum _{i=1}^Nx_iy_i=(\mathbf x,\mathbf y)_{\mathbb {R}^N}.$$

In particular, \(|\mathfrak R(\mathbf y)-\mathfrak R(\mathbf x)|=|\mathbf y-\mathbf x|\), and so it is clear that \(\mathfrak R\) is one-to-one and continuous. Further, if \(\mathfrak R\) and \(\mathfrak R^{\prime }\) are rotations, then so is \(\mathfrak R^{\prime }\circ \mathfrak R\). Indeed, if \((\mathbf e_1,\ldots ,\mathbf e_N)\) is the orthonormal basis for \(\mathfrak R\), then

$$\mathfrak R^{\prime }\circ \mathfrak R(\mathbf x)=\sum _{j=1}^Nx_j\mathfrak R^{\prime }(\mathbf e_j),$$

and, since

$$\bigl (\mathfrak R^{\prime }(\mathbf e_i),\mathfrak R^{\prime }(\mathbf e_j)\bigr )_{\mathbb {R}^N}=(\mathbf e_i,\mathbf e_j)_{\mathbb {R}^N}=\delta _{i,j},$$

\(\bigl (\mathfrak R^{\prime }(\mathbf e_1),\ldots ,\mathfrak R^{\prime }(\mathbf e_N)\bigr )\) is an orthonormal basis. Finally, if \(\mathfrak R\) is a rotation, then there is a unique rotation \(\mathfrak R^{-1}\) such that \(\mathfrak R\circ \mathfrak R^{-1}=\mathbf I=\mathfrak R^{-1}\circ \mathfrak R\), where \(\mathbf I\) is the identity map: \(\mathbf I(\mathbf x)=\mathbf x\). To see this, let \((\mathbf e_1,\ldots ,\mathbf e_N)\) be the orthonormal bases for \(\mathfrak R\), and set \(\tilde{\mathbf e}_i=\bigl ((\mathbf e_1)_i,\ldots ,(\mathbf e_N)_i\bigr )\) for \(1\le i\le N\). Using \((\mathbf e_1^0,\ldots ,\mathbf e^0_N)\) to denote the standard orthonormal basis, we have that \((\tilde{\mathbf e}_i,\tilde{\mathbf e}_j)_{\mathbb {R}^N}\) equals

$$\sum _{k=1}^N(\mathbf e_k)_i(\mathbf e_k)_j=\sum _{k=1}^N(\mathbf e_k,\mathbf e^0_i)_{\mathbb {R}^N}(\mathbf e_k,\mathbf e^0_j)_{\mathbb {R}^N}=\bigl (\mathfrak R(\mathbf e^0_i),\mathfrak R(\mathbf e^0_j)\bigr )_{\mathbb {R}^N}=\delta _{i,j},$$

and so \((\tilde{\mathbf e}_1,\ldots ,\tilde{\mathbf e}_N)\) is an orthonormal basis. Moreover, if \(\tilde{\mathfrak R}\) is the corresponding rotation, then

$$\begin{aligned}\tilde{\mathfrak R}\circ \mathfrak R(\mathbf x)&=\sum _{i=1}^Nx_i\tilde{\mathfrak R}(\mathbf e_i)=\sum _{i,j=1}^Nx_i(\mathbf e_i,\mathbf e^0_j)_{\mathbb {R}^N}\tilde{\mathbf e}_j \\&=\sum _{i,j,k=1}^Nx_i(\mathbf e_i,\mathbf e^0_j)_{\mathbb {R}^N}(\mathbf e_k,\mathbf e^0_j)_{\mathbb {R}^N}\mathbf e^0_k=\sum _{i,k=1}^Nx_i(\mathbf e_i,\mathbf e_k)_{\mathbb {R}^N}\mathbf e^0_k=\mathbf x. \end{aligned}$$

A similar computation shows that \(\mathfrak R\circ \tilde{\mathfrak R}=\mathbf I\), and so we can take \(\mathfrak R^{-1}=\tilde{\mathfrak R}\).

Because \(\mathfrak R\) preserves lengths, it is clear that \(\mathfrak R\bigl (\overline{B(\mathbf c,r)}\bigr )=\overline{B\bigl (\mathfrak R(\mathbf c),r\bigr )}\). \(\mathfrak R\) also takes a rectangle into a rectangle, but unfortunately the image rectangle may no longer have sides parallel to the standard coordinate axes. Instead, they are parallel to the axes for the corresponding orthonormal basis. That is,

$$\begin{aligned} \mathfrak R\left( \prod _{j=1}^N[a_j,b_j]\right) =\left\{ \sum _{j=1}^Nx_j\mathbf e_j:\,\mathbf x\in \prod _{j=1}^N[a_j,b_j]\right\} .({*}) \end{aligned}$$

Of course, we should expect that \(\mathrm {vol}\left( \mathfrak R\left( \prod _{j=1}^N[a_j,b_j]\right) \right) =\prod _{j=1}^N(b_j-a_j)\), but this has to be checked, and for that purpose we will need the following lemma.

Lemma 5.5.1

Let G be a non-empty, bounded, open subset of \({\mathbb {R}^N}\), and assume that

$$\lim _{r\searrow 0}\bigl |(\partial G)^{(r)}\bigr |_{\mathrm i}=0$$

where \((\partial G)^{(r)}\) is the set of \(\mathbf y\) for which there exists an \(\mathbf x\in \partial G\) such that \(|\mathbf y-\mathbf x|<r\). Then \(\bar{G}\) is Riemann measurable and, for each \(\epsilon >0\), there exists a finite set \(\mathcal B\) of mutually disjoint closed balls \(\bar{B}\subseteq G\) such that \(\mathrm {vol}(\bar{G})\le \sum _{B\in \mathcal B}\mathrm {vol}(\bar{B})+\epsilon \).

Proof

First note that \(\partial G\) is Riemann negligible and therefore that \(\bar{G}\) is Riemann measurable. Next, given a closed cube \(Q=\prod _{j=1}^N[c_j-r,c_j+r]\), let \(\bar{B}_Q\) be the closed ball \(\overline{B\bigl (\mathbf c,\frac{r}{2}\bigr )}\).

For each \(n\ge 1\), let \(\mathcal K _n\) be the collection of closed cubes Q of the form \(2^{-n}\mathbf k+[0,2^{-n}]^N\), where \(\mathbf k\in {\mathbb Z}^N\). Obviously, for each n, the cubes in \(\mathcal K_n\) are non-overlapping and \({\mathbb {R}^N}=\bigcup _{Q\in \mathcal K_n}Q\).

Now choose \(n_1\) so that \(|(\partial G)^{(2^{\frac{N}{2}}-n_1)}|_{\mathrm i}\le \frac{1}{2}\mathrm {vol}(\bar{G})\), and set

$$\mathcal C_1=\{Q\in \mathcal K_{n_1}:\,Q\subseteq G\}\text { and }\mathcal C_1^{\prime }=\{Q\in \mathcal K_{n_1}:\,Q\cap G\ne \emptyset \}.$$

Then \(\bar{G}\subseteq \bigcup _{Q\in \mathcal C_1^{\prime }}Q\), \(\bigcup _{Q\in \mathcal C_1^{\prime }{\setminus } \mathcal C_1}Q\subseteq (\partial G)^{(2^{\frac{N}{2}-n_1})}\), and therefore

$$\sum _{Q\in \mathcal C_1}|Q|=\sum _{Q\in \mathcal C_1^{\prime }}|Q|-\sum _{Q\in \mathcal C_1^{\prime }{\setminus } \mathcal C_1}|Q|\ge \mathrm {vol}(\bar{G})-\frac{\mathrm {vol}(\bar{G})}{2}= \frac{\mathrm {vol}(\bar{G})}{2}.$$

Clearly the \(\bar{B}_Q\)’s for \(Q\in \mathcal C_1\) are mutually disjoint, closed balls contained in G. Furthermore, \(\mathrm {vol}(\bar{B}_Q)=\alpha |Q|\), where \(\alpha \equiv 4^{-N}\varOmega _N\), and therefore

$$\mathrm {vol}\left( G{\setminus } \bigcup _{Q\in \mathcal C_1}\bar{B}_Q\right) =\mathrm {vol}(G)\,-\,\sum _{Q\in \mathcal C_1}\mathrm {vol}(\bar{B}_Q) =\mathrm {vol}(G)\,-\,\alpha \sum _{Q\in \mathcal C_1}|Q|\le \beta \mathrm {vol}(G),$$

where \(\beta \equiv 1\,-\,\frac{\alpha }{2}\). Finally, set \(\mathcal B_1=\{\bar{B}_Q:\,Q\in \mathcal C_1\}\).

Set \(G_1=G{\setminus } \bigcup _{\bar{B}\in \mathcal B_1}\bar{B}\). Then \(G_1\) is again a non-empty, bounded, open set. Furthermore, since (cf. Exercise 4.1) \(\partial G_1\subseteq \partial G\cup \bigcup _{\bar{B}\in \mathcal B_1}\partial \bar{B}\), it is easy to see that \(\lim _{r\searrow 0}\bigl |(\partial G_1)^{(r)}\bigr |_{\mathrm i}=0\). Hence we can apply the same argument to \(G_1\) and thereby produce a set \(\mathcal B_2\supseteq \mathcal B_1\) of mutually disjoint, closed balls \(\bar{B}\) such that \(\bar{B}\subseteq G_1\) for \(\bar{B}\in \mathcal B_2{\setminus } \mathcal B_1\) and

$$\mathrm {vol}\left( G{\setminus } \bigcup _{\bar{B}\in \mathcal B_2}\bar{B}\right) = \mathrm {vol}\left( G_1{\setminus } \bigcup _{\bar{B}\in \mathcal B_2{\setminus } \mathcal B_1}B_k\right) \le \beta \mathrm {vol}(G_1)\le \beta ^2\mathrm {vol}(G).$$

After m iterations, we produce a collection \(\mathcal B_m\) of mutually disjoint closed balls \(\bar{B}\subseteq G\) such that \(\mathrm {vol}\left( G{\setminus } \bigcup _{\bar{B}\in \mathcal B_m}\bar{B}\right) \le \beta ^m\mathrm {vol}(G)\). Thus, all that remains is to choose m so that \(\beta ^m\mathrm {vol}(G)<\epsilon \) and then do m iterations.\(\square \)

Lemma 5.5.2

If R is a rectangle and \(\mathfrak R\) is a rotation, then \(\mathfrak R(R)\) is Riemann measurable and has the same volume as R.

Proof

It is obvious that \(\mathrm {int}(R)\) satisfies the hypotheses of Lemma 5.5.1, and, by using (\(*\)), it is easy to check that \(\mathrm {int}\bigl (\mathfrak R(R)\bigr )\) does also.

Next assume that \(G\equiv \mathrm {int}(R)\ne \emptyset \). Clearly G satisfies the hypotheses of Lemma 5.5.1, and therefore for each \(\epsilon >0\) we can find a collection \(\mathcal B\) of mutually disjoint closed balls \(\bar{B}\subseteq G\) such that \(\sum _{\bar{B}\in \mathcal B}\mathrm {vol}(\bar{B})+\epsilon \ge \mathrm {vol}(\bar{G})=\mathrm {vol}(R)\). Thus, if \(\mathcal B^{\prime }=\{\mathfrak R(\bar{B}):\,\bar{B}\in \mathcal B\}\), then \(\mathcal B^{\prime }\) is a collection of mutually disjoint closed balls \(\bar{B}^{\prime }\subseteq \mathfrak R(R)\) such that

$$\mathrm {vol}(R)\le \sum _{\bar{B}\in \mathcal B}\mathrm {vol}(\bar{B})+\epsilon =\sum _{\bar{B}^{\prime }\in \mathcal B^{\prime }}\mathrm {vol}(\bar{B}^{\prime })+\epsilon \le \mathrm {vol}\bigl (\mathfrak R(R)\bigr )+\epsilon ,$$

and so \(\mathrm {vol}\bigl (\mathfrak R(R)\bigr )\ge |R|\). To prove that this inequality is an equality, apply the same line of reasoning to \(G^{\prime }=\mathrm {int}\bigl (\mathfrak R(R)\bigr )\) and \(\mathfrak R^{-1}\) acting on \(\mathfrak R(R)\), and thereby obtain

$$\mathrm {vol}(R)=\mathrm {vol}\bigl (\mathfrak R^{-1}\circ \mathfrak R(R)\bigr )\ge \mathrm {vol}\bigl (\mathfrak R(R)\bigr ).$$

Finally, if \(R=\emptyset \) there is nothing to do. On the other hand, if \(R\ne \emptyset \) but \(\mathrm {int}(R)=\emptyset \), for each \(\epsilon >0\) let \(R(\epsilon )\) be the set of points \(\mathbf y\in {\mathbb R}^N\) such that \(\max _{1\le j\le N}|y_j-x_j|\le \epsilon \) for some \(\mathbf x\in R\). Then \(R(\epsilon )\) is a closed rectangle with non-empty interior containing R, and so \(\mathrm {vol}\bigl (\mathfrak R(R)\bigr )\le \mathrm {vol}\bigl (\mathfrak R(R(\epsilon ))\bigr )=|R(\epsilon )|\). Since \(\mathrm {vol}(R)=0=\lim _{\epsilon \searrow 0}|R(\epsilon )|\), it follows that \(\mathrm {vol}(R)=\mathrm {vol}\bigl (\mathfrak R(R)\bigr )\) in this case also.\(\square \)

Theorem 5.5.3

If \(\varGamma \) is a bounded, Riemann measurable subset and \(\mathfrak R\) is a rotation, then \(\mathfrak R(\varGamma )\) is Riemann measurable and \(\mathrm {vol}\bigl (\mathfrak R(\varGamma )\bigr )=\mathrm {vol}(\varGamma )\).

Proof

Given \(\epsilon >0\), choose \(\mathcal C_1\) to be a collection of non-overlapping rectangles contained in \(\varGamma \) such that \(\mathrm {vol}(\varGamma ) \le \sum _{R\in \mathcal C_1}|R|+\epsilon \), and choose \(\mathcal C_2\) to be a cover of \(\varGamma \) by non-overlapping rectangles such that \(\mathrm {vol}(\varGamma )\ge \sum _{R\in \mathcal C_2}|R|-\epsilon \). Then

$$\begin{aligned}|\mathfrak R(\varGamma )|_{\mathrm i}&\ge \sum _{R\in \mathcal C_1}\mathrm {vol}\bigl (\mathfrak R(R)\bigr )=\sum _{R\in \mathcal C_1}|R|\ge \mathrm {vol}(\varGamma )-\epsilon \ge \sum _{R\in \mathcal C_2}|R|-2\epsilon \\ {}&=\sum _{R\in \mathcal C_2}\mathrm {vol}\bigl (\mathfrak R(R)\bigr )-2\epsilon \ge |\mathfrak R(\varGamma )|_{\mathrm e}-2\epsilon .\end{aligned}$$

Hence, \(|\mathfrak R(\varGamma )|_{\mathrm e}\le \mathrm {vol}(\varGamma )+2\epsilon \) and \(|\mathfrak R(\varGamma )|_{\mathrm i}\ge \mathrm {vol}(\varGamma )-2\epsilon \) for all \(\epsilon >0\).\(\square \)

Corollary 5.5.4

Let \(f:\overline{B(0,r)}\longrightarrow {\mathbb C}\) be a bounded function that is continuous off of a Riemann negligible set. Then, for each rotation \(\mathfrak R\), \(f\circ \mathfrak R\) is continuous off of a Riemann negligible set and

$$\int _{\overline{B(\mathbf 0,r)}}f\circ \mathfrak R(\mathbf x)\,d\mathbf x=\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x.$$

Proof

Without loss in generality, we will assume throughout that f is real-valued.

If D is a Riemann negligible set off of which f is continuous, then \(\mathfrak R^{-1}(D)\) contains the set where \(f\circ \,\mathfrak R\) is discontinuous. Hence, since \(\mathrm {vol}\bigl (\mathfrak R^{-1}(D)\bigr )=\mathrm {vol}(D)=0\), \(f\circ \mathfrak R\) is continuous off of a Riemann negligible set.

Set \(g={\mathbf 1}_{\overline{B(\mathbf 0,r)}}f\). Then, by the preceding, both g and \(g\circ \mathfrak R\) are Riemann integrable. By (5.4.1), for any cover \(\mathcal C\) of \([-r,r]^N\) by non-overlapping rectangles and any associated choice function \({\varvec{\varXi }}\),

$$\begin{aligned}\int _{\overline{B(\mathbf 0,r)}} f\circ \mathfrak R(\mathbf x)\,d\mathbf x&=\sum _{R\in \mathcal C}\int _{\mathfrak R^{-1}(R)}g\circ \mathfrak R(\mathbf x)\,d\mathbf x\\ {}&= \mathcal R(g;\mathcal C,{\varvec{\varXi }})+\sum _{R\in \mathcal C}\int _{\mathfrak R^{-1}(R)}\varDelta _R(\mathbf x)\,d\mathbf x,\end{aligned}$$

where \(\varDelta _R(\mathbf x)=g(\mathbf x)-g\bigl ({\varvec{\varXi }}(R)\bigr )\). Since \(\mathcal R\bigl (g;\mathcal C,{\varvec{\varXi }})\) tends to \(\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x\) as \(\Vert \mathcal C\Vert \rightarrow 0\), what remains to be shown is that the final term tends to 0 as \(\Vert \mathcal C\Vert \rightarrow 0\). But \(|\varDelta _R(\mathbf x)|\le \sup _Rg-\inf _Rg\) and therefore

$$\left| \sum _{R\in \mathcal C}\int _{\mathfrak R^{-1}(R)}\varDelta _R(\mathbf x)\,d\mathbf x\right| \le \sum _{R\in \mathcal C}\left( \sup _R g-\inf _R g\right) |R|=\mathcal U(g;\mathcal C)-\mathcal L(g;\mathcal C),$$

which tends to 0 as \(\Vert \mathcal C\Vert \rightarrow 0\).\(\square \)

Here is an example of the way in which one can use rotation invariance to make computations.

Lemma 5.5.5

Let \(0\le r_1<r_2\) and \(0\le \theta _1<\theta _2<2\pi \) be given. Then the region

$$\bigl \{(r\cos \theta ,r\sin \theta ):\,(r,\theta )\in [r_1,r_2]\times [\theta _1,\theta _2]\bigr \}$$

has a Riemann negligible boundary and volume \(\frac{r_2^2-r_1^2}{2}(\theta _2-\theta _1)\).

Proof

Because this region can be constructed by taking the intersection of differences of balls with half spaces, its boundary is Riemann negligible. Furthermore, to compute its volume, it suffices to treat the case when \(r_1=0\) and \(r_2=1\), since the general case can be reduced to this one by taking differences and scaling.

Now define \(u(\theta )=\mathrm {vol}\bigl (W(\theta )\bigr )\) where

$$W(\theta )\equiv \bigl \{(r\cos \omega ,r\sin \omega ):\,(r,\omega )\in [0,1]\times [0,\theta ]\bigr \}.$$

Obviously, u is a non-decreasing function of \(\theta \in [0,2\pi ]\) that is equal to 0 when \(\theta =0\) and \(\pi \) when \(\theta =2\pi \). In addition, \(u(\theta _1+\theta _2)=u(\theta _1)+u(\theta _2)\) if \(\theta _1+\theta _2\le 2\pi \). To see this, let \(\mathfrak R_{\theta _1}\) be the rotation corresponding to the orthonormal basis \(\bigl ((\cos \theta _1,\sin \theta _1),(-\sin \theta _1,\cos \theta _1)\bigr )\), and observe that

$$W(\theta _1+\theta _2)=W(\theta _1)\cup \mathfrak R_{\theta _1}\bigl (W(\theta _2)\bigr )$$

and that \(\mathrm {int}\bigl (W(\theta _1)\bigr )\cap \mathrm {int}\bigl (\mathfrak R_{\theta _1}\bigl (W(\theta _2)\bigr )\bigr )=\emptyset \). Hence, the equality follows from the facts that the boundaries of \(W(\theta _1)\) and \(\mathfrak R\bigl (W(\theta _2)\bigr )\) are Riemann negligible and that \(\mathfrak R_{\theta _1}\bigl (W(\theta _2)\bigr )\) has the same volume as \(W(\theta _2)\). After applying this repeatedly, we get \(nu\bigl (\frac{2\pi }{n}\bigr )=\pi \) and then that \(u\bigl (\frac{2\pi m}{n}\bigr )=mu\bigl (\frac{2\pi }{n}\bigr )\) for \(n\ge 1\) and \(0\le m\le n\). Hence, \(u\bigl (\frac{2\pi m}{n}\bigr )=\frac{\pi m}{n}\) for all \(n\ge 1\) and \(0\le m\le n\). Now, given any \(\theta \in (0,2\pi )\), choose \(\{m_n\in \mathbb N:\,n\ge 1\}\) so that \(0\le \theta -\frac{2\pi m_n}{n}<\frac{2\pi }{n}\). Then, for all \(n\ge 1\),

$$\bigl |u(\theta )-\tfrac{\theta }{2}\bigr |\le \bigl |u(\theta )-u\bigl (\tfrac{2\pi m_n}{n}\bigr )\bigr |+ \bigl |\tfrac{\pi m_n}{n}-\tfrac{\theta }{2}\bigr |\le u\bigl (\tfrac{2\pi }{n}\bigr )+\tfrac{\pi }{n}\le \tfrac{2\pi }{n},$$

and so \(u(\theta )=\frac{\theta }{2}\).

Finally, given any \(0\le \theta _1<\theta _2\le 2\pi \), set \(\theta =\theta _2-\theta _1\), and observe that \(W(\theta _2){\setminus } \mathrm {int}(W(\theta _1)=\mathfrak R_{\theta _1}\bigl (W(\theta )\bigr )\) and therefore that \(W(\theta _2){\setminus } \mathrm {int}(W(\theta _1)\) has the same volume as \(W(\theta )\).\(\square \)

5.6 Polar and Cylindrical Coordinates

Changing variables in multi-dimensional integrals is more complicated than in one dimension. From the standpoint of the theory that we have developed, the primary reason is that, in general, even linear changes of coordinates take rectangles into parallelograms that are not in general rectangles with respect to any orthonormal basis. Starting from the formula in terms of determinants for the volume of parallelograms, Jacobi worked out a general formula that says how integrals transform under continuously differentiable changes that satisfy a suitable non-degeneracy condition, but his theory relies on a familiarity with quite a lot of linear algebra and matrix theory. Thus, we will restrict our attention to changes of variables for which his general theory is not required.

We will begin with polar coordinates for \({\mathbb R}^2\). To every point \(\mathbf x\in {\mathbb R}^2{\setminus } \{\mathbf 0\}\) there exists a unique point \((\rho ,\varphi )\in (0,\infty )\times [0,2\pi )\) such that \(x_1=\rho \cos \varphi \) and \(x_2=\rho \sin \varphi \). Indeed, if \(\rho =|\mathbf x|\), then \(\frac{\mathbf x}{\rho }\in \mathbb S^1(\mathbf 0,1)\), and so \(\varphi \) is the distance, measured counterclockwise, one travels along \(\mathbb S^1(\mathbf 0,1)\) to get from (1, 0) to \(\frac{\mathbf x}{\rho }\). Thus we can use the variables \((\rho ,\varphi )\in (0,\infty )\times [0,2\pi )\) to parameterize \({\mathbb R}^2{\setminus } \{\mathbf 0\}\). We have restricted our attention to \({\mathbb R}^2{\setminus } \{\mathbf 0\}\) because this parameterization breaks down at \(\mathbf 0\). Namely, \(\mathbf 0=(0\cos \varphi ,0\sin \varphi )\) for every \(\varphi \in [0,2\pi )\). However, this flaw will not cause us problems here.

Given a continuous function \(f:\overline{B(\mathbf 0,r)}\longrightarrow {\mathbb C}\), it is reasonable to ask whether the integral of f over \(\overline{B(\mathbf 0,r)}\) can be written as an integral with respect to the variables \((\rho ,\varphi )\). In fact, we have already seen in (5.4.2) that this is possible when f depends only on \(|\mathbf x|\), and we will now show that it is always possible. To this end, for \(\theta \in {\mathbb R}\), let \(\mathfrak R_\theta \) be the rotation in \({\mathbb R}^2\) corresponding to the basis \(\bigl ((\cos \theta ,\sin \theta ),(-\sin \theta ,\cos \theta )\bigr )\). That is,

$$\mathfrak R_\theta \mathbf x=\bigl (x_1\cos \theta -x_2\sin \theta ,x_1\sin \theta +x_2\cos \theta \bigr ).$$

Using (1.5.1), it is easy to check that \(\mathfrak R_{\theta }\circ \mathfrak R_{\varphi }=\mathfrak R_{\theta +\varphi }\). In particular, \(\mathfrak R_{2\pi +\varphi }=\mathfrak R_{\varphi }\).

Lemma 5.6.1

Let \(f:\overline{B(\mathbf 0,r)}\longrightarrow {\mathbb C}\) be a continuous function, and define

$$\tilde{f}(\rho )=\frac{1}{2\pi }\int _0^{2\pi }f\bigl (\rho \cos \varphi ,\rho \sin \varphi \bigr )\,d\varphi \quad \text {for } \rho \in [0,r].$$

Then, for all \(\mathbf x\in \overline{B(\mathbf 0,r)}\),

$$\tilde{f}(|\mathbf x|)=\frac{1}{2\pi }\int _0^{2\pi }f\bigl (\mathfrak R_\varphi \mathbf x\bigr )\,d\varphi .$$

Proof

Set \(\rho =|\mathbf x|\) and choose \(\theta \in [0,1)\) so that \(\mathbf x=\bigl (\rho \cos (2\pi \theta ) ,\rho \sin (2\pi \theta )\bigr )\). Equivalently, \(\mathbf x=\mathfrak R_{2\pi \theta }(\rho ,0)\). Then by the preceding remarks about rotations in \({\mathbb R}^2\) and (3.3.3) applied to the periodic function \(\xi \rightsquigarrow f\bigl (\mathfrak R_{2\pi \xi } (\rho ,0)\bigr )\),

$$\begin{aligned} \frac{1}{2\pi }&\int _0^{2\pi }f\bigl (\mathfrak R_\varphi \mathbf x\bigr )\,d\varphi =\frac{1}{2\pi }\int _0^{2\pi }f\bigl (\mathfrak R_{2\pi \theta +\varphi }(\rho ,0)\bigr )\\ {}&=\int _0^1 f\bigl (\mathfrak R_{2\pi (\theta +\varphi )}(\rho ,0)\bigr )\,d\varphi =\int _0^1 f\bigl (\mathfrak R_{2\pi \varphi }(\rho ,0)\bigr )\,d\varphi =\tilde{f}(\rho ).\end{aligned}$$

\(\square \)

Theorem 5.6.2

If f is a continuous function on the ball \(\overline{B(\mathbf 0,r)}\) in \({\mathbb R}^2\), then

$$\begin{aligned}\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x&=\int _0^r \rho \left( \int _0^{2\pi }f\bigl (\rho \cos \varphi ,\rho \sin \varphi \bigr )\,d\varphi \right) d\rho \\ {}&=\int _0^{2\pi }\left( \int _0^r f\bigl (\rho \cos \varphi ,\rho \sin \varphi \bigr )\rho \,d\rho \right) d\varphi .\end{aligned}$$

Proof

By (5.4.2),

$$\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x=\int _{\overline{B(\mathbf 0,r)}}f\bigl (\mathfrak R_\varphi \mathbf x\bigr )\,d\mathbf x$$

for all \(\varphi \). Hence, by (5.2.1), Lemma 5.6.1, and (5.4.2),

$$\begin{aligned}\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x&=\frac{1}{2\pi }\int _0^{2\pi }\left( \int _{\overline{B(\mathbf 0,r)}}f\bigl (\mathfrak R_\varphi \mathbf x\bigr )\,d\mathbf x\right) d\varphi \\&=\int _{\overline{B(\mathbf 0,r)}}\tilde{f}(|\mathbf x|)\,d\mathbf x= \int _0^r \tilde{f} (\rho )\rho \,d\rho ,\end{aligned}$$

which is the first equality. The second equality follows from the first by another application of (5.2.1).\(\square \)

As a preliminary application of this theorem, we will use it to compute integrals over a star shaped region , a region G for which there exists a \(\mathbf c\in {\mathbb R}^2\), known as the center , and a continuous function, known as the radial function , \(r:[0,2\pi ]\longrightarrow (0,\infty )\) such that \(r(0)=r(2\pi )\) and

$$ \begin{aligned} G=\bigl \{\mathbf c+r\mathbf e(\varphi ):\,\varphi \in [0,2\pi )\; \& \;r\in \bigl [0,r(\varphi )\bigr )\bigr \},\end{aligned}$$
(5.6.1)

where \(\mathbf e(\varphi ) \equiv (\cos \varphi ,\sin \varphi )\). For instance, if G is a non-empty, bounded, convex open set, then for any \(\mathbf c\in G\), G is star shaped with center at \(\mathbf c\) and

$$r(\varphi )=\max \{r>0:\,\mathbf c+r\mathbf e(\varphi )\in \bar{G}\}.$$

Observe that

$$\begin{aligned}\partial G= \bigl \{\mathbf c+r(\varphi )\mathbf e(\varphi ) :\,\varphi \in [0,2\pi )\bigr \}.\end{aligned}$$

and, as a consequence, we can show that \(\partial G\) is Riemann negligible. Indeed, for a given \(\epsilon \in (0,1]\) choose \(n\ge 1\) so that \(|r(\varphi _2)-r(\varphi _1)|<\epsilon \) if \(|\varphi _2-\varphi _1|\le \frac{2\pi }{n}\) and, for \(1\le m\le n\), set

$$ A_m=\bigl \{\mathbf c+\rho \mathbf e(\varphi ):\,\tfrac{2\pi (m-1)}{n}\le \varphi <\tfrac{2\pi m}{n}\; \& \;\bigl |\rho -r\bigl (\tfrac{2\pi m}{n}\bigr )\bigr |\le \epsilon \bigr \}.$$

Then \(\partial G\subseteq \bigcup _{m=1}^nA_{m,n}\) and, by Lemma 5.5.5,

$$\mathrm {vol}(A_m)=\frac{2\pi ^2}{n}\Bigl (\bigl (r\bigl (\tfrac{2\pi m}{n}\bigr )+\epsilon \bigr )^2 -\bigl (r\bigl (\tfrac{2\pi m}{n}\bigr )-\epsilon \bigr )^2\Bigr ) \le \frac{8\pi ^2\Vert r\Vert _{[0,2\pi ]}\epsilon }{n},$$

and therefore there is a constant \(K<\infty \) such that \(|\partial G|_{\mathrm e}\le K\epsilon \) for all \(\epsilon \in (0,1]\). Finally, notice that G is path connected and therefore, by Exercise 4.5 is connected.

The following is a significant extension of Theorem 5.6.2.

Corollary 5.6.3

If G is the region in (5.6.1) and \(f:\bar{G}\longrightarrow {\mathbb C}\) is continuous, then

$$\int _{\bar{G}}f(\mathbf x)\,d\mathbf x=\int _0^{2\pi }\left( \int _0^{r(\varphi )}f(\mathbf c+\rho \mathbf e(\varphi ) )\, \rho d\rho \right) d\varphi .$$

Proof

Without loss in generality, we will assume that \(\mathbf c=\mathbf 0\). Set \(r_-=\min \{r(\varphi ):\,\varphi \in [0,2\pi ]\}\) and \(r_+=\max \{r(\varphi ):\,\varphi \in [0,2\pi ]\}\). Given \(n\ge 1\), define \(\eta _n:{\mathbb R}\longrightarrow [0,1]\) by

$$\eta _n(t)={\left\{ \begin{array}{ll} 0&{}\text {if }t\le 0\\ \frac{nt}{r_-}&{}\text {if }0<t\le \frac{r_-}{n} \\ 1 &{}\text {if }t>\frac{r_-}{n},\end{array}\right. }$$

and define \(\alpha _n\) and \(\beta _n\) on \({\mathbb R}^2\) by

$$\alpha _n(\rho \mathbf e(\varphi ) )=\eta _n\bigl (r(\varphi )-\rho \bigr )\text { and } \beta _n(\rho \mathbf e(\varphi ) )=\eta _n\bigl (r(\varphi )+\tfrac{r_-}{n}-\rho \bigr )$$

Then both \(\alpha _n\) and \(\beta _n\) are continuous functions, \(\alpha _n\) vanishes off of G and \(\beta _n\) equals 1 on \(\bar{G}\). Finally define

$$f_n(\mathbf x)\equiv {\left\{ \begin{array}{ll}\alpha _n(\mathbf x)f(\mathbf x)&{}\text {if }\mathbf x\in G\\ 0&{}\text {if }\mathbf x\notin G.\end{array}\right. }$$

Then \(f_n\) is continuous and therefore, by Theorem 5.6.2,

$$\begin{aligned}\int _{\bar{G}}f_n(\mathbf x)\,d\mathbf x&=\int _{\overline{B(\mathbf 0,r_+)}}f_n(\mathbf x)\,d\mathbf x=\int _0^{2\pi }\left( \int _0^{r_+}f_n(\rho \mathbf e(\varphi ) )\rho \,d\rho \right) d\varphi \\ {}&=\int _0^{2\pi }\left( \int _0^{r(\varphi )}f_n(\rho \mathbf e(\varphi ) )\rho \,d\rho \right) d\varphi .\end{aligned}$$

Clearly, again by Theorem 5.6.2,

$$\begin{aligned}&\left| \int _{\bar{G}}f(\mathbf x)\,d\mathbf x-\int _{\bar{G}}f_n(\mathbf x)\,d\mathbf x\right| \le \Vert f\Vert _{\bar{G}}\int _{\bar{G}}\bigl (1-\alpha _n(\mathbf x)\bigr )\,d\mathbf x\\ {}&\quad \le \Vert f\Vert _{\bar{G}}\int _{\overline{B(\mathbf 0,2r_+)}}\beta _n(\mathbf x)\bigl (1-\alpha _n(\mathbf x)\bigr )\,d\mathbf x\\ {}&\quad =\Vert f\Vert _{\bar{G}}\int _0^{2\pi } \left( \int _0^{2r_+}\eta _n\bigl (r(\varphi )+\tfrac{r_-}{n}-\rho \bigr )\Bigl (1-\eta _n\bigl (r(\varphi )-\rho \bigr )\Bigr )\rho \,d\rho \right) d\varphi \\ {}&\quad =\Vert f\Vert _{\bar{G}}\int _0^{2\pi }\left( \int _{r(\varphi )-\frac{r_-}{n}}^{r(\varphi )+\frac{r_-}{n}} \eta _n\bigl (r(\varphi )+\tfrac{r_-}{n}-\rho \bigr )\Bigl (1-\eta _n\bigl (r(\varphi )-\rho \bigr )\Bigr )\rho \,d\rho \right) d\varphi \\ {}&\quad \le \frac{8\pi \Vert f\Vert _{\bar{G}}r_+r_-}{n}.\end{aligned}$$

At the same time,

$$\begin{aligned}&\left| \int _0^{2\pi }\left( \int _0^{r(\varphi )}f(\rho \mathbf e(\varphi ) )\rho \,d\rho \right) d\varphi -\int _0^{2\pi }\left( \int _0^{r(\varphi )}f_n(\rho \mathbf e(\varphi ) )\rho \,d\rho \right) d\varphi \right| \\ {}&\le \Vert f\Vert _{\bar{G}}\int _0^{2\pi }\left( \int _{r(\varphi )-\frac{r_-}{n}}^{r(\varphi )}\rho \,d\rho \right) d\varphi \le \frac{4\pi \Vert f\Vert _{\bar{G}}r_+r_-}{n}.\end{aligned}$$

Thus, the asserted equality follows after one lets \(n\rightarrow \infty \).\(\square \)

We turn next to cylindrical coordinates in \({\mathbb R}^3\). That is, we represent points in \({\mathbb R}^3\) as \((\rho \mathbf e(\varphi ) ,\xi )\), where \(\rho \ge 0\), \(\varphi \in [0,2\pi )\), and \(\xi \in {\mathbb R}\). Again the correspondence fails to be one-to-one everywhere. Namely, \(\varphi \) is not uniquely determined for \(\mathbf x\in {\mathbb R}^3\) with \(x_1=x_2=0\), but, as before, this will not prevent us from representing integrals in terms of the variables \((\rho ,\varphi ,\xi )\).

Theorem 5.6.4

Let \(\psi :[a,b]\longrightarrow [0,\infty )\) be a continuous function, and set

$$ \varGamma =\{\mathbf x\in {\mathbb R}^3:\,x_3\in [a,b]\; \& \;x_1^2+x_2^2\le \psi (x_3)^2\}.$$

Then \(\varGamma \) is Riemann measurable and

$$\int _\varGamma f(\mathbf x)\,d\mathbf x=\int _a^b\left( \int _0^{\psi (\xi )}\rho \left( \int _0^{2\pi }f\bigl (\rho \mathbf e(\varphi ) ,\xi \bigr )\,d\varphi \right) d\rho \right) d\xi $$

for any continuous function \(f:\varGamma \longrightarrow {\mathbb C}\).

Proof

Given \(n\ge 1\), define \(c_{m,n}=\bigl (1-\frac{m}{n}\bigr )a+\frac{m}{n} b\) for \(0\le m\le n\), and set \(I_{m,n}=[c_{m-1,n},c_{m,n}]\) and \(\varGamma _{\!m,n}=\{\mathbf x\in \varGamma :\,x_3\in I_{m,n}\}\) for \(1\le m\le n\). Next, for each \(1\le m\le n\), set \(\kappa _{\!m,n}=\min _{I_{m,n}}\psi \), \(K_{m,n}=\max _{I_{m,n}}\psi \), and

$$ D_{m,n}=\{\mathbf x:\,\kappa _{m,n}^2\le x_1^2+x_2^2\le K_{m,n}^2\; \& \;x_3\in I_{m,n}\}.$$

To see that \(\varGamma \) is Riemann measurable, we will show that its boundary is Riemann negligible. Indeed, \(\partial \varGamma _{\!m,n}\subseteq D_{m,n}\), and therefore, by Theorem 5.2.1 and Lemma 5.5.5, \(|\partial \varGamma _{\!m,n}|_{\mathrm e}\le \frac{\pi (K_{m,n}^2-\kappa _{m,n}^2)(b-a)}{n}\). Since

$$\lim _{n\rightarrow \infty }\max _{1\le m\le n}(K_{m,n}-\kappa _{m,n})=0\text { and }|\partial \varGamma |_{\mathrm e}\le \sum _{m=1}^n |\partial \varGamma _{\!m,n}|_{\mathrm e},$$

it follows that \(|\partial \varGamma |_{\mathrm e}=0\). Of course, since each \(\varGamma _{\!m,n}\) is a set of the same form as \(\varGamma \), each of them is also Riemann measurable.

Now let f be given. Then

$$\int _\varGamma f(\mathbf x)\,d\mathbf x=\sum _{m=1}^n\int _{\varGamma _{m,n}}f(\mathbf x)\,d\mathbf x= \sum _{m=1}^n\int _{C _{m,n}}f(\mathbf x)\,d\mathbf x-\sum _{m=1}^n\int _{\varGamma _{m,n}{\setminus } C_{m,n}}f(\mathbf x)\,d\mathbf x,$$

where \( C_{m,n}\equiv \{\mathbf x:\,x_1^2+x_2^2\le \kappa _{m,n}\; \& \;x_3\in I_{m,n}\}\). Since \(\varGamma _{\!m,n}{\setminus } C_{m,n}\subseteq D_{m,n}\), the computation in the preceding paragraph shows that

$$\left| \sum _{m=1}^n\int _{\varGamma _{\!m,n}{\setminus } C_{m,n}}f(\mathbf x)\,d\mathbf x\right| \le \frac{\Vert f\Vert _\varGamma \pi (b-a)}{n}\sum _{m=1}^n\bigl (K_{m,n}^2-\kappa _{m,n}^2\bigr )\longrightarrow 0$$

as \(n\rightarrow \infty \). Next choose \(\xi _{m,n}\in I_{m,n}\) so that \(\psi (\xi _{m,n})=\kappa _{m,n}\), and set

$$\epsilon _n=\max _{1\le m\le n}\sup _{\mathbf x\in C_{m,n}}|f(\mathbf x)-f(x_1,x_2,\xi _{m,n})|.$$

Then

$$\left| \sum _{m=1}^n\int _{C _{m,n}}f(\mathbf x)\,d\mathbf x-\sum _{m=1}^n\int _{C _{m,n}}f(x_1,x_2,\xi _{m,\,n})\,d\mathbf x\right| \le \epsilon _n\mathrm {vol}(\varGamma )\longrightarrow 0.$$

Finally, observe that \(\int _{C _{m,\,n}}f(x_1,x_2,\xi _{m,n})\,d\mathbf x=\frac{b-a}{n} g(\xi _{m,n})\) where g is the continuous function on [ab] given by

$$g(\xi )\equiv \int _0^{\psi (\xi )}\rho \left( \int _0^{2\pi } f\bigl (\rho \mathbf e(\varphi ) ,\xi \bigr )\,d\varphi \right) d\rho .$$

Hence, \(\sum _{m=1}^n\int _{C _{m,n}}f(\mathbf x)\,d\mathbf x=\mathcal R(g;\mathcal C_n,\varXi _n)\) where \(\mathcal C_n=\{I_{m,n}:\,1\le m\le n\}\) and \(\varXi _n(I_{m,n})=\xi _{m,n}\). Now let \(n\rightarrow \infty \) to get the desired conclusion.\(\square \)

Integration over balls in \({\mathbb R}^3\) is a particularly important example to which Theorem 5.6.4 applies. Namely, take \(a=-r\), \(b=r\), and \(\psi (\xi )=\sqrt{r^2-\xi ^2}\) for \(\xi \in [-r,r]\). Then Theorem 5.6.4 says that

$$\begin{aligned} \begin{aligned} \int _{\overline{B(\mathbf 0,r)}}&f(\mathbf x)\,d\mathbf x\\ {}&=\int _{-r}^r\left( \int _0^{\sqrt{r^2-\xi ^2}}\rho \left( \int _0^{2\pi } f\bigl (\rho \cos \varphi ,\rho \sin \varphi ,\xi \bigr )\,d\varphi \right) d\rho \right) d\xi .\end{aligned}\end{aligned}$$
(5.6.2)

There is a beautiful application of (5.6.2) to a famous observation made by Newton about his law of gravitation . According to his law, the force exerted by a particle of mass \(m_1\) at \(\mathbf y\in {\mathbb R}^3\) on a particle of mass \(m_2\) at \(\mathbf b\in {\mathbb R}^3{\setminus } \{\mathbf x\}\) is equal to

$$\frac{Gm_1m_2}{|\mathbf y-\mathbf b|^3}(\mathbf y-\mathbf b),$$

where G is the gravitational constant. Next, suppose that \(\varOmega \) is a bounded, closed, Riemann measurable region on which mass is continuously distributed with density \(\mu \). Then the force that the mass in \(\varOmega \) exerts on a particle of mass m at \(\mathbf b\notin \varOmega \) is given by

$$\int _\varOmega \frac{Gm\mu (\mathbf y)}{|\mathbf y-\mathbf b|^3}(\mathbf y-\mathbf b)\,d\mathbf y.$$

Newton’s observation was that if \(\varOmega \) is a ball and the mass density depends only on the distance from the center of the ball, then the force felt by a particle outside the ball is the same as the force exerted on it by a particle at the center of the ball with mass equal to the total mass of the ball. That is, if \(\varOmega =\overline{B(\mathbf c,r)}\) and \(\mu :[0,r]\longrightarrow [0,\infty )\) is continuous, then for \(\mathbf b\notin \overline{B(\mathbf c,r)}\),

$$\begin{aligned} \begin{aligned}&\int _{\overline{B(\mathbf c,r)}}\frac{Gm\mu (|\mathbf y-\mathbf c|)}{|\mathbf y-\mathbf b|^3}(\mathbf y-\mathbf b)\,d\mathbf y=\frac{GMm}{|\mathbf c-\mathbf b|^3}(\mathbf c-\mathbf b)\\&\qquad \text {where }M=\int _{\overline{B(\mathbf c,r)}}\mu \bigl (|\mathbf y-\mathbf c|\bigr )\,d\mathbf y.\end{aligned}\end{aligned}$$
(5.6.3)

(See Exercise 5.8 for the case when \(\mathbf b\) lies inside the ball).

Using translation and rotations, one can reduce the proof of (5.6.3) to the case when \(\mathbf c=\mathbf 0\) and \(\mathbf b=(0,0,-D)\) for some \(D>r\). Further, without loss in generality, we will assume that \(Gm=1\). Next observe that, by rotation invariance applied to the rotations that take \((y_1,y_2,y_3)\) to \((\mp y_1,\pm y_2,y_3)\),

$$\int _{\overline{B(\mathbf 0,r)}}\frac{\mu (|\mathbf y|)}{|\mathbf y-\mathbf b|^3} y_i\,d\mathbf y=-\int _{\overline{B(\mathbf 0,r)}}\frac{\mu (|\mathbf y|)}{|\mathbf y-\mathbf b|^3}y_i\,d\mathbf y$$

and therefore

$$\int _{\overline{B(\mathbf 0,r)}}\frac{\mu (|\mathbf y|)}{|\mathbf y-\mathbf b|^3}y_i\,d\mathbf y=0\quad \text {for } i\in \{1,2\}.$$

Thus, it remains to show that

$$\int _{\overline{B(\mathbf 0,r)}}\frac{\mu (|\mathbf y|)}{|\mathbf y-\mathbf b|^3}y_3\,d\mathbf y=D^{-2}\int _{\overline{B(\mathbf 0,r)}}\mu (|\mathbf y|)\,d\mathbf y.(*)$$

To prove (\(*\)), we apply (5.6.2) to the function

$$f(\mathbf x)=\frac{\mu (|\mathbf x|)(x_3+D)}{\bigl (x_1^2+x_2^2+(x_3+D)^2\bigr )^{\frac{3}{2}}}$$

to write the left hand side as \(2\pi J\) where

$$J\equiv \int _{-r}^r\left( \int _0^{\sqrt{r^2-\xi ^2}}\frac{\rho \mu (\sqrt{\rho ^2+\xi ^2}) (\xi +D)}{\bigl (\rho ^2+(\xi +D)^2\bigr )^{\frac{3}{2}}}\,d\rho \right) d\xi .$$

Now make the change of variables \(\sigma =\sqrt{\rho ^2+\xi ^2}\) in the inner integral to see that

$$J=\int _{-r}^r(\xi +D)\left( \int _{|\xi |}^r\frac{\sigma \mu (\sigma )}{(\sigma ^2+2\xi D+D^2)^{\frac{3}{2}}}\,d\sigma \right) d\xi ,$$

and then apply (5.2.1) to obtain

$$J=\int _0^r\sigma \mu (\sigma )\left( \int _{-\sigma }^\sigma \frac{D+\xi }{(\sigma ^2+2\xi D+D^2)^{\frac{3}{2}}}\,d\xi \right) d\sigma .$$

Use the change of variables \(\eta =\sigma ^2+2\xi D+D^2\) in the inner integral to write it as

$$\frac{1}{4D^2}\int _{(D-\sigma )^2}^{(D+\sigma )^2}\bigl (\eta ^{-\frac{1}{2}}+(D^2-\sigma ^2)\eta ^{-\frac{3}{2}}\bigr )\,d\eta =\frac{2\sigma }{D^2}.$$

Hence,

$$2\pi J=\frac{4\pi }{D^2}\int _0^r\mu (\sigma )\sigma ^2\,d\sigma .$$

Finally, note that \(3\varOmega _3=4\pi \), and apply (5.4.5) with \(N=3\) to see that

$$4\pi \int _0^r\mu (\sigma )\sigma ^2\,d\sigma =\int _{\overline{B(\mathbf 0,r)}}\mu (|\mathbf x|)\,d\mathbf x.$$

We conclude this section by using (5.6.2) to derive the analog of Theorem 5.6.2 for integrals over balls in \({\mathbb R}^3\). One way to introduce polar coordinates for \({\mathbb R}^3\) is to think about the use of latitude and longitude to locate points on a globe. To begin with, one has to choose a reference axis, which in the case of a globe is chosen to be the one passing through the north and south poles. Given a point \(\mathbf q\) on the globe, consider a plane \(P_\mathbf q\) containing the reference axis that passes through \(\mathbf q\). (There will be only one unless \(\mathbf q\) is a pole.) Thinking of points on the globe as vectors with base at the center of the earth, the latitude of a point is the angle that \(\mathbf q\) makes in \(P_\mathbf q\) with the north pole \(\mathbf N\). Before describing the longitude of \(\mathbf q\), one has to choose a reference point \(\mathbf q_0\) that is not on the reference axis. In the case of a globe, the standard choice is Greenwich, England. Then the longitude of \(\mathbf q\) is the angle between the projections of \(\mathbf q\) and \(\mathbf q_0\) in the equatorial plane, the plane that passes through the center of the earth and is perpendicular to the reference axis.

Now let \(\mathbf x\in {\mathbb R}^3{\setminus } \{\mathbf 0\}\). With the preceding in mind, we say that the polar angle of \(\mathbf x=(x_1,x_2,x_3)\) is the \(\theta \in [0,\pi ]\) such that \(\cos \theta =\frac{(\mathbf x,\mathbf N)_{{\mathbb R}^3}}{|\mathbf x|}\), where \(\mathbf N=(0,0,1)\). Assuming that \(\sigma =\sqrt{x_1^2+x_2^2}>0\), the azimuthal angle of \(\mathbf x\) is the \(\varphi \in [0,2\pi )\) such that \((x_1,x_2)=(\sigma \cos \varphi ,\sigma \sin \varphi )\). In other words, in terms of the globe model, we have taken the center of the earth to lie at the origin, the north pole and south poles to be (0, 0, 1) and \((0,0,-1)\), and “Greenwich” to be located at (1, 0, 0). Thus the polar angle gives the latitude and the azimuthal angle gives the longitude.

The preceding considerations lead to the parameterization

$$\begin{aligned}(\rho ,\theta ,\varphi )&\in [0,\infty )\times [0,\pi ]\times [0,2\pi ) \\ {}&\longmapsto \mathbf x_{(\rho ,\theta ,\varphi )}\equiv \bigl (\rho \sin \theta \cos \varphi ,\rho \sin \theta \sin \varphi ,\rho \cos \theta \big )\in {\mathbb R}^3\end{aligned}$$

of points in \({\mathbb R}^3\). Assuming that \(\rho >0\), \(\theta \) is the polar angle of \(\mathbf x_{(\rho ,\theta ,\varphi )}\), and, assuming that \(\rho >0\) and \(\theta \notin \{0,\pi \}\), \(\varphi \) is its azimuthal angle. On the other hand, when \(\rho =0\), then \(\mathbf x_{(\rho ,\theta ,\varphi )}=\mathbf 0\) for all \((\theta ,\varphi )\in [0,\pi ]\times [0,2\pi )\), and when \(\rho >0\) but \(\theta \in \{0,\pi \}\), \(\theta \) is uniquely determined but \(\varphi \) is not. In spite of these ambiguities, if \(\mathbf x=\mathbf x_{(\rho ,\theta ,\varphi )}\), then \((\rho ,\theta ,\varphi )\) are called the polar coordinates of \(\mathbf x\), and as we are about to show, integrals of functions over balls in \({\mathbb R}^3\) can be written as integrals with respect to the variables \((\rho ,\theta ,\varphi )\).

Let \(f:\overline{B(\mathbf 0,r)}\longrightarrow {\mathbb C}\) be a continuous function. Then, by (5.6.2) and (5.2.2), the integral of f over \(\overline{B(\mathbf 0,r)}\) equals

$$\int _0^{2\pi }\left( \int _{-r}^r\left( \int _0^{\sqrt{r^2-\xi ^2}} \sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi \right) d\varphi ,$$

where \(f_\varphi (\sigma ,\xi )=f\bigl (\sigma \cos \varphi ,\sigma \sin \varphi ,\xi \bigr )\). Observe that

$$\begin{aligned}\int _{-r}^r&\left( \int _0^{\sqrt{r^2-\xi ^2}} \sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi \\ {}&= \int _{0}^r\left( \int _0^{\sqrt{r^2-\xi ^2}} \sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi +\int _{-r}^0\left( \int _0^{\sqrt{r^2-\xi ^2}} \sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi \\ {}&= \int _{0}^r\sigma \left( \int _0^{\sqrt{r^2-\sigma ^2}} f_\varphi \bigl (\sigma ,\xi \bigr )\,d\xi \right) d\sigma +\int _{0}^r\sigma \left( \int _0^{\sqrt{r^2-\sigma ^2}} f_\varphi \bigl (\sigma ,-\xi \bigr )\,d\xi \right) d\sigma ,\end{aligned}$$

and make the change of variables \(\xi =\sqrt{\rho ^2-\sigma ^2}\) to write

$$\int _0^{\sqrt{r^2-\sigma ^2}} f_\varphi \bigl (\sigma ,\pm \xi \bigr )\,d\xi =\int _{(\sigma ,r]}f_\varphi \bigl (\sigma ,\pm \sqrt{\rho ^2-\sigma ^2}\bigr )\frac{\rho }{\sqrt{\rho ^2-\sigma ^2}}\,d\rho .$$

Hence, we now know that

$$\begin{aligned}\int _{-r}^r&\left( \int _0^{\sqrt{r^2-\xi ^2}}\sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi \\ {}&=\int _0^r\sigma \left( \int _{(\sigma ,r]}\Bigl (f_\varphi \bigl (\sigma ,\sqrt{\rho ^2-\sigma ^2}\bigr )+ f_\varphi \bigl (\sigma ,-\sqrt{\rho ^2-\sigma ^2}\bigr )\Bigr )\frac{\rho }{\sqrt{\rho ^2-\sigma ^2}}\,d\rho \right) d\sigma \\ {}&=\int _0^r\rho \left( \int _{[0,\rho )} \Bigl (f_\varphi \bigl (\sigma ,\sqrt{\rho ^2-\sigma ^2}\bigr )+ f_\varphi \bigl (\sigma ,-\sqrt{\rho ^2-\sigma ^2}\bigr )\Bigr )\frac{\sigma }{\sqrt{\rho ^2-\sigma ^2}}d\sigma \right) d\rho ,\end{aligned}$$

where we have made use of the obvious extension of Fubini’s Theorem to integrals that are limits of Riemann integrals. Finally, use the change of variables \(\sigma =\rho \sin \theta \) in the inner integral to conclude that

$$\int _{-r}^r\left( \int _0^{\sqrt{r^2-\xi ^2}} \sigma f_\varphi \bigl (\sigma ,\xi \bigr )\,d\sigma \right) d\xi =\int _0^r\rho ^2\left( \int _0^\pi f_\varphi \bigl (\rho \sin \theta ,\rho \cos \theta \bigr )\,d\theta \right) d\rho $$

and therefore, after an application of (5.2.2), that

$$\begin{aligned} \begin{aligned}&\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x\\&=\int _0^{r}\rho ^2\biggl (\int _0^\pi \biggl (\int _0^{2\pi }f\bigl (\rho \sin \theta \cos \varphi , \rho \sin \theta \sin \varphi ,\rho \cos \theta \bigr )\,d\varphi \biggr )d\theta \biggr )d\rho .\end{aligned}\end{aligned}$$
(5.6.4)

5.7 The Divergence Theorem in \({\mathbb R}^2\)

Integration by parts in more than one dimension takes many forms, and in order to even state these results in generality one needs more machinery than we have developed. Thus, we will deal with only a couple of examples and not attempt to derive a general statement.

The most basic result is a simple application of The Fundamental Theorem of Calculus, Theorem 3.2.1. Namely, consider a rectangle \(R=\prod _{j=}^N[a_j,b_j]\), where \(N\ge 2\), and let \(\varphi \) be a \(\mathbb C\)-valued function which is continuously differentiable on \(\mathrm {int}(R)\) and has bounded derivatives there. Given \({\varvec{\xi }}\in {\mathbb {R}^N}\), one has

$$\begin{aligned} \begin{aligned}&\int _R\partial _{\varvec{\xi }}\varphi (\mathbf x)\,d\mathbf x=\sum _{j=1}^N\xi _j\left( \int _{R_j(b_j)}\varphi (\mathbf y)\, d\sigma _\mathbf y-\int _{R_j(a_j)}\varphi (\mathbf y)\,d\sigma _\mathbf y\right) \\&\qquad \text {where }R_j(c)\equiv \left( \prod _{i=1}^{j-1}[a_i,b_i]\right) \times \{c\}\times \left( \prod _{i=j+1}^N[a_j,b_j]\right) ,\end{aligned}\end{aligned}$$
(5.7.1)

where the integral \(\int _{R_j(c)}\psi (\mathbf y)\,d\sigma _\mathbf y\) of a function \(\psi \) over \(R_j(c)\) is interpreted as the \((N-1)\)-dimensional integral

$$\int \limits _{\prod _{i\ne j}[a_i,b_i]}\psi (y_1,\ldots ,y_{j-1},c,y_{j+1},\ldots ,y_N)\,dy_1\cdots dy_{j-1}dy_{j+1}\cdots dy_{N}.$$

Verification of (5.7.1) is easy. First write \(\partial _{\varvec{\xi }}\varphi \) as \(\sum _{j=1}^N\xi _j\partial _{\mathbf e_j}\varphi \). Second, use (5.2.2) with the permutation that exchanges j and 1 but leaves the ordering of the other indices unchanged, and apply Theorem 3.2.1.

In many applications one is dealing with an \({\mathbb R}^N\)-valued function \(\mathbf F\) and is integrating its divergence

$$\mathrm {div}\mathbf F\equiv \sum _{j=1}^N\partial _{\mathbf e_j}F_j$$

over R. By applying (5.7.1) to each coordinate, one arrives at

$$\begin{aligned} \int _R\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x=\sum _{j=1}^N\int _{R_j(b_j)}F_j(\mathbf y)\,d\sigma _\mathbf y-\sum _{j=1}^N\int _{R_j(a_j)}F_j(\mathbf y) \,d\sigma _\mathbf y,\end{aligned}$$
(5.7.2)

but there is a more revealing way to write (5.7.2). To explain this alternative version, let \(\partial G\) be the boundary of a bounded open subset G in \({\mathbb {R}^N}\). Given a point \(\mathbf x\in \partial G\), say that \({\varvec{\xi }}\in {\mathbb {R}^N}\) is a tangent vector to \(\partial G\) at \(\mathbf x\) if there is a continuously differentiable path \(\gamma :(-1,1)\longrightarrow \partial G\) such that \(\mathbf x=\gamma (0)\) and \({\varvec{\xi }}=\dot{\gamma } (0)\equiv \frac{d\gamma }{dt}(0)\). That is, \({\varvec{\xi }}\) is the velocity at the time when a path on \(\partial G\) passes through \(\mathbf x\). For instance, when \(G=\mathrm {int}(R)\) and \(\mathbf x\in R_j(a_j)\cup R_j(b_j)\) is not on one of the edges, then it is obvious that \({\varvec{\xi }}\) is tangent to \(\mathbf x\) if and only if \(({\varvec{\xi }},\mathbf e_j)_{\mathbb {R}^N}=0\). If \(\mathbf x\) is at an edge, \(({\varvec{\xi }},\mathbf e_j)_{\mathbb {R}^N}\) will be 0 for every tangent vector \({\varvec{\xi }}\), but there will be \({\varvec{\xi }}\)’s for which \(({\varvec{\xi }},\mathbf e_j)_{\mathbb {R}^N}=0\) and yet there is no continuously differentiable path that stays on \(\partial G\), passes through \(\mathbf x\), and has derivative \({\varvec{\xi }}\) when it does. When \(G=B(\mathbf 0,r)\) and \(\mathbf x\in \mathbb S^{N-1}(\mathbf 0,r)\equiv \partial B(\mathbf 0,r)\), then \({\varvec{\xi }}\) is tangent to \(\partial G\) if and only if \(({\varvec{\xi }},\mathbf x)_{\mathbb {R}^N}=0\).Footnote 5 To see this, first suppose that \({\varvec{\xi }}\) is tangent to \(\partial G\) at \(\mathbf x\), and let \(\gamma \) be an associated path. Then

$$0=\partial _t|\gamma (t)|^2=2\bigl (\gamma (t),\dot{\gamma } (t)\bigr )_{\mathbb {R}^N}=2(\mathbf x,{\varvec{\xi }})_{\mathbb {R}^N}\quad \text {at } t=0.$$

Conversely, suppose that \((\mathbf x,{\varvec{\xi }})_{\mathbb {R}^N}=0\). If \({\varvec{\xi }}=\mathbf 0\), then we can take \(\gamma (t)=\mathbf x\) for all t. If \({\varvec{\xi }}\ne \mathbf 0\), define \(\gamma (t)=\bigl (\cos (r^{-1}|{\varvec{\xi }}|t)\bigr )\mathbf x+\frac{r}{|{\varvec{\xi }}|}\bigl (\sin (r^{-1}|{\varvec{\xi }}|t)\bigr ){\varvec{\xi }}\), and check that \(\gamma (t)\in \mathbb S^{N-1}(\mathbf 0,r)\) for all t, \(\gamma (0)=\mathbf x\), and \(\dot{\gamma } (0)={\varvec{\xi }}\).

Having defined what it means for a vector to be tangent to \(\partial G\) at \(\mathbf x\), we now say that a vector \({\varvec{\eta }}\) is a normal vector to \(\partial G\) at \(\mathbf x\) if \(({\varvec{\eta }},{\varvec{\xi }})_{\mathbb {R}^N}=0\) for every tangent vector \({\varvec{\xi }}\) at \(\mathbf x\). For nice regions like balls, there is essentially only one normal vector at a point. Indeed, as we saw, \({\varvec{\xi }}\) is tangent to \(\mathbf x\in \partial B(\mathbf 0,r)\) if and only if \(({\varvec{\xi }},\mathbf x)_{\mathbb {R}^N}=0\), and so every normal vector there will have the form \(\alpha \mathbf x\) for some \(\alpha \in {\mathbb R}\). In particular, there is a unique unit vector, known as the outward pointing unit normal vector \(\mathbf n(\mathbf x)\), that is normal to \(\partial B(\mathbf 0,r)\) at \(\mathbf x\) and is pointing outward in the sense that \(\mathbf x\,+\,t\mathbf n(\mathbf x)\notin B(\mathbf 0,r)\) for \(t>0\). In fact, \(\mathbf n(\mathbf x)=\frac{\mathbf x}{|\mathbf x|}\). Similarly, when \(\mathbf x\in R_j(a_j)\cup R_j(b_j)\) is not on an edge, every normal vector will be of the form \(\alpha \mathbf e_j\), and the outward pointing normal unit normal at \(\mathbf x\) will be \(-\mathbf e_j\) or \(\mathbf e_j\) depending on whether \(\mathbf x\in R_j(a_j)\) or \(\mathbf x\in R_j(b_j)\). However, when \(\mathbf x\) is at an edge, there are too few tangent vectors to uniquely determine an outward pointing unit normal vector at \(\mathbf x\).

figure a

Fortunately, because this flaw is present only on a Riemann negligible set, it is not fatal for the application that we will make of these concepts to (5.7.2). To be precise, define \(\mathbf n(\mathbf x)=\mathbf 0\) for \(\mathbf x\in \partial R\) that are on an edge, note that \(\mathbf n\) is continuous off of a Riemann negligible subset of \({\mathbb R}^{N-1}\), and observe that (5.7.2) can be rewritten as

$$\begin{aligned} \int _R\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x=\int _{\partial R} \bigl (\mathbf F(\mathbf y),\mathbf n(\mathbf y)\bigr )_{\mathbb {R}^N}\,d\sigma _\mathbf y,\end{aligned}$$
(5.7.3)

where

$$\int _{\partial R}\psi (\mathbf y)\,d\sigma _\mathbf y\equiv \sum _{j=1}^N\left( \int _{R_j(a_j)}\psi (\mathbf y)\,d\sigma _\mathbf y+ \int _{R_j(b_j)}\psi (\mathbf y)\,d\sigma _\mathbf y\right) .$$

Besides being more aesthetically pleasing than (5.7.2), (5.7.3) has the advantage that it is in a form that generalizes and has a nice physical interpretation. In fact, once one knows how to interpret integrals over the boundary of more general regions, one can show that

$$\begin{aligned} \int _G\mathrm {div}\bigl (\mathbf F(\mathbf x)\bigr )\,d\mathbf x=\int _{\partial G}\bigl (\mathbf F(\mathbf y),\mathbf n(\mathbf y)\bigr )_{\mathbb {R}^N}\,d\sigma _\mathbf y\end{aligned}$$
(5.7.4)

holds for quite general regions, and this generalization is known as the divergence theorem . Unfortunately, understanding of the physical interpretation requires one to know the relationship between \(\mathrm {div}\mathbf F\) and the flow that \(\mathbf F\) determines, and, although a rigorous explanation of this connection is beyond the scope of this book, here is the idea. In Sect. 4.5 we showed that if \(\mathbf F\) satisfies (4.5.4), then it determines a map \(\mathbf X:{\mathbb R}\times {\mathbb {R}^N}\longrightarrow {\mathbb {R}^N}\) by the equation

$$\dot{\mathbf X}(t,\mathbf x)=\mathbf F\bigl (\mathbf X(t,\mathbf x)\bigr )\quad \text {with } \mathbf X(0,\mathbf x)=\mathbf x.$$

In other words, for each \(\mathbf x\), \(t\rightsquigarrow \mathbf X(t,\mathbf x)\) is the path that passes through \(\mathbf x\) at time \(t=0\) and has velocity \(\mathbf F\bigl (\mathbf X(t,\mathbf x)\bigr )\) for all t. Now think about mass that is initially uniformly distributed in a bounded region G and that is flowing along these paths. If one monitors the region to determine how much mass is lost or gained as a consequence of the flow, one can show that the rate at which change is taking place is given by the integral of \(\mathrm {div}\mathbf F\) over G. If instead of monitoring the region, one monitors the boundary and measures how much mass is passing through it in each direction, then one finds that the rate of change is given by the integral of \(\bigl (\mathbf F(\mathbf x),\mathbf n(\mathbf x)\bigr )_{\mathbb {R}^N}\) over the boundary. Thus, (5.7.4) is simply stating that these two methods of measurement give the same answer.

We will now verify (5.7.4) for a special class of regions in \({\mathbb R}^2\). The main reason for working in \({\mathbb R}^2\) is that regions there are likely to have boundaries that are piecewise parameterized curves, which, by the results in Sect. 4.4, means that we know how to integrate over them. The regions G with which we will deal are piecewise smooth star shaped regions in \({\mathbb R}^2\) given by (5.6.1) with a continuous function \(\varphi \in [0,2\pi ]\longmapsto r(\varphi )\in (0,\infty )\) that satisfies \(r(0)=r(2\pi )\) and is piecewise smooth. Clearly the boundary of such a region is a piecewise parameterized curve. Indeed, consider the path \(\mathbf p(\varphi )=\mathbf c +r(\varphi )\mathbf e(\varphi )\) where, as before, \(\mathbf e(\varphi )=(\cos \varphi ,\sin \varphi )\). Then the restriction \(\mathbf p_1\) of \(\mathbf p\) to \([0,\pi ]\) and the restriction \(\mathbf p_2\) of \(\mathbf p\) to \([\pi ,2\pi ]\) parameterize non-overlapping parameterized curves whose union is \(\partial G\). Moreover, by (4.4.2), since

$$\dot{\mathbf p}(\varphi )=r^{\prime }(\varphi )\mathbf e(\varphi )+r(\varphi )\dot{\mathbf e}(\varphi ),\;\bigl (\mathbf e(\varphi ),\dot{\mathbf e}(\varphi )\bigr )_{{\mathbb R}^2}=0,\text { and }|\mathbf e(\varphi )|=|\dot{\mathbf e}(\varphi )|=1,$$

we have that

$$\int _{\partial G}f(\mathbf y)\,d\sigma _\mathbf y=\int _0^{2\pi }f\bigl (\mathbf c+r(\varphi )\mathbf e(\varphi ) \bigr ) \sqrt{r(\varphi )^2+r^{\prime }(\varphi )^2}\,d\varphi .$$

Next observe that,

$$\mathbf t(\varphi )\equiv \bigl (r^{\prime }(\varphi )\cos \varphi -r(\varphi )\sin \varphi ,r^{\prime }(\varphi )\sin \varphi +r(\varphi )\cos \varphi \bigr )$$

is tangent to \(\partial G\) at \(\mathbf p(\varphi )\), and therefore that the outward pointing unit normal to \(\partial G\) at \(\mathbf p(\varphi )\) is

$$\begin{aligned} \mathbf n\bigl (\mathbf p(\varphi )\bigr ) =\pm \frac{\bigl (r(\varphi )\cos \varphi +r^{\prime }(\varphi )\sin \varphi ,r(\varphi )\sin \varphi -r^{\prime }(\varphi )\cos \varphi \bigr )}{\sqrt{r(\varphi )^2+r^{\prime }(\varphi )^2}}\end{aligned}$$
(5.7.5)

for \(\varphi \in [0,2\pi ]\) in intervals where r is continuously differentiable. Further, since

$$\bigl (\mathbf n\bigl (\mathbf p(\varphi )\bigr ),\mathbf p(\varphi )-\mathbf c\bigr )_{{\mathbb R}^2}=\frac{r(\varphi )^2}{\sqrt{r(\varphi )^2+r^{\prime }(\varphi )^2}}>0$$

and therefore

$$|\mathbf p(\varphi )+t\mathbf n(\varphi )-\mathbf c|^2>r(\varphi )^2\quad \text {for } t>0,$$

we know that the plus sign is the correct one. Taking all these into account, we see that (5.7.4) for G is equivalent to

$$\begin{aligned} \begin{aligned} \int _{G}\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x=&\int _0^{2\pi }\bigl (r(\varphi )\cos \varphi +r^{\prime }(\varphi )\sin \varphi \bigr )F_1\bigl (\mathbf c+r(\varphi )\mathbf e(\varphi ) \bigr )\,d\varphi \\ {}&+\int _0^{2\pi }\bigl (r(\varphi )\sin \varphi -r^{\prime }(\varphi )\cos \varphi \bigr )F_2\bigl (\mathbf c+r(\varphi )\mathbf e(\varphi ) \bigr )\,d\varphi . \end{aligned}\end{aligned}$$
(5.7.6)

In proving (5.7.6), we will assume, without loss in generality, that \(\mathbf c=\mathbf 0\). Hence, what we have to show is that

$$\begin{aligned} \int _{G}\partial _{\mathbf e_1}f(\mathbf x)\,d\mathbf x&=\int _0^{2\pi }\bigl (r(\varphi )\cos \varphi +r^{\prime }(\varphi )\sin \varphi \bigr ) f\bigl (r(\varphi )\mathbf e(\varphi )\bigr )\,d\varphi \\ \int _{G}\partial _{\mathbf e_2}f(\mathbf x)\,d\mathbf x&=\int _0^{2\pi }\bigl (r^{\prime }(\varphi )\sin \varphi -r^{\prime }(\varphi )\bigr ) f\bigl (r(\varphi )\mathbf e(\varphi )\bigr )\,d\varphi \end{aligned}(*)$$

To perform the required computation, it is important to write derivatives in terms of the variables \(\rho \) and \(\varphi \). For this purpose, suppose that f is a continuously differentiable function on an open subset of \({\mathbb R}^2\), and set \(g(\rho ,\varphi )=f(\rho \cos \varphi ,\rho \sin \varphi )\). Then

$$\partial _{\rho }g(\rho ,\varphi )=\cos \varphi \,\partial _{\mathbf e_1}f(\rho \cos \varphi ,\rho \sin \varphi )+ \sin \varphi \,\partial _{\mathbf e_2}f(\rho \cos \varphi ,\rho \sin \varphi )$$

and

$$\partial _\varphi g(\rho ,\varphi )=-\rho \sin \varphi \, \partial _{\mathbf e_1}f(\rho \cos \varphi ,\rho \sin \varphi )+\rho \cos \varphi \, \partial _{\mathbf e_2}f(\rho \cos \varphi ,\rho \sin \varphi ),$$

and therefore

$$\rho \partial _{\mathbf e_1}f(\rho \mathbf e(\varphi ) )=\rho \cos \varphi \,\partial _\rho g(\rho ,\varphi )-\sin \varphi \,\partial _\varphi g(\rho ,\varphi )$$

and

$$\rho \partial _{\mathbf e_2}f(\rho \mathbf e(\varphi ) )=\rho \sin \varphi \,\partial _{\rho }g(\rho ,\varphi )+\cos \varphi \,\partial _\varphi g(\rho ,\varphi ).$$

Thus, if f is a continuous function on \(\bar{G}\) that has bounded, continuous first order derivatives on G, then

$$\int _{G}\partial _{\mathbf e_1}f(\mathbf x)\,d\mathbf x=I-J,$$

where

$$I=\int _0^{2\pi }\cos \varphi \left( \int _0^{r(\varphi )}\rho \partial _\rho g(\rho ,\theta )\,d\rho \right) d\varphi $$

and

$$J=\int _0^{2\pi }\sin \varphi \left( \int _0^{r(\varphi )}\partial _\varphi g(\rho ,\theta )\,d\rho \right) d\varphi .$$

Applying integration by parts to the inner integral in I, we see that

$$I=\int _0^{2\pi }\cos \varphi \,g\bigl (r(\varphi ),\varphi \bigr )r(\varphi )\,d\varphi -\int _0^{2\pi }\cos \varphi \left( \int _0^{r(\varphi )}g(\rho ,\varphi )\,d\rho \right) d\varphi .$$

Dealing with J is more challenging. The first step is to write it as \(J_1+J_2\), where \(J_1\) and \(J_2\) are, respectively,

$$\int _0^{2\pi }\sin \varphi \left( \int _0^{r_-}\partial _\varphi g(\rho ,\varphi )\,d\rho \right) d\varphi \text { and } \int _0^{2\pi }\sin \varphi \left( \int _{r_-}^{r(\varphi )}\partial _\varphi g(\rho ,\varphi )\,d\rho \right) \,d\varphi ,$$

and \(r_-\equiv \min \{r(\varphi ):\,\varphi \in [0,2\pi \}\). By (5.2.1) and integration by parts,

$$J_1=\int _0^{r_-}\left( \int _0^{2\pi }\sin \varphi \,\partial _\varphi g(\rho ,\varphi )\,d\varphi \right) d\rho =-\int _0^{r_-}\left( \int _0^{2\pi }\cos \varphi \,g(\rho ,\varphi )\,d\varphi \right) d\rho .$$

To handle \(J_2\), choose \(\theta _0\in [0,2\pi ]\) so that \(r(\theta _0)=r_-\), and choose \(\theta _1,\ldots ,\theta _\ell \in [0,2\pi ]\) so that \(r^{\prime }\) is continuous on each of the open intervals with end points \(\theta _k\) and \(\theta _{k+1}\), where \(\theta _{\ell +1}=\theta _0\). Now use (3.1.6) to write \(J_2\) as \(\sum _{k=0}^{\ell }J_{2,k}\), where

$$J_{2,k}=\int _0^{2\pi }\sin \varphi \left( \int _{r(\varphi \wedge \theta _k)}^{r(\varphi \wedge \theta _{k+1})} \partial _\varphi g(\rho ,\varphi )\,d\rho \right) d\varphi ,$$

and then make the change of variables \(\rho =r(\theta )\) and apply (5.2.1) to obtain

$$\begin{aligned}J_{2,k}&=\int _0^{2\pi }\sin \varphi \left( \int _{\varphi \wedge \theta _k}^{\varphi \wedge \theta _{k+1}}\partial _\varphi g\bigl (r(\theta ),\varphi \bigr )r^{\prime }(\theta )\,d\theta \right) d\varphi \\ {}&= \int _{\theta _k}^{\theta _{k+1}}r^{\prime }(\theta )\left( \int _\theta ^{2\pi }\sin \varphi \,\partial _\varphi g\bigl (r(\theta ),\varphi \bigr )\,d\varphi \right) d\theta .\end{aligned}$$

Hence,

$$J_2=\int _0^{2\pi }r^{\prime }(\theta )\left( \int _\theta ^{2\pi }\sin \varphi \, \partial _\varphi g\bigl (r(\theta ),\varphi \bigr )\,d\varphi \right) d\theta , $$

which, after integration by parts is applied to the inner integral, leads to

$$J_2=-\int _0^{2\pi }\sin \theta \,g\bigl (r(\theta ),\theta )r^{\prime }(\theta )\,d\theta -\int _0^{2\pi } r^{\prime }(\theta )\left( \int _\theta ^{2\pi }\cos \varphi \,g\bigl (r(\theta ),\varphi \bigr )\,d\varphi \right) d\theta .$$

After applying (5.2.1) and undoing the change of variables in the second integral on the right, we get

$$J_2=-\int _0^{2\pi }\sin \theta \,g\bigl (r(\theta ),\theta )r^{\prime }(\theta )\,d\theta - \int _0^{2\pi }\cos \varphi \left( \int _{r_-}^{r(\varphi )}g(\rho ,\varphi )\,d\rho \right) \,d\varphi $$

and therefore

$$J=-\int _0^{2\pi }\sin \varphi \,g\bigl (r(\varphi ) ,\varphi \bigr )r^{\prime }(\varphi )\,d\varphi -\int _0^{2\pi }\cos \varphi \left( \int _0^{r(\varphi )}g(\rho ,\varphi )\,d\rho \right) d\varphi .$$

Finally, when we subtract J from I, we arrive at

$$\int _{G}\partial _{\mathbf e_1}f(\mathbf x)\,d\mathbf x=\int _0^{2\pi }\bigl (r(\varphi )\cos \varphi +r^{\prime }(\varphi )\sin \varphi \bigr )g\bigl (r(\varphi ),\varphi \bigr )\,d\varphi .$$

Proceeding in exactly the same way, one can derive the second equation in (\(*\)), and so we have proved the following theorem.

Theorem 5.7.1

If \(G\subseteq {\mathbb R}^2\) is a piecewise smooth star shaped region and if \(\mathbf F:\bar{G}\longrightarrow {\mathbb R}^2\) is continuous on \(\bar{G}\) and has bounded, continuous first order derivatives on G, then (5.7.5), and therefore (5.7.4) with \(\mathbf n:\partial G\longrightarrow \mathbb S^1(0,1)\) given by (5.7.5), hold.

Corollary 5.7.2

Let G be as in Theorem 5.7.1, and suppose that \(\mathbf a_1,\ldots ,\mathbf a_\ell \in G\) and \(r_1,\ldots ,r_\ell \in (0,\infty )\) have the properties that \(\overline{B(\mathbf a_k,r_k)}\subseteq G\) for each \(1\le k\le \ell \) and that \(\overline{B(\mathbf a_k,r_k)}\cap \overline{B(\mathbf a_{k^{\prime }},r_{k^{\prime }})}=\emptyset \) for \(1\le k<k^{\prime }\le \ell \). Set \(H=G{\setminus } \bigcup _{k=1}^\ell \overline{B(\mathrm a_k,r_k)}\). If  \(\mathbf F:\bar{H}\longrightarrow {\mathbb R}^2\) is a continuous function that has bounded, continuous first order derivatives on H, then \(\int _{H}\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x\) equals

$$\begin{aligned} \int _{\partial G}\bigl (&\mathbf F(\mathbf y),\mathbf n(\mathbf y)\bigr )_{{\mathbb R}^2}\,d\sigma _\mathbf y\\ {}&-\sum _{k=1}^\ell r_k\int _0^{2\pi }\Bigl (F_1\bigl (\mathbf a_k+r_k\mathbf e(\varphi )\bigr )\cos \varphi +F_2\bigl (\mathbf a_k+r_k\mathbf e(\varphi )\bigr )\sin \varphi \Bigr )\,d\varphi .\end{aligned}$$

Proof

First assume that \(\mathbf F\) has bounded, continuous derivatives on the whole of G. Then Theorem 5.7.1 applies to \(\mathbf F\) on \(\bar{G}\) and its restriction to each ball \(\overline{B(\mathbf a_k,r_k)}\), and so the result follows from Theorem 5.7.1 when one writes the integral of \( \mathrm {div}\mathbf F\) over \(\bar{H}\) as

$$\int _{\bar{G}}\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x-\sum _{k=1}^\ell \int _{\overline{B(\mathbf a_k,r_k)}}\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x.$$

To handle the general case, define \(\eta :{\mathbb R}\longrightarrow [0,1]\) by

$$\eta (t)={\left\{ \begin{array}{ll}0&{}\text {if }t\le 0\\ \frac{1+\sin \bigl (\pi (t-\frac{1}{2})\bigr )}{2} &{}\text {if }0<t\le 1\\ 1&{}\text {if }t>1.\end{array}\right. }$$

Then \(\eta \) is continuously differentiable. For each \(1\le k\le \ell \), choose \(R_k>r_k\) so that \(\overline{B(\mathbf a_k,R_k)}\subseteq G\) and \(\overline{B(\mathbf a_k,R_k)}\cap \overline{B(\mathbf a_{k^{\prime }},R_{k^{\prime }})}=\emptyset \) for \(1\le k<k^{\prime }\le \ell \). Define

$$\psi _k(\mathbf x)=\eta \left( \frac{|\mathbf x-\mathbf a_k|^2-r_k^2}{R_k^2-r_k^2}\right) \quad \text {for } \mathbf x\in {\mathbb R}^2$$

and

$$\tilde{\mathbf F}(\mathbf x)=\sum _{k=1}^\ell \psi _k(\mathbf x)\mathbf F(\mathbf x)$$

if \(\mathbf x\in \bar{H}\) and \(\tilde{\mathbf F}(\mathbf x)=0\) if \(\mathbf x\in \bigcup _{k=1}^\ell B(\mathbf a_k,r_k)\). Then \(\tilde{\mathbf F}\) is continuous on \(\bar{G}\) and has bounded, continuous first order derivatives on G. In addition, \(\tilde{\mathbf F}=\mathbf F\) on \(G{\setminus } \bigcup _{k=1}^\ell \overline{B(\mathbf a_k,R_k)}\). Hence, if \(H^{\prime }=\bar{G}{\setminus } \bigcup _{k=1}^\ell \overline{B(\mathbf a_k,R_k)}\), then, by the preceding, \(\int _{H^{\prime }}\mathrm {div}\mathbf F(\mathbf x)\,d\mathbf x\) equals

$$\begin{aligned}\int _{\partial G}\bigl (&\mathbf F(\mathbf y),\mathbf n(\mathbf y)\bigr )_{R^2} \\ {}&-\sum _{k=1}^\ell R_k\int _0^{2\pi }\Bigl (F_1\bigl (\mathbf a_k+R_k\mathbf e(\varphi ) \bigr )\cos \varphi +F_2\bigl (\mathbf a_k+R_k\mathbf e(\varphi ) \bigr )\sin \varphi \Bigr )\,d\varphi ,\end{aligned}$$

and so the asserted result follows after one lets each \(R_k\) decrease to \(r_k\). \(\square \)

We conclude this section with an application of Theorem 5.7.1 that plays a role in many places. One of the consequence of the Fundamental Theorem of Calculus is that every continuous function f on an interval (ab) is the derivative of a continuously differentiable function F on (ab). Indeed, simply set \(c=\frac{a+b}{2}\) and take \(F(x)=\int _c^xf(t)\,dt\). With this in mind, one should ask whether an analogous statement holds in \({\mathbb R}^2\). In particular, given a connected open set \(G\subseteq {\mathbb R}^2\) and a continuous function \(\mathbf F:G\longrightarrow {\mathbb R}^2\), is it true that there is a continuously differentiable function \(f:G\longrightarrow {\mathbb R}\) such that \(\mathbf F\) is the gradient of f? That the answer is no in general can be seen by assuming that \(\mathbf F\) is continuously differentiable and noticing that if f exists then

$$\partial _{\mathbf e_2}F_1=\partial _{\mathbf e_2}\partial _{\mathbf e_1}f=\partial _{\mathbf e_1}\partial _{\mathbf e_2}f=\partial _{\mathbf e_1}F_2.$$

Hence, a necessary condition for the existence of f is that \(\partial _{\mathbf e_2}F_1=\partial _{\mathbf e_1}F_2\), and when this condition holds \(\mathbf F\) is said to be exact . It is known that exactness is sufficient as well as necessary for a continuously differentiable \(\mathbf F\) on G to be the gradient of a function when G is what is called a simply connected region, but to avoid technical difficulties, we will restrict ourselves to star shaped regions.

Corollary 5.7.3

Assume that G is a star shaped region in \({\mathbb R}^2\) and that \(\mathbf F:G\longrightarrow {\mathbb R}^2\) is a continuously differentiable function. Then there exists a continuously differentiable function \(f:G\longrightarrow {\mathbb R}\) such that \(\mathbf F=\nabla f\) if and only if \(\mathbf F\) is exact.

Proof

Without loss in generality, we will assume that \(\mathbf 0\) is the center of G. Further, since the necessity has already been shown, we will assume that \(\mathbf F\) is exact.

Define \(f:G\longrightarrow {\mathbb R}\) by \(f(\mathbf 0)=0\) and

$$f\bigl (r\mathbf e(\varphi ) )=\int _0^r\bigl (F_1(\rho \mathbf e(\varphi ))\cos \varphi +F_2(\rho \mathbf e(\varphi ))\sin \varphi \bigr )\,d\rho $$

for \(\varphi \in [0,2\pi )\) and \(0<r<r(\varphi )\). Clearly \(F_1(\mathbf 0)=\partial _{\mathbf e_1}f(\mathbf 0)\) and \(F_2(\mathbf 0)=\partial _{\mathbf e_2}f(\mathbf 0)\).

We will now show that \(F_1=\partial _{\mathbf e_1}f\) at any point \((\xi _0,\eta _0)\in G{\setminus } \{\mathbf 0\}\). This is easy when \(\eta _0=0\), since \(f(\xi ,0)=\int _0^\xi F_1(t,0)\,dt\). Thus assume that \(\eta _0\ne 0\), and consider points \((\xi ,\eta _0)\) not equal to \((\xi _0,\eta _0)\) but sufficiently close that \((t,\eta _0)\in G\) if \(\xi \wedge \xi _0\le t\le \xi \vee \xi _0\). What we need to show is that

$$f(\xi ,\eta _0)-f(\xi _0,\eta _0)=\int _{\xi _0}^\xi F_1(t,\eta _0)\,dt.(*)$$

To this end, define \(\tilde{\mathbf F}=(F_2,-F_1)\). Then, because \(\mathbf F\) is exact, \(\mathrm {div}\tilde{\mathbf F}=0\) on G. Next consider the region H that is the interior of the triangle whose vertices are \(\mathbf 0\), \((\xi _0,\eta _0)\), and \((\xi ,\eta _0)\). Then H is a piecewise smooth star shaped region and so, by Theorem 5.7.1, the integral of \((\tilde{\mathbf F},\mathbf n)_{{\mathbb R}^2}\) over \(\partial H\) is 0. Thus, if we write \((\xi _0,\eta _0)\) and \((\xi ,\eta _0)\) as \(r_0\mathbf e(\varphi _0)\) and \(r\mathbf e(\varphi )\), then

$$\begin{aligned}0=\int _{\partial H}\bigl (\tilde{\mathbf F}(\mathbf y),\mathbf n(\mathbf y)\bigr )_{{\mathbb R}^2}\,d\mathbf y=\int _0^{r_0}&\Bigl (\tilde{\mathbf F}\bigl (\rho \mathbf e(\varphi _0)\bigr ),\mathbf n\bigl (\rho \mathbf e(\varphi _0)\bigr )\Bigr )_{{\mathbb R}^2}\,d\rho \\&+\int _0^{r}\Bigl (\tilde{\mathbf F}\bigl (\rho \mathbf e(\varphi )\bigr ),\mathbf n\bigl (\rho \mathbf e(\varphi )\bigr )\Bigr )_{{\mathbb R}^2}\,d\rho \\&\quad +\int _{\xi \wedge \xi _0}^{\xi \vee \xi _0}\bigl (\tilde{\mathbf F}(t,\eta _0),\mathbf n(t,\eta _0)\bigr )_{{\mathbb R}^2}\,dt.\end{aligned}$$

If \(\eta _0>0\) and \(\xi >\xi _0\), then

$$\mathbf n\bigl (\rho \mathbf e(\varphi )\bigr ) =(\sin \varphi ,-\cos \varphi ),\;\mathbf n\bigl (\rho \mathbf e(\varphi _0) \bigr )=(-\sin \varphi _0,\cos \varphi _0),$$

and \(\mathbf n(t,\eta _0)=(0,1)\), and therefore

$$\begin{aligned}f(\xi ,\eta _0)&=\int _0^{r}\Bigl (\tilde{\mathbf F}\bigl (\rho \mathbf e(\varphi )\bigr ),\mathbf n\bigl (\rho \mathbf e(\varphi )\bigr )\Bigr )_{{\mathbb R}^2}\,d\rho ,\\ f(\xi _0,\eta _0)&=-\int _0^{r_0}\Bigl (\tilde{\mathbf F}\bigl (\rho \mathbf e(\varphi _0)\bigr ),\mathbf n\bigl (\rho \mathbf e(\varphi _0)\bigr )\Bigr )_{{\mathbb R}^2}\,d\rho , \\ \int _{\xi _0}^\xi F_1(t,\eta _0)\,dt&=-\int _{\xi \wedge \xi _0}^{\xi \vee \xi _0}\bigl (\tilde{\mathbf F}(t,\eta _0),\mathbf n(t,\eta _0)\bigr )_{{\mathbb R}^2}\,dt,\end{aligned}$$

and so (\(*\)) holds. If \(\eta _0>0\) and \(\xi <\xi _0\), then the sign of \(\mathbf n\) changes in each term, and therefore we again get (\(*\)), and the cases when \(\eta _0<0\) are handled similarly.

The proof that \(F_2=\partial _{\mathbf e_2}f\) follows the same line of reasoning and is left as an exercise. \(\square \)

5.8 Exercises

Exercise 5.1

Let \(\mathbf x,\,\mathbf y\in {\mathbb R}^2{\setminus } \{\mathbf 0\}\). Then Schwarz’s inequality says that the ratio

$$\rho \equiv \frac{(\mathbf x,\mathbf y)_{{\mathbb R}^2}}{|\mathbf x||\mathbf y|}$$

is in the open interval \((-1,1)\) unless \(\mathbf x\) and \(\mathbf y\) lie on the same line, in which case \(\rho \in \{-1,1\}\). Euclidean geometry provides a good explanation for this. Indeed, consider the triangle whose vertices are \(\mathbf 0\), \(\mathbf x\), and \(\mathbf y\). The sides of this triangle have lengths \(|\mathbf x|\), \(|\mathbf y|\), and \(|\mathbf y-\mathbf x|\). Thus, by the law of the cosine, \(|\mathbf y-\mathbf x|^2=|\mathbf x|^2+|\mathbf y|^2-2|\mathbf x||\mathbf y|\cos \theta \), where \(\theta \) is the angle in the triangle between \(\mathbf x\) and \(\mathbf y\). Use this to show that \(\rho =\cos \theta \). The same explanation applies in higher dimensions since there is a plane in which \(\mathbf x\) and \(\mathbf y\) lie and the analysis can be carried out in that plane.

Exercise 5.2

Show that

$$\int _{\mathbb R}e^{\lambda x}e^{-\frac{x^2}{2}}\,dx=\sqrt{2\pi }e^{\frac{\lambda ^2}{2}}\quad \text {for all } \lambda \in {\mathbb R}.$$

One way to do this is to make the change of variables \(y=x-\lambda \) to see that

$$\int _{\mathbb R}e^{\lambda x}e^{-\frac{x^2}{2}}\,dx=e^{\frac{\lambda ^2}{2}}\int _{\mathbb R}e^{-\frac{(y-\lambda )^2}{2}}\,dy,$$

and then use (5.1.2) and (5.4.3).

Exercise 5.3

Show that \(|\varGamma |_{\mathrm e}=|\mathfrak R(\varGamma )|_{\mathrm e}\) and \(|\varGamma |_{\mathrm i}=|\mathfrak R(\varGamma )|_{\mathrm i}\) for all bounded sets \(\varGamma \subseteq {\mathbb {R}^N}\) and all rotations \(\mathfrak R\), and conclude that \(\mathfrak R(\varGamma )\) is Riemann measurable if and only if \(\varGamma \) is. Use this to show that if \(\varGamma \) is a bounded subset of \({\mathbb {R}^N}\) for which there exists a \(\mathbf x_0\in {\mathbb {R}^N}\) and an \(\mathbf e\in \mathbb S^{N-1}(\mathbf 0,1)\) such that \((\mathbf x-\mathbf x_0,\mathbf e)_{\mathbb {R}^N}=0\) for all \(\mathbf x\in \varGamma \), then \(\varGamma \) is Riemann negligible.

Exercise 5.4

An integral that arises quite often is one of the form

$$I(a,b)\equiv \int _{(0,\infty )} t^{-\frac{1}{2}}e^{-a^2t-\frac{b^2}{t}}\,dt,$$

where \(a,\,b\in (0,\infty )\). To evaluate this integral, make the change of variables \(\xi =at^{\frac{1}{2}}\,-\,bt^{-\frac{1}{2}}\). Then \(\xi ^2=at-2ab\,+\,t^{-1}\) and \(t^{\frac{1}{2}}=\frac{\xi +\sqrt{\xi ^2+4ab}}{2a}\), the plus being dictated by the requirement that \(t\ge 0\). After making this substitution, arrive at

$$I(a,b)=\frac{e^{-2ab}}{a}\int _{\mathbb R}e^{-\xi ^2}\bigl (1+(\xi ^2+4ab)^{-\frac{1}{2}}\xi \bigr )\,d\xi =\frac{e^{-2ab}}{a}\int _{\mathbb R}e^{-\xi ^2}\,d\xi ,$$

from which it follows that \(I(a,b)=\frac{\pi ^{\frac{1}{2}}e^{-2ab}}{a}\). Finally, use this to show that

$$\int _{(0,\infty )} t^{-\frac{3}{2}}e^{-a^2t-\frac{b^2}{t}}\,dt=\frac{\pi ^{\frac{1}{2}}e^{-2ab}}{b}.$$

Exercise 5.5

Recall the cardiod described in Exercise 2.2, and consider the region that it encloses. Using the expression \(z(\theta )=2R(1-\cos \theta )e^{i\theta }\) for the boundary of this region after translating by \(-R\), show that the area enclosed by the cardiod is \(6\pi R^2\). Finally, show that the arc length of the cardiod is 16R, a computation in which you may want to use (1.5.1) to write \(1-\cos \theta \) as a square.

Exercise 5.6

Given \(a_1,\,a_2,\,a_3\in (0,\infty )\), consider the region \(\varOmega \) in \({\mathbb R}^3\) enclosed by the ellipsoid \(\sum _{i=1}^3\frac{x_i^2}{a_i^2}=1\). Show that the boundary of \(\varOmega \) is Riemann negligible and that the volume of \(\varOmega \) is \(\frac{4\pi a_1a_2a_3}{3}\). When computing the volume V, first show that

$$V=2a_3\int _{\tilde{\varOmega } }\sqrt{1-\tfrac{x_1^2}{a_1^2}-\tfrac{x_2^2}{a_2^2}}\,dx_1dx_2\quad \text {where } \tilde{\varOmega } =\Bigl \{\mathbf x\in {\mathbb R}^2:\,\tfrac{x_1^2}{a_1^2}+\tfrac{x_2^2}{a_2^2}\le 1\Bigr \}.$$

Next, use Fubini’s Theorem and a change of variables in each of the coordinates to write the integral as \(a_1a_2\int _{\overline{B(\mathbf 0,1)}}\sqrt{1-|\mathbf x|^2}\,d\mathbf x\), where \(B(\mathbf 0,1)\) is the unit ball in \({\mathbb R}^2\). Finally, use (5.6.2) to complete the computation.

Exercise 5.7

Let \(\varOmega \) be a bounded, closed region in \({\mathbb R}^3\) with Riemann negligible boundary, and let \(\mu :\varOmega \longrightarrow [0,\infty )\) be a continuous function. Thinking of \(\mu \) as a mass density, one says that the center of gravity of \(\varOmega \) with mass distribution \(\mu \) is the point \(\mathbf c\in {\mathbb R}^3\) such that \(\int _\varOmega \mu (\mathbf y)(\mathbf y-\mathbf c)\,d\mathbf y=\mathbf 0\). The reason for the name is that if \(\varOmega \) is supported at this point \(\mathbf c\), then the net effect of gravity will be 0 and so the region will be balanced there. Of course, \(\mathbf c\) need not lie in \(\varOmega \), in which case one should think of \(\varOmega \) being attached to \(\mathbf c\) by weightless wires.

Obviously, \(\mathbf c=\frac{\int _\varOmega \mu (\mathbf y)\mathbf y\,d\mathbf y}{M}\), where \(M=\int _\varOmega \mu (\mathbf y)\,d\mathbf y\) is the total mass. Now suppose that \(\varOmega =\{\mathbf y\in {\mathbb R}^3:\, x_1^2+x_2^2\le x_3\le h\}\), where \(h>0\), has a constant mass density. Show that \(\mathbf c=\bigl (0,0,\frac{3h}{4}\bigr )\).

Exercise 5.8

We showed that for a ball \(\overline{B(\mathbf c,r)}\) in \({\mathbb R}^3\) with a continuous mass distribution that depends only of the distance to \(\mathbf c\), the gravitational force it exerts on a particle of mass m at a point \(\mathbf b\notin \overline{B(\mathbf c,r)}\) is given by (5.6.3). Now suppose that \(\mathbf b\in \overline{B(\mathbf c,r)}\), and set \(D=|\mathbf b-\mathbf c|\). Show that

$$\begin{aligned}\int _{\overline{B(\mathbf c,r)}}\frac{Gm\mu (|\mathbf y-\mathbf c|)}{|\mathbf y-\mathbf b|}\,d\mathbf y=\frac{GmM_D}{D^2}(\mathbf c-\mathbf b)\\ \text {where }M_D=\int _{\overline{B(\mathbf c,D)}}\mu (|\mathbf y-\mathbf c|)\,d\mathbf y.\end{aligned}$$

In other words, the forces produced by the mass that lies further than \(\mathbf b\) from \(\mathbf c\) cancel out, and so the particle feels only the force coming from the mass between it and the center of the ball.

Exercise 5.9

Let \(B(\mathbf 0,r)\) be the ball of radius r in \({\mathbb {R}^N}\) centered at the origin. Using rotation invariance, show that

$$\int _{\overline{B(\mathbf 0,r)}}x_i\,d\mathbf x=0\quad \text {and}\quad \int _{\overline{B(\mathbf 0,r)}}x_ix_j\,d\mathbf x=\frac{\varOmega _Nr^{N+2}}{N+2}\delta _{i,j}\text { for } 1\le i,j\le N.$$

Next, suppose that \(f:{\mathbb {R}^N}\longrightarrow {\mathbb R}\) is twice continuously differentiable, and let \(\mathcal A(f,r)=(\varOmega _Nr^N)^{-1}\int _{\overline{B(\mathbf 0,r)}}f(\mathbf x)\,d\mathbf x\) be the average value of f on \(B(\mathbf 0,r)\). As an application of the preceding, show that \(\frac{A(f,r)-f(\mathbf 0)}{r^2}\longrightarrow \frac{1}{2(N+2)}\sum _{i=1}^N\partial _{\mathbf e_i}^2f(\mathbf 0)\) as \(r\searrow 0\).

Exercise 5.10

Suppose that G is an open subset of \({\mathbb R}^2\) and that \(\mathbf F:G\longrightarrow {\mathbb R}^2\) is continuous. If \(\mathbf F=\nabla f\) for some continuously differentiable \(f:G\longrightarrow {\mathbb R}\) and \(\mathbf p:[a,b]\longrightarrow G\) is a piecewise smooth path, show that

$$f(b)-f(a)=\int _a^b\Bigl (\mathbf F\bigl (\mathbf p(t)\bigr ),\dot{\mathbf p}(t)\Bigr )_{\mathbb {R}^N}\,dt$$

and therefore that \(\int _a^b\Bigl (\mathbf F\bigl (\mathbf p(t)\bigr ),\dot{\mathbf p}(t)\Bigr )_{\mathbb {R}^N}\,dt=0\) if \(\mathbf p\) is closed (i.e., \(\mathbf p(b)=\mathbf p(a)\)). Now assume that G is connected and that \(\int _a^b\Bigl (\mathbf F\bigl (\mathbf p(t)\bigr ),\dot{\mathbf p}(t)\Bigr )_{\mathbb {R}^N}\,dt=0\) for all piecewise smooth, closed paths \(\mathbf p:[a,b]\longrightarrow G\). Using Exercise 4.5, show that for each \(\mathbf x,\,\mathbf y\in G\) there is a piecewise smooth path in G that starts at \(\mathbf x\) and ends at \(\mathbf y\). Given a reference point \(\mathbf x_0\in G\) and an \(\mathbf x\in G\), show that

$$f(\mathbf x)\equiv \int _a^b\Bigl (\mathbf F\bigl (\mathbf p(t)\bigr ),\dot{\mathbf p}(t)\Bigr )_{\mathbb {R}^N}\,dt$$

is the same for all piecewise smooth paths \(\mathbf p:[a,b]\longrightarrow G\) such that \(\mathbf p(a)=\mathbf x_0\) and \(\mathbf p(b)=\mathbf x\). Finally, show that f is continuously differentiable and that \(\mathbf F=\nabla f\).