Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The purpose of this chapter is to discuss various geometric problems which are informed by orthogonality and related considerations. We begin with Hurwitz’s proof of the isoperimetric inequality using Fourier series. We prove Wirtinger’s inequality, both by Fourier series and by compact operators. We continue with a theorem comparing areas of the images of the unit disk under complex analytic mappings. We again give two proofs, one using power series and one using Green’s (Stokes’) theorem. The maps zz d from the circle to itself play a prominent part in our story. We naturally seek the higher-dimensional versions of some of these results. It turns out, not surprisingly, that one can develop the ideas in many directions. We limit ourselves here to a small number of possible paths, focusing on the unit sphere in C n, and we travel only a small distance along each of them.

Complex analytic mappings sending the unit sphere in C n to the unit sphere in some C N play a major role in this chapter. For example, we study polynomial mappings that are also invariant under finite subgroups of the unitary group, and we discover a surprising connection to Chebyshev polynomials. We also compute many explicit integrals. The author’s technique of orthogonal homogenization is introduced and is used to prove a sharp inequality about volumes (with multiplicity accounted for) of complex analytic images of the unit ball. To prove this inequality we develop needed information about differential forms and complex vector fields. This material leads us to the Cauchy–Riemann (CR) geometry of the unit sphere. We close with a generalization of the Riesz–Fejer theorem on nonnegative trig polynomials to a result on Hermitian polynomials that are positive on the unit sphere. This chapter thus provides many ways to extend results from the unit circle to higher dimensions, all informed by orthogonality and Hermitian analysis.

We do not consider the Fourier transform in higher dimensions. Many books on partial differential equations and harmonic analysis tell that story well.

1 The Isoperimetric Inequality

Geometric inequalities range from easy observations to deep assertions. One of the easiest such inequalities is that the rectangle of a given perimeter with maximum area is a square. The proof follows from \((x + y)(x - y) = {x}^{2} - {y}^{2} \leq {x}^{2}\), with equality when y = 0. One of the most famous inequalities solves the isoperimetric problem; given a closed curve in the plane of length L, the area A enclosed satisfies \(A \leq { {L}^{2} \over 4\pi }\). Equality happens only if the curve is a circle. We use Fourier series to prove this isoperimetric inequality, assuming that the curve is smooth.

Recall from calculus that a smooth planar curve is a smooth function γ: [a, b] → R 2 for which γ′(t) does not vanish. Officially speaking, the curve is the function, but it is natural to think also of the curve as the image of the function traced out in some order. The curve is called closed if γ(a) = γ(b) and simple if γ(t 1) ≠ γ(t 2) for t 1t 2 unless \(t_{1} = a,t_{2} = b\) or \(t_{1} = b,t_{2} = a\). This complicated sounding condition is clear in geometric terms; if one thinks of the curve as its image, then the curve is simple if it neither crosses itself nor covers itself multiple times. Note, for example, that the curve γ: [0, 2π] → C given by γ(t) = e 2it is closed but not simple, because it covers the circle twice.

The length of γ is the integral γ ds, where ds is the arc-length form. In terms of the function tγ(t), we have the equivalent formula L = a b || γ′(t) || dt; this value is unchanged if we reparametrize the curve. It is often convenient to parametrize using arc length; in this case \(\vert \vert \gamma ^{\prime}(s)\vert \vert = \vert \vert \gamma ^{\prime}(s)\vert {\vert }^{2} = 1\).

We can integrate 1-forms along nice curves γ. We give a precise definition of 1-form in Sect. 5. For now we assume the reader knows the meaning of the line integral γ Pdx + Qdy, assuming P and Q are continuous functions on γ. This integral measures the work done in moving along γ against a force given by (P, Q). We also assume Green’s theorem from calculus. In Green’s theorem, the curve γ is assumed to be positively oriented. Intuitively, this condition means the (image of the) curve is traversed counterclockwise as the parameter t increases from a to b.

Proposition 4.1 (Green’s theorem).

Let γ be a piecewise-smooth, positively oriented, simple closed curve in R 2, bounding a region Ω. Assume that P and Q are continuously differentiable on Ω and continuous on Ω ∪γ. Then

$$\displaystyle{\int _{\gamma }Pdx + Qdy =\int _{\Omega }\left ({\partial Q \over \partial x} -{ \partial P \over \partial y} \right )dxdy.}$$

The area A enclosed by γ is of course given by a double integral. Assume that γ is positively oriented. Using Green’s theorem, we see that A is also given by a line integral:

$$\displaystyle{ A =\int _{\Omega }dxdy ={ 1 \over 2}\int _{\gamma }xdy - ydx ={ 1 \over 2}\int _{a}^{b}\left (x(t)y^{\prime}(t) - x^{\prime}(t)y(t)\right )dt. }$$
(1)

Notice the appearance of the Wronskian.

Exercise 4.1.

Graph the set of points where \({x}^{3} + {y}^{3} = 3xy\). Use a line integral to find the area enclosed by the loop. Solve the same problem when the defining equation is \({x}^{2k+1} + {y}^{2k+1} = (2k + 1){x}^{k}{y}^{k}\). Comment: Set y = tx to parametrize the curve. Then \(xdy - ydx = x(tdx + xdt) - txdx = {x}^{2}dt\).

Exercise 4.2.

Verify Green’s theorem when Ω is a rectangle. Explain how to extend Green’s theorem to a region whose boundary consists of finitely many sides, each parallel to one of the coordinate axes.

Theorem 4.1 (Isoperimetric inequality, smooth version).

Let γ be a smooth simple closed curve in R 2 of length L and enclosing a region of area A. Then \(A \leq { {L}^{2} \over 4\pi }\) and equality holds only when γ defines a circle.

Proof.

This proof goes back to Hurwitz in 1901. After a change of scale, we may assume that the length L of γ is 2π. After mapping [a, b] to [0, 2π], we parametrize by arc length s and thus assume γ: [0, 2π] → R 2 and || γ′(s) || = 1.

Since the curve is closed, γ may be thought of as periodic of period 2π. In terms of Fourier series we may therefore write:

$$\displaystyle{ \gamma (s) = (x(s),y(s)) = \left (\sum _{-\infty }^{\infty }a_{ n}{e}^{ins},\sum _{ -\infty }^{\infty }b_{ n}{e}^{ins}\right ) }$$
(2)
$$\displaystyle{ \gamma ^{\prime}(s) = (x^{\prime}(s),y^{\prime}(s)) = \left (\sum _{-\infty }^{\infty }ina_{ n}{e}^{ins},\sum _{ -\infty }^{\infty }inb_{ n}{e}^{ins}\right ). }$$
(3)

Since (x′(s), y′(s)) is a unit vector, we have \(2\pi =\int _{ 0}^{2\pi }{(x^{\prime}(s))}^{2} + {(y^{\prime}(s))}^{2}ds\). The only term that matters in computing the integral of a trigonometric series is the constant term. Constant terms in x′(s)2 and y′(s)2 arise precisely when the term with index n is multiplied by the term with index − n. It therefore follows that

$$\displaystyle{ \sum _{-\infty }^{\infty }{n}^{2}(\vert a_{ n}{\vert }^{2} + \vert b_{ n}{\vert }^{2}) = 1. }$$
(4)

We do a similar computation of xy′yx′ to find the area A. We have

$$\displaystyle{A ={ 1 \over 2}\big\vert \int _{0}^{2\pi }\left (x(s)y^{\prime}(s) - x^{\prime}(s)y(s)\right )ds\big\vert ={ 1 \over 2}2\pi \big\vert \sum in(a_{n}\overline{b_{n}} - b_{n}\overline{a_{n}})\big\vert }$$
$$\displaystyle{ =\pi \vert \sum in(a_{n}\overline{b_{n}} - b_{n}\overline{a_{n}})\vert \leq 2\pi \sum n\vert a_{n}\vert \vert b_{n}\vert. }$$
(5)

Next we use | n | ≤ n 2 and

$$\displaystyle{ \vert a_{n}b_{n}\vert \leq { 1 \over 2}(\vert a_{n}{\vert }^{2} + \vert b_{ n}{\vert }^{2}) }$$
(6)

in the last term in (5). These inequalities and (4) yield

$$\displaystyle{A \leq \pi \sum {n}^{2}(\vert a_{ n}{\vert }^{2} + \vert b_{ n}{\vert }^{2}) =\pi ={ {L}^{2} \over 4\pi },}$$

where we have also used the value L = 2π.

We check when equality holds in the inequality. It must be that the only nonzero terms are those with | n | = n 2, that is, n = 0, ± 1. We must also have equality in (6), and hence | a n | = | b n |. By (4) we then must have \(\vert a_{1}\vert = \vert b_{1}\vert ={ 1 \over 2}\). Put \(a_{1} ={ 1 \over 2}{e}^{i\mu }\) and \(b_{1} ={ 1 \over 2}{e}^{i\nu }\). Since x(s) and y(s) are real, \(a_{-1} = \overline{a_{1}}\) and \(b_{-1} = \overline{b_{1}}\). In other words we must have

$$\displaystyle{\left (x(s),y(s)\right ) = \left (a_{0} + a_{1}{e}^{is} + \overline{a_{ 1}}{e}^{-is},b_{ 0} + b_{1}{e}^{is} + \overline{b_{ 1}}{e}^{-is}\right ).}$$

Under these conditions we get \((x - a_{0},y - b_{0}) = (\mathrm{cos}(s+\mu ),\mathrm{cos}(s+\nu ))\). Finally, remembering that \({(x^{\prime})}^{2} + {(y^{\prime})}^{2} = 1\), we conclude that \(\mathrm{cos}(s+\nu ) = \pm \mathrm{sin}(s+\mu )\). Hence, γ defines a circle of radius 1. □

Exercise 4.3.

Given an ellipse E, create a family E t of ellipses such that the following all hold:

  1. (1)

    E = E 0.

  2. (2)

    Each E t has the same perimeter.

  3. (3)

    The area enclosed by E t is nondecreasing as a function of t.

  4. (4)

    E 1 is a circle.

Exercise 4.4.

A region Ω in the plane is convex if, whenever p, q ∈ Ω, the line segment connecting p and q also lies in Ω. Assume that Ω is bounded, has a nice boundary, but is not convex. Find, by a simple geometric construction, a region Ω ′ with the same perimeter as Ω but with a larger area. (Reflect a dent across a line segment. See Fig. 4.1.)

Figure 4.1
figure 1

Convexity and the isoperimetric inequality

Remark 4.1.

The isoperimetric inequality holds in higher dimensions. For example, of all simple closed surfaces in R 3 with a given surface area, the sphere encloses the maximum volume.

2 Elementary L 2 Inequalities

In this section we prove several inequalities relating L 2 norms of functions and their derivatives. The setting for the first example is an interval on the real line, whereas the setting for the second example is the unit disk in C.

We begin with the Wirtinger inequality, an easy one-dimensional version of various higher-dimensional inequalities relating functions and their derivatives. We give two proofs to help unify topics in this book.

Theorem 4.2 (Wirtinger inequality).

Assume f is continuously differentiable on [0,1] and \(f(0) = f(1) = 0\) . The following inequality holds and is sharp:

$$\displaystyle{\vert \vert f\vert \vert _{{L}^{2}}^{2} \leq {{ 1 \over \pi }^{2}} \vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2}.}$$

Proof.

First we show that there is a function for which equality occurs. Put f(x) = sin(π x). Then \(\vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} {=\pi }^{2}\vert \vert f\vert \vert _{{L}^{2}}^{2}\) because

$$\displaystyle{\vert \vert f\vert \vert _{{L}^{2}}^{2} =\int _{ 0}^{1}{\mathrm{sin}}^{2}(\pi x)dx ={ 1 \over 2}}$$
$$\displaystyle{\vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} =\int _{ 0}^{1}{\pi }^{2}{\mathrm{cos}}^{2}(\pi x)dx ={{ \pi }^{2} \over 2}.}$$

Next we use Fourier series to prove the inequality. By putting \(f(-x) = -f(x)\), we extend f to be an odd function (still called f) on the interval [ − 1, 1]. The extended f is still continuously differentiable, even at 0. Then f equals its Fourier series, which involves only the functions sin(nπ x). Put \(f(x) =\sum c_{n}\mathrm{sin}(n\pi x)\). Since f is odd, \(c_{0} =\hat{ f}(0) = 0\). Let L 2 continue to denote L 2([0, 1]). By either the Parseval formula or by orthonormality, we get

$$\displaystyle{\vert \vert f\vert \vert _{{L}^{2}}^{2} ={ 1 \over 2}\sum _{-\infty }^{\infty }\vert c_{ n}{\vert }^{2} =\sum _{ n=1}^{\infty }\vert c_{ n}{\vert }^{2}}$$
$$\displaystyle{\vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} =\sum _{ n=1}^{\infty }{n{}^{2}\pi }^{2}\vert c_{ n}{\vert }^{2} {=\pi }^{2}\sum _{ n=1}^{\infty }{n}^{2}\vert c_{ n}{\vert }^{2}.}$$

Since 1 ≤ n 2 for all n ≥ 1, we obtain

$$\displaystyle{{\pi }^{2}\vert \vert f\vert \vert _{{ L}^{2}}^{2} \leq \vert \vert f^{\prime}\vert \vert _{{ L}^{2}}^{2}.}$$

The proof also tells us when equality occurs! Put c n = 0 unless n = 1; that is, put f(x) = sin(π x) (Fig. 4.2). □

Proof.

We sketch a different proof using compact operators. Define a linear operator T on the continuous functions in L 2([0, 1]) by Tf(x) = 0 x f(t)dt. We work on the subspace where \(f(0) = f(1) = 0\). Computation (see Exercise 4.5) shows that \({T}^{{\ast}}f(x) =\int _{ x}^{1}f(u)du\). The operator T T is compact and self-adjoint. It is easy to check that each eigenvalue of T T is nonnegative. By the first part of the proof of the spectral theorem, the maximal eigenvalue λ M of T T satisfies \(\lambda _{M} = \vert \vert {T}^{{\ast}}T\vert \vert = \vert \vert T\vert {\vert }^{2}\). We find all eigenvalues.

Figure 4.2
figure 2

Wirtinger inequality

Set T Tf = λ f to get

$$\displaystyle{\int _{x}^{1}\int _{ 0}^{t}f(u)dudt =\lambda f(x).}$$

Differentiating twice and using the fundamental theorem of calculus gives

$$\displaystyle{-f(x) =\lambda f^{\prime \prime}(x).}$$

Since \(f(0) = f(1) = 0\), we conclude that \(f(x) = c\ \mathrm{sin}({ x \over \sqrt{\lambda }})\), where \({ 1 \over \sqrt{\lambda }} = n\pi\). Thus, \(\lambda ={ 1 \over {n{}^{2}\pi }^{2}}\). The maximum happens when n = 1. We get \(\vert \vert T\vert {\vert }^{2} ={{ 1 \over \pi }^{2}}\), which is equivalent to saying that \(\vert \vert Tg\vert \vert _{{L}^{2}} \leq { 1 \over \pi } \vert \vert g\vert \vert _{{L}^{2}}\) for all g. Setting g = f′ gives the desired conclusion. □

Corollary 4.1.

Assume f is continuously differentiable with \(f(a) = f(b) = 0\) . Then

$$\displaystyle{\int _{a}^{b}\vert f(x){\vert }^{2}dx \leq {({b - a \over \pi } )}^{2}\int _{ a}^{b}\vert f^{\prime}(x){\vert }^{2}dx.}$$

Proof.

The result follows from changing variables (Exercise 4.6). □

Exercise 4.5.

Put Tf(x) = a b K(x, y)f(y)dy. Express T as an integral operator. Check your answer when T is as in the second proof of Theorem 4.2.

Exercise 4.6.

Prove Corollary 4.1.

Higher-dimensional analogues of the Wirtinger inequality are called Poincaré inequalities. Given a region Ω in R n, a Poincaré inequality is an estimate of the form (for some constant C)

$$\displaystyle{ \vert \vert u\vert \vert _{{L}^{2}}^{2} \leq {C}^{2}\left ({\left \vert \int _{ \Omega }u\right \vert }^{2} + \vert \vert \nabla u\vert \vert _{{ L}^{2}}^{2}\right ). }$$
(P)

Let A denote the volume of Ω and let \(u_{0}={ 1 \over A}\int _{\Omega }u\) denote the average value of u on Ω. We can rewrite (P) in the form

$$\displaystyle{ \vert \vert u - u_{0}\vert \vert _{{L}^{2}}^{2} \leq \vert \vert \nabla u\vert \vert _{{ L}^{2}}^{2}. }$$
(P.1)

By expanding the squared norm on the left-hand side of (P.1) and doing some simple manipulations, we can also rewrite (P) in the form

$$\displaystyle{ \vert \vert u\vert \vert _{{L}^{2}}^{2} \leq { 1 \over A}{\left \vert \int u\right \vert }^{2} + {C}^{2}\vert \vert \nabla u\vert \vert _{{ L}^{2}}^{2}. }$$
(P.2)

The technique of subtracting the average value and expanding the squared norm appears, in various guises, many times in this book. This reasoning is standard in elementary probability, as used in Proposition 5.4. Observe also, for f, f 0 in a Hilbert space, that

$$\displaystyle{\vert \vert f - f_{0}\vert {\vert }^{2} = \vert \vert f\vert {\vert }^{2} -\vert \vert f_{ 0}\vert {\vert }^{2}}$$

whenever ff 0f 0. This version of the Pythagorean theorem was used in the proof of Bessel’s inequality, where f 0 was the orthogonal projection of f onto the subspace spanned by a finite orthonormal system.

Poincaré estimates do not hold for all domains. When such an inequality does hold, the smallest value of C that works is called the Poincaré constant for the domain.

We make one additional observation. In our proof of the Wirtinger inequality, we assumed that f vanished at both endpoints. We could have assumed that f vanished at only one endpoint, or instead that the average value of f was 0, and in each case proved similar results. The condition that the average value of f vanishes means that f is orthogonal to the one-dimensional subspace of constant functions. The condition that f vanish at the endpoints means that f lies in a subspace of codimension two.

Exercise 4.7.

Find the Poincaré constant for the interval [ − A, A]. (The function \(\mathrm{sin}({ \pi x \over 2A})\) achieves the bound. The answer is \({2A \over \pi }\).)

Remark 4.2.

The Wirtinger inequality provides a bound on the L 2 norm of a function in terms of the L 2 norm of its derivative. Various inequalities that bound the maximum of the derivative p′ of a polynomial in terms of the maximum of p (thus going in the other direction) and its degree are called Bernstein inequalities and Markov inequalities. We do not consider such results in this book.

We next prove a simple geometric inequality in one complex dimension. It motivates a more difficult higher-dimensional analogue which we prove in Sect. 9. The orthogonality of the functions e inθ again features prominently.

Let f be a complex analytic function on the unit disk B 1. Let A f denote the area of the image, with multiplicity counted. For example, if f(z) = z m, then f covers the disk m times and A f = . The formula for A f involves the L 2 norm of the derivative. We make the concept of counting multiplicity precise by defining A f as follows:

Definition 4.1.

Let Ω be open in C. Assume f: Ω → C is complex analytic. The area, written A f (Ω) or A f , of the image of f, with multiplicity counted, is defined by

$$\displaystyle{ A_{f} = \vert \vert f^{\prime}\vert \vert _{{L}^{2}(\Omega )}^{2}. }$$
(7)

We next note that this concept agrees with what we expect when f is injective.

Lemma 4.1.

Let f: Ω → C be complex analytic and injective. Then the area of f(Ω) equals \(\vert \vert f^{\prime}\vert \vert _{{L}^{2}(\Omega )}^{2}\) .

Proof.

Let A(f) denote the area of the image of f. Write \(f = u + iv\) and define F(x, y) = (u(x, y), v(x, y)). The Cauchy–Riemann equations and the definition of f′ imply \(\mathrm{det}(F^{\prime}) = u_{x}v_{y} - u_{y}v_{x} = u_{x}^{2} + u_{y}^{2} = \vert f^{\prime}{\vert }^{2}\). Since F is injective, the change of variables formula for double integrals applies and gives

$$\displaystyle{A(f) =\int _{F(\Omega )}dudv =\int _{\Omega }\vert \mathrm{det}(F^{\prime})\vert dxdy =\int _{\Omega }\vert f^{\prime}(z){\vert }^{2}dxdy = \vert \vert f^{\prime}\vert \vert _{{ L}^{2}}^{2}.}$$

Versions of the change of variables formula hold more generally. Suppose that f is m to one for some fixed m. The change of variables formula gives

$$\displaystyle{m\int _{F(\Omega )}dudv =\int _{\Omega }\vert \mathrm{det}(F^{\prime})\vert dxdy =\int _{\Omega }\vert f^{\prime}(z){\vert }^{2}dxdy = \vert \vert f^{\prime}\vert \vert _{{ L}^{2}}^{2}.}$$

In general, the multiplicity varies from point to point. For complex analytic functions, things are nonetheless quite nice. See [A] for the following result. Suppose that f is complex analytic near z 0 and the function zf(z) − f(z 0) has a zero of order m. Then, for w sufficiently close to f(z 0), there is a (deleted) neighborhood of z 0 on which the equation f(z) = w has precisely m solutions. By breaking Ω into sets on which f has constant multiplicity, we justify the definition of A f .

We return to the unit disk. The natural Hilbert space here is the set \({\mathcal{A}}^{2}\) of square-integrable complex analytic functions f on the unit disk. The inner product on \({\mathcal{A}}^{2}\) is given by

$$\displaystyle{\langle f,g\rangle =\int _{B_{1}}f(z)\overline{g(z)}dxdy.}$$

The subspace \({\mathcal{A}}^{2}\) is closed in L 2 and hence is itself a Hilbert space. See, for example, pages 70–71 in [D1] for a proof. The main point of the proof is that, on any compact subset K of the disk, we can estimate (the L norm) sup K | f | by a constant times (the L 2 norm) || f ||. Hence, if {f n } is Cauchy in L 2, then {f n } converges uniformly on compact subsets. By a standard fact in complex analysis (see [A]), the limit function is also complex analytic.

We are also concerned with the subspace of \({\mathcal{A}}^{2}\) consisting of those f for which f′ is square-integrable.

Lemma 4.2.

The functions z n for n = 0,1,2,… form a complete orthogonal system for \({\mathcal{A}}^{2}\) .

Proof.

Using polar coordinates we have

$$\displaystyle{ \langle {z}^{n},{z}^{m}\rangle =\int _{ 0}^{1}\int _{ 0}^{2\pi }{r}^{n+m+1}{e}^{i(n-m)\theta }d\theta dr. }$$
(8)

By (8), the inner product vanishes unless m = n. To check completeness, we observe that a complex analytic function in the unit disk has a power series based at 0 that converges in the open unit disk. If f is orthogonal to each monomial, then each Taylor coefficient of f vanishes and f is identically 0. □

Lemma 4.2 illustrates a beautiful aspect of Hilbert spaces of complex analytic functions. Let f be complex analytic in the unit disk, with power series \(\sum a_{n}{z}^{n}\). By basic analysis, the partial sums S N of this series converge uniformly to f on compact subsets of the unit disk. By Lemma 4.2, the partial sum S N can also be interpreted as the orthogonal projection of f onto the subspace of polynomials of degree at most N. Hence the partial sums also converge to f in the Hilbert space sense.

In Proposition 4.2 we relate \(\vert \vert f\vert \vert _{{L}^{2}}^{2}\) to the l 2 norm of the Taylor coefficients of f. By (9) below we can identify elements of \({\mathcal{A}}^{2}\) with sequences {b n } such that \(\sum {\vert b_{n}{\vert }^{2} \over n+1}\) converges.

Consider the effect on the area of the image if we multiply f by z. Since | z | < 1, the inequality | zf(z) | ≤ | f(z) | is immediate. But the area of the image under zf exceeds the area of the image under f, unless f is identically 0. In fact we can explain and determine precisely how the area grows.

Proposition 4.2.

Let \(f(z) =\sum _{ n=0}^{\infty }b_{n}{z}^{n}\) be a complex analytic function on the unit disk B 1 . We assume that both f and f′ are in \({L}^{2}(B_{1})\) . Then

$$\displaystyle{ \vert \vert f\vert \vert _{{L}^{2}}^{2} =\pi \sum _{ n=0}^{\infty }{\vert b_{n}{\vert }^{2} \over n + 1} }$$
(9)
$$\displaystyle{ \vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} =\pi \sum _{ n=0}^{\infty }(n + 1)\vert b_{ n+1}{\vert }^{2} }$$
(10)
$$\displaystyle{ \vert \vert (zf)^{\prime}\vert \vert _{{L}^{2}}^{2} = \vert \vert f^{\prime}\vert \vert _{{ L}^{2}}^{2} +\pi \sum _{ n=0}^{\infty }\vert b_{ n}{\vert }^{2}. }$$
(11.1)

Thus A zf ≥ A f and equality occurs only when f vanishes identically.

Proof.

The proof of (9) is an easy calculation in polar coordinates, using the orthogonality of e inθ. Namely, we have

$$\displaystyle{\vert \vert f\vert \vert _{{L}^{2}}^{2} =\int _{ B_{1}}\vert \sum b_{n}{z}^{n}{\vert }^{2}dxdy =\int _{ 0}^{1}\int _{ 0}^{2\pi }\sum b_{ n}\overline{b}_{m}{r}^{m+n}{e}^{i\theta (m-n)}rdrd\theta.}$$

The only terms that matter are those for which m = n and we see that

$$\displaystyle{\vert \vert f\vert \vert _{{L}^{2}}^{2} = 2\pi \sum \vert b_{ n}{\vert }^{2}\int _{ 0}^{1}{r}^{2n+1}dr =\pi \sum _{ n=0}^{\infty }{\vert b_{n}{\vert }^{2} \over n + 1}.}$$

Formula (10) follows immediately from (9). To prove (11.1), observe that \((zf)^{\prime}(z) =\sum _{ n=0}^{\infty }(n + 1)b_{n}{z}^{n}\). By (10), we have

$$\displaystyle{\vert \vert (zf)^{\prime}\vert \vert _{{L}^{2}}^{2} =\pi \sum _{ n=0}^{\infty }(n + 1)\vert b_{ n}{\vert }^{2} =\pi \sum _{ n=0}^{\infty }n\vert b_{ n}{\vert }^{2} +\pi \sum _{ n=0}^{\infty }\vert b_{ n}{\vert }^{2}}$$
$$\displaystyle{= \vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} +\pi \sum _{ n=0}^{\infty }\vert b_{ n}{\vert }^{2}.}$$

We express (11.1) in operator-theoretic language. Let \(D ={ d \over dz}\) with domain \(\{f \in {\mathcal{A}}^{2}: f^{\prime} \in {\mathcal{A}}^{2}\}\). Then D is an unbounded linear operator. Let M denote the bounded operator of multiplication by z. When f extends continuously to the circle, we write Sf for its restriction to the circle, that is, its boundary values. Thus \(\vert \vert Sf\vert {\vert }^{2} ={ 1 \over 2\pi }\int _{0}^{2\pi }\vert f{\vert }^{2}\). The excess area has a simple geometric interpretation:

Corollary 4.2.

Suppose Mf is in the domain of D. Then Sf is square integrable on the circle and

$$\displaystyle{ \vert \vert DMf\vert \vert _{{L}^{2}}^{2} -\vert \vert Df\vert \vert _{{ L}^{2}}^{2} ={ 1 \over 2}\int _{0}^{2\pi }\vert f({e}^{i\theta }){\vert }^{2}d\theta =\pi \vert \vert Sf\vert {\vert }^{2}. }$$
(11.2)

Proof.

The result is immediate from (11.1). □

Corollary 4.2 suggests an alternate way to view (11.1) and (11.2). We can use Green’s theorem to relate the integral over the unit disk to the integral over the circle. The computation uses the notation of differential forms. We discuss differential forms in detail in Sects. 5 and 6. For now we need to know less. In particular \(dz = dx + idy\) and \(d\overline{z} = dx - idy\). We can differentiate in these directions. See Sect. 1 for detailed discussion. For any differentiable function f, we write ∂ f for \({\partial f \over \partial z} dz\) and \(\overline{\partial }f\) for \({\partial f \over \partial \overline{z}}d\overline{z}\). If f is complex analytic, then \(\overline{\partial }f = 0\) (the Cauchy–Riemann equations), and we have

$$\displaystyle{df = (\partial + \overline{\partial })f = \partial f = f^{\prime}(z)dz.}$$

The area form in the plane becomes

$$\displaystyle{dx \wedge dy ={ -1 \over 2i} dz \wedge d\overline{z} ={ i \over 2}dz \wedge d\overline{z}.}$$

Finally, we use Green’s theorem, expressed in complex notation, in formula (12) of the geometric proof below. We generalize this proof in Sect. 9.

Exercise 4.8.

Express Green’s theorem in complex notation: express the line integral of \(Adz + Bd\overline{z}\) around γ as an area integral over the region bounded by γ.

Exercise 4.9.

Use Exercise 4.8 to show that γ f(z)dz = 0 when f is complex analytic and γ is a closed curve as in Green’s theorem. (This result is an easy form of the Cauchy integral theorem.)

Here is a beautiful geometric proof of (11.2), assuming f′ extends continuously to the circle:

Proof.

For any complex analytic f, we have

$$\displaystyle{A_{f} = \vert \vert f^{\prime}\vert \vert _{{L}^{2}}^{2} ={ i \over 2}\int _{B_{1}}\partial f \wedge \overline{\partial f} ={ i \over 2}\int _{B_{1}}d(f\overline{\partial f}).}$$

We apply this formula also to (zf). The difference in areas satisfies

$$\displaystyle{A_{zf} - A_{f} = \vert \vert (zf)^{\prime}\vert \vert _{{L}^{2}}^{2} -\vert \vert f^{\prime}\vert \vert _{{ L}^{2}}^{2} ={ i \over 2}\int _{B_{1}}d\left (zf\overline{\partial (zf)} - f\overline{\partial f}\right ).}$$

Assuming f′ extends continuously to the circle, we may use Green’s theorem to rewrite this integral as an integral over the circle:

$$\displaystyle{ A_{zf} - A_{f} ={ i \over 2}\int _{{S}^{1}}zf\overline{\partial (zf)} - (f\overline{\partial f}). }$$
(12)

By the product rule, \(\partial (zf) = fdz + z\partial f\). We plug this formula into (12) and simplify, getting

$$\displaystyle{A_{zf} - A_{f} ={ i \over 2}\int _{{S}^{1}}(\vert z{\vert }^{2} - 1)f\overline{\partial f} +{ i \over 2}\int _{{S}^{1}}z\vert f(z){\vert }^{2}d\overline{z}.}$$

The first integral vanishes because | z |2 = 1 on the circle. We rewrite the second integral by putting z = e to obtain

$$\displaystyle{{ i \over 2}\int _{{S}^{1}}{e}^{i\theta }\vert f({e}^{i\theta }){\vert }^{2}(-i){e}^{-i\theta }d\theta ={ 1 \over 2}\int _{{S}^{1}}\vert f({e}^{i\theta }){\vert }^{2}d\theta =\pi { 1 \over 2\pi }\int _{{S}^{1}}\vert f({e}^{i\theta }){\vert }^{2}d\theta =\pi \vert \vert Sf\vert {\vert }^{2}.}$$

Exercise 4.10.

Show that Corollary 4.2 can be stated as \({M}^{{\ast}}{D}^{{\ast}}DM - {D}^{{\ast}}D =\pi {S}^{{\ast}}S\).

Exercise 4.11.

What are the eigenfunctions and eigenvalues of DM and of MD? Show that the commutator \([D,M] = DM - MD\) is the identity. This example arises in quantum mechanics.

Exercise 4.12.

Find a closed formula for \(\sum _{j=0}^{\infty }{\vert z{\vert }^{2j} \over c_{j}}\), where \(c_{j} = \vert \vert {z}^{j}\vert {\vert }^{2}\) is the squared norm in \({\mathcal{A}}^{2}\). The answer is the Bergman kernel function of the unit disk.

Exercise 4.13.

For 0 ≤ a ≤ 1 and for | z | < 1, put \(f_{a}(z) = \sqrt{1 - {a}^{2}}z + a{z}^{2}\). Find \(\vert \vert f_{a}^{\prime}\vert \vert _{{L}^{2}}^{2}\) in terms of a. For several values of a, graph the image of the unit disk under f. For what values of a is f injective? See Figs. 4.3 and 4.4.

Exercise 4.14.

Put \(f(z) = z + {z}^{2} + {z}^{3}\). Describe or graph the image of the set | z | = r under f for several values of r. Suggestion: Use polar coordinates.

3 Unitary Groups

We now begin studying geometric problems in several complex variables. Recall that ⟨z, w⟩ denotes the Hermitian inner product of points in complex Euclidean space C n. The unitary group \(\mathcal{U}(n)\) consists of the linear transformations T which preserve the inner product; ⟨Tz, Tw⟩ = ⟨z, w⟩. Setting z = w shows that such transformations also preserve norms. The converse is also true: if || Tz ||2 = || z ||2 for all z, then ⟨Tz, Tw⟩ = ⟨z, w⟩ for all z and w, by Proposition 2.6.

Figure 4.3
figure 3

Injective image of unit disk

Figure 4.4
figure 4

Overlapping image of unit disk

The group law in \(\mathcal{U}(n)\) is composition. Let U, V be unitary transformations on C N. Then the composition UV is also unitary, because

$$\displaystyle{{(UV )}^{{\ast}} = {V }^{{\ast}}{U}^{{\ast}} = {V }^{-1}{U}^{-1} = {(UV )}^{-1}.}$$

It follows that the collection \(\mathcal{U}(N)\) of unitary transformations on C N is a subgroup of the group of invertible linear transformations.

We will often deal with complex Euclidean spaces of different dimensions. It is convenient to omit the dimension in the notation for the inner products and norms. When doing so, we must be careful. Suppose L: C nC n + 1 is given by L(z) = (z, 0). Then L is linear and || L(z) || = || z ||, but L is not unitary. Distance preserving maps are called isometries. In this setting, when N > n, we often identify C n with the subspace C n ⊕ 0 ⊆ C N.

Our first result (which holds much more generally than we state here) provides a polarization technique and gets used several times in the sequel. We use it several times in the special case when f and g are polynomial mappings.

Theorem 4.3.

Let B be a ball centered at 0 in C n . Suppose \(f: B \rightarrow {\mathbf{C}}^{N_{1}}\) and \(g: B \rightarrow {\mathbf{C}}^{N_{2}}\) are complex analytic mappings and ||f(z)|| 2 = ||g(z)|| 2 for all z ∈ B. Assume that the image of g lies in no lower-dimensional subspace and that N 1 ≥ N 2 . Then there is an isometry \(U:{ \mathbf{C}}^{N_{2}} \rightarrow {\mathbf{C}}^{N_{1}}\) such that f(z) = Ug(z) for all z. When f and g are as above and also N 2 = N 1 , then U is unitary.

Proof.

We expand f and g as convergent power series about 0, writing \(f(z) =\sum _{\alpha }A_{\alpha }{z}^{\alpha }\) and \(g(z) =\sum _{\alpha }B_{\alpha }{z}^{\alpha }\); the coefficient A α lies in \({\mathbf{C}}^{N_{1}}\) and the coefficient B α lies in \({\mathbf{C}}^{N_{2}}\). Equating the Taylor coefficients in || f(z) ||2 = || g(z) ||2 leads, for each pair α and β of multi-indices, to

$$\displaystyle{ \langle A_{\alpha },A_{\beta }\rangle =\langle B_{\alpha },B_{\beta }\rangle. }$$
(13)

It follows from (13) that \(A_{\alpha _{1}},\ldots,A_{\alpha _{K}}\) is a linearly independent set if and only if \(B_{\alpha _{1}},\ldots,B_{\alpha _{K}}\) is a linearly independent set. We then define U by U(B α ) = A α for a maximal linearly independent set of the B α . If B μ is a linear combination of some B α , then we define U(B μ ) as the same linear combination of the A α . The relations (13) guarantee that U is well defined. Furthermore, these relationships imply that U preserves inner products. Hence, U is an isometry on the span of the B α . When N 1 = N 2, an isometry U must be unitary. □

Example 4.1.

The parallelogram law provides an example of Theorem 4.3. Suppose \(g(z_{1},z_{2}) = (\sqrt{2}z_{1},\sqrt{2}z_{2})\) and \(f(z_{1},z_{2}) = (z_{1} + z_{2},z_{1} - z_{2})\). Then

$$\displaystyle{\vert \vert g(z)\vert {\vert }^{2} = 2\vert z_{ 1}{\vert }^{2} + 2\vert z_{ 2}{\vert }^{2} = \vert z_{ 1} + z_{2}{\vert }^{2} + \vert z_{ 1} - z_{2}{\vert }^{2} = \vert \vert f(z)\vert {\vert }^{2}.}$$

In this case f = Ug, where U is given by

$$\displaystyle{U = \left (\begin{array}{ccc} { 1 \over \sqrt{2}} & & { 1 \over \sqrt{2}} \\ { 1 \over \sqrt{2}} & & { -1 \over \sqrt{2}} \end{array} \right ).}$$

Our next example illustrates the situation when N 1 > N 2 in Theorem 4.3.

Example 4.2.

Put \(f(z) = (z_{1}^{2},z_{1}z_{2},z_{1}z_{2},z_{2}^{2})\) and \(g(z) = (z_{1}^{2},\sqrt{2}z_{1}z_{2},z_{2}^{2})\). Then f: C 2C 4 and g: C 2C 3. Also,

$$\displaystyle{\vert \vert f(z)\vert {\vert }^{2} = \vert z_{ 1}{\vert }^{4} + 2\vert z_{ 1}{\vert }^{2}\ \vert z_{ 2}{\vert }^{2} + \vert z_{ 2}{\vert }^{4} = {(\vert z_{ 1}{\vert }^{2} + \vert z_{ 2}{\vert }^{2})}^{2} = \vert \vert g(z)\vert {\vert }^{2}.}$$

The map U: C 3C 4 for which f = Ug is given by the matrix (with respect to the usual bases)

$$\displaystyle{U = \left (\begin{array}{ccccc} 1&& 0 &&0 \\ 0&&{ 1 \over \sqrt{2}} & & 0 \\ 0&&{ 1 \over \sqrt{2}} & & 0 \\ 0&& 0 &&1 \end{array} \right ).}$$

If \(\zeta = (\zeta _{1},\zeta _{2},\zeta _{3})\), then \(\vert \vert U\zeta \vert {\vert }^{2} = \vert \zeta _{1}{\vert }^{2} + \vert \zeta _{2}{\vert }^{2} + \vert \zeta _{3}{\vert }^{2} = \vert \vert \zeta \vert {\vert }^{2}\). Thus, U is an isometry, but U is not unitary.

Observe that each of the maps f and g from Example 4.2 sends the unit sphere in the domain to the unit sphere in the target. We will now consider such mappings in detail.

We begin with some examples of symmetries of the unit sphere. If e lies on the unit circle S 1, and z lies on the unit sphere S 2n − 1, the scalar multiple e z lies on S 2n − 1 as well. Thus, S 1 acts on S 2n − 1. We can replace S 1 with the n-torus S 1 × ×S 1. In this case we map \(z = (z_{1},z_{2},\ldots,z_{n})\) to \(({e}^{i\theta _{1}}z_{1},{e}^{i\theta _{2}}z_{2},\ldots,{e}^{i\theta _{n}}z_{n})\). Furthermore, for zS 2n − 1 and \(U \in \mathcal{U}(n)\), we have of course that UzS 2n − 1.

The next example of a symmetry is a bit more complicated. Choose a point a in the open unit ball B n . First define a linear mapping L: C nC n by

$$\displaystyle{L(z) = sz +{ 1 \over s + 1}\langle z,a\rangle a,}$$

where \(s = \sqrt{1 - \vert \vert a\vert {\vert }^{2}}\). Then define ϕ a by

$$\displaystyle{\phi _{a}(z) ={ a - L_{a}(z) \over 1 -\langle z,a\rangle }.}$$

The mapping ϕ a is a complex analytic automorphism of the unit ball, and it maps the unit sphere to itself. See Exercise 4.15, Exercise 4.16, and the discussion in Sect. 4 for more information.

Exercise 4.15.

Verify the following properties of the mapping ϕ a :

  • ϕ a (0) = a.

  • ϕ a (a) = 0.

  • \(\phi _{a}: {S}^{2n-1} \rightarrow {S}^{2n-1}\).

  • ϕ a ϕ a is the identity.

Exercise 4.16.

Carefully compute ϕ b ϕ a . The result is not of the form ϕ c for any c with || c || < 1. Show, however, that the result can be written c for some unitary U. Suggestion: First do the computation when n = 1.

Remark 4.3.

In complex analysis or harmonic analysis, it is natural to consider the group of all complex analytic automorphisms preserving the sphere. Each element of this group can be written as Uϕ a for some unitary U and some ϕ a . Rather than considering the full group, we will focus on the unitary group \(\mathcal{U}(n)\) and its finite subgroups. Various interesting combinatorial and number-theoretic issues arise in this setting.

We start in one dimension with an elementary identity (Lemma 4.3) involving roots of unity. The proof given reveals the power of geometric reasoning; one can also prove this identity by factoring 1 − t m over the complex numbers.

Definition 4.2.

A complex number ω is called a primitive m-th root of unity if ω m = 1 and m is the smallest such positive integer.

The imaginary unit i is a primitive fourth root of unity. Given a primitive m-th root of unity ω, the powers ω j for \(j = 0,1,\ldots,m - 1\) are equally spaced on the unit circle S 1. These m points define a cyclic subgroup of S 1 of order m. Note that the inverse of ω is ω m − 1, which also equals \(\overline{\omega }\). Note also that \({S}^{1} = \mathcal{U}(1)\).

Lemma 4.3.

Let ω be a primitive m-th root of unity. Then

$$\displaystyle{ 1 -\prod _{j=0}^{m-1}(1 {-\omega }^{j}t) = {t}^{m}. }$$
(14)

Proof.

The expression on the left-hand side is a polynomial in t of degree m. It is invariant under the map tω t. The only invariant monomials of degree at most m are constants and constants times t m. Hence, this expression must be of the form α + β t m. Setting t = 0 shows that α = 0 and setting t = 1 shows that β = 1. □

This proof relies on the cyclic subgroup Γ m of the unit circle, or of \(\mathcal{U}(1)\), generated by ω. We will generalize this lemma and related ideas to higher dimensions, where things become more interesting.

We extend the notion of Hermitian symmetry (Definition 1.2) to higher dimensions in the natural way. A polynomial \(R(z,\overline{\zeta })\) on C n ×C n is Hermitian symmetric if \(R(z,\overline{\zeta }) = \overline{R(\zeta,\overline{z})}\). The higher-dimensional version of Lemma 1.3 holds; it is useful in the solution of Exercise 4.19.

Let Γ be a finite subgroup of \(\mathcal{U}(n)\). The analogue of the left-hand side of (14) is the following Hermitian polynomial:

$$\displaystyle{ \Phi _{\Gamma }(z,\overline{\zeta }) = 1 -\prod _{\gamma \in \Gamma }(1 -\langle \gamma z,\zeta \rangle ). }$$
(15)

One can show (we do not use the result, and hence, we omit the proof) that Φ Γ is uniquely determined by the following properties:

  1. (1)

    Φ Γ is Hermitian symmetric.

  2. (2)

    Φ Γ(0, 0) = 0.

  3. (3)

    Φ Γ is Γ-invariant.

  4. (4)

    \(\Phi _{\Gamma }(z,\overline{z})\) is of degree in z at most the order of Γ.

  5. (5)

    \(\Phi _{\Gamma }(z,\overline{z}) = 1\) for z on the unit sphere.

In the special case when Γ is the group generated by a primitive m-th root of unity times the identity operator, (14) generalizes to the identity (16):

$$\displaystyle{ 1 -\prod _{j=0}^{m-1}\left (1 {-\omega }^{j}\sum _{ k=1}^{n}t_{ k}\right ) ={ \left (\sum _{k=1}^{n}t_{ k}\right )}^{m} =\sum _{ \vert \alpha \vert =m}{m\choose \alpha }{t}^{\alpha }. }$$
(16)

In this case the multinomial coefficients \({m\choose \alpha }\) make an appearance:

$$\displaystyle{{m\choose \alpha } ={ m! \over \alpha _{1}!\ldots \alpha _{n}!}.}$$

See Sects. 4 and 8 for more information about multi-index notation and the multinomial theorem, which is the far right equality in (16).

Interesting generalizations of (16) result from more complicated representations of cyclic groups. The product in (17) gives one collection of nontrivial examples:

$$\displaystyle{ 1 -\prod _{j=0}^{m-1}\left (1 -\sum _{ k=1}^{n}{\omega }^{kj}t_{ k}\right ). }$$
(17)

The coefficients of the expansion are integers with many interesting properties.

Exercise 4.17.

Prove Lemma 4.3 by factoring 1 − t m.

Exercise 4.18.

Prove that \(\Phi _{\Gamma }(z,\overline{w})\) is Hermitian symmetric.

Exercise 4.19.

Let \(R(z,\overline{z}) =\sum _{\alpha,\beta }c_{\alpha,\beta }{z}^{\alpha }{\overline{z}}^{\beta }\) be a Hermitian symmetric polynomial. Prove that there are linearly independent polynomials A j (z) and B k (z) such that

$$\displaystyle{R(z,\overline{z}) =\sum _{j}\vert A_{j}(z){\vert }^{2} -\sum _{ k}\vert B_{k}(z){\vert }^{2} = \vert \vert A(z)\vert {\vert }^{2} -\vert \vert B(z)\vert {\vert }^{2}.}$$

Exercise 4.20.

Write \(\Phi _{\Gamma } = \vert \vert A\vert {\vert }^{2} -\vert \vert B\vert {\vert }^{2}\) as in the previous exercise. Show that we may choose A and B to be Γ-invariant.

In the rest of this section, we consider several cyclic subgroups of \(\mathcal{U}(2)\). Write (z, w) for a point in C 2. Let η be a primitive p-th root of unity. We next study the mapping Φ Γ when Γ = Γ(p, q) is the cyclic group of \(\mathcal{U}(2)\) of order p generated by the matrix

$$\displaystyle{\left (\begin{array}{cc} \eta &0\\ 0 &{ \eta }^{q } \end{array} \right ).}$$

Remark 4.4.

The quotient space S 3 ∕ Γ(p, q) is called a lens space. These spaces are important in topology.

The definition of Φ Γ(p, q) yields

$$\displaystyle{\Phi _{\Gamma (p,q)} = 1 -\prod _{j=0}^{p-1}(1 {-\eta }^{j}\vert z{\vert }^{2} {-\eta }^{qj}\vert w{\vert }^{2}).}$$

This expression depends only upon the expressions | z |2 and | w |2; we simplify notation by defining the polynomial f p, q (x, y) by

$$\displaystyle{ f_{p,q}(x,y) = 1 -\prod _{j=0}^{p-1}(1 {-\eta }^{j}x {-\eta }^{qj}y). }$$
(18)

By taking j = 0 in the product, it follows that f p, q (x, y) = 1 on the line \(x + y = 1\).

Lemma 4.4.

\(f_{p,1}(x,y) = {(x + y)}^{p}\) .

Proof.

The result follows by replacing t by x + y in Lemma 4.3. □

The (binomial) coefficients of f p, 1 are integers which satisfy an astonishing number of identities and properties. More is true. For each q, the coefficients of f p, q are also integers, and they satisfy many interesting combinatorial and number-theoretic properties as well. We mention one of the properties now. Most people know the so-called freshman’s dream that \({(x + y)}^{p} \equiv {x}^{p} + {y}^{p}\) modulo p if and only if p is prime. The same result holds for each f p, q , although we omit the proof here.

The polynomials f p, 2 are more complicated than \(f_{p,1} = {(x + y)}^{p}\). When p is odd, all the coefficients of f p, 2 are nonnegative. Here are the first few f p, 2:

$$\displaystyle{f_{1,2}(x,y) = x + y}$$
$$\displaystyle{f_{2,2}(x,y) = {x}^{2} + 2y - {y}^{2}}$$
$$\displaystyle{f_{3,2}(x,y) = {x}^{3} + 3xy + {y}^{3}}$$
$$\displaystyle{f_{4,2}(x,y) = {x}^{4} + 4{x}^{2}y + 2{y}^{2} - {y}^{4}}$$
$$\displaystyle{ f_{5,2}(x,y) = {x}^{5} + 5{x}^{3}y + 5x{y}^{2} + {y}^{5}. }$$
(19)

We can find all these polynomials by solving a single difference equation. We offer two proofs of the following explicit formula for f p, 2. The key idea in the first proof is to interchange the order in a double product. See [D5] and its references for general results about group-invariant polynomials, proved by similar methods.

Proposition 4.3.

For all nonnegative integers p, we have

$$\displaystyle{ f_{p,2}(x,y) = {({x + \sqrt{{x}^{2 } + 4y} \over 2} )}^{p} + {({x -\sqrt{{x}^{2 } + 4y} \over 2} )}^{p} - {(-y)}^{p}. }$$
(20)

Proof.

Set q = 2 in (18). Each factor in the product is a quadratic in η j, which we also factor. We obtain

$$\displaystyle{1 - f(x,y) =\prod _{ j=0}^{p-1}(1 {-\eta }^{j}x {-\eta }^{2j}y) =\prod _{ j=0}^{p-1}(1 - c_{ 1}{(x,y)\eta }^{j})(1 - c_{ 2}{(x,y)\eta }^{j})}$$
$$\displaystyle{=\prod _{ j=0}^{p-1}(1 - c_{ 1}{(x,y)\eta }^{j})\prod _{ j=0}^{p-1}(1 - c_{ 2}{(x,y)\eta }^{j}).}$$

Here c 1 and c 2 are the reciprocals of the roots of the quadratic \(1 - x\eta - {y\eta }^{2}\). Each of the two products is familiar from Lemma 4.3. Using that result we obtain

$$\displaystyle{1 - f(x,y) = (1 - c_{1}{(x,y)}^{p})(1 - c_{ 2}{(x,y)}^{p}).}$$

It follows that f has the following expression in terms of the c j :

$$\displaystyle{f(x,y) = c_{1}{(x,y)}^{p} + c_{ 2}{(x,y)}^{p} - {(c_{ 1}(x,y)c_{2}(x,y))}^{p}.}$$

The product c 1(x, y)c 2(x, y) equals − y. The sum c 1(x, y) + c 2(x, y) equals x. Solving this system for c 1 and c 2 using the quadratic formula determines the expressions arising in (20). □

We sketch a second proof based on recurrence relations.

Proof.

(Sketch). It follows by setting x = 0 in formula (18) that the term \(-{(-y)}^{p}\) appears in f p, 2. Let g p (x, y) denote the other terms. The recurrence relation \(g_{p+2}(x,y) = xg_{p+1}(x,y) + yg_{p}(x,y)\) also follows from (18). To solve this recurrence, we use the standard method. The characteristic equation is \({\lambda }^{2} - x\lambda - y = 0\). Its roots are \({x\pm \sqrt{{x}^{2 } +4y} \over 2}\). Using the initial conditions that g 1(x, y) = x and \(g_{2}(x,y) = {x}^{2} + 2y\), we determine g p (x, y). Adding in the term \(-{(-y)}^{p}\) yields (20). □

These polynomials are related to some classical mathematics.

Definition 4.3.

The n-th Chebyshev polynomial T n is defined by

$$\displaystyle{T_{n}(x) = \mathrm{cos}(n\ {\mathrm{cos}}^{-1}(x)).}$$

Although it is not instantly obvious, the n-th Chebyshev polynomial is a polynomial of degree n. Hence these polynomials are linearly independent.

Example 4.3.

The first few Chebyshev polynomials:

  • T 0(x) = 1.

  • T 1(x) = x.

  • \(T_{2}(x) = 2{x}^{2} - 1\).

  • \(T_{3}(x) = 4{x}^{3} - 3x\).

  • \(T_{4}(x) = 8{x}^{4} - 8{x}^{2} + 1\).

  • \(T_{5}(x) = 16{x}^{5} - 20{x}^{3} + 5x\).

Exercise 4.21.

Verify that T n (x) is a polynomial. (See Exercise 4.23 for one approach.) Verify the formulas for T j (x) for j = 1, 2, 3, 4, 5.

Remark 4.5.

The polynomials T n (x) are eigenfunctions of a Sturm–Liouville problem. The differential equation, (SL) from Chap. 2, is \((1 - {x}^{2})y^{\prime \prime} - xy^{\prime} +\lambda y = 0\). The T n are orthogonal on the interval [ − 1, 1] with respect to the weight function \(w(x) ={ 1 \over \sqrt{1-{x}^{2}}}\). By Theorem 2.13, they form a complete orthogonal system for L 2([ − 1, 1], w).

Exercise 4.22.

Verify that T n is an eigenfunction as described in the remark; what is the corresponding eigenvalue λ?

Proposition 4.4.

The f p,2 have the following relationship to the Chebyshev polynomials T p (x):

$$\displaystyle{f_{p,2}(x,{ -1 \over 4} ) + {({1 \over 4})}^{p} = {2}^{1-p}\left (\mathrm{cos}(p\ {\mathrm{cos}}^{-1}(x))\right ) = {2}^{1-p}T_{ p}(x).}$$

Proof.

See Exercise 4.23. □

Remark 4.6.

Evaluating the f p, 2 at other points also leads to interesting things. For example, let ϕ denote the golden ratio. Then

$$\displaystyle{f_{p,2}(1,1) = {({1 + \sqrt{5} \over 2} )}^{p} + {({1 -\sqrt{5} \over 2} )}^{p} + {(-1)}^{p+1} {=\phi }^{p} + {(1-\phi )}^{p} + {(-1)}^{p+1}.}$$

The first two terms give the p-th Lucas number, and hence, f p, 2(1, 1) differs from the p-th Lucas number by ± 1. The p-th Fibonacci number F p has a similar formula:

$$\displaystyle{F_{p} ={ 1 \over \sqrt{5}}\ \left ({({1 + \sqrt{5} \over 2} )}^{p} - {({1 -\sqrt{5} \over 2} )}^{p}\right ) ={ 1 \over \sqrt{5}}\ \left ({(\phi )}^{p} - {(1-\phi )}^{p}\right ).}$$

It is remarkable that our considerations of group-invariant mappings connect so closely with classical mathematics. The polynomials f p, 2 arise for additional reasons in several complex variables. When p is odd, all the coefficients of f p, 2 are nonnegative. Put x = | z |2 and y = | w |2 and write \(p = 2r + 1\). Then

$$\displaystyle{f_{2r+1,2}(\vert z{\vert }^{2},\vert w{\vert }^{2}) =\sum _{ b}c_{b}\vert z{\vert }^{2(2r+1-2b)}\vert w{\vert }^{2b} = \vert \vert g(z,w)\vert {\vert }^{2}.}$$

Since \(f_{2r+1,2}(x,y) = 1\) on \(x + y = 1\), we see that || g(z, w) ||2 = 1 on the unit sphere. Hence, g(z, w) maps the unit sphere S 3 to the unit sphere S 2N − 1, where \(N = r + 2\). Thus, g provides a far from obvious example of a group-invariant mapping between spheres.

The functions f p, 2 satisfy an extremal property. If a polynomial f of degree d in x, y has N terms, all nonnegative coefficients, and f(x, y) = 1 on \(x + y = 1\), then the inequality d ≤ 2N − 3 holds and is sharp. We omit the proof of this difficult result. Equality holds for the f 2r + 1, 2.

Exercise 4.23.

Prove Proposition 4.4. Suggestion: First find a formula for cos− 1(s) using \(\mathrm{cos}(t) ={ {e}^{it}+{e}^{-it} \over 2} = s\) and solving a quadratic equation for e it.

Exercise 4.24.

Show that \(T_{nm}(x) = T_{n}(T_{m}(x))\).

Exercise 4.25.

Find a formula for the generating function \(\sum _{n=0}^{\infty }T_{n}(x){t}^{n}\). Do the same for \(\sum _{n=0}^{\infty }f_{n,2}(x,y){t}^{n}\).

The next exercise is intentionally a bit vague. See [D3] and the references there for considerably more information.

Exercise 4.26.

Use Mathematica or something similar to find f p, 3 and f p, 4 for 1 ≤ p ≤ 11. See what you can discover about these polynomials.

4 Proper Mappings

Consider the group-invariant polynomial (15) above when ζ = z. The factor 1 − ⟨γ z, z⟩ vanishes on the sphere when γ is the identity of the group. Hence \(\Phi _{\Gamma }(z,\overline{z}) = 1\) when z is on the sphere. By Exercises 4.19 and 4.20, we may write

$$\displaystyle{\Phi _{\Gamma }(z,\overline{z}) =\sum _{j}\vert A_{j}(z){\vert }^{2} -\sum _{ k}\vert B_{k}(z){\vert }^{2} = \vert \vert A(z)\vert {\vert }^{2} -\vert \vert B(z)\vert {\vert }^{2}}$$

, where the polynomials A j and B k are invariant. If B = 0, (thus Φ Γ is a squared norm), then Φ Γ will be an invariant polynomial mapping between spheres. If B ≠ 0, then the target is a hyperquadric.

The group-invariant situation, where the target is a sphere, is completely understood and beautiful. It is too restrictive for our current aims. In this section we therefore consider polynomial mappings between spheres, without the assumption of group invariance.

In one dimension, the functions zz m have played an important part in our story. On the circle, of course, z m = e imθ. The function zz m is complex analytic and maps the unit circle S 1 to itself. One of many generalizations of these functions to higher dimensions results from considering complex analytic functions sending the unit sphere S 2n − 1 into some unit sphere, perhaps in a different dimension. We discuss these ideas here and relate them to the combinatorial considerations from the previous section.

Definition 4.4.

Let Ω and Ω ′ be open, connected subsets of complex Euclidean spaces. Suppose f: Ω → Ω ′ is continuous. Then f is called proper if whenever KΩ ′ is compact, then f − 1(K) is compact in Ω.

Lemma 4.5.

A continuous map f: Ω → Ω′ between bounded domains is proper if and only if the following holds: whenever {z ν } is a sequence tending to the boundary bΩ, then {f(z ν )} tends to bΩ′.

Proof.

We prove both statements by proving their contrapositives. First let {z ν } tend to . If {f(z ν )} does not tend to bΩ ′, then it has a subsequence which stays in a compact subset K of Ω ′. But then f − 1(K) is not compact, and f is not proper. Thus properness implies the sequence property. Now suppose f is not proper. Find a compact set K such that f − 1(K) is not compact in Ω. Then there is a sequence {z ν } in f − 1(K) tending to , but the image sequence stays within a compact subset K. □

Lemma 4.5 states informally that f is proper if whenever z is close to , then f(z) is close to bΩ ′. Hence, it has an εδ version which we state and use only when Ω and Ω ′ are open unit balls.

Corollary 4.3.

A continuous map f: B n → B N is proper if and only if for all ε > 0, there is a δ > 0 such that 1 −δ < ||z|| < 1 implies 1 −ε < ||f(z)|| < 1.

Our main interest is complex analytic mappings, especially such polynomial mappings, sending the unit sphere in C n to the unit sphere in some C N. Consider mappings that are complex analytic on the open ball and continuous on the closed ball. The maximum principle implies that if such a mapping sends the unit sphere in the domain to some unit sphere, then it must actually be a proper mapping from ball to ball. On the other hand, a (complex analytic) polynomial mapping between balls is also defined on the boundary sphere, and Lemma 4.5 implies that such mappings send the boundary to the boundary. It would thus be possible never to mention the term proper map, and we could still do everything we are going to do. We continue to work with proper mappings because of the intuition they provide.

Remark 4.7.

Proper complex analytic mappings must be finite-to-one, although not all points in the image must have the same number of inverse images. By definition of proper, the inverse image of a point must be a compact set. Because of complex analyticity, the inverse image of a point must also be a complex variety. Together these facts show that no point in the target can have more than a finite number of inverse images.

Exercise 4.27.

Which of the following maps are proper from R 2R?

  1. (1)

    \(f(x,y) = {x}^{2} + {y}^{2}\).

  2. (2)

    \(g(x,y) = {x}^{2} - {y}^{2}\).

  3. (3)

    h(x, y) = x.

Exercise 4.28.

Under what circumstances is a linear map L: C nC N proper?

Our primary concern will be complex analytic proper mappings between balls. We start with the unit disk B 1 contained in C. Let us recall a simple version of the maximum principle. Suppose f is complex analytic in the open unit disk B 1 and | f(z) | ≤ M on the boundary of a closed set K. Then the same estimate holds in the interior of K.

Proposition 4.5.

Suppose f: B 1 → B 1 is complex analytic and proper. Then f is a finite Blaschke product: there are points a 1 ,…,a d in the unit disk, possibly repeated, and a point e on the circle, such that

$$\displaystyle{f(z) = {e}^{i\theta }\prod _{ j=1}^{d}{ a_{j} - z \over 1 -\overline{a_{j}}z}.}$$

If also either \({f}^{-1}(0) = 0\) or f is a polynomial, then f(z) = e z m for some positive integer m.

Proof.

Because f is proper, the set f − 1(0) is compact. We first show that it is not empty. If it were empty, then both f and \({ 1 \over f}\) would be complex analytic on the unit disk, and the values of \({ 1 \over \vert f(z)\vert }\) would tend to 1 as z tends to the circle. The maximum principle would then force \(\vert { 1 \over f(z)}\vert \leq 1\) on the disk, which contradicts | f(z) | < 1 there.

Thus, the compact set f − 1(0) is not empty. Because f is complex analytic, this set must be discrete. Therefore, it is finite, say a 1, , a d (with multiplicity allowed). Let B(z) denote the product \(\prod { a_{j}-z \over 1-\overline{a_{j}}z}\). We show that \(z\mapsto { f(z) \over B(z)}\) is a constant map of modulus one. Then f = e B.

By Corollary 4.3, applied to both f and B, for each ε > 0 we can find a δ > 0 such that 1 − ε < | f(z) | ≤ 1 and 1 − ε < | B(z) | ≤ 1 for | z | > 1 − δ. It follows by the maximum principle that these estimates hold for all z with | z | ≤ 1 − δ as well. The function \(g ={ f \over B}\) is complex analytic in the disk, as the zeros of B and of f correspond and thus cancel in g. By the maximum principle applied to g, we have for all z that \(1-\epsilon < \vert g(z)\vert <{ 1 \over 1-\epsilon }\). Since ε is arbitrary, we may let ε tend to 0 and conclude that | g(z) | = 1. It follows (by either Theorem 4.3 or the maximum principle) that g is a constant e of modulus one. Thus f(z) = e B(z). □

Exercise 4.29.

Suppose f: B 1B 1 is complex analytic and proper. Find another proof that there is a z with f(z) = 0. One possible proof composes f with an automorphism of the disk, preserving properness while creating a zero.

Consider next the proper complex analytic self-mappings of the unit ball B n in C n for n ≥ 2. We do not prove the following well-known result in several complex variables: the only proper complex analytic maps from the unit ball B n to itself (when n ≥ 2) are automorphisms. These mappings are analogues of the individual factors in Proposition 4.5. They have the form

$$\displaystyle{f(z) = U\ {z - L_{a}(z) \over 1 -\langle z,a\rangle }.}$$

Here U is unitary, and L a is a linear transformation depending on a, for a an arbitrary point in B n . These rational maps were mentioned in Sect. 3; see the discussion near Exercises 4.15 and 4.16. The only polynomial proper self-mappings of a ball are the unitary mappings f(z) = Uz. In order to obtain analogues of zz d, we must increase the target dimension.

The analogue of zz d in one dimension will be the tensor product zz d. We will make things concrete, but completely rigorous, by first identifying C MC N with C NM. The reader may simply regard the symbol ⊗ as notation.

Definition 4.5.

Let f = (f 1, , f M ) and g = (g 1, , g N ) be mappings taking values in C M and C N. Their tensor product fg is the mapping taking values in C MN defined by \((f_{1}g_{1},\ldots,f_{j}g_{k},\ldots,f_{M}g_{N})\).

In Definition 4.5 we did not precisely indicate the order in which the terms f j g k are listed. The reason is that we do not care; nearly everything we do in this section does not distinguish between h and Lh when || Lh || = || h ||. The following formula suggests why the tensor product is relevant to proper mappings between balls:

$$\displaystyle{ \vert \vert f \otimes g\vert {\vert }^{2} = \vert \vert f\vert {\vert }^{2}\vert \vert g\vert {\vert }^{2}. }$$
(21)

To verify (21), simply note that

$$\displaystyle{\vert \vert f\vert {\vert }^{2}\ \vert \vert g\vert {\vert }^{2} =\sum _{ j}\vert f_{j}{\vert }^{2}\ \sum _{ k}\vert g_{k}{\vert }^{2} =\sum _{ j,k}\vert f_{j}g_{k}{\vert }^{2}.}$$

Let m be a positive integer. We write z m for the tensor product of the identity map with itself m times. We show momentarily that \(\vert \vert {z}^{\otimes m}\vert {\vert }^{2} = \vert \vert z\vert {\vert }^{2m}\); in particular the polynomial map zz m takes the unit sphere in its domain to the unit sphere in its target. It exhibits many of the properties satisfied by the mapping zz m in one dimension. The main difference is that the target dimension is much larger than the domain dimension when n ≥ 2 and m ≠ 1.

In much of what we do, the mapping zf(z) is less important than the real-valued function z ↦ || f(z) ||2. It is therefore sometimes worthwhile to introduce the concept of norm equivalence. Consider two maps f, g with the same domain, but with possibly different dimensional complex Euclidean spaces as targets. We say that f and g are norm-equivalent if the functions || f ||2 and || g ||2 are identical.

We are particularly interested in the norm equivalence class of the mapping zz m. One member of this equivalence class is the monomial mapping described in (22), and henceforth, we define z m by the formula in (22). The target dimension is \({n + m - 1\choose m}\), and the components are the monomials of degree m in n variables. Thus we put

$$\displaystyle{ H_{m}(z) = {z}^{\otimes m} = (\ldots,c_{\alpha }{z}^{\alpha },\ldots ). }$$
(22)

In (22), z α is multi-index notation for \(\prod _{j=1}^{n}{(z_{j})}^{\alpha _{j}}\); each \(\alpha = (\alpha _{1},\ldots,\alpha _{n})\) is an n-tuple of nonnegative integers which sum to m, and all such α appear. There are \({n + m - 1\choose m}\) such multi-indices; see Exercise 4.30. For each α, c α is the positive square root of the multinomial coefficient \({m\choose \alpha }\). We write | z |2α as an abbreviation for the product

$$\displaystyle{\prod _{j}\vert z_{j}{\vert }^{2\alpha _{j} }.}$$

See Sect. 8 for more information about multi-index notation and for additional properties of this mapping.

By the multinomial expansion we see that

$$\displaystyle{\vert \vert {z}^{\otimes m}\vert {\vert }^{2} =\sum _{\alpha }\vert c_{\alpha }{\vert }^{2}\vert z{\vert }^{2\alpha } =\sum _{\alpha }{m\choose \alpha }\vert z{\vert }^{2\alpha } = {(\sum _{ j}\vert z_{j}{\vert }^{2})}^{m} = \vert \vert z\vert {\vert }^{2m}.}$$

The crucial formula \(\vert \vert {z}^{\otimes m}\vert {\vert }^{2} = \vert \vert z\vert {\vert }^{2m}\) explains why c α was defined as above. Furthermore, by Theorem 4.4 below, \({n + m - 1\choose m}\) is the smallest possible dimension k for which there is a polynomial mapping f: C nC k such that || f(z) ||2 = || z ||2m. In other words, if f is norm-equivalent to z m, then the target dimension must be at least \({n + m - 1\choose m}\).

Example 4.4.

Put n = 2 and m = 3. We identify the map z m with the map H 3 defined by

$$\displaystyle{(z_{1},z_{2}) \rightarrow H_{3}(z_{1},z_{2}) = (z_{1}^{3},\sqrt{3}z_{ 1}^{2}z_{ 2},\sqrt{3}z_{1}z_{2}^{2},z_{ 2}^{3}).}$$

Note that \(\vert \vert H_{3}(z_{1},z_{2})\vert {\vert }^{2} = {(\vert z_{1}{\vert }^{2} + \vert z_{2}{\vert }^{2})}^{3}\).

Definition 4.6.

Let p: C nC N be a polynomial mapping. Then p is called homogeneous of degree m if, for all tC, p(tz) = t m p(z).

Homogeneity is useful for many reasons. For example, a homogeneous polynomial is determined by its values on the unit sphere. Unless the degree of homogeneity is zero, in which case p is a constant, we have p(0) = 0. For z ≠ 0, we have

$$\displaystyle{p(z) = p(\vert \vert z\vert \vert \ { z \over \vert \vert z\vert \vert }) = \vert \vert z\vert {\vert }^{m}p({ z \over \vert \vert z\vert \vert }).}$$

This simple fact leads to the next lemma, which we use in proving Theorem 4.6.

Lemma 4.6.

Let p j and p k denote homogeneous polynomial mappings, of the indicated degrees, from C n to C N . Assume that ⟨p j (z),p k (z)⟩ = 0 for all z on the unit sphere. Then this inner product vanishes for all z ∈ C n .

Proof.

The statement is trivial if \(j = k = 0\), as p 0 is a constant. Otherwise the inner product vanishes at z = 0. For z ≠ 0, put \(w ={ z \over \vert \vert z\vert \vert }\). Homogeneity yields

$$\displaystyle{\langle p_{j}(z),p_{k}(z)\rangle = \vert \vert z\vert {\vert }^{j+k}\langle p_{ j}(w),p_{k}(w)\rangle,}$$

which vanishes by our assumption, because w is on the sphere. □

Exercise 4.30.

Show that the dimension of the vector space of homogeneous (complex-valued) polynomials of degree m in n variables equals \({n + m - 1\choose m}\).

Exercise 4.31.

Give an example of a polynomial \(r(z,\overline{z})\) that vanishes on the sphere, also vanishes at 0, but does not vanish everywhere.

Recall formula (22) defining the mapping z m. In particular, \({z}^{\otimes m}:{ \mathbf{C}}^{n} \rightarrow {\mathbf{C}}^{N}\), where N is the binomial coefficient \(N ={ n + m - 1\choose m}\), the number of linearly independent monomials of degree m in n variables. This integer is the minimum possible dimension for any map f for which || f(z) ||2 = || z ||2m.

Theorem 4.4.

Let \(h_{m}:{ \mathbf{C}}^{n} \rightarrow {\mathbf{C}}^{N}\) be a homogeneous polynomial mapping of degree m which maps S 2n−1 to S 2N−1 . Then z ⊗m and h m are norm-equivalent. Assume in addition that the components of h m are linearly independent. Then \(N ={ n + m - 1\choose m}\) , and there is a unitary transformation U such that

$$\displaystyle{h_{m}(z) = U{z}^{\otimes m}.}$$

Proof.

By linear independence of the components of h m , the target dimension N of h m is at most \({n + m - 1\choose m}\). We claim that \(N ={ n + m - 1\choose m}\). We are given that \(\vert \vert h_{m}(z)\vert \vert = \vert \vert z\vert \vert = 1\) on the sphere. Hence \(\vert \vert h_{m}(z)\vert {\vert }^{2} = \vert \vert z\vert {\vert }^{2m} = \vert \vert {z}^{\otimes m}\vert {\vert }^{2}\) on the sphere as well. By homogeneity, this equality holds everywhere, and the maps are norm-equivalent. Theorem 4.3 then implies the existence of an isometry V such that z m = Vh m (z). Since z m includes all the monomials of degree m, so does h m . Hence the dimensions are equal, and V is unitary. Put \(U = {V }^{-1}\). □

A variant of the tensor product operation allows us to construct more examples of polynomial mappings between spheres. By also allowing an inverse operation, we will find all polynomial mappings between spheres.

Let A be a subspace of C N, and let π A be orthogonal projection onto A. Then we have \(\vert \vert f\vert {\vert }^{2} = \vert \vert \pi _{A}f\vert {\vert }^{2} + \vert \vert (1 -\pi _{A})f\vert {\vert }^{2}\) by the Pythagorean theorem. Combining this fact with (21) leads to the following:

Proposition 4.6.

Suppose f: C n C M and g: C n C N satisfy \(\vert \vert f\vert {\vert }^{2} = \vert \vert g\vert {\vert }^{2} = 1\) on some set S. Then, for any subspace A of C M , the map \(E_{A,g}f = (1 -\pi _{A})f \oplus (\pi _{A}f \otimes g)\) satisfies ||E A,g f|| 2 = 1 on S.

Proof.

By definition of orthogonal sum and (21), we have

$$\displaystyle{ \vert \vert E_{A,g}f\vert {\vert }^{2} = \vert \vert (1 -\pi _{ A})f \oplus (\pi _{A}f \otimes g)\vert {\vert }^{2} = \vert \vert (1 -\pi _{ A})f\vert {\vert }^{2} + \vert \vert \pi _{ A}f\vert {\vert }^{2}\vert \vert g\vert {\vert }^{2}. }$$
(23)

If || g ||2 = 1 on S, then formula (23) becomes \(\vert \vert (1 -\pi _{A})f\vert {\vert }^{2} + \vert \vert \pi _{A}f\vert {\vert }^{2} = \vert \vert f\vert {\vert }^{2} = 1\) on S. □

When g(z) = z, we can write the computation in (23) as follows:

$$\displaystyle{\vert \vert E_{A}(f)\vert {\vert }^{2} = \vert \vert f\vert {\vert }^{2} + (\vert \vert z\vert {\vert }^{2} - 1)\vert \vert \pi _{ A}(f)\vert {\vert }^{2}.}$$

This tensor operation evokes our discussion of spherical harmonics, where we multiplied polynomials by the squared norm in R n. The operation E A is more subtle for several reasons; first of all, our map f is vector valued. Second of all, we perform the multiplication (now a tensor product) on a proper subspace A of the target.

We will begin studying nonconstant (complex-analytic) polynomial mappings taking S 2n − 1 to S 2N − 1. By Proposition 4.5, when \(n = N = 1\), the only possibilities are ze z m. When n = N ≥ 2, the only nonconstant examples are unitary maps. When N < n, the only polynomial maps are constants. The proofs of these facts use several standard ideas in the theory of analytic functions of several complex variables, but we omit them here to maintain our focus and because we do not use them to prove any of our results. We therefore summarize these facts without proof. We also include a simple consequence of Proposition 4.5 in this collection of statements about polynomial mappings between spheres.

Theorem 4.5.

Assume that p: C n C N is a polynomial mapping with \(p({S}^{2n-1}) \subseteq {S}^{2N-1}\) . If \(N = n = 1\) , then p(z) = e z m for some m. If N < n, then p is a constant. If n ≤ N ≤ 2n − 2, then p is either a constant or an isometry.

When N is much larger than n, there are many maps. We can understand them via a process of orthogonal homogenization.

Let p: C nC N be a polynomial mapping. Let |||| denote the Euclidean norm in either the domain or target. We expand p in terms of homogeneous parts. Thus \(p =\sum _{ k=0}^{d}p_{k}\), where each \(p_{k}:{ \mathbf{C}}^{n} \rightarrow {\mathbf{C}}^{N}\) and p k is homogeneous of degree k. That is, \(p_{k}(tz) = {t}^{k}p_{k}(z)\) for all tC. Suppose in addition that \(p: {S}^{2n-1} \rightarrow {S}^{2N-1}\). Then, if || z ||2 = 1, we have

$$\displaystyle{ \vert \vert p(z)\vert {\vert }^{2} = \vert \vert \sum p_{ k}(z)\vert {\vert }^{2} =\sum _{ k,j}\langle p_{k}(z),p_{j}(z)\rangle = 1. }$$
(24)

Replacing z by e z and using the homogeneity yields

$$\displaystyle{ 1 =\sum _{k,j}{e}^{i\theta (k-j)}\langle p_{ k}(z),p_{j}(z)\rangle. }$$
(25)

But the right-hand side of (25) is a trig polynomial; hence, all its coefficients vanish except for the constant term. We conclude that p must satisfy certain identities when || z || = 1:

$$\displaystyle{ \sum \vert \vert p_{k}\vert {\vert }^{2} = 1, }$$
(26)
$$\displaystyle{ \sum _{k}\langle p_{k},p_{k+l}\rangle = 0.\ \ (l\neq 0). }$$
(27)

Let d be the degree of p. When l = d in (27), the only term in the sum is when k = 0, and we conclude that p 0 and p d are orthogonal. Let π A denote the projection of C N onto the span A of p 0. We can write

$$\displaystyle{ p = (1 -\pi _{A})p \oplus \pi _{A}p. }$$
(28)

Consider a new map g, defined by

$$\displaystyle{g = E_{A}(p) = (1 -\pi _{A})p \oplus (\pi _{A}p \otimes z).}$$

By Proposition 4.6, E A (p) also takes the sphere to the sphere in a larger target dimension. The map g = E A (p) has no constant term and is of degree d. Thus g 0 = 0. Now we apply (27) to g, obtaining the following conclusion. Either g is homogeneous of degree 1 or its first-order part g 1 is orthogonal to its highest order part g d . We apply the same reasoning to g, letting π B denote the orthogonal projection onto the span of the homogeneous part g 1. We obtain a map \(E_{B}(E_{A}(p))\), still of degree d, whose homogeneous expansion now has no terms of order 0 or 1.

Proceeding in this fashion, we increase the order of vanishing without increasing the degree, stopping when the result is homogeneous. Thus we obtain a sequence of subspaces \(A_{0},\ldots,A_{d-1}\) such that composing these tensor product operations yields something homogeneous of degree d. As the last step, we compose with a linear map to guarantee that the components are linearly independent. Applying Theorem 4.3, we obtain the following result about orthogonal homogenization.

Theorem 4.6.

Let p be a polynomial mapping such that \(p({S}^{2n-1}) \subseteq {S}^{2N-1}\) and p is of degree d. Then there is a linear L and a finite sequence of subspaces and tensor products such that

$$\displaystyle{ {z}^{\otimes d} = L(E_{ A_{d-1}}(\ldots (E_{A_{0}}(p))\ldots )). }$$
(29)

Here L = qU, where U is unitary and q is a projection.

Proof.

We repeat the previous discussion in more concise language. If p is homogeneous, then the conclusion follows from Theorem 4.3. Otherwise, let ν denote the order of vanishing of p. Thus ν < d and \(p =\sum _{ j=\nu }^{d}p_{j}\), where p j is homogeneous of degree j. By (27), p ν is orthogonal to p d on the sphere. By Lemma 4.6, they are orthogonal everywhere. Let A denote the span of the coefficient vectors in p ν . By Proposition 4.2, the polynomial mapping E A (p) sends the unit sphere in its domain C n to the unit sphere in its target. This mapping is also of degree d, but its order of vanishing exceeds ν. After finitely many steps of this sort, we reach a homogeneous mapping of degree d. We then apply Theorem 4.3. □

In the next section we will use this result to prove a geometric inequality concerning the maximum volume (with multiplicity counted) of the image of the ball under a proper polynomial map, given its degree.

Next we illustrate Theorem 4.6 by way of a polynomial mapping S 3 to S 7.

Example 4.5.

Put z = (w, ζ) and p(w, ζ) = (w 3, w 2 ζ, , ζ). Then A 0 = 0. Also A 1 is the span of (0, 0, 0, 1), and \(E_{A_{1}}(p) = ({w}^{3},{w}^{2}\zeta,w\zeta,w\zeta {,\zeta }^{2})\). Now A 2 is the span of the three standard basis vectors e 3, e 4, and e 5 in C 5. Tensoring on the subspace A 2 yields

$$\displaystyle{f = E_{2}(E_{1}(p)) = ({w}^{3},{w}^{2}\zeta,{w}^{2}\zeta,{w\zeta }^{2},{w}^{2}\zeta,{w\zeta }^{2},{w\zeta }^{2}{,\zeta }^{3}).}$$

The image of f is contained in a 4-dimensional subspace of C 8. We can apply a unitary map U to f to get

$$\displaystyle{Uf = ({w}^{3},\sqrt{3}{w}^{2}\zeta,\sqrt{3}{w\zeta }^{2}{,\zeta }^{3},0,0,0,0).}$$

Finally we project onto C 4 and identify the result with the map z ⊗ 3. In the notation (29), L = qU is the composition of the unitary map U and the projection q.

5 Vector Fields and Differential Forms

Our second proof of Corollary 4.2 used the differential forms dz and \(d\overline{z}\) in one dimension. In order to extend the result to higher dimensions, we must discuss complex vector fields and complex differential forms. We begin by reviewing the real case. See [Dar] for an alternative treatment of the basics of differential forms and interesting applications.

As a first step, we clarify one of the most subtle points in elementary calculus. What do we mean by dx in the first place? High school teachers often say that dx means an infinitesimal change in the x direction, but these words are too vague to have any meaning. We proceed in the standard manner.

A vector field on R n is simply a function V: R nR n. We think geometrically of placing the vector V (x) at the point x. We make a conceptual leap by regarding the two copies of R n as different spaces. (Doing so is analogous to regarding the x and y axes as different copies of the real line.) For \(j = 1,\ldots,n\), we let e j denote the j-th standard basis element of the first copy of R n. We write \({ \partial \over \partial x_{j}}\) for the indicated partial differential operator; \({ \partial \over \partial x_{j}}\) will be the j-th standard basis vector of the second copy of R n.

Thus, at each point \(x = (x_{1},\ldots,x_{n})\) of R n, we consider a real vector space T x (R n) called the tangent space at x. The vector space T x (R n) is also n-dimensional. Here is the precise definition of \({ \partial \over \partial x_{j}}\):

$$\displaystyle{{ \partial \over \partial x_{j}}(f)(x) ={ \partial f \over \partial x_{j}}(x) =\lim _{t\rightarrow 0}{f(x + te_{j}) - f(x) \over t}. }$$
(30)

The \({ \partial \over \partial x_{j}}\), for \(j = 1,\ldots,n\), form a basis for T x (R n). Thus an element of T x (R n) is a vector of the form \(\sum _{j=1}^{n}c_{j}{ \partial \over \partial x_{j}}\).

Partial derivatives are special cases of directional derivatives. We could therefore avoid (30) and instead start with (31), the definition of the directional derivative of f in the direction \(v = (v_{1},\ldots,v_{n})\):

$$\displaystyle{{ \partial f \over \partial v} (x) =\lim _{t\rightarrow 0}{f(x + tv) - f(x) \over t} =\sum _{ j=1}^{n}v_{ j}{ \partial f \over \partial x_{j}}(x) = V [f](x). }$$
(31)

In this definition (31) of directional derivative, we do not assume that v is a unit vector. Given a vector field V, we write \(V =\sum v_{j}{ \partial \over \partial x_{j}}\). Then V can be applied to a differentiable function f, and V [f] means the directional derivative of f in the direction v, as suggested by the notation. Thus, T x (R n) is the set of directions in which we can take a directional derivative at x.

Remark 4.8.

The viewpoint expressed by the previous sentence is useful when we replace R n by a smooth submanifold M. The tangent space T x (M) is then precisely the set of such directions. See [Dar].

Remark 4.9.

The expression \({ \partial \over \partial x_{j}}\) is defined such that \({ \partial \over \partial x_{j}}(f)\) equals the directional derivative of f in the j-th coordinate direction. Warning! The expression \({ \partial \over \partial x_{j}}\) depends on the full choice of basis. We cannot define \({ \partial \over \partial x_{1}}\) until we have chosen all n coordinate directions. See Exercise 4.33.

The beauty of these ideas becomes apparent when we allow the base point x to vary. A vector field becomes a function whose value at each x is an element of T x (R n). Thus a vector field is a function

$$\displaystyle{x\mapsto V (x) =\sum _{ j=1}^{n}v_{ j}(x){ \partial \over \partial x_{j}}.}$$

A vector field is called smooth if each v j is a smooth function.

We pause to restate the definition of vector field in modern language. Let T(R n), called the tangent bundle, denote the disjoint union over x of all the spaces T x (R n). (To be precise, the definition of T(R n) includes additional information, but we can safely ignore this point here.) A point in T(R n) is a pair (x, v x ), where x is the base point and v x is a vector at x. A vector field is a map V: R nT(R n) such that V (x) ∈ T x (R n) for all x. In other words, V (x) = (x, v x ). In modern language, a vector field is a section of the tangent bundle T(R n). At each x, we regard V (x) as a direction in which we can differentiate functions defined near x.

Now what is a differential 1-form? We begin by defining df for a smooth function f. Here smooth means infinitely differentiable.

Let f: R nR be a smooth function. Let V be a vector field; v = V (x) is a vector based at x; thus V (x) ∈ T x (R n). We define df as follows:

$$\displaystyle{ df(x)[v] = \left (df(x),v\right ) ={ \partial f \over \partial v} (x) =\lim _{t\rightarrow 0}{f(x + tv) - f(x) \over t}. }$$
(32)

The formula on the far right-hand side of (32) is the definition. The other expressions are different notations for the same quantity. In the first formula, df(x) is a function, seeking a vector v as the input and producing a real number as the output. In the second formula, df(x) and v appear on equal footing. The third formula means the rate of change of f in the direction v at x. In coordinates, we have \(V (x) =\sum v_{j}{ \partial \over \partial x_{j}}\), where \(v = (v_{1},\ldots,v_{n})\) and

$$\displaystyle{ df(x)[v] =\sum _{ j=1}^{n}v_{ j}(x){ \partial f \over \partial x_{j}}(x). }$$
(33)

Formula (32) gives a precise, invariant definition of df for any smooth function f. In particular we can finally say what dx k means. Let f = x k be the function that assigns to a point x in R n its k-th coordinate, and consider df. The equation dx k = df gives a precise meaning to dx k . (Confusion can arise because x k denotes both the k-th coordinate and the function whose value is the k-th coordinate.)

The expression df is called the exterior derivative or total differential of f. We discuss the exterior derivative in detail in the next section. We can regard df as a function. Its domain consists of pairs (x, v), where xR n and vT x (R n). By (32), df(x)[v] is the directional derivative of f in the direction v at x. Since taking directional derivatives depends linearly on the direction, the object df(x) is a linear functional on T x (R n). It is natural to call the space \(T_{x}^{{\ast}}({\mathbf{R}}^{n})\) of linear functionals on T x (R n) the cotangent space at x. The cotangent space also has dimension n, but it is distinct both from the domain R n and from the tangent space. The disjoint union of all the cotangent spaces is called the cotangent bundle and written T (R n). A point in T (R n) is a pair (x, ξ x ), where x is the base point and ξ x is a co-vector at x. A differential 1-form is a section of the cotangent bundle. Not all 1-forms can be written in the form df for some function f. See the discussion after Stokes’ theorem.

Remark 4.10.

Assume f is defined near x, for some xR n. Then f is differentiable at x if it is approximately linear there. In other words, we can write \(f(x + h) = f(x) + df(x)(h) + \mathrm{error}\), where the error tends to 0 faster than || h || as h → 0. The same definition makes sense if f is vector valued. In that case we write Df(x) for the linear approximation. In this setting, Df(x) is a linear map from the tangent space at x to the tangent space at f(x).

We summarize the discussion, expressing things in an efficient order. For each xR n we presume the existence of a vector space T x (R n), also of dimension n. The union T(R n) over x of the spaces T x (R n) is called the tangent bundle. A vector field is a section of the tangent bundle. For each smooth real-valued function f, defined near x, we define df by (32). In particular, when f is the coordinate function x j , we obtain a definition of dx j . For each smooth f and each x, df(x) is an element of the dual space \(T_{x}^{{\ast}}({\mathbf{R}}^{n})\). The union of these spaces is the cotangent bundle. A 1-form is a section of the cotangent bundle.

We define the operators \({ \partial \over \partial x_{j}}\) by duality. Thus the differentials dx j precede the operators \({ \partial \over \partial x_{j}}\) in the logical development. A 1-form is a combination \(\sum b_{j}(x)dx_{j}\) and a vector field is a combination \(\sum a_{j}(x){ \partial \over \partial x_{j}}\).

5.1 Complex Differential Forms and Vector Fields

Our work requires complex vector fields and complex differential forms. In terms of real coordinates, a complex vector field on R m is an expression \(\sum _{j=1}^{m}g_{j}(x){ \partial \over \partial x_{j}}\) where the functions g j are smooth and complex valued. Similarly, a complex 1-form on R m is an expression \(\sum _{j=1}^{m}h_{j}(x)dx_{j}\) where the functions h j are smooth and complex valued.

We can identify complex Euclidean space C n with R 2n. Write \(z = (z_{1},\ldots,z_{n})\), and put \(z_{j} = x_{j} + iy_{j}\) (where i is the imaginary unit). We can express vector fields in terms of the \({ \partial \over \partial x_{j}}\) and \({ \partial \over \partial y_{j}}\) and differential forms in terms of the dx j and dy j . Complex geometry is magic; things simplify by working with complex (note the double entendre) objects. Everything follows easily from one obvious definition.

Definition 4.7.

Suppose Ω is an open set in C n and f: Ω → C is smooth. Write \(f = u + iv\) where u and v are real valued. We define df by \(df = du + idv\).

Corollary 4.4.

Let \(z_{j} = x_{j} + iy_{j}\) denote the j-th coordinate function on C n . Then \(dz_{j} = dx_{j} + idy_{j}\) and \(d\overline{z}_{j} = dx_{j} - idy_{j}\) .

We define complex differentiation by duality as follows in Definition 4.8. We could also use the formulas in Corollary 4.5 as definitions.

Definition 4.8.

For \(j = 1,\ldots n\), let \(\{{ \partial \over \partial z_{j}},{ \partial \over \partial \overline{z}_{j}}\}\) denote the dual basis to the basis \(\{dz_{j},d\overline{z}_{j}\}\). Thus \({ \partial \over \partial z_{j}}\) is defined by \(dz_{k}[{ \partial \over \partial z_{j}}] = 0\) if jk and by \(dz_{k}[{ \partial \over \partial z_{k}}] = 1\). Also, \({ \partial \over \partial \overline{z}_{j}}\) is defined by \(dz_{k}[{ \partial \over \partial \overline{z}_{j}}] = 0\) for all j, k and \(d\overline{z}_{k}[{ \partial \over \partial \overline{z}_{j}}] = 0\) for jk, but \(d\overline{z}_{k}[{ \partial \over \partial \overline{z}_{k}}] = 1\).

Differentiable functions \(g_{1},\ldots,g_{m}\) form a coordinate system on an open set Ω in R m if their differentials are linearly independent on Ω and the mapping \(g = (g_{1},\ldots,g_{m})\) is injective there. This concept makes sense when these functions are either real or complex valued. For example, the functions z and \(\overline{z}\) define a coordinate system on R 2, because dx + idy and dxidy are linearly independent and the map \((x,y)\mapsto (x + iy,x - iy)\), embedding R 2 into C 2, is injective.

We can regard the 2n functions \(z_{1},\ldots,z_{n},\overline{z}_{1},\ldots,\overline{z}_{n}\) as complex-valued coordinates on R 2n. The exterior derivative d f is invariantly defined, independent of coordinate system, by (32) and Definition 4.7. Hence, the following equality holds:

$$\displaystyle{ df =\sum _{ j=1}^{n}{ \partial f \over \partial x_{j}}dx_{j} +\sum _{ j=1}^{n}{ \partial f \over \partial y_{j}}dy_{j} =\sum _{ j=1}^{n}{ \partial f \over \partial z_{j}}dz_{j} +\sum _{ j=1}^{n}{ \partial f \over \partial \overline{z}_{j}}d\overline{z}_{j}. }$$
(34)

The following formulas then follow by equating coefficients. See Exercise 4.32.

Corollary 4.5.

$$\displaystyle{{ \partial \over \partial z_{j}} ={ 1 \over 2}\left ({ \partial \over \partial x_{j}} - i{ \partial \over \partial y_{j}}\right ) }$$
(35.1)
$$\displaystyle{{ \partial \over \partial \overline{z}_{j}} ={ 1 \over 2}\left ({ \partial \over \partial x_{j}} + i{ \partial \over \partial y_{j}}\right ). }$$
(35.2)

Suppose f is differentiable on an open set in C n. By (34), we can decompose its exterior derivative df into two parts:

$$\displaystyle{ df = \partial f + \overline{\partial }f =\sum _{ j=1}^{n}{ \partial f \over \partial z_{j}}dz_{j} +\sum _{ j=1}^{n}{ \partial f \over \partial \overline{z}_{j}}d\overline{z}_{j}. }$$
(36)

Formula (36) defines the splitting of the 1-form df into the sum of a (1, 0)-form and a (0, 1)-form. The important thing for us is the definition of complex analyticity in this language.

Definition 4.9.

Let Ω be an open subset of C n. Assume that f: Ω → C and f is continuously differentiable. Then f is complex analytic if and only if \(\overline{\partial }f = 0\). Equivalently, if and only if \({ \partial f \over \partial \overline{z}_{j}} = 0\) for all j.

The differential equations in Definition 4.9 are called the Cauchy–Riemann equations. Thus complex analytic functions are the solutions to a first-order system of partial differential equations. As in one variable, complex analytic functions are given locally by convergent power series. In Theorem 4.3 we used the power series expansion of a complex analytic mapping in a ball. For most of what we do, the crucial point is that the Cauchy–Riemann equations have the simple expression \(\overline{\partial }f = 0\). By (36), \(\overline{\partial }f = 0\) means that f is independent of each \(\overline{z}_{j}\). Part of the magic of complex analysis stems from regarding z and its conjugate \(\overline{z}\) as independent variables.

Corollary 4.6.

A continuously differentiable function, defined on an open set in C n , is complex analytic if and only if df = ∂f.

In the rest of this chapter most of the complex analytic functions we will encounter are polynomials. We emphasize the intuitive statement: f is complex analytic if and only if f is independent of the conjugate variable \(\overline{z} = (\overline{z}_{1},\ldots,\overline{z}_{n})\).

Exercise 4.32.

Use (34) to verify (35.1) and (35.2).

Exercise 4.33.

This exercise asks you to explain Remark 4.9. Consider the functions x and y as coordinates on R 2. Then by definition, \({\partial y \over \partial x} = 0\). Suppose instead we choose u = x and \(v = x + y\) as coordinates. Then we would have \({\partial v \over \partial u} = 0\). But \({\partial (x+y) \over \partial x} = 1\). Explain!

6 Differential Forms of Higher Degree

Our work in higher dimensions relies on differential forms of higher degree. This discussion presumes that the reader has had some exposure to the wedge product of differential forms and therefore knows intuitively what we mean by a k-form. We also use the modern Stokes’ theorem, which in our setting expresses an integral of a 2n-form over the unit ball as an integral of a (2n − 1)-form over the unit sphere. We develop enough of this material to enable us to do various volume computations.

Definition 4.10.

Let V be a (real or) complex vector space of finite dimension. A function \(F: V \times \ldots \times V \rightarrow \mathbf{C}\) (with k factors) is called a multi-linear form if F is linear in each variable when the other variables are held fixed. We often say F is k-linear. It is called alternating if \(F(v_{1},\ldots,v_{k}) = 0\) whenever v i = v j for some i, j with ij.

Example 4.6.

Consider a k-by-k matrix M of (real or) complex numbers. Think of the rows (or columns) of M as elements of C k. The determinant function is an alternating k-linear form on \({\mathbf{C}}^{k} \times \ldots \times {\mathbf{C}}^{k}\).

Example 4.7.

Given vectors \(a = (a_{1},a_{2},a_{3})\) and \(b = (b_{1},b_{2},b_{3})\) in R 3, define \(F(a,b) = a_{1}b_{3} - a_{3}b_{1}\). Then F is an alternating 2-linear form.

Lemma 4.7.

A multi-linear form F (over R n or C n ) is alternating if and only if the following holds. For each pair i,j of distinct indices, the value of F is multiplied by − 1 if we interchange the i-th and j-th slots:

$$\displaystyle{ F(v_{1},\ldots,v_{i},\ldots,v_{j},\ldots v_{k}) = -F(v_{1},\ldots,v_{j},\ldots v_{i},\ldots,v_{k}). }$$
(37)

Proof.

It suffices to ignore all but two of the slots and then verify the result when F is 2-linear. By multi-linearity we have

$$\displaystyle{ F(v + w,v + w) = F(v,v) + F(v,w) + F(w,v) + F(w,w). }$$
(38)

If F is alternating, then all terms in (38) vanish except F(v, w) + F(w, v). Hence this term must vanish as well. Conversely, if this term always vanishes, then (38) gives \(F(v + w,v + w) = F(v,v) + F(w,w)\). Put \(w = -v\). We get

$$\displaystyle{0 = F(0,0) = F(v,v) + F(-v,-v) = F(v,v) + {(-1)}^{2}F(v,v) = 2F(v,v).}$$

Hence F(v, v) = 0 for all v. □

Remark 4.11.

The reader might wonder why we chose the definition of alternating to be the vanishing condition rather than the change of sign condition. The reason is suggested by the proof. Over R or C, the conditions are the same. If we were working over more general fields, however, we could not rule out the possibility that \(1 + 1 = 0\). In this case the two conditions are not equivalent.

We note that 0 is the only alternating k-linear form on V if k exceeds the dimension of V. When k equals the dimension of V, the only alternating k-linear form is a multiple of the determinant.

Exercise 4.34.

Verify the statements in the previous paragraph.

We can now introduce differential forms of higher degree.

Definition 4.11.

Let V be a (real or) complex vector space of finite dimension n with dual space V . The collection \({\Lambda }^{k}({V }^{{\ast}})\) of all k-linear alternating forms on V is itself a vector space of dimension \({n\choose k}\). It is called the k-th exterior power of V .

Note that \({\Lambda }^{1}({V }^{{\ast}})\) consists of all 1-linear forms on V; thus, it is the dual space of V and \({\Lambda }^{1}({V }^{{\ast}}) = {V }^{{\ast}}\). By convention, \({\Lambda }^{0}({V }^{{\ast}})\) equals the ground field R or C.

Definition 4.12.

Let Ω be an open subset of R n. A differential form of degree k on Ω (or a differential k-form) is a (smooth) section of the k-th exterior power of the cotangent bundle \({T}^{{\ast}}({\mathbf{R}}^{n})\).

At each point x ∈ Ω, we have the vector space T x (R n) and its dual space \(T_{x}^{{\ast}}({\mathbf{R}}^{n})\). A differential k-form assigns to each x an element of \({\Lambda }^{k}(T_{x}^{{\ast}}({\mathbf{R}}^{n}))\). The value of the k-form at x is an alternating k-linear form.

By convention, a 0-form is a function. A 1-form assigns to each x a linear functional on T x (R n), as we have seen already. The value of a 2-form at x is a machine which seeks two vectors at x as inputs and returns a number. If we switch the order of the two inputs, we multiply the output by − 1.

Forms of all degrees can be generated from 1-forms using the wedge product. Before giving the definition of the wedge product, we express the idea informally using bases. Suppose \(e_{1},\ldots,e_{n}\) form a basis for the 1-forms at a point x. For each k with 1 ≤ kn, and each increasing sequence of indices \(i_{1} < i_{2} <\ldots < i_{k},\) we define a formal expression e I , written

$$\displaystyle{ e_{I} = e_{i_{1}} \wedge e_{i_{2}} \wedge \ldots \wedge e_{i_{k}}. }$$
(39)

Note that there are exactly \({n\choose k}\) such expressions. We decree that the collection of these objects form a basis for the space of k-forms. Thus the space of k-forms on an n-dimensional space has dimension \({n\choose k}\).

We can regard e I as an alternating k-linear form. As written, the index I satisfies \(i_{1} <\ldots < i_{k}\). We extend the notation by demanding the alternating property. For example, when k = 2 and l, m are either 1 or 2, we put

$$\displaystyle{(e_{l} \wedge e_{m})(v,w) = e_{l}(v)e_{m}(w) - e_{l}(w)e_{m}(v).}$$

Then \(e_{2} \wedge e_{1} = -e_{1} \wedge e_{2}\). More generally we put

$$\displaystyle{ (e_{1} \wedge \ldots \wedge e_{k})(v_{1},\ldots,v_{k}) = \mathrm{det}(e_{i}(v_{j})). }$$
(40)

Example 4.8.

Consider R 3 with basis \(e_{1},e_{2},e_{3}\). The zero forms are spanned by the constant 1. The 1-forms are spanned by \(e_{1},e_{2},e_{3}\). The 2-forms are spanned by e 1e 2, e 1e 3, and e 2e 3. The 3-forms are spanned by \(e_{1} \wedge e_{2} \wedge e_{3}\).

Exercise 4.35.

For 0 ≤ k ≤ 4, list bases for the k-forms on a 4-dimensional space.

A relationship between wedge products and determinants is evident. It is therefore no surprise that we define the wedge product in a manner similar to the Laplace expansion of a determinant.

First we recall the algebraic definition of the determinant. The motivation is geometric; \(\mathrm{det}(v_{1},\ldots,v_{n})\) measures the oriented volume of the n-dimensional box spanned by these vectors. We normalize by assuming that the volume of the unit n-cube is 1.

Definition 4.13.

Let V be either R n or C n. The determinant, written det, is the unique alternating n-linear form whose value on \(e_{1},\ldots,e_{n}\) is 1.

The Laplace expansion of the determinant follows from the definition. Suppose \(v_{j} =\sum c_{jk}e_{k}\). We compute \(\mathrm{det}(v_{1},\ldots,v_{n})\) by the definition. Multi-linearity yields

$$\displaystyle{\mathrm{det}(v_{1},\ldots,v_{n}) =\sum _{ k_{1}=1}^{n}\sum _{ k_{2}=1}^{n}\ldots \sum _{ k_{n}=1}^{n}\prod _{ j=1}^{n}c_{ jk_{j}}\mathrm{det}(e_{k_{1}},\ldots,e_{k_{n}}).}$$

Next we apply the alternating property to rewrite the determinant of each \((e_{k_{1}},\ldots e_{k_{n}})\). If indices are repeated, we get 0. Otherwise we get ± 1, depending on the signum of the permutation of the indices. We obtain the standard Laplace expansion of the determinant

$$\displaystyle{ \mathrm{det}(c_{jk}) =\sum _{\tau }\mathrm{sgn}(\tau )\prod _{j=1}^{n}c_{ j\ \tau (j)}. }$$
(41)

A permutation τ on n objects is a bijection on the set of these objects. The expression sgn(τ) is either 1 or − 1; it equals 1 when τ is an even permutation and − 1 when τ is an odd permutation. Thus sgn(τ) is the parity of the number of interchanges (of pairs of indices) required to put the indices in the order \(1,2,\ldots,n\).

Exercise 4.36.

Show that \(\mathrm{sgn}(\tau ) =\prod _{1\leq i<j\leq n}{\tau (i)-\tau (j) \over i-j}\).

Exercise 4.37.

Show that \(\mathrm{sgn}(\tau _{1} \circ \tau _{2}) = \mathrm{sgn}(\tau _{1})\mathrm{sgn}(\tau _{2})\). Suggestion: Use the previous exercise.

The wedge product is defined in a similar fashion:

Definition 4.14.

The wedge product of a k-form α and an l-form β is the (k + l)-form αβ defined by

$$\displaystyle{ (\alpha \wedge \beta )(v_{1},\ldots,v_{k+l}) =\sum _{\tau }\mathrm{sgn}(\tau )\alpha (v_{\tau (1)},\ldots,v_{\tau (k)})\beta (v_{\tau (k+1)},\ldots,v_{\tau (k+l)}). }$$
(42)

The sum in (42) is taken over all permutations τ on k + l objects.

Proposition 4.7 (Properties of the wedge product).

Let α,β,β 1 2 be differential forms. Then:

  1. (1)

    \(\alpha \wedge (\beta _{1} +\beta _{2}) = (\alpha \wedge \beta _{1}) + (\alpha \wedge \beta _{2})\) .

  2. (2)

    \(\alpha \wedge (\beta _{1} \wedge \beta _{2}) = (\alpha \wedge \beta _{1}) \wedge \beta _{2}\) .

  3. (3)

    \(\alpha \wedge \beta = {(-1)}^{kl}\beta \wedge \alpha\) if α is a k-form and β is an l-form.

Proof.

Left to the reader as Exercise 4.38. □

The exterior derivative d is one of the most important and elegant operations in mathematics. When η is a k-form, is a (k + 1)-form. When η is a function (a 0-form), agrees with our definition from (32). We can extend d to forms of all degrees by proceeding inductively on the degree of the form. After stating Theorem 4.7, we mention a more elegant approach.

If f is a function, then df is defined as in (32) by \(df[v] ={ \partial f \over \partial v}\). In coordinates, \(df =\sum { \partial f \over \partial x_{j}}dx_{j}\). When \(g =\sum _{j}g_{j}dx_{j}\) is an arbitrary 1-form, we define dg by

$$\displaystyle{ dg =\sum _{j}dg_{j} \wedge dx_{j} =\sum _{j}\sum _{k}{ \partial g_{j} \over \partial x_{k}}dx_{k} \wedge dx_{j} =\sum _{k<j}({ \partial g_{j} \over \partial x_{k}} -{ \partial g_{k} \over \partial x_{j}})dx_{k} \wedge dx_{j}. }$$
(43)

On the far right-hand side of (43), we have rewritten dg using \(dx_{k} \wedge dx_{j} = -dx_{j} \wedge dx_{k}\) to make the indices increase. The terms dx j dx j drop out. For example,

$$\displaystyle{ d(Pdx + Qdy) ={ \partial P \over \partial y} dy \wedge dx +{ \partial Q \over \partial x} dx \wedge dy = ({\partial Q \over \partial x} -{ \partial P \over \partial y} )dx \wedge dy. }$$
(44)

Suppose in (44) that \(Pdx + Qdy = df\) for some smooth function f. Then the equality of mixed second partial derivatives and (44) show that d(df) = 0. This statement in the language of differential forms is equivalent to the classical statement “the curl of a gradient is 0.” In fact d 2 = 0 in general; see Theorem 4.7 and Exercise 4.38.

Let η be a k-form. We wish to define in coordinates. To simplify the notation, write

$$\displaystyle{d{x}^{J} = dx_{ j_{1}} \wedge dx_{j_{2}} \wedge \ldots \wedge dx_{j_{k}}.}$$

Then we can write \(\eta =\sum _{J}\eta _{J}d{x}^{J}\) where the η J are functions and each J is a k-tuple of indices. We proceed as we did for 1-forms and put

$$\displaystyle{d\eta =\sum _{J}d\eta _{J} \wedge d{x}^{J} =\sum _{ J}\sum _{k}{ \partial \eta _{J} \over \partial x_{k}}dx_{k} \wedge d{x}^{J}.}$$

Thus \(d\eta =\sum g_{L}d{x}^{L}\), where now L is a (k + 1)-tuple of indices.

The following standard result, which applies in the setting of smooth manifolds, characterizes d. We omit the simple proof, which can be summarized as follows. Choose coordinates, use the properties to check the result in that coordinate system, and then use the chain rule to see that d is defined invariantly.

Theorem 4.7.

There is a unique operator d mapping smooth k-forms to smooth (k + 1)-forms satisfying the following properties:

  1. (1)

    If f is a function, then df is defined by ( 32 ).

  2. (2)

    \(d(\alpha +\beta ) = d\alpha + d\beta\) .

  3. (3)

    \(d(\alpha \wedge \beta ) = d\alpha \wedge \beta +{(-1)}^{p}\alpha \wedge d\beta\) if α is a p-form.

  4. (4)

    d 2 = 0.

It is possible to define d without resorting to a coordinate system. The definition on 0-forms is as in (32). We give the definition only for 1-forms. Let η be a 1-form; the 2-form requires two vector fields as inputs; it must be alternating and multi-linear. Thus we will define (v, w) for vector fields v and w.

We regard v and w as differential operators by recalling that v(f) = df(v) for smooth functions f. Earlier we wrote df[v], but henceforth we will use the symbol [, ] in another manner. We therefore use parentheses for the application of a 1-form on a vector field and for the action of a vector field on a function. We wish to define the expression (v, w).

Definition 4.15.

Let v and w be vector fields. Their Lie bracket, or commutator, is the vector field [v, w] defined by \([v,w](f) = v(w(f)) - w(v(f))\). Here f is a smooth function, and we regard a vector field as a differential operator. (Exercise 4.39 asks you to check that the commutator is a vector field.)

We can now define . Given vector fields v and w, we put

$$\displaystyle{d\eta (v,w) = v(\eta (w)) - w(\eta (v)) -\eta ([v,w]).}$$

The notation v(η(w)) here means the derivative of the function η(w) in the direction v. The full expression is alternating in v and w. The term involving commutators is required to make certain that is linear over the functions. See Exercise 4.40. This formula (and its generalization to forms of all degrees) is known as the Cartan formula for the exterior derivative.

Exercise 4.38.

Show that d 2 = 0. Recall, for smooth functions f, we have

$$\displaystyle{{ {\partial }^{2}f \over \partial x_{j}\partial x_{k}} ={ {\partial }^{2}f \over \partial x_{k}\partial x_{j}}.}$$

Exercise 4.39.

Verify that the commutator of two vector fields is a vector field. Suggestion: Use coordinates.

Exercise 4.40.

Suppose we tried to define a 2-form ζ by ζ(v, w) = v(η(w)) − w(η(v)). Show that ζ(gv, w) ≠ (v, w) in general, and thus linearity fails. Then show that the commutator term in the definition of enables linearity to hold.

Equation (44) fits nicely with Green’s theorem. The line integral of the 1-form \(\eta = Pdx + Qdy\) around a simple closed curve equals the double integral of over the curve’s interior. The generalization of this result to forms of all degrees is known as the modern Stokes’ theorem. This theorem subsumes many results, including the fundamental theorem of calculus, Green’s theorem, Gauss’s divergence theorem, and the classical Stokes’ theorem, and it illuminates results such as Maxwell’s equations from the theory of electricity and magnetism. We state it only for domains in R N, but it holds much more generally. We will apply Stokes’ theorem only when the surface in question is the unit sphere, which is oriented by the outward normal vector.

Theorem 4.8 (Stokes’ theorem).

Let S = bΩ be a piecewise-smooth-oriented (N − 1)-dimensional surface bounding an open subset Ω of R N . Let η be an (N − 1)-form that is smooth on Ω and continuous on Ω ∪ bΩ. Then

$$\displaystyle{\int _{b\Omega }\eta =\int _{\Omega }d\eta.}$$

Corollary 4.7.

If dη = 0, then ∫ η = 0.

Theorem 4.8 holds whether or not is connected, as long as one is careful with orientation. If Ω is the region between concentric spheres, for example, then the spheres must be oppositely oriented.

Each 1-form η on an open subset of R N can be written \(\eta =\sum _{ j=1}^{N}g_{j}dx_{j}\), where the g j are smooth functions. A 1-form η is called exact if there is a smooth function f such that η = df; thus \(g_{j} ={ \partial f \over \partial x_{j}}\). Readers who are familiar with using line integrals to compute work will recognize that exact 1-forms correspond to conservative force fields. More generally, a k-form η is exact if there is a (k − 1)-form α with = η. A necessary condition for being exact arises from the equality of mixed partial derivatives. A form η is called closed if = 0. That exact implies closed follows directly from d 2 = 0.

If a form is closed on an open set, it need not be exact there. The standard examples are of course

$$\displaystyle{ \eta ={ -ydx + xdy \over {x}^{2} + {y}^{2}} }$$
(45.1)
$$\displaystyle{ \eta ={ x\ dy \wedge dz + y\ dz \wedge dx + z\ dx \wedge dy \over {({x}^{2} + {y}^{2} + {z}^{2})}^{{ 3 \over 2} }}. }$$
(45.2)

These are defined on the complement of the origin in R 2 and R 3, respectively. The form in (45.2) provides essentially the same information as the electrical or gravitational field due to a charge or mass at the origin.

Such forms lead to the subject of deRham cohomology. One relates the existence and number of holes in a space to whether closed forms are exact.

Exercise 4.41.

Prove Proposition 4.7.

Exercise 4.42.

For 0 < r < and 0 ≤ θ < 2π, put (x, y) = (rcos(θ), rsin(θ)). Show that dxdy = rdr.

Exercise 4.43.

For 0 < ρ < , for 0 ≤ θ < 2π, and for 0 ≤ ϕ < π, put

$$\displaystyle{(x,y,z) = (\rho \cos (\theta )\sin (\phi ),\rho \sin (\theta )\sin (\phi ),\rho \cos (\phi )).}$$

Compute dxdydz in terms of ρ, θ, ϕ, , , .

Exercise 4.44.

Express the complex 1-form \({dz \over z}\) in terms of x, y, dx, dy. Express the form in (45.1) in terms of dz and \(d\overline{z}\).

Exercise 4.45.

Show that \(dz \wedge d\overline{z} = -2idx \wedge dy\).

Exercise 4.46.

Put z = re . Compute \(dz \wedge d\overline{z}\).

Exercise 4.47.

Put \(\eta = dx_{1} \wedge dx_{2} + dx_{3} \wedge dx_{4}\). Find ηη. The answer is not 0. Explain.

Exercise 4.48.

Verify that the forms in (45.1) and (45.2) are closed but not exact. (To show they are not exact, use Stokes’ theorem on concentric circles or concentric spheres.) For n ≥ 3, what is the analogue of (45.2) for the complement of the origin in R n?

Exercise 4.49.

Use wedge products to give a test for deciding whether a collection of 1-forms is linearly independent.

Exercise 4.50.

For nk ≥ 2, let \(r_{1},\ldots r_{k}\) be smooth real-valued functions on C n. Show that it is possible for \(dr_{1},\ldots,dr_{k}\) to be linearly independent while \(\partial r_{1},\ldots,\partial r_{k}\) are linearly dependent. Here \(\partial r =\sum { \partial r \over \partial z_{j}}dz_{j}\). This problem is even easier if we drop the assumption that the r j are real valued. Why?

7 Volumes of Parametrized Sets

Our next geometric inequality extends the ideas of Proposition 4.2 to higher dimensions. Things are more complicated for several reasons, but we obtain a sharp inequality on volumes of images of proper polynomial mappings between balls. We will also perform some computations from multivariable calculus which are useful in many contexts.

We begin with a quick review of higher-dimensional volume. Let Ω be an open subset of R k. Let \(u_{1},\ldots,u_{k}\) be coordinates on R k. The ordering of the u j , or equivalently the du j , defines the orientation on R k. We write

$$\displaystyle{dV = dV _{k} = dV _{k}(u) = du_{1} \wedge \ldots \wedge du_{k}}$$

for the Euclidean volume form. When u = F(x) is a change of variables, preserving the orientation, we obtain

$$\displaystyle{dV (u) = \mathrm{det}(DF(x))dV (x).}$$

Suppose F: Ω → R N is continuously differentiable and injective, except perhaps on a small set. Let us also assume that the derivative map DF: R kR N is injective, again except perhaps on a small set. At each x, DF(x) is a linear map from \(T_{x}({\mathbf{R}}^{k}) \rightarrow T_{F(x)}({\mathbf{R}}^{N})\). Let (DF)(x) denote the transpose of DF(x). Then \((DF){(x)}^{{\ast}}: T_{F(x)}({\mathbf{R}}^{N}) \rightarrow T_{x}({\mathbf{R}}^{k})\). The composition (DF)(x)DF(x) is then a linear mapping from the space T x (R k) to itself, and hence, its determinant is defined. The k-dimensional volume of the set F(Ω) is then given by an integral

$$\displaystyle{ \mathrm{Vol}(F(\Omega )) =\int _{\Omega }\sqrt{\mathrm{det } ({(DF)}^{{\ast} } DF)}dV _{k}. }$$
(46)

Example 4.9.

Let Ω denote the unit disk in R 2. Define F α : Ω → R 4 by

$$\displaystyle{F_{\alpha }(x,y) = (\mathrm{cos}(\alpha )x,\mathrm{cos}(\alpha )y,\mathrm{sin}(\alpha )({x}^{2} - {y}^{2}),\mathrm{sin}(\alpha )2xy).}$$

Computation shows that

$$\displaystyle{ DF_{\alpha } = \left (\begin{array}{ccc} \mathrm{cos}(\alpha ) && 0 \\ 0 && \mathrm{cos}(\alpha ) \\ 2x\mathrm{sin}(\alpha )&& - 2y\mathrm{sin}(\alpha ) \\ 2y\mathrm{sin}(\alpha ) && 2x\mathrm{sin}(\alpha ) \end{array} \right ). }$$
(47)

Matrix multiplication shows that \(DF_{\alpha }^{{\ast}}(x,y)DF_{\alpha }(x,y)\) is the matrix in (48):

$$\displaystyle{ \left (\begin{array}{ccc} {\mathrm{cos}}^{2}(\alpha ) + 4({x}^{2} + {y}^{2}){\mathrm{sin}}^{2}(\alpha )&& 0 \\ 0 &&{\mathrm{cos}}^{2}(\alpha ) + 4({x}^{2} + {y}^{2}){\mathrm{sin}}^{2}(\alpha ) \end{array} \right ). }$$
(48)

Hence, \(\sqrt{\mathrm{det } (DF_{\alpha }^{{\ast} }DF_{\alpha } )} ={ \mathrm{cos}}^{2}(\alpha ) + 4({x}^{2} + {y}^{2}){\mathrm{sin}}^{2}(\alpha )\). Thus, the area of the image of the unit disk B 1 under F α is the integral

$$\displaystyle{ \int _{B_{1}}({\mathrm{cos}}^{2}(\alpha ) + 4({x}^{2} + {y}^{2}){\mathrm{sin}}^{2}(\alpha ))dxdy =\pi (1 +{ \mathrm{sin}}^{2}(\alpha )). }$$
(49)

Example 4.10.

To anticipate a later development, we find the 3-dimensional volume of S 3. Let Ω denote the open subset of R 3 defined by the inequalities 0 < r < 1, 0 < θ < 2π, 0 < ϕ < 2π. We parametrize (most of) S 3 by

$$\displaystyle{(r,\theta,\phi )\mapsto F(r,\theta,\phi ) = (r\ \mathrm{cos}(\theta ),r\ \mathrm{sin}(\theta ),s\ \mathrm{cos}(\phi ),s\ \mathrm{sin}(\phi )).}$$

Here \(s = \sqrt{1 - {r}^{2}}\). Note that both θ and ϕ range from 0 to 2π; they are not the usual spherical coordinates on S 2. Computing DF and DF gives

$$\displaystyle{{(DF)}^{{\ast}} = \left (\begin{array}{ccccccc} \mathrm{cos}(\theta ) && \mathrm{sin}(\theta ) && {-r \over {s}^{2}} \mathrm{cos}(\phi ) &&{-r \over {s}^{2}} \mathrm{sin}(\phi ) \\ - r\ \mathrm{sin}(\theta )&&r\ \mathrm{cos}(\theta )&& 0 && 0 \\ 0 && 0 && - s\ \mathrm{sin}(\phi )&& s\ \mathrm{cos}(\phi ). \end{array} \right )}$$

Multiplying (DF) by DF and computing determinants yields the 3-dimensional volume form rdrdθ d ϕ on the sphere. Thus

$$\displaystyle{\mathrm{Vol}({S}^{3}) =\int _{ 0}^{2\pi }\int _{ 0}^{2\pi }\int _{ 0}^{1}rdrd\theta d\phi = {(2\pi )}^{2}{1 \over 2} = {2\pi }^{2}.}$$

We are interested in images of sets in C n under complex analytic mappings. When f is a complex-analytic and equi-dimensional mapping, we write f′ for its derivative and Jf for its Jacobian determinant. Thus

$$\displaystyle{Jf = \mathrm{det}\left ({\partial f_{j} \over \partial z_{k}}\right ).}$$

Volume computations simplify in the complex-analytic case, even when f is not equi-dimensional. We could express Example 4.9 using the complex analytic map f α defined by f α (z) = (cos(α)z, sin(α)z 2), and we easily obtain (49). The following result in the equi-dimensional case explains why:

Lemma 4.8.

Suppose f: Ω ⊆ C n C n is complex analytic. Define F: R 2n R 2n by \(F(x,y) = (\mathrm{Re}(f(x + iy)),\mathrm{Im}(f(x + iy)))\) . Then \(\mathrm{det}(DF) = \vert \mathrm{det}(f^{\prime}){\vert }^{2} = \vert Jf{\vert }^{2}\) . In particular, F preserves orientation.

Proof.

When u = F(x) is a change of variables on R k, then dV (u) = ± det((DF)(x))dV (x). The proof amounts to rewriting this equality using complex variables and their conjugates and using the relationship between wedge products and determinants.

Put w = f(z), where both z and w are in C n. Put \(w = u + iv\) and \(z = x + iy\). In real variables we have

$$\displaystyle{ dV _{2n}(u,v) = du_{1} \wedge dv_{1} \wedge \ldots \wedge du_{n} \wedge dv_{n} = \mathrm{det}(DF)dx_{1} \wedge dy_{1} \wedge \ldots \wedge dx_{n} \wedge dy_{n}. }$$
(50)

We will write the volume forms in the \(z,\overline{z}\) variables in the domain and the \(w,\overline{w}\) variables in the target. Note that

$$\displaystyle{dw_{j} =\sum { \partial f_{j} \over \partial z_{k}}dz_{k}.}$$

Hence \(dw_{1} \wedge \ldots \wedge dw_{n} = \mathrm{det}({\partial f_{j} \over \partial z_{k}})\ dz_{1} \wedge \ldots \wedge dz_{n} = (Jf)\ dz_{1} \wedge \ldots \wedge dz_{n}\).

Recall from Exercise 4.45 that \(dz_{j} \wedge d\overline{z}_{j} = (-2i)dx_{j} \wedge dy_{j}\) and similarly for the w variables. Putting everything together we get

$$\displaystyle{dV _{2n}(u,v) = du_{1} \wedge dv_{1} \wedge \ldots \wedge du_{n} \wedge dv_{n} = {({ i \over 2})}^{n}dw_{ 1} \wedge d\overline{w}_{1} \wedge \ldots \wedge dw_{n} \wedge d\overline{w}_{n}}$$
$$\displaystyle{= \vert \mathrm{det}(f^{\prime}(z)){\vert }^{2}{({ i \over 2})}^{n}dz_{ 1} \wedge d\overline{z}_{1} \wedge \ldots \wedge dz_{n} \wedge d\overline{z}_{n}}$$
$$\displaystyle{ = \vert \mathrm{det}(f^{\prime}(z)){\vert }^{2}dx_{ 1} \wedge dy_{1} \wedge \ldots \wedge dx_{n} \wedge dy_{n} = \vert \mathrm{det}(f^{\prime}(z)){\vert }^{2}dV _{ 2n}(x,y). }$$
(51)

Comparing (50) and (51) finishes the proof. □

Exercise 4.51.

Prove (51) using the real form of the Cauchy–Riemann equations. The computation is somewhat punishing; do it only in two complex variables where you will deal with four-by-four matrices.

We continue discussing higher-dimensional volumes of complex analytic images. Let Ψ denote the differential form on C N defined by

$$\displaystyle{\Psi ={ i \over 2}\sum _{j=1}^{N}d\zeta _{ j} \wedge d\overline{\zeta }_{j}.}$$

The factor \({ i \over 2}\) arises because \(dz \wedge d\overline{z} = -2idx \wedge dy\) in one dimension. See Exercise 4.45. The form Ψ k, where we wedge Ψ with itself k times, is used to define 2k-dimensional volume. As before we take multiplicity into account.

Definition 4.16.

(2k-dimensional volume) Let Ω be an open subset in C k, and suppose that f: Ω → C N is complex analytic. We define V 2k (f, Ω), the (2k)-dimensional volume with multiplicity counted, by (52):

$$\displaystyle{ V _{2k}(f,\Omega ) =\int _{\Omega }{{({f}^{{\ast}}\Psi )}^{k} \over k!} ={ 1 \over k!}{({ i \over 2})}^{k}\int _{ \Omega }{(\sum _{j=1}^{N}\partial f_{ j} \wedge \overline{\partial f_{j}})}^{k}. }$$
(52)

Remark 4.12.

Equation (52) is the natural definition based on our L 2 perspective. When f is not injective, the formula takes multiplicity into account. For wC N, let #(f, w) denote the number of points in Ω ∩ f − 1(w). Then we could define V 2k (f, Ω) by

$$\displaystyle{V _{2k}(f,\Omega ) =\int _{{\mathbf{C}}^{N}}\#(f,w)d{\mathbf{h}}^{2k}(w).}$$

Here d h 2k(w) is the 2k-dimensional Hausdorff measure. The so-called area formula from geometric measure theory shows under rather general hypotheses, met in our context, that this computation agrees with (52).

We are primarily interested in the case when Ω is the unit ball B k ; in this case we abbreviate V 2k (f, Ω) by V f . In (52) the upper star notation denotes pullback, and the k! arises because there are k! ways to permute the indices from 1 to k. The form \({{({f}^{{\ast}}\Psi )}^{k} \over k!}\) is rdV, where dV = dV 2k is the Euclidean volume form in k complex dimensions, for some function r depending on f. The next section provides techniques for evaluating the resulting integrals.

Remark 4.13.

Caution! In the complex 2-dimensional case, the volume form is h dV 4, where \(h = EG -\vert F{\vert }^{2}\) and

$$\displaystyle{E = \vert \vert {\partial f \over \partial z} \vert {\vert }^{2},}$$
$$\displaystyle{G = \vert \vert {\partial f \over \partial w}\vert {\vert }^{2},}$$
$$\displaystyle{F =\langle { \partial f \over \partial z},{ \partial f \over \partial w}\rangle.}$$

No square root appears here. By contrast, in the real case, the classical formula for the surface area form is \(\sqrt{EG - {F}^{2}}\), where E, G, F have analogous definitions.

Example 4.11.

We consider several maps from B 2 to C 3. Using (52) and the methods of the next section, we obtain the following values:

  1. (1)

    Put g(z, w) = (z, 0, w). Then \(V _{g} {={ \pi }^{2} \over 2}\).

  2. (2)

    For \(0 \leq \lambda \leq \sqrt{2}\), put f(z, w) = (z 2, λ z w, w 2). Then \(V _{f} ={{ 2{(\lambda }^{2}+1) \over 3} \pi }^{2}\).

The first map is injective, and V f gives the volume of the image. For λ ≠ 0, the second map is generically two to one. If (a, b, c) is in the image of f, and (a, b, c) is not the origin, then f − 1(a, b, c) has precisely two points. When λ 2 = 2, we obtain 4 times the volume of the unit ball. See Theorem 4.9. When λ = 0, the answer is \({4 \over 3}\) times the volume of the unit ball.

Example 4.12.

Define h: C 2C 3 by h(z, w) = (z, zw, w 2). This map and its generalization to higher dimensions will play an important role in our work, because h maps the unit sphere in C 2 into the unit sphere in C 3. Here it illustrates the subtleties involved in computing multiplicities. Let p = (a, b, c) be a point in C 3. Suppose first that a ≠ 0. Then h − 1(p) is empty unless b 2 = ca 2, in which case h − 1(p) is a single point. When a = 0, things change. If b ≠ 0, then h − 1(p) is empty. If \(a = b = 0\), then h − 1(p) consists of two points for c ≠ 0 and one point with multiplicity two if c = 0.

We will use the expanded version of the far right-hand side of (53) to compute volumes. Let Ω be an open set in C k, and assume that f: Ω → C N is complex analytic. Here we allow the target dimension to differ from the domain dimension. We define the pointwise squared Jacobian || Jf ||2 by

$$\displaystyle{ \vert \vert Jf\vert {\vert }^{2} =\sum \vert J(f_{ i_{1}},\ldots,f_{i_{k}}){\vert }^{2} =\sum \vert J(f_{ I}){\vert }^{2}. }$$
(53)

The sum in (53) is taken over all increasing k-tuples. Equivalently, we form all possible Jacobian determinants of k of the component functions and sum their squared moduli. Recall, in the equi-dimensional case, that

$$\displaystyle{Jg = \mathrm{det}\left ({\partial g_{j} \over \partial z_{k}}\right ).}$$

Exercise 4.52.

Let \(\alpha =\sum _{ j=1}^{3}\partial f_{j} \wedge \overline{\partial f_{j}}\). Find ααα by expanding and compare with (53).

The next lemma provides another method for finding V f . Let r be a twice differentiable function of several complex variables. The complex Hessian of r is the matrix \((r_{j\overline{k}}) = \left ({ {\partial }^{2}r \over \partial z_{j}\partial \overline{z_{k}}}\right )\). Lemma 4.9 relates the determinant of the Hessian of || f ||2 to the Jacobian Jf, when f is a complex analytic mapping. This lemma allows us to compute one determinant, rather than many, even when N > n.

Lemma 4.9.

If f: C k C N is complex analytic, then \(\vert \vert Jf\vert {\vert }^{2} = \mathrm{det}\left ((\vert \vert f\vert {\vert }^{2})_{j\overline{k}}\right )\) .

Proof.

See Exercise 4.53. □

To find the volume (with multiplicity accounted for) of the image of a complex analytic mapping f: Ω ⊆ C kC N, we must either integrate the determinant of the Hessian of || f ||2 or sum the L 2 norms of each Jacobian \(J(f_{j_{1}},\ldots,f_{j_{k}})\) formed from the components of f:

$$\displaystyle{ V _{2k}(f,\Omega ) =\int _{\Omega }\vert \vert Jf\vert {\vert }^{2}dV _{ 2k} =\int _{\Omega }\mathrm{det}\left ((\vert \vert f\vert {\vert }^{2})_{ j\overline{k}}\right )dV _{2k}. }$$
(54)

Exercise 4.53.

Put \(r(z,\overline{z}) =\sum _{ j=1}^{N}\vert f_{j}(z){\vert }^{2} = \vert \vert f(z)\vert {\vert }^{2}\). Use differential forms to prove Lemma 4.9.

8 Volume Computations

Our next goal is to compute the 2n-dimensional volume of the image of the unit ball in C n under the mapping zz m. As a warm-up, suppose n = 1. Then the map zz m covers the ball m times, and hence the area of the image with multiplicity counted is π m. We get the same answer using integrals:

$$\displaystyle{ A =\int _{B_{1}}\vert m{z}^{m-1}{\vert }^{2}dV = {m}^{2}\int _{ 0}^{2\pi }\int _{ 0}^{1}{r}^{2m-1}drd\theta = {m}^{2}{ 2\pi \over 2m} =\pi m. }$$
(55)

In order to help us do computations and to simplify the notation, we recall and extend our discussion of multi-index notation from Sect. 4. A multi-index α is an n-tuple \(\alpha = (\alpha _{1},\ldots,\alpha _{n})\) of nonnegative numbers, not necessarily integers. When the α j are integers, we write \(\vert \alpha \vert =\sum _{ j=1}^{n}\alpha _{j}\) and \(\alpha ! =\prod _{ j=1}^{n}\alpha _{j}!\). In case d = | α |, we write multinomial coefficients using multi-indices:

$$\displaystyle{{d\choose \alpha } ={ d! \over \alpha !} ={ d! \over \alpha _{1}!\ldots \alpha _{n}!}.}$$

Multi-indices are especially useful for writing polynomials and power series. If zC n, we write

$$\displaystyle{{z}^{\alpha } =\prod _{ j=1}^{n}{(z_{ j})}^{\alpha _{j} }}$$
$$\displaystyle{\vert z{\vert }^{2\alpha } =\prod _{ j=1}^{n}\vert z_{ j}{\vert }^{2\alpha _{j} }.}$$

The multinomial theorem gives the following result from Sect. 4:

$$\displaystyle{\vert \vert z\vert {\vert }^{2d} = {(\sum _{ j=1}^{n}\vert z_{ j}{\vert }^{2})}^{d} =\sum _{ \vert \alpha \vert =d}{d\choose \alpha }\vert z{\vert }^{2\alpha }.}$$

In order to help us find volumes in higher dimensions, we introduce the Γ-function. For x > 0, we let Γ(x) denote the Gamma function:

$$\displaystyle{\Gamma (x) =\int _{ 0}^{\infty }{e}^{-t}{t}^{x-1}dt.}$$

The integral is improper at t = 0 for x < 1, but it converges there for x > 0. When n is an integer and n ≥ 0, then \(\Gamma (n + 1) = n!\). More generally, \(\Gamma (x + 1) = x\Gamma (x)\). This property enables one to extend the definition of the Γ-function. The integral defining Γ(x) converges when x is complex and Re(x) > 0. The formula \(\Gamma (x + 1) = x\Gamma (x)\) provides a definition when − 1 < Re(x) < 0 and, by induction, a definition whenever Re(x) is not a negative integer or zero (Fig. 4.5).

Figure 4.5
figure 5

The Gamma function

It is useful to know that \(\Gamma ({1 \over 2}) = \sqrt{\pi }\). Exercise 4.55 asks for a proof; the result is equivalent to the evaluation of the Gaussian integral from Proposition 3.4. One squares the integral and changes variables appropriately.

Let K + denote the part of the unit ball in R n lying in the first orthant; that is, \(K_{+} =\{ x:\sum x_{j}^{2} \leq 1\ \mathrm{and}\ x_{j} \geq 0\ \mathrm{for\ all}\ j\}\). Let α be an n-tuple of positive real numbers. We define an n-dimensional analogue of the Euler Beta function by

$$\displaystyle{ \mathcal{B}(\alpha ) ={ \prod \Gamma (\alpha _{j}) \over \Gamma (\vert \alpha \vert )}. }$$
(56)

The expression (56) is the value of a certain integral

$$\displaystyle{ \mathcal{B}(\alpha ) = {2}^{n}\vert \alpha \vert \int _{ K_{+}}{\mathbf{r}}^{2\alpha -1}dV (\mathbf{r}). }$$
(57)

Note the use of multi-index notation in (57); 2α − 1 means the multi-index whose j-th entry is 2α j − 1. Thus r 2α − 1 means

$$\displaystyle{\prod _{j=1}^{n}\mathbf{r}_{ j}^{2\alpha _{j}-1}.}$$

The notation \(\mathbf{r} = (\mathbf{r}_{1},\ldots,\mathbf{r}_{n})\) has a specific purpose. Certain integrals over balls in C n (See Lemma 4.10) reduce to integrals such as (57) when we use polar coordinates in each variable separately; that is, \(z_{j} = \mathbf{r}_{j}{e}^{i\theta _{j}}\).

Corollary 4.8.

The volume of the unit ball in R n is \({ \Gamma {({ 1 \over 2} )}^{n} \over \Gamma ({n \over 2} +1)}\) .

Proof.

Put \(\alpha = ({1 \over 2},{ 1 \over 2},\ldots,{ 1 \over 2})\) in (57) and use (56). □

Exercise 4.54.

Verify that \(\Gamma (x + 1) = x\Gamma (x)\) and \(\Gamma (n + 1) = n!\).

Exercise 4.55.

Show that \(\Gamma ({1 \over 2}) = \sqrt{\pi }\).

Exercise 4.56.

Express the formula for the volume of the unit ball in R n in the form c n π n. (Use the previous two exercises.)

Exercise 4.57.

Put \(\beta (a,b) =\int _{ 0}^{1}{t}^{a-1}{(1 - t)}^{b-1}dt\) for a, b > 0. This integral is the classical Euler Beta function. By first computing Γ(a)Γ(b), evaluate it in terms of the Γ-function. Explain the relationship with (57).

Exercise 4.58.

Prove that (56) and (57) are equivalent.

Remark 4.14.

Integrals of the form \(\int _{0}^{2\pi }{\mathrm{cos}}^{k}(\theta ){\mathrm{sin}}^{l}(\theta )d\theta\) (for integer exponents) are easily evaluated by using the complex form of the exponential. Integrals of the form \(\int _{0}^{{ \pi \over 2} }{\mathrm{cos}}^{k}(\theta ){\mathrm{sin}}^{l}(\theta )d\theta\) are harder. Such integrals reduce to Beta functions:

$$\displaystyle{\beta (a,b) =\int _{ 0}^{1}{t}^{a-1}{(1 - t)}^{b-1}dt = 2\int _{ 0}^{{ \pi \over 2} }{\mathrm{sin}}^{2a-1}(\theta ){\mathrm{cos}}^{2b-1}(\theta )d\theta,}$$

even when a and b are not integers.

Exercise 4.59.

Use the Euler Beta function to verify the following duplication formula for the Γ function.

$$\displaystyle{{ \Gamma (x) \over \Gamma (2x)} = {2}^{1-2x}{ \Gamma ({1 \over 2}) \over \Gamma (x +{ 1 \over 2})}. }$$
(58)

Suggestion: First multiply both sides by Γ(x). The left-hand side of the result is then β(x, x). Write it as a single integral over [0, 1] as in Exercise 4.57. Rewrite by symmetry as twice the integral over \([0,{ 1 \over 2}]\). Then change variables by \(2t = 1 -\sqrt{s}\). You will obtain \({2}^{1-2x}\beta (x,{ 1 \over 2})\) and (58) follows.

Exercise 4.60.

Put \(\phi (x,y) ={ \Gamma (x)\Gamma (x+y) \over \Gamma (2x)\Gamma (y)}\). Find \(\phi (x,{ 1 \over 2})\) and \(\phi (x,{ 3 \over 2})\). Show that

$$\displaystyle{\phi (x,{ 5 \over 2}) = {2}^{1-2x}{(1 + 2x)(3 + 2x) \over 3}.}$$

Exercise 4.61.

(Difficult) Verify the following formula for Γ(z)Γ(1 − z):

$$\displaystyle{\Gamma (z)\Gamma (1 - z) ={ \pi \over \mathrm{sin}(\pi z)}.}$$

Suggestion: First obtain a Beta function integral. Convert it to an integral over [0, ). Then use contour integration. The computation is valid for all complex numbers except the integers. See also Exercise 3.45.

The Γ function also arises naturally in the following exercise.

Exercise 4.62 (For those who know probability).

Let X be a Gaussian random variable with mean 0 and variance σ 2. Use the fundamental theorem of calculus to find the density of the random variable X 2. The answer is called the Γ-density with parameters \({1 \over 2}\) and \({ 1 \over {2\sigma }^{2}}\). Use this method to show that \(\Gamma ({1 \over 2}) = \sqrt{\pi }\).

We will evaluate several integrals using the n-dimensional Beta function. Recall the notation \(\vert z{\vert }^{2\alpha } =\prod \vert z_{j}{\vert }^{2\alpha _{j}}\) used in (60).

Lemma 4.10.

Let d be a nonnegative integer, and let α be a multi-index of nonnegative real numbers. Let B n denote the unit ball in C n . Then

$$\displaystyle{ \int _{B_{n}}\vert \vert z\vert {\vert }^{2d}dV {={ \pi }^{n} \over (n - 1)!(n + d)}. }$$
(59)
$$\displaystyle{ \int _{B_{n}}\vert z{\vert }^{2\alpha }dV {={ \pi }^{n} \over (n + \vert \alpha \vert )}\mathcal{B}(\alpha +1). }$$
(60)

Proof.

We use polar coordinates in each variable separately; to evaluate (59), we have

$$\displaystyle{I =\int _{B_{n}}\vert \vert z\vert {\vert }^{2d}dV _{ 2n} = {(2\pi )}^{n}\int _{ K_{+}}\vert \vert \mathbf{r}\vert {\vert }^{2d}\prod \mathbf{r}_{ j}dV _{n}.}$$

We then expand || r ||2d using the multinomial theorem to obtain (61)

$$\displaystyle{ I {=\pi }^{n}{2}^{n}\sum _{ \vert \gamma \vert =d}{d\choose \gamma }\int _{K_{+}}{\mathbf{r}}^{2\gamma +1}dV _{ n}. }$$
(61)

Using formulas (56) and (57) for the Beta function in (61) we obtain

$$\displaystyle{ I {=\pi }^{n}\sum _{ \vert \gamma \vert =d}{d\choose \gamma }{\mathcal{B}(\gamma +1) \over \vert \gamma + 1\vert } {=\pi }^{n}\sum _{ \vert \gamma \vert =d}{d! \over \prod \gamma _{j}}{ \prod \gamma _{j} \over (d + n)\Gamma (d + n)} {=\pi }^{n}{ d! \over (d + n)!}\sum _{\vert \gamma \vert =d}1. }$$
(62)

By Exercise 4.30, the number of independent homogeneous monomials of degree d in n variables is \({n + d - 1\choose d}\). We replace the sum in the last term in (62) with this number to obtain the desired result:

$$\displaystyle{ I {=\pi }^{n}{ d! \over (d + n)!}{ (n + d - 1)! \over (n - 1)!d!} {={ \pi }^{n} \over (n - 1)!(n + d)}. }$$
(63)

The calculation of (60) is similar but easier as there is no summation to compute

$$\displaystyle{\int _{B_{n}}\vert z{\vert }^{2\alpha }dV _{ 2n} = {(2\pi )}^{n}\int _{ K_{+}}{\mathbf{r}}^{2\alpha +1}dV _{ n} {=\pi }^{n}{\mathcal{B}(\alpha +1) \over \vert \alpha \vert + n}.}$$

For convenience we write (60) when n = 2 and a, b are integers:

$$\displaystyle{ \int _{B_{2}}\vert z{\vert }^{2a}\vert w{\vert }^{2b}dV _{ 4} {={ \pi }^{2}a!b! \over (a + b + 2)!}. }$$
(64)

We return to the homogeneous mapping H m (z). We consider \(H_{m}: B_{k} \rightarrow {\mathbf{C}}^{N}\), where \(N ={ k + m - 1\choose k - 1}\), the dimension of the space of homogeneous polynomials of degree m in k variables. We use the following lemma to find (Theorem 4.9) an explicit formula for the 2k-dimensional volume (with multiplicity counted) of the image of the unit ball under H m .

Lemma 4.11.

The pullback k-th power \({(H_{m}^{{\ast}}(\Psi ))}^{k}\) satisfies the following:

$$\displaystyle{ {(H_{m}^{{\ast}}(\Psi ))}^{k} = {m}^{k+1}k!\vert \vert z\vert {\vert }^{2k(m-1)}dV _{ 2k}. }$$
(65)

Proof.

Note first that \({(H_{m}^{{\ast}}(\Psi ))}^{k}\) is a smooth (2k)-form, and hence a multiple τ of dV 2k . Note next that H m is invariant under unitary transformations, and therefore τ must be a function of || z ||2. Since H m is homogeneous of degree m, each first derivative is homogeneous of degree m − 1. The (1, 1) form H m (Ψ) must then have coefficients that are bihomogeneous of degree \((m - 1,m - 1)\). The coefficient τ of its k-th power must be homogeneous of degree 2k(m − 1). Combining the homogeneity with the dependence on || z ||2 gives the desired expression, except for evaluating the constant m k + 1 k!.

For simplicity we write | dz j |2 for \(dz_{j} \wedge d\overline{z}_{j}\). To evaluate the constant it suffices to compute the coefficient of | z 1 |2k(m − 1). To do so, we compute dH m and then work modulo z 2, , z n . Thus, in the formula for \({(H_{m}^{{\ast}}(\Psi ))}^{k}\), we set all variables equal to zero except the first. Doing so yields

$$\displaystyle{ H_{m}^{{\ast}}(\Psi ) = {m}^{2}\vert z_{ 1}{\vert }^{2m-2}\vert dz_{ 1}{\vert }^{2} + m\vert z_{ 1}{\vert }^{2m-2}\sum _{ j=2}^{k}\vert dz_{ j}{\vert }^{2}. }$$
(66)

From (66) it suffices to compute

$$\displaystyle{ {({m}^{2}\vert dz_{ 1}{\vert }^{2} + m\sum _{ j=2}^{k}\vert dz_{ j}{\vert }^{2})}^{k}. }$$
(67)

Expanding (67) yields

$$\displaystyle{k!{m}^{k+1}dz_{ 1} \wedge d\overline{z}_{1} \wedge \ldots \wedge dz_{k} \wedge d\overline{z}_{k},}$$

and (65) follows by putting the factor | z 1 |(2m − 2)k from (66) back in. □

Theorem 4.9.

Let f: B n → B K be a proper complex analytic homogeneous polynomial mapping of degree m. The 2n-dimensional volume V f (with multiplicity counted) is given by

$$\displaystyle{ V _{f} = {m{}^{n}\pi }^{n}{ 1 \over n!}. }$$
(68)

Proof.

Consider the function || f ||2. Since

$$\displaystyle{\vert \vert f(z)\vert {\vert }^{2} = 1 = \vert \vert z\vert {\vert }^{2m} = \vert \vert H_{ m}(z)\vert {\vert }^{2}}$$

on the unit sphere, and both f and H m are homogeneous, this equality holds everywhere. Hence, \(\vert \vert f\vert {\vert }^{2} = \vert \vert H_{m}\vert {\vert }^{2}\), and these two functions have the same complex Hessian determinant. By Lemma 4.9 they determine the same volume form:

$$\displaystyle{\sum _{I}\vert J(f_{I}){\vert }^{2} =\sum _{ I}\vert J((H_{m})_{I})){\vert }^{2},}$$

and hence by Lemma 4.11

$$\displaystyle{V _{f} =\int _{B_{n}}{{(H_{m}^{{\ast}}(\Psi ))}^{n} \over n!} =\int _{B_{n}}{m}^{n+1}\vert \vert z\vert {\vert }^{2n(m-1)}dV _{ 2n}.}$$

Lemma 4.10 yields

$$\displaystyle{V _{f} = {m{}^{n+1}{ \pi }^{n} \over (n(m - 1) + n)}{ 1 \over (n - 1)!} ={ {m{}^{n}\pi }^{n} \over n!}.}$$

As a check we observe, when m = 1, that \(V _{f} {={ \pi }^{n} \over n!}\), which is the volume of B n . When n = 1, we obtain V f = π m, also the correct result, as noted in (55). □

The factor of m n in (68) arises because the image of the unit sphere in C n covers m times a subset of the unit sphere in the target. Compare with item (2) of Example 4.11.

9 Inequalities

We are now ready to state a sharp inequality in Theorem 4.10. The proof of this volume comparison result combines Theorems 4.6, 4.9, and Theorem 4.11 (proved below). Theorem 4.11 generalizes Proposition 4.2 to higher dimensions. Our proof here uses differential forms; the result can also be proved by elaborate computation. See [D4] for the computational proof.

Theorem 4.10.

Let p: C n C N be a polynomial mapping of degree m. Assume that \(p({S}^{2n-1}) \subseteq {S}^{2N-1}\) . Then \(V _{p} \leq { {m{}^{n}\pi }^{n} \over n!}\) . Equality happens if and only if p is homogeneous of degree m.

Proof.

If p is a constant mapping, then m = 0 and the conclusion holds. When p is homogeneous of degree m, the result is Theorem 4.9. When p is not homogeneous, we apply the process from Theorem 4.6 until we obtain a homogeneous mapping. The key point is that the operation of tensoring with z on a subspace A increases the volume of the image, in analogy with Proposition 4.2. Since tensoring on a k-dimensional subspace gives the same result as tensoring k times on one-dimensional subspaces, we need only show that the volume of the image increases if we tensor on a one-dimensional space.

We must therefore establish the following statement, which we state and prove as Theorem 4.11 below. Put \(f = (f_{1},\ldots,f_{N})\). Put

$$\displaystyle{ g = (z_{1}f_{1},\ldots,z_{n}f_{1},f_{2},\ldots,f_{N}). }$$
(69)

Then V f V g , with equality only if f 1 = 0.

Each tensor operation from Theorem 4.6 then increases the volume. We stop when we reach a homogeneous map. Theorem 4.9 then gives the volume \({{m{}^{n}\pi }^{n} \over n!}\), the stated upper bound. □

With g as in (69), we need to verify that V f V g . We proved this result (Corollary 4.2) when \(n = N = 1\), in two ways. As noted above, one can prove the general result in both fashions. We give the proof involving a boundary integral.

Let us first recall what we mean by the volume form on the unit sphere in R N. It is convenient to introduce the notion of interior multiplication. Assume η is a k-form, and write

$$\displaystyle{\eta = dx_{j} \wedge \tau +\mu,}$$

where μ does not contain dx j . The contraction in the j-th direction, or interior product with \({ \partial \over \partial x_{j}}\), is the (k − 1)-form I j (η), defined by I j (η) = τ. Informally speaking, we are eliminating dx j from η. More precisely, we define I j (η) by its action on vectors \(v_{2},\ldots,v_{k}\):

$$\displaystyle{I_{j}(\eta )(v_{2},\ldots,v_{k}) =\eta ({ \partial \over \partial x_{j}},v_{2},\ldots,v_{k}).}$$

We use this notation to write a standard expression from calculus. The Euclidean (N − 1)-dimensional volume form on the sphere is given by

$$\displaystyle{\sigma _{N-1} =\sum _{ j=1}^{N}x_{ j}{(-1)}^{j+1}I_{ j}(dx_{1} \wedge \ldots \wedge dx_{N}).}$$

For example, when N = 2 (and x, y are the variables), we have \(\sigma _{1} = xdy - ydx\). When N = 3 (and x, y, z are the variables), we have

$$\displaystyle{\sigma _{2} = x\ dy \wedge dz - y\ dx \wedge dz + z\ dx \wedge dy.}$$

Note that \(d\sigma _{N-1} = N\ dV _{N}\), where dV N is the volume form on Euclidean space. It follows immediately from Stokes’ theorem that the (N − 1)-dimensional volume of the unit sphere is N times the N-dimensional volume of the unit ball.

Remark 4.15.

In the previous paragraph, σ N − 1 is a differential form, and N − 1 is its exterior derivative. Calculus books often write for the surface area form (and ds for the arc-length form), even though these objects are not differential forms. The symbol d is simply irresistible.

Exercise 4.63.

Verify the following formulas for the (N − 1)-dimensional volume W N of the unit sphere in R N:

  • W 1 = 2.

  • W 2 = 2π.

  • W 3 = 4π.

  • W 4 = 2π 2.

  • \(W_{5} ={{ 8 \over 3}\pi }^{2}\).

Put ρ(z) = || z ||2. The unit sphere S 2n − 1 is the set of points where ρ = 1. The differential form is orthogonal to the sphere at each point, and the cotangent space to the sphere is the orthogonal complement to . The decomposition \(d\rho = \partial \rho + \overline{\partial }\rho\) will be crucial to our proof. Since is orthogonal to the sphere, we may use the relation \(\partial \rho = -\overline{\partial }\rho\) when doing integrals over the sphere.

We can express the form σ 2n − 1 in terms of complex variables. Let \(W_{j\overline{j}}\) denote the (2n − 2)-form defined by eliminating \(dz_{j} \wedge d\overline{z}_{j}\) from \(dz_{1} \wedge d\overline{z}_{1} \wedge \ldots \wedge dz_{n} \wedge d\overline{z}_{n}\). For 1 ≤ jn, put \(z_{j} = x_{j} + iy_{j}\). Write \(x_{j} ={ z_{j}+\overline{z}_{j} \over 2}\) and \(y_{j} ={ z_{j}-\overline{z}_{j} \over 2i}\). Substituting in the form σ 2n − 1 and collecting terms, we obtain

$$\displaystyle{ \sigma _{2n-1} = {({ i \over 2})}^{n}\sum _{ j=1}^{n}\left (z_{ j}d\overline{z}_{j} -\overline{z}_{j}dz_{j}\right ) \wedge W_{j\overline{j}}. }$$
(70)

As a check, we note when n = 1 that this expression equals \({ i \over 2}(zd\overline{z} -\overline{z}dz)\). Putting z = e then yields , as expected. As a second check, we compute d of the right-hand side of (70), using \(dz_{j} \wedge d\overline{z}_{j} = -2i\ dx_{j} \wedge dy_{j}\), obtaining

$$\displaystyle{{({ i \over 2})}^{n}(2n){(-2i)}^{n}dV _{ 2n} = 2n\ dV _{2n},}$$

as expected (since we are in 2n real dimensions).

With these preparations we can finally show that the tensor product operation increases volumes; in other words, V Ef > V f (unless f 1 = 0).

Theorem 4.11.

Assume that f = (f 1 ,…,f N ) is complex analytic on the unit ball B n in C n . Define the partial tensor product Ef by

$$\displaystyle{Ef = (z_{1}f_{1},z_{2}f_{1},\ldots,z_{n}f_{1},f_{2},\ldots,f_{N}).}$$

Then V Ef > V f unless f 1 = 0.

Proof.

We prove the result assuming f has a continuously differentiable extension to the boundary sphere. [D4] has a proof without this assumption.

Recall that \(V _{f} =\int _{B_{n}}\vert \vert Jf\vert {\vert }^{2}dV\). Here, as in (53), Jf denotes all possible Jacobians formed by selecting n of the components of f. In case f is an equi-dimensional mapping, we also have

$$\displaystyle{ V _{f} = c_{n}\int _{B_{n}}\partial f_{1} \wedge \overline{\partial f_{1}} \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}. }$$
(71)

In general V f is a sum of integrals, as in (71), over all choices of n components. The constant c n equals \({({ i \over 2})}^{n}\); see the discussion near Definition 4.16.

We want to compute V Ef = || J(Ef) ||2. Many terms arise. We partition these terms into three types. Type I terms are those for which the n functions selected among the components of Ef include none of the functions z j f 1 for 1 ≤ jn. These terms also arise when computing V f . Hence terms of type I drop out when computing the difference V Ef V f , and we may ignore them. Type II terms are those for which we select at least two of the functions z j f 1. These terms arise in the computation of V Ef , but not in the computation of V f . All of these terms thus contribute nonnegatively. The type III terms remain. They are of the form \((z_{j}f_{1},f_{i_{2}},\ldots,f_{i_{n}})\). We will show, for each choice \((f_{i_{2}},\ldots,f_{i_{n}})\) of n − 1 of the functions f 2, , f N , that the sum on j of the volumes of the images of \((z_{j}f_{1},f_{i_{2}},\ldots,f_{i_{n}})\) is at least as large as the volume of the image of the map \((f_{1},f_{i_{2}},\ldots,f_{i_{n}})\). Combining these conclusions shows that V Ef V f .

For simplicity of notation, let us write the (n − 1)-tuple as \((f_{2},\ldots,f_{n})\). By the above paragraph, it suffices to prove the result when \(f = (f_{1},\ldots,f_{n})\) is an equi-dimensional mapping. In the rest of the proof, we let f denote this n-tuple.

Since f 1 is complex analytic, df 1 = ∂ f 1. We can therefore write the form in (71) as an exact form and then apply Stokes’ theorem to get

$$\displaystyle{V _{f} = c_{n}\int _{B_{n}}d(f_{1} \wedge \overline{\partial f_{1}} \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}})}$$
$$\displaystyle{ = c_{n}\int _{{S}^{2n-1}}f_{1} \wedge \overline{\partial f_{1}} \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}. }$$
(72)

For 1 ≤ jn we replace f 1 in (72) with z j f 1 and sum, obtaining

$$\displaystyle{ V _{Ef} \geq c_{n}\sum _{j=1}^{n}\int _{{ S}^{2n-1}}z_{j}f_{1} \wedge \overline{\partial (z_{j}f_{1})} \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}. }$$
(73)

Note that \(\partial (z_{j}f_{1}) = f_{1}dz_{j} + z_{j}df_{1}\) by the product rule. Using this formula in (73) and then subtracting (72) from (73) shows that the excess is at least

$$\displaystyle{V _{Ef} - V _{f} \geq c_{n}\int _{{S}^{2n-1}}(\sum _{j=1}^{n}\vert z_{ j}{\vert }^{2} - 1)f_{ 1}\overline{\partial f_{1}} \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}}$$
$$\displaystyle{ +c_{n}\int _{{S}^{2n-1}}\vert f_{1}{\vert }^{2}(\sum _{ j=1}^{n}z_{ j}d\overline{z}_{j}) \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}. }$$
(74)

Since \(\sum \vert z_{j}{\vert }^{2} = 1\) on the sphere, the expression in the top line of (74) vanishes. We claim that the other term is nonnegative. We will show that the form

$$\displaystyle{c_{n}\vert f_{1}{\vert }^{2}(\sum _{ j=1}^{n}z_{ j}d\overline{z}_{j}) \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}}}$$

arising in (74) is a nonnegative multiple of the real (2n − 1)-dimensional volume form on the sphere, and hence, its integral is nonnegative.

It suffices to prove that the form

$$\displaystyle{ \eta = c_{n}\overline{\partial }\rho \wedge \partial f_{2} \wedge \overline{\partial f_{2}} \wedge \ldots \wedge \partial f_{n} \wedge \overline{\partial f_{n}} }$$
(75)

is a nonnegative multiple of the volume form on the sphere.

Note that ∂ f j = df j , because f j is complex analytic. We wish to write df j in terms of a particular basis of 1-forms. We would like to find independent differential 1-forms ω 1, , ω n − 1, with the following properties. Each of these forms involves only the dz j (not the \(d\overline{z}_{j}\)). Each ω j is in the cotangent space to the sphere. Finally, these forms, their conjugates, and the additional forms ∂ ρ and \(\overline{\partial }\rho\) are linearly independent at each point. Doing so is not generally possible, but we can always find \(\omega _{1},\ldots,\omega _{n-1}\) such that linear independence holds except on a small set. After the proof (Remark 4.17), we explain how to do so.

Given these forms, we work on the set U where linear independence holds. We compute the exterior derivatives of the f j for 2 ≤ jn in terms of this basis:

$$\displaystyle{df_{j} = \partial f_{j} =\sum _{ k=1}^{n-1}B_{ jk}\omega _{k} + B_{j}\partial \rho.}$$

On the intersection of U and the sphere, we obtain

$$\displaystyle{df_{j} = \partial f_{j} =\sum _{ k=1}^{n-1}B_{ jk}\omega _{k} + B_{j}\partial \rho =\sum _{ k=1}^{n-1}B_{ jk}\omega _{k} - B_{j}\overline{\partial }\rho.}$$
$$\displaystyle{\overline{\partial f_{j}} =\sum _{ k=1}^{n-1}\overline{B_{ jk}}\overline{\omega }_{k} + \overline{B}_{j}\overline{\partial }\rho.}$$

In these formulas, B jk denotes the coefficient function; B jk can be written \(L_{k}(f_{j})\) for complex vector fields L k dual to the ω k .

These formulas allow us compute the wedge product in (75) very easily. We can ignore all the functions B j , because the wedge product of \(\overline{\partial }\rho\) with itself is 0. We obtain

$$\displaystyle{ \eta = c_{n}\vert \mathrm{det}(B_{jk}){\vert }^{2}\overline{\partial }\rho \wedge \omega _{ 1} \wedge \overline{\omega }_{1} \wedge \ldots \wedge \omega _{n-1} \wedge \overline{\omega }_{n-1}. }$$
(76)

In (76), the index k runs from 1 to n − 1, and the index j runs from 2 to n. Hence, it makes sense to take the determinant of the square matrix B jk of functions. Since the ω k and their conjugates are orthogonal to the normal direction , the form in (76) is a nonnegative multiple of σ 2n − 1.

We have verified that V Ef V f ≥ 0. □

Remark 4.16.

Let f and Ef be as in Theorem 4.11. Assume f 1 is not identically 0. For all z in the ball, || (Ef)(z) ||2 ≤ || f(z) ||2, with strict inequality except where f 1(z) = 0. There is no pointwise inequality relating \(\mathrm{det}\left ((\vert \vert Ef\vert {\vert }^{2})_{j\overline{k}}\right )\) and \(\mathrm{det}\left ((\vert \vert f\vert {\vert }^{2})_{j\overline{k}}\right )\). But, Theorem 4.11 and Lemma 4.9 yield

$$\displaystyle{\int \mathrm{det}\left ((\vert \vert Ef\vert {\vert }^{2})_{ j\overline{k}}\right )dV >\int \mathrm{det}\left ((\vert \vert f\vert {\vert }^{2})_{ j\overline{k}}\right )dV.}$$

Thus || Ef ||2 is (pointwise) smaller than || f ||2, there is no pointwise inequality between their Hessian determinants, but the average value (integral) of the Hessian determinant of || Ef ||2 is larger than the average value of the Hessian determinant of || f ||2.

Remark 4.17.

We show how to construct the 1-forms used in the proof. First consider S 3C 2. We can put \(\omega _{1} = z\ dw - w\ dz\). Then, except at the origin, the four 1-forms \(\omega _{1},\overline{\omega }_{1},\partial \rho,\overline{\partial }\rho\) do the job. The three 1-forms \(\omega _{1},\overline{\omega }_{1},\partial \rho -\overline{\partial }\rho\) form a basis for the cotangent space at each point of the unit sphere.

In the higher-dimensional case, we work on the set U where z n ≠ 0. The complement of U in the sphere is a lower-dimensional sphere, and hence, a small set as far as integration is concerned. For 1 ≤ jn − 1, we define ω j by

$$\displaystyle{\omega _{j} ={ z_{n}\ dz_{j} - z_{j}\ dz_{n} \over \vert z_{j}{\vert }^{2} + \vert z_{n}{\vert }^{2}}.}$$

The forms ω j are linearly independent on U, and each is orthogonal to . See the next section and Exercise 4.71 for their role in CR geometry.

We now discuss in more detail why η is a nonnegative multiple of the (2n − 1)-dimensional volume form on the sphere. One way to verify this fact is to introduce polar coordinates in each variable separately and compute. Thus, \(z_{j} = \mathbf{r}_{j}{e}^{i\theta _{j}}\), where each r j is nonnegative. On the unit sphere we have the relation \(\sum \mathbf{r}_{j}^{2} = 1\); it follows that \(\sum \mathbf{r}_{j}d\mathbf{r}_{j} = 0\) on the sphere. We therefore use all the θ j as coordinates, but we use only \(\mathbf{r}_{1},\ldots,\mathbf{r}_{n-1}\). The (2n − 1)-dimensional volume form on the sphere turns out to be (where the product is a wedge product)

$$\displaystyle{\left (\prod _{j=1}^{n-1}\mathbf{r}_{ j}d\mathbf{r}_{j} \wedge d\theta _{j}\right ) \wedge d\theta _{n}.}$$

We continue this geometric approach by noting the following simple Lemma, expressing the Cauchy–Riemann equations in polar coordinates.

Lemma 4.12.

Assume h is complex analytic in one variable. Use polar coordinates z = re . Then \({\partial h \over \partial \theta } = ri{\partial h \over \partial r}\) .

Proof.

We will use subscripts to denote partial derivatives in this proof. Since h is complex analytic, \(h_{\overline{z}} ={ \partial h \over \partial \overline{z}} = 0\). It follows that

$$\displaystyle{h_{r} ={ \partial h \over \partial r} ={ \partial h \over \partial z}{ \partial z \over \partial r} = h_{z}{e}^{i\theta }.}$$

Similarly,

$$\displaystyle{h_{\theta } ={ \partial h \over \partial \theta } ={ \partial h \over \partial z}{ \partial z \over \partial \theta } = h_{z}ri{e}^{i\theta } = rih_{ r}.}$$

Remark 4.18.

One can also prove Lemma 4.12 by observing that it suffices to check it for h(z) = z k, for each k.

Exercise 4.64.

Prove Lemma 4.12 as suggested in the Remark.

A continuously differentiable function of several complex variables is complex analytic if and only if it is complex analytic in each variable separately. (The same conclusion holds without the hypothesis of continuous differentiability, but this result, which we do not need, is much harder to prove.) The geometry of the sphere suggests, and the easier implication justifies, working in polar coordinates in each variable separately.

Put \(z_{j} = \mathbf{r}_{j}{e}^{i\theta _{j}}\) for 1 ≤ jn. Computation yields

$$\displaystyle{dz_{j} = {e}^{i\theta _{j} }d\mathbf{r}_{j} + i\mathbf{r}_{j}{e}^{i\theta _{j} }d\theta _{j}.}$$

Note that \(\sum _{1}^{n}\mathbf{r}_{j}d\mathbf{r}_{j} = 0\) on the sphere. We compute \(\overline{\partial }\rho =\sum _{ 1}^{n}z_{j}d\overline{z_{j}}\) as follows:

$$\displaystyle{\overline{\partial }\rho =\sum _{ j=1}^{n}z_{ j}d\overline{z}_{j} =\sum _{ j=1}^{n}\mathbf{r}_{ j}d\mathbf{r}_{j} - i\sum _{j=1}^{n}\mathbf{r}_{ j}^{2}d\theta _{ j} = -i(\sum _{j=1}^{n}\mathbf{r}_{ j}^{2}d\theta _{ j}).}$$

We can express the form η from (75) in terms of these new variables. We provide the details only when n = 2. For ease of notation, we write z = re and w = se . We obtain

$$\displaystyle{ zd\overline{z} + wd\overline{w} = -i({r}^{2}d\theta + {s}^{2}d\phi ). }$$
(77)

We compute \(\partial g \wedge \overline{\partial g}\), where g = f 2 in (75). Now that we do not have subscripts on the functions, we can use subscripts to denote partial derivatives. Since g is complex analytic, we have

$$\displaystyle{\partial g = dg = g_{r}dr + g_{\theta }d\theta + g_{s}ds + g_{\phi }d\phi.}$$

The Cauchy–Riemann equations in polar coordinates give g θ = rig r and g ϕ = sig s . From these equations we find

$$\displaystyle{ \partial g = g_{r}(dr + ird\theta ) + g_{s}(ds + isd\phi ). }$$
(78)

We need to compute \(\partial g \wedge \overline{\partial g}\). We obtain

$$\displaystyle{\partial g \wedge \overline{\partial g} = \vert g_{r}{\vert }^{2}(-2irdr \wedge d\theta ) + \vert g_{ s}{\vert }^{2}(-2isds \wedge d\phi )}$$
$$\displaystyle{+g_{r}\overline{g_{s}}(-isdr \wedge d\phi + ird\theta ds + rsd\theta \wedge d\phi )}$$
$$\displaystyle{ +g_{s}\overline{g_{r}}(-isdr \wedge d\phi + ird\theta ds - rsd\theta \wedge d\phi ). }$$
(79)

We wedge (77) with (79) and collect terms in the order drdθ d ϕ. The result is

$$\displaystyle{ (zd\overline{z} + wd\overline{w}) \wedge \partial g \wedge \overline{\partial g} = -2r\vert sg_{r} - rg_{s}{\vert }^{2}drd\theta d\phi. }$$
(80)

The form η in question is \({({ i \over 2})}^{2}\) times the expression in (80). Hence, we see that

$$\displaystyle{ \eta = \vert sg_{r} - rg_{s}{\vert }^{2}{r \over 2}drd\theta d\phi, }$$
(81)

which is a nonnegative multiple of the volume form rdrdθ d ϕ for the sphere.

We gain considerable insight by expressing sg r rg s in terms of g z and g w . Using the chain rule and some manipulation, we get

$$\displaystyle{ \vert sg_{r} - rg_{s}{\vert }^{2} = \vert sg_{ z}z_{r} - rg_{w}w_{s}{\vert }^{2} = \vert s{e}^{i\theta }g_{ z} - r{e}^{i\phi }g_{ w}{\vert }^{2} = \vert \overline{w}g_{ z} -\overline{z}g_{w}{\vert }^{2}. }$$
(82)

We can interpret (82) geometrically. Define a complex vector field L by

$$\displaystyle{ L = \overline{w}{ \partial \over \partial z} -\overline{z}{ \partial \over \partial w}. }$$
(83)

Then L is tangent to the unit sphere, and (81) and (82) yield \(\eta ={ 1 \over 2}\vert L(g){\vert }^{2}\ \sigma _{ 3}\). In the next section, we will interpret L in the context of CR Geometry.

Exercise 4.65.

Use polar coordinates to compute the form η from (75) in 3 complex dimensions.

Exercise 4.66.

Show that {z α}, as α ranges over all nonnegative integer multi-indices, is a complete orthogonal system for \({\mathcal{A}}^{2}\). Here \({\mathcal{A}}^{2}\) denotes the complex analytic functions in L 2(B n ).

Exercise 4.67.

Let \(c_{\alpha } = \vert \vert {z}^{\alpha }\vert \vert _{{L}^{2}}^{2}\) for the unit ball B n . Find a simple formula for the Bergman kernel \(B(z,\overline{z})\) for the ball, defined by

$$\displaystyle{B(z,\overline{z}) =\sum _{\alpha }{\vert z{\vert }^{2\alpha } \over c_{\alpha }}.}$$

Exercise 4.68.

Compute V f if f(z, w) = (z a, w b). Also compute V g if \(g(z) = ({z}^{a},z{w}^{b},{w}^{b+1})\).

Exercise 4.69.

Express the (2n − 1)-dimensional volume of the unit sphere S 2n − 1 in terms of the 2n-dimensional volume of B n . Suggestion: Use (71) and (72) when f(z) = z.

Exercise 4.70.

Consider the Hilbert space \(\mathcal{H}\) consisting of complex analytic functions on C n that are square-integrable with respect to the Gaussian weight function exp( − || z ||2). Show that the monomials form a complete orthogonal system for \(\mathcal{H}\).

10 CR Geometry

CR Geometry considers the geometry of real objects in complex spaces. The name itself has an interesting history, which we do not discuss here, other than to say that CR stands both for Cauchy–Riemann and for Complex–Real. See [DT] for a survey of CR Geometry and its connections with other branches of mathematics. See [BER] for a definitive treatment of the subject. In this section we mostly consider simple aspects of the CR Geometry of the unit sphere in C n.

Let S 2n − 1 denote the unit sphere in R 2n. Consider a point p in S 2n − 1. If we regard p as a unit vector v (from 0 to p) in R 2n, then v is orthogonal to the sphere at p. Hence, any vector w orthogonal to v is tangent to the sphere. Put \(r(x) =\sum _{ j=1}^{2n}x_{j}^{2} - 1\). Then the unit sphere is the zero-set of r, and furthermore, dr(x) ≠ 0 for x on the sphere. We call such a function a defining function for the sphere. The 1-form dr annihilates the tangent space T p (S 2n − 1) at each point. It defines the normal direction to the sphere.

In this section we write ⟨η, L⟩ for the contraction of a 1-form η with a vector field L. Previously we have been writing η(L). A vector field \(L =\sum _{ j=1}^{2n}a_{j}{ \partial \over \partial x_{j}}\) on R 2n is tangent to S 2n − 1 if and only if

$$\displaystyle{0 =\langle dr,L\rangle = dr(L) = L(r) =\sum _{ j=1}^{2n}a_{ j}{ \partial r \over \partial x_{j}}}$$

on the sphere.

Given the focus of this book, we regard R 2n as C n and express these geometric ideas using complex vector fields. A new phenomenon arises. Not all directions in the tangent space behave the same, from the complex variable point of view.

Let X be a complex vector field on C n. We can write

$$\displaystyle{X =\sum _{ j=1}^{n}a_{ j}{ \partial \over \partial z_{j}} +\sum _{ j=1}^{n}b_{ j}{ \partial \over \partial \overline{z}_{j}}}$$

where the coefficient functions a j , b j are smooth and complex valued. Each complex vector field is the sum of two vector fields, one of which involves differentiations in only the unbarred directions, the other involves differentiations in only the barred directions. Let T 1, 0(C n) denote the bundle whose sections are vector fields of the first kind and T 0, 1(C n) the bundle whose sections are of the second kind. The only vector field of both kinds is the 0 vector field. We therefore write

$$\displaystyle{ T({\mathbf{C}}^{n}) \otimes \mathbf{C} = {T}^{1,0}({\mathbf{C}}^{n}) \oplus {T}^{0,1}({\mathbf{C}}^{n}). }$$
(84)

The tensor product on the left-hand side of (84) arises because we are considering complex (rather than real) vector fields. The left-hand side of (84) means the bundle whose sections are the complex vector fields on C n. We next study how the decomposition in (84) applies to vector fields tangent to S 2n − 1.

Let T 1, 0(S 2n − 1) denote the bundle whose sections are complex vector fields of type (1, 0) and tangent to S 2n − 1. Then T 0, 1(S 2n − 1) denotes the complex conjugate bundle. For p on the sphere, each of the vector spaces \(T_{p}^{1,0}({S}^{2n-1})\) and \(T_{p}^{0,1}({S}^{2n-1})\) has complex dimension n − 1. But T p (S 2n − 1) ⊗ C has dimension 2n − 1. Hence, there is a missing direction. How can we describe and interpret this missing direction?

Observe first that the commutator [L, K] of vector fields L, K, each of type (1, 0) and tangent to S 2n − 1 also satisfies these properties. That [L, K] is of type (1, 0) follows easily from the formula \([L,K] = LK - KL\). That [L, K] is tangent follows by applying this formula to a defining function r:

$$\displaystyle{[L,K](r) = L(K(r)) - K(L(r)) = 0 - 0 = 0.}$$

Since K is tangent, K(r) = 0 on the sphere. Since L is tangent, L(K(r)) = 0 there. By symmetry, K(L(r)) = 0 as well. Note Remark 4.19. By symmetry considerations, the commutator of two (0, 1) tangent vector fields is also of type (0, 1) and tangent. On the sphere, however, the commutator of each nonzero (1, 0) vector field L with its conjugate \(\overline{L}\) will have a nonvanishing component in the missing direction.

Remark 4.19.

Warning! Is the derivative of a constant zero? The function \(R(x,y) = {x}^{2} + {y}^{2} - 1\) equals 0 everywhere on the unit circle, but \({\partial R \over \partial x} = 2x\) and hence is NOT zero at most points. The problem is that the differentiation with respect to x is not tangent to the unit circle.

We can abstract the geometry of the sphere as follows:

Definition 4.17.

The CR structure on S 2n − 1 is given by the subbundle \(V = {T}^{1,0}({S}^{2n-1})\), which has the following properties:

  1. (1)

    \(V \cap \overline{V } =\{ 0\}\).

  2. (2)

    The set of smooth sections of V is closed under the Lie bracket.

  3. (3)

    \(V \oplus \overline{V }\) has codimension one in T(S 2n − 1) ⊗ C.

Definition 4.18.

A CR manifold of hypersurface type is a real manifold M for which there is a subbundle VT(M) ⊗ C satisfying the three properties from Definition 4.17.

Any real hypersurface M in C n is a CR manifold of hypersurface type. Since \(V \oplus \overline{V }\) has codimension one in T(M) ⊗ C, there is a nonvanishing 1-form η, defined up to a multiple, annihilating \(V \oplus \overline{V }\). By convention, we assume that this form is purely imaginary. (See Exercise 4.76 for an explanation of this convention.) Thus, ⟨η, L⟩ = 0 whenever L is a vector field of type (1, 0) and similarly for vector fields of type (0, 1).

Definition 4.19.

Let M be a CR manifold of hypersurface type. The Levi form λ is the Hermitian form on sections of T 1, 0(M) defined by

$$\displaystyle{\lambda (L,\overline{K}) =\langle \eta,[L,\overline{K}]\rangle.}$$

Let us return to the unit sphere. Near a point where z n ≠ 0, for 1 ≤ jn − 1, we define n − 1 vector fields of type (1, 0) by

$$\displaystyle{ L_{j} = \overline{z}_{n}{ \partial \over \partial z_{j}} -\overline{z}_{j}{ \partial \over \partial z_{n}}. }$$
(85)

A simple check shows that each L j is tangent to the sphere. Similarly the complex conjugate vector fields \(\overline{L}_{j}\) are tangent. These vector fields are linearly independent (as long as we are working where z n ≠ 0). There are 2n − 2 of them. The missing direction requires both unbarred and barred derivatives. We can fill out the complex tangent space by setting

$$\displaystyle{ \mathbf{T} = z_{n}{ \partial \over \partial z_{n}} -\overline{z}_{n}{ \partial \over \partial \overline{z}_{n}}. }$$
(86)

Then \(L_{1},\ldots,L_{n-1},\overline{L}_{1},\ldots,\overline{L}_{n-1},\mathbf{T}\) span the complex tangent space to S 2n − 1 at each point where z n ≠ 0.

Exercise 4.71.

Verify that the L j from (85) and T from (86) are tangent to the sphere. Let ω j be as in Remark 4.17. Verify that ⟨L j , ω j ⟩ = 1.

Exercise 4.72.

Find a purely imaginary 1-form annihilating T 1, 0T 0, 1 on the sphere.

Exercise 4.73.

Compute the commutator \([L_{j},\overline{L}_{k}]\).

Exercise 4.74.

Use the previous two exercises to show that the Levi form on the sphere is positive definite.

Exercise 4.75.

Show that translating the sphere leads to the defining function

$$\displaystyle{ r(\zeta,\overline{\zeta }) =\sum _{ j=1}^{n-1}\vert \zeta _{ j}{\vert }^{2} + \vert \zeta _{ n}{\vert }^{2} + 2\mathrm{Re}(\zeta _{ n}). }$$
(87)

Show that a more elaborate change of variables leads to the defining function:

$$\displaystyle{ r(w,\overline{w}) =\sum _{ j=1}^{n-1}\vert w_{ j}{\vert }^{2} + 2\mathrm{Re}(w_{ n}). }$$
(88)

Suggestion: First do the case n = 1.

Exercise 4.76.

Show that \(\lambda (L,\overline{K}) =\lambda (K,\overline{L})\).

Exercise 4.77.

Let r be a smooth real-valued function on C n. Assume that dr does not vanish on M, the zero-set of r. Then M is a real hypersurface and hence a CR manifold. Compute the Levi form λ on M in terms of derivatives of r. The answer, in terms of the basis {L j } given below for sections of T 1, 0(M), is the following formula:

$$\displaystyle{\lambda _{jk} = r_{j\overline{k}}\vert r_{n}{\vert }^{2} - r_{ j\overline{n}}r_{n}r_{\overline{k}} - r_{n\overline{k}}r_{j}r_{\overline{n}} + r_{n\overline{n}}r_{j}r_{\overline{k}}.}$$

Suggestion: Work near a point where \(r_{z_{n}}\neq 0\). For 1 ≤ jn − 1, define L j by

$$\displaystyle{L_{j} ={ \partial \over \partial z_{j}} -{ r_{z_{j}} \over r_{z_{n}}}{ \partial \over \partial z_{n}}}$$

and define \(\overline{L}_{k}\) in a similar manner. Find the 1-form η, and compute \([L_{j},\overline{L}_{k}]\).

Remark 4.20.

The answer to Exercise 4.77 is the restriction of the complex Hessian of r to the space T 1, 0(M).

Exercise 4.78.

Find the Levi form on the hyperplane defined by Re(z n ) = 0.

The zero-set of (88), a biholomorphic image of the sphere, is an unbounded object H, commonly known as the Heisenberg group. Put n = 2 and define A by

$$\displaystyle{A ={ \partial \over \partial w_{1}} -\overline{w}_{1}{ \partial \over \partial w_{2}}.}$$

Then A, \(\overline{A}\), and \([A,\overline{A}]\) form a basis for the sections of T(H) ⊗ C at each point. See [DT] and its references for considerable information about the role of the Heisenberg group in complex analysis, geometry, and PDE.

We next use the CR geometry of the unit sphere to briefly study harmonic polynomials. For simplicity we work on S 3, where the vector field L from (83) defines the CR structure. Recall that (z, w) denotes the variable in C 2. We also recall from Sect. 11 of Chap. 1 that a smooth function is harmonic if its Laplacian is 0. We can express the Laplace operator in terms of complex partial derivatives; a (possibly complex-valued) smooth function u is harmonic on C 2 if and only if

$$\displaystyle{u_{z\overline{z}} + u_{w\overline{w}} = 0.}$$

As in Sect. 13 from Chap. 2, it is natural to consider harmonic homogeneous polynomials. Here we allow our harmonic functions to be complex valued. The complex vector space V d , consisting of homogeneous polynomials of degree d (with complex coefficients) in the underlying 2n real variables, decomposes into a sum of spaces V p, q . Here \(p + q = d\) and the elements of V p, q are homogeneous of degree p in z and of degree q in \(\overline{z}\). We obtain a decomposition \(\mathbf{H_{d}} =\sum \mathbf{H}_{p,q}\) of the space of harmonic homogeneous polynomials.

Example 4.13.

Put n = 2 and d = 2. By our work in Chap. 2, the space H 2 is 9-dimensional. We have the following:

  • H 2, 0 is spanned by z 2, zw, w 2.

  • H 1, 1 is spanned by \(z\overline{w},\overline{z}w,\vert z{\vert }^{2} -\vert w{\vert }^{2}\).

  • H 0, 2 is spanned by \({\overline{z}}^{2},\overline{z}\overline{w},{\overline{w}}^{2}\).

As in Chap. 2, the sum of these three spaces is the orthogonal complement of the (span of the) function | z |2 + | w |2 in the space of polynomials of degree 2.

Interesting results about eigenvalues and the CR vector fields also hold. We give a simple example. For each pair a, b of nonnegative integers, observe that the monomials \({z}^{a}{\overline{w}}^{b}\) and \({\overline{z}}^{a}{w}^{b}\) are harmonic. Elementary calculus yields

$$\displaystyle{L({z}^{a}{\overline{w}}^{b}) = a{z}^{a-1}{\overline{w}}^{b+1}}$$
$$\displaystyle{\overline{L}({z}^{a}{\overline{w}}^{b}) = -b{z}^{a+1}{\overline{w}}^{b-1}.}$$

Combining these results shows that

$$\displaystyle{L\overline{L}({z}^{a}{\overline{w}}^{b}) = -b(a + 1){z}^{a}{\overline{w}}^{b}.}$$
$$\displaystyle{\overline{L}L({z}^{a}{\overline{w}}^{b}) = -a(b + 1){z}^{a}{\overline{w}}^{b}}$$

Thus the harmonic monomials \({z}^{a}{\overline{w}}^{b}\) are eigenfunctions of the differential operators \(L\overline{L}\) and \(\overline{L}L\), with eigenvalues \(-b(a + 1)\) and \(-a(b + 1)\). Hence, they are also eigenfunctions of the commutator \(\mathbf{T} = [L,\overline{L}]\), with eigenvalue ab.

11 Positivity Conditions for Hermitian Polynomials

This section aims to glimpse recent research directions along the lines of this book. We state and discuss, but we do not prove, an analogue of the Riesz–Fejer theorem for positive polynomials on the unit sphere. We offer an application of this result to proper mappings between balls.

The Riesz–Fejer theorem (Theorem 1.1) characterizes nonnegative trig polynomials; each such polynomial agrees on the circle with the squared absolute value of a single polynomial in the complex variable z. We naturally seek to extend this result from the unit circle to the unit sphere in C n. Things become more complicated but also more interesting.

We start with a Hermitian symmetric polynomial \(r(z,\overline{z}) =\sum _{\alpha,\beta }c_{\alpha \beta }{z}^{\alpha }{\overline{z}}^{\beta }\) of degree d in zC n. We can always bihomogenize r by adding a variable as follows. We put r H (0, 0) = 0. For z ≠ 0 we put

$$\displaystyle{r_{H}(z,t,\overline{z},\overline{t}) = \vert t{\vert }^{2d}r({z \over t},{ \overline{z} \over \overline{t}} ).}$$

Then r H is homogeneous of degree d in the variables z, t and also homogeneous of degree d in their conjugates. The polynomial r H is thus determined by its values on the unit sphere in C n + 1. Conversely we can dehomogenize a bihomogeneous polynomial in two or more variables by setting one of its variables (and its conjugate!) equal to the number 1.

Example 4.14.

Put n = 1 and put \(r(z,\overline{z}) = {z}^{2} +{ \overline{z}}^{2}\). We compute r H :

$$\displaystyle{r_{H}(z,t,\overline{z},\overline{t}) = \vert t{\vert }^{4}({({z \over t} )}^{2} + {({\overline{z} \over \overline{t}} )}^{2}) ={ \overline{t}}^{2}{z}^{2} +{ \overline{z}}^{2}{t}^{2}.}$$

Example 4.15.

Put \(r = {(\vert zw{\vert }^{2} - 1)}^{2} + \vert z{\vert }^{2}\). Then r is positive everywhere, but r H , while nonnegative, has many zeroes.

There is no loss in generality in our discussion if we restrict our attention to the bihomogeneous case. Let R be a bihomogeneous polynomial in n variables (and their conjugates). Assume \(R(z,\overline{z}) \geq 0\) on the unit sphere. As a generalization of the Riesz–Fejer theorem, we naturally ask if there exist homogeneous polynomials f 1(z), , f K (z) such that

$$\displaystyle{R(z,\overline{z}) = \vert \vert f(z)\vert {\vert }^{2} =\sum _{ j=1}^{K}\vert f_{ j}(z){\vert }^{2}.}$$

We call R a Hermitian sum of squares or Hermitian squared norm. Of course we cannot expect K to be any smaller than the dimension. For example, the polynomial \(\sum _{j=1}^{n}\vert z_{j}{\vert }^{4}\) is positive on the sphere, but cannot be written as a Hermitian squared norm with fewer terms. Furthermore, not every nonnegative R is a Hermitian squared norm. Even restricted to the unit sphere, such a result fails in general, and hence, the analogue of the Riesz–Fejer theorem is more subtle.

Example 4.16.

Put \(R(z,\overline{z}) = {(\vert z_{1}{\vert }^{2} -\vert z_{2}{\vert }^{2})}^{2}.\) Then R is bihomogeneous and nonnegative. Its underlying matrix C α β of coefficients is diagonal with eigenvalues 1, − 2, 1. Suppose for some f that \(R(z,\overline{z}) = \vert \vert f(z)\vert {\vert }^{2}\). Then f would vanish on the subset of the unit sphere defined by \(\vert z_{1}{\vert }^{2} = \vert z_{2}{\vert }^{2} ={ 1 \over 2}\) (a torus), because R does. A complex analytic function vanishing there would also vanish for \(\vert z_{1}{\vert }^{2} \leq { 1 \over 2}\) and \(\vert z_{2}{\vert }^{2} \leq { 1 \over 2}\) by the maximum principle. Hence f would have to be identically zero. Thus R does not agree with a squared norm of any complex analytic mapping. The zero-set of R does not satisfy appropriate necessary conditions here.

The following elaboration of Example 4.16 clarifies the matter. Consider the family of polynomials R ε defined by

$$\displaystyle{R_{\epsilon }(z,\overline{z}) = {(\vert z_{1}{\vert }^{2} -\vert z_{ 2}{\vert }^{2})}^{2} +\epsilon \vert z_{ 1}{\vert }^{2}\vert z_{ 2}{\vert }^{2}.}$$

For each ε > 0, we have \(R_{\epsilon }(z,\overline{z}) > 0\) on the sphere. By Theorem 4.12 below there is a polynomial mapping f ε such that \(R_{\epsilon } = \vert \vert f_{\epsilon }\vert {\vert }^{2}\) on the sphere. Both the degree and the number of components of f ε must tend to infinity as ε tends to 0. See [D1] for a lengthy discussion of this sort of issue.

From Example 4.15 we discover that nonnegativity is too weak of a condition to imply that R agrees with a Hermitian squared norm. See also Example 4.18. On the other hand, when \(R(z,\overline{z}) > 0\) on the sphere, the conclusion does hold. See [D1] for detailed proofs of Theorem 4.12 and Theorem 4.13 below. The proof of Theorem 4.12 there uses the theory of compact operators, but other proofs have been found.

Theorem 4.12.

Let r be a Hermitian symmetric bihomogeneous polynomial in n variables and their conjugates. Suppose \(r(z,\overline{z}) > 0\) on S 2n−1 . Then there are positive integers d and K, and a polynomial mapping g: C n C K , such that

$$\displaystyle{\vert \vert z\vert {\vert }^{2d}r(z,\overline{z}) = \vert \vert g(z)\vert {\vert }^{2}.}$$

We can remove the assumption of bihomogeneity if we want equality to hold only on the unit sphere.

Theorem 4.13.

Let r be a Hermitian symmetric polynomial in n variables and their conjugates. Assume that \(r(z,\overline{z}) > 0\) on S 2n−1 . Then there are an integer N and a polynomial mapping h such that, for z ∈ S 2n−1,

$$\displaystyle{r(z,\overline{z}) = \vert \vert h(z)\vert {\vert }^{2}.}$$

Proof.

We sketch the derivation of Theorem 4.13 from Theorem 4.12. First we bihomogenize r to get \(r_{H}(z,t,\overline{z},\overline{t})\), bihomogeneous of degree m in the z, t variables. We may assume m is even. The polynomial r H could have negative values on the sphere \(\vert \vert z\vert {\vert }^{2} + \vert t{\vert }^{2} = 1\). To correct for this possibility, we define a bihomogeneous polynomial F C by

$$\displaystyle{F_{C}(z,\overline{z},t,\overline{t}) = r_{H}(z,t,\overline{z},\overline{t}) + C{(\vert \vert z\vert {\vert }^{2} -\vert t{\vert }^{2})}^{m}.}$$

It is easy to show that we can choose C large enough to make F C strictly positive away from the origin. By Theorem 4.12, we can find an integer d such that

$$\displaystyle{{(\vert \vert z\vert {\vert }^{2} + \vert t{\vert }^{2})}^{d}F_{ C}(z,\overline{z},t,\overline{t}) = \vert \vert g(z,t)\vert {\vert }^{2}.}$$

Setting t = 1 and then || z ||2 = 1 shows, for zS 2n − 1, that

$$\displaystyle{{2}^{d}r(z,\overline{z}) = \vert \vert g(z,1)\vert {\vert }^{2}.}$$

The following Corollary of Theorem 4.13 connects these ideas with proper complex analytic mappings between balls.

Corollary 4.9.

Let \(f ={ p \over q}:{ \mathbf{C}}^{n} \rightarrow {\mathbf{C}}^{N}\) be a rational mapping. Assume that the image of the closed unit ball under f lies in the open unit ball in C N . Then there are an integer K and a polynomial mapping g: C n C K such that \({p\oplus g \over q}\) maps the unit sphere S 2n−1 to the unit sphere \({S}^{2(N+K)-1}\) .

Proof.

The hypothesis implies that | q |2 − || p ||2 is strictly positive on the sphere. By Theorem 4.13 there is a polynomial map g such that \(\vert q{\vert }^{2} -\vert \vert p\vert {\vert }^{2} = \vert \vert g\vert {\vert }^{2}\) on the sphere. Then \({p\oplus g \over q}\) does the job. □

This corollary implies that there are many rational mappings taking the unit sphere in the domain into the unit sphere in some target. We choose the first several components to be anything we want, as long as the closed ball gets mapped to the open ball. Then we can find additional components, using the same denominator, such that the resulting map takes the sphere to the sphere. The following simple example already indicates the depth of these ideas.

Example 4.17.

Consider the maps p λ : C 2C given by p λ (z, w) = λ z w. Then p λ maps the closed ball in C 2 inside the unit disk if | λ |2 < 4. If this condition is met, then we can include additional components to make p λ into a component of a polynomial mapping sending S 3 to some unit sphere. In case \(\lambda = \sqrt{3}\), we obtain the map \((\sqrt{3}zw,{z}^{3},{w}^{3})\), which is one of the group-invariant examples from Sect. 3. If \(\sqrt{3} <\lambda < 2\), then we must map into a dimension higher than 3. As λ approaches 2, the minimum possible target dimension approaches infinity.

We conclude with a surprising example that combines ideas from many parts of this book.

Example 4.18.

([D1]). There exists a bihomogeneous polynomial \(r(z,\overline{z})\), in three variables, with the following properties:

  • \(r(z,\overline{z}) \geq 0\) for all z.

  • The zero set of r is a copy of C (a one-dimensional subspace of C 3).

  • 0 is the only polynomial s for which rs is a Hermitian squared norm.

We put \(r(z,\overline{z}) = {(\vert z_{1}z_{2}{\vert }^{2} -\vert z_{3}{\vert }^{4})}^{2} + \vert z_{1}{\vert }^{8}\). The nonnegativity is evident. The zero-set of r is the set of z of the form (0, z 2, 0) and hence a copy of C. Assume that rs is a Hermitian squared norm || A ||2. Consider the map from C to C 3 given by \(t\mapsto ({t}^{2},1 + t,t) = z(t)\). Pulling back yields the equation

$$\displaystyle{r(z(t),\overline{z(t)})\ s(z(t),\overline{z(t)}) = \vert \vert c_{m}{t}^{m} + \cdots \vert {\vert }^{2},}$$

where ⋯ denotes higher-order terms. Hence, the product of the lowest order terms in the pullback of s with the lowest order terms in the pullback of r is \(\vert \vert c_{m}\vert {\vert }^{2}\vert t{\vert }^{2m}\). A simple computation shows that the lowest order terms in the pullback of r are

$$\displaystyle{ {t}^{4}{\overline{t}}^{6} + 2\vert t{\vert }^{10} + {t}^{6}{\overline{t}}^{4} = 2\vert t{\vert }^{10}(1 + \mathrm{cos}(2\theta )). }$$
(89)

There is no trig polynomial p other than 0 for which multiplying the right-hand side of (89) by an expression of the form | t |2k p(θ) yields a result independent of θ.

No such example is possible in one dimension, because the only bihomogeneous polynomials are of the form c | t |2m. It is easy to find a nonnegative polynomial \(g(t,\overline{t})\) that does not divide any Hermitian squared norm (other than 0); for example,

$$\displaystyle{2\vert t{\vert }^{2} + {t}^{2} +{ \overline{t}}^{2} = 2\vert t{\vert }^{2}(1 + \mathrm{cos}(2\theta ))}$$

does the job. Our example is surprising because r is bihomogeneous.

The theorems, examples, and geometric considerations in this chapter illustrate the following theme. When passing from analysis on the unit circle to analysis in higher dimensions, the mathematics becomes both more complicated and more beautiful. Ideas revolving around Hermitian symmetry appear throughout. This perspective leads naturally to CR Geometry. We refer again to [DT] for an introduction to CR Geometry and to its references for viewing the many directions in which Hermitian analysis is developing.