Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

We have encountered a number of types of spaces consisting of points (affine, affine Euclidean, projective). For all of these spaces, an interesting and important question has been the study of quadrics contained in such spaces, that is, sets of points with coordinates (x 1,…,x n ) that in some coordinate system satisfy the single equation

$$ F (x_1, \ldots, x_n) = 0, $$
(11.1)

where F is a second-degree polynomial in the variables x 1,…,x n . Let us focus our attention on the fact that by the definition of a polynomial, it is possible in general for there to be present in equation (11.1) both first- and second-degree monomials as well as a constant term.

For each of the spaces of the above-mentioned types, a trivial verification shows that the property of a set of points being a quadric does not depend on the choice of coordinate system. Or in other words, a nonsingular affine transformation, motion, or projective transformation (depending on the type of space under consideration) takes a quadric to a quadric.

11.1 Quadrics in Projective Space

By the definition given above, a quadric Q in the projective space ℙ(L) is given by equation (11.1) in homogeneous coordinates. However, as we saw in Chap. 9, such an equation is satisfied by the homogeneous coordinates of a point of the projective space ℙ(L) only if its left-hand side is homogeneous.

Definition 11.1

A quadric in a projective space ℙ(L) is a set Q consisting of points defined by equation (11.1), where F is a homogeneous second-degree polynomial, that is, a quadratic form in the coordinates x 0,x 1,…,x n .

In Sect. 6.2, it was proved that is some coordinate system (that is, in some basis of the space L), equation (11.1) is reduced to canonical form

$$\lambda _0 x_0^2 + \lambda _1 x_1^2 + \cdots+ \lambda _r x_r^2 = 0, $$

where all the coefficients λ i are nonzero. Here the number rn is equal to the rank of the quadratic form F, and it is the same for every system of coordinates in which the form F is reduced to canonical form. In the sequel, we shall assume that the quadratic form F is nonsingular, that is, that r=n. We shall also call the associated quadric Q nonsingular. The canonical form of its equation can then be written as follows:

$$ \alpha _0 x_0^2 + \alpha _1 x_1^2 + \cdots+ \alpha _n x_n^2 = 0, $$
(11.2)

where all the coefficients α i are nonzero. The general case differs from (11.2) only in the omission of terms containing x i with i=r+1,…,n. It is therefore easily reduced to the case of a nonsingular quadric.

We have already encountered the concept of a tangent space to an arbitrary smooth hypersurface (in Chap. 7) or to a projective algebraic variety (in Chap. 9). Now we move on to a consideration of the notion of the tangent space to a quadric.

Definition 11.2

If A is a point on the quadric Q given by equation (11.1), then the tangent space to Q at the point AQ is defined as the projective space T A Q given by equation

$$ \sum_{i=0}^n \frac{\partial F}{\partial x_i} (A) x_i = 0. $$
(11.3)

The tangent space is an important general mathematical concept, and we shall now discuss it in the greatest possible generality. Within the framework of a course in algebra, it is natural to limit ourselves to the case in which F is a homogeneous polynomial of arbitrary degree k>0. Then equation (11.1) defines in the space ℙ(L) some hypersurface X, and if not all the partial derivatives \(\frac{\partial F}{\partial x_{i}} (A)\) are equal to zero, then equation (11.3) gives the tangent hyperplane to the hypersurface X at the point A. We see that in equation (11.3), on the left-hand side appears the differential d A F(x) (see Example 3.86 on p. 130), and since this notion was defined so as to be invariant with respect to the choice of coordinate system, the notion of tangent space is also independent of such a choice. The tangent space to the hypersurface X at the point A is denoted by T A X.

In the sequel, we shall always assume that quadrics are viewed as lying in spaces over a field \({\mathbb{K}}\) of characteristic different from 2 (for example, for definiteness, we may assume that the field \({\mathbb{K}}\) is either ℝ or ℂ). If F(x) is a quadratic form, then by the assumptions we have made, we can write it in the form

$$ F({\boldsymbol{x} }) = \sum_{i,j=0}^n a_{ij} x_i x_j, $$
(11.4)

where the coefficients satisfy a ij =a ji . In other words, F(x)=φ(x,x), where

$$ \varphi ({\boldsymbol{x} },{\boldsymbol{y} }) = \sum_{i,j=0}^n a_{ij} x_i y_j $$
(11.5)

is a symmetric bilinear form (Theorem 6.6). If the point A corresponds to the vector a with coordinates (α 0,α 1,…,α n ), then

$$\frac{\partial F}{\partial x_i}(A) = 2 \sum_{j=0}^n a_{ij} \alpha _j, $$

and therefore, equation (11.3) takes the form

$$\sum_{i,j=0}^n a_{ij} \alpha _j x_i = 0, $$

or equivalently, φ(a,x)=0. Thus in this case, the tangent hyperplane at the point A coincides with the orthogonal complement 〈a to the vector aL with respect to the bilinear form φ(x,y).

The definition of tangent space (11.3) loses sense if all derivatives \(\frac{\partial F}{\partial x_{i}} (A)\) are equal to zero:

$$ \frac{\partial F}{\partial x_i} (A) = 0,\quad i= 0,1, \ldots, n. $$
(11.6)

A point A of the hypersurface X given by equation (11.1) for which equalities (11.6) are satisfied is called a singular or critical point. If a hypersurface has no singular points, then it is said to be smooth. When the hypersurface X is a quadric, that is, the polynomial F is a quadratic form (11.4), then equations (11.6) assume the form

$$\sum_{j=0}^n a_{ij} \alpha _j = 0,\quad i= 0,1, \ldots, n. $$

Since the point A is in ℙ(L), it follows that not all of its coordinates α i are equal to zero. Thus singular points of a quadric Q are the nonzero solutions of the system of equations

$$ \sum_{j=0}^n a_{ij} x_j = 0,\quad i= 0,1, \ldots, n. $$
(11.7)

As was shown in Chap. 2, such solutions exist only if the determinant of the matrix (a ij ) is equal to zero, and that is equivalent to saying that the quadric Q is singular. Thus a nonsingular quadric is the same thing as a smooth quadric.

Let us consider the possible mutual relationships between a quadric Q and a line l in projective space ℙ(L). First, let us show that either the line l has not more than two points in common with the quadric Q, or else it lies entirely in Q.

Indeed, if a line l is not contained entirely in Q, then one can choose a point Al, AQ. Let the line l correspond to some plane L′⊂L, that is, l=ℙ(L′). If A=〈a〉, then L′=〈a,b〉, where the vector bL is not collinear with the vector a. In other words, the plane L′ consists of all vectors of the form x a+y b, where x and y range over all possible scalars. The points of intersection of the line l and plane Q are found from the equation F(x a+y b)=0, that is, from the equation

(11.8)

in the variables x,y. The vectors x a+y b with y=0 give us a point AQ. Assuming, therefore, that y≠0, we obtain t=x/y. Then (11.8) gives us a quadratic equation in the variable t:

$$F(x{\boldsymbol{a} }+ y{\boldsymbol{b} }) = y^2 \bigl(F({\boldsymbol{a} }) t^2 + 2\varphi ({\boldsymbol{a} },{\boldsymbol{b} }) t + F( {\boldsymbol{b} })\bigr) = 0. $$

The condition AQ has the form F(a)≠0. Consequently, the leading coefficient of the quadratic trinomial F(a)t 2+2φ(a,b)t+F(b) is nonzero, and therefore, the quadratic trinomial itself is not identically zero and cannot have more than two roots.

Let us now consider the mutual arrangement of Q and l if the line l passes through the point AQ. Then, as in the previous case, l corresponds to the solutions of the quadratic equation (11.8), in which F(a)=0, since AQ. Thus we obtain the equation

$$ F(x{\boldsymbol{a} }+ y{\boldsymbol{b} }) = 2\varphi ({\boldsymbol{a} },{\boldsymbol{b} }) xy + F({\boldsymbol{b} }) y^2 = y \bigl(2\varphi ({\boldsymbol{a} },{\boldsymbol{b} }) x + F({\boldsymbol{b} }) y\bigr) = 0. $$
(11.9)

One solution of equation (11.9) is obvious: y=0. It precisely corresponds to the point AQ. This solution is unique if and only if φ(a,b)=0, that is, if bT A Q. In the latter case, clearly lT A Q, and one says that the line l is tangent to the quadric Q at the point A.

Thus there are four possible cases of the relationship between a nonsingular quadric Q and a line l:

  1. (1)

    The line l has no points in common with the quadric Q.

  2. (2)

    The line l has precisely two distinct points in common with the quadric Q.

  3. (3)

    The line l has exactly one point A in common with the quadric Q, which is possible if and only if lT A Q.

  4. (4)

    The line l lies entirely in Q.

Of course, there also exist smooth hypersurfaces defined by equation (11.1) of arbitrary degree k≥1. For example, such a hypersurface is given by the equation \(c_{0} x_{0}^{k} + c_{1} x_{1}^{k} + \cdots+ c_{n} x_{n}^{k} = 0\), where all the c i are nonzero. In the sequel, we shall consider only smooth hypersurfaces. For these, the left-hand side of equation (11.3) is a nonnull linear form on the vector space L, and this means that it determines a hyperplane in L and in ℙ(L).

Let us verify that this hyperplane contains the point A. This means that if the point A corresponds to the vector a=(α 0,α 1,…,α n ), then

$$\sum_{i=0}^n \frac{\partial F}{\partial x_i} (A) \alpha _i = 0. $$

If the degree of the homogeneous polynomial F is equal to k, then by Euler’s identity (3.68), we have the equality

$$\sum_{i=0}^n \frac{\partial F}{\partial x_i} (A) \alpha _i = \Biggl( \sum_{i=0}^n \frac{\partial F}{\partial x_i} x_i \Biggr) (A) = k F(A). $$

The value of F(A) is equal to zero, since the point A lies on the hypersurface X given by the equation F(A)=0.

Now to switch to a more familiar situation, let us consider an affine subspace of ℙ(L), given by the condition \(x_{0} \not= 0\), and let us introduce in it the inhomogeneous coordinates

$$ y_i = {x_i}/{x_0},\quad i= 1, \ldots, n. $$
(11.10)

Let us assume that the point A lies in this subset (that is, its coordinate α 0 is nonzero) and let us write equation (11.3) in coordinates y i . To do so, we must move from the variables x 0,x 1,…,x n to the variables y 1,…,y n and rewrite equation (11.3) accordingly. Here we must set

$$ F (x_0, x_1, \ldots, x_n) = x_0^k f(y_1, \ldots, y_n), $$
(11.11)

where f(y 1,…,y n ) is a polynomial of degree k≥1, already not necessarily homogeneous (in contrast to F). In accord with formula (11.10), let us denote by a 1,…,a n the inhomogeneous coordinates of the point A, that is,

$$a_i = {\alpha _i}/{\alpha _0},\quad i= 1, \ldots, n. $$

Using general rules for the calculation of partial derivatives, from the representation (11.11), taking into account (11.10), we obtain the formulas

and

$$\frac{\partial F}{\partial x_i} = x_0^{k} \sum _{l=1}^n \frac{\partial f}{\partial y_l} \frac{\partial y_l}{\partial x_i} = x_0^{k} \sum_{l=1}^n \frac{\partial f}{\partial y_l} \biggl( x_0^{-1} \frac {\partial x_l}{\partial x_i} \biggr) = x_0^{k-1} \frac{\partial f}{\partial y_i},\quad i = 1, \ldots, n. $$

Now let us find the values of the derivatives calculated above of the function F at the point A with inhomogeneous coordinates a 1,…,a n . The value of F(A) is zero, since the point A lies in the hypersurface X and x 0≠0. By virtue of the representation (11.11), we obtain from this that f(a 1,…,a n )=0. For brevity, we shall employ the notation f(A)=f(a 1,…,a n ) and \(\frac{\partial f}{\partial y_{i}}(A) = \frac {\partial f}{\partial y_{i}}(a_{1}, \ldots, a_{n})\). Thus from the two previous relationships, we obtain

(11.12)

On substituting expression (11.12) into (11.3), and taking into account (11.10), we obtain the equation

Canceling the nonzero common factor \(\alpha _{0}^{k-1} x_{0}\), we finally obtain

$$ \sum_{i=1}^n \frac{\partial f}{\partial y_i} (A) (y_i - a_i) = 0. $$
(11.13)

This is precisely the equation of the tangent hyperplane T A X in inhomogeneous coordinates. In analysis and geometry, it is written in the form (11.13) for a function f of a much more general class than that of polynomials.

We may now return to the case in which the hypersurface X=Q is a nonsingular (and therefore smooth) quadric. Then for every point AQ, equation (11.3) determines a hyperplane in L, that is, some line in the dual space L , and therefore a point belonging to the space ℙ(L ), which we shall denote by Φ(A). Thus we define the mapping

$$ \varPhi: Q \to {\mathbb{P}}\bigl({\mathsf{L}}^*\bigr). $$
(11.14)

Our first task consists in determining what the set Φ(Q)⊂ℙ(L ) in fact is. For this, we express the quadratic form F(x) in the form F(x)=φ(x,x), where the symmetric bilinear form φ(x,y) has the form (11.5). By Theorem 6.3, we can write φ(x,y) uniquely as , where is some linear transformation. From the definitions, it follows that here, the radical of the form φ coincides with the kernel of the linear transformation . Since in the case of a nonsingular form F, the radical φ is equal to (0), it follows that the kernel of is also equal to (0). Since dimL=dimL , we have by Theorem 3.68 that the linear transformation is an isomorphism, and there is thereby determined a projective transformation .

Let us now write down our mapping (11.14) in coordinates. If the quadratic form F(x) is written in the form (11.4), then

$$\frac{\partial F}{\partial x_i} = 2 \sum_{j=0}^n a_{ij} x_j,\quad i = 0,1, \ldots, n. $$

On the other hand, in some basis e 0,e 1,…,e n of the space L, the bilinear form φ(x,y) has the form (11.5), where the vectors x and y are given by x=x 0 e 0+⋯+x n e n and y=y 0 e 0+⋯+y n e n . From this, it follows that the matrix of the transformation in the basis e 0,e 1,…,e n of the space L and in the dual basis f 0,f 1,…,f n of the space L is equal to (a ij ). Therefore, to the quadratic form F(x) is associated the isomorphism , and the mapping (11.14) that we constructed coincides with the restriction of the projective transformation to Q, that is, .

From this arises an unexpected consequence: since the transformation is a bijection, the transformation (11.14) is also a bijection. In other words, the tangent hyperplanes to the nonsingular quadric Q at distinct points A,BQ are distinct. Thus we obtain the following result.

Lemma 11.3

The same hyperplane cannot coincide with the tangent hyperplanes to a nonsingular quadric Q at two distinct points.

This means that in writing a hyperplane of the space ℙ(L) in the form T A Q, we may omit the point A. And in the case of a nonsingular quadric Q, it makes sense to say that the hyperplane is tangent to the quadric, and moreover, the point of tangency AQ is uniquely determined.

Let us now consider more concretely what the set Φ(Q) looks like. We shall show that it is also a nonsingular quadric, that is, in some (and therefore in any) basis of the space L determined by the equation q(x)=0, where q is a nonsingular quadratic form.

We saw above that there is an isomorphism that bijectively maps Q to Φ(Q). Therefore, there exists as well an inverse transformation , which is also an isomorphism. Then the condition yΦ(Q) is equivalent to . Let us choose an arbitrary basis

$$ {\boldsymbol{f} }_0, {\boldsymbol{f} }_1, \ldots, {\boldsymbol{f} }_n $$
(11.15)

in the space L . The isomorphism carries this basis to the basis

(11.16)

of the space L. Here obviously the coordinates of the vector in the basis (11.16) coincide with the coordinates of the vector y in the basis (11.15). As we saw above, the condition is equivalent to the relationship

$$ F(\alpha _0,\alpha _1, \ldots, \alpha _n) = 0, $$
(11.17)

where F is a nonsingular quadratic form, and (α 0,α 1,…,α n ) are the coordinates of the vector in some basis of the space L, for instance, in the basis (11.16). This means that the condition yΦ(Q) can be expressed by the same relationship (11.17). Thus we have proved the following statement.

Theorem 11.4

If Q is a nonsingular quadric in the space ℙ(L), then the set of tangent hyperplanes to it forms a nonsingular quadric in the space ℙ(L ).

Repeating verbatim the arguments presented in Sect. 9.1, we may extend the duality principle formulated there. Namely, we can add to it some additional notions that are dual to each other that can be interchanged so that the general assertion formulated on p. 326 remains valid:

$$\begin{array}{r@{\quad}||@{\quad}l} \mbox{\small nonsingular quadric in ${\mathbb{P}}({\mathsf{L}})$} & \mbox{\small nonsingular quadric in ${\mathbb{P}}({\mathsf{L}}^{*})$} \\ \mbox{\small point in a nonsingular quadric} & \mbox{\small hyperplane tangent to a nonsingular quadric} \end{array} $$

This (seemingly small) extension of the duality principle leads to completely unexpected results. By way of an example, we shall introduce two famous theorems that are duals of each other, that is, equivalent on the basis of the duality principle. Yet the second of them was published 150 years after the first. These theorems relate to quadrics in two-dimensional projective space, that is, in the projective plane. In this case, a quadric is called a conic.Footnote 1

In the sequel, we shall use the following terminology. Let Q be a nonsingular conic, and let A 1,…,A 6 be six distinct points of Q. This ordered (that is, their order is significant) collection of points is called a hexagon inscribed in the conic Q. For two distinct points A and B of the projective plane, their projective cover (that is, the line passing through them) is denoted by \(\overline {AB}\) (cf. the definition on p. 325). The six lines \(\overline {A_{1} A_{2}}, \overline {A_{2} A_{3}}, \ldots, \overline {A_{5} A_{6}}, \overline {A_{6} A_{1}}\) are called the sides of the hexagon.Footnote 2 Here the following pairs of sides will be called opposite sides: \(\overline {A_{1} A_{2}}\) and \(\overline {A_{4} A_{5}}\), \(\overline {A_{2} A_{3}}\) and \(\overline {A_{5} A_{6}}\), \(\overline {A_{3} A_{4}}\) and \(\overline {A_{6} A_{1}}\).

Theorem 11.5

(Pascal’s theorem)

Pairs of opposite sides of an arbitrary hexagon inscribed in a nonsingular cone intersect in three collinear points. See Fig11.1.

Fig. 11.1
figure 1

Hexagon inscribed in a conic

Before formulating the dual theorem to Pascal’s theorem, let us make a few comments.

With the selection of a homogeneous system of coordinates (x 0:x 1:x 2) in the projective plane, the equation of the conic Q can be written in the form

$$F(x_0 : x_1 : x_2) = a_1 x_0^2 + a_2 x_0 x_1 + a_3 x_0 x_2 + a_4 x_1^2 + a_5 x_1 x_2 + a_6 x_2^2 = 0. $$

There are six coefficients on the right-hand side of this equation. If we have k points A 1,…,A k , then the condition of their belonging to the conic Q reduces to the relationships

$$ F (A_i) = 0,\quad i = 1, \ldots, k, $$
(11.18)

which yield a system consisting of k linear homogeneous equations in the six unknowns a 1,…,a 6. We must find a nontrivial solution to this system. If we have k=6, then this question falls under Corollary 2.13 as a special case (and this explains our interest in hexagons inscribed in a conic). By this corollary, we have still to verify that the determinant of the system (11.18) for k=6 is equal to zero. It is Pascal’s theorem that gives a geometric interpretation of this condition.

It is not difficult to show that it gives necessary and sufficient conditions for six points A 1,…,A 6 to lie on some conic if we restrict ourselves, first of all, to nonsingular conics, and secondly, to such collections of six points that no three of them are collinear (this is proved in any sufficiently rigorous course in analytic geometry).

Now let us formulate the dual theorem to Pascal’s theorem. Here six distinct lines L 1,…,L 6 tangent to a conic Q will be called a hexagon circumscribed about the conic. Points L 1L 2, L 2L 3, L 3L 4, L 4L 5, L 5L 6, and L 6L 1 are called the vertices of the hexagon. Here the following pairs of vertices will be called opposite: L 1L 2 and L 4L 5, L 2L 3 and L 5L 6, L 3L 4 and L 6L 1.

Theorem 11.6

(Brianchon’s theorem)

The lines connecting opposite vertices of an arbitrary hexagon circumscribed about a nonsingular conic intersect at a common point. See Fig11.2.

Fig. 11.2
figure 2

Hexagon circumscribed about a conic

It is obvious that Brianchon’s theorem is obtained from Pascal’s theorem if we replace in it all the concepts by their duals according to the rules given above. Thus by virtue of the general duality principle, Brianchon’s theorem follows from Pascal’s theorem. Pascal’s theorem itself can be proved easily, but we will not present a proof, since its logic is connected with another area, namely algebraic geometry.Footnote 3 Here it is of interest to observe only that the duality principle makes it possible to obtain certain results from others that appear at first glance to be entirely unrelated. Indeed, Pascal proved his theorem in the seventeenth century (when he was 16 years old), while Brianchon proved his theorem in the nineteenth century, more than 150 years later. And moreover, Brianchon used entirely different arguments (the general duality principle was not yet understood at the time).

11.2 Quadrics in Complex Projective Space

Let us now consider the projective space ℙ(L), where L is a complex vector space, and as before, let us limit ourselves to the case of nonsingular quadrics. As we saw in Sect. 6.3 (formula (6.27)), a nonsingular quadratic form in a complex space has the canonical form \(x_{0}^{2} + x_{1}^{2} + \cdots+ x_{n}^{2}\). This means that in some coordinate system, the equation of a nonsingular quadric can be written as

$$ x_0^2 + x_1^2 + \cdots+ x_n^2 = 0, $$
(11.19)

that is, every nonsingular quadric can be transformed into the quadric (11.19) by some projective transformation. In other words, in a complex projective space there exists (defined up to a projective transformation) only one nonsingular quadric (11.19). It is this quadric that we shall now investigate.

In view of what we have said above, it suffices to consider any one arbitrary nonsingular quadric on the projective space ℙ(L) of a given dimension. For example, we may choose the quadric given by the equation F(x)=0, where the matrix of the quadratic form F(x) has the form

(11.20)

A simple calculation shows that the determinant of the matrix (11.20) is equal to +1 or −1, that is, it is nonzero.

A fundamental topic that we shall study in this and the following sections is projective subspaces contained in a quadric. Let the quadric Q be given by the equation F(x)=0, where xL, and let a projective subspace have the form ℙ(L′), where L′ is a subspace of the vector space L. Then the projective subspace ℙ(L′) is contained in Q if and only if F(x)=0 for all vectors xL′.

Definition 11.7

A subspace L′⊂L is said to be isotropic with respect to a quadratic form F if F(x)=0 for all vectors xL′.

Let φ be the symmetric bilinear form associated with the quadratic form F, according to Theorem 6.6. Then by virtue of (6.14), we see that φ(x,y)=0 for all vectors x,yL′. Therefore, we shall also say that the subspace L′⊂L is isotropic with respect to the bilinear form φ.

We have already encountered the simplest example of isotropic subspaces, in Sect. 7.7 in our study of pseudo-Euclidean spaces. There we encountered lightlike (also called isotropic) vectors on which a quadratic form (x 2) defining a pseudo-Euclidean space becomes zero. Every nonnull lightlike vector e clearly determines a one-dimensional subspace 〈e〉.

The basic technique that will be used in this and the following sections consists in how to reformulate our questions about subspaces contained in a quadric F(x)=0 in terms of a vector space L, a symmetric bilinear form φ(x,y) defined on L and corresponding to the quadratic form F(x), and subspaces isotropic with respect to F and φ. Then everything is determined almost trivially on the basis of the simplest properties of linear and bilinear forms.

Theorem 11.8

The dimension of an arbitrary isotropic subspace L′⊂L relative to an arbitrary nonsingular quadratic form F does not exceed half of dimL.

Proof

Let us consider (L′), the orthogonal complement of the subspace L′⊂L with respect to the bilinear form φ(u,v) associated with F(x). The quadratic form F(x) and bilinear form φ(u,v) are nonsingular. Therefore, we have relationship (7.75), from which follows the equality dim(L′)=dimL−dimL′.

That the space L′ is isotropic means that L′⊂(L′). From this we obtain the inequality

$$\dim {\mathsf{L}}' \le\dim\bigl({\mathsf{L}}'\bigr)^{\perp} = \dim {\mathsf{L}}- \dim {\mathsf{L}}', $$

from which it follows that \(\dim {\mathsf{L}}' \le\frac{1}{2} \dim {\mathsf{L}}\), as asserted in the theorem. □

In the sequel, we shall limit our study of isotropic subspaces to those of the greatest possible dimension, namely \(\frac{1}{2} \dim {\mathsf{L}}\) when the number dimL is even and \(\frac{1}{2} (\dim {\mathsf{L}}-1)\) when it is odd. The general case \(\dim {\mathsf{L}}' \le\frac{1}{2} \dim {\mathsf{L}}\) is easily reduced to this limiting case and is studied completely analogously.

Let us consider some of the simplest cases, known from analytic geometry.

Example 11.9

The simplest case of all is dimL=2, and therefore, dimℙ(L)=1. In coordinates (x 0:x 1), the quadratic form with matrix (11.20) has the form x 0 x 1. Clearly, the quadric x 0 x 1=0 consists of two points (0:1) and (1:0), corresponding to the vectors e 1=(0,1) and e 2=(1,0) in the plane L. Each of the two points determines an isotropic subspace \({\mathsf{L}}'_{i} = \langle {\boldsymbol{e} }_{i}\rangle\).

Example 11.10

Next in complexity is the case dimL=3, and correspondingly, dimℙ(L)=2. In this case, we are dealing with quadrics in the projective plane; their points determine one-dimensional isotropic subspaces in L that therefore form a continuous family. (If the equation of the quadric is F(x 0,x 1,x 2)=0, then in the space L, it determines a quadratic cone whose generatrices are isotropic subspaces.)

Example 11.11

The following case corresponds to dimL=4 and dimℙ(L)=3. These are quadrics in three-dimensional projective space. For isotropic subspaces L′⊂L, Theorem 11.8 gives dimL′≤2. Isotropic subspaces of maximal dimension are obtained for dimL′=2, that is, dimℙ(L′)=1. These are projective lines lying on the quadric. In coordinates (x 0:x 1:y 0:y 1), the quadratic form with matrix (11.20) gives the equation

$$ x_0 y_0 + x_1 y_1 = 0. $$
(11.21)

We must find all two-dimensional isotropic subspaces L′⊂L. Let a basis of the two-dimensional subspace L′ consist of vectors e=(a 0,a 1,b 0,b 1) and \({\boldsymbol{e} }' = (a_{0}',a_{1}',b_{0}',b_{1}')\). Then the fact that L′ is isotropic is expressed, in view of formula (11.21), by the relationship

$$ \bigl(a_0 u + a_0' v\bigr) \bigl(b_0 u + b_0' v\bigr) + \bigl(a_1 u + a_1' v\bigr) \bigl(b_1 u + b_1' v\bigr) = 0, $$
(11.22)

which is satisfied identically for all u and v. The left-hand side of equation (11.22) represents a quadratic form in the variables u and v, which can be identically equal to zero only in the case that all its coefficients are equal to zero. Removing parentheses in (11.22), we obtain

(11.23)

The first equation from (11.23) means that the rows (a 0,a 1) and (b 1,−b 0) are proportional. Since they cannot both be equal to zero simultaneously (then all coordinates of the basis vector e would be equal to zero, which is impossible), it follows that one of them is the product of the other and some (uniquely determined) scalar β. For definiteness, let a 0=βb 1, a 1=−βb 0 (the case b 1=βa 0, b 0=−βa 1 is considered analogously). In just the same way, from the third equation of (11.23), we obtain that \(a_{0}' = \gamma b_{1}'\), \(a_{1}' = -\gamma b_{0}'\) with some scalar γ. Substituting the relationships

$$ a_0 = \beta b_1,\qquad a_1 = -\beta b_0,\qquad a_0' = \gamma b_1',\qquad a_1' = -\gamma b_0' $$
(11.24)

into the second equation of (11.23), we obtain the equality \((\beta- \gamma) (b_{0}' b_{1} - b_{0} b_{1}') = 0\). Therefore, either \(b_{0}' b_{1} - b_{0} b_{1}' = 0\) or γ=β.

In the first case, from the equality \(b_{0}' b_{1} - b_{0} b_{1}' = 0\) it follows that the rows \((b_{0}, b_{0}')\) and \((b_{1}, b_{1}')\) are proportional, and we obtain the relationships b 1=−αb 0 and \(b_{1}' = -\alpha b_{0}'\) with some scalar α (the case b 0=−αb 1 and \(b_{0}' = -\alpha b_{1}\) is considered similarly). Let us assume that b 1 and \(b_{1}'\) are not both equal to zero. Then α≠0, and taking into account the relationships (11.24), we obtain

In the second case, let us suppose that a 0 and a 1 are not both equal to zero. Then β≠0, and taking into account relationship (11.24), we obtain

Thus with the assumptions made for an arbitrary vector subspace L′ with coordinates (x 0,y 0,x 1,y 1), we have either

$$ x_0 = \alpha x_1,\qquad y_0 = -\alpha ^{-1}y_1 $$
(11.25)

or

$$ x_0 = \beta y_1,\qquad y_0 = - \beta^{-1} x_1, $$
(11.26)

where α and β are certain nonzero scalars.

In order to consider the excluded cases, namely α=0 (\(b_{1} = b_{1}' = 0\)) and β=0 (a 0=a 1=0), let us introduce points (a:b)∈ℙ1 and (c:d)∈ℙ1, that is, pairs of numbers that are not simultaneously equal to zero, and let us consider them as defined up to multiplication by one and the same nonzero scalar. Then as is easily verified, a homogeneous representation of relationships (11.25) and (11.26) that also includes both previously excluded cases will have the form

$$ a x_0 = b x_1,\qquad b y_0 = -a y_1 $$
(11.27)

and

$$ c x_0 = d y_1,\qquad d y_0 = -c x_1 $$
(11.28)

respectively. Indeed, equality (11.25) is obtained from (11.27) for a=1 and b=α, while (11.26) is obtained from (11.28) for c=1 and d=β.

Relationships (11.27) give the isotropic plane L′⊂L or the line ℙ(L′) in ℙ(L), which belongs to the quadric (11.21). It is determined by the point (a:b)∈ℙ1. Thus we obtain one family of lines. Similarly, relationships (11.28) determine a second family of lines. Together, they give all the lines contained in our quadric (called a hyperboloid of one sheet). These lines are called the rectilinear generatrices of the hyperboloid.

On the basis of the formulas we have written down, it is easy to verify some properties known from analytic geometry: two distinct lines from one family of rectilinear generatrices do not intersect, while two lines from different families do intersect (at a single point). For every point of the hyperboloid, there is a line from each of the two families that passes through it.

In the following section, we shall consider the general case of projective subspaces of maximum possible dimension on a nonsingular quadric of arbitrary dimension in complex projective space.

11.3 Isotropic Subspaces

Let Q be a nonsingular quadric in a complex projective space ℙ(L) given by the equation F(x)=0, where F(x) is a nonsingular quadratic form on the space L. In analogy to what we discussed in the previous section, we shall study m-dimensional subspaces L′⊂L that are isotropic with respect to F, assuming that dimL=2m if dimL is even, and dimL=2m+1 if dimL is odd.

The special cases that we studied in the preceding section show that isotropic subspaces look different for different values of dimL. Thus for dimL=3, we found one family of isotropic subspaces, continuously parameterized by the points of the quadric Q. For dimL=2 or 4, we found two such families. This leads to the idea that the number of continuously parameterized families of isotropic subspaces on a quadric depends on the parity of the number dimL. As we shall now see, such is indeed the case.

The cases of even and odd dimension will be treated separately.

Case 1. Let us assume that dimL=2m. Consequently, we are interested in isotropic subspaces ML of dimension m. (This is the most interesting case, since here we shall see how the families of lines on a hyperbola of one sheet are generalized.)

Theorem 11.12

For every m-dimensional isotropic subspace ML, there exists another m-dimensional isotropic subspace NL such that

$$ {\mathsf{L}}= {\mathsf{M}}\oplus {\mathsf{N}}. $$
(11.29)

Proof

Our proof is by induction on the number m. For m=0, the statement of the theorem is vacuously true.

Let us assume now that m>0, and let us consider an arbitrary nonnull vector eM. Let φ(x,y) be the symmetric bilinear form associated with the quadratic form F(x). Since the subspace M is isotropic, it follows that φ(e,e)=0. In view of the nonsingularity of F(x), the bilinear form φ(x,y) is likewise nonsingular, and therefore, its radical is equal to (0). Then the linear function φ(e,x) of a vector xL is not identically equal to zero (otherwise, the vector e would be in the radical of φ(x,y), which is equal to (0)).

Let fL be a vector such that φ(e,f)≠0. Clearly, the vectors e,f are linearly independent. Let us consider the plane W=〈e,f〉 and denote by φ′ the restriction of the bilinear form φ to W. In the basis e,f, the matrix of the bilinear form φ′ has the form

It is obvious that |Φ′|=−φ(e,f)2≠0, and therefore, the bilinear form φ′ is nonsingular.

Let us define the vector

$${\boldsymbol{g} }= {\boldsymbol{f} }- \frac{\varphi ({\boldsymbol{f} },{\boldsymbol{f} })}{2\varphi ({\boldsymbol{e} },{\boldsymbol{f} })} {\boldsymbol{e} }. $$

Then as is easily verified, φ(g,g)=0, φ(e,g)=φ(e,f)≠0, and the vectors e,g are linearly independent, that is, W=〈e,g〉. In the basis e,g, the matrix of the bilinear form φ′ has the form

As a result of the nondegeneracy of the bilinear form φ′, we have by Theorem 6.9 the decomposition

$$ {\mathsf{L}}= {\mathsf{W}}\oplus {\mathsf{L}}_1,\qquad {\mathsf{L}}_1 = {\mathsf{W}}^{\perp}_{\varphi }, $$
(11.30)

where dimL 1=2m−2. Let us set M 1=L 1M and show that M 1 is a subspace of dimension m−1 isotropic with respect to the restriction of the bilinear form φ to L 1.

By construction, the subspace M 1 consists of the vectors xM such that φ(x,e)=0 and φ(x,g)=0. But the first equality holds in general for all xM, since eM and M is isotropic with respect to φ. Thus in the definition of the subspace M 1, there remains only the second equality, which means that M 1M is determined by what is sent to zero by the linear function f(x)=φ(x,g), which is not identically equal to zero (since f(e)=φ(e,g)≠0). Therefore, dimM 1=dimM−1=m−1.

Thus M 1 is a subspace of L 1 of half the dimension of L 1, defined by formula (11.30), and we can apply the induction hypothesis to it to obtain the decomposition

$$ {\mathsf{L}}_1 = {\mathsf{M}}_1 \oplus {\mathsf{N}}_1, $$
(11.31)

where N 1L 1 is some other (m−1)-dimensional isotropic subspace.

Let us note that M=〈e〉⊕M 1 and let us set N=〈g〉⊕N 1. Since the subspace N 1 is isotropic in L 1, the subspace N is isotropic in L, and taking into account that φ(g,g)=0, we have for all vectors xN 1 the equality φ(g,x)=0. Formulas (11.30) and (11.31) together give the decomposition

$${\mathsf{L}}= \langle {\boldsymbol{e} }\rangle \oplus\langle {\boldsymbol{g} }\rangle \oplus {\mathsf{M}}_1 \oplus {\mathsf{N}}_1 ={\mathsf{M}}\oplus {\mathsf{N}}, $$

which is what was to be proved. □

In the terminology of Theorem 11.12, an arbitrary vector zN determines a linear function f(x)=φ(z,x) on the vector space L, that is, an element of the dual space L . The restriction of this function to the subspace ML is obviously a linear function on M, that is, an element of the space M . This defines the mapping . A trivial verification shows that is a linear transformation.

The decomposition (11.29) established by Theorem 11.12 has an interesting consequence.

Lemma 11.13

The linear transformation constructed above is an isomorphism.

Proof

Let us determine the kernel of the transformation . Let us assume that for some z 0N, that is, φ(z 0,y)=0 for all vectors yM. But by Theorem 11.12, every vector xL can be represented in the form x=y+z, where yM and zN. Thus

$$\varphi ({\boldsymbol{z} }_0,{\boldsymbol{x} }) = \varphi ({\boldsymbol{z} }_0,{\boldsymbol{y} }) + \varphi ( {\boldsymbol{z} }_0,{\boldsymbol{z} }) = \varphi ({\boldsymbol{z} }_0,{\boldsymbol{z} }) = 0, $$

since both vectors z and z 0 belong to the isotropic subspace N. From the nonsingularity of the bilinear form φ, it then follows that z 0=0, that is, the kernel of consists of only the null vector. Since dimM=dimN, we have by Theorem 3.68 that the linear transformation is an isomorphism. □

Let e 1,…,e m be some basis in M, and f 1,…,f m the dual basis in M . The isomorphism that we constructed creates a correspondence between this dual basis and a certain basis g 1,…,g m in the space N according to the formula . From decomposition (11.29) established in Theorem 11.12, it follows that vectors e 1,…,e m ,g 1,…,g m form a basis in L. In this basis, the bilinear form φ has the simplest possible matrix Φ. Indeed, recalling the definitions of concepts that we have used, we obtain that

(11.32)

where E and 0 are the identity and zero matrices of order m. For the corresponding quadratic form F and vector

$${\boldsymbol{x} }= x_1{\boldsymbol{e} }_1 + \cdots+ x_m{\boldsymbol{e} }_m + x_{m+1}{\boldsymbol{g} }_1 + \cdots+ x_{2m}{\boldsymbol{g} }_m, $$

we obtain

$$ F({\boldsymbol{x} }) = \sum_{i=1}^m x_i x_{m+i}. $$
(11.33)

Conversely, if in some basis e 1,…,e 2m of the vector space L, the bilinear form φ has matrix (11.32), then the space L can be represented in the form

$${\mathsf{L}}= {\mathsf{M}}\oplus {\mathsf{N}},\quad {\mathsf{M}}= \langle {\boldsymbol{e} }_1, \ldots, {\boldsymbol{e} }_{m}\rangle, {\mathsf{N}}= \langle {\boldsymbol{e} }_{m+1}, \ldots, {\boldsymbol{e} }_{2m}\rangle, $$

in accordance with Theorem 11.12. Let us recall that in our case (in a complex projective space), all nonsingular bilinear forms are equivalent, and therefore, every nonsingular bilinear form φ has matrix (11.32) in some basis. In particular, we see that in the 2m-dimensional space L, there exists an m-dimensional isotropic subspace M.

In order to generalize known results from analytic geometry for m=2 to the case of arbitrary m (see Example 11.11), we shall provide several definitions that naturally generalize some concepts about Euclidean spaces familiar to us from Chap. 7.

Definition 11.14

Let φ(x,y) be a nonsingular symmetric bilinear form in the space L of arbitrary dimension. A linear transformation is said to be orthogonal with respect to φ if

(11.34)

for all vectors x,yL.

This definition generalizes the notion of orthogonal transformation of a Euclidean space and Lorentz transformation of a pseudo-Euclidean space. Similarly, we shall call a basis e 1,…,e n of a space L orthonormal with respect to a bilinear form φ if φ(e i ,e i )=1 and φ(e i ,e j )=0 for all ij. Every orthogonal transformation takes an orthonormal basis into an orthonormal basis, and for any two orthonormal bases, there exists a unique orthogonal transformation taking the first of them to the second. The proofs of these assertions coincide word for word with the analogous assertions from Section 7.2, since there we nowhere used the positive definiteness of the bilinear form (x,y), but only its nonsingularity.

The condition (11.34) can be expressed in matrix form. Let the bilinear form φ have matrix Φ in some basis e 1,…,e n of the space L. Then the transformation will be orthogonal with respect to φ if and only if its matrix U in this basis satisfies the relationship

$$ U^* \varPhi U = \varPhi. $$
(11.35)

This is proved just as was the analogous equality (7.18) for orthogonal transformations of Euclidean spaces, and (7.18) is a special case of formula (11.35) for Φ=E.

It follows from formula (11.35) that |U |⋅|Φ|⋅|U|=|Φ|, and taking into account the nonsingularity of the form φ (|Φ|≠0), that |U |⋅|U|=1, that is, |U|2=1. From this we finally obtain the equality |U|=±1, in which |U| can be replaced by , since the determinant of a linear transformation does not depend on the choice of basis in the space, and consequently, coincides with the determinant of the matrix of this transformation.

The equality generalizes a well-known property of orthogonal transformations of a Euclidean space and provides justification for an analogous definition.

Definition 11.15

A linear transformation orthogonal with respect to a symmetric bilinear form φ is said to be proper if and improper if .

It follows at once from Theorem 2.54 on the determinant of the product of matrices that proper and improper transformations multiply just like the numbers +1 and −1. Similarly, the transformation corresponds to the same type (of proper or improper orthogonal transformation) as .

The concepts that we have introduced can be applied to the theory of isotropic subspaces on the basis of the following result.

Theorem 11.16

For any two m-dimensional isotropic subspaces M and Mof a 2m-dimensional space L, there exists an orthogonal transformation taking one of the subspaces to the other.

Proof

Since Theorem 11.12 can be applied to each of the subspaces M and M′, there exist m-dimensional isotropic subspaces N and N′ such that

$${\mathsf{L}}= {\mathsf{M}}\oplus {\mathsf{N}}= {\mathsf{M}}' \oplus {\mathsf{N}}'. $$

As we have noted above, from the decomposition L=MN, it follows that in the space L, there exists a basis e 1,…,e 2m comprising the bases of the subspaces M and N in which the matrix of the bilinear form φ is equal to (11.32). The second decomposition L=M′⊕N′ gives us a similar basis \({\boldsymbol{e} }'_{1}, \ldots, {\boldsymbol{e} }'_{2m}\).

Let us define the transformation by the action on the vectors of the basis e 1,…,e 2m according to the formula for all i=1,…,2m. It is obvious that then the image is equal to M′. Furthermore, for any two vectors x=x 1 e 1+⋯+x 2m e 2m and y=y 1 e 1+⋯+y 2m e 2m , their images and have, in the basis \({\boldsymbol{e} }'_{1}, \ldots, {\boldsymbol{e} }'_{2m}\), decompositions with the same coordinates: and . From this it follows that

showing that is an orthogonal transformation. □

Let us note that Theorem 11.16 does not assert the uniqueness of such a transformation . In fact, such is not the case. Let us consider this question in more detail. Let and be the two orthogonal transformations that were the subject of Theorem 11.16. Applying to both sides of the equality the transformation , we obtain , where is also an orthogonal transformation. Our further considerations are based on the following result.

Lemma 11.17

Let M be an m-dimensional isotropic subspace of a 2m-dimensional space L, and let be an orthogonal transformation taking M to itself. Then the transformation is proper.

Proof

By assumption, M is an invariant subspace of the transformation . This means that in an arbitrary basis of the space L whose first m vectors form a basis of M, the matrix of the transformation has the block form

(11.36)

where A, B, C are square matrices of order m.

The orthogonality of the transformation is expressed by the relationship (11.35), in which, as we have seen, with the selection of a suitable basis, we may consider that relationship (11.32) is satisfied. Setting in (11.35) in place of U the matrix (11.36), we obtain

Multiplying the matrices on the left-hand side of this equality brings it into the form

From this, we obtain in particular A C=E, and this means that |A |⋅|C|=1. But in view of |A |=|A|, from (11.36) we have |U 0|=|A|⋅|C|=1, as asserted. □

From Lemma 11.17 we deduce the following important corollary.

Theorem 11.18

If M and Mare two m-dimensional isotropic subspaces of a 2m-dimensional space L, then the orthogonal transformations taking one of these subspaces into the other are either all proper or all improper.

Proof

Let and be two orthogonal transformations such that . It is clear that then . Setting , from the equality we obtain that . By Lemma 11.17, , and from the relationship , it follows that . □

Theorem 11.18 determines in an obvious way a partition of the set of all m-dimensional isotropic subspaces M of a 2m-dimensional space L into two families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\). Namely, M and M′ belong to one family if an orthogonal transformation taking one of these subspaces into the other (which always exists, by Theorem 11.16) is proper (it follows from Theorem 11.18 that this definition does not depend on the choice of a specific transformation ).

Now we can easily prove the following property, which was established in the previous section for m=2, for any m.

Theorem 11.19

Two m-dimensional isotropic subspaces M and Mof a 2m-dimensional space L belong to one family \({\mathfrak{M}}_{i}\) if and only if the dimension of their intersection MMhas the same parity as m.

Proof

Let us recall that natural numbers k and m have the same parity if k+m is even, or equivalently, if (−1)k+m=1. Recalling now the definition of the partition of the set of m-dimensional isotropic subspaces into families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\) and setting k=dim(MM′), we may formulate the assertion of the theorem as follows:

(11.37)

where is an arbitrary orthogonal transformation taking M to M′, that is, a transformation such that .

Let us begin the proof of relationship (11.37) with the case k=0, that is, the case that MM′=(0). Then in view of the equality dimM+dimM′=dimL, the sum of subspaces M+M′=MM′ coincides with the entire space L. This means that M′ exhibits all the properties of the isotropic subspace N constructed for the proof of Theorem 11.12. In particular, there exist bases e 1,…,e m in M and f 1,…,f m in M′ such that

$$\varphi ({\boldsymbol{e} }_i, {\boldsymbol{f} }_i) = 1\quad \mbox{for } i = 1,\ldots, m,\qquad \varphi ({\boldsymbol{e} }_i, {\boldsymbol{f} }_j) = 0 \quad\mbox{for } i \neq j. $$

We shall determine the transformation by the conditions and for all i=1,…,m. It is clear that and . It is equally easy to see that in the basis e 1,…,e m ,f 1,…,f m , the matrices of the transformation and bilinear form φ coincide and have the form (11.32). Substituting the matrix (11.32) in place of U and Φ into formula (11.35), we see that it is converted to a true equality, that is, the transformation is orthogonal.

On the other hand, we have, therefore, the equality . It is easy to convince oneself that |Φ|=(−1)m by transposing the rows of the matrix (11.32) with indices i and m+i for all i=1,…,m. Here we shall carry out m transpositions and obtain the identity matrix of order 2m with determinant 1. As a result, we arrive at the equality , that is, at relationship (11.37) for k=0.

Now let us examine the case k>0. Let us define the subspace M 1=MM′. Then k=dimM 1. By Theorem 11.12, there exists an m-dimensional isotropic subspace NL such that L=MN. Let us choose in the subspace M a basis e 1,…,e m such that its first k vectors e 1,…,e k form a basis in M 1. Then clearly, we have the decomposition

$${\mathsf{M}}= {\mathsf{M}}_1 \oplus {\mathsf{M}}_2, \quad\mbox{where } {\mathsf{M}}_1 = \langle {\boldsymbol{e} }_1, \ldots , {\boldsymbol{e} }_k\rangle, {\mathsf{M}}_2 = \langle {\boldsymbol{e} }_{k+1}, \ldots, {\boldsymbol{e} }_m\rangle. $$

Above (see Lemma 11.13), we constructed the isomorphism and with its help, defined a basis g 1,…,g m in the space N by formula , where f 1,…,f m is a basis of the space M , the dual basis to e 1,…,e m . We obviously have the decomposition

$${\mathsf{N}}= {\mathsf{N}}_1 \oplus {\mathsf{N}}_2, \quad\mbox{where } {\mathsf{N}}_1 = \langle {\boldsymbol{g} }_1, \ldots , {\boldsymbol{g} }_k\rangle, {\mathsf{N}}_2 = \langle {\boldsymbol{g} }_{k+1}, \ldots, {\boldsymbol{g} }_m\rangle, $$

where by our construction, and .

Let us consider the linear transformation defined by the formula

It is obvious that the transformation is orthogonal, and also and

(11.38)

In the basis e 1,…,e m ,g 1,…,g m that we constructed in the space L, the matrix of the transformation has the block form

where E k and E mk are the identity matrices of orders k and mk. As is evident, U 0 becomes the identity matrix after the transposition of its rows with indices i and m+i, i=1,…,k. Therefore, .

Let us prove that . Since , this is equivalent to . Let us assume that . From the membership and decomposition M=M 1M 2, taking into account (11.38), it follows that xN 1M 2, that is,

$$ {\boldsymbol{x} }= {\boldsymbol{z} }_1 + {\boldsymbol{y} }_2, \quad\mbox{where } {\boldsymbol{z} }_1 \in {\mathsf{N}}_1, {\boldsymbol{y} }_2 \in {\mathsf{M}}_2. $$
(11.39)

Thus for every vector y 1M 1, we have the equality

$$ \varphi ({\boldsymbol{x} },{\boldsymbol{y} }_1) = \varphi ({\boldsymbol{z} }_1,{\boldsymbol{y} }_1) + \varphi ( {\boldsymbol{y} }_2,{\boldsymbol{y} }_1). $$
(11.40)

The left-hand side of equality (11.40) equals zero, since xM′, y 1M 1M′, and the subspace M′ is isotropic with respect to φ. The second term φ(y 2,y 1) on the right-hand side is equal to zero, since y i M i M, i=1,2, and the subspace M is isotropic with respect to φ. Thus from relationship (11.40), it follows that φ(z 1,y 1)=0 for every vector y 1M 1.

This last conclusion means that for the isomorphism , there corresponds to the vector z 1N 1, a linear function on M 1 that is identically equal to zero. But that can be the case only if the vector z 1 itself is equal to 0. Thus in the decomposition (11.39), we have z 1=0, and therefore, the vector x=y 2 is contained in the subspace M 2. On the other hand, by virtue of the inclusions M 2M and , taking into account the definition of the subspace M 1=MM′, this vector is also contained in M 1. As a result, we obtain that xM 1M 2, while by virtue of the decomposition M=M 1M 2, this means that x=0.

Thus the subspaces and M are included in the case k=0 already considered, and relationship (11.37) has been proved for them. By Theorem 11.16, there exists an orthogonal transformation such that . Then, as we have proved, . The orthogonal transformation takes the isotropic subspace M′ to M, and for it we have the relationship

which completes the proof of the theorem. □

We note two corollaries to Theorem 11.19.

Corollary 11.20

The families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\) do not have an m-dimensional isotropic subspace in common.

Proof

Let us assume that two such m-dimensional isotropic subspaces \({\mathsf{M}}_{1} \in {\mathfrak{M}}_{1}\) and \({\mathsf{M}}_{2} \in {\mathfrak{M}}_{2}\) are to be found such that M 1=M 2. Then we clearly have the equality dim(M 1M 2)=m, and by Theorem 11.19, M 1 and M 2 cannot belong to different families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\). □

Corollary 11.21

If two m-dimensional isotropic subspaces intersect in a subspace of dimension m−1, then they belong to different families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\).

This follows from the fact that m and m−1 have opposite parity.

Case 2. Now we may proceed to an examination of the second case, in which the dimension of the space L is odd. It is considerably easier and can be reduced to the already considered case of even dimensionality.

In order to retain the previous notation used in the even-dimensional case, let us denote by \({\overline{{\mathsf{L}}}}\) the space of odd dimension 2m+1 under consideration and let us embed it as a hyperplane in a space L of dimension 2m+2. Let us denote by F a nonsingular quadratic form on L and by \({\overline{F}}\) its restriction to \({\overline{{\mathsf{L}}}}\). Our further reasoning will be based on the following fact.

Lemma 11.22

For every nonsingular quadratic form F there exists a hyperplane \({\overline{{\mathsf{L}}}}\subset {\mathsf{L}}\) such that the quadratic form \({\overline{F}}\) is nonsingular.

Proof

In a complex projective space, all nonsingular quadratic forms are equivalent. And therefore, it suffices to prove the required assertion for any one form F. For F, let us take the nonsingular form (11.33) that we encountered previously with m replaced by m+1. Thus for a vector xL with coordinates (x 1,…,x 2m+2), we have

$$ F({\boldsymbol{x} }) = \sum_{i=1}^{m+1} x_i x_{m+1+i}. $$
(11.41)

Let us define a hyperplane \({\overline{{\mathsf{L}}}}\subset {\mathsf{L}}\) by the equation x 1=x m+2. The coordinates in \({\overline{{\mathsf{L}}}}\) are collections \((x_{1}, \ldots, x_{m+1}, \breve{x}_{m+2}, x_{m+3}, \ldots, x_{2m+2})\), where the symbol \(\breve{ }\) indicates the omission of the coordinate underneath it, and the quadratic form \({\overline{F}}\) in these coordinates takes the form

$$ {\overline{F}}({\boldsymbol{x} }) = x_1^2 + \sum_{i=2}^{m+1} x_i x_{m+1+i}. $$
(11.42)

The matrix of the quadratic form (11.42) has the block form

where Φ is the matrix from formula (11.32). Since the determinant |Φ| is nonzero, it follows that the quadratic form (11.42) is nonsingular. □

We shall further investigate the m-dimensional subspaces \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\), isotropic with respect to the nonsingular quadratic form \({\overline{F}}\), which is the restriction to the hyperplane \({\overline{{\mathsf{L}}}}\) of the nonsingular quadratic form F given in the surrounding space L. Since in the complex projective space \({\overline{{\mathsf{L}}}}\) all nonsingular quadratic forms are equivalent, it follows that all our results will be valid for an arbitrary nonsingular quadratic form on \({\overline{{\mathsf{L}}}}\).

Let us consider an arbitrary (m+1)-dimensional subspace ML, isotropic with respect to F, and let us set \({\overline{{\mathsf{M}}}}= {\mathsf{M}}\cap {\overline{{\mathsf{L}}}}\). It is obvious that the subspace \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\) is isotropic with respect to \({\overline{F}}\). Since in the space L, the hyperplane \({\overline{{\mathsf{L}}}}\) is defined by a single linear equation, it follows that either \({\mathsf{M}}\subset {\overline{{\mathsf{L}}}}\) (and then \({\overline{{\mathsf{M}}}}={\mathsf{M}}\)), or \(\dim {\overline{{\mathsf{M}}}}= \dim {\mathsf{M}}- 1 =m\). But the first case is impossible, since \(\dim {\overline{{\mathsf{M}}}}\le\frac{1}{2} \dim {\overline{{\mathsf{L}}}}= \frac{1}{2} (2m+1)\), and dimM=m+1. Thus there remains the second case: \(\dim {\overline{{\mathsf{M}}}}= m\). Let us show that such an association with an (m+1)-dimensional isotropic subspace ML of an m-dimensional isotropic subspace \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\) gives all the subspaces \({\overline{{\mathsf{M}}}}\) of interest to us and in a certain sense, it is unique.

Theorem 11.23

For every m-dimensional subspace \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\) isotropic with respect to \({\overline{F}}\), there exists an (m+1)-dimensional subspace ML, isotropic with respect to F, such that \({\overline{{\mathsf{M}}}}= {\mathsf{M}}\cap {\overline{{\mathsf{L}}}}\). Moreover, in each of the families \({\mathfrak{M}}_{1}\) and \({\mathfrak{M}}_{2}\) of subspaces isotropic with respect to F, there exists such an M, and it is unique.

Proof

Let us consider an arbitrary m-dimensional subspace \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\), isotropic with respect to \({\overline{F}}\), and let us denote by \({\overline{{\mathsf{M}}}}^{\perp }\) its orthogonal complement with respect to the symmetric bilinear form φ associated with the quadratic form F in the surrounding space L. According to our previous notation, it should have been denoted by \({\overline{{\mathsf{M}}}}^{\perp}_{\varphi }\), but we shall suppress the subscript, since the bilinear form φ will be always one and the same. From relationship (7.75), which is valid for a nondegenerate (with respect to the form φ) space L and an arbitrary subspace of it (p. 267), it follows that

$$\dim {\overline{{\mathsf{M}}}}^{\perp} = \dim {\mathsf{L}}- \dim {\overline{{\mathsf{M}}}}= 2m+2 - m = m+2. $$

Let us denote by \({ \widetilde{\varphi }}\) the restriction of the bilinear form φ to \({\overline{{\mathsf{M}}}}^{\perp}\), and by \({ \widetilde{F} }\) the restriction of the quadratic form F to \({\overline{{\mathsf{M}}}}^{\perp}\). The forms \({ \widetilde{\varphi }}\) and \({ \widetilde{F} }\) are singular in general. By definition (p. 198), the radical of the bilinear form \(\widetilde{\varphi }\) is equal to \({\overline{{\mathsf{M}}}}^{\perp} \cap ({\overline{{\mathsf{M}}}}^{\perp} )^{\perp} = {\overline{{\mathsf{M}}}}^{\perp } \cap {\overline{{\mathsf{M}}}}\). But since \({\overline{{\mathsf{M}}}}\) is isotropic, it follows that \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{M}}}}^{\perp}\), and therefore, the radical of the bilinear form \(\widetilde{\varphi }\) coincides with \({\overline{{\mathsf{M}}}}\). By relationship (6.17) from Sect. 6.2, the rank of the bilinear form \(\widetilde{\varphi }\) is equal to

$$\dim {\overline{{\mathsf{M}}}}^{\perp} - \dim \bigl({\overline{{\mathsf{M}}}}^{\perp} \bigr)^{\perp} = \dim {\overline{{\mathsf{M}}}}^{\perp} - \dim {\overline{{\mathsf{M}}}}= (m+2)-m = 2, $$

and in the subspace \({\overline{{\mathsf{M}}}}^{\perp}\), we may choose a basis e 1,…,e m+2 such that its last m vectors are contained in \({\overline{{\mathsf{M}}}}\) (that is, in the radical \(\widetilde{\varphi }\)), and the restriction of φ to 〈e 1,e 2〉 has matrix .

Thus we have the decomposition \({\overline{{\mathsf{M}}}}^{\perp} = \langle {\boldsymbol{e} }_{1}, {\boldsymbol{e} }_{2}\rangle \oplus {\overline{{\mathsf{M}}}}\), where the restriction of the quadratic form F to 〈e 1,e 2〉 in our basis has the form x 1 x 2, and the restriction of F to \({\overline{{\mathsf{M}}}}\) is identically equal to zero.

Let us set \({\mathsf{M}}_{i} = {\overline{{\mathsf{M}}}}\oplus\langle {\boldsymbol{e} }_{i}\rangle\), i=1,2. Then M 1 and M 2 are (m+1)-dimensional subspaces in L. It follows from this construction that the M i are isotropic with respect to the bilinear form φ. Here \({\mathsf{M}}_{i} \cap {\overline{{\mathsf{L}}}}= {\overline{{\mathsf{M}}}}\), since on the one hand, from considerations of dimensionality, \({\mathsf{M}}_{i} \not\subset {\overline{{\mathsf{L}}}}\), and on the other hand, \({\overline{{\mathsf{M}}}}\subset {\mathsf{M}}_{i}\) and \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\). We have thus constructed two isotropic subspaces M i L such that \({\mathsf{M}}_{i} \cap {\overline{{\mathsf{L}}}}= {\overline{{\mathsf{M}}}}\). That they belong to different families \({\mathfrak{M}}_{i}\) and that in neither of these families are there any other subspaces with these properties, follows from Corollary 11.21. □

Thus we have shown that there exists a bijection between the set of m-dimensional isotropic subspaces \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\) and each of the families \({\mathfrak{M}}_{i}\) of (m+1)-dimensional isotropic subspaces ML. This fact is expressed by saying that m-dimensional subspaces \({\overline{{\mathsf{M}}}}\subset {\overline{{\mathsf{L}}}}\) isotropic with respect to a nonsingular quadratic form \({\overline{F}}\) form a single family.

Of course, our partition of the set of isotropic subspaces into families is a matter of convention. It is mostly a tribute to tradition originating in the special cases considered in analytic geometry. However, it is possible to give a more precise meaning to this partition by describing these subspaces in terms of Plücker coordinates.

In the previous chapter, we showed that k-dimensional subspaces M of an n-dimensional space L are in one-to-one correspondence with the points of some projective algebraic variety G(k,n), called the Grassmannian. Suppose we are given some nonsingular quadratic form F on the space L. Let us denote by I(k,n) the subset of points of the Grassmannian G(k,n) that correspond to the k-dimensional isotropic subspaces.

We shall state the following propositions without proof, since they relate not to linear algebra, but rather to algebraic geometry.Footnote 4

Proposition 11.24

The set I(k,n) is a projective algebraic variety.

In other words, this proposition asserts that the property of a subspace being isotropic can be described by certain homogeneous relationships among its Plücker coordinates.

A projective algebraic variety X is said to be irreducible if it cannot be represented in the form of a union X=X 1X 2, where X i are projective algebraic varieties different from X itself.

Suppose the space L has odd dimension n=2m+1.

Proposition 11.25

The set I(m,2m+1) is an irreducible projective algebraic variety.

Now let the space L have even dimension n=2m. We shall denote by I i (m,2m) the subset of the projective algebraic variety I(m,2m) whose points correspond to m-dimensional isotropic subspaces of the family \({\mathfrak{M}}_{i}\). Theorem 11.19 and its corollaries show that

$$I(m,2m) = I_1(m,2m) \cup I_2(m,2m),\qquad I_1(m,2m) \cap I_2(m,2m) = \varnothing. $$

This suggests the idea that the projective algebraic variety I(m,2m) is reducible.

Proposition 11.26

The sets I i (m,2m), i=1,2, are irreducible projective algebraic varieties.

Finally, we have the following assertion, which relates to the isotropism of a subspace whose dimension is less than maximal.

Proposition 11.27

For all k<n/2, the projective algebraic variety I(k,n) is irreducible.

11.4 Quadrics in a Real Projective Space

Let us consider a projective space ℙ(L), where L is a real vector space. As before, we shall restrict our attention to the case of nonsingular quadrics. As we saw in Sect. 6.3 (formula (6.28)), a nonsingular quadratic form in a real space has the canonical form

$$ x_0^2 + x_1^2 + \cdots+ x_s^2 - x_{s+1}^2 - \cdots- x_n^2 = 0. $$
(11.43)

Here the index of inertia r=s+1 will be the same in every coordinate system in which the quadric is given by the canonical equation.

If we multiply equation (11.43) by −1, we obviously do not change the quadric that it defines, and therefore, we may assume that s+1≥ns, that is, s≥(n−1)/2. Moreover, sn, but in the case s=n, from equation (11.43) we obtain x 0=0, x 1=0, …, x n =0, and there is no such point in projective space.

Thus, in contrast to a complex projective space, in a real projective space of given dimension n, there exists (up to a projective transformation) not one, but several nonsingular quadrics. However, there is only a finite number of them; they correspond to various values s, where we may assume that

$$ \frac{n-1}{2} \le s \le{n-1}. $$
(11.44)

To be sure, it is still necessary to prove that the quadrics corresponding to the various values of s are not projectively equivalent. But we shall consider this question (in an even more complex situation) in the next section.

Thus the number of projectively inequivalent nonsingular quadrics in a real projective space of dimension n is equal to the number of integers s satisfying inequality (11.44). If n is odd, n=2m+1, then inequality (11.44) gives ms≤2m, and the number of projectively inequivalent quadrics is equal to m+1. And if n is even, n=2m, then there are m of them. In particular, for n=2, all nonsingular quadrics in the projective plane are projectively equivalent. The most typical example is the circle x 2+y 2=1, which is contained entirely in the affine part of x 2≠0 if the equation is written as \(x_{0}^{2} + x_{1}^{2} - x_{2}^{2} = 0\) in homogeneous coordinates (x 0:x 1:x 2) (here inhomogeneous coordinates are expressed by the formulas x=x 0/x 2, y=x 1/x 2).

In three-dimensional projective space, there exist two types of projectively inequivalent quadrics. In homogeneous coordinates (x 0:x 1:x 2:x 3), one of them is given by the equation \(x_{0}^{2} + x_{1}^{2} + x_{2}^{2} - x_{3}^{2} = 0\). Here we always have x 3≠0, the quadric lies in the affine part, and it is given in inhomogeneous coordinates (x,y,z) by the equation x 2+y 2+z 2=1, where x=x 0/x 3, y=x 1/x 3, z=x 2/x 3. This quadric is a sphere. The second type is given by the equation \(x_{0}^{2} + x_{1}^{2} - x_{2}^{2} - x_{3}^{2} = 0\). This is a hyperboloid of one sheet.

Their projective inequivalence can be seen at the very least from the fact that not a single real line lies on the first of them (the sphere), while on the second (hyperboloid of one sheet), there are two families each consisting of an infinite number of lines, called the rectilinear generatrices.

Of course, we can embed a real space L into a complex space L , and similarly, embed ℙ(L) into ℙ(L ). Therefore, everything that was said in Sect. 11.3 about isotropic subspaces is applicable in our case. However, although our quadric is real, the isotropic subspaces obtained in this way can turn out to be complex. The single exception is the case in which if the number n is odd, then s=(n−1)/2, or if n is even, then s=n/2.

In the first instance, we may combine the coordinates into pairs (x i ,x s+1+i ) and set u i =x i +x s+1+i and v i =x i x s+1+i . Then taking into account the equalities

$$x_i^2 - x^2_{s+1+i} = (x_i + x_{s+1+i}) (x_i - x_{s+1+i}), $$

equation (11.43) can be written in the form

$$ u_0 v_0 + u_1 v_1 + \cdots+ u_s v_s = 0. $$
(11.45)

But this is the case of the quadric (11.33), which we considered in the previous section. It is easy to see that the reasoning used in Sect. 11.3 gives us a description of the real subspaces of a quadric.

The case s=n/2 for even n also does not remove us from the realm of real subspaces and also leads to the case considered in the previous section. Moreover, if the equation of a quadric has the form (11.45) over an arbitrary field \({\mathbb{K}}\) of characteristic different from 2, then the reasoning from the previous section remains in force.

In the general case, it is still possible to determine the dimensions of the spaces contained in a quadric. For this, we may make use of considerations already used in the proof of the law of inertia (Theorem 6.17 from Sect. 6.3). There we observed that the index of inertia (in the given case, the index of inertia of the quadratic form from (11.43), equal to s+1) coincides with the maximal dimension of the subspaces L′ on which the restriction of the form is positive definite. (Let us note that this condition gives a geometric characteristic of the index of inertia, that is, it depends only on the set of solutions of the equation F(x)=0, and not on the form F that defines it.)

Indeed, let the quadric Q be given by the equation F(x)=0. If the restriction F′ of the form F to the subspace L′ is positive definite, then it is clear that Q∩ℙ(L′)=∅. Thus if we are dealing with a projective space ℙ(L), where dimL=n+1, then in L there exists a subspace \({\overline{{\mathsf{L}}}}\) of dimension s+1 such that the restriction of the form F to it is positive definite. This means that \(Q \cap {\mathbb{P}}({\overline{{\mathsf{L}}}}) = \varnothing\) (however, such a subspace \({\overline{{\mathsf{L}}}}\) is also easily determined explicitly on the basis of equation (11.43)). If L′⊂L is a subspace such that ℙ(L′)⊂Q, then \({\mathsf{L}}' \cap {\overline{{\mathsf{L}}}}= ({\boldsymbol{0} })\). Hence by Corollary 3.42, we obtain the inequality \(\dim {\overline{{\mathsf{L}}}}+ \dim {\mathsf{L}}' \le\dim {\mathsf{L}}= n+1\). Consequently, dimL′+s+1≤n+1, and this means that dimL′≤ns. Thus for the space ℙ(L′) belonging to the quadric given by equation (11.43), we obtain dimL′≤ns and therefore dimℙ(L′)≤ns−1.

On the other hand, it is easy to produce a subspace of dimension ns−1 actually belonging to the quadric (11.43). To this end, let us combine in pairs the unknowns appearing in equation (11.43) with different signs and let us equate the unknowns in one pair, for example x 0=x s+1, and so on. Since we have assumed that s+1≥ns, we may form ns such pairs, and therefore, we obtain ns linear equations. How many unknowns remain? Since we have combined 2(ns) unknowns into pairs, and in all there were n+1 of them, there remain n+1−2(ns) unknowns (it is possible that this number will be equal to zero). Thus we obtain

$$(n-s) + n+1 - 2(n-s) = n+1 - (n-s) $$

linear equations in coordinates in the space L. Since different unknowns occur in all these equations, these equations are linearly independent and determine in L a subspace L′ of dimension ns. Then dimℙ(L′)=ns−1. Of course, since L′ is contained in Q, an arbitrary subspace ℙ(L″)⊂ℙ(L′) for L″⊂L′ is also contained in Q. Thus in the quadric Q are contained subspaces of all dimensions rns−1.

We have therefore proved the following result.

Theorem 11.28

If a nonsingular quadric Q in a real projective space of dimension n is given by the equation F(x 0,…,x n )=0 and the index of inertia of the quadratic form F is equal to s+1, then in Q are contained projective subspaces only of dimension rns−1, and for each such number r there can be found in Q a projective subspace of dimension r (when s+1≥nr, which is always possible to attain without changing the quadric Q, but changing only the quadratic form F that determines it toF).

We have already considered an example of a quadric in real three-dimensional projective space (n=3). Let us note that in this space there are only two nonempty quadrics: for s=1 and s=2.

For s=2, equation (11.43) can be written in the form

$$ x_0^2 + x_1^2 + x_2^2 = x_3^2. $$
(11.46)

As we have already said, for points of a real quadric, we have x 3≠0. This means that our quadric is entirely contained in this affine subset. Setting x=x 0/x 3, y=x 1/x 3, z=x 2/x 3, we shall write its equation in the form

$$x^2 + y^2 + z^2 = 1. $$

This is the familiar two-dimensional sphere S 2 in three-dimensional Euclidean space. Let us discover what lines lie on it. Of course, no real line can lie on a sphere, since every line has points that are arbitrarily distant from the center of the sphere, while for all points of the sphere, their distance from the center of the sphere is equal to 1. Therefore, we can be talking only about complex lines of the space ℙ(L ). If in equation (11.46) we make the substitution x 2=iy, where i is the imaginary unit, we obtain the equation \(x_{0}^{2} + x_{1}^{2} - y^{2} - x_{3}^{2} = 0\), which in the new coordinates

$$u_0 = x_0 + y,\qquad v_0 = x_0 - y,\qquad u_1 = x_1 + x_3,\qquad v_1 = x_1 - x_3 $$

takes the form

$$ u_0 v_0 + u_1 v_1 = 0. $$
(11.47)

We studied such an equation in Sect. 11.2 (see Example 11.11). As an example of a line lying in the given quadric, we may take the line given by equations (11.25): u 0=λu 1, v 0=−λ −1 v 1 with arbitrary complex number λ≠0 and arbitrary u 1,v 1. In general, such a line contains not a single real point of our quadric (that is, points corresponding to real values of the coordinates x 0,…,x 3). Indeed, if the number λ is not real, then the equality u 0=λu 1 contradicts the fact that u 0 and u 1 are real. The case u 0=u 1=0 would correspond to a point with coordinates x 1=x 3=0, for which \(x_{0}^{2} + x_{2}^{2} = 0\), that is, all x i are equal to zero.

Thus on the sphere lies a set of complex lines containing not a single real point. If desired, all of them could be described by formulas (11.27) and (11.28) after changes in coordinates that we described earlier. However, of greater interest are the complex lines lying on the sphere and containing at least one real point. For each such line l containing a real point of the sphere P, the complex conjugate line \(\overline{l}\) (that is, consisting of points \(\overline{Q}\), where Q takes values on the line l) also lies on the sphere and contains the point P. But by Theorem 11.19, through every point P pass exactly two lines (even if complex). We see that through every point of the sphere there pass exactly two complex lines, which are the complex conjugates of each other.

Finally, the case s=1 leads to the equation

$$ x_0^2 + x_1^2 - x_2^2 - x_3^2 = 0, $$
(11.48)

which after a change of coordinates

$$u_0 = x_0 + x_1,\qquad v_0 = x_0 - x_1,\qquad u_1 = x_2 + x_3,\qquad v_1 = x_2 - x_3, $$

also assumes the form (11.47). For this equation, we have described all the lines contained in a quadric by formulas (11.27) and (11.28), where clearly, real values must be assigned to the parameters a,b,c,d in these formulas. In this case, the obtained quadric is a hyperboloid of one sheet, and the lines are its rectilinear generatrices. See Fig. 11.3.

Fig. 11.3
figure 3

Hyperboloid of one sheet

Let us visualize what this surface looks like; that is, let us find a more familiar set that is homeomorphic to this surface. To this end, let us choose one line in each family of rectilinear generatrices: in the first, l 0; in the second, l 1. As we saw in Sect. 9.4, every projective line is homeomorphic to the circle S 1. On the other hand, every line in the second family of generatrices is uniquely determined by its point of intersection with the line l 0, and similarly, every line of the first family is determined by its point of intersection with the line l 1. Finally, through every point of the surface pass exactly two lines: one from the first family of generatrices, and the other from the second.

Thus is established a bijection between the points of a quadric given by equation (11.48) and pairs of points (x,y), where xl 0, yl 1, that is, the set S 1×S 1. It is easily ascertained that this bijection is a homeomorphism. The set S 1×S 1 is called a torus. It is most simply represented as the surface obtained by rotating a circle about an axis lying in the same plane as the circle but not intersecting it. See Fig. 11.4. Such a surface looks like the surface of a bagel. As a result, we obtain that the quadric given by equation (11.48) in three-dimensional real projective space is homeomorphic to a torus. See Fig. 11.4.

Fig. 11.4
figure 4

A torus

11.5 Quadrics in a Real Affine Space

Now we proceed to the study of quadrics in a real affine space (V,L). Let us choose in this space a frame of reference (O;e 1,…,e n ). Then every point AV is given by its coordinates (x 1,…,x n ). A quadric is the set of all points AV such that

$$ F(x_1, \ldots, x_n) = 0, $$
(11.49)

where F is some second-degree polynomial. There is now no reason to consider the polynomial F to be homogeneous (as was the case in a projective space).

Collecting in F(x) terms of the second, first, and zeroth degrees, we shall write them in the form

$$ F({\boldsymbol{x} }) = \psi({\boldsymbol{x} }) + f({\boldsymbol{x} }) + c, $$
(11.50)

where ψ(x) is a quadratic form, f(x) is a linear form, and c is a scalar. The quadrics F(x)=0 thus obtained for n=2 and 3 represent the curves and surfaces of order two studied in courses in analytic geometry.

Let us note that according to our definition of a quadric as a set of points satisfying relationship (11.49), we obtain even in the simplest cases, n=2 and 3, sets that generally do not belong to curves or surfaces of degree two. The same “strange” examples show that dissimilar-looking second-degree polynomials can define one and the same quadric, that is, the solution set of equation (11.49).

For example, in real three-dimensional space with coordinates x,y,z, the equation x 2+y 2+z 2+c=0 has no solution in x,y,z if c>0, and therefore for any c>0, it defines the empty set. Another example is the equation x 2+y 2=0, which is satisfied only with x=y=0 but for all z, that is, this equation defines a line, namely the z-axis. But the same line (z-axis) is defined, for example, by the equation ax 2+by 2=0 with any numbers a and b of the same sign.

Let us prove that if we exclude such “pathological” cases, then every quadric is defined by an equation that is unique up to a nonzero constant factor. Here it will be convenient to consider the empty set a special case of an affine subspace.

Theorem 11.29

If a quadric Q does not coincide with a set of points of any affine subspace and can be given by two different equations F 1(x)=0 and F 2(x)=0, where the F i are second-degree polynomials, then F 2=λF 1, where λ is some nonzero real number.

Proof

Since by the given condition, the quadric Q is not empty, it must contain some point A. By Theorem 8.14, there exists another point BQ such that the line l passing through A and B does not lie entirely in Q.

Let us select in the affine space V, a frame of reference (O;e 1,…,e n ) in which the point O is equal to A and the vector e 1 is equal to \(\overrightarrow {AB}\). The line passing through the points A and B consists of points with coordinates (x 1,0,…,0) for all possible real values x 1. Let us write down the equation F i (x)=0, i=1,2, defining our quadric after arranging terms in order of the degree of x 1. As a result, we obtain the equations

$$F_i(x_1, \ldots, x_n) = a_i x_1^2 + f_i(x_2, \ldots, x_n) x_1 + \psi_i(x_2, \ldots, x_n) = 0,\quad i=1,2, $$

where f i (x 2,…,x n ) and ψ i (x 2,…,x n ) are inhomogeneous polynomials of first and second degree in the variables x 2,…,x n . After defining \(f_{i}(0, \ldots, 0)=f_{i}(\overline{O})\) and \(\psi_{i}(0, \ldots, 0)=\psi_{i}(\overline{O})\), we may say that the relationship

$$ a_i x_1^2 + f_i(\overline{O}) x_1 + \psi_i(\overline{O}) = 0 $$
(11.51)

holds for x 1=0 (point A) and for x 1=1 (point B), but does not hold identically for all real values x 1. From this it follows that \(\psi_{i}(\overline{O}) = 0\) and \(a_{i} + f_{i}(\overline{O}) = 0\). This means that a i ≠0, for otherwise, we would obtain that relationship (11.51) was satisfied for all x 1. By multiplying the polynomial F i by \(a_{i}^{-1}\), we may assume that a i =1.

Let us denote by \(\overline{{\boldsymbol{x} }}\) the projection of the vector x onto the subspace 〈e 2,…,e n 〉 parallel to the subspace 〈e 1〉, that is, \({\overline{{\boldsymbol{x} }}} = (x_{2}, \ldots, x_{n})\). Then we may say that the two equations

$$ x_1^2 + f_1(\overline{{\boldsymbol{x} }}) x_1 + \psi_1(\overline{{\boldsymbol{x} }}) = 0 \quad\mbox{and}\quad x_1^2 + f_2(\overline{{\boldsymbol{x} }}) x_1 + \psi_2(\overline{{\boldsymbol{x} }}) = 0, $$
(11.52)

where \(f_{i}(\overline{{\boldsymbol{x} }})\) are first-degree polynomials and \(\psi_{i}(\overline{{\boldsymbol{x} }})\) are second-degree polynomials of the vector \(\overline{{\boldsymbol{x} }}\), have identical solutions. Furthermore, we know that they both have two solutions, x 1=0 and x 1=1, for \(\overline{{\boldsymbol{x} }}= {\boldsymbol{0} }\), that is, the discriminant of each quadratic trinomial

$$p_i(x_1) = x_1^2 + f_i(\overline{{\boldsymbol{x} }}) x_1 + \psi_i(\overline{{\boldsymbol{x} }}),\quad i=1,2, $$

with coefficients depending on the vector \(\overline{{\boldsymbol{x} }}\), for \(\overline{{\boldsymbol{x} }}= {\boldsymbol{0} }\), is positive.

The coefficients of the trinomial p i (x 1) can be viewed as polynomials in the variables x 2,…,x n , that is, the coordinates of the vector \(\overline{{\boldsymbol{x} }}\). Consequently, the discriminant of the trinomial p i (x 1) is also a polynomial in the variables x 2,…,x n , and therefore, it depends on them continuously. From the definition of continuity, it follows that there exists a number ε>0 such that the discriminant of each trinomial p i (x 1) is positive for all \(\overline{{\boldsymbol{x} }}\) such that |x 2|<ε, …, |x n |<ε. This condition can be written compactly in the form of the single inequality \(|\overline{{\boldsymbol{x} }}| < \varepsilon \), assuming that the space of vectors \(\overline{{\boldsymbol{x} }}\) is somehow converted into a Euclidean space in which is defined the length of a vector \(|\overline{{\boldsymbol{x} }}|\). For example, it can be defined by the relationship \(|\overline{{\boldsymbol{x} }}|^{2} = x_{2}^{2} + \cdots+ x_{n}^{2}\).

Thus the quadratic trinomials p i (x 1) with leading coefficient 1 and coefficients \(f_{i}(\overline{{\boldsymbol{x} }})\) and \(\psi_{i}(\overline{{\boldsymbol{x} }})\), depending continuously on \(\overline{{\boldsymbol{x} }}\), each have two roots for all \(|\overline{{\boldsymbol{x} }}| < \varepsilon \). But as is known from elementary algebra, such trinomials coincide. Therefore, \(f_{1} (\overline{{\boldsymbol{x} }}) = f_{2} (\overline{{\boldsymbol{x} }})\) and \(\psi_{1} (\overline{{\boldsymbol{x} }}) = \psi_{2} (\overline{{\boldsymbol{x} }})\) for all \(|\overline{{\boldsymbol{x} }}| < \varepsilon \). Hence on the basis of the following lemma, we obtain that these equalities are satisfied not only for \(|\overline{{\boldsymbol{x} }}| < \varepsilon \), but in general for all vectors \(\overline{{\boldsymbol{x} }}\). □

Lemma 11.30

If for some number ε>0, the polynomials \(f(\overline{{\boldsymbol{x} }})\) and \(g(\overline{{\boldsymbol{x} }})\) coincide for all \(\overline{{\boldsymbol{x} }}\) such that \(|\overline{{\boldsymbol{x} }}| < \varepsilon \), then they coincide identically for all \(\overline{{\boldsymbol{x} }}\).

Proof

Let us represent each of the polynomials \(f(\overline{{\boldsymbol{x} }})\) and \(g(\overline{{\boldsymbol{x} }})\) as a sum of homogeneous terms:

$$ f(\overline{{\boldsymbol{x} }}) = \sum_{k=0}^N f_k ( \overline{{\boldsymbol{x} }}),\qquad g(\overline{{\boldsymbol{x} }}) = \sum_{k=0}^N g_k (\overline{{\boldsymbol{x} }}). $$
(11.53)

Let us set \(\overline{{\boldsymbol{x} }}= \alpha \overline{{\boldsymbol{y} }}\), where \(|\overline{{\boldsymbol{y} }}| < \varepsilon \) and the number α is in [0,1]. Then the condition \(|\overline{{\boldsymbol{x} }}| < \varepsilon \) is clearly satisfied, and this means that \(f(\overline{{\boldsymbol{x} }}) = g(\overline{{\boldsymbol{x} }})\). Setting \(\overline{{\boldsymbol{x} }}= \alpha \overline{{\boldsymbol{y} }}\) in equality (11.53), we obtain

$$ \sum_{k=0}^N \alpha ^k f_k (\overline{{\boldsymbol{y} }}) = \sum_{k=0}^N \alpha ^k g_k (\overline{{\boldsymbol{y} }}). $$
(11.54)

On the one hand, equality (11.54) holds for all α∈[0,1], of which there are infinitely many. On the other hand, (11.54) represents an equality between two polynomials in the variable α. As is well known, polynomials of a single variable taking the same values for an infinite number of values of the variable coincide identically, that is, they have the same coefficients. Therefore, we obtain the equalities \(f_{k} (\overline{{\boldsymbol{y} }}) = g_{k} (\overline{{\boldsymbol{y} }})\) for all k=0,…,N and all \(\overline{{\boldsymbol{y} }}\) for which \(|\overline{{\boldsymbol{y} }}| <\varepsilon \). But since the polynomials f k and g k are homogeneous, it follows that these equalities hold in general for all \(\overline{{\boldsymbol{y} }}\).

Indeed, every vector \(\overline{{\boldsymbol{y} }}\) can be represented in the form \(\overline{{\boldsymbol{y} }}= \alpha \overline{{\boldsymbol{z} }}\) with some scalar α and vector \(\overline{{\boldsymbol{z} }}\) for which \(|\overline{{\boldsymbol{z} }}|< \varepsilon \). For example, it suffices to set \(\alpha = ({2}/{\varepsilon }) |\overline{{\boldsymbol{y} }}|\). Consequently, we obtain \(f_{k} (\overline{{\boldsymbol{z} }}) = g_{k} (\overline{{\boldsymbol{z} }})\). But if we multiply both sides of this equality by α k and invoke the homogeneity of f k and g k , we obtain the equality \(f_{k} (\alpha \overline{{\boldsymbol{z} }}) = g_{k} (\alpha \overline{{\boldsymbol{z} }})\), that is, \(f_{k} (\overline{{\boldsymbol{y} }}) = g_{k} (\overline{{\boldsymbol{y} }})\), which is what was to be proved. □

Let us note that we might have posed this same question about the uniqueness of the correspondence between quadrics and their defining equations with regard to quadrics in projective space. But in projective space, the polynomial defining a quadric is homogeneous, and this question can be resolved even more easily. So that we wouldn’t have to repeat ourselves, we have considered the question in the more complex situation.

Let us now investigate a question that is considered already in a course in analytic geometry for spaces of dimension 2 and 3: into what simplest form can equation (11.49) be brought by a suitable choice of frame of reference in an affine space of arbitrary dimension n? This question is equivalent to the following: under what conditions can two quadrics be transformed into each other by a nonsingular affine transformation?

We shall consider quadrics in an affine space (V,L) of dimension n, assuming that for smaller values of n, this problem has already been solved. In this regard, we shall not consider quadrics that are cylinders, that is, having the form

$$Q = h^{-1}\bigl(Q'\bigr), $$

where is an affine transformation of the space (V,L) into the affine space (V′,L′) of dimension m<n, and Q′ is some subset of V′. Let us ascertain that in this case, Q′ is a quadric in V′.

Let the quadric Q in a coordinate system associated with some frame of reference of the affine space V be defined by the second-degree equation F(x 1,…,x n )=0. Let us choose in the m-dimensional affine space V′ some frame of reference \((O'; {\boldsymbol{e} }_{1}', \ldots, {\boldsymbol{e} }_{m}')\). Then \({\boldsymbol{e} }_{1}', \ldots, {\boldsymbol{e} }_{m}'\) is a basis in the vector space L′. In the definition of a cylinder, one has the condition . Let us denote by e 1,…,e m vectors e i L such that , i=1,…,m, and let us consider the subspace M=〈e 1,…,e m 〉 that they span. By Corollary 3.31, there exists a subspace NL such that L=MN. Let OV be an arbitrary point such that h(O)=O′. Then in the coordinate system associated with the frame of reference \((O'; {\boldsymbol{e} }_{1}', \ldots, {\boldsymbol{e} }_{m}')\), the projection of the space L onto M parallel to the subspace N and the associated projection h of the affine space V onto V′ are defined by the condition

$$h(x_1, \ldots, x_n) = \bigl(x_1', \ldots, x_m'\bigr), $$

where \(x_{i}'\) are the coordinates of \((O'; {\boldsymbol{e} }_{1}', \ldots, {\boldsymbol{e} }_{m}')\), the associated frame of reference. Then the fact that Q is a quadric means that its second-degree equation F(x 1,…,x n )=0 is satisfied irrespective of the values that we have substituted for the variables x m+1,…,x n if the point with coordinates (x 1,…,x m ) belongs to the set Q′. For example, we may set x m+1=0,…,x n =0. Then the equation \(F(x_{1}', \ldots, x_{n}', 0, \ldots, 0) = 0\) will be precisely the equation of the quadric Q′.

The same reasoning shows that if a polynomial F depends on fewer than n unknowns, then the quadric Q defined by the equation F(x)=0 is a cylinder. Therefore, in the sequel we shall consider only quadrics that are not cylinders. Our goal will be the classification of these quadrics using nonsingular affine transformations. Two quadrics that can be mapped one into the other by such a transformation are said to be affinely equivalent.

First of all, let us consider the effect of a translation on the equation of a quadric. Let the equation of the quadric Q in coordinates associated with some frame of reference (O;e 1,…,e n ) have the form

$$ F({\boldsymbol{x} }) = \psi({\boldsymbol{x} }) + f({\boldsymbol{x} }) + c = 0, $$
(11.55)

where ψ(x) is a quadratic form, f(x) is a linear form, and c is a number. If is a translation by the vector aL, then the quadric is given by the equation

$$\psi({\boldsymbol{x} }+{\boldsymbol{a} }) + f({\boldsymbol{x} }+{\boldsymbol{a} }) + c = 0. $$

Let us consider how the equation of a quadric is transformed under these conditions. Let φ(x,y) be the symmetric bilinear form associated with the quadratic form ψ(x), that is, ψ(x)=φ(x,x). Then

As a result, we obtain that after a translation :

  1. (a)

    The quadratic part ψ(x) does not change.

  2. (b)

    The linear part f(x) is substituted by f(x)+2φ(x,a).

  3. (c)

    The constant term c is substituted by c+f(a)+ψ(a).

Using statement (b), then with the aid of a translation , it is sometimes possible to eliminate the first-degree terms in the equation of a quadric. More precisely, this is possible if there exists a vector aL such that

$$ f({\boldsymbol{x} }) = -2 \varphi ({\boldsymbol{x} }, {\boldsymbol{a} }) $$
(11.56)

for an arbitrary xL. By Theorem 6.3, any bilinear form φ(x,y) can be represented in the form via some linear transformation . Then condition (11.56) can be written in the form for all xL, that is, in the form . This means that the condition (11.56) amounts to the linear function fL being contained in the image of the transformation .

First of all, let us investigate those quadrics for which condition (11.56) is satisfied. In this case, there exists a frame of reference of the affine space in which the quadric can be represented by the equation

$$ F({\boldsymbol{x} }) = \psi({\boldsymbol{x} }) + c = 0. $$
(11.57)

This equation exhibits a remarkable symmetry: it is invariant under a change of the vector x into −x. Let us investigate this further.

Definition 11.31

Let V be an affine space and A a point of V. A central symmetry with respect to a point A is a mapping VV that maps each point BV to the point B′∈V such that \(\overrightarrow {AB'} = - \overrightarrow {AB}\).

It is obvious that by this condition, the point B′, and therefore the mapping, is uniquely determined. A trivial verification shows that this mapping is an affine transformation and its linear part is equal to .

Definition 11.32

A set QV is said to be centrally symmetric with respect to a point AV if it is invariant under a central symmetry with respect to the point A, which in this case is called the center of the set Q.

It follows from the definition that a point A on a quadric is a center if and only if the quadric is transformed into itself by the linear transformation , that is, x↦−x, where \({\boldsymbol{x} }= \overrightarrow {AX}\) for every point X of this quadric.

Theorem 11.33

If a quadric does not coincide with an affine space, is not a cylinder, and has a center, then the center is unique.

Proof

Let A and B be two distinct centers of the quadric Q. This means, by definition, that for every point XQ, there exists a point X′∈Q such that

$$ \overrightarrow {AX} = - \overrightarrow {AX'}, $$
(11.58)

and for every point YQ, there exists a point Y′∈Q such that

$$ \overrightarrow {BY} = - \overrightarrow {BY'}. $$
(11.59)

Let us apply relationship (11.58) to an arbitrary point XQ, and relationship (11.59) to the associated point X′=Y. Let us denote the point Y′ obtained as a result of these actions by X″. It is obvious that

$$ \overrightarrow {XX''} = \overrightarrow {XA} + \overrightarrow {AB} + \overrightarrow {BX''}, $$
(11.60)

and from relationships (11.58) and (11.59), it follows that \(\overrightarrow {XA} = \overrightarrow {AX'}\) and \(\overrightarrow {BX''} = \overrightarrow {X'B}\). Substituting the last expressions into (11.60), we obtain that \(\overrightarrow {XX''} = 2 \overrightarrow {AB}\). In other words, this means that if the vector e is equal to \(2 \overrightarrow {AB}\), then the quadric Q is invariant under the translation ; see Fig. 11.5. This assertion also follows from an examination of the similar triangles ABX′ and XXX′ in Fig. 11.5.

Fig. 11.5
figure 5

Similar triangles

Since AB, the vector e is nonnull. Let us choose an arbitrary frame of reference (O;e 1,…,e n ), where e 1=e. Let us set L′=〈e 2,…,e n 〉 and consider the affine space V′=(L′,L′) and mapping h:VV′, defined by the following conditions: h(O)=O, h(A)=O if \(\overrightarrow {OA} = \lambda {\boldsymbol{e} }\), and h(A i )=e i if \(\overrightarrow {O A_{i}} = {\boldsymbol{e} }_{i}\) (i=2,…,n). It is obvious that the mapping h is a projection and that the set Q is a cylinder. Since by our assumption, the quadric Q is not a cylinder, we have obtained a contradiction. □

Thus we obtain that by choosing a system of coordinates with the origin at the center of the quadric, one can define an arbitrary quadric satisfying the conditions of Theorem 11.33 by the equation

$$ \psi(x_1, \ldots, x_n) = c, $$
(11.61)

where ψ is a nonsingular quadratic form (in the case of a singular form ψ, the quadric would be a cylinder).

If c≠0, then we may assume that c=1 by multiplying both sides of equality (11.61) by c −1. Finally, we may execute a linear transformation that preserves the origin and brings the quadratic form ψ into canonical form (6.22). As a result, the equation of the quadric takes the form

$$ x_1^2 + \cdots+ x_r^2 - x_{r+1}^2 - \cdots- x_n^2 = c, $$
(11.62)

where c=0 or 1, and the number r is the index of inertia of the quadratic form ψ.

If c=0 and r=0 or r=n, then it follows that x 1=0, …, x n =0, that is, the quadric consists of a single point, the origin, which contradicts the assumption made above that it does not coincide with some affine subspace. Likewise, for c=1 and r=0, we obtain that \(-x_{1}^{2} - \cdots- x_{n}^{2} = 1\), and this is impossible for real x 1,…,x n , so that the quadric consists of the empty set, which again contradicts our assumption.

We have thus proved the following assertion.

Theorem 11.34

If a quadric does not coincide with an affine subspace, is not a cylinder, and has a center, then in some coordinate system, it is defined by equation (11.62). Moreover, 0<rn, and if c=0, then r<n.

In the case c=0, it is possible, by multiplying the equation of a quadric by −1, to obtain that in (11.62), the number of positive terms is not less than the number of negative terms, that is, rnr, or equivalently, rn/2. In the sequel, we shall always assume that in the case c=0, this condition is satisfied.

Theorem 11.34 asserts that every quadric that is not an affine subspace or a cylinder and that has a center can be transformed with the help of a suitable nonsingular affine transformation into a quadric given by equation (11.62). For c=0 (and only in this case), the quadric (11.62) is a cone (with its vertex at the origin), that is, for every one of its points x, it also contains the entire line 〈x〉. It is possible to indicate another characteristic property of a quadric given by equation (11.62) for c=0: it is not smooth, while in the case c=1, the quadric is smooth. This follows at once from the definition of singular points (the equalities F=0 and \(\frac{\partial F}{\partial x_{i}} = 0\)).

Let us now consider quadrics without a center. Such a quadric Q is defined by the equation

$$ F({\boldsymbol{x} }) = \psi({\boldsymbol{x} }) + f({\boldsymbol{x} }) + c = 0, $$
(11.63)

where ψ(x) is a quadratic form, f(x) a linear form, c a scalar. As earlier, we shall write a symmetric bilinear form φ(x,y) corresponding to a quadratic form ψ(x) as , where is a linear transformation. We have seen that for a quadric Q not to have a center is equivalent to the condition .

Let us choose an arbitrary basis e 1,…,e n−1 in the hyperplane L′=〈fa defined in the space L by the linear homogeneous equation f(x)=0, and let us extend this basis to a basis of the entire space L by means of a vector e n L′ such that f(e n )=1 (here, of course, orthogonality is understood in the sense of being with respect to the bilinear form φ(x,y)). In the obtained frame of reference (O;e 1,…,e n ), equation (11.63) can be written in the form

$$ F({\boldsymbol{x} }) = \psi' (x_1, \ldots, x_{n-1}) + \alpha x_n^2 + x_n + c = 0, $$
(11.64)

where ψ′ is the restriction of the quadratic form ψ to the hyperplane L′.

Let us now choose in L′ a new basis \({\boldsymbol{e} }'_{1}, \ldots, {\boldsymbol{e} }'_{n-1}\), in which the quadratic form ψ′ has the canonical form

$$ \psi' (x_1, \ldots, x_{n-1}) = x_1^2 + \cdots+ x_r^2 - x_{r+1}^2 - \cdots- x_{n-1}^2. $$
(11.65)

It is obvious that in this case, the coordinate origin O and the vector e n remain unchanged. If as a result, the quadratic form ψ′ turned out to depend on fewer than n−1 variables, then the polynomial F in equation (11.63) would depend on fewer than n variables, and that, as we have seen, means that the quadric Q is a cylinder.

Let us show that in formula (11.64), the number α is equal to 0. If α≠0, then by virtue of the obvious relationship \(\alpha x_{n}^{2} + x_{n} + c = \alpha (x_{n} + \beta)^{2} + c'\), where β=1/(2α) and c′=cβ/2, we obtain that via the translation by the vector a=−β e n , equation (11.64) is transformed into

$$F({\boldsymbol{x} }) = \psi' (x_1, \ldots, x_{n-1}) + \alpha x_n^2 + c' = 0, $$

where ψ′ has the form (11.65). But such an equation, as is easily seen, gives a quadric with a center.

Thus assuming that the quadric Q is not a cylinder and does not have a center, we obtain that its equation has the form

$$x_1^2 + \cdots+ x_r^2 - x_{r+1}^2 - \cdots- x_{n-1}^2 + x_n + c = 0. $$

Now let us perform a translation by the vector a=−c e n . As a result, the coordinates x 1,…,x n−1 are unchanged, while x n is changed to x n c. In the new coordinates, the equation of the quadric assumes the form

$$ x_1^2 + \cdots+ x_r^2 - x_{r+1}^2 - \cdots- x_{n-1}^2 + x_n = 0. $$
(11.66)

By multiplying the equation of the quadric by −1 and changing the coordinate x n to −x n , we can obtain that the number of positive squares in equation (11.66) is not less than the number of negative squares, that is, rnr−1, or equivalently, r≥(n−1)/2.

We have thereby obtained the following result.

Theorem 11.35

Every quadric that is not an affine subspace or a cylinder and does not have a center can be given in some coordinate system by equation (11.66), where r is a number satisfying the condition (n−1)/2≤rn−1.

Thus by combining Theorems 11.34 and 11.35, we obtain the following result: Every quadric that is not an affine subspace or a cylinder can be given in some coordinate system by equation (11.62) if it doesn’t have a center and by equation (11.66) if it does have a center. We call these equations canonical.

Theorems 11.34 and 11.35 do more than give the simplest form into which the equation of a quadric can be transformed through a suitable choice of coordinate system. Beyond that, it follows from these theorems that quadrics having a canonical form (11.62) or (11.66) can be affinely equivalent (that is, transformable into each other by a nonsingular affine transformation) only if their equations coincide.

On the way to proving this assertion, we shall first establish that quadrics defined by equation (11.66) never have a center. Indeed, writing the equation of a quadric in the form (11.50), we may say that it has a center only if . But a simple verification shows that this condition is not satisfied for quadrics defined by equation (11.66). Indeed, if in some basis e 1,…,e n of the space L, the quadratic form ψ(x) is given as

$$x_1^2 + \cdots+ x_r^2 - x_{r+1}^2 - \cdots- x_{n-1}^2, $$

then on choosing the dual basis f 1,…,f n , of the dual space L , we obtain that the linear transformation associated with ψ by the relationship , in which φ(x,y) is a symmetric bilinear form determined by the quadratic form ψ, has the form for i=1,…,r, for i=r+1,…,n−1, and , and the linear form x n coincides with f n . Thus and .

We may now formulate the fundamental theorem on the classification of quadrics with respect to nonsingular affine transformations.

Theorem 11.36

Any quadric that is not an affine subspace or cylinder can be represented in some coordinate system by the canonical equation (11.62) or (11.66), where the number r satisfies the conditions indicated in Theorems 11.34 and 11.35 respectively. And conversely, every pair of quadrics having the canonical equation (11.62) or (11.66) in some coordinate systems can be transformed into each other by a nonsingular affine transformation only if their canonical equations coincide.

Proof

Only the second part of the theorem remains to be proved. We have already seen that quadrics given by equations (11.62) and (11.66) cannot be mapped into each other by nonsingular affine transformations, since in the first case, the quadric has a center, while in the second case, it does not. Therefore, we may consider each case separately.

Let us begin with the first case. Let there be given two quadrics Q 1 and Q 2, given by different canonical equations of the form (11.62) (we note that the canonical equations in this case differ by the value c=0 or 1 and index r), and where Q 2=g(Q 1), with a nonsingular affine transformation. By assumption, each quadric has a unique center, which in its chosen coordinate system coincides with the point O=(0,…,0).

Let us write down the transformation g in the form (8.19): , where g 0(O)=O. By assumption, Q 2=g(Q 1), and this means that g(O)=O, that is, the vector a is equal to 0. In the equations of the quadrics, which we may write in the form F i (x)=ψ i (x)+c i =0, i=1 and 2, it is clear that F i (0)=c i , and this means that the constants c i coincide (in the sequel, we shall denote them by c). Thus the equations of the quadrics Q 1 and Q 2 differ only in the quadratic part ψ i (x).

By Theorem 11.29, the transformation g takes the polynomial F 1(x)−c into λ(F 2(x)−c), where λ is some nonzero real number. Consequently, the quadratic form ψ 1(x) is transformed into λψ 2(x) by the linear transformation . If we denote the indices of inertia of the quadratic forms ψ i (x) by r i , then from the law of inertia, it follows that either r 2=r 1 (for λ>0) or r 2=nr 1 (for λ<0). In the case c=0, we may assume that r i n/2, and the equality r 2=nr 1 is possible only for r 2=r 1. In the case c=1, this same result follows from the fact that the transformation takes the polynomial ψ 1(x)−1 into λ(ψ 1(x)−1). Comparing the constant terms, we obtain λ=1.

In the case that the quadric has no center, we may repeat the same arguments. We again obtain that the quadratic form ψ 1(x) is carried into λψ 2(x) by a nonsingular linear transformation. Since each form ψ i (x) contains by assumption the term \(x_{1}^{2}\), it follows that λ=1, and from the law of inertia, it follows that r 2=r 1 (for λ>0), or r 2=n−1−r 1 (for λ<0). Since by assumption, r i ≥(n−1)/2, the equality r 2=n−1−r 1 is possible only for r 2=r 1. □

Thus we see that in a real affine space of dimension n, there exists only a finite number of affinely inequivalent quadrics that are not affine subspaces or cylinders. Each of them is equivalent to a quadric that can be represented in the form of equation (11.62) or equation (11.66).

It is possible to compute the number of types of affinely inequivalent quadrics. Equation (11.62) for c=1 gives n possibilities. The remaining cases depend on the parity of the number n. If n=2m, then equation (11.62) for c=0 gives m different types, and the same number is given by equation (11.66). Altogether, we obtain n+2m=2n different types in the case of even n. If n=2m+1, then equation (11.62) for c=0 gives m different types, and the same number is given by equation (11.66). Altogether in this case we obtain n+2m−1=2n−2 different types. Thus in a real affine space of dimension n, the number of types of affinely inequivalent quadrics that are not affine subspaces or cylinders is equal to 2n if n is even, and to 2n−2 if n is odd.

Remark 11.37

It is easy to see that the content of this section is reduced to the classification of second-degree polynomials F(x 1,…,x n ) up to a nonsingular affine transformation of the variables and multiplication by a nonzero scalar coefficient. The connection with the geometric object—the quadric—is established by Theorem 11.29. That we excluded from consideration the case of affine subspaces is related to the fact that we wished to emphasize the differences among the geometric objects that arise.

The assumption that the quadric was not a cylinder was made exclusively to emphasize the inductive nature of the classification. The limitations that we introduced could have been done without. By repeating precisely the same arguments, we obtain that an arbitrary set in n-dimensional affine space given by equating a second-degree polynomial in n variables—the coordinates of a point—to zero is affinely equivalent to one of the sets defined by the following equations:

(11.67)
(11.68)
(11.69)

After this, it is easy to see that in the case of (11.67) for r=0, the empty set is obtained, while in the case (11.68) for r=0 or r=m, the result is an affine subspace. In the remaining cases, it is easy to find a line that intersects the given set in two distinct points and is not entirely contained in it. By virtue of Theorem 8.14, this means that such a set is not an affine subspace.

In conclusion, let us say a bit about the topological properties of affine quadrics.

If in equation (11.62), we have c=1 and the index of inertia r is equal to 1, then this equation can be rewritten in the form \(x_{1}^{2} = 1 + x_{2}^{2} + \cdots+ x_{n}^{2}\), from which it follows that \(x_{1}^{2} \ge1\), that is, x 1≥1 or x 1≤−1. Clearly, it is impossible for a point of the quadric whose coordinate x 1 is greater than 1 to be continuously deformed into a point whose coordinate x 1 is less than or equal to −1 while remaining on the quadric (see the definition on p. xx). Therefore, a quadric in this case consists of two components, that is, it consists of two subsets such that no two points lying one in each of these subsets can be continuously deformed into each other while remaining on the quadric. It can be shown that each of these components is path connected (see the definition on p. xx), just as is every quadric given by equation (11.66).

The simplest example of a quadric consisting of two path-connected components is a hyperbola in the plane; see Fig. 11.6.

Fig. 11.6
figure 6

A hyperbola

The topological property that we described above has a generalization to quadrics defined by equation (11.62) for c=1 with smaller values of the index r, but still assuming that r≥1. Here we shall say a few words about them, without giving a rigorous formulation and also omitting proofs.

For r=1 we can find two points, (1,0,…,0) and (−1,0,…,0), that cannot be transformed into each other by a continuous motion along the quadric (they could be given as the sphere \(x_{1}^{2} = 1\) in one-dimensional space). For an arbitrary value of r, the quadric contains the sphere

$$x_1^2 + \cdots+ x_r^2 = 1,\qquad x_{r+1} = 0,\qquad \ldots,\qquad x_n = 0. $$

One can prove that this sphere cannot be contracted to a single point by continuous motion along the surface of the quadric. But for every m<r and continuous mapping f of the sphere \(S^{m-1}: y_{1}^{2} + \cdots+ y_{m}^{2} = 1\) into the quadric, the image of the sphere f(S m−1) can be contracted to a point by continuous motion along the quadric (it should be clear to the reader what is meant by continuous motion of a set along a quadric, something that we have already encountered in the case r=1).

11.6 Quadrics in an Affine Euclidean Space

It remains to us to consider nonsingular quadrics in an affine Euclidean space V. We shall, as before, exclude the cases in which the quadrics are affine subspaces or cylinders. The classification of such quadrics up to metric equivalence uses precisely the same arguments as those used in Sect. 11.5. To some extent, the results of that section can be applied in our case, since motions are affine transformations. Therefore, we shall only cursorily recall the line of reasoning.

Generalizing the statement of the problem, which goes back to analytic geometry (where cases dimV=2 and 3 are considered), we shall say that two quadrics are metrically equivalent if they can be transformed into each other by some motion of the space V. This definition is a special case of metric equivalence of arbitrary metric spaces (see p. xxi), to which belong, as is easily verified, all quadrics in an affine Euclidean space.

First of all, let us consider quadrics given by equations whose linear part can be annihilated by a translation. These are quadrics that have a center (which, as we have seen, is unique). Choosing a coordinate origin (that is, a point O of the frame of reference (O;e 1,…,e n )) in the center of the quadric, we bring its equation into the form

$$\psi(x_1, \ldots, x_n) = c, $$

where ψ(x 1,…,x n ) is a nonsingular quadratic form, c a number. If c≠0, then by multiplying the equation by c −1, we may assume that c=1. For c=0, the quadric is a cone.

Using an orthogonal transformation, the quadratic form ψ can be brought into canonical form

$$\psi(x_1, \ldots, x_n) = \lambda _1 x_1^2 + \lambda _2 x_2^2 + \cdots+ \lambda _n x_n^2, $$

where all the numbers λ 1,…,λ n are nonzero, since by assumption, our quadric is nonsingular and is neither an affine subspace nor a cylinder, which means that the quadratic form ψ is nonsingular. Let us separate the positive numbers from the negative: suppose λ 1,…,λ k >0 and λ k+1,…,λ n <0. By tradition going back to analytic geometry, we shall set \(\lambda _{i} = a_{i}^{-2}\) for i=1,…,k and \(\lambda _{j} = - a_{j}^{-2}\) for j=k+1,…,n, where all numbers a 1,…,a n are positive.

Thus every quadric having a center is metrically equivalent to a quadric with equation

$$ \biggl(\frac{x_1}{a_1} \biggr)^2 + \cdots+ \biggl( \frac {x_k}{a_k} \biggr)^2 - \biggl(\frac{x_{k+1}}{a_{k+1}} \biggr)^2 - \cdots- \biggl(\frac {x_n}{a_n} \biggr)^2 = c, $$
(11.70)

where c=0 or 1. For c=0, multiplying equation (11.70) by −1, we may, as in the affine case, assume that kn/2.

Now let us consider the case that the quadric

$$\psi(x_1, \ldots, x_n) + f(x_1, \ldots, x_n) + c = 0 $$

does not have a center, that is, , where is the linear transformation associated with the quadratic form ψ by the relationship , in which φ(x,y) is the symmetric bilinear form that gives the quadratic form ψ. In this case, it is easy to verify that as in Sect. 11.5, we can find an orthonormal basis e 1,…,e n of the space L such that

$$f({\boldsymbol{e} }_1) = 0,\qquad \ldots,\qquad f({\boldsymbol{e} }_{n-1}) = 0,\qquad f({\boldsymbol{e} }_n) = 1, $$

and in the coordinate system determined by the frame of reference (O;e 1,…,e n ), the quadric is given by the equation

$$\lambda _1 x_1^2 + \lambda _2 x_2^2 + \cdots+ \lambda _{n-1} x_{n-1}^2 + x_n + c = 0. $$

Through a translation by the vector −c e n , this equation can be brought into the form

$$\lambda _1 x_1^2 + \lambda _2 x_2^2 + \cdots+ \lambda _{n-1} x_{n-1}^2 + x_n = 0, $$

in which all the coefficients λ i are nonzero, since the quadric is nonsingular and is not a cylinder.

If λ 1,…,λ k >0 and λ k+1,…,λ n−1<0, then by multiplying the equation of the quadric and the coordinate x n by −1 if necessary, we may assume that k≥(n−1)/2. Setting, as previously, \(\lambda _{i} = a_{i}^{-2}\) for i=1,…,k and \(\lambda _{j} = - a_{j}^{-2}\) for j=k+1,k+2,…,n−1, where a 1,…,a n >0, we bring the previous equation into the form

$$ \biggl(\frac{x_1}{a_1} \biggr)^2 + \cdots+ \biggl( \frac {x_k}{a_k} \biggr)^2 - \biggl(\frac{x_{k+1}}{a_{k+1}} \biggr)^2 - \cdots- \biggl(\frac {x_{n-1}}{a_{n-1}} \biggr)^2 + x_n = 0. $$
(11.71)

Thus every quadric in an affine Euclidean space is metrically equivalent to a quadric given by equation (11.70) (type I) or (11.71) (type II). Let us verify (under the given conditions and restriction on r) that two quadrics of the form (11.70) or of the form (11.71) are metrically equivalent only if all the numbers a 1,…,a n (for type I) and a 1,…,a n−1 (for type II) in their equations are the same. Here we may consider separately quadrics of type I and of type II, since they differ even from the viewpoint of affine equivalence.

By Theorem 8.39, every motion of an affine Euclidean space is the composition of a translation and an orthogonal transformation. As we saw in Sect. 11.5, a translation does not alter the quadratic part of the equation of a quadric. By Theorem 11.29, two quadrics are affinely equivalent only if the polynomials appearing in their equations differ by a constant factor. But for quadrics of type I for c=1, this factor must be equal to 1. In the case of a quadric of type I for c=0, multiplication by μ>0 means that all the numbers a i are multiplied by μ −1/2. For a quadric of type II, this factor must also be equal to 1 in order to preserve the coefficient 1 in the linear term x n .

Thus we see that if we exclude quadrics of type I with constant term c=0 (a cone), then the quadratic parts of the equations must be quadratic forms equivalent with respect to orthogonal transformations. But the numbers λ i are defined as the eigenvalues of the associated linearly symmetric transformation, and therefore, this also determines the numbers a i . In the case of a cone (quadric of type I for c=0), all the numbers λ i can be multiplied by a common factor that is a positive number (because of the assumptions made about r). This means that the numbers a i can be multiplied by an arbitrary positive common factor.

Let us note that although our line of reasoning was precisely the same as in the case of affine equivalence, the result that we obtained was different. We obtained relative to affine equivalence only a finite number of different types of inequivalent quadrics, while with respect to metric equivalence, the number is infinite: they are determined not only by a finite number of values of the index r, but also by arbitrary numbers a i (which in the case of a cone are defined up to multiplication by a common positive factor). This fact is presented in a course in analytic geometry; for example, an ellipse with equation

$$\biggl(\frac{x}{a} \biggr)^2 + \biggl( \frac{y}{b} \biggr)^2 = 1 $$

is defined by its semiaxes a and b, and if for two ellipses these are different, then the ellipses cannot be transformed into each other by a motion of the plane.

For arbitrary n, quadrics having a canonical equation (11.70) with k=n and c=1 are called ellipsoids. The equation of an ellipsoid can be rewritten in the form

$$ \sum_{k=1}^n \biggl(\frac{x_i}{a_i} \biggr)^2 = 1, $$
(11.72)

from which it follows that |x i /a i |≤1 and hence |x i |≤a i . If the largest of these numbers a 1,…,a n is denoted by a, then we obtain that |x i |≤a. This property is expressed by saying that the ellipsoid is a bounded set. The interested reader can easily prove that among all quadrics, only ellipsoids have this property.

If we renumber the coordinates in such a way that in the equation of the ellipsoid (11.72), the coefficients are a 1a 2≥⋯≥a n , then we obtain

$$\biggl(\frac{x_i}{a_1} \biggr)^2 \le \biggl(\frac{x_i}{a_i} \biggr)^2 \le \biggl(\frac{x_i}{a_n} \biggr)^2, $$

whence for every point x=(x 1,…,x n ) lying on the ellipsoid, we have the inequality a n ≤|x|≤a 1. This means that the distance from the center O of the ellipsoid to the point x is not greater than to the point A=(a 1,0,…,0) and not less than to the point B=(0,…,0,a n ). These two points, or more precisely, the segments OA and OB, are called the semimajor and semiminor axes of the ellipsoid.

11.7 Quadrics in the Real Plane*

In this section, we shall not be proving any new facts. Rather, our goal is to establish a connection between results obtained earlier with facts familiar from analytic geometry, in particular, the interpretation of quadrics in the real plane as conic sections, which was known already to the ancient Greeks.

Let us begin by considering the simplest example, in which it is possible to see the difference between the affine and projective classifications of quadrics, that is, quadrics in the real affine and real projective planes. But for this, we must first refine (or recall) the statement of the problem.

By the definition from Sect. 9.1, we may represent a projective space of arbitrary dimension n in the form ℙ(L), where L is a vector space of dimension n+1. An affine space of the same dimension n can be considered the affine part of ℙ(L), determined by the condition φ≠0, where φ is some nonnull linear function on L. It can also be identified with the set W φ , defined by the condition φ(x)=1. This set is an affine subspace of L (we may view L as its own space of vectors). In the sequel, we shall make use of precisely this construction of an affine space.

A quadric \(\overline{Q}\) in a projective space ℙ(L) is given by an equation F(x)=0, where F is a homogeneous second-degree polynomial. In the space L, the collection of all vectors for which F(x)=0 forms a cone K. Let us recall that a cone is a set K such that for every vector xK, the entire line 〈x〉 containing x is also contained in K. A cone associated with a quadric is called a quadratic cone. From this point of view, the projective classification of quadrics coincides with the classification of quadratic cones with respect to nonsingular linear transformations.

Thus an affine quadric Q can be represented in the form W φ K using the previously given notation W φ and K. Quadrics \(Q_{1} \subset W_{\varphi _{1}}\) and \(Q_{2} \subset W_{\varphi _{2}}\) are by definition affinely equivalent if there exists a nonsingular affine transformation \(W_{\varphi _{1}} \to W_{\varphi _{2}}\) mapping Q 1 to Q 2. This means that we have a nonsingular linear transformation of the vector space L for which

where K 1 and K 2 are quadratic cones associated with the quadrics Q 1 and Q 2.

First of all, let us examine how the mapping acts on the set W φ . To this end, let us recall that in the space L of linear functions on L there are defined dual transformations for which

for all vectors xL and φL . In other words, this means that if , then the linear function ψ(x) is equal to . Since the transformation is nonsingular, the dual transformation is also nonsingular, and therefore, there exists an inverse transformation . By definition, if φ(x)=1, that is, takes W φ into the set .

Since in previous sections, we considered only nonsingular projective quadrics, it is natural to set corresponding restrictions in the affine case as well. To this end, we shall use, as earlier, the representation of affine quadrics in the form Q=W φ K. A quadratic cone K determines some projection to the quadric \(\overline{Q}\). It is easy to express this correspondence in coordinates. If we choose in L a system of coordinates (x 0,x 1,…,x n ), then in \(W_{x_{0}}\) are defined inhomogeneous coordinates y 1,…,y n by the formula y i =x i /x 0. If the quadric Q is given by the second-degree equation

$$f(y_1, \ldots,y_n) = 0, $$

then the quadric \(\overline{Q}\) (and cone K) is given by the equation

$$F(x_0,x_1, \ldots,x_n) = 0, \quad\mbox{where } F = x_0^2 f \biggl( \frac {x_1}{x_0}, \ldots, \frac{x_n}{x_0} \biggr). $$

Thus the projective quadric \(\overline{Q}\) is uniquely defined by the affine quadric Q.

Definition 11.38

An affine quadric Q is said to be nonsingular if the associated projective quadric \(\overline{Q}\) is nonsingular.

In a space of arbitrary dimension n, all quadrics with canonical equations (11.67)–(11.69) for m<n are singular. Furthermore, a quadric of type (11.68) is singular as well for m=n. Both these assertions can be verified directly from the definitions; we have only to designate the coordinates x 1,…,x n by y 1,…,y n , introduce homogeneous coordinates x 0:x 1:⋯:x n , setting y i =x i /x 0, and multiply all the equations by \(x_{0}^{2}\). It is very easy to write down the matrix of a quadratic form F(x 0,x 1,…,x n ).

In particular, for n=2, we obtain three equations:

$$ y_1^2 + y_2^2 = 1,\qquad y_1^2 - y_2^2 = 1,\qquad y_1^2 + y_2 = 0. $$
(11.73)

From the results of Sect. 11.5, it follows that for n=2, every nonsingular affine quadric is affinely equivalent to a quadric of one (and only one) of these three types. The corresponding quadrics are called ellipses, hyperbolas, and parabolas.

On the other hand, in Sect. 11.4, we saw that all nonsingular projective quadrics are projectively equivalent. This result can serve as a graphic representation of affine quadrics. As we have seen, every affine quadric can be represented in the form Q=W φ K, where K is some quadratic cone. It is affinely equivalent to the quadric

where is an arbitrary nonsingular linear transformation of the space L.

Here arises the specific nature of the case n=2 (dimL=3). By what has been proved earlier, every cone K associated with a nonsingular quadric can be mapped to every other such cone by a nonsingular transformation . In particular, we may assume that , where the cone K 0 is given in some coordinate system x 0,x 1,x 2 of the space L by the equation \(x_{1}^{2} +x_{2}^{2} = x_{0}^{2}\). This cone is obtained by the rotation of one of its generatrices, that is, a line lying entirely on the cone (for example, the line x 1=x 0, x 2=0) about the axis x 0 (that is, the line x 1=x 2=0). In the cone K 0 that we have chosen, the angle between the generatrix and the axis x 0 is equal to π/4. In other words, this means that each pole of the cone K 0 is obtained by a rotation of the sides of an isosceles right triangle around its bisector.

Setting , we obtain that an arbitrary nonsingular affine quadric is affinely equivalent to the quadric W ψ K 0. Here W ψ is an arbitrary plane in the space L not passing through the vertex of the cone K 0, that is, through the point O=(0,0,0). Thus every nonsingular affine quadric is affinely equivalent to a planar section of a right circular cone. This explains the terminology conic used for quadrics in the plane.

It is well known from analytic geometry how the three conics that we have found (ellipses, hyperbolas, and parabolas) are obtained from a single (from the point of view of projective classification) curve. If we begin with equations (11.73), then the difference in the three types is revealed by writing these equations in homogeneous coordinates. Setting y 1=x 1/x 0 and y 2=x 2/x 0, we obtain the equations

$$ x_1^2 + x_2^2 = x_0^2,\qquad x_1^2 - x_2^2 = x_0^2,\qquad x_1^2 - x_0 x_2 = 0. $$
(11.74)

The differences among these equations can be found in the different natures of the sets of intersection with the infinite line l given by the equation x 0=0. For an ellipse, this set is empty; for a hyperbola, it consists of two points, (0:1:1) and (0:1:−1), and for a parabola, it consists of the single point (0:0:1) (substitution into equation (11.73) shows that the line l is tangent to the parabola at the point of intersection); see Fig. 11.7.

Fig. 11.7
figure 7

Intersection of a conic with an infinite line

We saw in Sect. 9.2 that an affine transformation coincides with a projective transformation that preserves the line l . Therefore, the type of set \({\overline{Q}} \cap l_{\infty}\) (empty set, two points, one point) should be the same for affinely equivalent quadrics Q. In our case, the actual content of what we proved in Sect. 11.4 is that the type of set \({\overline{Q}} \cap l_{\infty}\) determines the quadric Q up to affine equivalence.

But if we begin with the representation of a conic as the intersection of the cone K 0 with the plane W ψ , then different types appear due to a different disposition of the plane W ψ with respect to the cone K 0. Let us recall that the vertex O of the cone K 0 partitions it into two poles. If the equation of the cone has the form \(x_{1}^{2} + x_{2}^{2} = x_{0}^{2}\), then each pole is determined by the sign of x 0.

Let us denote by L ψ the plane parallel to W ψ and passing through the point O. This plane is given by the equation ψ=0. If L ψ has no points of intersection with the cone K 0 other than O, then W ψ intersects one of its poles (for example, the one within which lie the point of intersection W ψ and the axis x 0). In this case, the conic W ψ K 0 lies within one pole and is an ellipse.

For example, in the special case in which the plane W ψ is orthogonal to the axis x 0, we obtain a circle. If we move the plane W ψ (for example, decrease its angle with the axis x 0), then in its intersection with the cone K 0, an ellipse is obtained whose eccentricity increases as the angle is decreased; see Fig. 11.8(a). The limiting position is reached when the plane L ψ is tangent to the cone K 0 on a generatrix. Then W ψ again intersects in one pole (the one that contains the intersection with the axis x 0). This intersection is a parabola; see Fig. 11.8(b). And if the plane L ψ intersects K 0 in two different generatrices, then W ψ intersects both of its poles (on the side of the plane L ψ on which is located the plane W ψ parallel to it). This intersection is a hyperbola; see Fig. 11.8(c).

Fig. 11.8
figure 8

Conic sections

The connection between planar quadrics and conic sections is revealed particularly clearly by the metric classification of such quadrics, which forms part of any sufficiently rigorous course in analytic geometry. Let us recall only the main results.

As was done in Sect. 11.5, we must exclude from consideration those conics that are cylinders and those that are unions of vector subspaces (that is, in our case, lines or points). Then the results obtained in Sect. 11.5 give us (in coordinates x,y) the following three types of conic:

$$ \frac{x^2}{a^2} + \frac{y^2}{b^2} = 1,\qquad \frac{x^2}{a^2} - \frac {y^2}{b^2} = 1,\qquad x^2 + a^2 y = 0, $$
(11.75)

where a>0 and b>0. From the point of view of affine classification presented above, curves of the first type are ellipses, those of the second type are hyperbolas, and those of the third type are parabolas.

Let us recall that in a course in analytic geometry, these curves are defined as geometric loci of points of the plane satisfying certain conditions. Namely, an ellipse is the geometric locus of points the sum of whose distances from two given points in the plane is constant. A hyperbola is defined analogously with sum replaced by difference. A parabola is the geometric locus of points equidistant from a given point and a given line that does not pass through the given point.

There is an elegant and elementary proof of the fact that all ellipses, hyperbolas, and parabolas are not only affinely, but also metrically, that is, as geometric loci of points, equivalent to planar sections of a right circular cone. Let us recall that by right circular cone we mean a cone K in three-dimensional space obtained as the result of a rotation of a line about some other line, called the axis of the cone. The lines forming the cone are called its generatrices; they intersect the axis of the cone in one common point, called its vertex.

In other words, this result means that the section of a right circular cone with a plane not passing through the vertex of the cone is either an ellipse, a hyperbola, or a parabola, and every ellipse, hyperbola, and parabola coincides with the intersection of a right circular cone with a suitable plane.Footnote 5