Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

10.1 Plücker Coordinates of a Subspace

The fundamental idea of analytic geometry, which goes back to Fermat and Descartes, consists in the fact that every point of the two-dimensional plane or three-dimensional space is defined by its coordinates (two or three, respectively). Of course, there must also be present a particular choice of coordinate system. In this course, we have seen that this very principle is applicable to many spaces of more general types: vector spaces of arbitrary dimension, as well as Euclidean, affine, and projective spaces. In this chapter, we shall show that it can be applied to the study of vector subspaces M of fixed dimension m in a given vector space L of dimension nm. Since there is a bijection between the m-dimensional subspaces ML and (m−1)-dimensional projective subspaces ℙ(M)⊂ℙ(L), we shall therefore also obtain a description of the projective subspaces of fixed dimension of a projective space with the aid of “coordinates” (certain collections of numbers).

The case of points of a projective space (subspaces of dimension 0) was already analyzed in the previous chapter: they are given by homogeneous coordinates. The same holds in the case of hyperplanes of a projective space ℙ(L): they correspond to the points of the dual space ℙ(L ). The simplest case in which the problem is not reduced to these two cases given above is the set of projective lines in three-dimensional projective space. Here a solution was proposed by Plücker. And therefore, in the most general case, the “coordinates” corresponding to the subspace are called Plücker coordinates. Following the course of history, we shall begin in Sects. 10.1 and 10.2 by describing these using some coordinate system, and then investigate the construction we have introduced in an invariant way, in order to determine which of its elements depend on the choice of coordinate system and which do not.

Therefore, we now assume that some basis has been chosen in the vector space L. Since dimL=n, every vector aL has in this basis n coordinates. Let us consider a subspace ML of dimension mn. Let us choose an arbitrary basis a 1,…,a m of the subspace M. Then M=〈a 1,…,a m 〉, and the vectors a 1,…,a m are linearly independent. The vector a i has, in the chosen basis of the space L, coordinates a i1,…,a in (i=1,…,m), which we can arrange in the form of a matrix M of type (m,n), writing them in row form:

(10.1)

The condition that the vectors a 1,…,a m are linearly independent means that the rank of the matrix M is equal to m, that is, one of its minors of order m is nonzero. Since the number of rows of the matrix M is equal to m, a minor of order m is uniquely defined by the indices of its columns. Let us denote by \(M_{i_{1}, \ldots, i_{m}}\) the minor consisting of columns with indices i 1,…,i m , which assume the various values from 1 to n.

We know that not all of the minors \(M_{i_{1}, \ldots, i_{m}}\) can be equal to zero at the same time. Let us examine how they depend on the choice of basis a 1,…,a m in M. If b 1,…,b m is some other basis of this subspace, then

$${\boldsymbol{b} }_i = b_{i 1} {\boldsymbol{a} }_1 + \cdots+ b_{i m} {\boldsymbol{a} }_m,\quad i=1, \ldots, m. $$

Since the vectors b 1,…,b m are linearly independent, the determinant |(b ij )| is nonzero. Let us set c=|(b ij )|. If \(M'_{i_{1}, \ldots, i_{m}}\) is a minor of the matrix M′, constructed analogously to M using the vectors b 1,…,b m , then by formula (3.35) and Theorem 2.54 on the determinant of a product of matrices, we have the relationship

$$ M'_{i_1, \ldots, i_m} = c M_{i_1, \ldots, i_m}. $$
(10.2)

The numbers \(M_{i_{1}, \ldots, i_{m}}\) that we have determined are not independent. Namely, if the unordered collection of numbers j 1,…,j m coincides with i 1,…,i m (that is, comprises the same numbers, perhaps arranged in a different order), then as we saw in Sect. 2.6, we have the relationship

$$ M_{j_1, \ldots, j_m} = \pm M_{i_1, \ldots, i_m}, $$
(10.3)

where the sign + or − appears depending on whether the number of transpositions necessary to effect the passage from the collection (i 1,…,i m ) to (j 1,…,j m ) is even or odd. In other words, the function \(M_{i_{1}, \ldots, i_{m}}\) of m arguments i 1,…,i m assuming the values 1,…,n is antisymmetric.

In particular, we may take as the collection (j 1,…,j m ) the arrangement of the numbers i 1,…,i m such that i 1<i 2<⋯<i m , and the corresponding minor \(M_{j_{1}, \ldots, j_{m}}\) will coincide with either \(M_{i_{1}, \ldots, i_{m}}\) or \(-M_{i_{1}, \ldots, i_{m}}\). In view of this, in the original notation, we shall assume that i 1<i 2<⋯<i m , and we shall set

$$ p_{i_1, \ldots, i_m} = M_{i_1, \ldots, i_m} $$
(10.4)

for all collections i 1<i 2<⋯<i m of the numbers 1,…,n. Thus we assign to the subspace M as many of the numbers \(p_{i_{1}, \ldots, i_{m}}\) as there are combinations of n things taken m at a time, that is, \(\nu= \mathrm{C}_{n}^{m}\). From formula (10.3) and the condition that the rank of the matrix M is equal to m, it follows that these numbers \(p_{i_{1}, \ldots, i_{m}}\) cannot all become zero simultaneously. On the other hand, formula (10.2) shows that in replacing the basis a 1,…,a m of the subspace M by some other basis b 1,…,b m of this subspace, all these numbers are simultaneously multiplied by some number c≠0. Thus the numbers \(p_{i_{1}, \ldots, i_{m}}\) for i 1<i 2<⋯<i m can be taken as the homogeneous coordinates of a point of the projective space ℙν−1=ℙ(N), where dimN=ν and dimℙ(N)=ν−1.

Definition 10.1

The totality of numbers \(p_{i_{1}, \ldots, i_{m}}\) in (10.4) for all collections i 1<i 2<⋯<i m taking the values 1,…,n is called the Plücker coordinates of the m-dimensional subspace ML.

As we have seen, Plücker coordinates are defined only up to a common nonzero factor; the collection of them must be understood as a point in the projective space ℙν−1.

The simplest special case m=1 returns us to the definition of projective space, whose points correspond to one-dimensional subspaces 〈a〉 of some vector space L. The numbers \(p_{i_{1}, \ldots, i_{m}}\) in this case become the homogeneous coordinates of a point. It is therefore not surprising that all of these depend on the choice of a coordinate system (that is, a basis) of the space L. Following tradition, in the sequel we shall allow for a certain imprecision and call “Plücker coordinates” of the subspace M both a point of the projective space ℙν−1 and the collection of numbers \(p_{i_{1}, \ldots, i_{m}}\) specified in this definition.

Theorem 10.2

The Plücker coordinates of a subspace ML uniquely determine the subspace.

Proof

Let us choose an arbitrary basis a 1,…,a m of the subspace M. It uniquely determines (and not up to a common factor) the minors \(M_{i_{1}, \ldots, i_{m}}\), without regard to the order of the indices i 1,…,i m . The minors are uniquely determined by the Plücker coordinates (10.4), according to formula (10.3).

A vector xL belongs to the subspace M=〈a 1,…,a m 〉 if and only if the rank of the matrix

consisting of the coordinates of the vectors a 1,…,a m ,x in some (arbitrary) basis of the space L, is equal to m, that is, if all the minors of order m+1 of the matrix \(\overline{M}\) are equal to zero. Let us consider the minor that comprises the columns with indices forming the subset X={k 1,…,k m+1} of the set ℕ n ={1,…,n}, where we may assume that k 1<k 2<⋯<k m+1. Expanding it along the last row, we obtain the equality

$$ \sum_{\alpha \in X} x_{\alpha } A_{\alpha } = 0, $$
(10.5)

where A α is the cofactor of the element x α in the minor under consideration. But by definition, the minor corresponding to A α is obtained from the matrix \(\overline{M}\) by deleting the last row and the column with index α. Therefore, it coincides with one of the minors of the matrix M, and the indices of its columns are obtained by deleting the element α from the set X. For writing the sets thus obtained, one frequently uses the convenient notation

$$\{k_1, \ldots, \breve{k_{\alpha }}, \ldots, k_{m+1}\}, $$

where the notation \(\breve{\phantom{o}}\) signifies the omission of the element so indicated. Thus relationship (10.5) can be written in the form

$$ \sum_{j=1}^{m+1} (-1)^j x_{k_j} M_{k_1, \ldots, \breve{k_{j}}, \ldots, k_{m+1}} = 0. $$
(10.6)

Since the minors \(M_{i_{1}, \ldots, i_{m}}\) of the matrix M are expressed in Plücker coordinates by formula (10.4), relationships (10.6), obtained from all possible subsets X={k 1,…,k m+1} of the set ℕ n , also give expressions in terms of Plücker coordinates of the condition xM, which completes the proof of the theorem. □

By Theorem 10.2, Plücker coordinates uniquely define the subspace M, but as a rule, they cannot assume arbitrary values. It is true that for m=1, the homogeneous coordinates of a point of projective space can be chosen with arbitrary numbers (of course, with the exception of the one collection consisting of all zeros). Another equally simple case is m=n−1, in which subspaces are hyperplanes corresponding to points of ℙ(L ). Hyperplanes are defined by their coordinates in this projective space, which also can be chosen as arbitrary collections of numbers (again with the exclusion of the collection consisting of all zeros). It is not difficult to verify that these homogeneous coordinates can differ from Plücker coordinates only by their signs, that is, by the factor ±1. However, as we shall now see, for an arbitrary number m<n, the Plücker coordinates are connected to one another by certain specific relationships.

Example 10.3

Let us consider the next case in order of complexity: n=4, m=2. If we pass to projective spaces corresponding to L and M, then this will give us a description of the totality of projective lines in three-dimensional projective space (the case considered by Plücker).

Since n=4, m=2, we have \(\nu= \mathrm{C}^{2}_{4}= 6\), and consequently, each plane ML has six Plücker coordinates:

$$ p_{12}, p_{13}, p_{14}, p_{23}, p_{24}, p_{34}. $$
(10.7)

It is easy to see that for an arbitrary basis of the space L, we may always choose a basis a,b in the subspace M in such a way that the matrix M given by formula (10.1) will have the form

From this follow easily the values of the Plücker coordinates (10.7):

which yields the relationship p 34p 13 p 24+p 14 p 23=0. In order to make this homogeneous, we will use the fact that p 12=1, and write it in the form

$$ p_{12} p_{34} - p_{13} p_{24} + p_{14} p_{23} = 0. $$
(10.8)

The relationship (10.8) is already homogeneous, and therefore, it is preserved under multiplication of all the Plücker coordinates (10.7) by an arbitrary nonzero factor c. Thus relationship (10.8) remains valid for an arbitrary choice of Plücker coordinates, and this means that it defines a point in some projective algebraic variety in 5-dimensional projective space.Footnote 1 In the following section, we shall study an analogous question in the general case, for arbitrary dimension m<n.

10.2 The Plücker Relations and the Grassmannian

We shall now describe the relationships satisfied by Plücker coordinates of an m-dimensional subspace M of an n-dimensional space L for arbitrary n and m. Here we shall use the following notation and conventions. Although in the definition of Plücker coordinates \(p_{i_{1}, \ldots, i_{m}}\) it was assumed that i 1<i 2<⋯<i m , now we shall consider numbers \(p_{i_{1}, \ldots, i_{m}}\) also with other collections of indices. Namely, if (j 1,…,j m ) is an arbitrary collection of m indices taking the values 1,…,n, then we set

$$ p_{j_1, \ldots, j_m} = 0 $$
(10.9)

if some two of the numbers j 1,…,j m are equal, while if all the numbers j 1,…,j m are distinct and (i 1,…,i m ) is their arrangement in ascending order, then we set

$$ p_{j_1, \ldots, j_m} = \pm p_{i_1, \ldots, i_m}, $$
(10.10)

where the sign + or − depends on whether the permutation that takes (j 1,…,j m ) to (i 1,…,i m ) is even or odd (that is, whether the number of transpositions is even or odd), according to Theorem 2.25.

In other words, in view of equality (10.3), let us set

$$ p_{j_1, \ldots, j_m} = M_{j_1, \ldots, j_m}, $$
(10.11)

where (j 1,…,j m ) is an arbitrary collection of indices assuming the values 1,…,n.

Theorem 10.4

For every m-dimensional subspace M of an n-dimensional space L and for any two sets (j 1,…,j m−1) and (k 1,…,k m+1) of indices taking the values 1,…,n, the following relationships hold:

$$ \sum_{r=1}^{m+1} (-1)^r p_{j_1, \ldots, j_{m-1}, k_r} \cdot p_{k_1, \ldots, \breve{k_r}, \ldots, k_{m+1}} = 0. $$
(10.12)

These are called the Plücker relations.

The notation \({k_{1}, \ldots, \breve{k_{r}}, \ldots, k_{m+1}}\) means that we omit k r in the sequence k 1,…,k r ,…,k m+1.

Let us note that the indices among the numbers \(p_{\alpha _{1}, \ldots, \alpha _{m}}\) entering relationship (10.12) are not necessarily in ascending order, so they are not Plücker coordinates. But with the aid of relationships (10.9) and (10.10), we can easily express them in terms of Plücker coordinates. Therefore, relationship (10.12) may also be viewed as a relationship among Plücker coordinates.

Proof of Theorem 10.4

Returning to the definition of Plücker coordinates in terms of the minors of the matrix (10.1) and using relationship (10.11), we see that equality (10.12) can be rewritten in the form

$$ \sum_{r=1}^{m+1} (-1)^r M_{j_1, \ldots, j_{m-1}, k_r} \cdot M_{k_1, \ldots, \breve{k_r}, \ldots, k_{m+1}} = 0. $$
(10.13)

Let us show that relationship (10.13) holds for the minors of an arbitrary matrix of type (m,n). To this end, let us expand the determinant \(M_{j_{1}, \ldots, j_{m-1} k_{r}}\) along the last column. Let us denote the cofactor of the element \(a_{l k_{r}}\) of the last column of this determinant by A l , l=1,…,m. Thus the cofactor A l corresponds to the minor located in the rows and columns with indices \((1, \ldots, \breve{l}, \ldots, m)\) and (j 1,…,j m−1) respectively. Then

$$M_{j_1, \ldots, j_{m-1} ,k_r} = \sum_{l=1}^{m} a_{l k_r} A_l. $$

On substituting this expression into the left-hand side of relationship (10.13), we arrive at the equality

Changing the order of summation, we obtain

But the sum in parentheses is equal to the result of the expansion along the first row of the determinant of the square matrix of order m+1 consisting of the columns of the matrix (10.1) numbered k 1,…,k m+1 and rows numbered l,1,…,m. This determinant is equal to

Indeed, for arbitrary l=1,…,m, two of its rows (numbered 1 and l+1) coincide, and this means that the determinant is equal to zero. □

Example 10.5

Let us return once more to the case n=4, m=2 considered in the previous section. Relationships (10.12) are here determined by subsets (k) and (l,m,n) of the set {1,2,3,4}. If, for example, k=1 and l=2, m=3, n=4, then we obtain relationship (10.8) introduced earlier. It is easily verified that if all the numbers k,l,m,n are distinct, then we obtain the same relationship (10.8), while if among them there are two that are equal, then relationship (10.12) is an identity (for the proof of this, we can use the antisymmetry of p ij with respect to i and j). Therefore, in the general case, too (for arbitrary m and n), relationships (10.12) among the Plücker coordinates are called the Plücker relations.

We have seen that to each subspace M of given dimension m of the space L of dimension n, there correspond its Plücker coordinates

$$ p_{i_1, \ldots, i_m},\quad i_1 < i_2 < \cdots< i_m, $$
(10.14)

satisfying the relationships (10.12). Thus an m-dimensional subspace ML is determined by its Plücker coordinates (10.14), completely analogously to how points of a projective space are determined by their homogeneous coordinates (this is in fact a special case of Plücker coordinates for m=1). However, for m>1, the coordinates of the subspace M cannot be assigned arbitrarily: it is necessary that they satisfy relationships (10.12). Below, we shall prove that these relationships are also sufficient for the collection of numbers (10.14) to be Plücker coordinates of some m-dimensional subspace ML. For this, we shall find the following geometric interpretation of Plücker coordinates useful.

Relationships (10.12) are homogeneous (of degree 2) with respect to the numbers \(p_{i_{1}, \ldots, i_{m}}\). After substitution on the basis of formulas (10.9) and (10.10), each of these relationships remains homogeneous, and thus they define a certain projective algebraic variety in the projective space ℙν−1, called a Grassmann variety or simply Grassmannian and denoted by G(m,n).

We shall now investigate the Grassmannian G(m,n) in greater detail.

As we have seen, G(m,n) is contained in the projective space ℙν−1, where \(\nu= \mathrm{C}_{n}^{m}\) (see p. 351), and the homogeneous coordinates are written as the numbers (10.14) with all possible increasing collections of indices taking the values 1,…,n. The space ℙν−1 is the union of affine subsets \(U_{i_{1}, \ldots, i_{m}}\), each of which is defined by the condition \(p_{i_{1}, \ldots, i_{m}} \neq0\) for some choice of indices i 1,…,i m . From this we obtain

$$G(m,n) = \bigcup_{i_1, \ldots, i_m} \bigl( G(m,n) \cap U_{i_1, \ldots, i_m} \bigr). $$

We shall investigate separately one of these subsets \(G(m,n) \cap U_{i_{1}, \ldots, i_{m}}\), for example, for simplicity, the subset with indices (i 1,…,i m )=(1,…,m). The general case is considered completely analogously and differs only in the numeration of the coordinates in the space ℙν−1. We may assume that for points of our affine subset U 1,…,m , the number p 1,…,m is equal to 1.

Relationships (10.12) give the possibility to choose Plücker coordinates (10.14) of the subspace M (or equivalently, the minors \(M_{i_{1}, \ldots, i_{m}}\) of the matrix (10.1)) in the form of polynomials in coordinates \(p_{i_{1}, \ldots, i_{m}}\), such that among the indices i 1<i 2<⋯<i m , not more than one exceeds m. Any such collection of indices obviously has the form \((1, \ldots, \breve{r}, \ldots, m, l)\), where rm and l>m. Let us denote the Plücker coordinate corresponding to this collection by \({\overline{p}}_{r l}\), that is, we set \({\overline{p}}_{r l} = p_{1, \ldots, \breve{r}, \ldots, m, l}\).

Let us consider an arbitrary ordered collection j 1<j 2<⋯<j m of numbers between 1 and n. If the indices j k are less than or equal to m for all k=1,…,m, then the collection (j 1,j 2,…,j m ) coincides with the collection (1,2,…,m), and since the Plücker coordinate p 1,…,m is equal to 1, there is nothing to prove. Thus we have only to consider the remaining case.

Let j k >m be one of the numbers j 1<j 2<⋯<j m . Let us use relationship (10.12), corresponding to the collection \((j_{1}, \ldots, \breve {j_{k}}, \ldots, j_{m})\) of m−1 numbers and the collection (1,…,m,j k ) of m+1 numbers. In this case, relationship (10.12) assumes the form

$$\sum_{r=1}^{m} (-1)^r p_{j_1, \ldots, \breve{j_k}, \ldots, j_m, r} \cdot p_{1,2, \ldots, \breve{r}, \ldots, m, j_k} + (-1)^{m+1} p_{j_1, \ldots, \breve {j_k}, \ldots, j_m, j_k} = 0, $$

since p 1,…,m =1. In view of the antisymmetry of the expression \(p_{j_{1}, \ldots, j_{m}}\), it follows that \(p_{j_{1}, \ldots, j_{m}} = p_{j_{1}, \ldots, \breve{j_{k}}, \ldots, j_{m}, j_{k}}\) is equal to the sum (with alternating signs) of the products \(p_{j_{1}, \ldots, \breve{j_{k}}, \ldots, j_{m} r } {\overline{p}}_{r l}\). If among the numbers j 1,…,j m there were s numbers exceeding m, then among the numbers \(j_{1}, \ldots, \breve{j_{k}}, \ldots, j_{m}\), there would be already s−1 of them.

Repeating this process as many times as necessary, we will obtain as a result an expression of the chosen Plücker coordinate \(p_{j_{1}, \ldots, j_{m}}\) in terms of the coordinates \({\overline{p}}_{r l}\), rm, l>m. We have thereby obtained the following important result.

Theorem 10.6

For each point in the set G(m,n)∩U 1,…,m , all the Plücker coordinates (10.14) are polynomials in the coordinates \({\overline{p}}_{r l} = p_{1, \ldots, \breve{r},\ldots, m, l}\), rm, l>m.

Since the numbers r and l satisfy 1≤rm and m<ln, it follows that all possible collections of coordinates \({\overline{p}}_{r l}\) form an affine subspace V of dimension m(nm). By Theorem 10.6, all the remaining Plücker coordinates \(p_{i_{1}, \ldots, i_{m}}\) are polynomials in \({\overline{p}}_{r l}\), and therefore the coordinates \({\overline{p}}_{r l}\) uniquely define a point of the set G(m,n)∩U 1,…,m . Thus is obtained a natural bijection (given by these polynomials) between points of the set G(m,n)∩U 1,…,m and points of the affine space V of dimension m(nm). Of course, the same is true as well for points of any other set \(G(m,n) \cap U_{i_{1}, \ldots, i_{m}}\). In algebraic geometry, this fact is expressed by saying that the Grassmannian G(m,n) is covered by the affine space of dimension m(nm).

Theorem 10.7

Every point of the Grassmannian G(m,n) corresponds to some m-dimensional subspace ML as described in the previous section.

Proof

Since the Grassmannian G(m,n) is the union of sets \(G(m,n) \cap U_{i_{1}, \ldots, i_{m}}\), it suffices to prove the theorem for each set separately. We shall carry out the proof for the set G(m,n)∩U 1,…,m , since the rest differ from it only in the numeration of coordinates.

Let us choose an m-dimensional subspace ML and basis a 1,…,a m in it so that in the associated matrix M given by formula (10.1), the elements residing in its first m columns take the form of the identity matrix E of order m. Then the matrix M has the form

(10.15)

By Theorem 10.6, the Plücker coordinates (10.14) are polynomials in \({\overline{p}}_{r l} = p_{1, \ldots, \breve{r}, \ldots, m, l}\). Moreover, by the definition of Plücker coordinates (10.4), we have \(p_{1, \ldots, \breve{r}, \ldots, m, l}=M_{1, \ldots, \breve{r}, \ldots, m, l}\). Here, in the rth row of the minor \(M_{1, \ldots, \breve{r}, \ldots, m, l}\) of the matrix (10.15), all elements are equal to zero, except for the element in the last (lth) column, which is equal to a rl . Expanding the minor \(M_{1, \ldots, \breve{r}, \ldots, m, l}\) along the rth row, we see that it is equal to (−1)r+l a rl . In other words, \({\overline{p}}_{r l}=(-1)^{r+l}a_{rl}\).

By our construction, all elements a rl of the matrix (10.15) can assume arbitrary values by the choice of a suitable subspace ML and basis a 1,…,a m in it. Thus the Plücker coordinates \({\overline{p}}_{r l}\) also assume arbitrary values. It remains to observe that by Theorem 10.6, all remaining Plücker coordinates are polynomials in \({\overline{p}}_{r l}\), and consequently, for the constructed subspace M, they determine the given point of the set G(m,n)∩U 1,…,m . □

10.3 The Exterior Product

Now we shall attempt to understand the sense in which the subspace ML is related to its Plücker coordinates, after separating out those parts of the construction that depend on the choice of bases e 1,…,e n in L and a 1,…,a m in M from those that do not depend on the choice of basis.

Our definition of Plücker coordinates was connected with the minors of the matrix M given by formula (10.1), and since minors (like all determinants) are multilinear and antisymmetric functions of the rows (and columns), let us begin by recalling the appropriate definitions from Sect. 2.6 (especially because now we shall need them in a somewhat changed form). Namely, while in Chap. 2, we considered only functions of rows, now we shall consider functions of vectors belonging to an arbitrary vector space L. We shall assume that the space L is finite-dimensional. Then by Theorem 3.64, it is isomorphic to the space of rows of length n=dimL, and so we might have used the definitions from Sect. 2.6. But such an isomorphism itself depends on the choice of basis in the space L, and our goal is precisely to study the dependence of our construction on the choice of basis.

Definition 10.8

A function F(x 1,…,x m ) in m vectors of the space L taking numeric values is said to be multilinear if for every index i in the range 1 to m and arbitrary fixed vectors \({\boldsymbol{a} }_{1}, \ldots, \breve{{\boldsymbol{a} }}_{i}, \ldots, {\boldsymbol{a} }_{m}\),

$$F({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_{i-1}, {\boldsymbol{x} }_i, {\boldsymbol{a} }_{i+1}, \ldots, {\boldsymbol{a} }_m) $$

is a linear function of the vector x i .

For m=1, we arrive at the notion of linear function introduced in Sect. 3.7, and for m=2, this is the notion of bilinear form, introduced in Sect. 6.1.

The definition of antisymmetric function given in Sect. 2.6 was valid for every set, and in particular, we may apply it to the set of all vectors of the space L. According to this definition, for every pair of distinct indices r and s in the range 1 to m, the relationship

$$ F({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_r, \ldots, {\boldsymbol{x} }_s, \ldots, {\boldsymbol{x} }_m) = - F({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_s, \ldots, {\boldsymbol{x} }_r, \ldots, {\boldsymbol{x} }_m) $$
(10.16)

must be satisfied for every collection of vectors x 1,…,x m L. As proved in Sect. 2.6, it suffices to prove property (10.16) for s=r+1, that is, a transposition of two neighboring vectors from the collection x 1,…,x m is performed. Then property (10.16) will also be satisfied for arbitrary indices r and s. In view of this, we shall often formulate the condition of antisymmetry only for “neighboring” indices and use the fact that it then holds for two arbitrary indices r and s.

If these numbers are elements of a field of characteristic different from 2, then it follows that F(x 1,…,x m )=0 if any two vectors x 1,…,x m coincide.

Let us denote by Π m(L) the collection of all multilinear functions of m vectors of the space L, and by Ω m(L) the collection of all antisymmetric functions in Π m(L). The sets Π m(L) and Ω m(L) become vector spaces if for all F,GΠ m(L) we define their sum H=F+GΠ m(L) by the formula

$$H({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_m) = F({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_m) + G({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_m) $$

and define for every function FΠ m(L) the product by the scalar α as the function H=αFΠ m(L) according to the formula

$$H({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_m) = \alpha F({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_m). $$

It directly follows from these definitions that Π m(L) is thereby converted to a vector space, and Ω m(L)⊂Π m(L) is a subspace of Π m(L).

Let dimL=n, and let e 1,…,e n be some basis of the space L. It follows from the definition that the multilinear function F(x 1,…,x m ) is defined for all collections of vectors (x 1,…,x m ) if it is defined for those collections whose vectors x i belong to our basis. Indeed, repeating the arguments from Sect. 2.7 verbatim that we used in the proof of Theorem 2.29, we obtain for F(x 1,…,x m ) the same formulas (2.40) and (2.43). Thus for the chosen basis e 1,…,e n , the multilinear function F(x 1,…,x m ) is determined by its values \(F({\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{e} }_{i_{m}})\), where i 1,…,i m are all possible collections of numbers from the set ℕ n ={1,…,n}.

The previous line of reasoning shows that the space Π m(L) is isomorphic to the space of functions on the set \({\mathbb{N}}_{n}^{m} = {\mathbb{N}}_{n} \times\cdots\times {\mathbb{N}}_{n}\) (m-fold product). It follows that the dimension of the space Π m(L) is finite and coincides with the number of elements of the set \({\mathbb{N}}_{n}^{m}\). It is easy to verify that this number is equal to n m, and so dimΠ m(L)=n m.

As we observed in Example 3.36 (p. 94), in a space of functions f on a finite set \({\mathbb{N}}_{n}^{m}\), there exists a basis consisting of δ-functions assuming the value 1 on one element of \({\mathbb{N}}_{n}^{m}\) and the value 0 on all the other elements (p. 94). In our case, we shall introduce a special notation for such a basis. Let I=(i 1,…,i m ) be an arbitrary element of the set \({\mathbb{N}}_{n}^{m}\). Then we denote by f I the function taking the value 1 at the element I and the value 0 on all remaining elements of the set \({\mathbb{N}}_{n}^{m}\).

We now move on to an examination of the subspace of antisymmetric multilinear functions Ω m(L), assuming as previously that there has been chosen in L some basis e 1,…,e n . To verify that a multilinear function F is antisymmetric, it is necessary and sufficient that property (10.16) be satisfied for the vectors e i of the basis. In other words, this reduces to the relationships

$$F({\boldsymbol{e} }_{i_1}, \ldots, {\boldsymbol{e} }_{i_r}, \ldots, {\boldsymbol{e} }_{i_s}, \ldots, {\boldsymbol{e} }_{i_m}) = - F({\boldsymbol{e} }_{i_1}, \ldots, {\boldsymbol{e} }_{i_s}, \ldots, {\boldsymbol{e} }_{i_r}, \ldots, {\boldsymbol{e} }_{i_m}) $$

for all collections of vectors \({\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{e} }_{i_{m}}\) in the chosen basis e 1,…,e n of the space L. Therefore, for every function FΩ m(L) and every collection \((j_{1}, \ldots,\allowbreak j_{m}) \in {\mathbb{N}}_{n}^{m}\), we have the equality

$$ F ({\boldsymbol{e} }_{j_1}, \ldots, {\boldsymbol{e} }_{j_m}) = \pm F ( {\boldsymbol{e} }_{i_1}, \ldots, {\boldsymbol{e} }_{i_m}), $$
(10.17)

where the numbers i 1,…,i m are the same as j 1,…,j m , but arranged in ascending order i 1<i 2<⋯<i m , while the sign + or − in (10.17) depends on whether the number of transpositions necessary for passing from the collection (i 1,…,i m ) to the collection (j 1,…,j m ) is even or odd (we note that if any two of the numbers j 1,…,j m are equal, then both sides of equality (10.17) become equal to zero).

Reasoning just as in the case of the space Π m(L), we conclude that the space Ω m(L) is isomorphic to the space of functions on the set \(\overrightarrow{{\mathbb{N}}}_{n}^{m} \subset {\mathbb{N}}_{n}^{m}\), which consists of all increasing sets I=(i 1,…,i m ), that is, those for which i 1<i 2<⋯<i m . From this it follows in particular that Ω m(L)=(0) if m>n. It is easy to see that the number of such increasing sets I is equal to \(\mathrm{C}^{m}_{n}\), and therefore,

$$ \dim \varOmega ^m({\mathsf{L}}) = \mathrm{C}^m_n. $$
(10.18)

We shall denote by F I the δ-function of the space Ω m(L), taking the value 1 on the set \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{m}\) and the value 0 on all the remaining sets in \(\overrightarrow{{\mathbb{N}}}_{n}^{m}\).

The vectors a 1,…,a m L determine on the space Ω m(L) a linear function φ given by the relationship

$$ {\boldsymbol{\varphi }}(F) = F({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m) $$
(10.19)

for an arbitrary element FΩ m(L). Thus φ is a linear function on Ω m(L), that is, an element of the dual space Ω m(L).

Definition 10.9

The dual space Λ m(L)=Ω m(L) is called the space of m-vectors or the mth exterior power of the space L, and its elements are called m-vectors. A vector φΛ m(L) constructed with the help of relationship (10.19) involving the vectors a 1,…,a m is called the exterior product (or wedge product) of a 1,…,a m and is denoted by

$${\boldsymbol{\varphi }}= {\boldsymbol{a} }_1 \wedge {\boldsymbol{a} }_2 \wedge\cdots\wedge {\boldsymbol{a} }_m. $$

Now let us explore the connection between the exterior product and Plücker coordinates of the subspace ML. To this end, it is necessary to choose some basis e 1,…,e n in L and some basis a 1,…,a m in M. The Plücker coordinates of the subspace M take the form (10.4), where \(M_{i_{1}, \ldots, i_{m}}\) is the minor of the matrix (10.1) that resides in columns i 1,…,i m and is an antisymmetric function of its columns. Let us introduce for the Plücker coordinates and associated minors the notation

$$p_{{\boldsymbol{I} }} = p_{i_1, \ldots, i_m},\qquad M_{{\boldsymbol{I} }} = M_{i_1, \ldots, i_m},\quad \mbox{where } {\boldsymbol{I} }= (i_1, \ldots, i_m) \in \overrightarrow{{\mathbb{N}}}_n^m. $$

To the basis of the space Ω m(L) consisting of δ-functions F I , there corresponds the dual basis, of the dual space Λ m(L), whose vectors we shall denote by φ I . Using the notation that we introduced in Sect. 3.7, we may say that the dual basis is defined by the condition

$$ (F_{{\boldsymbol{I} }}, {\boldsymbol{\varphi }}_{{\boldsymbol{I} }}) = 1 \quad\mbox{for all } {\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_n^m, \qquad(F_{{\boldsymbol{I} }}, {\boldsymbol{\varphi }}_{{\boldsymbol{J}}}) = 0 \quad\mbox{for all } {\boldsymbol{I} }\neq {\boldsymbol{J}}. $$
(10.20)

In particular, the vector φ=a 1a 2∧⋯∧a m of the space Λ m(L) can be expressed as a linear combination of vectors in this basis:

$$ {\boldsymbol{\varphi }}= \sum_{{\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_n^m} \lambda_{{\boldsymbol{I} }} {\boldsymbol{\varphi }}_{{\boldsymbol{I} }} $$
(10.21)

with certain coefficients λ I . Using formulas (10.19) and (10.20), we obtain the following equality:

$$\lambda_{{\boldsymbol{I} }} = {\boldsymbol{\varphi }}(F_{{\boldsymbol{I} }}) = F_{{\boldsymbol{I} }} ( {\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m). $$

For determining the values F I (a 1,…,a m ), we may make use of Theorem 2.29; see formulas (2.40) and (2.43). Since \(F_{{\boldsymbol{I} }}({\boldsymbol{e} }_{j_{1}}, \ldots, {\boldsymbol{e} }_{j_{m}}) = 0\) when the indices of \({\boldsymbol{e} }_{j_{1}}, \ldots, {\boldsymbol{e} }_{j_{m}}\) form the collection JI, then from formula (2.43), it follows that the values F I (a 1,…,a m ) depend only on the elements appearing in the minor M I . The minor M I is a linear and antisymmetric function of its rows. In view of the fact that by definition, \(F_{{\boldsymbol{I} }}({\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{e} }_{i_{m}}) = 1\), we obtain from Theorem 2.15 that F I (a 1,…,a m )=M I =p I . In other words, we have the equality

$$ {\boldsymbol{\varphi }}= {\boldsymbol{a} }_1 \wedge {\boldsymbol{a} }_2 \wedge\cdots\wedge {\boldsymbol{a} }_m = \sum_{{\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_n^m} M_{{\boldsymbol{I} }} {\boldsymbol{\varphi }}_{{\boldsymbol{I} }} = \sum_{{\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_n^m} p_{{\boldsymbol{I} }} {\boldsymbol{\varphi }}_{{\boldsymbol{I} }}. $$
(10.22)

Thus any collection of m vectors a 1,…,a m uniquely determines the vector a 1∧⋯∧a m in the space Λ m(L), where the Plücker coordinates of the subspace 〈a 1,…,a m 〉 are the coordinates of this vector a 1∧⋯∧a m with respect to the basis φ I , \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{m}\), of the space Λ m(L). Like all coordinates, they depend on this basis, which itself is constructed as the dual basis to some basis of the space Ω m(L).

Definition 10.10

A vector xΛ m(L) is said to be decomposable if it can be represented as an exterior product

$$ {\boldsymbol{x} }= {\boldsymbol{a} }_1 \wedge {\boldsymbol{a} }_2 \wedge\cdots\wedge {\boldsymbol{a} }_m $$
(10.23)

with some a 1,…,a m L.

Let the m-vector x have coordinates \(x_{i_{1}, \ldots, i_{m}}\) in some basis φ I , \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{m}\), of the space Λ m(L). As in the case of an arbitrary vector space, the coordinates \(x_{i_{1}, \ldots, i_{m}}\) can assume arbitrary values in the associated field. In order for an m-vector x to be decomposable, that is, that it satisfy the relationship (10.23) with some vectors a 1,…,a m L, it is necessary and sufficient that its coordinates \(x_{i_{1}, \ldots, i_{m}}\) coincide with the Plücker coordinates \(p_{i_{1}, \ldots, i_{m}}\) of the subspace M=〈a 1,…,a m 〉 in L. But as we established in the previous section, the collection of Plücker coordinates of a subspace ML cannot be an arbitrary collection of ν numbers, but only one that satisfies the Plücker relations (10.12). Consequently, the Plücker relations give necessary and sufficient conditions for an m-vector x to be decomposable.

Thus for the specification of m-dimensional subspaces ML, we need only the decomposable m-vectors (the indecomposable m-vectors correspond to no m-dimensional subspace). However, generally speaking, the decomposable vectors do not form a vector space (the sum of two decomposable vectors might be an indecomposable vector), and also, as is easily verified, the set of decomposable vectors is not contained in any subspace of the space Λ m(L) other than Λ m(L) itself. In many problems, it is more natural to deal with vector spaces, and this is the reason for introducing the notion of a space Λ m(L) that contains all m-vectors, including those that are indecomposable.

Let us note that the basis vectors φ I themselves are decomposable: they are determined by the conditions (10.20), which, as is easily verified, taking into account equality \((F_{{\boldsymbol{J}}}, {\boldsymbol{\varphi }}_{{\boldsymbol{I} }}) =F_{{\boldsymbol{J}}} ({\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{e} }_{i_{m}})\), means that for a vector x=φ I , we have the representation (10.23) for \({\boldsymbol{a} }_{1} = {\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{a} }_{m} = {\boldsymbol{e} }_{i_{m}}\), that is,

$${\boldsymbol{\varphi }}_{{\boldsymbol{I} }} = {\boldsymbol{e} }_{i_1} \wedge {\boldsymbol{e} }_{i_2} \wedge\cdots \wedge {\boldsymbol{e} }_{i_m},\quad {\boldsymbol{I} }= (i_1, \ldots,i_m). $$

If e 1,…,e n is a basis of the space L, then the vectors \({\boldsymbol{e} }_{i_{1}} \wedge\cdots \wedge {\boldsymbol{e} }_{i_{m}}\) for all possible increasing collections of indices (i 1,…,i m ) form a basis of the subspace Λ m(L), dual to the basis F I of the space Ω m(L) that we considered above. Thus every m-vector is a linear combination of decomposable vectors.

The exterior product a 1∧⋯∧a m is a function of m vectors a i L with values in the space Λ m(L). Let us now establish some of its properties. The first two of these are an analogue of multilinearity, and the third is an analogue of antisymmetry, but taking into account that the exterior product is not a number, but a vector of the space Λ m(L).

Property 10.11

For every i∈{1,…,m} and all vectors a i ,b,cL the following relationship is satisfied:

(10.24)

Indeed, by definition, the exterior product

$${\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_{i-1} \wedge({\boldsymbol{b} }+{\boldsymbol{c} }) \wedge {\boldsymbol{a} }_{i+1} \wedge\cdots\wedge {\boldsymbol{a} }_m $$

is a linear function on the space Ω m(L) associating with each function FΩ m(L), the number F(a 1,…,a i−1,b+c,a i+1,…,a m ). Since the function F is multilinear, it follows that

which proves equality (10.24).

The following two properties are just as easily verified.

Property 10.12

For every number α and all vectors a i L, the following relationship holds:

(10.25)

Property 10.13

For all pairs of indices r,s∈{1,…,m} and all vectors a i L, the following relationship holds:

(10.26)

that is, if any two vectors from among a 1,…,a m change places, the exterior product changes sign.

If (as we assume) the numbers are elements of a field of characteristic different from 2 (for example, ℝ or ℂ), then Property 10.13 yields the following corollary.

Corollary 10.14

If any two of the vectors a 1,…,a m are equal, then a 1∧⋯∧a m =0.

Generalizing the definition given above, we may express Properties 10.11, 10.12, and 10.13 by saying that the exterior product a 1∧⋯∧a m is a multilinear antisymmetric function of the vectors a 1,…,a m L taking values in the space Λ m(L).

Property 10.15

Vectors a 1,…,a m are linearly dependent if and only if

$$ {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_m = {\boldsymbol{0} }. $$
(10.27)

Proof

Let us assume that the vectors a 1,…,a m are linearly dependent. Then one of them is a linear combination of the rest. Let it be the vector a m (the other cases are reduced to this one by a change in numeration). Then

$${\boldsymbol{a} }_m = \alpha _1 {\boldsymbol{a} }_1 + \cdots+ \alpha _{m-1} {\boldsymbol{a} }_{m-1}, $$

and on the basis of Properties 10.11 and 10.12, we obtain that

In view of Corollary 10.14, each term on the right-hand side of this equality is equal to zero, and consequently, we have a 1∧⋯∧a m =0.

Let us assume now that the vectors a 1,…,a m are linearly independent. We must prove that a 1∧⋯∧a m 0. Equality (10.27) would mean that the function a 1∧⋯∧a m (as an element of the space Λ m(L)) assigns to an arbitrary function FΩ m(L), the value F(a 1,…,a m )=0. However, in contradiction to this, it is possible to produce a function FΩ m(L) for which F(a 1,…,a m )≠0. Indeed, let us represent the space L as a direct sum

$${\mathsf{L}}= \langle {\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m\rangle\oplus {\mathsf{L}}', $$

where L′⊂L is some subspace of dimension nm, and for every vector zL, let us consider the corresponding decomposition z=x+y, where x∈〈a 1,…,a m 〉 and yL′. Finally, for vectors

$${\boldsymbol{z} }_i = \alpha _{i1}{\boldsymbol{a} }_1 + \cdots+ \alpha _{im}{\boldsymbol{a} }_m + {\boldsymbol{y} }_i,\quad {\boldsymbol{y} }_i \in {\mathsf{L}}', i=1, \ldots, m, $$

let us define a function F by the condition F(z 1,…,z m )=|(α ij )|. As we saw in Sect. 2.6, the determinant is a multilinear antisymmetric function of its rows. Moreover, F(a 1,…,a m )=|E|=1, which proves our assertion. □

Let L and M be arbitrary vector spaces, and let be a linear transformation. It defines the transformation

(10.28)

which assigns to each antisymmetric function F(y 1,…,y p ) in the space Ω p(M), an antisymmetric function G(x 1,…,x p ) in the space Ω p(L) by the formula

(10.29)

A simple verification shows that this transformation is linear. Let us note that we have already met with such a transformation in the case m=1, namely the dual transformation (see Sect. 3.7). In the general case, passing to the dual spaces Λ p(L)=Ω p(L) and Λ p(M)=Ω p(M), we define the linear transformation

(10.30)

dual to the transformation (10.28).

Let us note the most important properties of the transformation (10.30).

Lemma 10.16

Let and be linear transformations of arbitrary vector spaces L,M,N. Then

Proof

In view of the definition (10.30) and the properties of dual transformations (formula (3.61)) established in Sect. 3.7, it suffices to ascertain that

(10.31)

But equality (10.31) follows directly from the definition. Indeed, the transformation maps the function F(y 1,…,y p ) in the space Ω p(M) to the function G(x 1,…,x p ) in Ω p(L) by formula (10.29). In just the same way, the transformation maps the function H(z 1,…,z p ) in Ω p(N) to the function F(y 1,…,y p ) in Ω p(M) by the analogous formula

(10.32)

Finally, the transformation takes the function H(z 1,…,z p ) in the space Ω p(N) to the function G(x 1,…,x p ) in the space Ω p(L) by the formula

(10.33)

Substituting into (10.33) the vector and comparing the relationship thus obtained with (10.32), we obtain the required equality (10.31). □

Lemma 10.17

For all vectors x 1,…,x p L, we have the equality

(10.34)

Proof

Both sides of equality (10.34) are elements of the space Λ p(M)=Ω p(M), that is, they are linear functions on Ω p(M). It suffices to verify that their application to any function F(y 1,…,y p ) in the space Ω p(M) gives one and the same result. But as follows from the definition, in both cases, this result is equal to . □

Finally, we shall prove a property of the exterior product that is sometimes called universality.

Property 10.18

Any mapping that carries a vector [a 1,…,a m ] of some space M satisfying Properties 10.11, 10.12, 10.13 (p. 362) to m vectors a 1,…,a m of the space L can be obtained from the exterior product a 1∧⋯∧a m by applying some uniquely defined linear transformation .

In other words, there exists a linear transformation such that for every collection a 1,…,a m of vectors of the space L, we have the equality

(10.35)

which can be represented by the following diagram:

(10.36)

In this diagram, .

Let us note that although L m=L×⋯×L (m-fold product) is clearly a vector space, we by no means assert that the mapping

$${\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m \mapsto[{\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m] $$

discussed in Property 10.18 is a linear transformation L mM. In general, such is not the case. For example, the exterior product a 1∧⋯∧a m :L mΛ m(L) itself is not a linear transformation in the case that dimL>m+1 and m>1. Indeed, the image of the exterior product is the set of decomposable vectors described by their Plücker relations, which is not a vector subspace of Λ m(L).

Proof of Property 10.18

We can construct a linear transformation Ψ:M Ω m(L) such that it maps every linear function fM to the function Ψ(f)∈Ω m(L) defined by the relationship

$$ \varPsi({\boldsymbol{f} }) = {\boldsymbol{f} }\bigl([{\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_m]\bigr). $$
(10.37)

By Properties 10.11–10.13, which, by assumption, are satisfied by [a 1,…,a m ], the mapping Ψ(f) thus constructed is a multilinear and antisymmetric function of a 1,…,a m . Therefore, Ψ:M Ω m(L) is a linear transformation. Let us define as the dual mapping

By definition of the dual transformation (formula (3.58)), for every linear function F on the space Ω m(L), its image is a linear function on the space M such that for all fM . Applying formula (10.37) to the right-hand side of the last equality, we obtain the equality

(10.38)

Setting in (10.38) the function F(Ψ)=Ψ(a 1,…,a m ), that is, F=a 1∧⋯∧a m , we arrive at the relationship

(10.39)

whose left-hand side is an element of the space M ∗∗, which is isomorphic to M.

Let us recall that the identification (isomorphism) of the spaces M ∗∗ and M can be obtained by mapping each vector ψ(f)∈M ∗∗ to the vector xM for which the equality f(x)=ψ(f) is satisfied for every linear function fM . Then formula (10.39) gives the relationship

which is valid for every function fM . Consequently, from this we obtain the required relationship

(10.40)

Equality (10.40) defines a linear transformation for all decomposable vectors xΛ m(L). But above, we saw that every m-vector is a linear combination of decomposable vectors. The transformation is linear, and therefore, it is uniquely defined for all m-vectors. Thus we obtain the required linear transformation . □

10.4 Exterior Algebras*

In many branches of mathematics, an important role is played by the expression

$${\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_m, $$

understood not so much as a function of m vectors a 1,…,a m of the space L with values in Λ m(L), but more as the result of repeated (m-fold) application of the operation consisting in mapping two vectors xΛ p(L) and yΛ q(L) to the vector xyΛ p+q(L). For example, the expression abc can then be calculated “by parts.” That is, it can be represented in the form abc=(ab)∧c and computed by first calculating ab, and then (ab)∧c.

To accomplish this, we have first to define the function mapping two vectors xΛ p(L) and yΛ q(L) to the vector xyΛ p+q(L). As a first step, such a function xy will be defined for the case that the vector yΛ q(L) is decomposable, that is, representable in the form

$$ {\boldsymbol{y} }= {\boldsymbol{a} }_1 \wedge {\boldsymbol{a} }_2 \wedge\cdots\wedge {\boldsymbol{a} }_q,\quad {\boldsymbol{a} }_i \in {\mathsf{L}}. $$
(10.41)

Let us consider the mapping that assigns to p vectors b 1,…,b p of the space L the vector

$$[{\boldsymbol{b} }_1, \ldots, {\boldsymbol{b} }_p] = {\boldsymbol{b} }_1 \wedge\cdots \wedge {\boldsymbol{b} }_p \wedge {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_q, $$

and let us apply to it Property 10.18 (universality) from the previous section. We thereby obtain the diagram

(10.42)

In this diagram,

Definition 10.19

Let y be a decomposable vector, that is, it can be written in the form (10.41). Then for every vector xΛ p(L), its image for the transformation constructed above is denoted by xy=x∧(a 1∧⋯∧a q ) and is called the exterior product of vectors x and y.

Thus as a first step, we defined xy in the case that the vector y is decomposable. In order to define xy for an arbitrary vector yΛ q(L), it suffices simply to repeat the same argument. Indeed, let us consider the mapping [a 1,…,a q ]:Λ q(L)→Λ p+q(L) defined by the formula

$$[{\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_q] = {\boldsymbol{x} }\wedge({\boldsymbol{a} }_1 \wedge \cdots\wedge {\boldsymbol{a} }_q). $$

We again obtain, on the basis of Property 10.18, the same diagram:

(10.43)

where the transformation is defined by the formula

Definition 10.20

For any vectors xΛ p(L) and yΛ q(L), the exterior product xy is the vector in diagram (10.43) constructed above.

Let us note some properties of the exterior product that follow from this definition.

Property 10.21

For any vectors x 1,x 2Λ p(L) and yΛ q(L), we have the relationship

$$({\boldsymbol{x} }_1 + {\boldsymbol{x} }_2) \wedge {\boldsymbol{y} }= {\boldsymbol{x} }_1 \wedge {\boldsymbol{y} }+ {\boldsymbol{x} }_2 \wedge {\boldsymbol{y} }. $$

Similarly, for any vectors xΛ p(L) and yΛ q(L) and any scalar α, we have the relationship

$$(\alpha {\boldsymbol{x} }) \wedge {\boldsymbol{y} }= \alpha ({\boldsymbol{x} }\wedge {\boldsymbol{y} }). $$

Both equalities follow immediately from the definitions and the linearity of the transformation in diagram (10.43).

Property 10.22

For any vectors xΛ p(L) and y 1,y 2Λ q(L), we have the relationship

$${\boldsymbol{x} }\wedge({\boldsymbol{y} }_1 + {\boldsymbol{y} }_2) = {\boldsymbol{x} }\wedge {\boldsymbol{y} }_1 + {\boldsymbol{x} }\wedge {\boldsymbol{y} }_2. $$

Similarly, for any vectors xΛ p(L) and yΛ q(L) and any scalar α, we have the relationship

$${\boldsymbol{x} }\wedge(\alpha {\boldsymbol{y} }) = \alpha ({\boldsymbol{x} }\wedge {\boldsymbol{y} }). $$

Both equalities follow immediately from the definitions and the linearity of the transformations in diagrams (10.42) and (10.43).

Property 10.23

For decomposable vectors x=a 1∧⋯∧a p and y=b 1∧⋯∧b q , we have the relationship

$${\boldsymbol{x} }\wedge {\boldsymbol{y} }= {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_p \wedge {\boldsymbol{b} }_1 \wedge\cdots\wedge {\boldsymbol{b} }_q. $$

This follows at once from the definition.

Let us note that we have actually defined the exterior product in such a way that Properties 10.21–10.23 are satisfied. Indeed, Property 10.23 defines the exterior product of decomposable vectors. And since every vector is a linear combination of decomposable vectors, it follows that Properties 10.21 and 10.22 define it in the general case. The property of universality of the exterior product has been necessary for verifying that the result xy does not depend on the choice of linear combinations of decomposable vectors that we use to represent the vectors x and y.

Finally, let us make note of the following equally simple property.

Property 10.24

For any vectors xΛ p(L) and yΛ q(L), we have the relationship

$$ {\boldsymbol{x} }\wedge {\boldsymbol{y} }= (-1)^{pq} {\boldsymbol{y} }\wedge {\boldsymbol{x} }. $$
(10.44)

Both vectors on the right- and left-hand sides of equality (10.44) belong to the space Λ p+q(L), that is, by definition, they are linear functions on Ω p+q(L). Since every vector is a linear combination of decomposable vectors, it suffices that we verify equality (10.44) for decomposable vectors.

Let x=a 1∧⋯∧a p , y=b 1∧⋯∧b q , and let F be any vector of the space Ω p+q(L), that is, F is an antisymmetric function of the vectors x 1,…,x p+q in L. Then equality (10.44) means that

$$ F({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_p, {\boldsymbol{b} }_1, \ldots, {\boldsymbol{b} }_q) = (-1)^{pq} F({\boldsymbol{b} }_1, \ldots, {\boldsymbol{b} }_q, {\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_p). $$
(10.45)

But equality (10.45) is an obvious consequence of the antisymmetry of the function F. Indeed, in order to place the vector b 1 in the first position on the left-hand side of (10.45), we must change the position of b 1 with each vector a 1,…,a p in turn. One such transposition reverses the sign, and altogether, the transpositions multiply F by (−1)p. Similarly, in order to place the vector b 2 in the second position on the left-hand side of (10.45), we also must execute p transpositions, and the value of F is again multiplied by (−1)p. And in order to place all vectors b 1,…,b q at the beginning, it is necessary to multiply F by (−1)p a total of q times, and this ends up as (10.45).

Our next step consists in uniting all the sets Λ p(L) into a single set Λ(L) and defining the exterior product for its elements. Here we encounter a special case of a very important algebraic notion, that of an algebra.Footnote 2

Definition 10.25

An algebra (over some field \({\mathbb{K}}\), which we shall consider to consist of numbers) is a vector space A on which, besides the operations of addition of vectors and multiplication of a vector by a scalar, is also defined the operation A×AA, called the product, assigning to every pair of elements a,bA the element abA and satisfying the following conditions:

  1. (1)

    the distributive property: for all a,b,cA, we have the relationship

    $$ ({\boldsymbol{a} }+{\boldsymbol{b} }) {\boldsymbol{c} }= {\boldsymbol{a} }{\boldsymbol{c} }+ {\boldsymbol{b} }{\boldsymbol{c} },\qquad {\boldsymbol{c} }({\boldsymbol{a} }+{\boldsymbol{b} }) = {\boldsymbol{c} }{\boldsymbol{a} }+ {\boldsymbol{c} }{\boldsymbol{b} }; $$
    (10.46)
  2. (2)

    for all a,bA and every scalar \(\alpha \in {\mathbb{K}}\), we have the relationship

    $$ (\alpha {\boldsymbol{a} }) {\boldsymbol{b} }= {\boldsymbol{a} }(\alpha {\boldsymbol{b} }) = \alpha ({\boldsymbol{a} }{\boldsymbol{b} }); $$
    (10.47)
  3. (3)

    there exists an element eA, called the identity, such that for every aA, we have ea=a and ae=a.

Let us note that there can be only one identity element in an algebra. Indeed, if there existed another identity element e′, then by definition, we would have the equalities ee′=e′ and ee′=e, from which it follows that e=e′.

As in any vector space, in an algebra we have, for every aA, the equality 0⋅a=0 (here the 0 on the left denotes the scalar zero in the field \({\mathbb{K}}\), while the 0 on the right denotes the null element of the vector space A that is an algebra).

If an algebra A is finite-dimensional as a vector space and e 1,…,e n is a basis of A, then the elements e 1,…,e n are said to form a basis of the algebra A, where the number n is called its dimension and is denoted by dimA=n. For an algebra A of finite dimension n, the product of two of its basis elements can be represented in the form

$$ {\boldsymbol{e} }_i {\boldsymbol{e} }_j = \sum_{k=1}^{n} \alpha _{ij}^k {\boldsymbol{e} }_k,\quad i,j = 1, \ldots, n, $$
(10.48)

where \(\alpha _{ij}^{k} \in {\mathbb{K}}\) are certain scalars.

The totality of all scalars \(\alpha _{ij}^{k}\) for all i,j,k=1,…,n is called the multiplication table of the algebra A, and it uniquely determines the product for all the elements of the algebra. Indeed, if x=λ 1 e 1+⋯+λ n e n and y=μ 1 e 1+⋯+μ n e n , then repeatedly applying the rules (10.46) and (10.47) and taking into account (10.48), we obtain

$$ {\boldsymbol{x} }{\boldsymbol{y} }= \sum_{i,j,k=1}^{n} \lambda_i \mu_j \alpha _{ij}^k {\boldsymbol{e} }_k, $$
(10.49)

that is, the product xy is uniquely determined by the coordinates of the vectors x,y and the multiplication table of the algebra A. And conversely, it is obvious that for any given multiplication table, formula (10.49) defines in an n-dimensional vector space an operation of multiplication satisfying all the requirements entering into the definition of an algebra, except, perhaps, property 3, which requires further consideration; that is, it converts this vector space into an algebra of the same dimension n.

Definition 10.26

An algebra A is said to be associative if for every collection of three elements a, b, and c, we have the relationship

$$ ({\boldsymbol{a} }{\boldsymbol{b} }) {\boldsymbol{c} }= {\boldsymbol{a} }({\boldsymbol{b} }{\boldsymbol{c} }). $$
(10.50)

The associative property makes it possible to calculate the product of any number of elements a 1,…,a m of an algebra A without indicating the arrangement of parentheses among them; see the discussion on p. xv. Clearly, it suffices to verify the associative property of a finite-dimensional algebra for elements of some basis.

We have already encountered some examples of algebras.

Example 10.27

The algebra of all square matrices of order n. It has the finite dimension n 2, and as we saw in Sect. 2.9, it is associative.

Example 10.28

The algebra of all polynomials in n>0 variables with numeric coefficients. This algebra is also associative, but its dimension is infinite.

Now we shall define for a vector space L of finite dimension n its exterior algebra Λ(L). This algebra has many different applications (some of them will be discussed in the following section); its introduction is one more reason why in Sect. 10.3, we did not limit our consideration to decomposable vectors only, which were sufficient for describing vector subspaces.

Let us define the exterior algebra Λ(L) as a direct sum of spaces Λ p(L), p≥0, which consist of more than just the one null vector, where Λ 0(L) is by definition equal to \({\mathbb{K}}\). Since as a result of the antisymmetry of the exterior product we have Λ p(L)=(0) for all p>n, we obtain the following definition of an exterior algebra:

$$ \varLambda ({\mathsf{L}}) = \varLambda ^0 ({\mathsf{L}}) \oplus \varLambda ^1 ({\mathsf{L}}) \oplus\cdots\oplus \varLambda ^n ({\mathsf{L}}). $$
(10.51)

Thus every element u of the constructed vector space Λ(L) can be represented in the form u=u 0+u 1+⋯+u n , where u i Λ i(L).

Our present goal is the definition of the exterior product in Λ(L), which we denote by uv for arbitrary vectors u,vΛ(L). We shall define the exterior product uv of vectors

$${\boldsymbol{u} }= {\boldsymbol{u} }_0 + {\boldsymbol{u} }_1 + \cdots+ {\boldsymbol{u} }_n,\qquad { \boldsymbol{v} }= {\boldsymbol{v} }_0 + {\boldsymbol{v} }_1 + \cdots+ {\boldsymbol{v} }_n,\quad {\boldsymbol{u} }_i, {\boldsymbol{v} }_i \in \varLambda ^i({\mathsf{L}}), $$

as the element

$${\boldsymbol{u} }\wedge{\boldsymbol{v} }= \sum_{i,j=0}^n {\boldsymbol{u} }_i \wedge{\boldsymbol{v} }_j, $$

where we use the fact that the exterior product u i v j is already defined as an element of the space Λ i+j(L). Thus

$${\boldsymbol{u} }\wedge{\boldsymbol{v} }= {\boldsymbol{w} }_0 + {\boldsymbol{w} }_1 + \cdots+ {\boldsymbol{w} }_n, \quad\mbox{where } {\boldsymbol{w} }_k = \sum _{i+j=k} {\boldsymbol{u} }_i \wedge{\boldsymbol{v} }_j, {\boldsymbol{w} }_k \in \varLambda ^k({\mathsf{L}}). $$

A simple verification shows that for the exterior product thus defined, all the conditions for the definition of an algebra are satisfied. This follows at once from the properties of the exterior product xy of vectors xΛ i(L) and yΛ j(L) proved earlier. By definition, \(\varLambda ^{0} ({\mathsf{L}}) = {\mathbb{K}}\), and the number 1 (the identity in the field \({\mathbb{K}}\)) is the identity in the exterior algebra Λ(L).

Definition 10.29

A finite-dimensional algebra A is called a graded algebra if there is given a decomposition of the vector space A into a direct sum of subspaces A i A,

$$ {\mathsf{A}}= {\mathsf{A}}_0 \oplus {\mathsf{A}}_1 \oplus\cdots\oplus {\mathsf{A}}_k, $$
(10.52)

and the following conditions are satisfied: for all vectors xA i and yA j , the product xy is in A i+j if i+jk, and xy=0 if i+j>k. Here the decomposition (10.52) is called a grading.

In this case, dimA=dimA 0+⋯+dimA k , and taking the union of the bases of the subspaces A i , we obtain a basis of the space A. The decomposition (10.51) and the definition of the exterior product show that the exterior algebra Λ(L) is graded if the space L has finite dimension n. Since Λ p(L)=(0) for all p>n, it follows that

$$\dim \varLambda ({\mathsf{L}}) = \sum_{p=0}^n \dim \varLambda ^p({\mathsf{L}}) = \sum_{p=0}^n \mathrm{C}_n^p = 2^n. $$

In an arbitrary graded algebra A with grading (10.52), the elements of the subspace A i are called homogeneous elements of degree i, and for every uA i , we write i=degu. One often encounters graded algebras of infinite dimension, and in this case, the grading (10.52) contains, in general, not a finite, but an infinite number of terms. For example, in the algebra of polynomials (Example 10.28), a grading is defined by the decomposition of a polynomial into homogeneous components.

Property (10.44) of the exterior product that we have proved shows that in an exterior algebra Λ(L), we have for all homogeneous elements u and v the relationship

$$ {\boldsymbol{u} }\wedge{\boldsymbol{v} }= (-1)^{d} {\boldsymbol{v} }\wedge {\boldsymbol{u} }, \quad\mbox{where } d= \deg {\boldsymbol{u} }\deg{\boldsymbol{v} }. $$
(10.53)

Let us prove that for every finite-dimensional vector space L, the exterior algebra Λ(L) is associative. As we noted above, it suffices to prove the associative property for some basis of the algebra. Such a basis can constructed out of homogeneous elements, and we may even choose them to be decomposable. Thus we may suppose that the elements a,b,cΛ(L) are equal to

$${\boldsymbol{a} }= {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_p,\qquad {\boldsymbol{b} }= {\boldsymbol{b} }_1 \wedge\cdots\wedge {\boldsymbol{b} }_q,\qquad {\boldsymbol{c} }= {\boldsymbol{c} }_1 \wedge\cdots\wedge {\boldsymbol{c} }_r, $$

and in this case, using the properties proved above, we obtain

$${\boldsymbol{a} }\wedge({\boldsymbol{b} }\wedge {\boldsymbol{c} }) = {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_p \wedge {\boldsymbol{b} }_1 \wedge\cdots\wedge {\boldsymbol{b} }_q \wedge {\boldsymbol{c} }_1 \wedge\cdots\wedge {\boldsymbol{c} }_r = ({\boldsymbol{a} }\wedge {\boldsymbol{b} }) \wedge {\boldsymbol{c} }. $$

An associative graded algebra that satisfies relationship (10.53) for all pairs of homogeneous elements is called a superalgebra. Thus an exterior algebra Λ(L) of an arbitrary finite-dimensional vector space L is a superalgebra, and it is the most important example of this concept.

Let us now return to the exterior algebra Λ(L) of the finite-dimensional vector space L. Let us choose in it a convenient basis and determine its multiplication table.

Let us fix in the space L an arbitrary basis e 1,…,e n . Since the elements \(\varphi _{{\boldsymbol{I} }} = {\boldsymbol{e} }_{i_{1}} \wedge\cdots\wedge {\boldsymbol{e} }_{i_{m}}\) for all possible collections I=(i 1,…,i m ) in \(\overrightarrow{{\mathbb{N}}}_{n}^{m}\) form a basis of the space Λ m(L), m>0, it follows from decomposition (10.51) that a basis in Λ(L) is obtained as the union of the bases of the subspaces Λ m(L) for all m=1,…,n and the basis of the subspace \(\varLambda ^{0}({\mathsf{L}}) = {\mathbb{K}}\), consisting of a single nonnull scalar, for example 1. This means that all such elements φ I , \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{m}\), m=1,…,n, together with 1 form a basis of the exterior algebra Λ(L). Since the exterior product with 1 is trivial, it follows that in order to compose a multiplication table in the constructed basis, we must find the exterior product φ I φ J for all possible collections of indices \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{p}\) and \({\boldsymbol{J}}\in \overrightarrow{{\mathbb{N}}}_{n}^{q}\) for all 1≤p,qn.

In view of Property 10.23 on page 369, the exterior product φ I φ J is equal to

$$ \varphi _{{\boldsymbol{I} }} \wedge \varphi _{{\boldsymbol{J}}} = {\boldsymbol{e} }_{i_1} \wedge \cdots\wedge {\boldsymbol{e} }_{i_p} \wedge {\boldsymbol{e} }_{j_1} \wedge\cdots\wedge {\boldsymbol{e} }_{j_q}. $$
(10.54)

Here there are two possibilities. If the collections I and J contain at least one index in common, then by Corollary 10.14 (p. 363), the product (10.54) is equal to zero.

If, on the other hand, IJ=∅, then we shall denote by K the collection in \({\mathbb{N}}_{n}^{p+q}\) comprising the indices belonging to the set IJ, that is, in other words, K is obtained by arranging the collection (i 1,…,i p ,j 1,…,j q ) in ascending order. Then, as is easily verified, the exterior product (10.54) differs from the element φ K , \({\boldsymbol{K}}\in \overrightarrow{{\mathbb{N}}}_{n}^{p+q}\), belonging to the basis of the exterior algebra Λ(L) constructed above in that the indices of the collection IJ are not necessarily arranged in ascending order. In order to obtain from (10.54) the element φ K , \({\boldsymbol{K}}\in \overrightarrow{{\mathbb{N}}}_{n}^{p+q}\), it is necessary to interchange the indices (i 1,…,i p ,j 1,…,j q ) in such a way that the resulting collection is increasing. Then by Theorems 2.23 and 2.25 from Sect. 2.6 and Property 10.13, according to which the exterior product changes sign under the transposition of any two vectors, we obtain that

$$\varphi _{{\boldsymbol{I} }} \wedge \varphi _{{\boldsymbol{J}}} = \varepsilon ({\boldsymbol{I} },{\boldsymbol{J}}) \varphi _{{\boldsymbol{K}}},\quad {\boldsymbol{K}}\in \overrightarrow{{\mathbb{N}}}_n^{p+q}, $$

where the number ε(I,J) is equal to +1 or −1 depending on whether the number of transpositions necessary for passing from (i 1,…,i p ,j 1,…,j q ) to the collection \({\boldsymbol{K}}\in \overrightarrow{{\mathbb{N}}}_{n}^{p+q}\) is even or odd.

As a result, we see that in the constructed basis of the exterior algebra Λ(L), the multiplication table assumes the following form:

$$ \varphi _{{\boldsymbol{I} }} \wedge \varphi _{{\boldsymbol{J}}} = \begin{cases} {\boldsymbol{0} },& \mbox{if } {\boldsymbol{I} }\cap {\boldsymbol{J}}\neq\varnothing, \\ \varepsilon ({\boldsymbol{I} },{\boldsymbol{J}}) \varphi _{{\boldsymbol{K}}}, & \mbox{if } {\boldsymbol{I} }\cap {\boldsymbol{J}}= \varnothing. \end{cases} $$
(10.55)

10.5 Appendix*

The exterior product xy of vectors xΛ p(L) and yΛ q(L) defined in the previous section makes it possible in many cases to give simple proofs of assertions that we encountered earlier.

Example 10.30

Let us consider the case p=n, using the notation and results of the previous section. As we have seen, \(\dim \varLambda ^{p}({\mathsf{L}}) = \mathrm{C}^{p}_{n}\), and therefore, the space Λ n(L) is one-dimensional, and each of its nonzero vectors constitutes a basis. If e is such a vector, then an arbitrary vector of the space Λ n(L) can be written in the form α e with a suitable scalar α. Thus for any n vectors x 1,…,x n of the space L, we obtain the relationship

$$ {\boldsymbol{x} }_1 \wedge\cdots\wedge {\boldsymbol{x} }_n = \alpha ({\boldsymbol{x} }_1, \ldots, {\boldsymbol{x} }_n) {\boldsymbol{e} }, $$
(10.56)

where α(x 1,…,x n ) is some function of n vectors taking numeric values from the field \({\mathbb{K}}\). By Properties 10.11, 10.12, and 10.13, this function is multilinear and antisymmetric.

Let us choose in the space L some basis e 1,…,e n and set

$${\boldsymbol{x} }_i = x_{i 1}{\boldsymbol{e} }_1 + \cdots+ x_{i n} {\boldsymbol{e} }_n,\quad i=1, \ldots, n. $$

The choice of a basis defines an isomorphism of the space L and the space \({\mathbb{K}}^{n}\) of rows of length n, in which the vector x i corresponds to the row (x i1,…,x in ). Thus α becomes a multilinear and antisymmetric function of n rows taking numeric values. By Theorem 2.15, the function α(x 1,…,x n ) coincides up to a scalar multiple k(e) with the determinant of the square matrix of order n consisting of the coordinates x ij of the vectors x 1,…,x n :

(10.57)

The arbitrariness of the choice of coefficient k(e) in formula (10.57) corresponds to the arbitrariness of the choice of basis e in the one-dimensional space Λ n(L) (let us recall that the basis e 1,…,e n of the space L is fixed).

In particular, let us choose as basis of the space Λ n(L) the vector

$$ {\boldsymbol{e} }= {\boldsymbol{e} }_1 \wedge\cdots\wedge {\boldsymbol{e} }_n. $$
(10.58)

Vectors e 1,…,e n are linearly independent. Therefore, by Property 10.15 (p. 363), the vector e is nonnull. We therefore obviously obtain that k(e)=1. Indeed, since the coefficient k(e) in formula (10.57) is one and the same for all collections of vectors x 1,…,x n , we can calculate it by setting x i =e i , i=1,…,n. Comparing in this case formulas (10.56) and (10.58), we see that α(e 1,…,e n )=1. Substituting this value into relationship (10.57) for x i =e i , i=1,…,n, and noting that the determinant on the right-hand side of (10.57) is the determinant of the identity matrix, that is, equal to 1, we conclude that k(e)=1.

Using definitions given earlier, we may associate the linear transformation with the linear transformation . The transformation can be defined by indicating to which vectors x 1,…,x n it takes the basis e 1,…,e n of the space L, that is, by specifying vectors , i=1,…,n. By Lemma 10.17 (p. 365), we have the equality

(10.59)

On the other hand, as we know, all linear transformations of a one-dimensional space have the form xα x, where α is some scalar equal to the determinant of the given transformation and independent of the choice of basis e in Λ n(L). Thus we obtain that , where the scalar α is equal to the determinant and clearly depends only on the transformation itself, that is, it is determined by the collection of vectors , i=1,…,n. It is not difficult to see that this scalar α coincides with the function α(x 1,…,x n ) defined above. Indeed, let us choose in the space Λ n(L) a basis e=e 1∧⋯∧e n . Then the required equality follows directly from formula (10.59).

Further, substituting into (10.59) expression (10.57) for α(x 1,…,x n ), taking into account that k(e)=1 and that the determinant on the right-hand side of (10.57) coincides with the determinant of the transformation , we obtain the following result:

(10.60)

This relationship gives the most invariant definition of the determinant of a linear transformation among all those that we have encountered.

We obtained relationship (10.60) for an arbitrary basis e 1,…,e n of the space L, that is, for any n linearly independent vectors of the space. But it is also true for any n linearly dependent vectors a 1,…,a n of this space. Indeed, in this case, the vectors are clearly also linearly dependent, and by Property 10.15, both exterior products a 1∧⋯∧a n and are equal to zero. Thus for any n vectors a 1,…,a n of the space L and any linear transformation , we have the relationship

(10.61)

In particular, if is some other linear transformation, then formula (10.60) for the transformation gives the analogous equality

On the other hand, from the same formula we obtain that

Hence it follows that . This is almost a “tautological” proof of Theorem 2.54 on the determinant of the product of square matrices.

The arguments that we have presented acquire a more concrete character if L is an oriented Euclidean space. Then as the basis e 1,…,e n in L we may choose an orthonormal and positively oriented basis. In this case, the basis (10.58) in Λ n(L) is uniquely defined, that is, it does not depend on the choice of basis e 1,…,e n . Indeed, if \({\boldsymbol{e} }'_{1}, \ldots, {\boldsymbol{e} }'_{n}\) is another such basis in L, then as we know, there exists a linear transformation such that , i=1,…,n, and furthermore, the transformation is orthogonal and proper. But then , and formula (10.60) shows that \({\boldsymbol{e} }'_{1} \wedge \cdots\wedge {\boldsymbol{e} }'_{n} = {\boldsymbol{e} }_{1} \wedge\cdots\wedge {\boldsymbol{e} }_{n}\).

Example 10.31

Let us show how from the given considerations, we obtain a proof of the Cauchy–Binet formula, which was stated but not proved in Sect. 2.9.

Let us recall that in that section, we considered the product of two matrices B and A, the first of type (m,n), and the second of type (n,m), so that BA is a square matrix of order m. We are required to obtain an expression for the determinant |BA| in terms of the associated minors of the matrices B and A. Minors of the matrices B and A are said to be associated if they are of the same order, namely the minimum of n and m, and are located in the columns (of matrix B) and rows (of matrix A) of identical indices. The Cauchy–Binet formula asserts that the determinant |BA| is equal to 0 if n<m, and that |BA| is equal to the sum of the pairwise products over all the associated minors of order m if nm.

Since every matrix is the matrix of some linear transformation of vector spaces of suitable dimensions, we may formulate this problem as a question of the determinant of the product of linear transformations and , where dimL=n and dimM=m. Here it is assumed that we have chosen a basis e 1,…,e m in the space M and a basis f 1,…,f n in the space L such that the transformations and have matrices A and B respectively in these bases. Then will be a linear transformation of the space M into itself with determinant .

Let us first prove that |BA|=0 if n<m. Since the image of the transformation, , is a subset of and , it follows that in the case under consideration, we have the inequality

from which it follows that the image of the transformation is not equal to the entire space M, that is, the transformation is singular. This means that , that is, |BA|=0.

Now let us consider the case nm. Using Lemmas 10.16 and 10.17 from Sect. 10.3 with p=m, we obtain for the vectors of the basis e 1,…,e m of the space M the relationship

(10.62)

The vectors are contained in the space L of dimension n, and their coordinates in the basis f 1,…,f n , being written in column form, form the matrix A of the transformation . Let us now write the coordinates of the vectors in row form. We thereby obtain the transpose matrix A of type (m,n). Applying formula (10.22) to the vectors , we obtain the equality

(10.63)

with the functions φ I defined by formula (10.20). In the expression (10.63), according to our definition, M I is the minor of the matrix A occupying columns i 1,…,i m . It is obvious that such a minor M I of the matrix A coincides with the minor of the matrix A occupying rows with the same indices i 1,…,i m . Thus we may assume that in the sum on the right-hand side of (10.63), M I are the minors of order m of the matrix A corresponding to all possible ordered collections I=(i 1,…,i m ) of indices of its rows.

Relationships (10.62) and (10.63) together give the equality

(10.64)

Let us denote by M I and N I the associated minors of the matrices A and B. This means that the minor M I occupies the rows of the matrix A with indices I=(i 1,…,i m ), and the minor N I occupies the columns of the matrix B with the same indices. Let us consider the restriction of the linear transformation to the subspace \(\langle {\boldsymbol{f} }_{i_{1}}, \ldots, {\boldsymbol{f} }_{i_{m}} \rangle\). By the definition of the functions φ I , we obtain that

From this, taking into account formula (10.64), follows the relationship

On the other hand, by Lemma 10.17 and formula (10.60), we have

The last two equalities give us the relationship

which, taking into account the equality , coincides with the Cauchy–Binet formula for the case nm.

Example 10.32

Let us derive the formula for the determinant of a square matrix A that generalizes the well-known formula for the expansion of the determinant along the jth column:

$$ |A| = a_{1j} A_{1j} + a_{2j} A_{2j} + \cdots+ a_{nj} A_{nj}, $$
(10.65)

where A ij is the cofactor of the element a ij , that is, the number (−1)i+j M ij , and M ij is the minor obtained by deleting this element from the matrix A along with the entire row and column at whose intersection it is located. The generalization consists in the fact that now we shall write down an analogous expansion of the determinant not along a single column, but along several, thereby generalizing in a suitable way the notion of the cofactor.

Let us consider a certain collection \({\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_{n}^{m}\), where m is a natural number in the range 1 to n−1. Let us denote by \(\overline{{\boldsymbol{I} }}\) the collection obtained from (1,…,n) by discarding all indices entering into I. Clearly, \(\overline{{\boldsymbol{I} }}\in \overrightarrow{{\mathbb{N}}}_{n}^{n-m}\). Let us denote by |I| the sum of all indices entering into the collection I, that is, we shall set |I|=i 1+⋯+i m .

Let A be an arbitrary square matrix of order n, and let I=(i 1,…,i m ) and J=(j 1,…,j m ) be two collections of indices in \(\overrightarrow{{\mathbb{N}}}_{n}^{m}\). For the minor M IJ occupying the rows with indices i 1,…,i m and columns with indices j 1,…,j m , let us call the number

$$ A_{{\boldsymbol{I} }{\boldsymbol{J}}} = (-1)^{|{\boldsymbol{I} }|+|{\boldsymbol{J}}|} M_{\overline{{\boldsymbol{I} }}\overline{{\boldsymbol{J}}}} $$
(10.66)

the cofactor. It is easy to see that the given definition is indeed a generalization of that given in Chap. 2 of the cofactor of a single element a ij for which m=1 and the collections I=(i), J=(j) each consist of a single index.

Theorem 10.33

(Laplace’s theorem)

The determinant of a matrix A is equal to the sum of the products of all minors occupying any m given columns (or rows) by their cofactors:

$$|A| = \sum_{{\boldsymbol{J}}\in \overrightarrow{{\mathbb{N}}}_n^m} M_{{\boldsymbol{I} }{\boldsymbol{J}}} A_{{\boldsymbol{I} }{\boldsymbol{J}}} = \sum_{{\boldsymbol{I} }\in \overrightarrow{{\mathbb{N}}}_n^m} M_{{\boldsymbol{I} }{\boldsymbol{J}}} A_{{\boldsymbol{I} }{\boldsymbol{J}}}, $$

where the number m can be arbitrarily chosen in the range 1 to n−1.

Remark 10.34

For m=1 and m=n−1, Laplace’s theorem gives formula (10.65) for the expansion of the determinant along a column and the analogous formula for expansion along a row. However, only in the general case is it possible to focus our attention on the symmetry between the minors of order m and those of order nm.

Proof of Theorem 10.33

Let us first of all note that since for the transpose matrix, its rows are converted into columns while the determinant is unchanged, it suffices to provide a proof for only one of the given equalities. For definiteness, let us prove the first—the formula for the expansion of the determinant |A| along m columns.

Let us consider a vector space L of dimension n and an arbitrary basis e 1,…,e n of L. Let be a linear transformation having in this basis the matrix A. Let us apply to the vectors of this basis a permutation such that the first m positions are occupied by the vectors \({\boldsymbol{e} }_{i_{1}}, \ldots, {\boldsymbol{e} }_{i_{m}}\), the remaining nm positions by the vectors \({\boldsymbol{e} }_{i_{m+1}}, \ldots, {\boldsymbol{e} }_{i_{n}}\). In the basis thus obtained, the determinant of the transformation will again be equal to |A|, since the determinant of the matrix of a transformation does not depend on the choice of basis. Using formula (10.60), we obtain

(10.67)

Let us calculate the left-hand side of relationship (10.67), applying formula (10.22) to the two different groups of vectors.

First, let us set , …, . Then from (10.22), we obtain

(10.68)

where I=(i 1,…,i m ), and J runs through all collections from the set \(\overrightarrow{{\mathbb{N}}}_{n}^{m}\).

Now let replace the number m by nm in (10.22) and apply the formula thus obtained to the vectors , …, . As a result, we obtain the equality

(10.69)

where \(\overline{{\boldsymbol{I} }}= (i_{m+1}, \ldots, i_{n})\), and J′ runs through all collections in the set \(\overrightarrow{{\mathbb{N}}}_{n}^{n-m}\).

Substituting the expressions (10.68) and (10.69) into the left-hand side of (10.67), we obtain the equality

$$ \sum_{{\boldsymbol{J}}\in \overrightarrow{{\mathbb{N}}}_n^m} \sum_{{\boldsymbol{J}}' \in \overrightarrow{{\mathbb{N}}}_n^{n-m}} M_{{\boldsymbol{I} }{\boldsymbol{J}}} M_{\overline{{\boldsymbol{I} }}{\boldsymbol{J}}'} {\boldsymbol{\varphi }}_{{\boldsymbol{J}}} \wedge {\boldsymbol{\varphi }}_{{\boldsymbol{J}}'} = |A| ({\boldsymbol{\varphi }}_{{\boldsymbol{I} }} \wedge {\boldsymbol{\varphi }}_{\overline{{\boldsymbol{I} }}}). $$
(10.70)

Let us calculate the exterior product \({\boldsymbol{\varphi }}_{{\boldsymbol{I} }} \wedge {\boldsymbol{\varphi }}_{\overline{{\boldsymbol{I} }}}\) for p=m and q=nm, making use of the multiplication table (10.55) that was obtained at the end of the previous section. In this case, it is obvious that the collection K obtained by the union of I and \(\overline{{\boldsymbol{I} }}\) is equal to (1,…,n), and we have only to calculate the number \(\varepsilon ({\boldsymbol{I} }, \overline{{\boldsymbol{I} }}) = \pm1\), which depends on whether the number of transpositions to get from (i 1,…,i m ,i m+1,…,i n ) to K=(1,…,n) is even or odd. It is not difficult to see (using, for example, the same reasoning as in Sect. 2.6) that \(\varepsilon ({\boldsymbol{I} }, \overline{{\boldsymbol{I} }})\) is equal to the number of pairs \((i, \overline{\imath})\), where iI and \(\overline{\imath}\in \overline{{\boldsymbol{I} }}\), for which the indices i and \(\overline{\imath}\) are in reverse order (form an inversion), that is, \(i > \overline{\imath}\). By definition, all indices less than i 1 appear in \(\overline{{\boldsymbol{I} }}\), and consequently, they form an inversion with i 1. This gives us i 1−1 pairs. Further, all numbers less than i 2 and belonging to \(\overline{{\boldsymbol{I} }}\) form an inversion with index i 2, that is, all numbers less than i 2 with the exception of i 1, which belongs to I and not \(\overline{{\boldsymbol{I} }}\). This gives i 2−2 pairs.

Continuing in this way to the end, we obtain that the number of pairs \((i, \overline{\imath})\) forming an inversion is equal to (i 1−1)+(i 2−2)+⋯+(i m m), that is, equal to |I|−μ, where \(\mu= 1 + \cdots+ m =\frac{1}{2}m(m+1)\). Consequently, we finally obtain the formula \({\boldsymbol{\varphi }}_{{\boldsymbol{I} }} \wedge {\boldsymbol{\varphi }}_{\overline{{\boldsymbol{I} }}} = (-1)^{|{\boldsymbol{I} }| - \mu} {\boldsymbol{\varphi }}_{{\boldsymbol{K}}}\), where K=(1,…,n).

The exterior product φ J φ J is equal to zero for all J and J′, with the exception only of the case that \({\boldsymbol{J}}' = \overline{{\boldsymbol{J}}}\), that is, the collections J and J′ are disjoint and complement each other. By what we have said above, \(\varphi _{{\boldsymbol{J}}} \wedge \varphi _{\overline{{\boldsymbol{J}}}} =(-1)^{|{\boldsymbol{J}}| - \mu} {\boldsymbol{\varphi }}_{{\boldsymbol{K}}}\). Thus from (10.70) we obtain the equality

$$ \sum_{{\boldsymbol{J}}\in \overrightarrow{{\mathbb{N}}}_n^m} M_{{\boldsymbol{I} }{\boldsymbol{J}}} M_{\overline{{\boldsymbol{I} }}\overline{{\boldsymbol{J}}}} (-1)^{|{\boldsymbol{J}}| - \mu} {\boldsymbol{\varphi }}_{{\boldsymbol{K}}} = |A| (-1)^{|{\boldsymbol{I} }| - \mu} {\boldsymbol{\varphi }}_{{\boldsymbol{K}}}. $$
(10.71)

Multiplying both sides of equality (10.71) by the number (−1)|I|+μ, taking into account the obvious identity (−1)2|I|=1, we finally obtain

$$\sum_{{\boldsymbol{J}}\in \overrightarrow{{\mathbb{N}}}_n^m} M_{{\boldsymbol{I} }{\boldsymbol{J}}} M_{\overline{{\boldsymbol{I} }}\overline{{\boldsymbol{J}}}} (-1)^{|{\boldsymbol{I} }|+|{\boldsymbol{J}}|} = |A|, $$

which, taking into account definition (10.66), gives us the required equality. □

Example 10.35

We began this section with Example 10.30, in which we investigated in detail the space Λ p(L) for p=n. Let us now consider the case p=n−1. As a result of the general relationship \(\dim \varLambda ^{p}({\mathsf{L}}) = \mathrm{C}^{p}_{n}\), we obtain that dimΛ n−1(L)=n.

Having chosen an arbitrary basis e 1,…,e n in the space L, we assign to every vector zΛ n−1(L) the linear function f(x) on L defined by the condition

$${\boldsymbol{z} }\wedge {\boldsymbol{x} }= f({\boldsymbol{x} }) ({\boldsymbol{e} }_1 \wedge\cdots\wedge {\boldsymbol{e} }_n),\quad {\boldsymbol{x} }\in {\mathsf{L}}. $$

For this, it is necessary to recall that zx belongs to the one-dimensional space Λ n(L), and the vector e 1∧⋯∧e n constitutes there a basis. The linearity of the function f(x) follows from the properties of the exterior product proved above. Let us verify that the linear transformation

thus constructed is an isomorphism. Since dimΛ n−1(L)=dimL =n, to show this, it suffices to verify that the kernel of the transformation is equal to (0). As we know, it is possible to select as the basis of the space Λ n−1(L) the vectors

$${\boldsymbol{e} }_{i_1} \wedge {\boldsymbol{e} }_{i_2} \wedge\cdots\wedge {\boldsymbol{e} }_{i_{n-1}},\quad i_k \in\{1, \ldots, n\}, $$

uniquely up to a permutation of the collection (i 1,…,i n−1); these are all the numbers (1,…,n) except for one. This means that as the basis Λ n−1(L) one can choose the vectors

$$ {\boldsymbol{u} }_i = {\boldsymbol{e} }_{1} \wedge\cdots\wedge {\boldsymbol{e} }_{i-1} \wedge\breve{{\boldsymbol{e} }}_{i} \wedge {\boldsymbol{e} }_{i+1} \cdots\wedge {\boldsymbol{e} }_n,\quad i=1, \ldots, n. $$
(10.72)

It is clear that u i e j =0 if ij, and u i e i e 1∧⋯∧e n for all i=1,…,n.

Let us assume that zΛ n−1(L) is a nonnull vector such that its associated linear function f(x) is equal to zero for every xL. Let us set z=z 1 u 1+⋯+z n u n . Then from our assumption, it follows that zx=0 for all xL, and in particular, for the vectors e 1,…,e n . It is easy to see that from this follow the equalities z 1=0, …, z n =0 and hence z=0.

The constructed isomorphism is a refinement of the following fact that we encountered earlier: the Plücker coordinates of a hyperplane can be arbitrary numbers; in this dimension, the Plücker relations do not yet appear.

Let us now assume that the space L is an oriented Euclidean space. On the one hand, this determines a fixed basis (10.58) in Λ n(L) if e 1,…,e n is an arbitrary positively oriented orthonormal basis of L, so that the isomorphism constructed above is uniquely determined. On the other hand, for a Euclidean space, there is defined the standard isomorphism \({\mathsf{L}}^{*} {{}\!{\phantom {..}}^{\displaystyle \sim } \!\!\!\!\!\!\!\! \to {}}{\mathsf{L}}\), which does not require the selection of any basis at all in L (see p. 214). Combining these two isomorphisms, we obtain the isomorphism

which assigns to the element zΛ n−1(L) the vector xL such that

$$ {\boldsymbol{z} }\wedge {\boldsymbol{y} }= ({\boldsymbol{x} },{\boldsymbol{y} }) ({\boldsymbol{e} }_1 \wedge\cdots\wedge {\boldsymbol{e} }_n) $$
(10.73)

for every vector yL and for the positively oriented orthonormal basis e 1,…,e n , where (x,y) denotes the inner product in the space L.

Let us consider this isomorphism in greater detail. We saw earlier that the vectors u i determined by formula (10.72) form a basis of the space Λ n−1(L). To describe the constructed isomorphism, it suffices to determine which vector bL corresponds to the vector a 1∧⋯∧a n−1, a i L. We may suppose that the vectors a 1,…,a n−1 are linearly independent, since otherwise, the vector a 1∧⋯∧a n−1 would equal 0, and therefore to it would correspond the vector b=0. Taking into account formula (10.73), this correspondence implies the equality

$$ ({\boldsymbol{b} },{\boldsymbol{y} }) ({\boldsymbol{e} }_1 \wedge\cdots\wedge {\boldsymbol{e} }_n) = {\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_{n-1} \wedge {\boldsymbol{y} }, $$
(10.74)

satisfied by all yL. Since the vector on the right-hand side of (10.74) is the null vector if y belongs to the subspace L 1=〈a 1,…,a n−1〉, we may assume that \({\boldsymbol{b} }\in {\mathsf{L}}_{1}^{\perp}\).

Now we must recall that we have an orientation and consider L and L 1 to be oriented (it is easy to ascertain that the orientation of the space L does not determine a natural orientation of the subspace L 1, and so we must choose and fix the orientation of L 1 separately). Then we may choose the basis e 1,…,e n in such a way that it is orthonormal and positively oriented and also such that the first n−1 vectors e 1,…,e n−1 belong to the subspace L 1, and also define in it an orthonormal and positively oriented basis (it is always possible to attain this, possibly after replacing the vector e n with its opposite).

Since the vector b is contained in the one-dimensional subspace \({\mathsf{L}}_{1}^{\perp} = \langle {\boldsymbol{e} }_{n}\rangle\), it follows that b=β e n . Using the previous arguments, we obtain that

$${\boldsymbol{a} }_1 \wedge\cdots\wedge {\boldsymbol{a} }_{n-1} = v({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_{n-1}) {\boldsymbol{e} }_n, $$

where v(a 1,…,a n−1) is the oriented volume of the parallelepiped spanned by the vectors a 1,…,a n−1 (see the definition on p. 221). This observation determines the number β.

Indeed, substituting the vector y=e n into (10.74) and taking into account the fact that the basis e 1,…,e n was chosen to be orthonormal and positively oriented (from which follows, in particular, the equality v(e 1∧⋯∧e n )=1), we obtain the relationship

$$\beta v = v ({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_{n-1}, {\boldsymbol{e} }_n) = v( {\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_{n-1}). $$

Thus the isomorphism constructed above assigns to the vector a 1∧⋯∧a n−1 the vector b=v(a 1,…,a n−1)e n , where e n is the unit vector on the line \({\mathsf{L}}_{1}^{\perp}\), chosen with the sign making the basis e 1,…,e n of the space L orthonormal and positively oriented. As is easily verified, this is equivalent to the requirement that the basis a 1,…,a n−1,e n be positively oriented.

The final result is contained in the following theorem.

Theorem 10.36

For every oriented Euclidean space L, the isomorphism

assigns to the vector a 1∧⋯∧a n−1 the vector bL, which is orthogonal to the vectors a 1,…,a n−1 and whose length is equal to the unoriented volume V(a 1,…,a n−1), or more precisely,

$$ {\boldsymbol{b} }= V({\boldsymbol{a} }_1, \ldots, {\boldsymbol{a} }_{n-1}) {\boldsymbol{e} }, $$
(10.75)

where eL is a vector of unit length orthogonal to the vectors a 1,…,a n−1 and chosen in such a way that the basis a 1,…,a n−1,e is positively oriented.

The vector b determined by the relationship (10.75) is called the vector product of the vectors a 1,…,a n−1 and is denoted by [a 1,…,a n−1]. In the case n=3, this definition gives us the vector product of two vectors [a 1,a 2] familiar from analytic geometry.