Key words and Phrases

Mathematics Subject Classification (2010):

This text is based on the lectures given at the Arizona Winter School on “Quadratic forms”. The aim of the text is to give a brief introduction to the algebraic theory of quadratic forms. We explain invariants associated to quadratic forms—invariants with values in Galois cohomology as well as numerical invariants. We explain some open questions concerning these invariants and recent progress related to these questions.

There are many good references for this material on the algebraic theory of quadratic forms including [EKM, K, L, Pf] and [S].

1 Quadratic Forms

Let k be a field with char k ≠ 2.

Definition 1.1.

A quadratic form q: Vk on a vector space V over k is a map satisfying:

  1. (1)

    q(λ v) = λ 2 q(v) for vV, λk.

  2. (2)

    The map b q : V ×Vk, defined by

    $$\displaystyle{b_{q}(v,w) = \frac{1} {2}[q(v + w) - q(v) - q(w)]}$$

    is bilinear.

We denote a quadratic form by (V, q), or simply by q. Throughout, we restrict ourselves to the study of quadratic forms on finite-dimensional vector spaces.

The bilinear form b q is symmetric; q determines b q and for all vV, q(v) = b q (v, v).

For a choice of basis {e 1, , e n } of V, b q is represented by a symmetric matrix A(q) = (a i j ) with \(a_{ij} = b_{q}(e_{i},e_{j})\). If \(v =\sum _{1\leq i\leq n}X_{i}e_{i} \in V\), X i k, then

$$\displaystyle{q(v) =\sum _{1\leq i,j\leq n}a_{ij}X_{i}X_{j} =\sum _{1\leq i\leq n}a_{ii}X_{i}^{2} + 2\sum _{ i<j}a_{ij}X_{i}X_{j}.}$$

Thus q is represented by a homogeneous polynomial of degree 2. Clearly, every homogeneous polynomial of degree 2 corresponds to a quadratic form on V with respect to the chosen basis.

Definition 1.2.

Two quadratic forms (V 1, q 1), (V 2, q 2) are isometric if there is an isomorphism \(\phi: V _{1}{ \sim \atop \rightarrow } V _{2}\) such that \(q_{2}(\phi (v)) = q_{1}(v)\) for all vV 1.

If A(q 1), A(q 2) are the matrices representing q 1 and q 2 with respect to bases B 1 and B 2 of V 1 and V 2 respectively, ϕ yields a matrix TGL n (k), n = dimV, such that

$$\displaystyle{TA(q_{2})\,{T}^{t} = A(q_{ 1}).}$$

In other words, the symmetric matrices A(q 1) and A(q 2) are congruent. Thus isometry classes of quadratic forms yield congruence classes of symmetric matrices.

Definition 1.3.

The form q: Vk is said to be regular if b q : V ×Vk is nondegenerate.

Thus q is regular if and only if the map VV = Hom(V, k), defined by v↦(wb q (v, w)), is an isomorphism. This is the case if A(q) is invertible.

Let (V, q) be a quadratic form. Then

$$\displaystyle{V _{0} =\{ v \in V \,:\, b_{q}(v,w) = 0\text{ for all }w \in V \}}$$

is called the radical of V. If V 1 is any complementary subspace of V 0 in V, then \(q\vert _{V _{1}}\) is regular and \((V,q) = (V _{0},0) \perp (V _{1},q\vert _{V _{1}})\). Note that V is regular if and only if the radical of V is zero.

Henceforth, we shall only be concerned with regular quadratic forms.

Definition 1.4.

Let W be a subspace of V and q: Vk be a quadratic form. The orthogonal complement of W denoted W is the subspace

$$\displaystyle{{W}^{\perp } =\{ v \in V: b_{ q}(v,w) = 0\text{ for all }w \in W\}.}$$

Exercise 1.5.

Let (V, q) be a regular quadratic form and W a subspace of V. Then

  1. (1)

    \(\dim (W) +\dim ({W}^{\perp }) =\dim (V ).\)

  2. (2)

    \({({W}^{\perp })}^{\perp } = W.\)

1.1 Orthogonal Sums

Let (V 1, q 1), (V 2, q 2) be quadratic forms. The form

$$\displaystyle{(V _{1},q_{1}) \perp (V _{2},q_{2}) = (V _{1} \oplus V _{2},q_{1} \perp q_{2}),}$$

with q 1q 2 defined by

$$\displaystyle{(q_{1} \perp q_{2})(v_{1},v_{2}) = q_{1}(v_{1}) + q_{2}(v_{2}),\ v_{1} \in V _{1},\ v_{2} \in V _{2}}$$

is called the orthogonal sum of (V 1, q 1) and (V 2, q 2).

1.2 Diagonalization

Let (V, q) be a quadratic form. There exists a basis {e 1, , e n } of V such that \(b_{q}(e_{i},e_{j}) = 0\) for ij. Such a basis is called an orthogonal basis for q. With respect to an orthogonal basis, b q is represented by a diagonal matrix.

If {e 1, , e n } is an orthogonal basis of q and q(e i ) = d i , we write q = ⟨d 1, , d n ⟩. In this case, V = ke 1 ⊕ ⋯ ⊕ ke n is an orthogonal sum and \(q\vert _{ke_{i}}\) is represented by ⟨d i ⟩. Thus every quadratic form is diagonalizable.

1.3 Hyperbolic Forms

Definition 1.6.

A quadratic form (V, q) is said to be isotropic if there is a nonzero vV such that q(v) = 0. It is anisotropic if q is not isotropic. A quadratic form (V, q) is said to be universal if it represents every element of k; i.e., given λk, there is a vector vV such that q(v) = λ.

Example 1.7.

The quadratic form X 2Y 2 is isotropic over k. Suppose (V, q) is a regular form which is isotropic. Let vV be such that q(v) = 0, v≠0. Since q is regular, there exists wV such that b q (v, w)≠0. After scaling we may assume b q (v, w) = 1. If q(w)≠0, we may replace w by w + λ v, \(\lambda = -\frac{1} {2}q(w)\), and assume that q(w) = 0. Thus W = kvkw is a 2-dimensional subspace of V and q | W is represented by \(\left (\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right )\) with respect to {v, w}.

Definition 1.8.

A binary quadratic form isometric to \(({k}^{2},\left (\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right ))\) is called a hyperbolic plane. A quadratic form (V, q) is hyperbolic if it is isometric to an orthogonal sum of hyperbolic planes. A subspace W of V such that q restricts to zero on W and \(\dim W = \frac{1} {2}\dim V\) is called a Lagrangian.

Every regular quadratic form which admits a Lagrangian can easily be seen to be hyperbolic.

Exercise 1.9.

Let (V, q) be a regular quadratic form and (W, q | W ) a regular form on the subspace W. Then \((V,q) = (W,q\vert _{W}) \perp ({W}^{\perp },q\vert _{{W}^{\perp }})\).

Theorem 1.10 ( Witt’s Cancellation Theorem).

Let (V 1, q 1 ), (V 2, q 2 ), (V,q) be quadratic forms over k. Suppose

$$\displaystyle{(V _{1},q_{1}) \perp (V,q)\mathop{\cong}(V _{2},q_{2}) \perp (V,q).}$$

Then \((V _{1},q_{1})\mathop{\cong}(V _{2},q_{2})\).

The key ingredient of Witt’s cancellation theorem is the following.

Proposition 1.11.

Let (V,q) be a quadratic form and v,w ∈ V with q(v) = q(w)≠0. Then there is an isometry \(\tau: (V,q){ \sim \atop \rightarrow } (V,q)\) such that τ(v) = w.

Proof.

Let \(q(v) = q(w) = d\neq 0\). Then

$$\displaystyle{q(v + w) + q(v - w) = 2q(v) + 2q(w) = 4d\neq 0.}$$

Thus q(v + w)≠0 or q(vw)≠0. For any vector uV with q(u)≠0, define τ u : VV by

$$\displaystyle{\tau _{u}(z) = z -\frac{2b_{q}(z,u)u} {q(u)}.}$$

τ u is an isometry called the reflection with respect to u.

Suppose q(vw)≠0. Then τ vw : VV is an isometry of V which sends v to w. Suppose q(v + w)≠0. Then τ w τ v + w sends v to w. □

Remark 1.12.

The orthogonal group of (V, q) denoted by O(q) is the set of isometries of V onto itself. This group is generated by reflections. This is seen by an inductive argument on dim(q), using the above proposition.

Theorem 1.13 ( Witt’s decomposition).

Let (V,q) be a quadratic form (not necessarily regular). Then there is a decomposition

$$\displaystyle{(V,q) = (V _{0},0) \perp (V _{1},q_{1}) \perp (V _{2},q_{2})}$$

where V 0 is the radical of q, \(q_{1} = q\vert _{V _{1}}\) is anisotropic and \(q_{2} = q\vert _{V _{2}}\) is hyperbolic. If \((V,q) = (V _{0},0) \perp (W_{1},f_{1}) \perp (W_{2},f_{2})\) with f 1 anisotropic and f 2 hyperbolic, then

$$\displaystyle{(V _{1},q_{1})\mathop{\cong}(W_{1},f_{1}),\ (V _{2},q_{2})\mathop{\cong}(W_{2},f_{2}).}$$

Remark 1.14.

A hyperbolic form (W, f) is determined by dim(W); for if dim(W) = 2n, (W, f) ≅nH, where \(H = ({k}^{2},\left (\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right ))\) is the hyperbolic plane.

From now on, we shall assume (V, q) is a regular quadratic form. We denote by q an the quadratic form (V 1, q 1) in Witt’s decomposition which is determined by q up to isometry. We call \(\frac{1} {2}\dim (V _{2})\) the Witt index of q. Thus any regular quadratic form q admits a decomposition qq an ⊥ (nH), with q an anisotropic and H denoting the hyperbolic plane. We also sometimes denote by H n the sum of n hyperbolic planes.

2 Witt Group of Forms

2.1 Witt Groups

We set

$$\displaystyle{W(k) =\{\mathrm{ isomorphismclassesofregularquadraticformsover}\ k\}/ \sim }$$

where the Witt equivalence ∼ is given by:

$$\displaystyle{(V _{1},q_{1}) \sim (V _{2},q_{2})\;\Longleftrightarrow\;\begin{array}{l} \mathrm{thereexist}\,r,\,s \in \mathbb{Z}\mathrm{suchthat}\\ (V _{ 1},q_{1}) \perp {H}^{r}\mathop{\cong}(V _{ 2},q_{2}) \perp {H}^{s} \end{array}.}$$

W(k) is a group under orthogonal sum:

$$\displaystyle{[(V _{1},q_{1})] \perp [(V _{2},q_{2})] = [(V _{1},q_{1}) \perp (V _{2},q_{2})].}$$

The zero element in W(k) is represented by the class of hyperbolic forms. For a regular quadratic form (V, q), (V, q) ⊥ (V, − q) has Lagrangian

$$\displaystyle{W =\{ (v,v)\,:\, v \in V \}}$$

so that (V, q) ⊥ (V, − q) ≅H n, n = dim(V ). Thus, \([(V,-q)] = -[(V,q)]\) in W(k).

It follows from Witt’s decomposition theorem that every element in W(k) is represented by a unique anisotropic quadratic form up to isometry. Thus W(k) may be thought of as a group made out of isometry classes of anisotropic quadratic forms over k.

The abelian group W(k) admits a ring structure induced by tensor product on the associated bilinear forms. For example, if \(q_{1}\mathop{\cong}\langle a_{1},\ldots,a_{n}\rangle\) and q 2 is a quadratic form, then \(q_{1} \otimes q_{2}\mathop{\cong}a_{1}q_{2} \perp a_{2}q_{2} \perp \cdots \perp a_{n}q_{2}\).

Definition 2.1.

Let I(k) denote the ideal of classes of even-dimensional quadratic forms in W(k). The ideal I(k) is called the fundamental ideal. I n(k) stands for the nth power of the ideal I(k).

Definition 2.2.

Let P n (k) denote the set of isomorphism classes of forms of the type

$$\displaystyle{\langle \!\langle a_{1},\ldots,a_{n}\rangle \!\rangle:=\langle 1,a_{1}\rangle \otimes \cdots \otimes \langle 1,a_{n}\rangle.}$$

Elements in P n (k) are called n-fold Pfister forms.

The ideal I(k) is generated additively by the forms ⟨1, a⟩, ak . Moreover, the ideal I n(k) is generated additively by n-fold Pfister forms. For instance, for n = 2, the generators of I 2(k) are of the form

$$\displaystyle{\langle a,b\rangle \otimes \langle c,d\rangle \mathop{\cong}\langle 1,ac,ad,cd\rangle -\langle 1,cd,-bc,-bd\rangle =\langle \!\langle ac,ad\rangle \!\rangle -\langle \!\langle cd,-bc\rangle \!\rangle }$$

Example 2.3.

If \(k = \mathbb{C}\), every 2-dimensional quadratic form over k is isotropic.

$$\displaystyle{W(k)\mathop{\cong}\mathbb{Z}/2\mathbb{Z}}$$
$$\displaystyle{[(V,q)]\mapsto \dim (V )\ (\text{mod}\ 2)}$$

is an isomorphism.

Example 2.4.

Let \(k = \mathbb{F}_{{p}^{n}}\), p≠2, be a finite field. Then k = k ∖ { 0} has two square classes, {1, u}. Every 3-dimensional quadratic form over k is isotropic. Further, \(W(k)\mathop{\cong}\mathbb{Z}/4\mathbb{Z}\) if − 1 is not a square in \(\mathbb{F}_{{p}^{n}}\) and \(W(k)\mathop{\cong}\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}\) if − 1 is a square in \(\mathbb{F}_{{p}^{n}}\) (cf. [L], Corollary 3.6).

Example 2.5.

If \(k = \mathbb{R}\), every quadratic form q is represented by

$$\displaystyle{\langle 1,\ldots,1,-1,\ldots,-1\rangle }$$

with respect to an orthogonal basis. The number r of + 1’s and the number s of − 1’s in the diagonalization above are uniquely determined by the isomorphism class of q. The signature of q is defined as rs. The signature yields a homomorphism \(\mathrm{sgn}: W(\mathbb{R}) \rightarrow \mathbb{Z}\) which is an isomorphism.

2.2 Quadratic Forms Over p-Adic Fields

Let k be a finite extension of the field \(\mathbb{Q}_{p}\) of p-adic numbers. We call k a non-dyadic p-adic field if p≠2. The field k has a discrete valuation v extending the p-adic valuation on \(\mathbb{Q}_{p}\). Let π be a uniformizing parameter for v and κ the residue field for v. The field κ is a finite field of characteristic p≠2. Let u be a unit in k such that \(\overline{u} \in \kappa\) is not a square. Then

$$\displaystyle{{k}^{{\ast}}/{{k}^{{\ast}}}^{2} =\{ 1,u,\pi,u\pi \}.}$$

Since κ is finite, every 3-dimensional quadratic form over κ is isotropic. By Hensel’s lemma, every 3-dimensional form \(\langle u_{1},u_{2},u_{3}\rangle\) over k, with u i units in k is isotropic. Since every form q in k has a diagonal representation

$$\displaystyle{\langle u_{1},\ldots,u_{r}\rangle \perp \pi \langle v_{1},\ldots,v_{s}\rangle,}$$

if r or s exceeds 3, q is isotropic. In particular every 5-dimensional quadratic form over k is isotropic. Further, up to isometry, there is a unique quadratic form in dimension 4 which is anisotropic, namely,

$$\displaystyle{\langle 1,-u,-\pi,u\pi \rangle.}$$

This is the norm form of the unique quaternion division algebra H(u, π) over k (cf. Sect. 2.3).

2.3 Central Simple Algebras and the Brauer Group

Recall that a finite-dimensional algebra A over a field k is a central simple algebra over k if A is simple (has no two-sided ideals) and the center of A is k. Recall also that for a field k,

$$\displaystyle{\mathrm{Br}(k) = \left \{\mathrm{Isomorphismclassesofcentralsimplealgebrasoverk}\right \}/ \sim }$$

where the Brauer equivalence ∼ is given by: AB if and only if M n (A) ≅M m (B) for some integers m, n. The pair (Br(k), ⊗ ) is a group. The inverse of [A] is [A op] where A op is the opposite algebra of A: the multiplication structure, ∗, on A op is given by ab = b a. We have a k-algebra isomorphism \(\phi: A \otimes {A}^{\mathrm{op}}{ \sim \atop \longrightarrow } \mathrm{End}_{k}(A)\) induced by ϕ(ab)(c) = acb. The identity element in Br(k) is given by [k]. By Wedderburn’s theorem on central simple algebras, the elements of Br(k) parametrize the isomorphism classes of finite-dimensional central division algebras over k.

For elements a, bk , we define the quaternion algebra H(a, b) to be the 4-dimensional central simple algebra over k generated by {i, j} with the relations i 2 = a, j 2 = b, \(ij = -ji\). This is a generalization of Hamilton’s quaternion algebra \(H(-1,-1)\) over the field of real numbers. The algebra H(a, b) admits a canonical involution \(\bar{}\,: H(a,b) \rightarrow H(a,b)\) given by

$$\displaystyle{\overline{\alpha + i\beta + j\gamma + ij\delta } =\alpha -i\beta - j\gamma - ij\delta }$$

This involution gives an isomorphism H(a, b) ≅H(a, b)op; in particular, H(a, b) has order 2 in Br(k). Let 2​Br(k) denote the 2-torsion subgroup of the Brauer group of k. The norm form for this algebra is given by \(N(x) = x\overline{x}\), which is a quadratic form on H(a, b) represented with respect to the orthogonal basis {1, i, j, i j} by \(\langle 1,-a,-b,ab\rangle =\langle \!\langle -a,-b\rangle \!\rangle\).

2.4 Classical Invariants for Quadratic Forms

Let (V, q) be a regular quadratic form. We define dim(q) = dim(V ) and dim2(q) = dim(V ) modulo 2. We have a ring homomorphism \(\dim _{2}: W(k) \rightarrow \mathbb{Z}/2\mathbb{Z}\). We note that I(k) is the kernel of dim2. This gives an isomorphism

$$\displaystyle{\dim _{2}: W(k)/I(k){ \sim \atop \longrightarrow } \mathbb{Z}/2\mathbb{Z}.}$$

Let \(\mathrm{disc}(q) = {(-1)}^{n(n-1)/2}[\det (A(q))] \in {k}^{{\ast}}/{k}^{{\ast}2}\). Since A(q) is determined up to congruence, det(A(q)) is determined modulo squares. We have disc(H) = 1, where H is the hyperbolic plane. The discriminant induces a group homomorphism

$$\displaystyle{\mathrm{disc}: I(k) \rightarrow {k}^{{\ast}}/{k}^{{\ast}2}}$$

which is clearly onto. It is easy to verify that ker(disc) = I 2(k). Thus the discriminant homomorphism induces an isomorphism \(I(k)/{I}^{2}(k) \rightarrow {k}^{{\ast}}/{k}^{{\ast}2}\).

Example 2.6.

Let ⟨a, b⟩ be a binary quadratic form. Then \(\mathrm{disc}\langle a,b\rangle = -ab\). The discriminant is trivial if and only if ⟨a, b⟩ ≅⟨1, − 1⟩ is a hyperbolic plane. Further, if ⟨a, b⟩ represents a value ck , then ⟨a, b⟩ ≅⟨c, abc⟩.

The next invariant for quadratic forms is the Clifford invariant. To each quadratic form (V, q) we wish to construct a central simple algebra containing V whose multiplication on elements of V satisfies vv = q(v). The smallest such algebra (defined by a universal property) will be the Clifford algebra.

Definition 2.7.

The Clifford algebra C(q) of the quadratic form (V, q) is T(V ) ∕ I q , where I q is the two-sided ideal in the tensor algebra T(V ) generated by {vvq(v)∣vV }.

The algebra C(q) has a \(\mathbb{Z}/2\mathbb{Z}\) gradation C(q) = C 0(q) ⊕ C 1(q) induced by the gradation T(V ) = T 0(V ) ⊕ T 1(V ), where

$$\displaystyle{T_{0}(V ) =\bigoplus _{i\geq 0,\,\,i\mathrm{even}}{V }^{\otimes i}\mathrm{and}T_{ 1}(V ) =\bigoplus _{i\geq 1,\,\,i\mathrm{odd}}{V }^{\otimes i}.}$$

If dim(q) is even, then C(q) is a central simple algebra over k. If dim(q) is odd, C 0(q) is a central simple algebra over k. The Clifford algebra C(q) comes equipped with an involution τ defined by \(\tau (v) = -v\) for vV. Thus, if dim(q) is even, C(q) determines a 2-torsion element in Br(k).

Definition 2.8.

The Clifford invariant c(q) of (V, q) in Br(k) is defined as

$$\displaystyle{c(q) = \left \{\begin{array}{@{}l@{\quad }l@{}} [C(q)], \quad &\mathrm{if}\ \dim (q)\ \mathrm{is\ even}\\ {} [C_{ 0}(q)],\quad &\mathrm{if}\ \dim (q)\ \mathrm{is\ odd} \end{array} \right.}$$

Example 2.9.

Let \(q\mathop{\cong} \otimes _{i=1}^{n}\langle \!\langle - a_{i},-b_{i}\rangle \!\rangle \in {I}^{2}(k)\). Then

$$\displaystyle{c(q) = [\otimes _{1\leq i\leq n}H_{i}]}$$

where \(H_{i} = H(a_{i},b_{i})\).

Exercise 2.10.

Given \(\bigotimes _{1\leq i\leq n}H_{i}\), a tensor product of n quaternion algebras over k, show that there is a quadratic form q over k of dimension 2n + 2 such that \(c(q) = [\bigotimes _{1\leq i\leq n}H_{i}\)].

The Clifford invariant induces a homomorphism c: I 2(k) → 2​Br(k), 2​Br(k) denoting the 2-torsion in the Brauer group of k. The very first case of the Milnor conjecture (see Sect. 3) states: c is surjective and ker(c) = I 3(k).

Theorem 2.11 ( Merkurjev [M1]).

The map c induces an isomorphism

$$\displaystyle{{I}^{2}(k)/{I}^{3}(k)\mathop{\cong}_{ 2}\!\mathrm{Br}(k)}$$

Thus the image of I 2(q) in 2​Br(k) is spanned by quaternion algebras. It was a longstanding question whether 2​Br(k) is spanned by quaternion algebras. Merkurjev’s theorem answers this question in the affirmative; further, it gives precise relations between quaternion algebras in 2​Br(k).

3 Galois Cohomology and the Milnor Conjecture

Let \(\bar{k}\) be a separable closure of k. Let \(\Gamma _{k} = \mathrm{Gal}(\bar{k}\vert k)\) be the absolute Galois group of k. The group \(\Gamma _{k}\) is a profinite group:

$$\displaystyle{\Gamma _{k} = \mathop{\lim }\limits_\longleftarrow _{L\subset \bar{k},\,L/k\mathrm{finiteGalois}}\mathrm{Gal}(L/k).}$$

A discrete \(\Gamma _{k}\) -module M is a continuous \(\Gamma _{k}\)-module for the discrete topology on M and the profinite topology on \(\Gamma _{k}\). A \(\Gamma _{k}\)-module M is discrete if and only if the stabilizer of each mM is an open subgroup, in particular, of finite index in \(\Gamma _{k}\). For a discrete \(\Gamma _{k}\)-module M, we define H n(k, M) as the direct limit of the cohomology of the finite quotients

$$\displaystyle{{{H}^{n}}(k,M) = \mathop{\lim }\limits_{\longrightarrow \atop { L\subset \bar{k},\,L/k \mathrm{finite\ Galois}}}{H}^{n}(\mathrm{Gal}(L/k),{M}^{\Gamma _{L} }).}$$

Suppose char(k)≠2 and M = μ 2. The module μ 2 has trivial \(\Gamma _{k}\) action and is isomorphic to \(\mathbb{Z}/2\mathbb{Z}\). We have

$$\displaystyle\begin{array}{rcl} & {H}^{0}(k, \mathbb{Z}/2\mathbb{Z}) = \mathbb{Z}/2\mathbb{Z}& {}\\ & {H}^{1}(k, \mathbb{Z}/2\mathbb{Z})\mathop{\cong}{k}^{{\ast}}/{k}^{{\ast}2} & {}\\ & {H}^{2}(k, \mathbb{Z}/2\mathbb{Z})\mathop{\cong}_{ 2}\!\mathrm{Br}(k) & {}\\ \end{array}$$

These can be seen from the Kummer exact sequence of \(\Gamma _{k}\)-modules:

$$\displaystyle{0\longrightarrow \mu _{2}\longrightarrow \bar{{k}}^{{\ast}}{ \cdot 2 \atop \longrightarrow } \bar{{k}}^{{\ast}}\longrightarrow 0}$$

and noting that \({H}^{1}(\Gamma _{k},\bar{{k}}^{{\ast}}) = 0\) (Hilbert’s Theorem 90) and \({H}^{2}(\Gamma _{k},\bar{{k}}^{{\ast}}) = \mathrm{Br}(k)\).

For an element ak , we denote by (a) its class in \({H}^{1}(k, \mathbb{Z}/2\mathbb{Z})\) and for \(a_{1},\ldots,a_{n} \in {k}^{{\ast}}\), the cup product \((a_{1}) \cup \cdots \cup (a_{n}) \in {H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) is denoted by (a 1) ⋅⋯ ⋅(a n ).

For a, bk , the element (a). (b) represents the class of H(a, b) in 2​Br(k). The map

$$\displaystyle{c: {I}^{2}(k) \rightarrow {H}^{2}(k, \mathbb{Z}/2\mathbb{Z})}$$

sends \(\langle 1,-a,-b,ab\rangle\) to the class of H(a, b) in \({H}^{2}(k, \mathbb{Z}/2\mathbb{Z})\). The forms \(\langle 1,-a,-b,ab\rangle\) additively generate I 2(k). Merkurjev’s theorem asserts that \({H}^{2}(k, \mathbb{Z}/2\mathbb{Z})\) is generated by (a). (b), with a, bk . The Milnor conjecture (quadratic form version) proposes higher invariants \({I}^{n}(k) \rightarrow {H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) extending the classical invariants.

Milnor Conjecture.

The assignment

$$\displaystyle{\langle 1,a_{1}\rangle \otimes \cdots \otimes \langle 1,a_{n}\rangle \mapsto (a_{1}) \cdot \cdots \cdot (a_{n})}$$

yields a map \(e_{n}: P_{n}(k) \rightarrow {H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) . This map extends to a homomorphism \(e_{n}: {I}^{n}(k) \rightarrow {H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) which is onto and \(\ker (e_{n}) = {I}^{n+1}(k)\).

The maps dimension mod 2, discriminant and Clifford invariant coincide with e 0, e 1 and e 2. Unlike these classical invariants, which are defined on all quadratic forms, conjecturally e n , n ≥ 3, are defined only on elements in I n(k) on which the invariants e i , in − 1, vanish. In 1975, Arason [Ar] proved that \(e_{3}: {I}^{3}(k) \rightarrow {H}^{3}(k, \mathbb{Z}/2\mathbb{Z})\) is well defined and is one-one on P 3(k). As we mentioned earlier, the first nontrivial case of the Milnor conjecture was proved by Merkurjev for n = 2. The Milnor conjecture (quadratic form version) is now a theorem due to Orlov–Vishik–Voevodsky [OVV].

The Milnor conjecture gives a classification of quadratic forms by their Galois cohomology invariants: Given anisotropic quadratic forms q 1 and q 2, suppose \(e_{i}(q_{1} \perp -q_{2}) = 0\) for i ≥ 0. Then q 1 = q 2 in W(k). We need only to verify \(e_{i}(q_{1} \perp -q_{2}) = 0\) for iN where N ≤ 2n and \(\dim (q_{1} \perp -q_{2}) \leq {2}^{n}\), by the following theorem of Arason and Pfister.

Theorem 3.1 ( Arason–Pfister Hauptsatz).

Let k be a field. The dimension of an anisotropic quadratic form in I n (k) is at least 2 n.

4 Pfister Forms

The theory of Pfister forms (or multiplicative forms, as Pfister called them) evolved from questions on classification of quadratic forms whose nonzero values form a group (hereditarily).

Definition 4.1.

A regular quadratic form q over k is called multiplicative if the nonzero values of q over any extension field L over k form a group.

We have the following examples of quadratic forms which are multiplicative.

Example 4.2.

⟨1⟩: nonzero squares are multiplicatively closed in k .

Example 4.3.

⟨1, − a⟩: x 2a y 2, ak is the norm from the quadratic algebra \(k[t]/({t}^{2} - a)\) over k and the norm is multiplicative.

Example 4.4.

\(\langle 1,-a\rangle \otimes \langle 1,-b\rangle\): \({x}^{2} - a{y}^{2} - b{z}^{2} + ab{t}^{2}\) is a norm form from the quaternion algebra H(a, b): \(N(\alpha +i\beta + j\gamma + ij\delta ) {=\alpha }^{2} - {a\beta }^{2} - {b\gamma }^{2} + a{b\delta }^{2}\). The norm once again is multiplicative.

Example 4.5.

\(\langle 1,-a\rangle \otimes \langle 1,-b\rangle \otimes \langle 1,-c\rangle\): \(({x}^{2} - a{y}^{2} - b{z}^{2} + ab{t}^{2}) - c({u}^{2} - a{v}^{2} - b{w}^{2} + ab{s}^{2})\) is the norm form from an octonion algebra associated to the triple (a, b, c); it is a non-associative algebra obtained from the quaternion algebra H(a, b) by a doubling process (see [J, Sect. 7.6]). The norm is once again multiplicative.

Theorem 4.6 ( Pfister).

An anisotropic quadratic form q over k is multiplicative if and only if q is isomorphic to a Pfister form.

We shall sketch a proof of this theorem. The main ingredients are the Cassels–Pfister Theorem 4.7 and the Subform Theorem 4.10, which will not be proved in the text. We refer to [L, Chap. IX, Theorems 1.3 and 2.8] for the proofs.

Theorem 4.7 ( Cassels–Pfister).

Let q = ⟨a 1, …,a n ⟩ be a regular quadratic form over k and f(X) ∈ k[X], a polynomial over k which is a value of q over k(X). Then there exist polynomials g 1, …,g n ∈ k[X] such that \(f(X) = a_{1}g_{1}^{2} + \cdots + a_{n}g_{n}^{2}\).

Corollary 4.8 ( Specialization Lemma).

Let q = ⟨a 1, …,a n ⟩ be a quadratic form over k, X ={ X 1, …,X n }, p(X) ∈ k(X) a rational function represented by q over k(X). Then for any v ∈ k n where p(v) is defined, p(v) is represented by q over k.

Proof.

We may assume, by multiplying p(X) by a square, that p(X) ∈ k[X]. Let \(p(X) = p_{1}(X_{n})\), where p 1 is a polynomial in X n with coefficients in k[X 1, , X n − 1]. By the Cassels–Pfister theorem, \(p_{1}(X_{n})\) is represented by q over \(k(X_{1},\ldots,X_{n-1})[X_{n}]\). Let v = (v 1, , v n ). Then specializing X n to v n , we have \(p_{1}(v_{n}) \in k[X_{1},\ldots,X_{n-1}]\) is represented by q over k(X 1, , X n − 1). By an induction argument, one concludes that p(v 1, , v n ) is a value of q over k. □

Corollary 4.9.

Let q be an anisotropic quadratic form over k of dimension n. Then q is multiplicative if and only if, for indeterminates X = (X 1, …,X n ), Y = (Y 1, …,Y n ), q(X) q(Y ) is a value of q over \(k(X_{1},\ldots,X_{n},Y _{1},\ldots,Y _{n})\).

Proof.

The only non-obvious part is “if”. Suppose Lk is a field extension and v, wL n. Let q(v) = c and q(w) = d. Since q(X) q(Y ) is a value of q over k(X, Y ), by the Specialization lemma, q(X) q(w) is a value of q over L(X) and by the same lemma, q(v) q(w) is a value of q over L. □

Theorem 4.10 ( Subform Theorem).

Let q = ⟨a 1, …,a n ⟩, γ = ⟨b 1, …,b m ⟩ be quadratic forms over k with q anisotropic. Then γ is a subform of q (i.e., q≅γ ⊥γ′ for some form γ′ over k) if and only if \(b_{1}X_{1}^{2} + \cdots + b_{m}X_{m}^{2}\) is a value of q over k(X 1, …,X m ).

Corollary 4.11.

Let q be an anisotropic quadratic form over k of dimension n. Let X ={ X 1, …,X n } be a list of n indeterminates. Then q is multiplicative if and only if q≅q(X) q over k(X).

Proof.

Suppose qq(X) q over k(X). Let A be the matrix representing q over k. There exists W ∈ GL n (k(X)) such that q(X)A = WAW t. Let Y = { Y 1, , Y n } be a list of n indeterminates. Over k(X, Y ),

$$\displaystyle{q(X)\,q(Y ) = Y (q(X)A){Y }^{t} = (Y W)A{(Y W)}^{t} = q(Z)}$$

where Z = Y W. Thus q(X) q(Y ) is a value of q over k(X, Y ) and by Corollary 4.9, q is multiplicative.

Suppose conversely that q is multiplicative. Then q(X) q(Y ) is a value of q over k(X, Y ). By the Subform theorem, q(X) q is a subform of q. A dimension count yields qq(X) q. □

Proof of Pfister’s Theorem 4.6.

Let q = ⟨1, a 1⟩ ⊗ ⋯ ⊗ ⟨1, a n ⟩ be an anisotropic quadratic form over k. Over any field extension Lk, either q is an anisotropic Pfister form or isotropic in which case it is universal. Thus it suffices to show that the nonzero values of q form a subgroup of k for any anisotropic n-fold Pfister form q. The proof is by induction on n; for n = 1, q is the norm form from a quadratic extension of k (see Example 4.3) and we are done. Let n ≥ 2. We have \(q\mathop{\cong}q_{1} \perp a_{n}q_{1}\), where \(q_{1} =\langle 1,a_{1}\rangle \otimes \cdots \otimes \langle 1,a_{n-1}\rangle\) is an anisotropic (n − 1)-fold Pfister form. Let \(X =\{ X_{1},\ldots,X_{{2}^{n-1}}\}\), \(Y =\{ Y _{1},\ldots,Y _{{2}^{n-1}}\}\) be two lists of 2n − 1 indeterminates. Since q 1 is multiplicative, by Corollary 4.11, \(q_{1}(X)\,q_{1}\mathop{\cong}q_{1}\) over k(X) and \(q_{1}(Y )\,q_{1}\mathop{\cong}q_{1}\) over k(Y ). We have, over k(X, Y ),

$$\displaystyle{q\mathop{\cong}q_{1}(X)\,q_{1} \perp a_{n}q_{1}(Y )\,q_{1}\mathop{\cong}\langle q_{1}(X),a_{n}q_{1}(Y )\rangle \otimes q_{1}.}$$

Since \(q(X,Y ) = q_{1}(X) + a_{n}q_{1}(Y )\), \(\langle q_{1}(X),a_{n}q_{1}(Y )\rangle\) represents q(X, Y ). Therefore, by a comparison of discriminants,

$$\displaystyle\begin{array}{rcl} \langle q_{1}(X),a_{n}q_{1}(Y )\rangle & \mathop{\cong}& \langle q(X,Y ),a_{n}q(X,Y )q_{1}(X)q_{1}(Y )\rangle {}\\ & \mathop{\cong}& q(X,Y )(1 \perp a_{n}q_{1}(X)q_{1}(Y )) {}\\ \end{array}$$

In particular,

$$\displaystyle\begin{array}{rcl} q& \mathop{\cong}& q(X,Y )\langle 1,a_{n}q_{1}(X)q_{1}(Y )\rangle \otimes q_{1} {}\\ & \mathop{\cong}& q(X,Y )(q_{1} \perp a_{n}q_{1}) {}\\ & \mathop{\cong}& q(X,Y )\,q {}\\ \end{array}$$

Thus by Corollary 4.11, q is multiplicative.

Conversely, let q be an anisotropic quadratic form over k which is multiplicative. Let n be the largest integer such that q contains an n-fold Pfister form \(q_{1} =\langle 1,a_{1}\rangle \otimes \cdots \otimes \langle 1,a_{n}\rangle\) as a subform. Suppose qq 1γ, γ = ⟨b 1, , b m ⟩, with m ≥ 1. Let \(Z =\{ Z_{1},\ldots,Z_{{2}^{n}}\}\). Over k(Z),

$$\displaystyle{q\mathop{\cong}q(Z,0)\,q\mathop{\cong}q_{1}(Z)(q_{1} \perp \gamma )\mathop{\cong}q_{1}(Z)\,q_{1} \perp q_{1}(Z)\,\gamma \mathop{\cong}q_{1} \perp q_{1}(Z)\,\gamma.}$$

By Witt’s cancellation, γq 1(Z) γ over k(Z). Thus γ represents b 1 q 1(Z) over k(Z) and by the Subform theorem, \(\gamma \mathop{\cong}b_{1}\,q_{1} \perp \gamma _{1}\). Then \(q\mathop{\cong}q_{1} \perp b_{1}\,q_{1} \perp \gamma _{1}\mathop{\cong}\langle 1,b_{1}\rangle \otimes q_{1} \perp \gamma _{1}\) contains an (n + 1)-fold Pfister form ⟨1, b 1⟩ ⊗ q 1, leading to a contradiction to the maximality of n. Thus qq 1. □

An important property of Pfister forms is stated in the following.

Proposition 4.12.

Let ϕ be an n-fold Pfister form. If ϕ is isotropic then ϕ is hyperbolic.

Proof.

Let \(\phi = r\,\langle 1,-1\rangle \perp \phi _{0}\), with ϕ 0 anisotropic, dim(ϕ 0) ≥ 1 and r ≥ 1. Let dim(ϕ) = m and X = { X 1, , X m } be a list of m indeterminates. Over k(X 1, , X m )

$$\displaystyle{r\,\langle 1,-1\rangle \perp \phi _{0} =\phi \mathop{\cong}\phi (X_{1},\ldots,X_{m})\,\phi \mathop{\cong}r\,\langle 1,-1\rangle \perp \phi (X_{1},\ldots,X_{m})\,\phi _{0}.}$$

By Witt’s cancellation theorem

$$\displaystyle{\phi _{0}\mathop{\cong}\phi (X_{1},\ldots,X_{m})\,\phi _{0}.}$$

If b is a value of ϕ 0, b ϕ(X 1, , X m ) is a value of ϕ 0 and by the Subform theorem, b ϕ is a subform of ϕ 0 contradicting dim(ϕ 0) < dim(ϕ). Thus ϕr ⟨1, − 1⟩ is hyperbolic. □

Corollary 4.13.

The only integers n such that a product of sums of n squares is again a sum of n squares over every field of characteristic zero are n = 2 m for all m ≥ 0.

Proof.

Consider the quadratic form \(\phi _{n} = x_{1}^{2} + x_{2}^{2} + \cdots + x_{n}^{2}\) over \(\mathbb{Q}\). The form ϕ n is anisotropic. The condition that a product of sums of n squares is again a sum of n squares over any field of characteristic zero is equivalent to ϕ n being a Pfister form. Thus \(\dim (\phi _{n}) = n = {2}^{m}\) for some m. □

5 Level of a Field

Definition 5.1.

The level of a field k is the least positive integer n such that − 1 is a sum of n squares in k. We denote the level of k by s(k).

If the field is formally real (i.e., − 1 is not a sum of squares), then the level is defined to be infinite. It was a longstanding open question since the 1950s whether the level of a field, if finite, is always a power of 2. Pfister’s theory of quadratic forms leads to an affirmative answer to this question.

Theorem 5.2 ( [Pf1]).

The level of a field is a power of 2 if it is finite.

Proof.

Let n = s(k). We choose an integer m such that 2mn < 2m + 1. Suppose

$$\displaystyle{ -1 = (u_{1}^{2} + u_{ 2}^{2} + \cdots + u_{{ 2}^{m}}^{2}) + (u_{{ 2}^{m}+1}^{2} + \cdots + u_{ n}^{2}) }$$
(5.3)

The element \(u_{1}^{2} + u_{2}^{2} + \cdots + u_{{2}^{m}}^{2}\neq 0\) since s(k) ≥ 2m. Every ratio of sums of 2m squares is again a sum of 2m squares since ⟨1, 1⟩m is a multiplicative form. Thus, from (5.3) we see that

$$\displaystyle\begin{array}{rcl} 0& =& 1 + \frac{u_{{2}^{m}+1}^{2} + \cdots + u_{n}^{2} + 1} {u_{1}^{2} + \cdots + u_{{2}^{m}}^{2}} {}\\ & =& 1 + (v_{1}^{2} + \cdots + v_{{ 2}^{m}}^{2}) {}\\ \end{array}$$

Therefore, \(-1 = v_{1}^{2} + \cdots + v_{{2}^{m}}^{2}\) and s(k) = 2m. □

Remark 5.4.

There exist fields with level 2n for any n ≥ 1. For instance, \(\mathbb{R}(X_{1},\ldots,X_{{2}^{n}})(\sqrt{-(X_{1 }^{2 } + \cdots + X_{{2}^{n } }^{2 })}\,)\) is a field of level 2n (cf. [L], Sect. XI.2).

Exercise 5.5.

Let k be a p-adic field with p≠2 and with residue field \(\mathbb{F}_{q}\). Prove the following:

  1. (1)

    s(k) = 1 if q ≡ 1 (mod 4).

  2. (2)

    s(k) = 2 if q ≡ − 1 (mod 4).

6 The u-Invariant

Definition 6.1.

The u-invariant of a field k, denoted by u(k), is defined to be the largest integer n such that every (n + 1)-dimensional quadratic form over k is isotropic and there is an anisotropic form in dimension n over k; if no such integer exists, the u-invariant is said to be infinite. In other words,

$$\displaystyle{u(k) =\max \,\{\dim (q)\,:\, q\mathrm{\ anisotropic\ form\ over\ }k\}.}$$

If k admits an ordering, then sums of nonzero squares are never zero and there is a refined u-invariant for fields with orderings, due to Elman–Lam [EL]. In this article, we do not discuss this refined invariant.

Example 6.2.

  1. (1)

    \(u(\mathbb{F}_{q}) = 2\), if q is odd.

  2. (2)

    u(k(X)) = 2, if k is algebraically closed and X is an integral curve over k (Tsen’s theorem).

  3. (3)

    u(k) = 4 for k a p-adic field. For p≠2, see Sect. 2.2. For p = 2, see [L, Sect. XI.6].

  4. (4)

    u(k) = 4 for k a totally imaginary number field. This follows from the Hasse–Minkowski theorem.

  5. (5)

    Suppose u(k) = n < . Let k((t)) denote the field of Laurent series over k. Then u(k((t))) = 2n. In fact, the square classes in k((t)) are \(\{u_{\alpha },tu_{\alpha }\}_{\alpha \in I}\) where \(\{u_{\alpha }\}_{\alpha \in I}\) are the square classes in k . As in the p-adic field case, every form over k((t)) is isometric to \(\langle u_{1},\ldots,u_{r}\rangle \perp t\langle v_{1},\ldots,v_{s}\rangle\), \(u_{i},v_{i} \in {k}^{{\ast}}\) and this form is anisotropic if and only if \(\langle u_{1},\ldots,u_{r}\rangle\) and ⟨v 1, , v s ⟩ are anisotropic.

  6. (6)

    More generally, if K is a complete discrete valuated field with residue field κ of u-invariant n, then u(K) = 2n. For the case char(κ) = 2, we refer to [Sp].

Definition 6.3.

A field k is \(\boldsymbol{C}_{i}\) if every homogeneous polynomial in N variables of degree d with N > d i has a nontrivial zero.

Example 6.4.

Finite fields and function fields in one variable over algebraically closed fields are C 1.

If k is a C i field, u(k) ≤ 2i. Further, the property C i behaves well with respect to function field extensions. If lk is finite and k is C i then l is C i ; further, if t 1, , t n are indeterminates, k(t 1, , t n ) is C i + n .

Example 6.5.

The u-invariant of transcendental extensions:

  1. (1)

    \(u(k(t_{1},\ldots,t_{n})) = {2}^{n}\) if k is algebraically closed. In fact,

    $$\displaystyle{u(k(t_{1},\ldots,t_{n})) \leq {2}^{n}}$$

    since k(t 1, , t n ) is a C n field. Further, the form

    $$\displaystyle{\langle \!\langle t_{1},\ldots,t_{n}\rangle \!\rangle =\langle 1,t_{1}\rangle \otimes \cdots \otimes \langle 1,t_{n}\rangle }$$

    is anisotropic over \(k((t_{1}))((t_{2}))\ldots ((t_{n}))\) and hence also over k(t 1, , t n ).

  2. (2)

    \(u(\mathbb{F}_{q}(t_{1},\ldots,t_{n})) = {2}^{n+1}\) if q is odd.

All fields of known u-invariant in the 1950s happened to have u-invariant a power of 2. Kaplansky raised the question whether the u-invariant of a field is always a power of 2.

Proposition 6.6.

The u-invariant does not take the values 3,5,7.

Proof.

Let q be an anisotropic form of dimension 3. By scaling, we may assume that q ≅⟨1, a, b⟩. Then the form ⟨1, a, b, ab⟩ is anisotropic; if ⟨1, a, b, ab⟩ is isotropic, it is hyperbolic and Witt’s cancellation yields \(\langle a,b,ab\rangle \mathop{\cong}\langle 1,-1,-1\rangle\) which is isotropic and qaa, b, ab⟩ is isotropic leading to a contradiction. Thus u(k)≠3.

Let u(k) < 8. Every three-fold Pfister form (which has dimension 8) is isotropic and hence hyperbolic. Thus I 3(k) which is generated by three-fold Pfister forms is zero. Let qI 2(k) be any quadratic form. For any ck , ⟨1, − cqI 3(k) is zero and c q is Witt equivalent to q, hence isometric to q by Witt’s cancellation. We conclude that every quadratic form whose class is in I 2(k) is universal.

Suppose u(k) = 5 or 7. Let q be an anisotropic form of dimension u(k). Since every form in dimension u(k) + 1 is isotropic, if disc(q) = d, q ⊥ − d is isotropic and therefore q represents d. We may write qq 1 ⊥ ⟨d⟩ where q 1 is even-dimensional with trivial discriminant. Hence [q 1] ∈ I 2(k) so that q 1 is universal. This in turn implies that q 1 ⊥ ⟨d⟩ ≅q is isotropic, leading to a contradiction. □

In the 1990s Merkurjev [M2] constructed examples of fields k with u(k) = 2n for any n ≥ 1, n = 3 being the first open case, answering Kaplansky’s question in the negative. Since then, it has been shown that the u-invariant could be odd. In [I], Izhboldin proves that there exist fields k with u(k) = 9 and in [V] Vishik has shown that there exist fields k with \(u(k) = {2}^{r} + 1\) for all r ≥ 3.

Merkurjev’s construction yields fields k which are not of arithmetic type, i.e., not finitely generated over a number field or a p-adic field. It is still an interesting question whether u(k) is a power of 2 if k is of arithmetic type.

The behavior of the u-invariant is very little understood under rational function field extensions. For instance, it is an open question if u(k) < implies u(k(t)) < for the rational function field in one variable over k. This was unknown for \(k = \mathbb{Q}_{p}\) until the late 1990s. Conjecturally, \(u(\mathbb{Q}_{p}(t)) = 8\), in analogy with the positive characteristic local field case; the field \(\mathbb{F}_{p}((X))(t)\) is C 3 (see [G]) so that \(u(\mathbb{F}_{p}((X))(t)) \leq 8\) for p odd. If u is a nonsquare in \(\mathbb{F}_{p}\), \(\langle 1,-u\rangle \otimes \langle 1,-X\rangle \otimes \langle 1,-t\rangle\) is anisotropic over \(\mathbb{F}_{p}((X))(t)\), so that \(u(\mathbb{F}_{p}((X))(t)) = 8\).

We indicate some ways of bounding the u-invariant of a field k once we know how efficiently the Galois cohomology groups \({H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) are generated by symbols for all n.

We set

$$\displaystyle{H_{\mathrm{dec}}^{n}(k, \mathbb{Z}/2\mathbb{Z}) =\{ (a_{ 1}) \cdot \cdots \cdot (a_{n}): a_{i} \in {k}^{{\ast}}\}}$$

and call elements in this set symbols. By Voevodsky’s theorem on the Milnor conjecture, \({H}^{n}(k, \mathbb{Z}/2\mathbb{Z})\) is additively generated by \(H_{\mathrm{dec}}^{n}(k, \mathbb{Z}/2\mathbb{Z})\).

Proposition 6.7.

Let k be a field such that \({H}^{n+1}(k, \mathbb{Z}/2\mathbb{Z}) = 0\) and for 2 ≤ i ≤ n, there exist integers N i such that every element in \({H}^{i}(k, \mathbb{Z}/2\mathbb{Z})\) is a sum of N i symbols. Then u(k) is finite.

Proof.

Let q be a quadratic form over k of dimension m and discriminant d. Let q 1 = ⟨d⟩ if m is odd and ⟨1, − d⟩ if m is even. Then q ⊥ − q 1 has even dimension and trivial discriminant. Hence q ⊥ − q 1I 2(k). Let \(e_{2}(q \perp -q_{1}) =\sum _{j\leq N_{2}}\xi _{2j}\) where \(\xi _{2j} \in H_{\mathrm{dec}}^{2}(k, \mathbb{Z}/2\mathbb{Z})\). Let ϕ 2j be two-fold Pfister forms such that \(e_{2}(\phi _{2j}) =\xi _{2j}\). Then \(q_{2} =\sum _{j\leq N_{2}}\phi _{2j}\) has dimension at most 4N 2 and \(e_{2}(q \perp -q_{1} \perp -q_{2}) = 0\) and \(q \perp -q_{1} \perp -q_{2} \in {I}^{3}(k)\), by Merkurjev’s theorem. Repeating this process and using the Milnor conjecture, we get q i I i(k) which is a sum of N i i-fold Pfister forms and \(q -\sum _{1\leq i\leq n}q_{i} \in {I}^{n+1}(k) = 0\), since \({H}^{n+1}(k, \mathbb{Z}/2\mathbb{Z}) = 0\). Thus \([q] =\sum _{1\leq i\leq n}q_{i}\) and \(\dim (q_{an}) \leq \sum _{1\leq i\leq n}{2}^{i}N_{i}\). Thus \(u(k) \leq \sum _{1\leq i\leq n}{2}^{i}N_{i}\). □

Definition 6.8.

A field k is said to have cohomological dimension at most \(\boldsymbol{n}\) (in symbols, cd(k) ≤ n) if H i(k, M) = 0 for in + 1 for all finite discrete \(\Gamma _{k}\)-modules M (cf. [Se, §3]).

Example 6.9.

Finite fields and function fields in one variable over algebraically closed fields have cohomological dimension 1. Totally imaginary number fields and p-adic fields are of cohomological dimension 2. If k is a p-adic field, and k(X) a function field in one variable over k, cd(k(X)) ≤ 3. In particular, \({H}^{4}(k(X), \mathbb{Z}/2\mathbb{Z}) = 0\).

Theorem 6.10 ( Saltman [Sa]).

Let k be a non-dyadic p-adic field and k(X) a function field in one variable over k. Every element in \({H}^{2}(k(X), \mathbb{Z}/2\mathbb{Z})\) is a sum of two symbols.

Theorem 6.11 ( Parimala–Suresh [PS1]).

Let k(X) be as in the previous theorem. Then every element in \({H}^{3}(k(X), \mathbb{Z}/2\mathbb{Z})\) is a symbol.

Corollary 6.12.

For k(X) as above, \(u(k(X)) \leq 2 + 8 + 8 = 18\).

It is not hard to show from the above theorems that u(k(X)) ≤ 12. With some further work it was proved in [PS1] that u(k(X)) ≤ 10. More recently in [PS2] the estimated value u(k(X)) = 8 was proved. For an alternate approach to u(k(X)) = 8, we refer to [HH, HHK, CTPS]. More recently, Heath-Brown and Leep [HB] have proved the following spectacular theorem: If k is any p-adic field and k(X) the function field in n variables over k, then \(u(k(X)) = {2}^{n+2}\).

7 Hilbert’s Seventeenth Problem

An additional reference for sums of squares is [C].

Definition 7.1.

An element \(f \in \mathbb{R}(X_{1},\ldots,X_{n})\) is called positive semi-definite if f(a) ≥ 0 for all \(a = (a_{1},\ldots,a_{n}) \in {\mathbb{R}}^{n}\) where f is defined.

Hilbert’s seventeenth problem:

Let \(\mathbb{R}(X_{1},\ldots,X_{n})\) be the rational function field in n variables over the field \(\mathbb{R}\) of real numbers. Hilbert’s seventeenth problem asks whether every positive semi-definite \(f \in \mathbb{R}(X_{1},\ldots,X_{n})\) is a sum of squares in \(\mathbb{R}(X_{1},\ldots,X_{n})\). E. Artin settled this question in the affirmative and Pfister gave an effective version of Artin’s result (cf. [Pf, Chap. 6]).

Theorem 7.2 ( Artin, Pfister).

Every positive semi-definite function \(f \in \mathbb{R}(X_{1},\ldots,\) X n ) can be written as a sum of 2 n squares in \(\mathbb{R}(X_{1},\ldots,X_{n})\).

For n ≤ 2 the above was due to Hilbert himself. If one asks for expressions of positive definite polynomials in \(\mathbb{R}[X_{1},\ldots,X_{n}]\) as sums of 2n squares in \(\mathbb{R}[X_{1},\ldots,X_{n}]\), there are counterexamples for n = 2; the Motzkin polynomial

$$\displaystyle{f(X_{1},X_{2}) = 1 - 3X_{1}^{2}X_{ 2}^{2} + X_{ 1}^{4}X_{ 2}^{2} + X_{ 1}^{2}X_{ 2}^{4}}$$

is positive semi-definite but not a sum of squares in \(\mathbb{R}[X_{1},X_{2}]\). In fact, Pfister’s result has the following precise formulation.

Theorem 7.3 ( Pfister).

Let \(\mathbb{R}(X)\) be a function field in n variables over \(\mathbb{R}\) . Then every n-fold Pfister form in \(\mathbb{R}(X)\) represents every sum of squares in \(\mathbb{R}(X)\).

We sketch a proof of this theorem below.

Definition 7.4.

Let ϕ be an n-fold Pfister form with ϕ = 1 ⊥ ϕ ′. The form ϕ ′ is called the pure subform of ϕ.

Proposition 7.5 ( Pure Subform Theorem).

Let k be any field of characteristic not 2, ϕ an anisotropic n-fold Pfister form over k and ϕ′ its pure subform. If b 1 is any value of ϕ′, then ϕ≅⟨​⟨b 1, …,b n ⟩​⟩ for some \(b_{2},\ldots,b_{n} \in {k}^{{\ast}}\).

Proof.

The proof is by induction on n; for n = 1 the statement is clear. Let n > 1. We assume the statement holds for all (n − 1)-fold Pfister forms. Let ϕ = ⟨​⟨a 1, , a n ⟩​⟩, \(\psi =\langle \!\langle a_{1},\ldots,a_{n-1}\rangle \!\rangle\), and let ϕ ′, ψ ′ denote the pure subforms of ϕ and ψ respectively. We have ϕ = ψa n ψ, ϕ ′ = ψ ′a n ψ. Let b 1 be a value of ϕ ′. We may write \(b_{1} = b_{1}^{\prime} + a_{n}b\), with b 1 a value of ψ ′ and b a value of ψ. The only nontrivial case to discuss is when b≠0 and b 1 ≠0. By induction, \(\psi \mathop{\cong}\langle \!\langle b_{1}^{\prime},b_{2},\ldots,b_{n-1}\rangle \!\rangle\) and b ψψ. We thus have

$$\displaystyle\begin{array}{rcl} \phi & \mathop{\cong}& \langle \!\langle b_{1}^{\prime},b_{2},\ldots,b_{n-1},a_{n}\rangle \!\rangle \mathop{\cong}\langle \!\langle b_{1}^{\prime},b_{2},\ldots,b_{n-1},a_{n}b\rangle \!\rangle {}\\ & \mathop{\cong}& \langle \!\langle b_{1}^{\prime},a_{n}b\rangle \!\rangle \otimes \langle \!\langle b_{2},\ldots,b_{n-1}\rangle \!\rangle {}\\ \end{array}$$

Since \(b_{1} = b_{1}^{\prime} + a_{n}b\), \(\langle b_{1}^{\prime},a_{n}b\rangle \mathop{\cong}\langle b_{1},b_{1}b_{1}^{\prime}a_{n}b\rangle\) and we have

$$\displaystyle\begin{array}{rcl} \langle \!\langle b_{1}^{\prime},a_{n}b\rangle \!\rangle & =& \langle 1,b_{1}^{\prime},a_{n}b,a_{n}bb_{1}^{\prime}\rangle {}\\ & =& \langle 1,b_{1},b_{1}b_{1}^{\prime}a_{n}b,a_{n}bb_{1}^{\prime}\rangle {}\\ & =& \langle \!\langle b_{1},c_{1}\rangle \!\rangle, {}\\ \end{array}$$

where \(c_{1} = b_{1}b_{1}^{\prime}a_{n}b\). Thus,

$$\displaystyle{\phi \mathop{\cong}\langle \!\langle b_{1},c_{1},b_{2},\cdots \,,b_{n-1}\rangle \!\rangle.}$$

Proof of Pfister’s Theorem 7.3.

Let ϕ be an anisotropic n-fold Pfister form over \(K = \mathbb{R}(X)\). Let \(b = b_{1}^{2} + \cdots + b_{m}^{2}\), b i K . We show that ϕ represents b by induction on m. For m = 1, b is a square and is represented by ϕ. Suppose m = 2, \(b = b_{1}^{2} + b_{2}^{2}\), b 1≠0, b 2≠0. The field \(K(\sqrt{-1})\) is a function field in n variables over \(\mathbb{C}\) and is C n . Then ϕ is universal over \(K(\sqrt{-1})\) and hence represents \(\beta = b_{1} + ib_{2}\). Let \(v,w \in {K}^{{2}^{n} }\) such that \(\phi _{K(\sqrt{-1})}(v +\beta w) =\beta\). Hence

$$\displaystyle{\phi (v) {+\beta }^{2}\phi (w) +\beta (2\phi (v,w) - 1) = 0.}$$

The irreducible polynomial of β over K is

$$\displaystyle{\phi (w){X}^{2} + (2\phi (v,w) - 1)X +\phi (v)}$$

and hence \(N(\beta ) = b = \frac{\phi (v)} {\phi (w)}\) is a value of ϕ since ϕ is multiplicative.

Suppose m > 2. We argue by induction on m. Suppose ϕ represents all sums of m − 1 squares. Let b be a sum of m squares. After scaling b by a square, we may assume that \(b = 1 + c\), \(c = c_{1}^{2} + \cdots + c_{m-1}^{2}\), c≠0. Let ϕ ≅1 ⊥ ϕ ′. By induction hypothesis, ϕ represents c. Let \(c = c_{0}^{2} + c^{\prime}\), c ′ a value of ϕ ′. Let \(\psi =\phi \otimes \langle 1,-b\rangle\) and ψ = 1 ⊥ ψ ′ with \(\psi ^{\prime} =\langle -b\rangle \perp \phi ^{\prime}\perp -b\phi ^{\prime}\). The form ψ ′ represents \(c^{\prime} - b = (c - c_{0}^{2}) - (1 + c) = -1 - c_{0}^{2}\). Thus, by the Pure Subform theorem,

$$\displaystyle{\psi \mathop{\cong}\langle \!\langle - 1 - c_{0}^{2},d_{ 1},\ldots,d_{n}\rangle \!\rangle =\langle 1,-1 - c_{0}^{2}\rangle \otimes \langle \!\langle d_{ 1},\ldots,d_{n}\rangle \!\rangle.}$$

By induction, the n-fold Pfister form ⟨​⟨d 1, , d n ⟩​⟩ represents 1 + c 0 2 which is a sum of 2 squares; thus ψ is isotropic, hence hyperbolic. Thus ϕb ϕ represents b. □

Corollary 7.6.

Let \(K = \mathbb{R}(X)\) be a function field in n variables over \(\mathbb{R}\) . Then every sum of squares in K is a sum of 2 n squares.

Proof.

Set ϕ = ⟨1, 1⟩nn in the above theorem. □

8 Pythagoras Number

Definition 8.1.

The Pythagoras number p(k)n of a field kn is the least positive integer n such that every sum of squares in k is a sum of at most n squares; if no such n exists, p(k) is defined to be infinity.

Example 8.2.

If \(\mathbb{R}\) is the field of real numbers, \(p(\mathbb{R}) = 1\).

Example 8.3.

If \(\mathbb{R}(X_{1},\ldots,X_{n})\) is a function field in \(n\) variables over \(\mathbb{R}\), by Pfister’s theorem (Corollary 7.6), \(p(\mathbb{R}(X_{1},\ldots,X_{n})) \leq {2}^{n}\).

Let \(K = \mathbb{R}(X_{1},\ldots,X_{n})\) be the rational function field in n variables over \(\mathbb{R}\). We discuss the effectiveness of the bound p(K) ≤ 2n. For n = 1 the bound is sharp. For n = 2 the Motzkin polynomial

$$\displaystyle{f(X_{1},X_{2}) = 1 - 3X_{1}^{2}X_{ 2}^{2} + X_{ 1}^{4}X_{ 2}^{2} + X_{ 1}^{2}X_{ 2}^{4}}$$

is positive semi-definite; Cassels–Ellison–Pfister [CEP] show that this polynomial is not a sum of three squares in \(\mathbb{R}(X_{1},X_{2})\) (see also [CT]). Therefore \(p(\mathbb{R}(X_{1},X_{2})) = 4\).

Lemma 8.4 ( Key Lemma).

Let k be a field and n = 2 m . Let u = (u 1, …,u n ) and \(v = (v_{1},\ldots,v_{n}) \in {k}^{n}\) be such that \(u \cdot v =\sum _{1\leq i\leq n}u_{i}v_{i} = 0\) . Then there exist w j ∈ k, 1 ≤ j ≤ n − 1 such that

$$\displaystyle{\bigg(\sum _{1\leq i\leq n}u_{i}^{2}\bigg)\bigg(\sum _{ 1\leq i\leq n}v_{i}^{2}\bigg) =\sum _{ 1\leq j\leq n-1}w_{j}^{2}.}$$

Proof.

Let \(\lambda =\sum _{1\leq i\leq n}u_{i}^{2}\), \(\mu =\sum _{1\leq i\leq n}v_{i}^{2}\). We may assume without loss of generality that u≠0 and v≠0. The elements λ and μ are values of ϕ m = ⟨1, 1⟩m and λ ϕ m ϕ m , μ ϕ m ϕ m . We choose isometries f: λ ϕ m ϕ m , g: μ ϕ m ϕ m such that f(1, 0, , 0) = u and g(1, 0, , 0) = v. If U and V are matrices representing f, g respectively, we have

$$\displaystyle{U{U}^{t} {=\lambda }^{-1},\ \ V {V }^{t} {=\mu }^{-1}{,\ \ \lambda {}^{-1}\mu }^{-1} {=\lambda }^{-1}V {V }^{t} = (V {U}^{t}){(V {U}^{t})}^{t}.}$$

The first row of V U t is of the form (0, w 2, , w n ) since uv = 0. Thus \({\lambda {}^{-1}\mu }^{-1} =\sum _{2\leq i\leq n}w_{i}^{2}\). □

Corollary 8.5.

Let k be an ordered field with p(k) = n. Then p(k(t)) ≥ n + 1.

Proof.

Let λk be such that λ is a sum of n squares and not a sum of less than n squares. Suppose λ + t 2 is a sum of n squares in k(t). By the Cassels–Pfister theorem,

$$\displaystyle{\lambda +{t}^{2} = {(\mu _{ 1} +\nu _{1}t)}^{2} + \cdots + {(\mu _{ n} +\nu _{n}t)}^{2}}$$

with \(\mu _{i},\nu _{i} \in {k}^{{\ast}}\). If u = (μ 1, , μ n ), v = (ν 1, , ν n ), then uv = 0, \(\sum _{1\leq i\leq n}\mu _{i}^{2} =\lambda\), \(\sum _{1\leq i\leq n}\nu _{i}^{2} = 1\). Thus \(\lambda = (\sum _{1\leq i\leq n}\mu _{i}^{2})(\sum _{1\leq i\leq n}\nu _{i}^{2})\) is a sum of n − 1 squares by the Key Lemma 8.4, contradicting the choice of λ. □

Corollary 8.6.

For n ≥ 2,

$$\displaystyle{n + 2 \leq p(\mathbb{R}(X_{1},\ldots,X_{n})) \leq {2}^{n}.}$$

Proof.

By [CEP], we know that \(p(\mathbb{R}(X_{1},X_{2})) = 4\). The fact that \(n + 2 \leq p(\mathbb{R}(X_{1},\ldots,X_{n}))\) now follows by Corollary 8.5 and induction. □

Remark 8.7.

It is open whether \(p(\mathbb{R}(X_{1},X_{2},X_{3})) = 5,6,7\) or 8.

Remark 8.8.

The possible values of the Pythagoras number of a field have all been listed ([H], [Pf, p. 97]).

Proposition 8.9.

If k is a non-formally real field, p(k) = s(k) or s(k) + 1.

Proof.

If s(k) = n, then − 1 is not a sum of less than n squares, so that p(k) ≥ s(k). For ak ,

$$\displaystyle{a ={ \left (\frac{a + 1} {2} \right )}^{2} + (-1){\left (\frac{a - 1} {2} \right )}^{2}}$$

is a sum of n + 1 squares if − 1 is a sum of n squares. Thus p(k) ≤ s(k) + 1. □

Let k be a p-adic field and K = k(X 1, , X n ) a rational function field in n variables over k. Then s(k) = 1, 2 or 4 so that s(K) = 1, 2 or 4. Thus p(K) ≤ 5. (In fact it is easy to see that if s(k) = s, \(p(K) = s + 1\).)

Thus we have bounds for p(k(X 1, , X n )) if k is the field of real or complex numbers or the field of p-adic numbers. The natural questions concern a number field k.

9 Function Fields Over Number Fields

Let k be a number field and F = k(t) the rational function field in one variable over k. In this case p(k(t)) = 5 is a theorem [La]. The fact that p(k(t)) ≤ 8 can be easily deduced from the following injectivity in the Witt groups [CTCS, Proposition 1.1]:

$$\displaystyle{W(k(t))\longrightarrow \prod _{w\in \Omega (k)}W(k_{w}(t)),}$$

with \(\Omega (k)\) denoting the set of places of k. In fact, if fk(t) is a sum of squares, f is a sum of at most two squares in k w (t) for a real place w, by Pfister’s theorem (which in the case of function fields of curves goes back to Witt). Further, for a finite place w of k or a complex place, ⟨1, 1⟩⊗ 3 = 0 in W(k w ). Thus ⟨1, 1⟩⊗ 3 ⊗ ⟨1, − f⟩ is hyperbolic over k w (t) for all \(w \in \Omega (k)\).

By the above injectivity, this form is hyperbolic over k(t), leading to the fact that f is a sum of at most eight squares in k(t).

We have the following conjecture due to Pfister for function fields over number fields.

Conjecture ( Pfister).

Let k be a number field and F = k(X) a function field in d variables over k. Then

  1. (1)

    For d = 1, p(F) ≤ 5.

  2. (2)

    For d ≥ 2, p(F) ≤ 2 d+1.

For a function field k(X) in one variable over k, (d = 1), the best known result is due to F. Pop, p(F) ≤ 6 [P]. For d = 2, the conjecture is settled in [CTJ]. We sketch some results and conjectures from the arithmetic side which imply Pfister’s conjecture for d ≥ 3 (see Colliot-Thélène and Jannsen [CTJ] for more details).

For any field k, by Voevodsky’s theorem, we have an injection

$$\displaystyle{e_{n}: P_{n}(k) \rightarrow {H}^{n}(k, \mathbb{Z}/2\mathbb{Z}).}$$

In fact, for any field k, if ϕ 1, ϕ 2P n (k) have the same image under e n then \(\phi _{1} \perp -\phi _{2} \in \ker (e_{n}) = {I}^{n+1}(k)\). In W(k), \(\phi _{1} \perp -\phi _{2} =\phi ^{\prime}_{1} \perp -\phi ^{\prime}_{2}\) where ϕ 1 and ϕ 2 are the pure subforms of ϕ 1 and ϕ 2. Moreover, \(\dim (\phi _{1}^{\prime} \perp -\phi _{2}^{\prime})_{\mathrm{an}} \leq {2}^{n+1} - 2 < {2}^{n+1}\). By the Arason–Pfister Hauptsatz, (Theorem 3.1), anisotropic forms in I n + 1(k) must have dimension at least 2n + 1. Therefore ϕ 1 = ϕ 2.

Let k be a number field and F = k(X) be a function field in d variables over k. Let fF be a function which is a sum of squares in F. One would like to show that f is a sum of 2d + 1 squares. Let \(\phi _{d+1} =\langle 1,{1\rangle }^{\otimes (d+1)}\) and \(q =\phi _{d+1} \otimes \langle 1,-f\rangle\). This is a (d + 2)-fold Pfister form and ϕ d + 1 represents f if and only if q is hyperbolic or equivalently, by the injectivity of e n above, \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle ) = 0\).

We look at this condition locally at all completions k v at places v of k. Let k v (X) denote the function field of X over k v . (We may assume that X is geometrically integral.) Let v be a complex place. The field k v (X) has cohomological dimension d so that \({H}^{m}(k_{v}(X), \mathbb{Z}/2\mathbb{Z}) = 0\) for md + 1. Hence \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle ) = 0\) over k v (X). Let v be a real place. Over k v (X), f is a sum of squares, hence a sum of at most 2d squares (by Pfister’s Theorem 7.3) so that \(\phi _{d+1} \otimes \langle 1,-f\rangle\) is hyperbolic over k v (X). Hence \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle ) = 0\).

Let v be a non-dyadic p-adic place of k. Then ϕ 2 is hyperbolic over k v so that \(\phi _{d+1} \otimes \langle 1,-f\rangle = 0\) and \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle ) = 0\).

Let v be a dyadic place of k. Over k v , ϕ 3 is hyperbolic so that \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle ) = 0\). Thus for all completions v of k, \(e_{d+2}(\phi _{d+1} \otimes \langle 1,-f\rangle )\) is zero. The following conjecture of Kato implies Pfister’s conjecture for d ≥ 2.

Conjecture ( Kato).

Let k be a number field, X a geometrically integral variety over k of dimension d. Then the map

$$\displaystyle{{H}^{d+2}(k(X), \mathbb{Z}/2\mathbb{Z}) \rightarrow \prod _{ v\in \Omega _{k}}{H}^{d+2}(k_{ v}(X), \mathbb{Z}/2\mathbb{Z})}$$

has trivial kernel.

The above conjecture is the classical Hasse–Brauer–Noether theorem if the dimension of X is zero, i.e., the injectivity of the Brauer group map:

$$\displaystyle{\mathrm{Br}(k)\hookrightarrow \bigoplus _{v\in \Omega _{k}}\mathrm{Br}(k_{v}).}$$

For dimX = 1, the conjecture is a theorem of Kato [Ka]. Kato’s conjecture is now a theorem due to Jannsen [Ja1, Ja2] for dimX ≥ 2. Thus for every function field k(X) in d variables over a number field k, d ≥ 2, we have p(k(X)) ≤ 2d + 1.

We now explain how Kato’s theorem was used by Colliot-Thélène to derive p(k(X)) ≤ 7 for a curve X over a number field. We note that this bound is weaker than the bound established by F. Pop.

Suppose K = k(X) has no ordering. We claim that s(K) ≤ 4. To show this it suffices to show that ⟨1, 1⟩⊗ 3 is zero over k v (X) for every place v of k. At finite places v, ⟨1, 1⟩⊗ 3 is already zero in k v . If v is a real place of k, k v (X) is the function field of a real curve over the field of real numbers which has no orderings. By a theorem of Witt, Br(k v (X)) = 0 and every sum of squares is a sum of two squares in k v (X). Thus − 1 is a sum of two squares in k v (X) and ⟨1, 1⟩⊗ 3 = 0 over k v (X). Since \({H}^{3}(k(X), \mathbb{Z}/2\mathbb{Z}) \rightarrow \prod _{v\in \Omega _{k}}{H}^{3}(k_{v}(X), \mathbb{Z}/2\mathbb{Z})\) is injective by Kato’s theorem, e 3(⟨1, 1⟩⊗ 3) = 0 in \({H}^{3}(k(X), \mathbb{Z}/2\mathbb{Z})\). Since e 3 is injective on three-fold Pfister forms, ⟨1, 1⟩⊗ 3 = 0 in k(X). Thus s(k(X)) ≤ 4. In this case, p(k(X)) ≤ 5.

Suppose K has an ordering. Let fK be a sum of squares in K. Then \(K(\sqrt{-f})\) has no orderings and hence − 1 is a sum of 4 squares in \(K(\sqrt{-f})\). Let a i , b i K be such that

$$\displaystyle{-1 =\sum _{1\leq i\leq 4}{(a_{i} + b_{i}\sqrt{-f})}^{2},\ \ a_{ i},b_{i} \in K.}$$

Then

$$\displaystyle{1 +\sum _{1\leq i\leq 4}a_{i}^{2} = f\big(\sum _{ 1\leq i\leq 4}b_{i}^{2}\big),\ \ \sum _{ 1\leq i\leq 4}a_{i}b_{i} = 0.}$$

By the Key Lemma 8.4, \((1 +\sum _{1\leq i\leq 4}a_{i}^{2})\sum _{1\leq i\leq 4}b_{i}^{2}\) is a sum of at most 7 squares.