1 Introduction

There are many results relating the zero sets of linear or multilinear forms and their linear dependence. For example, it is well-known that if \(f_1, \ldots , f_k\) and g are linear forms such that

$$\begin{aligned} f_1(x)=0, \ldots , f_k(x)=0 \text { imply } g(x)=0, \end{aligned}$$
(1)

then \(g=a_1 f_1 + \cdots + a_k f_k\) for some scalars \(a_1, ..., a_k.\)

Also, if

$$\begin{aligned} f_1(x)\ge 0, \ldots , f_k(x)\ge 0 \text { imply } g(x)\ge 0, \end{aligned}$$
(2)

then \(g=a_1 f_1 + \cdots + a_k f_k\), with \(a_i \ge 0\) for all i. This is known as Farkas’ Lemma [4]. Note that condition (2) above is stronger than condition (1), as can be checked by replacing x with \(-x\).

Published in 1902, Farkas’ Lemma leads to linear programming duality and has proved to be a central result in the development of linear and non-linear mathematical optimization. According to MathSciNet, over 250 papers cite Farkas’ Lemma (one such contains the particularly beautiful statement by J. Franklin [7] that “I hope to convince you that every mathematician should know the Farkas theorem and should know how to use it”). One good, fairly recent source for such papers can be found among the opening articles in [5]. However, very few of these papers deal with non-linear versions of the lemma which is our interest here. In the non-linear setting, a short article describing some open problems involving polynomial matrices is contained in the work of A. A. ten Dam and J. W. Nieuwenhuis [2].

In the multilinear setting, positive results are known only in the case of two m-linear forms A and B as follows:

$$\begin{aligned} \text { If } A(x_1, \ldots ,x_m)=0 \text { implies } B(x_1, \ldots ,x_m)=0, \end{aligned}$$

then \(B=a A\) [1].

In addition, the following weak form of Farkas’ Lemma is given in [3]:

$$\begin{aligned} \text { If } A(x_1, \ldots ,x_m)\ge 0 \text { implies } B(x_1, \ldots ,x_m) \ge 0, \end{aligned}$$

then \(B=a A\), with \(a\ge 0 .\)

On the other hand, in [1] the authors give an example of symmetric bilinear forms \(A_1, A_2\) and B such that

$$\begin{aligned} A_1(x,y)=0 \text { and } A_2(x,y)=0 \text { imply } B(x,y)=0, \text { but }B\not =a_1 A_1+a_2 A_2, \end{aligned}$$

and in [3] the author shows (non-symmetric) bilinear forms \(A_1, A_2\) and B such that

$$\begin{aligned} A_1(x,y)\ge 0 \text { and } A_2(x,y)\ge 0 \text { imply } B(x,y)\ge 0, \text { but }B\not =a_1 A_1+a_2 A_2, \end{aligned}$$

with non-negative \(a_i\)’s.

Farkas’ Lemma also fails for 2-homogeneous polynomials. Consider \(P:\mathbb {R}^2 \rightarrow \mathbb {R}\) and \(Q:\mathbb {R}^2 \rightarrow \mathbb {R}\) given by \(P(x,y)=x^2\) and \(Q(x,y)=y^2\). Then \(P(x,y)\ge 0\) implies \(Q(x,y)\ge 0\), since both are always non-negative, but P and Q are independent.

In view of these examples, there appears to be little room for positive results. However in Sect. 1, below, we give a version of Farkas’ Lemma for simultaneously diagonalizable bilinear forms (see Theorem 1) as well as a result analogous to (1) above (see Theorem 2).

We note that a similar, but weaker, result:

$$\begin{aligned}&\text {If } A_1(x,y)\ge 0, \ldots , A_k(x,y)\ge 0 \text { imply } B(x,y)\ge 0, \nonumber \\&\qquad \text {then } B \succeq a_1A_1 + \cdots + a_kA_k \text { with all } a_i \ge 0, \end{aligned}$$
(3)

(where \(C\succeq D\) means \(C-D\) is positive semidefinite), is known and can be obtained through the S-procedure as in [6]. We require, however, equality in (3) for our version of the Farkas’ Lemma.

Moreover, in Sect. 2, we take a closer look at the reason for the lack of positive results. Farkas’ Lemma may be viewed as an application of the Hahn-Banach Theorem (unavailable to Farkas in 1902): indeed, given linear forms \(f_1, \ldots , f_k\), their positive cone

$$\begin{aligned} \mathcal {F}=\{a_1 f_1+ \cdots + a_k f_k : a_i \ge 0 \} \end{aligned}$$

is a convex set, so if \(g\not \in \mathcal {F}\) then it can be separated from \(\mathcal {F}\) by a linear functional on \({\mathbb {R}^n}^*\). But all linear functionals on \({\mathbb {R}^n}^*\) are given by “evaluation maps”. Thus there is a point \(x\in \mathbb {R}^n\) such that

$$\begin{aligned} g(x)< 0 \le f(x) \text { for all }f\in \mathcal {F}, \end{aligned}$$

and in particular, \(f_1(x)\ge 0, \ldots , f_k(x)\ge 0\) and \(g(x)<0\). This is the crucial point for the lack of Farkas type results in the multilinear (and other) settings: few linear functionals are “evaluation maps”. We study evaluation maps on spaces of real-valued bilinear forms on \(\mathbb {R}^n \ (n \ge 2)\) (see Propositions 3 and 4), and this enables us to produce new examples (see Examples 1 and 2) of bilinear forms \(A_1, A_2\) and B such that

$$\begin{aligned} A_1(x,y)\ge 0 \text { and } A_2(x,y)\ge 0 \text { imply } B(x,y)\ge 0, \text { but }B\not =a_1 A_1+a_2 A_2, \end{aligned}$$

with non-negative \(a_i\)’s.

2 Farkas’ Lemma for simultaneously diagonalizable bilinear forms

In this section, \(A_1,\ldots ,A_k\), and B will be bilinear forms on \(\mathbb {R}^n \times \mathbb {R}^n\). We will use the notation \(B_x(y)=B(x,y)\) (and similarly for the bilinear forms \(A_i\)). Note that a bilinear form A(xy) can be written as \(x^t[A]y,\) where \([A]\in \mathbb {R}^{n \times n}\) is the coefficient matrix of A in the canonical basis: \([A]_{ij}=A(e_i,e_j)\). We shall prove Farkas’ Lemma in the following form:

Theorem 1

For \(j = 1, ..., k,\) let \(A_j:\mathbb {R}^n\times \mathbb {R}^n \rightarrow \mathbb {R}\) be simultaneously diagonalizable bilinear forms. Then

$$\begin{aligned} B=a_1 A_1 + \cdots + a_k A_k, \text { with all } a_i \ge 0 \end{aligned}$$

if and only if

$$\begin{aligned} A_1(x,y) \ge 0, \ldots , A_k(x,y) \ge 0 \text { imply } B(x,y) \ge 0. \quad \quad \quad (*) \end{aligned}$$

Proof

Clearly \(\Rightarrow )\) is trivial, so we must see \(\Leftarrow )\).

Note that for any \(x\in \mathbb {R}^n\), we have

$$\begin{aligned} A_{1x}(y) \ge 0, \ldots , A_{kx}(y) \ge 0 \text { imply } B_x(y) \ge 0, \end{aligned}$$

so that by the linear Farkas’ Lemma there are \(a_1(x), \ldots , a_k(x) \ge 0\) such that

$$\begin{aligned} B_x = a_1(x) A_{1x} + \cdots + a_k(x) A_{kx}. \quad \quad \quad (**) \end{aligned}$$

The \(A_i\)’s are diagonalized simultaneously by a matrix U whose columns form a basis of eigenvectors \(\{u_1, \ldots , u_n\}\). Set

$$\begin{aligned} U=\left( \begin{array}{ccccc} &{} \vdots &{} &{} \vdots &{} \\ u_1 &{} \vdots &{} \cdots &{} \vdots &{} u_n\\ &{} \vdots &{} &{} \vdots &{} \end{array} \right) , \end{aligned}$$

and

$$\begin{aligned} U^{-1}=\left( \begin{array}{ccccc} &{} \ldots &{} v_1 &{} \ldots &{} \\ &{} \cdots &{} \vdots &{} \cdots &{} \\ &{} \ldots &{} v_n &{} \ldots &{} \end{array} \right) . \end{aligned}$$

We have \(\langle v_i,u_j \rangle =\delta _{ij}\), and for each \(j=1,\ldots , k\),

$$\begin{aligned} U^{-1}[A_j]U=\left( \begin{array}{cccc} A_j(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} A_j(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} A_j(v_n,u_n) \end{array} \right) . \end{aligned}$$

Note that if \(r\not =s\), then for all j, \(A_j(v_r,u_s)=0.\) Therefore from \((*)\) we deduce that \(B(v_r,u_s)=0\) for \(r\not =s\) as well. Thus the same basis diagonalizes B. We have

$$\begin{aligned} U^{-1}[B]U=\left( \begin{array}{cccc} B(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} B(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} B(v_n,u_n) \end{array} \right) . \end{aligned}$$

Consider \({\textsc {1}}\!\!\!{{\textsc {1}}}=(1,\ldots ,1)\), and \(z=(z_1,\ldots ,z_n)\). Then for each \(j=1,\ldots , k\),

$$\begin{aligned} A_j((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}} , Uz)&={\textsc {1}}\!\!\!{{\textsc {1}}}^t U^{-1} [A_j] Uz \\&={\textsc {1}}\!\!\!{{\textsc {1}}}^t \left( \begin{array}{cccc} A_j(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} A_j(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} A_j(v_n,u_n) \end{array} \right) z \\&= A_j(v_1,u_1)z_1 + \cdots + A_j(v_n,u_n)z_n, \end{aligned}$$

and similarly, \(B((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}} , Uz)=B(v_1,u_1)z_1 + \cdots + B(v_n,u_n)z_n\). Now let \(a_1=a_1((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}})\ge 0, \ldots , a_k=a_k((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}})\ge 0\). Since, by \((**)\),

$$\begin{aligned} B_{(U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}}}=a_1 {A_{1}}_{(U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}}}+\cdots +a_k {A_{k}}_{(U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}}}, \end{aligned}$$

we have, for any \(z \in \mathbb {R}^n\),

$$\begin{aligned} B((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Uz)=a_1 A_1((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Uz)+\cdots +a_k A_k((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Uz). \end{aligned}$$

Now, for \(r=1,\ldots ,n\), set \(z=e_r.\) We have

$$\begin{aligned} B(v_r,u_r)&= B((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Ue_r)\\&= a_1 A_1((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Ue_r)+\cdots +a_k A_k((U^{-1})^t {\textsc {1}}\!\!\!{{\textsc {1}}},Ue_r)\\&= a_1 A_1(v_r,u_r)+\cdots +a_k A_k(v_r,u_r). \end{aligned}$$

Thus

$$\begin{aligned} U^{-1}[B]U =&\left( \begin{array}{cccc} B(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} B(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} B(v_n,u_n) \end{array} \right) \\ =&\, a_1 \left( \begin{array}{cccc} A_1(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} A_1(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} A_1(v_n,u_n) \end{array} \right) + \cdots \\&\ldots + \, a_k \left( \begin{array}{cccc} A_k(v_1,u_1) &{} 0 &{} \ldots &{} 0\\ 0 &{} A_k(v_2,u_2) &{} \ldots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} A_k(v_n,u_n) \end{array} \right) \\ =&\, a_1 U^{-1}[A_1]U + \cdots + a_k U^{-1}[A_k]U \\ =&U^{-1} [a_1 A_1 + \cdots + a_k A_k]U. \end{aligned}$$

Therefore

$$\begin{aligned} B=a_1 A_1 + \cdots + a_k A_k, \text { with all } a_i \ge 0, \end{aligned}$$

and the proof is complete. \(\square \)

Note that by relaxing condition \((*)\) to

$$\begin{aligned} A_1(x,y) = 0, \ldots , A_k(x,y) = 0 \text { imply } B(x,y) = 0, \end{aligned}$$

with the same proof but using (1) instead of the linear Farkas’ Lemma, we have the following.

Theorem 2

For \(j = 1, ..., k,\) let \(A_j:\mathbb {R}^n\times \mathbb {R}^n \rightarrow \mathbb {R}\) be simultaneously diagonalizable bilinear forms. Then

$$\begin{aligned} B=a_1 A_1 + \cdots + a_k A_k, \end{aligned}$$

if and only if

$$\begin{aligned} A_1(x,y) = 0, \ldots , A_k(x,y) = 0 \text { imply } B(x,y) = 0. \end{aligned}$$

Also, since any number of symmetric commuting matrices are simultaneously diagonalizable [8, Theorem 1.3.21], we have the following corollary.

Corollary 1

If \(A_1,\ldots ,A_k\), are symmetric bilinear forms on \(\mathbb {R}^n \times \mathbb {R}^n\) defined by commuting matrices, then

$$\begin{aligned} B=a_1 A_1 + \cdots + a_k A_k, \text { with all } a_i \ge 0 \end{aligned}$$

if and only if

$$\begin{aligned} A_1(x,y) \ge 0, \ldots , A_k(x,y) \ge 0 \text { imply } B(x,y) \ge 0. \end{aligned}$$

Remark 1

We note that an analogous result can be obtained in the setting of a Hilbert space H as follows:

If \(A_1,\ldots ,A_k\), are symmetric bilinear forms on \(H \times H\) defined by \(A_i(x,y)=\langle y,T_i x \rangle \) where, for \(i=1,\ldots ,k\), the \(T_i'\hbox {s}\) are compact self-adjoint commuting operators, then

$$\begin{aligned} B=a_1 A_1 + \cdots + a_k A_k, \text { with all } a_i \ge 0 \end{aligned}$$

if and only if

$$\begin{aligned} A_1(x,y) \ge 0, \ldots , A_k(x,y) \ge 0 \text { imply } B(x,y) \ge 0. \end{aligned}$$

3 Evaluation maps over spaces of bilinear forms

As we have mentioned above, linear forms on \({\mathbb {R}^n}^*\) identify with elements of \(\mathbb {R}^n\), and thus are all evaluation maps: \(\gamma \mapsto \gamma (x)\). This is far from the case in most function spaces. For example, continuous linear functionals on C[0, 1] are identified with regular Borel measures, while the evaluation maps are the deltas: \(\delta _x (f)=f(x)\). The two convex sets

$$\begin{aligned} A&=\{f \in C[0,1]: f \text { is strictly increasing }\}\\ \text {and } B&=\{g \in C[0,1]: g \text { is strictly decreasing }\} \end{aligned}$$

can be separated by the measure \(\delta _x - \delta _y\) for any \(0 \le x \ne y \le 1,\) although they cannot be separated by any evaluation map \(\delta _x\).

We now characterize the evaluation maps on the space of bilinear forms \(\mathcal {L}^2 (\mathbb {R}^n)\) and on the space of symmetric bilinear forms \(\mathcal {L}_s^2 (\mathbb {R}^n)\) in order to construct examples of positive cones

$$\begin{aligned} \mathcal {F}=\{a_1 A_1 + \cdots + a_n A_n : a_i \ge 0 \} \end{aligned}$$

and \(B\not \in \mathcal {F}\) which cannot be separated by evaluation maps.

We consider \(\mathcal {L}^2 (\mathbb {R}^n)\) as \(\mathbb {R}^{n \times n}\), the space of \(n \times n\) matrices with inner product:

$$\begin{aligned} \langle A,B \rangle =\sum _i \sum _j a_{ij} b_{ij}=tr(AB^t), \end{aligned}$$

where \(A=(a_{ij})\) and \(B=(b_{ij})\). For a matrix A, we denote by tr(A) the trace of A and by rk(A) the rank of A. Any \(\varphi \in \mathcal {L}^2 (\mathbb {R}^n)^*\) can be represented as \(\varphi (A)=tr(AN^t)\) for some \(n\times n\) matrix N. As we will see below, in the symmetric setting, any \(\varphi \in \mathcal {L}_s^2 (\mathbb {R}^n)^*\) can be represented as \(\varphi (A)=tr(AS)\), with a symmetric matrix S. We say that \(\varphi \) is an evaluation map if there are x and y in \(\mathbb {R}^n\) such that \(\varphi (A)=A(x,y)=x^tAy\).

We characterize the evaluation maps in \(\mathcal {L}^2 (\mathbb {R}^n)^*\) and \(\mathcal {L}_s^2 (\mathbb {R}^n)^*\) as matrices N and S as follows.

Proposition 3

\(\varphi \in \mathcal {L}^2 (\mathbb {R}^n)^*\) is an evaluation map if and only if \(\varphi (A)=tr(AN^t)\), where N has rank less than or equal to one.

Proof

\(x^tAy=\sum _i \sum _j a_{ij} x_i y_j\), so set \(n^t_{ij}=x_i y_j\) for the entries of \(N^t\). Thus

$$\begin{aligned} rk(N)=rk\left( \begin{array}{ccc} x_1y_1 &{} \ldots &{} x_1y_n \\ x_2y_1 &{} \ldots &{} x_2y_n \\ \vdots &{} \ddots &{} \vdots \\ x_ny_1 &{} \ldots &{} x_ny_n \end{array} \right) \le rk\left( \begin{array}{cccc} y_1 &{} y_2 &{} \ldots &{} y_n \\ y_1 &{} y_2 &{} \ldots &{} y_n \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ y_1 &{} y_2 &{} \ldots &{} y_n \end{array} \right) \le 1. \end{aligned}$$

N has rank 0 if \(\varphi =0\) and rank one otherwise.

Conversely, if N has rank one, we can write

$$\begin{aligned} N^t=\left( \begin{array}{ccc} x_1y_1 &{} \ldots &{} x_1y_n \\ x_2y_1 &{} \ldots &{} x_2y_n \\ \vdots &{} \ddots &{} \vdots \\ x_ny_1 &{} \ldots &{} x_ny_n \end{array} \right) , \end{aligned}$$

for some \(y \in \mathbb {R}^n,\) and so \(n^t_{ij}=x_i y_j\), and N produces an evaluation map. \(\square \)

We now discuss the symmetric setting. We note that we can associate \(\mathcal L_s^2(\mathbb {R}^n)\) with the space \(\mathbb {R}_s^{n \times n}\) of symmetric \(n\times n\) matrices. Also, any continuous linear functional \(\varphi :\mathcal L^2_s(\mathbb {R}^n) \rightarrow \mathbb {R}\) is given by \(\varphi (M) = tr(M[\varphi ]^t)\) for a suitable symmetric \(n \times n\) matrix \([\varphi ]\). Indeed, since M is symmetric, if \(\varphi (M)=tr(MN^t)\), we can replace N by a symmetric matrix as follows.

$$\begin{aligned} \varphi (M)=tr(MN^t)&=\frac{tr(MN^t) + tr((MN^t)^t)}{2} \\&=\frac{tr(MN^t) + tr(NM)}{2} \\&=\frac{tr(MN^t) + tr(MN)}{2} \\&=tr\left( M \frac{N^t + N}{2}\right) \\&=tr(M[\varphi ]), \text { where } [\varphi ]=\frac{N^t + N}{2}. \end{aligned}$$

For each \(x, y \in \mathbb {R}^n,\) we will identify an evaluation functional E(xy) on \(\mathbb {R}_s^{n\times n}\) by

$$M \in \mathbb {R}^{n\times n}_s \leadsto E(x,y)(M) \equiv x^tMy.$$

We can associate E(xy) to the symmetric \(n \times n\) matrix \([E(x,y)] \in \mathbb {R}_s^{n\times n}\) where

$$[E(x,y)]_{i,j} = \frac{x_iy_j + x_jy_i}{2}.$$

Observe that if a matrix \(N \in \mathbb {R}_s^{n \times n}\) represents an evaluation map, then so does the matrix \(UNU^t,\) where \(U \in \mathcal O(n)\) (the orthogonal group in \(\mathbb {R}^{n\times n}\)). To see this, if \(N = E(x,y)\) is an evaluation map, so that \(E(x,y)(M) = tr(MN^t),\) then for any such U

$$\begin{aligned} tr(M(UNU^t)^t)&= tr(M UN^tU^t) = tr(M UNU^t) \\&= tr(U^tMUN) = E(x,y)(U^tMU) \\&= x^tU^tMUy = (Ux)^tM(Uy) = E(Ux,Uy)(M), \end{aligned}$$

where the second equality holds since N is symmetric.

Therefore, if N is a symmetric \(n \times n\) matrix, we can find an orthogonal matrix U so that \(UNU^t\) has a diagonal representation. In particular, for any such diagonal matrix \([E(x,y)] =\)

$$\begin{aligned} \left[ \begin{array}{cccc} x_1y_1 &{} 0 &{} \cdots &{} 0\\ 0 &{} x_2y_2 &{} \cdots &{} 0 \\ 0 &{} \cdots &{} x_iy_i &{} \cdots \\ 0 &{} 0 &{} \cdots &{} x_ny_n \end{array}\right] \end{aligned}$$

In fact, at most two of the diagonal entries must be non-zero. Indeed, suppose that there are three non-zero entries, \(\lambda _i = x_iy_i, \lambda _j = x_jy_j,\) and \(\lambda _k = x_ky_k.\) Since the off-diagonal entries are all 0,  it follows that

$$\begin{aligned} \lambda _j\frac{x_i}{x_j} + \lambda _i\frac{x_j}{x_i}&= 0, \\ \lambda _k\frac{x_i}{x_k} + \lambda _i\frac{x_k}{x_i}&= 0, \\ \mathrm{and }\,\, \lambda _k\frac{x_j}{x_k} + \lambda _j\frac{x_k}{x_j}&= 0. \end{aligned}$$

As a consequence, \(\lambda _jx_i^2 + \lambda _ix_j^2 = 0, \lambda _kx_i^2 + \lambda _ix_k^2 = 0,\) and \(\lambda _kx_j^2 + \lambda _jx_k^2 = 0.\) Since none of the entries is 0,  it follows that all three of \(\lambda _i, \lambda _j,\) and \(\lambda _k\) are of opposite sign, a clear impossibility. In addition, the argument shows that if there are just two non-zero diagonal entries, say with \(i = 1\) and \(j = 2,\) then \(\lambda _1\lambda _2 < 0.\)

Consequently, if \(\lambda _1 = \lambda _2 = 0,\) then the linear form \(\varphi = E(\vec {0},\vec {0}),\) and if only \(\lambda _1 \ne 0,\) then \(\varphi = E(\lambda _1 e_1, e_1).\) The third and last case occurs if \(\lambda _1 \ne 0 \ne \lambda _2.\) If say \(\lambda _1> 0 > \lambda _2,\) then by using the facts that \(\lambda _2\) is negative and that we are dealing with symmetric matrices it follows that \(\varphi = E(\sqrt{\frac{-\lambda _1}{\lambda _2}} \ e_1 + e_2, \sqrt{-\lambda _1 \lambda _2}\ e_1 +\lambda _2e_2).\)

Thus, given an evaluation map E(xy),  we have now seen that to the associated matrix \([E(x,y)] \in \mathbb {R}_s^{n\times n}\) there is an orthogonal matrix U such that \(U[E(x,y)]U^t\) is diagonal which also represents an evaluation map. Summarizing, we have the following.

Proposition 4

The square matrix \(S \in \mathbb {R}_s^{n\times n}\) represents an evaluation map if and only if S is one of the following:

  1. (i)

    \(S = 0,\) or

  2. (ii)

    S has only one non-zero eigenvalue, or

  3. (iii)

    S has exactly two non-zero eigenvalues which are of opposite sign.

We now construct counterexamples to Farkas’ Lemma for both cases, the four dimensional \(\mathcal {L}^2(\mathbb {R}^2)\) and the three dimensional \(\mathcal {L}_s^2(\mathbb {R}^2)\). Note that according to our characterizations, evaluation functionals over \(\mathcal {L}^2(\mathbb {R}^2)\) are of the form \(\varphi (A)=\langle A,N\rangle \) where \(\det N=0\), while evaluation functionals over \(\mathcal {L}_s^2(\mathbb {R}^2)\) are of the form \(\varphi (A)=\langle A,S\rangle \) where \(\det S\le 0\). We begin with counterexamples in \(\mathcal {L}^2(\mathbb {R}^2)\).

Example 1

Let \([I]^\perp \subset \mathcal {L}^2(\mathbb {R}^2)\) be the orthogonal complement of the line spanned by the identity matrix, and \(\{A_1, A_2, A_3 \}\) an orthonormal basis of \([I]^\perp \). We consider \(B=A_1+A_2+A_3-\frac{\varepsilon }{\sqrt{2}} I\) where \(\varepsilon > 0\) will be chosen later. Clearly \(B\not \in \mathcal {F} \equiv \{a_1 A_1+a_2 A_2+a_3 A_3 : a_i \ge 0 \}\). Let \(\varphi \) be a linear form separating B from \(\mathcal {F}\) chosen so that

$$\begin{aligned} \varphi (B)<0\le \varphi (X) \text { for all }X\in \mathcal {F}. \end{aligned}$$

\(\varphi \) is given by the matrix N: \(\varphi (X)=\langle X,N\rangle \). We may suppose N has norm one. We will show that if \(\varepsilon >0\) is chosen to be sufficiently small, then \(\varphi \) cannot be an evaluation map. We have

$$\begin{aligned} 0 > \varphi (B)=\langle B,N\rangle =\langle A_1+A_2+A_3,N\rangle - \frac{\varepsilon }{\sqrt{2}}\langle I,N\rangle , \end{aligned}$$

so

$$\begin{aligned} \langle A_1,N\rangle + \langle A_2,N\rangle + \langle A_3,N\rangle= & {} \langle A_1+A_2+A_3,N\rangle \\&< \frac{\varepsilon }{\sqrt{2}}\ \langle I,N\rangle \ \ \ \le \varepsilon . \end{aligned}$$

Thus

$$\begin{aligned} \langle A_1,N\rangle ^2 + \langle A_2,N\rangle ^2 + \langle A_3,N\rangle ^2&\le \left( \langle A_1,N\rangle + \langle A_2,N\rangle + \langle A_3,N\rangle \right) ^2 \\&< \varepsilon ^2. \end{aligned}$$

The first inequality follows from the fact that the right part is equal to the left part plus the double products of \(\langle A_i, N\rangle \langle A_j, N\rangle \), which are terms greater than or equal to zero since \(0 \le \varphi (X)\) for all \(X \in \mathcal {F}\).

Also,

$$\begin{aligned} 1=\Vert N \Vert ^2=\langle A_1,N\rangle ^2 + \langle A_2,N\rangle ^2 + \langle A_3,N\rangle ^2 + \left\langle \frac{I}{\sqrt{2}},N\right\rangle ^2, \end{aligned}$$

so

$$\begin{aligned} \left\langle \frac{I}{\sqrt{2}},N\right\rangle ^2 = 1 - \left( \langle A_1,N\rangle ^2 + \langle A_2,N\rangle ^2 + \langle A_3,N\rangle ^2 \right) > 1-\varepsilon ^2. \end{aligned}$$

Thus \(\sqrt{2}\sqrt{1-\varepsilon ^2} \ <\ \ \langle I,N\rangle \). Now by the parallelogram law we have

$$\begin{aligned} \Vert I-N \Vert ^2&= 2\left( \Vert I \Vert ^2 + \Vert N \Vert ^2 \right) - \Vert I+N \Vert ^2 \\&= 6 - \left( \Vert I \Vert ^2 + 2\langle I,N\rangle + \Vert N \Vert ^2 \right) \\&= 3 - 2 \langle I,N\rangle \\&< 3 - 2\sqrt{2}\sqrt{1-\varepsilon ^2}, \end{aligned}$$

which can be made smaller than one for small \(\varepsilon \) because \(3 - 2\sqrt{2} < 0.18\). Thus, for such \(\varepsilon \), \(\Vert I-N \Vert \) is smaller than one, and N is invertible. By Proposition 3, \(\varphi \) cannot be an evaluation map.

For the symmetric case we have the following.

Example 2

Let \([I]^\perp \subset \mathcal {L}_s^2(\mathbb {R}^2)\) be the orthogonal complement of the line spanned by the identity matrix, and \(\{A_1, A_2 \}\) be an orthonormal basis of \([I]^\perp \). We consider \(B=A_1+A_2-\frac{\varepsilon }{\sqrt{2}} I\) where \(\varepsilon > 0\) will be chosen later. Clearly B is not in \(\mathcal {F} \equiv \{a_1 A_1+a_2 A_2 : a_i \ge 0 \}\). Let \(\varphi \) be such that

$$\begin{aligned} \varphi (B)<0\le \varphi (X) \text { for all }X\in \mathcal {F}. \end{aligned}$$

\(\varphi \) is given by the symmetric matrix S, which we may suppose has norm one: \(\varphi (X)=\langle X,S\rangle \). Picking a suitable small enough \(\varepsilon \) as we have done in the previous example, we have \(\Vert I-S \Vert < 1\). But then the determinant of S must be positive: if \(\det S \le 0\), the line segment joining I and S would, by the intermediate value theorem, contain a matrix X with \(\det X = 0\), at a distance smaller than one from the identity. By Proposition 4, \(\varphi \) cannot be an evaluation map.

Note that no basis of \([I]^\perp \) can be diagonalized simultaneously. Otherwise, all matrices would be simultaneously diagonalizable.