Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Flats and Affine Functions

In this section, we present the basic properties of flats and affine functions. Let E be a linear space (over \(\mathbb{R}\)). We call F ⊂ E a flat Footnote 1 if the straight line through every distinct two points of F is contained in F, i.e.,

$$\displaystyle{(1 - t)x + ty \in F\;\mbox{ for each $x,y \in F$ and $t \in \mathbb{R}$.}}$$

Evidently, the intersection and the product of flats are also flats. We have the following characterization of flats:

Proposition 3.1.1.

Let E be a linear space. For each non-empty subset F ⊂ E, the following conditions are equivalent:

  1. (a)

    F is a flat;

  2. (b)

    For each \(n \in \mathbb{N}\) , if \(v_{1},\ldots,v_{n} \in F\) and \(\sum _{i=1}^{n}t_{i} = 1\) , then \(\sum _{i=1}^{n}t_{i}v_{i} \in F\);

  3. (c)

    F − x is a linear subspace of E for any x ∈ F;

  4. (d)

    F − x 0 is a linear subspace of E for some x 0 ∈ E.

Proof.

By induction on \(n \in \mathbb{N}\), we can obtain (a) ⇒ (b). Condition (c) follows from the case n = 3 of (b) because, for each x, y, z ∈ F and \(a,b \in \mathbb{R}\),

$$\displaystyle{a(y - x) + b(z - x) + x = (1 - a - b)x + ay + bz.}$$

To see (c) ⇒ (a), let x, y ∈ F and \(t \in \mathbb{R}\). Since F − x is a linear subspace of E by (c), we have \(t(y - x) \in F - x\), which means \((1 - t)x + ty \in F\). The implication (c) ⇒ (d) is obvious.

(d) ⇒ (c): It suffices to show that if F − x 0 is a linear subspace of E, then \(F - x = F - x_{0}\) for any x ∈ F. For every z ∈ F, we have

$$\displaystyle{z - x = (z - x_{0}) - (x - x_{0}) \in F - x_{0}.}$$

Here, take z′ ∈ F so that \((z - x_{0}) + (x - x_{0}) = z^\prime - x_{0}\). Then, we have

$$\displaystyle{z - x_{0} = (z^\prime - x_{0}) - (x - x_{0}) = z^\prime - x \in F - x.}$$

Consequently, we have \(F - x = F - x_{0}\).

In the proof of the implication (d) ⇒ (c), we actually proved the following:

Corollary 3.1.2.

Let F be a flat in a linear space E. Then, \(F - x = F - y\) for any x,y ∈ F. □

A maximal proper flat H⊊E is called a hyperplane in E. The following proposition shows the relationship between hyperplanes and linear functionals.

Proposition 3.1.3.

Let E be a linear space.

  1. (1)

    For each hyperplane H ⊂ E, there is a linear functional \(f : E \rightarrow \mathbb{R}\) such that \(H = {f}^{-1}(s)\) for some \(s \in \mathbb{R}\);

  2. (2)

    For each non-trivial linear functional \(f : E \rightarrow \mathbb{R}\) and \(s \in \mathbb{R}\) , f −1 (s) is a hyperplane in E;

  3. (3)

    For linear functionals \(f_{1},f_{2} : E \rightarrow \mathbb{R}\) , if \(f_{1}^{-1}(s_{1}) = f_{2}^{-1}(s_{2})\) for some \(s_{1},s_{2} \in \mathbb{R}\) , then f 2 = rf 1 for some \(r \in \mathbb{R}\) .

Proof.

(1): For a given x 0 ∈ H, \(H_{0} = H - x_{0}\) is a maximal proper linear subspace of E (Proposition 3.1.1). Let \(x_{1} \in E \setminus H_{0}\). For each x ∈ E, there exists a unique \(t \in \mathbb{R}\) such that \(x - tx_{1} \in H_{0}\). Indeed, \(E = H_{0} + \mathbb{R}x_{1}\) because of the maximality of H 0. Hence, we can write \(x = z + tx_{1}\) for some z ∈ H 0 and \(t \in \mathbb{R}\). Then, \(x - tx_{1} \in H_{0}\). Moreover, if \(x - t^\prime x_{1} \in H_{0}\) and \(t^\prime \in \mathbb{R}\), then \((t - t^\prime) x_{1} \in H_{0}\). Since \(x_{1}\not\in H_{0}\), it follows that t = t′. Therefore, we have a function \(f : E \rightarrow \mathbb{R}\) such that \(x - f(x)x_{1} \in H_{0}\). For each x, y ∈ E and \(a,b \in \mathbb{R}\),

$$\displaystyle{(ax + by) - (af(x) + bf(y))x_{1} = a(x - f(x)x_{1}) + b(y - f(y)x_{1}) \in H_{0},}$$

which means \(f(ax + by) = af(x) + bf(y)\), i.e., f is linear. Observe that \({f}^{-1}(0) = H_{0} = H - x_{0}\), hence it follows that \(H = {f}^{-1}(f(x_{0}))\).

(2): From the non-triviality of f, it follows that \(f(E) = \mathbb{R}\), and hence \(\emptyset\subsetneq {f}^{-1}(s) \subsetneq E\). A simple calculation shows that f  − 1(s) is a flat. To prove the maximality, let F ⊂ E be a flat with \({f}^{-1}(s) \subsetneq F\). Take x 0 ∈ f  − 1(s) and \(x_{1} \in F \setminus {f}^{-1}(s)\). Since f(x 1) ≠ f(x 0) and F is a flat, it follows that \(f(F) = \mathbb{R}\). For each x ∈ E, we can choose y ∈ F ∖ f  − 1(s) so that f(y) ≠ f(x). Note that \(s = tf(x) + (1 - t)f(y)\) for some \(t \in \mathbb{R} \setminus \{ 0\}\). Let \(z = tx + (1 - t)y \in {f}^{-1}(s) \subset F\). Then, \(x = (1 - {t}^{-1})y + {t}^{-1}z \in F\). Accordingly, we have F = E.

(3): When \(f_{1}^{-1}(s_{1}) = f_{2}^{-1}(s_{2}) = \emptyset\), both f 1 and f 2 are trivial (i.e., \(f_{1}(E) = f_{2}(E) =\{ 0\}\)), and hence f 1 = f 2. If \(f_{1}^{-1}(s_{1}) = f_{2}^{-1}(s_{2})\not =\emptyset\), take \(x_{0} \in f_{1}^{-1}(s_{1}) = f_{2}^{-1}(s_{2})\). Then, it follows that

$$\displaystyle{f_{1}^{-1}(0) = f_{ 1}^{-1}(s_{ 1}) - x_{0} = f_{2}^{-1}(s_{ 2}) - x_{0} = f_{2}^{-1}(0).}$$

Let \(H_{0} = f_{1}^{-1}(0) = f_{2}^{-1}(0)\) and x 1 ∈ E ∖ H 0. Analogous to (1), each x ∈ E can be uniquely written as \(x = y + tx_{1}\), where y ∈ H 0 and \(t \in \mathbb{R}\). Then, \(f_{1}(x) = tf_{1}(x_{1})\) and \(f_{2}(x) = tf_{2}(x_{1})\), hence \(f_{2}(x) = f_{1}(x)f_{1}{(x_{1})}^{-1}f_{2}(x_{1})\). Let \(r = f_{1}{(x_{1})}^{-1}f_{2}(x_{1})\). It follows that \(f_{2} = rf_{1}\).

It is said that finitely many distinct points \(v_{1},\ldots,v_{n} \in E\) are affinely (or geometrically) independent provided that, for \(t_{1},\ldots,t_{n} \in \mathbb{R}\),

$$\displaystyle{\sum _{i=1}^{n}t_{ i}v_{i} = \mathbf{0},\ \sum _{i=1}^{n}t_{ i} = 0\; \Rightarrow \; t_{1} = \cdots = t_{n} = 0,}$$

i.e., \(v_{1} - v_{n},\ldots,v_{n-1} - v_{n}\) are linearly independent. In this case, the subset \(\{v_{1},\ldots,v_{n}\} \subset E\) is also said to be affinely (or geometrically) independent. An (infinite) subset A ⊂ E is said to be affinely (or geometrically) independent if every finite subset of A is affinely independent. This condition is equivalent to the condition that (A − v) ∖ {0} is linearly independent for some/any v ∈ A.Footnote 2

The smallest flat containing A ⊂ E is called the flat hull Footnote 3 of A and is denoted by fl A. Then, \({\mathbb{R}}^{n} = \mathrm{fl}\{\mathbf{0},\mathbf{e}_{1},\ldots,\mathbf{e}_{n}\}\), where \(\{\mathbf{e}_{1},\ldots,\mathbf{e}_{n}\}\) is the canonical orthonormal basis for \({\mathbb{R}}^{n}\) (i.e., e i (i) = 1 and e i (j) = 0 for j ≠ i). Observe that

$$\displaystyle\begin{array}{rcl} & & \mathrm{fl}\,\{v_{1},\ldots,v_{n}\} =\big\{\sum _{ i=1}^{n}t_{i}v_{i}\bigm |\sum _{i=1}^{n}t_{i} = 1\big\}\;\text{ and} {}\\ & & \mathrm{fl}A =\bigcup \big\{ \mathrm{fl}\,\{x_{1},\ldots,x_{n}\}\bigm |n \in \mathbb{N},\ x_{1},\ldots,x_{n} \in A\big\}. {}\\ \end{array}$$

By Zorn’s Lemma, every non-empty subset A ⊂ E contains a maximal affinely independent subset A 0 ⊂ A. Then, fl A 0 = fl A and each x ∈ fl A can be uniquely written as \(x =\sum _{ i=1}^{n}t_{i}v_{i}\), where \(v_{1},\ldots,v_{n} \in A_{0}\) and \(t_{1},\ldots,t_{n} \in \mathbb{R} \setminus \{ 0\}\) such that \(\sum _{i=1}^{n}t_{i} = 1\). In fact, for some/any v ∈ A 0, (A 0 − v) ∖ {0} (\(= (A_{0} \setminus \{ v\}) - v\)) is a Hamel basis for the linear subspace fl A − v (\(= \mathrm{fl}\,A_{0} - v\)) of E.

The dimension of a flat F ⊂ E is denoted by dimF, and is defined by the dimension of the linear space F − x for some/any x ∈ F, i.e., \(\dim F =\dim (F - x)\). When dimF = n (resp. dimF <  or dimF = ), it is said that F is \(\boldsymbol{n}\)-dimensional (resp. finite-dimensional (abbrev. f.d.) or infinite-dimensional (abbrev. i.d.)). Therefore, every n-dimensional flat F ⊂ E contains n + 1 points \(v_{1},\ldots,v_{n+1}\) such that \(F = \mathrm{fl}\,\{v_{1},\ldots,v_{n+1}\}\). In this case, \(v_{1},\ldots,v_{n+1}\) are affinely independent. Conversely, if \(F = \mathrm{fl}\,\{v_{1},\ldots,v_{n+1}\}\) for some n + 1 affinely independent points \(v_{1},\ldots,v_{n+1} \in F\), then dimF = n.

Let F and F′ be flats in linear spaces E and E′, respectively. A function f : F → F′ is said to be affine if it satisfies the following condition:

$$\displaystyle{f((1 - t)x + ty) = (1 - t)f(x) + tf(y)\;\mbox{ for each $x,y \in F$ and $t \in \mathbb{R}$,}}$$

which is equivalent to the following:

$$\displaystyle\begin{array}{rcl} f\big(\sum _{i=1}^{n}t_{i}v_{i}\big)& =& \sum _{i=1}^{n}t_{ i}f(v_{i}) {}\\ & & \mbox{ for each $n \in \mathbb{N}$, $v_{i} \in F$, $t_{i} \in \mathbb{R}$ with $\sum _{i=1}^{n}t_{i} = 1$.} {}\\ \end{array}$$

Recall that F ⊂ E is a flat if and only if F − x 0 is a linear subspace of E for some/any x 0 ∈ F (Proposition 3.1.1).

Proposition 3.1.4.

Let f : F → F′ be a function between flats F and F′ in linear spaces E and E′, respectively. In order that f is affine, it is necessary and sufficient that the following \({f}^{x_{0}} : F - x_{0} \rightarrow F^\prime - f(x_{0})\) is linear for some/any x 0 ∈ F:

$$\displaystyle{{f}^{x_{0} }(x) = f(x + x_{0}) - f(x_{0})\;\mbox{ for each $x \in F - x_{0}$.}}$$

Proof.

(Necessity) For each x, y ∈ F − x 0 and \(a,b \in \mathbb{R}\),

$$\displaystyle\begin{array}{rcl}{ f}^{x_{0} }(ax + by)& =& f(ax + by + x_{0}) - f(x_{0}) {}\\ & =& f(a(x + x_{0}) + b(y + x_{0}) + (1 - a - b)x_{0}) - f(x_{0}) {}\\ & =& af(x + x_{0}) + bf(y + x_{0}) + (1 - a - b)f(x_{0}) - f(x_{0}) {}\\ & =& a(f(x + x_{0}) - f(x_{0})) + b(f(y + x_{0}) - f(x_{0})) {}\\ & =& a{f}^{x_{0} }(x) + b{f}^{x_{0} }(y). {}\\ \end{array}$$

(Sufficiency) For each x, y ∈ F and \(t \in \mathbb{R}\),

$$\displaystyle\begin{array}{rcl} f((1 - t)x + ty)& =& {f}^{x_{0} }((1 - t)x + ty - x_{0}) + f(x_{0}) {}\\ & =& {f}^{x_{0} }((1 - t)(x - x_{0}) + t(y - x_{0})) + f(x_{0}) {}\\ & =& (1 - t){f}^{x_{0} }(x - x_{0}) + t{f}^{x_{0} }(y - x_{0}) + f(x_{0}) {}\\ & =& (1 - t)({f}^{x_{0} }(x - x_{0}) + f(x_{0})) + t({f}^{x_{0} }(y - x_{0}) + f(x_{0})) {}\\ & =& (1 - t)f(x) + tf(y). {}\\ \end{array}$$

Proposition 3.1.5.

Let A be a non-empty affinely independent subset of a linear space E. Then, every function g : A → E′ to another linear space E′ uniquely extends to an affine function \(\tilde{g} : \mathrm{fl}\,A \rightarrow E ^\prime\) such that \(\tilde{g}(\mathrm{fl}\,A) = \mathrm{fl}\,g(A)\) . Accordingly, every affine function f defined on F = fl  A is uniquely determined by f|A and the image f(F) is a flat.

Proof.

Let F′ = fl g(A) and take v 0 ∈ A. Since \((A \setminus \{ v_{0}\}) - v_{0}\) is a Hamel basis of the linear subspace fl A − v 0 of E, we have the unique linear function \(h : \mathrm{fl}\,A - v_{0} \rightarrow F ^\prime - g(v_{0})\) such that

$$\displaystyle{h(v - v_{0}) = g(v) - g(v_{0})\;\mbox{ for each $v \in A \setminus \{ v_{0}\}$.}}$$

Then, g uniquely extends to the affine function \(\tilde{g} : \mathrm{fl}\,A \rightarrow F ^\prime\) defined by

$$\displaystyle{\tilde{g}(x) = h(x - v_{0}) + g(v_{0})\;\mbox{ for each $x \in \mathrm{fl}\,A$.}}$$

It is easy to see that \(\tilde{g}(\mathrm{fl}\,A) = \mathrm{fl}\,g(A)\).

Additional Properties of Flats and Affine Functions 3.1.6.

In the following, let E and E′ be linear spaces and f : F → E′ be a function of a flat F in E.

  1. (1)

    If f is affine and F′ is a flat in E′, then f(F) and f  − 1(F′) are flats in E′ and E, respectively.

  2. (2)

    A function f is affine if and only if the graph Gr (f) = { (x, f(x))∣x ∈ F} of f is a flat in E ×E′.

3.2 Convex Sets

In this section, we introduce the basic concepts of convex sets. A subset C ⊂ E is said to be convex if the line segment with the end ponts in C is contained in C, i.e.,

$$\displaystyle{(1 - t)x + ty \in C\;\mbox{ for each $x,y \in C$ and $t \in \mathbf{I}$.}}$$

By induction on n, it can be proved that every convex set C ⊂ E satisfies the following condition:

$$\displaystyle{\sum _{i=1}^{n}z(i)v_{ i} \in C\;\mbox{ for each $n \in \mathbb{N}$, $v_{i} \in C$ and $z \in {\Delta }^{n-1}$,}}$$

where \({\Delta }^{n-1} =\{ z \in {\mathbf{I}}^{n}\mid \sum _{i=1}^{n}z(i) = 1\}\) is the standard (n − 1)-simplex. The following is easy:

  • If A, B ⊂ E are convex, then aA + bB is also convex for each \(a,b \in \mathbb{R}\).

The dimension of a convex set C ⊂ E is defined by the dimension of the flat hull fl C, i.e., \(\dim C =\dim \mathrm{fl}\,C\). Concerning the flat hull of a convex set, we have the following proposition:

Proposition 3.2.1.

For each convex set C ⊂ E,

$$\displaystyle{\mathrm{fl}\,C =\big\{ (1 - t)x + ty\bigm |x,y \in C,\ t \in \mathbb{R}\big\}.}$$

Proof.

Each z ∈ fl C can be written \(z =\sum _{ i=1}^{n}t_{i}x_{i}\), where x i  ∈ C and \(\sum _{i=1}^{n}t_{i} = 1\). We may assume that \(t_{1} \leq \cdots \leq t_{n} \in \mathbb{R} \setminus \{ 0\}\). If t 1 ≥ 0 then z ∈ C. Otherwise, t k  < 0 and t k + 1 > 0 for some \(k = 1,\ldots,n - 1\). Then, we have \(t =\sum _{ i=1}^{n-k}t_{k+i} > 0\), where \(1 - t =\sum _{ i=1}^{k}t_{i} < 0\). Let

$$\displaystyle{x =\sum _{ i=1}^{k}{(1 - t)}^{-1}t_{ i}x_{i},\ y =\sum _{ i=1}^{n-k}{t}^{-1}t_{ k+i}x_{k+i} \in C.}$$

Then, \(z = (1 - t)x + ty\). Accordingly, we have

$$\displaystyle{\mathrm{fl}\,C \subset \big\{ (1 - t)x + ty\bigm |x,y \in C,\ t \in \mathbb{R}\big\}.}$$

The converse inclusion is obvious.

The smallest convex set containing A ⊂ E is called the convex hull of A and is denoted by ⟨A⟩. We simply write \(\langle v_{1},\ldots,v_{n}\rangle =\langle \{ v_{1},\ldots,v_{n}\}\rangle\). Then, \({\Delta }^{n-1} =\langle \mathbf{e}_{1},\ldots,\mathbf{e}_{n}\rangle\). Observe that

$$\displaystyle\begin{array}{rcl} & & \langle v_{1},\ldots,v_{n}\rangle =\big\{\sum _{ i=1}^{n}z(i)v_{i}\bigm |z \in {\Delta }^{n-1}\big\}\;\text{ and} {}\\ & & \langle A\rangle =\bigcup \big\{\langle x_{1},\ldots,x_{n}\rangle \bigm |n \in \mathbb{N},\ x_{1},\ldots,x_{n} \in A\big\}. {}\\ \end{array}$$

For each two non-empty subsets A, B ⊂ E,

$$\displaystyle\begin{array}{rcl} \langle A \cup B\rangle & =& \big\{(1 - t)x + ty\bigm |x \in \langle A\rangle,\ y \in \langle B\rangle,\ t \in \mathbf{I}\big\}\;\text{ and} {}\\ & & \langle aA + bB\rangle = a\langle A\rangle + b\langle B\rangle \;\mbox{ for $a,b \in \mathbb{R}$.} {}\\ \end{array}$$

The second equality can be proved as follows: Because aA⟩ + bB⟩ is convex and \(aA + bB \subset a\langle A\rangle + b\langle B\rangle\), we have \(\langle aA + bB\rangle \subset a\langle A\rangle + b\langle B\rangle\). To show that \(a\langle A\rangle + b\langle B\rangle \subset \langle aA + bB\rangle\), let x ∈ ⟨A⟩ and y ∈ ⟨B⟩. Then, \(x =\sum _{ i=1}^{n}t_{i}x_{i}\) and \(y =\sum _{ j=1}^{m}s_{j}y_{j}\) for some x i  ∈ A, y j  ∈ B, and t i , s j  > 0 with \(\sum _{i=1}^{n}t_{i} =\sum _{ j=1}^{m}s_{j} = 1\). Since \(ax_{i} + by_{j} \in aA + bB\) and \(\sum _{i=1}^{n}\sum _{j=1}^{m}t_{i}s_{j} = 1\), it follows that

$$\displaystyle\begin{array}{rcl} ax + by& =& \sum _{i=1}^{n}t_{ i}(ax_{i} + by) =\sum _{ i=1}^{n}t_{ i}\bigg(\sum _{j=1}^{m}s_{ j}(ax_{i} + by_{j})\bigg) {}\\ & =& \sum _{i=1}^{n}\sum _{ j=1}^{m}t_{ i}s_{j}(ax_{i} + by_{j}) \in \langle aA + bB\rangle . {}\\ \end{array}$$

Let C and C′ be non-empty convex sets in the linear spaces E and E′, respectively. A function f : C → C′ is said to be affine (or linear in the affine sense) provided

$$\displaystyle{f((1 - t)x + ty) = (1 - t)f(x) + tf(y)\;\mbox{ for each $x,y \in C$ and $t \in \mathbf{I}$.}}$$

As in the definition of a flat, I can be replaced by \(\mathbb{R}\), i.e.,

$$\displaystyle\begin{array}{rcl} & & x,y \in C,\ t \in \mathbb{R},\ (1 - t)x + ty \in C {}\\ & & \quad \Rightarrow f((1 - t)x + ty) = (1 - t)f(x) + tf(y). {}\\ \end{array}$$

Indeed, let \(z = (1 - t)x + ty \in C\) in the above expression. When t < 0, consider

$$\displaystyle{x = \frac{1} {1 - t}z + \frac{-t} {1 - t}y,\ \frac{1} {1 - t} \in \mathbf{I},\ \frac{-t} {1 - t} = 1 - \frac{1} {1 - t}.}$$

When t > 1, consider

$$\displaystyle{y = \frac{1} {t} z + \frac{t - 1} {t} x,\ \frac{1} {t} \in \mathbf{I},\ \frac{t - 1} {t} = 1 -\frac{1} {t} .}$$

As is easily seen, f : C → C′ is affine if and only if

$$\displaystyle{f\big(\sum _{i=1}^{n}z(i)v_{i}\big) =\sum _{ i=1}^{n}z(i)f(v_{ i})\ \mbox{ for each $n \in \mathbb{N}$, $v_{i} \in C$ and $z \in {\Delta }^{n-1}$,}}$$

which is equivalent to the following:

$$\displaystyle{v_{i} \in C,\ t_{i} \in \mathbb{R},\ \sum _{i=1}^{n}t_{ i}v_{i} \in C,\ \sum _{i=1}^{n}t_{ i} = 1 \Rightarrow f\big(\sum _{i=1}^{n}t_{ i}v_{i}\big) =\sum _{ i=1}^{n}t_{ i}f(v_{i}).}$$

For every affine function f : C → E′ of a convex set C ⊂ E into another linear space E′, the image f(C) is also convex.

Proposition 3.2.2.

Let C and D be non-empty convex sets in the linear spaces E and E′, respectively. Every affine function f : C → D uniquely extends to an affine function \(\tilde{f} : \mathrm{fl}\,C \rightarrow \mathrm{fl}\,D\) . Moreover, if f is injective (or surjective) then so is \(\tilde{f}\) .

Proof.

Let C 0 be a maximal affinely independent subset of C. Then, fl C = fl C 0. Due to Proposition 3.1.5, f | C 0 uniquely extends to an affine function \(\tilde{f} : \mathrm{fl}\,C \rightarrow \mathrm{fl}\,D\). From the above remark, we can see that \(\tilde{f}\vert C = f\).

If f is injective, we show that \(\tilde{f}\) is also injective. By the definition of \(\tilde{f}\) in the proof of Proposition 3.1.5, it suffices to show that f(C 0) is affinely independent. Assume that f(C 0) is not affinely independent, i.e., there are distinct points \(v_{1},\ldots,v_{n} \in C_{0}\) and \(t_{1},\ldots,t_{n} \in \mathbb{R} \setminus \{ 0\}\) such that \(\sum _{i=1}^{n}t_{i}f(v_{i}) = \mathbf{0}\) and \(\sum _{i=1}^{n}t_{i} = 0\). Without loss of generality, it can be assumed that \(t_{1},\ldots,t_{k} > 0\) and \(t_{k+1},\ldots,t_{n} < 0\). Note that 1 < k < n and \(\sum _{i=1}^{k}t_{i} = -\sum _{j=k+1}^{n}t_{j} > 0\). Let

$$\displaystyle{x =\sum _{ i=1}^{k}\frac{t_{i}} {s} v_{i}\;\text{ and }\;y =\sum _{ j=k+1}^{n} -\frac{t_{j}} {s} v_{j},\;\mbox{ where $s =\sum _{ i=1}^{k}t_{i} > 0$.}}$$

Then, x, y ∈ C and f(x) = f(y) because

$$\displaystyle{f(x) - f(y) = \frac{1} {s}\sum _{i=1}^{n}t_{ i}f(v_{i}) = \mathbf{0}.}$$

Since f is injective, we have x = y. Hence, it follows that \(\sum _{i=1}^{k}t_{i}v_{i} = -\sum _{j=k+1}^{n}t_{j}v_{j}\), i.e., \(\sum _{i=1}^{n}t_{i}v_{i} = \mathbf{0}\). Because C 0 is affinely independent, \(t_{1} = \cdots = t_{n} = 0\), which is a contradiction.

Finally, we show that if f is surjective then so is \(\tilde{f}\). By Proposition 3.2.1, each z ∈ fl D can be written as follows:

$$\displaystyle{z = (1 - t)y + ty ^\prime,\ y,y ^\prime \in D,\ t \in \mathbb{R}.}$$

Since f is surjective, we have x, x′ ∈ C such that f(x) = y and f(x′) = y′. Then, \((1 - t)x + tx ^\prime \in \mathrm{fl}\,C\) and

$$\displaystyle{\tilde{f}((1 - t)x + tx ^\prime) = (1 - t)y + ty^\prime = z.}$$

Therefore, \(\tilde{f}\) is also surjective.

Let C be a convex set in a linear space E. The following set is called the radial interior of C:

$$\displaystyle{\mathrm{rint}\,C =\big\{ x \in C\bigm |\forall y \in C,\ \exists \delta > 0\;\text{ such that }\;(1+\delta )x-\delta y \in C\big\}}$$

. Footnote 4 In the case \(C =\langle v_{1},\ldots,v_{n}\rangle\), observe that

$$\displaystyle{\mathrm{rint}\,\langle v_{1},\ldots,v_{n}\rangle =\big\{\sum _{ i=1}^{n}z(i)v_{i}\bigm |z \in {\Delta }^{n-1} \cap {(0,\infty )}^{n}\big\}.}$$

Indeed, let \(x_{0} =\sum _{ i=1}^{n}{n}^{-1}v_{i} \in \langle v_{1},\ldots,v_{n}\rangle\). For each \(x \in \mathrm{rint}\,\langle v_{1},\ldots,v_{n}\rangle\), we have \(y \in \langle v_{1},\ldots,v_{n}\rangle\) such that x ∈ ⟨x 0, y⟩, i.e., \(x = (1 - t)x_{0} + ty\) for some t ∈ (0, 1). Then, \(y =\sum _{ i=1}^{n}z(i)v_{i}\) for some z ∈ Δ n − 1. It follows that \(x =\sum _{ i=1}^{n}((1 - t){n}^{-1} + tz(i))v_{i}\), where \(\sum _{i=1}^{n}((1 - t){n}^{-1} + tz(i)) = 1\) and \((1 - t){n}^{-1} + tz(i) > 0\) for all \(i = 1,\ldots,n\). Thus, x is a point of the rightside set. Conversely, it is straightforward to prove that each point of the rightside set belongs to \(\langle v_{1},\ldots,v_{n}\rangle\).

In particular, \(\mathrm{rint}\,\langle v_{1},v_{2}\rangle =\{ (1 - t)v_{1} + tv_{2}\mid 0 < t < 1\}\), and hence \(\mathrm{rint}\,\langle v_{1},v_{2}\rangle =\langle v_{1},v_{2}\rangle \setminus \{ v_{1},v_{2}\}\) if v 1 ≠ v 2. The radial interior of C can also be defined as

$$\displaystyle{\mathrm{rint}\,C =\big\{ x \in C\bigm |\forall y \in C,\ \exists z \in C\;\mbox{ such that $x \in \mathrm{rint}\,\langle y,z\rangle $}\big\}.}$$

For each x ∈ C, the following subset C x  ⊂ C is called the face of C at x:

$$ \begin{array}{rcl} C_{x}& =& \big\{y \in C\bigm |\exists \delta > 0\;\text{ such that }\;(1+\delta )x -\delta y \in C\big\} {}\\ & =& \big\{y \in C\bigm |\exists z \in C\;\mbox{ such that $ x \in \mathrm{rint}\,\langle y,z\rangle $}\big\} \end{array}$$

Footnote 5 By an easy observation, we have

$$\displaystyle{\mathrm{rint}\,C =\{ x \in C\mid C_{x} = C\},\quad \text{i.e., }\;x \in \mathrm{rint}\,C \Leftrightarrow C_{x} = C.}$$

When C x  = { x}, we call x an extreme point of C. It is said that x ∈ E is linearly accessible from C if there is some y ∈ C such that

$$\displaystyle{\mathrm{rint}\,\langle x,y\rangle \subset C\quad (\text{i.e., }\langle x,y\rangle \setminus \{ x\} \subset C).}$$

The radial closure rcl C of C is the set of all linearly accessible points from C.Footnote 6 It should be noted that rcl C ⊂ fl C by Proposition 3.2.1, hence fl rcl C = fl C. Consequently, we have the following inclusions:

$$\displaystyle{\mathrm{rint}\,C \subset C \subset \mathrm{rcl}\,C \subset \mathrm{fl}\,C.}$$

The set ∂C = rcl C ∖ rint C is called the radial boundary of C.

Remark 1.

Note that A ⊂ B implies rcl A ⊂ rcl B, but it does not imply rint A ⊂ rint B. For example, consider \(A ={ \mathbf{I}}^{n} \times \{\mathbf{0}\} \subset B ={ \mathbf{I}}^{n+1}\). Then, A ∩ rint B = .

For the Hilbert cube \(\boldsymbol{Q} = {[-1,1]}^{\mathbb{N}}\), we have

$$\displaystyle{\mathrm{rint}\,\boldsymbol{Q} =\big\{ x \in \boldsymbol{ Q}\bigm |\sup _{i\in \mathbb{N}}\vert x(i)\vert < 1\big\} \subsetneq {(-1,1)}^{\mathbb{N}}.}$$

Observe that \(\mathrm{rint}\,[-1,1]_{f}^{\mathbb{N}} = (-1,1)_{f}^{\mathbb{N}}\) but \(\mathrm{rint}\,\mathbf{I}_{f}^{\mathbb{N}} = \emptyset\), where

$$\displaystyle{[-1,1]_{f}^{\mathbb{N}} = \mathbb{R}_{ f}^{\mathbb{N}}\cap {[-1,1]}^{\mathbb{N}},\ (-1,1)_{ f}^{\mathbb{N}} = \mathbb{R}_{ f}^{\mathbb{N}}\cap {(-1,1)}^{\mathbb{N}},\text{ and }\;\mathbf{I}_{ f}^{\mathbb{N}} = \mathbb{R}_{ f}^{\mathbb{N}}\cap {\mathbf{I}}^{\mathbb{N}}}.$$

Footnote 7 As is easily observed, \(\mathbf{I}_{f}^{\mathbb{N}} = \mathrm{rcl}\,(\mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\})\). It will be shown in Remark 3 that \(\mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\} = \mathrm{rcl}\,C\) for some convex set \(C \subset \mathbb{R}_{f}^{\mathbb{N}}\).

Remark 2.

The unit closed ball \(\mathbf{B}_{\boldsymbol{c}_{0}}\) of the Banach space \(\boldsymbol{c}_{0}\) has no extreme points. In fact, every \(x \in \mathbf{B}_{\boldsymbol{c}_{0}}\) is the midpoint of two distinct points \(y,z \in \mathbf{B}_{\boldsymbol{c}_{0}}\), i.e., \(x = \frac{1} {2}y + \frac{1} {2}z\). For example, choose \(n \in \mathbb{N}\) so that \(\vert x(n)\vert < \frac{1} {2}\) and let \(y,z \in \mathbf{B}_{\boldsymbol{c}_{0}}\) such that \(y(i) = z(i) = x(i)\) for i ≠ n, \(y(n) = x(n) + \frac{1} {2}\), and \(z(n) = x(n) -\frac{1} {2}\).

Proposition 3.2.3.

Let C ⊂ E be a convex set. If x ∈rint  C, y ∈rcl  C, and 0 ≤ t < 1, then \((1 - t)x + ty \in \mathrm{rint}\,C\) , i.e., ⟨x,y⟩∖{ y} ⊂rint  C.

Proof.

For each z ∈ C, we have to find v ∈ C and 0 < s < 1 such that

$$\displaystyle{(1 - t)x + ty = (1 - s)z + sv \in \mathrm{rint}\,\langle z,v\rangle .}$$

Take w ∈ C so that rint ⟨w, y⟩ ⊂ C, and choose 0 < r < 1 so that

$$\displaystyle{z^\prime = (1 + r)x - rz,\ w^\prime = (1 + r)x - rw \in C.}$$
Fig. 3.1
figure 1

\((1 - t)x + ty \in \mathrm{rint}\,C\)

The desired v is to be written as

$$\displaystyle{v = t_{1}y + t_{2}w + t_{3}w^\prime + t_{4}z^\prime = (t_{1} + t_{2})u + (t_{3} + t_{4})u^\prime \in C,}$$

where \(t_{1} + t_{2} + t_{3} + t_{4} = 1\), \(t_{1},t_{2},t_{3},t_{4} > 0\),

$$\displaystyle{u = \frac{t_{1}} {t_{1} + t_{2}}y + \frac{t_{1}} {t_{1} + t_{2}}w,\ u^\prime = \frac{t_{3}} {t_{3} + t_{4}}w^\prime + \frac{t_{4}} {t_{3} + t_{4}}z^\prime \in C.}$$

Then, we have

$$\displaystyle\begin{array}{rcl} (1 - s)z + sv& =& (1 - s)z + s(t_{1}y + t_{2}w + t_{3}w^\prime + t_{4}z^\prime) {}\\ & =& st_{1}y + s(t_{2} - t_{3}r)w + s(t_{3} + t_{4})(1 + r)x + (1 - s - st_{4}r)z. {}\\ \end{array}$$

To obtain \((1 - s)z + sv = (1 - t)x + ty\), it is enough to find \(t_{1},t_{2},t_{3},t_{4} > 0\) and 0 < s < 1 satisfying the simultaneous equations: st 1 = t, t 2 = t 3 r, \(s(t_{3}+t_{4})(1+r) = 1 - t\), and \(1 - s = st_{4}r\), i.e.,

$$\displaystyle{({\ast})\qquad \qquad \qquad \quad t_{1} = \frac{t} {s},\ t_{4} = \frac{1 - s} {rs},\ t_{3} = \frac{1} {r} - \frac{1 + rt} {(1 + r)rs},\ t_{2} = 1 - \frac{1 + rt} {(1 + r)s}.}$$

Since t 1, t 4 < 1 and 0 < t 2 ( < t 3), it is necessary to satisfy

$$\displaystyle{\max \bigg\{t,\ \frac{1} {1 + r},\ \frac{1 + rt} {1 + r} \bigg\} < s < 1.}$$

We can take such an s because the left side of the above inequality is less than 1. Then, we can define \(t_{1},t_{2},t_{3},t_{4} > 0\) as in ( ∗ ), which satisfies \(t_{1} + t_{2} + t_{3} + t_{4} = 1\). Thus, we have the desired \(v = t_{1}y + t_{2}w + t_{3}w^\prime + t_{4}z^\prime \in C\) — Fig. 3.1.

Although we verified in Remark 1 that A ⊂ B does not imply rint A ⊂ rint B in general, we do have the following corollary:

Corollary 3.2.4.

Let A and B be non-empty convex sets in E. If A ⊂ B and A ∩rint  B≠∅, then rint  A ⊂rint  B.

Proof.

Let x ∈ A ∩ rint B. For each y ∈ rint A, we have z ∈ A such that \(y \in \mathrm{rint}\,\langle x,z\rangle\). Since \(\mathrm{rint}\,\langle x,z\rangle \subset \mathrm{rint}\,B\) by Proposition 3.2.3, it follows that y ∈ rint B.

Proposition 3.2.5.

For each convex set C ⊂ E, the following statements hold:

  1. (1)

    Both rint  C and rcl  C are convex;

  2. (2)

    rint  rint  C = rint  C ⊂rint  rcl  C;

  3. (3)

    \(\mathrm{rint}\,C\not =\emptyset\Rightarrow \mathrm{rint}\,\mathrm{rcl}\,C = \mathrm{rint}\,C,\ \mathrm{rcl}\,\mathrm{rint}\,C = \mathrm{rcl}\,\mathrm{rcl}\,C = \mathrm{rcl}\,C\) ,in which case \(\partial \mathrm{rint}\,C = \partial \mathrm{rcl}\,C = \partial C\);

  4. (4)

    \(\mathrm{rint}\,C\not =\emptyset\Rightarrow \mathrm{fl}\,C = \mathrm{fl}\,\mathrm{rint}\,C\);

  5. (5)

    \(\mathrm{rint}\,C\not =\varnothing,\ \mathrm{rcl}\,C = \mathrm{fl}\,C \Rightarrow \mathrm{rint}\,C = C = \mathrm{fl}\,C\);

  6. (6)

    \(\partial C\not =\emptyset\Leftrightarrow \emptyset\not =C \subsetneq \mathrm{fl}\,C\);

  7. (7)

    C x is convex and C x = C ∩fl  C x for x ∈ C;

  8. (8)

    x ∈rint  C x for x ∈ C, hence \((C_{x})_{x} = C_{x}\);

  9. (9)

    \((C_{x})_{y} = C_{y}\) for x ∈ C and y ∈ C x;

  10. (10)

    C x = C y for x ∈ C and y ∈rint  C x .

Proof.

(1): To prove the convexity of rint C, we can apply Proposition 3.2.3. It is now quite straightforward to show the convexity of rcl C.

(2): To show rint C ⊂ rint rint C, we can apply Proposition 3.2.3. Because rint (rint C) ⊂ rint C by Corollary 3.2.4, we have rint rint C = rint C.

For each x ∈ rint C and y ∈ rcl C, \(\frac{1} {2}x + \frac{1} {2}y \in \mathrm{rint}\,C\) by Proposition 3.2.3. Then, we have δ > 0 such that \((1+\delta )x -\delta (\frac{1} {2}x + \frac{1} {2}y) \in C\), i.e., \((1 + \frac{1} {2}\delta )x -\frac{1} {2}\delta y \in C\). Hence, x ∈ rint rcl C.

(3): Let x 0 ∈ rint C. For each x ∈ rint rcl C, we have y ∈ rcl C such that \(x \in \mathrm{rint}\,\langle x_{0},y\rangle\), which implies that x ∈ rint C by Proposition 3.2.3. Combining this with (2) yields rint rcl C = rint C.

We now have x 0 ∈ rint C = rint rcl C. If x ∈ rcl rcl C, then \(\mathrm{rint}\,\langle x_{0},x\rangle \subset \mathrm{rint}\,\mathrm{rcl}\,C = \mathrm{rint}\,C\) by Proposition 3.2.3, which means that x ∈ rcl rint C. Since \(\mathrm{rcl}\,\mathrm{rint}\,C \subset \mathrm{rcl}\,C \subset \mathrm{rcl}\,\mathrm{rcl}\,C\), we have \(\mathrm{rcl}\,\mathrm{rint}\,C = \mathrm{rcl}\,C = \mathrm{rcl}\,\mathrm{rcl}\,C\).

(4): Let x 0 ∈ rint C. For each x ∈ C, \(\frac{1} {2}x + \frac{1} {2}x_{0} \in \mathrm{fl}\,\mathrm{rint}\,C\) by Proposition 3.2.3. Then, it follows from Proposition 3.2.1 that \(x = 2(\frac{1} {2}x + \frac{1} {2}x_{0}) - x_{0} \in \mathrm{fl}\,\mathrm{rint}\,C\). Accordingly, we have C ⊂ fl rint C, which implies fl C ⊂ fl rint C. Since fl rint C ⊂ fl C, we have fl C = fl rint C.

(5): Let x 0 ∈ rint C. For each x ∈ fl C, \(2x - x_{0} \in \mathrm{fl}\,C = \mathrm{rcl}\,C\). Then, \(x = \tfrac{1} {2}x_{0} + \tfrac{1} {2}(2x - x_{0}) \in \mathrm{rint}\,C \subset C\) by Proposition 3.2.3.

(6): Assume  ≠ C⊊ fl C. Then, we have x ∈ fl C ∖ C, which can be written as \(x = (1 + t)y - tz\) for some y ≠ z ∈ C and t > 0 by Proposition 3.2.1. Let

$$\displaystyle{s =\inf \big\{ t > 0\bigm |(1 + t)y - tz\not\in C\big\} \geq 0.}$$

Then, \((1 + s)y - sz \in \mathrm{rcl}\,C \setminus \mathrm{rint}\,C = \partial C\).

When C = fl C, i.e., C is a flat, we have \(\mathrm{rcl}\,C = \mathrm{rint}\,C = C\) by definition, which means ∂C = . Therefore, ∂C ≠  implies  ≠ C⊊ fl C.

(7): First, we show that C x is convex. For each y, z ∈ C x , we can choose δ > 0 so that \((1+\delta )x -\delta y \in C\) and \((1+\delta )x -\delta z \in C\). Then, for each t ∈ I,

$$\displaystyle\begin{array}{rcl} & & (1+\delta )x -\delta \big ((1 - t)y + tz\big) {}\\ & & \qquad \qquad \qquad \qquad \qquad = (1 - t)\big((1+\delta )x -\delta y\big) + t\big((1+\delta )x -\delta z\big) \in C, {}\\ \end{array}$$

which means \((1 - t)y + tz \in C_{x}\).

Because C x  ⊂ C ∩ fl C x , it remains to show C ∩ fl C x  ⊂ C x . By Proposition 3.2.1, each y ∈ C ∩ fl C x can be written as \( y = (1 - t)y^\prime + ty^{\prime\prime} \) for some y′, y′ ∈ C x and \(t \in \mathbb{R}\). Because of the convexity of C x , we have y ∈ C x if t ∈ I. Then, we may assume that t < 0 (if t > 1, exchange y′ with y′). We have δ > 0 such that \(z^\prime = (1+\delta )x -\delta y^\prime \in C\). Observe that

$$ \begin{array}{rcl} (1 + s)x - sy& =& (1 + s)\bigg(\frac{\delta } {1+\delta }y^\prime + \frac{1} {1+\delta}z^\prime\bigg) - s\big((1 - t)y^\prime + ty^{\prime\prime}\big) {}\\ & =& \bigg(\frac{(1 + s)\delta } {1+\delta } - s(1 -t)\bigg)y^\prime + \frac{1 + s} {1+\delta } z^\prime -sty^{\prime\prime}. {}\\ \end{array}$$

Let \(s =\delta /(1 - t - t\delta ) > 0\). Then, since \(1 + s = (1 - t)(1+\delta )/(1 - t - t\delta )\), it follows that

$$ {(1 + s)x - sy = \frac{1 - t} {1 - t - t\delta }z^\prime+ \frac{-t\delta } {1 - t - t\delta }y^{\prime\prime} \in C,}$$

which implies that y ∈ C x (Fig. 3.2).

Fig. 3.2
figure 2

C ∩ fl C x  ⊂ C x

(8): From the definition of rint C x , it easily follows that x ∈ rint C x .

(9): Because C x  ⊂ C, we have \((C_{x})_{y} \subset C_{y}\) by definition. We will show that C y  ⊂ C x , which implies \(C_{y} = (C_{y})_{y} \subset (C_{x})_{y}\) by (8) and the definition. For each z ∈ C y , choose δ 1 > 0 so that \(u = (1 +\delta _{1})y -\delta _{1}z \in C\). On the other hand, since y ∈ C x , we have δ 2 > 0 such that \(v = (1 +\delta _{2})x -\delta _{2}y \in C\). Then,

$$\displaystyle{\frac{(1 +\delta _{1})(1 +\delta _{2})} {1 +\delta _{1} +\delta _{2}} x - \frac{\delta _{1}\delta _{2}} {1 +\delta _{1} +\delta _{2}}z = \frac{1 +\delta _{1}} {1 +\delta _{1} +\delta _{2}}v + \frac{\delta _{2}} {1 +\delta _{1} +\delta _{2}}u \in C,}$$

which means that z ∈ C x .

(10): Since y ∈ rint C x , we have \((C_{x})_{y} = C_{x}\). On the other hand, \((C_{x})_{y} = C_{y}\) by (9).

Remark 3.

It should be noted that, in general, rcl rcl C ≠ rcl C. For example, let C be the convex set in \(\mathbb{R}_{f}^{\mathbb{N}}\) defined as follows:

$$\displaystyle\begin{array}{rcl} C =\big\{ x \in \mathbf{I}_{f}^{\mathbb{N}}\bigm |\exists k \in \mathbb{N}\text{ such that }\;& & \sum _{ i\in \mathbb{N}}x(i) \geq {k}^{-1}, {}\\ & & x(i)\not =0\;\mbox{ at least $k$ many $i \in \mathbb{N}$}\big\}. {}\\ \end{array}$$

It is easy to see that 0 ∉ rcl C, i.e., \(\mathrm{rcl}\,C \subset \mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\}\). For each \(x \in \mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\}\), choose \(k \in \mathbb{N}\) so that \({k}^{-1} \leq \sum _{i\in \mathbb{N}}x(i)\), and let y ∈ C such that \(y(i) = {k}^{-2}\) for i ≤ k and y(i) = 0 for i > k. If 0 < t ≤ 1, then \((1 - t)x + ty \in C\) because \((1 - t)x(i) + ty(i)\not =0\) for at least k many \(i \in \mathbb{N}\) and

$$\displaystyle{\sum _{i\in \mathbb{N}}\big((1 - t)x(i) + ty(i)\big) = (1 - t)\sum _{i\in \mathbb{N}}x(i) + t\sum _{i\in \mathbb{N}}y(i) \geq {k}^{-1}.}$$

Therefore, \(\mathrm{rcl}\,C = \mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\}\). As observed in Remark 1, \(\mathrm{rcl}\,\big(\mathbf{I}_{f}^{\mathbb{N}} \setminus \{\mathbf{0}\}\big) = \mathbf{I}_{f}^{\mathbb{N}}\). Hence, we have rcl rcl C ≠ rcl C. It should also be noted that rint C = .

In the finite-dimensional case, we have the following proposition:

Proposition 3.2.6.

Every non-empty finite-dimensional convex set C has a non-empty radial interior, i.e., rint  C≠∅, and therefore

$$\displaystyle{\mathrm{rcl}\,\mathrm{rint}\,C = \mathrm{rcl}\,\mathrm{rcl}\,C = \mathrm{rcl}\,C\;\text{ and }\;\partial \mathrm{rint}\,C = \partial \mathrm{rcl}\,C = \partial C.}$$

Proof.

We have a maximal affinely independent finite subset \(\{v_{1},\ldots,v_{n}\} \subset C\). Then, \(v_{0} =\sum _{ i=1}^{n}{n}^{-1}v_{i} \in \mathrm{rint}\,C\). Indeed, since \(C \subset \mathrm{fl}\,\{v_{1},\ldots,v_{n}\}\), each x ∈ C can be written as \(x =\sum _{ i=1}^{n}t_{i}v_{i}\), where \(\sum _{i=1}^{n}t_{i} = 1\). Observe that

$$\displaystyle\begin{array}{rcl} (1+\delta )v_{0} -\delta x& =& (1+\delta )\sum _{i=1}^{n}{n}^{-1}v_{ i} -\delta \sum _{i=1}^{n}t_{ i}v_{i} {}\\ & =& \sum _{i=1}^{n}({n}^{-1} +\delta ({n}^{-1} - t_{ i}))v_{i}. {}\\ \end{array}$$

When v 0 ≠ x, we have \(s =\min \{ {n}^{-1} - t_{i}\mid i=1,\ldots,n\} < 0\). Let \(\delta =1/(-sn) > 0\). Then, \({n}^{-1} +\delta ({n}^{-1} - t_{i}) \geq 0\) for every \(i = 1,\ldots,n\), which implies that \((1+\delta )v_{0} -\delta x \in C\).

Additional Results for Convex Sets 3.2.7.

  1. (1)

    For every two convex sets C and D,

    $$\displaystyle{(C \cap D)_{x} = C_{x} \cap D_{x}\;\mbox{ for each $x \in C \cap D$.}}$$
  2. (2)

    For every two convex sets C and D with rint C ∩ rint D ≠ ,

    $$\displaystyle{\mathrm{rint}\,(C \cap D) = \mathrm{rint}\,C \cap \mathrm{rint}\,D.}$$

    In general, rint C ∩ rint D ⊂ rint (C ∩ D).

    Sketch of Proof. To show that rint (C ∩ D) ⊂ rint C ∩ rint D, let x 0 ∈ rint C ∩ rint D. For each x ∈ rint (C ∩ D), take y ∈ C ∩ D so that x ∈ rint ⟨x 0, y⟩. Since rint ⟨x 0, y⟩ ⊂ rint C by Proposition 3.2.3, it follows that x ∈ rint C. Hence, rint (C ∩ D) ⊂ rint C. Similarly, we have rint (C ∩ D) ⊂ rint D.

  3. (3)

    Let C and D be convex sets in the linear spaces E and E′, respectively. Then, C ×D is also convex,

    $$\displaystyle{\mathrm{rint}\,(C \times D) = \mathrm{rint}\,C \times \mathrm{rint}\,D\;\text{ and }\;\mathrm{rcl}\,(C \times D) = \mathrm{rcl}\,C \times \mathrm{rcl}\,D.}$$

    Moreover, \((C \times D)_{(x,y)} = C_{x} \times D_{y}\) for each (x, y) ∈ C ×D.

  4. (4)

    Let f : C → E′ be an affine function of a convex set C in a linear space E into another linear space E′, and D be a convex set in E′. Then, f(C) and f  − 1(D) are convex and

    $$\displaystyle{{f}^{-1}(D)_{ x} = C_{x} \cap {f}^{-1}(D_{ f(x)})\;\mbox{ for each $x \in {f}^{-1}(D)$ ($ \subset C$).}}$$

    In particular, \(C_{x} \subset {f}^{-1}(f(C)_{f(x)})\) (i.e., f(C x ) ⊂ f(C) f(x)) for each x ∈ C. When f is injective, f(C x ) = f(C) f(x) for each x ∈ C.

    Sketch of Proof. It is easy to see that \(f({f}^{-1}(D)_{x}) \subset D_{f(x)}\), hence \({f}^{-1}(D)_{x} \subset {f}^{-1}(D_{f(x)})\). Also, \({f}^{-1}(D)_{x} \subset C_{x}\) because f  − 1(D) ⊂ C. Accordingly, \({f}^{-1}(D)_{x} \subset C_{x} \cap {f}^{-1}(D_{f(x)})\). To prove the converse inclusion, for each \(y \in {f}^{-1}(D_{f(x)}) \cap C_{x}\), choose δ > 0 so that \((1+\delta )f(x) -\delta f(y) \in D\) and \((1+\delta )x -\delta y \in C\). Then, \((1+\delta )x -\delta y \in {f}^{-1}(D)\).

  5. (5)

    For every (bounded) subset A of a normed linear space \(E = (E,\|\cdot \|)\), the following hold:

    1. (i)

      \(\|x - y\| \leq \sup _{z\in A}\|x - z\|\) for each x ∈ E and y ∈ ⟨A⟩;

    2. (ii)

      diam ⟨A⟩ = diam A.

    Sketch of Proof.

    1. (i):

      Write \(y =\sum _{ i=1}^{n}z(i)x_{i}\) for some \(x_{1},\ldots,x_{n} \in A\) and z ∈ Δ n − 1.

    2. (ii):

      For each x, y ∈ ⟨A⟩,

      $$\displaystyle{\|x - y\| \leq \sup _{z\in A}\|x - z\| \leq \sup _{z\in A}\sup _{z^\prime\in A}\|z - z^\prime\| = \mathrm{diam}\,A.}$$

Remark 4.

In (2) above, rint (C ∩ D) ≠ rint C ∩ rint D in general. Consider the case that C ∩ D ≠  but rint C ∩ rint D = .

In (4) above, f(C x ) ≠ f(C) f(x) in general. For instance, let \(C =\{ (s,t) \in {\mathbb{R}}^{2}\mid \vert s\vert \leq t \leq 1\} \subset {\mathbb{R}}^{2}\). Then, \(\mathrm{pr}_{1}(C) = [-1,1]\), pr1(C 0) = { 0}, and \(\mathrm{pr}_{1}(C)_{0} =\mathrm{ pr}_{1}(C)\).

3.3 The Hahn–Banach Extension Theorem

We now prove the Hahn–Banach Extension Theorem and present a relationship between the sublinear functionals and the convex sets.

Let E be a linear space. A functional \(p : E \rightarrow \mathbb{R}\) is sublinear if it satisfies the following conditions:

  • (SL1 ) \(p(x + y) \leq p(x) + p(y)\;\text{for each}\;x,y \in E,\) and

  • (SL2 ) p(tx) = tp(x) for  each x ∈ E and t > 0. 

Note that if \(p : E \rightarrow \mathbb{R}\) is sublinear then p(0) = 0 and \(-p(-x) \leq p(x)\). For each x, y ∈ E and t ∈ I,

$$\displaystyle{p((1 - t)x + ty) \leq (1 - t)p(x) + tp(y).}$$

When \(p : E \rightarrow \mathbb{R}\) is a non-negative sublinear functional, \({p}^{-1}([0,r)) = r{p}^{-1}([0,1))\) and \({p}^{-1}([0,r]) = r{p}^{-1}(\mathbf{I})\) are convex for each r > 0.

In the following Hahn–Banach Extension Theorem, no topological concepts appear (even in the proof). Nevertheless, this theorem is very important in the study of topological linear spaces.

Theorem 3.3.1 (Hahn–Banach Extension Theorem). 

Let \(p : E \rightarrow \mathbb{R}\) be a sublinear functional of a linear space E and F be a linear subspace of E. If \(f : F \rightarrow \mathbb{R}\) is a linear functional such that f(x) ≤ p(x) for every x ∈ F, then f extends to a linear functional \(\tilde{f} : E \rightarrow \mathbb{R}\) such that \(\tilde{f}(x) \leq p(x)\) for every x ∈ E.

Proof.

Let \(\mathcal{F}\) be the collection of all linear functionals \(f^\prime : F^\prime \rightarrow \mathbb{R}\) of a linear subspace F′ ⊂ E such that F ⊂ F′, f′ | F = f, and f′(x) ≤ p(x) for every x ∈ F′. For \(f^\prime,f^{\prime\prime} \in \mathcal{F}\), we define f′ ≤ f′ if f′ is an extension of f′. Then, \(\mathcal{F} = (\mathcal{F},\leq )\) is an inductive ordered set, i.e., every totally ordered subset of \(\mathcal{F}\) is upper bounded. By Zorn’s Lemma, \(\mathcal{F}\) has a maximal element \(f_{0} : F_{0} \rightarrow \mathbb{R}\). It suffices to show that F 0 = E.

Assume that F 0 ≠ E. Taking x 1 ∈ E ∖ F 0, we have a linear subspace \(F_{1} = F_{0} + \mathbb{R}x_{1} \supsetneq F_{0}\). We show that f 0 has a linear extension \(f_{1} : F_{1} \rightarrow \mathbb{R}\) in \(\mathcal{F}\), which contradicts the maximality of f 0. By assigning x 1 to \(\alpha \in \mathbb{R}\), f 1 can be defined, i.e., \(f_{1}(x + tx_{1}) = f_{0}(x) + t\alpha\) for x ∈ F 0 and \(t \in \mathbb{R}\). In order that \(f_{1} \in \mathcal{F}\), we have to choose α so that for every x ∈ F 0 and t > 0,

$$\displaystyle{f_{0}(x) + t\alpha \leq p(x + tx_{1})\;\text{ and }\;f_{0}(x) - t\alpha \leq p(x - tx_{1}).}$$

Dividing by t, we obtain the following equivalent condition:

$$\displaystyle{f_{0}(y) - p(y - x_{1}) \leq \alpha \leq p(y + x_{1}) - f_{0}(y)\;\mbox{ for every $y \in F_{0}$.}}$$

Hence, such an \(\alpha \in \mathbb{R}\) exists if

$$\displaystyle{\sup \{f_{0}(y) - p(y - x_{1})\mid y \in F_{0}\} \leq \inf \{ p(y + x_{1}) - f_{0}(y)\mid y \in F_{0}\}.}$$

This inequality can be proved as follows: for each y, y′ ∈ F 0,

$$\displaystyle{f_{0}(y) + f_{0}(y^\prime) = f_{0}(y + y^\prime) \leq p(y + y^\prime) \leq p(y - x_{1}) + p(y^\prime + x_{1}),}$$

hence \(f_{0}(y) - p(y - x_{1}) \leq p(y^\prime + x_{1}) - f_{0}(y^\prime)\), which implies the desired inequality.

Let F be a flat in a linear space E and A ⊂ F. The following set is called the core of A in F:

$$\displaystyle\begin{array}{rcl} \mathrm{core}\,_{F}A& =& \big\{x \in A\bigm |\forall \quad y \in F,\ \exists \delta > 0\;\text{ such that } {}\\ & & \qquad \quad \vert t\vert \leq \delta \Rightarrow (1 - t)x + ty \in A\big\}, {}\\ \end{array}$$

where | t | ≤ δ can be replaced by − δ ≤ t ≤ 0 (or 0 ≤ t ≤ δ). Each point of core  F A is called a core point of A in F. In the case that A is convex,

$$\displaystyle\begin{array}{rcl} x \in \mathrm{core}\,_{F}A& & \Leftrightarrow \forall y \in F,\ \exists \delta > 0\;\text{ such that }\;(1+\delta )x -\delta y \in A {}\\ & & \Leftrightarrow \forall y \in F,\ \exists \delta > 0\;\text{ such that }\;(1-\delta )x +\delta y \in A. {}\\ \end{array}$$

When F = E, we can omit the phrase “in E” and simply write core A by removing the subscript E. By definition, A ⊂ B ⊂ F implies core  F A ⊂ core  F B. We also have the following fact:

Fact.

For each A ⊂ F, core   F A≠∅ if and only if fl  A = F.

Indeed, the “if” part is trivial. To show the “only if” part, let x ∈ core  F A. For each y ∈ F, we have δ > 0 such that \(z = (1+\delta )x -\delta y \in A\). Then, \(y {=\delta }^{-1}(1+\delta )x {-\delta }^{-1}z \in \mathrm{fl}\,A\). Note that fl A ⊂ F because A ⊂ F. Consequently, fl A = F.

Proposition 3.3.2.

For every convex set A ⊂ E, core   fl  AA = rint  A, which is also convex. Hence, core  A≠∅ implies core  A = rint  A and core  core  A = core  A.

Proof.

Because core fl A A ⊂ rint A by definition, it suffices to show that rint A ⊂ core fl A A. For each x ∈ rint A and y ∈ fl A, we need to find some s > 0 such that \((1 + s)x - sy \in A\). This can be done using the same proof of the inclusion C ∩ fl C x  ⊂ C x in Proposition 3.2.5(7).

Remark 5.

When A is a finite-dimensional convex set, core  F A ≠  if and only if F = fl A according to Propositions 3.3.2 and 3.2.6. However, this does not hold for an infinite-dimensional convex set. For example, consider the convex set \(\mathbf{I}_{f}^{\mathbb{N}}\) in \({\mathbb{R}}^{\mathbb{N}}\). Then, \(\mathbb{R}_{f}^{\mathbb{N}} = \mathrm{fl}\,\mathbf{I}_{f}^{\mathbb{N}}\) and \(\mathrm{core}\,_{\mathbb{R}_{f}^{\mathbb{N}}}\mathbf{I}_{f}^{\mathbb{N}} = \mathrm{rint}\,\mathbf{I}_{f}^{\mathbb{N}} = \emptyset\).

With regard to convex sets defined by a non-negative sublinear functional, we have the following proposition:

Proposition 3.3.3.

Let \(p : E \rightarrow \mathbb{R}\) be a non-negative sublinear functional of a linear subspace E. Then,

$$\displaystyle{{p}^{-1}([0,1)) = \mathrm{core}\,{p}^{-1}([0,1)) = \mathrm{core}\,{p}^{-1}(\mathbf{I}).}$$

Proof.

The inclusion \(\mathrm{core}\,{p}^{-1}([0,1)) \subset \mathrm{core}\,{p}^{-1}(\mathbf{I})\) is obvious.

Let x ∈ p  − 1([0, 1)). For each y ∈ E, we can choose δ > 0 so that \(\delta p(x - y) < 1 - p(x)\). Then,

$$\displaystyle{0 \leq p((1+\delta )x -\delta y) = p(x +\delta (x - y)) \leq p(x) +\delta p(x - y) < 1,}$$

i.e., x ∈ core p  − 1([0, 1)). Hence, \({p}^{-1}([0,1)) \subset \mathrm{core}\,{p}^{-1}([0,1))\).

If p(x) ≥ 1, then x ∉ core p  − 1(I) because

$$\displaystyle{p((1 + t)x - t\mathbf{0}) = (1 + t)p(x) > 1\;\mbox{ for any $t > 0$.}}$$

This means that \(\mathrm{core}\,{p}^{-1}(\mathbf{I}) \subset {p}^{-1}([0,1))\).

For each A ⊂ E with 0 ∈ core A, the Minkowski functional \(p_{A} : E \rightarrow \mathbb{R}_{+}\) can be defined as follows:

$$\displaystyle{p_{A}(x) =\inf \big\{ s > 0\bigm |x \in sA\big\} =\inf \big\{ s > 0\bigm |{s}^{-1}x \in A\big\}.}$$

Then, for each x ∈ E and t > 0,

$$\displaystyle\begin{array}{rcl} p_{A}(tx)& =& \inf \big\{s > 0\bigm |{s}^{-1}tx \in A\big\} =\inf \big\{ ts > 0\bigm |{(ts)}^{-1}tx \in A\big\} {}\\ & =& t\inf \big\{s > 0\bigm |{s}^{-1}x \in A\big\} = tp_{ A}(x), {}\\ \end{array}$$

i.e., p A satisfies (SL2). In the above, \(p_{A}(tx) = p_{{t}^{-1}A}(x)\). Then, it follows that \(p_{{t}^{-1}A} = tp_{A}\) for each t > 0. Replacing t by t  − 1, we have

$$\displaystyle{p_{tA} = {t}^{-1}p_{ A}\;\mbox{ for each $t > 0$.}}$$

If A ⊂ E is convex, the Minkowski functional p A has the following desirable properties:

Proposition 3.3.4.

Let A ⊂ E be a convex set with 0 core  A. Then, the Minkowski functional p A is sublinear and

$$\displaystyle{\mathrm{rint}\,A = \mathrm{core}\,A = p_{A}^{-1}([0,1)) \subset A \subset p_{ A}^{-1}(\mathbf{I}) = \mathrm{rcl}\,A,}$$

so \(\partial A = p_{A}^{-1}(1)\) . Moreover,

$$\displaystyle{p_{A}(x) = 0\; \Leftrightarrow \; \mathbb{R}_{+}x \subset A.}$$

In order that p A is a norm on E, it is necessary and sufficient that \(\mathbb{R}_{+}x\not\subset A\) if x≠ 0 and tA ⊂ A if |t| < 1.

Proof.

First, we prove that p A is sublinear. As already observed, p A satisfies (SL2). To show that p A satisfies (SL1), let x, y ∈ E. Since A is convex, we have

$$\displaystyle{{s}^{-1}x,\ {t}^{-1}y \in A\; \Rightarrow \; {(s + t)}^{-1}(x + y) = \frac{s} {s + t}{s}^{-1}x + \frac{t} {s + t}{t}^{-1}y \in A,}$$

which implies that \(p_{A}(x + y) \leq p_{A}(x) + p_{A}(y)\).

The first equality rint A = core A has been stated in Proposition 3.3.2. It easily follows from the definitions that \(\mathrm{core}\,A \subset p_{A}^{-1}([0,1)) \subset A \subset p_{A}^{-1}(\mathbf{I})\) and p A  − 1(1) ⊂ rcl A, so p A  − 1(I) ⊂ rcl A. By Propositions 3.3.2 and 3.3.3, we have

$$\displaystyle{\mathrm{core}\,A = \mathrm{core}\,\mathrm{core}\,A \subset \mathrm{core}\,p_{A}^{-1}([0,1)) = p_{ A}^{-1}([0,1)) \subset \mathrm{core}\,A,}$$

which means the second equality \(\mathrm{core}\,A = p_{A}^{-1}([0,1))\). To obtain the third equality \(p_{A}^{-1}(\mathbf{I}) = \mathrm{rcl}\,A\), it remains to show that rcl A ⊂ p A  − 1(I). Let x ∈ rcl A. Since 0 ∈ rint A, it follows from Proposition 3.2.3 that s  − 1 x ∈ rint C ⊂ C for each s > 1, which implies that p A (x) ≤ 1, i.e., x ∈ p A  − 1(I).

By definition, p A (x) = 0 if and only if tx ∈ A for an arbitrarily large t > 0, which means that \(\mathbb{R}_{+}x \subset A\) because A is convex.

Because p A is sublinear, p A is a norm if and only if p A (x) ≠ 0 and \(p_{A}(x) = p_{A}(-x)\) for every x ∈ E ∖ {0}. Because p A (x) ≠ 0 if and only if \(\mathbb{R}_{+}x\not\subset A\), it remains to show that \(p_{A}(x) = p_{A}(-x)\) for every x ∈ E ∖ {0} if and only if tA ⊂ A whenever | t |  < 1.

Assume that \(p_{A}(x) = p_{A}(-x)\) for each x ∈ E. If x ∈ A and | t |  < 1 then \(p_{A}(tx) = p_{A}(\vert t\vert x) = \vert t\vert p_{A}(x) < 1\), which implies that tx ∈ A. Hence, tA ⊂ A whenever | t |  < 1.

Conversely, assume that tA ⊂ A whenever | t |  < 1. For each s > p A (x), r  − 1 x ∈ A for some 0 < r < s, and we have \({s}^{-1}(-x) = (-{s}^{-1}r){r}^{-1}x \in A\), hence p A ( − x) ≤ p A (x). Replacing x with − x, we have p A (x) ≤ p A ( − x). Therefore, \(p_{A}(x) = p_{A}(-x)\).

When the Minkowski functional p A is a norm on E, we call it the Minkowski norm. In this case, rcl A, rint A, and ∂A are the unit closed ball, the unit open ball, and the unit sphere, respectively, of the normed linear space E = (E, p A ). Then, rcl A and rint A are symmetric about 0, i.e., \(\mathrm{rcl}\,A = -\mathrm{rcl}\,A\) and \(\mathrm{rint}\,A = -\mathrm{rint}\,A\). We should note that a convex set A ⊂ E is symmetric about 0 if and only if tA ⊂ A whenever | t | ≤ 1 (in the next section, A is said to be circled).

A subset W ⊂ E is called a wedge if x + y ∈ W for each x, y ∈ W and tx ∈ W for each x ∈ W, t ≥ 0, or equivalently, W is convex and tW ⊂ W for every t ≥ 0. Note that if A ⊂ E is convex then \(\mathbb{R}_{+}A\) is a wedge. For a wedge W ⊂ E, the following statements are true:

  1. (1)

    0 ∈ core W ⇔ W = E;

  2. (2)

    \(W\not =E,\ x \in \mathrm{core}\,W \Rightarrow -x\not\in W\).

A cone C ⊂ E is a wedge with \(C \cap (-C) =\{ \mathbf{0}\}\). Each translation of a cone is also called a cone.

Using the Hahn–Banach Extension Theorem, we can prove the following separation theorem:

Theorem 3.3.5 (Separation Theorem). 

Let A and B be convex sets in E such that core  A≠∅ and (core  A) ∩ B = ∅. Then, there exists a linear functional \(f : E \rightarrow \mathbb{R}\) such that f(x) < f(y) for every x ∈core  A and y ∈ B, and sup f(A) ≤ inf f(B).

Proof.

Recall that core A = rint A (Proposition 3.3.2). For a linear functional \(f : E \rightarrow \mathbb{R}\), if f(x) < f(y) for every x ∈ core A and y ∈ B, then supf(A) ≤ inff(B). Indeed, let x ∈ A, y ∈ B, v ∈ core A, and 0 ≤ t < 1. Since \((1 - t)v + tx \in \mathrm{core}\,A\) by Proposition 3.2.3, we have

$$\displaystyle{(1 - t)f(v) + tf(x) = f((1 - t)v + tx) < f(y),}$$

where the left side tends to f(x) as t → 1, and hence f(x) ≤ f(y).

Note that \(W = \mathbb{R}_{+}(A - B)\) is a wedge. Moreover, (core A) − B ⊂ core W. Indeed, let x ∈ core A and y ∈ B. For each z ∈ E, choose δ > 0 so that \((1+\delta )x -\delta (y + z) \in A\). Then,

$$\displaystyle{(1+\delta )(x - y) -\delta z = (1+\delta )x -\delta (y + z) - y \in A - B \subset W.}$$

Therefore, it suffices to construct a linear functional \(f : E \rightarrow \mathbb{R}\) such that f(core W) ⊂ ( − , 0).

Now, we shall show that \(W \cap (B -\mathrm{core}\,A) = \emptyset\). Assume that there exist x 0 ∈ A, x 1 ∈ core A, \(y_{0},y_{1} \in B\), and t 0 ≥ 0 such that \(t_{0}(x_{0} - y_{0}) = y_{1} - x_{1}\). Note that \(\mathrm{rint}\,\langle x_{0},x_{1}\rangle \subset \mathrm{rint}\,A = \mathrm{core}\,A\) by Proposition 3.2.3. Hence,

$$\displaystyle{ \frac{t_{0}} {t_{0} + 1}x_{0} + \frac{1} {t_{0} + 1}x_{1} = \frac{t_{0}} {t_{0} + 1}y_{0} + \frac{1} {t_{0} + 1}y_{1} \in (\mathrm{core}\,A) \cap B,}$$

which contradicts the fact that (core A) ∩ B = .

Take v 0 ∈ (core A) − B ⊂ core W. Then, note that − v 0 ∉ W. For each x ∈ E, we have δ > 0 such that \((1+\delta )v_{0} -\delta (-x) \in W\), which implies \(x {+\delta }^{-1}(1+\delta )v_{0} \in W\). Then, we can define \(p : E \rightarrow \mathbb{R}\) by

$$\displaystyle{p(x) =\inf \big\{ t \geq 0\bigm |x + tv_{0} \in W\big\}.}$$

Because W is a wedge, we see that p is sublinear. Since − v 0 ∉ W, it follows that \(p(s(-v_{0})) = s\) and p(sv 0) = 0 for every s ≥ 0. Applying the Hahn–Banach Extension Theorem 3.3.1, we can obtain a linear functional \(f : E \rightarrow \mathbb{R}\) such that \(f(s(-v_{0})) = s\) for each \(s \in \mathbb{R}\) and f(x) ≤ p(x) for every x ∈ E (see Fig. 3.3). For each z ∈ core W, we have δ > 0 such that \((1+\delta )z -\delta (z + v_{0}) \in W\), i.e., z − δv 0 ∈ W. Accordingly, \((z -\delta v_{0}) + tv_{0} \in W\) for every t ≥ 0, which means \(p(z -\delta v_{0}) = 0\). Thus, we have

$$\displaystyle{\qquad \qquad f(z) < f(z)+\delta = f(z -\delta v_{0}) \leq p(z -\delta v_{0}) = 0.\qquad \qquad \qquad \qquad }$$

Remark 6.

Using the Hahn–Banach Extension Theorem, we have proved the Separation Theorem. Conversely, the Hahn–Banach Extension Theorem can be derived from the Separation Theorem. Indeed, under the assumption of the Hahn–Banach Extension Theorem 3.3.1, we define

$$\displaystyle{A =\big\{ (x,t) \in E \times \mathbb{R}\bigm |t > p(x)\big\}\;\text{ and }\;B =\big\{ (x,f(x)) \in E \times \mathbb{R}\bigm |x \in F\big\},}$$

where B = Gr (f) is the graph of f. Then, A and B are disjoint convex sets in \(E \times \mathbb{R}\). It is straightforward to show that \(\mathrm{core}\,A = A\not =\emptyset\). By the Separation Theorem 3.3.5, we have a linear functional \(\varphi : E \times \mathbb{R} \rightarrow \mathbb{R}\) such that \(A {\subset \varphi }^{-1}((-\infty,r])\) and \(B {\subset \varphi }^{-1}([r,\infty ))\) for some \(r \in \mathbb{R}\). Then, r ≤ 0 because \(0 =\varphi (0,0) \in \varphi (B)\). If \(\varphi (z) < 0\) for some z ∈ B, then \(\varphi (tz) = t\varphi (z) < r\) for sufficiently large t > 0. This is a contradiction because tz ∈ B. If \(\varphi (z) > 0\) for some z ∈ B, then − z ∈ B and \(\varphi (-z) = -\varphi (z) < 0\), which is a contradiction. Therefore, \(B {\subset \varphi }^{-1}(0)\). Note that \(\varphi (0,1) < 0\) because (0, 1) ∈ A. Since \(\varphi (x,t) =\varphi (x,0) + t\varphi (0,1)\) for each x ∈ E, we have \(\varphi (\{x\} \times \mathbb{R}) = \mathbb{R}\). Observe that \((\{x\} \times \mathbb{R}) {\cap \varphi }^{-1}(0)\) is a singleton. Then, f extends to the linear functional \(\tilde{f} : E \rightarrow \mathbb{R}\) whose graph is \({\varphi }^{-1}(0)\), i.e., \((x,\tilde{f}(x)) {\in \varphi }^{-1}(0)\) for each x ∈ E. Since \({\varphi }^{-1}(0) \subset (E \times \mathbb{R}) \setminus A\), it follows that \(\tilde{f}(x) \leq p(x)\) for every x ∈ E.

The Separation Theorem 3.3.5 can also be obtained as a corollary of the following two theorems, where we do not use the Hahn–Banach Extension Theorem 3.3.1.

Theorem 3.3.6.

For each pair of disjoint non-empty convex sets A,B ⊂ E, there exists a pair of disjoint convex sets \(\widetilde{A},\widetilde{B} \subset E\) such that \(A \subset \widetilde{ A}\), \(B \subset \widetilde{ B}\) , and \(\widetilde{A} \cup \widetilde{ B} = E\) .

Proof.

Let \(\mathcal{P}\) be the collection of pairs (C, D) of disjoint convex sets such that A ⊂ C and B ⊂ D. For \((C,D),(C^\prime,D^\prime) \in \mathcal{P}\), we define (C, D) ≤ (C′, D′) if C ⊂ C′ and D ⊂ D′. Then, it is easy to see that \(\mathcal{P} = (\mathcal{P},\leq )\) is an inductive ordered set. Due to Zorn’s Lemma, \(\mathcal{P}\) has a maximal element \((\widetilde{A},\widetilde{B})\).

Fig. 3.3
figure 3

The graphs of p and f

To show that \(\widetilde{A} \cup \widetilde{ B} = E\), assume the contrary, i.e., there exists a point \(v_{0} \in E \setminus (\widetilde{A} \cup \widetilde{ B})\). By the maximality of \((\widetilde{A},\widetilde{B})\), we can obtain two points

$$\displaystyle{x \in \widetilde{ A} \cap \langle \widetilde{ B} \cup \{ v_{0}\}\rangle \;\text{ and }\;y \in \widetilde{ B} \cap \langle \widetilde{ A} \cup \{ v_{0}\}\rangle .}$$

Then, \(x \in \langle v_{0},y_{1}\rangle\) for some \(y_{1} \in \widetilde{ B}\) and y ∈ ⟨v 0, x 1⟩ for some \(x_{1} \in \widetilde{ A}\). Note that \(x \in \mathrm{rint}\,\langle v_{0},y_{1}\rangle\) and y ∈ rint ⟨v 0, x 1⟩. Consider the triangle \(\langle v_{0},x_{1},y_{1}\rangle\). It is easy to see that ⟨x 1, x⟩ and ⟨y 1, y⟩ meet at a point v 1. Since \(\langle x_{1},x\rangle \subset \widetilde{ A}\) and \(\langle y_{1},y\rangle \subset \widetilde{ B}\), it follows that \(v_{1} \in \widetilde{ A} \cap \widetilde{ B}\), which is a contradiction.

Theorem 3.3.7.

For each pair of disjoint non-empty convex sets C,D ⊂ E with C ∪ D = E, rcl  C ∩rcl  D is a hyperplane if rcl  C ∩rcl  D≠E.

Proof.

First, we show that \(\mathrm{rcl}\,C \cap \mathrm{rcl}\,D = \partial C = \partial D\). To prove that ∂C ⊂ ∂D, let x ∈ ∂C. It suffices to find y ∈ C such that

$$\displaystyle\begin{array}{rcl} & & (1 - t)x + ty \in C\;\mbox{ for $0 < t \leq 1$ and} {}\\ & & (1 + t)x - ty \in E \setminus C = D\;\mbox{ for $t > 0$.} {}\\ \end{array}$$

To this end, take y′, y′ ∈ C such that \((1 - t)x + ty^\prime \in C\) for 0 < t ≤ 1 and \((1 + t)x - ty^{\prime\prime}\not\in C\) for t > 0. Then, \(y = \frac{1} {2}y^\prime + \frac{1} {2}y^{\prime\prime} \in C\) is the desired point. Indeed, for each 0 < t ≤ 1,

$$ \begin{array}{rcl} (1 - t)x + ty& =& (1 - t)x +\tfrac{1} {2}ty^\prime + \tfrac{1} {2}ty^{\prime\prime} {}\\ & =&\left (1 -\dfrac{1} {2}t\right )\left ( \frac{1 - t} {1 -\tfrac{1} {2}t}x + \frac{\tfrac{1} {2}t} {1 -\tfrac{1}{2}t}y^\prime\right ) + \tfrac{1} {2}ty^{\prime\prime} \in C. {}\\\end{array}$$

Moreover, note that

$$ \begin{array}{rcl} & & (1 - s)((1 + t)x - ty) + sy^\prime {}\\ & & \qquad = (1 - s)(1 + t)x -\tfrac{1} {2}(1 - s)ty^\prime -\tfrac{1} {2}(1 - s)ty^{\prime\prime} + sy^\prime. {}\\ \end{array}$$

For each t > 0, let \(s = t/(2 + t) \in (0,1)\). Then, \((1 - s)t = 2s\). Therefore, we have

$$ {(1 - s)((1 + t)x - ty) + sy^\prime = (1 + s)x - sy^{\prime\prime}\not\in C,}$$

which means that \((1 + t)x - ty\not\in C\) (Fig. 3.4). Similarly, we have ∂D ⊂ ∂C. Hence, ∂C = ∂D. Since rint C ∩ rint D = , it follows that \(\mathrm{rcl}\,C \cap \mathrm{rcl}\,D = \partial C = \partial D\).

Next, we show that ∂C is a flat. It suffices to show that if x, y ∈ ∂C and t > 0, then \(x^\prime = (1 + t)x - ty \in \partial C\). If x′ ∉ ∂C, then x′ ∈ rint C or x′ ∈ rint D. In this case, x ∈ rint ⟨x′, y⟩ ⊂ rint C or x ∈ rint ⟨x′, y⟩ ⊂ rint D by Proposition 3.2.3. This is a contradiction. Therefore, x′ ∈ ∂C.

It remains to show that if ∂C ≠ E then ∂C is a hyperplane. We have v ∈ E ∖ ∂C. It suffices to prove that E = fl (∂C ∪{ v}). Without loss of generality, we may assume that v ∈ rint C. On the other hand, ∂C ≠  because C ≠ E. Let z ∈ ∂C. Then, \(w = z - (v - z) = 2z - v \in \mathrm{rint}\,D\). Otherwise, w ∈ rcl C, from which, using Proposition 3.2.3, it would follow that \(z = \frac{1} {2}v + \frac{1} {2}w \in \mathrm{rint}\,\langle v,w\rangle \subset \mathrm{rint}\,C\), which is a contradiction.

For each x ∈ E ∖ ∂C, x ∈ rint C or x ∈ rint D. When x ∈ rint C, let

$$\displaystyle{s =\sup \big\{ t \in \mathbf{I}\bigm |(1 - t)x + tw \in C\big\}.}$$
Fig. 3.4
figure 4

∂C ⊂ ∂D

Fig. 3.5
figure 5

The case x ∈ rint C

Refer to Fig. 3.5. Then, \(y = (1 - s)x + sw \in \partial C\), which implies that

$$\displaystyle{x = \frac{1} {1 - s}y - \frac{2s} {1 - s}z + \frac{s} {1 - s}v \in \mathrm{fl}\,(\partial C \cup \{ v\}).}$$

In the case that x ∈ rint D, let

$$\displaystyle{s =\sup \big\{ t \in \mathbf{I}\bigm |(1 - t)x + tv \in D\big\}.}$$
Fig. 3.6
figure 6

The case x ∈ rint D

Now, refer to Fig. 3.6. Then, \(y = (1 - s)x + sv \in \partial D = \partial C\), which implies that

$$\displaystyle{x = \frac{1} {1 - s}y + \frac{-s} {1 - s}v \in \mathrm{fl}\,(\partial C \cup \{ v\}).}$$

Consequently, it follows that E = fl (∂C ∪{ v}).

Remark 7.

In the above, the condition rcl C ∩ rcl D ≠ E is necessary. For example, define the convex set C in the linear space \(\mathbb{R}_{f}^{\mathbb{N}}\) as follows:

$$\displaystyle{C =\big\{ x \in \mathbb{R}_{f}^{\mathbb{N}}\bigm |n =\max \{ i\mid x(i)\not =0\} \Rightarrow x(n) > 0\big\}.}$$

Let \(D = \mathbb{R}_{f}^{\mathbb{N}} \setminus C = (-C) \setminus \{\mathbf{0}\}\). Then, D is also convex. As is easily observed, \(\mathrm{rcl}\,C = \mathrm{rcl}\,D = \mathbb{R}_{f}^{\mathbb{N}}\), hence \(\mathrm{rcl}\,C \cap \mathrm{rcl}\,D = \mathbb{R}_{f}^{\mathbb{N}}\).

The Separation Theorem 3.3.5 can also be obtained as a corollary of Theorems  3.3.6 and 3.3.7. In fact, let A, B ⊂ E be convex sets with core A ≠  and (core A) ∩ B = . Then, core A = rint A is convex. We apply Theorem 3.3.6 to obtain disjoint non-empty convex sets C and D such that core A ⊂ C, B ⊂ D, and CD = E. Observe that core A ∩ rcl D = , hence rcl D ≠ E. It follows from Theorem 3.3.7 that rcl C ∩ rcl D is a hyperplane. Then, we have a linear functional \(f : E \rightarrow \mathbb{R}\) such that \(\mathrm{rcl}\,C \cap \mathrm{rcl}\,D = {f}^{-1}(s)\) for some \(s \in \mathbb{R}\) (Proposition 3.1.3(1)). Since core A ⊂ E ∖ f  − 1(s), we have core A ⊂ f  − 1((s, )) or \(\mathrm{core}\,A \subset {f}^{-1}((-\infty,s))\). If core A ⊂ f  − 1((s, )), by replacing f and s by − f and − s, it can be assumed that \(\mathrm{core}\,A \subset {f}^{-1}((-\infty,s))\).

We now show that \(\mathrm{rcl}\,C \subset {f}^{-1}((-\infty,s])\). Let x ∈ core A ( ⊂ rint C). Then, x ∈ rint C and f(x) < s. If f(y) > s for some y ∈ rcl C, we have z ∈ rint ⟨x, y⟩ ∩ f  − 1(s). Because z ∈ rcl D, rint ⟨w, z⟩ ⊂ D for some w ∈ D. On the other hand, z ∈ rint ⟨x, y⟩ ⊂ rint C (Proposition 3.2.3). Because rint C = core C, ⟨v, z⟩ ⊂ C = E ∖ D for some v ∈ rint ⟨w, z⟩, which is a contradiction.

Since \(C \subset {f}^{-1}((-\infty,s])\), it follows that D ⊃ f  − 1((s, )). Observe that rint D ⊃ f  − 1((s, )). So, we have x ∈ rint D and f(x) > s. Likewise for rcl D, we can show that rcl D ⊂ f  − 1([s, )). Accordingly, we have

$$\displaystyle{\mathrm{rcl}\,C = {f}^{-1}((-\infty,s])\;\text{ and }\;\mathrm{rcl}\,D = {f}^{-1}([s,\infty )).}$$

Since \(\mathrm{core}\,A \subset {f}^{-1}((-\infty,s))\) and B ⊂ f  − 1([s, )), we have the desired result.

3.4 Topological Linear Spaces

A topological linear space E is a linear space with a topology such that the algebraic operations of addition (x, y)↦x + y and scalar multiplication (t, x)↦tx are continuous.Footnote 8 Every linear space E has such a topology. In fact, E has a Hamel basis B. As a linear subspace of the product space \({\mathbb{R}}^{B}\), \(\mathbb{R}_{f}^{B}\) is a topological linear space that is linearly isomorphic to E by the linear isomorphism \(\varphi : \mathbb{R}_{f}^{B} \rightarrow E\) defined by \(\varphi (x) =\sum _{v\in B}x(v)v\). Then, \(\varphi\) induces a topology that makes E a topological linear space. In the next section, it will be seen that if E is finite-dimensional, then such a topology is unique. However, an infinite-dimensional linear space has various topologies for which the algebraic operations are continuous.

In the following proposition, we present the basic properties of a neighborhood basis at 0 in a topological linear space.

Proposition 3.4.1.

Let E be a topological linear space and \(\mathcal{U}\) be a neighborhood basis at 0 in E. Then, \(\mathcal{U}\) has the following properties:

  1. (1)

    For each \(U,V \in \mathcal{U}\) , there is some \(W \in \mathcal{U}\) such that W ⊂ U ∩ V ;

  2. (2)

    For each \(U \in \mathcal{U}\) , there is some \(V \in \mathcal{U}\) such that V + V ⊂ U;

  3. (3)

    For each \(U \in \mathcal{U}\) , there is some \(V \in \mathcal{U}\) such that [−1,1]V ⊂ U;

  4. (4)

    For each x ∈ E and \(U \in \mathcal{U}\) , there is some a > 0 such that x ∈ aU;

  5. (5)

    \(\bigcap \mathcal{U} =\{ \mathbf{0}\}\) .

Conversely, let E be a linear space with \(\mathcal{U}\) a collection of subsets satisfying these conditions. Then, E has a topology such that addition and scalar multiplication are continuous and \(\mathcal{U}\) is a neighborhood basis at 0 .

Sketch of Proof. Property (1) is trivial; (2) comes from the continuity of addition at (0, 0) ∈ E ×E; (3) is obtained by the continuity of scalar multiplication at each (t, 0) ∈ [ − 1, 1] ×E and the compactness of [ − 1, 1]; (4) follows from the continuity of scalar multiplication at \((0,x) \in \mathbb{R} \times E\); the Hausdorffness of E implies (5).

Given \(\mathcal{U}\) with these properties, an open set in E is defined as a subset W ⊂ E satisfying the condition that, for each x ∈ W, there is some \(U \in \mathcal{U}\) such that x + U ⊂ W. (Verify the axioms of open sets, i.e., the intersection of finite open sets is open; every union of open sets is open.)

For each x ∈ E and \(U \in \mathcal{U}\), x + U is a neighborhood of x in this topology.Footnote 9 Indeed, let

$$\displaystyle{W =\{ y \in E\mid \exists V \in \mathcal{U}\;\mbox{ such that $y + V \subset x + U$}\}.}$$

Then, x ∈ W ⊂ x + U because of (5). For each y ∈ W, we have \(V \in \mathcal{U}\) such that \(y + V \subset x + U\). Take \(V ^\prime \in \mathcal{U}\) so that V′ + V′ ⊂ V as in (2). Then, y + V′ ⊂ W because \((y + y^\prime) + V ^\prime \subset y + V \subset x + U\) for every y′ ∈ V′. Therefore, W is open in E, so x + U is a neighborhood of x in E. By the definition of the topology, \(\{x + U\mid U \in \mathcal{U}\}\) is a neighborhood basis at x. In particular, \(\mathcal{U}\) is a neighborhood basis at 0.

Since \(\{x + U\mid U \in \mathcal{U}\}\) is a neighborhood basis at x, the continuity of addition follows from (2). Using (3), we can show that the operation x↦ − x is continuous.

For scalar multiplication, let x ∈ E, \(\alpha \in \mathbb{R}\), and \(U \in \mathcal{U}\). Because of the continuity of x↦ − x, it can be assumed that α ≥ 0. Then, we can write \(\alpha = n + t\), where n ∈ ω and 0 ≤ t < 1. Using (2) inductively, we can find \(V _{1} \supset \cdots \supset V _{n} \supset V _{n+1}\) in \(\mathcal{U}\) such that

$$\displaystyle{V _{1} + \cdots + V _{n} + (V _{n+1} + V _{n+1}) \subset U.}$$

By (3), we have \(W \in \mathcal{U}\) such that \([-1, 1]W \subset V _{n+1}\). Then, x ∈ rW for some r > 0 by (4). Choose δ > 0 so that \(\delta <\min \{ 1/r, 1 - t\}\). Let y ∈ x + W and | α − β |  < δ. Then, we can write \(\beta = n + s\), where \(t-\delta < s < t+\delta\). It follows that

$$\displaystyle\begin{array}{rcl} \beta y -\alpha x& =& (n + s)y - (n + t)x = n(y - x) + s(y - x) + (s - t)x {}\\ & \in & nW + [-1, 1]W +\delta [-1, 1](rW) {}\\ & \subset & nV _{n+1} + V _{n+1} + V _{n+1} {}\\ & \subset &\underbrace{\mathop{V _{n+1} + \cdots + V _{n+1}}}\limits _{\mbox{ $n + 2$ many}} \subset U, {}\\ \end{array}$$

Footnote 10hence βy ∈ αx + U.

To see the Hausdorffness, let x ≠ y ∈ E. By (5), we have \(U \in \mathcal{U}\) such that x − y ∉ U. By (2) and (3), we can find \(V \in \mathcal{U}\) such that V − V ⊂ U. Then, x + V and y + V are neighborhoods of x and y, respectively. Observe that \((x + V ) \cap (y + V ) = \emptyset\).

It is said that A ⊂ E is circled if tA ⊂ A for every t ∈ [ − 1, 1]. It should be noted that the closure of a circled set A is also circled.

Indeed, let x ∈ cl A and t ∈ [ − 1, 1]. If t = 0, then tx = 0 ∈ A ⊂ cl A. When t ≠ 0, for each neighborhood U of tx in E, since t  − 1 U is a neighborhood of t  − 1 x, \({t}^{-1}U \cap A\not =\emptyset\), which implies that U ∩ tA ≠ . Because tA ⊂ A, U ∩ A ≠ . Thus, it follows that tx ∈ cl A.

In (3) above, \(W = [-1,1]V\) is a neighborhood of 0 ∈ E that is circled, i.e., tW ⊂ W for every t ∈ [ − 1, 1]. Consequently, (3) is equivalent to the following condition:

  1. (3)’

    0 ∈ E has a neighborhood basis consisting of circled (open) sets.

A topological group G is a group with a topology such that the algebraic operations of multiplication (x, y)↦xy and taking inverses xx  − 1 are both continuous.Footnote 11 Then, G is homogeneous, that is, for each distinct x 0, x 1 ∈ G, there is a homeomorphism h : G → G such that h(x 0) = x 1. Such an h can be defined by \(h(x) = x_{0}{x}^{-1}x_{1}\), where not only h(x 0) = x 1 but also h(x 1) = x 0. Every topological linear space is a topological group with respect to addition, so it is homogeneous.

Proposition 3.4.2.

Every topological group G has a closed neighborhood basis at each g ∈ G, i.e., it is regular. Footnote 12 For a topological linear space E, 0 ∈ E has a circled closed neighborhood basis.

Sketch of Proof. Each neighborhood U of the unit 1 ∈ G contains a neighborhood V of 1 such that V  − 1 V ⊂ U. For each x ∈ cl V, we have y ∈ Vx ∩ V . Consequently, \(x \in {V }^{-1}y \subset {V }^{-1}V \subset U\), so we have cl V ⊂ U.

For the additional statement, recall that if V is circled then cl V is also circled.

Proposition 3.4.3.

Let G be a topological group and H be a subgroup of G.

  1. (1)

    If H is open in G then H is closed in G.

  2. (2)

    The closure cl  H of H is a subgroup of G.

Sketch of Proof. (1): For each x ∈ G ∖ H, Hx is an open neighborhood of x in G and Hx ⊂ G ∖ H.(2): For each x, y ∈ cl H, show that x  − 1 y ∈ cl H, i.e., each neighborhood W of x  − 1 y meets H. To this end, choose neighborhoods U and V of x and y, respectively, so that U  − 1 V ⊂ W.

Due to Proposition 3.4.3(1), a connected topological group G has no open subgroups except for G itself. Observe that every topological linear space E is path-connected. Consequently, E has no open linear subspaces except for E itself, i.e., every proper linear subspace of E is not open in E.

The continuity of linear functionals is characterized as follows:

Proposition 3.4.4.

Let E be a topological linear space. For a linear functional \(f : E \rightarrow \mathbb{R}\) with f(E)≠{0}, the following are equivalent:

  1. (a)

    f is continuous;

  2. (b)

    f −1 (0) is closed in E;

  3. (c)

    f −1 (0) is not dense in E;

  4. (d)

    f(V ) is bounded for some neighborhood V of 0 ∈ E.

Proof.

The implication (a) ⇒ (b) is obvious, and (b) ⇒ (c) follows from f(E) ≠ {0} (i.e., \({f}^{-1}(0)\not =E\)).

(c) ⇒ (d): We have x ∈ E and a circled neighborhood V of 0 ∈ E such that \((x + V ) \cap {f}^{-1}(0) = \emptyset\). Then, f(V ) is bounded. Indeed, if f(V ) is unbounded, then there is some z ∈ V such that | f(z) |  >  | f(x) | . In this case, \(f(tz) = tf(z) = -f(x)\) for some t ∈ [ − 1, 1], which implies that − f(x) ∈ f(V ). It follows that \(0 \in f(x) + f(V ) = f(x + V )\), which contradicts the fact that \((x + V ) \cap {f}^{-1}(0) = \emptyset\).

(d) ⇒ (a): For each \(\varepsilon > 0\), we have \(n \in \mathbb{N}\) such that \(f(V ) \subset (-n\varepsilon,n\varepsilon )\). Then, n  − 1 V is a neighborhood of 0 in E and \(f({n}^{-1}V ) \subset (-\varepsilon,\varepsilon )\). Therefore, f is continuous at 0 ∈ E. Since f is linear, it follows that f is continuous at every point of E.

Proposition 3.4.5.

Let E be a topological linear space and A,B ⊂ E.

  1. (1)

    If B is open in E then A + B is open in E.

  2. (2)

    If A is compact and B is closed in E then A + B is also closed in E.

Sketch of Proof. (1): Note that \(A + B =\bigcup _{x\in A}(x + B)\).

(2): To show that E ∖ (A + B) is open in E, let z ∈ E ∖ (A + B). For each x ∈ A, because z − x ∈ E ∖ B, we have open neighborhoods U x , V x of x, z in E such that V x  − U x  ⊂ E ∖ B. Since A is compact, \(A \subset \bigcup _{i=1}^{n}U_{x_{i}}\) for some \(x_{1},\ldots,x_{n} \in A\). Then, \(V =\bigcap _{ i=1}^{n}V _{x_{i}}\) is an open neighborhood of z in E. We can show that \(V \cap (A + B) = \emptyset\), i.e., V ⊂ E ∖ (A + B).

Remark 8.

In (2) above, we cannot assert that A + B is closed in E even if both A and B are closed and convex in E. For example, \(A = \mathbb{R} \times \{ 0\}\) and \(B =\{ (x,y) \in {\mathbb{R}}^{2}\mid x > 0,\ y \geq {x}^{-1}\}\) are closed convex sets in \({\mathbb{R}}^{2}\), but \(A + B = \mathbb{R} \times (0,\infty )\) is not closed in \({\mathbb{R}}^{2}\).

Proposition 3.4.6.

Let F be a closed linear subspace of a topological linear space E. Then, the quotient linear space E∕F with the quotient topology is also a topological linear space, and the quotient map q : E → E∕F (i.e., \(p(x) = x + F \in E/F\) ) is open, hence if \(\mathcal{U}\) is a neighborhood basis at 0 in E, then \(q(\mathcal{U}) =\{ q(U)\mid U \in \mathcal{U}\}\) is a neighborhood basis 0 in E∕F.

Sketch of Proof. Apply Proposition 3.4.5(1) to show that the quotient map q : E → E ∕ F is open. Then, in the diagrams below, q ×q and \(q \times \mathrm{ id}_{\mathbb{R}}\) are open, so they are quotient maps: Accordingly, the continuity of addition and scalar multiplication are clear. Note that E ∕ F is Hausdorff if and only if F is closed in E.

For convex sets in a topological linear space, we have the following:

Proposition 3.4.7.

For each convex set C in a topological linear space E, the following hold:

  1. (1)

    cl  C is convex and rcl  C ⊂cl  C, hence rcl  C = C if C is closed in E;

  2. (2)

    int   F C = ∅ for any flat F with fl  C ⊊ F;

  3. (3)

    int   fl C C≠∅ implies \(\mathrm{int}\,_{\mathrm{fl}\,C}C = \mathrm{core}\,_{\mathrm{fl}\,C}C = \mathrm{rint}\,C\) .

Proof.

By the definition and the continuity of algebraic operations, we can easily obtain (1). For (2), observe int  F C ⊂ core  F C. If int  F C ≠  then fl C = F by the Fact stated in the previous section.

(3): Due to Proposition 3.3.2, core fl C C = rint C. Note that int fl C C ⊂ core fl C C. Without loss of generality, we may assume that 0 ∈ int fl C C. Then, for each x ∈ rint C, we can find 0 < s < 1 such that x ∈ sC. Since (1 − s)C is a neighborhood of \(\mathbf{0} = x - x\) in fl C, we have a neighborhood U of x in fl C such that \(U - x \subset (1 - s)C\). Then, it follows that \(U \subset (1 - s)C + sC = C\). Therefore, x ∈ int fl C C.

Remark 9.

In the above, we cannot assert any one of cl C = rcl C, int fl C C = core fl C C, or int fl C C ≠ . For example, \([-1,1]_{f}^{\mathbb{N}}\) is a convex set in \({\mathbb{R}}^{\mathbb{N}}\) such that \(\mathrm{rcl}\,[-1,1]_{f}^{\mathbb{N}} = [-1,1]_{f}^{\mathbb{N}}\) but \(\mathrm{cl}\,[-1,1]_{f}^{\mathbb{N}} = {[-1,1]}^{\mathbb{N}}\). Note that \(\mathrm{fl}\,[-1,1]_{f}^{\mathbb{N}} = \mathbb{R}_{f}^{\mathbb{N}}\). Regard \([-1,1]_{f}^{\mathbb{N}}\) as a convex set in \(\mathbb{R}_{f}^{\mathbb{N}}\). Then,

$$\displaystyle{\mathrm{int}\,_{\mathbb{R}_{f}^{\mathbb{N}}}[-1,1]_{f}^{\mathbb{N}} = \emptyset\;\text{ but }\;\mathrm{core}\,_{ \mathbb{R}_{f}^{\mathbb{N}}}[-1,1]_{f}^{\mathbb{N}} = \mathrm{rint}\,[-1,1]_{ f}^{\mathbb{N}} = (-1,1)_{ f}^{\mathbb{N}}.}$$

By Proposition 3.4.7(1), if A is a subset of a topological linear space E, then cl ⟨A⟩ is the smallest closed convex set containing A, which is called the closed convex hull of A.

Remark 10.

In general, ⟨A⟩ is not closed in E even if A is compact. For example, let A = { a n n ∈ ω} ⊂  1, where \(a_{0}(i) = {2}^{-i}\) for every \(i \in \mathbb{N}\) and, for each \(n \in \mathbb{N}\), \(a_{n}(i) = {2}^{-i}\) if i ≤ n and a n (i) = 0 if i > n. Then, A is compact and \(\langle A\rangle =\bigcup _{n\in \mathbb{N}}\langle a_{0},a_{1},\ldots,a_{n}\rangle\). For each \(n \in \mathbb{N}\), let

$$\displaystyle{x_{n} = {2}^{-n}a_{ 0} + {2}^{-1}a_{ 1} + \cdots + {2}^{-n}a_{ n} \in \langle a_{0},a_{1},\ldots,a_{n}\rangle .}$$

Then, \(x_{n}(i) = {2}^{-2i+1}\) if i ≤ n and \(x_{n}(i) = {2}^{-n-i}\) if i > n. Hence, \((x_{n})_{n\in \mathbb{N}}\) converges to x 0 ∈  1, where \(x_{0}(i) = {2}^{-2i+1}\) for each \(i \in \mathbb{N}\). However, x 0 ∉ ⟨A⟩. Otherwise, \(x_{0} \in \langle a_{0},a_{1},\ldots,a_{n}\rangle\) for some \(n \in \mathbb{N}\), where we can write

$$\displaystyle{x_{0} =\sum _{ i=0}^{n}z(i + 1)a_{ i},\ z \in {\Delta }^{n}.}$$

Then, we have the following:

$$\displaystyle\begin{array}{rcl} z(1)a_{0}(n + 1)& =& x_{0}(n + 1) = {2}^{-2n-1} = {2}^{-n}a_{ 0}(n + 1)\;\text{ and} {}\\ z(1)a_{0}(n + 2)& =& x_{0}(n + 2) = {2}^{-2n-3} = {2}^{-n-1}a_{ 0}(n + 2), {}\\ \end{array}$$

hence \(z(1) = {2}^{-n}\) and \(z(1) = {2}^{-n-1}\). This is a contradiction. Therefore, ⟨A⟩ is not closed in 1.

The following is the topological version of the Separation Theorem 3.3.5:

Theorem 3.4.8 (Separation Theorem). 

Let A and B be convex sets in a topological linear space E such that int  A≠∅ and (int  A) ∩ B = ∅. Then, there is a continuous linear functional \(f : E \rightarrow \mathbb{R}\) such that f(x) < f(y) for each x ∈int  A and y ∈ B, and sup f(A) ≤ inf f(B).

Proof.

First, int A ≠  implies \(\mathrm{core}\,A = \mathrm{int}\,A\not =\emptyset\) by Proposition 3.4.7(3). Then, by the Separation Theorem 3.3.5, we have a linear functional \(f : E \rightarrow \mathbb{R}\) such that f(x) < f(y) for every x ∈ int A and y ∈ B, and supf(A) ≤ inff(B). Note that B − int A is open in E and f(z) > 0 for every z ∈ B − int A. Thus, f  − 1(0) is not dense in E. Therefore, f is continuous by Proposition 3.4.4.

A topological linear space E is locally convex if 0 ∈ E has a neighborhood basis consisting of (open) convex sets; equivalently, open convex sets make up an open basis for E. It follows from Proposition 3.4.6 that for each locally convex topological space E and each closed linear subspace F ⊂ E, the quotient linear space E ∕ F is also locally convex. For locally convex topological linear spaces, we have the following separation theorem:

Theorem 3.4.9 (Strong Separation Theorem). 

Let A and B be disjoint closed convex sets in a locally convex topological linear space E. If at least one of A and B is compact, then there is a continuous linear functional \(f : E \rightarrow \mathbb{R}\) such that sup f(A) < inf f(B).

Proof.

By Proposition 3.4.5(2), B − A is closed in E. Since A ∩ B = , it follows that 0 ∉ B − A. Choose an open convex neighborhood U of 0 so that \(U \cap (B - A) = \emptyset\). By the Separation Theorem 3.4.8, we have a nontrivial continuous linear functional \(f : E \rightarrow \mathbb{R}\) such that supf(U) ≤ inff(B − A). Then, supf(A) + supf(U) ≤ inff(B), where supf(U) > 0 by the non-triviality of f. Thus, we have the result.

As a particular case, we have the following:

Corollary 3.4.10.

Let E be a locally convex topological linear space. For each pair of distinct points x,y ∈ E, there exists a continuous linear functional \(f : E \rightarrow \mathbb{R}\) such that f(x)≠f(y). □

Concerning the continuity of sublinear functionals, we have the following:

Proposition 3.4.11.

Let \(p : E \rightarrow \mathbb{R}\) be a non-negative sublinear functional of a topological linear space E. Then, p is continuous if and only if p −1 ([0,1)) is a neighborhood of 0 ∈ E.

Proof.

The “only if” part follows from \({p}^{-1}([0,1)) = {p}^{-1}((-1,1))\). To see the “if” part, let \(\varepsilon > 0\). Since \({p}^{-1}([0,\varepsilon )) =\varepsilon {p}^{-1}([0,1))\) is a neighborhood of 0 ∈ E, each x ∈ E has the following neighborhood:

$$\displaystyle{U =\big (x + {p}^{-1}([0,\varepsilon ))\big) \cap \big (x - {p}^{-1}([0,\varepsilon ))\big).}$$

For each y ∈ U, since \(p(y - x) <\varepsilon\) and \(p(x - y) <\varepsilon\), it follows that

$$\displaystyle\begin{array}{rcl} p(y)& \leq & p(y - x) + p(x) < p(x) +\varepsilon \; \text{ and} {}\\ p(y)& \geq & p(x) - p(x - y) > p(x)-\varepsilon, {}\\ \end{array}$$

which means that p is continuous at x.

For each convex set C ⊂ E with 0 ∈ int C, we have \(\mathrm{int}\,C = \mathrm{core}\,C = p_{C}^{-1}([0,1))\) by Propositions 3.3.4 and 3.4.7(3). Then, the following is obtained from Proposition 3.4.11.

Corollary 3.4.12.

Let E be a topological linear space. For each convex set C ⊂ E with 0 int  C, the Minkowski functional \(p_{C} : E \rightarrow \mathbb{R}\) is continuous. Moreover, \(p_{C}^{-1}([0,1)) = \mathrm{int}\,C = \mathrm{rint}\,C\) and \(p_{C}^{-1}(\mathbf{I}) = \mathrm{cl}\,C = \mathrm{rcl}\,C\) , hence \(p_{C}^{-1}(1) = \mathrm{bd}\,C = \partial C\) . □

The boundedness is a metric concept, but it can be extended to subsets of a topological linear space E. A subset A ⊂ E is topologically bounded Footnote 13 provided that, for each neighborhood U of 0 ∈ E, there exists some r > 0 such that A ⊂ rU. If A ⊂ E is topologically bounded and B ⊂ A, then B is also topologically bounded. Recall that every neighborhood U of 0 ∈ E contains a circled neighborhood V of 0 ∈ E (cf. Proposition 3.4.1(3)). Since sV ⊂ tV for 0 < s < t, it is easy to see that every compact subset of E is topologically bounded. When E is a normed linear space, A ⊂ E is topologically bounded if and only if A is bounded in the metric sense. Applying Minkowski functionals, we can show the following:

Theorem 3.4.13.

Let E be a topological linear space. Each pair of topologically bounded closed convex sets C,D ⊂ E with int  C≠∅ and int  D≠∅ are homeomorphic to each other by a homeomorphism of E onto itself, hence (C,bd  C) ≈ (D,bd  D) and int  C ≈int  D.

Proof.

Without loss of generality, we may assume that 0 ∈ int C ∩ int D. Let p C and p D be the Minkowski functionals for C and D, respectively. By the topological boundedness of C and D, it is easy to see that p C (x),  p D (x) > 0 for every x ∈ E ∖ {0}. Then, we can define maps \(\varphi,\psi : E \rightarrow E\) as follows: \(\varphi (\mathbf{0}) =\psi (\mathbf{0}) = \mathbf{0}\),

$$\displaystyle{\varphi (x) = \frac{p_{C}(x)} {p_{D}(x)}x\;\text{ and }\;\psi (x) = \frac{p_{D}(x)} {p_{C}(x)}x\;\mbox{ for each $x \in E \setminus \{\mathbf{0}\}$.}}$$

It follows from the continuity of p C and p D (Corollary 3.4.12) that \(\varphi\) and ψ are continuous at each x ∈ E ∖ {0}.

To verify the continuity of \(\varphi\) at 0 ∈ E, let U be a neighborhood of 0 ∈ E. Since D is topologically bounded and C is a neighborhood of 0, there is an r > 0 such that D ⊂ rC. Then, p C (x) ≤ rp D (x) for every x ∈ E. Choose a circled neighborhood V of 0 ∈ E so that rV ⊂ U. Then, \(\varphi (V ) \subset U\). Indeed, for each x ∈ V ∖ {0},

$$\displaystyle{\varphi (x) = \frac{p_{C}(x)} {p_{D}(x)}x \in \frac{p_{C}(x)} {p_{D}(x)}V \subset rV \subset U.}$$

Similarly, ψ is continuous at 0 ∈ E.

For each x ∈ E ∖ {0}, since \(\varphi (x)\not =\mathbf{0}\),

$$\displaystyle{\psi \varphi (x) = \frac{p_{D}(\varphi (x))} {p_{C}(\varphi (x))}\varphi (x) = \frac{\dfrac{p_{C}(x)} {p_{D}(x)}p_{D}(x)} {\dfrac{p_{C}(x)} {p_{D}(x)}p_{C}(x)} \cdot \frac{p_{C}(x)} {p_{D}(x)}x = x.}$$

Hence, \(\psi \varphi =\mathrm{ id}\). Similarly, \(\varphi \psi =\mathrm{ id}\). Therefore, \(\varphi\) is a homeomorphism with \({\varphi }^{-1} =\psi\). Moreover, observe that \(\varphi (C) \subset D\) and ψ(D) ⊂ C, hence \(\varphi (C) = D\). Thus, we have the result.

The norm of a normed linear space E is the Minkowski functional for the unit closed ball B E of E. Since bd B E is the unit sphere S E of E, we have the following:

Corollary 3.4.14.

Let \(E = (E,\|\cdot \|)\) be a normed linear space. For every bounded closed convex set C ⊂ E with int  C≠∅, the pair (C,bd  C) is homeomorphic to the pair ( B E, S E ) of the unit closed ball and the unit sphere of E.

It is easy to see that every normed linear space \(E = (E,\|\cdot \|)\) is homeomorphic to the unit open ball B(0, 1) = B E  ∖ S E of E.

In fact, the following are homeomorphisms (each of them is the inverse of the other):

$$\displaystyle{E \ni x\mapsto \frac{1} {1 +\| x\|}x \in \mathrm{ B}(\mathbf{0}, 1); \quad \mathrm{B}(\mathbf{0}, 1) \ni y\mapsto \frac{1} {1 -\| y\|}y \in E.}$$

By applying the Minkowski functional, this can be extended as follows:

Theorem 3.4.15.

Every open convex set V in a topological linear space E is homeomorphic to E itself.

Proof.

Without loss of generality, it can be assumed that 0 ∈ int V = V . Then, we have \(V = \mathrm{int}\,V = p_{V }^{-1}([0,1))\) by Corollary 3.4.12. Using the Minkowski functional p V , we can define maps \(\varphi : V \rightarrow E\) and ψ : E → V as follows:

$$\displaystyle{\varphi (x) = \frac{1} {1 - p_{V }(x)}x\quad \mbox{ for $x \in V $;}\quad \psi (y) = \frac{1} {1 + p_{V }(y)}y\quad \mbox{ for $y \in E$.}}$$

Observe that \(\psi \varphi =\mathrm{ id}_{V }\) and \(\varphi \psi =\mathrm{ id}_{E}\). This means that \(\varphi\) is a homeomorphism with \(\psi {=\varphi }^{-1}\).

3.5 Finite-Dimensionality

Here, we prove that every finite-dimensional linear space has the unique topology that is compatible with the algebraic operations, and that a topological linear space is finite-dimensional if and only if it is locally compact.

First, we show the following proposition:

Proposition 3.5.1.

Every finite-dimensional flat F in an arbitrary linear space E has the unique (Hausdorff) topology such that the following operation is continuous:

$$\displaystyle{F \times F \times \mathbb{R} \ni (x,y,t)\mapsto (1 - t)x + ty \in F.}$$

With respect to this topology, every affine bijection \(f : {\mathbb{R}}^{n} \rightarrow F\) is a homeomorphism, where n = dim F. Then, F is affinely homeomorphic to \({\mathbb{R}}^{n}\) . Moreover, if E is a topological linear space then F is closed in E.

Proof.

As mentioned at the beginning of Sect. 3.4, E has a topology that makes E a topological linear space. With respect to the topology of F inherited from this topology, the above operation is continuous.

Note that there exists an affine bijection \(f : {\mathbb{R}}^{n} \rightarrow F\), where dimF = n. We shall show that any affine bijection \(f : {\mathbb{R}}^{n} \rightarrow F\) is a homeomorphism with respect to any other topology of F such that the above operation is continuous, which implies that such a topology is unique and F is affinely homeomorphic to \({\mathbb{R}}^{n}\).

Since f is affine, we have

$$\displaystyle{f(z) =\bigg (1 -\sum _{i=1}^{n}z(i)\bigg)f(\mathbf{0}) +\sum _{ i=1}^{n}z(i)f(\mathbf{e}_{ i})\;\mbox{ for each $z \in {\mathbb{R}}^{n}$.}}$$

Note that the following function is continuous:

$$\displaystyle{{\mathbb{R}}^{n} \ni z\mapsto \bigg(1 -\sum _{ i=1}^{n}z(i),z(1),\ldots,z(n)\bigg) \in \mathrm{fl}\,{\Delta }^{n} \subset {\mathbb{R}}^{n+1}.}$$

Then, the continuity of f follows from the claim:

Claim.

Given \(v_{1},\ldots,v_{k} \in F\), k ≤ n, the following function is continuous:

$$\displaystyle{\varphi _{k} : \mathrm{fl}\,{\Delta }^{k-1} \ni z\mapsto \sum _{ i=1}^{k}z(i)v_{ i} \in F.}$$

Since fl Δ 0 = Δ 0 is a singleton, the continuity of \(\varphi _{1}\) is obvious. Assuming the continuity of \(\varphi _{k}\), we shall show the continuity of \(\varphi _{k+1}\). Let \(\psi : \mathrm{fl}\,{\Delta }^{k-1} \times \mathbb{R} \rightarrow \mathrm{fl}\,{\Delta }^{k}\) be the map defined by \(\psi (z,t) = ((1 - t)z,t)\). Observe that

$$\displaystyle{\varphi _{k+1}\psi (z,t) = (1 - t)\sum _{i=1}^{k}z(i)v_{ i} + tv_{k+1} = (1 - t)\varphi _{k}(z) + tv_{k+1}.}$$

From the property of the topology of F and the continuity of \(\varphi _{k}\), it follows that \(\varphi _{k+1}\psi\) is continuous. For each \(i = 1,\ldots,k + 1\), let \(p_{i} =\mathrm{ pr}_{i}\vert \mathrm{fl}\,{\Delta }^{k} : \mathrm{fl}\,{\Delta }^{k} \rightarrow \mathbb{R}\) be the restriction of the projection onto the i-th factor. Note that

$$\displaystyle{\psi \vert \mathrm{fl}\,{\Delta }^{k-1} \times (\mathbb{R} \setminus \{ 1\}) : \mathrm{fl}\,{\Delta }^{k-1} \times (\mathbb{R} \setminus \{ 1\}) \rightarrow \mathrm{fl}\,{\Delta }^{k} \setminus p_{ k+1}^{-1}(1)}$$

is a homeomorphism. Hence, \(\varphi _{k+1}\vert \mathrm{fl}\,{\Delta }^{k} \setminus p_{k+1}^{-1}(1)\) is continuous. Replacing the (k + 1)-th coordinates with the i-th coordinates, we can see the continuity of \(\varphi _{k+1}\vert \mathrm{fl}\,{\Delta }^{k} \setminus p_{i}^{-1}(1)\). Since \(\mathrm{fl}\,{\Delta }^{k} =\bigcup _{ i=1}^{k+1}(\mathrm{fl}\,{\Delta }^{k} \setminus p_{i}^{-1}(1))\), it follows that \(\varphi _{k+1}\) is continuous. Thus, the claim can be obtained by induction.

It remains to show the openness of f. On the contrary, assume that f is not open. Then, we have \(x \in {\mathbb{R}}^{n}\) and \(\varepsilon > 0\) such that \(f(\mathrm{B}(x,\varepsilon ))\) is not a neighborhood of f(x) in F. Since \(\mathrm{bd}\,\mathrm{B}(x,\varepsilon )\) is a bounded closed set of \({\mathbb{R}}^{n}\), it is compact, hence \(f(\mathrm{bd}\,\mathrm{B}(x,\varepsilon ))\) is closed in F. Then, \(F \setminus f(\mathrm{bd}\,\mathrm{B}(x,\varepsilon ))\) is a neighborhood of f(x) in F. Using the compactness of I, we can find an open neighborhood U of f(x) in F such that

$$\displaystyle{(1 - t)f(x) + tU \subset F \setminus f(\mathrm{bd}\,\mathrm{B}(x,\varepsilon ))\;\mbox{ for every $t \in \mathbf{I}$.}}$$

Then, \(U \cap f(\mathrm{bd}\,\mathrm{B}(x,\varepsilon )) = \emptyset\). Since \(f(\mathrm{B}(x,\varepsilon ))\) is not a neighborhood of f(x), it follows that \(U\not\subset f(\mathrm{B}(x,\varepsilon ))\), and so we can take a point \(y \in U \setminus f(\overline{\mathrm{B}}(x,\varepsilon ))\). Now, we define a linear path \(g : \mathbf{I} \rightarrow {\mathbb{R}}^{n}\) by \(g(t) = (1 - t)x + t{f}^{-1}(y)\). Since f is affine and y ∈ U, it follows that

$$\displaystyle{fg(t) = (1 - t)f(x) + ty \in F \setminus f(\mathrm{bd}\,\mathrm{B}(x,\varepsilon ))\;\mbox{ for every $t \in \mathbf{I}$.}}$$

Since f is a bijection, we have

$$\displaystyle{g(\mathbf{I}) \subset {\mathbb{R}}^{n} \setminus \mathrm{bd}\,\mathrm{B}(x,\varepsilon ) =\mathrm{ B}(x,\varepsilon ) \cup ({\mathbb{R}}^{n} \setminus \overline{\mathrm{B}}(x,\varepsilon )).}$$

Then, \(g(\mathbf{0}) = x \in \mathrm{ B}(x,\varepsilon )\) and \(g(1) = {f}^{-1}(y) \in {\mathbb{R}}^{n} \setminus \overline{\mathrm{B}}(x,\varepsilon )\), which contradicts the connectedness of I. Thus, f is open.

In the case when E is a topological linear space, to prove that F is closed in E, take a point x ∈ E ∖ F and consider the flat F x  = fl (F ∪{ x}). It is easy to construct an affine bijection \(f : {\mathbb{R}}^{n+1} \rightarrow F_{x}\) such that \(f({\mathbb{R}}^{n} \times \{\mathbf{0}\}) = F\). As we saw in the above, f is a homeomorphism, hence F is closed in F x . Since F x  ∖ F is open in F x , we have an open set U in E such that U ∩ F x  = F x  ∖ F. Then, U is a neighborhood of x in E and U ⊂ E ∖ F. Therefore, E ∖ F is open in E, that is, F is closed in E.

If a linear space E has a topology such that the operation

$$\displaystyle{E \times E \times \mathbb{R} \ni (x,y,t)\mapsto (1 - t)x + ty \in E}$$

is continuous, then scalar multiplication and addition are also continuous with this topology because they can be written as follows:

$$\displaystyle\begin{array}{rcl} E \times \mathbb{R} \ni (x,t)& \mapsto & tx = (1 - t)\mathbf{0} + tx \in E; {}\\ E \times E \ni (x,y)& \mapsto & x + y = 2\left (\tfrac{1} {2}x + \tfrac{1} {2}y\right ) \in E. {}\\ \end{array}$$

Then, the following is obtained by Proposition 3.5.1:

Corollary 3.5.2.

Every finite-dimensional linear space E has the unique (Hausdorff) topology compatible with the algebraic operations (addition and scalar multiplication), and then it is linearly homeomorphic to \({\mathbb{R}}^{n}\) , where n = dim E.

Moreover, we have the following:

Corollary 3.5.3.

Let E be a topological linear space and F a finite-dimensional flat in another topological linear space. Then, every affine function f : F → E is continuous, and if f is injective then f is a closed embedding.

Proof.

By Proposition 3.5.1, F can be replaced with \({\mathbb{R}}^{n}\), where n = dimF. Then, we can write

$$\displaystyle{f(x) =\Bigg (1 -\sum _{i=1}^{n}x(i)\Bigg)f(\mathbf{0}) +\sum _{ i=1}^{n}x(i)f(\mathbf{e}_{ i})\;\mbox{ for each $x \in {\mathbb{R}}^{n}$,}}$$

where \(\mathbf{e}_{1},\ldots,\mathbf{e}_{n}\) is the canonical orthonormal basis for \({\mathbb{R}}^{n}\). Thus, the continuity of f is obvious. Since \(f({\mathbb{R}}^{n})\) is a finite-dimensional flat in E, \(f({\mathbb{R}}^{n})\) is closed in E by Proposition 3.5.1. If f is injective then \(f : {\mathbb{R}}^{n} \rightarrow f({\mathbb{R}}^{n})\) is an affine bijection, which is a homeomorphism by Proposition 3.5.1. Hence, f is a closed embedding.

Combining Proposition 3.2.2 and Corollary 3.5.3, we have

Corollary 3.5.4.

Let E be a topological linear space and C a finite-dimensional convex set in another topological linear space. Then, every affine function f : C → E is continuous. Moreover, if f is injective then f is an embedding.

For finite-dimensional convex sets in a linear space, we have the following:

Proposition 3.5.5.

Let C be a finite-dimensional convex set in an arbitrary linear space E. Then, rint  C = int   fl C C with respect to the unique topology for fl  C as in Proposition  3.5.1 .

Proof.

We may assume that E is a topological linear space. By Proposition 3.4.7(3), it suffices to show that int fl C C ≠ . We have affinely independent \(v_{0},v_{1},\ldots,v_{n} \in C\) with \(\mathrm{fl}\,C = \mathrm{fl}\,\{v_{0},v_{1},\ldots,v_{n}\}\), where n = dimC. We have an affine bijection \(f : {\mathbb{R}}^{n} \rightarrow \mathrm{fl}\,C\) such that f(0) = v 0, f(e 1) = v 1, …, f(e n ) = v n . Then, f is a homeomorphism by Proposition 3.5.1, hence

$$\displaystyle{\mathrm{int}\,_{\mathrm{fl}\,C}C \supset \mathrm{int}\,_{\mathrm{fl}\,C}\langle v_{0},v_{1},\ldots,v_{n}\rangle = f(\mathrm{int}\,_{{\mathbb{R}}^{n}}\langle \mathbf{0},\mathbf{e}_{1},\ldots,\mathbf{e}_{n}\rangle )\not =\emptyset.}$$

Note that every compact set in a topological linear space is topologically bounded and closed. For an n-dimensional convex set C in a linear space, the flat hull fl C is affinely isomorphic to \({\mathbb{R}}^{n}\). Combining Propositions 3.5.1 and 3.5.5 with Corollary 3.4.14, we have the following:

Corollary 3.5.6.

For every n-dimensional compact convex set C in an arbitrary topological linear space E, the pair (C,∂C) is homeomorphic to the pair ( B n, S n−1 ) of the unit closed n-ball and the unit (n − 1)-sphere.

Remark 11.

It should be noted that every bounded closed set in Euclidean space \({\mathbb{R}}^{n}\) is compact. More generally, we can prove the following:

Proposition 3.5.7.

Let E be an arbitrary topological linear space and A ⊂ E with dim fl  A < ∞. Then, A is compact if and only if A is topologically bounded and closed in E.

Sketch of Proof. Using Proposition 3.5.1, this can be reduced to the case of \({\mathbb{R}}^{n}\).

The following convex version of Proposition 3.5.1 is not trivial.

Proposition 3.5.8.

Let C be an n-dimensional convex set in an arbitrary linear space E. If (1) C is the convex hull of a finite set Footnote 14 or (2) C = rint  C, then C has the unique (Hausdorff) topology such that the following operation is continuous:

$$\displaystyle{C \times C \times \mathbf{I} \ni (x,y,t)\mapsto (1 - t)x + ty \in C.}$$

In case (1), rcl  C = C and (C,∂C) ≈ ( B n, S n−1 ); in case (2), \(C \approx {\mathbb{R}}^{n}\) .

Proof.

Like Proposition 3.5.1, it suffices to see the uniqueness and the additional statement. To this end, suppose that C has such a topology, but it is unknown whether this is induced from a topology of fl C or not.

Case (1)::

Let \(C =\langle v_{1},\ldots,v_{k}\rangle\) and define f : Δ k − 1 → C by \(f(z) =\sum _{ i=1}^{k}z(i)v_{i}\). In the same way as for the claim in the proof of Proposition 3.5.1, we can see that the continuity of the operation above induces the continuity of f. Since Δ k − 1 is compact, f is a closed map, hence it is quotient. Thus, the topology of C is unique and C is compact with respect to this topology. Giving any topology on E so that E is a topological linear space, we have rcl C = C by Proposition 3.4.7(i) and (C, ∂C) ≈ (B n, S n − 1) by Corollary 3.5.6.

Case (2)::

Let \(f : {\mathbb{R}}^{n} \rightarrow \mathrm{fl}\,C\) be an affine bijection, where \(n =\dim \mathrm{fl}\,C =\dim C\). Since \(D = {f}^{-1}(C)\) is an n-dimensional convex set in \({\mathbb{R}}^{n}\), \(D = \mathrm{rint}\,D = \mathrm{int}\,D\) is open in \({\mathbb{R}}^{n}\) by Proposition 3.5.5, hence \(D \approx {\mathbb{R}}^{n}\) by Proposition 3.4.15. Then, it suffices to show that f | D : D → C is a homeomorphism. For each x ∈ D, choose δ > 0 so that \(x +\delta { \mathbf{B}}^{n} = \overline{\mathrm{B}}(x,\delta ) \subset D\). Let \(v_{0} = x -\delta \hat{ {\Delta }}^{n-1}\), where \(\hat{{\Delta }}^{n-1}\) is the barycenter of the standard (n − 1)-simplex \({\Delta }^{n-1} =\langle \mathbf{e}_{1},\mathbf{e}_{2},\ldots,\mathbf{e}_{n}\rangle \subset {\mathbb{R}}^{n}\). For each \(i = 1,\ldots,n\), let \(v_{i} = x +\delta \mathbf{e}_{i}\). Then, \(v_{0},v_{1},\ldots,v_{n}\) are affinely independent and

$$\displaystyle{x \in \mathrm{int}\,_{{\mathbb{R}}^{n}}\langle v_{0},v_{1},\ldots,v_{n}\rangle \subset x +\delta { \mathbf{B}}^{n} \subset D,}$$

hence \(\langle v_{0},v_{1},\ldots,v_{n}\rangle\) is a neighborhood of x in D. On the other hand, we have the affine homeomorphism \(\varphi : {\Delta }^{n} \rightarrow \langle v_{0},v_{1},\ldots,v_{n}\rangle\) defined by \(\varphi (z) =\sum _{ i=0}^{n}z(i + 1)v_{i}\). Since \(f\varphi (z) =\sum _{ i=0}^{n}z(i + 1)f(v_{i})\), the continuity of the operation above implies that of \(f\varphi\), hence \(f\vert \langle v_{0},v_{1},\ldots,v_{n}\rangle\) is continuous at x. Then, it follows that f | D is continuous at x.

Since D is open in \({\mathbb{R}}^{n}\), we can apply the same argument as in the proof of Proposition 3.5.1 to prove that f | D : D → C is open. Consequently, f | D : D → C is a homeomorphism.

Remark 12.

For an arbitrary finite-dimensional convex C, Proposition 3.5.8 does not hold in general. For example, let

$$\displaystyle{C =\{ \mathbf{0}\} \cup \{ (x,y) \in (0,1{]}^{2}\mid x \geq y\} \subset {\mathbb{R}}^{2}.}$$

Then, C is a convex set that has a finer topology than usual such that the operation in Proposition 3.5.8 is continuous. Such a topology is generated by open sets in the usual topology and the following sets:

$$\displaystyle{D_{r} =\{ \mathbf{0}\} \cup (\mathrm{B}((0,r),r) \cap C),\ r > 0.}$$

Note that this topology induces the same relative topology on C ∖ {0} as usual. Since \(D_{\varepsilon /\sqrt{2}} \subset \mathrm{ B}(\mathbf{0},\varepsilon )\) for each \(\varepsilon > 0\), {D r r > 0} is a neighborhood basis at 0 ∈ C with respect to this topology.

We shall show that the operation

$$\displaystyle{C \times C \times \mathbf{I} \ni (p,q,t)\mapsto (1 - t)p + tq \in C}$$

is continuous at (p, q, t) ∈ C ×C ×I. If \((1 - t)p + tq\not =\mathbf{0}\), it follows from the continuity with respect to the usual topology. The continuity at (0, 0, t) follows from the convexity of D r , r > 0.

Fig. 3.7
figure 7

The continuity of the operation at (0, q, 0) ∈ C ×C ×I

To see the continuity at (0, q, 0) (q ≠ 0), let q = (x, y), where 0 < y ≤ x ≤ 1. Choose s > 0 so that q ∈ D s (i.e., \(s > ({x}^{2} + {y}^{2})/2y\)). For each 0 < r < min{1, s}, let 0 ≤ t ≤ r ∕ 2s, p′ ∈ D r ∕ 2, and q′ ∈ D s (Fig. 3.7). Observe that

$$\displaystyle{ \frac{1 - t} {r/s - t}(r/s)p^\prime \in \frac{1 - t} {r/s - t}(r/s)D_{r/2} \subset \frac{1} {r/2s}(r/s)D_{r/2} = D_{r}}$$

and \((r/s)q^\prime \in (r/s)D_{s} = D_{r}\). Since D r is convex, it follows that

$$\displaystyle{(1 - t)p^\prime + tq^\prime =\bigg (1 - \frac{t} {r/s}\bigg) \frac{1 - t} {r/s - t}(r/s)p^\prime + \frac{t} {r/s}(r/s)q^\prime \in D_{r}.}$$

Thus, the operation is continuous at (0, q, 0). The continuity at (p, 0, 1) (p ≠ 0) is the same.

A subset A of a topological linear space E is totally bounded provided, for each neighborhood U of 0 ∈ E, there exists some finite set M ⊂ E such that A ⊂ M + U. In this definition, M can be taken as a subset of A.

Indeed, for each neighborhood U of 0 ∈ E, we have a circled neighborhood V such that V + V ⊂ U. Then, A ⊂ M + V for some finite set M ⊂ E, where it can be assumed that \((x + V ) \cap A\not =\emptyset\) for every x ∈ M. For each x ∈ M, choose a x  ∈ A so that a x  ∈ x + V . Then, \(x \in a_{x} - V = a_{x} + V\). It follows that \(A \subset \bigcup _{x\in M}(x + V ) \subset \bigcup _{x\in M}(a_{x} + V + V ) \subset \bigcup _{x\in M}(a_{x} + U)\).

If A ⊂ E is totally bounded and B ⊂ A, then B is also totally bounded. It is easy to see that every compact subset of E is totally bounded and every totally bounded subset of E is topologically bounded. In other words, we have:

$$\displaystyle{\text{compact} \Rightarrow \text{totally bounded} \Rightarrow \text{topologically bounded}}$$

For topological linear spaces, the finite-dimensionality can be simply characterized as follows:

Theorem 3.5.9.

Let E be a topological linear space. The following are equivalent:

  1. (a)

    E is finite-dimensional;

  2. (b)

    E is locally compact;

  3. (c)

    0 ∈ E has a totally bounded neighborhood in E.

Proof.

Since each n-dimensional topological linear space is linearly homeomorphic to \({\mathbb{R}}^{n}\) (Corollary 3.5.2), we have (a) ⇒ (b). Since every compact subset of E is totally bounded, the implication (b) ⇒ (c) follows.

(c) ⇒ (a): Let U be a totally bounded neighborhood of 0 ∈ E. By Proposition 3.4.1, we have a circled neighborhood V of 0 such that V + V ⊂ U. Then, V is also totally bounded. First, we show the following:

Claim.

For each closed linear subspace F⊊E, there is some x ∈ U such that \((x + V ) \cap F = \emptyset\).

Contrary to the claim, suppose that \((x + V ) \cap F\not =\emptyset\) for every x ∈ U. Since \(V = -V\), it follows that U ⊂ F + V, so we have \(V + V \subset F + V\). If \((n - 1)V \subset F + V\) then

$$\displaystyle{nV \subset (n - 1)V + V \subset F + V + V \subset F + F + V = F + V.}$$

By induction, we have nV ⊂ F + V for every \(n \in \mathbb{N}\), which implies that \(V \subset \bigcap _{n\in \mathbb{N}}(F + {n}^{-1}V )\).

Take z ∈ E ∖ F. Since F is closed in E, we have a circled neighborhood W of 0 ∈ E such that W ⊂ V and \((z + W) \cap F = \emptyset\). The total boundedness of V implies the topological boundedness, hence V ⊂ mW for some \(m \in \mathbb{N}\). On the other hand, k  − 1 z ∈ V for some \(k \in \mathbb{N}\). Since \({k}^{-1}z \in V \subset F + {(km)}^{-1}V\), it follows that \(z \in F + {m}^{-1}V \subset F + W\). This contradicts the fact that \((z + W) \cap F = \emptyset\).

Now, assume that E is infinite-dimensional. Let v 1 ∈ U ∖ {0} and \(F_{1} = \mathbb{R}v_{1}\). Then, F 1 is closed in E (Proposition 3.5.1) and F 1 ≠ E. Applying the claim above, we have v 2 ∈ U such that \((v_{2} + V ) \cap F_{1} = \emptyset\). Note that \(v_{2}\not\in v_{1} + V\). Let \(F_{2} = \mathbb{R}v_{1} + \mathbb{R}v_{2}\). Since F 2 is closed in E (Proposition 3.5.1) and F 2 ≠ E, we can again apply the claim to find v 3 ∈ U such that \((v_{3} + V ) \cap F_{2} = \emptyset\). Then, note that v 3 ∉ v i  + V for i = 1, 2. By induction, we have v n  ∈ U, \(n \in \mathbb{N}\), such that v n  ∉ v i  + V for i < n. Then, \(\{v_{n}\mid n \in \mathbb{N}\}\) is not totally bounded. This is a contradiction. Consequently, E is finite-dimensional.

By Theorem 3.5.9, every infinite-dimensional topological linear space is not locally compact.

3.6 Metrizability and Normability

In this section, we prove metrization and normability theorems for topological linear spaces. The metrizability of a topological linear space has the following very simple characterization:

Theorem 3.6.1.

A topological linear space E is metrizable if and only if 0 ∈ E has a countable neighborhood basis.

In a more general setting, we shall prove a stronger result. A metric d on a group G is said to be left (resp. right) invariant if d(x, y) = d(zx, zy) (resp. d(x, y) = d(xz, yz)) for each x, y, z ∈ G; equivalently, \(d(x,y) = d({x}^{-1}y,\mathbf{1})\) (resp. \(d(x,y) = d(x{y}^{-1},\mathbf{1})\)) for each x, y ∈ G. When both of two metrics d and d′ on a group G are left (or right) invariant, they are uniformly equivalent to each other if and only if they induce the same topology. It is said that d is invariant if it is left and right invariant. Every invariant metric d on a group G induces the topology on G that makes G a topological group. In fact,

$$ \begin{array}{rcl} d(x,y)& =& d({x}^{-1}x{y}^{-1},{x}^{-1}y{y}^{-1}) = d({y}^{-1},{x}^{-1}) = d({x}^{-1},{y}^{-1})\;\text{ and} {}\\ & & d(xy,x^\prime y^\prime) \leq d(xy,x^\prime y) + d(x^\prime y,x^\prime y^\prime) = d(x,x^\prime) + d(y,y^\prime). {}\\ \end{array}$$

It is easy to verify that a left (or right) invariant metric d on a group G is invariant if \(d(x,y) = d({x}^{-1},{y}^{-1})\) for each x, y ∈ G. Theorem 3.6.1 comes from the following:

Theorem 3.6.2.

For a topological group G, the following are equivalent:

  1. (a)

    G is metrizable;

  2. (b)

    The unit 1 ∈ G has a countable neighborhood basis;

  3. (c)

    G has an admissible bounded left invariant (right invariant) metric.

Proof.

Since the implications (a) ⇒ (b) and (c) ⇒ (a) are obvious, it suffices to show the implication (b) ⇒ (c).

(b) ⇒ (c):Footnote 15 We shall construct a left invariant metric ρ ∈ Metr (G). Then, a right invariant metric ρ′ ∈ Metr (G) can be defined by \(\rho ^\prime(x,y) =\rho ({x}^{-1},{y}^{-1})\). By condition (b), we can find an open neighborhood basis \(\{V _{n}\mid n \in \mathbb{N}\}\) at 1 ∈ G such that

$$\displaystyle{V _{n}^{-1} = V _{ n}\;\text{ and }\;V _{n+1}V _{n+1}V _{n+1} \subset V _{n}\;\mbox{ for each $n \in \mathbb{N}$.}}$$

Footnote 16 Let V 0 = G, and define

$$\displaystyle{p(x) =\inf \{ {2}^{-i}\mid x \in V _{ i}\} \in \mathbf{I}\;\mbox{ for each $x \in G$.}}$$

Since \(V _{n} = V _{n}^{-1}\) for each \(n \in \mathbb{N}\), it follows that \(p(x) = p({x}^{-1})\) for every x ∈ G. Note that \(\bigcap _{n\in \omega }V _{n} =\{ \mathbf{1}\}\).Footnote 17 Hence, for every x ∈ G,

$$\displaystyle{p(x) = 0\; \Leftrightarrow \; x = \mathbf{1}.}$$

By induction on n, we shall prove the following:

  • ( ∗ ) \(p(x_{0}^{-1}x_{n}) \leq 2\sum _{i=1}^{n}p(x_{ i-1}^{-1}x_{ i})\) for each \(x_{0},x_{1},\ldots,x_{n} \in G\).Footnote 18

The case n = 1 is obvious. Assume ( ∗ ) for m < n. If \(\sum _{i=1}^{n}p(x_{i-1}^{-1}x_{i}) = 0\) or \(\sum _{i=1}^{n}p(x_{i-1}^{-1}x_{i}) \geq \frac{1} {2}\), it is trivial. When \({2}^{-k-1} \leq \sum _{i=1}^{n}p(x_{i-1}^{-1}x_{i}) < {2}^{-k}\) for some \(k \in \mathbb{N}\), choose 1 ≤ m ≤ n so that

$$\displaystyle{\sum _{i=1}^{m-1}p(x_{ i-1}^{-1}x_{ i}) < {2}^{-k-1}\;\text{ and }\;\sum _{ i=m+1}^{n}p(x_{ i-1}^{-1}x_{ i}) < {2}^{-k-1}.}$$

Note that \(p(x_{m-1}^{-1}x_{m}) < {2}^{-k}\). By the inductive assumption, \(p(x_{0}^{-1}x_{m-1}) < {2}^{-k}\) and \(p(x_{m}^{-1}x_{n}) < {2}^{-k}\). Then, \(x_{0}^{-1}x_{m-1},\ x_{m-1}^{-1}x_{m},\ x_{m}^{-1}x_{n} \in V _{k+1}\). Since \(V _{k+1}V _{k+1}V _{k+1} \subset V _{k}\), it follows that \(x_{0}^{-1}x_{n} \in V _{k}\), hence

$$\displaystyle{p(x_{0}^{-1}x_{ n}) \leq {2}^{-k} \leq 2\sum _{ i=1}^{n}p(x_{ i-1}^{-1}x_{ i}).}$$

Now, we can define a metric ρ on G as follows:

$$\displaystyle{\rho (x,y) =\inf \big\{\sum _{ i=1}^{n}p(x_{i-1}^{-1}x_{i})\bigm |n \in \mathbb{N},\ x_{i} \in G,\ x_{0} = x,\ x_{n} = y\big\}.}$$

By the definition, ρ is left invariant. Note that \(\rho (x,y) \leq p({x}^{-1}y) \leq 1\). Then, x  − 1 y ∈ V n implies \(\rho (x,y) \leq p({x}^{-1}y) \leq {2}^{-n} < {2}^{-n+1}\), which means \(xV _{n} \subset \mathrm{ B}_{\rho }(x,{2}^{-n+1})\). On the other hand, if ρ(x, y) < 2 − n then \(p({x}^{-1}y) \leq 2\rho (x,y) < {2}^{-n+1}\) by ( ∗ ), which implies x  − 1 y ∈ V n . Thus, \(\mathrm{B}_{\rho }(x,{2}^{-n}) \subset xV _{n}\). Therefore, ρ is admissible.

In the above proof, a right invariant metric ρ ∈ Metr (G) can be directly defined as follows:

$$\displaystyle{\rho (x,y) =\inf \big\{\sum _{ i=1}^{n}p(x_{i-1}x_{i}^{-1})\bigm |n \in \mathbb{N},\ x_{i} \in G,\ x_{0} = x,\ x_{n} = y\big\}.}$$

Every metrizable topological linear space E has an admissible (bounded) metric ρ that is not only invariant but also satisfies the following:

  • (\(\sharp \)) \(\vert t\vert \leq 1 \Rightarrow \rho (tx,\mathbf{0}) \leq \rho (x,\mathbf{0})\).

To verify this, let us recall how to define the metric ρ in the above proof. Taking a neighborhood basis \(\{V _{n}\mid n \in \mathbb{N}\}\) at 0 ∈ E so that \(V _{n} = -V _{n}\) and \(V _{n+1} + V _{n+1} + V _{n+1} \subset V _{n}\) for each \(n \in \mathbb{N}\), we define the admissible invariant metric ρ as follows:

$$\displaystyle{\rho (x,y) =\inf \big\{\sum _{ i=1}^{n}p(x_{i} - x_{i-1})\bigm |n \in \mathbb{N},\ x_{i} \in E,\ x_{0} = x,\ x_{n} = y\big\},}$$

where \(p(x) =\inf \{ {2}^{-i}\mid x \in V _{i}\}\). Since E is a topological linear space, the condition that \(V _{n} = -V _{n}\) can be replaced by a stronger condition that V n is circled, i.e., tV n  ⊂ V n for t ∈ [ − 1, 1]. Then, p(tx) ≤ p(x) for each x ∈ E and t ∈ [ − 1, 1], which implies that ρ(tx, 0) ≤ ρ(x, 0) for each x ∈ E and t ∈ [ − 1, 1].

Let d be an invariant metric on a linear space E. Addition on a linear space E is clearly continuous with respect to d. On the other hand, scalar multiplication on E is continuous with respect to d if and only if d satisfies the following three conditions:

  1. (i)

    \(d(x_{n},\mathbf{0}) \rightarrow 0\; \Rightarrow \;\forall t \in \mathbb{R},\ d(tx_{n},\mathbf{0}) \rightarrow 0\);

  2. (ii)

    \(t_{n} \rightarrow 0\; \Rightarrow \;\forall x \in E,\ d(t_{n}x,\mathbf{0}) \rightarrow 0\);

  3. (iii)

    \(d(x_{n},\mathbf{0}) \rightarrow 0,\ t_{n} \rightarrow 0\; \Rightarrow \; d(t_{n}x_{n},\mathbf{0}) \rightarrow 0\).

Indeed, the “only if” part is trivial. To show the “if” part, observe

$$\displaystyle{t_{n}x_{n} - tx = (t_{n} - t)(x_{n} - x) + t(x_{n} - x) + (t_{n} - t)x.}$$

Since d is invariant, it follows that

$$\displaystyle\begin{array}{rcl} d(t_{n}x_{n},tx)& =& d((t_{n} - t)(x_{n} - x) + t(x_{n} - x) + (t_{n} - t)x,\mathbf{0}) {}\\ & \leq & d((t_{n} - t)(x_{n} - x),\mathbf{0}) + d(t(x_{n} - x),\mathbf{0}) + d((t_{n} - t)x,\mathbf{0}), {}\\ \end{array}$$

where d(t n x n , tx) → 0 if t n  → t and d(x n , x) → 0. Thus, the above three conditions imply the continuity of scalar multiplication on E with respect to d.

It should be remarked that condition (\(\sharp \)) implies condition (iii).

An invariant metric d on E satisfying these conditions is called a linear metric. A linear space with a linear metric is called a metric linear space. Then, every metric linear space is a metrizable topological linear space. Conversely, we have the following fact:

Fact.

Every admissible invariant metric for a metrizable topological linear space is a linear metric.

For subsets of a metric linear space, the total boundedness coincides with that in the metric sense. On the other hand, the topological boundedness does not coincide with the metric boundedness. In fact, every metrizable topological linear space E has an admissible bounded invariant metric. For instance, given an admissible invariant metric d for E, the following are admissible bounded invariant metrics:

$${\min \big\{1,\ d(x,y)\big\},\ \frac{d(x,y)} {1 + d(x,y)}.}$$

For a linear metric ρ on E with the condition (\(\sharp \)), the functional \(E \ni x\mapsto \rho (x,\mathbf{0}) \in \mathbb{R}\) is called an \(\boldsymbol{F}\)-norm. In other words, a functional \(\|\cdot \| : E \rightarrow \mathbb{R}\) on a linear space E is called an \(\boldsymbol{F}\)-norm if it satisfies the following conditions:

  • \(\|x\| \geq 0\) for every x ∈ E;

  • \(\|x\| = 0\; \Rightarrow \; x = \mathbf{0}\);

  • \(\vert t\vert \leq 1\; \Rightarrow \;\| tx\| \leq \| x\|\) for every x ∈ E;

  • \(\|x + y\| \leq \| x\| +\| y\|\) for every x, y ∈ E;

  • \(\|x_{n}\| \rightarrow 0\; \Rightarrow \;\| tx_{n}\| \rightarrow 0\) for every \(t \in \mathbb{R}\);

  • \(t_{n} \rightarrow 0\; \Rightarrow \;\| t_{n}x\| \rightarrow 0\) for every x ∈ E.

Conditions (F 3), (F 5), and (F 6) correspond to conditions (\(\sharp \)), (i), and (ii), respectively. The converse of (F 2) is true because \(\|\mathbf{0}\| = 0\) by (F 6). Then, \(\|x\| = 0\) if and only if x = 0. Condition (F 3) implies that \(\|-x\| =\| x\|\) for every x ∈ E. Furthermore, conditions (F 3) and (F 4) imply condition (F 5). Indeed, using (F 4) inductively, we have \(\|nx\| \leq n\|x\|\) for every \(n \in \mathbb{N}\). Each t ∈ [0, ) can be written as \(t = [t] + s\) for some s ∈ [0, 1), where [t] is the greatest integer ≤ t. Since \(\|sx\| \leq \| x\|\) by (F 3), it follows that \(\|tx\| \leq ([t] + 1)\|x\|\). Because \(\|-x\| =\| x\|\), \(\|tx\| \leq ([\vert t\vert ] + 1)\|x\|\) for every \(t \in \mathbb{R}\). This implies condition (F 5). Thus, condition (F 5) is unnecessary.

A linear space E given an F-norm \(\|\cdot \|\) is called an \(\boldsymbol{F}\)-normed linear space. Every norm is an F-norm, hence every normed linear space is an F-normed space. An F-norm \(\|\cdot \|\) induces the linear metric \(d(x,y) =\| x - y\|\). Then, every F-normed linear space is a metric linear space. An F-norm on a topological linear space E is said to be admissible if it induces the topology for E. As we saw above, if E is metrizable, then E has an admissible invariant metric ρ satisfying (\(\sharp \)), which induces the F-norm. Therefore, we have the following:

Theorem 3.6.3.

A topological linear space has an admissible F-norm if and only if it is metrizable.

For each metrizable topological linear space, there exists an F-norm with the following stronger condition than (F 3):

  • (F 3  ∗ ) \(x\not =\mathbf{0},\ \vert t\vert < 1 \Rightarrow \| tx\| <\| x\|\),

which implies that \(\|sx\| <\| tx\|\) for each x ≠ 0 and 0 < s < t. The following proposition guarantees the existence of an F-norm with the condition (F 3  ∗ ):

Proposition 3.6.4.

Every (completely) metrizable topological linear space E has an admissible invariant (complete) metric d such that d(tx, 0 ) < d(x, 0 ) if x≠ 0 and |t| < 1, which induces an admissible F-norm satisfying (F 3 ). If an admissible invariant metric ρ for E is given, d can be chosen so that d ≥ρ (hence, if ρ is complete, then so is d). Moreover, if ρ is bounded, d can be chosen to be bounded.

Proof.

Given an admissible (bounded) invariant metric ρ for E, we define \(d_{1}(x,y) =\sup _{0<s\leq 1}\rho (sx,sy)\). Then, d 1 is an invariant metric on E with d 1 ≥ ρ (if ρ is bounded then so is d 1). For each \(\varepsilon > 0\), since the scalar multiplication \(E \times \mathbb{R} \ni (x,s) \rightarrow sx \in E\) is continuous at (0, s) and I is compact, we can find δ > 0 such that ρ(x, 0) < δ implies \(\rho (sx,\mathbf{0}) <\varepsilon\) for every s ∈ I, hence ρ(x, y) < δ implies \(d_{1}(x,y) =\sup _{0<s\leq 1}\rho (sx,sy) \leq \varepsilon\). Thus, d 1 is uniformly equivalent to ρ. In particular, d 1 is admissible. For r > 0, we define an admissible invariant metric d r for E by d r (x, y) = d 1(rx, ry) ( = sup0 < s ≤ r ρ(sx, sy)). Observe that d r (tx, 0) ≤ d r (x, 0) for each x ∈ E and t ∈ I.

Now, let \(\mathbb{Q} \cap (0,1] =\{ r_{n}\mid n \in \mathbb{N}\}\), where r 1 = 1. We define \(d(x,y) =\sum _{n\in \mathbb{N}}{2}^{-n+1}d_{r_{n}}(x,y)\). Then, d is an invariant metric on E and

$${\rho (x,y) \leq d_{1}(x,y) \leq d(x,y) \leq 2d_{1}(x,y),}$$

hence d is admissible (if ρ is bounded then so is d). It also follows that d(tx, 0) ≤ d(x, 0) for each x ∈ E and t ∈ I. It remains to show that d(tx, 0) ≠ d(x, 0) for each x ∈ E ∖ {0} and 0 < t < 1. Since \(\mathbb{Q} \cap (0,1)\) is dense in (0, 1), it suffices to show that d(tx, 0) ≠ d(x, 0) for each x ∈ E ∖ {0} and \(t \in \mathbb{Q} \cap (0,1)\). Assume that there exists some x ∈ E ∖ {0} and \(t \in \mathbb{Q} \cap (0,1)\) such that d(tx, 0) = d(x, 0). Note that d r (tx, 0) = d r (x, 0) for each \(r \in \mathbb{Q} \cap (0,1)\). Then, it follows that

$$\begin{array}{rcl} d_{t}(x,\mathbf{0})& =& d_{t}(tx,\mathbf{0}) = d_{{t}^{2}}(x,\mathbf{0}) = d_{{t}^{2}}(tx,\mathbf{0}) {}\\ & =& d_{{t}^{3}}(x,\mathbf{0}) = d_{{t}^{3}}(tx,\mathbf{0}) = \cdots \,, {}\\ \end{array}$$

so \(d_{t}(x,\mathbf{0}) = d_{{t}^{n+1}}(x,\mathbf{0}) = d_{t}({t}^{n}x,\mathbf{0})\) for every \(n \in \mathbb{N}\). Since lim n →  t n = 0, it follows that \(d_{t}(x,\mathbf{0}) =\lim _{n\rightarrow \infty }d_{t}({t}^{n}x,\mathbf{0}) = 0\), hence x = 0, which is a contradiction.

The topological linear space \({\mathbb{R}}^{\mathbb{N}} =\boldsymbol{ s}\) (the space of sequences) has the following admissible F-norms:

$${\sup _{i\in \mathbb{N}}\min \big\{1/i,\vert x(i)\vert \big\},\ \sum _{i\in \mathbb{N}}\min \big\{{2}^{-i},\vert x(i)\vert \big\},\ \sum _{ i\in \mathbb{N}} \frac{{2}^{-i}\vert x(i)\vert } {1 + \vert x(i)\vert },\ \ldots .}$$

The first two do not satisfy condition (F 3  ∗ ), but the third does.

We now consider the completion of metric linear spaces (cf. 2.3.10).

Proposition 3.6.5.

Let G be a topological group such that the topology is induced by an invariant metric d. The completion \(\widetilde{G} = (\widetilde{G},\tilde{d})\) of (G,d) is a group such that G is its subgroup and \(\tilde{d}\) is invariant. Similarly, the completion of a metric (F-normed or normed) linear space E is a metric (F-normed or normed) linear space containing E as a linear subspace.

Proof.

We define the algebraic operations on \(\widetilde{G}\) as follows: for each \(x,y \in \widetilde{ G}\), choose sequences \((x_{i})_{i\in \mathbb{N}}\) and \((y_{i})_{i\in \mathbb{N}}\) in G so as to converge to x and y, respectively. Since d is invariant, \((x_{i}y_{i})_{i\in \mathbb{N}}\) and (x i  − 1) are Cauchy sequences in G. Then, define xy and x  − 1 as the limits of \((x_{i}y_{i})_{i\in \mathbb{N}}\) and \((x_{i}^{-1})_{i\in \mathbb{N}}\), respectively. It is easily verified that these are well-defined. Since \(\tilde{d}(x,y) =\lim _{i\rightarrow \infty }d(x_{i},y_{i})\), it is also easy to see that \(\tilde{d}\) is invariant, which implies the continuity of the algebraic operations (x, y)↦xy and xx  − 1.

For the completion \(\widetilde{E}\) of a metric linear space E, we can define not only addition but also scalar multiplication in the same way. To see the continuity of scalar multiplication, let \(x \in \widetilde{ E}\) and \(t \in \mathbb{R}\). Choose a sequence \((x_{i})_{i\in \mathbb{N}}\) in E so as to converge to x. For each \(\varepsilon > 0\), we can choose δ 0 > 0 (depending on t) so that

$$ z \in E,\ d(z,\mathbf{0}) <\delta _{0},\ \vert t - t^\prime\vert <\delta _{0} \Rightarrow d(t^\prime z,\mathbf{0}) $$

Then, we have \(n_{0} \in \mathbb{N}\) such that \(d(x_{n},x_{n_{0}}) <\delta _{0}\) for every n ≥ n 0. Choose δ 1 > 0 so that δ 1 < δ 0 and

$${\vert s\vert <\delta _{1} \Rightarrow d(sx_{n_{0}},\mathbf{0}) <\varepsilon /4.}$$

Now, let \(x^\prime \in \widetilde{ E}\) and \(t^\prime \in \mathbb{R}\) such that \(\tilde{d}(x,x^\prime) <\delta _{0}\) and | t − t′ |  < δ 1. Take a sequence \((x^\prime_{i})_{i\in \mathbb{N}}\) in E so as to converge to x′ and choose \(n_{1} \in \mathbb{N}\) so that n 1 ≥ n 0 and \(d(x_{n},x^\prime_{n}) <\delta _{0}\) for every n ≥ n 1. Then, for every n ≥ n 1, it follows that

$$\begin{array}{rcl} d(tx_{n},t^\prime x^\prime_{n})& \leq & d(tx_{n},tx_{n_{0}}) + d(tx_{n_{0}},t^\prime x_{n_{0}}) + d(t^\prime x_{n_{0}},t^\prime x_{n}) + d(t^\prime x_{n},t^\prime x^\prime_{n}) {}\\ & =& d(t(x_{n} - x_{n_{0}}),\mathbf{0}) + d((t - t^\prime)x_{n_{0}},\mathbf{0}) {}\\ & & \qquad \qquad + d(t^\prime(x_{n_{0}} - x_{n}),\mathbf{0}) + d(t^\prime(x_{n} - x^\prime_{n}),\mathbf{0}) {}\\ & <& \varepsilon /4 +\varepsilon /4 +\varepsilon /4 +\varepsilon /4 =\varepsilon . {}\\ \end{array}$$

When E is an F-normed (or normed) linear space, it is easy to see that the F-norm (or norm) for E naturally extends to \(\widetilde{E}\).

Concerning the completeness of admissible invariant metrics, we have the following:

Theorem 3.6.6.

Let G be a completely metrizable topological group. Every admissible invariant metric for G is complete. In particular, a metric linear space is complete if it is absolutely G δ (i.e., completely metrizable).

Proof.

Let d be an admissible invariant metric for G and \(\widetilde{G}\) be the completion of (G, d). Note that \(\widetilde{G}\) is a topological group by Proposition 3.6.5. It suffices to show that \(\widetilde{G} = G\). Since G is completely metrizable, G is a dense G δ -set in \(\widetilde{G}\) (Theorem 2.5.2), hence we can write \(\widetilde{G} \setminus G =\bigcup _{n\in \mathbb{N}}F_{n}\), where each F n is a nowhere dense closed set in \(\widetilde{G}\). Assume \(\widetilde{G} \setminus G\not =\emptyset\) and take \(x_{0} \in \widetilde{ G} \setminus G\). Since \(x_{0}x \in \widetilde{ G} \setminus G\) for every x ∈ G, it follows that \(G \subset \bigcup _{n\in \mathbb{N}}x_{0}^{-1}F_{n}\), where each \(x_{0}^{-1}F_{n}\) is also a nowhere dense closed set in \(\widetilde{G}\). Then, we have

$${\widetilde{G} =\bigcup _{n\in \mathbb{N}}F_{n} \cup \bigcup _{n\in \mathbb{N}}x_{0}^{-1}F_{ n},}$$

which is the countable union of nowhere dense closed sets. This contradicts the complete metrizability of \(\widetilde{G}\) (the Baire Category Theorem 2.5.1).

Corollary 3.6.7.

Let G be a metrizable topological group. Every completely metrizable Abelian subgroup H of G is closed in G. Hence, in a metrizable topological linear space, every completely metrizable linear subspace is closed.

Proof.

By Theorem 3.6.2, G has an admissible left invariant metric d. Because H is an Abelian subgroup of G, the restriction of d on H is an admissible invariant metric for H, which is complete by Theorem 3.6.6. Hence, it follows that H is closed in G.

It is said that an F-norm (or an F-normed space) is complete if the metric induced by the F-norm is complete. It should be noted that every metrizable topological linear space has an admissible F-norm (Proposition 3.6.4) and that every admissible F-norm for a completely metrizable topological linear space is complete (Theorem 3.6.6). A completely metrizable topological linear space (or a complete F-normed linear space) is called an \(\boldsymbol{F}\)-space. A Fréchet space is a locally convex F-space, that is, a completely metrizable locally convex topological linear space. Every Banach space is a Fréchet space, but the converse does not hold. In fact, \(\boldsymbol{s} = {\mathbb{R}}^{\mathbb{N}}\) is a Fréchet space but it is not normable (Proposition 1.2.1).

Concerning the quotient of an F-normed (or normed) linear space, we have the following:

Proposition 3.6.8.

Let \(E = (E,\|\cdot \|)\) be an F-normed (or normed) linear space and F a closed linear subspace of E. Then, the quotient space E∕F has the admissible F-norm (or norm) \(\vert \!\vert \!\vert \xi \vert \!\vert \!\vert =\inf _{x\in \xi }\|x\|\) , where if \(\|\cdot \|\) is complete then so is |​|​|⋅|​|​|. Hence, if E is (completely) metrizable or (completely) normable then so is E∕F.

Proof.

It is easy to see that | ​ | ​ |  ⋅ | ​ | ​ | is an F-norm (or a norm). It should be noted that the closedness of F is necessary for condition (F 2). Let q : E → E ∕ F be the natural linear surjection, i.e., \(q(x) = x + F\). Then, for each \(\varepsilon > 0\),

$${\big\{q(x)\bigm |\|x\| <\varepsilon \big\}=\big\{\xi \in E/F\bigm |\vert \!\vert \!\vert \xi \vert \!\vert \!\vert <\varepsilon \big\},}$$

which means that q : E → (E ∕ F, | ​ | ​ |  ⋅ | ​ | ​ | ) is open and continuous, so it is a quotient map. Then, | ​ | ​ |  ⋅ | ​ | ​ | induces the quotient topology, i.e., | ​ | ​ |  ⋅ | ​ | ​ | is admissible for the quotient topology. It also follows that if E is locally convex then so is E ∕ F.

We should remark the following fact:

Fact.

\(\vert \!\vert \!\vert \xi -\xi ^\prime\vert \!\vert \!\vert =\inf \big\{\| x - x^\prime\|\bigm |x^\prime \in \xi ^\prime\big\}\) for each x ∈ξ.

Indeed, the left side is not greater than the right side by definition. For each x, y ∈ ξ and y′ ∈ ξ′,

$${\|y - y^\prime\| =\| x - (y^\prime + x - y)\| \geq \inf \big\{\| x - x^\prime\|\bigm |x^\prime \in \xi ^\prime\big\}}$$

because \(y^\prime + x - y \in \xi ^\prime\). Thus, the left side is not less than the right side.

We shall show that if \(\|\cdot \|\) is complete then so is | ​ | ​ |  ⋅ | ​ | ​ | . To see the completeness of | ​ | ​ |  ⋅ | ​ | ​ | , it suffices to prove that each Cauchy sequence \((\xi _{i})_{i\in \mathbb{N}}\) in E ∕ F contains a convergent subsequence. Then, by replacing \((\xi _{i})_{i\in \mathbb{N}}\) with its subsequence, we may assume that \(\vert \!\vert \!\vert \xi _{i} -\xi _{i+1}\vert \!\vert \!\vert < {2}^{-i}\) for each \(i \in \mathbb{N}\). Using the fact above, we can inductively choose \(x_{i} \in \xi _{i}\) so that \(\|x_{i} - x_{i+1}\| < {2}^{-i}\). Then, \((x_{i})_{i\in \mathbb{N}}\) is a Cauchy sequence in E, which converges to some x ∈ E. It follows that \((\xi _{i})_{i\in \mathbb{N}}\) converges to some x + F.

In the above, E ∕ F is called the quotient \(\boldsymbol{F}\)-normed (or normed) linear space with the F-norm (or norm) | ​ | ​ |  ⋅ | ​ | ​ | , which is called the quotient \(\boldsymbol{F}\)-norm (or norm). Note that E ∕ F is locally convex if so is E. If E is a Banach space, a Fréchet space, or an F-space, then so is E ∕ F for any closed linear subspace F of E.

Recall that A ⊂ E is topologically bounded if, for each neighborhood U of 0 ∈ E, there exists some \(r \in \mathbb{R}\) such that A ⊂ rU.

Theorem 3.6.9.

A topological linear space E is normable if and only if there is a topologically bounded convex neighborhood of 0 ∈ E.

Proof.

The “only if” part is trivial. To see the “if” part, let V be a topologically bounded convex neighborhood of 0 ∈ E. Then, \(W = V \cap (-V )\) is a topologically bounded circled convex neighborhood of 0 ∈ E. Hence, the Minkowski functional p W is a norm on E by Proposition 3.3.4. By Corollary 3.4.12,

$${\big\{x \in E\bigm |p_{W}(x) <\varepsilon \big\}=\varepsilon p_{W}^{-1}([0,1)) =\varepsilon \mathrm{int}\,W\;\mbox{ for each $\varepsilon > 0$.}}$$

For each neighborhood U of 0 ∈ E, we can choose r > 0 such that W ⊂ rU. Then,

$${\big\{x \in E\bigm |p_{W}(x) < {r}^{-1}\big\} = {r}^{-1}\mathrm{int}\,W \subset {r}^{-1}W \subset U,}$$

hence p W induces the topology for E.

For the local convexity, we have the following:

Theorem 3.6.10.

A (metrizable) topological linear space E is locally convex if and only if E is linearly homeomorphic to a linear subspace of the (countable) product \(\prod _{\lambda \in \Lambda }E_{\lambda }\) of normed linear spaces E λ .

Proof.

As is easily observed, the product of locally convex topological linear spaces is locally convex, and so is any linear subspace of a locally convex topological linear space. Moreover, the countable product of metrizable spaces is metrizable. Then, the “if” part follows.

We show the “only if” part. By the local convexity, E has a neighborhood basis \(\{V _{\lambda }\,\mid \,\lambda \,\in \,\Lambda \}\) of 0 ∈ E consisting of circled closed convex sets (cf. Proposition 3.4.2), where card Λ =  0 if E is metrizable (Theorem 3.6.1). For each λ ∈ Λ, let F λ be a maximal linear subspace of E contained in V λ . (The existence of F λ is guaranteed by Zorn’s Lemma.) Then, F λ is closed in E. Let q λ : E → E ∕ F λ be the natural linear surjection, where we do not give the quotient topology to E ∕ F λ but we want to define a norm on E ∕ F λ .

Observe that q λ (V λ ) is a circled convex set in E ∕ F λ and \(\mathbf{0} \in \mathrm{core}\,q_{\lambda }(V _{\lambda })\). Moreover, \(\mathbb{R}_{+}\xi \not\subset q_{\lambda }(V _{\lambda })\) for each ξ ∈ (E ∕ F λ ) ∖ {0}. Indeed, take x ∈ E ∖ F λ so that q λ (x) = ξ. By the maximality of F λ , \(\mathbb{R}x + F_{\lambda }\not\subset V _{\lambda }\), i.e., tx + y ∉ V λ for some \(t \in \mathbb{R}\) and y ∈ F λ , where we can take t > 0 because V λ is circled. For each z ∈ F λ ,

$${tx + y = \tfrac{1} {2}(2tx + z) + \tfrac{1} {2}(2y - z).}$$

Since \(2y - z \in F_{\lambda } \subset V _{\lambda }\), it follows that 2tx + z ∉ V λ . Then, \(2t\xi = q_{\lambda }(2tx)\not\in q_{\lambda }(V _{\lambda })\).

By Proposition 3.3.4, the Minkowski functional \(p_{\lambda } = p_{q_{\lambda }(V _{\lambda })} : E/F_{\lambda } \rightarrow \mathbb{R}\) for q λ (V λ ) is a norm. Thus, we have a normed linear space \(E_{\lambda } = (E/F_{\lambda },p_{\lambda })\). Observe that

$$\begin{array}{rcl} \mathbf{0} \in \mathrm{int}\,V _{\lambda }& =& \mathrm{core}\,V _{\lambda } \subset q_{\lambda }^{-1}(\mathrm{core}\,q_{\lambda }(V _{\lambda })) {}\\ & =& q_{\lambda }^{-1}\big(p_{ q_{\lambda }(V _{\lambda })}^{-1}([0,1))\big) = {(p_{\lambda }q_{\lambda })}^{-1}([0,1)). {}\\ \end{array}$$

By Proposition 3.4.11, the sublinear functional \(p_{\lambda }q_{\lambda } : E \rightarrow \mathbb{R}\) is continuous, which implies that q λ : E → E λ is continuous.

Let h : E →  λ ∈ Λ E λ be the linear mapFootnote 19 defined by h(x) = (q λ (x)) λ ∈ Λ . If x ≠ 0 ∈ E then x ∉ V λ (so x ∉ F λ ) for some λ ∈ Λ, which implies q λ (x) ≠ 0, hence h(x) ≠ 0. Thus, h is a continuous linear injection. To see that h is an embedding, it suffices to show that

$${h(V _{\lambda }) \supset h(E) \cap \mathrm{ pr}_{\lambda }^{-1}(p_{\lambda }^{-1}([0, \tfrac{1} {2})))\;\mbox{ for each $\lambda \in \Lambda $.}}$$

If \(p_{\lambda }(\mathrm{pr}_{\lambda }h(x)) < \frac{1} {2}\) then

$${q_{\lambda }(2x) =\mathrm{ pr}_{\lambda }h(2x) \in p_{\lambda }^{-1}([0,1)) \subset q_{\lambda }(V _{\lambda }),}$$

hence 2x − y ∈ F λ for some y ∈ V λ . Then, it follows that

$${x = \tfrac{1} {2}(2x - y) + \tfrac{1} {2}y \in V _{\lambda },}$$

so h(x) ∈ h(V λ ). This completes the proof.

Combining Theorem 3.6.10 with Proposition 3.6.5 and Corollary 3.6.7, we have the following:

Corollary 3.6.11.

A topological linear space E is a Fréchet space if and only if E is linearly homeomorphic to a closed linear subspace of the countable product \(\prod _{i\in \mathbb{N}}E_{i}\) of Banach spaces E i .

3.7 The Closed Graph and Open Mapping Theorems

This section is devoted to two very important theorems, the Closed Graph Theorem and the Open Mapping Theorem. They are proved using the Baire Category Theorem 2.5.1.

Theorem 3.7.1 (Closed Graph Theorem).

Let E and F be completely metrizable topological linear spaces and f : E → F be a linear function. If the graph of f is closed in E × F, then f is continuous.

Proof.

It suffices to show the continuity of f at 0 ∈ E. Let d and ρ be admissible complete invariant metrics for E and F, respectively (cf. Proposition 3.6.4).

First, we show that for each \(\varepsilon > 0\), there is some \(\delta (\varepsilon ) > 0\) such that \(\mathrm{B}_{d}(\mathbf{0},\delta (\varepsilon )) \subset \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon ))\). Since \(F =\bigcup _{n\in \mathbb{N}}n\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2)\) and f is linear, it follows that \(E =\bigcup _{n\in \mathbb{N}}n{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\). By the Baire Category Theorem 2.5.1, \(\mathrm{int}\,\mathrm{cl}\,n{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\not =\emptyset\) for some \(n \in \mathbb{N}\), which implies that \(\mathrm{int}\,\mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\not =\emptyset\). Let \(z \in \mathrm{int}\,\mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\) and choose \(\delta (\varepsilon ) > 0\) so that

$${z +\mathrm{ B}_{d}(\mathbf{0},\delta (\varepsilon )) =\mathrm{ B}_{d}(z,\delta (\varepsilon )) \subset \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2)).}$$

Then, it follows that

$${\mathrm{B}_{d}(\mathbf{0},\delta (\varepsilon )) \subset \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2)) - z \subset \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon )).}$$

The second inclusion can be proved as follows: for each \(y \in \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\) and η > 0, we have \(y^\prime,z^\prime \in {f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon /2))\) such that d(y, y′), d(z, z′) < η ∕ 2, which implies \(d(y - z,y^\prime - z^\prime) <\eta\). Observe that

$${\rho (f(y^\prime - z^\prime),\mathbf{0}) =\rho (f(y^\prime),f(z^\prime)) \leq \rho (f(y^\prime),\mathbf{0}) +\rho (f(z^\prime),\mathbf{0}) <\varepsilon,}$$

which means \(y^\prime - z^\prime \in {f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon ))\). Therefore, \(y - z \in \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},\varepsilon ))\).

Now, for each \(\varepsilon > 0\) and \(x \in \mathrm{ B}_{d}(\mathbf{0},\delta (\varepsilon /2))\), we can inductively choose x n  ∈ E, \(n \in \mathbb{N}\), so that \(x_{n} \in {f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},{2}^{-n}\varepsilon ))\) and

$${d\big(x,\sum _{i=1}^{n}x_{i}\big) = d\big(x -\sum _{i=1}^{n}x_{i},\ \mathbf{0}\big) <\min \big\{ {2}^{-n},\delta ({2}^{-n-1}\varepsilon )\big\}.}$$

Indeed, if \(x_{1},\ldots,x_{n-1}\) have been chosen, then

$${x -\sum _{i=1}^{n-1}x_{ i} \in \mathrm{ B}_{d}(\mathbf{0},\delta ({2}^{-n}\varepsilon )) \subset \mathrm{cl}\,{f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},{2}^{-n}\varepsilon )),}$$

hence we can choose \(x_{n} \in {f}^{-1}(\mathrm{B}_{\rho }(\mathbf{0},{2}^{-n}\varepsilon ))\) so that

$${d\big(x,\sum _{i=1}^{n}x_{i}\big) = d\big(x -\sum _{i=1}^{n-1}x_{i},\ x_{n}\big) <\min \big\{ {2}^{-n},\delta ({2}^{-n-1}\varepsilon )\big\}.}$$

Since \(\rho (f(x_{n}),\mathbf{0}) < {2}^{-n}\varepsilon\) for each \(n \in \mathbb{N}\), it follows that \((f(\sum _{i=1}^{n}x_{i}))_{n\in \mathbb{N}}\) is a Cauchy sequence, which converges to some y ∈ F. For each \(n \in \mathbb{N}\),

$${\rho \big(f(\sum _{i=1}^{n}x_{i}),\ \mathbf{0}\big) \leq \sum _{i=1}^{n}\rho (f(x_{ i}),\ \mathbf{0}) <\sum _{ i=1}^{n}{2}^{-i}\varepsilon <\varepsilon,}$$

hence \(y \in \overline{\mathrm{B}}_{\rho }(\mathbf{0},\varepsilon )\). On the other hand, \(\sum _{i=1}^{n}x_{i}\) converges to x. Since the graph of f is closed in E ×F, the point (x, y) belongs to the graph of f, which means \(f(x) = y \in \overline{\mathrm{B}}_{\rho }(\mathbf{0},\varepsilon )\). Thus, we have \(f(\mathrm{B}_{d}(\mathbf{0},\delta (\varepsilon /2))) \subset \overline{\mathrm{B}}_{\rho }(\mathbf{0},\varepsilon )\). Therefore, f is continuous.

Corollary 3.7.2.

Let E and F be completely metrizable topological linear spaces. Then, every continuous linear isomorphism f : E → F is a homeomorphism.

Proof.

In general, the continuity of f implies the closedness of the graph of f in E ×F. By changing coordinates, the graph of f can be regarded as the graph of f  − 1. Then, it follows that the graph of f  − 1 is closed in F ×E, which implies the continuity of f  − 1 by Theorem 3.7.1.

Theorem 3.7.3 (Open Mapping Theorem). 

Let E and F be completely metrizable topological linear spaces. Then, every continuous linear surjection f : E → F is open.

Proof.

Since f  − 1(0) is a closed linear subspace of E, the quotient linear space \(E/{f}^{-1}(\mathbf{0})\) is completely metrizable by Proposition 3.6.8. Then, f induces the continuous linear isomorphism \(\tilde{f} : E/{f}^{-1}(\mathbf{0}) \rightarrow F\). By Corollary 3.7.2, \(\tilde{f}\) is a homeomorphism. Note that the quotient map \(q : E \rightarrow E/{f}^{-1}(\mathbf{0})\) is open. Indeed, for every open set U in E, \({q}^{-1}(q(U)) = U + {f}^{-1}(\mathbf{0})\) is open in E, which means that q(U) is open in \(E/{f}^{-1}(\mathbf{0})\). Hence, f is also open.

Note.

In the above, the Closed Graph Theorem is first proved and then the Open Mapping Theorem is obtained as a corollary of the Closed Graph Theorem. Conversely, we can directly prove the Open Mapping Theorem and then obtain the Closed Graph Theorem as a corollary of the Open Mapping Theorem.

Direct Proof of the Open Mapping Theorem. Let d and ρ be admissible complete invariant metrics for E and F, respectively.

First, we show that for each \(\varepsilon > 0\), there is some \(\delta (\varepsilon ) > 0\) such that \(\mathrm{B}_{\rho }(\mathbf{0},\delta (\varepsilon )) \subset \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon ))\). Since \(E =\bigcup _{n\in \mathbb{N}}n\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)\), it follows that \(F = f(E) =\bigcup _{n\in \mathbb{N}}nf(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\). By the Baire Category Theorem 2.5.1, \(\mathrm{int}\,\mathrm{cl}\,nf(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\not =\emptyset\) for some \(n \in \mathbb{N}\), which implies that \(\mathrm{int}\,\mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\not =\emptyset\). Let \(z \in \mathrm{int}\,\mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\) and choose \(\delta (\varepsilon ) > 0\) so that

$${z +\mathrm{ B}_{\rho }(\mathbf{0},\delta (\varepsilon )) =\mathrm{ B}_{\rho }(z,\delta (\varepsilon )) \subset \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)).}$$

Then, it follows that

$${\mathrm{B}_{\rho }(\mathbf{0},\delta (\varepsilon )) \subset \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)) - z \subset \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon )),}$$

where the second inclusion can be seen as follows: for \(y \in \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\) and η > 0, choose \(y^\prime,z^\prime \in \mathrm{ B}_{d}(\mathbf{0},\varepsilon /2)\) so that ρ(y, f(y′)), ρ(z, f(z′)) < η ∕ 2. Then, observe that \(\rho (y - z,f(y^\prime - z^\prime)) <\eta\) and \(d(y^\prime - z^\prime,\mathbf{0}) = d(y^\prime,z^\prime) <\varepsilon\), hence \(y - z \in \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon ))\).

Next, we prove that \(\mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)) \subset f(\overline{\mathrm{B}}_{d}(\mathbf{0},\varepsilon ))\) for each \(\varepsilon > 0\). For each \(y \in \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2))\), choose \(x_{1} \in \mathrm{ B}_{d}(\mathbf{0},\varepsilon /2)\) so that

$${\rho (y,f(x_{1})) <\min \{ {2}^{-1},\delta ({2}^{-2}\varepsilon )\}.}$$

By induction, we can choose \(x_{n} \in \mathrm{ B}_{d}(\mathbf{0}, {2}^{-n}\varepsilon )\), \(n \in \mathbb{N}\), so that

$${\rho \big(y,f\big(\sum _{i=1}^{n}x_{i}\big)\big) =\rho \big (y -\sum _{i=1}^{n}f(x_{i}),\ \mathbf{0}\big) <\min \{ {2}^{-n},\delta ({2}^{-n-1}\varepsilon )\}.}$$

Indeed, if \(x_{1},\ldots,x_{n-1}\) have been chosen, then

$${y -\sum _{i=1}^{n-1}f(x_{ i}) \in \mathrm{ B}_{\rho }(\mathbf{0},\delta ({2}^{-n}\varepsilon )) \subset \mathrm{cl}\,f(\mathrm{B}_{ d}(\mathbf{0}, {2}^{-n}\varepsilon )),}$$

hence we can choose \(x_{n} \in \mathrm{ B}_{d}(\mathbf{0}, {2}^{-n}\varepsilon )\) so that

$$\begin{array}{rcl} \rho \big(y,f\big(\sum _{i=1}^{n}x_{i}\big)\big)& =& \rho \big(y -\sum _{i=1}^{n-1}f(x_{i}),\ f(x_{n})\big) {}\\ & <& \min \{{2}^{-n},\delta ({2}^{-n-1}\varepsilon )\}. {}\\ \end{array}$$

Since \((\sum _{i=1}^{n}x_{i})_{n\in \mathbb{N}}\) is a Cauchy sequence in E, it converges to some x ∈ E. On the other hand, \((f(\sum _{i=1}^{n}x_{i}))_{n\in \mathbb{N}}\) converges to y. By the continuity of f, we have f(x) = y. For each \(n \in \mathbb{N}\),

$${d\big(\sum _{i=1}^{n}x_{i},\ \mathbf{0}\big) \leq \sum _{i=1}^{n}d(x_{ i},\mathbf{0}) <\sum _{ i=1}^{n}{2}^{-i}\varepsilon <\varepsilon,}$$

hence \(x \in \overline{\mathrm{B}}_{d}(\mathbf{0},\varepsilon )\). Thus, it follows that \(\mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)) \subset f(\overline{\mathrm{B}}_{d}(\mathbf{0},\varepsilon ))\).

To see that f is open, let U be an open set in E. For each x ∈ U, choose \(\varepsilon > 0\) so that \(\overline{\mathrm{B}}_{d}(\mathbf{0},\varepsilon ) \subset -x + U\). Since

$${\mathrm{B}_{\rho }(\mathbf{0},\delta (\varepsilon /2)) \subset \mathrm{cl}\,f(\mathrm{B}_{d}(\mathbf{0},\varepsilon /2)) \subset f(\overline{\mathrm{B}}_{d}(\mathbf{0},\varepsilon )) \subset -f(x) + f(U),}$$

it follows that \(\mathrm{B}_{\rho }(f(x),\delta (\varepsilon /2)) \subset f(U)\). Hence, f(U) is open in F.

Now, using the Open Mapping Theorem, we shall prove the Closed Graph Theorem.

Proof of the Closed Graph Theorem. The product space E ×F is a completely metrizable topological linear space. The graph G of f is a linear subspace of E ×F that is completely metrizable because it is closed in E ×F. Since p = pr E  | G : G → E is a homeomorphism by the Open Mapping Theorem, \(f =\mathrm{ pr}_{F} \circ {p}^{-1}\) is continuous.

Remark 13.

In both the Closed Graph Theorem and the Open Mapping Theorem, the completeness is essential. Let \(E = (\ell_{1},\|\cdot \|_{2})\), where 1 ⊂  2 as sets and \(\|\cdot \|_{2}\) is the norm inherited from 2. Then, E is not completely metrizable. Indeed, if so, it would be closed in 2 by Corollary 3.6.7, but E is dense in 2 and E ≠  2. The linear bijection f = id :  1 → E is continuous, but is not a homeomorphism, so it is not an open map. It follows from the continuity of f that the graph of f is closed in 1 ×E, hence the graph of f  − 1 is closed in E × 1. However, \({f}^{-1} : E \rightarrow \ell_{1}\) is not continuous.

3.8 Continuous Selections

Let X and Y be spaces and \(\varphi : X \rightarrow \mathfrak{P}(Y )\) be a set-valued function, where \(\mathfrak{P}(Y )\) is the power set of Y . We denote \(\mathfrak{P}_{0}(Y ) = \mathfrak{P}(Y ) \setminus \{\emptyset\}\). A (continuous) selection for \(\varphi\) is a map f : X → Y such that \(f(x) \in \varphi (x)\) for each x ∈ X. For a topological linear space Y, we denote by Conv (Y ) the set of all non-empty convex sets in Y . In this section, we consider the problem of when a convex-valued function \(\varphi : X \rightarrow \mathrm{Conv}\,(Y )\) has a selection.

It is said that \(\varphi : X \rightarrow \mathfrak{P}(Y )\) is lower semi-continuous (l.s.c.) (resp. upper semi-continuous (u.s.c.)) if, for each open set V in Y,

$${\big\{x \in X\bigm |\varphi (x) \cap V \not =\emptyset\big\}\;\big(\text{resp. }\big\{x \in X\bigm |\varphi (x) \subset V \big\}\big)\;\mbox{ is open in $X$;}}$$

equivalently, for each open set V in Y and x 0 ∈ X such that \(\varphi (x_{0}) \cap V \not =\emptyset\) (resp. \(\varphi (x_{0}) \subset V\)), there exists a neighborhood U of x 0 in X such that \(\varphi (x) \cap V \not =\emptyset\) (resp. \(\varphi (x) \subset V\)) for every x ∈ U. We say that \(\varphi\) is continuous if \(\varphi : X \rightarrow \mathfrak{P}(Y )\) is l.s.c. and u.s.c. The continuity of \(\varphi\) coincides with that in the usual sense when \(\mathfrak{P}(Y )\) is regarded as a space with the topology generated by the following sets:

$${{U}^{-} =\big\{ A \in \mathfrak{P}(Y )\bigm |A \cap U\not =\emptyset\big\}\;\text{ and }\;{U}^{+} =\big\{ A \in \mathfrak{P}(Y )\bigm |A \subset U\big\},}$$

where U is non-empty and open in Y . This topology is called the Vietoris topology, where is isolated because \(\{\emptyset\} = {\emptyset}^{+}\) ( ∉ U  −  for any open set U in Y ). The Vietoris topology has an open basis consisting of the following sets: V () = { } and

$$\begin{array}{rcl} \mathrm{V}(U_{1},\ldots,U_{n})& =& \big\{A \subset Y \bigm |A \subset \bigcup _{i=1}^{n}U_{i},\ \forall i = 1,\ldots,n,\ A \cap U_{i}\not =\emptyset\big\} {}\\ & =& \bigg{(\bigcup _{i=1}^{n}U_{ i}\bigg)}^{+} \cap \bigcap _{ i=1}^{n}U_{ i}^{-}, {}\\ \end{array}$$

where \(n \in \mathbb{N}\) and \(U_{1},\ldots,U_{n}\) are open in Y . In fact, \({U}^{-} = \mathrm{V}\,(U,X)\) and \({U}^{+} = \mathrm{V}\,(U) \cup \mathrm{V}\,(\emptyset)\). The subspace F1(Y ) = {{ y}∣y ∈ Y } of \(\mathfrak{P}_{0}(Y )\) consisting of all singletons is homeomorphic to Y because \({U}^{+} \cap \mathrm{ F}_{1}(Y ) = {U}^{-}\cap \mathrm{ F}_{1}(Y ) =\mathrm{ F}_{1}(U)\) for each open set U in Y . It should be noted that \(\mathfrak{P}_{0}(Y )\) with the Vietoris topology is not T 1 in general.

For example, the space \(\mathfrak{P}_{0}(\mathbf{I})\) is not T 1. Indeed, for any neighborhood of \(\mathcal{U}\) of \(\mathbf{I} \in \mathfrak{P}_{0}(\mathbf{I})\), there are open sets \(U_{1},\ldots,U_{n}\) in I such that \(\mathbf{I} \in \mathrm{V}\,(U_{1},\ldots,U_{n}) \subset \mathcal{U}\). Then, \(D \in \mathrm{V}\,(U_{1},\ldots,U_{n}) \subset \mathcal{U}\) for every dense subset D ⊂ I. In particular, \(\mathbf{I} \cap \mathbb{Q} \in \mathcal{U}\).

The subspace Comp(Y ) of \(\mathfrak{P}(Y )\) consisting of all non-empty compact sets is Hausdorff.Footnote 20 Indeed, for each A ≠ B ∈ Comp(Y ), we may assume that A ∖ B ≠ . Take y 0 ∈ B ∖ A. Because of the compactness of A, we have disjoint open sets U and V in Y such that A ⊂ U and y 0 ∈ V . Then, A ∈ U  + , B ⊂ V  − , and \({U}^{+} \cap {V }^{-}\not =\emptyset\). It will be prove that Comp(Y ) is metrizable if Y is metrizable (Proposition 5.12.4). Moreover, Cld(Y ) is metrizable if and only if Y is compact and metrizable (cf. Note after Proposition 5.12.4).

By the same argument as above, it follows that if Y is regular then the subspace Cld(Y ) of \(\mathfrak{P}(Y )\) consisting of all non-empty closed sets is Hausdorff. One should note that the converse is also true, that is, if Cld(Y ) is Hausdorff then Y is regular. When Y is not regular, we have a closed set A ⊂ Y and y 0 ∈ Y ∖ A such that if U and V are open sets with A ⊂ U and y 0 ∈ V then U ∩ V ≠ . Let B = A ∪{ y 0} ∈ Cld(Y ) and let \(U_{1},\ldots,U_{n}\), \(U^\prime_{1},\ldots,U^\prime_{n^\prime}\) be open sets in Y such that

$${A \in \mathrm{V}\,(U_{1},\ldots,U_{n})\;\text{ and }\;B \in \mathrm{V}\,(U^\prime_{1},\ldots,U^\prime_{n^\prime}).}$$

Let \(U_{0} =\bigcap \{ U^\prime_{i}\mid U^\prime_{i} \cap A = \emptyset\}\). Since y 0 ∈ U 0, we have \(y_{1} \in U_{0} \cap \bigcup _{i=1}^{n}U_{i}\). It follows that

$${A \cup \{ y_{1}\} \in \mathrm{V}\,(U_{1},\ldots,U_{n}) \cap \mathrm{V}\,(U^\prime_{1},\ldots,U^\prime_{n^\prime}).}$$

Thus, Cld(Y ) is not Hausdorff.

Proposition 3.8.1.

For a function g : Y → X, the set-valued function \({g}^{-1} : X \rightarrow \mathfrak{P}(Y )\) is l.s.c. (resp. u.s.c.) if and only if g is open (resp. closed).

Proof.

This follows from the fact that, for V ⊂ Y,

$$\begin{array}{rcl} \qquad \qquad \qquad \qquad \big\{x \in X\bigm |{g}^{-1}(x)& \cap & V \not =\emptyset\big\} = g(V )\quad \text{and} {}\\ \big\{x \in X\bigm |{g}^{-1}(x) \subset V \big\}& =& X \setminus \big\{ x \in X\bigm |{g}^{-1}(x) \cap (X \setminus V )\not =\emptyset\big\} {}\\ & =& X \setminus g(X \setminus V ).\qquad \qquad \qquad \qquad \qquad \qquad \ \ \ \ \square {}\\ \end{array}$$

Because of the following proposition, we consider the selection problem for l.s.c. set-valued functions.

Proposition 3.8.2.

Let \(\varphi : X \rightarrow \mathfrak{P}_{0}(Y )\) be a set-valued function. Assume that, for each x 0 ∈ X and \(y_{0} \in \varphi (x_{0})\) , there exists a neighborhood U of x 0 in X and a selection f : U → Y for \(\varphi \vert U\) such that f(x 0 ) = y 0 . Then, \(\varphi\) is l.s.c.

Proof.

Let V be an open set in Y and x 0 ∈ X such that \(\varphi (x_{0}) \cap V \not =\emptyset\). Take any \(y_{0} \in \varphi (x_{0}) \cap V\). From the assumption, there is a neighborhood U of x 0 in X with a selection f : U → Y for \(\varphi \vert U\) such that \(f(x_{0}) = y_{0}\). Then, f  − 1(V ) is a neighborhood of x 0 in X and \(f(x) \in \varphi (x) \cap V\) for each x ∈ f  − 1(V ).

Lemma 3.8.3.

Let \(\varphi,\psi : X \rightarrow \mathfrak{P}(Y )\) be set-valued functions such that \(\mathrm{cl}\,\varphi (x) = \mathrm{cl}\,\psi (x)\) for each x ∈ X. If \(\varphi\) is l.s.c. then so is ψ.

Sketch of Proof. This follows from the fact that, for each open set V in Y and B ⊂ Y, V ∩ B ≠  if and only if V ∩ cl B ≠ .

Lemma 3.8.4.

Let \(\varphi : X \rightarrow \mathfrak{P}(Y )\) be l.s.c., A be a closed set in X, and f : A → Y be a selection for \(\varphi \vert A\) . Define \(\psi : X \rightarrow \mathfrak{P}(Y )\) by

$${\psi (x) = \left \{\begin{array}{@{}l@{\quad }l@{}} \{f(x)\}\quad &\text{if }\;x \in A,\\ \varphi (x) \quad &\text{otherwise}. \end{array} \right .}$$

Then, ψ is also l.s.c.

Proof.

For each open set V in Y, f  − 1(V ) is open in A and

$${{f}^{-1}(V ) \subset \big\{ x \in X\bigm |\varphi (x) \cap V \not =\emptyset\big\},}$$

where the latter set is open in X because \(\varphi\) is l.s.c. Then, we can choose an open set U in X so that \({f}^{-1}(V ) = U \cap A\) and \(U \subset \{ x \in X\mid \varphi (x) \cap V \not =\emptyset\}\). Observe that

$${\big\{x \in X\bigm |\psi (x) \cap V \not =\emptyset\big\} = U \cup \big (\big\{x \in X\bigm |\varphi (x) \cap V \not =\emptyset\big\}\setminus A\big).}$$

Thus, it follows that ψ is l.s.c.

For each W ⊂ Y 2 and y 0 ∈ Y, we denote

$${W(y_{0}) =\big\{ y \in Y \bigm |(y_{0},y) \in W\big\}.}$$

If W is a neighborhood of the diagonal \(\Delta _{Y } =\{ (y,y)\mid y \in Y \}\) in Y 2, then W(y 0) is a neighborhood of y 0 in Y .

Lemma 3.8.5.

Let \(\varphi : X \rightarrow \mathfrak{P}(Y )\) be l.s.c., f : X → Y be a map, and W be a neighborhood of Δ Y in Y 2 . Define a set-valued function \(\psi : X \rightarrow \mathfrak{P}(Y )\) by \(\psi (x) =\varphi (x) \cap W(f(x))\) for each x ∈ X. Then, ψ is l.s.c.

Proof.

Let V be an open set in Y and x 0 ∈ X such that \(\psi (x_{0}) \cap V \not =\emptyset\). Take any \(y_{0} \in \varphi (x_{0}) \cap W(f(x_{0})) \cap V\). Since (f(x 0), y 0) ∈ W, there are open sets V 1 and V 2 in Y such that \((f(x_{0}),y_{0}) \in V _{1} \times V _{2} \subset W\). Then, x 0 has the following open neighborhood in X:

$${U = {f}^{-1}(V _{ 1}) \cap \big\{ x \in X\bigm |\varphi (x) \cap V _{2} \cap V \not =\emptyset\big\}.}$$

For each x ∈ U, we have \(y \in \varphi (x) \cap V _{2} \cap V\). Since \((f(x),y) \in V _{1} \times V _{2} \subset W\), it follows that \(y \in \varphi (x) \cap W(f(x)) \cap V\), hence ψ(x) ∩ V ≠ . Therefore, ψ is l.s.c.

Let E be a linear space. The set of all non-empty convex sets in E is denoted by Conv (E). Recall that ⟨A⟩ denotes the convex hull of A ⊂ E.

Lemma 3.8.6.

Let E be a topological linear space and \(\varphi : X \rightarrow \mathfrak{P}_{0}(E)\) be an l.s.c. set-valued function. Define a convex-valued function ψ : X →Conv  (E) by \(\psi (x) =\langle \varphi (x)\rangle\) for each x ∈ X. Then, ψ is also l.s.c.

Proof.

Let V be an open set in E and x 0 ∈ X such that ψ(x 0) ∩ V ≠ . Choose any \(y_{0} =\sum _{ i=1}^{n}t_{i}y_{i} \in \psi (x_{0}) \cap V\), where \(y_{1},\ldots,y_{n} \in \varphi (x_{0})\) and \(t_{1},\ldots,t_{n} \geq 0\) with \(\sum _{i=1}^{n}t_{i} = 1\). Then, each y i has an open neighborhood V i such that \(t_{1}V _{1} + \cdots + t_{n}V _{n} \subset V\). Since \(\varphi\) is l.s.c.,

$${U =\bigcap _{ i=1}^{n}\big\{x \in X\bigm |\varphi (x) \cap V _{ i}\not =\emptyset\big\}}$$

is an open neighborhood of x 0 in X. For each x ∈ U, let \(z_{i} \in \varphi (x) \cap V _{i}\), \(i = 1,\ldots,n\). Then, \(\sum _{i=1}^{n}t_{i}z_{i} \in \psi (x) \cap V\), hence ψ(x) ∩ V ≠ . Therefore, ψ is l.s.c.

Lemma 3.8.7.

Let X be paracompact, E be a topological linear space, and \(\varphi : X \rightarrow \mathrm{Conv}\,(E)\) be an l.s.c. convex-valued function. Then, for each convex open neighborhood V of 0 in E, there exists a map f : X → E such that \(f(x) \in \varphi (x) + V\) for each x ∈ X.

Proof.

For each y ∈ E, let

$${U_{y} =\big\{ x \in X\bigm |\varphi (x) \cap (y - V )\not =\emptyset\big\}.}$$

Since \(\varphi\) is l.s.c., we have \(\mathcal{U} =\{ U_{y}\mid y \in E\} \in \mathrm{cov}\,(X)\). From paracompactness, X has a locally finite partition of unity (f λ ) λ ∈ Λ subordinated to \(\mathcal{U}\). For each λ ∈ Λ, choose y λ  ∈ E so that \(\mathrm{supp}\,f_{\lambda } \subset U_{y_{\lambda }}\). We define a map f : X → E by f(x) =  λ ∈ Λ f λ (x)y λ . If f λ (x) ≠ 0 then \(x \in U_{y_{\lambda }}\), which means that \(\varphi (x) \cap (y_{\lambda } - V )\not =\emptyset\), i.e., \(y_{\lambda } \in \varphi (x) + V\). Since each \(\varphi (x) + V\) is convex, \(f(x) \in \varphi (x) + V\).

Now, we can prove the following:

Theorem 3.8.8 (Michael Selection Theorem). 

Let X be a paracompact space and E = (E,d) be a locally convex metric linear space. Footnote 21 Then, every l.s.c. convex-valued function \(\varphi : X \rightarrow \mathrm{Conv}\,(E)\) admits a selection if each \(\varphi (x)\) is d-complete. Moreover, if A is a closed set in X then each selection f : A → E for \(\varphi \vert A\) can extend to a selection \(\tilde{f} : X \rightarrow E\) for \(\varphi\) .

Proof.

Let \(\{V _{i}\mid i \in \mathbb{N}\}\) be a neighborhood basis of 0 in E such that each V i is symmetric, convex, and \(\mathrm{diam}\,V _{i} < {2}^{-(i+1)}\). By induction, we construct maps f i : X → E, \(i \in \mathbb{N}\), so that, for each x ∈ X and \(i \in \mathbb{N}\),

  1. (1)

    \(f_{i}(x) \in \varphi (x) + V _{i}\) and

  2. (2)

    \(d(f_{i+1}(x),f_{i}(x)) < {2}^{-i}\).

The existence of f 1 is guaranteed by Lemma 3.8.7. Assume we have maps \(f_{1},\ldots,f_{n}\) satisfying (1) and (2). Define ψ : X → Conv (E) by

$${\psi (x) =\varphi (x) \cap (f_{n}(x) + V _{n})\quad \text{for each }x \in X.}$$

Since V n is symmetric, we have ψ(x) ≠  by (1). Consider the neighborhood \(W =\{ (x,y) \in {E}^{2}\mid y - x \in V _{n}\}\) of Δ E in E 2. Then, \(W(f_{n}(x)) = f_{n}(x) + V _{n}\). By Lemma 3.8.5, ψ is l.s.c. We can apply Lemma 3.8.7 to obtain a map f n + 1 : X → E such that

$${f_{n+1}(x) \in \psi (x) + V _{n+1}\quad \text{for each }x \in X.}$$

Then, as is easily observed, f n + 1 satisfies (1) and (2). Thus, we have the desired sequence of maps f i , \(i \in \mathbb{N}\).

Using maps f i : X → E, \(i \in \mathbb{N}\), we shall define a selection f : X → E for \(\varphi\). For each x ∈ X and \(i \in \mathbb{N}\), we have \(x_{i} \in \varphi (x)\) such that \(d(f_{i}(x),x_{i}) < {2}^{-(i+1)}\) by (1). Then, \((x_{i})_{i\in \mathbb{N}}\) is Cauchy in \(\varphi (x)\). Since \(\varphi (x)\) is complete, \((x_{i})_{i\in \mathbb{N}}\) converges to \(f(x) \in \varphi (x)\). Thus, we have f : X → E. Note that \((f_{i})_{i\in \mathbb{N}}\) uniformly converges to f, so f is continuous. Hence, f is a selection for \(\varphi\).

For the additional statement, apply Lemma 3.8.4.

Concerning factors of a metric linear space, we have the following:

Corollary 3.8.9 (Bartle–Graves–Michael). 

Let E be a locally convex metric linear space and F be a linear subspace of E that is complete (so a Fréchet space). Then, E ≈ F × E∕F. In particular, \(E \approx \mathbb{R} \times G\) for some metric linear space G.

Proof.

Note that the quotient space E ∕ F is metrizable (Proposition 3.6.8) and the natural map g : E → E ∕ F is open, hence \({g}^{-1} : E/F \rightarrow \mathrm{Conv}\,(E)\) is l.s.c. by Proposition 3.8.1. Since \({g}^{-1}g(x) = x + F\) is complete for each x ∈ E, we apply the Michael Selection Theorem 3.8.8 to obtain a map f : E ∕ F → E that is a selection for g  − 1, i.e., gf = id. Then, x − fg(x) ∈ F for each x ∈ E. Hence, a homeomorphism h : E → F ×(E ∕ F) can be defined by

$${h(x) = (x - fg(x),g(x))\;\mbox{ for each $x \in E$.}}$$

In fact, \({h}^{-1}(y,z) = y + f(z)\) for each (y, z) ∈ F ×E ∕ F.

By combining the Michael Selection Theorem 3.8.8 and the Open Mapping Theorem 3.7.3, the following Bartle–Graves Theorem can be obtained as a corollary:

Theorem 3.8.10 (Bartle–Graves). 

Let E and F be Fréchet spaces and f : E → F be a continuous linear surjection. Then, there is a map g : F → E such that fg = id . Therefore, E ≈ F ×ker  f by the homeomorphism h defined as follows:

$${h(x) = (f(x),x - gf(x))\;\mbox{ for each $x \in E$.}\quad \square }$$

We show that each Banach space is a (topological) factor of 1(Γ). To this end, we need the following:

Theorem 3.8.11 (Banach–Mazur, Klee). 

 For every Banach space E, there is a continuous linear surjection q : ℓ 1 (Γ) → E, where card  Γ = dens  E.

Proof.

The unit closed ball B E of E has a dense set {e γ γ ∈ Γ}. Since \(\sum _{\gamma \in \Gamma }\vert x(\gamma )\vert =\| x\| < \infty \) for each x ∈  1(Γ) and E is complete, we can define a linear map q :  1(Γ) → E as follows:

$${q(x) =\sum _{\gamma \in \Gamma }x(\gamma )\mathbf{e}_{\gamma }\;\mbox{ for each $x \in E$.}}$$

Footnote 22 Since \(\|q(x)\| \leq \sum _{\gamma \in \Gamma }\vert x(\gamma )\vert =\| x\|\), it follows that q is continuous.

To see that q is surjective, it suffices to show B E  ⊂ q( 1(Γ)). For each y ∈ B E , we can inductively choose \(\mathbf{e}_{\gamma _{i}}\), \(i \in \mathbb{N}\), so that γ i  ≠ γ j if i ≠ j, and

$$\begin{array}{rcl} & & \|y -\mathbf{e}_{\gamma _{1}}\| < {2}^{-1},\;\|y -\mathbf{e}_{\gamma _{ 1}} - {2}^{-1}\mathbf{e}_{\gamma _{ 2}}\| < {2}^{-2}, {}\\ & & \|y -\mathbf{e}_{\gamma _{1}} - {2}^{-1}\mathbf{e}_{\gamma _{ 2}} - {2}^{-2}\mathbf{e}_{\gamma _{ 3}}\| < {2}^{-3},\;\ldots . {}\\ \end{array}$$

We have x ∈  1(Γ) defined by

$${x(\gamma ) = \left \{\begin{array}{@{}l@{\quad }l@{}} {2}^{1-i}\quad &\text{if }\;\gamma =\gamma _{ i}, \\ 0 \quad &\text{otherwise.} \end{array} \right .}$$

Then, it follows that \(y =\sum _{ i=1}^{\infty }{2}^{1-i}\mathbf{e}_{\gamma _{i}} = q(x)\). This completes the proof.

As a combination of the Bartle–Graves Theorem 3.8.10 and Theorem 3.8.11 above, we have the following:

Corollary 3.8.12.

For any Banach space E, there exists a Banach space F such that E × F ≈ ℓ 1 (Γ), where card  Γ = dens  E.

In the Michael Selection Theorem 3.8.8, the paracompactness of X is necessary. Actually, we have the following characterization:

Theorem 3.8.13.

A space X is paracompact if and only if the following holds for any Banach space E: if \(\varphi : X \rightarrow \mathrm{Conv}\,(E)\) is l.s.c. and each \(\varphi (x)\) is closed, then \(\varphi\) has a selection.

Proof.

Since the “only if” part is simply Theorem 3.8.8, it suffices to prove the “if” part. For each \(\mathcal{U}\in \mathrm{cov}\,(X)\), we define \(\varphi : X \rightarrow \mathfrak{P}_{0}(\ell_{1}(\mathcal{U}))\) as follows:

$${\varphi (x) =\big\{ z \in \ell_{1}(\mathcal{U})\bigm |\|z\| = 1,\ \forall U \in \mathcal{U},\ z(U) \geq 0,z(U) = 0\;\mbox{ if $x\not\in U$}\big\}.}$$

Clearly, each \(\varphi (x)\) is a closed convex set.

To see that \(\varphi\) is l.s.c., let W be an open set in \(\ell_{1}(\mathcal{U})\) and \(z \in \varphi (x) \cap W\). Choose δ > 0 so that B(z, 2δ) ⊂ W. Then, we have \(V _{1},\ldots,V _{n} \in \mathcal{U}[x]\) such that \(\sum _{i=1}^{n}z(V _{i}) > 1-\delta\), where \(\bigcap _{i=1}^{n}V _{i}\) is a neighborhood of x in X. We define \(z^\prime \in \ell_{1}(\mathcal{U})\) as follows:

$${z^\prime(V _{i}) = \frac{z(V _{i})} {\sum _{j=1}^{n}z(V _{j})}\;\text{ and }\;z^\prime(U) = 0\;\mbox{ for $U\not =V _{1},\ldots,V _{n}$.}}$$

It is easy to see that \(z^\prime \in \varphi (x^\prime) \cap W\) for every \(x^\prime \in \bigcap _{i=1}^{n}V _{i}\). Thus, \(\varphi\) is l.s.c.

By the assumption, \(\varphi\) has a selection \(f : X \rightarrow \ell_{1}(\mathcal{U})\). For each \(U \in \mathcal{U}\), let f U : X → I be the map defined by f U (x) = f(x)(U) for x ∈ X. Then, \((f_{U})_{U\in \mathcal{U}}\) is a partition of unity such that \(f_{U}^{-1}((0,1]) \subset U\) for every \(U \in \mathcal{U}\). The result follows from Theorem 2.7.5.

Remark 14.

Let \(g,h : X \rightarrow \mathbb{R}\) be real-valued functions on a space X such that g is u.s.c., h is l.s.c., and g(x) ≤ h(x) for each x ∈ X. We define the convex-valued function \(\varphi : X \rightarrow \mathrm{Conv}\,(\mathbb{R})\) by \(\varphi (x) = [g(x),h(x)]\) for each x ∈ X. Then, \(\varphi\) is l.s.c. Indeed, for each open set V in \(\mathbb{R}\), let \(\varphi (x) \cap V \not =\emptyset\). Take \(y \in \varphi (x) \cap V\) and a < y < b so that [a, b] ⊂ V . Since g is u.s.c. and h is l.s.c., x has a neighborhood U in X such that x′ ∈ U implies g(x′) < b and h(x′) > a. Since g(x′) ≤ h(x′), it follows that

$${\varphi (x^\prime) \cap V \supset [g(x^\prime),h(x^\prime)] \cap [a,b] = [\max \{a,g(x^\prime)\},\min \{b,h(x^\prime)\}]\not =\emptyset.}$$

Now, we can apply the Michael Selection Theorem 3.8.8 to obtain a map \(f : X \rightarrow \mathbb{R}\) such that g(x) ≤ f(x) ≤ h(x) for each x ∈ X. This is analogous to Theorem 2.7.6.

3.9 Free Topological Linear Spaces

The free topological linear space over a space X is a topological linear space L(X) that contains X as a subspace and has the following extension property:

  • For an arbitrary topological linear space F, every map f : X → F of X uniquely extends to a linear mapFootnote 23 \(\tilde{f} : L(X) \rightarrow F\).

If such a space L(X) exists, then it is uniquely determined up to linear homeomorphism, that is, if E is a topological linear space that contains X and has the property (LE), then E is linearly homeomorphic to L(X).

Indeed, there exist linear maps \(\varphi : L(X) \rightarrow E\) and ψ : E → L(X) such that \(\varphi \vert X =\psi \vert X =\mathrm{ id}_{X}\). Since id L(X) is a linear map extending id X , it follows from the uniqueness that \(\psi \varphi =\mathrm{ id}_{L(X)}\). Similarly, we have \(\varphi \psi =\mathrm{ id}_{E}\). Therefore, \(\varphi\) is a linear homeomorphism with \(\psi {=\varphi }^{-1}\).

Lemma 3.9.1.

If X is a Tychonoff space,

  1. (1)

    X is a Hamel basis for L(X);

  2. (2)

    L(X) is regular.

Proof.

(1): First, let F be the linear span of X. Applying (LE), we have a linear map r : L(X) → F such that r | X = id X . Since r : L(X) → L(X) is a linear map extending id X , we have r = id L(X), which implies F = L(X), that is, L(X) is generated by X.

To see that X is linearly independent in L(X), let \(x_{1},\ldots,x_{n} \in X\), where \(x_{i}\not =x_{j}\) if i ≠ j. For each \(i = 1,\ldots,n\), there is a map f i : X → I such that \(f_{i}(x_{i}) = 1\) and f i (x j ) = 0 for j ≠ i. Let \(f : X \rightarrow {\mathbb{R}}^{n}\) be the map defined by \(f(x) = (f_{1}(x),\ldots,f_{n}(x))\). Then, by (LE), f extends to a linear map \(\tilde{f} : L(X) \rightarrow {\mathbb{R}}^{n}\), where \(\tilde{f}(x_{i}) = f(x_{i}) = \mathbf{e}_{i}\) for each \(i = 1,\ldots,n\). Since \(\mathbf{e}_{1},\ldots,\mathbf{e}_{n}\) is linearly independent in \({\mathbb{R}}^{n}\), it follows that \(x_{1},\ldots,x_{n} \in X\) is linearly independent in L(X).

(2): Due to the Fact in Sect. 3.4 and Proposition 3.4.2, it suffices to show that {0} is closed in L(X). Each z ∈ L(X) ∖ {0} can be uniquely represented as follows:

$${z =\sum _{ i=1}^{n}t_{ i}x_{i},\ x_{i} \in X,\ t_{i} \in \mathbb{R} \setminus \{ 0\},}$$

where \(x_{i}\not =x_{j}\) if i ≠ j. There is a map f : X → I such that f(x 1) = 1 and f(x i ) = 0 for each \(i = 2,\ldots,n\). By (LE), f extends to a linear map \(\tilde{f} : L(X) \rightarrow \mathbb{R}\). Then, \(\tilde{f}(z) = t_{1}f(x_{1}) = t_{1}\not =0 =\tilde{ f}(\mathbf{0})\). Hence, \(\tilde{{f}}^{-1}(\mathbb{R} \setminus \{ 0\})\) is an open neighborhood of z in L(X) that misses 0.

Remark 15.

In the definition of a free topological linear space L(X), specify a map η : X → L(X) instead of assuming X ⊂ L(X) and replace the property (LE) with the following universality:

  1. (*)

    For each map f : X → F of X to an arbitrary topological linear space F, there exists a unique linear map \(\tilde{f} : L(X) \rightarrow F\) such that \(\tilde{f}\eta = f\).

Then, we can show that η is an embedding if X is a Tychonoff space.

To see that η is injective, let x ≠ y ∈ X. Then, there is a map f : X → I with f(x) = 0 and f(y) = 1. By (*), we have a linear map \(\tilde{f} : L(X) \rightarrow \mathbb{R}\) such that \(\tilde{f}\eta = f\). Then, observe η(x) ≠ η(y).

To show that η : X → η(X) is open, let U be an open set in X. For each x ∈ U, there is a map g : X → I such that g(x) = 0 and g(X ∖ U) = 1. By (*), we have a linear map \(\tilde{g} : L(X) \rightarrow \mathbb{R}\) such that \(\tilde{g}\eta = g\). Then, \(V =\tilde{ {g}}^{-1}((-\frac{1} {2}, \frac{1} {2}))\) is an open neighborhood η(x) in L(X). Since \({\eta }^{-1}(V ) = {g}^{-1}([0, \frac{1} {2})) \subset U\), it follows that V ∩ η(X) ⊂ η(U), hence η(U) is a neighborhood of η(x) in η(X). This means that η(U) is open in η(X). Thus, η : X → η(X) is open.

Since η is an embedding, X can be identified with η(X), which is a subspace of L(X). Then, (*) is equivalent to (LE). Here, it should be noted that the uniqueness of \(\tilde{f}\) in (*) is not used to prove that η is an embedding. Moreover, the linear map \(\tilde{f}\) in (*) is unique if and only if L(X) is generated by η(X). (For the “only if” part, refer to the proof of Lemma 3.9.1(1).)

Theorem 3.9.2.

For every Tychonoff space X, there exists the free topological linear space L(X) over X.

Proof.

There exists a collection \(\mathcal{F} =\{ f_{\lambda } : X \rightarrow F_{\lambda }\mid \lambda \in \Lambda \}\) such that, for an arbitrary topological linear space F and each continuous map f : X → F, there exist λ ∈ Λ and a linear embedding \(\varphi : F_{\lambda } \rightarrow F\) such that \(\varphi f_{\lambda } = f\).

Indeed, for each cardinal τ ≤ card X, let \(\mathfrak{T}_{\tau }\) be the topologies \(\mathcal{T}\) on \(\mathbb{R}_{f}^{\tau }\) such that \((\mathbb{R}_{f}^{\tau },\mathcal{T} )\) is a topological linear space. Then, the desired collection is

$${\mathcal{F} =\bigcup _{\tau \leq \mathrm{card}\,X}\bigcup _{\mathcal{T}\in \mathfrak{T}_{\tau }}\mathrm{C}(X, (\mathbb{R}_{f}^{\tau },\mathcal{T} )).}$$

Consequently, for an arbitrary topological linear space F and each continuous map f : X → F, let τ = card f(X) ≤ card X. The linear span F′ of f(X) is linearly homeomorphic to \((\mathbb{R}_{f}^{\tau },\mathcal{T} )\) for some \(\mathcal{T} \in \mathfrak{T}_{\tau }\). Let \(\psi : F^\prime \rightarrow (\mathbb{R}_{f}^{\tau },\mathcal{T} )\) be a linear homeomorphism. Accordingly, we have \(g =\psi f \in \mathrm{ C}(X, (\mathbb{R}_{f}^{\tau },\mathcal{T} ))\), and thus \(f {=\psi }^{-1}g\).

The product space \(\prod _{\lambda \in \Lambda }F_{\lambda }\) is a topological linear space. Let \(\eta : X \rightarrow \prod _{\lambda \in \Lambda }F_{\lambda }\) be the map defined by \(\eta (x) = (f_{\lambda }(x))_{\lambda \in \Lambda }\). We define L(X) as the linear span of η(X) in \(\prod _{\lambda \in \Lambda }F_{\lambda }\). Then, (L(X), η) satisfies the condition (*) in the above remark. In fact, for an arbitrary topological linear space F and each map f : X → F, there exists λ ∈ Λ and a linear embedding \(\varphi : F_{\lambda } \rightarrow F\) such that \(\varphi f_{\lambda } = f\). Consequently, we have a linear map \(\tilde{f} =\varphi \mathrm{ pr}_{\lambda }\vert L(X) : L(X) \rightarrow F\) and

$${\tilde{f}\eta (x) =\varphi \mathrm{ pr}_{\lambda }\eta (x) =\varphi f_{\lambda }(x) = f(x)\;\mbox{ for every $x \in X$.}}$$

Because L(X) is generated by η(X), a linear map \(\tilde{f} : L(X) \rightarrow F\) is uniquely determined by the condition that \(\tilde{f}\eta = f\). As observed in the above remark, η is an embedding, hence X can be identified with η(X). Then, L(X) satisfies (LE), i.e., L(X) is the free topological linear space over X.

Let X and Y be Tychonoff spaces. For each map f : X → Y, we have a unique linear map f : L(X) → L(Y ) that is an extension of f by (LE). This is functorial, i.e., \((gf)_{\natural } = g_{\natural }f_{\natural }\) for every pair of maps f : X → Y and g : Y → Z, and \(\mathrm{id}_{L(X)} = (\mathrm{id}_{X})_{\natural }\). Accordingly, we have a covariant functor from the category of Tychonoff spaces into the category of topological linear spaces. Consequently, every homeomorphism f : X → Y extends to a linear homeomorphism f : L(X) → L(Y ).

In Sect. 7.12, we will construct a metrizable linear space that is not an absolute extensor for metrizable spaces. The free topological linear space L(X) over a compactum X has an important role in the construction. The topological and geometrical structures of L(X) will be studied in Sect. 7.11.

3.9.1 Notes for Chap. 3

There are lots of good textbooks for studying topological linear spaces. The following classical book of Köthe is still a very good source on this subject. The textbook by Kelly and Namioka is also recommended by many people. Besides these two books, the textbook by Day is a good reference for normed linear spaces as is Valentine’s book for convex sets. Concerning non-locally convex F-spaces and Roberts’ example (a compact convex set with no extreme points), one can refer to the book by Kalton, Peck and Roberts.

  • G. Köthe, Topological Vector Spaces, I, English edition, GMW 159 (Springer-Verlag, New York, 1969)

  • J.L. Kelly and I. Namioka, Linear Topological Spaces, Reprint edition, GTM 36 (Springer-Verlag, New York, 1976)

  • M.M. Day, Normed Linear Spaces, 3rd edition, EMG 21 (Springer-Verlag, Berlin, 1973)

  • F.A. Valentine, Convex Sets (McGraw-Hill Inc., 1964); Reprint of the 1964 original (R.E. Krieger Publ. Co., New York, 1976)

  • N.J. Kalton, N.T. Peck and J.W. Roberts, An F-space Sampler, London Math. Soc. Lecture Note Ser. 89 (Cambridge Univ. Press, Cambridge, 1984)

For a systematic and comprehensive study on continuous selections, refer to the following book by Repovš and Semenov, which is written in instructive style.

  • D. Repovš and P.V. Semenov, Continuous Selections of Multivalued Mappings, MIA 455 (Kluwer Acad. Publ., Dordrecht, 1998)

In Theorem 3.6.4, the construction of a metric d from d 0 is due to Eidelheit and Mazur [1].

The results of Sect. 3.8 are contained in the first part of Michael’s paper [2], which consists of three parts. For the finite-dimensional case, refer to the second and third parts of [2] (cf. [3]) and the book of Repovš and Semenov. The finite-dimensional case is deeply related with the concept discussed in Sect. 6.11 but will not be treated in this book. The 0-dimensional case will be treated in Sect. 7.2.