Key words

Mathematics Subject Classifications (2010)

15.1 Introduction

Unless explicitly stated otherwise, throughout this paper X will always denote a (real) normed linear space, and C a closed convex set in X. For any two distinct points x, v in X, we define interval notation analogous to that on the real line by

$$\displaystyle\begin{array}{rcl} [x,v]:& =& \{\lambda x + (1-\lambda )v\mid 0 \leq \lambda \leq 1\}, {}\\ \left [x,v\right [:& =& \{\lambda x + (1-\lambda )v\mid 0 <\lambda \leq 1\}, {}\\ \left ]x,v\right ]:& =& \{\lambda x + (1-\lambda )v\mid 0 \leq \lambda < 1\} = \left [v,x\right [,\;\mbox{ and} {}\\ \left ]x,v\right [:& =& \{\lambda x + (1-\lambda )v\mid 0 <\lambda < 1\}. {}\\ \end{array}$$

In other words, [x, v] is just the closed line segment joining x and v, [x, v[ is the same line segment but excluding the end point v, and ]x, v[ is the line segment [x, v] with both end points x and v excluded.

Definition 15.1.

Let xX. A point vC is said to be visible to x with respect to C if and only if [x, v]∩C = {v} or, equivalently, [x, v[∩C = . The set of all visible points to x with respect to C is denoted by V C (x).

Thus

$$\displaystyle{ V _{C}(x) =\{ v \in C\mid [x,v] \cap C =\{ v\}\} =\{ v \in C\mid [x,v[\,\cap \,C = \varnothing \}. }$$
(15.1)

Geometrically, one can regard V C (x) as the “light” that would be cast on the set C if there were a light source at the point x emanating in all directions. Alternatively, one can regard the set C as an “obstacle” in X, a “robot” is located at a point xX, and the directions determined by the intervals [x, v], where vV C (x), as directions to be avoided by the robot so as not to collide with the obstacle C.

In this paper we begin a study of visible sets. In Sect. 15.2, we will give some characterizations of visible sets (see Lemmas 15.4 and 15.10, and Theorem 15.15 below). We show that the visible set mapping V C satisfies a translation property just like the well-known metric projection P C (see Lemma 15.6 below). Recall that the generally set-valued metric projection (or nearest point mapping) P C is defined on X by

$$\displaystyle{P_{C}(x):=\{ y \in C\mid \|x - y\| =\inf _{c\in C}\|x - c\|\}.}$$

Those closed convex sets C such that the set of visible points to each point not in C is the whole set C are precisely the affine sets (Proposition 15.7). In Sect. 15.3 we study the connection between visible points and best approximations. Finally, in Sect. 15.4 we consider characterizing best approximations to a point in a Hilbert space from a polytope, i.e., the convex hull of a finite set of points.

15.2 Visibility from Convex Sets

The first obvious consequence of the definition of visibility is the following.

Lemma 15.2.

Let C be a closed convex set in X. If x ∈ C, then V C (x) ={ x}.

This lemma shows that the most interesting case is when xXC and the main results to follow actually require this condition as part of their hypotheses. Indeed, when xC, there are additional useful criteria that characterize visible points. For any set C, let bdC denote the boundary of C.

Unlike the metric projection, the visibility operator is never empty-valued.

Lemma 15.3.

Let C be a closed convex set in X. Then

  1. 1.

    V C (x) ≠ ∅ for each x ∈ X, and

  2. 2.

    V C (x) ⊂ bdC for each x ∈ X ∖ C.

Proof.

  1. 1.

    Let xX. By Lemma 15.2 we may assume that xC. Fix any yC. Then the interval [x,y] contains points in C (e.g., y) and points not in C (e.g., x). Let

    $$\displaystyle{\lambda _{0}:=\sup \{\lambda \in [0,1]\mid \lambda x + (1-\lambda )y \in C\}.}$$

    Since C is closed, it follows that \(v_{0}:=\lambda _{0}x + (1 -\lambda _{0})y \in C\). Hence λ 0 < 1, and [x, v 0]∩C = {v 0}. That is, v 0V C (x).

  2. 2.

    Fix any xXC. To show that v ∈ bdC for each vV C (x). If not, then there exists some vV C (x) such that vC ∖ bdC. Hence v is in the interior of C. Thus there must be a subinterval [v 0, v] of the interval [x, v] which lies in C. Hence [x, v]∩C ≠ {v}, a contradiction to vV C (x).

Lemma 15.4 (Characterization of visible points).

Let C be a closed convex set in X, x ∈ X ∖ C, and v ∈ C. Then the following statements are equivalent:

  1. 1.

    v is visible to x with respect to C.

  2. 2.

    \(\lambda x + (1-\lambda )v\notin C\) for each 0 < λ ≤ 1.

  3. 3.

    \(\max \{\lambda \in [0,1]\mid \;\lambda x + (1-\lambda )v \in C\} = 0\) .

Proof.

(1) ⇒ (2): If (1) holds, then [x, v[∩C = . Since \([x,v[\,=\{\lambda x + (1-\lambda )v\mid 0 <\lambda \leq 1\}\), (2) follows.

(2) ⇒ (3): Since vC, (3) is an obvious consequence of (2).

(3) ⇒ (1): If (3) holds, then [x, v[∩C = . That is, vV C (x).

Simple examples in the Euclidean plane (e.g., a box) show that although C is convex, V C (x) is not convex in general. These simple examples also might seem to indicate that V C (x) is always closed. However, the following example in 3 dimensions shows that this is false in general.

Consider the subset of Euclidean 3-space 2(3) defined by

$$\displaystyle{ C:= (1,0,0) +\mathrm{ cone\,}\{(1,\alpha,\beta ){\mid \alpha }^{2} + {(\beta -1)}^{2} \leq 1\}. }$$
(15.2)

Example 15.5.

The set C defined by (15.2) is a closed convex subset of 2(3) such that 0 ∉ C and V C (0) is not closed.

Proof.

The result is geometrically obvious (see Fig. 15.1) by observing that the points (2, sint, 1 + cost) are in V C (0) for each 0 < t < π, but that the limit point (2, 0, 0) (as tπ) is not. However, the formal proof of this fact is a bit lengthy. Clearly, 0 ∉ C since the first component of any element of C is at least 1. We first verify the following claim.

Fig. 15.1
figure 1

The set C from Example 15.5

Claim.

The points \(v(t):= (2,\,\sin t,\,1 +\cos t)\) are in V C (0) for each 0 < t < π.

Using the classical trig identity \({\sin }^{2}\,t {+\cos }^{2}\,t = 1\), it is clear that v(t) ∈ C for each 0 < t < π. To complete the proof of the claim, it is enough to show that [0, v(t)[∩C = for each 0 < t < π. By way of contradiction, suppose the claim is false. Then there exists 0 < t 0 < π such that [0, v(t 0)[∩C. Since 0 ∉ C, it follows that there exists 0 < λ < 1 such that λ v(t 0) ∈ C. That is,

$$\displaystyle\begin{array}{rcl} \lambda (2,\,\sin t_{0},1 +\cos t_{0}) \in C& =& (1,0,0) +\mathrm{ cone\,}\{(1,\alpha,\beta ){\mid \alpha }^{2} + {(\beta -1)}^{2} \leq 1\} {}\\ & =& (1,0,0) + \cup _{\rho \geq 0}\rho \{(1,\alpha,\beta ){\mid \alpha }^{2} + {(\beta -1)}^{2} \leq 1\}. {}\\ \end{array}$$

Since λ sint 0 ≠ 0, it follows that for some ρ > 0,

$$\displaystyle{ \lambda (2,\,\sin t_{0},1 +\cos t_{0}) = (1,0,0) +\rho (1,\alpha,\beta ) }$$
(15.3)

for some α and β such that

$$\displaystyle{{ \alpha }^{2} + {(\beta -1)}^{2} \leq 1. }$$
(15.4)

By equating the corresponding components in (15.3), we obtain

$$\displaystyle\begin{array}{rcl} 2\lambda = 1+\rho & &{}\end{array}$$
(15.5)
$$\displaystyle\begin{array}{rcl} \lambda \sin t_{0} =\rho \alpha & &{}\end{array}$$
(15.6)
$$\displaystyle\begin{array}{rcl} \lambda (\cos t_{0} + 1) =\rho \beta & &{}\end{array}$$
(15.7)

From (15.5) is deduced that \(\rho = 2\lambda - 1 < 2 - 1 = 1\) and hence that

$$\displaystyle{ 0 <\rho < 1. }$$
(15.8)

Also, from (15.6) and (15.7) we deduce that α = μ sin t 0 and \(\beta =\mu (1 +\cos t_{0})\), where \(\mu:=\lambda /\rho\). Substituting these values for α and β into (15.4), we deduce after some algebra that \(1 \geq {2\mu }^{2}(1 +\cos t_{0}) - 2\mu (1 +\cos t_{0}) + 1\). Subtracting 1 from both sides of this inequality and then dividing both sides of the resulting inequality by the positive number 2μ(1 + cost 0), we obtain μ ≤ 1, i.e., λρ. From (15.5), it follows that ρ ≥ 1, which contradicts (15.8). This proves the claim.

It remains to note that the limit point \(\lim _{t\rightarrow \pi }v(t) = v(\pi ) = (2,0,0)\) is not in V C (0). For this, it is enough to note that [0, v(π)[∩C. And for this, it suffices to show that (3 ∕ 4)v(π) ∈ C. But

$$\displaystyle{\frac{3} {4}v(\pi ) = \left (\dfrac{6} {4},0,0\right ) = (1,0,0) + \frac{1} {2}(1,0,0) \in C.}$$

The following simple fact will be useful to us. It shows that the visible set mapping V C satisfies a translation property that is also satisfied by the (generally set-valued) metric projection P C .

Lemma 15.6.

Let C be a closed convex set and x,y ∈ X. Then

$$\displaystyle{ V _{C}(x) = V _{C+y}(x + y) - y. }$$
(15.9)

Proof.

Let vC. Note that vV C (x) ⇔ [x, v[∩C = \([x + y,v + y[\,\cap \,(C + y) = \varnothing \)\(v + y \in V _{C+y}(x + y)\)\(v \in V _{C+y}(x + y) - y\).

It is natural to ask which closed convex sets C have the property that V C (x) = C for each xC. That is, for which sets is the whole set visible to any point outside the set? The next result shows that this is precisely the class of affine sets. Recall that a set A is affine if the line through each pair of points in A lies in A. That is, if the line \(\mathrm{aff\,}\{a_{1},a_{2}\}:=\{\alpha _{1}a_{1} +\alpha _{2}a_{2}\mid \alpha _{1} +\alpha _{2} = 1\} \subset A\) for each pair a 1, a 2A. Equivalently, A is affine if and only if \(A = M + a\) for some (unique) linear subspace M (namely, \(M = A - A\)) and (any) aA. Finally, the affine hull of a set C, aff (C), is the intersection of all affine sets which contain C. As is well known,

$$\displaystyle{ \mathrm{aff\,}(C) = \left \{\sum _{j\in J}\alpha _{j}x_{j}\biggm |J\mbox{ finite, }\sum _{j\in J}\alpha _{j} = 1,\,x_{j} \in C\;\right \}. }$$
(15.10)

Proposition 15.7.

Let C be a closed convex set in X. Then the following statements are equivalent:

  1. 1.

    C is affine.

  2. 2.

    V C (x) = C for each x ∈ X ∖ C.

Proof.

(1) ⇒ (2): Let us assume first that C = M is actually a subspace, i.e., that 0 ∈ C. Fix any xM. Since V M (x) ⊂ M, it suffices to show that MV M (x). To this end, let mM. If mV M (x), then [x, m[∩M. Hence there exists 0 < λ < 1 such that \(\lambda x + (1-\lambda )m \in M\). Since mM, this implies that λ xM and hence xM, a contradiction. This proves (2) in case C is a subspace.

In general, suppose C is affine. Then \(C = M + c\) for some subspace M and cC. For any xXC, we see that xcM and by the above proof and Lemma 15.6 we obtain

$$\displaystyle{V _{C}(x) = V _{M+c}(x) = V _{M}(x - c) + c = M + c = C.}$$

(2) ⇒ (1): Assume (2) holds. If C is not affine, then there exist distinct points c 1, c 2 in C such that \(\mathrm{aff\,}\{c_{1},c_{2}\}\not\subset C\). Since C is closed convex and \(\mathrm{aff\,}\{c_{1},c_{2}\}\) is a line, it follows that either \(\mathrm{aff\,}\{c_{1},c_{2}\} \cap C = [y_{1},y_{2}]\) or \(\mathrm{aff\,}\{c_{1},c_{2}\} \cap C = y_{1} +\{\rho (y_{2} - y_{1})\mid \rho \geq 0\}\) for some distinct points y 1, y 2 in C. In either case, it is easy to verify that \(x:= 2y_{1} - y_{2}\notin C\). Also, \(y_{1} = \frac{1} {2}x + \frac{1} {2}y_{2} \in [x,y_{2}[\,\cap \,C\), which proves that y 2V C (x) and hence contradicts the hypothesis that V C (x) = C. Thus C must be affine.

Definition 15.8.

Let C be a closed convex subset of X. For any point yX, we define the translated cone C y of C by

$$\displaystyle{C_{y}:=\mathrm{ cone\,}(C - y) + y.}$$

Some basic facts about the translated cone follow.

Lemma 15.9.

Let C be a closed convex set in X. Then the following statements hold:

  1. 1.

    C y ⊃ C for each y ∈ X.

  2. 2.

    The set cone (C − y), and hence also C y , is not closed in general.

  3. 3.

    If y ∈ C and the set cone (C − y) is closed, then \(C_{y} = T_{C}(y) + y\) , where T C (y) is the tangent cone to C at y.

Proof.

  1. 1.

    \(C_{y} =\mathrm{ cone\,}(C - y) + y \supset C - y + y = C\).

  2. 2.

    Consider the closed ball C of radius one in the Euclidean plane centered at the point (0, 1) and let y denote the origin (0, 0). Then C y is the open upper half-plane plus the origin, which is not closed.

  3. 3.

    This follows since the definition of the tangent cone to C at the point yC is given by \(T_{C}(y) = \overline{\mathrm{cone\,}}(C - y)\) (see, e.g., [1, p. 100]).

One can also characterize the visible points via the translated cone.

Lemma 15.10.

Let C be a closed convex set in X, x ∈ X ∖ C, and v ∈ C. Then v ∈ V C (x) if and only if x ∉ C v . Equivalently, v ∉ V C (x) if and only if x ∈ C v .

Proof.

If vV C (x), then [x, v[∩C. Thus there exists 0 < λ < 1 such that \(y:=\lambda x + (1-\lambda )v \in C\). Hence \(x - v = (1/\lambda )(y - v) \in \mathrm{ cone}\,(C - v)\) and therefore xC v .

Conversely, if xC v , then there exist ρ ≥ 0 and yC such that \(x =\rho (y - v) + v =\rho y + (1-\rho )v\). If ρ ≤ 1, then x, being a convex combination of two points in C, must lie in C, a contradiction. It follows that ρ > 1 and \(y = (1/\rho )x + ((\rho -1)/\rho )v \in [x,v[\,\cap \,C\). Thus [x, v[∩C, and so vV C (x) by (15.1).

The following proposition shows that the translated cones of C form the external building blocks for C.

Proposition 15.11.

Let C be a closed convex set in X. Then

$$\displaystyle{\bigcap _{y\in \mathrm{bd}C}C_{y} =\bigcap _{y\in C}C_{y} =\bigcap _{y\in X}C_{y} = C.}$$

Proof.

By Lemma 15.9, ∩ yX C y C. Thus to complete the proof, it suffices to show that∩ y∈bdC C y C. If not, then there exists x ∈∩ y∈bdC C y C. Thus xC y C for each y ∈ bdC. By Lemma 15.10 yV C (x) for all y ∈ bdC. But V C (x) ⊂ bdC by Lemma 15.3(2). This shows that V C (x) = , which contradicts Lemma 15.3(1).

A somewhat deeper characterization of visible points is available by using the strong separation theorem. Recall that two sets C 1 and C 2 in the normed linear space X can be strongly separated by a continuous linear functional x X if

$$\displaystyle{ \sup _{y\in C_{1}}{x}^{{\ast}}(y) <\inf _{ z\in C_{2}}{x}^{{\ast}}(z). }$$
(15.11)

One can also interpret strong separation geometrically. Suppose C 1 and C 2 are strongly separated by the functional x such that (15.11) holds. Let b be any scalar such that

$$\displaystyle{\sup _{y\in C_{1}}{x}^{{\ast}}(y) \leq b \leq \inf _{ z\in C_{2}}{x}^{{\ast}}(z).}$$

Define the hyperplane H and the (open) half-spaces H + and H by

$$\displaystyle\begin{array}{rcl} & & H:=\{ y \in X\mid {x}^{{\ast}}(y) = b\},\quad {H}^{+}:=\{ y \in X\mid {x}^{{\ast}}(y) > b\},\mbox{ and } {}\\ & & {H}^{-}:=\{ y \in X\mid {x}^{{\ast}}(y) < b\}. {}\\ \end{array}$$

(Note that H, H , and H + are disjoint sets such that \(X = H \cup {H}^{-}\cup {H}^{+}\).) Then H is said to strongly separate the sets C 1 and C 2 in the sense that C 1HH , C 2HH +, and (at least) one of the sets C 1 or C 2 is disjoint from H.

Fact 15.12 (Strong Separation Theorem; see [4, Theorem V.2.10, p. 417]).

Let C 1 and C 2 be two disjoint closed convex sets in X, one of which is compact. Then the sets can be strongly separated by a continuous linear functional.

Definition 15.13.

Let K be a convex subset of X. A point eK is called an extreme point of K if k 1K, k 2K, 0 < λ < 1, and \(e =\lambda k_{1} + (1-\lambda )k_{2}\) imply that \(k_{1} = k_{2} = e\). The set of extreme points of K is denoted by extK.

The following fact is well known (see, e.g., [4, pp. 439–440]), and it will be needed in this section and the next.

Fact 15.14 (Krein–Milman).

Let K be a nonempty compact convex subset of X. Then:

  1. 1.

    K has extreme points and K is the closed convex hull of its extreme points: \(K = \overline{\mathrm{conv}\,}(\mathrm{ext}K)\).

  2. 2.

    If x X , then x attains its maximum (resp., minimum) value over K at an extreme point of K.

Theorem 15.15 (Another characterization of visible points).

Let C be a closed convex subset of X, x ∈ X ∖ C, and v ∈ C. Then the following statements are equivalent:

  1. 1.

    v is visible to x with respect to C.

  2. 2.

    For each point y ∈]x,v[, there exists a functional x ∈ X that strongly separates [x,y] and C, and \({x}^{{\ast}}(y) =\max _{z\in [x,y]}{x}^{{\ast}}(z)\) .

  3. 3.

    For each point y ∈]x,v[, there exists a hyperplane H = H y that contains y and strongly separates [x,y] and C.

Proof.

(1) ⇒ (2): Suppose v is visible to x from C. Then [x, v[∩C = . In particular, for each y ∈ [x, v[, [x,y]∩C ⊂ [x, v[∩C = . Thus [x,y] and C are disjoint closed convex sets, and [x,y] is compact. By Fact 15.12, there exists x X such that

$$\displaystyle{ b:=\sup _{z\in [x,y]}{x}^{{\ast}}(z) <\inf _{ c\in C}{x}^{{\ast}}(c). }$$
(15.12)

To verify (2), it remains to show that x (y) = b. If x = y, this is clear. Thus we may assume that xy. Since [x,y] is compact, the supremum on the left side of (15.12) is attained. Further, this maximum must be attained at an extreme point of [x,y] by Fact 15.14(2). Since x and y are the only two extreme points of [x,y], we must have x (x) = b or x (y) = b.

Suppose x (x) = b. Since vC, we have x (v) > b by (15.12). Since y ∈ ]x, v[, there exists 0 < λ < 1 such that \(y =\lambda x + (1-\lambda )v\). Then

$$\displaystyle{{x}^{{\ast}}(y) =\lambda {x}^{{\ast}}(x) + (1-\lambda ){x}^{{\ast}}(v) >\lambda b + (1-\lambda )b = b,}$$

which contradicts the definition of b. Thus the condition x (x) = b is not possible, and we must have that x (y) = b, which verifies (2).

(2) ⇒ (3): Assume (2) holds. Let y ∈ ]x, v[. Choose x X as in (2), and define \(H:=\{ z \in X\mid {x}^{{\ast}}(z) = b\}\), where b = max z∈[x,y] x (z). Then H strongly separates [x,y] and C, x (y) = b, and so yH. Thus (3) holds.

(3) ⇒ (1): Suppose (3) holds but (1) fails. Then [x, v[∩C. Choose any y ∈ ]x, v[∩C. By (3), there is a hyperplane H that strongly separates [x,y] and C such that yH. Writing \(H =\{ z \in X\mid {x}^{{\ast}}(z) = b\}\), we see that [x,y] ⊂ {zXx (z) ≤ b}, C ⊂ {zXx (z) > b}, and x (y) = b. But yC and hence x (y) > b, which is a contradiction.

15.3 Visibility and Best Approximation

In this section we explore the connection between visibility and best approximation. The first such result states that the set of best approximations to x from C is always contained in the set of visible points to x with respect to C.

Lemma 15.16.

Let C be a closed convex subset of X. Then P C (x) ⊂ V C (x) for each x ∈ X.

Proof.

The result is trivial if P C (x) = . If xC, then clearly P C (x) = {x} and V C (x) = {x} by Lemma 15.2.

Now suppose xXC and let x 0P C (x). Then x 0C so x 0x. If [x, x 0[∩C, then there exists 0 < λ < 1 such that \(x_{\lambda }:=\lambda x + (1-\lambda )x_{0} \in C\). Hence

$$\displaystyle{\|x - x_{\lambda }\| =\| (1-\lambda )(x - x_{0})\| = (1-\lambda )\|x - x_{0}\| <\| x - x_{0}\|,}$$

which is a contradiction to x 0 being a closest point in C to x. This shows that [x, x 0[∩C = and hence that x 0V C (x).

Recall that if X is a strictly convex reflexive Banach space, then each closed convex subset C is Chebyshev (see, e.g., [7]). That is, for each xX, there is a unique best approximation (i.e., nearest point) P C (x) to x from C. As is well known, the most important example of a strictly convex reflexive Banach space is a Hilbert space. It is convenient to use the following notation. If S is any subset of X, then the convex hull of S is denoted by conv (S) and the closed convex hull of S is denoted by \(\overline{\mathrm{conv\,}}(S)\).

Another such relationship between visibility and best approximation is the following.

Lemma 15.17.

Let X be a strictly convex reflexive Banach space and C a closed convex subset of X. Then C is a Chebyshev set and if x ∈ X ∖ C, then

$$\displaystyle{ P_{C}(x) = P_{V _{C}(x)}(x) = P_{\overline{\mathrm{conv}}\,V _{C}(x)}(x). }$$
(15.13)

Proof.

By Lemma 15.16, P C (x) ∈ V C (x). Since \(V _{C}(x) \subset \overline{\mathrm{conv}}\,V _{C}(x) \subset C\), it follows that \(P_{C}(x) \in P_{V _{C}(x)}(x)\) and \(P_{C}(x) = P_{\overline{\mathrm{conv}}\,V _{C}(x)}(x)\). Thus \(P_{V _{C}(x)}(x)\) is a singleton and (15.13) holds.

While the Krein–Milman theorem [Fact 15.14(1)] shows that the set of extreme points extC of a compact convex set C forms the internal building blocks of C, the next result shows that the sets C e , where e ∈ extC, form the external building blocks for C. It is a sharpening of Proposition 15.11 in the special case when the closed convex set C is actually compact.

Theorem 15.18.

Let C be a compact convex set in X. Then

$$\displaystyle{ C =\bigcap \{ C_{e}\mid e \in \mathrm{ ext}C\} =\bigcap \{ C_{y}\mid y \in C\}. }$$
(15.14)

Proof.

Using Proposition 15.11, it suffices to show that ∩{C e e ∈ extC} ⊂ C. If not, then there exists x ∈∩{C e e ∈ extC} ∖ C. By Fact 15.12, there exists x X such that

$$\displaystyle{ s:=\sup _{c\in C}{x}^{{\ast}}(c) < {x}^{{\ast}}(x). }$$
(15.15)

By compactness of C, the supremum of x over C is attained, i.e., there exists c 0C such that x (c 0) = s. As is easily verified, the set

$$\displaystyle{ \tilde{C} = C \cap \{ y \in X\mid {x}^{{\ast}}(y) = s\} }$$
(15.16)

is extremal in C and has extreme points (since it is a closed, hence compact, convex subset of C), and each extreme point of \(\tilde{C}\) is an extreme point of C (see, e.g., [4, pp. 439–440]). Choose any extreme point \(\tilde{c}\) in \(\tilde{C}\). Then \(\tilde{c} \in \mathrm{ ext}C\). Also, \(x \in C_{\tilde{c}} =\mathrm{ cone\,}(C -\tilde{ c}) +\tilde{ c}\) implies that \(x =\rho (c -\tilde{ c}) +\tilde{ c}\) for some ρ > 0 and cC (see, e.g., [3, Theorem 4.4(5), p. 45]). Hence

$$\displaystyle\begin{array}{rcl} s& <& {x}^{{\ast}}(x) =\rho [{x}^{{\ast}}(c) - {x}^{{\ast}}(\tilde{c})] + {x}^{{\ast}}(\tilde{c}) \leq {x}^{{\ast}}(\tilde{c}) = s, {}\\ \end{array}$$

which is impossible. This contradiction completes the proof.

Proposition 15.19.

Let C be a closed convex set in X, x ∈ X ∖ C, and let x 0 ∈ C be a proper convex combination of points e i in C. That is, \(x_{0} =\sum _{ 1}^{k}\lambda _{i}e_{i}\) for some λ i > 0 with \(\sum _{1}^{k}\lambda _{i} = 1\) . If x 0 is visible to x with respect to C, then each e i is also visible to x.

Proof.

If k = 1 the result is trivial. Assume that k = 2. (We will reduce the general case to this case.)

If the result were false, then we may assume without loss of generality that e 1 is not visible to x. Thus ]x, e 1[∩C. Hence there exists 0 < μ < 1 such that \(x_{1}:=\mu x + (1-\mu )e_{1} \in C\). It follows that

$$\displaystyle{ e_{1} = \frac{1} {1-\mu }x_{1} - \frac{\mu } {1-\mu }x. }$$
(15.17)

Next consider, for each ρ ∈ [0, 1], the expression \(x(\rho ):=\rho x_{1} + (1-\rho )e_{2}\). Clearly, x(ρ) ∈ C for all such ρ since both x 1 and e 2 are in C and C is convex. Omitting some simple algebra, we deduce that

$$\displaystyle\begin{array}{rcl} x(\rho )& =& \rho [\mu x + (1-\mu )e_{1}] + (1-\rho )e_{2} {}\\ & =& \rho \mu x + (1-\rho \mu )x_{0} +\rho (1-\mu )e_{1} + (1-\rho )e_{2} - (1-\rho \mu )x_{0} {}\\ & =& \rho \mu x + (1-\rho \mu )x_{0} + [\rho (1 -\mu +\lambda _{1}\mu ) -\lambda _{1}]e_{1} + [-\rho (1 -\mu +\lambda _{1}\mu ) +\lambda _{1}]e_{2}. {}\\ \end{array}$$

In particular, if we choose

$$\displaystyle{ \tilde{\rho }:= \frac{\lambda _{1}} {1 -\mu +\lambda _{1}\mu }\,, }$$
(15.18)

it is not hard to check that \(0 <\tilde{\rho }< 1\). Thus \(0 <\tilde{\rho }\mu < 1\) and

$$\displaystyle{ x(\tilde{\rho }) =\tilde{\rho }\mu x + (1-\tilde{\rho }\mu )x_{0} \in C. }$$
(15.19)

This proves that \(x(\tilde{\rho }) \in \,]x,x_{0}[\,\cap \,C\), which contradicts the fact that x 0 is visible to x.

Finally, consider the case when k ≥ 3. If the result were false, then without loss of generality, we may assume that e 1 fails to be visible to x. Write

$$\displaystyle{x_{0} =\lambda _{1}e_{1} +\mu \sum _{ i=2}^{k}\frac{\lambda _{i}} {\mu } e_{i},}$$

where \(\mu:=\sum _{ 2}^{k}\lambda _{i} = 1 -\lambda _{1}\). Then 0 < μ < 1, \(\lambda _{1} = 1-\mu\), and \(x_{0} = (1-\mu )e_{1} +\mu y\), where \(y =\sum _{ 2}^{k}\frac{\lambda _{i}} {\mu } e_{i} \in C\) by convexity. By the case when k = 2 that we proved above, we get that e 1 (as well as y) is visible to x, which is a contradiction.

Remark 15.20.

Simple examples in the plane (e.g., a triangle) show that the converse to Proposition 15.19 is false! That is, one could have a closed convex set C, a point xXC, points e i V C (x) for \(i = 1,2,\ldots,k\), k ≥ 2, but \(x_{0} = \frac{1} {k}\sum _{1}^{k}e_{ i} \in C\) is not visible to x.

Theorem 15.21.

Let C be a closed and bounded convex set in an n-dimensional normed linear space X. Then

$$\displaystyle{ C = \left \{\sum _{1}^{k}\lambda _{ i}e_{i}\biggm |1 \leq k \leq n + 1,\;\lambda _{i} \geq 0,\;\sum _{1}^{k}\lambda _{ i} = 1,\;e_{i} \in \mathrm{ ext}C\right \}. }$$
(15.20)

Further, let x ∈ X ∖ C. Then each point in P C (x) is a proper convex combination of no more than n + 1 extreme points of C all of which are visible to x with respect to C. That is,

$$\displaystyle\begin{array}{rcl} P_{C}(x) \subset \left \{\sum _{1}^{k}\lambda _{ i}e_{i}\biggm |1 \leq k \leq n + 1,\;\lambda _{i} \geq 0,\;\sum _{1}^{k}\lambda _{ i} = 1,\;e_{i} \in (\mathrm{ext}C) \cap V _{C}(x)\right \}.& & \\ & &{}\end{array}$$
(15.21)

Proof.

By [6, Corollary 18.5.1], C = conv (extC). By Caratheodory’s theorem (see, e.g., [2, p. 17]), each point in conv (extC) may be expressed as a convex combination of at most n + 1 points of extC. That is,

$$\displaystyle{ \mathrm{conv}\,(\mathrm{ext}C) = \left \{\sum _{1}^{n+1}\lambda _{ i}e_{i}\biggm |e_{i} \in \mathrm{ ext}C,\lambda _{i} \geq 0,\sum _{1}^{n+1}\lambda _{ i} = 1\right \}. }$$
(15.22)

This proves (15.20).

Now let xXC. By the first part, each point of P C (x) is in conv (extC). By Proposition 15.19 and Lemma 15.16, (15.21) follows.

15.4 Best Approximation from a Simplex

In this section we investigate the problem of finding best approximations from a polytope, i.e., the convex hull of a finite number of points in a Hilbert space X. Such sets are compact (because they are closed and bounded in a finite-dimensional subspace).

Let \(E:=\{ e_{0},e_{1},\ldots,e_{n}\}\) be a set of n + 1 points in X that is affinely independent, i.e., \(\{e_{1} - e_{0},e_{2} - e_{0},\ldots,e_{n} - e_{0}\}\) is linearly independent. This implies that each point in the convex hull \(C =\mathrm{ conv}\,\{e_{0},e_{1},\ldots,e_{n}\}\) has a unique representation as a convex combination of the points of E. In this case, C is also called an n -dimensional simplex with vertices e i , since the dimension of the affine hull aff (C) of C is n. Further, the relative interior of C, that is, the interior of C relative to aff (C), is given by

$$\displaystyle{ \mathrm{ri}\,(C):={\biggl \{\sum _{ i=0}^{n}\lambda _{ i}e_{i}\bigm |\lambda _{i} > 0,\;\sum _{i=0}^{n}\lambda _{ i} = 1\biggr \}}. }$$
(15.23)

It follows that the relative boundary of C, rbd(C): = C ∖ ri (C), is given by

$$\displaystyle{ \mathrm{rbd\,}(C) ={\biggl \{\sum _{ i=0}^{n}\lambda _{ i}e_{i}\bigm |\lambda _{i} \geq 0,\;\sum _{i=0}^{n}\lambda _{ i} = 1,\;\lambda _{j} = 0\mbox{ for at least one $j$}\biggr \}}. }$$
(15.24)

(See [6, p. 44ff] and [5, p. 7ff] for more detail and proofs about the facts stated in this paragraph.)

We consider sets of affinely independent points, since this case captures the essence of our constructions and arguments. Convex hulls of n affinely dependent points (i.e., finite point sets that are not affinely independent) can be split into the union of a finite number of convex hulls of subsets of affinely independent points. Thus the problem of finding best approximation from the convex hull of an affinely dependent set of points can be reduced to a finite number of problems analogous to the one that we consider below in detail.

Under the above hypothesis that C is an n-dimensional simplex, we wish to compute P C (x) for any x ∈ X.

We give an explicit formula for P C (x) in the case when n = 1, that is, when C = [e 0, e 1] is a line segment. Then, by a recursive argument, we will indicate how to compute P C (x) when C is an n-dimensional simplex for any n ≥ 2. First we recall that the truncation function [ ⋅]0 1 is defined on the set of real numbers by

$$\displaystyle{[\alpha ]_{0}^{1} = \left \{\begin{array}{ll} 0&\mbox{ if $\alpha < 0$},\\ \alpha &\mbox{ if $0 \leq \alpha \leq 1$}, \\ 1&\mbox{ if $\alpha > 1$}. \end{array} \right.}$$

(Note that in the space \(X = \mathbb{R}\), \([\alpha ]_{0}^{1} = P_{[0,1]}(\alpha )\) for all \(\alpha \in \mathbb{R}\).)

Proposition 15.22.

Let \(C =\mathrm{ conv}\,\{e_{0},e_{1}\} = [e_{0},e_{1}]\) be a 1-dimensional simplex. Then, for each x ∈ X,

$$\displaystyle{ P_{C}(x) = e_{0} + \left [\frac{\langle x - e_{0},e_{1} - e_{0}\rangle } {\|e_{1} - e{_{0}\|}^{2}} \right ]_{0}^{1}(e_{ 1} - e_{0}). }$$
(15.25)

Proof.

Let \(\alpha:= \langle x - e_{0},e_{1} - e_{0}\rangle \|e_{1} - e{_{0}\|}^{-2}\) and \(c_{0}:= e_{0} + [\alpha ]_{0}^{1}(e_{1} - e_{0})\). Then c 0C, and by the well-known characterization of best approximations from convex sets in Hilbert space (see, e.g., [3, p. 43]) it suffices to show that

$$\displaystyle{ \langle x - c_{0},y - c_{0}\rangle \leq 0\mbox{ for each $y \in C$.} }$$
(15.26)

Let yC. Then \(y = e_{0} +\lambda (e_{1} - e_{0})\) for some λ ∈ [0, 1]. Hence

$$\displaystyle\begin{array}{rcl} \langle x - c_{0},y - c_{0}\rangle & =& \langle x - e_{0} - [\alpha ]_{0}^{1}(e_{ 1} - e_{0}),\lambda (e_{1} - e_{0}) - [\alpha ]_{0}^{1}(e_{ 1} - e_{0})\rangle {}\\ & =& (\lambda -[\alpha ]_{0}^{1})[\langle x - e_{ 0},e_{1} - e_{0}\rangle - [\alpha ]_{0}^{1}\|e_{ 1} - e{_{0}\|}^{2}] {}\\ & =& (\lambda -[\alpha ]_{0}^{1})\|e_{ 1} - e{_{0}\|}^{2}\left [\alpha -[\alpha ]_{ 0}^{1}\right ]. {}\\ \end{array}$$

By considering the three possible cases: α < 0, α ∈ [0, 1], and α > 1, it is easy to see that the last expression is always ≤ 0. Hence (15.26) is verified.

Before considering the cases when n ≥ 2, let us first consider the problem of computing P A (x) for any xX, where A = affC.

Fact 15.23.

Let \(C =\mathrm{ conv}\,\{e_{0},e_{1},\ldots,e_{n}\}\) be an n-dimensional simplex, and let A = aff (C). For any xX, we have

$$\displaystyle{ P_{A}(x) = e_{0} +\sum _{ j=1}^{n}\alpha _{ j}(e_{j} - e_{0}), }$$
(15.27)

where the scalars α i satisfy the “normal” equations:

$$\displaystyle{ \sum _{j=1}^{n}\alpha _{ j}\langle e_{j} - e_{0},e_{i} - e_{0}\rangle =\langle x - e_{0},e_{i} - e_{0}\rangle \qquad (i = 1,2,\ldots,n). }$$
(15.28)

The proof of this fact can be found e.g., in [1, p. 418] or [3, p. 215]. Moreover, the “reduction principle” that was established in [3, p. 80] (where it was stated in the particular case of a subspace) can be easily extended to affine sets as follows.

Fact 15.24 (Reduction principle).

Let C be a closed convex set in the Hilbert space X and let \(A = \overline{\mathrm{aff\,}}(C)\). Then \(P_{C} = P_{C} \circ P_{A}\). That is, for each xX,

$$\displaystyle{P_{C}(x) = P_{C}(P_{A}(x))\mbox{ and }{d}^{2}(x,C) = {d}^{2}(x,A) + {d}^{2}(P_{ A}(x),C).}$$

We are going to use the Reduction Principle as follows. We assume that it is straightforward to find the best approximation to any x in the set A = aff C, where C is an n-dimensional simplex (since it involves only solving a linear system of n equations in n unknowns by Fact 15.23). The Reduction Principle states that (by replacing x with P A (x) if necessary) we may as well assume that our point x is in A to begin with, and we shall do this in what follows. We will see that the case when n = 2 can be reduced to the case when n = 1 (i.e., Proposition 15.22 above) for which there is an explicit formula.

Proposition 15.25.

Let \(C =\mathrm{ conv}\,\{e_{0},e_{1},e_{2}\}\) be a 2-dimensional simplex. Then for each x ∈ aff (C), either x ∈ C, in which case P C (x) = x, or x ∉ C, in which case

$$\displaystyle{ P_{C}(x) = P_{[e_{i},e_{i+1}]}(x)\mbox{ for any $i \in \{ 0,1,2\}$ that satisfies } }$$
(15.29)
$$\displaystyle{ \|x - P_{[e_{i},e_{i+1}]}(x)\| =\min _{j}\|x - P_{[e_{j},e_{j+1}]}(x)\|. }$$
(15.30)

(Here e 3 := e 0 .)

Proof.

If xC, then obviously P C (x) = x. Thus we can assume that x ∈ aff (C) ∖ C. It follows that P C (x) must lie on \(\mathrm{rbd\,}C = \cup _{i=0}^{2}[e_{i},e_{i+1}]\). That is, \(P_{C}(x) \in [e_{i},e_{i+1}]\) for some i = 0, 1, or 2.

Claim.

\(P_{C}(x) = P_{[e_{i},e_{i+1}]}(x)\) for each i such that \(P_{C}(x) \in [e_{i},e_{i+1}]\) .

To see this, we observe that since \(P_{C}(x) \in [e_{i},e_{i+1}]\), we have

$$\displaystyle{\|x - P_{C}(x)\| = d(x,C) \leq d(x,[e_{i},e_{i+1}]) \leq \| x - P_{C}(x)\|}$$

which implies that \(\|x - P_{[e_{i},e_{i+1}]}(x)\| = d(x,[e_{i},e_{i+1}]) =\| x - P_{C}(x)\|\). By uniqueness of best approximations from convex sets in Hilbert space, the claim is proved.

If k is any index such that \(\|x - P_{[e_{k},e_{k+1}]}(x)\| =\min _{j}\|x - P_{[e_{j},e_{j+1}]}(x)\|\), then it is clear that we must have \(P_{C}(x) = P_{[e_{k},e_{k+1}]}(x)\).

Now it appears to be straightforward to apply the idea of Proposition 15.25 to any n-dimensional simplex to describe how to determine P C (x).

Let \(C =\mathrm{ conv}\,\{e_{0},e_{1},\ldots,e_{n}\}\) be an n-dimensional simplex in X and x ∈ aff (C). If xC, we have P C (x) = x. Thus we may assume that x ∈ aff (C) ∖ C. It follows that P C (x) ∈ rbd(C). From (15.24) we see

$$\displaystyle{ \mathrm{rbd\,}(C) ={\biggl \{\sum _{ 0}^{n}\lambda _{ i}e_{i}\bigm |\lambda _{i} \geq 0,\;\sum _{0}^{n}\lambda _{ i} = 1,\;\lambda _{j} = 0\mbox{ for some $j$}\biggr \}}. }$$

Since every y ∈ rbdC is contained in (at least) one of the sets

$$\displaystyle{ C_{j}:={\biggl \{\sum _{ i=0}^{n}\lambda _{ i}e_{i}\bigm |\lambda _{i} \geq 0\mbox{ for all $i$, }\lambda _{j} = 0,\,\mbox{ and }\sum _{0}^{n}\lambda _{ i} = 1\biggr \}}, }$$
(15.31)

it follows that

$$\displaystyle{ \mathrm{rbd\,}C =\bigcup _{ j=0}^{n}C_{ j}. }$$

Further, each C j is a simplex of dimension n − 1 in C, P C (x) ∈ C j for at least one j, and for all such j, we have that

$$\displaystyle{\|x - P_{C}(x)\| = d(x,C) \leq \| x - P_{C_{j}}(x)\| = d(x,C_{j}) \leq \| x - P_{C}(x)\|.}$$

This implies that equality holds throughout these inequalities, and hence by the uniqueness of best approximations, we have \(P_{C}(x) = P_{C_{j}}(x)\). If \(J =\{ j\mid \|x - P_{C_{j}}(x)\| =\min _{i}\|x - P_{C_{i}}(x)\|\}\), then clearly \(P_{C}(x) = P_{C_{j}}(x)\) for each jJ.

This discussion suggests the following recursive algorithm for computing P C (x) when \(C =\mathrm{ conv}\,\{e_{0},e_{1},\ldots,e_{n}\}\) is an n-dimensional simplex. Let C j be the (n − 1)-dimensional simplices as defined in (15.31). Let A = aff C, A j = aff C j for each \(j = 0,1,\ldots,n\), xAC, and \(x_{j} = P_{C_{j}}(x_{j})\) for all j. The algorithm below defines a function P(n, x, C) which takes as input n and x and the set C and returns the best approximation P C (x).

Algorithm

  1. 1.

    If n = 1, then find P(1, x, C) by using the formula given in Proposition 15.22.

  2. 2.

    If n > 1, then compute \(x_{j} = P_{A_{j}}(x)\) and \(P_{C_{j}}(x_{j}) = P(n - 1,x_{j},C_{j})\) for \(j = 0,1,\ldots,n\).

  3. 3.

    Set \(P_{C}(x) = P_{C_{j}}(x_{j})\) for any \(j \in \mathrm{ argmin\,}_{k}\|x_{k} - P_{C_{k}}(x_{k})\|\).