Abstract
We give new characterizations for the subdifferential of the supremum of an arbitrary family of convex functions, dropping out the standard assumptions of compactness of the index set and upper semi-continuity of the functions with respect to the index (J. Convex Anal. 26, 299–324, 2019). We develop an approach based on the compactification of the index set, giving rise to an appropriate enlargement of the original family. Moreover, in contrast to the previous results in the literature, our characterizations are formulated exclusively in terms of exact subdifferentials at the nominal point. Fritz–John and KKT conditions are derived for convex semi-infinite programming.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
If we consider the pointwise supremum \(f:=\sup _{t\in T}f_{t}\) of a collection of convex functions \(f_{t}:X\rightarrow \mathbb {R}\cup \{\pm \infty \}\), t ∈ T≠∅, T arbitrary, defined on a separated locally convex space X, a challenging problem along the recent history of optimization (specially, in the decades of the 60s and 70s of the 20th century) has been to obtain formulas for the subdifferential of the supremum, ∂f(x), at any point x of the effective domain of f, in terms of the subdifferentials of the data functions, ∂ft(x), t ∈ T.
Since many convex functions, such as the Fenchel conjugate, the sum, the composition with affine applications, etc., can be expressed as the supremum of affine or convex functions, formulas characterizing the subdifferential of the supremum were expected to play a crucial role in convex and variational analysis, leading to a variety of calculus rules and allowing a deeper analysis for some relevant problems in this area. For instance, any formula for the subdifferential of the supremum function can be seen as a useful tool in deriving KKT-type optimality conditions for a convex optimization problem. This is due to the fact that any set of convex constraints, even an infinite set, can be replaced by a unique convex constraint involving the supremum function. An alternative approach consists of replacing the constraints by the indicator function of the feasible set. It turns out that, under certain constraint qualifications, its subdifferential (i.e., the normal cone to the feasible set) appears in the so-called Fermat optimality principle, and its relation with the subdifferential of the supremum function can be then conveniently exploited.
Let us quote the following paragraph extracted from [15]: “One of the most specific constructions in convex or nonsmooth analysis is certainly taking the supremum of a (possibly infinite) collection of functions. In the years 1965–1970, various calculus rules concerning the subdifferential of sup-functions started to emerge; working in that direction and using various assumptions, several authors contributed to this calculus rule: B.N. Pshenichnyi, A.D. Ioffe, V.L. Levin, R.T. Rockafellar, A. Sotskov, etc.; however, the most elaborated results of that time were due to M. Valadier (1969); he made use of ε-active indices in taking the supremum of the collection of functions.”
Therefore, it is clear that the mathematical interest of this topic was widely recognized since the very beginning of the convex and variational analysis history. A sample of remarkable contributions to this topic are: Brøndsted [1], Ekeland and Temam [10], Ioffe [17], Ioffe and Levin [18], Ioffe and Tikhomirov [19], Levin [20], Pschenichnyi [23], Rockafellar [26], Valadier [30], etc. See, for instance, Tikhomirov [29] to trace out the historical origins of the issue.
In a series of papers ([3,4,5,6, 12,13,14], etc.) we provided alternative characterizations of the subdifferential supremum in various settings, and applied them to derive calculus rules in convex analysis.
In [7] we addressed the problem of characterizing the subdifferential of the supremum of a compactly-indexed family of extended real-valued convex functions. These assumptions, which are standard in the literature of convex analysis and non-differentiable semi-infinite programming, are the compactness of the index set T and the upper semi-continuity of the constraint functions with respect to the index t. A couple of questions arise in a natural way. The first basic one is the following: Is it possible to remove these assumptions? A second more precise question is: By using a compactification of the index set and an appropriate enlargement of the original family of data functions, is there any chance for getting rid of these assumptions, but keeping alive the possibility of still applying the theory developed under them?
In this framework, we propose in the current paper an approach based on the Stone–Čech compactification of the index set T, as well as a natural procedure for building an appropriate enlargement of the original family ensuring the fulfillment of the minimal requirements of continuity of the functions with respect to the index. Moreover, in contrast to previous approaches, our characterizations are formulated exclusively in terms of exact subdifferentials at the nominal point.
Formula (10) constitutes the main result of the paper. It provides an explicit expression of the subdifferential of the supremum function for any family of convex functions, dropping the usual standard assumptions in the literature (upper semi-continuity and compactness conditions; see, e.g. [1, 7, 19, 27, 30]). Namely, compared with the formula
(see (2) and (3) for the definition of \(\mathcal {F}(x)\) and Tε(x), respectively), which can be easily derived from the main result in [14, Theorem 4], formula (10) involves the convex hull of the union of the exact subdifferentials of exclusively the active functions, up to an appropriate enlargement of the original family of functions.
The paper is structured as follows. After a short section introducing the notation, the main result in the section devoted to preliminaries is formula (4) in Proposition 1, which slightly improves Proposition 2 in [6] as it uses the convex hull instead of the closed convex hull. In Section 3 the compactification process is described in detail, and an appropriate enlargement of the original family {ft,t ∈ T} is built, through formula (6), in order to guarantee the (upper-semi) continuity requirements with respect to the index t which allow to apply the results in [7]. Our main result in Section 3, Theorem 1, provides the aimed characterization of the subdifferential of f in non-compact frameworks. It comes after some needed technical lemmas, and some corollaries are also established under certain specific assumptions. An example illustrates the compactification approach, and the last section provides Fritz–John and KKT-type optimality conditions for the convex semi-infinite optimization problem such that the compact/continuity assumptions in [6, Theorem 5 and Corollary 6] are again dropped.
2 Notation
Let X be a (real) separated locally convex space, whose topological dual space is X∗, which is endowed with the w∗-topology. The spaces X and X∗ are paired in duality by the bilinear form (x∗,x) ∈ X∗× X↦〈x∗,x〉 := 〈x,x∗〉 := x∗(x). The zero vectors in X and X∗ are denoted by 𝜃. Closed, convex and balanced neighborhoods of 𝜃 are called 𝜃-neighborhoods. We use the notation \(\overline {\mathbb {R}}:=\mathbb {R}\cup \{-\infty ,+\infty \}\) and \(\mathbb {R}_{\infty }:=\mathbb {R}\cup \{+\infty \}\), and adopt the convention \((+\infty ) + (-\infty ) = (-\infty ) + (+\infty )=+\infty \).
Given two nonempty sets A and B in X (or in X∗), we define the algebraic (or Minkowski) sum by
By co(A), cone(A), and aff(A), we denote the convex, the conical convex (i.e., \(\text {cone} A:=\mathbb {R}_{+}(\text {co} A))\), and the affine hulls of the set A, respectively. Moreover, int(A) is the interior of A and clA and \(\overline {A}\) are indistinctly used for denoting the closure of A. We use ri(A) to denote the (topological) relative interior of A (i.e., the interior of A in the topology relative to aff(A) if aff(A) is closed, and the empty set otherwise).
Associated with A≠∅ we consider the orthogonal subspace given by
The following relation is fulfilled
where \(\mathcal {F}\) is the family of finite-dimensional linear subspaces of X.
If A ⊂ X is convex and x ∈ X, we define the normal cone to A at x as
if x ∈ A, and the empty set otherwise.
Given a function \(f:X\longrightarrow \overline {\mathbb {R}}\), its (effective) domain is
We say that f is proper when domf≠∅ and \(f(x)>-\infty \) for all x ∈ X.
Given x ∈ X and ε ≥ 0, the ε-subdifferential of f at x is
when x ∈domf, and ∂εf(x) := ∅ when \(f(x)\notin \mathbb {R}\). The elements of ∂εf(x) are called ε-subgradients of f at x. The subdifferential of f at x is ∂f(x) := ∂0f(x), whose elements are called subgradients of f at x.
The support and the indicator functions of A ⊂ X are respectively defined as
and
3 Preliminary Results
We give a first characterization of the subdifferential of the supremum
of a family of extended real-valued convex functions {ft, t ∈ T}, defined on a (separated) real locally convex space X, and indexed by an arbitrary (possibly, infinite) set T.
We shall need the following result which slightly improves Proposition 2 in [6], as it uses the convex hull instead of the closed convex hull. Our main result, given in Theorem 1, provides the general characterization of the subdifferential of f in non-necessarily compact frameworks.
Given x ∈ X and ε ≥ 0, we shall denote
Proposition 1
Fix x ∈ X. We assume there is some ε0 > 0 such that (i) \(T_{\varepsilon _{0}}(x)\) is compact and (ii) for each net \((t_{i})_{i}\subset T_{\varepsilon _{0}}(x)\) converging to \(t\in T_{\varepsilon _{0}}(x)\) we have that
Then
Proof
According to [6, Proposition 2], where (4) was established with \(\overline {\text {co}}\) instead of co, we only need to prove that the sets
are closed under the current hypothesis. Let us denote, for t ∈ T(x) and \(L\in \mathcal {F}(x)\),
so that
Take a net \((u_{i}^{\ast })_{i}\subset E_{L}\) such that \(u_{i}^{\ast }\rightarrow u^{\ast }\in X^{\ast }\). We denote by \(z_{i}^{\ast }\) the restriction of \(u_{i}^{\ast }\) to the finite-dimensional subspace L, so that
where L∗ is the dual of L. Then, by applying Charathéodory’s Theorem in L∗, for each i there are some λi,1,…,λi,n+ 1 ≥ 0 with λi,1 + ⋯ + λi,n+ 1 = 1, and elements \(z_{i,k}^{\ast }\in \partial g_{t_{i,k}}(x)\) with ti,k ∈ T(x) and k ∈ K := {1,…,n + 1} (hence, the functions \(g_{t_{i,k}}\), k ∈ K, are all proper), such that
where n is the dimension of L.
We may assume that each (λi,k)i, k ∈ K, converges to some λk ≥ 0 such that λ1 + ⋯ + λn+ 1 = 1. Also, since \((t_{i,k})_{i}\subset T(x)\subset T_{\varepsilon _{0}}(x)\) and this last set is compact by assumption, we may assume that \(t_{i,k}\rightarrow t_{k}\in T_{\varepsilon _{0}}(x)\), k ∈ K. Moreover, using again the assumption, we have
in particular, \(g_{t_{k}}(z)>-\infty \) for all z ∈ L ∩domf, k ∈ K, and (recall that ti,k ∈ T(x))
showing that tk ∈ T(x) for all k ∈ K. Consequently, taking into account that (ti,k)i ⊂ T(x) for all k ∈ K, for every \(z\in L\cap \text {dom} f (=\text {dom} g_{t_{k}}\), k ∈ K) we obtain
where K+ := {k ∈ K : λk > 0}. Hence, using Rockafellar’s subdifferential sum rule [25], as \(g_{t_{k}}(z)+\mathrm {I}_{L\cap \text {dom} f}(z)=g_{t_{k}}(z)\) and \(\text {ri}(\text {dom} g_{t_{k}})=\text {ri}(L\cap \text {dom} f)\neq \emptyset \) for all z ∈ L and all k ∈ K, we obtain
Then, using the extension theorem, we can take an extension v∗ of z∗ to X∗ such that
satisfying u∗− v∗∈ L⊥. Therefore
□
4 Compactification Approach
Given a non-empty family of extended real-valued convex functions
defined on a (separated) real locally convex space X, and indexed by an arbitrary (possibly, infinite) set T, we consider the corresponding supremum function
Here, in order to apply the methodology proposed in [7], we endow the index set T with some topology. When no topology is known on T we frequently use the discrete one. We denote by \(\mathcal {C}(T,[0,1])\) the set of continuous functions from T to [0,1], and consider the product space [0,1]C(T,[0,1]), which is compact for the product topology (by Tychonoff theorem). We shall regard the index set T as a subset of [0,1]C(T,[0,1]), and write T ⊂ [0,1]C(T,[0,1]), by using the mapping \(\mathfrak {d}\):\(T\rightarrow [0,1]^{C(T,[0,1])}\), which assigns to each t ∈ T the evaluation function \(\mathfrak {d}\)(t)≡ γt ∈ [0,1]C(T,[0,1]), defined as
The closure of T in [0,1]C(T,[0,1]) for the product topology is the compact set
and is referred to as the Stone–Čech compactification of T, usually denoted by βT. Remember that for \(\gamma \in \widehat {T}\) and a net \((\gamma _{i})_{i}\subset \widehat {T}\), we have \(\gamma _{i}\rightarrow \gamma \) when
When T is completely regular; i.e., compact Hausdorff, \(\widehat {T}\) is Hausdorff (see, i.e., [22, §38]), and the convergences in \(\mathfrak {d} (T)\) and T are the same.
Next, we enlarge the original family {ft, t ∈ T} by introducing the functions \(f_{\gamma }:X\rightarrow \overline {\mathbb {R}}\), \(\gamma \in \widehat {T}\), defined by
that is,
Observe that the family \(\{f_{\gamma },\gamma \in \widehat {T}\}\) includes the elements of the form \(f_{\gamma _{t}}\), t ∈ T, given by
which may not belong to the original family {ft, t ∈ T}, as well as the functions fγ with \(\gamma \in \widehat {T}\setminus \mathfrak {d} (T)\).
Remark 1
Observe that, for all t ∈ T and z ∈ X,
and that the first inequality may be strict. Indeed, one may have that \(f_{\gamma _{t}}(z)=\lim _{i}f_{t_{i}}(z)\) for some \(\gamma _{t_{i}} \rightarrow \gamma _{t}\) such that (ti)i does not converge to t. This may happen, for instance, when T is compact but not Hausdorff. On the other side, if T is completely regular, for example compact Hausdorff, then
The new functions fγ, \(\gamma \in \widehat {T}\), provide the same supremum f as the original ones ft, t ∈ T:
Lemma 1
The functions fγ, \(\gamma \in \widehat {T}\), are convex, and we have
Proof
The convexity of the fγ’s follows easily from the convexity of the ft’s. Next, for each \(\gamma \in \widehat {T}\) and z ∈ X, we have
entailing that \(\sup _{\gamma \in \widehat {T}}f_{\gamma }\leq f\). In addition, if the sequence (tn)n ⊂ T is such that \(f(z)=\lim _{n}f_{t_{n}}(z)\), with z ∈ X, then there exist a subnet (ti)i of (tn)n and \(\gamma \in \widehat {T}\) such that \(\gamma _{t_{i}}\rightarrow \gamma \), and we get
showing that \(\sup _{\gamma \in \widehat {T}}f_{\gamma }\geq f\). □
Now, given x ∈ X, with \(f(x)\in \mathbb {R}\), and ε ≥ 0, we introduce the extended ε-active index set of f at x by
and the extended active index set of f at x
Moreover, taking into account (7), for each t ∈ T(x) we have that
that is,
The set \(\widehat {T}(x)\) is a nonempty set in spite of the possible emptiness of T(x). More generally, we have:
Lemma 2
The sets \(\widehat {T}_{\varepsilon }(x)\), ε ≥ 0 and x ∈domf, are nonempty and compact.
Proof
It is enough to prove that \(\widehat {T}(x)\) is nonempty and closed; the general case when ε > 0 is similar. Fix x ∈domf. For a sequence (tn)n ⊂ T such that \(\lim _{n}f_{t_{n}}(x)=f(x)\) there will exist, due to the compactness of \(\widehat {T}\), a subnet (ti)i ⊂ T such that \(\gamma _{t_{i}}\rightarrow \gamma \in \widehat {T}\), and then (7) ensures that
that is, \(\gamma \in \widehat {T}(x)\) and this set is nonempty.
Next, we show that \(\widehat {T}(x)\) is closed. We take a net \((\gamma _{i})_{i}\subset \widehat {T}(x)\) that converges to γ \((\in \widehat {T})\). Then, by the definition of the fγ’s, for each i we find a net (tij)j ⊂ T such that \(\gamma _{t_{ij}}\rightarrow _{j}\gamma _{i}\) and
Thus, there exists a diagonal net \(\left (\gamma _{t_{ij_{i}}},f_{t_{ij_{i}}}(x)\right )_{i}\subset \widehat {T}\times \mathbb {R}\) such that \(\gamma _{t_{ij_{i}}}\rightarrow _{i}\gamma \) and \(f_{t_{ij_{i}}}(x)\rightarrow _{i}f(x)\); that is,
and so \(\gamma \in \widehat {T}(x)\). □
Lemma 3
If x ∈domf, then
Proof
Take \(\gamma \in \widehat {T}(x)\). Then there exists a net (ti)i ⊂ T such that \(\gamma _{t_{i}}\rightarrow \gamma \) and
Hence, for each ε > 0 there exists an i0 such that
where ≽ defines the order in the directed set. In other words, \(\gamma _{t_{i}}\in \mathfrak {d} (T_{\varepsilon }(x))\) for all i ≽ i0. This entails that \(\gamma \in \text {cl} (\mathfrak {d} (T_{\varepsilon }(x)))\), and we get \(\gamma \in \bigcap _{\varepsilon >0} \text {cl}\)(\(\mathfrak {d} (T_{\varepsilon }(x)))\), by the arbitrariness of ε > 0.
Conversely, take \(\gamma \in \bigcap _{\varepsilon >0} \text {cl}\)(\(\mathfrak {d} (T_{\varepsilon }(x)))\). Then, for each integer number k and each neighborhood U of γ, there exists some \(\gamma _{t_{(k,U)}}\in U\) with \(t_{(k,U)}\in T_{\frac {1}{k}}(x)\); that is (by (7)),
Since \(\widehat {T}\) is compact Hausdorff (coming form the complete regularity of T), the net \((\gamma _{t_{(k,U)}})_{(k,U)}\) converges and its limit must be equal to γ. Then
and so \(\gamma \in \widehat {T}(x)\). □
Let us examine the concepts introduced above in a compact (possibly, non-Hausdorff) framework. We denote by \(\sim \) the equivalence relation on T given by
and by \(\tilde {t}\) the equivalence class of t ∈ T. It is known that \(\widehat {T}\) and T are homeomorphic when T is compact Hausdorff.
Lemma 4
Assume that T is compact (possibly, non-Hausdorff). Then, provided that the mapping t↦ft(x) is continuous on T, the following assertions hold true for each x ∈domf :
-
(i)
ft(x) = fs(x) for all s, t ∈ T such that \(s\sim t\).
-
(ii)
$$ \widehat{T}(x)=\left\{\tilde{t}\in T/\sim~:~f_{\tilde{t}}(x)=\limsup_{\widetilde{s}\rightarrow\tilde{t}}f_{s}(x)=f(x)\right\} = \left\{\tilde{t}\in T/\sim~:~t\in T(x)\right\}, $$
where \(\widetilde {s}\rightarrow \tilde {t}\) means that \(\varphi (s)\rightarrow \varphi (t)\) for all \(\varphi \in \mathcal {C}(T,[0,1])\).
Proof
Under the current hypothesis it can be proved that \(\widehat {T}\) and the quotient space \(T/\sim \) are homeomorphic, by means of the mapping \(\tilde {t}\in T/\sim ~\longmapsto \gamma _{t}\in \widehat {T}\).
(i) Since f(⋅)(x) is continuous and T is compact we can easily prove the existence of m > 0 such that
Thus, using the positive and the negative parts of f(⋅)(x), \(f_{(\cdot )}^{+}(x)\) and \(f_{(\cdot )}^{-}(x)\), we have \(m^{-1}f_{(\cdot )}^{+}(x)\), \(m^{-1}f_{(\cdot )}^{-}(x)\in \mathcal {C}(T,[0,1])\) and so, for all s,t ∈ T such that \(s\sim t\),
(ii) If \(\tilde {t}\in \widehat {T}(x)\), then there exists a net (ti)i ⊂ T such that
Since T is compact we may assume that \(t_{i}\rightarrow s\in T\), and the continuity of f(⋅)(x) entails
that is, s ∈ T(x). Now, fix \(\varphi \in \mathcal {C}(T,[0,1])\). From the one hand, since \(\widetilde {t_{i}}\rightarrow \tilde {t}\), we have that
On the other hand, the continuity of φ yields
and we get φ(t) = φ(s); that is, \(\tilde {t}=\tilde {s}\).
Conversely, if \(\tilde {t}\in T/\sim \) is such that t ∈ T(x), then by (7) we get
and we are done. □
Now we give the main result of the paper, for general index sets and dropping both the upper semi-continuity-type condition and the compactness assumption assumed in Proposition 1.
Theorem 1
Let {ft,t ∈ T} be a nonempty family of extended real-valued convex functions, and consider \(f=\sup _{t\in T}f_{t}\). Then, for every x ∈domf, we have
where fγ, \(\widehat {T}(x)\) and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is equipped with a completely regular topology.
In order to prove Theorem 1, we first establish the following key lemma, which constitutes the bridge with the compact framework.
Lemma 5
Assume that f is proper and take x ∈domf, with \(f(x)\in \mathbb {R}\), and ε ≥ 0.
-
(i)
Every net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) has an accumulation point \(\gamma \in \widehat {T}_{\varepsilon }(x)\) such that
$$ \limsup_{i}f_{\gamma_{i}}(z)\leq f_{\gamma}(z)\quad\text{ for all }z\in\text{dom} f. $$(11) -
(ii)
If T is completely regular, then (11) holds for every net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) converging to \(\gamma \in \widehat {T}_{\varepsilon }(x)\).
Proof
(i) Fix a net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) and, due to the compactness of \(\widehat {T}_{\varepsilon }(x)\) established in Lemma 2, let \(\gamma \in \widehat {T}_{\varepsilon }(x)\) be such that \(\gamma _{i}\rightarrow \gamma \) (without loss of generality). Take z ∈domf, so that \(f_{\gamma }(z)\leq f(z)<+\infty \). Next, for each i there will exist a net (tij)j ⊂ T such that
For every fixed δ > 0 we may suppose, without loss of generality, that for all i
Then there exists a diagonal net \((t_{ij_{i}})_{i}\subset T\) such that \(\gamma _{t_{ij_{i}}}\rightarrow _{i}\gamma \) and
Consequently,
and we get, as δ ↓ 0,
(ii) Fix a net \((\gamma _{i})_{i\in I}\subset \widehat {T}_{\varepsilon }(x)\) such that \(\gamma _{i}\rightarrow \gamma \in \widehat {T}_{\varepsilon }(x)\), and take z ∈domf with \(f_{\gamma }(z)<+\infty \). By assertion (i) the inequality (11) holds for some accumulation point of (γi)i, which must be γ (because \(\widehat {T}\) is Hausdorff).
□
Proof
(of Theorem 1) By Lemma 1, the functions \(\{f_{\gamma },\gamma \in \widehat {T}\}\) are convex and satisfy
According to Lemma 2, the sets \(\widehat {T}_{\varepsilon }(x)\) are compact for every ε ≥ 0, while Lemma 5(ii) entails the upper semi-continuity of the mappings γ↦fγ(z), z ∈domf. Consequently, Proposition 1 applies and yields the desired formula. □
Corollary 1
If f∣aff(domf) is continuous on ri(domf) (assumed to be nonempty), then for every x ∈ X
Proof
Under the current assumption, for every \(L\in \mathcal {F}(x)\) and \(\gamma \in \widehat {T}(x)\) such that L ∩ri(domf)≠∅ we have that (see, e.g., [3, Theorem 15(iii)])
Now, given a convex neighborhood U ⊂ X∗ of the origin, we choose \(L\in \mathcal {F}(x)\) such that L ∩ri(domf)≠∅ and L⊥⊂ U. Then Theorem 1 yields
and we get, by intersecting over the U’s,
The conclusion follows as the opposite inclusion is straightforward. □
Theorem 2
If f is finite and continuous at some point, then for every x ∈ X
Proof
Fix x ∈domf. By taking into account that fγ ≤ f, Corollary 1 yields
Additionally, for a neighborhood \(U_{x_{0}}\) of x0 ∈int(domf) such that \(U_{x_{0}}\subset \text {dom} f\), we have
showing that \(\sigma _{\mathrm {N}_{\text {dom} f}(x)}\) is continuous at x0 − x. At the same time, we have
and so, thanks to the Moreau–Rockafellar sum rule, (14) implies that
□
The following corollary is a straightforward consequence of Theorems 1 and 2. We introduce the functions \(\tilde {f}_{t}:X\rightarrow \overline {\mathbb {R}}\), t ∈ T, given by
and denote
Corollary 2
Assume that T is compact Hausdorff. Then, for every x ∈domf, we have
If, in addition, f is finite and continuous at some point, then for every x ∈domf
Proof
Since T is compact Hausdorff; hence, completely regular, we have that \(T\equiv \widehat {T}\) and, for all t ∈ T and z ∈ X,
In other words, the first formula is a consequence of Theorem 1. Similarly, the second statement of the theorem follows from Theorem 2. □
The following corollary shows how to deduce Valadier’s formula ([30]), given in the compact setting (see [19, Theorem 3, p. 201] and [31, Theorem 2.4.18]).
Corollary 3
Assume that T is compact Hausdorff. Let U ⊂ X be an open set such that:
-
(i)
\(f_{t}(x)\in \mathbb {R}\) for all t ∈ T and x ∈ U,
-
(ii)
t ∈ T↦ft(x) is upper semi-continuous for each x ∈ U,
-
(iii)
x ∈ U↦ft(x) is continuous for each t ∈ T.
Then for every x ∈ U we have
Proof
Assume first that X is a Banach space. Then, using classical arguments (see, e.g., [19, 31]), it is shown that the supremum function \(f=\sup _{t\in T}f_{t}\) is finite and, so, continuous on U. Thus, by Corollary 2, for each x ∈ U we have
where \(\tilde {f}_{t}\) and \(\widetilde {T}(x)\) are defined in (15) and (16), respectively.
Take \(t\in \widetilde {T}(x)\). On the one hand, using the compactness assumption, there exist some net (ti)i ⊂ T and t ∈ T such that \(f(x)=\lim _{i}f_{t_{i}}(x)\) and (ti)i converges to t ∈ T. But we have, due to assumption (ii),
and so t ∈ T(x).
On the other hand, also by assumption (ii), for all z ∈ U we have
and both functions \(\tilde {f}_{t}\) and ft coincide at x. Consequently, \(\partial \tilde {f}_{t}(x)\subset \partial f_{t}(x)\) and (17) yields
Thus, we are done since the opposite inclusion is straightforward.
We consider now the case when X is any locally convex space. We fix x ∈ U and x∗∈ ∂f(x). Given an \(L\in \mathcal {F}(x)\), we introduce the convex functions \(g_{t}:L\rightarrow \overline {\mathbb {R}}\), t ∈ T, defined as
that is, gt is the restriction of ft + IL to L, and consider the associated supremum
Therefore, since the family {gt,t ∈ T} satisfies the requirements of the paragraph above, we obtain
Now, take x∗∈ ∂f(x), so that \(\hat {x}^{\ast }:=x_{\mid L}^{\ast }\in \partial g(x)=\overline {\text {co}}\left \{\bigcup _{t\in T(x)}\partial g_{t}(x)\right \}\). Then, thanks to the fact that L∗ is isomorphic to the quotient space X∗/L⊥, for every 𝜃-neighborhood V ⊂ X∗we have that
where \(V_{\mid L}:=\{u_{\mid L}^{\ast }:u^{\ast }\in V\}\) is a 𝜃-neighborhood in X∗/L⊥. In other words, there are u∗∈ V, \(\lambda _{1},\dots ,\lambda _{k}\geq 0\), \(t_{1},\dots ,t_{k}\in T(x)\) and \(\hat {x}_{1}^{\ast },\cdots ,\hat {x}_{k}^{\ast }\in L^{\ast }\) such that λ1 + ⋯ + λk = 1, \(\hat {x}_{j}^{\ast }\in \partial g_{t_{j}}(x)\), \(j=1,\dots ,k\), k ≥ 1, and
Moreover, by the Hahn–Banach theorem, we extend \(\hat {x}_{1}^{\ast },\dots ,\hat {x}_{k}^{\ast }\) to \(x_{1}^{\ast },\dots ,x_{k}^{\ast }\in X^{\ast }\), which satisfy
that is, \(x^{\ast }\in \lambda _{1}x_{1}^{\ast }+\cdots +\lambda _{k}x_{k}^{\ast }+u^{\ast }+L^{\perp }\). But \(x_{j}^{\ast }\in \partial (f_{t_{j}}+\mathrm {I}_{L})(x)\), \(j=1,\dots ,k\), and so
where the last equality follows by applying the Moreau–Rockafellar sum rule (thanks to assumption (iii)). Finally, because L and V were arbitrarily chosen, we deduce that \(x^{\ast }\in \overline {\text {co}}\left \{\bigcup _{t\in T(x)}\partial f_{t}(x)\right \}\) (see (1)), and the inclusion “⊂” follows. □
Corollary 4
Assume that \(X=\mathbb {R}^{n}\). Then (12) and (13) hold with co instead of \(\overline {\text {co}}\).
Proof
Similarly as in the proof Proposition 1, we can prove that the set
is closed and (12) holds with co instead of \(\overline {\text {co}}\); that is,
In addition, if f is finite and continuous at some point in domf, then each function fγ (≤ f), \(\gamma \in \widehat {T}(x)\), is finite and continuous at the same point, and (18) yields (13) with co instead of \(\overline {\text {co}}\),
□
Example 1
Consider the family of convex functions g2n+ 1,h2n, \(n\in \mathbb {N}\), defined on \(\mathbb {R}\) as
We introduce the family \(\{f_{n}, n\in \mathbb {N}\}\) such that f2n+ 1 := g2n+ 1 and f2n := h2n, together with the supremum function
Obviously,
and
Thus, if we apply (4) in Proposition 1, we reach a false conclusion as the assumption there is not satisfied in this case:
The Stone–Čech compactification of \(\mathbb {N}\) is given by
whereas the fγ’s, \(\gamma \in \widehat {\mathbb {N}}\), take the form
for \(\gamma \in \mathbb {N}\), and
for \(\gamma \in \widehat {\mathbb {N}}\setminus \mathbb {N}\). Equivalently, we consider the family
where \(g_{\bar {\gamma }}\), \(h_{\bar {\gamma }}: \mathbb {R}\rightarrow \mathbb {R}\) are defined as
It is easily checked that this new family has the same properties as the original one, \(\{f_{\gamma },\gamma \in \widehat {\mathbb {N}}\}\). In other words, we have enlarged the original family of functions by adding \(g_{\bar {\gamma }}\) and \(h_{\bar {\gamma }}\). Therefore, applying (10), we get
and, for x≠ 0, say x = 1,
Observe that the presence of the new functions \(g_{\bar {\gamma }}\) and \(h_{\bar {\gamma }}\) is necessary, since the subdifferentials at 0 of the data functions g2n+ 1 and h2n do not lead us to the whole subdifferential of the supremum function f, as they do not include the subgradients − 1 and 1.
In order to decompose the subdifferential term involved in formula (10) we need to impose some additional continuity or lower semi-continuity conditions on the initial functions. The assumption in Theorem 2 gives the first example, where the continuity of the supremum function allows to characterize ∂f(x) by means only of the sets ∂fγ(x). We give next an alternative representation of ∂f(x) by means of the ε-subdifferentials of the fγ’s, under the condition
where clf and clft are the closed hulls (lower semi-continuous regularizations) of the respective functions.
Proposition 2
If (19) holds, then for every x ∈domf
where fγ, \(\widehat {T}(x)\) and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is a completely regular topological space.
Proof
It suffices to apply [7, Theorem 3.8] to the family \(\{f_{\gamma }, \gamma \in \widehat {T}\}\). □
We discuss next a nonconvex counterpart of formula (10), under the following condition introduced in [21],
where f∗∗ and \(f_{t}^{\ast \ast }\) are the biconjugates of the respective functions. In the convex case, and assuming that the conjugatesf∗ and \(f_{t}^{\ast }\) are proper, (19) is equivalent to the last relation.
Proposition 3
Let {ft,t ∈ T} be a nonempty family of extended real-valued non-necessarily convex functions, and consider \(f=\sup _{t\in T}f_{t}\). If condition (20) holds, then for every x ∈domf
where fγ, \(\widehat {T}(x)\), and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is equipped with a completely regular topology.
Proof
Assume that ∂f(x)≠∅, so that f(x) = f∗∗(x) and ∂f(x) = ∂f∗∗(x). Then, by applying Theorem 1 to the family \(\{f_{t}^{\ast \ast },t\in T\}\), we obtain
where \(g_{\gamma }:X\rightarrow \overline {\mathbb {R}}\), \(\gamma \in \widehat {T}\), are defined by
and
Observe that for every \(\gamma \in \widehat {T}^{1}(x)\) we have that
and so \(\gamma \in \widehat {T}(x)=\{\gamma \in \widehat {T}:f_{\gamma }(x)=f(x)\}\). Moreover, since
we deduce that for all \(L\in \mathcal {F}(x)\)
Thus, the inclusion “⊂” in the first statement follows from (21), and we are done since the opposite inclusion is easily verified.
The second statement follows similarly by using Proposition 2 instead of Theorem 1. □
5 An Application to Optimality Conditions
In this section, we revise the optimality conditions for convex semi-infinite programming established in [6], by removing the compactness of the set indexing the constraints.
Aside [6], a significant precedent of the results in this section can be found in [11, Chapter 7], where KKT conditions are established for convex semi-infinite optimization with finite-valued functions, using a closedness condition which is implied by some version of Slater’s qualification. Many KKT conditions exist in the literature which are obtained via different approaches: approximate subdifferentials of the data functions ([3, 16]), the exact subdifferentials at close points [28], Farkas–Minkowski-type closedness criteria [8] in convex semi-infinite optimization, strong CHIP-like qualifications for convex optimization with non necessarily convex \(\mathcal {C}^{1}\)-constraints [2] (see, also, [9] for locally Lipschitz constraints), among others.
Here we consider the following optimization problem
where T is a completely regular topological space, and \(f_{t}:\mathbb {R}^{n}\rightarrow \mathbb {R}_{\mathbb {\infty }}\), for t ∈ T ∪{0} (we assume, without loss of generality, that 0∉T), are proper and convex. Problem \((\mathcal {P})\) is equivalent to
where
Let the set \(\widehat {T}\) and the convex functions \(f_{\gamma }:\mathbb {R}^{n}\rightarrow \mathbb {R}_{\mathbb {\infty }}\), \(\gamma \in \widehat {T}\), be as defined in (5) and (6), respectively. We also denote
so that, by Lemma 3, for every feasible point \(x\in \mathbb {R}^{n}\) for \(\mathcal {(P)}\) we have
where
The following theorem establishes Fritz–John-type necessary optimality conditions for problem \(\mathcal {(P)}\). The main feature of this result and the subsequent corollary is the absence of any compactness and continuity assumptions on the index set and the mappings t↦ft(z), as they were required in [6, Theorem 5].
Theorem 3
Assume that \(\bar {x}\) is an optimal solution of \(\mathcal {(P)}\). Then we have
-
(a)
$$ 0_{n}\in\text{co}\left\{\partial(f_{0}+\mathrm{I}_{\text{dom} f})(\bar{x})\cup \bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial(f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f})(\bar{x})\right\}. $$
-
(b)
Moreover, under the condition
$$ \text{ri}(\text{dom} f_{\gamma})\cap\text{ri}(\text{dom} f)\neq\emptyset\quad\text{ for all }\gamma\in\widehat{A}(\bar{x})\cup\{0\}, $$we have
$$ 0_{n}\in\text{co}\left\{\partial f_{0}(\bar{x})\cup \bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial f_{\gamma}(\bar{x})\right\} + \mathrm{N}_{\text{dom} f} (\bar{x})+\mathrm{N}_{\text{dom} f_{0}}(\bar{x}), $$
Proof
We consider the supremum function \(g:\mathbb {R}^{n} \rightarrow \mathbb {R}_{\mathbb {\infty }}\), defined as
so that domg = domf0 ∩domf. It is easily verified that \(\bar {x}\) is a global minimum of g; that is, \(0_{n}\in \partial g(\bar {x})\).
We endow the set T ∪{0} with the topology generated by the open sets of T and {0}, which makes it completely regular. Then the compactification of T ∪{0} can be identified with \(\widehat {T}\cup \{0\}\). Consequently, and according to Corollary 4, \(\bar {x}\) satisfies
which is condition (a).
(b) Under the current assumptions, by using the classical sum rule ([25]), we get from the one hand
and from the other hand, since
we obtain
Thus, the conclusion follows from (a). □
Remark 2
In particular, if \(f(\bar {x})<0\), then the last condition reads
as \(f_{\gamma }(\bar {x})\leq f(\bar {x})<0\) for all \(\gamma \in \widehat {A}\), and so \(\widehat {A}(\bar {x})=\emptyset \).
Remark 3
Observe that the strong Slater condition; i.e., the existence of some x0 ∈domf0 such that f(x0) < 0, does not imply that x0 is an interior point of the feasible set. This is what happens in the following example. Take \(T:=[0,+\infty \lbrack \), f0 ≡ 0 and let \(f_{t}:\mathbb {R}\rightarrow \mathbb {R}\), t ∈ T, be defined as
The point 0 is a strong Slater point, but \(0\notin \text {int}(\{x\in \mathbb {R}:~f_{t}(x)\leq 0,~t\in T\})\).
We derive next the KKT conditions for problem \(\mathcal {(P)}\) under the Slater qualification.
Corollary 5
Under the strong Slater condition; that is,
the point \(\bar {x}\) is optimal for \(\mathcal {(P)}\) if and only if
Proof
Assume first that \(f(\bar {x})=0\). By Theorem 3(a), \(\bar {x}\) is optimal if and only if either
or (22) holds.
Moreover, by Theorem 1 we have that
and so relation (23) is equivalent to
equivalently, \(f(x)\geq f(\bar {x})=0\) for all x ∈domf0; and this contradicts the strong Slater condition.
Finally, if \(f(\bar {x})<0\), then (22) follows by Theorem 3(a). □
References
Brøndsted, A.: On the subdifferential of the supremum of two convex functions. Math. Scand. 31, 225–230 (1972)
Chieu, N.H., Jeyakumar, V., Li, G., Mohebi, H.: Constraint qualifications for convex optimization without convexity of constraints: new connections and applications to best approximation. Eur. J. Oper. Res. 265, 19–25 (2018)
Correa, R., Hantoute, A., López, M.A.: Weaker conditions for subdifferential calculus of convex functions. J. Funct. Anal. 271, 1177–1212 (2016)
Correa, R., Hantoute, A., López, M.A.: Towards supremum-sum subdifferential calculus free of qualification conditions. SIAM J. Optim. 26, 2219–2234 (2016)
Correa, R., Hantoute, A., López, M.A.: Valadier-like formulas for the supremum function I. J. Convex Anal. 25, 1253–1278 (2018)
Correa, R., Hantoute, A., López, M.A.: Moreau–rockafellar type formulas for the subdifferential of the supremum function. SIAM J. Optim. 29, 1106–1130 (2019)
Correa, R., Hantoute, A., López, M.A.: Valadier-like formulas for the supremum function II: the compactly indexed case. J. Convex Anal. 26, 299–324 (2019)
Dinh, N., Goberna, M.A., López, M.A.: From linear to convex systems: consistency, Farkas’ lemma and applications. J. Convex Anal. 13, 113–133 (2006)
Dutta, J., Lalitha, C.S.: Optimality conditions in convex optimization revisited. Optim. Lett. 7, 221–229 (2013)
Ekeland, I., Témam, R.: Convex Analysis and Variational Problems. North-holland & American Elsevier, Amsterdam (1976)
Goberna, M.A., López, M.A.: Linear Semi-infinite Optimization. J. Wiley, Chichester (1998)
Hantoute, A.: Subdifferential set of the supremum of lower semi-continuous convex functions and the conical hull property. Top 14, 355–374 (2006)
Hantoute, A., López, M. A.: A complete characterization of the subdifferential set of the supremum of an arbitrary family of convex functions. J. Convex Anal. 15, 831–858 (2008)
Hantoute, A., López, M.A., Zălinescu, C.: Subdifferential calculus rules in convex analysis: a unifying approach via pointwise supremum functions. SIAM J. Optim. 19, 863–882 (2008)
Hiriart-Urruty, J.-B.: Convex analysis and optimization in the past 50 years: some snapshots. In: Demyanov, V.F., Pardalos, P.M., Batsyn, M (eds.) Constructive Nonsmooth Analysis and Related Topics. Springer Optimization and Its Applications, vol. 87, pp 245–253. Springer, New York (2014)
Hiriart-Urruty, J.-B., Phelps, R.R.: Subdifferential calculus using ε-subdifferentials. J. Funct. Anal. 118, 154–166 (1993)
Ioffe, A.D.: A note on subdifferentials of pointwise suprema. Top 20, 456–466 (2012)
Ioffe, A.D., Levin, V.L.: Subdifferentials of convex functions. Tr. Moskov Mat. Obshch 26, 3–73 (1972). (In Russian)
Ioffe, A.D., Tikhomirov, V.H.: Theory of Extremal Problems. Studies in Mathematics and Its Applications, vol. 6. North-Holland, Amsterdam (1979)
Levin, V.L.: An application of Helly’s theorem in convex programming, problems of best approximation and related questions. Mat. Sb., Nov. Ser. 79, 250–263 (1969). Engl. Trans.: Math. USSR, Sb. 8, 235–247 (1969)
López, M.A., Volle, M.: A formula for the set of optimal solutions of relaxed minimization problems. Applications to subdifferential calculus. J. Convex Anal. 17, 1057–1075 (2010)
Munkres, J.: Topology, 2nd edn. Prentice Hall, Upper Saddle River (2000)
Pschenichnyi, B.N.: Convex programming in a normalized space. Kibernetika 5, 46–54 (1965). (Russian); Engl. Trans.: Cybernetics 1, 46–57 (1965)
Rockafellar, R.T., Brøndsted, A.: On the subdifferentiability of convex functions. Proc. Am. Math. Soc. 16, 605–611 (1965)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rockafellar, R.T.: Directionally Lipschitzian functions and subdifferential calculus. Proc. Lond. Math. Soc. 39, 331–355 (1979)
Solov’ev, V.N.: The subdifferential and the directional derivatives of the maximum of a family of convex functions. Izvestiya RAN: Ser. Mat. 65, 107–132 (2001)
Thibault, L.: Sequential convex subdifferential calculus and sequential Lagrange multipliers. SIAM J. Control Optim. 35, 1434–1444 (1997)
Tikhomirov, V. M.: Analysis II: convex analysis and approximation theory. In: Gamkrelidze, R.V. (ed.) Encyclopaedia of Mathematical Sciences, vol. 14. Springer, Berlin (1990)
Valadier, M.: Sous-différentiels d’une borne supérieure et d’une somme continue de fonctions convexes. C. R. Acad. Sci. Paris Sé,r. A-B 268, A39–A42 (1969)
Zălinescu, C.: Convex Analysis in General Vector Spaces. World Scientific, River Edge (2002)
Acknowledgements
Research supported by CONICYT (Fondecyt 1190012 and 1190110), Proyecto/Grant PIA AFB-170001, MICIU of Spain and Universidad de Alicante (Grant Beatriz Galindo BEA- GAL 18/00205), and Research Project PGC2018-097960-B-C21 from MICINN, Spain. The research of the third author is also supported by the Australian ARC - Discovery Projects DP 180100602.
The authors wish to thank the referee for the valuable comments and suggestions which have contributed to improve the first version of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated by his coauthors to Prof. Marco A. López on his 70th birthday.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Correa, R., Hantoute, A. & López, M.A. Subdifferential of the Supremum via Compactification of the Index Set. Vietnam J. Math. 48, 569–588 (2020). https://doi.org/10.1007/s10013-020-00403-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-020-00403-5
Keywords
- Supremum of convex functions
- Subdifferentials
- Stone–Čech compactification
- Convex semi-infinite programming
- Optimality conditions