1 Introduction

If we consider the pointwise supremum \(f:=\sup _{t\in T}f_{t}\) of a collection of convex functions \(f_{t}:X\rightarrow \mathbb {R}\cup \{\pm \infty \}\), tT, T arbitrary, defined on a separated locally convex space X, a challenging problem along the recent history of optimization (specially, in the decades of the 60s and 70s of the 20th century) has been to obtain formulas for the subdifferential of the supremum, f(x), at any point x of the effective domain of f, in terms of the subdifferentials of the data functions, ft(x), tT.

Since many convex functions, such as the Fenchel conjugate, the sum, the composition with affine applications, etc., can be expressed as the supremum of affine or convex functions, formulas characterizing the subdifferential of the supremum were expected to play a crucial role in convex and variational analysis, leading to a variety of calculus rules and allowing a deeper analysis for some relevant problems in this area. For instance, any formula for the subdifferential of the supremum function can be seen as a useful tool in deriving KKT-type optimality conditions for a convex optimization problem. This is due to the fact that any set of convex constraints, even an infinite set, can be replaced by a unique convex constraint involving the supremum function. An alternative approach consists of replacing the constraints by the indicator function of the feasible set. It turns out that, under certain constraint qualifications, its subdifferential (i.e., the normal cone to the feasible set) appears in the so-called Fermat optimality principle, and its relation with the subdifferential of the supremum function can be then conveniently exploited.

Let us quote the following paragraph extracted from [15]: “One of the most specific constructions in convex or nonsmooth analysis is certainly taking the supremum of a (possibly infinite) collection of functions. In the years 1965–1970, various calculus rules concerning the subdifferential of sup-functions started to emerge; working in that direction and using various assumptions, several authors contributed to this calculus rule: B.N. Pshenichnyi, A.D. Ioffe, V.L. Levin, R.T. Rockafellar, A. Sotskov, etc.; however, the most elaborated results of that time were due to M. Valadier (1969); he made use of ε-active indices in taking the supremum of the collection of functions.”

Therefore, it is clear that the mathematical interest of this topic was widely recognized since the very beginning of the convex and variational analysis history. A sample of remarkable contributions to this topic are: Brøndsted [1], Ekeland and Temam [10], Ioffe [17], Ioffe and Levin [18], Ioffe and Tikhomirov [19], Levin [20], Pschenichnyi [23], Rockafellar [26], Valadier [30], etc. See, for instance, Tikhomirov [29] to trace out the historical origins of the issue.

In a series of papers ([3,4,5,6, 12,13,14], etc.) we provided alternative characterizations of the subdifferential supremum in various settings, and applied them to derive calculus rules in convex analysis.

In [7] we addressed the problem of characterizing the subdifferential of the supremum of a compactly-indexed family of extended real-valued convex functions. These assumptions, which are standard in the literature of convex analysis and non-differentiable semi-infinite programming, are the compactness of the index set T and the upper semi-continuity of the constraint functions with respect to the index t. A couple of questions arise in a natural way. The first basic one is the following: Is it possible to remove these assumptions? A second more precise question is: By using a compactification of the index set and an appropriate enlargement of the original family of data functions, is there any chance for getting rid of these assumptions, but keeping alive the possibility of still applying the theory developed under them?

In this framework, we propose in the current paper an approach based on the Stone–Čech compactification of the index set T, as well as a natural procedure for building an appropriate enlargement of the original family ensuring the fulfillment of the minimal requirements of continuity of the functions with respect to the index. Moreover, in contrast to previous approaches, our characterizations are formulated exclusively in terms of exact subdifferentials at the nominal point.

Formula (10) constitutes the main result of the paper. It provides an explicit expression of the subdifferential of the supremum function for any family of convex functions, dropping the usual standard assumptions in the literature (upper semi-continuity and compactness conditions; see, e.g. [1, 7, 19, 27, 30]). Namely, compared with the formula

$$ \partial f(x)=\bigcap\nolimits_{L\in\mathcal{F}(x),\varepsilon>0} \overline{\text{co}}\left\{\bigcup\nolimits_{t\in T_{\varepsilon }(x)}\partial_{\varepsilon}(f_{t}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\} $$

(see (2) and (3) for the definition of \(\mathcal {F}(x)\) and Tε(x), respectively), which can be easily derived from the main result in [14, Theorem 4], formula (10) involves the convex hull of the union of the exact subdifferentials of exclusively the active functions, up to an appropriate enlargement of the original family of functions.

The paper is structured as follows. After a short section introducing the notation, the main result in the section devoted to preliminaries is formula (4) in Proposition 1, which slightly improves Proposition 2 in [6] as it uses the convex hull instead of the closed convex hull. In Section 3 the compactification process is described in detail, and an appropriate enlargement of the original family {ft,tT} is built, through formula (6), in order to guarantee the (upper-semi) continuity requirements with respect to the index t which allow to apply the results in [7]. Our main result in Section 3, Theorem 1, provides the aimed characterization of the subdifferential of f in non-compact frameworks. It comes after some needed technical lemmas, and some corollaries are also established under certain specific assumptions. An example illustrates the compactification approach, and the last section provides Fritz–John and KKT-type optimality conditions for the convex semi-infinite optimization problem such that the compact/continuity assumptions in [6, Theorem 5 and Corollary 6] are again dropped.

2 Notation

Let X be a (real) separated locally convex space, whose topological dual space is X, which is endowed with the w-topology. The spaces X and X are paired in duality by the bilinear form (x,x) ∈ X× X↦〈x,x〉 := 〈x,x〉 := x(x). The zero vectors in X and X are denoted by 𝜃. Closed, convex and balanced neighborhoods of 𝜃 are called 𝜃-neighborhoods. We use the notation \(\overline {\mathbb {R}}:=\mathbb {R}\cup \{-\infty ,+\infty \}\) and \(\mathbb {R}_{\infty }:=\mathbb {R}\cup \{+\infty \}\), and adopt the convention \((+\infty ) + (-\infty ) = (-\infty ) + (+\infty )=+\infty \).

Given two nonempty sets A and B in X (or in X), we define the algebraic (or Minkowski) sum by

$$ A+B:=\{a+b:~a\in A, b\in B\},\quad A+\emptyset=\emptyset+A=\emptyset. $$

By co(A), cone(A), and aff(A), we denote the convex, the conical convex (i.e., \(\text {cone} A:=\mathbb {R}_{+}(\text {co} A))\), and the affine hulls of the set A, respectively. Moreover, int(A) is the interior of A and clA and \(\overline {A}\) are indistinctly used for denoting the closure of A. We use ri(A) to denote the (topological) relative interior of A (i.e., the interior of A in the topology relative to aff(A) if aff(A) is closed, and the empty set otherwise).

Associated with A we consider the orthogonal subspace given by

$$ A^{\perp}:=\{x^{\ast}\in X^{\ast}:~\langle x^{\ast},x\rangle = 0~\text{ for all }x\in A\}. $$

The following relation is fulfilled

$$ \bigcap\nolimits_{L\in\mathcal{F}}(A+L^{\perp})\subset\text{cl} A, $$
(1)

where \(\mathcal {F}\) is the family of finite-dimensional linear subspaces of X.

If AX is convex and xX, we define the normal cone to A at x as

$$ \mathrm{N}_{A}(x):=\{x^{\ast}\in X^{\ast}:~\langle x^{\ast},y-x\rangle \leq 0~\text{ for all }y\in A\}, $$

if xA, and the empty set otherwise.

Given a function \(f:X\longrightarrow \overline {\mathbb {R}}\), its (effective) domain is

$$ \text{dom} f:=\{x\in X:~f(x)<+\infty\}. $$

We say that f is proper when domf and \(f(x)>-\infty \) for all xX.

Given xX and ε ≥ 0, the ε-subdifferential of f at x is

$$ \partial_{\varepsilon}f(x)=\{x^{\ast}\in X^{\ast}:~f(y)\geq f(x)+\langle x^{\ast},y-x\rangle-\varepsilon~\text{ for all }y\in X\} $$

when x ∈domf, and εf(x) := when \(f(x)\notin \mathbb {R}\). The elements of εf(x) are called ε-subgradients of f at x. The subdifferential of f at x is f(x) := 0f(x), whose elements are called subgradients of f at x.

The support and the indicator functions of AX are respectively defined as

$$ \sigma_{A}(x^{\ast}):=\sup\{\langle x^{\ast},x\rangle:~x\in A\}\quad\text{ for }x^{\ast}\in X^{\ast}, $$

and

$$ \mathrm{I}_{A}(x):=\left\{ \begin{array}{ll} 0 &~\text{ if }x\in A,\\ +\infty &~\text{ if }x\in X\setminus A. \end{array} \right. $$

3 Preliminary Results

We give a first characterization of the subdifferential of the supremum

$$ f:=\sup_{t\in T}f_{t}, $$

of a family of extended real-valued convex functions {ft, tT}, defined on a (separated) real locally convex space X, and indexed by an arbitrary (possibly, infinite) set T.

We shall need the following result which slightly improves Proposition 2 in [6], as it uses the convex hull instead of the closed convex hull. Our main result, given in Theorem 1, provides the general characterization of the subdifferential of f in non-necessarily compact frameworks.

Given xX and ε ≥ 0, we shall denote

$$ \begin{array}{@{}rcl@{}} \mathcal{F}(x)&:=&\{L\text{ is a finite-dimensional linear subspace of }X\text{ containing }x\}, \end{array} $$
(2)
$$ \begin{array}{@{}rcl@{}} T_{\varepsilon}(x)&:=&\{t\in T:~f_{t}(x)\geq f(x)-\varepsilon\} \quad\text{ and }\quad T(x):=T_{0}(x). \end{array} $$
(3)

Proposition 1

Fix xX. We assume there is some ε0 > 0 such that (i) \(T_{\varepsilon _{0}}(x)\) is compact and (ii) for each net \((t_{i})_{i}\subset T_{\varepsilon _{0}}(x)\) converging to \(t\in T_{\varepsilon _{0}}(x)\) we have that

$$ \limsup_{i}f_{t_{i}}(z)\leq f_{t}(z)\quad\text{ for all }z\in\text{dom} f. $$

Then

$$ \partial f(x)=\bigcap\nolimits_{L\in\mathcal{F}(x)}\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial(f_{t}+\mathrm{I}_{L\cap \text{dom} f})(x)\right\}. $$
(4)

Proof

According to [6, Proposition 2], where (4) was established with \(\overline {\text {co}}\) instead of co, we only need to prove that the sets

$$ E_{L}:=\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial (f_{t}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\},\quad L\in\mathcal{F}(x), $$

are closed under the current hypothesis. Let us denote, for tT(x) and \(L\in \mathcal {F}(x)\),

$$ \tilde{g}_{t}:=f_{t}+\mathrm{I}_{L\cap\operatorname*{dom}f},\text{ } g_{t}:=\tilde{g}_{t\mid_{L}}, $$

so that

$$ \text{dom} \tilde{g}_{t}=\text{dom} f_{t}\cap(L\cap \text{dom} f)=L\cap\text{dom} f. $$

Take a net \((u_{i}^{\ast })_{i}\subset E_{L}\) such that \(u_{i}^{\ast }\rightarrow u^{\ast }\in X^{\ast }\). We denote by \(z_{i}^{\ast }\) the restriction of \(u_{i}^{\ast }\) to the finite-dimensional subspace L, so that

$$ (z_{i}^{\ast})_{i}\subset\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial g_{t}(x)\right\} \subset L^{\ast}, $$

where L is the dual of L. Then, by applying Charathéodory’s Theorem in L, for each i there are some λi,1,…,λi,n+ 1 ≥ 0 with λi,1 + ⋯ + λi,n+ 1 = 1, and elements \(z_{i,k}^{\ast }\in \partial g_{t_{i,k}}(x)\) with ti,kT(x) and kK := {1,…,n + 1} (hence, the functions \(g_{t_{i,k}}\), kK, are all proper), such that

$$ z_{i}^{\ast}=\lambda_{i,1}z_{i,1}^{\ast}+\ldots+\lambda_{i,n+1}z_{i,n+1}^{\ast}, $$

where n is the dimension of L.

We may assume that each (λi,k)i, kK, converges to some λk ≥ 0 such that λ1 + ⋯ + λn+ 1 = 1. Also, since \((t_{i,k})_{i}\subset T(x)\subset T_{\varepsilon _{0}}(x)\) and this last set is compact by assumption, we may assume that \(t_{i,k}\rightarrow t_{k}\in T_{\varepsilon _{0}}(x)\), kK. Moreover, using again the assumption, we have

$$ \limsup_{i}g_{t_{i,k}}(z)\leq g_{t_{k}}(z)\quad\text{ for all }z\in L\cap\text{dom} f,~k\in K; $$

in particular, \(g_{t_{k}}(z)>-\infty \) for all zL ∩domf, kK, and (recall that ti,kT(x))

$$ f(x)=\limsup_{i} g_{t_{i,k}}(x)\leq g_{t_{k}}(x)=f_{t_{k}}(x)\leq f(x)\quad\text{ for all }k\in K, $$

showing that tkT(x) for all kK. Consequently, taking into account that (ti,k)iT(x) for all kK, for every \(z\in L\cap \text {dom} f (=\text {dom} g_{t_{k}}\), kK) we obtain

$$ \begin{array}{@{}rcl@{}} \langle z^{\ast},z-x\rangle & =& \lim_{i}\langle \lambda_{i,1}z_{i,1}^{\ast}+\cdots+\lambda_{i,n+1}z_{i,n+1}^{\ast},z-x\rangle \\ &\leq& \lambda_{1}\limsup_{i} g_{t_{i,1}}(z) + {\cdots} + \lambda_{n+1}\limsup_{i}g_{t_{i,n+1}}(z)\\ &&+\limsup_{i}(-\lambda_{1}g_{t_{i,1}}(x)-\cdots-\lambda_{n+1}g_{t_{i,n+1}}(x))\\ &\leq&\lambda_{1}g_{t_{1}}(z)+\cdots+\lambda_{n+1}g_{t_{n+1}}(z)+\limsup_{i}(-\lambda_{1}g_{t_{i,1}}(x) - \cdots - \lambda_{n+1}g_{t_{i,n+1}}(x))\\ &=&\lambda_{1}g_{t_{1}}(z)+\cdots+\lambda_{n+1}g_{t_{n+1}}(z)-f(x)\\ &=&\sum\limits_{k\in K_{+}} \lambda_{k}g_{t_{k}}(z)- \sum\limits_{k\in K_{+}} \lambda_{k}g_{t_{k}}(x), \end{array} $$

where K+ := {kK : λk > 0}. Hence, using Rockafellar’s subdifferential sum rule [25], as \(g_{t_{k}}(z)+\mathrm {I}_{L\cap \text {dom} f}(z)=g_{t_{k}}(z)\) and \(\text {ri}(\text {dom} g_{t_{k}})=\text {ri}(L\cap \text {dom} f)\neq \emptyset \) for all zL and all kK, we obtain

$$ z^{\ast}\in\partial\left( \sum\limits_{k\in K_{+}} \lambda_{k}g_{t_{k}}\right) (x)= \sum\limits_{k\in K_{+}} \lambda_{k}\partial g_{t_{k}}(x). $$

Then, using the extension theorem, we can take an extension v of z to X such that

$$ v^{\ast}\in \sum\limits_{k\in K_{+}} \lambda_{k}\partial\tilde{g}_{t_{k}}(x), $$

satisfying uvL. Therefore

$$ \begin{array}{@{}rcl@{}} u^{\ast}\in v^{\ast}+L^{\bot} & \subset& \sum\limits_{k\in K_{+}} \lambda_{k}\partial\tilde{g}_{t_{k}}(x)+L^{\bot}\\ &\subset& \sum\limits_{k\in K_{+}} \lambda_{k}\partial(\tilde{g}_{t_{k}}+\mathrm{I}_{L})(x)= \sum\limits_{k\in K_{+}} \lambda_{k}\partial\tilde{g}_{t_{k}}(x)\in E_{L}. \end{array} $$

4 Compactification Approach

Given a non-empty family of extended real-valued convex functions

$$ f_{t}:X\rightarrow\overline{\mathbb{R}},\quad t\in T, $$

defined on a (separated) real locally convex space X, and indexed by an arbitrary (possibly, infinite) set T, we consider the corresponding supremum function

$$ f:=\sup_{t\in T}f_{t}. $$

Here, in order to apply the methodology proposed in [7], we endow the index set T with some topology. When no topology is known on T we frequently use the discrete one. We denote by \(\mathcal {C}(T,[0,1])\) the set of continuous functions from T to [0,1], and consider the product space [0,1]C(T,[0,1]), which is compact for the product topology (by Tychonoff theorem). We shall regard the index set T as a subset of [0,1]C(T,[0,1]), and write T ⊂ [0,1]C(T,[0,1]), by using the mapping \(\mathfrak {d}\):\(T\rightarrow [0,1]^{C(T,[0,1])}\), which assigns to each tT the evaluation function \(\mathfrak {d}\)(t)≡ γt ∈ [0,1]C(T,[0,1]), defined as

$$ \gamma_{t}(\varphi):=\varphi(t),\quad\varphi\in\mathcal{C}(T,[0,1]). $$

The closure of T in [0,1]C(T,[0,1]) for the product topology is the compact set

$$ \widehat{T}:=\text{cl}(\mathfrak{d}(T)), $$
(5)

and is referred to as the Stone–Čech compactification of T, usually denoted by βT. Remember that for \(\gamma \in \widehat {T}\) and a net \((\gamma _{i})_{i}\subset \widehat {T}\), we have \(\gamma _{i}\rightarrow \gamma \) when

$$ \gamma_{i}(\varphi)\rightarrow\gamma(\varphi)\quad \text{ for all }\varphi \in\mathcal{C}(T,[0,1]). $$

When T is completely regular; i.e., compact Hausdorff, \(\widehat {T}\) is Hausdorff (see, i.e., [22, §38]), and the convergences in \(\mathfrak {d} (T)\) and T are the same.

Next, we enlarge the original family {ft, tT} by introducing the functions \(f_{\gamma }:X\rightarrow \overline {\mathbb {R}}\), \(\gamma \in \widehat {T}\), defined by

$$ f_{\gamma}(z):=\limsup_{\gamma_{t}\rightarrow\gamma,~t\in T} f_{t}(z); $$
(6)

that is,

$$ f_{\gamma}(z)=\sup\left\{\limsup_{i}f_{t_{i}}(z)\left| \begin{array}{l} (t_{i})_{i}\subset T,~\varphi(t_{i})\rightarrow\gamma(\varphi),\\ \forall\varphi\in\mathcal{C}(T,[0,1]) \end{array} \right. \right\}. $$

Observe that the family \(\{f_{\gamma },\gamma \in \widehat {T}\}\) includes the elements of the form \(f_{\gamma _{t}}\), tT, given by

$$ f_{\gamma_{t}}(z)=\limsup_{\gamma_{s}\rightarrow\gamma_{t}, s\in T}f_{s}(z), $$

which may not belong to the original family {ft, tT}, as well as the functions fγ with \(\gamma \in \widehat {T}\setminus \mathfrak {d} (T)\).

Remark 1

Observe that, for all tT and zX,

$$ f_{\gamma_{t}}(z)\geq\limsup_{s\rightarrow t, s\in T} f_{s}(z)\geq f_{t}(z), $$
(7)

and that the first inequality may be strict. Indeed, one may have that \(f_{\gamma _{t}}(z)=\lim _{i}f_{t_{i}}(z)\) for some \(\gamma _{t_{i}} \rightarrow \gamma _{t}\) such that (ti)i does not converge to t. This may happen, for instance, when T is compact but not Hausdorff. On the other side, if T is completely regular, for example compact Hausdorff, then

$$ f_{\gamma_{t}}(z)=\limsup_{s\rightarrow t, s\in T}f_{s}(z). $$

The new functions fγ, \(\gamma \in \widehat {T}\), provide the same supremum f as the original ones ft, tT:

Lemma 1

The functions fγ, \(\gamma \in \widehat {T}\), are convex, and we have

$$ \sup_{\gamma\in\widehat{T}}f_{\gamma}=\sup_{t\in T}f_{t}=f. $$

Proof

The convexity of the fγ’s follows easily from the convexity of the ft’s. Next, for each \(\gamma \in \widehat {T}\) and zX, we have

$$ f_{\gamma}(z)=\limsup_{\gamma_{s}\rightarrow\gamma, s\in T}f_{s}(z)\leq f(z), $$

entailing that \(\sup _{\gamma \in \widehat {T}}f_{\gamma }\leq f\). In addition, if the sequence (tn)nT is such that \(f(z)=\lim _{n}f_{t_{n}}(z)\), with zX, then there exist a subnet (ti)i of (tn)n and \(\gamma \in \widehat {T}\) such that \(\gamma _{t_{i}}\rightarrow \gamma \), and we get

$$ f_{\gamma}(z)\geq\limsup_{i}f_{t_{i}}(z)=\lim_{n}f_{t_{n}}(z)=f(z), $$

showing that \(\sup _{\gamma \in \widehat {T}}f_{\gamma }\geq f\). □

Now, given xX, with \(f(x)\in \mathbb {R}\), and ε ≥ 0, we introduce the extended ε-active index set of f at x by

$$ \widehat{T}_{\varepsilon}(x):=\left\{\gamma\in\widehat{T}:~f_{\gamma}(x)\geq f(x)-\varepsilon\right\}; $$
(8)

and the extended active index set of f at x

$$ \widehat{T}(x):=\widehat{T}_{0}(x)=\left\{\gamma\in\widehat{T}:~f_{\gamma}(x)=f(x)\right\}. $$
(9)

Moreover, taking into account (7), for each tT(x) we have that

$$ f(x)\geq f_{\gamma_{t}}(x)\geq f_{t}(x)=f(x); $$

that is,

$$ \mathfrak{d}(T(x))\subset\widehat{T}(x). $$

The set \(\widehat {T}(x)\) is a nonempty set in spite of the possible emptiness of T(x). More generally, we have:

Lemma 2

The sets \(\widehat {T}_{\varepsilon }(x)\), ε ≥ 0 and x ∈domf, are nonempty and compact.

Proof

It is enough to prove that \(\widehat {T}(x)\) is nonempty and closed; the general case when ε > 0 is similar. Fix x ∈domf. For a sequence (tn)nT such that \(\lim _{n}f_{t_{n}}(x)=f(x)\) there will exist, due to the compactness of \(\widehat {T}\), a subnet (ti)iT such that \(\gamma _{t_{i}}\rightarrow \gamma \in \widehat {T}\), and then (7) ensures that

$$ f(x)=\lim_{i}f_{t_{i}}(x) \leq \lim_{i}f_{\gamma_{t_{i}}}(x)\leq \limsup_{\gamma_{t}\rightarrow\gamma, t\in T} f_{t}(x)=f_{\gamma}(x)\leq f(x); $$

that is, \(\gamma \in \widehat {T}(x)\) and this set is nonempty.

Next, we show that \(\widehat {T}(x)\) is closed. We take a net \((\gamma _{i})_{i}\subset \widehat {T}(x)\) that converges to γ \((\in \widehat {T})\). Then, by the definition of the fγ’s, for each i we find a net (tij)jT such that \(\gamma _{t_{ij}}\rightarrow _{j}\gamma _{i}\) and

$$ f(x)=f_{\gamma_{i}}(x)=\lim_{j}f_{t_{ij}}(x). $$

Thus, there exists a diagonal net \(\left (\gamma _{t_{ij_{i}}},f_{t_{ij_{i}}}(x)\right )_{i}\subset \widehat {T}\times \mathbb {R}\) such that \(\gamma _{t_{ij_{i}}}\rightarrow _{i}\gamma \) and \(f_{t_{ij_{i}}}(x)\rightarrow _{i}f(x)\); that is,

$$ f_{\gamma}(x)\geq\limsup_{i}f_{t_{ij_{i}}}(x)=\lim_{i}f_{t_{ij_{i}}}(x)=f(x), $$

and so \(\gamma \in \widehat {T}(x)\). □

Lemma 3

If x ∈domf, then

$$ \widehat{T}(x)=\bigcap\nolimits_{\varepsilon>0} \text{cl}\big(\mathfrak{d}(T_{\varepsilon}(x))\big). $$

Proof

Take \(\gamma \in \widehat {T}(x)\). Then there exists a net (ti)iT such that \(\gamma _{t_{i}}\rightarrow \gamma \) and

$$ f_{\gamma}(x)=\lim_{i}f_{t_{i}}(x)=f(x). $$

Hence, for each ε > 0 there exists an i0 such that

$$ t_{i}\in T_{\varepsilon}(x)\quad\text{for all }i\succeq i_{0}, $$

where ≽ defines the order in the directed set. In other words, \(\gamma _{t_{i}}\in \mathfrak {d} (T_{\varepsilon }(x))\) for all ii0. This entails that \(\gamma \in \text {cl} (\mathfrak {d} (T_{\varepsilon }(x)))\), and we get \(\gamma \in \bigcap _{\varepsilon >0} \text {cl}\)(\(\mathfrak {d} (T_{\varepsilon }(x)))\), by the arbitrariness of ε > 0.

Conversely, take \(\gamma \in \bigcap _{\varepsilon >0} \text {cl}\)(\(\mathfrak {d} (T_{\varepsilon }(x)))\). Then, for each integer number k and each neighborhood U of γ, there exists some \(\gamma _{t_{(k,U)}}\in U\) with \(t_{(k,U)}\in T_{\frac {1}{k}}(x)\); that is (by (7)),

$$ f(x)-\frac{1}{k}\leq f_{t_{(k,U)}}(x)\leq f_{\gamma_{t_{(k,U)}}}(x)\leq f(x)\leq0. $$

Since \(\widehat {T}\) is compact Hausdorff (coming form the complete regularity of T), the net \((\gamma _{t_{(k,U)}})_{(k,U)}\) converges and its limit must be equal to γ. Then

$$ 0\geq f(x)\geq f_{\gamma}(x)\geq\limsup_{(k,U)}f_{t_{(k,U)}}(x)=0, $$

and so \(\gamma \in \widehat {T}(x)\). □

Let us examine the concepts introduced above in a compact (possibly, non-Hausdorff) framework. We denote by \(\sim \) the equivalence relation on T given by

$$ t_{1}\sim t_{2}\quad\Longleftrightarrow\quad\varphi(t_{1})=\varphi(t_{2})\quad\text{ for all }\varphi\in\mathcal{C}(T,[0,1]), $$

and by \(\tilde {t}\) the equivalence class of tT. It is known that \(\widehat {T}\) and T are homeomorphic when T is compact Hausdorff.

Lemma 4

Assume that T is compact (possibly, non-Hausdorff). Then, provided that the mapping tft(x) is continuous on T, the following assertions hold true for each x ∈domf :

  1. (i)

    ft(x) = fs(x) for all s, tT such that \(s\sim t\).

  2. (ii)
    $$ \widehat{T}(x)=\left\{\tilde{t}\in T/\sim~:~f_{\tilde{t}}(x)=\limsup_{\widetilde{s}\rightarrow\tilde{t}}f_{s}(x)=f(x)\right\} = \left\{\tilde{t}\in T/\sim~:~t\in T(x)\right\}, $$

    where \(\widetilde {s}\rightarrow \tilde {t}\) means that \(\varphi (s)\rightarrow \varphi (t)\) for all \(\varphi \in \mathcal {C}(T,[0,1])\).

Proof

Under the current hypothesis it can be proved that \(\widehat {T}\) and the quotient space \(T/\sim \) are homeomorphic, by means of the mapping \(\tilde {t}\in T/\sim ~\longmapsto \gamma _{t}\in \widehat {T}\).

(i) Since f(⋅)(x) is continuous and T is compact we can easily prove the existence of m > 0 such that

$$ |f_{t}(x)| \leq m\quad\text{ for all }t\in T. $$

Thus, using the positive and the negative parts of f(⋅)(x), \(f_{(\cdot )}^{+}(x)\) and \(f_{(\cdot )}^{-}(x)\), we have \(m^{-1}f_{(\cdot )}^{+}(x)\), \(m^{-1}f_{(\cdot )}^{-}(x)\in \mathcal {C}(T,[0,1])\) and so, for all s,tT such that \(s\sim t\),

$$ f_{t}(x)=m\left( m^{-1}f_{t}^{+}(x)-m^{-1}f_{t}^{-}(x)\right) = m\left( m^{-1}f_{s}^{+}(x)-m^{-1}f_{s}^{-}(x)\right)=f_{s}(x). $$

(ii) If \(\tilde {t}\in \widehat {T}(x)\), then there exists a net (ti)iT such that

$$ \widetilde{t_{i}}\rightarrow\tilde{t}\quad\text{ and }\quad f(x)=f_{\tilde{t}}(x)=\lim_{i}f_{t_{i}}(x). $$

Since T is compact we may assume that \(t_{i}\rightarrow s\in T\), and the continuity of f(⋅)(x) entails

$$ f(x)=\lim_{i}f_{t_{i}}(x)=f_{s}(x); $$

that is, sT(x). Now, fix \(\varphi \in \mathcal {C}(T,[0,1])\). From the one hand, since \(\widetilde {t_{i}}\rightarrow \tilde {t}\), we have that

$$ \varphi(t_{i})\rightarrow\varphi(t). $$

On the other hand, the continuity of φ yields

$$ \varphi(t_{i})\rightarrow\varphi(s), $$

and we get φ(t) = φ(s); that is, \(\tilde {t}=\tilde {s}\).

Conversely, if \(\tilde {t}\in T/\sim \) is such that tT(x), then by (7) we get

$$ f(x)\geq\limsup_{\widetilde{s}\rightarrow\tilde{t}}f_{s}(x)=f_{\tilde{t}}(x)\geq f_{t}(x)=f(x); $$

and we are done. □

Now we give the main result of the paper, for general index sets and dropping both the upper semi-continuity-type condition and the compactness assumption assumed in Proposition 1.

Theorem 1

Let {ft,tT} be a nonempty family of extended real-valued convex functions, and consider \(f=\sup _{t\in T}f_{t}\). Then, for every x ∈domf, we have

$$ \partial f(x)=\bigcap\nolimits_{L\in\mathcal{F}(x)}\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\}, $$
(10)

where fγ, \(\widehat {T}(x)\) and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is equipped with a completely regular topology.

In order to prove Theorem 1, we first establish the following key lemma, which constitutes the bridge with the compact framework.

Lemma 5

Assume that f is proper and take x ∈domf, with \(f(x)\in \mathbb {R}\), and ε ≥ 0.

  1. (i)

    Every net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) has an accumulation point \(\gamma \in \widehat {T}_{\varepsilon }(x)\) such that

    $$ \limsup_{i}f_{\gamma_{i}}(z)\leq f_{\gamma}(z)\quad\text{ for all }z\in\text{dom} f. $$
    (11)
  2. (ii)

    If T is completely regular, then (11) holds for every net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) converging to \(\gamma \in \widehat {T}_{\varepsilon }(x)\).

Proof

(i) Fix a net \((\gamma _{i})_{i}\subset \widehat {T}_{\varepsilon }(x)\) and, due to the compactness of \(\widehat {T}_{\varepsilon }(x)\) established in Lemma 2, let \(\gamma \in \widehat {T}_{\varepsilon }(x)\) be such that \(\gamma _{i}\rightarrow \gamma \) (without loss of generality). Take z ∈domf, so that \(f_{\gamma }(z)\leq f(z)<+\infty \). Next, for each i there will exist a net (tij)jT such that

$$ \gamma_{t_{ij}}\rightarrow_{j}\gamma_{i},\quad f_{\gamma_{i}} (z)=\lim_{j}f_{t_{ij}}(z). $$

For every fixed δ > 0 we may suppose, without loss of generality, that for all i

$$ f_{t_{ij}}(z)\geq f_{\gamma_{i}}(z)-\delta\quad\text{ eventually on }j. $$

Then there exists a diagonal net \((t_{ij_{i}})_{i}\subset T\) such that \(\gamma _{t_{ij_{i}}}\rightarrow _{i}\gamma \) and

$$ f_{t_{ij_{i}}}(z)\geq f_{\gamma_{i}}(z)-\delta\quad\text{ for all }i. $$

Consequently,

$$ f_{\gamma}(z)\geq\limsup_{i}f_{t_{ij_{i}}}(z)\geq\limsup_{i}f_{\gamma_{i}}(z)-\delta, $$

and we get, as δ 0,

$$ f_{\gamma}(z)\geq\limsup_{i}f_{\gamma_{i}}(z). $$

(ii) Fix a net \((\gamma _{i})_{i\in I}\subset \widehat {T}_{\varepsilon }(x)\) such that \(\gamma _{i}\rightarrow \gamma \in \widehat {T}_{\varepsilon }(x)\), and take z ∈domf with \(f_{\gamma }(z)<+\infty \). By assertion (i) the inequality (11) holds for some accumulation point of (γi)i, which must be γ (because \(\widehat {T}\) is Hausdorff).

Proof

(of Theorem 1) By Lemma 1, the functions \(\{f_{\gamma },\gamma \in \widehat {T}\}\) are convex and satisfy

$$ f=\sup_{\gamma\in\widehat{T}}f_{\gamma}. $$

According to Lemma 2, the sets \(\widehat {T}_{\varepsilon }(x)\) are compact for every ε ≥ 0, while Lemma 5(ii) entails the upper semi-continuity of the mappings γfγ(z), z ∈domf. Consequently, Proposition 1 applies and yields the desired formula. □

Corollary 1

If f∣aff(domf) is continuous on ri(domf) (assumed to be nonempty), then for every xX

$$ \partial f(x)=\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma \in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\}. $$
(12)

Proof

Under the current assumption, for every \(L\in \mathcal {F}(x)\) and \(\gamma \in \widehat {T}(x)\) such that L ∩ri(domf)≠ we have that (see, e.g., [3, Theorem 15(iii)])

$$ \partial(f_{\gamma}+\mathrm{I}_{L\cap\text{dom} f})(x)=\partial\left( (f_{\gamma}+\mathrm{I}_{\text{dom} f})+\mathrm{I}_{L}\right)(x)=\text{cl}(\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)+L^{\bot}). $$

Now, given a convex neighborhood UX of the origin, we choose \(L\in \mathcal {F}(x)\) such that L ∩ri(domf)≠ and LU. Then Theorem 1 yields

$$ \begin{array}{@{}rcl@{}} \partial f(x) & \subset&\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\} \\ & =&\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\text{cl}\left( \partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)+L^{\bot}\right)\right\} \\ &\subset&\text{co}\left\{\bigcup\nolimits_{\gamma\in \widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\} + U + U, \end{array} $$

and we get, by intersecting over the U’s,

$$ \partial f(x)\subset\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\}. $$

The conclusion follows as the opposite inclusion is straightforward. □

Theorem 2

If f is finite and continuous at some point, then for every xX

$$ \partial f(x)=\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma \in\widehat{T}(x)}\partial f_{\gamma}(x)\right\} + \mathrm{N}_{\text{dom} f}(x). $$
(13)

Proof

Fix x ∈domf. By taking into account that fγf, Corollary 1 yields

$$ \begin{array}{@{}rcl@{}} \partial f(x) & =& \overline{\text{co}}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\} \\ & =&\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)+\mathrm{N}_{\text{dom} f}(x)\right\} \\ & =&\partial\sigma_{\bigcup_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)+\mathrm{N}_{\text{dom} f}(x)}(\theta)\\ & =&\partial(\sigma_{\bigcup_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)}+\sigma_{\mathrm{N}_{\text{dom} f}(x)})(\theta). \end{array} $$
(14)

Additionally, for a neighborhood \(U_{x_{0}}\) of x0 ∈int(domf) such that \(U_{x_{0}}\subset \text {dom} f\), we have

$$ \sigma_{\mathrm{N}_{\text{dom} f}(x)}(U_{x_{0}}-x)\leq0, $$

showing that \(\sigma _{\mathrm {N}_{\text {dom} f}(x)}\) is continuous at x0x. At the same time, we have

$$ \sigma_{\bigcup_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)}(x_{0}-x)\leq\sup_{\gamma\in\widehat{T}(x)}(f_{\gamma}(x_{0})-f_{\gamma}(x))\leq f(x_{0})-f(x)<+\infty, $$

and so, thanks to the Moreau–Rockafellar sum rule, (14) implies that

$$ \begin{array}{@{}rcl@{}} \partial f(x) & =&\partial\left( \sigma_{\bigcup_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)}+\sigma_{\mathrm{N}_{\text{dom} f}(x)}\right)(\theta)\\ & =&\partial\sigma_{\bigcup_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)}(\theta)+\partial\sigma_{\mathrm{N}_{\text{dom} f}(x)}(\theta)\\ & =&\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma \in \widehat{T}(x)}\partial f_{\gamma}(x)\right\} +\mathrm{N}_{\text{dom} f}(x). \end{array} $$

The following corollary is a straightforward consequence of Theorems 1 and 2. We introduce the functions \(\tilde {f}_{t}:X\rightarrow \overline {\mathbb {R}}\), tT, given by

$$ \tilde{f}_{t}(z):=\limsup_{s\rightarrow t, s\in T}f_{s}(z), $$
(15)

and denote

$$ \widetilde{T}(x):=\{t\in T:~\tilde{f}_{t}(x)=f(x)\}. $$
(16)

Corollary 2

Assume that T is compact Hausdorff. Then, for every x ∈domf, we have

$$ \partial f(x)=\bigcap\nolimits_{L\in\mathcal{F}(x)}\text{co}\left\{\bigcup\nolimits_{t\in\widetilde{T}(x)}\partial(\tilde{f}_{t}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\}. $$

If, in addition, f is finite and continuous at some point, then for every x ∈domf

$$ \partial f(x)=\mathrm{N}_{\text{dom} f}(x)+\overline{\text{co}}\left\{\bigcup\nolimits_{t\in\widetilde{T}(x)}\partial\tilde{f}_{t}(x)\right\}. $$

Proof

Since T is compact Hausdorff; hence, completely regular, we have that \(T\equiv \widehat {T}\) and, for all tT and zX,

$$ f_{\gamma_{t}}(z)=\lim_{\gamma_{s}\rightarrow\gamma_{t}, s\in T} f_{s}(z)=\lim_{s\rightarrow t, s\in T}f_{s}(z)=\tilde{f}_{t}(z). $$

In other words, the first formula is a consequence of Theorem 1. Similarly, the second statement of the theorem follows from Theorem 2. □

The following corollary shows how to deduce Valadier’s formula ([30]), given in the compact setting (see [19, Theorem 3, p. 201] and [31, Theorem 2.4.18]).

Corollary 3

Assume that T is compact Hausdorff. Let UX be an open set such that:

  1. (i)

    \(f_{t}(x)\in \mathbb {R}\) for all tT and xU,

  2. (ii)

    tTft(x) is upper semi-continuous for each xU,

  3. (iii)

    xUft(x) is continuous for each tT.

Then for every xU we have

$$ \partial f(x)=\overline{\text{co}}\left\{\bigcup\nolimits_{t\in T(x)}\partial f_{t}(x)\right\}. $$

Proof

Assume first that X is a Banach space. Then, using classical arguments (see, e.g., [19, 31]), it is shown that the supremum function \(f=\sup _{t\in T}f_{t}\) is finite and, so, continuous on U. Thus, by Corollary 2, for each xU we have

$$ \partial f(x)=\mathrm{N}_{\text{dom} f}(x)+\overline{\text{co}}\left\{\bigcup\nolimits_{t\in\widetilde{T}(x)} \partial\tilde{f}_{t}(x)\right\} = \overline{\text{co}}\left\{\bigcup\nolimits_{t\in\widetilde{T}(x)}\partial\tilde{f}_{t}(x)\right\}, $$
(17)

where \(\tilde {f}_{t}\) and \(\widetilde {T}(x)\) are defined in (15) and (16), respectively.

Take \(t\in \widetilde {T}(x)\). On the one hand, using the compactness assumption, there exist some net (ti)iT and tT such that \(f(x)=\lim _{i}f_{t_{i}}(x)\) and (ti)i converges to tT. But we have, due to assumption (ii),

$$ f(x)=\limsup_{i}f_{t_{i}}(x)\leq f_{t}(x), $$

and so tT(x).

On the other hand, also by assumption (ii), for all zU we have

$$ \tilde{f}_{t}(z)=\limsup_{s\rightarrow t}f_{s}(z)\leq f_{t}(z), $$

and both functions \(\tilde {f}_{t}\) and ft coincide at x. Consequently, \(\partial \tilde {f}_{t}(x)\subset \partial f_{t}(x)\) and (17) yields

$$ \partial f(x)=\overline{\text{co}}\left\{\bigcup\nolimits_{t\in \widetilde{T}(x)}\partial\tilde{f}_{t}(x)\right\} \subset\overline{\text{co}}\left\{ \bigcup\nolimits_{t\in T(x)}\partial f_{t}(x)\right\}. $$

Thus, we are done since the opposite inclusion is straightforward.

We consider now the case when X is any locally convex space. We fix xU and xf(x). Given an \(L\in \mathcal {F}(x)\), we introduce the convex functions \(g_{t}:L\rightarrow \overline {\mathbb {R}}\), tT, defined as

$$ g_{t}:=(f_{t}+\mathrm{I}_{L})_{\mid L}; $$

that is, gt is the restriction of ft + IL to L, and consider the associated supremum

$$ g:=\sup_{t\in T}g_{t}=(f+\mathrm{I}_{L})_{\mid L}. $$

Therefore, since the family {gt,tT} satisfies the requirements of the paragraph above, we obtain

$$ \partial g(x)=\overline{\text{co}}\left\{\bigcup\nolimits_{t\in T(x)}\partial g_{t}(x)\right\}. $$

Now, take xf(x), so that \(\hat {x}^{\ast }:=x_{\mid L}^{\ast }\in \partial g(x)=\overline {\text {co}}\left \{\bigcup _{t\in T(x)}\partial g_{t}(x)\right \}\). Then, thanks to the fact that L is isomorphic to the quotient space X/L, for every 𝜃-neighborhood VXwe have that

$$ \hat{x}^{\ast}\in\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial g_{t}(x)\right\} +V_{\mid L}, $$

where \(V_{\mid L}:=\{u_{\mid L}^{\ast }:u^{\ast }\in V\}\) is a 𝜃-neighborhood in X/L. In other words, there are uV, \(\lambda _{1},\dots ,\lambda _{k}\geq 0\), \(t_{1},\dots ,t_{k}\in T(x)\) and \(\hat {x}_{1}^{\ast },\cdots ,\hat {x}_{k}^{\ast }\in L^{\ast }\) such that λ1 + ⋯ + λk = 1, \(\hat {x}_{j}^{\ast }\in \partial g_{t_{j}}(x)\), \(j=1,\dots ,k\), k ≥ 1, and

$$ \hat{x}^{\ast}=\lambda_{1}\hat{x}_{1}^{\ast}+\cdots+\lambda_{k}\hat{x}_{k}^{\ast}+u_{\mid L}^{\ast}. $$

Moreover, by the Hahn–Banach theorem, we extend \(\hat {x}_{1}^{\ast },\dots ,\hat {x}_{k}^{\ast }\) to \(x_{1}^{\ast },\dots ,x_{k}^{\ast }\in X^{\ast }\), which satisfy

$$ \langle x^{\ast},u\rangle = \lambda_{1}\langle x_{1}^{\ast},u\rangle +\cdots+\lambda_{k}\langle x_{k}^{\ast},u\rangle + \langle u^{\ast},u\rangle\quad \text{ for all }u\in L; $$

that is, \(x^{\ast }\in \lambda _{1}x_{1}^{\ast }+\cdots +\lambda _{k}x_{k}^{\ast }+u^{\ast }+L^{\perp }\). But \(x_{j}^{\ast }\in \partial (f_{t_{j}}+\mathrm {I}_{L})(x)\), \(j=1,\dots ,k\), and so

$$ \begin{array}{@{}rcl@{}} x^{\ast} &\in&\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial(f_{t}+\mathrm{I}_{L})(x)\right\} +V+L^{\perp}\\ & =&\text{co}\left\{\bigcup\nolimits_{t\in T(x)}\partial f_{t}(x)\right\} +V+L^{\perp}, \end{array} $$

where the last equality follows by applying the Moreau–Rockafellar sum rule (thanks to assumption (iii)). Finally, because L and V were arbitrarily chosen, we deduce that \(x^{\ast }\in \overline {\text {co}}\left \{\bigcup _{t\in T(x)}\partial f_{t}(x)\right \}\) (see (1)), and the inclusion “⊂” follows. □

Corollary 4

Assume that \(X=\mathbb {R}^{n}\). Then (12) and (13) hold with co instead of \(\overline {\text {co}}\).

Proof

Similarly as in the proof Proposition 1, we can prove that the set

$$ \text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\} $$

is closed and (12) holds with co instead of \(\overline {\text {co}}\); that is,

$$ \partial f(x)=\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\}. $$
(18)

In addition, if f is finite and continuous at some point in domf, then each function fγ (≤ f), \(\gamma \in \widehat {T}(x)\), is finite and continuous at the same point, and (18) yields (13) with co instead of \(\overline {\text {co}}\),

$$ \partial f(x)=\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{\text{dom} f})(x)\right\} = \text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial f_{\gamma}(x)\right\} +\mathrm{N}_{\text{dom} f}(x). $$

Example 1

Consider the family of convex functions g2n+ 1,h2n, \(n\in \mathbb {N}\), defined on \(\mathbb {R}\) as

$$ g_{2n+1}(z):=\max\left\{\frac{nz}{n+1},0\right\},\quad h_{2n}(z):=\max\left\{\frac{-nz}{n+1},0\right\}. $$

We introduce the family \(\{f_{n}, n\in \mathbb {N}\}\) such that f2n+ 1 := g2n+ 1 and f2n := h2n, together with the supremum function

$$ f=\sup_{n\in\mathbb{N}}f_{n}=\sup_{n\in\mathbb{N}}\left\{g_{2n+1},h_{2n}\right\}. $$

Obviously,

$$ f(x)=|x|\quad \text{ and }\quad\partial f(x)=\left\{ \begin{array}{ll} \lbrack-1,1] &~\text{ if } x=0,\\ \{-1\} &~\text{ if }x>0,\\ \{1\} &~\text{ if }x>0, \end{array} \right. $$

and

$$ T(x)=\left\{ \begin{array}{ll} \mathbb{N}&~\text{ if }x=0,\\ \emptyset&~\text{ if }x\neq0. \end{array} \right. $$

Thus, if we apply (4) in Proposition 1, we reach a false conclusion as the assumption there is not satisfied in this case:

$$ \partial f(x)=\left\{ \begin{array}{ll} ]-1,1[&~\text{ if }x=0,\\ \emptyset&~\text{ if }x\neq0. \end{array} \right. $$

The Stone–Čech compactification of \(\mathbb {N}\) is given by

$$ \begin{array}{@{}rcl@{}} \widehat{\mathbb{N}} & =&\mathbb{N}\cup\left\{\lim_{i}\gamma_{n_{i}}:~(n_{i})_{i}\subset\mathbb{N}, n_{i}\rightarrow+\infty\right\} \\ & =&\mathbb{N}\cup\left\{\lim_{i}\gamma_{2n_{i}},~\lim_{i}\gamma_{2n_{i}+1}:~(n_{i})_{i}\subset\mathbb{N}, n_{i}\rightarrow +\infty\right\}, \end{array} $$

whereas the fγ’s, \(\gamma \in \widehat {\mathbb {N}}\), take the form

$$ f_{\gamma}=\left\{ \begin{array}{ll} g_{2n+1} &\quad\text{if }\gamma=\gamma_{2n+1}\equiv2n+1,\\ h_{2n} &\quad \text{if }\gamma=\gamma_{2n}\equiv2n, \end{array} \right. $$

for \(\gamma \in \mathbb {N}\), and

$$ f_{\gamma}=\limsup_{\gamma_{n}\rightarrow\gamma}f_{n} $$

for \(\gamma \in \widehat {\mathbb {N}}\setminus \mathbb {N}\). Equivalently, we consider the family

$$ \left\{g_{2n+1},h_{2n}, n\in\mathbb{N};~g_{\bar{\gamma}}, h_{\bar{\gamma}}\right\}, $$

where \(g_{\bar {\gamma }}\), \(h_{\bar {\gamma }}: \mathbb {R}\rightarrow \mathbb {R}\) are defined as

$$ \begin{array}{@{}rcl@{}} g_{\bar{\gamma}}(z)&=&\limsup_{n\rightarrow\infty}g_{2n+1}(z)=\max\{z,0\},\\ h_{\bar{\gamma}}(z)&=&\limsup_{n\rightarrow\infty}h_{2n}(z)=\max\{-z,0\}. \end{array} $$

It is easily checked that this new family has the same properties as the original one, \(\{f_{\gamma },\gamma \in \widehat {\mathbb {N}}\}\). In other words, we have enlarged the original family of functions by adding \(g_{\bar {\gamma }}\) and \(h_{\bar {\gamma }}\). Therefore, applying (10), we get

$$ \begin{array}{@{}rcl@{}} \partial f(0) & =& \text{co}\left\{\bigcup\nolimits_{n\in T(0)}\partial g_{2n+1}(0)\bigcup\partial g_{\bar{\gamma}}(0)\bigcup\nolimits_{n\in T(0)}\partial h_{2n}(0)\bigcup\partial h_{\bar{\gamma}}(0)\right\} \\ &=&\text{co}\left\{\bigcup\nolimits_{n\geq1}\left[\frac{n}{n+1},0\right] \bigcup [0,1] \bigcup\nolimits_{n\geq1}\left[\frac{-n}{n+1},0\right] \bigcup [-1,0]\right\} = [-1,1], \end{array} $$

and, for x≠ 0, say x = 1,

$$ \partial f(1)=\partial g_{\bar{\gamma}}(1)=\{1\}. $$

Observe that the presence of the new functions \(g_{\bar {\gamma }}\) and \(h_{\bar {\gamma }}\) is necessary, since the subdifferentials at 0 of the data functions g2n+ 1 and h2n do not lead us to the whole subdifferential of the supremum function f, as they do not include the subgradients − 1 and 1.

In order to decompose the subdifferential term involved in formula (10) we need to impose some additional continuity or lower semi-continuity conditions on the initial functions. The assumption in Theorem 2 gives the first example, where the continuity of the supremum function allows to characterize f(x) by means only of the sets fγ(x). We give next an alternative representation of f(x) by means of the ε-subdifferentials of the fγ’s, under the condition

$$ \text{cl} f=\sup_{t\in T}(\text{cl}f_{t}), $$
(19)

where clf and clft are the closed hulls (lower semi-continuous regularizations) of the respective functions.

Proposition 2

If (19) holds, then for every x ∈domf

$$ \partial f(x)=\bigcap\nolimits_{\varepsilon>0,L\in\mathcal{F}(x)} \overline{\text{co}}\left\{\bigcup\nolimits_{\gamma\in \widehat{T}(x)}\partial_{\varepsilon}f_{\gamma}(x)+\mathrm{N}_{L\cap\text{dom} f}(x)\right\}, $$

where fγ, \(\widehat {T}(x)\) and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is a completely regular topological space.

Proof

It suffices to apply [7, Theorem 3.8] to the family \(\{f_{\gamma }, \gamma \in \widehat {T}\}\). □

We discuss next a nonconvex counterpart of formula (10), under the following condition introduced in [21],

$$ f^{\ast\ast}=\sup_{t\in T}f_{t}^{\ast\ast}, $$
(20)

where f∗∗ and \(f_{t}^{\ast \ast }\) are the biconjugates of the respective functions. In the convex case, and assuming that the conjugatesf and \(f_{t}^{\ast }\) are proper, (19) is equivalent to the last relation.

Proposition 3

Let {ft,tT} be a nonempty family of extended real-valued non-necessarily convex functions, and consider \(f=\sup _{t\in T}f_{t}\). If condition (20) holds, then for every x ∈domf

$$ \begin{array}{@{}rcl@{}} \partial f(x) & =& \bigcap\nolimits_{L\in\mathcal{F}(x)}\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial(f_{\gamma}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\} \\ & =&\bigcap\nolimits_{\varepsilon>0,L\in\mathcal{F}(x)}\overline{\text{co}}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}(x)}\partial_{\varepsilon}f_{\gamma}(x)+\mathrm{N}_{L\cap\text{dom} f}(x)\right\}, \end{array} $$

where fγ, \(\widehat {T}(x)\), and \(\mathcal {F}(x)\) are defined in (6), (9), and (2), respectively, and T is equipped with a completely regular topology.

Proof

Assume that f(x)≠, so that f(x) = f∗∗(x) and f(x) = f∗∗(x). Then, by applying Theorem 1 to the family \(\{f_{t}^{\ast \ast },t\in T\}\), we obtain

$$ \partial f(x)=\partial f^{\ast\ast}(x)=\bigcap\nolimits_{L\in\mathcal{F}(x)}\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{T}^{1}(x)}\partial(g_{\gamma}+\mathrm{I}_{L\cap\text{dom} f})(x)\right\}, $$
(21)

where \(g_{\gamma }:X\rightarrow \overline {\mathbb {R}}\), \(\gamma \in \widehat {T}\), are defined by

$$ g_{\gamma}(z):=\limsup_{\gamma_{t}\rightarrow\gamma, t\in T}f_{t}^{\ast\ast}(z), $$

and

$$ \widehat{T}^{1}(x):=\left\{\gamma\in\widehat{T}:g_{\gamma}(x)=f(x)\right\}. $$

Observe that for every \(\gamma \in \widehat {T}^{1}(x)\) we have that

$$ f(x)=\limsup_{\gamma_{t}\rightarrow\gamma, t\in T}f_{t}^{\ast\ast}(x)\leq\limsup_{\gamma_{t}\rightarrow\gamma, t\in T}f_{t}(x)=f_{\gamma}(x)\leq f(x), $$

and so \(\gamma \in \widehat {T}(x)=\{\gamma \in \widehat {T}:f_{\gamma }(x)=f(x)\}\). Moreover, since

$$ g_{\gamma}(z)=\limsup_{\gamma_{t}\rightarrow\gamma, t\in T}f_{t}^{\ast\ast}(z)\leq\limsup_{\gamma_{t}\rightarrow\gamma, t\in T}f_{t}(z)=f_{\gamma}(z)\quad\text{ for all }z\in X, $$

we deduce that for all \(L\in \mathcal {F}(x)\)

$$ \partial\left( g_{\gamma}+\mathrm{I}_{L\cap\text{dom} f}\right)(x)\subset \partial\left( f_{\gamma}+\mathrm{I}_{L\cap\text{dom} f}\right)(x). $$

Thus, the inclusion “⊂” in the first statement follows from (21), and we are done since the opposite inclusion is easily verified.

The second statement follows similarly by using Proposition 2 instead of Theorem 1. □

5 An Application to Optimality Conditions

In this section, we revise the optimality conditions for convex semi-infinite programming established in [6], by removing the compactness of the set indexing the constraints.

Aside [6], a significant precedent of the results in this section can be found in [11, Chapter 7], where KKT conditions are established for convex semi-infinite optimization with finite-valued functions, using a closedness condition which is implied by some version of Slater’s qualification. Many KKT conditions exist in the literature which are obtained via different approaches: approximate subdifferentials of the data functions ([3, 16]), the exact subdifferentials at close points [28], Farkas–Minkowski-type closedness criteria [8] in convex semi-infinite optimization, strong CHIP-like qualifications for convex optimization with non necessarily convex \(\mathcal {C}^{1}\)-constraints [2] (see, also, [9] for locally Lipschitz constraints), among others.

Here we consider the following optimization problem

$$ \mathcal{(P)}:\qquad \inf_{f_{t}(x)\leq0, t\in T}f_{0}(x), $$

where T is a completely regular topological space, and \(f_{t}:\mathbb {R}^{n}\rightarrow \mathbb {R}_{\mathbb {\infty }}\), for tT ∪{0} (we assume, without loss of generality, that 0∉T), are proper and convex. Problem \((\mathcal {P})\) is equivalent to

$$ \inf_{f(x)\leq0}f_{0}(x), $$

where

$$ f:=\sup_{t\in T}f_{t}. $$

Let the set \(\widehat {T}\) and the convex functions \(f_{\gamma }:\mathbb {R}^{n}\rightarrow \mathbb {R}_{\mathbb {\infty }}\), \(\gamma \in \widehat {T}\), be as defined in (5) and (6), respectively. We also denote

$$ \widehat{A}(x):=\{\gamma\in\widehat{T}:f_{\gamma}(x)=0\}, $$

so that, by Lemma 3, for every feasible point \(x\in \mathbb {R}^{n}\) for \(\mathcal {(P)}\) we have

$$ \widehat{A}(x)=\bigcap\nolimits_{\varepsilon>0} \text{cl}\left( \mathfrak{d}(A_{\varepsilon}(x))\right), $$

where

$$ A_{\varepsilon}(x) := \{t\in T:~f_{t}(x)\geq-\varepsilon\},\quad\varepsilon>0. $$

The following theorem establishes Fritz–John-type necessary optimality conditions for problem \(\mathcal {(P)}\). The main feature of this result and the subsequent corollary is the absence of any compactness and continuity assumptions on the index set and the mappings tft(z), as they were required in [6, Theorem 5].

Theorem 3

Assume that \(\bar {x}\) is an optimal solution of \(\mathcal {(P)}\). Then we have

  1. (a)
    $$ 0_{n}\in\text{co}\left\{\partial(f_{0}+\mathrm{I}_{\text{dom} f})(\bar{x})\cup \bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial(f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f})(\bar{x})\right\}. $$
  2. (b)

    Moreover, under the condition

    $$ \text{ri}(\text{dom} f_{\gamma})\cap\text{ri}(\text{dom} f)\neq\emptyset\quad\text{ for all }\gamma\in\widehat{A}(\bar{x})\cup\{0\}, $$

    we have

    $$ 0_{n}\in\text{co}\left\{\partial f_{0}(\bar{x})\cup \bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial f_{\gamma}(\bar{x})\right\} + \mathrm{N}_{\text{dom} f} (\bar{x})+\mathrm{N}_{\text{dom} f_{0}}(\bar{x}), $$

Proof

We consider the supremum function \(g:\mathbb {R}^{n} \rightarrow \mathbb {R}_{\mathbb {\infty }}\), defined as

$$ g(x):=\sup\{f_{0}(x)-f_{0}(\bar{x}),~f_{t}(x),~t\in T\}=\max\left\{f_{0}(x)-f_{0}(\bar{x}),~f(x)\right\}, $$

so that domg = domf0 ∩domf. It is easily verified that \(\bar {x}\) is a global minimum of g; that is, \(0_{n}\in \partial g(\bar {x})\).

We endow the set T ∪{0} with the topology generated by the open sets of T and {0}, which makes it completely regular. Then the compactification of T ∪{0} can be identified with \(\widehat {T}\cup \{0\}\). Consequently, and according to Corollary 4, \(\bar {x}\) satisfies

$$ 0_{n}\in\partial g(\bar{x})=\text{co}\left\{\partial(f_{0} + \mathrm{I}_{\text{dom} f})(\bar{x})\cup \bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial(f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f})(\bar{x})\right\}, $$

which is condition (a).

(b) Under the current assumptions, by using the classical sum rule ([25]), we get from the one hand

$$ \partial\left( f_{0} + \mathrm{I}_{\text{dom} f}\right)(\bar{x})=\partial f_{0}(\bar{x})+\mathrm{N}_{\text{dom} f}(\bar{x}), $$

and from the other hand, since

$$ f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap\text{dom} f}=f_{\gamma}+\mathrm{I}_{\text{dom} f}+\mathrm{I}_{\text{dom} f_{0}}\quad\text{ and }\quad\text{dom} (f_{\gamma}+\mathrm{I}_{\text{dom} f})=\text{dom} f, $$

we obtain

$$ \partial\left( f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap\text{dom} f}\right)(\bar{x})=\partial f_{\gamma}(\bar{x})+\mathrm{N}_{\text{dom} f_{0}}(\bar{x})+\mathrm{N}_{\text{dom} f}(\bar{x}). $$

Thus, the conclusion follows from (a). □

Remark 2

In particular, if \(f(\bar {x})<0\), then the last condition reads

$$ 0_{n}\in\partial\left( f_{0}+\mathrm{I}_{\text{dom} f}\right)(\bar{x}), $$

as \(f_{\gamma }(\bar {x})\leq f(\bar {x})<0\) for all \(\gamma \in \widehat {A}\), and so \(\widehat {A}(\bar {x})=\emptyset \).

Remark 3

Observe that the strong Slater condition; i.e., the existence of some x0 ∈domf0 such that f(x0) < 0, does not imply that x0 is an interior point of the feasible set. This is what happens in the following example. Take \(T:=[0,+\infty \lbrack \), f0 ≡ 0 and let \(f_{t}:\mathbb {R}\rightarrow \mathbb {R}\), tT, be defined as

$$ f_{t}(x):=\max\{tx-1,-tx-1\}. $$

The point 0 is a strong Slater point, but \(0\notin \text {int}(\{x\in \mathbb {R}:~f_{t}(x)\leq 0,~t\in T\})\).

We derive next the KKT conditions for problem \(\mathcal {(P)}\) under the Slater qualification.

Corollary 5

Under the strong Slater condition; that is,

$$ f(x_{0})<0\quad\text{ for some }x_{0}\in\text{dom} f_{0}, $$

the point \(\bar {x}\) is optimal for \(\mathcal {(P)}\) if and only if

$$ 0_{n}\in\partial\left( f_{0}+\mathrm{I}_{\text{dom} f}\right)(\bar{x})+\text{cone}\left\{\bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial\left( f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f}\right)(\bar{x})\right\}. $$
(22)

Proof

Assume first that \(f(\bar {x})=0\). By Theorem 3(a), \(\bar {x}\) is optimal if and only if either

$$ 0_{n}\in\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial\left( f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f}\right)(\bar{x})\right\} $$
(23)

or (22) holds.

Moreover, by Theorem 1 we have that

$$ \begin{array}{@{}rcl@{}} \text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial\left( f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f}\right)(\bar{x})\right\} & =& \partial\left( \sup_{t\in T}\left( f_{t}+\mathrm{I}_{\text{dom} f_{0}\cap\text{dom} f}\right)\right)(\bar{x})\\ & =& \partial\left( f+\mathrm{I}_{\text{dom} f_{0}}\right)(\bar{x}), \end{array} $$

and so relation (23) is equivalent to

$$ 0_{n}\in\text{co}\left\{\bigcup\nolimits_{\gamma\in\widehat{A}(\bar{x})} \partial\left( f_{\gamma}+\mathrm{I}_{\text{dom} f_{0}\cap \text{dom} f}\right)(\bar{x})\right\} = \partial\left( f+\mathrm{I}_{\text{dom} f_{0}}\right)(\bar{x}), $$

equivalently, \(f(x)\geq f(\bar {x})=0\) for all x ∈domf0; and this contradicts the strong Slater condition.

Finally, if \(f(\bar {x})<0\), then (22) follows by Theorem 3(a). □