Abstract
Under the assumption of prox-regularity and the presence of a tilt stable local minimum we are able to show that a \({\mathcal {VU}}\) like decomposition gives rise to the existence of a smooth manifold on which the function in question coincides locally with a smooth function.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The study of substructure of nonsmooth functions has led to an enrichment of fundamental theory of nonsmooth functions [19,20,21, 24,25,26, 32]. Fundamental to this substructure is the presence of manifolds along which the restriction of the nonsmooth function exhibits some kind of smoothness. In the case of “partially smooth function” [25] an axiomatic approach is used to describe the local structure that is observed in a number of important examples [25, 26]. In [26] it is shown that the study of tilt stability can be enhanced for the class of partially smooth functions. In the theory of the “\({\mathcal {U}}\)-Lagrangian” and the associated “\({\mathcal {VU}}\) decomposition” [24, 28] the existence of a smooth manifold substructure is proven for some special classes of functions [28, 31]. In the extended theory the presence of so called “fast tracks” is assumed and these also give rise to similar manifold substructures [29, 32]. The \({\mathcal {U}}\)-Lagrangian is reminiscent of a partial form of “tilt minimisation” [36] and this observation has motivated this study. As fast tracks and related concepts such as “identifiable constraints” are designed to aid the design of methods for the solution of nonsmooth minimization problems [18, 27, 29, 31, 32, 40], it seems appropriate to ask what additional structure does the existence of a tilt stable local minimum give to the study of the \({\mathcal {VU}}\) decomposition [24]? This is the subject of the paper. In the following discussion we denote the extended reals by \({\mathbb {R}}_{\infty }:={\mathbb {R}}\cup \left\{ +\infty \right\} \). If not otherwise stated we will consider a lower semi-continuous, extended-real-valued function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\). We denote the limiting subdifferential of Mordukhovich, Ioffe and Kruger by \(\partial f\).
Tilt stability was first studied in [36] for the case of f being both “prox-regular” at \(\bar{x}\ \) for \(\bar{z}\in \partial f(\bar{x})\) and “subdifferentially continuous” at \(\left( \bar{x},\bar{z}\right) \), in the sense of Rockafellar and Poliquin [35]. In [36] a characterisation of tilt stability is made in terms of certain second order sufficient optimality conditions. Such optimality conditions have been studied in [10, 11, 15, 36]. In [11, 13] it is shown that second order information provided by the coderivative is closely related to another second order condition framed in terms of the “limiting subhessian” [13, 14, 34]. These may be thought of as the robust\(\backslash \)limiting version of symmetric matrices associated with a lower, supporting Taylor expansion with a first order component z and second order component Q (a symmetric matrix). The limiting pairs \((\bar{z}, \bar{Q})\) are contained in the so called “subjet” [5] and the second order components \(\bar{Q}\) associated with a given \(\bar{z} \in \partial f(\bar{x})\) are contained in the limiting subhessian \(\underline{\partial }^2 f(\bar{x}, \bar{z})\), [10, 23, 34]. These have been extensively studied and possess a robust calculus similar to that which exists for the limiting subdifferential [9, 23]. One can view the “best curvature” approximation in the direction h for the function f at \((\bar{x},\bar{z})\) to be \(q(\underline{\partial }^2 f(\bar{x},\bar{z}))(h):= \sup \{\langle Qh, h\rangle \mid Q \in \underline{\partial }^2 f(\bar{x},\bar{z}) \}\), where we denote by \(\langle u, h\rangle \) the usual Euclidean inner product of two vectors \(u, h \in {\mathbb {R}}^n\).
To complete our discussion we consider the \({\mathcal {VU}}\) decomposition [24]. When \({\text {rel}}\)-\({\text {int}} \partial f\left( \bar{x}\right) \ne \emptyset \) we can take \(\bar{z}\in {\text {rel}}\)-\({\text {int}} \partial f\left( \bar{x}\right) \) and define \({\mathcal {V}}:={\text { span}}\left\{ \partial f\left( \bar{x}\right) -\bar{z}\right\} \) and \({\mathcal {U}}:={\mathcal {V}}^{\perp }\). The \({\mathcal {V}}\)-space is thought to capture the directions of nonsmoothness of f at \(\bar{x}\) while the \({\mathcal {U}}\) is thought to capture directions of smoothness. When \({\mathcal {U}}^{2}:= {\text {dom}} q(\underline{\partial }^{2}f(\bar{x},\bar{z}))(\cdot )\) is a linear subspace that is contained in \({\mathcal {U}}\), we call \({\mathcal {U}}^{2}\) the second order component of \({\mathcal {U}}\) and in Lemma 18 we give quite mild condition under which this is indeed the case. When \({\mathcal {U}}^{2}={\mathcal {U}}\) we say that a fast-track exists at \(\bar{x}\) for \(\bar{z}\in \partial f\left( \bar{x}\right) \).
In this paper we investigate whether the existence of a tilt stable local minimum provides extra information regarding the existence of a smooth manifold within which a smooth function interpolates the values of the f. We are able to show the following positive results. Recall that we say f is quadratically minorised when there exists a quadratic function \(q\left( x\right) :=\alpha -\frac{R}{2}\left\| x-\bar{x}\right\| ^{2}\) such that \(q\le f\) (globally). All balls \(B^{X}_{\varepsilon } (0) := \{ x \in X \mid \Vert x\Vert \le \varepsilon \}\) are closed.
Theorem 1
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function, quadratically minorised, and prox-regular at \(\bar{x}\) for \(0\in \partial f(\bar{x})\). Suppose in addition f admits a nontrivial subspace \({\mathcal {U}}^{2}:= {\text {dom}} \left( \underline{\partial }^{2}f\left( \bar{x},0\right) \right) (\cdot )\) and that f has a tilt stable local minimum at \(\bar{x}\). Then \({\mathcal {U}}^2 \subseteq {\mathcal {U}}\), \( {\mathcal {V}}^{2}:=({\mathcal {U}}^2)^{\perp }\) and for \(g\left( w\right) := \left[ {\text {co}}h\right] \left( w\right) \), placing \(h(w):=f(\bar{x}+w) +\delta _{B_{\varepsilon }^{{\mathcal {U}}^{2 }}\left( 0\right) \oplus B_{\varepsilon }^{{\mathcal {V}}^{2 }}\left( 0\right) }\left( w\right) \) we have \(\{v\left( u\right) \}= {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}^{2}\cap B_{\varepsilon }\left( 0\right) }f\left( \bar{x}+u+v^{\prime }\right) :{\mathcal {U}}^{2}\rightarrow {\mathcal {V}}^{2}\) and there exists a \(\delta >0\) such that we have \(g\left( u+v\left( u\right) \right) =f\left( \bar{x}+u+v\left( u\right) \right) \) and \(\nabla _{u}g\left( u+v\left( u\right) \right) \) existing as Lipschitz function for \(u\in B_{\delta }^{{\mathcal {U}}^{2}}\left( 0\right) \).
That is, \({\mathcal {M}}:=\left\{ u+ v(u) \mid u\in B_{\varepsilon }^{{\mathcal {U}}^{2}}\left( 0\right) \right\} \) is a manifold on which the restriction to \({\mathcal {M}}\) of function g coincides with a smooth \(C^{1,1}\) function of \(u \in {\mathcal {U}}\) [tilt stability ensures local uniqueness of the function \(v(\cdot )\)]. Assuming a little more we obtain the smoothness of v and in addition the smoothness of the manifold.
Theorem 2
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\), a proper lower semi-continuous function, quadratically minorised and prox-regular at \(\bar{x}\) for \(0\in \partial f(\bar{x})\). Let \(g\left( w\right) :=\left[ {\text {co}}h\right] \left( w\right) \) for \(h(w):=f(\bar{x}+w) +\delta _{B_{\varepsilon }^{{\mathcal {U}}}\left( 0\right) \oplus B_{\varepsilon }^{{\mathcal {V}}}\left( 0\right) }\left( w\right) \). Suppose in addition that \({\mathcal {U}}^{2} ={\mathcal {U}}\) is a linear subspace (i.e. \({\mathcal {U}}\) admits a fast track), f has a tilt stable local minimum at \(\bar{x}\) for \(0\in {\text {rel}}\)-\({\text {int}}\partial f\left( \bar{x}\right) \) and \(\partial ^{\infty }f\left( \bar{x}+u+v\left( u\right) \right) =\left\{ 0\right\} \) for \( v\left( u\right) \in {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) }\left\{ g\left( u+v^{\prime }\right) \right\} : {\mathcal {U}}\rightarrow {\mathcal {V}}\), \(u \in B_{\varepsilon }^{{\mathcal {U}}}\left( 0\right) \). Then there exists a \(\varepsilon >0\) such that the function in (1) is \(C^{1,1}\left( B_{\varepsilon }^{{\mathcal {U}}}\left( 0\right) \right) \) smooth:
(\(e_{{\mathcal {U}}}\) is the identity operator on \({\mathcal {U}}\)). Moreover, suppose we have a \(\delta >0\) (with \(\delta \le \varepsilon \)) such that for all \(z_{{\mathcal {V}}}\in B_{\delta }\left( 0\right) \cap {\mathcal {V}}\subseteq \partial _{{\mathcal {V}}}f\left( \bar{x}\right) \) and \(u\in B_{\varepsilon }\left( 0\right) \cap {\mathcal {U}}\) we have
Then \({\mathcal {M}}:=\left\{ u+ v\left( u\right) \mid u\in B_{\varepsilon }^{{\mathcal {U}}}\left( 0\right) \right\} \) is a \(C^1\)-smooth manifold on which \(u\mapsto f\left( \bar{x}+u+v\left( u\right) \right) \) is \(C^{1,1} \left( B_{\delta }^{{\mathcal {U}}}\left( 0\right) \right) \) smooth and \(u\mapsto v\left( u\right) \) is continuously differentiable.
We are also able to produce a lower Taylor approximation for f that holds locally at all points inside \({\mathcal {M}}\), see Corollary 54. These results differ from those present in the literature in that we impose common structural assumptions on f found elsewhere in the literature on stability of local minima [7, 36], rather than imposing very special structural properties, as is the approach of [18, 28, 29, 40]. Moreover, we do not assume the a-priori existence of any kind of smoothness of the underlying manifold, as is done in the axiomatic approach in [26], but let smoothness arise from a graded set of assumptions which progressively enforce greater smoothness. In this way the roles of these respective assumptions are clarified. Finally we note that it is natural in this context to study \(C^{1,1}\) smoothness rather than the \(C^2\) smoothness used in other works such as [26, 27, 32].
2 Preliminaries
The following basic concepts are used repeatedly throughout the paper.
Definition 3
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a lower semi-continuous function.
-
1.
Denote by \(\partial _{p}f(\bar{x})\) the proximal subdifferential, which consists of all vectors z satisfying \(f(x)\ge f(\bar{x})+\langle z,x-\bar{x}\rangle -\frac{r}{2}\Vert x-\bar{x}\Vert ^{2}\) in some neighbourhood of \(\bar{x}\), for some \(r\ge 0\), where \(\Vert \cdot \Vert \) denotes the Euclidean norm. Denote by \(S_{p}(f)\) the points in the domain of f at which \(\partial _{p}f(x)\ne \emptyset \).
-
2.
The limiting subdifferential [33, 38] at x is given by
$$\begin{aligned} \partial f(x)=\limsup _{x^{\prime }\rightarrow _{f}x}\partial _{p}f(x^{\prime }):=\{z\mid \exists z_{v}\in \partial _{p}f(x_{v}),x_{v}\rightarrow _{f}x \text {, with }z_{v}\rightarrow z\}, \end{aligned}$$where \(x^{\prime }\rightarrow _{f}x\) means that \(x^{\prime }\rightarrow x\) and \(f(x^{\prime })\rightarrow f(x)\).
-
3.
The singular limiting subdifferential is given by
$$\begin{aligned} \partial ^{\infty }f(x)&=\limsup _{x^{\prime }\rightarrow _{f}x}\!{}^{\infty }\,\partial _{p}f(x^{\prime }) \\&:=\{z\mid \exists z_{v}\in \partial _{p}f(x_{v}),x_{v}\rightarrow _{f}x \text {, with }\lambda _{v}\downarrow 0\text { and }\lambda _{v}z_{v}\rightarrow z\}. \end{aligned}$$
2.1 The \({\mathcal {VU}}\) decomposition
Denote the convex hull of a set \(C\subseteq {\mathbb {R}}^{n}\) by \({\text {co}}C\). The convex hull of a function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is denoted by \({\text {co}}f\) and corresponds to the proper lower-semi-continuous function whose epigraph is given by \(\overline{{\text {co }}{\text {epi}}f}\). In this section we will use a slightly weaker notion of the \({\mathcal {VU}}\) decomposition. When \({\text {rel}}\)-\({\text {int co}}\partial f\left( \bar{x}\right) \ne \emptyset \) we can take \(\bar{z}\in {\text {rel}}\)-\({\text {int co}}\partial f\left( \bar{x}\right) \) and define \({\mathcal {V}}:={\text { span}} \left\{ {\text {co}}\partial f\left( \bar{x}\right) -\bar{z}\right\} \) and \({\mathcal {U}}:={\mathcal {V}}^{\perp }\).
Under the \({\mathcal {VU}}\) decomposition [24] for a given \(\bar{z}\in {\text {rel}}\)-\({\text {int co}}\partial f(\bar{x})\) we have, by definition,
One can then decompose \(\bar{z}=\bar{z}_{{\mathcal {U}}}+\bar{z}_{{\mathcal {V}}}\) so that when \(w=u+v\in {\mathcal {U}}\oplus {\mathcal {V}}\) we have \(\langle \bar{z},w\rangle =\langle \bar{z}_{{\mathcal {U}}}, u\rangle +\langle \bar{z}_{{\mathcal {V}}},v\rangle .\) Indeed we may decompose into the direct sum \(x=x_{{\mathcal {U}}}+x_{{\mathcal {V}}}\in {\mathcal {U}}\oplus {\mathcal {V}}\) and use the following norm for this decomposition \(\left\| x-\bar{x}\right\| ^{2}:=\left\| x_{{\mathcal {U}}}-\bar{x}_{{\mathcal {U}}}\right\| ^{2}+\left\| x_{{\mathcal {V}}}-\bar{x}_{{\mathcal {V}}}\right\| ^{2}.\) As all norms are equivalent we will at times prefer to use \(\{B^{{\mathcal {U}}}_{\varepsilon } (\bar{x}_{{\mathcal {U}}}) \oplus B^{{\mathcal {V}}}_{\varepsilon } (\bar{x}_{{\mathcal {V}}}) \}_{\varepsilon > 0}\) which more directly reflects the direct sum \({\mathcal {U}}\oplus {\mathcal {V}}\), where each \(B^{(\cdot )}_{\varepsilon } (\cdot )\) is a closed ball of radius \(\varepsilon >0\), in their respective space.
Denote the projection onto the subspaces \({\mathcal {U}}\) and \({\mathcal {V}}\) by \(P_{{\mathcal {U}}}\left( \cdot \right) \) and \(P_{{\mathcal {V}}}\left( \cdot \right) \), respectively. Denote by \(f|_{{\mathcal {U}}}\) the restriction of f to the subspace \({\mathcal {U}}\), \(\partial _{{\mathcal {V}}}f(\bar{x}):=P_{{\mathcal {V}}}(\partial f(\bar{x}))\) and \(\partial _{{\mathcal {U}}}f(\bar{x} ):=P_{{\mathcal {U}}}(\partial f(\bar{x}))\). Let \(\delta _{C}(x)\) denote the indicator function of a set C, \(\delta _{C}(x)=0\) iff \(x\in C\) and \(+\infty \) otherwise. Let \(f^{*}\) denote the convex conjugate of a function f.
Remark 4
The condition (3) implies one can take \({\mathcal {V}}:={\text {span}} \left\{ {\text {co}}\partial f\left( \bar{x}\right) -\bar{z}\right\} ={\text {affine}}\)-\({\text {hull}} \left[ {\text {co}}\partial f\left( \bar{x}\right) \right] -\bar{z}\) which is independent of the choice of \(\bar{z}\in {\text {co}}\partial f\left( \bar{x}\right) \). Moreover, as was observed in [32, Lemma 2.4] we have \(\bar{z}_{{\mathcal {U}}}=P_{{\text {affine}}\text {-}{\text {hull co}}\partial f\left( \bar{x}\right) }\left( 0\right) \) (see part 2 below).
Proposition 5
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function with (3) holding.
-
1.
We have
$$\begin{aligned} {\mathcal {U}}=\left\{ u\mid -\delta _{\partial f(\bar{x})}^{*}(-u)=\delta _{\partial f(\bar{x})}^{*}(u)\right\} . \end{aligned}$$(4) -
2.
We have
$$\begin{aligned} \partial f\left( \bar{x}\right) =\left\{ \bar{z}_{{\mathcal {U}}}\right\} \oplus \partial _{{\mathcal {V}}}f\left( \bar{x}\right) . \end{aligned}$$(5) -
3.
Suppose there exists \(\varepsilon >0\) such that for all \(z_{{\mathcal {V}}}\in B_{\varepsilon }\left( \bar{z}_{{\mathcal {V}}}\right) \cap {\mathcal {V}}\subseteq \partial _{{\mathcal {V}}}f\left( \bar{x}\right) \) there is a common
$$\begin{aligned} v\left( u\right) \in {\text {argmin}}_{v\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) }\left\{ f\left( \bar{x}+u+v\right) -\langle z_{{\mathcal {V}}},v\rangle \right\} \cap {\text {int}} B_{\varepsilon } (0) \end{aligned}$$for all \(u\in B_{\varepsilon }\left( 0\right) \cap {\mathcal {U}}\). Then we have
$$\begin{aligned} {\text {cone}}\left[ \partial _{{\mathcal {V}}}f\left( \bar{x}+u+v\left( u\right) \right) -\bar{z}_{{\mathcal {V}}}\right] \supseteq {\mathcal {V}}. \end{aligned}$$(6) -
4.
If we impose the addition assumption that f is (Clarke) regular at \(\bar{x}\), \(\bar{z}\in \partial f\left( \bar{x}\right) \) and \(\partial ^{\infty }f\left( \bar{x}\right) \cap {\mathcal {V}} =\left\{ 0\right\} \). Then the function
$$\begin{aligned} H_{{\mathcal {U}}}\left( \cdot \right) :=f\left( \bar{x}+\cdot \right) :{\mathcal {U}} \rightarrow {\mathbb {R}}_{\infty } \end{aligned}$$is strictly differentiable at 0 and single valued with \(\partial H_{{\mathcal {U}}}\left( 0\right) =\left\{ \bar{z}_{{\mathcal {U}}}\right\} \) and \(H_{{\mathcal {U}}}\) (as a function defined on \({\mathcal {U}}\)) is continuous with \(H_{{\mathcal {U}}}\) and \(-H_{{\mathcal {U}}}\) (Clarke) regular functions at 0 (in the sense of [38]).
Proof
-
(1)
If \(u\in {\mathcal {U}}\) then by construction we have
$$\begin{aligned} -\delta _{\partial f(\bar{x})}^{*}(-u)=-\delta _{{\text {co}}\partial f(\bar{x})}^{*}(-u)=\delta _{{\text {co}}\partial f(\bar{x})}^{*}(u)=\delta _{\partial f(\bar{x})}^{*}(u) = \langle u , \bar{z} \rangle \end{aligned}$$(7)giving the containment of \({\mathcal {U}}\) in the right hand side of (4). For u satisfying (7) then \(\langle z-\bar{z},u\rangle =0\) for all \(z\in {\text {co}}\partial f(\bar{x})\). That is, \(u\perp [{\text {co}}\partial f(\bar{x})-\bar{z}]\) and hence \(u\perp {\mathcal {V}}={\mathcal {U}}^{\perp }\) verifying \(u\in {\mathcal {U}}\).
-
(2)
Since \(\partial f(\bar{x})\subseteq \bar{z}+{\mathcal {V}}=\bar{z}_{{\mathcal {U}}}+{\mathcal {V}}\) always have \(\partial f\left( \bar{x}\right) =\left\{ \bar{z}_{{\mathcal {U}}}\right\} \oplus \partial _{{\mathcal {V}}}f\left( \bar{x}\right) .\)
-
(3)
When \(v\left( u\right) \in {\text {argmin}}_{v\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) }\left\{ f\left( \bar{x}+u+v\right) -\langle z_{{\mathcal {V}}},v\rangle \right\} \) for all \(u\in B_{\varepsilon }\left( 0\right) \cap {\mathcal {U}}\) and \(z_{{\mathcal {V}}}\in B_{\varepsilon }\left( \bar{z}_{{\mathcal {V}}}\right) \cap {\mathcal {V}}\) we have, due to the necessary optimality conditions, that
$$\begin{aligned} z_{{\mathcal {V}}}\in \partial _{{\mathcal {V}}}f\left( \bar{x}+u+v\left( u\right) \right) \end{aligned}$$and hence \(B_{\varepsilon }\left( \bar{z}_{{\mathcal {V}}}\right) \cap {\mathcal {V}}\subseteq \partial _{{\mathcal {V}}}f\left( \bar{x}+u+v\left( u\right) \right) \) giving (6).
-
(4)
For \(h\left( \cdot \right) :=f\left( \bar{x}+\cdot \right) \) define \(H =h +\delta _{{\mathcal {U}}} \) so \(h\left( u\right) =H\left( u\right) \) when \(u\in {\mathcal {U}}\). Then as \(\partial ^{\infty }f\left( \bar{x}\right) \cap {\mathcal {V}} =\left\{ 0\right\} \), by [38, Corollary 10.9] we have
$$\begin{aligned} \partial H\left( 0\right) \subseteq \partial f\left( \bar{x}\right) + N_{{\mathcal {U}}}\left( 0 \right) =\partial f\left( \bar{x}\right) +\mathcal {\ V}. \end{aligned}$$Then restricting to \({\mathcal {U}}\) we have \(P_{{\mathcal {U}}} \partial H\left( 0\right) \subseteq \partial _{{\mathcal {U}}} f\left( \bar{x} \right) \). Then for \(u \in {\mathcal {U}}\) we have \(\delta _{\partial H(0)}^{*} (u) = \delta _{P_{{\mathcal {U}}}\partial H(0)}^{*} (u) \le \delta _{\partial f(\bar{x})}^{*}(u)\) and so
$$\begin{aligned} -\delta _{\partial f(\bar{x})}^{*}(-u)\le -\hat{d}H(0)(-u)\le \hat{d} H(0)(u)\le \delta _{\partial f(\bar{x})}^{*}(u). \end{aligned}$$As f is regular at \(\bar{x}\) we have \(\partial ^{\infty } f (\bar{x}) = 0^{+} (\partial f (\bar{x}))\) where the later corresponds to the recession directions of the convex set \(\partial f (\bar{x})\) (see [38, Theorem 8.49]). Then we have \(0^+ (\partial f (\bar{x})) \subseteq {\mathcal {V}}\). [Take \(u \in 0^+ (\partial f (\bar{x}))\) and \(z \in {\text {rel-int}} \partial f (\bar{x})\). Then by [37, Theorem 6.1] we have \(z + u \in {\text {rel-int}} \partial f (\bar{x})\) and hence \(u \in {\mathcal {V}}\).] Thus for \(u \in {\mathcal {U}} \subseteq (0^+ (\partial f (\bar{x})))^{\circ }\) we have
$$\begin{aligned} \hat{d}H (0)(u) := \limsup _{x \rightarrow 0, t \downarrow 0} \inf _{u^{\prime }\rightarrow u} \frac{1}{t} (f(x+tu^{\prime })-f(x) ) =\delta _{\partial H(0)}^{*} (u)=\delta _{P_{{\mathcal {U}}}\partial H(0)}^{*} (u), \end{aligned}$$see [38, Definition 8.16, Exercise 8.23]. It follows that \(-\hat{d}H(0)(-u)=\hat{d}H(0)(u)\) for all \(u\in {\mathcal {U}}\). Restriction of H to the subspace \({\mathcal {U}}\), (denoted this function by \(H_{{\mathcal {U}}}\)) we have \(\partial ^{\infty } H_{{\mathcal {U}}} (0) \subseteq \partial ^{\infty }f\left( \bar{x}\right) \cap {\mathcal {U}} = \{0\}\) then by [38, Theorem 9.18] we have \(\partial H_{{\mathcal {U}}} \left( 0\right) \) a singleton with \(H_{{\mathcal {U}}}\) continuous at 0 and \(H_{{\mathcal {U}}}\) and \(-H_{{\mathcal {U}}}\) (Clarke) regular. As \(\bar{z}_{{\mathcal {U}} }\in \partial H_{{\mathcal {U}}}\left( 0\right) \) we have \(\partial H_{{\mathcal {U}}}\left( 0\right) =\left\{ \bar{z}_{{\mathcal {U}}}\right\} \), so \(\partial _{{\mathcal {U}}} f (\bar{x}) =\left\{ \bar{z}_{{\mathcal {U}}}\right\} \). \(\square \)
3 A Primer on Subjets and Subhessians
We will have need to discuss second order behaviour in this paper and as a consequence it will be useful to define a refinement of this decomposition that takes into account such second order variations. In most treatments of the \({\mathcal {VU}}\) decomposition one finds that by restricting f to \({\mathcal {M}}:=\{(u,v(u)) \mid u \in {\mathcal {U}}\}\) not only do we find f is smooth we also find that there is better second order behaviour as well [24]. This is also often associated with smooth manifold substructures. Let \(\mathcal {S}(n)\) denote the set of symmetric \(n\times n\) matrices (endowed with the Frobenius norm and inner product) for which \(\langle Q,hh^{T}\rangle =h^{T}Qh \). Denote the cone of positive semi-definite matrices by \(\mathcal {P}(n)\) and \(\Delta _{2}f(x,t,z,u):=2\frac{f(x+tu)-f(x)-t\langle z,u\rangle }{t^{2}}\).
Definition 6
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a lower semi-continuous function.
-
1.
The function f is said to be twice sub-differentiable (or possess a subjet) at x if the following set is nonempty;
$$\begin{aligned} \partial ^{2,-}f(x)= & {} \{(\nabla \varphi (x),\nabla ^{2}\varphi (x))\,:\,f-\varphi \,\text { has a local minimum at }\,\,x\text { with } \,\,\\&\varphi \in {\mathcal {C}}^{2}{\mathcal {(}}{\mathbb {R}}^{n})\}. \end{aligned}$$The subhessians at \((x,z)\in {\text {graph}}\partial f\) are given by \(\partial ^{2,-}f(x,z):=\{Q\in \mathcal {S}(n)\mid (z,Q)\in \partial ^{2,-}f(x)\}\).
-
2.
The limiting subjet of f at x is defined to be: \(\underline{\partial }^{2}f(x)=\limsup _{u\rightarrow ^{f}x}\partial ^{2,-}f(u)\) and the associated limiting subhessians for \(z\in \partial f\left( x\right) \) are \( \underline{\partial }^{2}f(x,z)=\left\{ Q\in \mathcal {S}\left( n\right) \mid \left( z,Q\right) \in \underline{\partial }^{2}f(x)\right\} \).
-
3.
We define the rank one barrier cone for \(\underline{\partial }^{2}f(x,z)\) as
$$\begin{aligned} b^{1}(\underline{\partial }^{2}f(x,z)):= & {} \{h\in {\mathbb {R}}^{n}\mid q\left( \underline{\partial }^{2}f(x,z)\right) (h)\\:= & {} \sup \left\{ \langle Qh,h\rangle \mid Q\in \underline{\partial }^{2}f(x,z)\right\} <\infty \}. \end{aligned}$$ -
4.
Denoting \(S_{2}(f)=\{x\in {\text {dom}}\,(f)\mid \nabla ^{2}f(x) \text { exists} \} \), then the limiting Hessians at \((\bar{x}, \bar{z})\) are given by:
$$\begin{aligned}&\overline{D}^{2}f(\bar{x},\bar{z}) =\{Q\in \mathcal {S}(n)\mid Q=\lim _{n\rightarrow \infty }\nabla ^{2}f(x_{n}) \\ \end{aligned}$$where \(\{x_{n}\}\subseteq S_{2}(f)\text {, }x_{n}\rightarrow ^{f}\bar{x}\) and \(\nabla f(x_{n})\rightarrow \bar{z}\}\).
-
5.
Define the second order Dini-directional derivative of f by \(f_{\_}^{\prime \prime }(\bar{x},z,h)=\liminf _{t\downarrow 0,u\rightarrow h}\Delta _{2}f(\bar{x},t,z,u)\).
Define \(\partial ^{2,+}f(x,z):= -\partial ^{2,-}(-f)(x,-z)\) then when \(Q\in \partial ^{2,-}f(x,z)\cap \partial ^{2,+}f(x,z)\) it follows that \(Q=\nabla ^{2}f\left( x\right) \) and \(z=\nabla f\left( x\right) \). If \(f_{\_}^{\prime \prime }(\bar{x},z,h)\) is finite then \(f_{\_}^{\prime }(\bar{x},h):=\liminf _{{{t\downarrow 0}} \mathop {{u\rightarrow h }}}\frac{1}{t}(f(\bar{x}+tu)-f(\bar{x}))=\langle z,h\rangle \). It must be stressed that these second order objects may not exist everywhere but as \(\partial ^{2,-}f(x)\) is non-empty on a dense subset of its domain [5] when f is lower semi-continuous then at worst so are the limiting objects. In finite dimensions this concept is closely related to the proximal subdifferential (as we discuss below). The subhessian is always a closed convex set of matrices while \(\underline{\partial }^{2}f(\bar{x},z)\) may not be convex (just as \(\partial _{p}f(\bar{x})\) is convex while \(\partial f(\bar{x})\) often is not).
A function f is para-concave around \(\bar{x}\) when there exists a \(c>0\) and a ball \(B_{\varepsilon }\left( \bar{x}\right) \) within which the function \(x\mapsto \)\(f\left( x\right) -\frac{c}{2}\left\| x\right\| ^{2}\) is finite concave (conversely f is para-convex around \(\bar{x}\) iff \(-f\) is para-concave around \(\bar{x}\)). If a function is para-concave or para-convex we have (by Alexandrov’s theorem) the set \(S_{2}(f)\) is of full Lebesgue measure in \({\text {dom}}\,f\). A function is \(C^{1,1}\) when \(\nabla f \) exists and satisfies a Lipschitz property. In [13, Lemma 2.1], it is noted that f is locally \(C^{1,1}\) iff f is simultaneously a locally para-convex and para-concave function. The next observation was first made in [34, Prposition 4.2] and later used in [23, Proposition 6.1].
Proposition 7
([23], Proposition 6.1) If f is lower semi-continuous then for \(z\in \partial f(\bar{x})\) we have
If we assume in addition that f is continuous and a para-concave function around \(\bar{x}\) then equality holds in (8).
A weakened form of para-convexity is prox-regularity.
Definition 8
([35]) Let the function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) be finite at \(\bar{x}\).
-
1.
The function f is prox-regular at \(\bar{x}\) for \(\bar{z}\) with respect to \(\varepsilon >0\) and \(r \ge 0 \), where \(\bar{z}\in \partial f(\bar{x})\), if f is locally lower semi-continuous at \(\bar{x}\) and
$$\begin{aligned} f(x^{\prime })\ge f(x)+\langle z,x^{\prime }-x\rangle -\frac{r}{2}\Vert x^{\prime }-x\Vert ^{2} \end{aligned}$$whenever \(\Vert x^{\prime }-\bar{x}\Vert \le \varepsilon \) and \(\Vert x-\bar{x}\Vert \le \varepsilon \) and \(\left| f(x)-f(\bar{x})\right| \le \varepsilon \ \) with \(\Vert z-\bar{z}\Vert \le \varepsilon \) and \(z\in \partial f(x)\).
-
2.
The function f is subdifferentially continuous at \(\bar{x}\) for \(\bar{z}\), where \(\bar{z}\in \partial f(\bar{x})\), if for every \(\delta >0\) there exists \(\varepsilon >0\) such that \(\left| f(x)-f(\bar{x} )\right| \le \delta \) whenever \(|x-\bar{x}|\le \varepsilon \) and \(|z-\bar{z}|\le \varepsilon \) with \(z\in \partial f(x).\)
Remark 9
In this paper we adopt the convention that limiting subgradients must exist at \(\bar{x}\) to invoke this definition. We say that f is prox-regular at \(\bar{x}\) iff it is prox-regular with respect to each \(\bar{z}\in \partial f\left( \bar{x}\right) \) (with respect to some \(\varepsilon >0\) and \(r\ge 0\)).
Remark 10
We shall now discuss a well known alternative characterisation of \((z,Q)\in \partial ^{2,-}f(\bar{x})\), see [34]. By taking the \(\varphi \in C^{2}({\mathbb {R}}^{n})\) in Definition 6 and expanding using a Taylor expansion we may equivalently assert that there exists a \(\delta >0\) for which
where \(o\left( \cdot \right) \) is the usual Landau small order notation. It is clear from (9) that we have \((z,Q)\in \partial ^{2,-}f(\bar{x}) \) implies \(z\in \partial _{p}f(\bar{x})\) as
when \(r>\Vert Q\Vert _{F}\) and \(\delta >0\) sufficiently reduced. Moreover \(z\in \partial _{p}f(\bar{x})\) implies \((z,-rI)\in \partial ^{2,-}f(\bar{x})\). From the definition of prox-regularity at \(\bar{x}\) for \(\bar{z}\) (and the choice of \(x=\bar{x}\)) we conclude that we must have \(\bar{z}\in \partial _{p}f(\bar{x})\) and hence \(\partial ^{2,-}f(\bar{x},\bar{z})\ne \emptyset \). Moreover the definition of prox-regularity implies the limiting subgradients are actually proximal subgradients locally i.e. within an “f-attentive neighbourhood of \(\bar{z}\)” [35]. When f is subdifferentially continuous we may drop the f-attentiveness and claim \(B_{\delta }(\bar{z})\cap \partial f(\bar{x})=B_{\delta }(\bar{z})\cap \partial _{p}f(\bar{x})\) for some sufficiently small \(\delta >0\). The Example 4.1 of [26] show that this neighbourhood can reduce to a singleton \(\{\bar{z}\}\). When we have a tilt stable local minimum at \(\bar{x}\) or \(\bar{z} \in {\text {rel-int}} \partial f (\bar{x})\) then this situation cannot occur.
Remark 11
We denote \((x^{\prime },z^{\prime })\rightarrow _{S_{p}(f)}(\bar{x},z)\) to mean \(x^{\prime }\rightarrow ^{f}\bar{x}\), \(\ z^{\prime }\in \partial _{p}f(x^{\prime })\) and \(z^{\prime }\rightarrow z\). As \(\partial ^{2,-}f(x^{\prime },z^{\prime })\ne \emptyset \) iff \(z^{\prime }\in \partial _{p}f\left( x^{\prime }\right) \) it follows via an elementary argument that
Denote the recession directions of a convex set C by \(0^{+}C\). Noting that \(\langle Q,uv^{T}\rangle =v^{T}Qu\) one may see the motivation for the introduction of the rank-1 support in (9). The rank-1 support is given by \(q\left( \mathcal {A}\right) (u,v):=\sup \left\{ \langle Q,uv^{T}\rangle \mid Q\in \mathcal {A}\right\} \) for a subset \(\mathcal {A}\subseteq \mathcal {S}\left( n\right) \). We see from (9) that when we have \(Q\in \partial ^{2,-}f(\bar{x},\bar{z})\) then \(Q-P\in \partial ^{2,-}f(\bar{x},\bar{z})\) for any \(n\times n\) positive semi-definite matrix \(P \in \mathcal {P}(n)\). Thus we always have \(-\mathcal {P}(n)\subseteq 0^{+}\partial ^{2,-}f(\bar{x},\bar{z})\) where \(\partial ^{2,-}f(\bar{x},\bar{z}) \subseteq \underline{\partial }^{2}f(\bar{x},\bar{z})\).
Theorem 12
([14], Theorem 1) Let \(g:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) be proper (i.e. \(g(u)\ne -\infty \) anywhere) and \({\text {dom}}g\ne \emptyset \). For \(u,v\in {\mathbb {R}}^{n}\), define \(q(u,v)=\infty \) if u is not a positive scalar multiple of v or vice versa, and \(q(\alpha u,u)=q(u,\alpha u)=\alpha g(u)\) for any \(\alpha \ge 0\). Then q is a rank one support of a set \(\mathcal {A}\subseteq \mathcal {S}(n)\) with \(-\mathcal {P}(n)\subseteq 0^{+}\mathcal {A}\) if and only if
-
1.
g is positively homogeneous of degree 2.
-
2.
g is lower semicontinuous.
-
3.
\(g(-u)=g(u)\) (symmetry).
For the sets \(\mathcal {A}\subseteq \mathcal {S}(n)\) described in Theorem 12 one only needs to consider the support defined on \({\mathbb {R}}^{n}\) by \(q\left( \mathcal {A}\right) (h):=\sup \left\{ \langle Q,hh^{T}\rangle \mid Q\in \mathcal {A}\right\} \). On reflection it is clear that all second order directional derivative possess properties 1. and 3. of the above theorem and those that are topologically well defined possess 2. as well. We call
the symmetric rank-1 hull of \(\mathcal {A}\subseteq \mathcal {S}(n)\). Note that by definition \(q(\mathcal {A})(h)=q(\mathcal {A}^{1})(h)\). When \(\mathcal {A}=\mathcal {A}^{1}\), we say \(\mathcal {A}\) is a symmetric rank-1 representer. Note that if \(Q\in \mathcal {A}^{1}\), then \(Q-P\in \mathcal {A}^{1}\) for \(P\in \mathcal {P}(n)\) so always \(-\mathcal {P}(n)\subseteq 0^{+}\mathcal {A} \). The rank one barrier cone for a symmetric rank-1 representer is denoted by \(b^{1}(\mathcal {A}):=\{h\in {\mathbb {R}}^{n}\mid q\left( \mathcal {A}\right) (h)<\infty \}\). Theorem 12 show that in general the rank-1 support is an even and positively homogeneous degree 2 function. Moreover these two properties imply the domain of \(q(\mathcal {A})(\cdot )\) is the union of some cone C and its negative i.e.
In the first order case we have \(\delta ^{*}_{\partial _{p}f(\bar{x})} (h) \le f_{\_}^{\prime }(\bar{x},h)\). In [14] it was first observed that we have an analogous identity involving the rank-1 support of the subhessians in that
Hence if we work with subjets we are in effect dealing with objects dual to the lower, symmetric, second-order epi-derivative \(f_{\_}^{\prime \prime }(\bar{x},z,\cdot )\). Many text book examples of these quantities can be easily constructed. Moreover there exists a robust calculus for the limiting subjet [9, 23]. Furthermore as noted in Example 51 of [11] the qualification condition for the sum rule for the limiting subjet can hold while for the same problem the basic qualification condition for the sum rule for the limiting (first order) subdifferential can fail to hold. This demonstrates the value of considering pairs (z, Q).
Example 13
Consider the convex function on \({\mathbb {R}}^{2}\) given by \( f(x,y)=\left| x-y\right| . \) Take \(\left( x,y\right) =(0,0)\) and \(z=(0,0)\in \partial f(0,0)\) then \(Q=\left( \begin{array} [c]{cc} \alpha &{} \gamma \\ \gamma &{} \beta \end{array} \right) \in \partial ^{2,-}f \left( (0,0)\,(0,0)\right) \) iff locally around (0, 0) we have
This inequality only bites when \(x=y\) in which case
Consequently
The extreme case is when \(\alpha +2\gamma +\beta =0\) and two examples of Q attaining this extremal value are:
Also
Remark 14
Knowing the rank-1 barrier cone of a rank-1 representer \(\mathcal {A}\) tells us a lot about it’s structure. This is no small part to the fact that it consists only of symmetric matrices. This discussion has been carried out in quite a bit of detail in [9]. From convex analysis we know that the barrier cone (the points at which the support function is finite valued) is polar to the recession directions. In [9, Lemma 14] it is shown that for a rank-1 representer (using the Frobenious inner product on \(\mathcal {S}\) (n)) this corresponds to \((0^+ \mathcal {A})^{\circ } = \mathcal {P} (b^1 (\mathcal {A})) := \{ \sum _{i \in F} u_i u_i^T \mid u_i \in b^1 (\mathcal {A})\, \text {for a3 finite index set} \)F\(\}\). Moreover in [9, Lemma 24] it is shown that \( \mathcal {P} (b^1 (\mathcal {A}))^{\circ } \cap \mathcal {P} (n) = \mathcal {P} (b^1 (\mathcal {A})^{\perp })\). Denoting \({\mathcal {U}}^2 := b^1 (\mathcal {A})\) and \({\mathcal {V}}^2 = ({\mathcal {U}}^2)^{\perp }\) we deduce that \(\mathcal {P}({\mathcal {V}}^2) = (0^+ \mathcal {A}) \cap \mathcal {P} (n)\). This explains why \(q(\mathcal {A}) (w) = +\infty \) when \(w \notin {\mathcal {U}}^2\). Since we always have \(-\mathcal {P} ({\mathcal {V}}^2) \subseteq -\mathcal {P} (n) \subseteq 0^{+} \mathcal {A}\) it follows that \(\mathcal {P}({\mathcal {V}}^2) -\mathcal {P}({\mathcal {V}}^2) \subseteq 0^{+} \mathcal {A}\). Furthermore we find that for any \(w = w_{{\mathcal {U}}^2} + w_{{\mathcal {V}}^2}\) we then have for \(\mathcal {S}({\mathcal {V}}^2)\), denoting the symmetric linear mapping from \({\mathcal {V}}^2\) into \({\mathcal {V}}^2\), that
3.1 A second order \({\mathcal {VU}}\) decomposition
The result [13], Corollary 6.1 contains a number of observations that characterise the rank-1 support of the limiting subhessians. We single out the following which is of particular interest for this paper.
Proposition 15
([13], Corollary 6.1) Suppose that \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is quadratically minorised and is prox-regular at \(\bar{x}\ \) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Then \(h\mapsto q\left( \underline{\partial }^{2}f(\bar{x},\bar{z})\right) (h)+r\Vert h\Vert ^{2}\) is convex.
Proof
For the convenience of the reader we provide a self contained proof of this in the Appendix A. \(\square \)
Corollary 16
Suppose that f is quadratically minorised and is prox-regular at \(\bar{x}\) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Then \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))\) is a linear subspace of \({\mathbb {R}}^{n}\).
Proof
Note that \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))={\text {dom}}[q\left( \underline{\partial }^{2}f(\bar{x},\bar{z})\right) (\cdot )]\) is convex under the assumption of Proposition 15. Let C be the cone given in (11) then \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))={\text {co}}(C\cup (-C))={\text {span}}C\). As \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))\) is a symmetric convex cone it is a subspace. \(\square \)
Definition 17
Let the function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) be finite at \(\bar{x}\). When \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z} )) \) is a linear subspace of \({\mathbb {R}}^{n}\) and \(b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))\subseteq {\mathcal {U}}\) we call \({\mathcal {U}} ^{2}:=b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))\) a second order component of the \({\mathcal {U}}\)-space.
We will now justify this definition via the following results.
Lemma 18
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is quadratically minorised and is prox-regular at \(\bar{x}\ \) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Suppose in addition that \(\bar{z}\in {\text {rel}}\)-\({\text {int}}\partial f(\bar{x})\). Then for any \(\beta \ge 0\) there is \(\varepsilon ^{\prime }>0\) (independent of \(\beta \)) and a \(\epsilon _{\beta }>0\) (\(\beta \) dependent) such that we have \(f\left( \bar{x}+u+v\right) \ge f\left( \bar{x}\right) +\langle \bar{z},u+v\rangle +\frac{\beta }{2}\left\| v\right\| ^{2}-\frac{r}{2}\left\| u\right\| ^{2}\) whenever \(v\in B_{\epsilon _{\beta }}\left( 0\right) \) and \(u\in B_{\varepsilon ^{\prime }}\left( 0\right) \).
Moreover we have
Proof
By the prox-regularity of f at \(\bar{x}\) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and \(r>0\) we have \(B_{\delta }(\bar{z} )\cap \partial f(\bar{x})=B_{\delta }(\bar{z})\cap \partial _{p}f(\bar{x})\) for some sufficiently small \(\delta >0\). Thus \(\bar{z}\in {\text {rel}}\)-\({\text {int}}\partial _{p}f(\bar{x})\) and there exists a \(\varepsilon ^{\prime }\le \min \{\varepsilon ,\delta \}\) such that \(\bar{z}+\varepsilon ^{\prime }B_{1}\left( 0\right) \cap {\mathcal {V}}\subseteq \partial _{p}f(\bar{x})\) and \(r>0\) such that for \(u+v\in B_{\varepsilon ^{\prime }}^{{\mathcal {U}}}\left( 0\right) \times B_{\varepsilon ^{\prime }}^{{\mathcal {V}}}\left( 0\right) \) we have
where the last inequality holds due to the fact that \(\varepsilon ^{\prime }-\frac{r \Vert v \Vert }{2} \ge \beta \Vert v \Vert \). Now choose \(\epsilon _{\beta }=\min \{\varepsilon ^{\prime },\frac{2\varepsilon ^{\prime }}{2\beta +r}\}\).
This inequality implies that for all \(\beta >0\) we have \( \beta I\in P_{{\mathcal {V}}}^{T}\partial ^{2,-}f(\bar{x},\bar{z})P_{{\mathcal {V}}} \) and hence when \(P_{{\mathcal {V}}} h \ne 0\) (or \(h \notin {\mathcal {U}}\)) we have \( q\left( \underline{\partial }^{2}f(\bar{x},\bar{z})\right) (h) = +\infty \) and so \(h \notin {\mathcal {U}}^2\).
Remark 19
This result may hold trivially with both \({\mathcal {U}} = {\mathcal {U}}^2 =\{0\}\). Consider the function \(f :{\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) given by:
and take \(\bar{x} = (0,0)\). Then \(\partial f\left( 0,0\right) \supseteq \left\{ \left( 0,0\right) ,\left( 1,1\right) ,\left( -\,1,1\right) ,\left( 1,-\,1\right) ,\left( -\,1,-\,1\right) \right\} \) and \({\mathcal {U}}=\left\{ 0\right\} \) with \({\mathcal {V}} ={\mathbb {R}}^{2}.\) We have f is prox-regular at \(\bar{x}=(0,0)\) for \(\bar{z} = (0,0)\) and quadratically memorised (by the zero quadratic). We have \({\mathcal {U}}^{2}=\left\{ 0\right\} \) as we have \(Q_{1} =\pm \beta \left( 1,1\right) \left( \begin{array} [c]{c} 1\\ 1 \end{array} \right) =\pm \beta \left( \begin{array} [c]{cc} 1 &{} 1\\ 1 &{} 1 \end{array} \right) \) and \(Q_{2}=\pm \beta \left( -\,1,1\right) \left( \begin{array} [c]{c} -\,1\\ 1 \end{array} \right) =\pm \beta \left( \begin{array} [c]{cc} 1 &{} -\,1\\ -\,1 &{} 1 \end{array} \right) \) with \(Q_{1}\), \(Q_{2}\in \underline{\partial }^{2}f\left( ( 0,0), (0,0) \right) \) for all \(\beta \ge 0\) (approach (0, 0) along \(x=y\) and \(y=-x\) for \(z\rightarrow 0\)). Then \(q\left( \underline{\partial }^{2}f\left( (0,0), (0,0) \right) \right) \left( u,w\right) =+\infty \ge \beta \max \left\{ \left( -u+w\right) ^{2},\left( u+w\right) ^{2}\right\} \) for all \(\left( u,w\right) \ne \left( 0,0\right) \) and \(\beta \ge 0\).
We note that the examples developed in [30, Examples 2, 3] show that the assumption that \(\bar{z}\in {\text {rel}}\)-\({\text {int}}\partial f(\bar{x})\) is necessary for Lemma 18 to hold.
We finish by generalizing the notion of “fast track” [24].
Definition 20
We say f possesses a “fast track” at \(\bar{x}\) iff there exists \(\bar{z} \in \partial f\left( \bar{x}\right) \) for which
In the next section after we have introduced the localised \({\mathcal {U}}\)-Lagrangian we will justify this definition further. From Proposition 7 we see that \({\mathcal {U}}^{2}=b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z}))\) provides the subspace within which the eigen-vectors of the limiting Hessians remain bounded.
Lemma 21
Suppose f is quadratically minorised and prox-regular at \(\bar{x}\) for \(\bar{z}\in \partial f(\bar{x})\) which possesses a nontrivial second order component \({\mathcal {U}}^{2}\subseteq {\mathcal {U}}\). Then for all \(\left\{ x_{k}\right\} \subseteq S_{2}(f)\), \(x_{k}\rightarrow ^{f}\bar{x}\) with \(z_{k}\rightarrow \bar{z}\) and all \(h\in {\mathcal {U}}^{2}\) there is a uniform bound \(M>0\) such that for \(Q_{k}\in \partial ^{2,-}f\left( x_{k},z_{k}\right) \) we have
Proof
We have for all \(Q\in \underline{\partial }^{2}f(\bar{x},\bar{z})\) and any \(h\in {\mathcal {U}}^{2}\) that
As f is prox-regular, by Proposition 15\(q\left( \underline{\partial }^{2}f(\bar{x},\bar{z})\right) (\cdot )+r\Vert \cdot \Vert ^{2}\) is convex and finite valued on \({\mathcal {U}}^{2}\), a closed subspace and therefore is locally Lipschitz. Thus \(q\left( \underline{\partial }^{2}f(\bar{x},\bar{z})\right) (\cdot )\) is locally Lipschitz continuous on \({\mathcal {U}}^{2}\). Moreover a compactness argument allows us to claim it is Lipschitz continuous on the unit ball inside the space \({\mathcal {U}}^{2}\) and thus obtains a maximum, over the unit ball restricted to the space \({\mathcal {U}}^{2}\). Hence
for some \(K>0\). On multiplying by \(\Vert h\Vert ^{2}\) for \(h\in {\mathcal {U}}^{2}\) and using the positive homogeneity of degree 2 of the rank-1 support results in following inequality
for all \(Q\in \underline{\partial }^{2}f(\bar{x},\bar{z})\) and any \(h\in {\mathcal {U}}^{2}\). Take an arbitrary sequence \((x_{k},z_{k})\rightarrow _{S_{p}(f)}(\bar{x},\bar{z})\) and \(Q_{k}\in \partial ^{2,-}f(x_{k},z_{k})\) with \(Q_{k}\rightarrow Q\in \underline{\partial }^{2}f(\bar{x},\bar{z})\) then by taking \(M=2K\) we have
Moreover any sequence \(\left\{ x_{k}\right\} \subseteq S_{2}(f)\), \(x_{k}\rightarrow ^{f}\bar{x}\) with \(z_{k}\rightarrow \bar{z}\) has \((x_{k},z_{k})\rightarrow _{S_{p}(f)}(\bar{x},\bar{z})\). \(\square \)
3.2 Some consequences for coderivatives of \(C^{1,1}\) functions
As usual we have denoted the indicator function of a set \(\mathcal {A}\) by \(\delta _{\mathcal {A}}(Q)\) which equals zero if \(Q\in \mathcal {A}\) and \(+\infty \) otherwise. Recall the definition of the rank-1 hull \(\mathcal {A}^1\) given in (10). In general for the recession directions \(0^{+}\mathcal {A}^{1}\supseteq -\mathcal {P}(n)\). Consequently the convex support function \(\delta _{\mathcal {A}^{1}}^{*}\left( P\right) :=\sup \left\{ \langle Q,P\rangle :={\text {tr}}QP\mid Q\in \mathcal {A}^{1}\right\} =+\infty \) if \(P\notin \mathcal {P}(n)\). It is noted in [14, Proposition 4] that \(0^{+}\mathcal {A}^{1}=-\mathcal {P}(n)\) iff \(q(\mathcal {A})(h)<+\infty \) for all h.
Lemma 22
([12], Lemma 7) For any \(\mathcal {A}\subseteq \mathcal {S}(n)\), then \({\text {co}} \left( \mathcal {A}-\mathcal {P}(n)\right) =\mathcal {A}^{1}\text {.}\)
For any multi-function \(F:{\mathbb {R}}^{n}\rightrightarrows {\mathbb {R}}^{m}\) we denote its graph by \({\text {Graph}}F:=\left\{ (x,y)\mid y\in F(x)\right\} \). The Mordukhovich coderivative is defined as
and a second order object \(D^{*}\left( \partial f\right) (\bar{x},\bar{z})(h)\) is obtained by applying this construction to \(F\left( x\right) =\partial f\left( x\right) \) for \(\bar{z} \in \partial f (\bar{x})\). We can combine this observation with [38, Theorem 13.52] that gives a characterisation of the convex hull of the coderivative in terms of limiting Hessians for a \(C^{1,1}\) function f.
Corollary 23
Suppose f is locally \(C^{1,1}\) around x then the Mordukhovich coderivative satisfies
and
Proof
The first equality of (15) follows from [38, Theorem 13.52] and the second a restatement in terms of \(\overline{D}^{2}f(x,z)\). The third equality follows from preservation of convexity under a linear mapping. Clearly \({\text {co}}\overline{D}^{2}f(x,z)\subseteq {\text {co}}\left[ \overline{D}^{2}f(x,z)-\mathcal {P}(n)\right] =\overline{D}^{2}f(x,z)^{1}\) by Lemma 22. Moreover we must have by Proposition 7 and the linearity of \(Q\mapsto \langle Q,hh^{T}\rangle \) that
\(\square \)
A central assumption in this paper will be the presence of the following notion of local minimizer.
Definition 24
([36]) A point \(\bar{x}\) gives a tilt stable local minimum of a function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) if \(f\left( \bar{x}\right) \) is finite and there exists an \(\varepsilon >0\) such that the mapping
is single valued and Lipschitz on some neighbourhood of 0 with \(m_{f}\left( 0\right) =\bar{x}\).
In [36, Theorem 1.3] a criterion for tilt stability was given in terms of second order construction based on the coderivative of the subdifferential. Assume the first-order condition \(0\in \partial f(\bar{x})\) holds. In [36] the second order sufficiency condition
is studied and shown to imply a tilt-stable local minimum when f is both subdifferentially continuous and prox-regular at \(\bar{x}\) for \(\bar{z}\in \partial f(\bar{x})\). We may reinterpreting the condition (17) for \(C^{1,1}\) functions. Indeed thanks to Corollary 23 condition (17) is equivalent to the following.
Corollary 25
If f is locally \(C^{1,1}\) around x then condition (17) is equivalent to the existence of \(\beta >0\) such that:
Proof
By a simple convexity argument (17) is equivalent to \(\langle v,h\rangle >0\) for all \(v\in {\text {co}}D^{*}(\partial f)(x|0)(h)=\left[ {\text {co}}\overline{D}^{2}f(x,0)\right] h\) from which we have an equivalent condition that \(\langle Qh,h\rangle >0\) for all \(Q\in {\text {co}} \overline{D}^{2}f(x,0).\) But \(\langle Qh,h\rangle =\langle Q,hh^{T}\rangle \) (the Frobenius inner product) and linearity in Q gives \(\langle Qh,h\rangle >0\) for all \(Q\in \overline{D}^{2}f(x,0)\) as an equivalent condition. Finally we note that \(\overline{D}^{2}f(x,0)\) is closed and uniformly bounded due to the local Lipschitzness of the gradient \(x \mapsto \nabla f(x)\) so via a compactness argument \(\langle Qh,h\rangle \ge \beta >0\) for some \(\beta >0\).\(\square \)
Remark 26
It would be interesting to have a characterisation of subjets for functions other than those that are \(C^{1,1}\) smooth, in order to compare with their corresponding second order coderivative. Consider a characterisation of the coderivative for a class of functions found in [26, Corollary 5.4, Theorem 5.3] (which are not \(C^{1,1}\) functions). Then we have:
In this context of [26] (\(C^2\)-partially smooth functions around a tilt stable local minimum \(\bar{x}\)) we have \({\mathcal {M}} := \{ \bar{x} + u + v (u) \mid u \in {\mathcal {U}} \}\) is a \(C^2\) smooth manifold and in fact we have \(\nabla ^2_{{\mathcal {M}}} f( \bar{x}) w = \frac{d^2}{dt^2} f(\bar{x} + tw + v(tw) )|_{t=0}\) with \(T_{{\mathcal {M}}} ( \bar{x}) = {\mathcal {U}}^2\) and \(N_{{\mathcal {M}}} (\bar{x}) = {\mathcal {V}}^2\). These last claims are a consequence of [25, Corollary 2.3] (or [32, Corollary 2.3]) and the application of [24, Theorem 3.9] (or [32, Corollary 2.6]), after noting that a local convexification \(g( \cdot - \overline{x} )\) of f provides a convex “representative function” for f relative to \({\mathcal {M}}\), [26, Defintions 2.10 and 2.11]. How the construction of this convex representative function is carried out is the subject of the next Sect. 4. It seems possible that the calculus provided by [9, 23] could provide an avenue to calculate \(\underline{\partial }^2 f (\bar{x}, 0 )\) for this class of functions and shed some further light on their relationship.
4 The localised \({\mathcal {U}}^{\prime }\)-Lagrangian
For the remainder of the paper we will assume \(\bar{z} \in {\text {rel}}\)-\({\text {int}}\partial f\left( \bar{x}\right) \ne \emptyset \) and so \({\mathcal {V}}:={\text {span}} \left\{ \partial f\left( \bar{x}\right) -\bar{z}\right\} \), \({\mathcal {U}} = ({\mathcal {V}})^{\perp }\), as defined in [24, 28] and coinciding with the space defined in Sect. 2.1. When discussing tilt stability we will to assume \(\bar{z}=0\in \partial f\left( \bar{x}\right) \). Then we define the localised \({\mathcal {U}}^{\prime }\)-Lagrangian, for any subspace \({\mathcal {U}}^{\prime }\subseteq {\mathcal {U}}\) and some \(\varepsilon >0\), to be the function
where \(\mathcal {V^{\prime }}:=\mathcal {U^{\prime }}^{\perp }\). Let
This Lagrangian differs from the modification introduced by Hare [20] in that \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( \cdot \right) \) is locally well defined on \(\mathcal {U^{\prime }}\) due to the introduction of the ball \(B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) =\mathcal {V^{\prime }}\cap B_{\varepsilon }\left( 0\right) \) over which the infimum is taken. Hare assumes a quadratic minorant to justify a finite value for a sufficiently large regularization parameter used in the so-called quadratic sub-Lagrangian. Define for \(u\in \mathcal {U^{\prime }}\) and \(v\left( \cdot \right) :\mathcal {U^{\prime }}\rightarrow B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) \) the auxiliary functions
Then
When \(v(\cdot )\) is chosen as in (18) we have \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =k_{v}(u)\) with both infinite outside \(B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \).
Lemma 27
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function and assume \(v(\cdot )\) is chosen as in (18). The conjugate of \(k_{v}:{\mathcal {U}}^{\prime }\rightarrow {\mathbb {R}}_{\infty }\) with respect to \(\mathcal {U^{\prime }}\) is given by
Proof
By direct calculation we have
as \(\langle z_{\mathcal {U^{\prime }}},v^{\prime }\rangle =0\) for all \(v^{\prime }\in \mathcal {V^{\prime }}\). Also
\(\square \)
When we assume \(\bar{x}\) gives a tilt stable local minimum of a function \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) we shall choose the \(\varepsilon >0\) to be consistent with the definition of tilt stability at \(\bar{x}\) for the neighbourhood
where \(\varepsilon \) is reduced to contain the above neighbourhood in a larger ball \(\{x \in {\mathbb {R}}^n \mid \Vert x - \bar{x} \Vert \le \hat{\varepsilon } \}\) on which tilt stability holds. We will rely on the results of [7]. From definition 24 we have on \(B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( \bar{x}_{\mathcal {U^{\prime }}}\right) \oplus B_{\varepsilon }^{\mathcal {V^{\prime } }}\left( \bar{x}_{\mathcal {V^{\prime }}}\right) \) that
where \(m_{f}(\cdot )\) is as defined in (16). That is, we have a supporting tangent plane to the epigraph of \(f +\delta _{B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( \bar{x}_{\mathcal {U^{\prime }}}\right) \oplus B_{\varepsilon }^{\mathcal {V^{\prime } }}\left( \bar{x}_{\mathcal {V^{\prime }}}\right) }\). As the convex hull of any set (including the epigraph of \(f +\delta _{B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( \bar{x}_{\mathcal {U^{\prime }}}\right) \oplus B_{\varepsilon }^{\mathcal {V^{\prime } }}\left( \bar{x}_{\mathcal {V^{\prime }}}\right) }\)) must remain on the same side of any supporting hyperplane (in this case the hyperplane \((x,\alpha ) \mapsto \langle (x,\alpha ) -(m_{f}\left( v\right) , f(m_{f}\left( v\right) ) , (v, -1)\rangle \le 0\)) we may deduce that (again locally)
This observation leads to the following minor rewording of the result from [7]. It shows that there is a strong convexification process involved with tilt stability.
Proposition 28
([7], Proposition 2.6) Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function and suppose that \(\bar{x}\) give a tilt stable local minimum of f. Then for all sufficiently small \(\varepsilon >0,\) in terms of the function \(h\left( w\right) :=f\left( \bar{x}+w\right) +\delta _{B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \oplus B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) }\left( w\right) \) we have
for all z sufficiently close to 0. Consequently 0 is a tilt stable local minimum of \({\text {co}}h\).
We now study the subgradients of the \(\mathcal {U^{\prime }}\)-Lagrangian. In order to simplify statements we introduce the following modified function:
then we have \(m_{h}\left( z\right) +\bar{x}=m_{f}\left( z\right) \) for \(m_{f}\left( z\right) :={\text {argmin}}_{x\in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( \bar{x}_{\mathcal {U^{\prime }}}\right) \oplus B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( \bar{x}_{\mathcal {V^{\prime }}}\right) }\left[ f\left( x\right) -\langle x,z\rangle \right] \). The next result shows that under the assumption of tilt stability we have \(u:=P_{\mathcal {U^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) iff
where \(\partial _{{\text {co}}}g\left( u\right) :=\left\{ z\mid g\left( u^{\prime }\right) -g\left( u\right) \ge \langle z,u^{\prime }-u\rangle \text { for all }u^{\prime }\right\} \) corresponds to the subdifferential of convex analysis. In passing we note that tilt stability of f at \(\bar{x}\) implies \(\partial _{{\text {co}}} f (\bar{x}) \ne \emptyset \).
Remark 29
When \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) has a tilt-stable local minimum at \(\bar{x}\) for \(\bar{z}\) sufficiently small we must also have \(g\left( x\right) :=f\left( x\right) -\langle \bar{z},x\rangle \) possessing a tilt stable local minimum at \(\left\{ \bar{x}\right\} =m_{f}\left( \bar{z}\right) \). In this way we may obtain a unique Lipschitz continuous selection
in a neighbourhood of \(z_{\mathcal {U^{\prime }}} \in B_{\varepsilon } (\bar{z}_{\mathcal {U^{\prime }}}) \) (where \(\bar{z}_{\mathcal {U^{\prime }}} \ne 0\)).
Proposition 30
Let \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) be a proper lower semi-continuous function with \(f-\langle \bar{z},\cdot \rangle \) having a tilt-stable local minimum at \(\bar{x}\).
-
1.
Then \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( \cdot \right) \) is closed, proper convex function that is finite valued for \(u \in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \).
-
2.
Let \(u:=P_{\mathcal {U^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \in {\text {int}} B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \) (where \(z_{{\mathcal {U}}^{\prime }}\in {\mathcal {U}}^{\prime }\)) then
$$\begin{aligned} L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u^{\prime }\right) -L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) \ge \langle z_{\mathcal {U^{\prime }}},u^{\prime }-u\rangle \quad \text {for }u^{\prime }\in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) . \end{aligned}$$(22)Moreover \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =\min _{v^{\prime }\in \mathcal {V^{\prime }}}\left[ {\text {co}}h\left( u+v^{\prime }\right) -\langle v^{\prime },\bar{z}_{\mathcal {V^{\prime }} }\rangle \right] \) for which the minimum is attained at \(v\left( u\right) =P_{\mathcal {V^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) where \(v\left( 0\right) =0\).
-
3.
Conversely suppose (21) holds at any given \(u\in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \) and let v(u) be as defined in (18). Then we have \(u=P_{\mathcal {U^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) and \(v\left( u\right) =P_{\mathcal {V^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \in {\text {int}}B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) \) for \(\Vert u\Vert \) sufficiently small.
Proof
Consider 1. By Proposition 28 we have
Hence \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u^{\prime }\right) \) is a “marginal mapping” corresponding to a coercive closed convex function \(F(u^{\prime },v^{\prime }):={\text {co}}h\left( u^{\prime }+v^{\prime }\right) -\langle v^{\prime },\bar{z}_{\mathcal {V^{\prime }}}\rangle \). Applying [37, Theorem 9.2] the result follows on viewing \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u^{\prime }\right) \) as the “image of F under the linear mapping A” given by the projection \(u^{\prime } := A (u^{\prime }, v^{\prime }) :=P_{{\mathcal {U}}}(u^{\prime }, v^{\prime })\) onto \({\text {int}} B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \).
For the second part we have \(z=z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\), where only the \(\mathcal {U^{\prime }}\) component varies. The following minimum attained at the unique point \(m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \) that uniquely determines the value of \(u\in \mathcal {U^{\prime }}\):
where \(u:=P_{\mathcal {U^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }} }+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) and \(v\left( u\right) :=P_{\mathcal {V^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \). As \(m_{h}\left( \cdot \right) \) is a single valued Lipschitz function and \({\text {co}}h\) has a local minimum at 0 then \(v\left( 0\right) =0\) because \(\left\{ 0\right\} ={\text {argmin}}_{v^{\prime }\in \mathcal {V^{\prime }}} \left[ {\text {co}}h\left( v^{\prime }\right) -\langle v^{\prime },\bar{z}_{\mathcal {V^{\prime }}}\rangle \right] \). Hence by continuity \(v(u) \in {\text {int}} B_{\varepsilon }^{\mathcal {U^{\prime }}}\) for \(\Vert u\Vert \) sufficiently small. The objective value on this minimization problem equals
giving (22).
For the third part we note that (21) is equivalent to (22) and hence equivalent to the identity (24), which affirms that the minimizer in the \(\mathcal {U^{\prime }}\) space is attained at u and thus the minimizer in the \(\mathcal {V^{\prime }}\) space in the definition of \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) \) is attained at v(u). This in turn can be equivalently written as (23) which affirms that \(u=P_{\mathcal {U^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) and \(v\left( u\right) =P_{\mathcal {V^{\prime }}}\left[ m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \in B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) \). \(\square \)
Remark 31
In principle the knowledge of \(m_f\) and \({\mathcal {U}}\) should allow one to construct the function \(v(\cdot )\). One can perform a rotation of coordinates and a translation of \(\bar{x}\) to zero so that we have then f represented as \(h: {\mathcal {U}} \times {\mathcal {V}} \rightarrow {\mathbb {R}}_{\infty }\) and correspondingly obtain \(m_h\). Now decompose \(m_h (z_{{\mathcal {U}}} + \bar{z}_{{\mathcal {V}}}) = m^h_{{\mathcal {U}}}(z_{{\mathcal {U}}} ) + m^h_{{\mathcal {V}}}(z_{{\mathcal {U}}} )\) (where we have drop the reference to \(\bar{z}_{{\mathcal {V}}}\) as it’s value is fixed). Then eliminate the variable \(z_{{\mathcal {U}}} \) from the system of equations \(u = m^h_{{\mathcal {U}}}(z_{{\mathcal {U}}} ) \) and \(v = m^h_{{\mathcal {V}}}(z_{{\mathcal {U}}} )\) to obtain v(u). This solution is unique under the assumption of a tilt stable local minimum. Indeed one can interpret \(v(\cdot )\) as an implicit function. This point of view has been used by numerous authors [32, Theorem 2.2], [25, Theorem 6.1] and with regard to \(C^2\)-smooth manifolds see [26, Theorem 2.6]. This last result indicates that when f is “partially smooth” with respect to a \(C^2\)-smooth manifold \({\mathcal {M}}\) then the form of \(v( \cdot )\) is accessible via the implicit function theorem. Moreover there is a local description \({\mathcal {M}} = \{ u + v(u) \equiv (u, v(u)) \mid u \in {\mathcal {U}} \cap B_{\varepsilon } (0)\}\). An interesting example of this sort of approach can be found in [24, Theorem 4.3]. Here the exact penalty function of a convex nonlinear optimisation problem is studied where \(\bar{x}\) is chosen to be the minimizer. The function \(v(\cdot )\) is characterised as the solution to a system of equation associated with the active constraints at \(\bar{x}\) for the associated nonlinear programming problem. A similar analysis may be applied to the illustrative example of \(C^2\) smooth function f restricted to a polyhedral set \(P := \{ x \in {\mathbb {R}}^n \mid l_i (x) \le 0 \text { for } i \in I:=\{1,\dots ,m\}\}\), where \(l_i\) are affine functions and \(I(\bar{x}):=\{ i \in I \mid l_i (\bar{x}) = 0 \}\) are the active constraints. Assume \(\{\nabla l_i (\bar{x})\}_{i \in I(\bar{x})}\) are linearly independent. When the optimal solution \(\bar{x} \in {\text {int}} P\) then \({\mathcal {V}} = \{0\}\) and \({\mathcal {U}} = {\mathbb {R}}^n\) giving \({\mathcal {M}} = {\mathbb {R}}^n \times \{0\}\), a smooth manifold. When the active constraints \(I(\bar{x})\) are nonempty then \({\mathcal {V}} = {\text {lin}} \{\nabla l_i (\bar{x})\}_{i \in I(\bar{x})}\}\) and \({\mathcal {U}} = \{d \in {\mathbb {R}}^n \mid \langle \nabla l_i (\bar{x}) ,d \rangle = 0 \text { for } i \in I(\bar{x}) \}\). Then v(u) is the solution (or implicit function) associated with the system of equation \(l_i (\bar{x} + (u,v)) =0\) for \(i \in I(\bar{x})\), in the unknowns \(v \in {\mathcal {V}}\). The implicit function theorem now furnishes existence, uniqueness and differentiability. Given this clear connection to implicit functions it would be interesting to relate these ideas to a more modern theory of implicit functions [6]. \(\square \)
Existence of convex subgradients indicates a hidden convexification.
Lemma 32
Consider \(h:\mathcal {U^{\prime }}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function. Then
When \(\partial _{{\text {co}}}h\left( u\right) \ne \emptyset \) then \({\text {co}}h\left( u\right) =h\left( u\right) \) and we have \(\partial _{{\text {co}} }h\left( u\right) =\partial \left[ {\text {co}}h\right] \left( u\right) \subseteq \partial _{p}h\left( u\right) \ne \emptyset .\) If in addition h is differentiable we have \(\nabla h\left( u\right) =\nabla \left( {\text {co}} h\right) \left( u\right) \).
Proof
If \(z_{{\mathcal {U}}^{\prime }}\in \partial _{{\text {co}}}h\left( u\right) \) then
and so for \(u^{\prime }=u\) we have \({\text {co}}h\left( u\right) \ge h\left( u\right) \ge {\text {co}}h\left( u\right) \) giving equality. Thus
Hence \(\partial _{{\text {co}}}h\left( u\right) \subseteq \partial \left[ {\text {co}}h\right] \left( u\right) .\) When \(\partial _{{\text {co}}}h\left( u\right) \ne \emptyset \) then \({\text {co}}h\left( u\right) =h\left( u\right) \) and (26) gives (25) as \(h\left( u^{\prime }\right) \ge {\text {co}}h\left( u^{\prime }\right) \) is always true. In particular (25) implies \(z_{\mathcal {U^{\prime }}}\in \partial _{p}h\left( u\right) \) and when h is actually differentiable at u then\(\ \partial _{{\text {co}}}h\left( u\right) =\partial \left[ {\text {co}}h\right] \left( u\right) \subseteq \partial _{p}h\left( u\right) =\left\{ \nabla h\left( u\right) \right\} .\)\(\square \)
Remark 33
Assume \(g\left( x\right) :=f\left( x\right) -\langle \bar{z} ,x\rangle \) possessing a tilt stable local minimum at \(\left\{ \bar{x} \right\} =m_{f}\left( \bar{z} \right) \) (and hence \(\partial _{{\text {co}}}f\left( \bar{x}\right) ) \ne \emptyset \)). In [24, Theorem 3.3] it is observed that the optimality condition applied to the minimization problem that defines \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =\inf _{v^{\prime }\in \mathcal {V^{\prime }}}\left\{ {\text {co}}h\left( u+v^{\prime }\right) -\langle \bar{z}_{{\mathcal {V}}^{\prime }},v^{\prime }\rangle \right\} \) (which attains its minimum at v(u)) gives rise to
assuming \((u,v(u)) \in {\text {int}} B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \times {\text {int}} B_{\varepsilon }^{\mathcal {V^{\prime }}}\left( 0\right) \). Applying (5) and Lemma 32 we have
Thus \(\nabla L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( 0\right) =\bar{z}_{{\mathcal {U}}^{\prime }}\) exists (as was first observed in [24, Theorem 3.3] for convex functions). Moreover we also have \(L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( 0\right) =\inf _{v^{\prime }\in \mathcal {V^{\prime }}}\left\{ {\text {co}}h\left( v^{\prime }\right) -\langle \bar{z}_{{\mathcal {V}}^{\prime }},v^{\prime }\rangle \right\} ={\text {co}}h\left( 0\right) =f\left( \bar{x}\right) \) because \(m_{h}\left( \bar{z}_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) =\left\{ 0\right\} .\) Furthermore, due to the inherent Lipschitz continuity implied by tilt stability (see Proposition 30) we must have for \(\delta \) sufficiently small \(\partial L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) \ne \emptyset \) for all \(u\in B_{\delta }^{\mathcal {U^{\prime }}}\left( 0\right) \).
Even without the assumption of tilt stability we have the following.
Proposition 34
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function and
Then when \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}} L_{\mathcal {U^{\prime }}}^{\varepsilon } \left( u\right) \) we have for \(g\left( w\right) :={\text {co}}h\left( w\right) \) that
Proof
As \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}} L_{\mathcal {U^{\prime }}}^{\varepsilon } \left( u\right) \) we have for any \(u^{\prime }\in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \) that
Hence for all \(v^{\prime }\in \mathcal {V^{\prime }}\) we have
or for all \(\left( u^{\prime },v^{\prime }\right) \in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \oplus \mathcal {V^{\prime }}\) (using orthogonality of the spaces), we have
That is \(\left( u,v\left( u\right) \right) \in m_{h}\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \) and we may now apply [7, Lemmas 2.4–2.5] to obtain the result. \(\square \)
In the following we repeatedly use the fact that when a function has a supporting tangent plane to its epigraph one can take the convex closure of the epigraph and the resultant set will remain entirely to that same side of that tangent hyperplane. This will be true for partial convexifications as convex combinations cannot violate the bounding plane.
Proposition 35
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function and \(v\left( u\right) \in {\text {argmin}}_{v^{\prime }\in \mathcal {V^{\prime }}\cap B_{\varepsilon }\left( 0\right) }\left\{ f\left( \bar{x}+u+v^{\prime }\right) -\langle \bar{z}_{\mathcal {V^{\prime }} },v^{\prime }\rangle \right\} .\) Then when \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}} L_{\mathcal {U^{\prime }} }^{\varepsilon } \left( u\right) \) we have
where \(k_{v}\left( u\right) :=h\left( u+v\left( u\right) \right) -\langle \bar{z}_{\mathcal {V^{\prime }}},u+v\left( u\right) \rangle \) i.e. \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}}k_{v}\left( u\right) \) and in particular \(\bar{z}_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}} }k_{v}\left( u\right) =\partial {\text {co}} k_{v}\left( u\right) \) and \(k_{v}\left( u\right) ={\text {co}}k_{v}\left( u\right) \). Moreover for \(u\in \mathcal {U^{\prime }}\) we have
Proof
By (29) we have
So \(z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}} \in \partial _{{\text {co}}}h\left( u+v\left( u\right) \right) \ne \emptyset \) and by Lemma 32 we have \({\text {co}} h\left( u+v\left( u\right) \right) =h\left( u+v\left( u\right) \right) \). Hence
On placing \(v^{\prime }=v\left( u^{\prime }\right) \) we have \(h\left( u+v\left( u\right) \right) =\left[ {\text {co}}h\right] \left( u+v\left( u\right) \right) \) when \(u^{\prime }=u\) and otherwise
or by orthogonality we have for all \(u^{\prime }\in \mathcal {U^{\prime }}\) that
Hence \(-k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) \ge k_{v}\left( u\right) -\langle z_{\mathcal {U^{\prime }}},u\rangle \) implying \(\langle z_{\mathcal {U^{\prime }} },u\rangle \ge k_{v}\left( u\right) +k_{v}^{*}\left( z_{\mathcal {U^{\prime }} }\right) \). The reverse inequality is supplied by the Fenchel inequality which gives the result \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}}k_{v}\left( u\right) =\partial {\text {co}}k_{v}\left( u\right) \) and \(k_{v}\left( u\right) ={\text {co}}k_{v}\left( u\right) \) follows from Lemma 32.
Moreover we have from (32) that
and hence (using orthogonality)
On placing \(v^{\prime }=v\left( u^{\prime }\right) \) we have
and \(u^{\prime }=u\) and using the identities \(k_{v}\left( u\right) ={\text {co}} k_{v}\left( u\right) \) and \(\left[ {\text {co}}h\right] \left( u+v\left( u\right) \right) =h\left( u+v\left( u\right) \right) \) for \(u\in \mathcal {U^{\prime }}\) we have (30). \(\square \)
4.1 Subhessians and the localised \({\mathcal {U}}^{\prime }\)-Lagrangian
Now that we have some theory of the localised \({\mathcal {U}}^{\prime }\)-Lagrangian we may study its interaction with the notion of subhessian. As we will be applying these results locally around a tilt stable local minimum we are going to focus on the case when we have \(L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) =\inf _{v\in {\mathcal {V}}}\left\{ {\text {co}}h\left( u+v\right) -\langle \bar{z}_{{\mathcal {V}}},v\rangle \right\} \) and \({\mathcal {U}}^{\prime } = {\mathcal {U}}\). The following is a small variant of [24, Corollary 3.5].
Lemma 36
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty } \) is quadratically minorised and is prox-regular at \(\bar{x}\ \) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Suppose in addition that \(f-\langle \bar{z},\cdot \rangle \) possesses a tilt stable local minimum at \(\bar{x}\), where \(\bar{z}\in {\text {rel}}\)-\({\text {int}}\partial f(\bar{x})\), \({\mathcal {U}}^{\prime } ={\mathcal {U}}\), \(v\left( u\right) \in {\text {argmin}}_{v\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) } \left[ f\left( \bar{x}+u+v\right) -\langle \bar{z}_{{\mathcal {V}}},v\rangle \right] \) and \(L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) =\inf _{v\in {\mathcal {V}}}\left\{ {\text {co}}h\left( u+v\right) -\langle \bar{z}_{{\mathcal {V}} },v\rangle \right\} \). Then we have \(v\left( u\right) =o\left( \left\| u\right\| \right) \) in the following sense:
Proof
As noted in Remark 33 we have \(\nabla L_{{\mathcal {U}}}^{\varepsilon }\left( 0\right) =\bar{z}_{{\mathcal {U}}}\) existing where \(L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) =\inf _{v\in {\mathcal {V}}}\left\{ {\text {co}}h\left( u+v\right) -\langle \bar{z}_{{\mathcal {V}}},v\rangle \right\} \) is a convex function finite locally around \(u=0.\) Consequently we have for \(u\in B^{{\mathcal {U}}}_{\varepsilon }\left( 0\right) \) we have
Invoking (13) in the proof of Lemma 18 we have an \(\varepsilon ^{\prime }>0\) such that for \(u\in B_{\varepsilon ^{\prime }}\left( 0\right) \cap {\mathcal {U}}\)
When we choose \(v\in B_{\min \{\varepsilon ^{\prime },\varepsilon ^{\prime }/r\}}\left( 0\right) \) we have
As \(v(\cdot )\) is Lipschitz continuous with \(v(0)=0\) there exists \(\delta '>0\) such that \(\Vert u \Vert < \delta '\) implies \(v(u) \in B_{\min \{\varepsilon ^{\prime },\varepsilon ^{\prime }/r\}}\left( 0\right) \). Then we have
Hence
and given any \(\varepsilon ^{\prime \prime } >0\) we choose \(\delta >0\) with \(\delta \le \delta '\) such that \(\left\| u\right\| \le \delta \) implies \(\frac{2}{\varepsilon ^{\prime }}[ \frac{o\left( \left\| u\right\| \right) }{\left\| u\right\| }+ \frac{r}{2}\left\| u\right\| ] \le \min \{ \varepsilon ^{\prime \prime } ,\varepsilon ^{\prime }/r,\varepsilon ^{\prime }\} .\)\(\square \)
We may now further justify our definition of “fast track” at \(\bar{x}.\) In [19] and other works “fast tracks” are specified as a subspace on which both \(u\mapsto L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) \) and \(u\mapsto v\left( u\right) \) are twice continuously differentiable. In particular \(L_{{\mathcal {U}}}^{\varepsilon }\left( \cdot \right) \) admits a Taylor expansion.
Proposition 37
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is quadratically minorised and is prox-regular at \(\bar{x}\ \) for \(\bar{z}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Suppose in addition that \(f-\langle \bar{z},\cdot \rangle \) possesses a tilt stable local minimum at \(\bar{x}\), \(\bar{z}+B_{\varepsilon }\left( 0\right) \cap {\mathcal {V}}\subseteq \partial f(\bar{x})\), \({\mathcal {U}}^{\prime }={\mathcal {U}}\), \(L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) =\inf _{v\in {\mathcal {V}}}\left\{ {\text {co}}h\left( u+v\right) -\langle \bar{z}_{{\mathcal {V}}},v\rangle \right\} \) and \(\left\{ v\left( u\right) \right\} = {\text {argmin}}_{v\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) }\left[ f\left( \bar{x}+u+v\right) -\langle \bar{z}_{{\mathcal {V}}},v\rangle \right] \) for \(u \in B^{{\mathcal {U}}}_{\varepsilon } (0)\). Then for \(\bar{z}_{{\mathcal {U}}}=\nabla L_{{\mathcal {U}}}^{\varepsilon }\left( 0\right) \) we have
Proof
Consider the second order quotient
Apply Lemma 36, for an arbitrary \(\delta >0\) by taking \(\varepsilon ^{\prime \prime } = \frac{\delta }{\Vert u\Vert }\) and obtaining the existence of \(\gamma >0\) such that for \(\Vert t u\Vert \le t[ \delta + \Vert w\Vert ] \le \gamma \) we have \(\Vert v (t u ) \Vert \le \varepsilon ^{\prime \prime } \Vert t u \Vert \). This implies for all \(\delta > 0\) that when \(t[ \delta + \Vert w\Vert ] \le \gamma \) and \(u \in B_{\delta } (w )\) we have \(\frac{\Vert v\left( tu\right) \Vert }{t} \le \delta \) and so \(\frac{v\left( tu\right) }{t}\in B_{\delta }\left( 0\right) \). Thus for \(t<\eta := \frac{\gamma }{\delta + \Vert w\Vert }\) we have
and so
Taking the infimum over \(\delta >0\) gives \(\left( L_{{\mathcal {U}}}^{\varepsilon }\right) _{\_}^{\prime \prime }\left( 0,\bar{z}_{{\mathcal {U}}},w\right) \ge f_{\_}^{\prime \prime }\left( \bar{x},\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) , w\right) =\left( {\text {co}} h\right) _{\_}^{\prime \prime }\left( 0+v\left( 0\right) , \left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) , w\right) \) (because \(f\left( \bar{x}+\cdot \right) \) and \({\text {co}}h\left( \cdot \right) \) agree locally). Conversely consider
and on taking a limit infimum as \(t\downarrow 0\) and then an infimum over \(\delta >0\) gives \(\left( {\text {co}}h\right) _{\_}^{\prime \prime }\left( 0+v\left( 0\right) ,\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,w\right) \ge \left( L_{{\mathcal {U}}}^{\varepsilon }\right) _{\_}^{\prime \prime }\left( 0,\bar{z}_{{\mathcal {U}}},w\right) \) and thus equality. \(\square \)
Denote \(\partial _{\mathcal {U^{\prime }}}^{2,-}\left( {\text {co}}h\right) \left( x,z\right) =P_{\mathcal {U^{\prime }}}^{T}\partial ^{2,-}\left( {\text {co}}h\right) \left( x,z\right) P_{\mathcal {U^{\prime }}}\).
Corollary 38
Posit the assumption of Proposition 37. Then we have
and \(\partial _{{\mathcal {U}}}^{2,-}\left( {\text {co}}h\right) \left( \bar{x}, \bar{z}\right) \subseteq \partial ^{2,-}L_{{\mathcal {U}}}^{\varepsilon }\left( 0,\bar{z}_{{\mathcal {U}}}\right) \).
Proof
As \({\text {dom}}\left( {\text {co}}h\right) \subseteq {\mathcal {U}}\) we have \({\text {dom}}\left( L_{{\mathcal {U}}}^{\varepsilon }\right) _{\_}^{\prime \prime }\left( 0,\bar{z}_{{\mathcal {U}}},\cdot \right) \subseteq {\mathcal {U}}\) and hence by Proposition 37 we have \({\text {dom}}\left( L_{{\mathcal {U}}}^{\varepsilon }\right) _{\_}^{\prime \prime }\left( 0,\bar{z}_{{\mathcal {U}}},\cdot \right) \ \subseteq {\text {dom}}f_{\_}^{\prime \prime }\left( \bar{x},\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,\cdot \right) ={\text {dom}}\left( {\text {co}}h\right) _{\_}^{\prime \prime }\left( 0,\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,\cdot \right) \). Now take \(Q\in \partial ^{2,-}\left( {\text {co}}h\right) \left( \bar{x},\bar{z}\right) \) and so we have \(\langle Qu,u\rangle \le \left( {\text {co}}h\right) _{\_}^{\prime \prime }\left( 0,\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,u\right) \) for all \(u\in {\text {dom}}\left( {\text {co}}h\right) _{\_}^{\prime \prime }\left( 0,\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,\cdot \right) \). Hence for all \(u\in {\text {dom}}\left( L_{{\mathcal {U}}}^{\varepsilon }\right) _{\_}^{\prime \prime }\left( 0,\bar{z}_{{\mathcal {U}}},\cdot \right) \subseteq {\text {dom}}\left( {\text {co}}h\right) _{\_}^{\prime \prime }\left( 0,\left( \bar{z}_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) ,\cdot \right) \cap {\mathcal {U}}\) we have
and so \(P_{{\mathcal {U}}}^{T}QP_{{\mathcal {U}}}\in \partial ^{2,-}L_{{\mathcal {U}}}^{\varepsilon }\left( 0,\bar{z}_{{\mathcal {U}}}\right) \). That is \(\partial _{{\mathcal {U}}}^{2,-}\left( {\text {co}}h\right) \left( \bar{x},\bar{z}\right) =P_{{\mathcal {U}}}^{T}\partial ^{2,-}\left( {\text {co}}h\right) \left( \bar{x},\bar{z}\right) P_{{\mathcal {U}}}\subseteq \partial ^{2,-}L_{{\mathcal {U}}}^{\varepsilon }\left( 0,\bar{z}_{{\mathcal {U}}}\right) .\)\(\square \)
If we assume more (which is very similar to the “Partial Smoothness” of [26]) we obtain the following which can be viewed as a less stringent version of the second order expansions studied in [24, Theorem 3.9], [30, Equation (7)] and [32, Theorem 2.6]. This result suggests that the role of assumptions like that of Proposition 5 part 3, which are also a consequence of the definition of partial smoothness (via the continuity of the \(w\mapsto \partial f(w)\) at x relative to \({\mathcal {M}}\)) could be to build a bridge to the identity \({\mathcal {U}}={\mathcal {U}}^{2}\) (see the discussion in Remark 40 below).
Corollary 39
Posit the assumption of Proposition 37 and assume the assumption of Proposition 5 part 3 i.e. suppose we have \(\varepsilon >0\) such that for all \(z_{{\mathcal {V}}}\in B_{\varepsilon }\left( \bar{z}_{{\mathcal {V}}}\right) \cap {\mathcal {V}}\subseteq \partial _{{\mathcal {V}}}f\left( \bar{x}\right) \) there is a common
for all \(u\in B^{{\mathcal {U}}}_{\varepsilon }\left( 0\right) \). In addition suppose there exists \(\varepsilon >0\) such that for all \(u\in B^{{\mathcal {U}}}_{\varepsilon }\left( 0\right) \) we have \(u \mapsto \nabla L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) :=z_{{\mathcal {U}}}(u)\) existing, and is a continuous function. Then
Proof
All the assumption of Proposition 37 are local in nature except for the assumption that \(\bar{z}+B_{\varepsilon }\left( 0\right) \cap {\mathcal {V}}\subseteq \partial f(\bar{x}).\) Discounting this assumption for now we note that we can perturb \(\bar{x}\) (to \(\bar{x}+u+v\left( u\right) \)) and \(\bar{z}\) (to \(\left( z_{{\mathcal {U}}},\bar{z}_{{\mathcal {V}}}\right) \in \partial \left( {\text {co}}h\right) \left( u+v\left( u\right) \right) \)) within a sufficiently small neighbourhood \(u\in B^{{\mathcal {U}}}_{\varepsilon }\left( 0\right) \) and still have the assumption of prox-regularity, tilt stability (around a different minimizer of our tilted function) and still use the same selection function \(v\left( \cdot \right) \). Regarding this outstanding assumption the optimality conditions associated with (33) imply \(\bar{z}_{{\mathcal {V}}} +B_{\varepsilon }\left( 0\right) \cap {\mathcal {V}}\subseteq \partial _{{\mathcal {V}}} f(\bar{x}+u+v\left( u\right) )\). As we have \(\nabla L_{{\mathcal {U}}}^{\varepsilon }\left( u\right) =z_{{\mathcal {U}}}(u)\) existing from (27) that \( \partial {\text {co}}h ( u+v\left( u\right) ) - ({z}_{{\mathcal {U}}}(u) , \bar{z}_{{\mathcal {V}}}) \subseteq \{0\} \oplus {\mathcal {V}}\) and so \( \partial {\text {co}}h ( u+v\left( u\right) ) = \left\{ {z}_{{\mathcal {U}}}(u)\right\} \oplus \partial _{{\mathcal {V}}}{\text {co}}h( u+v\left( u\right) ) = \left\{ {z}_{{\mathcal {U}}} (u)\right\} \oplus \partial _{{\mathcal {V}}}f\left( \bar{x} + u+v\left( u\right) \right) . \) Hence
This furnishes the final assumption that is required to invoke Proposition 37 at points near to \(\left( \bar{x},\bar{z}\right) \). \(\square \)
Remark 40
We omit the details here, as they are not central to this paper, but one may invoke [13, Corollary 3.3] to deduce from Corollary 39 that
It follows that \({\mathcal {U}}\supseteq {\text {dom}}q\left( \underline{\partial } ^{2}L_{{\mathcal {U}}}^{\varepsilon }\left( 0,\bar{z}_{{\mathcal {U}}}\right) \right) \left( \cdot \right) \ ={\text {dom}}q(\underline{\partial }^{2}\left( {\text {co}}h|_{{\mathcal {U}}}\right) (0,\bar{z}))\left( \cdot \right) \cap {\mathcal {U}}={\mathcal {U}}^{2}\cap {\mathcal {U}}\). Then the imposition of the equality \({\mathcal {U}}={\mathcal {U}}^{2}\mathcal {\ }\)(our definition of a fast track) implies \(\underline{\partial }^{2}L_{{\mathcal {U}}}^{\varepsilon }\left( 0,\bar{z}_{{\mathcal {U}}}\right) =\underline{\partial }^{2}\left( {\text {co}}h|_{{\mathcal {U}}}\right) (0,\bar{z})=\underline{\partial }^{2}\left( f|_{{\mathcal {U}}}\right) (\bar{x},\bar{z}).\) It would be enlightening to have a result that establishes this identity without a-priori assuming \({\mathcal {U}}={\mathcal {U}}^{2}\) but having this as a consequence.
Recall that for a convex function \(\left( {\text {co}}h\right) ^{*}\) its subjet is nonempty at every point at which it is subdifferentiable. Moreover \(\left( {\text {co}}h\right) ^{*}\) is finite on \({\text {co}} ({\text {dom}} f - \overline{x})\). We will study its restriction \(\left( {\text {co}}h\right) ^{*}|_{\mathcal {U^{\prime }}}\) to \({\mathcal {U}}^{\prime } \equiv ({\mathcal {U}}^{\prime })^{*}\) (or more precisely to \({\mathcal {U}}^{\prime } \simeq ( {{\mathbb {R}}^{n}}\slash {{\mathcal {V}}^{\prime }})^{*}\), where the later is a “canonical isomorphism of two linear spaces”, one being the dual of a factorisation of the whole space with a subspace \({\mathcal {V}}^{\prime } \) (see, pages 2–5 of [22])). The following result will be required later in the paper.
Lemma 41
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function possessing a tilt stable local minimum at \(\bar{x}\). Suppose in addition that \(\bar{z}=0\in {\text {rel}}\)-\({\text {int}}\partial f\left( \bar{x}\right) \), \(z_{\mathcal {U^{\prime }}} \in \mathcal {U^{\prime }\subseteq U}\),
and \(u:=P_{\mathcal {U^{\prime }}}\left[ m_f \left( z_{\mathcal {U^{\prime }}}+ \bar{z}_{\mathcal {V^{\prime }}}\right) \right] .\)
-
1.
Then \(z_{\mathcal {U^{\prime }}}+0_{{\mathcal {V}}}\in \partial \left( {\text {co}}h\right) \left( u+v\left( u\right) \right) =\partial _{{\text {co}}}h\left( u+v\left( u\right) \right) \) and consequently \(z_{\mathcal {U^{\prime }}}\in \partial L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =\partial k_{v}\left( u\right) .\)
-
2.
Suppose \(\left( \left( u+v\left( u\right) \right) ,Q\right) \in \partial ^{2,-}\left( {\text {co}}h\right) ^{*}\left( z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\right) \) then \(P_{\mathcal {U^{\prime }}}^{T}QP_{\mathcal {U^{\prime }}}\in \partial ^{2,-}\left( {\text {co}}h\right) ^{*}|_{\mathcal {U^{\prime }}}\left( z_{\mathcal {U^{\prime }}}\right) =\partial ^{2,-}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) .\) Consequently when \(\nabla ^{2}k_{v}^{*}\left( z_{\mathcal {U^{\prime }} }\right) \) exists
$$\begin{aligned}&\partial _{\mathcal {U^{\prime }}}^{2,-}\left( {\text {co}}h\right) ^{*}\left( u+v\left( u\right) ,z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime } }}\right) \nonumber \\&\quad :=P_{\mathcal {U^{\prime }}}^{T}\partial ^{2,-}\left( {\text {co}} h\right) ^{*}\left( z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }} }\right) P_{\mathcal {U^{\prime }}}=\nabla ^{2}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) -\mathcal {P}({\mathcal {U}}^{\prime }). \end{aligned}$$(34)
Proof
Note that as \(\bar{z}=0\) by (19) we have \(k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) =h^{*}\left( z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\right) =({\text {co}}h)^{*}\left( z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\right) \). Invoking Proposition 35 we have \(k_{v}\left( u\right) :=h\left( u+v\left( u\right) \right) ={\text {co}}h\left( u+v\left( u\right) \right) \), and by Lemma 27 we then have
and hence \(z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\in \partial ({\text {co}} h)\left( u+v\left( u\right) \right) =\partial _{{\text {co}}}h\left( u+v\left( u\right) \right) \). Taking into account Remark 33 we have for all \(u\in \mathcal {U^{\prime }\cap B}_{\varepsilon }\left( 0\right) \) that
For the second part we have from the definition of \(\left( \left( u+v\left( u\right) \right) ,Q\right) \in \partial ^{2,-}\left( {\text {co}}h\right) ^{*}\left( z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\right) \) locally around \(z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\) that
Restricting to \(\mathcal {U^{\prime }}\) we have the following locally around \(z_{\mathcal {U^{\prime }}}\)
and by (19) we have \(k_{v}^{*}\left( z_{\mathcal {U^{\prime }} }\right) =\left( {\text {co}}h\right) ^{*}\left( z_{\mathcal {U^{\prime }} }+0_{\mathcal {V^{\prime }}}\right) =\left( {\text {co}}h\right) ^{*}|_{\mathcal {U^{\prime }}}\left( z_{\mathcal {U^{\prime }}}\right) .\) Hence
When \(\nabla ^{2}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) \) exists we have \(\partial ^{2,-}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) =\nabla ^{2}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) -\mathcal {P}({\mathcal {U}})\) giving (34). \(\square \)
In the following we shall at times use the alternate notation \(z_{\mathcal {U^{\prime }}}+z_{\mathcal {V^{\prime }}}=(z_{\mathcal {U^{\prime }}},z_{\mathcal {V^{\prime }}})\) to contain the notational burden of the former. The proof of the next proposition follows a similar line of argument as in [34, Proposition 3.1].
Proposition 42
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function and suppose that \(\bar{x}\) is a tilt stable local minimum of f. In addition suppose \(\bar{z}=0\in {\text {rel}}\)-\({\text {int}}\partial f\left( \bar{x}\right) \), \(\mathcal {U^{\prime }\subseteq U}\) and \(v\left( u\right) \in {\text {argmin}}_{v^{\prime }\in \mathcal {V^{\prime }}\cap B_{\varepsilon }\left( 0\right) }f\left( \bar{x}+u+v^{\prime }\right) \) with \(u:=P_{\mathcal {U^{\prime }}}\left[ m\left( z_{\mathcal {U^{\prime }}}+\bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) or \(z_{\mathcal {U^{\prime }}}\in \partial L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =\partial k_{v}\left( u\right) \). Suppose \(k_{v}^{*}:\mathcal {U^{\prime }}\rightarrow {\mathbb {R}}_{\infty }\) is a \(C^{1,1}\left( B_{\varepsilon }\left( 0\right) \right) \) function for some \(\varepsilon >0\) with \(\nabla ^{2}k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) \) existing as a positive definite form. Then for \(u:=\nabla k_{v}^{*}\left( z_{\mathcal {U^{\prime }}}\right) \) we have
Proof
When \(u:=P_{\mathcal {U^{\prime }}}\left[ m\left( z_{\mathcal {U^{\prime }}}+ \bar{z}_{\mathcal {V^{\prime }}}\right) \right] \) or \(z_{\mathcal {U^{\prime }} }\in \partial L_{\mathcal {U^{\prime }}}^{\varepsilon }\left( u\right) =\partial k_{v}\left( u\right) \) (see Proposition 30) we have \(z_{\mathcal {U^{\prime }}}+0_{{\mathcal {V}}^{\prime }}\in \partial \left( {\text {co}}h\right) \left( u+v\left( u\right) \right) \). We show that \(\left( \begin{array}{cc} Q^{-1} &{} 0 \\ 0 &{} 0 \end{array} \right) \) is a subhessian of \({\text {co}}h\) at (u, v(u)) for \(z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\in \partial ({\text {co}}h)\left( u+v\left( u\right) \right) \) and hence deduce that (by definition) \(Q^{-1}\in \partial _{\mathcal {U^{\prime }}}^{2,-}\left( {\text {co}}h\right) \left( u+v\left( u\right) ,z_{\mathcal {U^{\prime }}}+0_{\mathcal {V^{\prime }}}\right) \). Expanding \(k_{v}^{*}\) via a second order Taylor expansion around \(z_{\mathcal {U^{\prime }}}\) we have for all \(y_{\mathcal {U^{\prime }}}-z_{\mathcal {U^{\prime }}}\in B_{\varepsilon }\left( 0\right) \) a function \(\delta \left( \varepsilon \right) \rightarrow 0\) as \(\varepsilon \rightarrow 0\) with
Then as \({\text {co}}h\left( u+v\left( u\right) \right) =\langle z_{\mathcal {U^{\prime }}},u\rangle -\left( {\text {co}}h\right) ^{*}\left( \left( z_{\mathcal {U^{\prime }}},0_{\mathcal {V^{\prime }}}\right) \right) \) and \(\langle Q\left( y_{\mathcal {U^{\prime }}}-z_{\mathcal {U^{\prime }}}\right) ,\left( y_{\mathcal {U^{\prime }}}-z_{\mathcal {U^{\prime }}}\right) \rangle \ge \alpha \left\| y_{\mathcal {U^{\prime }}}-z_{\mathcal {U^{\prime }} }\right\| ^{2}\) we have
and when \(u^{\prime }-u\in \left( 1+\alpha ^{-1}\delta \left( \varepsilon \right) \right) \varepsilon \left\| Q^{-1}\right\| ^{-1}B_{1}\left( 0\right) \) we have the supremum attained at
Hence when \(\left( u^{\prime },v^{\prime }\right) \in B_{\gamma \left( \varepsilon \right) }\left( 0\right) \), for \(\gamma \left( \varepsilon \right) :=\left( 1+\alpha ^{-1}\delta \left( \varepsilon \right) \right) \varepsilon \left\| Q^{-1}\right\| ^{-1}\), we have
where \(\beta \left( \varepsilon \right) =\left[ 1-\left( 1+\alpha ^{-1}\delta \left( \varepsilon \right) \right) ^{-1}\right] \left\| Q^{-1}\right\| \rightarrow 0\) as \(\varepsilon \rightarrow 0\). That is
from which the result follows. \(\square \)
5 The main result
The main tools we use to establish our results are the convexification that tilt stable local minimum enable us to utilise [7], the correspondence between tilt stability and the strong metric regularity of the locally restricted inverse of the subdifferential and the connection conjugacy has to inversion of subdifferentials of convex functions [1, 7]. These tools and the coderivative characterisation (17) of tilt stability (being applicable to convex functions) allows a chain of implications to be forged. The differentiability properties we seek may be deduced via strong metric regularity or alternatively via the results of [2] after invoking the Mordukhovich coderivative criteria for the Aubin property for the associated subdifferential.
Once again we will consider subspaces \(\mathcal {U^{\prime }} \subseteq {\mathcal {U}}\). We now show that tilt stability is inherited by \(k_{v}\).
Proposition 43
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty } \) is a proper lower semi-continuous function and \(v\left( u\right) \in {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}\cap B_{\varepsilon }\left( 0\right) }f\left( \bar{x}+u+v^{\prime }\right) .\) Suppose that f has a tilt stable local minimum at \(\bar{x}\) for \(0\in \partial f\left( \bar{x}\right) \) then \(v\left( \cdot \right) :\mathcal {U^{\prime }}\rightarrow \mathcal {V^{\prime }}\) is uniquely defined and the associated function \(k_{v}\left( \cdot \right) : \mathcal {U^{\prime }} \rightarrow {\mathbb {R}}_{\infty }\) has a tilt stable local minimum at 0.
Proof
In this case we have \(\left( \bar{z}_{{\mathcal {U}}^{\prime }}, \bar{z}_{\mathcal {V^{\prime }} }\right) =\left( 0,0\right) \). By tilt stability we have \(m\left( \cdot \right) \) a single valued Lipschitz functions and hence \(v\left( \cdot \right) \) is unique. From Proposition 30 and \(\left\{ u\right\} =P_{\mathcal {U^{\prime }}}\left[ m\left( z_{\mathcal {U^{\prime }}}\right) \right] \) we have \(z_{\mathcal {U^{\prime }}}\in \partial _{{\text {co}}}\left[ L_{\mathcal {U^{\prime }}}^{\varepsilon }+\delta _{B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) }\right] \left( u\right) \) and from Propositions 34 and 28 that
implying \(\left\{ u\right\} =P_{\mathcal {U^{\prime }}}\left[ m\left( z_{\mathcal {U^{\prime }} }+0\right) \right] \subseteq {\text {argmin}}_{u^{\prime }\in \mathcal {U^{\prime }}}\left\{ k_{v}\left( u^{\prime }\right) -\langle z_{\mathcal {U^{\prime }}},u^{\prime }\rangle \right\} =\left\{ u\right\} \). Hence
is clearly a single valued, locally Lipschitz function of \(z_{ \mathcal {U^{\prime }} }\in B_{\varepsilon }^{\mathcal {U^{\prime }}}\left( 0\right) \subseteq \mathcal {U^{\prime }}\). \(\square \)
Remark 44
Clearly Proposition 43 implies \(k_{v}\left( \cdot \right) : {\mathcal {U}}^{2}\rightarrow {\mathbb {R}}_{\infty }\) has a tilt stable local minimum at 0 relative to \({\mathcal {U}}^{2}\subseteq {\mathcal {U}}\).
The following will help connect the positive definiteness of the densely defined Hessians of the convexification h with the associated uniform local strong convexity of f. This earlier results [11, Theorem 24, Corollary 39] may be compared with Theorem 3.3 of [7] in that it links “stable strong local minimizers of f at \(\bar{x}\)” to tilt stability. We say \(f_{z}:=f-\langle z,\cdot \rangle \) has a strict local minimum order two at \(x^{\prime }\) relative to \(B_{\delta }(\bar{x})\ni x^{\prime }\) when \(f_{z}(x)\ge f_{z}\left( x^{\prime }\right) +\beta \left\| x-x^{\prime }\right\| ^{2}\) for all \(x\in B_{\delta }(\bar{x})\). It is a classical fact that this is characterised by the condition \(\left( f_{z}\right) _{\_}^{\prime \prime }\left( x,0,h\right) >0\) for all \(\left\| h\right\| =1,\) see [39, Theorem 2.2].
The following result gives conditions on f, in finite dimensions, such that the coderivative in the second order sufficiency condition (17) is uniformly bounded away from zero by a constant \(\beta >0\). Then indeed (17) is equivalent to this strengthened condition. This follows from a uniform bound on the associated quadratic minorant associated with the strong stable local minimum. This phenomena was also observed in [3, Theorem 5.36] in the case of infinite dimensions for a class of optimisation problems. As we already know this is true for \(C^{1,1}\) functions (see Corollary 25) and as we know that application of the infimal convolution to prox-regular functions produces a \(C^{1,1}\) function, there is a clear path to connect these results. Indeed this is the approach used in [10, 11].
Theorem 45
([11], Theorem 34 part 1.) Suppose \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is lower-semicontinuous, prox-bounded (i.e. minorised by a quadratic) and \(0\in \partial _{p}f(\bar{x})\).
Suppose in addition there exists \(\delta >0\) and \(\beta >0\) such that for all \((x,z)\in B_{\delta }(\bar{x},0)\cap {\text {Graph}}\,\partial _{p}f\) the function \(f-\langle z,\cdot \rangle \) has a strict local minimum order two at x in the sense that there exists \(\gamma >0\) (depending on x, y) such that for each \(x^{\prime }\in {B}_{\gamma }(x)\) we have
Then we have for all \(\Vert w\Vert =1\) and \(0\ne p\in {D}^{*}(\partial _{p}f)(\bar{x},0)(w)\) that \(\langle w,p\rangle \ge \beta >0\).
The following may also be found in [16, Theorem 3.5–3.6].
Corollary 46
Suppose \(f:{\mathbb {R}} ^{n}\rightarrow \overline{{\mathbb {R}}}\) a is lower semi-continuous, prox-bounded and f is both prox-regular at \(\bar{x}\) with respect to \(0\in \partial _{p}f(\bar{x})\) and subdifferentially continuous there. Then the following are equivalent:
-
1.
For all \(\Vert w\Vert =1\) and \(p\in {D}^{*}(\partial _{p}f)(\bar{x},0)(w)\) we have \(\langle w,p\rangle >0\).
-
2.
There exists \(\beta >0\) such that for all \(\Vert w\Vert =1\) and \(p\in {D}^{*}(\partial _{p}f)(\bar{x},0)(w)\) we have \(\langle w,p\rangle \ge \beta >0\).
Proof
We only need show 1 implies 2. By [36, Theorem 1.3] we have 1 implying a tilt stable local minimum at \(\bar{x}\). Now apply [7, Theorem 3.3] to deduce the existence of a \(\delta >0\) such that for all \((x,z)\in B_{\delta }(\bar{x},0)\cap {\text {Graph}} \,\partial _{p}f\) we have x a strict local minimizer order two of the function \(f-\langle z,\cdot \rangle \) in the sense that (35) holds for some uniform value \(\beta >0\) for all \(x^{\prime } \in B_{\gamma } (x)\). Now apply Theorem 45 to obtain 2. \(\square \)
Another condition equivalent to all of those in [36, Theorem 1.3] is the following
which is motivated by the classical observation that \(f^{\prime \prime }\left( x,z,u\right) >0\) implies \(f-\langle z,\cdot \rangle \) has a strict local minimum order 2 at x (see [39, Theorem 2.2]). We will show that a stronger version gives an equivalent characterisation in Corollary 25 below. The following construction is also standard. Denote
where \(\hat{N}_{{\text {Graph}}\,\partial _{p}f}(x,z)=\left( \limsup _{t\downarrow 0}\frac{{\text {Graph}}\,\partial _{p}f-(x,p)}{t}\right) ^{\circ }\) is the contingent normal cone. Then we have \(D^{*}\left( \partial _{p}f\right) (\bar{x},0)(w)=g\)-\(\limsup _{\left( x,z\right) \rightarrow _{S_{p}\left( f\right) }\left( \bar{x},0\right) }\hat{D}^{*}\left( \partial _{p}f\right) (x,z)(w)\) (the graphical limit supremum [38, page 327]).
Corollary 47
Suppose \(f:{\mathbb {R}}^{n}\rightarrow {{\mathbb {R}}_{\infty }}\) a is lower semi-continuous, prox-bounded and f is both prox-regular at \(\bar{x}\) with respect to \(0\in \partial _{p}f(\bar{x})\) and subdifferentially continuous there. Then the following are equivalent:
-
1.
For all \(\Vert w\Vert =1\) and \(p\in {D}^{*}(\partial _{p}f)(\bar{x},0)(w)\) we have \(\langle w,p\rangle >0\).
-
2.
There exists \(\beta >0\) such that for all \(\Vert w\Vert =1\) we have \(f_{s}^{\prime \prime }\left( x,z,w\right) \ge \beta >0\) for all \(\left( x,z\right) \in B_{\delta }(\bar{x},0)\cap {\text {Graph}}\,\partial _{p}f\), for some \(\delta >0\).
Moreover the \(\beta \) in part 2 may be taken as that in Corollary 46 part 2.
Proof
(1.\(\implies \) 2.) By Corollary 46 we have 1 equivalent to condition 2 of Corollary 46 (for some fixed \(\beta >0\)). Now define \(G := f - \frac{\beta '}{2}\Vert \cdot - \bar{x} \Vert ^2\) (for \(0<\beta ' < \beta \)). Apply the sum rule for the limiting subgradient and that for the coderivatives [38, Theorem 10.41] to deduce that \(0 \in \partial G (\bar{x}) = \partial f(\bar{x})-\beta ' \times 0\) and also \({D}^{*} (\partial G)(\bar{x},0) (w) \subseteq {D}^{*} (\partial f )(\bar{x},0) (w) -\beta ' w\). Then for any \(v \in {D}^{*} (\partial G)(\bar{x},0) (w)\) we have \(\langle v, w \rangle = \langle p , w \rangle -\beta ' \Vert w \Vert ^2 >0\). Now apply Theorem 3.3 of [7] to deduce there exists a strict local minimum order two for \(G_z := G - \langle z , \cdot \rangle \) at each \(\left( x,z\right) \in B_{\delta }(\bar{x},0)\cap {\text {Graph}}\,\partial G\) for some \(\delta >0\). Noting that \(\partial G\) and \(\partial _p G\) locally coincide around \((\bar{x}, 0)\), after possibly reducing \(\delta >0\), we apply (see [39, Theorem 2.2]) to deduce that \((G_z)^{\prime \prime } (x,0,w) = f^{\prime \prime } (x,z,w) - \beta ' \Vert w\Vert ^2 > 0\) for all \(\beta '< \beta \). This implies 2.
(2.\(\implies \) 1.) Let \( \left( x,z\right) \in B_{\delta }(\bar{x},0)\cap {\text {Graph}}\,\partial _{p}f\). We use the fact that \(f_{s}^{\prime \prime }\left( x,z,w\right)> \beta ' >0\) for all \(0< \beta ' < \beta \) and \(\Vert w\Vert =1\) implies
has x as a strict local minimum order 2 at x. We may now apply [10, Theorem 67] to deduce that for all \(y\in {\hat{D}}^{*}(\partial _{p}f_{z})(x,0)(w)\) we have \(\langle w,y\rangle \ge 0\). By direct calculation from definitions one may show that \({\hat{D}}^{*}(\partial _{p}f_{z})(x,0)(w)={\hat{D}}^{*}(\partial _{p}f)(x,z)(w) - \beta ' w\) and hence \(\langle p , w \rangle \ge \beta ' \Vert w\Vert ^2\) for all \(p \in {\hat{D}}^{*}(\partial _{p}f)(x,z)(w)\). Taking the graphical limit supremum [38, identity 8(18)] of \({\hat{D}}^{*}(\partial _{p}f)(x,z)(\cdot )\) as \(\left( x,z\right) \rightarrow _{S_{p}\left( f\right) }\left( \bar{x},0\right) \) gives 1. \(\square \)
One of the properties that follows from [36, Theorem 1.3] is that the Aubin Property (or pseudo-Lipschitz property) holds for the mapping \(z \mapsto B_{\delta } (\bar{x}) \cap (\partial f)^{-1} (z)\). The Aubin property is related to differentiability via the following result.
Theorem 48
([2], Theorem 5.3) Suppose H is a Hilbert space and \(f:H\mapsto {\mathbb {R}}_{\infty }\) is lower semi-continuous, prox-regular, and subdifferentially continuous at \(\bar{x}\in {\text {int dom}}\partial f\) for some \(\bar{v}\in \partial f(\bar{x})\). In addition, suppose \(\partial f\) is pseudo-Lipschitz (i.e. possess the Aubin property) at a Lipschitz rate L around \(\bar{x}\) for \(\bar{v}\). Then there exists \(\varepsilon >0\) such that \(\partial f(x)=\{\nabla f(x)\}\) for all \(x\in B_{\varepsilon }(\bar{x})\) with \(x\mapsto \nabla f(x)\) Lipschitz at the rate L.
Corollary 49
Under the assumption of Proposition 43 we have \(z\mapsto \partial k_{v}^{*}(z)\) a single valued Lipschitz continuous mapping in some neighbourhood of 0.
Proof
We invoke Theorems 45 and 48. As \(({\text {co}} k_{v})^{*}= k_{v}^{*}\) and being a convex function it is prox-regular and subdifferentially continuous so \((\partial _{p}{\text {co}} k_{v})^{-1}=(\partial {\text {co}}k_{v})^{-1}=\partial k_{v}^{*}\) is single valued and Lipschitz continuous by Theorem 48, noting that the tilt stability supplies the Aubin property for \(\left( \partial {\text {co}} k_{v}\right) ^{-1}\) via [36, Theorem 1.3] . \(\square \)
We include the following for completeness. We wish to apply this in conjunction with Alexandrov’s theorem and this is valid due to the equivalence of the existence of a Taylor expansion and twice differentiability in the extended sense (see [38, Corollary 13.42, Theorem 13.51]).
Lemma 50
Suppose \(f: {\mathbb {R}}^n \rightarrow {\mathbb {R}}_{\infty }\) is a locally finite convex function at \(B_{\varepsilon } (x)\) which is twice differentiable at \(\bar{x} \in B_{\varepsilon } (x)\) with \(\bar{z} := \nabla f(\bar{x})\) and \(Q:= \nabla ^2 f(\bar{x})\) positive definite. Then we have
Proof
In [17] it is shown that when g is convex with \(g(0)=0\), \(\nabla g (0) = 0\) and twice differentiable \(x=0\) in the sense that the following Taylor expansion exists: \( g(y) = \frac{1}{2} (Qy )T y + o(\Vert y \Vert ^2). \) Then we have the corresponding Taylor expansion: \( g^{*} (x) = \frac{1}{2} (Q^{-1} x)^T x + o(\Vert x\Vert ^2). \) We may apply [38, Corollary 13.42] to claim these expansions are equivalent to the existence of a Hessian for both functions (twice differentiability in the extended sense) where \(0 = \nabla g(0)\), \(Q = \nabla ^2 g(0)\) and \(0=\nabla g^{*} (0)\), \(Q^{-1} = \nabla ^2 g^{*} (0)\). Now apply this to the function \(g(y):=f(y+\nabla f(\bar{x})) - \langle \nabla f (\bar{x}) , y+\nabla f(\bar{x}) \rangle \) noting that \(g^{*} (z) = f^{*} (z + \nabla f(\bar{x})) - \langle \bar{x}, z \rangle \). We have \(\nabla ^2 g (0) = \nabla ^2 f(\bar{x})\), \(\nabla g^{*} (0) = 0\) implies \(f^{*} (\nabla f(\bar{x})) = \bar{x}\) and \(\nabla ^2 g^{*} (0) = Q^{-1} = \nabla f^{*} (\nabla f(\bar{x}))\), demonstrating the forward implication (\(\implies \)) in (37). To obtain the reverse implication we apply the proven result to the convex function \(f^{*}\) using the bi-conjugate formula \(f^{**} = f\). \(\square \)
Our main goal is to demonstrate that the restriction of f to the set \({\mathcal {M}}:=\left\{ \left( u,v\left( u\right) )\mid u\in {\mathcal {U}}^{2}\right) \right\} \) coincides with a \(C^{1,1}\) smooth function of \(u \in {\mathcal {U}}\). Consequently we will be focusing on the case when \({\mathcal {U}}^2\) is a linear subspace and so take \(\mathcal {U^{\prime }} \equiv {\mathcal {U}}^2\) in our previous results. The next result demonstrates when there is a symmetry with respect to conjugation in the tilt stability property for the auxiliary function \(k_{v}\).
Theorem 51
Consider \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) is a proper lower semi-continuous function, which is a prox-regular function at \(\bar{x}\) for \(0\in \partial f(\bar{x})\) with a nontrivial subspace \({\mathcal {U}}^{2}=b^{1}\left( \underline{\partial }^{2}f\left( \bar{x},0\right) \right) \subseteq {\mathcal {U}}\). Denote \({\mathcal {V}}^{2}=({\mathcal {U}}^{2})^{\perp }\), let \(v\left( u\right) \in {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}^{2}\cap B_{\varepsilon }\left( 0\right) }f\left( \bar{x}+u+v^{\prime }\right) :{\mathcal {U}}^{2}\rightarrow {\mathcal {V}}^{2}\) and \(k_{v}\left( u\right) :=h\left( u+v\left( u\right) \right) :{\mathcal {U}}^{2}\rightarrow {\mathbb {R}}_{\infty }\). Suppose also that f has a tilt stable local minimum at \(\bar{x}\) for \(0\in \partial f\left( \bar{x}\right) \) then for \(p\ne 0\) we have
and hence \(k_{v}^{*}\) has a tilt stable local minimum at \(0\in \partial k_{v}^{*}\left( 0\right) .\)
Proof
On application of Propositions 43 and 28 we have \({\text {co}}k_{v}\left( \cdot \right) :{\mathcal {U}}^2\rightarrow {\mathbb {R}}_{\infty }\) possessing a tilt stable local minimum at 0. As \({\text {co}} k_{v}\left( \cdot \right) \) is convex it is prox-regular at 0 for \(0\in \partial {\text {co}} k_{v}\left( 0\right) \ \)and subdifferentially continuous at 0 [38, Proposition 13.32]. Hence we may apply [36, Theorem 1.3] to obtain the equivalent condition for tilt stability. For all \(q \ne 0\)
Now apply Corollary 46 to deduce the existence of \(\beta >0\) such that \(\langle p,q\rangle \ge \beta >0\) for all (p, q) taken in (39) with \(\Vert q \Vert =1\).
For this choice of \(v(\cdot )\) we have \(k_v = L^{\varepsilon }_{{\mathcal {U}}^2}\). From Proposition 5 part 2, Remark 33 and Lemma 32 we see that \(\nabla k_{v}(0)=\{0\}=\nabla {\text {co}} k_{v}(0)\). Then whenever \(x^{k}\in S_{2}({\text {co}}k_{v})\) with \(x^{k}\rightarrow 0\) (as we always have \(z^{k}=\nabla {\text {co}} k_{v}(x^{k})\rightarrow 0=\nabla {\text {co}}k_{v}(0)\)) it follows from Corollary 47 that we have
for k sufficiently large. By Alexandrov’s theorem this positive definiteness of Hessians must hold on a dense subset of some neighbourhood of zero. By the choice of \(v(\cdot )\) we have \(k_{v}(u)=L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\) and hence we may assert that \(\partial _{{\text {co}}}L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)=\partial {\text {co}}k_{v}(u)\ne \emptyset \) in some neighbourhood of the origin in \({\mathcal {U}}^{2}\).
Since \(\left[ {\text {co}}k_{v}\right] ^{*}=k_{v}^{*}\) and \(\nabla k_{v}^{*}\)\(=\left[ \partial {\text {co}}k_{v}\right] ^{-1}\) we may apply [38, identity 8(19)] to deduce that for \(\left\| q\right\| =1\) we have
Hence we can claim that for \(q\ne 0\), after a sign change, that \(\langle p,q\rangle =\langle -p,-q\rangle \ge \beta >0\). We need to rule out the possibility that \(0\in D^{*}\left( \nabla k_{v}^{*}\right) \left( 0,0\right) \left( p\right) \) for some \(p\ne 0\). To this end we may use the fact that \(k_{v}^{*}\) is \(C^{1,1}\) (and convex) and apply [38, Theorem 13.52] to obtain the following characterisation of the convex hull of the coderivative in terms of limiting Hessians. Denote \(S_{2}(k_{v}^{*}):=\{x\mid \nabla ^{2}k_{v}^{*}(x)\text { exists}\}\) then
Now suppose \(0\in D^{*}\left( \nabla k_{v}^{*}\right) \left( 0,0\right) \left( p\right) \) then there exists \(A^{i}=\lim _{k}\nabla ^{2}k_{v}^{*}(z_{i}^{k})\) for \(z_{i}^{k}\rightarrow 0\) such that \(0=q:=\sum _{i=1}^{m}\lambda _{i}A^{i}p\in {\text {co}} D^{*}\left( \nabla k_{v}^{*}\right) \left( 0,0\right) \left( p\right) \). As \(p\ne 0\) we must then have \(\langle p,q\rangle =p^{T}(\sum _{i=1}^{m}\lambda _{i}A^{i})p=0 \) where \(B:=\sum _{i=1}^{m}\lambda _{i}A^{i}\) is a symmetric positive semi-definite matrix. The inverse \((A_{k}^{i})^{-1}\) exists (relative to \({\mathcal {U}}^2\)) due to (40). Now apply the duality formula for Hessians Lemma 50 to deduce that when \(x_{i}^{k}:=\nabla k_{v}^{*}(z_{i}^{k})\) then \(A_{k}^{i}=\nabla ^{2}k_{v}^{*}(z_{i}^{k})\) iff \((A_{k}^{i})^{-1}=\nabla ^{2}({\text {co}} k_{v})(x_{i}^{k})\).
We now apply Lemma 21 to deduce that the limiting subhessians of \(h(w):=f(\bar{x}+w)\) satisfy (14). We will want to apply this bound to the limiting subhessians of \({\text {co}}h\) at \(x_{i}^{k}+v(x_{i}^{k})\). To this end we demonstrate that \(\Delta _{2}h\left( x_{i}^{k}+v(x_{i}^{k}),\left( z_{i}^{k},0\right) ,t,w\right) \ge \Delta _{2}\left( {\text {co}}h\right) \left( x_{i}^{k}+v(x_{i}^{k}),\left( z_{i}^{k},0\right) ,t,w\right) \) for all \(t\in {\mathbb {R}}\) and any w. This follows from Lemma 32, Proposition 35 in that \(\left( z_{i}^{k},0\right) \in \partial {\text {co}} h\left( x_{i}^{k}+v(x_{i}^{k})\right) =\partial h\left( x_{i}^{k}+v(x_{i}^{k})\right) ,\)\({\text {co}}h\left( x_{i}^{k}+v(x_{i}^{k})\right) =h\left( x_{i}^{k}+v(x_{i}^{k})\right) \) and \({\text {co}}h\left( u+v\right) \le h\left( u+v\right) \) for all \(\left( u,v\right) \in {\mathcal {U}}^2\times {\mathcal {V}}^2\). On taking the a limit infimum for \(t\rightarrow 0\) and \(w\rightarrow u\in {\mathcal {U}}^2\) we obtain
Hence the bound in (14) involving the constant \(M>0\) applies to any \(Q_k \in \partial ^{2,-} \left( {\text {co}}h\right) \left( x_{i}^{k}+v(x_{i}^{k}),\left( z_{i}^{k},0\right) \right) \) for k large.
As \(A_{k}^{i}=\nabla ^{2}k_{v}^{*}(z_{i}^{k})\) by Proposition 42 we have \((A_{k}^{i})^{-1}=(\nabla ^2 _{{\mathcal {U}}^{2}}h^{*}(z_{i}^{k}+0_{{\mathcal {V}}^{2}}))^{-1}\in \partial _{{\mathcal {U}}^{2}}^{2,-}\left( {\text {co}}h\right) (x_{i}^{k}+v(x_{i}^{k}))\) and on restricting to the \({\mathcal {U}}^{2}\) space and using (19), (14) and (31) we get for all \(p\in {\mathcal {U}}^{2}\) that
Thus \(\{A_{k}^{i}\}\) are uniformly positive definite. By [17] we have \((A_{k}^{i})^{-1}=\nabla ^{2}({\text {co}}k_{v})(x_{i}^{k})\) existing at \(x_{i}^{k}\) and hence
Then, for \(u\ne 0\), by Theorem 45 we have \(\langle \nabla ^{2}({\text {co}}k_{v})(x_{i}^{k})u,u\rangle \ge \frac{\beta }{2}>0\) for k large implying \(\left\{ (A_{k}^{i})^{-1}\right\} \) remain uniformly positive definite on \({\mathcal {U}}^{2}\). Hence \(\left\{ A_{k}^{i}\right\} \) remain uniformly bounded within a neighbourhood of the origin within \({\mathcal {U}}^{2}\). Thus on taking the limit we get \(A^{i}=\lim _{k}A_{k}^{i}\) is positive definite and hence \(B:=\sum _{i=1}^{m}\lambda _{i}A^{i}\) is actually positive definite, a contradiction.
As \(k_{v}^{*}\) is convex and finite at 0, it is prox-regular and subdifferentially continuous at 0 for \(0\in \partial k_{v}^{*}\left( 0\right) \) by [38, Proposition 13.32]. Another application of [36, Theorem 1.3] allows us to deduce that \(k_{v}^{*}\) has a tilt stable local minimum at \(0\in \nabla k_{v}^{*}\left( 0\right) .\)\(\square \)
We may either use the strong metric regularity property to obtain the existence of a smooth manifold or utilizes the Mordukhovich criteria for the Aubin property [38] and the results of [2] on single valuedness of the subdifferential satisfying a pseudo-Lipschitz property, namely:
Proof of Theorem 1
using strong metric regularity
Note first that \({\mathcal {U}}^{2}\subseteq {\mathcal {U}}\) corresponds to (12) for \(\bar{z}=0\). Let \(\{v\left( u\right) \} = {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}^{2}\cap B_{\varepsilon }\left( 0\right) }f\left( \bar{x}+u+v^{\prime }\right) \). We apply either [36, Theorem 1.3] or [7, Theorem 3.3] that asserts that as \(k_{v}^{*}\) is prox-regular and subdifferentially continuous at 0 for \(0\in \partial k_{v}^{*}\left( 0\right) \) then \(\partial k_{v}^{*}\) is strongly metric regular at \(\left( 0,0\right) .\) That is there exists \(\varepsilon >0\) such that
is single valued and locally Lipschitz for \(u\in {\mathcal {U}}^{2}\) sufficiently close to 0. But as \(\left( \partial k_{v}^{*}\right) ^{-1}=\partial k_{v}^{**}=\partial \left[ {\text {co}}k_{v}\right] \) is a closed convex valued mapping (and hence has connected images) we must have the existence of \(\delta >0\) such that for \(u\in B_{\delta }^{{\mathcal {U}}^{2}}\left( 0\right) \) we have \(\partial \left[ {\text {co}}k_{v}\right] \left( \cdot \right) \) a singleton locally Lipschitz mapping (giving differentiability). As \(\{v\left( u\right) \} = {\text {argmin}}_{v^{\prime }\in {\mathcal {V}}^{2}\cap B_{\varepsilon }\left( 0\right) }\left\{ h\left( u+v^{\prime }\right) -\langle \bar{z}_{{\mathcal {V}}^{2}},v^{\prime }\rangle \right\} :{\mathcal {U}}^{2}\cap B_{\varepsilon }\left( 0\right) \rightarrow {\mathcal {V}}^{2}\) we have \(k_{v}(u)=L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\) for \(u\in {\text {int}}B_{\varepsilon }^{{\mathcal {U}}^{2}}\left( 0\right) \). Hence \(\nabla {\text {co}} L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\in \partial _{{\text {co}}}L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\ne \emptyset \) and by Corollary 35 we have on \({\mathcal {U}}^{2}\) that \(h\left( u+v\left( u\right) \right) =\left[ {\text {co}}h\right] \left( u+v\left( u\right) \right) \) and hence
is single valued implying \(\nabla _{u}g\left( u+v\left( u\right) \right) \) exists where \(g\left( \cdot \right) :=\left[ {\text {co}}h\right] \ \left( \cdot \right) \) . \(\square \)
Corollary 52
Under the assumptions of Theorem 1 we have \(\nabla L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\) existing as a Lipschitz function locally on \(B^{{\mathcal {U}}}_{\varepsilon } (0)\).
Proof
Applying Corollary 35 again we can assert that under our current assumptions that locally we have \({\text {co}}k_{v} = k_{v} = L_{{\mathcal {U}}^{2}}^{\varepsilon }\) and hence \(\nabla k_{v} (u) = \nabla L_{{\mathcal {U}}^{2}}^{\varepsilon }(u)\) exists as a Lipschitz function locally on \(B^{{\mathcal {U}}}_{\varepsilon } (0)\). \(\square \)
Proof of Theorem 1
using the single valuedness of the subdifferential satisfying a pseudo-Lipschitz property.
We show that \(D^{*}(\partial [{\text {co}}k_{v}])(0 , 0)(0)=\{0\}\). To this end we use (38). Indeed this implies that \(q\ne 0\) for any \(p\ne 0\) for all \(q \in D^{*} (\nabla k^{*}_v ) (0,0)(0) (p)\). Applying the result [38, identity 8(19)] on inverse functions and coderivatives we have \(q=0\) implies \(p=0\) for all \(p\in D^{*}(\partial [{\text {co}} k_{v}])(0,0)(q)\). Hence we have \(D^{*}(\partial [{\text {co}} k_{v}])(0, 0)(0)=\{0\}\). Now apply the Mordukhovich criteria for the Aubin property [38, Theorem 9.40] to deduce that \(\partial [{\text {co}}k_{v}]\) has the Aubin property at 0 for \(0\in \partial [{\text {co}}k_{v}](0)\). Now apply Theorem 48 to deduce that \(u\mapsto \nabla [{\text {co}}k_{v}](u)\) exists a single valued Lipschitz mapping in some ball \(B_{\delta }^{{\mathcal {U}}^2}\left( 0\right) \) in the space \({\mathcal {U}}^2\). We now finish the proof as before in the first version. \(\square \)
If we assume more, essentially what is needed to move towards partial smoothness we get a \(C^{1,1}\) smooth manifold.
Proof of Theorem 2
First note that when we have (2) holding using f then we must have (2) holding using \(g := {\text {co}} h \). Thus by Proposition 5 part 3 have (6) holding using g (via the convexification argument). As \(g\left( w\right) :=\left[ {\text {co}}h \right] \left( w\right) \) for \(w\in B_{\varepsilon }\left( 0\right) \) is a convex function we have g a regular in \(B_{\varepsilon }\left( 0\right) \). Moreover as \(g\left( u+v\left( u\right) \right) =f\left( \bar{x}+u+v\left( u\right) \right) \) (and \(g\left( w\right) \le f\left( \bar{x}+w\right) \) for all w ) we have the regular subdifferential of g (at \(u + v(u)\)) contained in that of f (at \(\bar{x} + u + v(u)\)). As g is regular the singular subdifferential coincides with the recession directions of the regular subdifferential [38, Corollary 8.11] and so are contained in the recession direction of the regular subdifferential of f. We are thus able to write down the following inclusion
By the tilt stability we have v a locally Lipschitz single valued mapping. Thus by the basic chain rule of subdifferential calculus we have
is a single valued Lipschitz mapping. Under the additional assumption we have via Proposition 5 part that, \({\text {cone}}\left[ \partial _{{\mathcal {V}}}g\left( u+v\left( u\right) \right) \right] \supseteq {\mathcal {V}}\) for \(u\in B_{\varepsilon }\left( 0\right) \cap {\mathcal {U}}\). As \(\partial v\left( u\right) \subseteq {\mathcal {V}}\) it cannot be multi-valued and still have \(\left( e_{{\mathcal {U}}}\oplus \partial v\left( u\right) \right) ^{T}\partial g\left( u\oplus v\left( u\right) \right) \) single valued. This implies the limiting subdifferential \(\partial v\left( u\right) \) is locally single valued and hence \(\nabla v\left( u\right) \) exists locally. The upper-semi-continuity of the subdifferential and the single-valuedness implies \(u \mapsto \nabla v(u)\) is a continuous mapping. \(\square \)
The following example demonstrates the fact that even if \(\partial _{w}g\left( u+v\left( u\right) \right) \) is multi-valued we still have \(\left( e_{{\mathcal {U}}}, \nabla v\left( u\right) \right) ^{T}\partial _{w}g\left( u+v\left( u\right) \right) \) single valued.
Example 53
If \(f:{\mathbb {R}}^{2}\rightarrow {\mathbb {R}}\) is given by \(f=\max \{ f_1, f_2\} \) where \(f_1=w_1^2+(w_2-1)^2\) and \(f_2=w_2\), then \(\partial _{w}g\left( u+v\left( u\right) \right) \) is multi valued but \(\left( e_{{\mathcal {U}}},\partial v\left( u\right) \right) ^{T}\partial _{w}g\left( u+v\left( u\right) \right) \) is single valued.
Using the notation in the Theorem, we put \(\bar{w}=0,\) find that \(\partial f\left( 0\right) =\{\alpha \left( 0,1-\sqrt{5}\right) +(1-\alpha )\left( 0,1\right) \;|\;0\le \alpha \le 1\}\) so we have \({\mathcal {U}}=\left\{ \alpha \left( 1,0\right) \mid \alpha \in {\mathbb {R}}\right\} \) and \({\mathcal {V}}=\left\{ \alpha \left( 0,1\right) \mid \alpha \in {\mathbb {R}}\right\} \). With \(\epsilon <1/2\) then
It follows that \( \nabla v\left( u\right) =\frac{2u}{\sqrt{9-4u^{2}}}\) and \(\left( e_{{\mathcal {U}}},\partial v\left( u\right) \right) ^{T}=\left( 1,\frac{2u}{\sqrt{9-4u^{2}}}\right) ^{T}\). Now we consider \(\partial _{w}g\left( u+v\left( u\right) \right) =\partial _{w}f\left( u+v\left( u\right) \right) \). At \(u+v\left( u\right) \), from \(f_{1}\) we know
and from \(f_{2}\) we know
Thus
that is, \(\partial _{w}g\left( u+v\left( u\right) \right) \) is multi valued. However, for all such \(\alpha \), we have
Therefore \(\left( e_{{\mathcal {U}}},\partial v\left( u\right) \right) ^{T}\partial _{w}g\left( u+v\left( u\right) \right) \) is single valued.
We may now demonstrate that we have arrived at a weakening of the second order expansions studied in [24, Theorem 3.9], [30, Equation (7)] and [32, Theorem 2.6].
Corollary 54
Under the assumption of Theorem 2 we have the following local lower Taylor estimate holding: there exists \(\delta >0\) such that for all \(u \in B_{\delta } (0) \cap {\mathcal {U}}\) we have for all \(u'+v' \in B_{\delta } (u + v(u))\)
for all \(Q \in \partial ^{2,-} L_{{\mathcal {U}}}^{\varepsilon }(u, z_{{\mathcal {U}}}(u))\), where \( z_{{\mathcal {U}}}(u) := \nabla L_{{\mathcal {U}}}^{\varepsilon } (u) \).
Proof
We apply Corollary 39 taking note of the observation in remark 29 to obtain the following chain of inequalities. As \(Q \in \partial ^{2,-} L_{{\mathcal {U}}}^{\varepsilon }(u, z_{{\mathcal {U}}}(u))\) we have
where we have used Corollary 52 to deduce that \(\nabla L_{{\mathcal {U}}}^{\varepsilon } (u) = z_{{\mathcal {U}}} (u)\) exists locally as a Lipschitz continuous function. The result now follows using the orthogonality of the \({\mathcal {U}}\) and \({\mathcal {V}}\).\(\square \)
Remark 55
The function f described in Theorem 2 is quite closely related to the partial smooth class introduced by Lewis [25, 26]. Lewis calls f partially smooth at \(\bar{x}\) relative to a manifold \({\mathcal {M}}\) iff
-
1.
We have \(f|_{{\mathcal {M}}}\) is smooth around \(\bar{x}\);
-
2.
for all points in \({\mathcal {M}}\) close to \(\bar{x}\) we have f is regular and has a subgradient;
-
3.
we have \({f_{\_}}^{\prime }(\bar{x},h) >- {f_{\_}}^{\prime }(\bar{x},-h)\) for all nonzero \(h \in N_{{\mathcal {M}}} (\bar{x})\) and
-
4.
the subgradient mapping \(w \mapsto \partial f (w)\) is continuous at \(\bar{x}\) relative to \({\mathcal {M}}\).
It is not difficult to see that \(\{0\}\times {\mathcal {V}}=N_{{\mathcal {M}}}(\bar{x})\) (for \(\bar{x}\) as in Theorem 2) and so 3 of [26, Definition 3.2] is known to hold. Thus clearly we have 1 and 3 holding for the function f in Theorem 2. As functions that are prox-regular at a point \((x,0)\in {\text {Graph}}\partial f\) are not necessarily regular at x then 2 is not immediately obvious, although a subgradient must exist. Fortunately the “convex representative” given by \(g(\bar{x} + \cdot )\), where \(g:= {\text {co}}h\), is regular thanks to convexity. This implies that the restriction of f to the manifold is indeed regular. This leaves the issue of whether \(w\mapsto \partial g(w)\) is continuous at 0 (relative to \({\mathcal {M}}\)). As \(0\in {\text {int}}\partial _{{\mathcal {V}}}g\left( u+v\left( u\right) \right) \) for \(u\in B_{\varepsilon }\left( 0\right) \cap {\mathcal {U}}\) this problem may be reduced to investigating whether \(u\mapsto {\text {int}}\partial _{{\mathcal {V}}}g\left( u+v\left( u\right) \right) \) is lower semi-continuous at 0. This is not self evident due to the possibility of a increase in the dimension of the affine hull of \(\partial f(\cdot )\) at \(\bar{x}\). So the question as to whether g is partially smooth is still open. The solution to this issue may lie in the underlying assumption that \({\mathcal {U}}= {\mathcal {U}}^{2}\) in Theorem 2 (see the discussion in Remark 40). On balance the authors would conjecture that the functions we described in Theorem 2 are most likelihood partially smooth, despite failing to engineer a proof.
We would like to finish this section with some remarks regarding the related work in [26]. Because of the gap we still currently have in providing a bridge to the concept of partial smoothness we can’t make direct comparisons with the results of [26]. Moreover in [26] the authors deal with \(C^2\)-smooth manifolds while the natural notion of smoothness for this work is of type \(C^{1,1}\). It would be interesting to see to what degree the very strong results of [26] carry over to this context. That is, a study of tilt stability of partially smooth functions under pinned by a \(C^{1,1}\)-smooth manifold. This may be another avenue to close the gap that still exists.
References
Artacho Aragón, F.J.: Characterization of metric regularity of subdifferentials. J. Convex Anal. 15(2), 365–380 (2008)
Bačák, M., Borwein, J.M., Eberhard, A., Mordukhovich, B.S.: Infimal convolutions and Lipschitzian properties of subdifferentials for prox-regular functions in Hilbert spaces. J. Convex Anal. 17(3–4), 737–763 (2010)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer Series in Operations Research. Springer, New York (2000)
Cominetti, R., Correa, R.: A generalized second-order derivative in nonsmooth optimization. SIAM J. Control Optim. 28(4), 789–809 (1990)
Crandall, M.G., Ishii, H., Lions, P.-L.: User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. (N.S.) 27(1), 1–67 (1992)
Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer Series in Operations Research and Financial Engineering, A view from variational analysis, 2nd edn. Springer, New York (2014)
Drusvyatskiy, D., Lewis, A.S.: Tilt stability, uniform quadratic growth, and strong metric regularity of the subdifferential. SIAM J. Optim. 23(1), 256–267 (2013)
Eberhard, A., Sivakumaran, R., Wenczel, R.: On the variational behaviour of the subhessians of the Lasry-Lions envelope. J. Convex Anal. 13(3–4), 647–685 (2006)
Eberhard, A., Wenczel, R.: On the calculus of limiting subhessians. Set Valued Anal. 15(4), 377–424 (2007)
Eberhard, A., Wenczel, R.: Some sufficient optimality conditions in nonsmooth analysis. SIAM J. Optim. 20(1), 251–296 (2009)
Eberhard, A., Wenczel, R.: A study of tilt-stable optimality and sufficient conditions. Nonlinear Anal. 75(3), 1260–1281 (2012)
Eberhard, A.C., Pearce, C.E.M.: A sufficient optimality condition for nonregular problems via a nonlinear Lagrangian. Numer. Algebra Control Optim. 2(2), 301–331 (2012)
Eberhard, A.: Prox-regularity and subjets. In: Optimization and Related Topics (Ballarat/Melbourne, 1999), vol. 47 of Appl. Optim., pp. 237–313. Kluwer Acad. Publ., Dordrecht (2001)
Eberhard, A., Nyblom, M., Ralph, D.: Applying generalised convexity notions to jets. In: Generalized Convexity, Generalized Monotonicity: Recent Results (Luminy, 1996), vol. 27 of Nonconvex Optim. Appl., pp. 111–157. Kluwer Acad. Publ., Dordrecht (1998)
Eberhard, A.C., Mordukhovich, B.S.: First-order and second-order optimality conditions for nonsmooth constrained problems via convolution smoothing. Optimization 60(1–2), 253–275 (2011)
Mordukhovich, B.S., Nghai, T.T.A.: Second-order characterizations of tilt stability with applications to nonlinear programming. Math. Program Ser. A 149, 83–104 (2015)
Gorni, G.: Conjugation and second-order properties of convex functions. J. Math. Anal. Appl. 158(2), 293–315 (1991)
Hare, W.: Numerical analysis of \({\cal{VU}}\)-decomposition, \({\cal{U}}\)-gradient, and \({\cal{U}}\)-Hessian approximations. SIAM J. Optim. 24(4), 1890–1913 (2014)
Hare, W.L.: Functions and sets of smooth substructure: relationships and examples. Comput. Optim. Appl. 33(2–3), 249–270 (2006)
Hare, W.L., Poliquin, R.A.: The quadratic sub-Lagrangian of a prox-regular function. In: Proceedings of the Third World Congress of Nonlinear Analysts, Part 2 (Catania, 2000), vol. 47, pp. 1117–1128 (2001)
Hare, W.L., Poliquin, R.A.: Prox-regularity and stability of the proximal mapping. J. Convex Anal. 14(3), 589–606 (2007)
Holmes, R.B.: Geometric Functional Analysis and Its Applications. Graduate Texts in Mathematics, vol. 24. Springer, New York (1975)
Ioffe, A.D., Penot, J.-P.: Limiting sub-Hessians, limiting subjets and their calculus. Trans. Am. Math. Soc. 349(2), 789–807 (1997)
Lemaréchal, C., Oustry, F., Sagastizábal, C.: The \({\cal{U}}\)-Lagrangian of a convex function. Trans. Am. Math. Soc. 352(2), 711–729 (2000)
Lewis, A.S.: Active sets, nonsmoothness, and sensitivity. SIAM J. Optim 13(3), 702–725 (2002)
Lewis, A.S., Zhang, S.: Partial smoothness, tilt stability, and generalized Hessians. SIAM J. Optim. 23(1), 74–94 (2013)
Mifflin, R., Sagastizábal, C.: Proximal points are on the fast track. J. Convex Anal 9(2), 563–579 (2002). Special issue on optimization (Montpellier, 2000)
Mifflin, R., Sagastizábal, C.: Primal-dual gradient structured functions: second-order results; links to epi-derivatives and partly smooth functions. SIAM J. Optim. 13(4), 1174–1194 (2003)
Mifflin, R., Sagastizábal, C.: \({\cal{VU}}\)-smoothness and proximal point results for some nonconvex functions. Optim. Methods Softw. 19(5), 463–478 (2004)
Mifflin, R., Sagastizábal, C.: On the relation between \({\cal{U}}\)-Hessians and second-order epi-derivatives. Eur. J. Oper. Res. 157(1), 28–38 (2004)
Mifflin, R., Sagastizábal, C.: A \({mathcal VU }\)-algorithm for convex minimization. Math. Program. 104(2–3, Ser. B), 583–608 (2005)
Miller, S.A., Malick, J.: Newton methods for nonsmooth convex minimization: connections among \({mathcal{U}}\)-Lagrangian, Riemannian Newton and SQP methods. Math. Program. 104(2–3, Ser. B), 609–633 (2005)
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation. I, volume 330 of Grundlehren der Mathematischen Wissenschaften (Fundamental Principles of Mathematical Sciences). Springer, Berlin (2006)
Penot, J.-P.: Sub-Hessians, super-Hessians and conjugation. Nonlinear Anal. 23(6), 689–702 (1994)
Poliquin, R.A., Rockafellar, R.T.: Prox-regular functions in variational analysis. Trans. Am. Math. Soc. 348(5), 1805–1838 (1996)
Poliquin, R.A., Rockafellar, R.T.: Tilt stability of a local minimum. SIAM J. Optim. 8(2), 287–299 (1998). (electronic),
Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series, vol. 28. Princeton University Press, Princeton (1970)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, volume 317 of Grundlehren der Mathematischen Wissenschaften (Fundamental Principles of Mathematical Sciences). Springer, Berlin (1998)
Studniarski, M.: Necessary and sufficient conditions for isolated local minima of nonsmooth functions. SIAM J. Control Optim. 24(5), 1044–1049 (1986)
Wright, S.J.: Identifiable surfaces in constrained optimization. SIAM J. Control Optim. 31(4), 1063–1079 (1993)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was in part supported by the ARC Discovery Grant no. DP120100567.
Appendix A
Appendix A
The prove Proposition 15 we need the following results regarding the variation limits of rank-1 supports.
Proposition 56
([13], Corollary 3.3) Let \(\{\mathcal {A}(v)\}_{v\in W}\) be a family of non-empty rank-1 representers (i.e. \(\mathcal {A}(v)\subseteq \mathcal {S}\left( n\right) \) and \(-\mathcal {P}\left( n\right) \subseteq 0^{+}\mathcal {A}(v)\) for all v) and W a neighbourhood of w. Suppose that \(\limsup _{v\rightarrow w}\mathcal {A}(v)=\mathcal {A}(w)\). Then
Recall that \((x^{\prime },z^{\prime })\rightarrow _{S_{p}(f)}(\bar{x},z)\) means \(x^{\prime }\rightarrow ^{f}\bar{x}\), \(\ z^{\prime }\in \partial _{p}f(x^{\prime })\) and \(z^{\prime }\rightarrow z\).
Corollary 57
Let \(f:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}_{\infty }\) be proper and lower semicontinuous with \(h\in b^{1}(\underline{\partial }^{2}f(\bar{x},\bar{z} )). \) Then
Proof
Use Proposition 56 and Remark 11. \(\square \)
Denote the infimal convolution of f by \(f_{\lambda }(x):=\inf _{u\in {\mathbb {R}}^{n}}\left( f(u)+\frac{1}{2\lambda }\Vert x-u\Vert ^{2}\right) \). Recall that \(f_{\lambda }\left( x\right) -\frac{1}{2\lambda }\left\| x\right\| ^{2}=-\left( f\ +\frac{\lambda }{2}\Vert \cdot \Vert ^{2}\right) ^{*}(\lambda x)\) and this \(f_{\lambda }\) is always para-concave. Recall that in [13, Lemma 2.1], it is observed that f is locally \(C^{1,1}\) iff f is simultaneously a locally para-convex and para-concave function. Recall [38, Proposition 4.15] that states that the limit infimum of a collection of convex sets is also convex and that the upper epi-limit of a family of functions has an epi-graph that is the limit infimum of the family of epi-graphs. Consequently the epi-limit supremum of a family of convex functions give rise to convex function.
Proof
(of Proposition 15) Begin by assuming f is locally para-convex. Let \(\frac{c}{2}>0\) be the modulus of para-convexity of f on \(B_{\delta }( \bar{x})\), \(x\in B_{\delta }(\bar{x})\) with \(z\in \partial f\left( x\right) \) and \(\partial ^{2,-}f\left( x,z\right) \ne \emptyset \). Let \(C_{t}(x)=\{h\mid x+th\in B_{\delta }(\bar{x})\}\) then we have
convex on \(C_{t}(x)\) since \(x\mapsto f(x)+\frac{c}{2}\Vert x\Vert ^{2}\) is convex on \(B_{\delta }(\bar{x})\). Next note that for every \(K>0\) there exists a \(\bar{t}>0\) such that for \(0<t<\bar{t}\) we have \(B_{K}(0)\subseteq C_{t}(x)\). Once again restricting f to \(B_{\delta }(\bar{x})\) we get a family
of convex functions with domains containing \(C_{t}(x)\,\)(for each t) and whose convexity (on their common domain of convexity) will be preserved under an upper epi-limit as \(t\downarrow 0\). Thus, using the fact that \(\frac{c}{t^{2}}\left( \Vert x+th\Vert ^{2}-\Vert x\Vert ^{2}-t\langle 2x,h\rangle \right) \) converges uniformly on bounded sets to \(c\Vert h\Vert ^{2}\), we have the second order circ derivative (introduced in [23]) given by:
which is convex on \(B_{K}(0)\), for every \(K>0\), being obtained by taking an epi-limit supremum of a family of convex functions given in (43). We then have \(h\mapsto f^{\uparrow \uparrow }(x,z,h)+c\Vert h\Vert ^{2}\) convex (with \(f^{\uparrow \uparrow }(x,z,\cdot )\) having a modulus of para-convexity of c).
From [4], Proposition 4.1 particularized to \(C^{1,1}\) functions f we have that there exists a \(\eta \in [x,y]\) such that
Using (44), Proposition 7 and the variational result corollary 56, we have when the limit is finite (for \(\bar{z}:=\nabla f(\bar{x})\))
where the last inequality follows from [23, Proposition 6.5].
Now assuming f is quadratically minorised and is prox-regular at \(\bar{x}\ \) for \(\bar{p}\in \partial f(\bar{x})\) with respect to \(\varepsilon \) and r. Let \(g(x):=f(x+\bar{x})-\langle \bar{z},x+\bar{x}\rangle \). Then \(0\in \partial g(0)\) and we now consider the infimal convolution \(g_{\lambda }(x)\) which is para-convex locally with a modulus \(c:=\frac{\lambda r}{2(\lambda -r)}\), prox-regular at 0 (see [35, Theorem 5.2]). We may now use the first part of the proof to deduce that \(g_{\lambda }^{\uparrow \uparrow }(0,0,\cdot )\) is para-convex with modulus \(c=\frac{2\lambda r}{(\lambda -r)}\) and \(g_{\lambda }^{\uparrow \uparrow }(0,0,h)=q\left( \underline{\partial }^{2}g_{\lambda }(0,0)\right) (h)\) since \(g_{\lambda }\) is \(C^{1,1}\) (being both para-convex and para-concave). Using Corollary 56 and [8, Proposition 4.8 part 2.] we obtain
Thus \(q\left( \underline{\partial }^{2}g(0,0)\right) (h)+r\Vert h\Vert ^{2}=\limsup _{\lambda \rightarrow \infty }\inf _{h^{\prime }\rightarrow h}\left( g_{\lambda }^{\uparrow \uparrow }(0,0,h^{\prime })+\frac{\lambda r}{(\lambda -r)}\Vert h\Vert ^{2}\right) \) is convex, being the variational upper limit of convex functions. One can easily verify that \(\underline{\partial }^{2}g(0,0)=\underline{\partial }^{2}f(\bar{x},\bar{z})\) and \(g^{\uparrow \uparrow }(0,0,h)=f^{\uparrow \uparrow }(\bar{x},\bar{z},h)\). \(\square \)
Rights and permissions
About this article
Cite this article
Eberhard, A.C., Luo, Y. & Liu, S. On partial smoothness, tilt stability and the \({\mathcal {VU}}\)-decomposition. Math. Program. 175, 155–196 (2019). https://doi.org/10.1007/s10107-018-1238-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1238-8