1 Introduction

The primary focus of the paper is on Karush–Kuhn–Tucker-type (KKT) optimality conditions for optimization problems with inequality constraints for the cases when regularity assumptions (constraint qualifications) known in the literature (see, for example, [1]) are not satisfied at a solution. One goal of the paper is to present KKT-type optimality conditions in Banach spaces under new regularity assumptions. Another goal is to analyze problems for which the KKT form of optimality conditions does not hold (see Example 7.1) and to propose necessary and sufficient conditions for those problems.

The regularity assumptions proposed in the paper expand constraint qualifications known in the literature (e.g., [1,2,3,4]) to new classes of optimization problems. While work [1] gives a careful comparison and classification of the existing constraint qualifications, all of them are stated using at most the first derivatives of the constraints. The regularity assumptions proposed in this paper expand considerations to new classes of problems by using higher-order derivatives. As a result, our approach allows us to analyze problems, where, for example, the first-order derivatives of all constraints are equal to zero, which are not covered by any regularity assumptions given in [1,2,3,4]. Also, in addition to the classical KKT-type optimality conditions analyzed in [1,2,3,4], our approach is applicable to cases when the KKT Theorem fails but generalized forms of the KKT conditions can be derived.

There is the extended literature on generalizations of the KKT Theorem to an infinite-dimensional setting (the relevant references can be found, for example, in [5, p. 159] or in [6,7,8]). Our approach is based on the construction of p-regularity introduced earlier in [9,10,11,12]. The main idea of the p-regularity is that it replaces the operator of the first derivative, which is not surjective, by a special operator that is onto. Various optimality conditions proposed for the degenerate case are given, for example, in [13,14,15,16,17,18]. An approach similar to ours is used in [19,20,21,22]. The main differences between optimality conditions proposed in this paper and those in [19,20,21,22] are that we consider a more general case of \(p \ge 2\) and do not make some additional assumptions introduced in [19,20,21,22]. We give a more detailed comparison in Sect. 8.

The paper is organized as follows. We formulate the problem in Sect. 2. We start with an absolutely degenerate case in Sect. 3, when the Karush–Kuhn–Tucker necessary conditions reduce to a specific form containing the objective function only. In Sect. 4.1, we analyze some cases, when the KKT conditions hold for nonregular problems with a nonzero multiplier corresponding to the objective function and present new KKT-type optimality conditions in Theorems 4.1 and 4.2. After that, in Sect. 4.2, we analyze problems for which the Karush–Kuhn–Tucker form of optimality conditions does not hold. Necessary and sufficient conditions derived in Theorem 4.4 can be viewed as generalized KKT optimality conditions. As auxiliary results, we derive new geometric necessary conditions in Lemmas 3.1 and 4.1. In Sect. 5, we consider a general case of degeneracy, where we do not make assumption (10), which is one of the main assumptions in Sect. 4. A new approach presented in Sect. 5 was briefly announced in our paper [23] and is used to reduce degenerate optimization problems to new forms, so that one can use simpler ways to analyze those problems. Some directions for future work are briefly described in Sect. 6. We illustrate the optimality conditions by some examples in Sect. 7, give additional comparison with other results in Sect. 8, and conclude the paper with Sect. 9.

2 Formulation of the Problem

We consider a nonlinear optimization problem with inequality constraints

$$\begin{aligned} {\mathop {\hbox {min}}\limits _{x \in X}} \; f (x) \quad \mathrm{s.t.} \quad g(x)=(g_1(x), \ldots , g_m(x)) \le 0, \end{aligned}$$
(1)

where the functions f and \(g_i\) are sufficiently smooth functions from the Banach space X to \({\mathbb R}\). In the case when the Linear Independence Constraint Qualification is not satisfied at a solution \(\bar{x}\) of the problem (1), we call the problem degenerate (nonregular) at \(\bar{x}\). The Karush–Kuhn–Tucker (KKT) Theorem states that if \(\bar{x}\) is a local solution of problem (1) and a regularity assumption holds, then there exist Lagrange multipliers \(\lambda ^*_1, \ldots , \lambda ^*_m\) such that

$$\begin{aligned} {f^{\prime }(\bar{x}) + {\mathop {\sum }\limits _{j=1}^{m}} \lambda _j^* g^{\prime }_j(\bar{x}) = 0}, \quad g(\bar{x}) \le 0, \quad \lambda _j^* \ge 0, \quad \lambda _j^* g_j(\bar{x}) =0, \quad j=1,\ldots , m. \end{aligned}$$
(2)

It is interesting to note that, while the first equation in (2) holds, the requirement that the Lagrange multipliers \(\lambda _j^*\) are nonnegative can be violated for degenerate optimization problems. For example, note that \(\bar{x}=0\) is a local minimizer for the problem:

$$\begin{aligned} {\mathop {\hbox {min}}\limits _{x \in X}} \; f(x) = - x_2 \quad \mathrm{s.t.} \quad g_1(x) = - x_1^2 - x_2 \le 0 , \; g_2(x) = x_1^{12} + x_2^3 \le 0 . \end{aligned}$$

Then, the first equation in (2) reduces to \(\left( \begin{array}{c} 0\\ -1\end{array} \right) + \lambda _1 \left( \begin{array}{c} 0\\ -1\end{array} \right) + \lambda _2 \left( \begin{array}{c} 0\\ 0\end{array} \right) = \left( \begin{array}{c} 0\\ 0\end{array} \right) \) and yields \(\lambda _1 < 0\), contradicting \(\lambda _j \ge 0\) in (2). For problems of this type, Theorem 4.4 and Theorem 5.1 give necessary optimality conditions, which can be viewed as a generalized form of the KKT conditions and guarantee that all Lagrange multipliers are of the same sign. Example 7.1 illustrates this case.

Notation: For some set C, we denote the span of C by span(C), and the set of all nonnegative combinations of vectors in C by cone\(\, C\). We let \(g_i^{(p)}(x)\) be the pth derivative of \(g_i: X \rightarrow {\mathbb R}^n\) at the point x; the associated p-form is \(g^{(p)}_i(x) [h]^p := g^{(p)}_i (x) ( h, h, \ldots , h).\) Notation \(g_i^{(p)} ({x}) [h]^{p -1}\) means \(\left( g_i^{(p-1)} ({x}) [h]^{p-1}\right) ^\prime _x\) (see [24] for additional details). The other notation will be introduced below as needed.

3 New Optimality Conditions for an Absolutely Degenerate Case and an Even p

Throughout this section, we assume that p is an even number and

$$\begin{aligned} g_i^{(j)} (\bar{x}) = 0, \quad j=1, \ldots , p-1, \quad \forall i \in I(\bar{x}), \end{aligned}$$
(3)

where \(I(\bar{x})\) is the set of indices of active constraints at \(\bar{x}\), \(I(\bar{x}):= \{ i = 1, \ldots , m \, : \, g_i(\bar{x}) =0\} \). The case of an odd p was covered in our paper [25].

We need the following additional notation.

Let \(S \,{:=}\, \{ x \,{\in }\, X \,{:}\, g_{i} (x) \,{\le }\, 0, \, i\,{=}\,1, \ldots , m \}\) denote the feasible set for problem (1), \( G_p (\bar{x}, h):= \{ d \in X \, : \, \langle g_i^{(p)} (\bar{x}) [h]^{p -1}, d \rangle \le 0, \; \forall i \in I(\bar{x})\} \) for some \(h \in X\), and

$$\begin{aligned} F_0( \bar{x}) := \{d\in X \, : \, \langle f'(\bar{x}), d \rangle <0\}. \end{aligned}$$
(4)

The following theorem presents one of the main results of this section.

Theorem 3.1

Let \(\bar{x}\) be a local minimizer of problem (1), \(f(x) \in C^2(X)\), and \(g_i(x) \in C^{p+1}(X)\), \(i = 1 \ldots m\). Assume that (3) holds with some even p and that there exist vectors \(h \in X\), \(\Vert h\Vert =1\), and \(\xi \in X\), \(\Vert \xi \Vert = 1\), such that for all \(i \in I(\bar{x}) \),

$$\begin{aligned} g_i^{(p)} (\bar{x}) [h]^p =0 \quad \text{ and } \quad \langle g_i^{(p)} (\bar{x}) [h]^{p -1}, \xi \rangle < 0. \end{aligned}$$
(5)

Then \(\; f^\prime (\bar{x}) =0. \;\)

Note that assumption (5) can be viewed as a new generalization of the Mangasarian–Fromovitz constraint qualification.

Without loss of generality, we may assume that \(I(\bar{x}) = \{1,\ldots ,m\}\) throughout the paper, because the continuity of \(g_i(x)\) for \(i\notin I(\bar{x})\) prevents \(g_i(\bar{x})\) from taking the value 0 on some neighborhood of \(\bar{x}\). We need the following lemmas to prove Theorem 3.1.

Lemma 3.1

(Geometric Necessary Condition) Let the assumptions of Theorem 3.1 hold. Then

$$\begin{aligned} F_0 (\bar{x}) \cap G_p (\bar{x}, h) = \emptyset .\end{aligned}$$
(6)

Proof

Assume on the contrary that there exists \(d\in F_0 (\bar{x}) \cap G_p (\bar{x}, h)\). Without loss of generality, let \(\Vert d \Vert = 1\). Since \(d \in G_p (\bar{x}, h)\),

$$\begin{aligned} \langle g_i^{(p)} (\bar{x}) [h]^{p -1}, d \rangle \le 0 \quad \forall \; i \in I(\bar{x}). \end{aligned}$$
(7)

First, we will prove that \(\langle f'(\bar{x}), h \rangle =0\). Assume on the contrary that \(\langle f'(\bar{x}), h \rangle \ne 0\). Let \(\xi \in X\) satisfy (5) and consider \(\bar{x} + t h + t^{3/2} \xi \) and \(\bar{x} - t h - t^{3/2} \xi \) for some sufficiently small t. Then, for \(i \in I(\bar{x})\), we get the following inequalities with \(\bar{r}_i(t)\) and \(\tilde{r}_i(t)\), \(\Vert \bar{r}_i(t)\Vert = o(t^{p+1/2})\), \(\Vert \tilde{r}_i(t)\Vert = o(t^{p+1/2})\),

$$\begin{aligned} g_i(\bar{x} + t h + t^{3/2} \xi )=\frac{1}{p!} \left( g_i^p(\bar{x})[t h]^p + t^{p+1/2} \langle g_i^p(\bar{x}) [ h]^{p-1}, \xi \rangle \right) + \bar{r}_i(t) \le 0,\\ g_i(\bar{x} - t h - t^{3/2} \xi )=\frac{1}{p!} \left( g_i^p(\bar{x})[t h]^p + t^{p+1/2} \langle g_i^p(\bar{x}) [-h]^{p-1}, -\xi \rangle \right) + \tilde{r}_i(t) \le 0. \end{aligned}$$

The inequalities imply that \(\bar{x} + t h + t^{3/2} \xi \in S\) and \(\bar{x} - t h - t^{3/2} \xi \in S\) for all sufficiently small t. Then \(\langle f'(\bar{x}), h \rangle \ne 0\) yields \(f(\bar{x} + t h + t^{3/2} \xi ) < f(\bar{x})\) or \(f(\bar{x} - t h - t^{3/2} \xi ) < f(\bar{x})\), which contradicts the assumption that \(\bar{x}\) is a local minimizer and proves \(\langle f'(\bar{x}), h \rangle =0\).

Now, consider \(x(t) = \bar{x}+ t h + t^{3/2} d + t^{7/4} \xi \). For every \(i = 1, \ldots , m\), by assumptions \(i \in I(\bar{x})\) and (3), there exist \(\delta _i >0\) and \(r_i: ]0, \delta _i[ \rightarrow {\mathbb R}\) such that \(\Vert r_i(t)\Vert = o(t^{p+3/4})\) and

$$\begin{aligned} g_i(x(t))= & {} g_i(\bar{x}+ t h + t^{3/2} d + t^{7/4} \xi )\\= & {} \frac{1}{p!} \left( g_i^p(\bar{x})[t h]^p + t^{p+1/2} \langle g_i^p(\bar{x}) [h]^{p-1}, d \rangle + t^{p+3/4} \langle g_i^p(\bar{x}) [h]^{p-1}, \xi \rangle \right) \\&+ r_i(t), \; \forall t \in ]0, \delta _i[. \end{aligned}$$

Hence, by (5) and (7), there exists \(\varepsilon _i \in ]0, \delta _i[\) such that \( g_i(x(t) ) \le 0\) for all \(t \in ]0, \varepsilon _i[. \) Taking \(\varepsilon =\mathop { \min }\limits _{i= 1, \ldots , m} \varepsilon _i\), we get \(g_i(x(t) ) \le 0\) for all \(i=1, \ldots , m,\) and, therefore, x(t) is feasible for problem (1) for any \( t \in ]0, \varepsilon [\). Then \(d\in F_0 (\bar{x})\) and \(\langle f'(\bar{x}), h \rangle =0\) yield \(f(x(t)) < f(\bar{x})\) for all \(t \in ]0, \varepsilon [\), which contradicts the assumption that \(\bar{x}\) is a local minimizer and proves (6). \(\square \)

Lemma 3.2

Let X be a Banach space and \(X^*\) be its dual space. Given a set of vectors \(\eta _i \in X^*\), \(i=1, \ldots r\), let \(z \in X\) be a vector such that \(\langle z, \eta _i \rangle < 0\), \(i=1, \ldots , r\). Assume also that for some vector \(\eta \in X^*\), there exist numbers \(\alpha _i \ge 0\) and \(\beta _i \le 0\), \(i=1, \ldots , r\), such that \( \eta = \mathop {\sum }\limits _{i=1}^r \alpha _i \eta _i\) and \( \eta = \mathop {\sum }\limits _{i=1}^r \beta _i \eta _i.\) Then \(\eta = 0\).

Proof

By assumptions of the lemma, \(\, \mathop {\sum }\limits _{i=1}^r \alpha _i \eta _i = \mathop {\sum }\limits _{i=1}^r \beta _i \eta _i \, \) and

$$\begin{aligned} 0 \ge \mathop {\sum }\limits _{i=1}^r \alpha _i \langle z, \eta _i \rangle = \mathop {\sum }\limits _{i=1}^r \beta _i \langle z, \eta _i \rangle \ge 0, \end{aligned}$$

since \(\langle z, \eta _i \rangle < 0\), \(\alpha _i \ge 0\), and \(\beta _i \le 0\), \(i=1, \ldots , r.\) Therefore, \( \alpha _i = \beta _i =0,\) \( i=1, \ldots , r,\) and \(\eta =0,\) which proves the lemma. \(\square \)

We will need the following generalization of the Farkas Lemma in Banach spaces.

Lemma 3.3

(Farkas) Consider \(c, \eta _1, \ldots , \eta _r \in X^*\). Exactly one of the following holds:

(*):

there exists \(x\in X\) with \(\langle \eta _i, x \rangle \le 0\) \(\forall i =1, \ldots , r\), and \(\langle c, x \rangle >0\).

(**):

there exists nonnegative scalars \(\mu _1, \ldots , \mu _r\) such that \(c = \mu _1 \eta _1 + \ldots + \mu _r \eta _r\).

Now we are ready to prove Theorem 3.1.

Proof

(Theorem 3.1) Let \(\eta _i =g_i^{(p)} (\bar{x}) [h]^{p -1}\), \(i \in I(\bar{x})\), and \(c = - f^{\prime } (\bar{x})\). Then, by Lemma 3.1, part (*) in Lemma 3.3 does not hold, so, by part (**), there exist scalars \(\beta _i \le 0\), \(i \in I(\bar{x}),\) such that

$$\begin{aligned} f^{\prime } (\bar{x}) = \mathop {\sum }\limits _{i\in I(\bar{x})} \beta _i g_i^{(p)} (\bar{x}) [h]^{p -1} . \end{aligned}$$
(8)

Note that the assumptions of the theorem also hold with the vector \(-h\). Indeed, (5) can be written as \( g_i^{(p)} (\bar{x}) [-h]^p =0\) and \(\langle g_i^{(p)} (\bar{x}) [-h]^{p -1}, - \xi \rangle < 0 . \) Then, similar to considerations above, by Lemma 3.3, there exist \(\gamma _i \le 0\), \(i \in I(\bar{x}),\) such that

$$\begin{aligned} f^{\prime } (\bar{x}) = \mathop {\sum }\limits _{i\in I(\bar{x})} \gamma _i g_i^{(p)} (\bar{x}) [- h]^{p -1}= \mathop {\sum }\limits _{i\in I(\bar{x})} (-\gamma _i) g_i^{(p)} (\bar{x}) [ h]^{p -1}. \end{aligned}$$
(9)

Introducing \(\alpha _i = - \gamma _i \ge 0\) and using Lemma 3.2 with \(\eta _i \) defined above, \(\eta = f^\prime (\bar{x})\), and \(z=-\xi \), we get \(f^\prime (\bar{x}) = 0\), which finishes the proof of the theorem. \(\square \)

Note that conditions (8) and (9) can be viewed as generalizations of the KKT-type optimality conditions. However, the constraint qualification in the form (5) implies \(f^\prime (\bar{x}) = 0\). The following theorem is a simple corollary of Theorem 3.1.

Theorem 3.2

Let \(\bar{x}\) be a local minimizer of problem (1), \(f(x) \in C^2(X)\), and \(g_i(x) \in C^{p+1}(X)\), \(i = 1 \ldots m\). Assume that (3) holds with some even p and that the vectors \( g_i^{(p)} (\bar{x}) [h]^{p -1},\) \(i \in I(\bar{x}),\) are linearly independent for some h satisfying \( g_i^{(p)} (\bar{x}) [h]^p =0,\) \(i \in I(\bar{x}) . \) Then \( f^\prime (\bar{x}) =0. \)

4 Optimality Conditions in the General Case of Degeneracy with \(p=2\)

In this section, we analyze some cases when the KKT Theorem holds for nonregular problems. Then, we introduce generalizations of the KKT conditions for some cases when the KKT Theorem does not hold. For now, we assume that there exists a number \(r \in \{1, \ldots , m-1\} \) such that

$$\begin{aligned} g^\prime _i (\bar{x}) \ne 0, \; i = 1, \ldots , r, \quad {g}_i '(\bar{x}) = 0, \; i=r+1, \ldots , m. \end{aligned}$$
(10)

A more general case without assumption (10) is considered in Sect. 5 of the paper.

4.1 When the KKT Theorem Holds

We start with a special case when there exists a vector \(h \in X\), \(h \ne 0\), such that

$$\begin{aligned} \langle g_i ' (\bar{x}), h \rangle = 0,\; i=1, \ldots , r , \quad \langle {g}_i ^{\prime \prime } (\bar{x}) h , h \rangle = 0, \; i=r+1, \ldots , m . \end{aligned}$$
(11)

We use the following notation and assumptions in this section:

$$\begin{aligned}&G_2 (\bar{x}, h) := \{ d \in X \, : \, \langle g_i^\prime (\bar{x}), d \rangle \le 0, \; i=1, \ldots , r; \; \nonumber \\&\langle {g}_i ^{\prime \prime } (\bar{x}) [h] , d \rangle \le 0, \; i=r+1, \ldots , m \}. \end{aligned}$$
(12)

Assumption 1

(A generalized MFCQ-type 2-regularity assumption) For a vector h satisfying (11), assume the following

Part A. There exists a vector \(\xi \in X\), \(\Vert \xi \Vert =1\), such that

$$\begin{aligned} \langle g_i ' (\bar{x}), \xi \rangle< 0, i=1, \ldots , r, \quad \langle {g}_i ^{\prime \prime } (\bar{x}) h , \xi \rangle < 0, \; i=r+1, \ldots , m . \end{aligned}$$
(13)

Part B. There exists a vector \(\eta \in X\), \(\Vert \eta \Vert =1\), such that

$$\begin{aligned} \langle g_i ' (\bar{x}), \eta \rangle < 0, \; i=1, \ldots , r ,\quad \langle {g}_i ^{\prime \prime } (\bar{x}) h , \eta \rangle > 0, \; i=r+1, \ldots , m . \end{aligned}$$

Assumption 2

For some h, satisfying (11), assume that \(C_1 \cap C_2 = \{0\},\) where \(C_1 := \mathrm{span}\{g_1^\prime (\bar{x}), \ldots , g_r^\prime (\bar{x}) \} \) and \(C_2 := \mathrm{cone} \{{g}_{r+1}^{\prime \prime } (\bar{x}) [h], \ldots , {g}_m^{\prime \prime } (\bar{x}) [h] \}.\)

The following theorem can be viewed as a generalization of the KKT Theorem.

Theorem 4.1

(Necessary optimality conditions) Assume that \(\bar{x}\) is a local minimizer of problem (1), \(f(x) \in C^1(X)\), and \(g_i(x) \in C^{2}(X)\), \(i = 1, \ldots , m\). Assume that (10) holds and that there exists a vector \(h \in X\), \(h \ne 0\), such that (11) holds. Suppose that Assumptions 1 and 2 hold for problem (1). Then there exist \(\lambda _i^* \ge 0\), \(i=1, \ldots , r\), such that

$$\begin{aligned} f'(\bar{x}) + \mathop {\sum }\limits _{i =1}^r \lambda ^*_i g_i'(\bar{x}) = 0. \end{aligned}$$
(14)

Remark 4.1

Note the following:

  1. 1.

    The multipliers \(\lambda ^*_i\) in Theorem 4.1 do not depend on the vector h.

  2. 2.

    Only \(g_1(\bar{x}), \ldots , g_r(\bar{x})\) are used in equation (14).

  3. 3.

    In the case when \(g_i ' (\bar{x}) = 0\), \(i=1, \ldots , r\), equation (14) reduces to \(f '(\bar{x})~=~0\).

  4. 4.

    The generalized MFCQ-type 2-regularity assumption (Assumption 1) is a new constraint qualification.

We will need the following lemma to prove Theorem 4.1. Recall that the sets \(F_0 (\bar{x})\) and \(G_2(\bar{x}, h)\) are introduced in (4) and (12), respectively.

Lemma 4.1

(Geometric Necessary Condition) Let the assumptions of Theorem 4.1 hold. Then

$$\begin{aligned} F_0 (\bar{x}) \cap G_2 (\bar{x}, h) = \emptyset .\end{aligned}$$
(15)

Proof

First, we will prove that \(\; \langle f ' (\bar{x}) , h \rangle = 0\;\) for h that satisfies Assumption 1 and \(\Vert h\Vert =1\). Assume on the contrary that \( \langle f ' (\bar{x}) , h \rangle \ne 0\).

1. If \(i \in \{1, \ldots , r\}\), then \(g_i(\bar{x})=0\) and, by Assumption 1, (11), and Taylor’s expansion, there exist vectors \(\xi \) and \(\eta \), \(\Vert \xi \Vert =1\), \(\Vert \eta \Vert =1\), a sufficiently small \(\delta > 0\) and \(\omega _{i_j}: ]0, \delta [ \rightarrow {\mathbb R}\) such that \(|\omega _{i_j} (t)| = o(t^{3/2}),\) \(j=1,2,\) and \( g_i(\bar{x} + t h + t^{3/2} \xi ) = \langle g_i'(\bar{x}), t h \rangle + \langle g_i^\prime (\bar{x}), t^{3/2} \xi \rangle + \omega _{i_1}(t) \) \(= \langle g_i^\prime (\bar{x}), t^{3/2} \xi \rangle + \omega _{i_1}(t)< 0, g_i(\bar{x} - t h + t^{3/2} \eta ) = - \langle g_i'(\bar{x}), t h \rangle + \langle g_i^\prime (\bar{x}), t^{3/2} \eta \rangle + \omega _{ i_2}(t) = \langle g_i^\prime (\bar{x}), t^{3/2} \eta \rangle + \omega _{i_2}(t) < 0\) for all \(t \in ]0, \delta [\).

2. If \(i \in \{r+1, \ldots , m\}\), then \(g_i(\bar{x})=0\), \(g_i'(\bar{x}) = 0\), and similarly to the above, there exist functions \(\omega _{i_j}: ]0, \delta [ \rightarrow {\mathbb R}\) such that \(|\omega _{i_j} (t)| = O(t^{3}),\) \(j=3, 4,\) and \(g_i(\bar{x} + t h + t^{3/2} \xi )= g_i(\bar{x}) + \frac{1}{2} \langle g_i^{\prime \prime }(\bar{x}) t h, t^{3/2} \xi \rangle + \omega _{i_3}( t) < 0, \) \( g_i(\bar{x} - t h + t^{3/2} \eta )= g_i(\bar{x}) - \frac{1}{2} \langle g_i^{\prime \prime }(\bar{x}) t h, t^{3/2} \eta \rangle + \omega _{i_4}( t) < 0 \) for all \(t \in ]0, \delta [\). Thus, \(\bar{x} + t h + t^{3/2} \xi \in S\) and \(\bar{x} - t h + t^{3/2} \eta \in S\) for all \(t \in ]0, \delta [\). Then the assumption \( \langle f ' (\bar{x}) , h \rangle \ne 0\) above implies that either \( f(\bar{x} + t h + t^{3/2} \xi ) < f(\bar{x}) \) or \( f(\bar{x} - t h + t^{3/2} \eta ) < f(\bar{x}) , \) which contradicts the minimality of \(\bar{x}\) and proves \(\; \langle f ' (\bar{x}) , h \rangle = 0.\;\)

To prove (15), assume on the contrary that there exists \(d\in F_0 (\bar{x}) \cap G_2 (\bar{x}, h)\). Then similarly to the above and by using Assumption 1, there exists a sufficiently small \(\delta >0\) such that \(x(t)= \bar{x} + t h + t^{3/2} {d} + t^{7/4} \xi \in S\) for all \(t \in ]0, \delta [\). This inclusion, together with \(d\in F_0 (\bar{x})\) and \( \langle f ' (x^*) , h \rangle = 0\), yields \(f(x(t)) < f(\bar{x})\) for all \(t \in ]0, \delta [\), which contradicts the minimality of \(\bar{x}\) and proves (15). \(\square \)

Now we are ready to prove Theorem 4.1.

Proof

(Theorem 4.1) Let \( \eta _1 =g_1^\prime (\bar{x}), \ldots , \eta _r =g_r^\prime (\bar{x}),\) \(\eta _{r+1} =~g_{r+1}^{\prime \prime } (\bar{x})[h], \ldots ,\) \( \eta _{m} = g_{m}^{\prime \prime } (\bar{x})[h], \) and \( c = - f^{\prime } (\bar{x}) . \) Then, by Lemma 4.1, part (*) in Lemma 3.3 does not hold, so by part (**), there exist scalars \({\lambda }_i^* \ge 0\), \(i=1, \ldots , r\), and \(\gamma _i^* \ge 0\), \(i=r+1, \ldots , m\), such that

$$\begin{aligned} f'(\bar{x}) = - \mathop {\sum }\limits _{i =1}^r {\lambda }^*_i g_i'(\bar{x}) - \mathop {\sum }\limits _{i =r+1}^m {\gamma }^*_i g_i^{\prime \prime }(\bar{x}) [h]. \end{aligned}$$
(16)

Note that the assumptions of Theorem 4.1 also hold with the vector \(-h\). Then, using similar arguments, there exist \(\bar{\lambda }_i^* \ge 0\), \(i=1, \ldots , r\), and \(\bar{\gamma }_i^* \ge 0\), \(i=r+1, \ldots , m\), such that

$$\begin{aligned} f'(\bar{x}) = - \mathop {\sum }\limits _{i =1}^r \bar{\lambda }^*_i g_i'(\bar{x}) - \mathop {\sum }\limits _{i =r+1}^m \bar{\gamma }^*_i g_i^{\prime \prime }(\bar{x})(- h). \end{aligned}$$
(17)

Equations (16) and (17) imply

$$\begin{aligned} \mathop {\sum }\limits _{i =1}^r \bar{\lambda }^*_i g_i'(\bar{x}) - \mathop {\sum }\limits _{i =1}^r {\lambda }^*_i g_i'(\bar{x})= \mathop {\sum }\limits _{i =r+1}^m \bar{\gamma }^*_i g_i^{\prime \prime }(\bar{x})[h] + \mathop {\sum }\limits _{i =r+1}^m {\gamma }^*_i g_i^{\prime \prime }(\bar{x}) [h] . \end{aligned}$$
(18)

Consider two cases. In the first case, we assume that

$$\begin{aligned} \mathop {\sum }\limits _{i =1}^r {\lambda }^*_i g_i^\prime (\bar{x}) = \mathop {\sum }\limits _{i =1}^r \bar{\lambda }^*_i g_i^\prime (\bar{x}) . \end{aligned}$$
(19)

Then by Lemma 3.2, \(\gamma _i^* = \bar{\gamma }_i^* = 0\), \(i=r+1, \ldots , m\), so (14) holds and we are done.

In the second case, suppose that (19) does not hold and recall that the sets \(C_1\) and \(C_2\) are defined in Assumption 2. In this case, equation (18) implies that there exists a vector \(d \in C_1 \cap C_2\) such that \(\; d = \mathop {\sum }\limits _{i =1}^r (\bar{\lambda }^*_i - {\lambda }^*_i ) g_i'(\bar{x}) = \mathop {\sum }\limits _{i =r+1}^m (\bar{\gamma }^*_i + {\gamma }^*_i ) g_i^{\prime \prime }(\bar{x}) [h]. \;\) Hence, Assumption 2 is violated and (19) holds, which finishes the proof of the theorem. \(\square \)

In the following theorem, we propose necessary conditions for optimality without using Assumption 2. Instead, without loss of generality, we assume that the vectors \(g^\prime _1 (\bar{x}), \ldots ,\) \(g^\prime _r (\bar{x})\), \(g^{\prime \prime }_{r+1} (\bar{x}) [h]\), \(\ldots \), \(g^{\prime \prime }_{m} (\bar{x}) [h]\) are the extreme directions for the cone

$$\begin{aligned} K(\bar{x}) := \mathrm{cone} \{ g^\prime _1 (\bar{x}), \ldots , g^\prime _r (\bar{x}), g^{\prime \prime }_{r+1} (\bar{x}) [h], \ldots g^{\prime \prime }_{m} (\bar{x}) [h]\}. \end{aligned}$$
(20)

(Recall that an extreme direction is the direction of a ray that cannot be expressed as a conic combination of any ray directions in the cone distinct from it.)

Theorem 4.2

(Necessary optimality conditions) Assume that \(\bar{x}\) is a local minimizer of problem (1), \(f(x) \in C^1(X)\), and \(g_i(x) \in C^{2}(X)\), \(i = 1, \ldots , m\). Assume that (10) holds and that there exists a vector \(h \in X\), \(h \ne 0\), such that relations (11) and Assumption 1 hold. Then there exist \(\lambda _i^* \ge 0\), \(i=1, \ldots , r\), such that

$$\begin{aligned} f'(\bar{x}) + \mathop {\sum }\limits _{i =1}^r \lambda ^*_i g_i'(\bar{x}) = 0. \end{aligned}$$
(21)

Proof

Let h, \(h \ne 0 \), satisfy (11) and (13). Note that since Assumption 1 holds, the proof of Lemma 4.1 implies \( \langle f ' (\bar{x}) , h \rangle = 0. \) Also, (16) was proved using Assumption 1 only; hence, there exist scalars \({\lambda }_i^* \ge 0\), \(i=1, \ldots , r\), and \(\gamma _i^* \ge 0\), \(i=r+1, \ldots , m\), such that

$$\begin{aligned} f'(\bar{x}) = - \mathop {\sum }\limits _{i =1}^r {\lambda }^*_i g_i'(\bar{x}) - \mathop {\sum }\limits _{i =r+1}^m {\gamma }^*_i g_i^{\prime \prime }(\bar{x}) [h]. \end{aligned}$$
(22)

Assume on the contrary that (21) does not hold. Then \(- f ' (\bar{x}) \notin \mathrm{cone} \{ g^\prime _1 (\bar{x}), \ldots , g^\prime _r (\bar{x})\}.\) Also, by (20) and (22), \(- f ' (\bar{x}) \in K(\bar{x})\) and there exists an index j such that \({\gamma }^*_j >0\) in (22). Then by Part B of Assumption 1, there exists a vector \(\bar{h} \ne 0\) such that

$$\begin{aligned} \langle f ' (\bar{x}) , \bar{h}\rangle> 0, \quad \langle g^\prime _i (\bar{x}) , \bar{h} \rangle > 0, \; i=1, \ldots , r, \quad \langle g^{\prime \prime }_j (\bar{x}) [h], \bar{h} \rangle < 0, \; j=r+1, \ldots , m. \end{aligned}$$
(23)

Hence, by (11) and (23), for \(i=1, \ldots , r\), there exists a sufficiently small \(0< \varepsilon < 1\) and \(\alpha > 0\) such that

$$\begin{aligned} g_i (\bar{x} - \alpha h - \alpha ^{1+ \varepsilon } \bar{h}) = -\langle g_i^{\prime }(\bar{x}), \alpha h \rangle - \langle g_i^{\prime }(\bar{x}), \alpha ^{1+ \varepsilon } \bar{h} \rangle + \omega _i (\alpha ) <0, \end{aligned}$$

where \(|\omega _i (\alpha )| = O(\alpha ^2)\). Similarly, by (10), (11), and (23), for \(i=r+1, \ldots , m,\)

$$\begin{aligned} g_i (\bar{x} - \alpha h - \alpha ^{1+ \varepsilon } \bar{h}) = \frac{1}{2} \langle g_i^{\prime \prime }(\bar{x}) \alpha ^{1+ \varepsilon } \bar{h}, \alpha ^{1+ \varepsilon } \bar{h} \rangle + \frac{1}{2} \langle g_i^{\prime \prime }(\bar{x}) \alpha h, \alpha ^{1+ \varepsilon } \bar{h} \rangle + \xi _i (\alpha ) <0, \end{aligned}$$

where \(|\xi _i (\alpha )| = O(\alpha ^3)\). Therefore, \( \bar{x} - \alpha h - \alpha ^{1+ \varepsilon } \bar{h} \in S, \)and, by using (23) and \( \langle f ' (\bar{x}) , h \rangle = 0\), we get

$$\begin{aligned} f (\bar{x} - \alpha h - \alpha ^{1+ \varepsilon } \bar{h}) = f(\bar{x}) - \langle f^{\prime }(\bar{x}), \alpha h \rangle - \langle f^{\prime }(\bar{x}), \alpha ^{1+ \varepsilon } \bar{h} \rangle + \eta (\alpha ) < f(\bar{x}), \end{aligned}$$

where \(|\eta (\alpha )| = O(\alpha ^2)\), which contradicts the assumption that \(\bar{x}\) is a local minimizer. Hence, (21) holds. \(\square \)

In the following theorem, we present sufficient conditions for optimality. To simplify the consideration, we derive the sufficient conditions for problem (1) in the case when X is a finite dimensional space. However, a similar result will also be true in Banach space X under an assumption of strong p-regularity (see [26]).

To formulate the next theorem, we introduce a Lagrange function, \( \mathcal {L} (x, \lambda ) := f(x) + \mathop {\sum }\limits _{i=1}^{r} \lambda _i g_i(x), \) and a set

$$\begin{aligned} H_2 (\bar{x}) := \{ h \in X \, : \, \langle g_i ' (\bar{x}), h \rangle \le 0, \; i=1, \ldots , r , \quad \langle {g}_i ^{\prime \prime } (\bar{x}) h , h \rangle \le 0, \; i=r+1, \ldots , m \}. \end{aligned}$$
(24)

Remark 4.2

Note that we do not make Assumption 1 or any regularity assumption in Theorem 4.3.

Theorem 4.3

(Sufficient optimality conditions) Let \(X = {\mathbb R}^n\) and \(f(x) , g_i(x) \in C^{2}(X)\), \(i = 1, \ldots , m\). Assume that there exist \(\lambda _i^* \ge 0\), \(i=1, \ldots , r\), such that \( \mathcal {L}^{\prime }_x (\bar{x}, \lambda ^*) = f'(\bar{x}) + \mathop {\sum }\limits _{i =1}^r \lambda ^*_i g_i'(\bar{x}) = 0. \) Assume also that there exists \(\beta > 0\) such that, for any \(h \in H_2 (\bar{x})\), the following holds: \( \langle \mathcal {L}_{xx}^{\prime \prime } (\bar{x}, \lambda ^*) h, h \rangle \ge \beta \Vert h \Vert ^2 . \) Then \(\bar{x}\) is a strict local minimizer of problem (1).

Proof

Assume on the contrary that \(\bar{x}\) is not a strict local minimizer of problem (1). Then there exists a sequence \(\{ x_k \}_{k=1}^\infty \) such that \(x_k \in S \cap U(\bar{x})\), \( f(x_k) \le f(\bar{x}), \) and \(x_k \rightarrow \bar{x}\) as \(k \rightarrow \infty \). Since \(x_k \in S\), it can be represented as \( x_k = \bar{x} + \alpha _k h + \omega (\alpha _k), \) where \(h \in H_2(\bar{x})\), \(\Vert h\Vert =1\), \(\Vert \omega (\alpha _k)\Vert = o(\alpha _k)\), and \(\alpha _k \rightarrow 0\) as \(k \rightarrow \infty \). Indeed, \(x_k\) can be written as \( x_k = \bar{x} + h_k, \) where \(h_k/\Vert h_k\Vert \rightarrow h\) as \(k \rightarrow \infty \), so that \(h_k = \alpha _k h + \omega (\alpha _k)\) and \(\Vert h_k\Vert = \alpha _k\). Then the inclusion \(h \in H_2(\bar{x})\) follows from \(\bar{x} + h_k \in S\) and we get

$$\begin{aligned} f(x_k)\ge & {} f(x_k) + \mathop {\sum }\limits _{i =1}^r {\lambda }^*_i g_i (x_k) = \mathcal {L} (x_k, \lambda ^*)\\= & {} \mathcal {L} (\bar{x}, \lambda ^*) + \langle \mathcal {L}_{x}^{\prime } (\bar{x}, \lambda ^*), x_k - \bar{x} \rangle + \frac{1}{2} \langle \mathcal {L}_{xx}^{\prime \prime } (\bar{x}, \lambda ^*) (x_k - \bar{x} ), (x_k - \bar{x}) \rangle + o(\alpha _k^2)\\\ge & {} f(\bar{x}) + \frac{\beta }{2} \Vert \alpha _k h + \omega (\alpha _k)\Vert ^2 + o(\alpha _k^2) > f(\bar{x}) . \end{aligned}$$

Getting the contradiction finishes the proof. \(\square \)

4.2 When the KKT Theorem Fails

In this section, we consider some classes of problems, for which either the KKT conditions do not hold or there is no h satisfying (11) (see Example 7.1), and present generalized KKT conditions for those problems. We assume that (10) holds and introduce additional sets: \(I_{1} (h) := \{i \in \{1, \ldots , r\} : \langle g_i^{\prime } (\bar{x}), h \rangle =0 \}\), \(I_{2} (h) := \{i \in \{r+1, \ldots , m\} : \langle g_i^{\prime \prime } (\bar{x}) h, h \rangle = 0\}, \) and \( H_f (\bar{x}) := \{ h \in X : \langle f ^{\prime } (\bar{x}), h \rangle \ge 0 \},\) where \(h \in H_2 (\bar{x})\) and \(H_2 (\bar{x})\) is defined in (24).

Definition 4.1

We say that a mapping g(x) is 2-regular at the point \(\bar{x} \in X\) along a vector \(h\in H_2 (\bar{x})\) if either \(I_{1} (h) = \emptyset \) and \(I_{2} (h) = \emptyset \), or there exists an element \(\xi = \xi (h) \in X\) such that

$$\begin{aligned} \langle g_i ' (\bar{x}), \xi \rangle< 0,\; \forall \; i \in I_{1} (h), \quad \langle {g}_i ^{\prime \prime } (\bar{x}) h , \xi \rangle < 0 ,\; \forall \; i \in I_{2} (h). \end{aligned}$$
(25)

Note that Definition 4.1 introduces a new 2-regularity constraint qualification, which can be viewed as another generalization of the MFCQ. Note that Assumption 1 given in the previous section is a special case of Definition 4.1 for \(I_{1} (h) = \{1, \ldots , r\}\) and \( I_{2} (h) = \{r+1, \ldots , m\} .\)

Definition 4.2

We say that a mapping g(x) is 2-regular at the point \(\bar{x} \in X\) if, for every \(h\in H_2 (\bar{x})\), either \(I_{1} (h) = \emptyset \) and \(I_{2} (h) = \emptyset \), or there exists \(\xi = \xi (h) \in X\), such that (25) holds.

We illustrate Definition 4.2 in Example 7.1.

For \(x, h \in X\), \(\Vert h\Vert =1\), and \(\lambda (h) = (\lambda _i(h))_{i \in I_{1} (h) \cup I_{2} (h) }\), introduce a 2-factor-Lagrange function as

$$\begin{aligned} {L}_2 (x, \lambda (h), h) := f(x) +\mathop {\sum }\limits _{i \in I_{1} (h)} \lambda _i(h) g_i (x) + \mathop {\sum }\limits _{i \in I_{2} (h)} \lambda _i (h) g_i^\prime (x) h . \end{aligned}$$
(26)

Now we are ready to present necessary and sufficient conditions for problem (1).

Theorem 4.4

Let \(X ={\mathbb R}^n\), \(f(x) \in C^1(X)\), and \(g_i(x) \in C^{2}(X)\), \(i = 1, \ldots , m\). Assume that (10) holds and that mapping g(x) is 2-regular at the point \(\bar{x}\) along \(h \in H_2 (\bar{x})\).

Necessary conditions: If \(\bar{x}\) is a minimizer to (1), then either

$$\begin{aligned} \langle f^\prime (\bar{x}), h \rangle > 0 \end{aligned}$$
(27)

or there exists \(\lambda ^*(h) = (\lambda ^*_i(h))_{i \in I_{1} (h) \cup I_{2} (h)}\) such that

$$\begin{aligned} {{L}_{2}}^\prime _x (\bar{x},\lambda ^*(h), h) = 0, \quad \lambda ^*(h) \ge 0. \end{aligned}$$
(28)

Sufficient conditions: If, in addition, g(x) is 2-regular at \(\bar{x}\) and, for any \(h \in H_2(\bar{x})\), either (27) holds or there exists \(\beta > 0\) such that (28) holds and

$$\begin{aligned} {{L}_{2}}^{\prime \prime }_{xx} (\bar{x}, \tilde{\lambda }^*(h), h) [h]^2 \ge \beta \Vert h\Vert ^2 , \qquad \tilde{\lambda }_i^*(h) = \left\{ \begin{array}{ll} \lambda ^*_i(h), &{}\quad \text{ if } i \in I_{1} (h)\\ \frac{ \lambda _i^*(h)}{3}, &{}\quad \text{ if } i \in I_{2} (h), \end{array} \right. \end{aligned}$$
(29)

then \(\bar{x}\) is a strict local minimizer of (1).

Proof

Necessary conditions. Consider an element \(h \in H_2(\bar{x})\) such that g(x) is 2-regular at the point \(\bar{x}\) along the vector h and divide our consideration into two following cases:

Case 1: If \(I_{1} (h)=I_{2} (h) = \emptyset \), then \(\langle g_i^{\prime } (\bar{x}), h \rangle < 0\) for all \(i =1, \ldots , r\) and \(\langle g_i^{\prime \prime } (\bar{x}) h, h \rangle < 0\) for all \(i =r+1, \ldots , m\). First, we will prove that \(\langle f^\prime (\bar{x}), h \rangle \ge 0.\) Assume on the contrary that \(\langle f^\prime (\bar{x}), {h} \rangle < 0\).

Then, by Taylor expansion, \(f(\bar{x} + t {h}) < f(\bar{x})\) and \(g_i (\bar{x} + t {h}) < 0\), \(i =1, \ldots , m\), for all sufficiently small \(t>0\), which contradicts the assumption that \(\bar{x}\) is a local minimizer and proves \(\langle f^\prime (\bar{x}), h \rangle \ge 0\). If \(\langle f^\prime (\bar{x}), h \rangle > 0\), then (27) holds and we are done with the proof in Case 1. Otherwise, \(\langle f^\prime (\bar{x}), h \rangle = 0\).

If \( f^\prime (\bar{x}) \ne 0 \), there exists \(\bar{\xi }\) such that \(\langle f^\prime (\bar{x}), \bar{\xi }\rangle < 0\). Then \(g_i (\bar{x} + t {h} + t^{3/2}\bar{\xi }) < 0\), \(i =1, \ldots , m\), and \(f(\bar{x} + t {h} + t^{3/2}\bar{\xi }) < f(\bar{x})\), which contradicts the assumption that \(\bar{x}\) is a local minimizer, so \( f^\prime (\bar{x}) = 0\), and, hence, (28) holds with \(\lambda _i^*(h) = 0\), \(i =1, \ldots , m\). Thus, in Case 1, either (27) or (28) holds.

Case 2: Consider the case when \(I_{1} (h) \cup I_{2} (h) \ne \emptyset \). First, we will prove that \(\langle f^\prime (\bar{x}), h \rangle \ge 0\). Assume on the contrary that \(\langle f^\prime (\bar{x}), {h} \rangle < 0\). By (25), there exists \(\xi = \xi (h)\) such that \(\langle g_i^\prime (\bar{x}), \xi \rangle < 0\), \(i \in I_1( h)\), and \(\langle g_i^{\prime \prime }(\bar{x}) {h}, \xi \rangle < 0\), \(i \in I_2( h)\). Hence, \(g_i (\bar{x} + t{h} + t^{3/2} \xi ) < 0\), \(i =1, \ldots , m\), and \(f(\bar{x} + t{h} + t^{3/2} \xi ) < f(\bar{x})\) for all sufficiently small \(t>0\), which contradicts the assumption that \(\bar{x}\) is a local minimizer. Hence, \(\langle f^\prime (\bar{x}), h \rangle \ge 0\). If \(\langle f^\prime (\bar{x}), h \rangle > 0\), then (27) holds, and we are done with the proof in Case 2. If not, then \(\langle f^\prime (\bar{x}), {h} \rangle = 0\). In this case, we will prove that \(\langle f^\prime (\bar{x}), d \rangle \ge 0\) for every d such that

$$\begin{aligned} \langle g_i^\prime (\bar{x}), d\rangle \le 0,\; i \in I_1 ( h), \quad \langle g_i^{\prime \prime }(\bar{x}) {h}, d \rangle \le 0, \; i \in I_2( h). \end{aligned}$$
(30)

Assume on the contrary that there exists d satisfying (30) such that \(\langle f^\prime (\bar{x}), d \rangle < 0\). Then by (25), there exists \(\xi = \xi (h)\) such that \(g_i (\bar{x} + t{h} + t^{3/2} \bar{d} +t^{7/4} \xi ) \le 0\), \(i =1, \ldots , m\), and \(f(\bar{x} + t{h} + t^{3/2} \bar{d} +t^{7/4} \xi ) < f(\bar{x})\), which contradicts the assumption that \(\bar{x}\) is a local minimizer. Hence, \(\langle f^\prime (\bar{x}), d \rangle \ge 0\) for every d satisfying (30). Then, similarly to the proof of Theorems 4.1, we get (28) by using Lemma 3.3, which finishes the proof in the second case.

Sufficient conditions. Assume on the contrary that \(\bar{x}\) is not a strict local minimizer. Then there exists a sequence \(\{x_k\} \rightarrow \bar{x}\) such that \(f(x_k) \le f(\bar{x})\) and \(g(x_k) \le 0\). Using the same notation for a convergent subsequence, let \( \left\{ \frac{x_k - \bar{x}}{\Vert x_k - \bar{x} \Vert } \right\} \) converge to some \(\tilde{h}\). Then \(x_k = \bar{x} + \Vert x_k - \bar{x} \Vert \tilde{h} + w(x_k) = \bar{x} + t_k \tilde{h} + w(x_k), \;\) where \(\Vert w(x_k) \Vert = o ( \Vert x_k - \bar{x} \Vert )\) and \(t_k = \Vert x_k - \bar{x}\Vert \). Note that \(\tilde{h}= \frac{x_k - \bar{x} - w(x_k)}{t_k } \) satisfies the following:

$$\begin{aligned}&\langle g_i^{\prime } (\bar{x}), \tilde{h} \rangle =\frac{1}{t_k} \langle g_i^{\prime } (\bar{x}), x_k - \bar{x} - w(x_k) \rangle \\&\quad = \frac{1}{t_k} \left( g_i(x_k) - g_i(\bar{x}) \right) + o(t_k)/t_k , \quad i = 1, \ldots , r, \end{aligned}$$

so that, when \(k \rightarrow \infty \), \(\langle g_i^{\prime } (\bar{x}), \tilde{h} \rangle \le 0,\) \(i = 1, \ldots , r, \) and similarly, \( \langle g_i^{\prime \prime } (\bar{x}) \tilde{h}, \tilde{h} \rangle \le 0,\) \(i = r+1, \ldots , m.\) Hence, \(\tilde{h}\in H_2(\bar{x})\) and, by the assumption of the theorem, either (27) holds or there exists \(\beta > 0\) such that (28) and (29) hold. Consider two cases.

Case 1. If (27) holds, that is \(\langle f^\prime (\bar{x}), \tilde{h} \rangle > 0\), then \( f(x_k) = f(\bar{x}) +\left\langle f^\prime (\bar{x}), t_k \tilde{h} + w(x_k)\right\rangle > f(\bar{x}). \) However, this is a contradiction, so this case does not hold.

Case 2. If (28) holds, then \( 0 = \langle f^{\prime }(\bar{x}), \tilde{h} \rangle + \mathop {\sum }\limits _{i \in I_{1} (\tilde{h})} \lambda ^*_i(\tilde{h}) \langle g_i^\prime (\bar{x}), \tilde{h} \rangle + \mathop {\sum }\limits _{i \in I_{2} (\tilde{h})} \lambda ^*_i (\tilde{h}) \langle g_i^{\prime \prime } (\bar{x}) \tilde{h}, \tilde{h}\rangle =\langle f^{\prime } (\bar{x}), \tilde{h} \rangle , \) so that \(\langle f^\prime (\bar{x}), \tilde{h} \rangle = 0\). To simplify the notation, we let \(w = w(x_k)\), \(h_k = t_k \tilde{h} + w\), \(I_1= I_1 ( \tilde{h})\), and \(I_2= I_2 (\tilde{h})\). By the consideration above, \(g_i (x_k) \le 0\) and since there exists \(\lambda ^*(\tilde{h})\ge 0\) such that (28) holds, we get

$$\begin{aligned} f(x_k) - f(\bar{x})\ge & {} f(x_k) - f(\bar{x}) + \mathop {\sum }\limits _{i \in I_1} \lambda ^*_i(\tilde{h}) g_i (x_k) + \mathop {\sum }\limits _{i \in I_2 } \frac{ \lambda ^*_i(\tilde{h}) g_i (x_k)}{t_k}\\= & {} \langle f^\prime (\bar{x}), h_k \rangle + \frac{1}{2} \langle f^{\prime \prime }(\bar{x})h_k, h_k \rangle \\&+ \mathop {\sum }\limits _{i \in I_1} \lambda ^*_i(\tilde{h}) \left( \langle g_i^\prime (\bar{x}), h_k \rangle + \frac{1}{2} \langle g_i^{\prime \prime }(\bar{x})h_k, h_k \rangle \right) \\&+ \mathop {\sum }\limits _{i \in I_2} \lambda ^*_i(\tilde{h}) \left( \frac{1}{2} g_i^{\prime \prime } (\bar{x}) \frac{[h_k]^2}{t_k} \right. + \left. \frac{1}{3!} g_i^{\prime \prime \prime } (\bar{x}) \frac{[h_k]^3}{t_k} + \frac{o(t_k^3)}{t_k}\right) + o(t_k^2). \end{aligned}$$

Then \(\langle f^\prime (\bar{x}), \tilde{h} \rangle = 0\), \(\langle g^\prime _i (\bar{x}), \tilde{h} \rangle = 0\), \(i \in I_1\), and \(g_i^{\prime \prime } (\bar{x}) [\tilde{h}]^2 = 0\), \(i \in I_2\), yield

$$\begin{aligned} f(x_k) - f(\bar{x})\ge & {} \left\langle f^\prime (\bar{x}) +\mathop {\sum }\limits _{i \in I_1} \lambda ^*_i (\tilde{h}) g_i^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_2} \lambda ^*_i (\tilde{h}) g_i^{\prime \prime } (\bar{x})\tilde{h} , w \right\rangle \\&+\,\frac{t_k^2}{2} \left( f^{\prime \prime }(\bar{x}) [\tilde{h}]^2 + \mathop {\sum }\limits _{i \in I_1} \lambda ^*_i (\tilde{h}) g_i^{\prime \prime }(\bar{x})[\tilde{h}]^2 + \mathop {\sum }\limits _{i \in I_2} \frac{\lambda ^*_i (\tilde{h})}{3} g_i^{\prime \prime \prime } (\bar{x}) [\tilde{h}]^3 \right) \\&+ o(t_k^2). \end{aligned}$$

The assumptions of the theorem and the last inequalities imply

$$\begin{aligned}&f(x_k) - f(\bar{x}) \ge \langle {{L}_{2}}^\prime _x (\bar{x},\lambda ^*(\tilde{h}), \tilde{h}) , w \rangle \\&\quad + \frac{t_k^2}{2} {{L}_{2}}^{\prime \prime }_{xx} (\bar{x},\tilde{\lambda }^*(\tilde{h}), \tilde{h}) \; [\tilde{h}]^2 + o(t_k^2) \ge \frac{\beta }{2} t_k^2 \Vert \tilde{h}\Vert ^2 + o(t_k^2) > 0, \end{aligned}$$

which contradicts the assumption \(f(x_k) \le f(\bar{x})\). Hence, \(\bar{x}\) is a strict local minimizer. \(\square \)

Corollary 4.1

If (28) holds and \(I_1(h)\cup I_2(h)= \emptyset \), the equation (28) reduces to \(f^\prime (\bar{x})=0\) in the necessary conditions of Theorem 4.4.

The proof of the corollary follows from the proof of necessary conditions in Theorem 4.4.

5 Optimality Conditions in the General Case of Degeneracy without Assumption (10)

To simplify considerations in this part of the paper, assume that \(X = {\mathbb R}^n\). In this section, we consider a general case of degeneracy without assumption (10).

We need the following additional notation.

Let \(H_g(\bar{x}) := \{ h \in {\mathbb R}^n \, : \, \langle g_i^\prime (\bar{x}), h \rangle \le 0, \, i \in I(\bar{x}) \}.\) For some fixed element \(h \in H_g(\bar{x})\), we define a set of indices: \(\; I_1(\bar{x}, h) := \{ i \in I(\bar{x}) \, : \, \langle g_i^\prime (\bar{x}), h \rangle = 0\} \) and assume that \( | I_1(\bar{x}, h) | =m_1(h) = m_1 \ne 0\). We start with construction of \(l \le m_1\) proper cones generated by vectors \( g_j^\prime (\bar{x})\) with indices j from the set \(I_1(\bar{x}, h)\). The cones are determined in such a way that, for every \(j \in I_1(\bar{x}, h)\), the corresponding \( g_j^\prime (\bar{x})\) is used in defining at least one cone and all cones are different. For constructing a cone with number k, \(k=1, \ldots , l\), indices \(k_1, \ldots , k_{r_k} \in I_1(\bar{x}, h)\) are used so that the corresponding vectors \(g^\prime _{k_1} (\bar{x}), \ldots , g^\prime _{k_{r_k}} (\bar{x})\) generate the largest proper cone and \(k_{j} \ne k_{l}\) if \(j \ne l\). As a result, there exists an element \(\gamma _i \in {\mathbb R}^n\) such that \(\langle g_j^\prime (\bar{x}), \gamma _i \rangle < 0\), \(j = k_1, \ldots k_{r_k}\), and, for every \(j \in J_k (\bar{x}, h)\), where \( J_k (\bar{x}, h) := I_1(\bar{x}, h) \backslash \{k_1, \ldots , k_{r_k}\},\) the following holds: \( -g^\prime _j (\bar{x}) = \alpha _{j k_1} g^\prime _{k_1} (\bar{x}) + \ldots + \alpha _{j k_{r_k}} g^\prime _{k_{r_k}} (\bar{x}) , \) where \(\alpha _{j k_1} \ge 0, \ldots , \alpha _{j k_{r_k}} \ge 0\). For each \(j \in J_k (\bar{x}, h) \), introduce \(\, \tilde{g}_{j} (x) := {g}_{j}(x) +\alpha _{j k_1} g_{k_1} (x) + \ldots + \alpha _{j k_{r_k}} g_{k_{r_k}} (x), \) and get l sets consisting of functions \( g_{k_1} (x), \ldots , g_{k_{r_k}} (x), \tilde{g}_{j} (x)\), \(j \in J_k (\bar{x}, h)\), such that

$$\begin{aligned} \tilde{g}^\prime _{j} (\bar{x}) = 0, \quad j \in J_k (\bar{x}, h). \end{aligned}$$
(31)

Note that conditions (31) resemble conditions (10). For every \(k = 1, \ldots , l\), define a set \(S_k\) as follows:

$$\begin{aligned}&S_k := \{ x \in {\mathbb R}^n \, : \, g_{k_1} (x) \le 0, \ldots , g_{k_{r_k}} (x)\le 0,\; \tilde{g}_{j} (x)\le 0,\, j \in J_k (\bar{x}, h), \; \nonumber \\&g_i(x)\le 0, \, i \in I(\bar{x}) \backslash I_1(\bar{x}, h) \}. \end{aligned}$$
(32)

Consider an example.

Example 5.1

Let \(X={\mathbb R}^3\), \(g_1(x) = - x_1\), \(g_2(x) = - x_2\), \(g_3(x) = x_2 - x_1^2 + x_2^2 + x_3^2\), and \(\bar{x} = (0, 0, 0)^T\). In this example, \(h = (1, 0, 1)^T\), \(m=3\), \(m_1 = 2\), \(l=2\), and \(I_1 (0, h)= \{2, 3\}\). For \(k=1\), we have \(g_{1_1}(x) = g_2 (x)\), \(r_1 = 1\), and \(\tilde{g}_{3}(x) \) \(= g_3 (x)+ g_2(x) = - x_1^2 + x_2^2 + x_3^2\). For \(k=2\), we get \(r_2 = 1\), \(g_{2_1}(x) = g_3 (x)\), and \(\tilde{g}_{2}(x) = g_2 (x) + g_3(x)= - x_1^2 + x_2^2 + x_3^2\). As a result, there are two following systems of inequalities, (A) and (B), that define the sets \(S_1\) and \(S_2\), respectively:

$$\begin{aligned} \text{(A) } \quad \left\{ \begin{array}{l} g_1(x) = - x_1 \le 0 \\ g_{2}(x) = - x_2 \le 0 \\ \tilde{g}_{3}(x) = - x_1^2 + x_2^2 + x_3^2 \le 0. \end{array} \right. \qquad \text{(B) } \quad \left\{ \begin{array}{l} g_1(x) = - x_1 \le 0 \\ g_{3}(x) = x_2 - x_1^2 + x_2^2 + x_3^2 \le 0 \\ \tilde{g}_{2}(x) = - x_1^2 + x_2^2 + x_3^2 \le 0. \end{array} \, \right. \end{aligned}$$

We will need the following lemma.

Lemma 5.1

\(S =\mathop {\bigcap }\limits _{k=1}^l S_k\), where \(S_k\) are defined in (32).

Proof

The proof of the lemma follows from the property that, for any \(k=1, \ldots , l\), the definition of the cone with number \(k+1\) implies that at least one function \(g_j (x)\), \(j \in I_1(\bar{x}, h)\), is used in the definition of the set \(S_k\) and is not in \(S_{k+1}\). Also, at least one function \(g_j (x)\), \(j \in I_1(\bar{x}, h)\), is used in defining \(S_{k+1}\) and is not in \(S_k\). The process of defining the cones also implies that each index j from the set \(I_1 (\bar{x}, h)\) is used at least once. Then, by the definition of \(S_k\), \( \mathop {\bigcap }\limits _{k=1}^l S_k \subseteq \mathop {\bigcap }\limits _{k=1}^m A_i = S\), where \(A_i = \{ x \in {\mathbb R}^n \, : \, g_{i} (x) \le 0\},\) \(i = 1, \ldots , m\). At the same time, for every \(k = 1, \ldots , l\), \(S \subseteq S_k\), and, hence, \(S \subseteq \mathop {\bigcap }\limits _{k=1}^l S_k.\) Thus \(S = \mathop {\bigcap }\limits _{k=1}^l S_k\) holds. \(\square \)

Note that the functions used in the definition of the sets \(S_k\) satisfy conditions (31) and \(\langle g_j^\prime (\bar{x}), h \rangle < 0\), \(j \in I(\bar{x})\backslash I_1 (\bar{x}, h)\). Since Lemma 5.1 implies that problem (1) can be written as

$$\begin{aligned} \mathrm{min} \; f (x), \quad \text{ s.t. } \quad x \in \bigcap _{k=1}^l S_k, \end{aligned}$$

optimality conditions for problem (1), given below in Theorem 5.1, are formulated in terms of the functions used in the definition of the sets \(S_k\) under an assumption that guarantees \( \mathop {\bigcap }\limits _{k=1}^l S_k \ne \bar{x}\). To state the theorem, we need to introduce some additional notation and definitions.

Assume that there exists a vector \(h \in H_g(\bar{x})\), satisfying the following inequalities,

$$\begin{aligned} \langle \tilde{g}_{j}^{\prime \prime } (\bar{x})h, h \rangle \le 0 , \quad j\in J_k (\bar{x}, h), \quad k=1, \ldots , l. \end{aligned}$$

If such h is not found, then \(\bar{x}\) is an isolated feasible point for problem (1). Recall that we consider indices \(k_j\) from the set \(I_1(\bar{x}, h)\) and, for every \(k = 1, \ldots , l\), define

$$\begin{aligned}&I_0^{1 \, k} (\bar{x}, h) := \{k_1, \ldots , k_{r_k}\},\quad I_0^{2\, k} (\bar{x}, h) := \{i \in J_k (\bar{x}, h) \, : \, \langle \tilde{g}_i^{\prime \prime } (\bar{x}) h, h \rangle = 0\},\\&I_0^{1} (\bar{x}, h) := \mathop {\bigcup }\limits _{k=1}^l I_0^{1 \, k} (\bar{x}, h) ,\quad I_0^{2} (\bar{x}, h) := \mathop {\bigcup }\limits _{k=1}^l I_0^{2 \, k} (\bar{x}, h). \end{aligned}$$

Definition 5.1

We say that a mapping \(g(x): {\mathbb R}^n \rightarrow {\mathbb R}^m\) is 2–regular at the point \(\bar{x} \in {\mathbb R}^n\) along a vector \(h\in H_g(\bar{x})\) if there exists an element \(\xi \in {\mathbb R}^n\) satisfying the following inequalities,

$$\begin{aligned}&\langle g_{k_1} ' (\bar{x}), \xi \rangle< 0, \ldots , \langle g_{k_{r_k}} ' (\bar{x}), \xi \rangle< 0 ,\nonumber \\&\quad \langle \tilde{g}_{j} ^{\prime \prime } (\bar{x}) h , \xi \rangle < 0, \quad j \in I_0^{2} (\bar{x}, h) , \quad k=1, \ldots , l. \end{aligned}$$

Note that Definition 5.1 introduces another 2-regularity constraint qualification, which can be viewed as a generalization of the MFCQ.

Definition 5.2

We say that a mapping \(g(x): {\mathbb R}^n \rightarrow {\mathbb R}^m\) is tangent 2–regular at the point \(\bar{x} \in {\mathbb R}^n\) along a vector \(h\in H_g(\bar{x})\) if, for any \(\xi \in {\mathbb R}^n\), satisfying the following inequalities,

$$\begin{aligned}&\langle g_{k_1} ' (\bar{x}), \xi \rangle \le 0, \ldots , \langle g_{k_{r_k}} ' (\bar{x}), \xi \rangle \le 0 ,\nonumber \\&\quad \langle \tilde{g}_{j} ^{\prime \prime } (\bar{x}) h , \xi \rangle \le 0, \quad j \in I_0^{2} (\bar{x}, h) , \quad k=1, \ldots , l, \end{aligned}$$
(33)

there exists a set of feasible points \(x(\alpha ) \in S\) in the form \(x (\alpha )= \bar{x} + \alpha h + \omega (\alpha ) \xi + \eta (\alpha )\), where \(\alpha > 0\) is sufficiently small, \(\omega (\alpha ) = o(\alpha )\), \(\alpha ^2 / \omega (\alpha ) \rightarrow 0\) as \(\alpha \rightarrow 0^+\), and \(\Vert \eta (\alpha ) \Vert = o( \omega (\alpha ))\).

An example of \(\omega (\alpha )\) in Definition 5.2 is \(\omega (\alpha ) = \alpha ^{1+\varepsilon }\) with \(\varepsilon \in ]0, 1[\).

To formulate the next theorem, we introduce a generalized 2-factor-Lagrange function in the form:

\( L_2(x, \lambda (h), h) := f(x) + \mathop {\sum }\limits _{i \in I_0^{1} (x, h)} \lambda _i (h) g_i(x) + \mathop {\sum }\limits _{i \in I_0^{2} (x, h)} \tilde{\lambda }_i (h) \langle \tilde{g}_i^\prime (x), h \rangle . \)

Theorem 5.1

Assume that \(\bar{x}\) is a local minimizer of the problem (1), \(f \in C^1({\mathbb R}^n)\), and \(g \in C^2({\mathbb R}^n)\). Assume that g(x) is tangent 2-regular at the point \(\bar{x}\) along a vector \(h \in H_g(\bar{x})\) and \(\langle f^\prime (\bar{x}), h \rangle = 0\). Then there exist coefficients \( \lambda _i^* (h)\ge 0\), \(i \in I_0^{1} (\bar{x}, h)\), \(\tilde{\lambda }_i^* (h)\ge 0\), \(i \in I_0^{2} (\bar{x}, h)\), such that

$$\begin{aligned} {L_2}^\prime _x (\bar{x}, \lambda ^*(h), h) = f^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_0^{1} (\bar{x}, h)} \lambda ^*_i (h) g^\prime _i(\bar{x}) + \mathop {\sum }\limits _{i \in I_0^{2} (\bar{x}, h)} \tilde{\lambda }^*_i (h) \tilde{g}_i^{\prime \prime } (\bar{x}) [h] =0.\nonumber \\ \end{aligned}$$
(34)

The proof of Theorem 5.1 is similar to the proof of necessary conditions in Theorem 4.4 with an assumption that g(x) is tangent 2-regular at the point \(\bar{x}\) along a vector \(h \in H_g(\bar{x})\) and an additional property:

\(\left( T_S (\bar{x})\right) ^* \,{=}\,\left( \mathop {\bigcap }\limits _{k=1}^l T_{S_k} (\bar{x}) \right) ^* \,{=}\, \mathrm{cone} \left\{ g^\prime _i(\bar{x}) ,\tilde{g}_j^{\prime \prime } (\bar{x}) [h], \; {i \in I_0^{1} (\bar{x}, h)}, \; j \in I_0^{2} (\bar{x}, h)\right\} ,\) where \(T_M (\bar{x})\) is a tangent cone to the set M at \(\bar{x}\), and \((T_M (\bar{x}))^*\) is its conjugate.

Remark 5.1

Note that \(I_0^{1} (\bar{x}, h) \subseteq I_1 (\bar{x}, h)\) and \(I_0^{2} (\bar{x}, h) \subseteq I_1 (\bar{x}, h)\), and recall the definition of functions \(\; \tilde{g}_j (x) = {g}_j (x) +\alpha _{j k_1} g_{k_1} (x) + \ldots + \alpha _{j k_{r_k}} g_{k_{r_k}} (x) ,\) \(j \in J_k (\bar{x}, h) \), where \(\alpha _{j k_1} \ge 0, \ldots , \alpha _{j k_{r_k}} \ge 0\). Then the statement of Theorem 5.1 can be written in the form: \(\, f^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_1 (\bar{x}, h)} \lambda _i^* g_i^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_1 (\bar{x}, h)} \gamma ^*_i {g}_i^{\prime \prime } (\bar{x}) [h] =0, \,\) where \( \lambda ^*_i \ge 0\) and \(\gamma ^*_i \ge 0\).

Example 5.1. (continued). Let \(f=x_1 + x_2 - x_3\). In this example, we have \(\bar{x} = 0\), \(h = (1, 0, 1)^T\), \(\langle f^\prime (\bar{x}), h \rangle = 0\), and \(\tilde{g}_{2}^{\prime \prime } (\bar{x})[h] = \tilde{g}_{3}^{\prime \prime } (\bar{x})[h] = (-2, 0, 2)^T\). Also, using the introduced notation, we get the following sets: \(I_0^{1 \, 1} (\bar{x}, h) = \{2\},\) \( I_0^{1 \, 2} (\bar{x}, h) = \{3\},\) \( I_0^{2 \, 1} (\bar{x}, h) = \{3\}, \) \( I_0^{2 \, 2} (\bar{x}, h) = \{2\},\) \(I_0^{1} (\bar{x}, h) = \{2, 3 \},\) and \(I_0^{2} (\bar{x}, h) = \{3, 2\}\). Note that mapping g(x) is tangent 2-regular at the point \(\bar{x}\) along the vector h; hence, all conditions of Theorem 5.1 are satisfied. Then, there exist multipliers \(\lambda ^*_2(h)=1\) and \(\lambda ^*_3(h)=0\) for \(i \in I_0^{1} (\bar{x}, h)\), and there exist multipliers \(\tilde{\lambda }^*_3 (h) = 0\) and \(\tilde{\lambda }^*_2 (h) = \frac{1}{2}\) for \( i \in I_0^{2}(\bar{x}, h)\), such that condition (34) holds in the following form:

\( f^\prime (0) + \lambda ^*_2 (h) g_2^\prime (0) + \lambda ^*_3 (h) g_3^\prime (0) + \tilde{\lambda }^*_3 (h) \tilde{g}_{3}^{\prime \prime } (0) [h] + \tilde{\lambda }_2 (h) \tilde{g}_{2}^{\prime \prime } (0) [h] =0. \)

6 Future Work for \(p>2\)

Similar to the approach described in Sect. 4.2, the constraints of Problem (1) can be reduced to equivalent ones that satisfy the following relations (without notation change):

$$\begin{aligned} \begin{array}{ll} g^\prime _i (\bar{x}) \ne 0, \; i = 1, \ldots , r_1, &{} {g}_i '(\bar{x}) = 0, \; i=r_1+1, \ldots , m,\\ g^{\prime \prime }_i (\bar{x}) \ne 0, \; i = r_1+1, \ldots , r_2, &{} g^{\prime \prime }_i (\bar{x}) = 0, \; i = r_2+1, \ldots , m, \quad \\ \cdots &{}\cdots \\ {g}_i ^{(p-1)}(\bar{x}) \ne 0, \; i=r_{p-2}+1, \ldots , r_{p-1}, \qquad &{} {g}_i ^{(p-1)}(\bar{x}) = 0, \; i=r_{p-1}+1, \ldots , m\\ {g}_i ^{(p)}(\bar{x}) \ne 0, \; i=r_{p-1}+1, \ldots , m. \end{array} \end{aligned}$$

Introduce the sets:

\(I_1 (\bar{x}):= \{ 1, \ldots , r_1\}, \) \(I_2 (\bar{x}):= \{ r_1+1, \ldots , r_2 \}\), \(\ldots ,\) and \(I_p(\bar{x}):= \{ r_{p-1}+1, \ldots , m\}.\)

Definition 6.1

Assume that there exists h such that \( \langle g_i^\prime (\bar{x}), h \rangle = 0\) for all \(i \in I_1 (\bar{x})\), \(g_i^{\prime \prime } (\bar{x}) [h ]^2 = 0 \) for all \(i \in I_2 (\bar{x})\), \(\ldots \), and \(g_i^{(p)} (\bar{x}) [h ]^p = 0\) for all \(i \in I_p (\bar{x})\). We say that mapping \(g(x): {\mathbb R}^n \rightarrow {\mathbb R}^m \) is p–regular at the point \(\bar{x} \in {\mathbb R}^n\) along the vector h, if there exists \(\xi \in {\mathbb R}^n\), which satisfies the following inequalities: \(\langle g_{i} ' (\bar{x}), \xi \rangle < 0\) for all \( i \in I_1(\bar{x}),\) \( \langle g_{i} ^{\prime \prime } (\bar{x}) h , \xi \rangle < 0 \) for all \( i \in I_2(\bar{x}),\) \( \ldots \), and \(\langle g_{i} ^{(p)}(\bar{x}) [h]^{p-1} , \xi \rangle < 0\) for all \(i \in I_p(\bar{x}) .\)

Theorem 6.1

Assume that \(\bar{x}\) is a local minimizer of problem (1), \(f \in C^1({\mathbb R}^n)\), and \(g \in C^{p+1}({\mathbb R}^n)\). Assume that mapping g(x) is p–regular at \(\bar{x}\) along a vector \(h \in H_g(\bar{x})\) and \(\langle f^\prime (\bar{x}), h \rangle = 0\). Then there exists \(\lambda ^*(h)\) \( = \left( \lambda _i^* (h) \right) _{i \in I_{1} (\bar{x}) \bigcup I_{2} (\bar{x}) \bigcup \cdots \bigcup I_p(\bar{x})}\) such that \( \lambda ^*(h) \ge 0\) and

$$\begin{aligned}&f^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_1 (\bar{x})} \lambda _i^* (h) g_i^\prime (\bar{x}) + \mathop {\sum }\limits _{i \in I_2 (\bar{x})} \lambda _i^* (h) {g}_i^{\prime \prime } (\bar{x}) h \\&\quad + \cdots +~\mathop {\sum }\limits _{i \in I_p (\bar{x})} \lambda _i^* (h) {g}_i^{(p)} (\bar{x}) [h]^{p-1}=0. \end{aligned}$$

The proof of Theorem 6.1 is similar to the proof of necessary conditions in Theorem 4.4 with an additional property:

\(\left( T_S (\bar{x})\right) ^* = \mathrm{cone} \Bigg \{g^\prime _i(\bar{x}), i \in I_1 (\bar{x}), {g}_i^{\prime \prime } (\bar{x}) h, i \in I_2 (\bar{x}),\ldots , {g}_i^{(p)} (\bar{x}) [h]^{p-1}, i \in I_p (\bar{x})\Bigg \}.\)

A more general version of optimality conditions given in Theorem 6.1 can be derived under an assumption that \(g(x): {\mathbb R}^n \rightarrow {\mathbb R}^m \) is tangent p–regular (\(p>2\)) at \(\bar{x} \in {\mathbb R}^n\) along a vector \(h\in H_g(\bar{x})\), which is a generalization of Definition 5.2. Optimality conditions given in Theorem 4.2 can also be expanded to the case \(p>2\) under a generalized version of Assumption 1:

There exist vectors \(\xi , \eta \in X\), \(\Vert \xi \Vert =\Vert \eta \Vert =1\), such that \(\langle g_i ' (\bar{x}), \xi \rangle< 0, \langle g_i ' (\bar{x}), \eta \rangle < 0,\) \(i \in I_1 (\bar{x})\); \( \langle {g}_i ^{\prime \prime } (\bar{x}) h , \xi \rangle < 0,\) \( \langle {g}_i ^{\prime \prime } (\bar{x}) h , \eta \rangle > 0,\) \(i \in I_2 (\bar{x})\); \(\ldots , \) \( \langle {g}_i ^{(p)} (\bar{x}) [h]^{p-1} , \xi \rangle < 0,\) and \( \langle {g}_i ^{(p)} (\bar{x}) h^{p-1} , \eta \rangle <0\), if p is odd, or \( \langle {g}_i ^{(p)} (\bar{x}) h^{p-1} , \eta \rangle >0\), if p is even, \(i \in I_p (\bar{x})\).

7 Examples

Example 7.1. This example illustrates Theorem 4.4. Consider the following problem:

$$\begin{aligned} \begin{array}{ll} {\mathop {\hbox {min}}\limits _{x \in {\mathbb {R}}^3}} &{} 3x_1 - 3x_3 +x_1^2 + x_2^2 \quad \mathrm{s.t.} \quad g_1(x) = - x_1 - x_2 + 2x_3 + |x_1|^{5/2} \le 0, \\ &{} g_2(x) = - 2 x_1^2 + x_2^2 + x_3^2 + |x_2|^{7/2} \le 0, \quad g_3(x) = x_1^2 - 2 x_2^2 + x_3^2+ |x_3|^{7/2}\le 0. \end{array} \end{aligned}$$
(35)

Necessary conditions. Note that \(\bar{x}=0\) is a minimizer in Problem (35). To verify assumptions of Theorem 4.4, notice that \(r=1\) in (10) and vectors \(h= (h_1, h_2, h_3)\) in the set \(H_2 (0)\) satisfy the following inequalities: \( \langle g_1^\prime (0), h\rangle = -h_1 -h_2 + 2h_3 \le 0 \), \(\langle g_2^{\prime \prime } (0) h, h \rangle =-4 h_1^2 + 2h_2^2 + 2h_3^2 \le 0, \) and \(\langle g_3^{\prime \prime } (0) h, h \rangle = 2 h_1^2 - 4h_2^2 + 2h_3^2 \le 0 . \) Consider an element \(h= (a, a, a)\), \(a \ne 0\), which is in the set \(H_2(0)\) and satisfies \(\langle f^{\prime } (0), h \rangle = 0\), \(\langle g_1^\prime (0), h\rangle = 0\), and \(\langle g_2^{\prime \prime } (0) h, h \rangle = \langle g_3^{\prime \prime } (0) h, h \rangle = 0.\) Otherwise, necessary optimality conditions hold in the form (27). The mapping \(g(x) = (g_1(x), g_2(x), g_3(x))\) is 2-regular at \(\bar{x}=0\) along the vector h because there exists a vector \(\xi \), for example, \(\xi = (2, 2, 1)\), such that \( \langle g_1^\prime (\bar{x}), \xi \rangle = -\xi _1 -\xi _2 + 2\xi _3< 0 \), \(\langle g_2^{\prime \prime } (\bar{x}) h, \xi \rangle =-4a \xi _1 + 2a \xi _2 + 2a \xi _3 <0 \), and \(\langle g_3^{\prime \prime } (\bar{x}) h,\xi \rangle =2 a \xi _1 - 4a \xi _2 + 2a \xi _3< 0,\) where \(a>0\). Therefore, all assumptions of Theorem 4.4 hold. Hence, for \(h =(a, a, a) \in H_f(0)\), there exist multipliers \(\lambda ^*_i(h)\ge 0\), \(i \in I_{1} (h) \cup I_{2} (h)= \{1, 2, 3\}\), such that (28) holds. Indeed, taking \(\lambda ^*_1(h) = 1\), \(\lambda ^*_2(h) = 1/{2a}\), \(\lambda ^*_3(h) = 0\), we get \( f^\prime (0) + (1) g_1^\prime (0) + (1/2a) g_2^{\prime \prime }(0) [h] = 0. \)

Sufficient conditions. For \(\bar{x}=0\), consider \(h =(a, a, a) \in H_f(0)\), where a is a fixed real number, \(a \ne 0\), and, using (26), define a 2-factor-Lagrange function with \(\tilde{\lambda }^*_1(h) = 1\), \(\tilde{\lambda }^*_2(h) = 1/{6a}\), and \(\tilde{\lambda }^*_3(h) = 0\) as \(L_2 (0, \tilde{\lambda }^*(h), h) = 3x_1 - 3x_3 + x_1^2 +x_2^2 + (1)(- x_1 - x_2 + 2x_3 + |x_1|^{5/2}) + \frac{1}{6a} (- 4 x_1 + 2x_2 +2 x_3 + 7/2|x_2|^{5/2} )a.\) Note that, by the above, (28) holds and there exists \(\beta >0\) such that

\({{L}_{2}}^{\prime \prime }_{xx} (0, \tilde{\lambda }^*(h), h) [h]^2 = 4a^2 \ge \beta \Vert h \Vert ^2 ,\) so (29) is satisfied. Then, by Theorem 4.4, \(\bar{x}=0\) is a strict local minimizer in Problem (35).

Example 7.2. Consider the following problem that illustrates Theorem 5.1:

$$\begin{aligned} \begin{array}{ll} {\mathop {\hbox {min}}\limits _{{x \in {\mathbb {R}}^3}}} &{} x_1 + x_2 - x_3 +x_1^2 + x_2^2+ x_3^2 \quad \mathrm{s.t.} \quad g_1(x) = - x_1 - |x_2|^{5/2} \le 0, \quad g_2(x) = - x_2- |x_3|^{5/2} \le 0, \\ &{} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad g_3(x) = x_2 - x_1^2 +x_2^2+ x_3^2- |x_1|^{5/2}\le 0, \quad g_4(x) = x_1 - x_3\le 0 . \end{array} \end{aligned}$$

Results presented in [16] cannot be applied here since there is no \(p>1\) such that the constraints are 2p–times continuously differentiable. Optimality conditions given in [19, 22] are also not applicable to this problem since their assumptions are not satisfied at \(\bar{x} = (0, 0, 0)^T\). Consider \(h = (1, 0, 1)^T \in H_g (0)\). Then \(I_1 (0, h) = \{ 2, 3, 4\}\), so \(m_1 = |I_1 (0, h)| =3\). For \(k=1\), we get \(r_1 = 2\), \(g_{1_1}(x) = g_2 (x)\), \(g_{1_2}(x) = g_4 (x)\), and \(\tilde{g}_{3} (x)= g_3 (x)+g_2(x)= - x_1^2 +x_2^2+ x_3^2- |x_1|^{5/2} - |x_3|^{5/2} .\) For \(k=2\), we get \(r_2 = 2\), \(g_{2_1}(x) = g_3 (x)\), \(g_{2_2}(x) = g_4 (x)\), and \(\tilde{g}_{2} (x)= - x_1^2 +x_2^2+ x_3^2- |x_1|^{5/2} - |x_3|^{5/2}.\) Note that \(\langle f^\prime (\bar{x}), h \rangle = 0\), \(\tilde{g}_{3}^{\prime \prime } (\bar{x})[h] = \tilde{g}_{2}^{\prime \prime } (\bar{x})[h] = (-2, 0, 2)^T\), and \(I_0^{2}(0, h) = \{2, 3\}\).

To verify Definition 5.2, consider \(\xi = (\xi _1, \xi _2, \xi _3)\). The inequalities (33) in this example, \( \langle g_{1_1} ' (0), \xi \rangle \le 0, \) \( \langle g_{1_2} ' (0), \xi \rangle \le 0, \) \( \langle g_{2_1} ' (0), \xi \rangle \le 0, \) \( \langle \tilde{g}_{3} ^{\prime \prime } (0) h , \xi \rangle \le 0, \) and \( \langle \tilde{g}_{2} ^{\prime \prime } (0) h , \xi \rangle \le 0, \) reduce to \( \xi _2 = 0, \) \( \xi _1 = \xi _3. \) Now let \(\xi = (\xi _1, 0, \xi _1)\), \(\omega (\alpha )= \alpha ^{3/2}\), and \( \eta (\alpha ) = 0\) to define \(x(\alpha ) = \bar{x} + \alpha h + \omega (\alpha ) \xi + \eta (\alpha )=(\alpha + \alpha ^{3/2} \xi _1, 0, \alpha + \alpha ^{3/2} \xi _1)\) and get

$$\begin{aligned} \begin{array}{l} g_1(x(\alpha )) = -\alpha - \alpha ^{3/2} \xi _1 \le 0, \quad g_2(x(\alpha )) = - |\alpha + \alpha ^{3/2} \xi _1|^{5/2} \le 0, \\ g_3(x(\alpha )) = - |\alpha + \alpha ^{3/2} \xi _1|^{5/2}\le 0, \quad g_4(x(\alpha )) = 0, \end{array} \end{aligned}$$

for all sufficiently small \(\alpha >0\). Therefore, g(x) is tangent 2-regular at a point \(\bar{x}\) along the vector h. Since all conditions of Theorem 5.1 are satisfied, there exist \(\lambda ^*_i(h) \ge 0, \) \(i \in I_0^{1}(0, h)= \{2, 3, 4\}\), and \(\tilde{\lambda }^*_i(h) \ge 0\), \(i \in I_0^{2}(0, h)= \{3, 2\}\), such that (34) holds:

$$\begin{aligned}&f^\prime (0) + \lambda ^*_2 (h) g_{2}^\prime (0) + \lambda ^*_3 (h) g_{3}^\prime (0) + \lambda ^*_4 (h) g_{4}^\prime (0) + \tilde{ \lambda }^*_{3} (h) \tilde{g}_{3}^{\prime \prime } (0) [h]\nonumber \\&\quad + \tilde{\lambda }^*_{2} (h) \tilde{g}_{2}^{\prime \prime }(0) [h]=0. \end{aligned}$$

Indeed, the last equation is satisfied, for example, with \(\lambda ^*_2 (h)=1,\) \(\lambda ^*_3 (h) = 0\), \(\lambda ^*_4 (h)= 0\), \( \tilde{ \lambda }^*_{3} (h) =1/2,\) and \(\tilde{\lambda }^*_{2} (h) =0\).

8 Comparison with Other Results

We start this section comparing our results with Fritz John-type conditions proposed in [19]. The main difference is in the coefficient \(\lambda _0\) of the objective function that is not guaranteed to be nonzero in [19], while optimality conditions presented in this paper have \(\lambda _0 =1\). If \(\lambda _0 = 0\), then optimality conditions do not provide qualitative information about the optimization problem. Moreover, the authors in [19] make an additional assumption that a vector h used in the statements of their results belongs to the set: \(H_2(\bar{x}) := \{ h \in X \, : \langle g_i^\prime (\bar{x}), h \rangle \le 0 \; \forall i \in I(\bar{x})\) and \(\exists x\) such that \(\langle g_i^\prime (\bar{x}), x \rangle +g^{\prime \prime }_i(\bar{x})[h, h] \le 0 \; \forall i \in I(\bar{x}, h) \}, \) where \(I(\bar{x}, h) := \{ i \in I(\bar{x}) \, : \, \langle g_i^\prime (\bar{x}), h \rangle = 0\}\). However, there is no similar requirement in the classical case. For the case when \(\lambda _0 \ne 0\) in [19], there is an additional requirement that there exist \(\xi \) and \(\hat{\xi }\) such that \( \langle g_i^\prime (\bar{x}), \xi \rangle + g_i^{\prime \prime } (\bar{x}) [h, \hat{\xi }] < 0\). This restricts the class of problems, where the optimality conditions from [19] can be applied and give \(\lambda _0 \ne 0\). Example 7.2 illustrates the case, when the above assumptions do not hold, but optimality conditions presented in the paper are satisfied. Another difference between optimality conditions given in [19] and in Theorem 4.1 is in the last term in the generalized Lagrange function, \(\mathcal{L} (x, \lambda ) := \lambda _0 f (x) + \mathop {\sum }\limits _{j=1}^{r} \lambda _j g_j (x) + \mathop {\sum }\limits _{j=r+1}^{m} \lambda _j g_j^{\prime } (x) h\). Namely, Theorem 4.1 yields \(\lambda _j^*=0\), \(j=r+1, \ldots , m\); and hence, the optimality conditions derived in Theorem 4.1 reduce to the classical form of the KKT conditions. Note that Theorem 4.1 also implies that either \( \lambda _0=0\) or \(\lambda _j=0\), \(j=r+1, \ldots , m\), in optimality conditions presented in [19] in terms of the function \(\mathcal{L} (x, \lambda ) \),

Results in [22] are KKT-type optimality conditions. However, the main assumption in [22] is stronger than regularity conditions proposed in this paper. There is also an additional requirement on a vector h from some special set in [22], \(\langle f'(\bar{x}), h \rangle =0\), where \(\bar{x}\) is a local minimizer of problem (1). In our paper, we prove that \(\langle f'(\bar{x}), h \rangle =0\) holds in some relevant results without making it an assumption. Moreover, a regularity assumption in [22] has the form: \( \exists \, h, \, \bar{h}\) such that \( \langle g_i^\prime (\bar{x}), h \rangle \le 0 \; \forall i \in I(\bar{x}) \text{ and } g_i^{\prime \prime } (\bar{x}) [h, \bar{h}] < 0 \; \forall i \in I(\bar{x}, h) .\) It restricts the class of problems where optimality conditions given in [22] hold and \(\lambda _0 \ne 0\) is guaranteed. Example 7.2 illustrates a case when this assumption does not hold, but optimality conditions presented in the paper are satisfied. All results derived in [19,20,21,22] are given for the case of \(p=2\) only, while we consider a more general case of \(p \ge 2\). The results presented in [16] are different from ones given in this paper, since [16] requires that the constraints are 2p–times continuously differentiable.

9 Conclusions

We showed that the Karush–Kuhn–Tucker necessary conditions can reduce to a specific form containing the objective function only in an absolutely degenerate case when (3) holds with an even p. Then we analyzed classes of nonregular problems, when the KKT conditions hold with a nonzero multiplier corresponding to the objective function. After that, we turned our attention to the degenerate optimization problems for which the KKT Theorem fails. We presented necessary and sufficient conditions that can be viewed as generalized KKT-type optimality conditions. As auxiliary results, we derived new geometric necessary conditions. A new approach presented in Sect. 5 can be used to reduce degenerate optimization problems to new forms to simplify analysis of nonregular optimization problems. The proposed optimality conditions were illustrated by some examples and compared to existing optimality conditions.

Most of the constraint qualifications (CQs) proposed in the paper can be viewed as generalizations of either MFCQ or LICQ. The main difference between CQs known in the literature (see, for example, [1,2,3,4] and references therein) and ones presented in the paper is that our regularity assumptions allowed us to derive not only classical KKT-type optimality conditions but also generalized forms of the KKT conditions.

Some directions for future research are described in Sect. 6. It would be interesting to extend the results presented in the paper to the case of \(p>2\). It remains open to determine a generalization for the approach described in Sect. 5 to new classes of optimization problems and to the case of \(p>2\).