1 Introduction

This paper focuses mainly on the following multiobjective semi-infinite programming problem:

$$\begin{aligned} \text{(MOSIP) }\quad&\inf \;\big (f_1(x), f_2(x), \ldots , f_m(x)\big ) \\ \text{ s.t. } \quad \quad&g_j(x)\le 0,\quad \ j\in J,\\&x \in {\mathbb {R}}^n, \end{aligned}$$

where \(f_i, i\in I:=\{1,2,\ldots ,m\}\), and \(g_j, j\in J\), are locally Lipschitz functions from \({\mathbb {R}}^n\) to \({\mathbb {R}}\cup \{+\infty \}\), and \(J\) is an arbitrary (but nonempty) index set.

Theoretical aspects and a wide range of applications of semi-infinite (both scalar problems and vector ones) programming have been studied intensively by many researchers; see [25, 8, 12, 1417, 20, 25] and their references. To the best of our knowledge, there are only a few works available dealing with optimality conditions for (MOSIP); see [1, 27] for differentiable case, [6, 11] for convex case, and [9, 10, 22, 23, 26] for other cases. Recently, in [18] we obtained Karusk-Kuhn-Tucker (KKT, briefly) optimality conditions for non-differentiable non-convex (MOSIP).

In many situations, we obtain positive KKT multiplier associated with vector-valued objective function \( \big (f_1(x), f_2(x), \ldots , f_m(x)\big )\), namely, some of the multipliers may be equal to zero. We say that strong KKT condition holds for a (MOSIP), when the KKT multipliers are positive for all components of the objective function. The aim of this paper is to derive the strong KKT types necessary and sufficient optimality conditions for the (MOSIP). Our results are expressed in terms of Clarke subdifferential.

The paper is organized as follows. In Sect. 2, we introduce some notations, basic definitions, and preliminaries, which are used throughout the paper. In Sect. 3, we give a constraint qualification and derive the strong KKT type necessary conditions for (MOSIP). In Sect. 4, a strong KKT type sufficient condition for (MOSIP) is obtained.

2 Notations and preliminaries

In this section we present some definitions and auxiliary results that will be needed in the sequel.

Let \(A\) be a nonempty subset of \({\mathbb {R}}^{n}\), denote by \(\bar{A}, conv(A)\), and \(cone(A)\), the closure of \(A\), the convex hull, and the convex cone (containing the origin) generated by \(A\), respectively. Also, the polar cone and the strict polar cone of \(A\) are defined respectively by:

$$\begin{aligned}&A^{-}:=\big \{d\in R^{n}\mid \langle x,d \rangle \le 0\quad \forall x\in A\big \}, \\&A^{s}:=\big \{d\in R^{n}\mid \langle x,d \rangle <0\quad \forall x\in A\big \}, \end{aligned}$$

where \(\langle .,. \rangle \) exhibits the standard inner product in \({\mathbb {R}}^n\). Notice that \(A^-\) is always a closed convex cone. It is easy to show that if \(A^{s}\ne \emptyset \) then \(\overline{A^{s}}=A^{-}\). The bipolar Theorem states that \((A^{-})^-=\overline{cone}(A)\); see [13].

Let us recall the following theorems which will be used in the sequel.

Theorem 1

([13]) Let \(A\) be a nonempty compact subset of \({\mathbb {R}}^n\). Then

  1. (I)

    \(conv(A)\) is a closed set.

  2. (II)

    \(cone(A)\) is a closed cone, if \(0\notin conv(A)\).

We recall that for \(A\subseteq {\mathbb {R}}^n\) and \(\hat{x}\in \overline{A}\), the contingent cone and the Clarke tangent cone to \(A\) at \(\hat{x}\) are respectively defined by

$$\begin{aligned}&\varGamma (A,\hat{x}):=\Big \{d \in {\mathbb {R}}^n \mid \exists \left\{ (t_k,d_k)\right\} \rightarrow (0^+,d),\ \text{ such } \text{ that }\ \hat{x}+t_k d_k \in A \ \ \forall k\in \mathbb {N}\ \Big \},\\&T(A,\hat{x}):=\Big \{d \in {\mathbb {R}}^n \mid \forall \left\{ (t_k,x_k)\right\} \rightarrow (0^+,\hat{x}) \ \exists d_k \rightarrow d,\ \text{ such } \text{ that }\ \hat{x}_k+t_k d_k \in A \quad \forall k\in \mathbb {N}\ \Big \}. \end{aligned}$$

Notice that \(\varGamma (A,\hat{x})\) is a closed cone (generally nonconvex) in \({\mathbb {R}}^{n}\).

Let \(\hat{x} \in {\mathbb {R}}^n\) and let \(\varphi :{\mathbb {R}}^n \rightarrow {\mathbb {R}}\) be a locally Lipschitz function. The Clarke directional derivative of \(\varphi \) at \(\hat{x}\) in the direction \(v \in {\mathbb {R}}^n\), and the Clarke subdifferential of \(\varphi \) at \(\hat{x}\) are respectively given by

$$\begin{aligned} \varphi ^\circ (\hat{x};v)&:= \limsup _{y\rightarrow \hat{x},\ t\downarrow 0} \frac{\varphi (y+tv)- \varphi (y)}{t},\\ \partial ^c \varphi (\hat{x})&:= \big \{\xi \in {\mathbb {R}}^n \mid \left<\xi ,v\right> \le \varphi ^\circ (\hat{x};v) \quad \text{ for } \text{ all }\ v\in {\mathbb {R}}^n \big \}. \end{aligned}$$

The Clarke subdifferential is a natural generalization of the classical derivative since it is known that when function \(\varphi \) is continuously differentiable at \(\hat{x}, \partial ^c \varphi (\hat{x})=\{\nabla \varphi (\hat{x})\}\). Moreover when a function \(\varphi \) is convex, the Clarke subdifferential coincides with the subdifferential in the sense of convex analysis.

In the following theorem we summarize some important properties of the Clarke directional derivative and the Clarke subdifferential from [7] which are widely used in what follows.

Theorem 2

Let \(\varphi \) and \(\phi \) be functions from \({\mathbb {R}}^n \) to \({\mathbb {R}}\) which are Lipschitz near \(\hat{x}\). Then,

  1. (i)

    the following assertions hold:

    $$\begin{aligned}&\varphi ^\circ (\hat{x};v)=\max \big \{\left<\xi ,v\right> \mid \xi \in \partial ^c \varphi (\hat{x}) \big \},\\&\partial ^c \big (\max \{\varphi ,\phi \} \big )(\hat{x})\subseteq conv \big (\partial ^c \varphi (\hat{x}) \cup \partial ^c \phi (\hat{x}) \big ), \\&\partial ^c(\lambda \varphi +\phi )(\hat{x}) \subseteq \lambda \partial ^c \varphi (\hat{x})+\partial ^c \phi (\hat{x}), \qquad \forall \; \lambda \in {\mathbb {R}}. \end{aligned}$$
  2. (ii)

    the function \(v \rightarrow \varphi ^\circ (\hat{x};v)\) is finite, positively homogeneous, and subadditive on \({\mathbb {R}}^n\), and

    $$\begin{aligned} \partial \big (\varphi ^\circ (\hat{x};.)\big )(0)=\partial ^c \varphi (\hat{x}), \end{aligned}$$

    where \(\partial \) denotes the subdifferential in sense of convex analysis.

  3. (iii)

    \( \partial ^c \varphi (\hat{x})\) is a nonempty, convex, and compact subset of \({\mathbb {R}}^n\).

  4. (iv)

    \(\varphi ^\circ (x;v)\) is upper semicontinuous as a function of \((x,v)\).

Theorem 3

(mean-value) Let \(x,y \in {\mathbb {R}}^n\), and \(\varphi \) be a locally Lipschitz function from \({\mathbb {R}}^n\) to \({\mathbb {R}}\). Then, there exists a point \(u\) in the open line segment \((x,y)\), such that

$$\begin{aligned} \varphi (y)-\varphi (x) \in \langle \partial ^c \varphi (u),y-x\rangle . \end{aligned}$$

3 Strong KKT necessary condition

As a starting point of this section, we denote by \(S\) the feasible region of (MOSIP), i.e.,

$$\begin{aligned} S:=\big \{x\in {\mathbb {R}}^n \mid g_j(x)\le 0 \quad \forall j\in J \big \}. \end{aligned}$$

For a given \(\hat{x} \in S\), let \(J(\hat{x})\) denotes the index set of all active constraints at \(\hat{x}\),

$$\begin{aligned} J(\hat{x}):=\big \{ j\in J \mid g_j(\hat{x})=0 \big \}. \end{aligned}$$

A point \(\hat{x}\) is said to be a weakly efficient solution to (MOSIP) if there is no \(x\in S\) satisfying \(f_i(x)< f_i(\hat{x})\) for all \( i\in I\). The set of all weakly efficient solutions of (MOSIP) is denoted by \(W\).

For each \(\hat{x} \in S\), set

$$\begin{aligned} \mathcal {A}(\hat{x}):=\bigcup _{i\in I} \partial ^c f_i(\hat{x}) \quad \text{ and }\quad \mathcal {B}(\hat{x}):=\bigcup _{j\in J(\hat{x})}\partial ^c g_j(\hat{x}). \end{aligned}$$

Recall the following definition from [18, Definition 3.2]:

We say that (MOSIP) satisfies the regular constraint qualification

(RCQ, briefly) at \(\hat{x} \in S\) if

$$\begin{aligned} \big (\mathcal {A}(\hat{x})\big )^s \cap \big (\mathcal {B}(\hat{x})\big )^- \subseteq \varGamma (S,\hat{x}). \end{aligned}$$

The following theorem is proved in [18, Theorem 3.4].

Theorem 4

(KKT necessary condition) Let \(x_0\) be a weakly efficient solution of (MOSIP) and RCQ holds at \(x_0\). If in addition \(cone\big (\mathcal {B}(x_0)\big )\) is a closed cone, then there exist \(\alpha _{i}\ge 0\) (for \(i\in I\)), and \(\beta _{j}\ge 0\) (for \(j \in J(x_0)\)) with \(\beta _{j} \ne 0\) for at most finitely many indexes, such that

$$\begin{aligned} 0\in \sum _{i=1}^m \alpha _i \partial ^c f_i(x_0)+\sum _{j\in J(x_0)}\beta _{j}\partial ^c g_{j}(x_0). \end{aligned}$$

It is shown in [19, Example 5.1] that strong KKT condition does not necessarily hold at a weakly efficient solution under RCQ, even if \(|J|=1\). The aim of this paper is to derive the strong KKT necessary condition at \(\hat{x} \in W\) under the following constraint qualification \(\big (\)with the convention \(\bigcup _{\alpha \in \emptyset } X_\alpha = \emptyset \big )\):

$$\begin{aligned}&\text{(CQ): }&\quad \big (\mathcal {A}_k(\hat{x})\big )^s\cap \big (\mathcal {B}(\hat{x})\big )^s \ne \emptyset \quad \text{ for } \text{ all }\ \ k\in I, \end{aligned}$$

where

$$\begin{aligned} \mathcal {A}_k(\hat{x}):= \bigcup _{i\in I_k} \partial ^c f_i(\hat{x}) \quad \text{ and }\quad I_k:=I{\setminus }\{ k \}. \end{aligned}$$

Observe that (CQ) is the nonsmooth analog of the qualification introduced by Maeda in [21] for differentiable finite multiobjective problems (i.e., \(|J| < \infty \)). If \(m=1\), the (CQ) reduces to Cottle constraint qualification which studied in [16] in nonsmooth semi-infinite cases.

Throughout this section we assume that the following condition holds:

Assumption A

The index set \(J\) is a nonempty compact subset of \({\mathbb {R}}^l\), the function \((x,j)\rightarrow g_j(x)\) is upper semicontinuous on \({\mathbb {R}}^n \times J\), and \(\partial ^c g_j(x)\) is an upper semicontinuous mapping in \(j\) for each \(x\).

Theorem 5

(Strong KKT necessary condition) Suppose that (CQ) is satisfied at \(\hat{x} \in W\). If assumption A holds, then there exist \(\lambda _{i}>0,\ i\in I\), and \(\gamma _j \ge 0,\ j\in J(\hat{x})\), with \(\gamma _j \ne 0\) for at most finitely many indexes, such that

$$\begin{aligned}&0\in \sum _{i=1}^m \lambda _i \partial ^c f_i(\hat{x})+\sum _{j\in J(\hat{x})}\gamma _j \partial ^c g_{j}(\hat{x}). \end{aligned}$$

Proof

We present the proof in five steps.

Step 1. We prove that \(conv\big (\mathcal {B}(\hat{x})\big )\) and \(cone\big (\mathcal {B}(\hat{x})\big )\) are closed sets. Firstly, we claim that \(\mathcal {B}(\hat{x})\) is a compact set. Let \(\{ \xi _k \}_{k=1}^\infty \) be a sequence in \(\mathcal {B}(\hat{x})\). If \(| \partial ^c g_{j_*}(\hat{x}) \cap \{ \xi _k \}_{k=1}^\infty | = \infty \) for some \(j_* \in J(\hat{x})\), then there exists subsequence \(\{ \xi _{k_p} \}\) which converges to some \(\hat{\xi }\in \partial ^c g_{j_*}(\hat{x})\) (by compactness of \(\partial ^c g_{j_*}(\hat{x})\)). If \(| \partial ^c g_{j}(\hat{x}) \cap \{ \xi _k \}_{k=1}^\infty | < \infty \) for all \(j \in J(\hat{x})\), then without loss of generality we can assume that \(\xi _k \in \partial ^c g_{j_k}(\hat{x})\) for all \(k \in \mathbb {N}\), and hence, \(j_{k_p} \rightarrow \hat{j} \in J(\hat{x})\) for some subsequence \(\{j_{k_p}\}\) of \(\{j_{k}\}\) (by compactness of \(J(\hat{x})\)). Since the mapping \(j \rightarrow \partial ^c g_j(\hat{x})\) is upper-semicontinuous, there exists a subsequence of \(\{\xi _{k_p}\}\) which converges to \(\hat{\xi }\in \partial ^c g_{\hat{j}}(\hat{x})\). Therefore, our claim is proved, i.e., \(\mathcal {B}(\hat{x})\) is a compact set. Then, \(conv\big ( \mathcal {B}(\hat{x}) \big )\) is closed by Theorem 1(I). Now, the (CQ) implies

$$\begin{aligned} \Big (conv\big (\mathcal {B}(\hat{x})\big )\Big )^s = \big (\mathcal {B}(\hat{x})\big )^s \ne \emptyset , \end{aligned}$$

and hence \(0\notin conv\big (\mathcal {B}(\hat{x}) \big )\). Therefore, \(cone\big (\mathcal {B}(\hat{x}) \big )\) is a closed set by Theorem 1(II).

Step 2. Let

$$\begin{aligned} G(x):=\max _{j\in J} g_j(x) \quad \ \ \forall x\in S. \end{aligned}$$

Since each \(g_j\) is locally Lipschitz, it follows readily that \(G\) is locally Lipschitz. The proof of the estimate

$$\begin{aligned} G^\circ (\hat{x};d)\le \max _{j\in J(\hat{x})} g^\circ _j(\hat{x};d) \quad \forall d\in {\mathbb {R}}^n, \end{aligned}$$
(1)

is presented in [7, Theorem 2.8.2, Step 1]. Note that the function \(j\rightarrow g^\circ _j(\hat{x};d)\) is upper-semicontinuous and \(J(\hat{x})\) is compact (by Assumption A; see [7] P. 78-79), so that the notation “max” is justified in (1).

Let \(\xi \in \partial ^c G(\hat{x})\). The inequality (1) implies that

$$\begin{aligned} \max _{j\in J(\hat{x})}\hat{g_j}(d)\ge \langle \xi ,d\rangle \quad \ \forall d\in {\mathbb {R}}^n, \end{aligned}$$

where \(\hat{g_j}(d):=g^\circ _j(\hat{x};d)\). Since each \(\hat{g_j}(.)\) is convex and \(\hat{g_j}(0)=0\), we can conclude that \(\xi \in \partial ^c \widehat{G}(0)\), where \(\widehat{G}\) defined for each \(d\) by \(\widehat{G}(d):= \max _{j\in J(\hat{x})} \hat{g_j}(d)\). On the other hand, for every \(j,\hat{g_j}\) is continuous at \(\hat{d}:=0\), and for every \(d\), the function \(j\rightarrow \hat{g_j}(d)\) is upper-semicontinuous. So, the well-known Pshenichnyi-Levin-Valadire Theorem ([13, pp. 267]) can be applied to obtain that

$$\begin{aligned} \partial \widehat{G}(0)=\overline{conv}\left( \bigcup _{j\in \widehat{J}(0)}\partial \hat{g_j}(0) \right) , \end{aligned}$$

where, \(\widehat{J}(0):=\big \{j\in J(\hat{x}) \mid \hat{g_j}(0)=\widehat{G}(0)=0\big \}\). Now, since \(\widehat{J}(0)=J(\hat{x})\) and \(\partial \hat{g_j}(0)=\partial ^c g_j(\hat{x})\) and \(conv \big (\mathcal {B}(\hat{x})\big )\) is closed by Step 1, we obtain that

$$\begin{aligned} \partial ^c G(\hat{x}) \subseteq conv\big (\mathcal {B}(\hat{x})\big ). \end{aligned}$$
(2)

Step 3. Since \(\big (\mathcal {B}(\hat{x}) \big )^s \ne \emptyset \) under the (CQ) assumption, we can choose \(d \in \big (\mathcal {B}(\hat{x}) \big )^s\). Now, owning to

$$\begin{aligned} \big (\mathcal {B}(\hat{x}) \big )^s =\Big ( conv\big (\mathcal {B}(\hat{x})\big ) \Big )^s, \end{aligned}$$

we conclude that \(d\in \Big (conv\big (\mathcal {B}(\hat{x})\big )\Big )^s\). Hence, by (2) \(G^\circ (\hat{x};d)<0,\) and consequently, there exists a scalar \(\delta >0\) such that

$$\begin{aligned} G(\hat{x}+\beta d)< G(\hat{x}) \le 0,\qquad \forall \ \beta \in (0, \delta ]. \end{aligned}$$

Thus, for all \(j\in J\) and for all \(\beta \in (0, \delta ]\), we have \( g_j(\hat{x}+\beta d)<0.\) Therefore, for all \(\beta \in (0, \delta ]\) we have \( \hat{x}+\beta d\in S,\) which implies \(d\in \varGamma (S,\hat{x}).\) Since \(d\) is an arbitrary element in \(\big (\mathcal {B}(\hat{x}) \big )^s\), it follows that \(\big (\mathcal {B}(\hat{x}) \big )^s \subseteq \varGamma (S,\hat{x})\). This inclusion and the fact that \(\big (\mathcal {B}(\hat{x})\big )^s \ne \emptyset \) imply that

$$\begin{aligned} \big (\mathcal {B}(\hat{x})\big )^-=\overline{\big (\mathcal {B}(\hat{x})\big )^s}\subseteq \overline{\varGamma (S,\hat{x})}=\varGamma (S,\hat{x}), \end{aligned}$$

and hence, the RCQ is satisfied at \(\hat{x}\).

Step 4. Now, by Step 1 \(cone\big (\mathcal {B}(\hat{x})\big )\) is closed and by Step 3 (RCQ) holds. Therefore, from Theorem 4 we conclude that

$$\begin{aligned}&0\in \sum _{i= 1} ^ m \alpha _i \partial ^c f_i(\hat{x})+\sum _{j\in J(\hat{x})}\beta _{j}\partial ^c g_{j}(\hat{x}), \end{aligned}$$
(3)

for some \(\alpha _{i}\ge 0,\ i\in I\), and \(\beta _{j}\ge 0,\ j\in J(\hat{x})\), with \(\beta _{j} \ne 0\) for finitely many indexes.

Step 5. Inclusion (3) implies

$$\begin{aligned} \sum _{i=1}^m \alpha _i \xi _i + \sum _{j\in J(\hat{x})} \beta _j \zeta _j= 0, \end{aligned}$$
(4)

for some \(\xi _i \in \partial ^c f_i(\hat{x})\) and \(\zeta _j \in \partial ^c g_j(\hat{x})\) with \((i,j)\in I \times J(\hat{x})\). By contradiction, we suppose that \(\alpha _k =0\) for some \(k\in I\). Since (CQ) holds, there exists \(d\in {\mathbb {R}}^n\) such that

$$\begin{aligned} \left\{ \begin{array}{ll} \langle \xi _i,d \rangle <0, \quad i\in I_k,\\ \langle \zeta _j,d \rangle \le 0, \quad j\in J(\hat{x}). \end{array}\right. \end{aligned}$$

Inequalities above together with (4) imply that

$$\begin{aligned} \underbrace{\sum _{i\in I_k } \alpha _i \langle \xi _i,d \rangle }_{<0} + \underbrace{ \sum _{j\in J(\hat{x})} \beta _j \langle \zeta _j,d \rangle }_{\le 0}= 0, \end{aligned}$$

a contradiction. This means \(\alpha _i >0\) for all \(i \in I\), and hence, the proof of the theorem is complete. \(\square \)

The following example shows that if (CQ) is not satisfied then a weakly efficient solution is not a strong KKT point.

Example 1

Suppose that \(f_1(x_1,x_2):=-x_1,\ \hat{x}=(0,0),\ g_j(x_1,x_2):=-x_1-j\) for all \(j\in J=[0,1],\) and \(f_2(x)\) is the support function of \(P:=\left\{ (y_1,y_2)\in {\mathbb {R}}^2\mid y_1^2 + (y_2 + 1)^2 \le 1\right\} \), i.e.,

$$\begin{aligned} f_2(x)=\sup _{b\in P} \left<b,x\right>. \end{aligned}$$

Assumption A is satisfied, and \(\hat{x}\) is a weakly efficient solution for the problem. We have:

$$\begin{aligned}&J(\hat{x})=\{0\},\\&\big (\mathcal {B}(\hat{x}) \big )^s=\{(-1,0)\}^s=(0,+\infty ) \times {\mathbb {R}},\\&\big (\mathcal {A}_1(\hat{x}) \big )^s=P^s = \emptyset . \end{aligned}$$

Thus, the (CQ) is not satisfied at \(\hat{x}\). It is easy to see that there do not exist \(\lambda _1 >0, \lambda _2 >0\) and \(\gamma _0 \ge 0\) satisfying

$$\begin{aligned} (0,0) \in \lambda _1 (-1,0) + \lambda _2 P +\gamma _0 (-1,0). \end{aligned}$$

4 Strong KKT sufficient condition

In this section we discuss a sufficient optimality result under invexity hypotheses imposed on the involved functions. Let us recall the following definition from [24].

Let \(\varphi :{\mathbb {R}}^{n}\longrightarrow {\mathbb {R}}\) be a locally Lipschitz function. We say that \(\varphi \) is invex at \(\hat{x} \in A \subseteq {\mathbb {R}}^n\) on \(A\) if for every \(y\) in \(A\), there is some vector \(\eta (\hat{x},y)\in T(A,\hat{x})\), called kernel of \(\varphi \),such that

$$\begin{aligned} \varphi (y)-\varphi (\hat{x})\ge \varphi ^{\circ }\big (\hat{x},\eta (\hat{x},y)\big ). \end{aligned}$$

Theorem 6

Let \(\hat{x}\) be a feasible solution of (MOSIP). Suppose that \(f_i, i\in I\), and \(g_j, j\in J(\hat{x})\), are invex functions on \(S\) with the same kernel \(\eta \). If there exist scalars \(\lambda _i > 0\ i \in I\) and \(\gamma _j \ge 0,\ j\in J(\hat{x})\), with \(\gamma _j \ne 0\) for at most finitely many indexes such that

$$\begin{aligned} 0\in \partial ^c \left( \sum _{i=1}^m \lambda _i f_i(x)+\sum _{j\in J(\hat{x})}\gamma _{j} g_{j}(x) \right) , \end{aligned}$$
(5)

holds, then \(\hat{x}\) is a weakly efficient solution of the problem.

Proof

Suppose on the contrary that \(\hat{x}\) is not a weakly efficient solution for (MOSIP). Then there exists a feasible point \(y\) for (MOSIP) such that

$$\begin{aligned} f_i(y)< f_i(\hat{x})\qquad \text{ for } \text{ all } \quad i \in I. \end{aligned}$$

Since \(\lambda _i >0\) for each \(i\in I\), we obtain that

$$\begin{aligned} \sum _{i=1}^m \lambda _i f_i(y) < \sum _{i=1}^m \lambda _i f_i(\hat{x}). \end{aligned}$$

Due to the feasibility of \(y\) and the last relation, the following inequalities are fulfilled:

$$\begin{aligned}&\sum _{i=1}^m \lambda _i f_i(y)+\sum _{j\in K(\hat{x})}\gamma _{j} g_{j}(y) \le \sum _{i=1}^m \lambda _i f_i(y)<\sum _{i=1}^m \lambda _i f_i(\hat{x}) \nonumber \\&\quad \le \sum _{i=1}^m \lambda _i f_i(\hat{x})+\sum _{j\in K(\hat{x})}\gamma _{j} g_{j}(\hat{x}), \end{aligned}$$
(6)

where \(K(\hat{x}):=\{j\in J(\hat{x}) \mid \gamma _{j} \ne 0 \}.\) For each \(x\in S\), we define

$$\begin{aligned} \varphi (x):= \sum _{i=1}^m \lambda _i f_i(x)+\sum _{j\in K(\hat{x})}\gamma _{j} g_{j}(x). \end{aligned}$$

Observe that \(\varphi \) is a invex function with kernel \(\eta \). Now, (5) and (6) imply that

$$\begin{aligned} 0\in \partial ^c \varphi (\hat{x}) \quad \text{ and }\quad \varphi (y)-\varphi (\hat{x})<0. \end{aligned}$$

Combining this with Theorem 2(i) yields

$$\begin{aligned} 0>\varphi (y)-\varphi (\hat{x})\ge \varphi ^\circ \big (\hat{x};\eta (\hat{x},y)\big )=\max \big \{\langle \xi ,\eta (\hat{x},y)\rangle \mid \xi \in \partial ^c \varphi (\hat{x}) \big \} \ge \langle 0,\eta (\hat{x},y)\rangle =0. \end{aligned}$$

This is a contradiction, and hence, the proof of the theorem is complete. \(\square \)