1 Introduction

The interval-valued optimization problems have been of ample interest since real-world optimization problems often involve inexactness or imprecise data due to some unpredictable circumstances or measurement miscalculations in the objective function and/or in the constraints. In recent years, interval optimization theory and its applications have been studied by many authors. For instance, in Cao et al. [7] a vehicle routing problem with interval demands is investigated, where the demands of customers are uncertain. Costa et al. [9] used interval analysis in the calculation of the 3D structure of a protein molecule: the measurements of the distance between atoms obtained from nuclear magnetic resonance are not precise (such measures are important to determine the protein structure). In Das et al. [10], an imperfect production inventory model is developed in the interval environment. Kumar, Behera, and Bhurjee [16] studied a portfolio selection model with interval-type random parameters considering risk measures as value-at-risk. In Wen et al. [25], an interval optimization method is developed to determine the optimal size of an energy storage system in a hybrid ship power system to reduce the fuel cost, capital cost of the system, and emissions of greenhouse gases.

With respect to the theoretical developments, see, e.g., Ahmad et al. [1, 2], Antczak [3], Chalco-Cano et al. [8], Kummari and Ahmad [17], Singh et al. [19], Stefanini and Arana-Jiménez [21], Tung [23], Van Luu and Mai [24], Wu [27], Zhang et al. [28] and Zhao and Bin [29]. In all of the aforementioned studies, optimality conditions for interval-valued optimization problems were derived. To express such conditions, a suitable concept of derivative for interval-valued functions should be chosen. Most of the authors use Hukuhara differentiability (see Hukuhara [13]) or generalized Hukuhara differentiability (see Chalco-Cano [8]). The first one is known to be very restrictive. Despite the fact that the generalized Hukuhara differentiability is less restrictive than Hukuhara differentiability, in the case of many variables it does not generalize the one-dimensional generalized Hukuhara differentiability given in Stefanini and Bede [22]. Recently, Stefanini and Arana-Jiménez [21] give a new definition of a generalized Hukuhara differentiability for functions with many variables which does extend the one-dimensional case.

Necessary optimality conditions of Karush–Kuhn–Tucker (KKT) type for interval optimization problems were given, for instance, in [2, 8, 17, 21, 24, 28, 29] in the mono-objective (scalar) case. KKT necessary conditions for the multi-objective case were furnished in [1, 3, 23].

Sufficient optimality conditions under convexity can be found in, among others, [21] for scalar problems and in [3, 19, 23, 27] for multi-objective problems; under generalized convexity, they can be found in [2, 17, 24], for scalar problems.

Duality theory for interval problems was developed, for example, in [17, 24] in the scalar case, and in [3, 23] in the vector case.

Hukuhara differentiability was used in [27, 28]. In other studies by [1, 2, 19], generalized Hukuhara differentiability (given in [8]) was utilized. In all of the papers cited in the previous sentence, the constraint functions are assumed to be differentiable (in the classic sense). Problems in which the constraint functions are non-smooth were treated in [3, 17, 23, 24, 29], where neither Hukuhara nor generalized Hukuhara differentiability was used. Assumptions were made directly on the extremum functions that define the interval function. Clarke subdifferential was used in [3, 29], limiting subdifferential in [17], the subdifferential of convex analysis in [23], and convexificators (a generalized notion of subdifferential) in [24].

Most of the papers mentioned so far deal with interval problems posed in finite-dimensional Euclidean spaces and with a finite number of constraints. Stefanini and Arana-Jiménez [21], actually, developed necessary and sufficient optimality conditions for fuzzy optimization problems with inequality constraints. However, such conditions are easily converted to interval problems. Tung [23] studied a semi-infinite interval problem, that is, the problem is posed in \(\mathbb {R}^n\) and has possibly an infinite number of (inequality) constraints. Van Luu and Mai [24] considered a very general interval problem, posed in a real Banach space (possibly infinite dimensional), with equality, inequality and set constraints, and within a non-smooth context. Despite possessing, possibly, an infinite number of variables, the number of equality and inequality constraints is finite. Zhao and Bin [29] also considered an interval optimization problem posed in a Banach space in the non-smooth context but with (a finite number of) inequality constraints only. The novelty was the fact that the constraints were allowed to be parametrized into compact subsets of some given topological spaces. Such problems are called robust optimization problems.

The main goal of this work is to present necessary optimality conditions for interval-valued optimization problems in several variables with inequality and equality functional as well as abstract (or set) constraints. We use the new gH-differentiability given in Stefanini and Arana-Jiménez [21] and the LU order relation. We present necessary optimality conditions by means of a multiplier rule of KKT type. A specification of the Dubovitskii–Milyutin formalism is utilized to establish such optimality conditions.

The Dubovitskii–Milyutin formalism can be found, for example, in Girsanov [12]. This formalism consists of a unified functional-analytic approach to treat the so-called extremum problems. A great diversity of problems can be attacked through this approach, from classical mathematical programming to optimal control; very general problems fit into such a framework since the only requirement in the space in which the problem is posed is to be a locally convex topological linear one. However, the theory described in [12] can deal only with problems in which the objective function is a functional, which is not the case in interval optimization. So, we propose a modification of the Dubovitskii–Milyutin formalism to comprise optimization problems in which the objective function is interval-valued. To be specific, we define an appropriate notion of directions of decrease for interval-valued functions, by using the LU order relation, in such a way that it is possible to characterize geometrically the optimal solutions. The set of all directions of decrease is, then, described by means of the gH-gradient of the objective function.

The KKT-type necessary optimality conditions furnished here are similar to those given in Stefanini and Arana-Jiménez [21], expressed in terms of the gH-gradient of the objective function. In other studies cited above, the KKT conditions do not involve the gH-gradient, but the classic gradient (or a subdifferential, in the case of non-smooth functions) of the extremum functions which define the interval objective function. A distinction between our results and those of Stefanini and Arana-Jiménez resides in the constraint qualification used: while we require a positive linear independence condition, they assume a linear independence one (which is stronger). Moreover, the interval problem we worked on is more general, since equality, inequality, and abstract constraints are allowed, whereas in [21] only inequality constraints are present.

As mentioned, Van Luu and Mai [24] worked with a very general problem in the non-smooth context. Notwithstanding, they made strong assumptions on the data. Despite the fact that the objective extremum functions and equality constraints are merely assumed to be locally Lipschitz and the inequality constraints to be continuous, it is supposed that the objective extremum functions and inequality constraints admit certain convexificators and that the absolute values of the equality constraints are Clarke regular. Further, they also need a regularity condition in the sense of Ioffe. The KKT necessary conditions are obtained under a Mangasarian–Fromovitz-type constraint qualification and formulated by means of subdifferentials of the functions that define the problem. Although their results are established in a more general setting, no kind of derivative for interval-valued functions is used. The KKT conditions we derive here are stated through the gH-derivative concept (given in [21]), which is a genuine derivative concept for interval-valued functions. Moreover, owing to the methodology we made use of, the Dubovitskii–Milyutin formalism, our results are easily extended to include problems posed in Banach spaces.

The work is organized as follows. In Sect. 2, we give basic definitions and important results from the literature, and present the interval optimization problem. Section 3 is dedicated to obtain the necessary optimality conditions. Some examples and possible applications are discussed in Sect. 4. To finish, in Sect. 5 we present some conclusions of this study.

2 Preliminaries

Here, we set the notation, give some basic definitions, state some important results from the literature and pose the interval-valued optimization problem the paper is concerned with.

2.1 Basic Definitions and Important Properties

The interval-valued objective function we work with is defined from \(\mathbb {R}^n\) into the space \(\mathcal {K}_C = \{[\underline{a},\overline{a}] : \underline{a},\overline{a} \in \mathbb {R}, ~ \underline{a} \le \overline{a}\}\). In such a space, the following interval arithmetic operations are defined. Given \(A = [\underline{a},\overline{a}]\), \(B = [\underline{b},\overline{b}]\), \(C = [\underline{c},\overline{c}] \in \mathcal {K}_C\) and \(\lambda \in \mathbb {R}\),

The gH-difference of two intervals A, \(B \in \mathcal {K}_C\) (see Stefanini [20]) is defined by:

$$\begin{aligned} A \ominus _{gH} B = C \Leftrightarrow {\left\{ \begin{array}{ll} A = B + C, ~ \text {or} \\ B = A + (-1) \cdot C. \end{array}\right. } \end{aligned}$$

Note that \(A \ominus _{gH} A = [0,0]\). Moreover, the gH-difference of two intervals always exists and, if \(A = [\underline{a},\overline{a}]\), \(B = [\underline{b},\overline{b}] \in \mathcal {K}_C\), it holds that \(A \ominus _{gH} B = \left[ (\underline{a}-\underline{b}) \vee (\overline{a} - \overline{b}) \right] \), where \([\underline{a} \vee \overline{a}] := \left[ \min \{ \underline{a},\overline{a} \},\max \{ \underline{a},\overline{a} \} \right] \).

The Cartesian product \(\mathcal {K}_C \times \cdots \times \mathcal {K}_C\), of n-factors \(\mathcal {K}_C\), is denoted by \(\mathcal {K}_C^n\). If \(A = (A_1,\ldots ,A_n)\), \(B = (B_1,\ldots ,B_n) \in \mathcal {K}_C^n\) and \(\lambda \in \mathbb {R}\), the sum and the scalar product in \(\mathcal {K}_C^n\) are defined as follows:

$$\begin{aligned} A \oplus B= & {} (A_1,\ldots , A_n) \oplus (B_1,\ldots , B_n) = (A_1+B_1,\ldots , A_n+B_n), \\ \lambda \odot A= & {} \lambda \odot (A_1,\ldots , A_n) = (\lambda \cdot A_1,\ldots , \lambda \cdot A_n). \end{aligned}$$

Note that any real number a can be seen as a degenerated interval \(\left[ a,a \right] \). Then, for all \(v = (v_{1}, \ldots ,v_{n}) \in \mathbb {R}^{n}\) and \(A = (A_1,\ldots , A_n) \in \mathcal {K}_C^n\), in which \(A_i = \left[ \underline{a}_i,\overline{a}_i \right] \in \mathcal {K}_C\), \(i = 1,\ldots ,n\), it follows that

$$\begin{aligned} A \oplus v= & {} \left( \left[ \underline{a}_1,\overline{a}_1 \right] , \ldots , \left[ \underline{a}_n,\overline{a}_n \right] \right) + ([v_1,v_1], \ldots , [v_n,v_n]) \\ \qquad \qquad= & {} \left( \left[ \underline{a}_1,\overline{a}_1 \right] +[v_1,v_1], \ldots , \left[ \underline{a}_n,\overline{a}_n \right] +[v_n,v_n] \right) . \end{aligned}$$

We denote \(v \in A\) to mean that \(v_{i} \in A_{i}\) for all \(i = 1,\ldots , n\). Given \(S \subset \mathbb {R}^n\), we define the sum \(A \oplus S\) as

$$\begin{aligned} A \oplus S = \{ a+s : a \in A, ~ s \in S \}. \end{aligned}$$

Next, we have the definition of the LU order relation which will be used here.

Definition 2.1

(Kulish and Miranker [15]) Let \(A = \left[ \underline{a},\overline{a} \right] \) and \(B = \left[ \underline{b},\overline{b}\right] \in \mathcal {K}_C\). The lower and upper order relation, LU in short, is defined by

  1. (i)

    \(A \leqq _{LU} B\) if, and only if, \(\underline{a} \le \underline{b}\) and \(\overline{a} \le \overline{b}\);

  2. (ii)

    \(A \le _{LU} B\) if, and only if, either \(\underline{a} < \underline{b}\) and \(\overline{a} \le \overline{b}\) or \(\underline{a} \le \underline{b}\) and \(\overline{a} < \overline{b}\);

  3. (iii)

    \(A <_{LU} B\) if, and only if, \(\underline{a} < \underline{b}\) and \(\overline{a} <\overline{b}\).

An application \(F : S \subseteq \mathbb {R}^n \rightarrow \mathcal {K}_C\) such that \(F(x) = \left[ \underline{f}(x),\overline{f}(x) \right] \) for all \(x \in S\), where \(\underline{f}\), \(\overline{f} : S \rightarrow \mathbb {R}\), with \(\underline{f}(x) \le \overline{f}(x)\) for all \(x \in S\), is called an interval-valued function of several variables. Functions \(\underline{f}\) and \(\overline{f}\) are its extremum functions.

The Pompeiu–Hausdorff metric between \(A = [\underline{a},\overline{a}]\) and \(B = [\underline{b},\overline{b}] \in \mathcal {K}_C\) is defined as \(d_{H}(A,B) := \max \{ \vert \underline{a} - \underline{b} \vert , \vert \overline{a} - \overline{b} \vert \}\). In Diamond and Kloeden [11] it is shown that \((\mathcal {K}_C, d_H)\) is a complete and separable metric space. It is natural, then, to consider the limit and continuity concepts of interval-valued functions defined by means of the Pompeiu–Hausdorff metric \(d_H\). Before enunciating definitions and results involving the concepts of limit and continuity, let us state the technical lemma below which will be useful later in the paper.

Lemma 2.1

Given \(A = [\underline{a},\overline{a}]\), \(B = [\underline{b},\overline{b}] \in \mathcal {K}_{C}\), then

$$\begin{aligned} A \ominus _{gH} B \leqq _{LU} \left[ d_H(A,B),d_H(A,B) \right] . \end{aligned}$$

Proof

We have that

$$\begin{aligned} A \ominus _{gH} B&= \left[ \underline{a} - \underline{b} \vee \overline{a} - \overline{b} \right] \leqq _{LU} \left[ \vert \underline{a}-\underline{b} \vert \vee \vert \overline{a} - \overline{b} \vert \right] \\&{\leqq _{LU}} \left[ \max \{ \vert \underline{a} - \underline{b} \vert , \vert \overline{a} - \overline{b} \vert \} , \max \{ \vert \underline{a} - \underline{b} \vert , \vert \overline{a} - \overline{b} \vert \} \right] \\&= \left[ d_H(A,B),d_H(A,B) \right] . \end{aligned}$$

\(\square \)

Theorem 2.1

(Aubin and Cellina [4]) Let \(F : S \subseteq \mathbb {R} \rightarrow \mathcal {K}_C\) be an interval-valued function such that \(F(x) = \left[ \underline{f}(x),\overline{f}(x) \right] \) for all \(x \in S\). Given \(x_0 \in S\),

$$\begin{aligned} \lim _{x \rightarrow x_0} F(x) = \left[ \lim _{x \rightarrow x_0} \underline{f}(x) , \lim _{x \rightarrow x_0} \overline{f}(x) \right] . \end{aligned}$$

The usual inner product in \(\mathbb {R}^ n\) is denoted by \(\langle \cdot , \cdot \rangle \). The Euclidean norm is denoted by \(\Vert \cdot \Vert \). Given \(\epsilon >0\), \(N_{\epsilon }(x_0)\) denotes the \(\epsilon \)-neighborhood of \(x_0 \in \mathbb {R}^n\), that is, \(N_{\epsilon }(x_0) = \{ x \in \mathbb {R}^n : \Vert x-x_0 \Vert < \epsilon \}\).

Definition 2.2

(Luhandjula and Rangoaga [18]) Let \( F : S \subseteq \mathbb {R}^n \rightarrow \mathcal {K}_{C}\), \(F(x) = \left[ \underline{f}(x),\overline{f}(x) \right] \). F is said to be \(d_{H}\)-continuous at \(x_0 \in S\) if for every \(\epsilon > 0\), there exists \(\delta > 0\) such that \(\Vert x-x_0\Vert < \delta \) implies that \(d_H(F(x),F(x_0)) < \epsilon \).

Definition 2.3

Let \(S \subseteq \mathbb {R}^n\) be open and nonempty and \(F: S \rightarrow \mathcal {K}_C\).

  1. (i)

    F is said to be Lipschitz continuous of rank \(L > 0\) if

    $$\begin{aligned} d_H(F(x_1),F(x_2)) \le L \Vert x_1 - x_2 \Vert ~ \forall x_1, x_2 \in S. \end{aligned}$$
  2. (ii)

    F is said to be Lipschitz continuous locally near a given point \(x \in S\) of rank \(L > 0\) if, for some \(\epsilon > 0\),

    $$\begin{aligned} d_H(F(x_1),F(x_2)) \le L \Vert x_1 - x_2 \Vert ~\forall x_1, x_2 \in N_\epsilon (x). \end{aligned}$$

All intervals \(A = [\underline{a},\overline{a}] \in \mathcal {K}_{C}\) can be expressed through their center-radius representation \(A = (a^C;a^R)\) (see Ishibuchi and Tanaka [14] and Stefanini and Arana-Jiménez [21]), where \(a^C = \frac{\underline{a}+\overline{a}}{2}\), \(a^R = \frac{\overline{a}-\underline{a}}{2}\), \(\underline{a} = a^C-a^R\) and \(\overline{a} = a^C+a^R\). Then, the concepts of limit and continuity of interval-valued functions defined by means of the Pompeiu–Hausdorff metric \(d_H\) can be interpreted through their respective versions for real-valued functions (see Stefanini and Bede [22]). Based on the center-radius representation of intervals, the definitions of gH-derivative, gH-gradient, gH-directional derivative and gH-partial derivatives are defined in Stefanini and Arana-Jiménez [21]. In what follows, we state the definitions of such concepts and some of their properties.

Definition 2.4

(Stefanini and Arana Jiménez [21]) Let \(F : S \subseteq \mathbb {R}^n \rightarrow \mathcal {K}_C\), \(F(x) = (f^C(x);f^R(x))\), and let \(x_0 \in S\) be such that \(x_0 + h \in S\) for all \(h \in \mathbb {R}^n\) with \(\Vert h \Vert < \delta \) for a given \(\delta > 0\). F is said to be gH-differentiable at \(x_0\) if there exist vectors \(w^C, w^R \in \mathbb {R}^n\) and functions \(\epsilon ^C,\epsilon ^R : \mathbb {R}^n \rightarrow \mathbb {R}\) with \(\epsilon ^C(h) \rightarrow 0\) and \(\epsilon ^R(h) \rightarrow 0\), when \(h \rightarrow 0\), such that, for all \(h \ne 0\),

$$\begin{aligned} f^C(x_0+h) - f^C(x_0) = \langle h , w^C \rangle + \Vert h \Vert \epsilon ^C(h) \end{aligned}$$

and

$$\begin{aligned} \left| f^R(x_0+h) - f^R(x_0) \right| = \left| \langle h , w^R \rangle + \Vert h \Vert \epsilon ^R(h) \right| . \end{aligned}$$

The interval-valued function \(D_{gH}F(x_0) : \mathbb {R}^n \rightarrow \mathcal {K}_C\) defined, for \(h \in \mathbb {R}^n\), by

$$\begin{aligned} D_{gH}F(x_0)(h) = \left( \langle h , w^C \rangle ; \left| \langle h , w^R \rangle \right| \right) \end{aligned}$$

is called the total gH-derivative of F at \(x_0\), and \(D_{gH}F(x_0)(h)\) is the interval-valued differential of F at \(x_0\) with respect to h.

Definition 2.5

(Stefanini and Arana Jiménez [21]) Let S be a nonempty open subset of \(\mathbb {R}^n\) and \(F: S \rightarrow \mathcal {K}_C\), and let \(x_0 \in S\) be such that \(x_0 + h \in S\) for all \(h \in \mathbb {R}^n\) with \(\Vert h \Vert < \delta \) for a given \(\delta > 0\). The directional generalized Hukuhara derivative (directional gH-derivative, for short) of F at \(x_0\) in the direction \(d \in \mathbb {R}^n\) is defined by

$$\begin{aligned} F_{gH}^\prime (x_0;d) = \lim _{h \rightarrow 0^{+}} \frac{1}{h} \cdot (F(x_0+hd) \ominus _{gH}F(x_0)), \end{aligned}$$

if it exists.

The partial interval-valued gH-derivative of F at \(x_0 \in S\) with respect to \(x_j\), \( j\in \{ 1,\ldots ,n \}\), is defined by

$$\begin{aligned} \frac{\partial _{gH}F}{\partial x_j}(x_0) = \lim _{h \rightarrow 0} \frac{1}{h} \cdot (F(x_0+he_j) \ominus _{gH}F(x_0)), \end{aligned}$$

if it exists. (Above, \(e_j\) denotes the j-th canonical vector in \(\mathbb {R}^n\).)

Proposition 2.1

(Stefanini and Arana Jiménez [21]) Let \(F: S\rightarrow \mathcal {K}_C\) and S be an open subset of \(\mathbb {R}^n\). Assume that \(F = [\underline{f},\overline{f}]\) is gH-differentiable at \(x_0 \in S\) and suppose that the lateral partial derivatives of \(\underline{f}\) and \(\overline{f}\) exist at \(x_0\) for a given index \(j \in \{ 1,\ldots ,n \}\). Denote \(\underline{u}_{j}^{-} = \frac{\partial \underline{f}}{\partial x_j^-}(x_0)\), \(\underline{u}_{j}^{+} = \frac{\partial \underline{f}}{\partial x_j^+}(x_0)\), \(\overline{u}_{j}^{-} = \frac{\partial \overline{f}}{\partial x_j^-}(x_0)\) and \(\overline{u}_{j}^{+} = \frac{\partial \overline{f}}{\partial x_j^+}(x_0)\).

  1. (i)

    If \(\underline{u}_j^- = \underline{u_j}^+ =: m_j\) and \(\overline{u}_j^- = \overline{u}_j^+ =: n_j\), then

    $$\begin{aligned} \frac{\partial _{gH} F}{\partial x_j}(x_0) = \left[ m_j \vee n_j \right] = \left( \frac{m_j+n_j}{2};\left| \frac{m_j-n_j}{2} \right| \right) . \end{aligned}$$
  2. (ii)

    If \(\underline{u}_j^- = \overline{u}_j^+ =: p_j\) and \(\overline{u}_j^- = \underline{u}_j^+ =: q_j\), then

    $$\begin{aligned} \frac{\partial _{gH} F}{\partial x_j}(x_0) = \left[ p_j \vee q_j \right] = \left( \frac{p_j+q_j}{2};\left| \frac{p_j-q_j}{2} \right| \right) . \end{aligned}$$

Theorem 2.2

(Stefanini and Arana Jiménez [21]) Let \(F=(f^C;f^R)\) be an interval-valued function in an open set \(S \subseteq \mathbb {R}^n\). If F is gH-differentiable at \(x_0\), then all the interval-valued partial gH-derivatives of F exist and, following the notation used in Definition 2.4,

$$\begin{aligned} \frac{\partial _{gH} F}{\partial x_j}(x_0) = F_{gH}^\prime (x_0;e_j) = \left( w_j^C; \left| w_j^R \right| \right) \end{aligned}$$

for \(j \in \{ 1,\ldots ,n \}\). Furthermore, all directional gH-derivatives of F exist and

$$\begin{aligned} F_{gH}^\prime (x_0;d) = \left( \langle w^C , d \rangle ; \left| \langle w^R , d \rangle \right| \right) . \end{aligned}$$

Let \(f : S \subseteq \mathbb {R}^n \rightarrow \mathbb {R}\) and \(x_0 \in S\), where S is open, be such that every lateral partial derivative of f exists at \(x_0\). We will use the following notation:

$$\begin{aligned}&\left( \nabla f \right) _{-}(x_0) = \left( \frac{\partial f}{\partial x_1^-}(x_0),\ldots ,\frac{\partial f}{\partial x_n^-}(x_0) \right) \\&\left( \nabla f \right) _{+}(x_0) = \left( \frac{\partial f}{\partial x_1^+}(x_0),\ldots ,\frac{\partial f}{\partial x_n^+}(x_0) \right) . \end{aligned}$$

Theorem 2.3

Let \(S \subseteq \mathbb {R}^n\) be an open and nonempty set and \(F : S \rightarrow \mathcal {K}_C\), with \(F(x) = \left[ \underline{f}(x),\overline{f}(x) \right] \) for all \(x \in S\). If F is gH-differentiable at \(x_0 \in S\), then for all \(d \in \mathbb {R}^n\), one of the following cases holds:

  1. (i)

    \(\nabla \underline{f}(x_0)\) and \(\nabla \overline{f}(x_0)\) exist, and

    $$\begin{aligned} F_{gH}^\prime (x_0;d) = \left[ \left\langle \nabla \underline{f}(x_0) , d \right\rangle \vee \left\langle \nabla \overline{f}(x_0) , d \right\rangle \right] . \end{aligned}$$

    Particularly,

    $$\begin{aligned} \frac{\partial _{gH}F}{\partial x_i}(x_0) = \left[ \frac{\partial \underline{f}}{\partial x_i}(x_0) \vee \frac{\partial \overline{f}}{\partial x_i}(x_0) \right] , ~ i \in \{ 1,\ldots ,n \}. \end{aligned}$$
  2. (ii)

    \(\left( \nabla \underline{f} \right) _{-}(x_0)\), \(\left( \nabla \overline{f}\right) _{-}(x_0)\), \(\left( \nabla \underline{f}\right) _{+}(x_0)\), \(\left( \nabla \overline{f}\right) _{+}(x_0)\) exist, satisfy

    $$\begin{aligned}&\left\langle \left( \nabla \underline{f}\right) _{-}(x_0) , d \right\rangle = \left\langle \left( \nabla \overline{f}\right) _{+}(x_0) , d \right\rangle , \\&\left\langle \left( \nabla \overline{f}\right) _{-}(x_0) , d \right\rangle = \left\langle \left( \nabla \underline{f}\right) _{+}(x_0) , d \right\rangle , \end{aligned}$$

    and

    $$\begin{aligned} F_{gH}^\prime (x_0;d)= & {} \left[ \left\langle \left( \nabla \underline{f}\right) _{-}(x_0) , d \right\rangle \vee \left\langle \left( \nabla \overline{f}\right) _{-}(x_0) , d \right\rangle \right] \\= & {} \left[ \left\langle \left( \nabla \underline{f}\right) _{+}(x_0) , d \right\rangle \vee \left\langle \left( \nabla \overline{f}\right) _{+}(x_0) , d \right\rangle \right] . \end{aligned}$$

    Particularly,

    $$\begin{aligned} \frac{\partial _{gH}F}{\partial x_i}(x_0) \!=\! \left[ \frac{\partial \underline{f}}{\partial x_i^-}(x_0) \vee \frac{\partial \overline{f}}{\partial x_i^-}(x_0) \right] \!=\! \left[ \frac{\partial \underline{f}}{\partial x_i^+}(x_0) \vee \frac{\partial \overline{f}}{\partial x_i^+}(x_0)\right] , ~ i \!\in \! \{ 1,\ldots ,n \}. \end{aligned}$$

Proof

It follows directly from Proposition 2.1 and Theorem 2.2. \(\square \)

Let \(f : S \subseteq \mathbb {R}^n \rightarrow \mathbb {R}\) and \(x_0 \in S\), where S is open, be such that every lateral partial derivative of f exist at \(x_0\). The following notation will be used later in the paper:

$$\begin{aligned}&\nabla _{\sharp } f(x^*) = {\left\{ \begin{array}{ll} \nabla f(x^*), &{} \text {if it exists}, \\ \nabla f_{-}(x^*), &{} \text {otherwise}, \end{array}\right. } \\&\frac{\partial f}{\partial x_i^\sharp }(x^*) = {\left\{ \begin{array}{ll} \frac{\partial f}{\partial x_i}(x^*), &{} \text {if it exists}, \\ \frac{\partial f}{\partial x_i^-}(x^*), &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Given \(F : S \subseteq \mathbb {R}^n \rightarrow \mathcal {K}_C\), with \(F(x) = \left[ \underline{f}(x),\overline{f}(x) \right] \) \(\forall x \in S\), and \(x_0 \in S\), when F is gH-differentiable at \(x_0\), we will denote:

$$\begin{aligned} \frac{\partial _{gH}F}{\partial x_i}(x^*)&= {\left\{ \begin{array}{ll} \left[ \frac{\partial \underline{f}}{\partial x_i}(x^*) \vee \frac{\partial \overline{f}}{\partial x_i}(x^*) \right] , &{} \text {if they exist}, \\ \left[ \frac{\partial \underline{f}}{\partial x_i^-}(x^*) \vee \frac{\partial \overline{f}}{\partial x_i^-}(x^*) \right] = \left[ \frac{\partial \underline{f}}{\partial x_i^+}(x_0) \vee \frac{\partial \overline{f}}{\partial x_i^+}(x_0)\right] , &{} \text {otherwise} \end{array}\right. } \\&= \left[ \frac{\partial \underline{f}}{\partial x_i^\sharp }(x^*) \vee \frac{\partial \overline{f}}{\partial x_i^\sharp }(x^*) \right] \end{aligned}$$

and

$$\begin{aligned} \nabla _{gH} F(x^*) = \left( \frac{\partial _{gH}F}{\partial x_1}(x^*),\ldots ,\frac{\partial _{gH}F}{\partial x_n}(x^*) \right) . \end{aligned}$$

Next, we reproduce some key definitions and results from Girsanov [12] related to the Dubovitskii–Milyutin formalism.

Definition 2.6

Let E be a topological linear space. A nonzero continuous linear functional f is said to be a supporting functional for a set \(A \subset E\) at \(x_0 \in A\) if \(f(x) \ge f(x_0)\) for all \(x\in A\).

Definition 2.7

Let E be a topological linear space and \(E^\prime \) its topological dual space.

  1. (i)

    A set \(\mathcal {K}\) in E is called a cone with apex at 0 if \(x \in \mathcal {K}\) implies that \(\lambda x \in \mathcal {K}\) for all \(\lambda >0\).

  2. (ii)

    If \(\mathcal {K}\) is a cone in E, its dual cone, denoted by \(\mathcal {K}^*\), is defined as the set of all continuous linear functionals which are nonnegative on \(\mathcal {K}\), that is,

    $$\begin{aligned} \mathcal {K}^*= \{ f \in E^\prime : f(x) \ge 0 ~ \forall x \in \mathcal {K} \}. \end{aligned}$$

Lemma 2.2

(Dubovitskii–Milyutin, see Girsanov [12]) Let \(\mathcal {K}_1,\ldots ,\mathcal {K}_l,\mathcal {K}_{l+1}\) be convex cones with apex at the origin, where \(\mathcal {K}_1,\ldots ,\mathcal {K}_l\) are open. Then, \(\bigcap _{i=1}^{l+1} \mathcal {K}_i = \emptyset \) if, and only if, there exist linear functionals \(f_i \in \mathcal {K}_i^*\), \(i \in \{1,\ldots ,l+1\}\), not all zero, such that \(f_1+\ldots +f_l+f_{l+1}=0\).

Definition 2.8

Let E be a topological linear space and \(Q \subseteq E\).

  1. (i)

    A vector d is a feasible direction for Q at a point \(x_0\) if there exists a neighborhood U of d and a positive scalar \(\epsilon _0\) such that \(x_0+\epsilon \tilde{d} \in Q\) for all \(\epsilon \in (0,\epsilon _0)\) and all \(\tilde{d} \in U\). The set of all feasible directions will denoted by \(\mathcal {K}_b\).

  2. (ii)

    A vector d is a tangent direction to Q at a point \(x_0\) if there exist a positive scalar \(\epsilon _0\) and a map \(r : (0,\epsilon _0) \rightarrow E\) such that \(x(\epsilon ) := x_0 + \epsilon d + r(\epsilon ) \in Q\) for all \(\epsilon \in (0,\epsilon _0)\), where for any neighborhood U of zero, \(\frac{1}{\epsilon } r(\epsilon )\in U\) for all small \(\epsilon >0\) (in a Banach space this is replaced by the simpler condition \(\Vert r(\epsilon ) \Vert = o(\epsilon )\)). The set of all tangent directions will be denoted by \(\mathcal {K}_k\).

We keep the classical notation \(\mathcal {K}_b\) and \(\mathcal {K}_k\) (Russian the initials of “feasible cone” and “tangent cone”).

Proposition 2.2

(Girsanov [12]) The set of all feasible directions \(\mathcal {K}_b\) is an open cone with apex at zero.

Proposition 2.3

(Girsanov [12]) The set of all tangent directions \(\mathcal {K}_k\) is a cone with apex at zero.

Theorem 2.4

(Girsanov [12]) Let E be a Banach space and \(f : E \rightarrow \mathbb {R}\) be Fréchet differentiable at \(x_0 \in E\). Denote \(Q = \{ x : f(x) \le f(x_0) \}\). If \(f^\prime (x_0) \ne 0\), then

$$\begin{aligned} \mathcal {K}_b = \{ d : f^\prime (x_0;d) < 0 \}. \end{aligned}$$

Theorem 2.5

(Girsanov [12]) Let Q be a closed convex set. If \(\mathrm {int}(Q) \ne \emptyset \), then

$$\begin{aligned} \mathcal {K}_b= & {} \left\{ \lambda [\mathrm {int}(Q) - x_0] : \lambda> 0 \right\} \\= & {} \left\{ d : d = \lambda (x - x_0), ~ x \in \mathrm {int}(Q), ~ \lambda > 0 \right\} . \end{aligned}$$

Theorem 2.6

(Lyusternik, see Girsanov [12]) Let \(E_1\) and \(E_2\) be Banach spaces, \(P : E_1 \rightarrow E_2\) be a Fréchet differentiable operator in a neighborhood of a point \(x_0\) with \(P(x_0) = 0\). Let \(P^\prime (x)\) be continuous in a neighborhood of \(x_0\) and suppose that \(P^\prime (x_0)\) maps \(E_1\) onto \(E_2\). Then,

$$\begin{aligned} \mathcal {K}_k = \{ d : P^\prime (x_0)h = 0 \}. \end{aligned}$$

Theorem 2.7

(Girsanov [12]) Let \(f \in E^\prime \) and denote

$$\begin{aligned} \mathcal {K}_I = \{ x : f(x) = 0 \}, \quad \mathcal {K}_{II} = \{ x : f(x) \ge 0 \} \quad \text {and} \quad \mathcal {K}_{III} = \{ x : f(x) > 0 \}. \end{aligned}$$

Then,

$$\begin{aligned} \mathcal {K}_I^*= \{ \lambda f : \lambda \in \mathbb {R} \}, \quad \mathcal {K}_{II}^*= \{ \lambda f : 0 \le \lambda < \infty \}, \quad \mathcal {K}_{III}^*= {\left\{ \begin{array}{ll} E^\prime , ~ \text {if} ~ f = 0, \\ \mathcal {K}_{II}^*, ~ \text {if} ~ f \ne 0. \end{array}\right. } \end{aligned}$$

Theorem 2.8

(Girsanov [12]) Let A be a \(k \times l\) matrix and \(K = \{ x \in \mathbb {R}^l : Ax = 0 \}\). Then, \(K^*= \{ A^Ty : y \in \mathbb {R}^k \}\).

Theorem 2.9

(Girsanov [12]) Let Q be a closed convex set. If \(\mathrm {int}(Q) \ne \emptyset \), then \(\mathcal {K}_b^*= Q^*= \{ f \in E^\prime : f(x) \ge f(x_0) ~ \forall x \in Q\}\).

2.2 The Interval Optimization Problem

We consider the following interval-valued optimization problem with functional and abstract constraints:

$$\begin{aligned} \begin{array}{ll} \text {minimize} &{} F(x) = [\underline{f}(x),\overline{f}(x)] \\ \text {subject to} &{} g_i(x) \le 0, ~ i \in I = \{1,\ldots ,p\}, \\ &{} h_j(x) = 0, ~ j \in J = \{1,\ldots ,q\}, \\ &{} x \in S \subseteq \mathbb {R}^n, \end{array} \qquad \qquad \qquad \text {(IOP)} \end{aligned}$$

where \(F = [\underline{f},\overline{f}]: \mathbb {R}^n \rightarrow \mathcal {K}_C\), \(\underline{f},\overline{f} : \mathbb {R}^n \rightarrow \mathbb {R}\), \(\underline{f}(x) \le \overline{f}(x)\) for all \(x \in \mathbb {R}^n\), \(g_i : \mathbb {R}^n \rightarrow \mathbb {R}\), \(i \in I\), \(h_j : \mathbb {R}^n \rightarrow \mathbb {R}\), \(j \in J\), and S is a nonempty set in \(\mathbb {R}^n\).

The set of all feasible solutions to (IOP) will be denoted by \(\mathcal {F}\), that is,

$$\begin{aligned} \mathcal {F} = \{ x \in \mathbb {R}^n : g_i(x) \le 0, ~ i \in I, ~ h_j(x) = 0, ~ j \in J, ~ x \in S \}. \end{aligned}$$

The set of indices related to the active constraints at a feasible point \(x^*\in \mathcal {F}\) will be denoted by \(I(x^*)\):

$$\begin{aligned} I(x^*) = \{ i \in I : g_i(x^*) = 0 \}. \end{aligned}$$

Next, we give the concepts of solutions that will be used for the problem (IOP). They are based in the order relation LU.

Definition 2.9

(See Wu [26]) Let \(x^*\in S\). Then,

  1. (i)

    \(x^*\) is said to be a local LU-solution of (IOP) if there exists \(\epsilon >0\) such that there does not exist \(x \in N_{\epsilon }(x^*) \cap \mathcal {F}\) with \(F(x) \le _{LU} F(x^*)\).

  2. (ii)

    \(x^*\) is said to be a local weak LU-solution of (IOP) if there exists \(\epsilon >0\) such that there does not exist \(x \in N_{\epsilon }(x^*) \cap \mathcal {F}\) with \(F(x) <_{LU} F(x^*)\).

If \(N_{\epsilon }(x^*)\) above is replaced by the whole space \(\mathbb {R}^n\), we have the corresponding definitions of global solutions for (IOP).

3 Necessary Optimality Conditions

In this section, necessary optimality conditions for (IOP) will be established through an adaptation of the Dubovitskii–Milyutin formalism. The cones of feasible and tangent directions and the respective duals can be found in Girsanov [12]. We define an appropriate cone of directions of decrease for interval-valued functions and give it a characterization by means of the gH-derivative of the function. A geometric necessary optimality condition involving the mentioned cones is derived. Algebraic necessary optimality conditions are obtained after an application of the Dubovitskii–Milyutin Lemma (see Lemma 5.11 in [12]).

The following assumptions will be required. Let \(x^*\in \mathcal {F}\).

  1. (A1)

    F(x) is Lipschitz continuous locally near \(x^*\).

  2. (A2)

    F is continuously gH-differentiable in a neighborhood of \(x^*\).

  3. (A3)

    \(g_i\), \(i \in I\), \(h_j\), \(j \in J\), are continuously differentiable in a neighborhood of \(x^*\).

  4. (A4)

    S is closed and convex, and \(\mathrm {int}(S) \ne \emptyset \).

Now, we define the directions of decrease for the interval-valued function F. Then, some of its properties will be developed.

Definition 3.1

A direction \(d \in \mathbb {R}^n\) is said to be a LU-direction of decrease of F at \(x^*\in \mathcal {F}\) if there exist a neighborhood N of d, a positive scalar \(\epsilon ^*\) and an interval \([\underline{\alpha },\overline{\alpha }] <_{LU} [0,0]\) such that

$$\begin{aligned} F(x^*+\epsilon \tilde{d}) <_{LU} F(x^*) + \epsilon [\underline{\alpha },\overline{\alpha }] ~ \forall \epsilon \in (0,\epsilon ^*), ~ \forall \tilde{d} \in N. \end{aligned}$$

The set of all LU-directions of decrease of F at \(x^*\) will be denoted by \(\mathcal {K}_y^{LU}\).

It is important to emphasize that in Definition 3.1, both \(\underline{\alpha }\) and \(\overline{\alpha }\) depend on F, \(x^*\), and \(\bar{x}\).

Proposition 3.1

The set of all LU-directions of decrease \(\mathcal {K}_y^{LU}\) is an open convex cone.

The proof of Proposition 3.1 will be postponed to the end of the next subsection.

Next, we have a geometric characterization of weak LU-solutions.

Theorem 3.1

Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP). Then,

$$\begin{aligned} \mathcal {K}_y^{LU} \cap \mathcal {K}_b \cap \mathcal {K}_k = \emptyset . \end{aligned}$$

Proof

Suppose that there exists \(d \in \mathcal {K}_y^{LU} \cap \mathcal {K}_b \cap \mathcal {K}_y\). Then, by Definition 3.1, there exists a neighborhood \(N_0\) of d, a positive scalar \(\epsilon _0\) and an interval \(\left[ \underline{\alpha },\overline{\alpha }\right] <_{LU} [0,0]\) such that

$$\begin{aligned} F(x^*+\epsilon \tilde{d}) <_{LU} F(x^*) + \epsilon \left[ \underline{\alpha },\overline{\alpha } \right] ~ \forall \epsilon \in (0,\epsilon _0), ~ \forall \tilde{d} \in N_0. \end{aligned}$$
(1)

By Definition 2.8-(i), it follows that there exists a neighborhood \(N_1\) of d and a scalar \(\epsilon _1>0\) such that

$$\begin{aligned} x^*+ \epsilon \tilde{d} \in \mathcal {F}_1 ~ \forall \epsilon \in (0,\epsilon _1), ~ \forall \tilde{d} \in N_1, \end{aligned}$$
(2)

where \(\mathcal {F}_1 = \{ x \in \mathbb {R}^n : g_i(x) \le 0, ~ i \in I, ~ x \in S \}\). We also have, by Definition 2.8-(ii), that there exist a scalar \(\epsilon _2>0\) and a map \(r : (0,\epsilon _2) \rightarrow \mathbb {R}^n\) such that

$$\begin{aligned} x^*+ \epsilon d + r(\epsilon ) \in \mathcal {F}_2 ~ \forall \epsilon \in (0,\epsilon _2), \end{aligned}$$
(3)

where \(\Vert r(\epsilon ) \Vert = o(\epsilon )\) and \(\mathcal {F}_2 = \{ x \in \mathbb {R}^n : h_j(x) = 0, ~ j \in J \}\). Let \(d_\epsilon := d + \frac{1}{\epsilon } r(\epsilon )\), \(\epsilon \in (0,\epsilon _3)\), where \(\epsilon _3 > 0\) is small enough so that \(d_\epsilon \in N := N_0 \cap N_1\), \(\epsilon \in (0,\epsilon _3)\). Provided \(x^*\) is a local weak LU-solution of (IOP), the exists \(\epsilon _4 > 0\) such that there does not exist \(x \in N_{\epsilon _4}(x^*) \cap \mathcal {F}\) with \(F(x) \le _{LU} F(x^*)\). If \(\epsilon _5 := \min \{ \epsilon _0,\epsilon _1,\epsilon _2,\epsilon _3,\epsilon _4 \}\), it follows, by (2)–(3), that

$$\begin{aligned} x_\epsilon := x^*+ \epsilon d_\epsilon \in N_{\epsilon _5}(x^*) \cap \mathcal {F}_1 \cap \mathcal {F}_2 = N_{\epsilon _5}(x^*) \cap \mathcal {F} ~ \forall \epsilon \in (0,\epsilon _5). \end{aligned}$$

Moreover, by (1),

$$\begin{aligned} F(x_\epsilon ) = F(x^*+\epsilon d_\epsilon )<_{LU} F(x^*) + \epsilon \left[ \underline{\alpha },\overline{\alpha } \right] <_{LU} F(x^*) ~ \forall \epsilon \in (0,\epsilon _5). \end{aligned}$$

Therefore, \(x_\epsilon \in N_{\epsilon _5}(x^*) \cap \mathcal {F}\) satisfies \(F(x_\epsilon ) <_{LU} F(x^*)\), which is a contradiction to the assumption that \(x^*\in \mathcal {F}\) is a local weak LU-solution. \(\square \)

3.1 Characterization of the Cone of LU-Directions of Decrease

In order to transcribe the necessary optimality condition given in geometric form in Theorem 3.1 into an algebraic one, it is necessary to characterize the cones of LU-directions of decrease and of feasible and tangent directions. The characterizations of the cones of feasible and tangent directions are already known in the literature and we have already stated them in Theorems 2.4, 2.5, and 2.6. So, we proceed to characterize \(\mathcal {K}_y^{LU}\).

Let \(x^*\in S\). In order to characterize the cone of LU-directions of decrease in terms of the gH-derivative of F, we define

$$\begin{aligned} \mathcal {K}_0 = \left\{ d \in \mathbb {R}^n : F_{gH}^\prime (x^*;d)<_{LU} [0,0] \right\} . \end{aligned}$$

Theorem 3.2

Assume that (A2) holds. Then, \(\mathcal {K}_y^{LU} \subset \mathcal {K}_0\).

Proof

Let \(d \in \mathcal {K}_y^{LU}\). Then, there exists a neighborhood N of d, a positive scalar \(\epsilon ^*\) and an interval \([\underline{\alpha },\overline{\alpha }] <_{LU} [0,0]\) such that

$$\begin{aligned} F(x^*+ \epsilon \tilde{d}) <_{LU} F(x^*) + \epsilon [\underline{\alpha },\overline{\alpha }] ~ \forall \epsilon \in (0,\epsilon ^*), ~ \forall \tilde{d} \in N. \end{aligned}$$
(4)

We have that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \cdot (F(x^*+\epsilon d) \ominus _{gH}F(x^*)) <_{LU} [0,0]. \end{aligned}$$

Indeed, let us set, for \(\epsilon \in (0,\epsilon ^*)\),

$$\begin{aligned} A_\epsilon&:= [\underline{a}_\epsilon ,\overline{a}_\epsilon ]= F(x^*+ \epsilon d), \\ B_\epsilon&:= [\underline{b}_\epsilon ,\overline{b}_\epsilon ]= F(x^*), \\ C_\epsilon&:= [\underline{c}_\epsilon ,\overline{c}_\epsilon ]= A_\epsilon \ominus _{gH}B_\epsilon = F(x^*+ \epsilon d)\ominus _{gH}F(x^*). \end{aligned}$$

Since \(A_\epsilon \ominus _{gH} B_\epsilon = C_\epsilon \), we have that

$$\begin{aligned} {\left\{ \begin{array}{ll} A_\epsilon = B_\epsilon +C_\epsilon \\ \text {or} \\ B_\epsilon = A_\epsilon -C_\epsilon \end{array}\right. }&\Leftrightarrow {\left\{ \begin{array}{ll} \underline{c}_\epsilon = \underline{a}_\epsilon - \underline{b}_\epsilon \quad \text {and} \quad \overline{c}_\epsilon = \overline{a}_\epsilon - \overline{b}_\epsilon \\ \text {or} \\ \overline{c}_\epsilon = \underline{a}_\epsilon -\underline{b}_\epsilon \quad \text {and} \quad \underline{c}_\epsilon = \overline{a}_\epsilon -\overline{b}_\epsilon . \end{array}\right. } \end{aligned}$$
(5)

By (4), we have that

$$\begin{aligned} A_\epsilon<_{LU} B_\epsilon + \epsilon [\underline{\alpha },\overline{\alpha }] \Leftrightarrow \left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] <_{LU} \left[ \underline{b}_\epsilon + \epsilon \underline{\alpha },\overline{b}_\epsilon + \epsilon \overline{\alpha } \right] , \end{aligned}$$

so that

$$\begin{aligned} \frac{1}{\epsilon }(\underline{a}_\epsilon -\underline{b}_\epsilon )< \underline{\alpha } \quad \text {and} \quad \frac{1}{\epsilon }(\overline{a}_\epsilon -\overline{b}_\epsilon ) < \overline{\alpha }. \end{aligned}$$

Let us assume that the first case in (5) holds. If the second case occurs, we proceed similarly. Then,

$$\begin{aligned} \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \underline{c}_\epsilon< \underline{\alpha } \quad \text {and} \quad \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \overline{c}_\epsilon < \overline{\alpha }, \end{aligned}$$

that is,

$$\begin{aligned} \left[ \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \underline{c}_\epsilon , \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \overline{c}_\epsilon \right]<_{LU} [\underline{\alpha },\overline{\alpha }] <_{LU} [0,0]. \end{aligned}$$

It follows from Theorem 2.1 that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \cdot C_\epsilon <_{LU} [0,0]. \end{aligned}$$

By Definition 2.5, \(F_{gH}^\prime (x^*;d) <_{LU} [0,0]\), that is, \(d \in \mathcal {K}_0\). \(\square \)

Now, we establish the converse of Theorem 3.2:

Theorem 3.3

Suppose that (A1)–(A2) holds. Then, \(\mathcal {K}_0 \subset \mathcal {K}_y^{LU}\).

Proof

Let \(d \in \mathcal {K}_0\). By Definition 2.5,

$$\begin{aligned} F_{gH}^\prime (x^*;d) = \lim _{\epsilon \rightarrow 0^+} \frac{1}{\epsilon } \cdot (F(x^*+\epsilon d) \ominus _{gH} F(x^*)) =: \left[ \underline{\delta },\overline{\delta } \right] \in \mathcal {K}_C, \end{aligned}$$

where \(\underline{\delta } \le \overline{\delta }\). Then, provided \(d \in \mathcal {K}_0\),

$$\begin{aligned}{}[0,0] >_{LU} F_{gH}^\prime (x^*;d) = \left[ \underline{\delta },\overline{\delta } \right] , \end{aligned}$$

which implies that \(\left[ \underline{\delta },\overline{\delta } \right] <_{LU} [0,0]\), that is, \(\underline{\delta } < 0\) and \(\overline{\delta } < 0\). By definition of limit, given \(\epsilon ^\prime \in (0,-\overline{\delta }/2)\), there exists \(\delta ^\prime > 0\) such that

$$\begin{aligned} d_H \left( \frac{1}{\epsilon } \cdot (F(x^*+\epsilon d) \ominus _{gH} F(x^*)),[\underline{\delta },\overline{\delta }] \right) < \epsilon ^\prime \end{aligned}$$

for all \(\epsilon \in (0,\delta ^\prime )\). By setting

$$\begin{aligned} \frac{1}{\epsilon } \cdot (F(x^*+\epsilon d) \ominus _{gH} F(x^*)) =: K_\epsilon = \left[ \underline{k}_\epsilon ,\overline{k}_\epsilon \right] , \end{aligned}$$

where \(\epsilon \in (0,\delta ^\prime )\), we have that

$$\begin{aligned}&\max \{ \vert \underline{k}_\epsilon - \underline{\delta } \vert , \vert \overline{k}_\epsilon - \overline{\delta } \vert \} = d_{H} (K_\epsilon ,[\underline{\delta },\overline{\delta }])< \epsilon ^\prime \nonumber \\&\quad \Rightarrow \underline{k}_\epsilon - \underline{\delta } \le \vert \underline{k}_\epsilon - \underline{\delta } \vert< \epsilon ^\prime \quad \text {and} \quad \overline{k}_\epsilon -\overline{\delta } \le \vert \overline{k}_\epsilon - \overline{\delta } \vert< \epsilon ^\prime \nonumber \\&\quad \Rightarrow \underline{k}_\epsilon< \epsilon ^\prime + \underline{\delta }< -\frac{\overline{\delta }}{2} + \underline{\delta }< -\frac{\underline{\delta }}{2} + \underline{\delta } = \frac{\underline{\delta }}{2} \quad \text {and} \quad \overline{k}_\epsilon< \epsilon ^\prime + \overline{\delta }< -\frac{\overline{\delta }}{2} + \overline{\delta } = \frac{\overline{\delta }}{2} \nonumber \\&\quad \Rightarrow K_\epsilon = \left[ \underline{k}_\epsilon ,\overline{k}_\epsilon \right]<_{LU} \left[ \frac{\underline{\delta }}{2},\frac{\overline{\delta }}{2} \right] \nonumber \\&\quad \Rightarrow F(x^*+\epsilon d) \ominus _{gH} F(x^*) <_{LU} \epsilon \left[ \frac{\underline{\delta }}{2},\frac{\overline{\delta }}{2} \right] ~ \forall \epsilon \in (0,\delta ^\prime ). \end{aligned}$$
(6)

Since F(x) satisfies the Lipschitz continuous condition in a neighborhood of \(x^*\), there exist \(L>0\) and \(\epsilon ^*> 0\) such that

$$\begin{aligned} d_H(F(x_1),F(x_2)) \le L \Vert x_1 - x_2 \Vert \end{aligned}$$
(7)

for all \(x_1\) and \(x_2\) in which \(\Vert x_1 - x^*\Vert < \epsilon ^*\) and \(\Vert x_2 - x^*\Vert < \epsilon ^*\).

Let \(\tilde{d} \in \mathbb {R}^n\) be arbitrary but such that

$$\begin{aligned} \Vert \tilde{d} - d \Vert < -\frac{\overline{\delta }}{4L}. \end{aligned}$$
(8)

Whenever \(\epsilon \in (0,\hat{\epsilon })\), where \(\hat{\epsilon } < \min \left\{ \frac{4 \epsilon ^*L}{-\overline{\delta } + 4 \Vert d \Vert L},\frac{\epsilon ^*}{\Vert d \Vert }\right\} \), we obtain that

$$\begin{aligned} \Vert (x^*+ \epsilon \tilde{d}) - x^*\Vert \le \epsilon (\Vert \tilde{d} - d \Vert + \Vert d \Vert ) < \frac{4 \epsilon ^*L}{-\overline{\delta } + 4 \ |d \Vert L} \left( -\frac{\overline{\delta }}{4 L} + \Vert d \Vert \right) = \epsilon ^*\end{aligned}$$

and

$$\begin{aligned} \Vert (x^*+ \epsilon d) - x^*\Vert = \epsilon \Vert d \Vert < \frac{\epsilon ^*}{\Vert d \Vert } \Vert d \Vert = \epsilon ^*. \end{aligned}$$

Hence, it follows from (7) that

$$\begin{aligned} d_H(F(x^*+\epsilon \tilde{d}),F(x^*+\epsilon d)) \le L \Vert (x^*+\epsilon \tilde{d}) - (x^*+\epsilon d) \Vert \end{aligned}$$

for all \(\epsilon \in (0,\hat{\epsilon })\). Then, by Lemma 2.1 and (8),

$$\begin{aligned}&F(x^*+\epsilon \tilde{d}) \ominus _{gH} F(x^*+\epsilon d) \\&\qquad \leqq _{LU} [d_H(F(x^*+\epsilon \tilde{d}),F(x^*+\epsilon d)), d_H(F(x^*+\epsilon \tilde{d}),F(x^*+\epsilon d))] \\&\qquad \leqq _{LU} [L \Vert (x^*+\epsilon \tilde{d}) - (x^*+\epsilon d) \Vert , L \Vert (x^*+\epsilon \tilde{d}) - (x^*+\epsilon d) \Vert ] \\&\qquad <_{LU} \left[ -L\epsilon \frac{ \overline{\delta }}{4 L}, -L\epsilon \frac{ \overline{\delta }}{4 L} \right] \\&\qquad = \epsilon \left[ -\frac{\overline{\delta }}{4},-\frac{\overline{\delta }}{4} \right] ~ \forall \epsilon \in (0,\hat{\epsilon }), \end{aligned}$$

i.e.,

$$\begin{aligned} F(x^*+\epsilon \tilde{d}) \ominus _{gH}F(x^*+\epsilon d) <_{LU} \epsilon \left[ -\frac{\overline{\delta }}{4},-\frac{\overline{\delta }}{4} \right] ~ \forall \epsilon \in (0,\hat{\epsilon }). \end{aligned}$$
(9)

Let \(\tilde{\epsilon } = \min \left\{ \delta ^\prime ,\hat{\epsilon } \right\} \) and \(\underline{\alpha }^\prime = \overline{\alpha }^\prime = \frac{\overline{\delta }}{4}\). Given \(0< \epsilon < \tilde{\epsilon }\), we have

$$\begin{aligned} F(x^*+\epsilon \tilde{d}) <_{LU} F(x^*) + \epsilon \left[ \underline{\alpha }^\prime ,\overline{\alpha }^\prime \right] . \end{aligned}$$

Indeed, set, for \(\epsilon \in (0,\tilde{\epsilon })\),

$$\begin{aligned} A_\epsilon&:= [\underline{a}_\epsilon ,\overline{a}_\epsilon ] = F(x^*+\epsilon d), \\ B_\epsilon&:= [\underline{b}_\epsilon ,\overline{b}_\epsilon ] = F(x^*), \\ C_\epsilon&:= [\underline{c}_\epsilon ,\overline{c}_\epsilon ] = A_\epsilon \ominus _{gH} B_\epsilon = F(x^*+\epsilon d) \ominus _{gH}F(x^*), \\ D_\epsilon&:= [\underline{d}_\epsilon ,\overline{d}_\epsilon ] = F(x^*+\epsilon \tilde{d}), \\ E_\epsilon&:= [\underline{e}_\epsilon ,\overline{e}_\epsilon ] = D_\epsilon \ominus _{gH} A_\epsilon = F(x^*+\epsilon \tilde{d}) \ominus _{gH}F(x^*+\epsilon d). \end{aligned}$$

Since \(A_\epsilon \ominus _{gH}B_\epsilon =C_\epsilon \), by (6), we have that

and since \(D_\epsilon \ominus _{gH}A_\epsilon = E_\epsilon \), by (9), we have that

We have one of the following cases:

  1. (i)

    \(\left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] = \left[ \underline{b}_\epsilon ,\overline{b}_\epsilon \right] + \left[ \underline{c}_\epsilon ,\overline{c}_\epsilon \right] \) and \(\left[ \underline{d}_\epsilon ,\overline{d}_\epsilon \right] = \left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] + \left[ \underline{e}_\epsilon ,\overline{e}_\epsilon \right] \);

  2. (ii)

    \(\left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] = \left[ \underline{b}_\epsilon ,\overline{b}_\epsilon \right] + \left[ \underline{c}_\epsilon ,\overline{c}_\epsilon \right] \) and \(\left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] = \left[ \underline{d}_\epsilon ,\overline{d}_\epsilon \right] + \left[ -\overline{e}_\epsilon ,-\underline{e}_\epsilon \right] \);

  3. (iii)

    \(\left[ \underline{b}_\epsilon ,\overline{b}_\epsilon \right] = \left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] + \left[ -\overline{c}_\epsilon ,-\underline{c}_\epsilon \right] \) and \(\left[ \underline{d}_\epsilon ,\overline{d}_\epsilon \right] = \left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] + \left[ \underline{e}_\epsilon ,\overline{e}_\epsilon \right] \);

  4. (iv)

    \(\left[ \underline{b}_\epsilon ,\overline{b}_\epsilon \right] = \left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] + \left[ -\overline{c}_\epsilon ,-\underline{c}_\epsilon \right] \) and \(\left[ \underline{a}_\epsilon ,\overline{a}_\epsilon \right] = \left[ \underline{d}_\epsilon ,\overline{d}_\epsilon \right] + \left[ -\overline{e}_\epsilon ,-\underline{e}_\epsilon \right] \);

always with

$$\begin{aligned} \left[ \underline{c}_\epsilon ,\overline{c}_\epsilon \right]<_{LU} \left[ \frac{\epsilon \underline{\delta }}{2},\frac{\epsilon \overline{\delta }}{2} \right] \quad \text{ and } \quad \left[ \underline{e}_\epsilon ,\overline{e}_\epsilon \right] <_{LU} \left[ -\frac{\epsilon \overline{\delta }}{4},-\frac{\epsilon \overline{\delta }}{4} \right] . \end{aligned}$$

In each of the cases (i)–(iv) above we have \(\left[ \underline{d}_\epsilon ,\overline{d}_\epsilon \right] <_{LU} \left[ \underline{b}_\epsilon ,\overline{b}_\epsilon \right] + \epsilon \left[ \underline{\alpha }^\prime ,\overline{\alpha }^\prime \right] \), with \(\underline{\alpha }^\prime = \overline{\alpha }^\prime = \frac{\overline{\delta }}{4}\), that is, \(D_\epsilon <_{LU} B_\epsilon + \epsilon \left[ \underline{\alpha }^\prime ,\overline{\alpha }^\prime \right] \), which is equivalent to

$$\begin{aligned} F(x^*+\epsilon \tilde{d}) <_{LU} F(x^*) + \epsilon \left[ \underline{\alpha }^\prime ,\overline{\alpha }^\prime \right] , \end{aligned}$$

with \(\left[ \underline{\alpha }^\prime ,\overline{\alpha }^\prime \right] = \left[ \frac{\overline{\delta }}{4},\frac{\overline{\delta }}{4} \right] <_{LU} [0,0]\), since \(\overline{\delta } < 0\). It follows from Definition 3.1 that \(d \in \mathcal {K}_y^{LU}\). \(\square \)

By Theorems 3.2 and 3.3, we have:

Theorem 3.4

Suppose that (A1)–(A2) holds. Then, the cone of LU-directions of decrease \(\mathcal {K}_y^{LU}\) coincides with \(\mathcal {K}_0\).

If (A2) holds, by Theorem 2.3, we have

$$\begin{aligned} F_{gH}^\prime (x^*;d) = \left[ \left\langle \nabla _\sharp \underline{f}(x^*), d \right\rangle \vee \left\langle \nabla _\sharp \overline{f}(x^*), d \right\rangle \right] , ~ d \in \mathbb {R}^n, \end{aligned}$$

so that we can write

$$\begin{aligned} \mathcal {K}_0 = \left\{ d \in \mathbb {R}^n : \left[ \left\langle \nabla _\sharp \underline{f}(x^*), d \right\rangle \vee \left\langle \nabla _\sharp \overline{f}(x^*), d \right\rangle \right] <_{LU} [0,0] \right\} . \end{aligned}$$

Now, let us set

$$\begin{aligned}&\mathcal {\underline{K}}_0 := \left\{ d \in \mathbb {R}^n : \left\langle \nabla _\sharp \underline{f}(x^*), d \right\rangle < 0 \right\} , \end{aligned}$$
(10)
$$\begin{aligned}&\mathcal {\overline{K}}_0 := \left\{ d \in \mathbb {R}^n : \left\langle \nabla _\sharp \overline{f}(x^*), d \right\rangle < 0 \right\} . \end{aligned}$$
(11)

It is clear that \(\mathcal {K}_0 = \mathcal {\underline{K}}_0 \cap \mathcal {\overline{K}}_0\).

Proposition 3.2

Assume that (A2) holds. Then,

  1. (i)

    \(\underline{\mathcal {K}}_0\) is an open convex cone;

  2. (ii)

    \(\overline{\mathcal {K}}_0\) is an open convex cone.

Proof

The proof is straightforward and has been omitted. \(\square \)

We end this subsection by providing the proof of Proposition 3.1.

Proof of Proposition 3.1

It follows directly from Theorem 3.4 and Proposition 3.2. \(\square \)

3.1.1 Characterization of the Cone of Feasible Directions

As already mentioned, the characterization of the cone of feasible directions can be obtained by applying Theorems 2.4 and 2.5. Here, we assume that (A3)–(A4) holds.

For each \(i \in I\), we set \(\mathcal {F}_{1i} = \{ x \in \mathbb {R}^n : g_i(x) \le 0 \}\) and denote by \(\mathcal {K}_{bi}\) the cone of feasible directions for \(\mathcal {F}_{1i}\) at \(x^*\). For \(i \in I(x^*)\), by Theorem 2.4, if \(\nabla g_i(x^*) \ne 0\),

$$\begin{aligned} \mathcal {K}_{bi} = \{ d \in \mathbb {R}^n : \langle \nabla g_i(x^*),d \rangle < 0 \}. \end{aligned}$$
(12)

If \(g_i(x^*) < 0\), then clearly \(\mathcal {K}_{bi} = \mathbb {R}^n\). If \(\mathcal {F}_{1s} = \{ x \in \mathbb {R}^n : x \in S \}\) and \(\mathcal {K}_{bs}\) denotes the cone of feasible directions for \(\mathcal {F}_{1s}\) at \(x^*\), by Theorem 2.5,

$$\begin{aligned} \mathcal {K}_{bs} = \left\{ d \in \mathbb {R}^n : d = \lambda (x - x^*), ~ x \in \mathrm {int}(S), ~ \lambda > 0 \right\} . \end{aligned}$$
(13)

It is easy to see that

$$\begin{aligned} \mathcal {F} = \left( \bigcap _{i \in I} \mathcal {F}_{1i} \right) \cap \mathcal {F}_{1s} \quad \text{ and } \quad \mathcal {K}_b = \left( \bigcap _{i \in I} \mathcal {K}_{bi} \right) \cap \mathcal {K}_{bs}. \end{aligned}$$
(14)

3.1.2 Characterization of the Cone of Tangent Directions

The characterization of the cone of tangent directions is done by making use of Theorem 2.6. Assume that (A3) is valid. We denote \(\mathcal {F}_2 = \{ x \in \mathbb {R}^n : h_j(x) = 0, ~ j \in J \}\). If \(\{ \nabla h_j(x^*) \}_{j \in J}\) is a linearly independent set, by Theorem 2.6,

$$\begin{aligned} \mathcal {K}_k = \{ d \in \mathbb {R}^n : \langle \nabla h_j(x^*),d \rangle = 0, ~ j \in J \}. \end{aligned}$$
(15)

3.1.3 The Multiplier Rule

The theorem below is a first step in transcribing the geometric necessary optimality condition into an algebraic one.

Theorem 3.5

Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP). Assume that (A1)–(A2) hold. Then,

$$\begin{aligned} \underline{\mathcal {K}}_0 \cap \overline{\mathcal {K}}_0 \cap \left( \bigcap _{i \in I} \mathcal {K}_{bi} \right) \cap \mathcal {K}_{bs} \cap \mathcal {K}_k = \emptyset . \end{aligned}$$

Proof

The result is a direct consequence of Theorems 3.1 and 3.4 and (14). \(\square \)

In order to establish the main result of the paper, we will need a constraint qualification. Before stating the constraint qualification, we recall the definition of a normal cone.

Definition 3.2

The normal cone to a convex set \(S \subset \mathbb {R}^n\) at the point \(x^*\in S\), denoted by \(N_S(x^*)\), is defined as

$$\begin{aligned} N_S(x^*) = \{ v \in \mathbb {R}^n : \langle v , x - x^*\rangle \le 0 ~ \forall x \in S \}. \end{aligned}$$

Definition 3.3

The constraints of (IOP) are said to satisfy the positive linear independence constraint qualification (PLICQ) at a feasible point \(x^*\in \mathcal {F}\) if given \(\mu _i \ge 0\), \(i \in I(x^*)\), and \(\lambda _j \in \mathbb {R}\), \(j \in J\), such that

$$\begin{aligned} \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) + \sum _{j \in J} \lambda _j \nabla h_j(x^*) \in - N_S(x^*) \end{aligned}$$

one has

$$\begin{aligned} \mu _i = 0, ~ i \in I(x^*), \quad \text{ and } \quad \lambda _j = 0, ~ j \in J. \end{aligned}$$

Theorem 3.6

(Necessary optimality conditions for (IOP)) Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP). Assume that (A1)–(A4) and (PLICQ) hold at \(x^*\). Then, there exist \(\mu _i \ge 0\), \(i \in I(x^*)\), \(\lambda _j \in \mathbb {R}\), \(j \in J\), and \(y \in \mathbb {R}^n\) such that

$$\begin{aligned} y \in \nabla _{gH} F(x^*) \oplus \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) \oplus \sum _{j \in J} \lambda _j \nabla h_j(x^*) \end{aligned}$$

and

$$\begin{aligned} \langle y , x-x^*\rangle \ge 0 ~ \forall x \in S. \end{aligned}$$

Proof

Provided (PLICQ) is valid at \(x^*\), we have that \(\nabla g_i(x^*) \ne 0\), \(i \in I(x^*)\), and \(\nabla h_j(x^*) \ne 0\), \(j \in J\). Moreover, the gradients of the equality constraints are linearly independent at \(x^*\).

Suppose that \(\mathcal {\underline{K}}_0 \ne \emptyset \) and \(\mathcal {\overline{K}}_0 \ne \emptyset \). The exceptional cases (when this supposition is not valid) will be discussed at the end of the proof.

We know by Propositions 2.2, 2.3 and 3.2 and (12), (13) and (15) that \(\underline{\mathcal {K}}_0\), \(\overline{\mathcal {K}}_0\), \(\mathcal {K}_{bi}\), \(i \in I\), and \(\mathcal {K}_{bs}\) are open convex cones and \(\mathcal {K}_k\) is a convex cone. It follows from Theorem 3.5 and the Dubovitskii–Milyutin Lemma 2.2 that there exist linear functionals \(\underline{f}_0 \in \mathcal {\underline{K}}_0^*\), \(\overline{f}_0 \in \mathcal {\overline{K}}_0^*\), \(f_{1i} \in \mathcal {K}_{bi}^*\), \(i \in I\), \(f_{1s} \in \mathcal {K}_{bs}^*\) and \(f_2 \in \mathcal {K}_k^*\), not all zero, such that, for all \(d \in \mathbb {R}^n\),

$$\begin{aligned} \underline{f}_0(d) + \overline{f}_0(d) + \sum _{i \in I} f_{1i}(d) + f_{1s}(d) + f_2(d) = 0. \end{aligned}$$
(16)

By (10), (11), (13) and Theorem 2.7, we see that

$$\begin{aligned}&\mathcal {\underline{K}}_0^*= \{ - \underline{\lambda } \nabla _\sharp \underline{f}(x^*) : \underline{\lambda } \ge 0 \}, \quad \mathcal {\overline{K}}_0^*= \{ - \overline{\lambda } \nabla _\sharp \overline{f}(x^*) : \overline{\lambda } \ge 0 \}, \\&\mathcal {K}_{bi}^*= \{ - \mu _i \nabla g_i(x^*) : \mu _i \ge 0 \}, ~ i \in I(x^*), \quad \mathcal {K}_{bi}^*= \{ 0 \}, ~ i \in I {\setminus } I(x^*). \end{aligned}$$

From (13) and Theorem 2.9, we obtain

$$\begin{aligned} \mathcal {K}_{bs}^*= S^*= \{ f \in (\mathbb {R}^n)^\prime : f(x) \ge f(x^*) ~ \forall x \in S \}. \end{aligned}$$

By (15) and Theorem 2.8,

$$\begin{aligned} \mathcal {K}_k^*= \left\{ \sum _{j \in J} \lambda _j \nabla h_j(x^*) : \lambda _j \in \mathbb {R}, ~ j \in J \right\} . \end{aligned}$$

Therefore, from (16), there exist \(\underline{\lambda } \ge 0\), \(\overline{\lambda } \ge 0\), \(\mu _i \ge 0\), \(i \in I(x^*)\), \(\mu _i := 0\), \(i \in I {\setminus } I(x^*)\) and \(\lambda _j \in \mathbb {R}\), such that, for all \(d \in \mathbb {R}^n\),

$$\begin{aligned} f_{1s}(d)= & {} \underline{\lambda } \langle \nabla _\sharp \underline{f}(x^*) , d \rangle + \overline{\lambda } \langle \nabla _\sharp \overline{f}(x^*) , d \rangle \nonumber \\&+ \sum _{i \in I} \mu _i \langle \nabla g_i(x^*) , d \rangle + \sum _{j \in J} \lambda _j \langle \nabla h_j(x^*) , d \rangle . \end{aligned}$$
(17)

Since \(S^*\subset (\mathbb {R}^n)^\prime \approx \mathbb {R}^n\), by the Riesz Representation Theorem (see Brezis [6]), there exists a unique \(y \in \mathbb {R}^n\) such that

$$\begin{aligned} f_{1s}(x) = \langle y , x \rangle ~ \forall x \in \mathbb {R}^n. \end{aligned}$$
(18)

Provided \(f_{1s}\) is a supporting functional of S at \(x^*\in S\), we have that

$$\begin{aligned} \langle y, x \rangle \ge \langle y, x^*\rangle \Leftrightarrow \langle y , x - x^*\rangle \ge 0 ~ \forall x \in S. \end{aligned}$$
(19)

It follows from (17) and (18) that

$$\begin{aligned} \left\langle y - \underline{\lambda } \nabla _\sharp \underline{f}(x^*) - \overline{\lambda } \nabla _\sharp \overline{f}(x^*) - \sum _{i \in I} \mu _i \nabla g_i(x^*) - \sum _{j \in J} \lambda _j \nabla h_j(x^*) , d \right\rangle = 0 \end{aligned}$$

for all \(d \in \mathbb {R}^n\). Then,

$$\begin{aligned} y = \underline{\lambda } \nabla _\sharp \underline{f}(x^*) + \overline{\lambda } \nabla _\sharp \overline{f}(x^*) + \sum _{i \in I} \mu _i \nabla g_i(x^*) + \sum _{j \in J} \lambda _j \nabla h_j(x^*). \end{aligned}$$
(20)

Note that \(\underline{\lambda } = 0\) and \(\overline{\lambda } = 0\) cannot occur simultaneously. Indeed, if \(\underline{\lambda } = \overline{\lambda } = 0\), by (19) and (20) and the fact that (PLICQ) is valid at \(x^*\), it follows that \(\mu _i = 0\), \(i \in I\), and \(\lambda _j = 0\), \(j \in J\). This implies that \(\underline{f}_0(d) = \overline{f}_0(d) = f_{1i}(d) = f_{1s}(d) = f_2(d) = 0\), \(i \in I\), \(j \in J\), for all \(d \in \mathbb {R}^n\), which is contradiction. Therefore, we can assume that \(\underline{\lambda } + \overline{\lambda } = 1\).

Since \(\underline{\lambda } + \overline{\lambda } = 1\) with \(\underline{\lambda }, \overline{\lambda } \ge 0\), we have that

$$\begin{aligned} \underline{\lambda } \frac{\partial \underline{f}}{\partial x_i^\sharp }(x^*) + \overline{\lambda } \frac{\partial \overline{f}}{\partial x_i^\sharp }(x^*) \in \left[ \frac{\partial \underline{f}}{\partial x_i^\sharp }(x^*) \vee \frac{\partial \underline{f}}{\partial x_i^\sharp }(x^*) \right] = \frac{\partial _{gH} F}{\partial x_i}(x^*), ~ i = 1,\ldots ,n. \end{aligned}$$

Then,

$$\begin{aligned} \underline{\lambda } \nabla _\sharp \underline{f}(x^*) + \overline{\lambda } \nabla _\sharp \overline{f}(x^*) \in \nabla _{gH} F(x^*). \end{aligned}$$
(21)

From (20), (21) and the fact that \(\mu _i = 0\), \(i \in I {\setminus } I(x^*)\), we get

$$\begin{aligned} y \in \nabla _{gH} F(x^*) \oplus \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) \oplus \sum _{j \in J} \lambda _j \nabla h_j(x^*). \end{aligned}$$
(22)

The result follows from (19) and (22).

Finally, we discuss the exceptional cases. Suppose that \(\mathcal {\underline{K}}_0=\emptyset \), that is,

$$\begin{aligned} \langle \nabla _\sharp \underline{f}(x^*),d \rangle = 0 ~ \forall d \in \mathbb {R}^n. \end{aligned}$$
(23)

Taking \(\underline{\lambda } = 1\), \(\overline{\lambda } = 0\), \(\mu _i = 0\), \(i \in I\), \(\lambda _j = 0\), \(j \in J\), there exists \(y = \nabla _\sharp \underline{f}(x^*) \in \mathbb {R}^n\) such that

$$\begin{aligned} y= & {} \underline{\lambda } \nabla _\sharp \underline{f}(x^*) + \overline{\lambda } \nabla _\sharp \overline{f}(x^*) + \sum _{i \in I} \mu _i \nabla g_i(x^*) + \sum _{j \in J} \lambda _j \nabla h_j(x^*) \\\in & {} \nabla _{gH} F(x^*) \oplus \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) \oplus \sum _{j \in J} \lambda _j \nabla h_j(x^*). \end{aligned}$$

Moreover, (23) becomes \(\langle y,d \rangle = 0\) for all \(d \in \mathbb {R}^n\), which implies that \(y = 0\), so that

$$\begin{aligned} \langle y,x-x^*\rangle \ge 0 ~ \forall x \in S. \end{aligned}$$

In the case \(\mathcal {\overline{K}}_0=\emptyset \), we proceed analogously. The proof is complete. \(\square \)

Note that the condition \(\langle y,x-x^*\rangle \ge 0\) for all \(x \in S\) in Theorem 3.6 can also be expressed as \(-y \in N_S(x^*)\). Thus, the necessary optimality conditions for (IOP) can be stated in the following alternative form.

Theorem 3.7

Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP). Assume that (A1)–(A4) and (PLICQ) hold at \(x^*\). Then, there exist \(\mu _i \ge 0\), \(i \in I(x^*)\), \(\lambda _j \in \mathbb {R}\), \(j \in J\), such that

$$\begin{aligned} 0 \in \nabla _{gH} F(x^*) \oplus \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) \oplus \sum _{j \in J} \lambda _j \nabla h_j(x^*) \oplus N_S(x^*). \end{aligned}$$

In the following, we reproduce, as particular cases, the last result for some special forms of (IOP). First, we consider the problem without the abstract constraint:

$$\begin{aligned} \begin{array}{ll} \text {minimize} &{} F(x) = [\underline{f}(x),\overline{f}(x)] \\ \text {subject to} &{} g_i(x) \le 0, ~ i \in I = \{1,\ldots ,p\}, \\ &{} h_j(x) = 0, ~ j \in J = \{1,\ldots ,q\}. \end{array} \qquad \qquad \text {(IOP1)} \end{aligned}$$

In this case, the constraint qualification reads

Definition 3.4

The constraints of (IOP1) are said to satisfy the positive linear independence constraint qualification (PLICQ) at a feasible point \(x^*\in \mathcal {F}\) if given \(\mu _i \ge 0\), \(i \in I(x^*)\), and \(\lambda _j \in \mathbb {R}\), \(j \in J\), such that

$$\begin{aligned} \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) + \sum _{j \in J} \lambda _j \nabla h_j(x^*) = 0 \end{aligned}$$

one has

$$\begin{aligned} \mu _i = 0, ~ i \in I(x^*), \quad \text{ and } \quad \lambda _j = 0, ~ j \in J. \end{aligned}$$

Theorem 3.8

(Necessary optimality conditions for (IOP1)) Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP1). Assume that (A1)–(A3) and (PLICQ) hold at \(x^*\). Then, there exist \(\mu _i \ge 0\), \(i \in I(x^*)\), \(\lambda _j \in \mathbb {R}\), \(j \in J\), such that

$$\begin{aligned} 0 \in \nabla _{gH} F(x^*) \oplus \sum _{i \in I(x^*)} \mu _i \nabla g_i(x^*) \oplus \sum _{j \in J} \lambda _j \nabla h_j(x^*). \end{aligned}$$

Now, we consider the problem with the abstract constraint only:

$$\begin{aligned} \begin{array}{ll} \text {minimize} &{} F(x) = [\underline{f}(x),\overline{f}(x)] \\ \text {subject to} &{} x \in S. \end{array} \qquad \qquad \qquad \text {(IOP2)} \end{aligned}$$

The constraint qualification is unnecessary.

Theorem 3.9

(Necessary optimality conditions for (IOP2)) Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP2). Assume that (A1), (A2) and (A4) hold at \(x^*\). Then, there exists \(y \in \mathbb {R}^n\) such that

$$\begin{aligned} y \in \nabla _{gH} F(x^*) \end{aligned}$$

and

$$\begin{aligned} \langle y , x-x^*\rangle \ge 0 ~ \forall x \in S. \end{aligned}$$

Alternatively,

Theorem 3.10

Let \(x^*\in \mathcal {F}\) be a local weak LU-solution of (IOP2). Assume that (A1), (A2) and (A4) hold at \(x^*\). Then,

$$\begin{aligned} 0 \in \nabla _{gH} F(x^*) \oplus N_S(x^*). \end{aligned}$$

We finalize this section by observing that, since every LU-solution is also a weak LU-solution, all the results above on necessary optimality conditions can be applied to characterize LU-solutions as well.

4 Examples and Possible Applications

As already mentioned in the introduction, interval optimization problems arise naturally in practice. In this section, we will briefly discuss some possible applications whose modeling falls into a framework often referred to as discrete-time optimal control. We first describe the discrete-time optimal control problems with an interval-valued objective function. Afterward, we present an application on reservoir regulation.

A discrete-time optimal control problem with an interval-valued objective function may be posed as follows:

$$\begin{aligned} \begin{array}{ll} \min &{} F(x,u) = \displaystyle {\sum _{k=0}^{N-1} F_k(x_k,u_k) + F_N(x_N)} \\ \text{ s.t. } &{} x_{k+1} = f_k(x_k,u_k), ~ k \in \{0,\ldots ,N-1\}, \\ &{} (x_k,u_k) \in S_k, ~ k \in \{0,\ldots ,N-1\}, \end{array} \end{aligned}$$

where \(F_k : \mathbb {R}^n \times \mathbb {R}^m \rightarrow \mathcal {K}_C\), \(k \in \{0,\ldots ,N-1\}\), \(F_N : \mathbb {R}^n \rightarrow \mathcal {K}_C\), \(f_k : \mathbb {R}^n \times \mathbb {R}^m \rightarrow \mathbb {R}^n\), \(k \in \{0,\ldots ,N-1\}\), \(S_k \subset \mathbb {R}^n \times \mathbb {R}^m\), \(k \in \{0,\ldots ,N-1\}\). Vectors \(u_k \in \mathbb {R}^m\) are called control vectors while \(x_k \in \mathbb {R}^n\) are termed state vectors. We refer to \(u=(u_0,\ldots ,u_{N-1})\) as a control policy. The corresponding \(x=(x_0,\ldots ,x_N)\) is said to be the state trajectory. The integer N is the number of periods. The vector difference equation \(x_{k+1} = f_k(x_k,u_k)\) is called system equation and describes the “dynamics” of the problem.

Regardless of its peculiar structure, the discrete optimal control problems above fit thoroughly as an (IOP) without inequality constraints. Then, the results established in the last section can be applied.

Discrete-time optimal control problem appear often in applications. One instance of such problems is the reservoir regulation. The deterministic case was described in Bertsekas [5], for example. What follows is an adaptation to the interval case. We denote by \(x_k\) the volume of water held in a reservoir at the k-th of N time periods and by \(u_k\) the water used for some productive purpose in period k. Then, the volume evolves according to

$$\begin{aligned} x_{k+1} = x_k - u_k, ~ k \in \{0,\ldots ,N-1\}. \end{aligned}$$

The outflow \(u_k\) is seen as the controls, while the volume \(x_k\) as the states. We may consider a cost, say \(F_N(x_N)\), associated to the terminal state, and costs, say \(F_k(u_k)\), related to the outflow \(u_k\) at period k. Due to operational and market uncertainties, it is interesting to consider such costs as interval-valued functions. When \(u_k\) is used for electric power generation, for example, \(F_k(u_k)\) may be an interval function of the value of power produced from \(u_k\). The outflows \(u_0\), \(u_1\), \(\ldots \), \(u_{N-1}\) are to be chosen so that \(\sum _{k=0}^{N-1}F_k(u_k) + F_N(x_N)\) is LU-minimized subject to some constraints on the volume and on the outflow (for example, \(x_k\) should be kept between some upper and lower bounds and \(u_k\) should have nonnegative entries). If we use \(S_k\) to denote such constraints, from the system equation \(x_{k+1} = x_k - u_k\), we can write the problem as

$$\begin{aligned} \begin{array}{ll} \min &{} F(x,u) = \displaystyle {\sum _{k=0}^{N-1} F_k(x_k-x_{k+1}) + F_N(x_N)} \\ \text{ s.t. } &{}(x_k,u_k) \in S_k, ~ k \in \{0,\ldots ,N-1\}, \end{array} \end{aligned}$$

and Theorem 3.10 can be applied when F(xu) is gH-differentiable.

We close the paper with a numerical example to better illustrate the application of our results. With the purpose of illustrating the application of the main result, Theorem 3.6 or its equivalent formulation Theorem 3.7, we will not use the condensed form above. Let us consider the particular case below:

$$\begin{aligned} \begin{array}{ll} \min &{} F(x,u) = \displaystyle {\left[ -u_0^2,0\right] +\left[ -u_1^2,0\right] +\left[ -\frac{(x_2-2)^3}{8} , -\frac{(x_2-2)^3}{8}-\frac{x_2^2-4x_2}{8} \right] } \\ \text{ s.t. } &{} x_{1} = x_0-u_0, \\ &{} x_2 = x_1-u_1, \\ &{} (x_0,u_0) \in \{5\} \times [0,5], \\ &{} (x_1,u_1) \in [3,4] \times [0,3]. \end{array} \end{aligned}$$

From the system equation \(x_1=x_0-u_0\) and the constraints \(x_0 \in \{5\}\) and \(x_1 \in [3,4]\), we get \(u_0 \in [1,2]\). From the system equation \(x_2=x_1-u_1\) and the constraints \(x_1 \in [3,4]\) and \(u_1 \in [0,3]\) we see that \(x_2 \in [0,4]\). Therefore, the feasible set \(\mathcal {F}\) is contained in \(\{5\} \times [3,4] \times [0,4] \times [1,2] \times [0,3]\). With respect to the notation of (IOP), we clearly identify

$$\begin{aligned}&\underline{f}(x_0,x_1,x_2,u_0,u_1)=-u_0^2-u_1^2-\frac{(x_2-2)^3}{8}, \\&\overline{f}(x_0,x_1,x_2,u_0,u_1)=-\frac{(x_2-2)^3}{8}-\frac{x_2^2-4x_2}{8}, \\&h_1(x_0,x_1,x_2,u_0,u_1)=x_1-x_0+u_0, \\&h_2(x_0,x_1,x_2,u_0,u_1)=x_2-x_1+u_1, \\&S=\{5\} \times [3,4] \times \mathbb {R} \times [0,5] \times [0,3]. \end{aligned}$$

The point \(p^*=(x_0^*,x_1^*,x_2^*,u_0^*,u_1^*) = (5,4,4,1,0)\) is a local LU-solution. Indeed, assume that, for an arbitrary \(\epsilon \)-neighborhood \(N_\epsilon (p^*)\) of \(p^*\), there exists a feasible solution \(p \in N_\epsilon (p^*)\) with \(F(p) \le _{LU} F(p^*)\). Then, if \(p=(x_0,x_1,x_2,u_0,u_1)\), we would have either

$$\begin{aligned} \left\{ \begin{array}{l} -u_0^2-u_1^2-\frac{(x_2-2)^3}{8} \le -2 \\ -\frac{(x_2-2)^3}{8}-\frac{x_2^2-4x_2}{8} < -1 \end{array} \right. \end{aligned}$$

or

$$\begin{aligned} \left\{ \begin{array}{l} -u_0^2-u_1^2-\frac{(x_2-2)^3}{8} < -2 \\ -\frac{(x_2-2)^3}{8}-\frac{x_2^2-4x_2}{8} \le -1. \end{array} \right. \end{aligned}$$

In the first case, the second inequality implies in \(x_2 > 4\), which is contradiction to the fact that p is a feasible solution. In the second case, the second inequality gives \(x_2 \ge 4\), so that \(x_2=4\) since p is feasible. From the system equations, we have

$$\begin{aligned} 4=x_2=x_1-u_1=x_0-u_0-u_1=5-u_0-u_1 \Leftrightarrow u_0+u_1=1. \end{aligned}$$

Using this information together with \((u_0,u_1) \in [1,2] \times [0,3]\) (remember that \(p \in \mathcal {F}\)), we obtain \(u_0=1\) and \(u_1=0\). Substituting in the first inequality from the second case we get the contradiction \(-2 < -2\). Let \(\lambda _1\) and \(\lambda _2\) be scalars such that \(\lambda _1 \nabla h_1(p^*) + \lambda _2 \nabla h_2(p^*) \in - N_S(p^*)\). Then,

$$\begin{aligned} \lambda _1 (-1,1,0,1,0) + \lambda _2 (0,-1,1,0,1) \in \mathbb {R} \times \mathbb {R}_- \times \{0\} \times \{0\} \times \mathbb {R}_+, \end{aligned}$$

so that \(\lambda _1=\lambda _2=0\), that is, (PLICQ) holds at \(p^*\). By Theorem 3.7, we know that there exist \(\lambda _1,\lambda _2 \in \mathbb {R}\) such that

$$\begin{aligned} 0 \in \nabla _{gH} F(p^*) \oplus \lambda _1 \nabla h_1(p^*) \oplus \lambda _2 \nabla h_2(p^*) \oplus N_S(p^*). \end{aligned}$$

The right-hand side above is equal to

$$\begin{aligned} \left( [0,0],[0,0],\left[ -\frac{3}{8}(x_2^*-2)^2 \vee -\frac{3}{8}(x_2^*-2)^2-\frac{2}{8}x_2^*+\frac{4}{8} \right] ,[-2u_0^*,0],[-2u_1,0] \right)&\\ \oplus \lambda _1 (-1,1,0,1,0) \oplus \lambda _2 (0,-1,1,0,1) \oplus \mathbb {R} \times \mathbb {R}_+ \times \{0\} \times \{0\} \times \mathbb {R}_-.&\end{aligned}$$

Thence, \(0 \in \nabla _{gH} F(p^*) \oplus \lambda _1 \nabla h_1(p^*) \oplus \lambda _2 \nabla h_2(p^*) \oplus N_S(p^*)\) if

$$\begin{aligned} \left\{ \begin{array}{l} 0 \in [-\lambda _1,-\lambda _1] + \mathbb {R}, \\ 0 \in [\lambda _1-\lambda _2,\lambda _1-\lambda _2] + \mathbb {R}_+, \\ 0 \in [-2,-3/2] + [\lambda _2,\lambda _2], \\ 0 \in [-2,0] + [\lambda _1,\lambda _1], \\ 0 \in [\lambda _2,\lambda _2] + \mathbb {R}_-. \end{array} \right. \end{aligned}$$

There exist infinitely many multipliers that fulfill the system of inclusions above, \(\lambda _1=1\) and \(\lambda _2=7/4\), for example. It is easy to see that the set of all multipliers is a compact one.

5 Conclusion

We considered an interval optimization problem with equality and inequality constraints along with an abstract constraint. The necessary optimality conditions were obtained using a specification of the Dubovitskii–Milyutin formalism. We defined the LU-directions of decrease for interval-valued functions. The characterization of such directions was done through the gH-derivative of the interval-valued objective function. After an application of the Dubovitskii–Milyutin Lemma, we came to the Euler–Lagrange equation. A multiplier rule of Karush–Kuhn–Tucker type was then derived.

Among others, dealing with abstract constraints is important in optimal control problems: in general, control constraints are given in the abstract form. In the case of discrete optimal control interval problems, our results can be directly applied, as mentioned in the last section. But even in the continuous case, although the optimality conditions developed here are not directly applicable, our results can be useful. It is well known that the necessary optimality conditions for optimal control problems given in the maximum principle of Pontryagin involve the maximization of the Hamiltonian function with respect to the control variable at each instant of time t. Despite the optimal control problem being posed in spaces of infinite dimensions (in the continuous case), the maximization of the Hamiltonian at a given t falls exactly in an (IOP2)-type problem.

Another important feature of our work has to do with the fact that the necessary optimality conditions were achieved after the application of a specification of the Dubovitskii–Milyutin formalism. This allows us to easily generalize the results for problems posed in Banach spaces, continuous-time interval programming problems, and optimal control interval problems.

For future research, it would be of interest to work with problems in which not only the objective function is interval-valued, but the constraints are as well. More challenging would be working with problems whose decision variables are intervals.